Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2018.04.25 edition

School shootings, America’s roads, wind turbines, NYT front-page stories, and 32 cooks in the kitchen.

School shootings. Over the past year, reporters at the Washington Post ”attempted to identify every act of gunfire at a primary or secondary school during school hours since the Columbine High massacre on April 20, 1999.” Using a range of sources, the reporters ”reviewed more than 1,000 alleged incidents, but counted only those that happened on campuses immediately before, during or just after classes.” The resulting database, published last week, currently contains more than 200 incidents and can be downloaded as a CSV. For each shooting, the database includes details about the location, timing, circumstances, shooter, casualties, and the school’s students. [h/t The INN Nerds]

America’s roads. The federal Highway Performance Monitoring System “includes inventory information for all of the Nation’s public roads as certified by the States’ Governors annually.” And it’s not just highways: “All roads open to public travel are reported in HPMS regardless of ownership, including Federal, State, county, city, and privately owned roads such as toll facilities.” Shapefiles representing the HPMS data are available for 2011–2015. For each segment of road, the dataset indicates the average daily traffic, number of turn lanes, surface type, and dozens of other variables. Related: America’s Quietest Routes, which uses the data.

Wind turbines. Lawrence Berkeley National Laboratory, the U.S. Geological Survey, and the American Wind Energy Association have partnered to publish the U.S. Wind Turbine Database. The dataset, which the government says will be “continuously updated,” currently contains 57,636 turbines and includes each turbine’s location, development project, manufacturer, model, height, rotor diameter, and other characteristics. You can download the data in several formats, and also explore it on an interactive map. [h/t Ed Vine]

A decade of New York Times front-page stories. For her 2013 book, Making the News: Politics, the Media, and Agenda Setting, UC Davis professor Amber E. Boydstun oversaw the compilation of a dataset of every front-page article in the New York Times from 1996 to 2006. Each of the 31,034 articles have been categorized by topic, according a detailed codebook, and given a short summary. Related: The Comparative Agendas Project’s list of datasets that use its topic-classification system, including Boydstun’s data. Also related: The NYT’s APIs. [h/t Cornelius Puschmann]

Cooks in the kitchen. Computer-vision researchers convinced 32 participants (of 10 nationalities, living in 4 cities) to record everything they did in their kitchens for three days using a head-mounted camera. Later, the participants narrated what they had been doing. Taken together, the EPIC-Kitchens dataset includes 55 hours of video, nearly 40,000 narration segments, and more. [h/t Duncan Geere]