Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2018.02.14 edition

Nepal after the Gorkha Earthquake, spatiotemporal congressional results, human speech, UK fire stats, and imported bats.

Nepal, post-earthquake. In April 2015, the Ghorkha Earthquake killed more than 8,000 people in Nepal, and destroyed hundreds of thousands of homes. In early 2016, a team led by the not-for-profit Kathmandu Living Labs, in collaboration with Nepal’s government, undertook “a massive household survey using mobile technology to assess building damage in the earthquake-affected districts.” The responses to that survey are now available at the 2015 Nepal Earthquake Open Data Portal; you can explore the data online or download it in bulk. In all, the datasets include details on millions of individuals, plus information about each surveyed household and building. [h/t Reddit user “phishfart”]

Historical congressional results, historical boundaries. Through the Constituency-Level Elections Archive (DIP 2016.09.28) and other sources, you can get historical election results for the U.S. Congress. And through the work of Jeffrey B. Lewis et al., you can get data describing the historical boundaries of each congressional district. In a Scientific Data article published last year, quantitative geographer Levi John Wolf presented a dataset that brings the two types of information together, so that all congressional election results from 1896 to 2014 are “explicitly linked to the geospatial data about the districts themselves.”

Human speech. Common Voice is a Mozilla-led project that aims “to make voice recognition technology easily accessible to everyone.” To that end, the project asks visitors to record themselves speaking specific sentences, and to validate the recordings of other users. The whole dataset is available to download and currently clocks in at 12 gigabytes, compressed. (Bonus: That download page also links to other freely available voice datasets.) Related: The project’s FAQ.

UK fire stats. The United Kingdom’s Home Office publishes dozens of fire-safety related datasets, including aggregate statistics on response times, smoke alarms, and fire department staffing; incident-level data on appliance fires, vehicle fires, and fatalities; and much more. Of the 100,000+ domestic appliance fires reported over a six-year span, 52% were believed to have been caused by a “cooker incl. oven,” 11% by a “grill/toaster,” 2% by dishwashers, and just over 1% by deep-fat fryers. Semi-related: Jamie Oliver’s Bad Cheese Idea Is Still Starting Toaster Fires. [h/t Owen Boswarva‏]

Imported bats. Via a Freedom of Information Act request to the Fish and Wildlife Service, Newsweek reporter Kristin Hugo obtained a spreadsheet listing all imports of bats — vampire, fruit, yellow-shouldered, leaf-nosed, and more — to the United States between January 2016 and October 2017.