Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2017.08.02 edition

The search for MH370, EU lobbyists, test scores, individual library checkouts, and philharmonic subscribers.

Data from the search for MH370. After Malaysia Airlines flight MH370 disappeared in March 2014, the Australian government undertook an enormous seafloor-mapping operation in search of the lost Boeing 777. Last month, it released data from the first phase of the project, which collected 278,000 square kilometers of bathymetry (i.e., seafloor topography) measurements. “In general, the world’s deep oceans have had little investigation,” the government explains in an interactive map. “Only 10 to 15 percent of the ocean has been mapped with the sonar technology similar to that used in the search for MH370.” As a result, the MH370 search area “is now among the most thoroughly mapped regions of the deep ocean on the planet.” [h/t Soh Kam Yung]

European Union lobbying. The EU publishes a searchable database of people and organizations registered to lobby the European Parliament and the European Commission. The website LobbyFacts.eu takes that data and makes it available via an API. LobbyFacts also scrapes the European Commission’s disclosed lobbying meetings, which you can download here (warning: 10-megabyte direct download). Related: You can also explore the lobbyists and meetings via InegrityWatch.eu, which uses LobbyFacts’ data. Previously: U.S. government lobbyists (DIP 2017.05.31). [h/t Enigma Public + Xavier Dutoit]

SAT, ACT, and AP scores. The California Department of Education publishes aggregate scores on these high-school tests for each county, district, and school going back to the late 1990s. One hitch: For more than two months, the 2016 AP data “contained 350,000 more tests than had actually been taken,” according to inewsource.org’s Megan Wood, who spotted the discrepancies (and others) and got the department to fix them. Similar datasets are available from other states, including Texas, Florida, and Pennsylvania. Bonus: inewsource.org’s has also published easy-to-search tables of the California AP, SAT, and ACT scores.

Individual library checkouts. The Seattle Public Library publishes a dataset of every checkout of every physical item (e.g., paperback books and DVDs, but not e-books) since April 2005. It currently contains more than 90 million rows. Previously: The library’s monthly checkout counts, by title (DIP 2017.03.01). [h/t David Christensen]

Harmony lovers. The New York Philharmonic has published three spreadsheets listing its subscribers — including where they sat, how much they paid, and where they had their tickets sent — for a slew of orchestral seasons between 1883 and the late 1990s. The earliest data includes names, too. (“Miss A. Brown” of 715 Fifth Avenue seems to have been a big fan, having subscribed to 26 seats for the 1890-91 season.) Previously: The Philharmonic’s performance history (DIP 2016.10.12). [h/t Rachel Shorey]