Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2017.11.29 edition

Political violence in developing countries, stolen guns, (some) White House visitor logs, California election data, and folktales.

Protests and political violence in Africa and Asia. The Armed Conflict Location & Event Data Project (ACLED), records the locations, dates, actors, and outcomes of “all reported political violence and protest events in over 60 developing countries in Africa and Asia.” The Africa datasets currently go back to 1997 and cover more than 50 countries. The Asia datasets currently only go back to 2015, but ACLED’s website says it’s planning to add data soon going back to 2010. Both of the datasets are extensively documented, as is the methodology . [h/t Lari McEdward]

Stolen guns. Missing Pieces is “a yearlong investigation by The Trace and more than a dozen NBC TV stations [that has] identified more than 23,000 stolen firearms recovered by police between 2010 and 2016 — the vast majority connected with crimes.” To support the investigation, the reporters obtained more than 800,000 records of stolen and recovered guns, which they’ve standardized into a single CSV file and supplemented with a data dictionary. The dataset “contains nearly complete stolen-gun records for the states of California and Florida, both of which have centralized collections of gun-theft data,” as well as records from nearly 300 other agencies across the country. Previously: The ATF’s gun trace statistics (DIP 2017.11.08) and firearm background checks (DIP 2015.12.09). [h/t Sarah Ryley]

(Some) White House visitor logs. ProPublica has published a searchable and downloadable dataset of visitor logs and meeting calendars from five White House agencies: the Office of Management and Budget, the Office of the U.S. Trade Representative, the Office of National Drug Control Policy, the Office of Science and Technology Policy, and the Council on Environmental Quality. ProPublica received the underlying documents from Property of the People, a transparency group that sued the Trump administration to release the records under the Freedom of Information Act. (The administration has not released the White House’s main visitor logs.) Related: Politico has manually compiled a searchable database it calls “The Unauthorized White House Visitor Logs”, based on thousands of known visits, meetings, phone calls, and other presidential interactions. Also related: The Obama administration’s White House visitor logs.

California elections and campaign finance. Since 2014, the California Civic Data Coalition has been working to improve access to CAL-ACCESS, “the jumbled, dirty and difficult government database that tracks campaign finance and lobbying activity in California politics.” Their cleaned-up datasets are updated often and include formats suitable for beginners, “database junkies,” and masochists. Last month, the organization released data files cataloging every state ballot measure and candidate for public office since 2000. [h/t Zack Quaintance]

Folktales. The Aarne-Thompson-Uther Classification of Folk Tales organizes (mostly Indo-European) folktales into groups and hierarchies. As Atlas Obscura’s Cara Giaimo puts it, the ATU is “like the Dewey Decimal System, but with more ogres.” The ATU doesn’t publish any downloadable versions of its data, but researchers studying the “ancient roots” of such stories have built a data-matrix that denotes the presence/absence of the 275 ATU “tales of magic” across 50 Indo-European-speaking populations. [h/t Andrew McCartney]