Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.07.12 edition

US military interventions, the latest White House visitors, NY prison employee misconduct, Australian mine production, and “the global human day.”

US military interventions. For the Military Intervention Project, Sidita Kushi and Monica Duffy Toft have constructed a dataset of “all instances of US military intervention from 1776 until 2019, alongside key drivers and consequences of these interventions.” The 392 cases include wars, occupations, major troop deployments, humanitarian assistance, and other military actions abroad. The dataset categorizes their objectives, use of force, and outcomes; indicates their location, foreign states involved, starting/ending year, and human costs; and provides many additional variables. Its sources include government publications, media reports, other datasets (such as the International Military Intervention and Military Intervention by Powerful States projects), and more. [h/t David Vine]

The latest White House visitors. In May 2021, the Biden-Harris administration began releasing its logs of visitors to the White House. The records now include 500,000+ entries from January 2021 to March 2023, featuring 350,000+ distinct names and indicating the visit’s timing, visitee, meeting location, and more. Caveat: “However, a Bloomberg News analysis of the data found duplications, anomalies and missing names,” write Eric Fan and Josh Wingrove. “For example, the records […] show just five visits from Nancy Pelosi when she was House Speaker, despite at least 20 known instances when she was there.” Previously: Official logs from the Obama-Biden White House, plus ProPublica’s and Politico’s attempts to compile them for Trump-Pence (DIP 2017.11.29).

NY prison employee misconduct. For a series of articles (co-published with the New York Times) investigating abuse by prison guards in New York State, The Marshall Project obtained and analyzed data representing 12 years of Department of Corrections and Community Supervision employee disciplinary notices. The newsroom is publishing those records, which they’ve converted from two PDFs into tabular data, along with additional context and caveats. The records contain ~6,000 (non-redacted) notices; they indicate the employee name, title, facility, union, type of misconduct, case disposition, description, and penalty, among other details.

Australian mine production. “No […] study has ever compiled a national mine production data set which includes basic mining data such as ore processed, grades, extracted products (e.g., metals, concentrates, saleable ore) and waste rock,” writes Gavin M. Mudd, whose new dataset aims to do exactly that for Australia from 1799 to 2021. It contains mine-by-mine metrics, as well as annual production by state and element/mineral.

“The global human day.” By harmonizing “data collected by national statistics agencies, international organizations, and researchers from over 140 countries,” William Fajzel et al. have “assemble[d] a complete estimate of what humans are doing, averaged over time and across the entire population, to provide [what] we refer to as the global human day.” Their published data include global and national estimates for 24 subcategories, such as sleep (~9 hours) and childcare (~17 minutes).