Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2024.03.06 edition

Global military spending, fatal police pursuits, price-fixing cartels, real-time airport disruptions, and pinball machines.

Global military spending. How much money has each country spent, each year, on its military? Different datasets have different answers, cover different timeframes, and use different methodologies. Miriam Barnum et al.’s Global Military Spending Dataset attempts to bring them together. By uniting “76 variables from 9 dataset collection projects,” the authors write, “we provide the most comprehensive and complete set of published datasets on military spending ever assembled.” Each of the variables represents one source/methodology, and each observation is a country-year. “Disagreement on the actual expenditure value for a given country-year is common, even between datasets produced by the same project,” they find. Previously: The Stockholm International Peace Research Institute’s Military Expenditure Database (DIP 2017.03.29), one of the sources.

Fatal police pursuits. Reporters at the San Francisco Chronicle have compiled a national dataset of 3,300+ deaths in police car chases in 2017–2022. To build it, they used information from the federal government’s Fatality Analysis Reporting System (DIP 2016.08.31), research organizations, news reports, lawsuits, and public records requests. For each death, the dataset indicates the person’s name, age, gender, race, and connection to the pursuit (driver, passenger, bystander, officer). It also includes the incident’s date, location, reason given for the pursuit, and main law enforcement agency involved. Read more: “Fast and Fatal,” the Chronicle’s investigation based on the dataset. [h/t Susie Neilson]

Price-fixing cartels. Industrial economist John M. Connor has constructed the Private International Cartels dataset, “which the author believes to be the largest collection of legal-economic information on contemporary price-fixing cartels.” It spans three decades (1990–2019) and covers 1,500+ suspected or convicted cartels, including 1,100+ that “have been deemed guilty of price fixing by one or more antitrust authority.” It also links those cartels to tens of thousands of companies and to 2,000+ individuals indicted or punished for their involvement. The dataset’s variables include information about cartel geography, industry, market share, overcharges, penalties, and much more.

Real-time airport disruptions. The Federal Aviation Administration’s National Airspace System Status dashboard provides real-time listings of delays and closures at US airports. For each disruption, it indicates the type of problem, reason, current average delay times, and more. A minimal API linked from the site provides the information as an XML-formatted file. Read more: Ruihai Youngblood describes his experience helping to redesign the dashboard. [h/t Jason Scott]

Pinball machines. The Open Pinball Database provides a searchable inventory and API of ~2,000 pinball machines and 120+ manufacturers. Details include each machine’s name, manufacture date, mechanism type, display type, player count, and more. Related: Pinball Map’s crowdsourced global map and API of the locations of installed pinball machines. [h/t Jeremy Herrman + technophiliac]