Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.09.06 edition

Eponymic streets, doctors in practice, intra-state ceasefires, metabolisms, and The Onion’s “American Voices.”

Eponymic streets. Mapping Diversity, created by the European Data Journalism Network, “is a platform for discovering key facts about diversity and representation in street names across Europe, and to spark a debate about who is missing from our urban spaces.” The interactive analysis of 145,000+ streets in 30 major European cities launched in March, accompanied by spreadsheets calculating city-level statistics and listing all the streets named after women. Last month the team released its full dataset, which provides information for every street examined, plus data for six more cities. Each row indicates a street’s country, city, and name; whether it’s named after anyone; and, if so, the person’s name, gender, and various attributes from Wikidata, such as occupation and date of birth. Previously: Las Calles de las Mujeres (DIP 2019.05.29).

Doctors in practice. The Physician and Physician Practice Research Database, published by the US government’s Agency for Healthcare Research and Quality, harmonizes data on medical practices from 13 participating states. The public-use files provide each practice’s ZIP code, number of physicians, most common specialty, organizational NPI, and other characteristics. They also provide statistical aggregates at the 3-digit ZIP code level, such as the number of physicians accepting Medicare and/or Medicaid, average claims per month, and more. [h/t Gary Price]

Intra-state ceasefires. Govinda Clayton et al.’s Civil Conflict Ceasefire Dataset “covers all ceasefires in civil conflict between 1989 and 2020, including multilateral, bilateral and unilateral arrangements, ranging from verbal arrangements to detailed written agreements.” The dataset’s 2,200+ ceasefires span 109 conflicts in 66 countries, largely based on news articles discussing the agreements. A team of reviewers manually coded each instance, indicating its type and stated purpose; sides participating; dates declared, entered effect, and ended; and more. Previously: The PA-X Peace Agreements Database (DIP 2018.02.28).

Metabolisms. Drawing on 20+ published sources, Tori M. Hoehler et al. have compiled a dataset of 10,000+ measurements of metabolic rates of mammals, fish, birds, insects, tree saplings, bacteria, and other living organisms. The dataset includes several types of metabolic rates (primarily basal, field, and maximum), and a mix of individual-organism and species-average measurements.

Vox populi, except not. The Onion runs a regular feature called “American Voices,” which presents fake quotes from fake people responding to not-fake events in the news. Cody Winchester has built a spreadsheet listing the headlines, descriptions, and dates for 7,000+ of these features since August 1996; the 23,000+ quotes in them; and the names, occupations, and (almost always recycled) photos of the fictional personae purportedly quoted.