Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.11.15 edition

Trans rights, home schooling, migrant arrivals in Italy, toxic wastewater spills, and coffee tasting.

Trans rights. Myles Williamson’s Trans Rights Indicator Project “provides insight into the legal situations transgender people faced in 173 countries from 2000 to 2021.” The project’s dataset is “the only public dataset covering trans rights with wide spatial and temporal coverage (to my knowledge),” Williamson writes. For each country-year combination, it “includes 14 indicators that capture the presence or absence of laws related to criminalization, legal gender recognition, and anti-discrimination protections.” For instance: Does the country require a psychological diagnosis before someone can change their gender on identity documents?

Home schooling. The Washington Post has gathered data on home-school enrollment figures in dozens of US states and 6,700+ school districts over the past six academic years. Post reporters, with help from students at American University, “trawled state websites, contacted education officials in all 50 states and the District of Columbia and submitted multiple public records requests” to build the dataset, released last week. Each entry indicates the state/district, school year, and the number of students registered for home schooling. Read more: The Post’s analysis, which “reveals that a dramatic rise in home schooling at the onset of the pandemic has largely sustained itself through the 2022-23 academic year, defying predictions […].” [h/t Meghan Hoyer]

Migrant arrivals in Italy. The Dati Bene Comune campaign’s latest initiative, Liberiamoli tutti! (Let’s free them all!), aims to improve the availability of Italian government data. Its first project has been to convert the interior ministry’s statistics on migrant arrivals from PDF reports into spreadsheets. The extracted information includes daily counts of sea arrivals, twice-monthly totals by nationality, and the number of people in reception centers by center type and Italian region. A nice shout-out: Liberiamoli tutti! says it’s inspired by Data Is Plural’s sibling, The Data Liberation Project.

Toxic wastewater spills. For a recent Inside Climate News investigation, Martha Pskowski and Peter Aldhous wrangled data on 10,000+ wastewater spills reported by oil and gas companies to the Texas government between 2013 and 2022. These “spill logs” — obtained through public records requests, then cleaned and standardized by the journalists — correspond to more than 148 million gallons of “produced water,” a byproduct of drilling and fracking. The data indicate each spill’s date, location, facility, operator, type of operation, volume of wastewater released, volume recovered, and much more.

Coffee tasting. Last month, British YouTuber (and former World Barista Champion) James Hoffman virtually hosted the Great American Coffee Taste Test, during which thousands of people simultaneously blind-tasted the same four coffees. Hoffman has published a video summarizing the results, as well as a spreadsheet of anonymized survey responses from 4,000+ participants. It includes tasters’ demographics, general coffee drinking habits and preferences, assessments of the four coffees, and more. [h/t Dan Brady]