Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.02.22 edition

Facilities handling hazardous chemicals, Animal Welfare Act inspections, daily European gas imports, unclaimed estates, and data journalists.

Facilities handling hazardous chemicals. The US Environmental Protection Agency’s Risk Management Program rule requires facilities that handle “extremely hazardous substances” to tell the government, at least every five years, about those substances, their safety plans, their recent accident history, and more. Through a FOIA request to the EPA, the Data Liberation Project (which, disclosure, I run) obtained a copy of the agency’s database of these filings (minus some parts the government deems non-disclosable), containing submissions by 21,000+ facilities from early 1999 to February 2022. You can now access that data, in various formats, along with documentation guiding you through it.

Animal Welfare Act inspections. The USDA’s Animal and Plant Health Inspection Service checks whether animal dealers, exhibitors, research facilities, and transporters are complying with the care standards set by the Animal Welfare Act. The agency provides public access to the inspection reports but no bulk data on them. So, in a collaboration between Big Local News and the Data Liberation Project (same disclosure as above), Ben Welsh and I wrote code to fetch the 80,000+ (and counting) inspections going back to 2014, parse their PDFs, and make the data more accessible. The information includes each inspection’s date, type, licensee, violation counts, species inspected, and more.

Daily European gas imports. Researchers at Bruegel are tracking daily and weekly natural gas imports to Europe, using data from the European Network of Transmission System Operators for Gas’s transparency portal. Alongside the imports, which they’re aggregating by source (e.g., Russia, Norway, Algeria) and by route (e.g., Nord Stream, TurkStream), the researchers are also tracking gas storage levels, using data from Gas Infrastructure Europe (DIP 2022.01.26). Previously: Eurostat’s data on annual European energy imports and exports (DIP 2022.03.16).

Unclaimed estates. The UK government’s Bona Vacantia division publishes a dataset of unclaimed estates — inheritances that nobody has claimed yet to inherit. The entries indicate the deceased person’s name, aliases, date/place of birth and death, marital status, and more. Related: California provides a dataset of unclaimed property, such as “lost or forgotten” bank accounts, insurance benefits, and stock holdings.

Data journalists, surveyed. The European Journalism Center’s has published a dataset of 1,800+ anonymized responses to its second annual State of Data Journalism Survey, including 50+ entries each from the US, UK, Italy, Germany, Spain, India, and Nigeria, plus double-digit counts from dozens of other countries. The questions touch on demographics, employment, training, skills, the COVID-19 pandemic, and more. [h/t Simona Bisiani]