2025.07.16 edition
Landfills. The EPA’s Landfill Methane Outreach Program, “a voluntary program that works cooperatively with industry stakeholders and waste officials to reduce or avoid methane emissions,” maintains a database of 2,600+ municipal solid waste (MSW) landfills in the United States. One table provides each site’s name, location, owners, operators, ownership type, year opened, year closed, size, waste capacity, latest waste tonnage, methane-related metrics, and more. Another table lists “landfill gas energy projects in various stages, such as planned, under-construction, operational and shutdown.” As seen in: “America’s Hot Garbage Problem” (Bloomberg). [h/t Laura Bliss]
Car crash datasets. Transportation policy scholars Hannah Younes and Robert B. Noland have compiled a catalog of US states’ car crash data resources. Thirty states and DC publish raw data, 14 states provide only a dashboard or map, while six provide no public data, according to the authors’ survey. They’ve provided links to each of the resources, such as New Jersey’s data downloads, Arizona’s dashboard, and Wisconsin’s map. The data sources vary in detail, time span, and ease of access. Previously: The National Highway Traffic Safety Administration’s Fatality Analysis Reporting System (DIP 2016.08.31). [h/t Michael Allen]
AI legal flubs. Damien Charlotin’s AI Hallucination Cases database “tracks legal decisions in cases where generative AI produced hallucinated content – typically fake citations, but also other types of arguments.” Charlotin, who teaches a course called “Large Language Models and the Future of the Legal Profession”, has collected 212 examples so far. The database lists each case’s name, jurisdiction, decision date, party responsible, AI tool used, “nature of hallucination”, outcome, penalties, and description. [h/t Simon Willison + Avi Levin]
British literary prizes. The Selected British Literary Prizes dataset, created by Katherine Binhammer and colleagues, “contains information on nine major literary prizes in the U.K. from 1990 to 2022 and demographic information on 682 prize winners and shortlisted authors.” The demographic attributes include “gender, sexuality, UK residency, ethnicity, geography and details of educational background.” The documentation includes a section on how the researchers approached the ethnicity categorizations. Previously: US literary prizewinners (DIP 2022.12.07), also via the Post45 Data Collective. [h/t Melanie Walsh]
Subway art. The Metropolitan Transportation Authority’s Permanent Art Program commissions public artworks for New York City Transit, Metro-North Railroad, Long Island Rail Road, and even one of its tunnels. It publishes a dataset of the collection’s 380+ pieces, which include mosaics, stained glass windows, sculptures, passageway floors, murals, fences, and more. It provides each piece’s location, transit agency, artist, artwork name, date constructed, material, description, and online catalog link. As seen in: Stephanie Dang’s Art Off the Rails, winner of the 2024 MTA Open Data Challenge. [h/t Matt Yarri]