Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2022.11.09 edition

Income patterns, mine safety, William Still’s freedom-seekers, Sierra snowpack, and the turkey industry.

Income patterns. The Global Repository of Income Dynamics is a new “open-access international database that provides a wealth of micro statistics on income inequality and income dynamics.” It was constructed by an international team of economists using longitudinal administrative data and “designed from ground up with a focus on comparability across countries.” The project’s data access tool provides stats ranging from the widely-understood (e.g., share of income going to the top 1%) to the more specialized (e.g., kurtosis coefficients of various income-change distributions). It currently covers 13 countries (although access to the UK’s data is listed as “coming soon”), with timespans that typically stretch from the 1980s or 1990s to the mid/late-2010s.

Mine safety. The Department of Labor’s Mine Safety and Health Administration “develops and enforces safety and health rules for all U.S. mines regardless of size, number of employees, commodity mined, or method of extraction.” Its Mine Data Retrieval System provides a search interface and downloads of records collected through this oversight role, including detailed information about individual accidents since January 2000, civil penalties, coal dust samples, mine owner and operator histories, employment and production levels, and more.

Still’s freedom-seekers. William Still, a Philadelphia-based abolitionist and key leader of the Underground Railroad, kept records of the freedom-seekers he helped to escape the South, details of which he recounted in his 1872 magnum opus. In the early 2000s, the late James A. McGowan compiled those descriptions into a spreadsheet listing each runaway’s name, gender, age, date of escape, enslaver’s name, and more. William C. Kashatus’s 2021 biography of Still draws on McGowan’s efforts and other records to construct a listing of 995 runaways whom Still assisted, which historian Nick Sacco has converted into a spreadsheet. [h/t Eric Gardner]

Sierra snowpack. For decades, UC Berkeley’s Central Sierra Snow Lab has been collecting daily temperature, snowfall, snowpack, and related measurements at its field station near Donner Pass. Until recently, the records were available only upon request. That changed last year, when the lab published a fully-public dataset of the daily measurements for October 1970 to September 2019.

The turkey industry. The USDA’s National Agricultural Statistics Service publishes monthly and annual reports estimating the number of turkeys incubating, raised, slaughtered, and in cold storage, based on surveys and food-safety inspections. The reports contain semi-structured data tables, while more-structured data can be fetched through the agency’s Quick Stats tool and API. [h/t Emily Stewart + Walt Hickey]