Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2020.08.19 edition

Federal sentencing, the demographics of power, House job listings, late-medieval England immigrants, and bees.

Federal sentencing. The United States Sentencing Commission publishes annual datasets, going back to 2002, on people and organizations criminally sentenced in federal court. The files are anonymized, but contain hundreds of variables detailing the circumstances and outcomes of each decision. The commission also publishes “special collections” with additional information on drug-trafficking and economic crimes. Note: The datasets are published as SAS and SPSS files, but Kevin H. Wilson has shared Python code to convert them to CSVs. [h/t Giuseppe Sollazzo]

The demographics of power. The Reflective Democracy Campaign and the Center for Technology and Civic Life have partnered to produce datasets examining the demographics of 3,000+ elected sheriffs; 2,800+ elected prosecutors; and candidates for federal, state, and local offices in 2012, 2014, 2016, and 2018. [h/t Stacy Montemayor]

House work. With help from academics and former Hill staffers, journalist Derek Willis has assembled an archive of the weekly job and internship bulletins sent by the US House of Representatives. The archive, which goes back to late 2013, includes both the original PDFs and text extracted from them.

Late-medieval English immigrants. England’s Immigrants 1330-1550 is “a fully-searchable database containing over 64,000 names of people known to have migrated to England during the period of the Hundred Years’ War and the Black Death, the Wars of the Roses and the Reformation,” drawing on “taxation assessments, letters of denization and protection, and a variety of other licences and grants.” In addition to names, the dataset includes nationalities, places of residence, occupations, and more. [h/t W. Mark Ormrod]

Bees. The US Geological Survey’s Native Bee Inventory and Monitoring Lab keeps tabs on the country’s bee species, including through a dataset of more than 400,000 observations of “native and non-native bees, wasps and other insects.” (Free registration required.) Related: The lab also publishes “The Very Handy Manual: How to Catch and Identify Bees and Manage a Collection,” plus thousands of high-resolution photos.