Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2019.04.17 edition

Medical device safety, SCOTUS confirmation transcripts, scientific publishing, coroner inquests, and Skittles.

Medical device safety. The International Consortium of Investigative Journalists, along with media partners in dozens of countries, has been compiling a cross-border database of medical-device safety alerts. The alerts include recalls as well as less-urgent notifications published by health authorities and manufacturers. You can download the public database, which so far includes 90,000+ notices for devices in 18 countries. The records include the date and type of notice; a device identifier; the reason for the alert; a classification of its severity; and more. Related: The Implant Files, an investigative series by the consortium, based on the data.

SCOTUS confirmation transcripts. The R Street Institute has converted the last five decades of successful Supreme Court confirmation hearings into a spreadsheet, with one row for each statement, question, and answer. The 15 transcripts begin with William Rehnquist’s 1971 hearing and end with Neil Gorsuch’s in 2017. (Robert Bork’s failed nomination is excluded, and Brett Kavanaugh’s 2018 transcript is not yet available.) [h/t Zachary Agatstein + Alex Spurrier]

Scientific publishing, linked. The Microsoft Academic Knowledge Graph, published under an Open Data Attributions license, describes 8+ billion relationships between scientific papers, their authors, affiliated institutions, conferences, journals, fields of study, and more. The data can be downloaded and also queried online through a SPARQL interface. [h/t Michael Färber]

18th-century coroner inquests. The London Lives initiative “makes available, in a fully digitised and searchable form, a wide range of primary sources about eighteenth-century London, with a particular focus on plebeian Londoners.” As part of the project, digital historian Sharon Howard has compiled a dataset of 2,894 Westminster coroners’ inquests from 1760 to 1799. The fields include the date of death, the name of the deceased, the cause of death, the coroner’s verdict, and more. Bonus: A recent Twitter thread from Howard highlighting more datasets.

Double rainbows. The question: How many bags of Skittles must you open before finding two identical color-distributions? The answer: “82 days, 13 boxes, 468 packs, and 27,740 individual Skittles later […]”. The data: available on GitHub. [h/t u/cavedave]