Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2019.01.30 edition

Fatal and non-fatal gun crime, hourly rainfall, ethnonationalism, uncertain spellings, and Twin City radio spins.

Fatal and non-fatal gun crime. On Thursday, Sarah Ryley, Sean Campbell, and I published a deeply-reported investigation into U.S. cities’ failure to solve shootings — a year-long collaboration between The Trace and BuzzFeed News. To reach our quantitative findings, we analyzed (and standardized) three major FBI datasets, internal data from 22 police departments, and a database of Baltimore victims and suspects. Data, code, and methodologies for the analyses are available on GitHub. Related: Last year, The Washington Post published Murder with Impunity, a series examining unsolved homicides; their data, on 52,000+ homicides in 50 cities, is also available on GitHub.

Hourly rainfall. Since 1997, the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) algorithm has used satellite imagery to estimate rainfall rates around the world. The system’s hourly, daily, monthly, and annual estimates can now be explored online and downloaded.

Ethnonationalism. Christina Isabel Zuber and Edina Szöcsik’s Ethnonationalism in Party Competition dataset compiles ratings for more than 200 political parties in 22 European countries. Experts rated the parties twice — first in 2011, and then again in 2017 — on a range of factors, such as the centrality of ethnonationalism to the parties’ platforms, and their positions on territorial autonomy for minorities. (Dataset access requires providing a name and email address.) [h/t Erik Gahner]

Uncertain spellings. “Funemployed programmer” Colin Morris looked for all the times where commenters on Reddit added “(sp?)”, or a related annotation, to their remarks. E.g., “SF is putting on quite a show, especially Kapernick (sp?).” Morris then compiled a dataset of the words that preceded those annotations, accompanied by examples of their usage. [h/t Rich Posert]

Twin City radio spins.Shane Nackerud needed to know: Does 89.3 the Current play the Replacements every day?” To figure it out, the University of Minnesota librarian extracted track listings from 1.1 million @currentplaylist tweets from 2009 through 2018. He’s also published the total play counts by artist and the raw data. [h/t Kent Gerber + Amy Riegelman]