Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2020.04.22 edition

A COVID policy tracker–tracker, every US Census question since 1790, chemical weapons in Syria, NYC sidewalks, and crowdsourced moral judgments.

Tracking the COVID policy trackers. Economics graduate student Lukas Lehner has gathered, with help from the hive-mind, a list of dozens of websites and datasets tracking policy responses to the COVID-19 pandemic – a few that have been mentioned in DIP, plus many that haven’t. Lehner’s tracker-tracker groups the resources into several topic areas, summarizes them, and indicates their data formats. [h/t François Briatte]

Every Census question, ever. Programmer Alec Barrett has built a spreadsheet listing every question asked on every decennial US Census since 1790 — more than 900 items overall. In addition to the questions themselves, the dataset describes the subgroups of people questioned and the types of answers expected. Related: The dataset powers “The Evolution of the American Census,” Barrett’s interactive exploration of how the questions have changed over time, and what they say about America.

Chemical weapons attacks in Syria. “Building on years of painstaking work alongside our Syrian and international partners,” the Global Public Policy Institute “has compiled the most comprehensive dataset of incidents of chemical weapons use in Syria to date.” The institute has published a new interactive data portal to display the data on 345 attacks between 2012 and 2019. The data fields include the date and time of the attack, location, chemical agent, munition type and method of delivery, perpetrator, confidence rating, and more. [h/t Tobias Schneider]

NYC sidewalks, narrow and broad. Urban planner Meli Harvey has taken New York City’s official dataset of sidewalks and dissected its geometries to map the width of each segment of walkable pavement. [h/t Dan Brady]

Crowdsourced moral judgements. Data scientist Elle O’Brien recently described how she built and cleaned a dataset of the moral dilemmas posted to r/AmItheAsshole, “a semi-structured online forum that’s the internet’s closest approximation of a judicial system.” For each of the 97,628 posts collected, the dataset includes the title, body, date, number of Reddit upvotes, and number of comments — plus the community’s verdict. [h/t u/thumbsdrivesmecrazy]