Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2016.09.14 edition

Minimum wages, health habits, sea ice, state prison admissions, cartoon captions.

Minimum wages. Researchers at the Washington Center for Equitable Growth have compiled a dataset of current and historical minimum wages in America. The federal and state minimum-wage data stretches back to May 1974 — when the federal minimum was $2.00 per hour, or roughly equivalent $9.76 per hour in today’s dollars — while the data for cities and counties starts in January 2004. [h/t Ben Casselman]

Health habits. The CDC calls its Behavioral Risk Factor Surveillance System “the largest continuously conducted health survey system in the world.” Every year, the survey asks more than 400,000 American adults about a range of health-related topics, from tobacco to seatbelt use, from alcohol consumption to arthritis, from HIV testing to immunizations. Annual datasets from 1984–2015 are currently available. [h/t Ricardo Pietrobon]

Sea ice. The National Snow and Ice Data Center, based at the University of Colorado, publishes the Sea Ice Index. The data files, which track ice coverage in the Arctic and Antarctic oceans, include daily and monthly measurements from November 1978 to the present. Lately, the extent of sea ice on the Arctic Ocean has been two or more standard deviations below its long-term average, according to the center, while Antarctic sea ice remained at average levels. [h/t Dan Vergano]

State prison admissions, by county. Reporters at the New York Times have assembled a dataset counting the number of inmates each U.S. county sent to state prison in 2006, 2013, and 2014. The reporters derived the numbers from the Bureau of Justice Statistics’ National Corrections Reporting Program, which only certain researchers can access. Related:This small Indiana county sends more people to prison than San Francisco and Durham, N.C., combined. Why?

Captionless cartoons, captioned. A group of computer scientists and the New Yorker’s cartoon editor walk into a room… and write an academic article titled, “Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest.” The corresponding dataset — available via the “cartoons” link on this page — includes 50 cartoons and nearly 300,000 reader-submitted captions.