Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2016.04.13 edition

Rainfall, legal opinions, inequality in health, inequality in Hollywood, and registered aircraft.

Global rainfall. To create the most detailed measurements of global rainfall ever, researchers at UC Santa Barbara’s Climate Hazards Group harmonize data from satellites and on-the-ground weather stations. The dataset, known as CHIRPS, stretches back more than 30 years and is freely available. Related: Eric Holthaus provides more details and explains why the dataset is so important. [h/t Dave Riordan]

Order in the courts. CourtListener gathers and publishes bulk data the Supreme Court, all federal appeals courts, and hundreds of other jurisdictions. The files include opinions, audio from oral arguments, dockets, and citations. It also has an API. (If you register, you can also create and explore networks of citation-linked cases.) [h/t Jeff Grove]

Health and wealth. The Health Inequality Project calculates American life expectancies by income, gender, and geography. You can download the data at the national, state, county, and “commuting zone” levels. Where do poor Americans live the longest? New York City, Santa Barbara, and San Jose. [h/t Margot Sanger-Katz]

He said, she said (less). Over the weekend, Hannah Anderson and Matt Daniels published an interactive analysis of male and female speaking roles in 2,000 movie scripts. Among their findings: 308 scripts gave 90%+ of the film’s dialogue to men, while just 8 scripts did so for women. The duo has also released “as much data as we can share (without getting sued)” on GitHub.

Plane papers. The Federal Aviation Administration maintains a database of all non-military aircraft registrations, which includes extensive details about each plane/helicopter/glider/blimp and their owners. Related:Spies In The Skies.” [h/t Peter Aldhous]