Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2019.03.13 edition

Workplace discrimination, wildfire costs, Dem endorsements, therapists, and zoo animal lifespans.

Employment discrimination cases. “Thousands of people report workplace discrimination to the government each year. Employers are rarely held accountable,” according to an investigation by the Center for Public Integrity. Reporters Maryam Jameel and Joe Yerardi “analyzed eight years of complaint data — through fiscal 2017 — from the [U.S. Equal Employment Opportunity Commission] as well as its state and local counterparts, reviewed hundreds of court cases and interviewed dozens of people who filed complaints.” The data (on more than 3.7 million allegations and their outcomes) and code are available online. Related: A visual exploration of the data. Previously: Two decades of workplace sexual harassment complaints (DIP 2017.12.06). [h/t Reddit user “cavedave” + Giuseppe Sollazzo]

U.S. wildfire costs. Stanford University’s Big Local News project has compiled data from 100,000+ daily situation reports (known as “SIT-209”s) filed by federal firefighting authorities, detailing their efforts to suppress large wildfires. The dataset covers 2014 to 2017, and includes 240+ variables from each report, including estimated costs, damaged/destroyed buildings, injuries, fatalities, and more. Related: Eric Sagara’s quick introduction to the dataset.

Democratic endorsements. FiveThirtyEight is tracking who’s endorsing whom to be the Democrats’ 2020 presidential nominee. The site has published a methodology describing its approach, plus the underlying data, which includes each endorser’s name, state, relevant position, and other details. (According to the site’s formula, Sens. Cory Booker and Kamala Harris are currently leading, although almost entirely based on home-state endorsements.)

50,000 therapists. The magazine Psychology Today hosts paid listings for therapists, who advertise their services to prospective patients. Andrew Thompson has created a dataset of the 50,000+ U.S. listings (as of October 2018), with each therapist’s name, city, specialties, and subject areas.

Zoo animal lifespans. Researchers based at Chicago’s Lincoln Park Zoo have published “life expectancy estimates for hundreds of vertebrate species based on carefully vetted studbook data from North American zoos and aquariums.” Their dataset includes “sex-specific median life expectancies as well as sample size and 95% confidence limits for each estimate.”