Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.12.13 edition

NYC shelter counts, US solar facilities, state supreme court justices, antimicrobial peptides, and Finland transport.

NYC shelter counts. Every night, tens of thousands of people stay in New York City homeless shelters, which are managed by a half-dozen city agencies. Earlier this year, thanks to a 2022 law, the city began publishing monthly statistics on shelter population counts and average lengths of stay, broken down by agency, program, and family composition. One agency, the Department of Homeless Services, publishes some structured daily data, but shares its most comprehensive daily counts as a nightly-overwritten PDF. So Patrick Spauster and Adrian Nesta have built a data pipeline to scrape, archive, and standardize these datasets, and have partnered with City Limits to revamp the publication’s shelter population tracker. In October, more than 143,000 people slept in the city’s shelters, the highest monthly count on record.

US solar facilities. The United States Large-Scale Solar Photovoltaic Database, released last month, maps and describes 3,600+ solar power facilities with capacities of at least 1 megawatt. The project, a collaboration between the Lawrence Berkeley National Laboratory and US Geological Survey, includes an online viewer, downloadable database, and API (with example usage). The data indicate each facility’s boundary and centroid coordinates, state, county, and site type, plus data extracted from EIA Form 860, such as capacity, axis type, tilt angle, and more. Related: A map and dataset of Illinois solar installations, large and small, compiled by volunteers at Chi Hack Night. Previously: The Global Energy Monitor’s Global Solar Power Tracker (DIP 2022.06.01). [h/t Kate Martin + Derek Eder]

State supreme court justices. David A. Hughes et al., building on prior work, have calculated ideology scores (on a conservative-liberal continuum) for 1,600+ state supreme court justices from 1970 to 2019. For each justice and year, the dataset provides the justice’s state, last name, political affiliation, the authors’ ideology score estimates, and ideology scores produced by two other research teams. Previously: Ideology estimates for state legislators (DIP 2020.01.01), members of Congress (DIP 2023.03.01), and local populations (DIP 2023.08.02).

Antimicrobial peptides. In a recent journal article, the University of Nebraska Medical Center’s Guangshun Wang discusses the history and future of the Antimicrobial Peptide Database, which his lab launched more than 20 years ago. The database now catalogs 3,500+ such molecules, which play an important role in the innate immune system. You can search the peptides by name, structural and functional properties, source organism, and other criteria, and can download their amino acid sequences.

Finland transport. The Finnish Transport and Communications Agency’s public datasets span a range of topics, including the radio spectrum, mobile networks, cybersecurity, registered vehicles, ships, aircraft, and more. It also publishes downloadable statistics unavailable from many other countries, such as passenger cars’ inspection failure rates by make, model, and year and driving exam results. [h/t Stagnant]