Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2021.09.22 edition

K-12 learning arrangements, rushing waters, voluntary union recognition, four decades of Spanish elections, and cryptic crossword clues.

K-12 learning arrangements. The COVID-19 School Data Hub, which launched last week, is “a central database for educators, researchers, and policymakers to understand how the COVID-19 pandemic shaped students’ modes of learning in 2020-21.” The project’s team, led by economist Emily Oster, has gathered data on learning models (in-person, virtual, or hybrid) used by public schools and districts at various points in time, their masking policies, and reported COVID-19 cases. The datasets can be downloaded in bulk or by state. The coverage and granularity vary by topic and state; the project’s documentation describes the collection methods and availability.

Rushing waters. The US Geological Survey’s National Water Information System provides data on the “occurrence, quantity, quality, distribution, and movement of surface and underground waters” around the country. The surface water measurements — mainly streamflow and gage height — come from tens of thousands of monitoring sites. (Here’s a site near Baton Rouge before and after hurricanes Ida and Nicholas.) There’s an API for accessing the records, including daily summaries and real-time measurements. Previously: NOAA’s water-level data (DIP 2016.03.23). [h/t Michael Allen]

Voluntary union recognition. Civic technologist Forest Gregg has begun filing FOIA requests to the National Labor Relations Board to collect newly-available data on employers’ voluntary recognition of employee unions, drawn from agency’s relevant notification form. The records so far include 70+ recognitions in late 2020 and early 2021, plus nearly 1,000 from a prior reporting program between 2007 and 2009; they list the employer, union, case number, relevant dates, and more. Previously: Union election results (DIP 2021.05.05).

Four decades of Spanish elections. The Spanish Electoral Archive, published this summer, provides detailed results of all municipal, regional, general, and European Parliament elections in Spain since the country’s transition to democracy in the late 1970s. The project’s datasets standardize records from various official sources that, in many cases, drill down to the level of individual ballot boxes.

Hidden clue here will be! Data scientist George Ho has compiled a dataset of 589,000+ clues to cryptic crosswords, “collected from various blogs and publicly available digital archives.” The collection, released earlier this month, is available to download and also explore online. (For example.) Its “datasheet” describes the motivation, collection process, composition, and more.