Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.03.29 edition

Local public meetings, health workers, Kremlin posts, carbon capture projects, and Milan drinking fountains.

Local public meetings. LocalView, developed by Soubhik Barari and Tyler Simko, “is the largest dataset of local government public meetings — the central policy-making process in American local government — as they are captured on video.” In a recent paper, the authors describe how they built the dataset, which is based on 130,000+ YouTube-hosted videos of such meetings in 1,000+ US cities and counties, covering the years 2006 to 2022. The dataset lists each meeting’s date, jurisdiction, and government body (e.g., municipal council, school board, etc.), plus the video’s ID, title, channel, transcript, and more. [h/t Chris Goodman]

Health workers. The World Health Organization’s Global Health Workforce Statistics database presents annual, national estimates of the number of medical doctors, nursing and midwifery professionals, community health workers, and several other types of health personnel. The estimates come from the WHO’s National Health Workforce Accounts system, national censuses, labor force surveys, and other sources. For medical doctors, the estimates span nearly 200 countries, with the majority having estimates as recent as 2020 or 2021. Twenty countries’ estimates go back to the 1960s (and, for Spain, all the way back to 1952). [h/t Datasketch]

Kremlin posts. Giorgio Comai’s “text as data & data in the textproject “aims at facilitating structured analysis of on-line contents related to conflicts in the post-Soviet space by providing easier access to relevant datasets and tools.” Those datasets include the URL, title, text, date, and other metadata of all posts published on the Kremlin’s English-language website since the year 2000; on the Kremlin’s Russian-language website; and by Zavtra since late 1996. Previously: Foreign ministry statements (DIP 2022.03.09). [h/t EDJNet]

Carbon capture projects. The National Energy Technology Laboratory maintains a map and dataset of carbon capture and storage projects “active, proposed, and terminated” in 30+ countries since the 1970s. The latest version of the dataset includes 400+ entries, slightly more than are on the map. It lists each project’s name, company, location, date, type, scope, magnitude, status, technology, cost, summary, and other details. As seen in: “What is carbon capture and storage? Where is it happening in the US?” (USAFacts).

Milan drinking fountains. Milan’s government publishes a dataset of 600+ local vedovelle, the distinctive (green, cast iron, dragon-headed) drinking fountains that dot the city. The dataset provides each fountain’s coordinates, municipal zone, and neighborhood. As seen in: “Tutto sulle fontanelle di Milano,” with a map of the locations, by Il Post’s Isaia Invernizzi.