Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2021.07.07 edition

Water politics, government regulations, rural hospital closures, vertebrates’ viruses, and Andean roadkill.

Water politics. Oregon State University’s International Water Events Database summarizes 7,000+ episodes from 1948 to 2008 that “concern water as a scarce or consumable resource or as a quantity to be managed,” categorizing their intensities and indicating the basins and countries involved. The Pacific Institute’s Water Conflict Chronology documents 926 events (including a few legends) between 3000 BC and 2019 that involved violence or the threat of it. Bernhauer et al.’s Water-Related Intrastate Conflict and Cooperation dataset details 10,000+ events in the Middle East, Mediterranean, and Sahel between 1997 and 2009. Related: Takeshi Wada’s “Geographic Distribution of Water Conflicts Worldwide: A Comparative Analysis of Four Databases” (pdf) discusses these resources plus GDELT, a broader-scope event database.

Government regulations. QuantGov, an open-source project of the free-market-oriented Mercatus Center, “solves the problem of quantifying large amounts of policy text for research and comprehension by using machine learning and natural language processing.” Its RegData initiative applies this approach to government regulations over time — quantifying their length and linguistic complexity and trying to identify their relevant NAICS-classified industries. Its datasets examine rules enacted by the federal government and most US states, as well as federal and subnational regulations from Australia, Canada, and India. [h/t Aaron Staples et al.]

Rural hospital closures. The North Carolina Rural Health Research Program maintains a list of rural hospital closures in the US since 2005. The dataset contains 181 entries through June 2021, each representing a complete closure or a conversion from inpatient care to other services, and indicates the hospital, number of beds, Medicare payment program, month of closure, and the location’s Rural-Urban Commuting Area classification. [h/t Betsy Ladyzhets]

Vertebrates’ viruses. VIRION is “an open atlas of the vertebrate virome” that launched in May. It represents the associations between 9,000+ viruses and 3,700+ host species, drawing from a range of sources, including USAID’s PREDICT project and the GloBI project, which organizes data about inter-species interactions. [h/t Timothée Poisot]

Andean roadkill. For their recent paper, “Geography of roadkills within the Tropical Andes Biodiversity Hotspot,” ecologists Pablo Medrano-Vizcaíno and Santiago Espinosa surveyed three 33-kilometer road segments dozens of times in 2014. Their publication datasets provide details about the 445 dead vertebrates they found, including one new-to-science species of snake. [h/t Christian Miles + The Economist]