Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2022.01.05 edition

Civil asset forfeiture, joint military exercises, medical drug names, foundation shades, and honey bees.

Civil asset forfeiture. “Most states and the federal government have laws allowing police and prosecutors to seize and permanently keep Americans’ cash, cars, homes and other property suspected of being involved in a crime — without regard to the owners’ guilt or innocence,” the nonprofit law firm Institute for Justice writes in its third edition of Policing for Profit, published in 2020. The report gathers and analyzes datasets on property seized in dozens of states through this practice of civil asset forfeiture, and on the spending of forfeiture funds. It also examines seizures from the federal Consolidated Asset Tracking System, detailed public extracts of which the Department of Justice updates quarterly. As seen in: “Cops still take more stuff from people than burglars do” (The Why Axis, 2021), and “Stop and Seize” (Washington Post, 2014).

Joint military exercises. Jordan Bernhardt’s Joint Military Exercises Dataset describes 5,000+ such operations undertaken between 1977 and 2016, drawn from historical news reports. The dataset lists each exercise’s name, location, when it began and ended, the countries that participated, activities involved, and more. Related: Brandon J. Kinne’s Defense Cooperation Agreement Dataset, a “comprehensive, human-coded dataset” covering bilateral defense treaties between 1980 and 2010.

Medical drug names. To build their International Drug Dictionary, Mohammad A. Khaleel et al. collected trade names and ingredient names “from open access websites belonging to official drug regulatory agencies, official healthcare systems, or recognized scientific bodies from 44 countries around the world,” among other sources. Each of the 450,000+ entries maps a name to standardized ingredient information from the National Library of Medicine’s RxNorm database.

Foundation shades. For “The Naked Truth,” a Pudding article published last year with Ofunne Amaka, Amber Thomas scraped information about 6,800+ foundation shades from the websites of two major cosmetics retailers. The project’s datasets identify each product’s name, description, URL, and the predominant RGB/HSL color value in its swatch image.

Honey bees. Since the 1980s, the US Department of Agriculture has conducted an annual Bee and Honey Inquiry Survey, which generates estimates of “the number of colonies producing honey, yield per colony, honey production, average price, price by color class and value as well as honey stocks at the state and national levels.” Since 2016, it has also published annual reports that examine the gain and loss of colonies, including losses due to colony collapse disorder.