Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.05.31 edition

Correctional control, local business, crowd accidents, Antarctic geology, and NYC sidewalk scaffolding.

Correctional control. The Prison Policy Initiative’s Punishment Beyond Prisons report cross-references data from the Bureau of Justice Statistics and several other sources to count or estimate the number of people under eight forms of “correctional control” in each state and DC: federal prisons, state prisons, local jails, Indian Country jails, youth confinement, involuntary commitment, parole, and probation — approximately 5.5 million people in total. Related: The Census of Juveniles in Residential Placement, one of the report’s sources, provides a tool to tabulate youth confinement counts by sex, age, race, status, offense, and facility characteristics. [h/t Mike Wessler]

Local business. Despite the name, the Census Bureau’s County Business Patterns datasets cover a range of geographic units, including states, congressional districts, metro areas, counties, and ZIP codes. Generated from the Bureau’s confidential Business Register, they provide the number of establishments and (noise-infused) employee counts and payroll figures, disaggregated by industry code. Last month, the Bureau released the data for 2021. Historical availability varies; the Bureau directly provides data for counties back to 1986, and for ZIP codes back to 1994, for example. Fabian Eckert and colleagues, meanwhile, have converted two older archives of the records into comparable data, spanning 1946 to 1974 and 1975 onward.

Crowd accidents. Claudio Feliciani et al. have compiled a dataset of 281 crowd accidents from 1900 to 2019, based on “a comprehensive investigation of the press and media reports.” The researchers focus on accidents with at least one fatality or ten injuries “caused by a collective crowd motion which could have been potentially prevented by employing a different design or through a proper crowd management.” The dataset lists each accident’s date, country, coordinates, gathering type (sport, religious, political, etc.), fatality and injury counts, crowd size, and sources. The deadliest accident included, by far, is the 2015 Mina stampede. [h/t Neil Martin]

Antarctic geology. Antarctica’s surface is mostly ice, but there’s also lots of rock. In a recent paper, the Scientific Committee on Antarctic Research’s GeoMAP team describes building the “first detailed geological map dataset covering all of Antarctica,” assembled and refined from 589 sources. The dataset, which you can download or explore as an interactive map, uses 99,000+ polygons to describe the rock-scape, associating each unit with a name (or group name), rock type, lithology, geologic period, and more.

NYC sidewalk scaffolding. New York City’s Department of Buildings publishes a map and dataset of active permits for “sidewalk sheds,” the ubiquitous, temporary structures (often colloquially called scaffolding) meant to shield pedestrians from falling debris. Each of the ~9,000 entries indicates the permit address, date issued, material, linear feet, age, and more. The oldest shed has had a permit for more than 17 years. Read more: An introduction to the sheds and several visualizations of the data, by BetaNYC’s Zhi Keng He. [h/t Giuseppe Sollazzo]