Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2021.12.22 edition

Tobacco habits, working hours, COVID-19 in European prisons, California water wells, and toothbrushing.

Tobacco habits. Every few years since 1992, the National Cancer Institute has sponsored the Tobacco Use Supplement to the Current Population Survey, administered by the US Census Bureau. In addition to extensive demographic information, the supplement asks about historical tobacco usage (“Have you smoked at least 100 cigarettes in your entire life?”), preferences (“Do you usually smoke menthol or non-menthol cigarettes?”), purchasing habits, e-cigarettes, and much more. Anonymized responses and documentation are available for all survey waves through 2018–19. Previously: The CDC’s Behavioral Risk Factor Surveillance System (DIP 2016.09.14). [h/t Christian Gunadi et al. + Kevin Lewis]

Working hours. Political scientist Magnus Bergli Rasmussen has compiled data on the regulation of laborers’ total work hours in nearly every country since 1789, available as a Stata file. For each year and territory, the dataset indicates whether such laws existed, the “normal” number of contractually-obligated weekly hours, the maximum number of hours allowed, and increases in pay for overtime. Read more: “The Great Standardization: Working Hours Around the World,” in which Rasmussen describes the dataset’s construction.

COVID-19 in European prisons. A collaboration coordinated by Deutsche Welle and the European Data Journalism Network has gathered data on the pandemic’s impact on prisoners and prison staff in dozens of European countries, including the number of COVID-19 tests, cases, and deaths over time. The data also note the types of preventative measures and vaccine policies in place. Previously: US prison COVID-19 data from the Marshall Project and AP (DIP 2020.05.06) and the New York Times (DIP 2021.04.21). [h/t Lorenzo Ferrari]

California water wells. Domestic wells in the San Joaquin Valley “are drying up at an alarming pace” amid “a frenzy of new well construction and heavy agricultural pumping,” according to a Los Angeles Times article last week. Data reporter Gabrielle LaMarr LeMee’s analysis provides the quantitative backbone, drawing on three state datasets: well completion reports and periodic groundwater level measurements, both of which go back more than a century, and household water supply shortage reports since 2013.

Toothbrushing. Zawar Hussain et al. recorded data from 120 electric and manual toothbrushing sessions, using sensors attached to the brush handle and brusher’s wrist. Each session’s data files trace the sensors’ positions over time and indicate the brush type, participant’s gender, age, and handedness, and more.