Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2021.04.07 edition

Post offices, Han Chinese names, working hours, London’s COVID restrictions, and 1½ centuries of Great Lakes fishing.

Post offices. Historian Cameron Blevins has released a dataset of 166,000+ post offices operating in the US between 1639 and 2000. It includes their years of service and precise/approximate geocoordinates, “making it one of the most fine-grained and expansive datasets currently available for studying the historical geography of the United States.” The project builds on research by the late Richard W. Helbock and provides the data-foundation for Blevins’s new book, Paper Trails: The US Post and the Making of the American West, and companion website. Read more: Blevins’s introductory Twitter thread. [h/t Eric Gardner + Alex Albright]

Han Chinese names. The Chinese Name Database, published by social psychology grad student Han-Wu-Shuang (Bruce) Bao (包寒吴霜), “contains nationwide frequency statistics of 1,806 Chinese surnames and 2,614 Chinese characters used in given names, covering about 1.2 billion Han Chinese population (96.8% of the Han Chinese household-registered population born from 1930 to 2008 and still alive in 2008).” The statistics, obtained from China’s National Citizen Identity Information Center, also record the frequencies of given-name characters for six age cohorts, based on decade of birth. Read more: “What can we tell from the evolution of Han Chinese names?” — an explainer and analysis by Isabella Chua in Kontinentalist. [h/t Nathan Yau]

Working hours. For an article initially published in 2013 and since updated, Our World in Data has examined historical trends in the number of hours that people work. The data sources vary in geographic and temporal scope, with most spanning decades; they include Huberman and Minns (PDF, see Table 3), the Penn World Table, the Total Economy Database (registration required), the OECD, and more.

London’s COVID restrictions. London is providing an “experimental dataset” categorizing the various coronavirus-related restrictions that have affected the UK capital since March 2020. It lists the dates of 22 policy changes, including three separate lockdowns; whether schools, pubs, restaurants, and/or shops were closed; whether household mixing was banned; and more. [h/t Olivier Lejeune]

Fishing the Great Lakes. The Great Lakes Fishery Commission, whose founding was spurred by an invasion of sea lampreys, publishes several datasets relevant to the famous North American basin. Among them: “Commercial Fish Production In The Great Lakes 1867-2015,” which tallies the number of pounds caught each year by lake, jurisdiction, and species. [h/t Forest Gregg]