Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2020.01.29 edition

Time use surveys, dams, Spanish migration, darknet markets, and (more) vanity plate applications.

How we spend our time. Ever year since 2003, the American Time Use Survey, has been measuring how much time we spend sleeping, eating, and working; with friends, with family, and alone; and much more. Unlike many other time use surveys, the ATUS’s respondent-level datasets are freely available to the general public. Other time use surveys with downloadable data include the Russia Longitudinal Monitoring Survey and a Kosovo time use survey sponsored by the Millennium Challenge Corporation. Related: Through IPUMS, you can build custom data extracts of the ATUS, of historical US time survey data, and of the Multinational Time Use Study (registration required). As seen in: “How the American Work Day Changed in 15 Years” and “A Day in the Life: Women and Men,” two visualizations by Nathan Yau. [h/t Petrit Selimi]

Dams. The Global Georeferenced Database of Dams contains geographic data on more than 38,000 dams and their watersheds. The project, published by geographers at King’s College London, is based on a combination of satellite imagery, national registries, and other sources. At least one co-author has been working on the project since 2008. [h/t Jida Wang]

Three decades of Spanish migration. Spain’s national statistics institute publishes annual microdata on the relocation of residents within, into, and out of the country. The datasets indicate each relocator’s gender and age, birthplace, previous location, and destination — down to the municipality for locations within Spain. Currently, the records cover 1988 to 2018. Related: El Confidencial’s report on intra-provincial migration patterns, using this data, and related GitHub repository. [h/t Giuseppe Sollazzo]

Darknet market survival. The researcher who goes by “Gwern” maintains a dataset of more than 80 darknet marketplaces founded between 2011 and 2015. For each market (such as the infamous Silk Road), the dataset lists when it began, when it closed, why it closed, its URL, what cryptocurrencies it accepted, whether guns were allowed to be sold, and more.

California vanity plates. Through a public records request, Noah Veltman has obtained data on more than 23,000 personalized license plate applications flagged for review by the California DMV. For each flagged application, the dataset contains the applicant’s justification, comments from the state’s reviewers, and the proposal’s outcome. The reviewers rejected about 80% of the proposals, including those believed to be referencing racial slurs, swearing, drugs, sex, violence, and more. Previously: New York vanity plate applications (DIP 2015.10.21).