Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2021.07.14 edition

Refugees resettled in the US, Arizona migrant deaths, Canadian candidates, NBER working papers, and lightning intensity.

Refugees resettled in the US. Axel Dreher et al. have published person-level data on 2.5+ million refugees who arrived in the US between 1975 and 2008. The anonymized records, obtained from the National Archives and originally collected by the Office of Refugee Resettlement, indicate each refugee’s country and date of birth, marital and family status, education level and English proficiency, date of US arrival, US city of resettlement, and more. The researchers also combined these records with public reports from the Bureau of Population, Refugees, and Migration (DIP 2015.11.25) to create a geocoded dataset of annual resettlements by citizenship and destination city from 1975 to 2018. [h/t Chris Parsons]

Arizona migrant deaths. The Arizona OpenGIS Initiative for Deceased Migrants is a collaboration between Pima County’s Office of the Medical Examiner and Humane Borders, a nonprofit that maintains water stations in the Sonoran Desert. “Although each organization has a distinct mission, both are committed to the common vision of raising awareness about migrant deaths and lessening the suffering of families by helping to provide closure through the identification of the deceased and the return of remains.” The initiative’s maps and dataset provide details on 3,700+ deaths since 1990, including the deceased’s name, sex, age, and cause of death; their body’s location and condition; and the date reported. [h/t Olaya Argüeso Pérez]

Canadian candidates. PhD candidate Semra Sevi recently compiled a dataset of 44,000+ candidates for Canadian federal office from 1867 to 2019 (and similar for Ontario provincial candidates). It lists each candidate’s name, gender, birth year, occupation, party, and incumbency status, plus the election’s date, riding, and outcome. And a new dataset from Anna Johnson et al. delves into the demographics of 4,516 Canadian federal candidates from 2008 to 2019, including their gender, age, race, Indigenous background, occupational category, and more. [h/t Marina Smailes + Erin Tolley]

NBER working papers. Since 1973, the National Bureau of Economic Research’s 1,500+ affiliated researchers have published 29,000+ (pre-peer review) articles through the organization’s working papers series. It provides structured information about each paper, using a template from the RePEc project. PhD student Ben Davies has converted those files into CSV tables and an R package listing each paper’s title, publication month, ID, and authors. [h/t Alex Albright]

Lightning intensity. The World Wide Lightning Location Network uses radio sensors to detect the location and power of 200+ million lightning strokes per year. Access to WWLLN’s raw, detailed data costs money but earth scientists Jed O. Kaplan and Katie Hong-Kiu Lau have converted it into a few public-access gridded-globe timeseries of lightning activity from 2010 to 2020 — daily and monthly strokes per km2, and monthly stroke power. [h/t Robin Sloan]