Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2020.07.15 edition

1.75 million US COVID patients, more goverment responses to the coronavirus, COVID cases in migrant worker dorms, 43 million court case citations, and old British lighthouses.

1.75 million US COVID patients. The US Centers for Disease Control and Prevention has published a dataset containing demographic and medical information on 1.75 million deidentified COVID-19 patients. For each confirmed or probable case, the dataset report reports the patient’s age group and race/ethnicity, the date of their initial symptoms, whether they were hospitalized, whether they had an “underlying morbidity or disease,” and more — although several of the fields contain high percentages of “unknown” values. The dataset is similar to the one The New York Times got from the CDC through a Freedom of Information Act lawsuit (see: “The Fullest Look Yet at the Racial Inequity of Coronavirus”), but it does not specify the patient’s county. [h/t Marc Bevand + Steven Mosher]

More government responses to the coronavirus. The CoronaNet Research Project aims “to collect as much information as we can about the various fine-grained actions governments are taking to defeat the coronavirus.” The project, which has drawn contributions from more than 400 researchers around the world, published its initial release a few weeks ago, and now details nearly 16,000 policy events in nearly 200 countries. Related: The nonprofit Hikma Health says it has compiled “the largest county-level COVID-19 policy dataset in the nation,” covering 1,200 US counties and more than 120 Native American communities. The dataset indicates the dates on which each jurisdiction undertook various responses, such as closing schools and restricting large gatherings. [h/t Alex Pashanov]

COVID cases in migrant worker dorms. The anonymous author of Squirrelling Data has been collating information from the Singapore Ministry of Health’s coronavirus press releases. Among the datasets: daily case counts in dozens of migrant worker dormitories, which have been hit hard. [h/t Joses Ho]

Case citations. The Caselaw Access Project (DIP 2018.11.07) has begun publishing a citation graph, a dataset listing the previous cases that each court decision cites. The latest release covers 43 million citations. The project also provides aggregated versions of the data, plus interactive graphics showing the frequency of citations to and from courts in each state.

Old British lighthouses. The Historical Light Aids to Navigation dataset “shows the development of historical lighthouses, lightships, harbour lights and beacons in England and Wales for several benchmark years between 1514-1911,” drawn from navigational charts, government publications, and other sources. For each of the 600+ entries, the dataset provides the light aid’s name, geocoordinates, and (when available) its visibility range, height, and number of lights.