Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2017.01.18 edition

TrumpWorld, food stamp foods, offline prices versus online prices, German rail, and .gov domains.

TrumpWorld. At BuzzFeed News, a few colleagues and I spent the past two months compiling a big database of organizations and people connected to President-elect Trump, his family, advisers, and Cabinet picks. On Sunday, we published what we’ve found so far — connections between more than 1,500 organizations and people altogether. Still, there are certainly things we’ve missed. So you can download and search the data, but you can also help us expand it. See something we’ve overlooked? Let us know!

Food stamp foods. Late last year, the USDA published a study that used “point-of-sale transaction data from a leading grocery retailer to examine the food choices” of households receiving Supplemental Nutrition Assistance Program (SNAP) benefits. In an appendix, the report ranks the total spending on major commodities by SNAP households and non-SNAP households. Soft drinks, “fluid milk products,” and ground beef were the top three commodities purchased by SNAP households. Milk, soft drinks, and cheese were the top three for non-SNAP households. That information is presented as a PDF table, but I’ve converted it to a spreadsheet-friendly text file for you. [h//t Reddit user “junglejuicy”]

Online and offline prices. Between December 2014 and March 2016, Alberto Cavallo — co-founder of MIT’s Billion Prices Project — sent 323 crowdsourced workers to collect product prices from 56 large retailers in 10 countries. Then, he found the prices for the same products on the retailers’ websites. The results, which contain tens of thousands of observations, are available as several Excel spreadsheets. (Caveat: The dataset’s “Terms of Use” rules stipulate that the information is “EXCLUSIVELY FOR USE IN ACADEMIC RESEARCH AND PUBLICATIONS”.) Related: Cavallo summarized his findings in a paper published recently by the American Economic Review.

German rail. State-owned Deutsche Bahn AG is Europe’s largest railway company by revenue, serving 12 million train and bus passengers each day. It also happens to publish a bunch of open data, including datasets on its routes, stations, platforms, and cargo facilities. [h/t Martin Bergmann]

Dot-gov domains. The General Services Administration recently updated its list of known .gov domains. It currently includes more than 1,300 federal domains — from to — and more than 4,300 domains registered by state, local, and native sovereign agencies.