Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2016.12.14 edition

Drug costs, breached accounts, Earth’s surface water, crowdfunding, and energy use at 10 Downing Street.

Medicare drug costs. The federal government has released data on Medicare’s prescription drug spending from 2011 to 2015. Previously, Medicare had only published data on the most expensive drugs; the new release includes data on all drugs used by at least 11 Medicare patients in a given year. Caveat: Medicare “is prohibited from publicly disclosing drug-specific information on manufacturer rebates,” so the “spending metrics do not reflect any manufacturers’ rebates or other price concessions.” [h/t Charles Ornstein]

Breached accounts. Troy Hunt runs, a service that lets you see whether your email address has been included in any major data breaches. Last week, Hunt published an anonymized dataset based on the breaches he’s collected. (That post provides a torrent file for the dataset; you can also download the data here.) Unlike the HaveIBeenPwned website, the dataset doesn’t include information about specific accounts; instead it counts the number of email addresses that have been compromised on particular combinations of websites. For example, 14.6 million email addresses appeared in both the LinkedIn and Dropbox breaches. (You can read more about each breach here.)

Water world. The European Commission and Google engineers have mapped surface water – including lakes, rivers, reservoirs, oceans, and more – on every 30-meter-by-30-meter square on Earth between 1984 and 2015. During that time, “permanent surface water has disappeared from an area of almost 90,000 square kilometres, roughly equivalent to that of Lake Superior, though new permanent bodies of surface water covering 184,000 square kilometres have formed elsewhere.” The data, based on the U.S. government’s Landsat satellite images, are available to download and explore online. Related:Mapping Three Decades of Global Water Change,” published by The New York Times, based on this dataset.

Crowdfunding. A Lithuania-based web-scraping company has been collecting data on Kickstarter projects and Indiegogo campaigns every month. The datasets include (among other things) each project’s number of backers, amount pledged, and category. You can also explore the data online. [h/t Vincent Granville]

Energy use at 10 Downing Street. UK-based CarbonCulture helps organizations measure and publish their buildings’ energy and water use in near-realtime. Among the first users: 10 Downing Street, the Tate Modern, and University College London. For each building, you can download yearly datasets, which are broken down into 30-minute intervals. [h/t Max Roser]