Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2017.01.11 edition

What Facebook knows, global warming, U.S. bombing missions, so many satellites, and Waldo.

What Facebook knows about us. In September, ProPublica published a Chrome extension that showed readers what Facebook said it knew about them — and then asked readers to share that data. In the following months, readers unearthed more than 52,000 of the “unique interest categories” that Facebook uses for advertising, such as “yoga,” “beer,” and “Scent of a Woman (1992 film).” But ProPublica’s reporters also found that Facebook doesn’t tell users about the “far more sensitive” data it buys about their offline lives, which can include “their income, the types of restaurants they frequent and even how many credit cards are in their wallets.” To support these findings, ProPublica published two key datasets: the crowdsourced “interest categories” and the list of categories that Facebook allows advertisers to target.

Getting warmer. Scientists expect that, when the final numbers come in, 2016 will have been Earth’s hottest year on record. The National Oceanic and Atmospheric Administration publishes monthly data on “temperature anomalies” — how much hotter or cooler a month was than the 20th century average. (November 2016, the most recent month available, was 0.73° Celsius warmer than the average November.) You can grab the data for the entire globe, by hemisphere, or by continent; for the land and ocean combined, or separately; and going all the way back to 1880. Related: My colleague Peter Aldhous demonstrates how he charted this data using R. Also: NOAA released its 2016 U.S. “State of the Climate” report on Monday.

Four wars’ bombing missions. Years ago, Lt. Col. Jenns Robertson began entering information into “a simple Excel spreadsheet that eventually matured into the largest compilation of releasable U.S. air operations data in existence.” Last month, the Department of Defense published a “beta” version of this data, known as Theater History of Operations Reports (THOR). Currently, THOR’s data covers bombing operations from World War I, World War II, the Korean War, and the Vietnam War. For each bombing, the reports include data about the aircraft, munitions, targets, results, and more.

So many satellites. CelesTrak’s T.S. Kelso has been obsessively transcribing NORAD’s “resident space object” data for decades. Among his offerings: the SATCAT satellite catalog, which provides data on all known satellites launched since 1957 — more than 41,900 of ‘em. Kelso also provides a SATCAT Boxscore, which is like a baseball box score … but for satellites. The U.S., it turns out, is responsible for almost exactly one-third of the 1,590 satellites classified as “active.” Previously: The Union of Concerned Scientists’ satellite database, featured Dec. 30, 2015. [h/t Noah Veltman]

Where Waldo is. In 2015, computer scientist Randy Olson tried computing “the optimal search strategy for finding Waldo” in the seven original Where’s Waldo? books. In doing so, he transcribed a 2013 Slate chart of Waldo’s locations (itself transcribed from those seven original books). The resulting dataset contains 68 rows — one for each Waldo — and four columns: book, page, x coordinate, and y coordinate.