Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2016.04.20 edition

IP addresses, consumer spending, cricket, clouds, and distilleries.

Where computers (maybe) are. An under-scrutinized quirk in a little-known, widely-used database “turned a random Kansas farm into a digital hell.” How? The database contains best-guess geographic coordinates for every IP address on the internet. But for millions of IP addresses, the best guess is just somewhere in the United States. And, until recently, the database translated that vague location into the latitude and longitude of a farm in Potwin, Kansas. (Now it points to a lake.)

The American consumer. Last week, the Bureau of Labor Statistics published its midyear update to the Consumer Expenditure Survey. The survey collects data on spending, income, and a handful of characteristics about U.S. consumers. One tidbit: On average, Americans are spending approximately 33% of their income on housing, and a tad less than 1% on alcohol. [h/t Nathan Yau]

Cricket. Baseball season is in full-swing, basketball and hockey playoffs have begun, and the NFL draft is nigh. No better time to highlight some cricket data! has gathered ball-by-ball data on more than 2,700 matches played since the mid-2000s. Looking for historical data? A new GitHub repository contains stats for more than 40,000 matches going back to 1773 (but mostly since the 1970s), scraped from ESPN Cricinfo. Related: How, statistically, the coin toss affects who’ll win. [h/t Derek Willis]

Where clouds congregate, and when. Researchers have analyzed 15 years of satellite imagery to create a nearly-global dataset of seasonal cloud coverage. The data — available at a kilometer-square resolution — could help scientists monitor and predict changes in ecosystems. [h/t Grant Smith + Joanna Klein]

License to distill. The U.S. Alcohol and Tobacco Tax and Trade Bureau publishes a few permit datasets, including this table of 1,900+ businesses licensed to produce and/or bottle liquor. [h/t Maggie Lee]