Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2016.02.17 edition

Risky behavior, emotion-tinged words, land cover, chess games, and dead insects.

The kids are alright. Every two years since 1991, the CDC has conducted the Youth Risk Behavior Survey, which asks high school students questions about drug use, sex, eating habits, and more. The results are available at the national, state, and district level. Results from the 2015 survey will be published in June, the CDC says. Related: Today’s teens _______ less than you did.

Word-emotion associations. Computational linguists at Canada’s National Research Council used Mechanical Turk to crowdsource the emotional associations of 14,182 words. For each word, participants were asked whether it was “positive” and/or “negative”, and whether it was associated with any of eight emotions: anger, anticipation, disgust, fear, joy, sadness, surprise, and trust. The resulting Word-Emotion Association Lexicon was first published in 2010. Of the full lexicon, only two words — “treat” and “feeling” — were associated with all eight emotions. [h/t Bipul Mohanto]

The United States of Land. In 2011, agriculture occupied about 22% of all land in the contiguous U.S., according to the National Land Cover Database. The NLCD classifies every 30-meter-by-30-meter chunk of land into one of 16 categories, including “woody wetlands,” “cultivated crops,” and “developed” land, at different intensities. (Alaska’s unique landscape has earned it a few additional categories, such as “dwarf scrub.”) The database is presented as raster files, so you’ll need some geospatial software to dig in. [h/t Ryan McNeill]

Hundreds of thousands of chess games. Portable Game Notation, a file format used to describe chess matches, was invented in 1993. Since then, enthusiasts have created PGN files for virtually all top players’ games and every high-level tournament at sites such as PGN Mentor and Chess DB. [h/t Seth Kadish]

A quarter-million bugs. For 18 years, a trap on the roof of the University of Copenhagen’s Zoological Museum lured moths, butterflies, and beetles to their early deaths. Researchers at the university counted and identified more than 250,000 specimens from 1,500+ species. The most common: Yponomeuta evonymella, a moth species also known as the bird-cherry ermine, which got trapped nearly 40,000 times.