Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2018.12.12 edition

Bike and pedestrian safety, computer vulnerabilities, sunniness, German political speeches, and boy bands.

Bike and pedestrian safety. A growing number of cities publish detailed data on bicyclist and pedestrian injuries involving cars, including New York City, Chicago, Boston, Seattle, St. Paul, Minn., Chapel Hill, N.C., Tempe, Ariz., Toronto, and London — many through the cities’ “Vision Zero” street-safety initiatives. (Some of the datasets also include car-on-car collisions.) Related:The most dangerous intersections in Seattle for bicyclists and pedestrians.” [h/t Rachel Schallom + Jeff Asher]

Computer vulnerabilities. Common Vulnerabilities and Exposures is a downloadable list of more than 110,000 “publicly known cybersecurity vulnerabilities.” Each vulnerability is assigned a unique identifier (e.g., CVE-2014-0160) and given a description. The National Institute of Standards and Technology’s National Vulnerability Database takes the list and adds more information for each entry, “such as fix information, severity scores, and impact ratings.” That database is available in a variety of bulk downloads and data feeds; you can also search it online. [h/t GitHub user “nanoseconds”]

Sunniness. The National Renewable Energy Laboratory’s solar datasets measure the average annual and monthly “total solar resource” for the United States, broken down by state, county, ZIP code, and roughly-10-square-kilometer chunks of the country. Bonus: More sun-radiation datasets via this Stack Overflow answer. [h/t Joe Hourclé]

German political speeches. Academic researcher Adrien Barbaresi has compiled a corpus of thousands of speeches from the the German Presidency, Presidency of the Bundestag, Chancellery, and Ministry of Foreign Affairs. The corpus, now in its third version, was first released in 2011. [h/t Adrien Barbaresi]

Boy bands. The Pudding’s Internet Boy Band Database is “an audio-visual history of every boy band to chart on the Billboard Hot 100 since 1980.” You can download the underlying data, which is stored in two files: boys.csv and bands.csv.