Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2018.12.05 edition

A spyware’d novel, body camera usage, UK grants, subtitle word frequencies, and snow plows.

Novel-writing, recorded. In 2014, author C. M. Taylor began writing a new novel, this time with a twist: He would write the entire story on a laptop intentionally infected with spyware. With the help of the British Library, a program recorded every keystroke Taylor typed and took screenshots every few seconds. The novel, Staying On, was published in October; soon after, Taylor and the library made the spyware recordings available to download. [h/t Dan Hett]

Body camera usage. The New Orleans Police Department’s “Body Worn Camera Metadata” contains the dates, times, durations, and locations for 2.7 million body camera recordings, going back to 2014. Related: The agency publishes similar data for 1.5 million in-car camera recordings. [h/t Alexandre Léchenet]

UK grants. The British nonprofit 360Giving helps grantmakers “to publish their grants data in an open, standardised way and helps people to understand and use the data.” Through its GrantNav platform, you can search across more than 300,000 grants — totalling more than £25 billion — given by scores of funders to nearly 180,000 recipients. You can download the results of each search, as well as the underlying datasets. [h/t Enigma Public]

Subtitle word frequencies. SUBTLEXus is a dataset of word frequencies in American English, derived from the subtitles for 8,388 films. The dataset, which covers more than 74,000 words, includes each word’s total frequency, the number of films in which the word appeared, and several other metrics. Bonus: Similar datasets are also available for Chinese and Dutch. [h/t The Language Goldmine]

Snow plows. Last month, I Quant NY’s Ben Wellington analyzed New York City’s raw snow plow data, “which had only been viewed 41 times before apparently.” The 250 million–row dataset is, as Wellington notes, “stored in an odd format” — snapshots that indicate, every 15 minutes, the last time each of the city’s street segments was plowed. Related: ClearStreets provides historical data from the City of Chicago’s Plow Tracker; Iowa Department of Transportation also publishes a live plow tracker; Syracuse and Pittsburgh have published historical snow plow data.