Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2018.01.03 edition

Taxes filed, distance learning, animals on the move, building permits, and knotted string.

Taxes filed. The IRS publishes a ton of tax statistics. One of the most interesting portions: data aggregated from individual income tax returns (i.e., Form 1040s), which the IRS provides at the state, county, and ZIP code level. Those datasets’ 100+ fields include details that range from the basic (e.g., the number of tax filings and total income reported) to the more obscure (e.g., the number of returns that included “educator expenses” and the total amount of overpayments refunded). [h/t Cecilia Reyes]

Distance learning. The Open University Learning Analytics dataset features demographic information about 28,000+ students who, in 2013 and 2014, enrolled in any of seven particular distance learning courses at the UK’s Open University; their final results (distinction, pass, fail, or withdrawn); 173,000+ graded assignments; and 10+ million rows describing each student’s interactions with the courses’ “virtual learning environments.” Useful: The researchers’ academic article describing the dataset.

Animals on the move. Movebank is a “a free, online database of animal tracking data hosted by the Max Planck Institute for Ornithology.” On the site’s data map, you can display the animal tracks from particular studies — for instance, the migrations of more than a dozen turkey vultures. Contributing researchers can decide whether to share the underlying data; not all do. (Here’s the data for those vultures, plus six buffalo in Kruger National Park, and seven Venezuelan oilbirds.) [h/t Hari Karthic]

Building permits. The Census Bureau’s Building Permits Survey collects data from thousands of municipalities every month. For each municipality, metro area, and state, the datasets provide the number of permits issued for new residential housing, number housing units authorized, and total estimated value of the new construction. Previously: The Census Bureau’s Annual Characteristics of New Housing survey (DIP 2016.06.22). [h/t Susie Cambria + Issi Romem]

Knotted string.The Khipu Database Project began in the fall of 2002, with the goal of collecting all known information about khipu” — the knotted string textiles used for recordkeeping in the Inca Empire — “into one centralized repository.” The project’s datasets include detailed structural data about hundreds of khipu, as well as an inventory of all known specimens. Related: The College Student Who Decoded the Data Hidden in Inca Knots.