Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2017.03.15 edition

Visas, sound clips, Freddie Mac home loans, Chicago traffic violations, and constitutional amendments.

Who’s visited the U.S. on visas, and how. Donald Trump’s new travel ban is scheduled to take effect at 12:01am Eastern tonight. The State Department doesn’t publish realtime visa data, but it does publish historical data, including the number of non-immigrant visas issued each fiscal year between 1997 and 2016, by nationality and visa type. (For example, the government issued 226 “fiancé(e)” K-1 visas to Syrian nationals in fiscal year 2016.) The agency also reports how many visas of each type it refused each year, as well as refusal rates by nationality. [h/t Thomas Kasang]

Sounds of YouTube. Last week, a research team at Google published AudioSet, a dataset of “2,084,320 human-labeled 10-second sound clips drawn from YouTube videos.” The clips have been classified into hundreds of categories, including “plucked string instrument,” “computer keyboard,” “chuckle, chortle,” “snoring,” and “fowl.” [h/t Suman Deb Roy]

Many millions of mortgages. Freddie Mac — the government-sponsored, publicly traded company also known as the Federal Home Loan Mortgage Corporation — publishes data on 23 million single-family home mortgages it has originated or guaranteed since 1999. The dataset includes the loan amount and interest rate, the borrower’s credit score, the property type (e.g., condo, co-op, manufactured housing), metro area, first payment month, whether the borrower is a first-time homebuyer, and lots more. Freddie Mac requests that you register before downloading the data, but you can also access the files directly. Don’t miss the terms and conditions, which prohibit republishing the files. Previously: Data on millions more loans from the Home Mortgage Disclosure Act (Dec. 30, 2015).

Chicago traffic camera violations. The Windy City publishes two datasets on traffic violations. One tallies the daily number of speeding violations in each Children’s Safety Zone; the other, red-light violations at each camera-surveilled intersection. Both go back to July 2014. The city also publishes a spreadsheet of city-towed vehicles. Related: The Chicago Tribune’s long-running investigation into the city’s traffic camera troubles. [h/t Jacob Sheff]

Nearly every proposed amendment to the Constitution. To prepare for an exhibition last year, the National Archives and Records Administration created a dataset of more than 11,000 constitutional amendment proposals introduced in Congress between 1787 and 2014. [h/t Justin Lewis]