Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2022.11.23 edition

Presidential pardons, radioactive waste, songs of the world, semiconductor logistics, and “every Star Trek ever.”

Presidential pardons. The US Department of Justice’s Office of the Pardon Attorney reviews all federal clemency requests and advises the president on such matters. The office provides a search tool and spreadsheet with the names, case statuses, and decision dates for all 76,000+ pardon and commutation requests since 1989. It also publishes a series of tables, with varying structure, listing the people who received clemency from each president since Richard Nixon. Those tables additionally include recipients who did not submit formal requests and, for more recent presidents, details about the recipients’ offenses and sentences. Another page contains clemency statistics for each year and president since William McKinley.

Radioactive waste. Through its Nuclear Fuel Data Survey, the US Energy Information Administration “collects data on spent nuclear fuel from all utilities that operate commercial nuclear reactors and from all others that possess irradiated fuel from commercial nuclear reactors.” The agency’s latest release, published last year with data through 2017, includes several basic tables. One tallies the annual amount of nuclear fuel discharged and stored at commercial sites since 1968, measured by assembly count and metric tons of uranium. As seen in: “As nuclear waste piles up, scientists seek the best long-term storage solutions” (Chemical & Engineering News, 2020).

Songs of the world. In a paper published this month, researchers describe the Global Jukebox — an interactive map and compilation of datasets focused on traditional songs from around the world. The project traces its history back to initial prototypes in the 1980s by Alan Lomax, a musicologist who collected many of the recordings. The core dataset, called Cantometrics, encodes “37 aspects of musical style for 5,776 traditional songs from 1,026 societies.” Others include, for example, datasets of song instruments and phrasing patterns. [h/t Pat Savage]

Semiconductor logistics. The Emerging Technology Observatory, a new initiative based at Georgetown University, aims to provide a “public platform for high-quality, actionable data resources on the global emerging technology landscape.” Its Advanced Semiconductor Supply Chain Dataset, which you can download and explore online, contains “manually compiled, high-level information about the tools, materials, processes, countries, and firms involved in the production of advanced logic chips.” [h/t Zach Arnold]

“Every Star Trek ever.” The dataset: Flourish Klink’s compilation of every official Star Trek book, audiobook, comic, episode, movie, and more. “My continuing mission: To consume every piece of Star Trek content ever made!” Previously: Star Trek computer-talk (DIP 2022.08.10) and a Star Trek API (DIP 2018.11.07). [h/t Lisa Cee]