Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2018.02.21 edition

Rohingya refugees, Treasury sanctions, happy moments, GitHub activity, and public commemorations.

Rohingya refugees. The Humanitarian Data Exchange has collated dozens of datasets related to the Rohingya refugee crisis. Among them: the geographic boundaries of Rohingya refugee settlements in Bangladesh, the numbers of refugees living in those settlements, and the infrastructure available there.

U.S. Treasury sanctions. Through its Office of Foreign Assets Control, the Treasury publishes several datasets that describe the people and companies subject to U.S. economic sanctions. The two main listings are the Specially Designated Nationals and Blocked Persons (“SDN”) and the Consolidated Sanctions List. Those contain only currently-sanctioned entities, but the Treasury also publishes (semi-structured) documents describing historical additions and removals. Related: Enigma Public’s Sanctions Tracker. [h/t Jennifer Roscoe]

Happy moments. HappyDB is “a corpus of 100,000 crowd-sourced happy moments.” An example: “My son gave me a big hug in the morning when I woke him up.” The researchers, who recently described their efforts in an academic paper, collected the sentiments from Mechanical Turk workers, who also supplied basic demographic information, such as age, gender, and whether they have children. [h/t Marcel Weiher]

Seven years of GitHub activity. The GitHub Archive is an effort to record the popular code-sharing website’s public timeline, “archive it, and make it easily accessible for further analysis.” The dataset, which includes more than 20 types of events and often contains more than 1 million events per day, goes back to February 2011. Related: Structured data representing the “commit histories” of two dozen popular open-source projects, including Rust, Pandas, Redis, and Bitcoin.

Public commemorations. The Open Plaques project is dedicated to “documenting the historical links between people and places as recorded by commemorative plaques.” The latest data dump contains nearly 40,000 plaques — the vast majority in the U.S., U.K., and Germany. OpenBenches, meanwhile, has collected similar data for 4,300+ memorial benches. [h/t Jason Norwood-Young]