Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2021.09.01 edition

Attacks against aid workers, fact-checks, offshore wind turbines, worker strikes in China, and creative Twitter bots.

Attacks against aid workers. The Aid Worker Security Database is “a global compilation of reports on major security incidents involving deliberate acts of violence affecting aid workers,” with more than 3,200 records since 1997. Researchers gather, evaluate, and categorize information from official reports, partnerships with humanitarian agencies, news media, and other sources. For each incident, the database indicates its date and location; the number of workers killed, wounded, or kidnapped; their general affiliations; the type of attacker and means of attack; a brief description; and more. [h/t The Costs of War Project]

Fact-checks. ClaimReview is an open standard for adding structured information to fact-check articles, such as the specific claim reviewed, where it appeared, the fact-checking organization, and the reviewer’s rating. The schema has been adopted by a range of big-name publishers, including the Washington Post, PolitiFact, and Univision, as well as smaller outlets around the world. The structured-data website Data Commons hosts a feed of 29,000+ ClaimReview-tagged fact-checks, as well as a curated subset.

Offshore wind turbines. Ting Zhang et al. have trained an algorithm to identify wind turbines in coastal satellite imagery, and have used it to build a dataset listing the location and construction month of 6,924 turbines offshore of 14 countries between 2015 and 2019. To test the algorithm’s accuracy, the researchers compared its results to other sources, including the US Wind Turbine Database (DIP 2018.04.25), the UK’s Renewable Energy Planning Database, the European Marine Observation and Data Network, and Open Power System Data (DIP 2019.08.14).

Worker strikes in China. China Labour Bulletin, founded in 1994 as a monthly newsletter, is a Hong Kong–based organization “that supports and actively engages with the workers’ movement in China.” Its map and dataset of worker strikes and protests provides details on 13,000+ events since 2011, including their location, date, and description; industry categories and ownership types; employee demands; and authorities’ response. [h/t The China Data Lab]

Creative Twitter bots. The website Botwiki “was created in July 2015 by Stefan Bohacek with the goal of preserving examples of interesting and creative online bots” and providing tutorials for building them. Bohacek has curated a dataset of 70+ popular examples running on Twitter, drawn from Botwiki and from Tully Hansen’s Omnibots list. Among them: @year_progress, @nyt_first_said, and @tiny_star_field.