Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2020.08.05 edition

Cabinet members, the World Values Survey, software supply chain attacks, animals with SARS-CoV-2, and 15 million CAD sketches.

Cabinet members. WhoGov, a new project led by two graduate students at Oxford, provides “bibliographic information, such as gender and party affiliation, on cabinet members in July every year in the period 1966-2016 in all countries with a population of more than 400,000 citizens.” In all, the dataset covers more than 50,000 officials and “makes it possible to answer questions such as; what is the share of female cabinet members globally, which type of regime has the highest cabinet turnover, and have cabinets increased in size over time?” [h/t Yujin Julia Jung + Max Grömping]

Values and beliefs. The World Values Survey, first fielded in 1981, “is the largest non-commercial, cross-national, time series investigation of human beliefs and values ever executed, currently including interviews with almost 400,000 respondents.” Last month, the project began releasing data from its seventh wave of interviews, conducted in 77 countries and covering hundreds of questions about religion, migration, stereotypes, trust, and more. [h/t Michael Howlett + Seth J. Meyer]

Software supply chain attacks. Breaking Trust, a new project from the Atlantic Council, provides a dataset of 115 “software supply chain” attacks and disclosures in the past decaude. These vulnerabilities occur “when an attacker accesses and edits software somewhere in the complex software development supply chain to compromise a target farther up the chain by inserting their own malicious code.” The dataset’s examples include Stuxnet, malicious browswer extensions, and various attacks on software package registries. [h/t Maya Kaczorowski + Lily Liu]

Animals with SARS-CoV-2. The USDA has been publishing basic information about animals that its National Veterinary Services Laboratories have confirmed contracted the novel coronvirus. The simple table — unavailable to download, but easy enough to copy-paste into a spreadsheet — lists the types of animals (mostly “Cat” and “Dog,” but also a lion and a tiger), the states they lived in, the dates confirmed, and the methods of diagnosis.

A lot of CAD sketches. SketchGraphs characterizes 15 million computer-aided design (CAD) sketches, “extracted from real-world CAD models” and obtained from a popular online CAD platform. Each of the dataset’s sketches “is represented as a geometric constraint graph where edges denote designer-imposed geometric relationships between primitives, the nodes of the graph.”