Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2019.08.28 edition

Multinational corporations, scientific citations, whip counts, German federal judges, and movie shots.

Multinational corporations. The OECD’s ADIMA database tracks multinational corporations — Walmart, Toyota, Nestle, etc. — and their subsidiaries. It currently includes economic statistics about each of the world’s 100 largest multinationals, the names and locations of 26,000 subsidiaries, and information about nearly 20,000 of their websites. The OECD says plans to expand the number of companies in the future. Now you know: In 2016, the companies in the dataset “generated nearly $10 trillion in revenues (almost 20% of global GDP), earned $730 billion in profits and paid $185 billion in taxes,” according to the OECD.

Citations and self-citations. A team led by meta-research pioneer John Ioannidis has developed a dataset of citation metrics for science’s 100,000 most-cited authors. The dataset includes each author’s name, institutional affiliation, number of publications, total citations, “h-index,” and more. For each citation metric, there’s a second version that excludes self-citations. Related:Hundreds of extreme self-citing scientists revealed in new database” (Nature).

Congressional whip counts. Government professor C. Lawrence Evansdataset of US House “whip counts” describes more than 650 of the informal polls conducted by party leadership — covering 1955–86 for Democrats and 1975–80 for Republicans, on topics as varied as dairy prices, Alaskan statehood, voting rights, and Vietnam. It also indicates how each party member responded. [h/t Neil Malhotra + Janet Box-Steffensmeier]

German federal judges. Legal scholar and open-data enthusiast Hanjo Hamann has digitized seventy years of rosters from Germany’s seven federal courts, extracted structured data about the judges, and linked them to their Wikidata IDs. Related: Hamann’s detailed description of the dataest’s historical context and its construction. [h/t Erik Gahner Larsen]

Movie shots. James E. Cutting, a Cornell University psychology professor, has compiled several datasets on the structure of popular films, including one that indicates the length of each shot in 220 movies from 1915 to 2015. [h/t Igor Schwarzmann + Noah Brier]