Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2022.03.16 edition

European energy imports/exports, Ukraine border crossing sites, conspiracy theory language, ISS photo locations, and software sunsets.

European energy imports/exports. The EU’s Eurostat office publishes a range of statistical datasets on energy usage and economics, including annual imports and exports of petroleum, natural gas, and coal between European countries and their trading partners. Related: The Energy Information Administration tracks US imports and exports of petroleum, natural gas, and coal. As seen in: How Europe is dependent on Russian gas (New Statesman) and Why the Toughest Sanctions on Russia Are the Hardest for Europe to Wield (New York Times). Previously: European gas storage (DIP 2022.01.26), state-owned oil companies (DIP 2019.05.01), and global and gas infrastructure (DIP 2018.06.06). [h/t Lisa Charlotte Muth]

Ukraine border crossing sites. The UN’s Office for the Coordination of Humanitarian Affairs in Ukraine has combined information from multiple sources to create a geospatial dataset of the country’s international border crossings with Moldova (11 crossings listed), Poland (8), Hungary (5), Romania (4), and Slovakia (2). It provides each crossing’s latitude, longitude, border country, and name in both Ukrainian and English. The associated metadata indicates an “expected update frequency” of weekly.

Conspiracy theory language. Alessandro Miani et al. have built a dataset to study the language of conspiracy theories. Starting with a set of phrases associated with major conspiracy theories (e.g., those surrounding Princess Diana’s death), the authors searched online for sources that mentioned them often, ultimately selecting 150 websites — both mainstream and conspiracy-laden — that met their criteria. Then, in mid-2020, they collected 72,000+ topical articles from those sites. For each article, the project includes its text, lexical features, and metadata. [h/t Gwern Branwen]

ISS photo locations. In 2013, Nathan Bergey crawled NASA’s website of International Space Station imagery, creating a dataset that listed the mission, roll, frame, latitude, and longitude of each of the million-plus photographs ever taken from the habitable satellite. Related: NASA itself also provides tools for searching, browsing, and mapping ISS imagery. [h/t Sasha Trubetskoy]

Software sunsets. The website endoflife.date tracks “end of life” dates and “support lifecycles” for roughly 100 software products — languages, frameworks, operating systems, and more. It indicates, for example, that Python 3.10 appeared in October 2021 and will lose security support in October 2026. You can access the project’s machine-readable data through its GitHub repository and API.