Data Is Plural

... is a weekly newsletter of useful/curious datasets.

From the archives: Energy-related datasets

This is an unedited collection of previous dataset summaries from Data Is Plural's archives, presented in reverse chronological order. Please note: The excerpts are copied verbatim from the original editions, and might be out of date or missing essential context. If you'd like to suggest a dataset for inclusion, please get in touch.

Strategic petroleum. The US Energy Information Administration maintains a dataset tracking the monthly volume of the country’s Strategic Petroleum Reserve, measured in the thousands of barrels. The figures go back to 1977, the year the first crude oil was delivered to the reserve, but lag by a couple of months; the end-of-August volume is scheduled for publication on October 31. Read more: The Department of Energy’s history of reserve releases. Previously: Petroleum Supply Monthly reports (DIP 2017.08.16) and weekly gas prices (DIP 2021.06.09), both also published by the EIA. [h/t u/CountBayesie]

Carbon pricing. In a paper published last month, Geoffroy Dolphin and Qinrui Xiahou describe their World Carbon Pricing Database. For each country (as well as each US state and certain other subnational jurisdictions), the database indicates the price per metric ton of CO2 equivalent associated with any carbon taxes and cap-and-trade mechanisms in place, for each year going back to 1990. It lists these prices for each combination of type of fuel and sectoral classification. Previously: The Voluntary Registry Offsets Database and the World Bank’s database of carbon pricing initiatives (DIP 2021.11.17).

Grid emissions. Ember, an “energy think tank that uses data-driven insights to shift the world from coal to clean electricity,” has begun compiling annual and monthly statistics on electricity demand, generation, and estimated greenhouse gas emissions by country, standardized from national and international sources. The annual estimates span two decades and 200+ countries and territories; the monthly dataset provides somewhat less coverage. Both can also be explored online. Related: Singularity’s Open Grid Emissions initiative estimates the hourly grid emissions of balancing authorities and power plants in the US, currently for 2019 and 2020. Previously: Other energy-related datasets. [h/t Philippe Quirion]

Wind and solar power. The Global Energy Monitor’s Global Wind Power Tracker is “a worldwide dataset of utility-scale wind facilities,” focusing on those with planned or installed capacities of at least 10 megawatts. It provides each facility’s name, location, status, capacity, installation type, owner, and other details. The project launched last week alongside a sibling dataset, the Global Solar Power Tracker. They join a growing collection of trackers from the organization, including those examining coal infrastructure, steel plants, and oil and gas resources. [h/t Nathaniel Hoffman]

Solar panels. The Berkeley Lab’s Tracking the Sun project examines US trends in residential and small non-residential solar panel installations. Its latest report describes more than 2 million such projects, based on records provided by state governments, utility companies, and other organizations. It features an interactive dashboard, summary tables, and a public dataset that lists each installation’s location, capacity, price, cost rebated, owner type, installer, physical orientation, component details, and other characteristics. A companion report and dataset examine utility-scale solar plants. [h/t Ed Vine]

European energy imports/exports. The EU’s Eurostat office publishes a range of statistical datasets on energy usage and economics, including annual imports and exports of petroleum, natural gas, and coal between European countries and their trading partners. Related: The Energy Information Administration tracks US imports and exports of petroleum, natural gas, and coal. As seen in: How Europe is dependent on Russian gas (New Statesman) and Why the Toughest Sanctions on Russia Are the Hardest for Europe to Wield (New York Times). Previously: European gas storage (DIP 2022.01.26), state-owned oil companies (DIP 2019.05.01), and global and gas infrastructure (DIP 2018.06.06). [h/t Lisa Charlotte Muth]

European gas storage. The industry group Gas Infrastructure Europe publishes data on daily fuel storage levels at its members’ facilities. For each facility, and aggregated to the provider and country level, the data indicate the amount of fuel in storage, the percent capacity that represents, fuel added and withdrawn, and more. As seen in: “Earlier Than Ever, European Gas Storage Is Half-Empty” (Bloomberg). [h/t Rose Mintzer-Sweeney]

Offshore wind turbines. Ting Zhang et al. have trained an algorithm to identify wind turbines in coastal satellite imagery, and have used it to build a dataset listing the location and construction month of 6,924 turbines offshore of 14 countries between 2015 and 2019. To test the algorithm’s accuracy, the researchers compared its results to other sources, including the US Wind Turbine Database (DIP 2018.04.25), the UK’s Renewable Energy Planning Database, the European Marine Observation and Data Network, and Open Power System Data (DIP 2019.08.14).

India’s coal mines. Thanks to India’s Right to Information Act, energy researcher Sandeep Pai has compiled a dataset of the country’s 459 operational coal mines. It includes each mine’s name, location (state, district, latitude, longitude), ownership, production tonnage, and more. Related: Pai’s introductory Twitter thread.

Electric utilities, standardized. “Electric utilities report a huge amount of information to the US government,” but “much of this data is not released in well documented, ready-to-use, machine readable formats.” That assessment comes from the Public Utility Data Liberation (PUDL) project, which aims to clean, standardize, and cross-link the electric utility information gathered by various agencies. Earlier this month, PUDL published its first data release; it includes information originally collected through Energy Information Administration Form 860 (details about individual generators) and Form 923 (individual power plants), the Environmental Protection Agency’s Continuous Emissions Monitoring System (hourly emissions), and the Federal Energy Regulatory Commission’s Form 1 (price rates and financial audits). The code PUDL uses to download, extract, and standardize the raw data is also available online. [h/t Zane Selvans]

Electricity prices. OpenEI’s Utility Rate Database contains nearly 50,000 expert-verified rates — current and historical — for residential, commercial, industrial, and street-lighting electricity from thousands of US utility companies. Related: OpenEI’s other data offerings. [h/t Arik Levinson and Emilson Delfino Silva]

California power outages. Last week, Pacific Gas and Electric began cutting power to hundreds of thousands of Californians — a precaution to keep the company’s aging infrastructure from sparking wildfires. Simon Willison has been scraping PG&E’s outage website every 10 minutes, and pushing the results into a database you can query and download. [h/t Lam Thuy Vo]

Electricity in rural India. Last month, the Smart Power India and the Initiative for Sustainable Energy Policy published Rural Electricity Demand in India, a new survey dataset that “covers 10,000 households and 2,000 rural enterprises across 200 villages in Bihar, Uttar Pradesh, Odisha, and Rajasthan.” Respondents were asked, among other things, how many hours per day they get electricity, whether they have solar panels, and the price they pay for kerosene. [h/t Hisham Zerriffi + Johannes Urpelainen]

European electricity. The Open Power System Data platform has aggregated energy data from across Europe into a series of standardized datasets, including electricity consumption, power plants, and generation capacity. The project has also published an “IT philosophy,” a guide for new users, and a detailed listing of primary sources.

State-owned oil companies. The browseable and downloadable National Oil Company Database, a project of the Natural Resource Governance Institute, pulls together official data on nearly 100 metrics concerning 71 oil/gas companies owned by 61 countries. For instance: Petróleos de Venezuela, S.A., reported transferring roughly $5.5 billion dollars to its government in 2016, down from nearly $28 million in 2013; Saudi Aramco produces the equivalent of 13 million barrels of oil daily; and in 2017, Russia’s Rosneft generated approximately $283,000 in revenue per employee. [h/t Rachel Ziemba]

Power plants. The Global Power Plant Database, published by the World Resources Institute, “is a comprehensive, open source database of power plants around the world” and contains “information on plant capacity, generation, ownership, and fuel type.” The current edition, released in June 2018, covers 28,600+ power plants in 164 countries — including more than 1,000 each in Brazil, Canada, China, Great Britain, France, and the United States. Previously: U.S. power plants (DIP 2016.02.10). [h/t Kelly Rose + Paul Deane]

Coal cleanup funds. What happens when coal mines shut down? Money for their cleanup is supposed to be ensured by a system of bonds. But when Climate Home News’ Mark Olalde investigated these remediation funds, he found “a system incapable of dealing with large-scale bankruptcies, amid a declining industry, which severely threatens the environment and future of coal-mining communities across the country.” You can download the data behind Olalde’s findings — including bond databases covering the “23 states that produce 99% of US coal,” obtained via public records requests. [h/t Megan Darby]

Electric utilities. The U.S. Energy Information Administration uses Form EIA-861 to collect annual data from thousands of electric utilities about their sales, revenue, peak loads, customer counts, energy efficiency savings, and more. More than 3,400 utilities submitted the form (or its shorter cousin, EIA-861S) for 2017, and the data go back to 1990. [h/t Jordan Wirfs-Brock]

Home energy consumption. For many decades, the Department of Energy’s Residential Energy Consumption Survey has been asking people about their homes’ energy-related characteristics (e.g., number of bedrooms and roofing materials) and energy-consuming appliances (e.g., television size and dishwasher use). Then, the agency cross-references those answers with billing data collected “directly from energy suppliers under a mandatory authority granted by Congress.” The survey has been conducted 14 times since 1978; survey microdata is available for the eight most recent iterations.

Power outages. Utility companies are required to report major power outages and other “electric disturbance events” to the Department of Energy within a business day (or, depending on the type of event, sooner) of the incident. The federal agency then aggregates the reports annual summary datasets. For each event, the data includes the time it began and was resolved, the geographic areas it affected, the type of incident, and the estimated number of customers affected. [h/t Jordan Wirfs-Brock]

Global gas and oil infrastructure. The Department of Energy’s National Energy Technology Laboratory has published what it says is the “first-ever database inventory of oil and natural gas infrastructure information from the top hydrocarbon-producing and consuming countries in the world.” The database contains tons of geospatial information and “identifies more than 4.8 million individual features like wells, pipelines, and ports from more than 380 datasets in 194 countries. It includes information about the type, age, status, and owner/operator of infrastructure features.” Helpful: The authors’ (detailed) methodology paper. [h/t Michael McLaughlin]

Wind turbines. Lawrence Berkeley National Laboratory, the U.S. Geological Survey, and the American Wind Energy Association have partnered to publish the U.S. Wind Turbine Database. The dataset, which the government says will be “continuously updated,” currently contains 57,636 turbines and includes each turbine’s location, development project, manufacturer, model, height, rotor diameter, and other characteristics. You can download the data in several formats, and also explore it on an interactive map. [h/t Ed Vine]

Offshore drilling. The Bureau of Ocean Energy Management and the Bureau of Safety and Environmental Enforcement — two of the agencies that replaced the troubled U.S. Minerals Management Service in the wake of the Deepwater Horizon spill — publish a few dozen bulk datasets related to their oversight of offshore drilling operations. Among them: lease owners, production metrics, company details, pipeline permits and locations, incident investigations, and platform structures. Related:American Idle: Decommissioning costs sink offshore drillers into latest crisis,” a 2017 Debtwire investigation that used the platform data. [h/t Alex Plough]

Carbon-conscious energy policies. The Database of State Incentives for Renewables & Efficiency, “is the most comprehensive source of information on incentives and policies that support renewables and energy efficiency in the United States.” The database, which was founded in 1995 and is funded by the Department of Energy, includes tax rebates, solar energy buybacks, building standards, and more. You can download the data in several formats, or browse and search it online. [h/t Carol Brotman White]

Energy use at 10 Downing Street. UK-based CarbonCulture helps organizations measure and publish their buildings’ energy and water use in near-realtime. Among the first users: 10 Downing Street, the Tate Modern, and University College London. For each building, you can download yearly datasets, which are broken down into 30-minute intervals. [h/t Max Roser]

Solar panels. The Open PV Project is a “community driven, comprehensive database” of solar panel installations in the U.S., ranging from home installations to utility-scale projects. The database, run by the Department of Energy, contains more than 1 million installations — with a total capacity of 16,000+ megawatts — and tracks their locations, sizes, costs, installers, and other variables. [h/t Dad]

Pipelines. The U.S. Energy Information Administration publishes a bunch of geographic data, including shapefiles mapping the country’s crude oil, petroleum product, hydrocarbon gas liquid, and natural gas pipelines. (They were last updated five months ago.) Additionally, the Pipeline and Hazardous Materials Safety Administration keeps track of “significant incidents” — for example, those that caused a serious injury or $50,000 in damage. Related:Six maps that show the anatomy of America’s vast infrastructure.” Also related: ProPublica’s Pipeline Safety Tracker, covering 1986–2012.

Electricity prices. In May 2016, U.S. residential consumers paid an average of roughly 12.8 cents per kilowatt hour of electricity. The price was lowest in Louisiana (9.28 cents) and Washington state (9.54 cents), and highest in Hawaii (26.87 cents) and Connecticut (21.63 cents). These data-points, and more, are available through the Energy Information Administration’s electric power reports, which are updated monthly. [h/t Jordan Wirfs-Brock]

Powering America. Every year, the U.S. Energy Information Administration requires thousands of power plants to report detailed data on fuel consumption and electricity generation. The datasets stretch back more than three decades, to 1989. In 2014, the most recent year available, Arizona’s Palo Verde Nuclear Generating Station generated more electricity — 32 million megawatt hours — than any other power plant in the country. [h/t Marc DaCosta]