Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2023.03.15 edition

Recent bank financials, historical bank financials, policy categorizations, irrigation water use, and the market for X-Men.

Bank financials, 1976–present. The Federal Financial Institutions Examination Council’s National Information Center “provides comprehensive financial and structure information on banks and other institutions for which the Federal Reserve has a supervisory, regulatory, or research interest.” Its datasets include quarterly financial statements for bank holding companies, going back to 2016, plus detailed attributes of all active banks, 150,000+ banks closed since the mid-1930s (including Silicon Valley Bank), and 160,000+ bank branches. The agency also provides bank financials in the form of “call reports” going back to 2001. Earlier call reports, going back to 1976, are available from the Chicago Fed. Related: The FDIC’s list of failed banks since October 2000. [h/t Sergio Correia]

Bank financials, 1867–1904. Federal Reserve economists Sergio Correia and Stephan Luck have compiled a dataset of “annual national bank balance sheets for more than 7,000 unique national banks, covering the years 1867 to 1904.” They did so by “combining optical character recognition (OCR) techniques with modern layout separation techniques,” which allowed them to extract information from scans of the Office of the Comptroller of the Currency’s annual reports to Congress. The data include asset and liability subtotals, receivership dates, city-level variables, and more. Related: Correia and Luck describe their methodology in a recent paper and open-access preprint.

Policies, categorized. The Comparative Agendas Project “assembles and codes information on the policy processes of governments from around the world,” categorizing them into 20+ topics (e.g., “Civil Rights”) and 200+ subtopics (e.g., “Handicap Discrimination”). It “actively monitors thirty different data series,” which you can download and explore online, “all coded by this same predictable, reliable coding system.” Previously: CAP categorizations for a decade of NYT front-page stories (DIP 2018.04.25). [h/t E.J. Fagan]

Irrigation by county and crop. P. J. Ruess et al. have developed annual, county-level estimates of irrigation water use for 20 crop groups between 2008 and 2020. The calculations draw on water use data from the US Geological Survey, as well as high-resolution data on crop locations, climate, and more. They generate estimates for surface water withdrawals, groundwater withdrawals, and nonrenewable groundwater depletion, making the findings “the first national-scale assessment of irrigation by crop, water source, and year.” [h/t Mike Stucka]

The market for X-Men. Anderson Evans’s Mutant Moneyball project uses comic book market data to explore the financial value of individual X-Men characters. The project’s dataset provides decade-by-decade statistics for 26 members of the team, drawn from sales histories and pricing guides, as well as a matrix indicating the issues in which each character appeared.