All Projects → wri → Global Power Plant Database

wri / Global Power Plant Database

A comprehensive, global, open source database of power plants

Projects that are alternatives of or similar to Global Power Plant Database

oeplatform
Repository for the code of the Open Energy Platform (OEP) website. The OEP provides an interface to the Open Energy Family
Stars: ✭ 49 (-71.35%)
Mutual labels:  energy, open-data
Pudl
The Public Utility Data Liberation Project
Stars: ✭ 200 (+16.96%)
Mutual labels:  open-data, energy
Osd Street Center Line
Open source release of street center lines in Chicago.
Stars: ✭ 108 (-36.84%)
Mutual labels:  open-data
Energy Py
Reinforcement learning for energy systems
Stars: ✭ 148 (-13.45%)
Mutual labels:  energy
Openimu
Open Source Analytics & Visualisation Software for Inertial Measurement Units
Stars: ✭ 133 (-22.22%)
Mutual labels:  energy
Crypto
Cryptocurrency Historical Market Data R Package
Stars: ✭ 112 (-34.5%)
Mutual labels:  open-data
Osd Bike Routes
Open source release of bike routes in Chicago.
Stars: ✭ 140 (-18.13%)
Mutual labels:  open-data
Fma
FMA: A Dataset For Music Analysis
Stars: ✭ 1,391 (+713.45%)
Mutual labels:  open-data
Dash Oil And Gas Demo
Dash Demo App - New York Oil and Gas
Stars: ✭ 156 (-8.77%)
Mutual labels:  energy
Gnaf Loader
A quick way to get started with PSMA's open GNAF & Admin Boundaries
Stars: ✭ 132 (-22.81%)
Mutual labels:  open-data
Calliope
A multi-scale energy systems modelling framework
Stars: ✭ 147 (-14.04%)
Mutual labels:  energy
Awesome Italian Public Datasets
A selection of interesting Open dataset from the Italian Public Administration and Civic Data use cases
Stars: ✭ 132 (-22.81%)
Mutual labels:  open-data
The Building Data Genome Project
A collection of non-residential buildings for performance analysis and algorithm benchmarking
Stars: ✭ 117 (-31.58%)
Mutual labels:  open-data
Data.gov
Data.gov source code and issue tracker
Stars: ✭ 1,856 (+985.38%)
Mutual labels:  open-data
Killedbygoogle
Part guillotine, part graveyard for Google's doomed apps, services, and hardware.
Stars: ✭ 1,567 (+816.37%)
Mutual labels:  open-data
Dados Abertos
Repositório do serviço de Dados Abertos da Câmara. Consulte as "Issues" para atendimento a dúvidas e sugestões.
Stars: ✭ 153 (-10.53%)
Mutual labels:  open-data
Specification
The Human Services Data Specification - a data exchange format developed by the Open Referral Initiative
Stars: ✭ 106 (-38.01%)
Mutual labels:  open-data
Olhoneles
Tool to monitor Brazilian legislators expenses while in the exercise of their mandates
Stars: ✭ 131 (-23.39%)
Mutual labels:  open-data
Eemeter
An open source python package for implementing and developing standard methods for calculating normalized metered energy consumption and avoided energy use.
Stars: ✭ 134 (-21.64%)
Mutual labels:  energy
Windpowerlib
The windpowerlib is a library to model the output of wind turbines and farms.
Stars: ✭ 170 (-0.58%)
Mutual labels:  energy

Global Power Plant Database

This project aims to build email the team or fork the repo and code! To learn more about how to contribute to this repository, read the CONTRIBUTING document.

The latest database release (v1.2.0) is available in CSV format here under a Creative Commons-Attribution 4.0 (CC BY 4.0) license. A bleeding-edge version is in the output_database directory of this repo.

All Python source code is available under a MIT license.

This work is made possible and supported by Google, among other organizations.

Database description

The Global Power Plant Database is built in several steps.

  • The first step involves gathering and processing country-level data. In some cases, these data are read automatically from offical government websites; the code to implement this is in the build_databases directory.
  • In other cases we gather country-level data manually. These data are saved in raw_source_files/WRI and processed with the build_database_WRI.py script in the build_database directory.
  • The second step is to integrate data from different sources, particularly for geolocation of power plants and annual total electricity generation. Some of these different sources are multi-national databases. For this step, we rely on offline work to match records; the concordance table mapping record IDs across databases is saved in resources/master_plant_concordance.csv.

Throughout the processing, we represent power plants as instances of the PowerPlant class, defined in powerplant_database.py. The final database is in a flat-file CSV format.

Key attributes of the database

The database includes the following indicators:

  • Plant name
  • Fuel type(s)
  • Generation capacity
  • Country
  • Ownership
  • Latitude/longitude of plant
  • Data source & URL
  • Data source year
  • Annual generation

We will expand this list in the future as we extend the database.

Fuel Type Aggregation

We define the "Fuel Type" attribute of our database based on common fuel categories. In order to parse the different fuel types used in our various data sources, we map fuel name synonyms to our fuel categories here. We plan to expand the database in the future to report more disaggregated fuel types.

Combining Multiple Data Sources

A major challenge for this project is that data come from a variety of sources, including government ministries, utility companies, equipment manufacturers, crowd-sourced databases, financial reports, and more. The reliability of the data varies, and in many cases there are conflicting values for the same attribute of the same power plant from different data sources. To handle this, we match and de-duplicate records and then develop rules for which data sources to report for each indicator. We provide a clear data lineage for each datum in the database. We plan to ultimately allow users to choose alternative rules for which data sources to draw on.

To the maximum extent possible, we read data automatically from trusted sources, and integrate it into the database. Our current strategy involves these steps:

  • Automate data collection from machine-readable national data sources where possible.
  • For countries where machine-readable data are not available, gather and curate power plant data by hand, and then match these power plants to plants in other databases, including GEO and CARMA (see below) to determine their geolocation.
  • For a limited number of countries with small total power-generation capacity, use data directly from Global Energy Observatory (GEO).

A table describing the data source(s) for each country is listed below.

Finally, we are examining ways to automatically incorporate data from the following supra-national data sources:

ID numbers

We assign a unique ID to each line of data that we read from each source. In some cases, these represent plant-level data, while in other cases they represent unit-level data. In the case of unit-level data, we commonly perform an aggregation step and assign a new, unique plant-level ID to the result. For plants drawn from machine-readable national data sources, the reference ID is formed by a three-letter country code ISO 3166-1 alpha-3 and a seven-digit number. For plants drawn from other database (including the manually-maintained dataset by WRI), the reference ID is formed by a variable-size prefix code and a seven-digit number.

Power plant matching

In many cases our data sources do not include power plant geolocation information. To address this, we attempt to match these plants with the GEO and CARMA databases, in order to use that geolocation data. We use an elastic search matching technique developed by Enipedia to perform the matching based on plant name, country, capacity, location, with confirmed matches stored in a concordance file. This matching procedure is complex and the algorithm we employ can sometimes wrongly match two power plants or fail to match two entries for the same power plant. We are investigating using the Duke framework for matching, which allows us to do the matching offline.

Related repos

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].