## Input data

The input data mainly stems from the European Network of Transmission System Operators for Electricity—short ENTSO-E. Following EU Regulation 543/2013, the Transmission System Operators (TSOs) in countries that are part of the European Union are mandated to collect and publish insightful data about the electricity generation, trading, balancing, transmission and consumption in their respective grid areas. These data are transferred to the ENTSO-E who as of 2015 publishes all these data in bulk on their Transparency Platform [1]. The ENTSO-E TP also provides data for several countries outside the EU, including Switzerland, Norway, Moldova, and several countries in the Balkans.

Since Brexit, ENTSO-E no longer provides data for Great Britain. Instead we retrieve data for Great Britain directly from their TSO, National Grid ESO [2], with the exception of close-to-real-time load data which is currently obtained from Elexon BMRS [3]. To these data are added the generation and load data for Northern Ireland, which we retrieve from EirGrid [4].

Another exceptional country is the Republic of Ireland. While technically part of the EU, there are frequent and extensive gaps in the data provided by ENTSO-E. Unfortunately, there is no alternative source that provides generation data per type. Therefore, we have to live with the many gaps. To improve quality, though, we retrieve load data for Ireland from EirGrid [4].

Last exception regarding generation and load data is Türkiye, for which we obtain data from their TSO, EPİAŞ [5].

All transmission data is retrieved from the ENTSO-E TP [1], with the exception of the connection between Great Britain and Ireland, for which data is retrieved from EirGrid [4].

## Balancing generation, load and flows

For various reasons, the data reported about electricity generation, consumption and inter-regional flows are typically not balanced, i.e., they do not fulfill the physical law of energy conservation. However, to determine the probable origin of any quantum of electricity consumed, it is necessary to have a balanced network of generation, transmission and consumption. To obtain this, it is necessary to pre-process any data retrieved from the TSOs, balancing out the (mostly) minor differences between generation, consumption and inter-regional exchanges for every regarded region. This is done following the data reconciliation approach developed in [6].

Say we have retrieved electricity consumption \(C_r(t)\) and generation \(G_r^\tau(t)\) (per type \(\tau\)) for region \(r\) at time \(t\), alongside the positive flows \(F_{r_1\rightarrow r_2}(t)\) from any region \(r_1\) to any other region \(r_2\neq r_1\). Then, energy conservation would require for every region \(r\) and time \(t\) that

$$\sum_\tau G_r^\tau(t)-C_r(t)+\sum_{r_2\neq r}\left(F_{r_2\rightarrow r}(t)-F_{r\rightarrow r_2}(t)\right)=0.$$

But in general, this will not be fulfilled. Instead, we aim to fulfill the energy conservation in a slightly altered form, where we allow every datum to be altered by an additive/subtractive correction term \(\delta\). The revised conservation reads

$$\sum_\tau\left(G_r^\tau+\delta_r^{G\tau}\right)-\left(C_r+\delta_r^C\right)+\sum_{r_2\neq r}\left(\tilde{F}_{r_2\rightarrow r}+\delta_{r_2\rightarrow r}^F\right)=0,$$

where \(\tilde{F}_{r_2\rightarrow r}=F_{r_2\rightarrow r}-F_{r\rightarrow r_2}\). To make sure the corrections are only minor, we obtain them as the solution to this quadratic minimization problem

$$\min\left\{\sum_r\left(\sum_\tau w_G(G_r^\tau)\left(\delta_r^{G\tau}\right)^2+w_C(C_r)\left(\delta_r^C\right)^2\right)+\sum_{(r_1,r_2)}w_F(|\tilde{F}_{r_1\rightarrow r_2}|)\left(\delta_{r_1\rightarrow r_2}^F\right)^2\right\},$$

where the weighting functions are chosen to be

$$w_i(x)=A_i\cdot\frac{x_0}{\max\{x_\mathrm{min},x\}}.$$

Following [6], we choose \(x_0=10\,000\,\mathrm{MW}\) and \(x_\mathrm{min}=100\,\mathrm{MW}\). To decide on the different weighting constants \(A_i\) for generation, consumption and flows, we look at how these values are measured. The easiest to determine are probably the inter-regional flows, as these are recorded directly by the TSOs at the cross-border interconnectors as current or power measurements. We therefore trust these values the most and use a high weighting constant of \(A_F=100\) to keep the corrections to a minimum.

Generation and load data, on the other hand, are typically the sum of several independent measurements combined with electricity market results and probably also estimations/modelling results, especially for solar and wind power. Therefore, they both contain many potential sources of error, which makes it difficult to decide which of the values to trust more. To make a decision for this, we take a look at the expected results. If we were to give more trust in the load data, this would result in larger corrections for the generation data. This approach however introduces another source of error as it is unclear how the large corrections in generation data should be distributed among the several different generation types. Uniform relative distribution of the corrections would disrespect that data for different generation types may be of largely different quality. Therefore, we have decided to use \(A_G=10\) and \(A_C=1\) for the EnergieTakt Karte, and thus give more trust in the generation data. This results in larger corrections for the load, which however do not contribute any additional error other than solely the numerical change in load data.

Other than energy conservation in each region, we also require the minimization result to fulfill certain additional criteria. These include that any generation and load must not be negative, except for generation from hydro pumped storage, where negative values indicate storage charging. Furthermore, we require that if generation of a type is reported to be zero, the zero is retained unchanged. And lastly, we also require inter-regional flows to be below the respective line capacity, though at the moment, we have implemented this only for large HVDC interconnectors where the nominal capacity is publicly known through e.g. press statements. To be precise though, this does not add much of a constraint anyway, since we set a high weighting factor, \(A_F=100\), which generally inhibits large flow corrections.

## Flow tracing

Once we have obtained a balanced set of generation, consumption and inter-regional flow data, we can move on to tracing the origin of any quantum of electricity consumed. Origin tracing or flow tracing requires some assumptions to be made about the mixing of in-flows in the different regions. For the EnergieTakt Karte, we follow the authors of [7], who differentiate between two flow tracing schemes called *direct coupling* and *aggregated coupling*. As these are in some sense diametrically opposed and deliver largely different results for some regions as indeed shown in [7], we do not choose a single one of the schemes but rather apply both in parallel and thus allow for a direct comparison.

The result of any flow tracing calculation will be a complete set of mixing factors \(q_{r_2\rightarrow r}\) which measure the share of electricity mix in region \(r\) that was produced in region \(r_2\) (here \(r_2\) may indeed be \(r\) itself as well). For any proximate region \(r\) and remote region \(r_2\), the mixing factor is obtained as the sum of all in-flows \(F_{r_3\rightarrow r}\) from its neighboring regions \(r_3\neq r\) each multiplied by the respective mixing factor \(q_{r_2\rightarrow r_3}\) in that neighboring region, plus the locally injected electricity \(P_r^\mathrm{in}\) if \(r=r_2\), and this sum divided by the sum of all out-flows \(F_{r\rightarrow r_3}\) to neighboring regions \(r_3\neq r\) plus the locally extracted electricity \(P_r^\mathrm{out}\),

$$q_{r_2\rightarrow r}=\frac{\delta_{r,r_2}P_r^\mathrm{in}+\displaystyle\sum_{r_3\neq r}q_{r_2\rightarrow r_3}F_{r_3\rightarrow r}}{P_r^\mathrm{out}+\displaystyle\sum_{r_3\neq r}F_{r\rightarrow r_3}}.$$

It always holds that \(\sum_{r_2}q_{r_2\rightarrow r}=1\). From these mixing factors follows a set of origin factors \(\sigma_{r_2\rightarrow r}\) which measure the share of electricity consumed in region \(r\) that was produced in region \(r_2\) (again \(r_2\) may also be \(r\) itself). It again holds that \(\sum_{r_2}\sigma_{r_2\rightarrow r}=1\). How the mixing factors and origin factors are different from and related to each other depends on the scheme used and will be explained in the next sections.

### Direct coupling

The direct coupling scheme assumes that local generation in any region is directly coupled to the network of inter-regional interconnections, i.e., the injected electricity in region \(r\) is assumed to be equal to the total generation in that region,

$$P_r^\mathrm{in}=G_r^\mathrm{tot}=\sum_\tau G_r^\tau.$$

Correspondingly, the extracted electricity in region \(r\) is assumed to be equal to the total consumption,

$$P_r^\mathrm{out}=C_r.$$

With all variables assigned, we re-arrange the defining equation as a linear equation for each set of proximate region \(r\) and remote region \(r_2\),

$$\left(C_r+\sum_{r_3\neq r}F_{r\rightarrow r_3}\right)q_{r_2\rightarrow r}+\sum_{r_3\neq r}\left(-F_{r_3\rightarrow r}\right)q_{r_2\rightarrow r_3}=\delta_{r,r_2}G_r^\mathrm{tot}.$$

Together with the requirement \(\sum_{r_2}q_{r_2\rightarrow r}=1\) for each region \(r\), we obtain a system of linear equations, which is solved to obtain the mixing factors \(q_{r_2\rightarrow r}\).

In the last step, the origin factors \(\sigma_{r_2\rightarrow r}\) are calculated from the mixing factors \(q_{r_2\rightarrow r}\). Since the extracted electricity in any region \(r\) is assumed to equal to the total consumption, the origin factors for the consumed electricity in region \(r\) are precisely equal to the mixing factors in \(r\),

$$\sigma_{r_2\rightarrow r}=q_{r_2\rightarrow r}.$$

### Aggregated coupling

The aggregated coupling scheme assumes that only the aggregated net-export or net-import of a region is coupled to network of inter-regional interconnections, i.e., the injected electricity in region \(r\) is assumed to be equal to the net-export of that region,

$$P_r^\mathrm{in}=\operatorname{EX}_r=\max\{G_r^\mathrm{tot}-C_r,0\}.$$

Correspondingly, the extracted electricity in region \(r\) is assumed to be equal to net-import,

$$P_r^\mathrm{out}=\operatorname{IM}_r=\max\{C_r-G_r^\mathrm{tot},0\}.$$

Note that only either of the two will be non-zero. With all variables assigned, we again re-arrange the defining equation as a linear equation for each set of proximate region \(r\) and remote region \(r_2\),

$$\left(\operatorname{IM}_r+\sum_{r_3\neq r}F_{r\rightarrow r_3}\right)q_{r_2\rightarrow r}+\sum_{r_3\neq r}\left(-F_{r_3\rightarrow r}\right)q_{r_2\rightarrow r_3}=\delta_{r,r_2}\operatorname{EX}_r.$$

Together with the requirement \(\sum_{r_2}q_{r_2\rightarrow r}=1\) for each region \(r\), we obtain again a system of linear equations, which is solved to obtain the mixing factors \(q_{r_2\rightarrow r}\).

In the last step, the origin factors \(\sigma_{r_2\rightarrow r}\) are calculated from the mixing factors \(q_{r_2\rightarrow r}\). Since injected and extracted electricity in any region \(r\) are assumed to be different from generation and consumption, the origin factors for the consumed electricity in region \(r\) are not equal to the mixing factors in \(r\). Instead, they are given by

$$\sigma_{r\rightarrow r}=\begin{cases}1 & \text{if }G_r^\mathrm{tot}\geq C_r,\\ \displaystyle\frac{G_r^\mathrm{tot}}{C_r} & \text{otherwise},\end{cases}$$

$$\sigma_{r_2\neq r\rightarrow r}=\begin{cases}0 & \text{if }G_r^\mathrm{tot}\geq C_r,\\ q_{r_2\rightarrow r}\cdot\displaystyle\frac{\operatorname{IM}_r}{C_r} & \text{otherwise}.\end{cases}$$

## Calculation of emission intensities

When estimating the regional GHG emission intensity, one can either regard only the local generation in that region, in which case we arrive at an average *production-based* emission intensity value, or one can also take into account the inter-regional exchanges of that region with its neighbors, in which case we arrive at an average *consumption-based* emission intensity value.

### Production-based

The average production-based emission intensity \(\varepsilon_r^G\) in any region \(r\) is obtained by averaging the emission factors \(\varepsilon_r^\tau\) of the different generation types in that region weighted according to the respective generation \(G_r^\tau\),

$$\varepsilon_r^G=\frac{\displaystyle\sum_\tau\varepsilon_r^\tau\cdot G_r^\tau}{\displaystyle\sum_\tau G_r^\tau}.$$

This calculation method assumes that all power plants of a given generation type release the same amount of emissions per quantum of electricity generated, and thus neglects possible differences between different power plants. The obtained production-based emission intensities are totally independent of the flow tracing calculation. Our emission factors used are listed below.

### Emission factors

At the moment, we use a single emission factor for each generation type across all considered regions, i.e., we neglect differences between the respective power plants in different countries. The emission factors were estimated and published by the Intergovernmental Panel on Climate Change (IPCC) [8] and the United Nations Economic Commission for Europe (UNECE) [9], respectively. They are so-called life-cycle emission factors, i.e., they include not only direct emissions from electricity generation but also emissions from building, commissioning and decommissioning the respective power plants; and furthermore, they include not only carbon dioxide (CO_{2}) but also other greenhouse gases (GHG) like methane.

For some generation types, the IPCC [8] and UNECE [9] reports state significantly different values or multiple values each corresponding to a subset of power plants within the regarded generation type. In these cases, we made an estimation ourselves within the range of values given in the reports. The final values we utilize are listed in the following table.

Generation type | Emission factor | Source |

Biomass | 230 gCO_{2}eq/kWh | [8] |

Lignite | 1100 gCO_{2}eq/kWh | Estimated based on [9] |

Fossil Gas | 500 gCO_{2}eq/kWh | Estimated based on [8], [9] |

Hard Coal | 850 gCO_{2}eq/kWh | Estimated based on [8], [9] |

Geothermal | 38 gCO_{2}eq/kWh | [8] |

Hydro | 24 gCO_{2}eq/kWh | [8] |

Nuclear | 5 gCO_{2}eq/kWh | [9] |

Other | 700 gCO_{2}eq/kWh | Estimated based on Coal and Gas |

Other Renewable | 100 gCO_{2}eq/kWh | Estimated based on Biomass, Solar, Wind |

Solar | 35 gCO_{2}eq/kWh | Estimated based on [8], [9] |

Wind Offshore | 12 gCO_{2}eq/kWh | [8] |

Wind Onshore | 11 gCO_{2}eq/kWh | [8] |

### Consumption-based

The average consumption-based emission intensity \(\varepsilon_r^C\) in any region \(r\) is obtained by averaging the production-based emission intensities \(\varepsilon_{r_2}^G\) in all regions \(r_2\) weighted according to the respective origin factor \(\sigma_{r_2\rightarrow r}\),

$$\varepsilon_r^C=\sum_{r_2}\varepsilon_{r_2}^G\cdot\sigma_{r_2\rightarrow r}.$$

As we obtain different origin factors for the direct and aggregated flow tracing scheme, we also get two different consumption-based emission intensities for each region.

### Storage units

At the moment, the only type of storage for which data is available is Hydro Pumped Storage. Throughout the above calculation steps, storage units are considered either as generation units when they produce electricity or as additional load when they consume electricity (i.e., charge up the storage). As a result, the storage units also consume some emissions according to the respective consumption-based emission intensity in their region. To prevent emissions from being lost in this process, it is necessary to reallocate the corresponding emissions to the storage units during electricity generation. We do this by assigning an ordinary emission factor to storage electricity generation. Since the consumption-based emission intensity varies greatly between different regions, it is necessary to assign a different emission factor to storage electricity generation in each region.

The emission factors for storage electricity generation are estimated as the yearly average consumption-based emission intensity in the respective region weighted by the storage consumption in that region wherever such values are reported and weighted by the region’s normal load otherwise. Additionally, we factor in a round trip efficiency of 80%. For the current year, the emission factors from the previous year are used.

## Discussion & caveats

### Incomplete or missing input data

Both close-to-real-time and historically, there are a lot of incomplete or missing input data. While the balancing procedure is meant to fix the generally expected (minor) inconsistencies, it is neither intended nor appropriate for correcting larger issues. As a first step to avoid unrealistic balancing corrections, we therefore exclude regions from the entire calculation if either no generation or no load data are reported (or neither). If transmission data are reported though, we still include the excluded region as a network node that is balanced purely by the inter-regional flows. As soon as non-zero generation and load data are available, the region is fully taken into account again.

Apart from the case of completely missing data, various combinations of incomplete data are also conceivable. As this is difficult to detect algorithmically, the affected region is still calculated normally, with the balancing procedure indeed making major corrections if necessary. But we have built in a basic integrity check that will detect the most serious cases of incompleteness and flag the corresponding data (the EnergieTakt Karte will display a warning sign in this case). However, due to the larger corrections applied by the balancing procedure, neighboring regions can also be affected, which is not explicitly indicated.

## References

[1] ENTSO-E Transparency Platform. https://transparency.entsoe.eu/

[2] National Grid ESO Data Portal. https://www.nationalgrideso.com/data-portal

[3] Elexon BMRS. https://bmreports.com/

[4] EirGrid Smart Grid Dashboard. https://smartgriddashboard.com/

[5] EPİAŞ Transparency Platform. https://seffaflik.epias.com.tr/

[6] J. A. de Chalendar, S. M. Benson. *A physics-informed data reconciliation framework for real-time electricity and emissions tracking*. Applied Energy 304, 117761 (2021). https://doi.org/10.1016/j.apenergy.2021.117761

[7] M. Schäfer, B. Tranberg, D. Jones, A. Weidlich. *Tracing carbon dioxide emissions in the European electricity markets*. 17th International Conference on the European Energy Market (2020). https://doi.org/10.1109/EEM49802.2020.9221928

[8] S. Schlömer, T. Bruckner, L. Fulton, E. Hertwich et al. *Annex III: Technology-specific cost and performance parameters*. Climate Change 2014: Mitigation of Climate Change. Contribution of Working Group III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (2014). https://www.ipcc.ch/site/assets/uploads/2018/02/ipcc_wg3_ar5_annex-iii.pdf

[9] United Nations Economic Commission for Europe. *Carbon Neutrality in the UNECE Region: Integrated Life-cycle Assessment of Electricity Sources*. (2022). https://unece.org/sites/default/files/2022-04/LCA_3_FINAL%20March%202022.pdf