Databases with secondary data are the fourth building block for carbon footprints. This chapter discusses the landscape of Life Cycle Assessment (LCA) databases, and where the data in those databases come from. The chapter also discusses the geographic specifity of existing LCA databases, and the scope to improve their interoperability.
Measuring Carbon Footprints of Agri‑Food Products
7. Databases with secondary data
Copy link to 7. Databases with secondary dataAbstract
A recurring theme in this report is the importance of primary data. However, this should not obscure the role of secondary data, such as data on average emissions of agri-food products or inputs. Given the challenges in collecting primary data, calculations will initially need to rely mostly on secondary data.
One use of secondary data was already discussed in the previous chapter: some farm level calculation tools rely on secondary data to include an estimate of ‘embedded’ emissions of farm inputs such as fertilisers, feed, or electricity. The lens taken in this report, of a system of reliable and widespread carbon footprints in food systems, suggests that this should be a temporary workaround until supplier-specific carbon footprint estimates can be obtained directly from input suppliers. The next chapter discusses how such information could be efficiently communicated through the supply chain and easily incorporated in carbon footprint calculations, using the cradle-to-gate principle. But even when the use of primary data is scaled up, secondary data may continue to play an important role, for example in providing a ‘default’ value in the absence of primary data.
7.1. The landscape of LCA databases
Copy link to 7.1. The landscape of LCA databasesSecondary data are typically found in Life Cycle Assessment (LCA) databases, sometimes also referred to as Life Cycle Inventory (LCI) databases.1 These databases provide estimates of carbon footprints (and other environmental impacts) of a wide range of products, based on the LCA methodology (discussed in Chapter 4). The estimates themselves could be based on a range of sources, such as direct measurement, modelling based on direct measurement of proxy variables, data taken from other LCA databases, expert judgment, or a combination of these sources (Hauschild, Rosenbaum and Olsen, 2018[1]). This is discussed further in the next section.
The landscape of LCA databases is vast and interlinked. The GHG Protocol lists at least 53 third-party LCA databases that can be used for constructing product life cycle or corporate value chain GHG inventories.2 These include industry specific databases (e.g. Worldsteel Association, ICE for building materials, BUWAL for packaging materials), country specific databases that may feature specific industries (e.g. US Lifecycle Inventory Database, Canadian Raw Materials Database, Australian Life Cycle Inventory Database, Chinese Life Cycle Database), and other multisector databases that provide access to multiple databases and datasets at once, such as Ecoinvent. There also exist LCA software tools that help users combine data from different databases, calculate environmental impacts, and generate reports. They vary in complexity and features, ranging from free tools such as OpenLCA to commercial tools such as SimaPro.
Among the many LCA databases relevant for food systems, a few stand out because of their size, widespread use, or specialisation. Table 7.1 presents key characteristics of six of these databases: Agri-footprint, Agribalyse, Ecoinvent, Sphera (formerly known as GaBi), the Global Feed LCA Institute database (GFLI), and the World Food LCA database (WFLDB).
As this overview shows, databases differ in terms of geographical coverage, specialisation, and other characteristics. While Agribalyse covers a single country (France) and Agri-footprint covers 63 countries, other databases claim global coverage. The number of products varies too, from some 130 products for WFLDB to thousands for some of the other databases. However, not all of the possible datasets (that is, “country x product” combinations) exist in the databases; the number of datasets varies from some 1 800 for GFLI to more than 20 000 for Ecoinvent.3
Table 7.1. Key characteristics of selected LCA databases relevant for food systems
Copy link to Table 7.1. Key characteristics of selected LCA databases relevant for food systems|
Agri-footprint |
Agribalyse |
Ecoinvent |
Sphera (GaBi) |
GFLI database |
World Food LCA Database (WFLDB) |
|
|---|---|---|---|---|---|---|
|
Developer |
Blonk Consultants (private) |
French Environment and Energy Management Agency (public) |
Ecoinvent (non-profit) |
Sphera (private) |
The Global Feed LCA Institute (non-profit) |
Quantis (private) |
|
Sector / products |
Food, feed and agricultural intermediate products |
Agri-food products |
General (all sectors) |
General (all sectors) |
Feed ingredients |
Agri-food products |
|
Created |
2014 |
2010 |
2003 |
1989 |
2020 |
2012 |
|
Latest version (as of June 2024) |
v6.3 Aug 2022 |
v 3.1.1 June 2023 |
v3.10 Jan 2024 |
v2022.2 Nov 2022 |
v2.1 Oct 2023 |
v3.9 Oct 2023 |
|
Datasets |
Unclear |
Unclear |
20 000+ |
18 000+ |
1 800+ |
2 600+ |
|
Products |
5 000 products and processes |
2 517 products |
3 500 products and services |
Unclear |
Unclear |
130+ |
|
Geographic scope |
63 countries |
France |
Global |
Global |
Global |
150+ countries |
|
Impact categories |
19 |
14 |
23 |
13 |
16 to 19 |
Unclear |
|
Allocation approach |
Economic, mass, energy |
Economic (biophysical for dairy, mass for cheese) |
Economic |
Physical, economic |
Economic, mass, energy |
Physical, economic (but may differ for different supply chain stages) |
|
Alignment on standards |
ISO 14040; ISO 14044; PEF (2021); PEFCR (2018); SBTi’s Forest Land and Agriculture Guidance (FLAG) |
ISO 14040; LEAP and PEF |
ISO 14040; ISO 14044; ISO/TS 14048 |
GHG Protocol, ISO 14040/44, EN 15804+A2, ILCD DN entry level, subset of data: EF 2.0 and 3.0 |
FAO LEAP feed guidelines (2016), LEAP feed additives guidelines (2020), Feed PEF database methodology (2017), and Feed PEFCR (2018) |
ISO 14040; ISO 14044; ILCD; PEF |
|
Alignment on IPCC 2019 and most recent GWP |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
Cost |
Three different pricing options (Research, Commercial, Developer) + through SimaPro and OpenLCA |
Full version through SimaPro, openLCA and Brightway; Simplified version as Excel files |
Four different pricing options + through other dataset initiatives and software tools (e.g SimaPro) |
Partly free, partly with a license fee |
Four different pricing options (Membership, Commercial, Developer, Per project) |
Only through SimaPro subscription with five different pricing options available |
|
Languages |
English + SimaPro provides 13 different languages |
English, French |
English, German + SimaPro provides 13 different languages |
English, German |
English |
English + SimaPro provides 13 different languages |
Note: “Unclear” means information not found in online documentation of the databases.
Source: OECD analysis based on online database documentation.
While the focus in this report is on carbon footprints, LCA databases typically include more than a dozen different impact categories, often including land use, water use, eutrophication, and acidification, which are of particular relevance to food systems. All LCA databases comply with the ISO standards 14040 and 14044, which define the LCA methodology. Some also comply with data documentation standards (ISO/TS 14048), data standards (the EU’s ILCD) as well as sector or product specific standards such as FAO’s LEAP or the EU Product Environmental Footprint rules (Chapter 4). All LCA databases seem to align with the latest IPCC (2019) recommendations.
A preliminary assessment by GHG Protocol shows that WFLDB, Ecoinvent and Sphera (GaBi) include most of the emission accounting metrics relevant for the GHG Protocol Land Sector and Removals Guidance, with the exception of land use-related variables such as direct and indirect land use change or carbon opportunity costs.
Databases typically see frequent updates; as of June 2024, all databases had been updated within the last two years. Some LCA databases, WFLDB for example, can be accessed exclusively through paid-for software such as SimaPro. OpenLCA is open source software and hence free, and its developer provides access to datasets formatted for use in openLCA (some free, some paid).4 Sphera (GaBi) is both a dataset and a software tool.
Databases also vary in terms of cost. The publicly funded Agribalyse database is free while others have different pricing options. A perpetual commercial license to the ecoinvent database costs EUR 3 800 for a single user; a commercial license for Agrifootprint starts at EUR 1 160 per year. Alternatively, some LCA software packages are bundled with licenses to LCA databases. For example, a SimaPro commercial license starts at EUR 5 900 per year but includes access to the ecoinvent and Agri-footprint databases (among others).
7.2. Where do the data in an LCA database come from?
Copy link to 7.2. Where do the data in an LCA database come from?As noted earlier, data sources can include direct measurement, modelling based on direct measurement of proxy variables, data taken from other LCA databases, or expert judgment, or a combination of these (Hauschild, Rosenbaum and Olsen, 2018[1]).
As an example, consider an estimate for the carbon footprint of fluid milk at farm gate in Switzerland. This estimate could be based on observations from a sample of dairy farmers in the country. On-farm GHG emissions could be modelled based on farm level data, as with the farm level tools discussed in the previous chapter. Embedded emissions of feed and fertiliser could be calculated by combining input use observed on those farms with average emission factors from existing LCA databases. Transparency on assumptions and methods is important: ideally, underlying activity data are stored so that results can be re-calculated when improved models become available.5
Estimates in a secondary database could therefore be based on primary data. What makes it nevertheless a secondary database is that the estimates will be used as a substitute for primary data in another context. To continue the example of milk in Switzerland, if a Swiss dairy processor wants to calculate the carbon footprint of its cheese, it might decide not to collect primary data from the farmers that supply the milk, but instead use the estimate for fluid milk at farm gate in Switzerland.
As noted in Chapter 4, important parameters for an LCA are the definition of the relevant system boundaries, and the allocation rules used. LCA databases differ in their methodological choices, which are typically documented in guidelines. For example, Nemecek et al. (2019[2]) provide guidelines for the World Food LCA Database (version 3.5), including general principles around the structure of the database, naming conventions, system boundaries, required representativeness of the data in terms of geographical, temporal, and technological coverage, and allocation rules. The guidelines also cover principles for data collection (e.g. how to identify the most appropriate data source) and specifies which emission models are used to translate activity data into Life Cycle Inventory data. For example, the guidelines specify that the IPCC Tier 2 approach is used to estimate methane emissions from livestock.
Because LCA databases make different methodological choices, their results will usually differ. For example, Pauer et al. (2020) found that the Ecoinvent 3.6 database led to higher environmental impacts compared to GaBi (now Sphera), because Ecoinvent datasets often include more background processes. LCA databases may also make different modelling choices, for example on whether to use Tier 2 or Tier 3 models, and if Tier 3, which ones.
As with farm level calculation tools, different choices lead to different results. Such differences have been documented for other sectors (Herrmann and Moltesen, 2015[3]; Kalverkamp, Helmers and Pehlken, 2020[4]; Speck et al., 2015[5]; Lopes Silva et al., 2019[6]; Säynäjoki et al., 2017[7]) although this kind of comparison has apparently not yet been undertaken for agri-food products.
Some LCA databases also allow the user a choice between different options, e.g. between physical and economic allocation across co-products.
7.3. Geographic specificity of LCA databases
Copy link to 7.3. Geographic specificity of LCA databasesLCA databases by construction contain average data rather than producer-specific information. However, databases may differ in the level of granularity. For example, data could represent a global average, a regional average, a national average, or a sub-national average. Similarly, average data could distinguish different production methods or practices. Geographic specificity and distinctions between production methods are important for agri-food products, given the variability of biological processes (Notarnicola et al., 2017[8]).
In this regard, there are important evidence gaps for some products and regions, particularly in the developing world (Deconinck and Toyama, 2022[9]) (Edelen et al., 2017[10]). Practical challenges in the developing world include a diversity of production systems, a lack of reliable data, and highly diverse natural contexts (Basset-Mens et al., 2021[11]). The problem here is not merely about a lack of activity data (e.g. farm level data), but also gaps regarding the models used to estimate emissions: available models have often been developed and validated for countries with more temperate climates and may not be appropriate, for example, for tropical agriculture. In terms of the “building blocks” identified in this report, this corresponds to a lack of suitable Tier 3 science-based methods (Chapter 5).
In response, global initiatives have emerged to provide greater “regionalisation” of LCA databases. The UNEP-led Life Cycle Initiative is a global, public-private multi-stakeholder initiative that promotes the establishment of regionally representative databases for LCA studies.6 It focuses on creating national LCA databases that can better reflect local conditions and cover sectors and products most critical for each country, using methodologies, data quality assurance mechanisms, and data format requirements in accordance with widely adopted standards. One concrete project under this initiative was a cooperation between Ecoinvent and the Brazilian Agricultural Research Corporation (Embrapa), which resulted in the creation of more regionalised Brazilian land use change data. Research using this granular sub-national level data demonstrated the importance of going beyond national-level data, particularly for large and heterogenous countries like Brazil: national-level data misrepresents direct land use change emissions for many agricultural products (Donke et al., 2020[12]; Novaes et al., 2017[13]). Similar initiatives to improve LCA databases in low- and middle-income countries should be encouraged.
Populating an LCA database ideally happens based on primary research rather than extrapolation from other data points. There may be data gaps when some activities in some regions have not been studied. The data actually available in LCA databases are often the cumulative result of many ad hoc research projects, rather than a deliberately planned effort to fill in data gaps. It would be helpful to have a more explicit strategy to identifying and addressing data gaps in LCA databases. Publicly funded research (e.g. by agricultural research institutes) can play an important role here.
Data quality ratings can be a useful tool in prioritising new research. LCA databases often compute a data quality rating reflecting how representative an estimate is in terms of geography, technology, time, and precision. The Ecoinvent data quality guidelines (Weidema et al., 2013[14]) are used as a reference across many databases. Some databases also rely on supplementary requirements and product rules for their data quality ratings. For example, Agri-footprint’s data quality ratings for feed materials follow the EU Product Environmental Footprint methodology.
7.4. Interconnectedness and interoperability of LCA databases
Copy link to 7.4. Interconnectedness and interoperability of LCA databasesBecause they take a life-cycle perspective, LCA databases are often built on a “modular” principle, where results from one LCA (e.g. fertiliser) become an input in another LCA (e.g. wheat), which may in turn be an input for yet other LCAs (e.g. bread). In many cases, information originally came from a different LCA database. This leads to a certain level of interconnectedness. For example, both Agri-footprint and the World Food LCA database (WFLDB) use Ecoinvent as a background database for fuel and energy. Similarly, the Australian National Life Cycle Inventory Database (AusLCI) combines Australian data with selected emissions factors adapted from the Ecoinvent database.
The interconnectedness of LCA databases could create problems such as a lack of clarity on where data comes from, inconsistencies in methods and data collection, and difficulties in translating and converting across different sources, possibly leading to a loss of information and incorrect interpretations (Edelen et al., 2017[10]). Such difficulties are particularly likely when underlying databases are updated. For example, in the Australian case AusLCI is based on version 2.2 of the Ecoinvent database, even though Ecoinvent version 3.10 is currently available. As with other elements of the building blocks, this example suggests the importance of regular updates to incorporate new versions of underlying datasets (Chapter 11). However, it also highlights the importance of ensuring interoperability between various LCA databases. For example, databases may use different nomenclatures, making it hard to match data across databases. Edelen et al. (2017[10]), looking at four databases commonly used in the United States, found that when the original nomenclature of the different databases was used, automatic name-to-name matching was typically difficult. In the United States, the Federal LCA Commons (an initiative to harmonise public LCA research) created a Federal Elementary Flow List (FEDEFL) as a common nomenclature; Edelen et al. (2022[15]) found that this greatly facilitated automatic matching.
An important tool in creating interoperability is adopting a common data format. The International Life Cycle Data System (ILCD) is a data format developed by the European Union which is increasingly used in the LCA community (Pré Sustainability, 2019[16]).
Specifically for agri-food products, the HESTIA project (https://www.hestia.earth/) has also developed a standardised data format that can be used to represent not only LCA data but also other agri-environmental data, including data from farms, farm surveys, and experimental field trials. The HESTIA format is discussed in more detail in Box 8.1 in Chapter 8.
Other actions can be taken to improve access and interoperability. For example, governments may have LCA data and models which could be made available to the public. In the United States, the Federal LCA Commons initiative mentioned earlier is a collaboration between several federal agencies (including the US Department of Agriculture) to make LCA datasets freely available through an online platform.7
7.5. A first assessment
Copy link to 7.5. A first assessmentLCA databases are well established and cover a large number of products and geographies. Most are consistent with key standards (notably ISO standards), and updated regularly. Databases also often cross-reference each other, for example as a source of information for “background” processes.
However, there is room for improvement. First, databases differ in their methodological choices, which influences the results. While databases tend to document their choices, differences still make it hard to compare and combine data. This goes beyond interoperability of data formats, as it concerns more fundamental choices around system boundaries, allocation rules, and the like. One option would be for existing databases to harmonise their methodological guidelines, taking into account not only general standards such as ISO 14040/14044 and ISO 14067 but also more detailed product category rules. Another option is for databases to provide users with the flexibility to adjust methodological parameters, perhaps using “presets” corresponding to different standards or product category rules. More research on how methodological choices influence carbon footprint results for agri-food products would be welcome, too.
Second, there exist data gaps. Not all products, activities and geographies are equally well covered by existing databases. It would be valuable to develop a deliberate strategy to identify and address data gaps as part of an ongoing process of continuous improvement. Data quality ratings can be a useful tool in prioritising areas where new research is needed. Actually addressing the data gaps may require in-depth scientific research (e.g. to create new science-based methods) and farm surveys to collect the necessary activity data. It is a task which LCA database providers may not be able to undertake by themselves and where collaboration with, for example, agricultural research institutes may be important.
Third, the existence of databases does not mean all supply chain actors can easily access and use them. The cost of commercial databases and commercial LCA software is one element, but correctly using the data also requires specialised skills. It is possible that consultants will be able to provide an integrated service to firms that lack the means and capabilities to do everything in-house. But the financial cost of accessing and analysing secondary data should be kept in mind as a potential barrier to widespread carbon footprints (see also Chapter 10, which discusses other possible barriers and ways to address them).
References
[11] Basset-Mens, C. et al. (2021), Life Cycle Assessment of agri-food systems, éditions Quae, https://doi.org/10.35690/978-2-7592-3467-7.
[9] Deconinck, K. and L. Toyama (2022), “Environmental impacts along food supply chains: Methods, findings, and evidence gaps”, OECD Food, Agriculture and Fisheries Papers, No. 185, OECD Publishing, Paris, https://doi.org/10.1787/48232173-en.
[12] Donke, A. et al. (2020), “Integrating regionalized Brazilian land use change datasets into the ecoinvent database: new data, premises and uncertainties have large effects in the results”, The International Journal of Life Cycle Assessment, Vol. 25/6, pp. 1027-1042, https://doi.org/10.1007/s11367-020-01763-3.
[15] Edelen, A. et al. (2022), “Life Cycle Data Interoperability Improvements through Implementation of the Federal LCA Commons Elementary Flow List”, Applied Sciences, Vol. 12/19, p. 9687, https://doi.org/10.3390/app12199687.
[10] Edelen, A. et al. (2017), “Critical review of elementary flows in LCA data”, The International Journal of Life Cycle Assessment, Vol. 23/6, pp. 1261-1273, https://doi.org/10.1007/s11367-017-1354-3.
[1] Hauschild, M., R. Rosenbaum and S. Olsen (eds.) (2018), Life Cycle Assessment, Springer International Publishing, Cham, https://doi.org/10.1007/978-3-319-56475-3.
[3] Herrmann, I. and A. Moltesen (2015), “Does it matter which Life Cycle Assessment (LCA) tool you choose? – a comparative assessment of SimaPro and GaBi”, Journal of Cleaner Production, Vol. 86, pp. 163-169, https://doi.org/10.1016/j.jclepro.2014.08.004.
[4] Kalverkamp, M., E. Helmers and A. Pehlken (2020), “Impacts of life cycle inventory databases on life cycle assessments: A review by means of a drivetrain case study”, Journal of Cleaner Production, Vol. 269, p. 121329, https://doi.org/10.1016/j.jclepro.2020.121329.
[6] Lopes Silva, D. et al. (2019), “Why using different Life Cycle Assessment software tools can generate different results for the same product system? A cause–effect analysis of the problem”, Sustainable Production and Consumption, Vol. 20, pp. 304-315, https://doi.org/10.1016/j.spc.2019.07.005.
[2] Nemecek, T. et al. (2019), Methodological Guidelines for the Life Cycle Inventory of Agricultural Products (Version 3.5), December 2019, World Food LCA Database (WFLDB). Quantis and Agroscope, Lausanne and Zurich, Switzerland., https://simapro.com/wp-content/uploads/2020/11/WFLDB_MethodologicalGuidelines_v3.5.pdf.
[8] Notarnicola, B. et al. (2017), “The role of life cycle assessment in supporting sustainable agri-food systems: A review of the challenges”, Journal of Cleaner Production, Vol. 140, pp. 399-409, https://doi.org/10.1016/j.jclepro.2016.06.071.
[13] Novaes, R. et al. (2017), “Estimating 20‐year land‐use change and derived CO2 emissions associated with crops, pasture and forestry in Brazil and each of its 27 states”, Global Change Biology, Vol. 23/9, pp. 3716-3728, https://doi.org/10.1111/gcb.13708.
[16] Pré Sustainability (2019), The ILCD format – solving LCA data exchange problems, https://pre-sustainability.com/articles/the-ilcd-format-solving-lca-data-exchange-problems/.
[7] Säynäjoki, A. et al. (2017), “Input–output and process LCAs in the building sector: are the results compatible with each other?”, Carbon Management, Vol. 8/2, pp. 155-166, https://doi.org/10.1080/17583004.2017.1309200.
[5] Speck, R. et al. (2015), “Life Cycle Assessment Software: Selection Can Impact Results”, Journal of Industrial Ecology, Vol. 20/1, pp. 18-28, https://doi.org/10.1111/jiec.12245.
[14] Weidema, B. et al. (2013), Overview and methodology: Data quality guideline for the ecoinvent database version 3, The ecoinvent Centre, https://lca-net.com/files/Overview_and_methodology.pdf.
Notes
Copy link to Notes← 1. In the LCA methodology, a Life Cycle Inventory (LCI) can be thought of as a flowchart containing all emissions and material flows, including resources extracted from the environment and waste products. In a carbon footprint context, this would include estimates of GHG emissions. In the LCA methodology, the LCI stage is then followed by the Life Cycle Impact Assessment phase, where LCI numbers (such as GHG emissions) are translated into impacts in terms of human or environmental health. For the purposes of this report, the LCI phase is the more relevant one.
← 2. These databases are not necessarily endorsed by the GHG Protocol.
← 3. As an example, the documentation for WFLDB lists as some of the datasets “Broiler husbandry, poultry industrial broiler systems, at farm, Brazil”, “Broiler husbandry, poultry industrial broiler systems, at farm, Canada”, and so on. In some cases there is a separate global dataset too.
← 4. The LCA models made available through the US Federal LCA Commons mentioned earlier are free, and also formatted for use in openLCA.
← 5. This is an example of a process-based approach. An alternative approach is environmentally extended input-output analysis (EEIO). An input-output table captures how the output of one sector is used as input in another sector. If information is available on emissions related to the production of inputs, it is then possible to use input-output tables to follow how these ‘embedded’ emissions flow through the economy until the final product. For a discussion in the context of environmental impacts in food supply chains, see Deconinck and Toyama (2022[9]).
← 6. See https://www.lifecycleinitiative.org (accessed 4 June 2024).
← 7. See https://www.lcacommons.gov/about-us (accessed 16 October 2024).