Improving Governance with Policy Evaluation

Lessons From Country Experiences

Report

OECD Public Governance Reviews

23 June 2020

Download PDF

Chapter 3. How do countries address the challenges of promoting quality and use of evaluations?

Copy link to Chapter 3. How do countries address the challenges of promoting quality and use of evaluations?

Abstract

Quality and use of evaluations are essential to ensure relevance and impact on policy-making. They are key to promote learning, accountability and effective contribution of evaluation to decision-making tools such as regulation and budgeting. However, achieving both quality and use is widely recognised as the most important challenge faced by policy-makers and practitioners in this area. This is due to a mix of skills and institutional gaps, heterogeneous oversight of evaluation processes, and insufficient mechanisms for quality control and capacity for uptake of evidence. This chapter discusses the external and internal factors that affect the quality and use of policy evaluations, as well as their interlinkages. It examines the various mechanisms put forth by governments in order to promote the good quality and use of policy evaluations, and highlights relevant country practices in this regard.

Key findings

Copy link to Key findings

Quality is key to ensure the robustness of policy evaluations, and can be achieved through quality control and quality assurance processes. These have been put in place by countries through different means:

Standards for quality play an important role in quality assurance, but are less likely to be embedded in normative instruments such as legal and policy frameworks.
Quality control mechanisms are much less common across the sample of countries that responded to the OECD Survey, both within and outside of the executive, and may constitute an area of development in order to ensure that evaluation reports and evaluative evidence meet a high quality standard.

While quality is very important and can facilitate use of evaluation, it is not enough to guarantee such use, which remains an important challenge faced by many countries. However, the use of evaluation is crucial to ensure impact and to promote evidence-informed policy-making and learning.

Organisations and institutional mechanisms within the executive play an important role in creating a market place for the use of evaluations. Yet the heterogeneity of country approaches suggests that there is no one size fits all approach and the set up depends on the local political and cultural context.

Countries have recognised the importance of competences for promoting the quality and use of evaluation. Most mechanisms for the development of skills and competences are aimed at evaluators, managers, or senior civil servants, and aim to ensure high quality evaluations. Further increasing the competences of policy and decision-makers and increasing capacity for the use of evaluation, on the other hand, may increase demand for evaluative evidence.

The role of institutions outside of the executive remains limited, both in the promotion of quality and use of evaluation, aside from their involvement in the budgetary cycle. Parliament can play a role in some countries. Supreme Audit Institutions play an important role in the supply of unbiased evaluations overall in a significant number of countries.

Introduction

Copy link to Introduction

Quality and use of evaluations are essential to ensure impact on policy-making, and thus in ensuring that evaluations actually serve as tools for learning, accountability and better decision-making. However, achieving both quality and use is widely recognised as some of the most important challenge faced by policy-makers and practitioners in this area. This is due to a mix of skills gaps, heterogeneous oversight of evaluation processes, and insufficient mechanisms for quality control and capacity for uptake of evidence.

This chapter discusses the quality and use of policy evaluation, as well as their interlinkages. It discusses both the external and internal factors that affect the quality and use of policy evaluations. Finally, the chapter examines the various mechanisms put forth by governments in order to promote the good quality and utilisation of policy evaluations, as well as highlight some interesting country practices in the area.

Quality and use are essential

Copy link to Quality and use are essential

Quality matters

Not all evaluations are created equal, some deserve to be given more weight in decision-making. In fact, high quality evaluations generate robust and credible results that can be used with confidence. As a result, good quality evaluations enable policies to be improved and are thus a key part of the policy cycle. In particular, quality impact evaluations provide evidence on the outcome of policies, as well as on whether these changes can be attributed to the intervention in question. In this sense, they facilitate learning in decision-making and policy design, by providing reliable information on why and how a policy was successful or not, and the underlying causal mechanisms leading to success or failure.

Quality evaluations also have the potential to increase policy accountability as they can provide trustworthy evidence on how resources were spent, what benefits were achieved and what the returns were. Good quality evaluations give citizens and stakeholders access to information on whether the efforts carried out by the government, including allocation of financial resources, are producing the expected results (OECD, 2018[2]) . As such, good quality evaluations are fundamental to democratic accountability (HM Treasury, 2011[9]).

Conversely, poor quality evaluations carry the risk of providing unfit evidence, or evidence that is subject to bias and undue influence. Poor quality evidence also implies that a policy that is ineffective, or even harmful, might either be implemented or continue to be. Finally, opportunities to use public funds more effectively may be missed.

Use is also important

Effective use of evaluations is key to embed them in policy making processes and to generate incentives for the dissemination of evaluation practices. It is a critical source of feedback for generating new policies and developing rationale for government interventions. If evaluations are not used, gaps will remain between what is known to be effective as suggested by evidence and policy, and decision-making in practice. Simply put, evaluations that are not used represent missed opportunities for learning and accountability.

Connections between evidence and policy-making remain elusive (OECD, 2020[99]): the use of policy evaluation continues to be one of the most important challenges. This is compounded by the fact that the underuse of evaluations may jeopardize the legitimacy of the evaluative exercise in the first place. When decision-makers ignore the results of evaluations, the claim for further analysis is undermined (Leviton and Hughes, 1981[100]). Unused evaluations may also contribute to an impression of excess supply, whereby quality evidence gets lost in the shuffle.

Underuse also represents a waste of public resources: policy evaluations, whether conducted internally or contracted-out to external stakeholders, require significant public human and financial resources (Stern, Saunders and Stame, 2015[101]), which will be lost if they lead to no outcomes.

Quality and use are closely interrelated

Quality and use of evaluations are intrinsically linked, thereby increasing their significance for policy-makers. Some academic authors consider use to be a key component of an evaluation’s quality (Patton, 1978[102]) (Kusters, 2011[103]) (Vaessen, 2018[104]). From this perspective, the extent to which an evaluation meets the needs of different groups of users dictates its quality. For instance, stakeholder involvement and iterative learning are seen as the foundation for using evaluation, and by implication, quality. Conversely, evaluations that adhere to the quality standard of appropriateness – that is, evaluations that address multiple political considerations, are useful to achieve policy goals and consider the local context – are by very definition more useful to intended users.

In addition, quality should also be conducive to greater potential for use. In fact, insofar as good quality evaluations benefit from greater credibility, both because they are technically rigorous and well governed, they are likely to be given more weight in decision-making. Similarly, unused data are likely to suffer because they are not subject to critical questioning. However, in practice, it is important to recognise that quality may be associated with greater complexity of the results, due to methodological requirements and limits with the use of quantitative methods, which may make the results difficult to read and interpret for a lay audience.

Exogenous factors affecting quality and use of evaluations

Copy link to Exogenous factors affecting quality and use of evaluations

Quality and use can be influenced both by policy and internal factors, amenable to policy intervention, as well as by a range of exogenous factors determined by cultural, historical and environmental circumstances.

The extent to which policies can be evaluated

For an evaluation to be of high quality and to be useful for policy-making, the policy or programme should be easily evaluable in the first place, meaning that it should be possible to evaluate it in a credible and reliable manner (OECD, 2010[48]). Two main factors may affect the degree to which a policy can easily be evaluated:

the nature and design of the policy or programme itself
the quality and availability of non-survey specific data.

The nature and design of the policy

Clearly laying out the objectives of a policy and the levers to attain it will facilitate the evaluation (OECD, 2017[105]). This also implies the original intentions of the programme developers be explicit and open to critical thinking (OECD, forthcoming[106]). One way to facilitate clear policy objectives is to develop a theory of change and logic model, which can be done either at the stage of policy design, or when developing an evaluation. A theory of change can be defined as a set of interrelated assumptions explaining how and why an intervention is likely to produce outcomes in the target population (OECD, forthcoming[106]). Developing a theory of change can lead to better policy planning and evaluation because the policy or programme activities are linked to a detailed and plausible understanding of how change actually happens. A logic model sets out the conceptual connections between concepts in the theory of change to show what intervention, at what intensity, delivered to whom and at what intervals would likely produce specified short term, intermediate and long term outcomes (OECD, forthcoming[106]).

Box 3.1. Benefits of developing an intervention theory of change and logic model for policy or programme development

Copy link to Box 3.1. Benefits of developing an intervention theory of change and logic model for policy or programme development

The evaluability of the programme —for both implementation and outcomes— is facilitated, by signposting appropriate metrics.

The original intentions of the programme developers are clearly set out, and are explicit and open to critique.

The underlying logic of the assumptions made in the theory, for example, that undertaking a certain activity will lead to a particular outcome, can be scrutinised.

The realism of the assumptions made by the programme developers can be checked against wider evidence of ‘what works’ to assess the likelihood of the programme being successful.

Commissioners can check whether the programme meets their needs; and providers and practitioners delivering the programme can check their own assumptions and the alignment of their expectations against the original intentions of the programme developers.

The key parameters or boundaries (e.g., who is the programme for, and under what specific circumstances) can be set out, reducing the likelihood that the programme is used inappropriately or ineffectively.

Core components (of content, or of implementation, or both) that are believed to be essential to the programme’s effectiveness can be identified.

Activity traps can be identified and avoided.

The most important features of the implementation model of the programme can be captured, enabling delivery that adheres to the original model and helping to prevent programme drift during maturation and scaling

Source: Ghate, D. (2018), “Developing theories of change for social programmes: co-producing evidence-supported quality improvement”, Palgrave Communications, Vol. 4/1, p. 90, http://dx.doi.org/10.1057/s41599-018-0139-z.

The quality and availability of data

The quality and availability of non-evaluation specific data (big data, open data, statistical data, programme monitoring data, etc.) is a primordial factor in how easily a policy can be evaluated. Similarly, the quality of data has an important influence on the rigorousness of the resulting evaluation. In order for data to meet the quality criteria to be used for evaluation, it needs to be accurate, verifiable and documented. Furthermore, policy evaluation and evidence informed policy making (EIPM)= can be hindered by the lack of available adequate data and the capacity gaps among government departments and agencies to generate it in a format that can be used. Such challenges include understanding what data and data sets currently exist in ministries and how they can be used for policy analysis. Evaluators and analysts are not necessarily aware of all the data that exists nor do they necessarily have access to administrative data, which may be especially true of external evaluators. Another issue could be that departments do not have comprehensive inventories of all their data holdings and knowledge of their quality. Beyond this, there is a broader data governance challenge that corresponds to the capacity of the public sector to generate the data that is necessary to produce evidence and evaluation, which should also, in theory, be facilitated by the increasing digitalisation of public sector processes.1

Another challenge relates to the use of individual administrative data. Indeed, data protection legislations can also constitute an obstacle to using individual level data to evaluate policies and programmes in some countries, specifically when carrying out statistical analysis and when merging files, which requires access to single identifiers. Political reticence towards sharing evidence on policy impact and effectiveness may also be another barrier in accessing data.

Box 3.2. Potential sources of data used for policy evaluation

Copy link to Box 3.2. Potential sources of data used for policy evaluation

Conducting quality evaluation requires quality data, which may come from various sources:

Statistical data: commonly used in research, it corresponds to census data or more generally to information on a given population collected through national or international surveys.
Administrative data: this data is generally collected through administrative systems managed by government departments or ministries, and usually concerns whole sets of individuals, communities and businesses that are concerned by a particular policy. For instance, it includes housing data and tax records.
Big data: mainly drawn from a variety of sources such as citizen inputs and the private sector, big data is most often digital and continuously generated. It has the advantage of coming in greater volume and variety.
Evaluation data: this data is collected for the purpose of the evaluation. It can take the form of qualitative questionnaires, on-site observations, focus groups, or experimental data. See further down for a description of impact evaluation methods to collect and analyse data.

Combining different data sources also has the potential to unlock relevant insights for policy evaluation. Applying big data analysis techniques to public procurement data can contribute to creating stronger, sounder and more relevant evaluations.

Sources: based on Results for America (2017), Government Mechanisms to Advance the Use of Data and Evidence in Policymaking: A Landscape Review

Overall, strategies and policies to combine, link and reuse data, as well as to connect actors and decisions within and outside the public sector, are necessary to enable open data to deliver results (OECD, 2019[107]). Evidence from the OECD OURData Index suggests that the countries achieving better results are those that clearly assign the responsibility to co-ordinate open data policies.

Some countries have sought to develop EIPM strategies by fostering systematic access to, and use of, administrative data. The US and Japan, for example, have both institutionalised and implemented more systematic structural approaches to facilitate evidence informed policy making. They have done this by mobilising institutional resources, promoting internal champions and exploring the possibility to fully use existing data on a systematic basis through significant governance changes.

The presence of an enabling environment for quality and use

Quality and use are also influenced by a wider enabling environment. The incentives and attitudes of potential users and of evaluators toward conducting evaluations, are influenced by:

the existence of an enabling environment within the evaluation unit and within the institution as a whole
the wider environment beyond institutional boundaries and the overall evaluation culture (Vaessen, 2018[104]).

At the level of evaluation units and individual institutions, quality can benefit from managerial independence – when the evaluation unit can take resource decisions independently – and functional independence – when the evaluation unit can decide on what and how to evaluate (see for a discussion of evaluation independence) (Vaessen, 2018[104]). Decisions about how to use evidence will also be shaped by the internal dynamics of individual government departments, which includes the organisation culture and internal structures or processes that impact how teams work with each other (Shaxson, 2019[108]).

The wider environment beyond institutional boundaries also affects the use of evidence (OECD, 2020[99]). This context can refer to the extent to which ministries are networked with other external organisations, such as knowledge brokers, who can support evidence use (Damschroder et al., 2009[109]; Greenhalgh et al., 2004[110]).

Cultural and societal factors may also affect the extent to which evidence gets used in policy-making (OECD, 2020[99]). For instance, societal attitudes towards policy-making, and what and who should contribute to it, can also affect the use of evidence (Newman, Fisher and Shaxson, 2012[111]). The erosion in trust in traditional institutions and the digital revolution in communication have eroded the authority of science in some instances. Social media and web-based sources can diffuse opinions very quickly, irrespective of whether they are grounded in scientific evidence (OECD, 2017[112]). Existing examples of challenges in the communication of science, such as vaccination for example, have led to the recognition that ‘more facts’ are not enough in addressing these challenges (Sinatra, Kienhues and Hofer, 2014[113]). Therefore, in order to promote use of evidence, policy makers must also address the societal drivers of resistance to the use of evidence, and recognise the emotional, as well as rational, elements of decision-making.

Besides these external factors, the following sections will focus on the factors that are amenable to policy interventions, including the institutions, strategies and tools developed by governments in order to promote evaluation and use.

Promoting quality through good governance and sound methodology

Copy link to Promoting quality through good governance and sound methodology

Understanding quality evaluations

To be credible, a policy evaluation must be technically rigorous, as well as be well governed; that is be independent and appropriate for the decision-making process (Robert Picciotto, 2013[114]). Therefore, quality evaluations are:

technically and methodologically sound
well-governed.

On the first hand, independent processes alone do not guarantee that policy evaluations are of high quality: proper design, sound data collection, rigorous methods, adequate resources are also required. Independent but technically weak evaluations can lead to poor evidence, which can be costly and misleading.

On the other hand, technical quality is necessary but not sufficient to promote an evidence informed approach to policy making. This is because evaluations inherently take place in a political context, as they are usually commissioned by policy and decision-makers, making their outcome susceptible to influence (Parkhurst, 2017[32]) (Pleger and Hadorn, 2018[115]). In other words, even when methodologically and technically robust, an evaluation process is ‘never truly neutral’ (Desautels and Jacob, 2012[116]). The evaluations of policies and programmes can suffer from a range of biases, whether technical or political, which can affect the evidence-production process. Conversely, evaluations are but one input into policy making and policy and practice decisions must also weigh broader considerations, such as ethics, equity, values and political considerations (Parkhurst, 2017[32]). The academic literature includes rich discussions of the governance challenges relating to evaluations that may affect the quality of evidence collected ( (Barnett and Camfield, 2016[117]) (Jacob and Boisvert, 2010[118]) (Brown and Newman, 1992[119]). Still, the question remains as to what governments can do to promote quality in practice, which will be presented below.

Overview of mechanisms to promote quality evaluations

A large majority of surveyed countries (29 of 42 respondent countries, of which 24 OECD countries) have put in place one or several mechanisms in order to promote quality through various means – thus suggesting that survey respondents have recognised the importance of ensuring the good quality of evaluations.

In general, countries have sought to promote the quality of evaluations via four main determinants:

developing standards on the quality of the evaluation process, which can be embedded in evaluation guidelines or in legal/policy frameworks
controlling the quality of the evaluation end product
supporting and promoting evaluator competences
fostering quality at an institutional level.

Quality standards for the evaluative process

Firstly, countries have developed mechanisms to ensure that evaluations are properly conducted, that is to say that the process of evaluating a policy respects certain quality criteria. In order to do so, countries have developed quality standards, which serve to impose a certain uniformity in the design and process of evaluations (Picciotto, n.d.[120]).

In many countries, standards for good quality evaluations are embedded in guidelines, which are non-binding documents or recommendations that aim to support governments in the design and implementation of a policy and/or practice (examples include white-books and handbooks). Fewer countries, on the other hand, have embedded such standards in policy or legal frameworks, or normative instruments.

The results of the survey show that most countries have developed standards regarding both the technical quality of evaluation and its good governance, reflecting their understanding of the dual determinants of quality evaluations. Nevertheless, large differences remain across OECD countries in the content of these guidelines and norms. An analysis of the existing standards for the design, implementation and evaluation of specific public interventions will also complement this analysis (OECD, forthcoming[106]).

	Provisions expressed in a policy/legal framework	Guidelines for policy evaluation across government	Competence requirements for evaluators	Peer review (internal/external) of evaluations	Systematic and meta-evaluations	Other
Australia	○	●	○	○	○	●
Austria	●	●	●	●	○	●
Belgium	○	○	○	○	○	○
Canada	●	●	●	●	○	●
Chile	●	○	●	○	○	○
Czech Republic	○	●	○	○	○	○
Denmark	○	○	○	○	○	●
Estonia	●	●	●	○	○	○
Finland	●	●	●	●	○	○
France	●	●	●	●	○	○
Germany	●	●	○	●	●	●
Great Britain	○	●	●	●	●	○
Greece	●	●	●	○	○	○
Hungary	○	○	○	○	○	●
Iceland	○	○	○	○	○	○
Ireland	○	●	○	○	○	●
Israel	○	○	○	○	○	○
Italy	○	●	○	○	○	●
Japan	●	●	●	●	○	○
Korea	●	●	●	○	○	○
Latvia	●	●	○	○	○	○
Lithuania	○	●	○	○	○	○
Mexico	●	●	●	●	●	○
Netherlands	○	●	○	●	○	○
New Zealand	○	●	○	○	○	○
Norway	○	●	○	○	○	○
Poland	●	●	○	○	●	●
Portugal	○	●	○	○	●	○
Slovakia	○	●	●	○	○	○
Slovenia	○	○	○	○	○	○
Spain	○	●	○	●	○	●
Sweden	○	○	○	○	○	○
Switzerland	○	●	○	○	○	○
Turkey	○	○	○	○	○	○
United States	●	●	●	○	○	○
OECD Total
● Yes	14	26	13	10	5	10
○ No	21	9	22	25	30	25
Argentina	○	●	●	○	○	○
Brazil	○	●	○	○	○	●
Bulgaria	○	○	○	○	○	○
Colombia	●	●	●	○	○	●
Costa Rica	●	●	●	●	●	○
Kazakhstan	○	●	○	○	○	○
Romania	●	○	●	○	○	○

Note: n=42 (35 OECD member countries). 14 countries (12 OECD member countries) answered that there are no mechanisms to ensure the quality of evaluations across government. Answers reflect responses to the questions “How does your government ensure the quality of evaluations across government” and “Are there guidelines available to support the implementation of policy evaluation across government?”. Systematic and meta-evaluations refer to the evaluations designed to aggregate findings from a series of evaluations. In the option "others", In Brazil some ministries promote the training of evaluators through its schools of government, and by making available the findings of their evaluations and databases on public sites., In Germany, regular exchange take place within the network of evaluation units of development cooperation agencies and externally through the OECD DAC evalnet. Hungary has a consultation process to review the evaluations, In Ireland each Accounting Officer is responsible for ensuring compliance with the Public Spending Code in their Department/Office. Italy has different mechanisms to improve the quality of the evaluations as part of the National Evaluation system such as steering groups. Poland has a system of assessment of quality of conducted evaluations in the policy evaluation guidelines.

Source: OECD Survey on Policy Evaluation (2018)

Standards set-out in guidelines including provisions for technical quality

A majority of countries (20 countries, of which 17 OECD countries) have developed guidelines that seek to address both the technical quality of evaluations and the good governance of evaluations. Seven countries have developed a single reference guideline for public sector evaluations. Other countries have chosen to adopt distinct guidelines for standards of good governance and for standards regarding methodological rigor. In Estonia, for instance, the Methodology of Impact Assessment (2012) guidelines describe the technical features of impact evaluations of policies and programmes, while the Good Public Engagement Code of Practice (2012) focuses on the principles for the good governance of evaluations, such as the involvement of the public and interests groups in decision-making processes.

International organisations have also adopted such guidelines in order to set standards for quality evaluations and the appropriate principles for their oversight (United Nations Evaluation Group, 2016[121]). The international organisation that brings together Supreme Audit Institutions has done so as well (INTOSAI, 2010[122]). At the OECD, the Development Assistance Committee’s Quality Standards for Development Evaluation (OECD, 2010[48]) include overarching considerations regarding evaluation ethics and transparency in the evaluation process, as well as technical guidelines for the design, conduct and follow-up of development evaluations by countries. The OECD Best practices on ex post evaluations of regulations (OECD, 2018[123]) also provide standards relating to the ex post evaluation of laws and regulations, and the OECD best practice principles for regulatory policy on Regulatory Impact assessment (GOV/RPC (2018)12/REV2) include provisions for the ex ante assessment of regulatory impacts. Similarly, the World Bank Group Evaluation Principles sets out core evaluation principles for selecting, conducting and using evaluations (World Bank et al., 2019[124]) aimed at ensuring that all World Bank Group evaluations are technically robust, as well as credible.

	Technical Quality of evaluations						Good Governance of evaluations
	Identification and design of evaluation approaches	Course of action for commissioning evaluations	Establishment of a calendar for policy evaluation	Identification of human and financial resources	Design of data collection methods	Quality standards of evaluations	Independence of the evaluations	Ethical conduct of evaluations	None Of The Above
Australia	○	○	○	●	○	○	○	○	○
Austria	○	○	○	●	●	○	○	○	○
Canada	●	○	○	○	●	●	●	●	○
Czech Republic	●	○	○	○	●	●	●	●	○
Estonia	●	●	○	●	●	●	●	●	○
Finland	○	●	●	○	○	●	●	●	○
France	●	○	○	○	○	○	○	○	○
Germany	●	●	●	●	●	●	●	●	○
Great Britain	●	○	●	●	●	●	●	●	○
Greece	●	●	●	●	●	●	●	○	○
Ireland	●	○	○	○	●	●	●	○	○
Italy	○	●	○	●	○	○	●	○	○
Japan	●	○	●	○	●	●	○	○	○
Korea	●	○	●	○	●	●	○	○	○
Latvia	●	●	●	●	●	●	○	○	○
Lithuania	●	○	○	●	●	○	●	○	○
Mexico	●	●	●	○	○	●	●	●	○
Netherlands	○	○	○	○	○	●	○	○	○
New Zealand	●	●	○	●	●	●	●	●	○
Norway	●	○	○	●	●	○	○	○	○
Poland	○	○	●	○	●	●	●	○	○
Portugal	○	●	○	○	○	○	○	○	○
Slovakia	●	○	○	○	○	●	●	○	○
Spain	●	●	○	●	○	●	●	●	○
Switzerland	○	○	●	●	●	●	●	●	○
United States	●	○	●	●	●	●	●	●	○
OECD Total
● Yes	18	10	11	14	17	19	17	11	0
○ No	8	16	15	12	9	7	9	15	26
Argentina	○	●	●	○	●	○	○	○	○
Brazil	●	●	○	●	●	●	●	○	○
Colombia	●	●	○	○	○	○	●	●	○
Costa Rica	●	●	●	●	●	●	●	●	○
Kazakhstan	○	○	●	○	○	○	○	○	○

Note: n=31 (26 OECD member countries). 11 countries (9 OECD member countries) answered that they do not have guidelines to support the implementation of policy evaluation across government. Answers reflect responses to the question, “Do the guidelines contain specific guidance related to the: (Check all that apply)”.

Source: OECD Survey on Policy Evaluation (2018)

Identification and design of evaluation approaches

About two thirds of countries (21 countries, of which 18 OECD countries) have included provisions for the design of evaluation approaches in their guidelines. The Spanish State Agency for the Evaluation of Public Policies and Quality of Services (AEVAL), for instance, implemented a practical guide for the design and implementation of public policy evaluation in 2015. The guidelines seek to provide theoretical and practical advice for better evaluation approaches and include detailed recommendations for evaluation design, for instance by proposing key steps for drawing out an intervention’s theory of change or illustrating common scenarios for evaluators with local examples.

A further analysis of country guidelines show that these recommend that the purpose, scope (for example time-period, target population, geographic area included, etc.) and objectives of an evaluation be clear. These guidelines underline the importance of making sure that the questions that the evaluation intends to answer are clear and well-defined, as the evaluation criteria and questions define the evidence that the evaluation will generate. Some guidelines also emphasise that the analysis conducted to answer the evaluation question should be clearly and explicitly stated and explained (OECD, forthcoming[106]).

Box 3.3. The example of the Magenta Book in the United Kingdom: Core questions of policy evaluations

Copy link to Box 3.3. The example of the Magenta Book in the United Kingdom: Core questions of policy evaluations

In the United Kingdom, the Magenta book provides guidance on what to consider when designing an evaluation. It invites analysts to consider a series of question such as

Should it work? (theory of change) What is the underlying ‘theory of change’, which explains how the policy will make an impact? An understanding of the theory of change that underpins the project will ensure that we measure the things that really matter during the evaluation.
Can it work? How was the policy delivered (process/implementation evaluation)? How was the policy implemented? Has the policy been properly implemented? What were the challenges to implementation and how were they overcome?
Does it work? (impact evaluation) Many of our evaluations investigate the impact of the intervention.
Is it worth it? Do the benefits justify the costs (economic evaluation)? It is anticipated that, if successful, policies/interventions might receive a wider roll-out. It will therefore be important to consider whether they are cost effective.

Source: (HM Treasury, 2011[9]). www.gov.uk/government/publications/the-magenta-book

Finally, an evaluation plan or matrix may be a useful tool to lay out the evaluation’s focus, the main questions it seeks to answer, the key information needed for indicators, data collection methods, etc. Importantly, such an evaluation plan should also mention the purpose of the evaluation and how its results should be put to use (Kusters et al., 2011[125]). The Lithuanian ministry of finance, for example, issued Recommendations on Implementation of Programs Evaluation (2011), which give advice on how to plan and design an evaluation, from identifying the need for an evaluation to establishing an evaluation plan, including methods.

Course of action for commissioning evaluations

Some country guidelines (14 countries overall, of which 10 OECD countries) include specific standards or recommendations regarding the commissioning of evaluations, as is the case in Costa Rica, where the ministry of national planning and economic policy (Mideplan) has dedicated a separate guideline for the establishment of an evaluation’s terms of reference.

Box 3.4. Standards for commissioning evaluations in Costa Rica

Copy link to Box 3.4. Standards for commissioning evaluations in Costa Rica

The ministry of national planning and economic policy (Mideplan) has developed specific guidelines and standards for the preparation of Terms of Reference for policy evaluations (“Guía de Términos de Referencia: Orientaciones para su elaboración: estructura y contenido”). This handbook provides inputs on the recommended technical content and basic structure of terms of reference (ToRs) for policy evaluation that are commissioned to an external agent. The methodological tool is composed of two main parts:

5. what the ToRs are and what they are for the basic structure that this document should present and the essential characteristics of its content.

According to the document, terms of reference should include at least the following criteria:

Finally, the guidelines recommend that the ToR be clear, concrete and that the main actors involved in the public intervention and evaluation submit them for consultation and validation.

Source: (Mideplan, 2018[126]).

In fact, drafting the terms of reference (ToRs) provides the guidelines for the work that will have to be carried-out during the evaluation process and therefore constitute an essential tool for quality assurance (Kusters et al., 2011[125]). The terms of Reference is an essential document of any evaluation. Country guidelines mention that ToRs should likely cover the background context of the evaluation, its scope, goals, methodology, team composition, stakeholders to be engaged and the evaluation budget (Independent Evaluation Office of UNDP, 2019[127]). Evaluation guidelines developed by countries may also specify that ToRs should be drafted by the evaluation manager once the relevant data and documents are collected.

Planning out evaluations and identifying the appropriate resources

Good evaluation planning may also be important to ensure quality, as well as use. Many researchers emphasise the importance of the issue of timeliness of evaluation results to promote their use in decision-making (Leviton and Hughes, 1981[100]): the consensus is that evaluations should be thought of well in advance and the evaluation process planned-out carefully. Likewise, resource limitations can strongly influence an evaluation’s impact and use, making the identification of human and financial resources an important step in planning-out the evaluation process.

However, only a minority of countries (14 countries, of which 11 OECD countries) include clauses regarding the establishment of a calendar for policy evaluation in their guidelines. Similarly, less than half of OECD countries and of overall respondents (16 countries, of which 14 OECD countries) include standards regarding the identification of human and financial resources for evaluation in their guidelines.

One notable exception is Korea’s office for government policy coordination (2017) framework act on government performance evaluation, which recommends a systematic approach to evaluation on a yearly basis in order to facilitate planning out resources for this purpose.

Box 3.5. Korea’s office for government policy Coordination (2017) framework act on government performance evaluation

Copy link to Box 3.5. Korea’s office for government policy Coordination (2017) framework act on government performance evaluation

The framework act on government performance evaluation recommends that all government agencies formulate a yearly internal evaluation plan to identify the major policies to undergo review each year. The results of the evaluations are to be submitted to the government performance evaluation committee (GPEC) in the spring. Such evaluation plans may allow evaluators to adequately plan the necessary resources for the evaluations, as well as to ensure that the results of the evaluations are useful to decision-makers, as the timeline for the publication of their results is clear.

The document gives specific instructions regarding the composition of the GPEC, which is in charge of implementing government performance evaluation:

It should be composed of no more than fifteen members including two chairpersons.
Members should have earned a degree in a discipline related to evaluation and have experience related to academia or a research institute.
Members who are not government officials should be into office for a two years term, and may only serve one additional consecutive one.

Source: Korea Office for Government Policy Coordination (2017), Framework Act on Government Performance Evaluation.

Design of data collection methods

A majority of countries (20 countries, 17 OECD countries) have included standards for the design of data collection methods in their guidelines. Indeed, although experimental approaches, such as randomised control trials, can sometimes take advantage of existing administrative data, it is often necessary to collect new data for an evaluation, using social research methods.

The design of data collection methods is key for conducting policy evaluations and many country guidelines include data collection standards: they should be representative, their approach should be well-designed, and their size should be appropriate (OECD, forthcoming[106]). Moreover, guidelines prescribe that data should be opened to critical questioning and challenge, and its use to conduct an analysis should be explained and justified. This analysis should be subsequently well-defined and executed (see examples for France and Norway in (Box 3.6).

Box 3.6. Data collection standards in guidelines

Copy link to Box 3.6. Data collection standards in guidelines

French Guidelines on evaluating the impact of public policies

The French guidelines for decision makers and practitioners on how to evaluate the impact of public policies (Comment évaluer l’impact des politiques publiques: un guide à l’usage des décideurs et des praticiens, 2016) underline how the quality of an impact evaluation depends on the availability, breadth and quality of data on the policy being evaluated. According to the guidelines, creating relevant indicators to measure the impact of a policy requires access to various data sources and variables, and thereby a frequent matching of statistical sources. The right type of data should be collected for a valid implementation of the evaluation method chosen. For example, the guidelines describe how, when using the matching method to establish the causal effect of a policy on certain outcomes, data on individuals and their social and economic environment has to be sufficiently rich to minimise selection bias.

The guidelines recommend that, when using qualitative data (from surveys, field observations or case studies), the credibility of results be increased by comparing and combining information from different actors and methods. These guidelines conclude on the need to institutionalise and better operationalise the production of and access to data. Examples of processes to promote data access include the accelerated provision of administrative files and the facilitated access procedures to institutions such as the National Council for Statistical Information (CNIS). Lastly, the guidelines note the virtue of conducting a systematic review of the readily available to assess whether collecting new or existing data is needed in the first place. In addition, France has created a secure access to statistical and administrative micro data through a single entry point to a large number of data producers (www.casd.eu/en/).

Norwegian guidelines on carrying out evaluations

These guidelines give an explicit methodology for collecting data, from choosing the collection instrument (survey, interview, observation, etc.) and the subjects (individuals, businesses, etc.) to obtaining the information, and registering and processing the data. They give advice regarding the choice of data collection and analysis methods, and recommend to combine them to increase the quality of a single evaluation. Among other examples, a precise step-by-step guide is provided on how to conduct a survey, one of the most common data collection methods. This guide includes suggestions on reaching out to as many relevant respondents as possible, designing clear and precise questions and achieving a high response rate. Finally, these guidelines emphasise the importance of evaluators’ analytical knowledge and skills to ensure the correct use of data and avoidance of data saturation.

Sources: (France Stratégie, 2016[128]), Norway Ministry of Finance (2005), Guidelines on carrying out evaluations

Evaluation methods

Choosing the appropriate evaluation method is paramount to an evaluation’s quality. A high quality evaluation method solves the issue of attribution (causality) by providing insights on whether and to what extent a policy delivered its intended outcomes.

	Evaluation method	Description	Limits
Quasi experimental	Pre-Post	Impact is measured as the change in the outcomes of participants before and after the policy is implemented.	Factors other than the policy itself that might have influenced the outcomes of participants are not accounted for.
	Simple Difference	Outcomes of participants and non-participants after the policy is implemented are compared.	Results are biased if participants and non-participants have different chances of being affected by the policy before its implementation, and if they differ in other ways than their participation status.
	Differences in Differences	The policy effect is measured by comparing the evolution of the participants’ outcomes before and after its implementation with the evolution of non-participants’ outcomes throughout that same period.	There will be bias if the control group does not actually reflect what would have happened to the treatment group had it not been treated. For valid results, the observable and unobservable differences between the two groups should also be constant across time.
	Multiple Linear Regression	This method consists of comparing the outcomes of participants and non-participants, controlling for observable differences between the two groups that might affect their outcomes (gender, income, education, age, etc.).	Unobservable, unmeasurable and unmeasured factors may still differ across the two groups and affect the measured outcome, which would limit the validly in estimating the causal impact of the programme.
	Statistical Matching	Participants and non-participants who have otherwise similar characteristics are compared.	Unobserved, unmeasurable and unmeasured characteristics may still bias the estimated effect.
	Regression Discontinuity Design	Individuals are ranked according to a given measurable criteria, and a cut-off determines their participation in the policy. Participant just above the cut-off are compared to non-participants just below.	There is a risk that individuals manipulate their own outcomes to become eligible (or not) for the policy, which introduces bias. Moreover, the measured effect is only “local”, meaning that it holds only for individuals close to the cut-off.
	Instrumental Variables	The effect is measured by identifying an “instrumental” variable that affects the outcome of interest only indirectly through determining whether an individual participates in the policy. This instrument should not be related to any other factor affecting the outcome of interest.	The validity of results relies on finding a good instrument, or one that predicts the outcome only through programme participation, which is difficult in practice.
Statistical	Randomised Evaluation	This experimental method consists in randomly assigning individuals to participate in the policy or not, and comparing outcomes of the two groups. Random assignment removes, on average, any differences between the participants and the non-participants, apart from their participation” status.	Causal estimation from randomised evaluation is valid if only if randomisation was “properly” conducted. Examples of bias are that the effect on the treatment group “spilled over” on the control group (spill over effects), or that treated individuals ended up not participating in the programme (attrition bias).

Source: Source: Based on J-Pal (2016) Impact Evaluation Methods: What are they and what assumptions must hold for each to be valid?

Specifically, impact evaluation methods provide a solid counterfactual, that is to say take into account all the other factors that could generate an observed outcome (Campbell and Harper, 2012[129]). The question of what approach is most appropriate will depend on the complexity of the relationships between an intervention’s inputs, activities, outputs, outcomes and impacts.

According to OECD data, two thirds of countries (21 countries overall, of which 19 OECD countries) include provisions detailing quality standards for evaluation methods (see column “Quality standard for evaluations”). Some countries have developed methodological guidebooks or manuals with the primary intent of delving deeper into evaluative methods in order to provide evaluators with practical advice for the implementation of an evaluation. In Great Britain, for instance, several methodological handbooks provide detailed recommendations on how to evaluate policy impacts and conduct programme appraisals and evaluations. Other countries that have developed such handbooks or detailed guidelines on evaluative methods include France (France Stratégie, Desplatz and Ferracci, 2016[130]), Spain (AEVAL, 2015[131]) and Lithuania (Ministry of Finance (Lithuania), 2011[132]).

Box 3.7. Evaluation guidelines in Great Britain

Copy link to Box 3.7. Evaluation guidelines in Great Britain

The UK Government has been committed to improving central and local government efficiency and effectiveness through the development of different tools to ensure public policies are based on reliable and robust evidence. To achieve this, the HM Treasury’s Green and Magenta Books together provide detailed guidelines, aimed at policy makers and analysts, on how policies and projects should be assessed and reviewed, which makes the two sets of guidance complementary.

The Magenta book: guidance for evaluation

The Magenta Book comprises central government guidance on public policy evaluations. It presents standards of good practice in conducting evaluations, and seeks to provide an understanding of the issues faced when undertaking evaluations of projects, policies, programmes and the delivery of services.

The green book: central government guidance on appraisal and evaluation

The Green Book is guidance issued by HM Treasury on how to appraise policies, programmes and projects. It also provides guidance on the design and use of monitoring and evaluation before, during and after implementation. A range of templates and guidance on specific analysis topics and analysis techniques, which are frequently encountered during government analysis, are found in the Aqua book.

The Aqua Book: guidance on producing quality analysis for government

The Aqua Book is a suite of resources aimed at improving analytical quality assurance. Combining the high-level principles of analytical quality assurance, together with clarified roles and responsibilities, the Aqua Book helps departments and agencies embed an analytical environment that assists the delivery of quality analysis, deliver greater consistency in the approach to analytical quality assurance processes across government and ensure that the commissioners of analysis have greater confidence in analysis.

In practice, nevertheless, only a minority of countries use impact evaluation methods, such as randomised controlled trials, to evaluate their government-wide policy priorities (8 countries, of which 7 OECD countries).

	Regression/econometrics/structural equation modelling	Randomised controlled trials	Qualitative Comparative Analysis	Contribution analysis	(Comparative) case studies	Process tracing	Theory-based evaluation
Australia	○	○	○	○	○	○	○
Austria	○	○	○	○	○	●	○
Canada	●	○	●	●	●	○	●
Chile	○	○	●	○	●	○	○
Estonia	●	○	●	●	○	○	○
Finland	○	○	●	●	●	●	○
France	●	●	●	○	●	○	○
Germany	●	●	●	●	●	●	●
Great Britain	●	●	●	●	●	●	●
Greece	●	○	●	●	●	●	○
Hungary	○	○	●	○	○	●	○
Ireland	●	●	●	○	●	○	○
Israel	○	○	○	○	○	●	○
Italy	○	●	○	○	●	○	○
Japan	○	○	○	○	●	○	●
Korea	○	○	○	○	○	○	○
Lithuania	●○	○	●	○	●	○	●○
Latvia	●	○	●	○	○	○	●
Mexico	●	●	●	●	○	●	●
Poland	●	●	●	●	●	●	●
Portugal	○	○	○	○	○	●	○
Slovakia	●	○	○	○	○	○	●
Spain	●	○	●	○	○	●	○
Sweden	●	○	○	○	●	○	●
OECD Total
● Yes	13	7	15	8	13	11	9
○ No	11	17	9	16	11	13	15
Argentina	○	○	○	○	○	○	○
Brazil	●	○	●	○	●	●	●
Colombia	●	●	●	○	●	○	○
Costa Rica	●	○	●	●	●	●	●
Romania	●	○	○	○	○	○	○

Note: For the main institution on government-wide policy priorities n=29. 4 countries answered that they do not have government-wide policy priorities. Moreover, 9 countries answered that they do not evaluate their government-wide policy priorities. Answers reflect responses to the question, “Which quantitative or qualitative methods of impact evaluation have been used over the past three years for the evaluation of government wide policy priorities? (Check all that apply)".

Source: OECD Survey on Policy Evaluation (2018)

Guidelines for the good governance of the evaluation process

Individuals and organisations conducting policy evaluations also need to ensure the credibility of the evidence produced by putting in place mechanisms to promote the integrity of the evaluation process (OECD, forthcoming[106]). In fact, an evaluation’s impact can depend on its perceived quality, in terms of its readability and perception of transparency and lack of bias, as much as it can on its technical quality. Stakeholders and an evaluation’s clients must therefore trust its findings and find them credible (Caroline Heider, 2018[133]).

Independence of evaluations

Firstly, the independence of process for conducting policy evaluations is also a crucial element of their credibility (France Stratégie, Desplatz and Ferracci, n.d.[134]). The notion of independence can be understood as an evaluation being free from undue political pressure and organisational influence. The literature distinguishes between several types of independence: structural, functional and behavioural independence (Vaessen, 2018[104]) (Robert Picciotto, 2013[114]).

Box 3.8. Understanding independence in evaluations

Copy link to Box 3.8. Understanding independence in evaluations

Independence in evaluations is a critical element of their credibility and ultimately quality. It consists in evaluations being free and protected from undue political and managerial influence. Three types of such independence are mentioned in literature (Vaessen, 2018[104]):

Structural and functional independence refer to the independence of the evaluation team with respect to management, both in terms of the object and processes of the evaluation and in the decisions concerning human and financial resources.
Behavioural independence relates to the unbiasedness and integrity of the evaluator.

As such, independence requires avoiding conflicts of interests, complying with ethical norms of conduct and the independence of the evaluation commissioners themselves. In practice, independence is usually difficult to achieve in internal evaluations, where political influence is often exerted and various political interests are at stake. Accordingly, appointing an external evaluator is a common solution to foster more impartial and trustworthy results, but it may not always solve the issue of the pressures from private interests and lobbying efforts, which can implicitly weigh on external evaluators.

Independence can only be pursued to a certain extent, as there is a complex trade-off between evaluation independence and quality. External evaluators are indeed more prone to be free from political biases, but they risk lacking sufficiently thorough and adequate knowledge about the policy being evaluated. They can also be subject to influence by specific private interest groups and may have more difficult access to relevant administrative data. Conversely, internal evaluators have the potential to offer constructive views and expertise thanks to their familiarity with the policy subject and knowledge of its political relevance.

Lastly, managerial influence can also provide effective incentives and positive support so that the results of evaluations are used and understood.

Source: (Picciotto, 2013[56]), (Vaessen, 2018[104]), (France Stratégie, 2016[128]), (Wildavsky, 1979[135]).

OECD data shows that 20 countries include provisions regarding the independence of evaluations in their evaluation guidelines, understood broadly. While evaluation guidelines usually put emphasis on behavioural independence (i.e. how the evaluator should act to maintain independence in the evaluative process), countries have also put in place other safeguard mechanisms to ensure the structural and functional independence of government evaluators. Refer also to Chapter 2. on the institutionalisation of policy evaluation for a detailed discussion of the subject.

Box 3.9. Australia’s productivity commission: An autonomous government body

Copy link to Box 3.9. Australia’s productivity commission: An autonomous government body

The Australian government’s productivity commission is an autonomous research and advisory body that focuses on a number of economic, social and environmental issues affecting the wellbeing of Australians. At the request of the Australian Government, it provides independent and quality advice and information on key policy and regulatory issues. It also conducts self-initiated research to support the Government in its performance reporting and annual reporting, and acts as a secretariat under the council of Australian government for the inter-governmental review of government service provision.

The commission is located in the Government’s treasury portfolio and its activities range across all levels of governments. It does not have executive power and does not administer government programmes. The Commission is nevertheless effective in informing policy formulation and the public debate thanks to three characteristics:

Independence: it operates under its own legislation, and its independence is formalised through the productivity commission act. Moreover, it has its own budget allocation and permanent staff working at arm’s length from government agencies. Even if the commission’s work programme is largely defined by the government, its results and advice are always derived from its own analyses.
Transparent processes: all advice, information and analysis produced and provided to government is subject to public scrutiny through consultative forums and release of preliminary findings and draft reports.
Community-wide perspective: under its statutory guidelines, the Commission is required to take a view that encompasses the interests of the entire Australian community rather than particular ones.

Source: Australian Government. “About the Commission” and “How we operate”. Accessed September 2^nd 2019. https://www.pc.gov.au/about, https://www.pc.gov.au/about/operate

Ethical conduct of evaluators

Standards for ethical conduct of evaluators are found in approximately a third of the sample: out of the 42 countries who responded to the survey, 13 have developed such standards (11 OECD countries). These standards can include provisions for the use of administrative and big data, for instance when issues of consent are raised where information provided by citizens is being used. Other approaches focus on ensuring that evaluators conduct their research and data collection in ways that ensure the safeguard of the dignity, rights, safety and privacy of participants (e.g. OMB guidance).

Finally, standards for the ethical conduct of evaluators include mechanisms focused on the prevention of conflicts of interests. In fact, a key part of standards of public life is that officials do not act or take decisions in such a way as to gain financial or other material benefits. Such principles of ethical conduct are outlined in the US office of management and budget’s monitoring and evaluation Guidelines for agencies that administer foreign assistance, which advises the full disclosure of any conflict of interest among evaluators (Office of Management and Budget, 2018[136]), as well as in the recent programme evaluation standards and practices issued as part of the implementation of the 2018 Act on Evidence Based Policy Making (Table 2.4). Similarly, the Swiss guide for evaluation of the confederation’s efficacy also underlines the importance of determining relevant actors’ and stakeholders’ needs and interests early enough to allow sufficient time to identify and solve conflicts of interest (Office fédéral de la justice, 2005[137]).

The OECD has also developed ‘Guidelines for managing conflict of interest in the public service’ whose primary aim is to help countries, at the central government level, consider conflict of interest policies and practices. These guidelines pertain to all public officials, in any capacity, and are not necessarily geared towards the evaluators or the producers of evidence.

Box 3.10. Sources of conflict of interest in evaluations

Copy link to Box 3.10. Sources of conflict of interest in evaluations

Conflicts of interest often arise when evaluators have previous or intended future work experience related to the policy being evaluated (Independent Evaluation Office of UNDP, 2019[127]). To minimise them, evaluation commissioners may avoid employing evaluators who had prior engagement in the decision-making, financing or design of the policy being evaluated. Evaluators of a particular policy should not be subsequently involved in any service related to that same policy, from implementation to design.

Conflicts of interests may also come from particular personal relationships between evaluators and commissioners, such as close family members who may be in a position to influence the evaluation or its outcome on the policy (Picciotto, 2013[56]). Research has shown that, although often unnoticed, evaluation clients can exert pressure on evaluators, which is a source of conflict that may be avoided by improving communication between the two parties (Pleger and Hadorn, 2018[115]).

Lastly, lobbyists and advocacy groups can exert influence to further their particular interest, often at the expense of the public interest. It is nevertheless important to note that these groups also have the capacity to bring valuable information into the evaluation and its related policy debate. Overall, evaluators should follow the principle of full disclosure of any actual or potential conflicts of interest, and procedures should be put in place to identify relationships that might put the objectivity of the evaluation at risk. An example of such procedure is analysing the resumes of current and potential evaluators and circulating them to partners and stakeholders to decide whether they should be dismissed or employed.

Sources: OECD (2019) Meeting of the Coalition of Influencers on Integrity in Public Decision-Making, (Independent Evaluation Office of UNDP, 2019[127]), (Pleger and Hadorn, 2018[138]), (Picciotto, 2013[56]).

Other standards relating to the good governance of the evaluation process

In addition to the previously mentioned standards for the oversight of evaluations, other standards have been identified as relevant by literature. These include the principles of transparency, accountability, appropriateness and integrity. The OECD is currently conducting a mapping of principles and standards for the good governance of evidence. This exercise required an extensive stocktaking of country and academic experiences to identify a list of core principles for the governance of evidence. These principles, which are equally applicable to the governance of policy evaluations, mainly address issues such as the appropriateness of the evidence, the accountability and transparency of evidence, and the need for evidence to be ready for critical questioning and public scrutiny (OECD, forthcoming[106]).

Standards embedded in legal frameworks

Some countries have also embedded such standards in their policy or legal framework, meaning that the standards are included in normative instruments. Overall, fewer countries have chosen to embed standards for good quality evaluations in a normative instrument, suggesting that countries view these standards as recommendations to be used in a proportional manner by evaluations and managers depending on the local context – rather than fixed rules. For instance, only nine OECD countries have adopted standards for quality methods in their policy/legal framework related to policy evaluations.

Figure 3.1. Countries that have standards for quality evaluations in their policy/legal framework
Copy link to Figure 3.1. Countries that have standards for quality evaluations in their policy/legal framework

The Korea’s office for government policy coordination (2017) framework act on Government performance evaluation contains quality standards relating to planning and carrying out evaluations. Likewise, the national evaluation policy (PNE) in Costa Rica seeks to ensure the quality of evaluations by promoting the evaluability of government programmes, increasing the involvement of stakeholders in the evaluative process and establishing competency requirements for evaluators. The Czech Republic, Germany, Spain, Estonia, Great-Britain, Korea, Lithuania, Poland and Costa Rica have also embedded standards related to the ethical conduct of evaluators in their legal framework.

While this report focuses mainly on public sector standards related to the quality of evaluation, there are also many standards established and proposed by the private sector (OECD, forthcoming[106]).

Measures to control the quality of the evaluation product

In various countries, quality control mechanisms are developed in addition to the standards and guidelines in place to ensure the quality of policy evaluations. Mechanisms for quality control ensure that the evaluation design, as well as its planning and delivery, have been properly conducted to meet the pre-determined quality criteria. While quality assurance mechanisms seek to ensure credibility in how the evaluation is conducted (the process). Quality control tools ensure that the end product of the evaluation (the report) meets a certain standard for quality. Both are key elements to ensure the robustness of policy evaluations (HM Treasury, 2011[9]). Overall, quality control mechanisms are much less common than quality assurance mechanisms, with only approximately one third of countries (31% of countries overall) using a quality control mechanism (for example, a peer review of evaluations or meta-evaluations). An example of a country with quality assurance mechanisms is Japan; there, the Ministry of Internal Affairs and Communications (MIC) is in charge of quality assurance of policy evaluation. It checks the ministries’ evaluations, holds inter-ministerial liaison meetings, uses academic and practical experts’ insights, and publicises information about policy evaluation (see Box 3.11).

Box 3.11. The review function of Japanese Ministry of Internal Affairs and Communications (MIC)

Copy link to Box 3.11. The review function of Japanese Ministry of Internal Affairs and Communications (MIC)

The MIC conducts coherent and comprehensive quality controls of policy evaluations done by the ministries. The Administrative Evaluation Bureau (AEB) reviews the evaluations carried out by the ministries, identifying elements that need to be improved and publicised on the basis of the basic guidelines for implementing policy evaluation (Cabinet Decision in 2005 and latest revised in 2017). This includes:

an examination of the objectivity and rigor of policy evaluations conducted by ministries.
a determination of the need for the implementation of a new evaluation or further evaluation..
ensuring objective and rigorous implementation is deemed impossible if left to the ministry.

The role of liaison meetings

The MIC hosts inter-ministerial liaison meetings to foster close communication, ensure the implementation of evaluations and promote initiatives related to policy evaluation, with a view to improving quality.

The use of academic experts

The use of insights of academic and practical experts is aimed at ensuring the objective and rigorous implementation of policy evaluation, thereby assuring quality. Experts’ insights are collected through interviews, in various steps including what the policy management cycle such as PDCA by the policy evaluation should be, setting primary goals of policies, and summarising policy evaluation results.

The policy evaluation council

The policy evaluation council established under the MIC investigates and discusses important matters relating to policy evaluation and the AEB investigation. The council is composed of members who have been selected based on their expertise in the academic, administrative and private fields. In regards to policy evaluation, the council discusses important matters relating to the development and revision of guidelines, and the objectivity and rigor of evaluation results.

Publicising information about policy evaluation

Policy evaluation reports are made public, together with information on how the results are used for the development of policy. The MIC must also publicise its evaluation plan and evaluation reports. The MIC also prepares an annual report on the status of policy evaluation conducted by the ministries and how the results of the evaluation have been reflected in policymaking process, which must be publicised and reported to the National Diet of Japan. The MIC aggregates results of policy evaluation by the ministries on the Portal Site for Policy Evaluation. This system leads to contribution to ensuring the quality of evaluation as well as accountability and transparency of implementation.

Source: Ministry of Internal Affairs and Communications (Japan).

Peer review of evaluation products

The most common control mechanism used by countries to promote quality of evaluations is the peer review process. Peer reviews consist of a panel or reference group, composed of external or internal experts, subject an evaluation to review of its technical quality and substantive content. The peer review process helps determine whether the evaluation meets the adequate quality standards and can therefore be published, as illustrated by examples for Portugal and Germany.

Figure 3.2. Peer reviews
Copy link to Figure 3.2. Peer reviews

Box 3.12. Internal and external peer reviews in Portugal and Germany

Copy link to Box 3.12. Internal and external peer reviews in Portugal and Germany

Evaluation of the Portuguese Simplex Programme

The evaluation of the Portuguese Simplex programme shows a form of combined (internal and external) peer reviews. On the internal level, project managers are required to regularly report on the progress of the project plan. Reporting is done through an electronic platform and during meetings with key stakeholders and partners, allowing for relevant internal and external insights. The results of such reviews are uploaded on a publicly accessible website, so that citizens may also have a critical say on the advancement of the programme and on the progress report shared every trimester. At the same time, external contractors such as academics from Nova University or evaluators from the European Commission evaluate the programme.

Evaluation of the German Strategy on Sustainable Development Goals

The German Chancellor invited an international group of recognised experts to review the country’s 2013 Sustainability Strategy. Following a first external peer review made in 2009, this one includes a variety of experts such as Korean and German experts, the former senior vice President of Unilever, members of Parliaments, Chair of WWF South Africa, or again the former director general of the UK Department of Environment. Such diverse group of peer review may provide constructive insights on the evaluation of the German Strategy, allowing for further improvements in its quality, and ultimately its quality.

Source: OECD (2018) Survey on Policy Evaluation, German Council for Sustainable Development (2013) Peer review: Germany leads the way https://www.bundesregierung.de/breg-en/issues/sustainability/peer-review-germany-leads-the-way-402952 (Accessed 28th of August 2019).

Meta-evaluations

Meta-evaluations correspond originally to the evaluation of an evaluation to control its quality and/or assess the overall performance of the evaluation (Scriven, 1969[139]). Nowadays, it mainly refers to evaluations designed to aggregate findings from a series of evaluations. In its latter meaning, meta-evaluation is an evidence synthesis method (see section on ‘Methods for Reviewing and Assessing the Evidence Base’ for other evidence synthesis methods), which serves to evaluate the quality of a series of evaluations (by making an assessment of evaluations through reports and other relevant sources) and its adherence to established standards. As such, meta-evaluations constitute a useful tool to review the quality of policy evaluations before they are made publicly available. The figure below shows that a relatively limited number of countries use meta-evaluations to control the quality of evaluations. This might either be due to a lack of skills, familiarity or methods.

Box 3.13. Meta-evaluations

Copy link to Box 3.13. Meta-evaluations

A meta-evaluation is a systematic, managed and controlled method to assess the quality of processes and results of carried out evaluations (Malčík and Seberová, 2010[140]). Meta-evaluations can take several forms:

Formative meta-evaluations intend to guide a primary evaluation (Stufflebeam, 1978[141]). In this dimension, the meta-evaluation is used as an instrument to improve or change an ongoing evaluation design and implementation (Better evaluation, 2019[142]).
Summative meta-evaluations denote studies that judge the merits of completed evaluations (Better evaluation, 2019[142]). This dimension is connected to ensuring the quality, validity and correctness of the primary evaluation, thus verifying whether key principles have been followed and whether its results can themselves be judged as relevant, valid and reliable.

Source: in the text

An exception is the meta-evaluation in Costa Rica, led by the program for the promotion of capacities in evaluation in Latin America (Programa de Fomento de Capacidades en Evaluación en diversos países de América Latina - FOCEVAL) in 2016. This meta-evaluation sought to assess the usefulness of a set of evaluations2, their methodological rigor, their success in resource management, and their professional and ethical performance. The meta-evaluation provided relevant information to the ministry of national planning and economic policy (Mideplan) to improve stages in the evaluation process, such as an agreement among institutional authorities to reduce the times for starting the evaluation, and an enhancement of the terms of reference to promote a more rigorous and clear evaluation process.

Self-evaluation tools and checklists

Finally, some countries have also developed tools aimed either at the evaluators themselves (i.e. self-evaluation) or at the managing and/or commissioning team (quality control checklists, for example) in order to help them control whether their work meets the appropriate quality criteria.

Quality control checklists are aimed at standardising quality control practices of evaluation deliverables and as such can be useful to evaluation managers, commissioners, decision-makers or other stakeholders to review evaluations against a set a pre-determined criteria (Stufflebeam, 2001[143]). The evaluation unit in the European Commission, for example, includes a clear quality criteria grid in its terms of reference, against which the evaluation manager assesses the work of the external evaluators (OECD, 2016[144]).

Self-evaluation, on the other hand, is a critical review of project/programme performance by the operations team in charge of the intervention, as they serve to standardise practices when reviewing evaluation deliverables. Although less commonly used (only two respondent countries mentioned their use), self-evaluation tools can form an important element of a quality control system (OECD, 2016[144]), as they constitute the first step in the control process.

Box 3.14. Self-evaluation checklists in Spain and Poland

Copy link to Box 3.14. Self-evaluation checklists in Spain and Poland

Only two countries reported the use of a self evaluation checklist, the results of which are presented below:

The Spanish institute for the evaluation of public policies’ auto-satisfaction survey

The Spanish institute for the evaluation of public policies (IEPP, formerly AEVAL) has developed an auto-satisfaction survey, whereby participants in the evaluation share their satisfaction regarding the evaluation process and quality. This stage of the evaluation follow-up process favors responsiveness to the evaluation client by providing specific measurements of the quality and degree of usefulness of evaluation products such as the evaluation report.

The Polish Ministry of Infrastructure and Development’s self-assessment checklist

This self-assessment checklist, presented in the national Guidelines on evaluation of cohesion policy for 2014-2020, aims to prevent implementing recommendations from poor quality evaluations. This system is one of the components of meta-evaluations, focusing on the skills and practices of the evaluators rather than the evaluation more broadly. The checklist includes criteria such as the extent to which the objectives were achieved, the methodology used and the data reliability. Each criteria is given a numerical rating that can be supplemented with qualitative comments (Polish Ministry of Infrastructure and Development, 2015[145]).

Sources: OECD (2018) Survey on Policy Evaluation, Polish Ministry of Infrastructure and Development (2015), Self-Assessment Checklist.

Promoting competencies for policy evaluation

While quality guidelines and standards provide evaluators with resources to help them make the appropriate decisions when conducting evaluations, they may also benefit from the appropriate competencies. Competencies ensure or promote quality in evaluation practice, as individuals who possess the right competences are more likely to produce high quality and utilisation-focused evaluations (Mcguire and Zorzi, 2005[146]).

Simply put, evaluators’ competencies imply having the appropriate skills, knowledge, experience and abilities (Stevahn et al., 2005[147]) (American Evaluation Association, 2018[148]). Nevertheless, the wide variety of contexts (internal or external evaluations) and fields (health, education, etc.) in which policy evaluations take place means that it has been difficult for literature to draw out a universal set of competencies needed for evaluators (King et al., 2001[149]) (Stevahn et al., 2005[147]). The knowledge, skills and abilities required to conduct policy evaluation are indeed situation dependent: depending on the policy being evaluated, the resources available, the needs of the client and stakeholders, etc. (Mcguire and Zorzi, 2005[146]). Evaluation networks and associations have worked to establish a list of core competencies required to be an evaluator, in an effort to professionalise evaluations (Podems, 2013[150]). The American Evaluation Association, for instance, has developed a list of core evaluator competencies (American Evaluation Association, 2015[151]), which focus on the professional, the technical, the interpersonal, the management and organisational skills necessary to be an evaluator – thus reflecting the wide variety of competencies such a profession requires beyond technical expertise.

Regardless of their heterogeneity, OECD countries have recognised the crucial role of competencies in promoting quality evaluations. In fact, survey data shows that a majority of main respondents (17 main respondents, of which 13 OECD countries) use mechanisms to support the competence development of evaluators. Sector level practices do not differ significantly, as 16 health and 13 PSR respondents report having competence requirements for evaluators. In reality, competency requirements are the most commonly used measure to promote quality amongst respondents over all sectors (Figure 3.4).

Competency development covers a range of training and support function, aimed either at individual evaluators or at organisations in their entirety – as will be further explored in the following section.

Figure 3.4. Competence requirements for evaluators
Copy link to Figure 3.4. Competence requirements for evaluators

Promoting individual evaluators’ competencies

Training for internal or external evaluators

The appropriate competencies to carry out quality evaluations can also be developed by training internal and/or external evaluators, a mechanism that a number of surveyed countries have used. OECD survey data shows that training evaluators is the most commonly used technique for competency development: half (21) of respondent countries (including 19 OECD countries) implement such trainings. This practice is also relatively frequent at the sector level, with about 13 ministries of health and 11 ministries of Public Sector Reform organising training for their evaluators.

Evaluator training curricula may be created at the level of individual ministries (see Box 3.15 for an example for Slovakia), or homogenised across government, such as in Austria. Indeed, in Austria, several ministries such as the Ministry of Finance and the Ministry of Women and Public Services collaborated to provide a manual that gives guidance on training public officials on evaluation matters (Bundesministerium für Finanzen and Bundesministerin für Frauen und öffentlichen Dienst, 2013[152]).

Box 3.15. Training Evaluators in Slovakia

Copy link to Box 3.15. Training Evaluators in Slovakia

Before entering the analytical team of a given ministry, Slovakian analysts working on policy evaluation have to pass a test that assesses their competencies in light of quality standards. For instance, some institutes use a centralised test that examines the analytical skills of the candidates for an evaluation job. Other institutes use their own tests to account for the specific evaluation requirements they have. Such requirements are good knowledge of econometric and qualitative methods and expertise on the specific policy topics the ministry focuses on.

As analysts join policy evaluation units in ministries, they are offered the opportunity to attend a broad variety of courses to deepen their knowledge of the evaluation of a specific policy topic. For this purpose, the Value for Money Institute provides an extensive list of recommended courses. To participate, analysts have to provide documents, such as a motivation statement explaining their interest in the course. A board then reviews these documents and decides whether or not to offer the analyst a position in the course. Under the “Harvard 2 programme”, the European structural and investment funds covers the expenses for these courses.

Source: OECD (2018) Survey on Policy Evaluation.

A specific job category for evaluators in the government

A further competency development strategy that has been implemented by some governments has been to establish a specific job category for evaluators. This mechanism has been adopted by 8 main respondents, of which only 5 OECD countries. At the sector level, 10 ministries of health reported having specific evaluator positions. On the other hand, Austria is the only surveyed country that has a job category for evaluators in its Public Sector Reform Ministry.

Figure 3.6. A specific job category in government
Copy link to Figure 3.6. A specific job category in government

In particular, some OECD countries have developed dedicated professions, aimed at promoting policy evaluation across government. In the UK, a total of 15 000 analysts are based across the government departments. In Ireland, the Irish Government Economic and Evaluation Service (IGEES) operates as an integrated, cross-Government service, supporting better policy formulation through economic analysis and policy evaluation (IGEES, 2014[153]). In the US, the recent Foundations for Evidence-Based Policy Making Act requires agencies to create three new positions: evaluation officer, statistical official, and chief data officer. It also requires the creation of a new (or enhancement of an existing) job series in the civil service for program evaluation.

Box 3.16. Policy evaluation as a profession in Ireland and the UK

Copy link to Box 3.16. Policy evaluation as a profession in Ireland and the UK

The Irish Government Economic and Evaluation Service (IGEES) was created in 2012 under an initiative to extend analytic capacities for evidence-informed policy making across whole of Government. Today, the IGEES still plays a major role in building capacities to achieve better policy formulation and implementation in all Government departments on the basis of economics, statistics and evaluation practices. This service particularly aims at improving the design and targeting of policies and contribute to better outcomes for citizens by building on existing analytical work and playing a lead role in policy analysis.

IGEES staff are integrated in each department, adding their specific analytic and policy skills and expertise across whole of Government. More than 160 IGEES staff work across all of the Irish Government’s Departments at different hierarchical levels, including assistant principal and administrative officer. They are either serving civil servants or staff directly recruited through the open competition process of the IGEES stream. The latter are graduates, experienced economists, evaluators and policy analysts who join analytical resources in all departments. As IGEES is an established brand in Ireland among economics graduates, this has ensured a continuous inflow of quality trained professional staff in economics across government. IGEES supports capacity building and skills enhancement and transfer for individuals and Departments through structured mobility, a learning and development framework and targeted opportunities, and platforms for discussion on analytical output and its relevance for policy. The IGEES Learning and Development (L&D) Framework intends to support capacity development according to specific individual and business needs of each Department. Following a consultation process, a cluster of skills and competencies specific to IGEES roles was developed. These skills include policy and data analysis, evaluation, quantitative methods, application of economics and civil service competencies (OECD 2020). (OECD, 2020[98])

The UK Government Social Research Profession

The Government social research (GSR) profession is one of the civil service professions that works alongside other analysts (economists, statisticians and operational researchers). GSR professionals use the core methods of social scientific enquiry, such as surveys, controlled trials, qualitative research, case studies and analysis of administrative and statistical data in order to explain and predict social and economic phenomena for policymaking.

Members of the GSR profession come from a wide variety of social science backgrounds, including candidates with degrees in psychology, geography, sociology and criminology. The GSR profession has its own competency framework that begins with entry-level graduates as part of the fast stream to members of the senior civil service and most UK government departments would have a chief social researcher who leads and supports the activity of social researchers within the department.

Sources: OECD (2020) (OECD, 2020[98]) Study of the Irish Government Economic and Evaluation Service; IGEES (2017) Work Programme for 2018 and IGEES Achievements in 2017; UK Government, “Government Social Research Profession”. Accessed September 2^nd 2019. https://www.gov.uk/government/organisations/civil-service-government-social-research-profession/about

Certification system for evaluators

Finally, certification systems for evaluators is the least common mechanism for competency development among the countries surveyed, as out of the 42 countries surveyed, only Korea and Colombia have indicated using it.

Organisational measures for the promotion of competencies

Advisory panels and committees

Fifteen respondents also use organisational measures such as advisory panels and committees in order to promote the quality of evaluations. OECD data shows that panels and committees may be composed of either policy practitioners, managers or evaluations experts. They may be established on an ad hoc basis or systematically. Their main aim is to provide comments and feedback throughout the different phases of implementation of the evaluation (design, data collection, synthesis, etc.).

Figure 3.7. Advisory panels/ steering committees
Copy link to Figure 3.7. Advisory panels/ steering committees

Establishment and/or support a network of evaluators

OECD data shows that 17 main respondents have established or support a network of evaluators. Such networks are less common at the sector level. Examples can be found in the United States, Japan or Norway. In Norway, the EVA-forum is an informal network organisation chaired by the Agency for Financial Management. It is aimed at sharing experiences on issues regarding the evaluation initiation phase, writing terms of reference, follow-up during and after evaluation, and the sharing of evaluation results. The forum organises several networking/workshop seminars per year and one national evaluation conference yearly which brings together over one hundred participants. The network collaborates closely with the national evaluation association, in which both Government, researchers, academics and consultants are members.

Figure 3.8. Establishment of a network of evaluators
Copy link to Figure 3.8. Establishment of a network of evaluators

The role of institutions and actors beyond the executive

Outside of the executive, Supreme Audit Institutions are the main institutions that play a role in promoting the quality of evaluations.

Role of SAIs in quality of evaluations and audit of the evaluation function

National audit institutions play an important role in evaluation discussions in the countries which have developed a more mature evaluation culture (Jacob, Speer and Furubo, 2015[63]). Therefore, Supreme Audit Institutions (SAIs) have taken an active part in the promotion of evaluation quality. The role played by SAIs in this regard is varied, but focuses mostly on ‘soft’ instruments for quality assurance rather than quality control such as audits.

Firstly, SAIs are often key players in the national discourse concerning evaluation quality (see for example the role played by the United States’ Government Accountability Office), bringing in their particular expertise in performance auditing, which gives countries external insights on how to improve the quality of their evaluation systems (Jacob, Speer and Furubo, 2015[63]). Moreover, some SAIs, such as the United Kingdom’s National Audit Office, have developed guidelines for the quality of the evaluative process (National Audit Office (GBR), 2013[154]). At the international level, the International Organisation of Supreme Audit Institutions (INTOSAI) is also supporting numerous SAIs in producing quality evaluations through the provision of specific and exhaustive guidelines (INTOSAI, 2016[155]).

Finally, some SAIs have promoted quality evaluations by conducting audits of the national policy evaluation system. This practice is still relatively infrequent, however; OECD data shows that about 36% of countries overall (15 surveyed countries and 12 OECD countries) have seen their policy evaluation system audited by the Supreme Audit Institution in the past ten years.

Figure 3.9. Audit of the policy evaluation system by Supreme Audit Institutions
Copy link to Figure 3.9. Audit of the policy evaluation system by Supreme Audit Institutions

SAIs play a dual role, as they can both offer strict audits, and evaluative audits, which come closer to evaluation practices These audits are not exclusively focused on compliance with quality norms or standards, but also look at how the system is functioning as a whole. For instance, they may assess the legal and institutional system, the evaluative processes, the information systems in place to operate them, as well as the evaluation results and their use (Operational and Evaluation Audit Division of Costa Rica, 2014[156]). The fact that quality standards are neither explicitly included in a policy/legal framework nor in evaluation guidelines has not prevented the Belgian Court of Audit from recently analysing the performance of the national evaluation system.

Box 3.17. Audits of the evaluation system by SAIs: Examples from Belgium and Estonia

Copy link to Box 3.17. Audits of the evaluation system by SAIs: Examples from Belgium and Estonia

Belgium: audit of the ability of federal government departments to assess public policies

In its report to the federal Parliament, the Belgian Court of Audit examines whether federal government departments have the ability to assess public policies in an organised and professional way. An analysis of the evaluation system’s steering function show that the majority of public services have developed evaluation practices. However, the Court underlines the lack of a central vision and strategy on evaluation within the public function, which blurs the division of labor and hindered coordination between actors. The report recommends fully incorporating evaluation in the policy cycle and budget by delivering yearly evaluation notes to Parliament. The audit also analyses the resources dedicated to evaluation, warning against the diminishing budget of certain services. It notes the lack of a clear evaluation function and advocates for the need to facilitate data access and develop analytic tools.

The implementation of evaluations and quality assurance is audited as well. According to the Court, the country’s public services rarely have well-defined tasks, processes and methodologies for quality assurance. The audit, thus, reiterates the importance of quality methods and of making policies evaluable by clearly and explicitly defining the logic of intervention and collecting the necessary data. Finally, the Court characterizes the use of evaluation results as insufficient because public services rarely make them publicly accessible. There is also a lack of transparency, which suggests that evaluations are not seen as a means to justify public policies on the federal level.

Estonia: audit of the planning, conduct and use of impact evaluations

The Estonian National Audit Office’s 2011 report to Parliament assesses the planning and conduct of evaluations and whether their results are continuously provided to Parliament and the public through coordination mechanisms. According to the audit, ministries lack coordination mechanisms and requirements for establishing evaluations. The lack of resources and capacity building dedicated to evaluative practices and the perceptions regarding evaluations are identified as reasons for the low quantity and quality of evaluations. The report recommends establishing sustainable quality control at the Executive and Parliament levels, clarifying the scope and methods of evaluations, involving stakeholders in legislative drafting, and clearly communicating impact analysis in explanatory memorandum (Estonian National Audit Office, 2011[157]).

Sources: Belgium Supreme Court of Audit (2018); Estonia National Audit Office (2011), The state of affairs with legislative impact assessment

Only Slovenia, Brazil and Colombia specifically mentioned that their SAI had conducted several audits of the policy evaluation system in the past ten years. No country surveyed by the OECD has reported conducting systematic audits, as is the case with the European Court of Auditors (see Box 3.18).

Box 3.18. The role of the European Court of Auditors in auditing the evaluation system of a Directorate General

Copy link to Box 3.18. The role of the European Court of Auditors in auditing the evaluation system of a Directorate General

The European Court of Auditors’ major activity consists in conducting performance audits, which entails examining the quality of the evaluation system of a directorate general (DG) according to European Commission standards. Such a quality evaluation system is one that ensures effective evaluation demand management, quality of supply and use of results. DGs are mandated to implement those standards to foster quality evaluation systems.

Accordingly, the European Court of Auditors offers practical guidelines to support the assessment of these evaluation systems’ quality. Since evaluation systems have to be tailored to their environment, the guidelines advise thorough understanding of policies’ intervention logic, their legal framework, available resources, etc. This also requires processes for programming, monitoring, supporting and reporting on evaluations.

In terms of effectively managing evaluation demand, DGs should attach sufficient importance to the evaluation itself, which requires gaining support from high level decision-makers and creating the right incentives for carrying out evaluations.
On the other hand, supporting evaluation quality requires DGs to implement procedures for training evaluation staff appropriately, involving stakeholders, ensuring robust methods and rigorously planning evaluations.

Finally, to foster use of evaluation results, arrangements should be set to identify users, understand their needs, communicate the results clearly and deliver them on time, and follow-up on their ultimate use.

Source: (European Court of Auditors, 2013[158])

National evaluation associations or societies

Outside of government, national associations of evaluators play an important role in promoting the competencies of evaluators and the quality of evaluations (Cooksy and Mark, 2012[159]). All OECD countries have a national evaluation association. Evaluation associations use a variety of approaches for encouraging competencies for quality analysis in the evaluation community. Some evaluation societies, such as the American Evaluation Association, seek to create a policy environment for quality evaluations by advocating the utility of good analysis to policy makers, establishing guidelines and increasing awareness through workshops, trainings, webinars (Cooksy and Mark, 2012[159]). Others, such as the Canadian Evaluation Society, have developed a professional designations program, which imposes a minimum competency requirement to be considered an evaluator.

Country	Name of the Society Network	Website
Australia	Australian Evaluation Society	https://www.aes.asn.au/
Austria	Austrian-German Evaluation Association	https://www.degeval.org/home/
Belgium	Flemish Evaluation Platform	http://www.evaluatieplatform.be/VEP/index.htm
Canada	Canadian Evaluation Society	https://evaluationcanada.ca/
Chile	Red Chilena de Evaluación	http://www.evaluacionpoliticaspublicas.com/
Czech Republic	Czech Evaluation Society	https://czecheval.cz/
Denmark	Danish Evaluation Society	http://danskevalueringsselskab.dk/
Estonia	Estonian Evaluation Society	http://www.praxis.ee/vana/index.php-id=1029.html
Finland	Finnish Evaluation Society	http://www.sayfes.fi/in-english/
France	French Évaluation Society (SFE)	http://www.sfe-asso.fr/
Germany	German Evaluation Society (DeGEval)	https://www.degeval.org/en/home/
Greece	Hellenic Evaluation Society	http://www.hellenicevaluation.org/index.php/el/
Hungary	Hungarian Evaluation Society	https://www.europeanevaluation.org/content/hungarian-evaluation-society
Iceland
Ireland	Irish Evaluation Network	https://www.dcu.ie/eqi/ien/index.shtml
Israel	Israeli Association for Program Evaluation	http://www.iape.org.il/en_index.asp
Italy	Italian Evaluation Association	http://valutazioneitaliana.eu/
Japan	Japan Evaluation Society (JES)	http://evaluationjp.org/english/index.html
Korea	Korean Evaluation Association	http://www.valuation.or.kr/
Latvia	Latvian Evaluation Society (LATES)	http://www.izvertesana.lv/en/about-us/
Lithuania
Luxembourg	Luxembourg Evaluation and Foresight Society	http://solep.lu/
Mexico	Academia Nacional de Evaluadores Mexicanos (ACEVAL)	http://aceval.org/
Netherlands	Dutch Evaluation Society (VIDE)	https://www.videnet.nl/
New Zealand	Aotearoa New Zealand Evaluation Association (ANZEA )	https://www.anzea.org.nz/
Norway	Norwegian Evaluation Society	http://norskevalueringsforening.no/
Poland	Polish Evaluation Society	http://pte.org.pl/
Portugal	Portugal Evaluation Association (AvalPortugal)	https://avalportugal.wordpress.com/
Slovak Republic	Slovak Evaluation Society (SES)	http://www.evaluacia.sk/en/
Slovenia	Slovenian Evaluation Society	https://www.sdeval.si/eng
Spain	Spanish Public Policy Evaluation Society (SEE)	http://www.sociedadevaluacion.org/
Sweden	Swedish Evaluation Society	http://svuf.nu/
Switzerland	Swiss Evaluation Society (SEVAL)	https://www.seval.ch/
Turkey	Turkish Monitoring and Evaluation Society (TMES)	https://www.ived.org.tr/english
United Kingdom	UK Evaluation Society	https://www.evaluation.org.uk/about-us/
United States	American Evaluation Association	https://www.eval.org/

Ensuring evaluation for impact: promoting the use by politicians, practitioners and citizens

Copy link to Ensuring evaluation for impact: promoting the use by politicians, practitioners and citizens

Understanding the use of evaluations

As policy-makers invest public funds in evaluations, their use and the ability to improve policy, programs, or projects, are key to their success. One of the most fundamental rationale for conducting policy evaluations is their usefulness in informing policy and decision-making, in general, and improving the intervention they consider, specifically. Indeed, one of the principal goals of evaluation is to support decision-making with useful insights on public issues and evidence on the impact of policies and their underlying change mechanisms.

The literature on policy evaluation use defines the concept of use threefold (Weiss and Weiaa Harvard, 1998[160]) (Alkin and Taut, 2002[161]) (Fleischer and Christie, 2009[162]) (Ledermann, 2012[163]):

Symbolic use (also known as persuasive) occurs when the results of evaluations are taken up to justify or legitimise a pre-existing position, without changing it. Examples of this are when ministers use evaluations to justify their policy choices or when congressional members use findings from an evaluation in order to push for a proposition of law (Ledermann, 2012[163]).
Conceptual use happens when evaluation results lead to an improved understanding or a change in the conception of the subject of evaluation. An example of this is the identification of collateral impact of a policy or reverse causation (Ledermann, 2012[163]).
Instrumental use is when evaluation recommendations inform decision making and lead actual change in the policy being evaluated. An example of this is the reallocation of funds after a poor performance (Ledermann, 2012[163]).

The users of evaluations include not only decision makers, for whom conceptual and instrumental use are key, but also civil servants, experts and practitioners (local authorities, programme managers, health practitioners, etc.), who are looking for increased accountability, learning and better strategic decision-making. Evaluations can be used to improve regulations, inform resource allocations on the ground or monitor the implementation of policies, etc.

Regardless of these many potential users, the use of evaluations remains a constant challenge and often falls under expectations. Despite the potential for policies to be based on evidence, in reality an effective connection with many types of research evidence in policy making remains elusive (Newman, Cherney and Head, 2017[164]). For example, USA estimates show that under the two Obama administrations, only 1% of government funding was informed by evidence (Bridgeland and Orszag, 2013[165]).

Furthermore, while many factors contribute to evaluation use, the role of barriers and facilitators to evidence use will vary depending on the context. The way in which specific barriers and facilitators operate, and how they interact with each other, depends on the local context. Use of evaluation is “more of an art than a science” (Results for America, 2017[166]). Thus, in order to promote the use of evaluations, it is important to understand these determinants and their interactions, before discussing the range of practices promoted by countries to promote use.

Overview of mechanisms to promote the use of evaluations

A large majority of countries (31 countries, of which 27 OECD countries) have put in place one or several mechanisms in order to influence these determinants – and thus promote the use of evaluations (Table 3.6 next page). In general, countries have sought to promote the use of evaluations by:

conducting utilisation-focused evaluative processes
promoting access to evaluations
supporting the uptake of evaluations results
increasing demand for evaluations through competency development
embedding use in the institutional set-up, within and outside of the executive.

Conducting utilisation-focused evaluations

Countries have developed mechanisms to ensure that evaluative processes are utilisation-focused, meaning that evaluations are conducted in a way that is fit for purpose and takes into account the needs of their primary users and the types of intended uses (Patton, 1978[102]). Empirical research (Johnson et al., 2009[167]) has found that user-focused evaluations share several features:

They are methodologically robust and credible (for a discussion of determinants of credible evaluations, see the section on ‘Mechanisms to promote quality evaluations’ as well as the OECD (forthcoming[106]). report on Principals and Standards for Good Governance of Evidence).
Users and stakeholders are involved in the evaluation process.
The evaluation methodology is perceived as appropriate by users.

Involving stakeholders throughout the evaluative process

Governments overall are increasingly eager to engage a wide range of internal and external stakeholders in the decision-making process to generate a broader consensus and increase the legitimacy of public-policy decisions (OECD, 2016[43]). There is a general consensus in the academic literature that engagement with those concerned and affected by evaluations is fundamental to improving the design, relevance, transparency and, in fine¸ use (Patton, 1978[102]) (Kusters et al., 2011[125]) (Gauthier, 2015[168]). Concordantly, OECD data shows that 72% of countries overall (and 71% of OECD countries) report engaging stakeholders in the evaluation of their policy priorities.

Evidence shows that policy-makers are more likely to seek and use evaluation results obtained from trusted familiar individuals or organisations rather than from formal sources (Oliver et al., 2015[169]; Haynes et al., 2012[170]). Stakeholder participation and interaction in the evaluative process can help build trusted relationships and increase the opportunities for evaluation results to impact policy making. Similarly, communicating findings to stakeholders as the evaluation progresses, or involving stakeholders in the design of the evaluation, can favour their adherence and understanding of the results (Fleischer and Christie, 2009[162]).

	Management response mechanism in place	Incorporation of findings into the budget cycle	A rating / grading system	Coordination platform to promote use of evidence	Discussions of findings at the Council of Ministers	No specific initiatives in place
Australia	○	●	○	○	○	○
Austria	●	●	○	○	○	○
Belgium	○	○	○	○	○	●
Canada	●	●	○	●	○	○
Chile	○	●	○	○	●	○
Czech Republic	○	○	○	○	○	●
Denmark	○	○	○	○	○	○
Estonia	●	●	○	●	○	○
Finland	○	●	○	●	●	○
France	○	●	○	○	○	●
Germany	○	●	○	●	●	○
Great Britain	○	●	○	●	○	○
Greece	●	●	○	○	●	○
Hungary	○	○	○	○	●	○
Iceland	○	○	○	○	○	●
Ireland	○	●	○	●	○	○
Israel	○	●	○	○	○	○
Italy	○	○	○	○	○	○
Japan	●	●	●	●	●	○
Korea	●	○	○	○	●	○
Latvia	●	●	○	●	●	○
Lithuania	○	●	○	○	●	○
Mexico	●	●	●	●	●	○
Netherlands	○	●	○	○	○	○
New Zealand	●	○	○	○	○	○
Norway	○	○	○	●	○	○
Poland	○	○	●	●	○	○
Portugal	○	●	○	○	●	○
Slovakia	●	●	○	○	○	○
Slovenia	○	○	○	○	○	●
Spain	○	○	○	○	○	●
Sweden	○	●	○	○	○	○
Switzerland	○	○	○	○	○	●
Turkey	○	○	○	○	○	●
United States	○	●	○	●	○	○
OECD Total
● Yes	10	21	3	12	11	8
○ No	25	14	32	23	24	27
Argentina	○	○	○	○	●	○
Bulgaria	○	○	○	○	○	●
Brazil	○	○	○	●	○	○
Colombia	○	○	○	●	●	○
Costa Rica	●	○	●	○	●	○
Kazakhstan	○	○	○	○	○	●
Romania	○	○	○	○	○	●

Note: n=42 (35 OECD member countries). Answers reflect responses to the question “How does your government promote the use of the findings of policy evaluations”. The option "Other" is not included. A rating/grading system refers to classify the robustness of evidence provided and recommendations derived from the policy evaluations exists.

Source: OECD Survey on Policy Evaluation (2018)

Requirements for stakeholder participation

In order to promote stakeholder participation, 65% of all countries (56% of OECD countries) have adopted formal requirements for stakeholder engagement in their legal/policy framework related to policy evaluations.

In the Netherlands, for example, the Ministry of Finance’s Regulations for periodic evaluation research (15 March 2018) lays down rules for the participation of stakeholders in periodic evaluations. With each policy evaluation, at least one independent expert must give an opinion on the quality of the evaluation. Similarly, the European Commission’s Better Regulation Guidelines contain a chapter that describes standards for stakeholder engagement. According to these guidelines, views from stakeholders should be included in the evaluation of all programmes and policies issued by the Commission, as well as initiatives with impact assessments (European Commission, 2017[171]). Public participation in the processes for designing new regulations are also underlined in the area of regulatory policy [GOV/RPC(2019].

Figure 3.10. Requirements related to stakeholder engagement in policy/legal frameworks
Copy link to Figure 3.10. Requirements related to stakeholder engagement in policy/legal frameworks

Variety of stakeholders involved in the evaluation process

According to the OECD survey, stakeholders include a variety of actors such as citizens, CSOs/NGOs, representatives of academia, representatives of the private sector and international organisations. Still, it is very revealing that overall representatives of the academia (17 countries for the evaluation of government-wide policy priorities (GWPP), 23 for that of health sector policies, and 16 for PSR)and the private sector (14 GWPP, 21 health, 14 PSR)are more likely to be engaged in the evaluation process than citizens according to the survey results.

These results suggest that countries mostly engage with stakeholders emanating from traditional sources of authority and expertise (academia, international organisations, the private sector). Yet, citizens, as the primary intended users of the policy being evaluated, can be considered to be the most important stakeholders to include in the evaluative process (Kusters et al., 2011[125]). Other stakeholders that are not represented include staff and managers, who can produce actionable knowledge by being engaged in the evaluation process (Gauthier, 2015[168]). In an era of relative discontent with public policies in a significant number of countries, this apparent gap of public engagement may reveal the need to explore how to reengage with citizens on evaluation results that they will both understand and find useful. This is particularly true for key challenges such as taxation, health or climate change that generate significant concerns among citizens.

Figure 3.11. Types of stakeholders engaged in policy evaluations
Copy link to Figure 3.11. Types of stakeholders engaged in policy evaluations

Engagement at different stages of the evaluation process

Involving stakeholders during every step of the evaluation can be very useful. The earlier and more actively the intended users are involved in an evaluation process and with dissemination of results, the more likely they are to use the evaluation’s results (Patton, 1978[102]). However, there is no general agreement in literature on the recommended degree of engagement between the evaluator and the users, as well as when those should be involved in the evaluation process (Fleischer and Christie, 2009[162]).

Country practices are mixed. A large majority of respondents engage stakeholders during the implementation of the evaluation (designing evaluations, providing the data relevant to the evaluation, carrying out evaluations) and during the dissemination of the results (discussing the results of the evaluation, communicating the results of the evaluation). Fewer countries engage stakeholders when deciding what policies should be evaluated and in following-up on the use of results.

Figure 3.12. Stakeholder engagement in the evaluative process
Copy link to Figure 3.12. Stakeholder engagement in the evaluative process

Canada offers an interesting example of stakeholder involvement at the early stages of an evaluation process. Results in some sectors are less clear-cut. Overall, results may suggest that stakeholder engagement in the field of policy evaluation may be aimed at symbolic use, whereby evaluations are conducted in order to justify prior decisions. Yet, stakeholders’ willingness to consider evaluations evidence is key to promote learning. Similar results can be found in the OECD Regulatory Policy Outlook (OECD, 2018[172]), which found that most consultation efforts in regards to regulatory policy development continue to focus on later stages of the rule-making process, i.e. when a preferred solution has been identified and/or a draft regulation been prepared.

Box 3.19. The experience of Alberta in Canada and the “What We Heard” report

Copy link to Box 3.19. The experience of Alberta in Canada and the “What We Heard” report

The Canadian province of Alberta’s ministry of health has established an innovative approach to communicating the results of a citizen-centred approach of policy evaluations, especially its inputs, the methodology used, as well as the outputs. In 2005, the Government of Alberta introduced the “Getting on with Better Health Care” package, which contained 13 concrete actions for the advancement of the health care system. One of these actions included the “Health Policy Framework”. In order to ensure a needs-tailored design and implementation of this framework, the government inquired the opinion of 420 health system stakeholders, health care professionals, unions, municipal leaders, educators and community organisations. Through letters and e-mails, meetings, phone calls and online expression of opinions, the government heard from 4 056 individuals from Alberta and their suggestions on how to design and implement the best health policy framework possible.

The consultations were collected and summarised in a consultation report entitled, “What We Heard…from Albertans during March 2006”, and is accessible to everyone on Alberta’s Ministry for Health’s website. In a concise and easy to read report, the ministry of health provides information on the approach of the consultation, the timeframe, the content of the input received and the “lessons learned” from the consultation process. These results of the evaluation process were subsequently used to further improve approaches to integrate citizens’ opinions. The necessity and value added of such consultation processes is underscored by one of the major findings of the report, “Albertans would like more information and communication about Alberta’s policy directions to better understand the framework and what it will mean for them.”

Source: based on OECD (2016), Open Government: The Global context and the Way Forward; Alberta Health and Wellness (2006), “What We Heard… from Albertans during March 2006”, www.health.alberta.ca/documents/What-We-Heard-Report-2006.pdf (accessed 02 August 2016).

Designing and implementing evaluations for use

The evaluation’s set-up, understood as the planning, resources and communication channels involved, also affects the use of evaluations in policy-making. The set-up needs to be tailored to the policy maker’s needs if use is to be facilitated in practice. The resources for evidence should match the demand of policy makers in terms of timing and format. Finally, the evaluation questions foreseen by the evaluator should be set to match the users’ needs (Patton, 1978[102]).

Box 3.20. The concept of Utilization-Focused Evaluations

Copy link to Box 3.20. The concept of Utilization-Focused Evaluations

The concept of Utilization-focused Evaluations (Patton, 1978[102]), developed by Michael Quinn Patton, refers to the principle according to which an evaluation should be useful to its intended users and should be judged based on its utility. Evaluations should be planned and conducted in ways that enhance their likely use. The following 17-step guidelines were identified to increase the impact and usefulness of evaluations:

1. Assess and build program and organisational readiness for utilisation-focused evaluation

2. Assess and enhance evaluator readiness and competence to undertake a utilization-focused evaluation

3. Identify, organise and engage primary intended users: the personal factor

4. Situation analysis conducted jointly with primary intended users

5. Identify and prioritise primary intended uses by determining priority purposes

6. Consider and build in process uses if and as appropriate

7. Focus priority evaluation questions

8. Check that fundamental areas for evaluation inquiry are being adequately addressed: implementation, outcomes and attribution questions

9. Determine what intervention model or theory of change is being evaluated

10. Negotiate appropriate methods to generate credible findings that support intended use by intended users

11. Make sure intended users understand potential methods controversies and their implications

12. Simulate use of findings: evaluation's equivalent of a dress rehearsal

13. Gather data with ongoing attention to use

14. Organize and present the data for interpretation and use by primary intended users: analysis, interpretation, judgment, and recommendations

15. Prepare an evaluation report to facilitate use and disseminate significant findings to expand influence

16. Follow up with primary intended users to facilitate and enhance use

17. Meta-evaluation of use: be accountable, learn, and improve.

Sources: Patton (1978), Utilization-focused evaluation, OECD (2019), Evaluating Public Sector Innovation Support or hindrance to innovation?

Methods and tools to promote access to evaluation results

Policy makers and stakeholders cannot use evidence and the results of evaluation if they do not know about it (Haynes et al., 2018[173]). The first step to promote use is therefore that the results be made available to their intended users – simply put, that they be communicated and disseminated to stakeholders. While communication supplies evidence to specific users and publics of policy evaluations, evidence dissemination aims to maximise general access to research and increase stakeholders’ understanding of, and confidence in, such content.

Communicating and disseminating results

Publicity of results

Making the result public is an important element to ensure impact and thus increase the use of evaluations. The results of the survey show that evaluation results are becoming increasingly made public by countries, through increased openness and transparency. Only one country reported that evaluation results are only available for selected officials on an ad hoc basis. 18 countries overall – of which 16 OECD countries, make evaluation findings and recommendations available to the general public by default, for example by publishing the reports on the commissioning institutions’ website. Such availability is important to promote use: if citizens are aware of the results and wary of the implications, it may also build pressure on the policy makers to pay attention to the results and ensure that they feed into policymaking (OECD, 2020[99]).

Finally, five countries make results available to public agents and officials across government. Such communication can be done through internal circulation, for example, via email or intranet. OECD data shows that uptake of evaluation results by policy and decision makers may be more likely when information is easily accessible to them (OECD, 2020[99]).

Figure 3.13. Publicity of evaluation results
Copy link to Figure 3.13. Publicity of evaluation results

Some countries have also adopted differentiated approaches to ensure the publicity of evaluation results depending on the commissioning institution. In Australia, for example, the results of performance and other audits by the auditor-general, reports of parliamentary inquiries and annual performance statements are publicly available by default. The reports by the productivity commission are also public. Individual government agencies can decide to whether to make the results of their evaluations systematically available or not.

In many countries, unpublished evaluations may be sought through freedom of information laws. Freedom of information laws (FOI) – also referred to as access to information laws – presume a principle of maximum disclosure of information, i.e. the information held by the state is in principle available to the public. However, these laws also contain a list of exemptions that may be applied to justify withholding certain information from disclosure (see the OECD’s Government at a Glance (OECD, 2011[174]) for an overview of an overview of FOI in OECD countries). It is worth noting that in European countries, evaluations of cohesion policies are made publicly available, but that may not be the case for all evaluations done at the national level, depending on the countries.

Nevertheless, the figure above shows an uneven level of dissemination and publicity of the results, with much scope to reflect on the potential to further increase use through greater awareness and use of modern communication and dissemination techniques. This was for example highlighted as a key message of a recent OECD study of the Irish Government Economic Evaluation System. (OECD, 2020[98])

Evaluation databases and portals

Online databases, for instance, seek to increase access to specific types of research. Some surveyed countries have created national databases or evaluation portals with the aim of centralising evaluation evidence in one easily accessible place.

Box 3.21. Evaluation portals to promote the use of evidence

Copy link to Box 3.21. Evaluation portals to promote the use of evidence

Poland’s national evaluation database for the evaluation of cohesion policy

All evaluations commissioned in Poland, including those concerning the implementation of EU funds, must be made accessible to the public. Concerning the evaluations related to Cohesion Policy, a national database has been created: all evaluations are published on the website www.ewaluacja.gov.pl. This platform shares the results of more than a thousand studies conducted since 2004, as well as methodological tools aimed at evaluators.

Norway’s evaluation portal

Norway’s evaluation portal (https://evalueringsportalen.no/) is a publicly accessible web service that gathers all the findings of evaluations carried out by the central government. This database is operated by the Directorate for Financial Management and the National Library of Norway. It contains evaluations carried out on behalf of government agencies from 2005 until today, as well as a selection of central evaluations from 1994 to 2004. Evaluation reports are registered in the database as soon as they are made available to the public. Moreover, the portal provides evaluation guidelines, a calendar of the key activities in the evaluation area, news and professional papers.

By increasing accessibility to evaluation results, the portal allows the use and reuse of the knowledge and findings from evaluations in all state policy areas, in future evaluations and in society as a whole. It ultimately allows increased legitimacy and transparency regarding government activities.

Source: OECD Survey on Policy Evaluation (2018).

Communication strategies and tools

Research suggests that in isolation, publicity alone does not significantly improve uptake of evaluations in policy-making (Haynes et al., 2018[173]; Langer, Tripney and Gough, 2016[175]; Dobbins et al., 2009[176]). Rather, the way evidence is presented should be strategic and driven by the evaluation’s purpose and the information needs of intended users (Patton, 1978[102]). When evaluation results are well synthesised, tailored for specific users and sent directly to them, their use is facilitated (Haynes et al., 2018[173]). Tailored communication and dissemination strategies that increase access to clearly presented research findings are very important for use.

These strategies can include use of infographics, tailored synthesis of research evidence, for example in the form of executive summaries, dissemination of ‘information nuggets’ through social media, seminars to present research findings, etc. (OECD, 2016[144]) (OECD, 2020[99]). The UK What Works Centre – including Education Endowment Foundation, the Early Intervention and the What Works Centre for Local Economic Growth – produce a range of policy briefs to disseminate key messages to its target audience. Similarly, in Canada, departments are diffusing evaluation findings beyond departmental websites via such platforms as Twitter and LinkedIn.

Methods for reviewing and assessing the evidence base

Several methods exist for reviewing and assessing the evidence base. Portals serving as passive repositories of information are less likely to promote evidence use (Results for America, 2017[166]). Compiling evaluations in portals or databases runs the risk of information overload, thus hindering the incorporation of findings and reducing the effectiveness of evaluation. As the number of evaluations increases, it becomes more difficult for policy makers and practitioners to keep abreast of the literature. Yet, policies should ideally be based taking into account the full assessment drawn from the body of evidence, not single studies, which may not provide a full picture of the effectiveness of a policy or programme. In addition, such repositories do not necessarily allow stakeholders to understand the quality of the evidence produced by an evaluation: its rigor or replicability for example.

These needs have led to an increase in the use of evidence synthesis. Evidence syntheses, through secondary processing of existing evaluations, provide a vital tool for policy makers and practitioners to:

inform them about what can be known, or derived, from previous research
understand what works and how it works.

Evidence synthesis methodologies seek to not only aggregate evaluation findings and review them in a more or less systematic manner for a discussion of methods), but also assess and rate the strength of the evidence. Evidence syntheses provide a useful dissemination tool since they allow decision-makers to access large bodies of evidence, as well as rapidly assess the extent to which they can trust it. They can also play an important role in promoting the quality of evaluations, as discussed in the section on ‘Mechanisms to promote the quality of evaluations’.

Box 3.22. Different methodologies for reviewing the evidence base

Copy link to Box 3.22. Different methodologies for reviewing the evidence base

Effective policy-making requires using the best available evidence, which itself requires reviewing and choosing from the already existing evidence on the policy question. Different reviewing methods enable managing and interpreting the results of a large evidence base:

Quick Scoping Review: this non-systematic method can take from 1 week to 2 months. It consists in doing a quick overview of the available research on a specific topic to determine the range of existing studies on the topic. It allows mapping the literature concerning a delimited question by using only easily accessible, electronic and key resources, going up to two bibliographical references.
Rapid Evidence Assessment (REA): this systematic and more time-consuming method (2 to 6 months) consists in quickly overviewing the existing research on a specific policy issue and synthesising the evidence provided by this research. It intends to rigorously and explicitly search and critically appraise this evidence. To gain time, it may limit certain aspects of the systematic review process, such as narrowing the REA question or the type and breadth of data considered. Shortening the traditional systematic review process provides a rapid synthesis of the existing relevant evidence, but risks introducing bias.
Systematic Review: this is the most robust method for reviewing, synthesising and mapping existing evidence on a particular policy topic. It is more resource-intensive, as it can take up to 8 to 12 months minimum and requires a research team. It has explicit objectives and a thorough search strategy that considers a broad range of data. Studies are chosen and screened according to explicit and uniform criteria, and reasons for excluding certain studies have to be stated. This transparent and comprehensive method maximally reduces bias in the search, choice and synthesis of the existing research. Moreover, it allows the creation of a cumulative and sound evidence base on a specific policy subject. Lastly, systematic reviews are applicable to quantitative studies as well as other types of questions.

Source: The UK Civil Service, What is a Rapid Evidence Assessment? https://webarchive.nationalarchives.gov.uk/20140402163359/http://www.civilservice.gov.uk/networks/gsr/resources-and-guidance/rapid-evidence-assessment/what-is (Accessed August 12th 2019).

According to the survey results, only a small number of countries conduct evidence synthesis within government. In fact, only two countries (Japan and Poland) declared using a rating system to classify the robustness of evaluations at the level of the main institution in charge of policy evaluations. The results from the UK What Works Centres, or corresponding knowledge brokerage institutions in the US or Australia, may not have been included as they might not be considered as being “within government”.

These clearinghouses or what works centres play a significant role as knowledge brokers. According to the Results for America initiative, clearinghouses can be defined as “an information source that aggregates, standardizes, reviews and rates the evidence base of interventions” (Neuhoff et al., 2015[177]) (Results for America, 2017[166]). Clearinghouses, therefore, conduct evidence syntheses to make information available to decision-makers and translate the research into language relevant to them.

Because of the nature of their mandate, clearinghouses usually work at arms’ length of government (for instance receiving government funding but functioning autonomously, such as the What Works Network in the UK) or completely independently from government (in the case of nongovernment initiatives, such as the Campbell Collaboration or the Cochrane Library, for example). Many focus on only one area of specialisation (Results for America, 2017[166]) and have their own review and rating process.

Box 3.23. The United Kingdom’s What Works Network

Copy link to Box 3.23. The United Kingdom’s What Works Network

The What Works Network comprises seven independent What Works Centres and two affiliate members. It intends to support the government and other organisations in creating, sharing and using high quality evidence to make better decisions for the improvement of public services. What Works is a unique national approach taken by a government to inform decision-making by the best available or created evidence. Its success can be associated to its three key features:

Its autonomy:

a. The Network operates at arm’s length from government and independently assesses the evidence that it encourages policymakers to incorporate in their decisions.

b. It is also funded out of non budgetary resources, such as lottery funds and its operations concern the public sector at national and local levels, covering policy areas that receive more than £200 billion of public spending.

Its role in promoting use of evidence:

c. It allows policy makers, commissioners and practitioners to use evidence on what works to make decisions and provide cost-efficient and useful services, distinguishing itself from standard research centres.

d. Several centres also support the development of a civil service with the skills, capability and commitment to use evidence effectively (Results for America, 2017[166]).

e. Additionally, the What Works National Adviser located in the Cabinet Office runs a Cross-Government Trial Advice Panel, which includes experts from academia and government who provide free support to all civil servants to assess whether policies are working.

f. This Adviser also frames findings from all Centres in an accessible and understandable format and shares them across government. This practice encourages cross-government discussions on ‘what works’ and assists policy makers in making evidence-based decisions regarding investment in value for money services that are intended to have a positive impact on citizens

Its role in producing systematic reviews:

g. Where evidence is lacking, centres creates high quality synthesis reports in their policy domain (What Work Centres are usually focused on one policy area, such as wellbeing or early intervention).

h. Centres also collate existing evidence on the effectiveness of policies and practices.

Sources: UK Government “What Works Network” https://www.gov.uk/guidance/what-works-network (Accessed September 2^nd 2019), (Results for America, 2017[166]) .

Guidelines to promote the uptake of evaluation results

While communication strategies, platforms and evidence synthesis methodologies are designed to give users quick and easy access to clear evaluation evidence, they do not systematically translate to better uptake of policy evaluations in decision-making.

According to the OECD data, 21 countries (of which 18 OECD members) have developed guidelines on public policy evaluations that contain specific provisions or standards for the use of policy evaluation.

Figure 3.14. Guidelines containing standards for the use of policy evaluation
Copy link to Figure 3.14. Guidelines containing standards for the use of policy evaluation

Few countries have developed specific guidelines aimed at policy-makers for the uptake of evaluation evidence. These include New Zealand, Japan and Costa Rica (Box Box 3.24). In the US, the use of evaluation findings is an important part of the 2010 Federal Performance Framework (OMB, 2010[178]). Overall, OECD data shows that the development of guidelines specifically dedicated to use of policy evaluations aimed at policy and decision-makers is a relatively recent practice, suggesting an increased awareness by countries of the need to development demand for evidence.

Box 3.24. Guidelines for the use of evaluation evidence

Copy link to Box 3.24. Guidelines for the use of evaluation evidence

New Zealand: Making sense of evidence: A guide to using evidence in policy (2018)

New Zealand’s Social Policy Evaluation and Research Unit (Superu) released this guide to provide central and local governments, the voluntary sector and the community with a structured approach to using evidence in every stage of the policy development cycle. This guide gives practical advice on:

understanding the different sources and types of evidence, and the questions each is best suited to answer at each stage of policy development
choosing and using evidence effectively according to three guiding principles: making sure that the evidence is appropriate, credible and transparent, and then explicitly stating how it is considered in every stage of the policy process.
dealing with gaps in the evidence base and weak, uncertain and conflicting evidence
taking into account different cultural values, and following a framework for bridging cultural perspectives, which is especially relevant in multicultural settings
finally, getting stakeholders to engage early and commit to using evidence in policymaking through communicating findings (Superu, 2018[179]).

Costa Rica: Guide for the use of evaluations: Guidelines for its implementation and follow-up on recommendations (2018)

The guide published by the Costa Rican Ministry of Planning and Economic Policy (Mideplan) provides support to decision-makers and those who execute policy interventions in concretely applying evaluation recommendations to improve the management of the public intervention evaluated. Precisely:

It first defines the different types of use (instrumental, conceptual, persuasive and political) and emphasises their importance.
Secondly, it details each step required to operationalise such use: analysis of recommendations, elaboration of a plan of action, implementation of the plan and analysis of its incidence. These steps include formalising and communicating decisions, identifying the actors and activities they involve, and elaborating a results report (Mideplan, 2018[180]).

Japan: Policy Evaluation Implementation Guidelines (2005)

These guidelines state the importance of reflecting the results of an evaluation in the policy evaluated. They recommend individual administrative organs to prepare and release an evaluation report and compile a budget request to ensure that results are incorporated in policy planning. They suggest holding ministerial discussions on the results when the fiscal budget is being compiled or when important policy decisions are made to strengthen cooperation between the evaluation unit and ministry in charge of developing policies. The Ministry of Internal Affairs and Communications and administrative organs should explicitly state evaluation recommendations (such as the suppression, scaling up or down, or specific targeting of policy) when releasing the evaluation results in budget requests.

Sources: (Superu, 2018[179]), (Mideplan, 2018[180]), (The Ministry of Internal Affairs and Communication, 2005[78]).

Other initiatives used by countries to promote use of evaluations include self-assessment tools to assess the capacity of organisations to demand and apply research. A key first step in enabling organisations to increase their ability to identify and assess research and use it in decision making is to examine the existing organisational capacity to access, interpret and use research findings (Kothari et al., 2009[181]). Colombia, for example, launched the first guideline to construct an evidence gap map for civil servants, academics and external organization interested in using the existence evidence by performing a more robust approach.

Box 3.25. Guidelines for the construction of Evidence Gaps Maps: A tool for decision making in Colombia

Copy link to Box 3.25. Guidelines for the construction of Evidence Gaps Maps: A tool for decision making in Colombia

The Colombian ministry of planning (DNP) created Guidelines for the construction of evidence gap maps (MBEs) to strengthen evidence-based decision-making. MBEs systematize and synthesize the evaluation results in a clear way, giving decision-makers an easy and comprehensive access to them and ultimately reinforcing use.

These guidelines can be used by any national public entities and international organisations interested in improving their decision-making processes. They present the steps required for the construction of an evidence gap map (MBE), accompanied by concrete examples and recommendations. They also describe the human resources needed to build the team responsible for constructing the MBE as well as the optimal planning for it.

Source: Taken from Colombia Ministry of Planning (2019), Guideline for the construction of Evidence Gaps Maps: a tool for decision making, Ministry of Planning

Increasing demand for evaluation by promoting competencies

In some countries, mechanisms to promote demand for evaluations are developed in addition to those aimed at promoting their supply. In fact, supply of evaluative evidence is not a sufficient condition for use: demand from primary intended users also needs to be there. Both research and practice indicate that despite the extensive production, communication and dissemination of evaluation reports, the use of evidence by decision makers remains limited, and the commitment of top management to evaluation activities remains low (Olejniczak and Raimondo, 2016[182]).

Specifically, evaluation users – policy makers, in particular – can also face challenges related to their lack of competence to analyse and interpret evidence (Results for America, 2017[166]), meaning that they do not have the appropriate skills, knowledge, experience and abilities to use evaluation results (Stevahn et al., 2005[147]) (American Evaluation Association, 2018[148]) (Newman, Fisher and Shaxson, 2012[183]).

According to the survey, mechanisms aimed specifically at increasing demand for evaluations are less frequent than mechanisms aimed at promoting supply. Nevertheless, country practices reveal a wide range and approaches aimed at developing competences for use. This includes practices such as training, aimed at senior civil servants or policy professionals, and mentoring initiatives.

Understanding skills and competencies for policy evaluation

The OECD and the Joint Research Centre of the European Commission developed a Mapping of the relevant skills and competencies for Evidence Informed Policy Making from countries experiences. The skillset is presented below in Box 3.26. (A detailed discussion of this mapping can be found in the OECD forthcoming report on Building Capacity for Evidence Informed Policy Making).

Box 3.26. The skillset for Evidence Informed Policy Making

Copy link to Box 3.26. The skillset for Evidence Informed Policy Making

This skill-set is defined as a collective skill-set for the improvement of public service in the future and not as a full list of skills that each public servant needs to master. This skillset does not apply to one scenario; instead, iris of a cross-cutting character and can be applied on multiple occasions. It includes elements like critical thinking, systems thinking, and engaging with stakeholders. The skillset is defined by six (6) clusters, as follows:

Source: Adapted from Building Capacity for Evidence Informed Policy Making: Lessons from country experiences, OECD (2020).

Training for policy makers and civil servants

Training refers to an active preparation based on appropriate approaches and strong guidance for a specific attendance. OECD’s work on how to engage public employees for a high performing civil service highlights the importance of learning and training in a modern civil service to enable civil servants to continually update their skills and capacity to innovate (OECD, 2016[184]). There is, therefore, a strong justification for investment in learning and training, and there is also a strong call from employers and employees for the need to invest in skills and competency development.

The work by the OECD on Building Capacity for Evidence Informed Policy Making (OECD, 2020[99]) suggests that training for Senior Civil Service leadership is aimed at increasing managers’ understanding of evidence informed policy making and policy evaluation, enabling them to become champions for evidence use. Intensive skills training programmes aimed at policy makers may be more focused on interrogating and assessing evidence and on using and applying it in policy making.

Training for Senior Civil Service leadership can include training courses or seminars given by national schools of government in the context of their leadership programmes or specific training courses developed by ministries or agencies. In Canada, for example, the executive training in research application (EXTRA) programme provides support and development for leaders in using research. The programme is targeted towards leaders in the healthcare field. The programme’s objectives are that after the completion of the training, participants will be able to use evidence in their decision-making and will be able to train their co-workers and bring about organizational change.

Intensive skills training programmes geared towards policy makers can provide them with the necessary skills to increase the use of evidence in their work. Through such trainings, policy makers not only learn new skill but often also have increased motivation to use evidence and many become research champions and train or mentor others (Haynes et al., 2018[173]). Such trainings can take the form of workshops, masterclasses or seminars (see Box 3.27 for such examples).

Box 3.27. Intensive skills training and mentoring programmes

Copy link to Box 3.27. Intensive skills training and mentoring programmes

In the UK, the Alliance for Useful Evidence organises an evidence msterclass where policy makers can learn about how to use evidence in their policy work and can practice their new skills through simulations. Through this programme, policy makers are able to build their confidence in compiling, assimilating, distilling, interpreting and presenting evidence. Participants learn how to find research that is relevant to their policy question, and they develop their ability to assess the quality and trustworthiness of research.

Mexico has also implemented capacity-building initiatives concerning Regulatory Impact Assessment. Training seminars were held by Mexico’s ministry of the economy, for Federal and Provincial officials on how to draft and implement Regulatory Impact Assessments (RIA). The learning programme provided a step-by-step methodology on how to produce and analyse impact assessments in practice using guidance, case studies and advice from peer government officials, experts and OECD insights (OECD, 2020[99]).

South Africa has a longstanding history of initiatives to improve the demand side for evidence use in policy making, including the implementation of workshops and a mentorship programme throughout government. The programme was created to address the disconnect between the widespread support for EIPM in principle and its practical application. The workshops and group mentoring were geared towards laying the foundations for individuals to acquire evidence informed policymaking skills. The group orientation created an environment in which there was greater acceptance of the value and practice of EIPM and therefore made individual mentoring possible. Those individuals were then able to mentor their colleagues on integrating evidence into their work.

Sources: adapted form OECD (Forthcoming), Building capacity for evidence informed policy making: lessons from country experience.

Mentoring programmes

Mentoring initiatives, on the other hand, refer to more personalised guidance on ‘real-world’ applications. For this initiative, the expertise and interpersonal skills of the mentors are key for the credibility of the process. While evidence suggests that mentoring initiatives can be successful in helping policy makers in using and applying evidence in their work, these types of programmes are less frequently used. One exception includes the Data for Decision Making (DDM) programme that was implemented in Mexico, which mainly relied on mentoring to improve the use of evidence in health policy-making (Punton et al., 2016[185]). Box 3.27 above provides details on South African initiatives regarding mentorship programmes for government.

Creating an evaluation market place by embedding use of evidence in the institutional set-up

While individual competencies are important, formal organisations and institutional mechanisms set-up a foundation for evidence-informed policy making that can withstand transitions between leadership (Results for America, 2017[166]). The use of evaluations is intimately linked to organisational structures and systems, insofar as they create a fertile ground for supply and demand of evaluations to meet.

Institutional or organisational mechanisms which enable the creation of an evaluation market place can be found either at the level of specific institutions, such as management response mechanisms, or within the wider policy cycle, such as through the incorporation of policy evaluation findings into the budget cycle or discussions of findings at the highest political level.

Management response mechanisms at the level of specific institutions

The first level of use for evaluations is management response mechanisms. Management response mechanisms indicate whether senior management partially agrees or disagrees with the assessment and strategic recommendations contained in a policy evaluation. The reason for agreement or disagreement is provided, and actions to be taken in response to the evaluation are described.

According to OECD data, the use of formal management response and follow-up systems is relatively infrequent. Amongst country respondents, 11 main institutions in charge of policy evaluations (10 in OECD countries), and a similar sample of Health Ministries (8) used management response mechanisms to react to internal or external evaluations. Some exceptions include management response mechanisms in Mexico and Costa Rica (see Box 3.28).*

The information from the OECD survey also suggests that a larger number of countries use management response and follow-up mechanisms for the evaluation of government-wide policy priorities. In Japan, for instance, the government submits each year a report to the Diet (Houses of Representatives and Councillors) on the status of policy evaluation and on how its results have been reflected in policy planning and development. In Korea, for instance, based on the review of evaluation results, improvements to be made to evaluated policies are identified and the evaluation plan for the subsequent year is adapted accordingly.

Figure 3.15. Management response mechanisms at the level of specific institutions
Copy link to Figure 3.15. Management response mechanisms at the level of specific institutions

Box 3.28. Management response mechanisms in Mexico and Costa Rica

Copy link to Box 3.28. Management response mechanisms in Mexico and Costa Rica

Mexico implemented a mechanism to establish a follow-up process on external evaluation recommendations, which defines the actors responsible for constructing the tools that will track the aspects of programmes and policies to be improved. The Mexican National Council for the Evaluation of Social Development Policy (CONEVAL) gives a prize to federal ministries and states who contribute to the generation and use of evaluations results to improve policies, as well as to the development of their staff’s skills for that purpose.

Costa Rica’s ministry of national planning and political economy (Mideplan) developed a guide for the use of evaluations. It advises that an action plan should be developed based on an analysis of evaluations’ recommendations. This requires convening key actors and stakeholders, and formulating and communicating decisions. The action plan should then be formalised by defining activities, roles and responsibilities, establishing expected results, and finally validating and communicating the plan. Implementing the plan requires incorporating it in instruments of organisational planning, monitoring compliance with its activities and generating a report. Finally, the impact of the action plan should be assessed by collecting and analysing the reports, consolidating its results, and publishing them.

Sources: OECD (2018) Survey on Policy Evaluation, (Mideplan, 2018[180]).

Management response systems can also be informal. In the United Kingdom/Great Britain, for example, there are formally no specific requirements on how policy evaluation results are to be followed-up on. However, most institutions will develop some form of ministerial/management response to the results.

The role of knowledge brokers

Knowledge brokers are pivotal actors that connect knowledge producers and users in networks where knowledge and evidence is produced (Olejniczak and Raimondo, 2016[182]). They can help to facilitate policymakers’ access to the results of evaluations and to research evidence by helping them to navigate research material that may be unfamiliar. They can also help to articulate policymakers’ needs, constraints and expectations, translating them for researchers who may be unfamiliar with the policy process (see Box 3.29).

Box 3.29. The role of knowledge brokers

Copy link to Box 3.29. The role of knowledge brokers

Knowledge brokers play a key role in strengthening the relationship and collaboration between evidence producers and policymakers. A knowledge broker is either an individual, organisation, or structure that shares information, strengthens capacity and builds partnerships.

Governments can rely on knowledge brokers to improve their communication towards the evidence community regarding their particular needs and expectations for policymaking. On the other hand, knowledge brokers may also help evidence producers “translate” their results to policy makers, by synthesising them, disseminating them and expressing them in a clear and relevant manner

Overall, knowledge brokers have to both understand the technicalities of the research and evaluation world, as well as the practicalities of the actions and decisions taken by policy makers and the political, economic and social factors that influence them. More precisely, they can undertake the following activities to effectively transfer the knowledge they create to policy-makers and society more broadly:

identifying the information gaps and needs of the users of evidence (decision-makers and policy actors)
acquiring quality evidence from appropriate sources and in a timely manner (at the right stage of an intervention)
transferring evidence to users by translating it in an appealing, tailored and actionable message, which may involve discussion and persuasion
building networks between evidence producers and users to facilitate interactions and collaboration, ultimately allowing capacity building and dissemination
accumulating evidence over time to build a robust and diversified evidence base, which requires building institutional capabilities for extracting useful evidence
fostering an evidence-based culture by organising interactive workshops with decision makers to develop their skills and commitment to using evidence.

Source: (Results for America, 2017[166]). (Olejniczak, Raimondo and Kupiec, 2016[64]).

Knowledge brokers can take on a variety of forms, ranging from individual professionals (such as Government chief science advisors in some countries, or ministerial advisors) to dedicated organisations. In terms of institutions, some are specifically connected to knowledge producers, such as brokering units within academic institutions (for example, the Centre for Evaluation and Analysis of Public Policies in Poland and the Top Institute of Evidence-Based Education Research in the Netherlands). Other approaches, on which this section will focus, locate the function closer to decision makers, either within government or at arms’ length.

Evaluation units as knowledge brokers

Firstly, evaluation units or advisory bodies within ministries play an important knowledge brokerage role within their institution, as they convey their findings to departments responsible for planning and implementing interventions in their institution. As such, evaluation units are the first knowledge brokers, as they typically act as intermediaries between knowledge producers (evaluators) and actors involved in policy decisions (Olejniczak and Raimondo, 2016[182]).

In France, for instance, many of the knowledge brokerage functions are integrated within the ministries. Analytical directorates in the ministries of labour (DARES), social affairs (DREES) or the environment (CGEDDE) provide strategic advice and access to evidence, and integrate the knowledge broker functions within the day-to-day work of the ministries.

Bodies at arm’s length of government

Other countries have seen the development of knowledge brokerage organisations at arm’s length of government. These units may function with a certain degree of independence, for instance in terms of staffing or budget, but receive government funding. These units often concentrate on one thematic area of specialisation. Examples of such organisations are the Australian Institute of Family Studies (AIFS), the Research and Evaluation Unit Department of Children and Youth Affairs in Ireland and the What Works Network in the United Kingdom (see the section on clearinghouses for a description of the What Works Network). Others, like Australia’s Productivity Commission (see Box 3.30), are cross-disciplinary. The experience of the Productivity Commission is in many regards exemplary in terms of communication and external engagement. (For a full review of policy advisory bodies see (OECD, 2017[186])).

Box 3.30. The experience of Australia’s Productivity Commission in communication and public inquiries

Copy link to Box 3.30. The experience of Australia’s Productivity Commission in communication and public inquiries

The Australian Productivity Commission provides analysis and recommendations on specific policies and a range of economic, social and environmental issues. One of the main activities of the commission is the communication of its ideas and analyses, a key determinant for the use of evaluation results. First, the Commission’s research reports are formally presented for discussion to the Australian Parliament through the Treasurer. Then, as the Commission is statutorily required to promote a public understanding of policy issues, it directs its reports and other activities at the wider community. For instance, all draft reports and preliminary findings are shared with the public for discussion through workshops, presentations and forums.

Public inquiries are another means used by the commission to handle policy issues that require significant public exposure and consultation. When policies have complex consequences or potential important impacts on society, citizens are consulted and their perspectives considered during policy formulation. To reach as many citizens and foster their active involvement, these public inquiries are widely advertised. For instance, the commission’s inquiry on disability care and support received 1062 submissions, led to the visits of 119 organisations and included 23 days of public hearings.

Sources: Australian Government, “About the Commission” and “Core Functions”. Accessed September 2^nd 2019. https://www.pc.gov.au/about, https://www.pc.gov.au/about/core-functions

Coordination platforms or units across government

The results of the survey show that some governments (14 countries, of which 12 OECD countries) have established dedicated cross-governmental units to champion the use of policy evaluations in a horizontal manner (see Figure 3.16). These coordination platforms, or knowledge brokers, can take on a variety of organisational form, often close to the Centre of Government. These include the UK Cabinet Office’s ‘What Works’ team, the evidence team within the Office of Management and Budget in the US (see Box 3.31) and France Stratégie, a think tank attached to the Prime Minister’s Office in France.

Figure 3.16. Coordination platforms across government to promote use of evidence
Copy link to Figure 3.16. Coordination platforms across government to promote use of evidence

Box 3.31. The role of the US Office of Management and Budget in promoting the use of evaluation

Copy link to Box 3.31. The role of the US Office of Management and Budget in promoting the use of evaluation

The United States’ Office of Management and Budget (OMB) has a dedicated evidence team that acts as a central hub of expertise across the federal government. The team works with other OMB offices in order to set research priorities and ensure the use of appropriate evaluation methodologies in federal evaluations. As of July 2019, the team has created an interagency council that regroups Evaluation Officers. This council is intended to serve as a forum for officers to exchange information and advise the OMB on issues affecting the evaluation functions such as evaluator competencies, best practices for programme evaluation and evaluation capacity building. The council also allows for coordination and collaboration between evaluators and the government. It plays a leadership role for the larger Federal evaluation community. To ensure that evidence is used in policy design, the Evidence Team is also actively involved offering technical assistance to Federal Agencies.

Source: Clark, C. (2019) “OMB Moving Ahead to Steer Agencies on Evidence-Based Policymaking” https://www.govexec.com/management/2019/07/omb-moving-ahead-steer-agencies-evidence-based-policymaking/158381/

Of the 13 countries making use of such platforms, only five have mandates focused on matching supply and demand for evaluations (see Table 3.7).

	Mapping the evidence brokerage function across government as way to foster systematic use of evidence	Ensure that policy evaluation and resources for evidence use are directed to inform policy design for government priorities	Ensure that the production of evidence matches the demand of policy makers in terms of timing and format	Enable the sharing of policy evaluations and of evidence results to practitioners and local governments to improve service delivery	Facilitate international cooperation in evidence production and use to enable efficiency gains
Canada	○	○	○	○	○
Estonia	●	●	○	○	○
Finland	○	●	●	●	○
Germany	○	●	●	●	●
Great Britain	○	●	●	●	○
Ireland	○	●	●	●	○
Japan	○	○	○	○	○
Latvia	●	●	○	●	○
Mexico	○	●	●	●	●
Norway	○	○	○	●	○
Poland	●	●	○	●	●
United States	●	○	○	●	○

Brazil	○	●	○	○	○
Colombia	○	●	○	●	●

Note: n=14 (12 OECD). Answers reflect responses to the question “ What functions are being carried out by this coordination platform”. The information reported here refers only to the countries that selected the option: "A coordination platform across government to promote the use of evidence (produced by policy evaluations." in the question "How does your government promote the use of the findings of policy evaluations". In the option "other", Poland reported "assessment of evaluation reports influencing on robustness of evaluations", and Japan "information sharing".

Source: OECD Survey on Policy Evaluation (2018)

A majority of countries have mandates relating to ensuring that evaluation resources are directed to inform policy-design and decision-making. The following countries report that the coordination platform plays a role in mapping the evidence brokerage function: Estonia, Latvia, Poland, United States, , while Germany, Mexico, Colombia and Poland are the only countries that have explicitly attributed a role in facilitating international cooperation for use of evaluations to the platforms.

For instance, Japan’s the Ministry of Internal Affairs and Communications (MIC) prepares an annual report on the status of policy evaluations carried out by the ministries and how the results of the evaluations have been reflected in policy planning and the development process. The MIC then aggregates the results of the evaluations conducted by ministries on the ‘Portal Site for Policy Evaluation’.

Embedding the use of evaluation findings into policy planning/making processes

Incorporation of evaluation findings in the budgetary cycle

Incorporation of evaluation findings in the budgetary cycle is one of the most commonly used mechanism for the promotion of use of evaluations (see Figure 3.17). In fact, the results of the survey show that half of surveyed countries report that they incorporate evaluation evidence into the budgetary cycle. Sectoral respondents seem to incorporate such evidence less, with 35% (11 countries) of health respondents using evaluation evidence in budgetary decision-making and only 25% (6 countries) of public sector reform respondents.

Country practices in this regard can take on a variety of forms depending on:

the nature of the evidence produced: spending reviews or policy evaluation.
the extent to which this evidence will impact budgetary decisions.

According to data from the budgeting and public expenditures survey (2018) (OECD, 2019[12]), spending reviews are a widely used tool in OECD countries as part of the budget cycle. As discussed in chapter Chapter 1. , spending reviews produce performance evidence on programmes and policies. Nevertheless, while spending reviews focus on the effectiveness and efficiency of currently funded programmes in order to propose options for savings and fund reallocations, policy evaluations also look at impact of public interventions. Spending reviews also need to be informed by evaluations and an assessment of the effectiveness of programmes (Robinson, 2014[187]) (Smismans, 2015[14]) (The World Bank, 2018[15]).

Many OECD countries (27 out of 33 respondents (OECD, 2019[12]) make use of spending reviews and the evidence they produce in their budgetary cycle. Some countries, such as Denmark, the Netherlands or Germany, conduct ad hoc spending reviews to inform some allocation decisions every year. Others have used more comprehensive spending reviews, typically on a rolling basis over the period of an electoral mandate. The Irish Department of Public Expenditure and Reform has introduced a rolling spending review process, which consists of examining national expenditures over a three-year period and assessing the effectiveness of existing programmes. The three-year rolling nature of the review enables building up expertise and awareness of the process, and allows analysts to revisit emerging issues and further embed an evaluation culture across the Public Service.

Figure 3.17. Incorporation of policy evaluation findings into the budget cycle
Copy link to Figure 3.17. Incorporation of policy evaluation findings into the budget cycle

According to the results of the survey on performance budgeting for a sample of 20 countries, the impact of policy evaluations on budget decisions remains relatively limited compared to spending reviews.

Figure 3.18. Influence of evaluation findings on budget allocation decisions
Copy link to Figure 3.18. Influence of evaluation findings on budget allocation decisions

One exception includes the budgetary cycle in Lithuania, where the office of Government, together with the ministry of finance, summarises the results of evaluations in preparation for budget negotiations, in a note that provides information on progress achieved by the agency since the evaluation and any implementation gaps. Another such exception is the budgetary cycle in Canada (see Box 3.32).

Box 3.32. Use of evaluation findings in the budgetary cycle in Canada

Copy link to Box 3.32. Use of evaluation findings in the budgetary cycle in Canada

The Treasury Board’s reviews, expenditure and allocation decisions are required to be informed by evaluation findings. Such use of evaluation findings is ensured by requiring organisations in charge of policymaking to have to seek permission from the Treasury Board to obtain expenditure authority for policies, programmes and projects. These organisations have to address a list of practical questions related to evaluation, which can be found on the Canadian Government’s website, when drafting a submission for the Treasury Board.

The organisation making a policy proposal has to define the expected results of the policy in light of existing policies, state whether an evaluation has been conducted and if so, share its results and otherwise state whether a future evaluation is planned. The policy proposal has to indicate whether the head of evaluation or head of performance measurement has been consulted in the development of the policy, which in any case has to be supported by relevant evidence, and eventual unfavourable evidence has to be discussed.

Source: OECD (2018) Survey on Policy Evaluation.

Finally, evaluative evidence may be used in a more or less systematic manner in the budget cycle. For instance, the OECD has identified four main models of performance budgeting, which reflect the different strength of the links between performance evidence and budgeting (OECD, 2019[12]):

presentational (evidence presented separately from the main budget document)
performance informed (performance evidence included within the budget document that is presented on the basis of programmes)
managerial
direct performance budgeting (direct link between results and resources).

In most OECD countries, performance evidence is included in the budget cycle according to one of the first three approaches. France offers an example of strong links between key performance indicators established at the national level, their ex post evaluation and the budget cycle. France’s organic budget law (loi organique relative aux lois de finances (LOLF)) groups expenditures by “missions” and programmes, which are each associated with policy objectives and performance indicators. Each budgetary cycle sees programmes from the previous cycles’ budgetary being evaluated against these objectives and indicators in annual performance reports (rapport annuels de performance). These evaluations are included in the annex of the main budget document, which is examined by Parliament.

Another long standing example of embedding evaluations in policy making is the domain of regulatory policy, where there are requirements both for using evaluation ex ante as part of the RIA process and requirements related to the implementation and use of evaluations in laws and policies. An increasing number of laws and regulatory acts contain clauses that include formal requirements for policy evaluation. The OECD Council Recommendation on Regulatory Policy and Governance makes numerous references to evaluation as a part of promoting evidence-based decision making through including ex ante and ex post assessment of regulations (OECD, 2012). The OECD is currently developing best practice principles for RIA (GOV/RPC(2018)12/REV2) and also for ex post review as the attention to regulatory quality is increasingly shifting the focus not only on the evaluation of the impact ex ante, but also to the evaluation ex post (GOV/RPC(2018)5/REV2).

In the case of ex ante evaluations, these requirements can promote the use of evaluations as the evidence produced through the assessment is in fine meant to inform decision makers on whether and how to regulate to achieve public policy goals (OECD, 2018[172]). Regulatory impact assessment (RIA) is “a systematic process of identification and quantification of benefits and costs likely to flow from regulatory or non-regulatory options for a policy under consideration. A RIA may be based on benefit-cost analysis, cost-effectiveness analysis, business impact analysis, etc.” (OECD, 2018[172]). RIA can be an important tool for promoting evidence-information policy-making agenda.

RIA is now required in almost all OECD countries for the development of at least some regulations and implementation gaps are slowly diminishing (OECD, 2018[172]).

Figure 3.19. Formal requirements to conduct RIA and *ex post* evaluation of primary laws
Copy link to Figure 3.19. Formal requirements to conduct RIA and <em>ex post</em> evaluation of primary laws

Ex-post evaluations of regulations seek to establish whether laws and regulations continue to be fit for purpose. At the same time, ex post evaluations provide an opportunity to assess whether there are better means of achieving the original policy goals and thereby further enhance societal welfare. According to data from the indicators of regulatory policy and governance survey (see (OECD, 2018[123]), ex post review of regulations remains less institutionalised than ex ante assessments, with fewer countries having formalised arrangements. Less than one third of OECD countries have systematic requirements for ex post evaluation of regulations (OECD, 2018[172]).

Using evaluation at the Centre of Government, and monitoring of government wide policy priorities

Countries have also used various mechanisms to discuss evaluation results at the highest political level. This practice is more frequent for the evaluation of government-wide policy priorities (GWPP), with about half of countries (14, of which 11 OECD countries) discussing evaluation findings at the level of the Council of Ministers (or equivalent), compared to a third of countries (14, of which 11 OECD countries) for policy evaluations in general. In Korea, for instance, in the context of the “100 Policy Tasks” five-year plan, evaluation results are discussed at the council of ministers (See Box 3.33).

Figure 3.20. Discussion of evaluation findings at the Centre of government
Copy link to Figure 3.20. Discussion of evaluation findings at the Centre of government

Box 3.33. Discussion of evaluation results at the Council of Ministers in Korea

Copy link to Box 3.33. Discussion of evaluation results at the Council of Ministers in Korea

In Korea, according to the Government performance evaluation implementation plan, evaluation results have to be discussed at the council of ministers. This plan is part of a larger five-year plan that requires ministerial and vice-ministerial agencies to carry out 100 policy tasks to turn the Republic into a more people-centred democracy. Ministerial capacity is evaluated with regards to these 100 policy tasks on the basis of of job creation, attainment of targets and policy impact. Ministries are encouraged to take continued interest in this evaluation plan and give inputs on the implementation of the evaluated tasks by participating in forums within the state administration to discuss evaluation results. According to these results, the evaluation plan for the subsequent year is adapted and rewards are given to the best performing agencies, incentivising them further to make evidence-based decisions. Lastly, when necessary, the Prime minister presides over Government performance evaluation committee meetings, during which evaluation reports are reviewed.

Source: OECD (2018) Survey on Policy Evaluation

Other countries have set-up specific committees or councils, most often at the centre of government, in order to follow-up on the implementation of policy evaluations and/or discuss their findings. The Brazilian committee for monitoring and evaluation of federal public policies is an example of such committee, which brings together high-level representatives from the executive (Presidency of the Republic, ministry of finance, ministry of planning, and the ministry of transparency) and from the comptroller general of the Union (CGU) (see Box 3.34).

Box 3.34. The Brazilian Committee for Monitoring and Evaluation of Federal Public Policies

Copy link to Box 3.34. The Brazilian Committee for Monitoring and Evaluation of Federal Public Policies

In 2016, the Brazilian Government established the committee for monitoring and evaluation of federal public policies (CMAP) with the objective of encouraging the use of evaluation results to improve public policy outcomes and performance, the allocation of resources, and the quality of public spending. The Committee involves the following institutions: ministry of planning, development and management, ministry of finance, ministry of transparency, the Union’s general comptroller, and the civil house of the Presidency. They meet periodically to monitor and evaluate the public policies selected by the CMAP and accordingly propose alternative designs and adjustments to them. All policymakers in charge of the evaluated policies are invited to participate in the CMAP’s evaluation activities. Moreover, although not always in a systematic way, most of evaluation findings are involved in broader political discussions on public policy. The CMAP has thereby been able to promote several reforms in the legal framework and design of evaluated policies. The CMAP can be seen as an effective mechanism for fostering evaluation use at the highest political level thanks to its composition, consisting of central ministries responsible for the public budget, public resources and political coordination.

Source: OECD (2018) Survey on Policy Evaluation.

At the sector level, such councils or committees can be set-up on an ad hoc basis in order to follow-up on the implementation of recommendations from policy evaluations. In Canada, for example, after the review of federally funded pan-Canadian health organizations was completed in March 2018, an implementation steering group (ISG) was formed to develop a detailed implementation plan and provide advice to health Canada on how to move forward with the recommendations of the review.

The role of institutions beyond of the executive

The role of Parliaments

Beyond their role as evaluation producers, parliaments have a particular role to play in promoting the use of evaluations. First, as they contribute to ensure accountability, parliaments have played an important role in promoting a more structured or systematic approach to conducting evaluations (Gaarder and Briceño, 2010[55]). For instance, parliaments have been instrumental in increasing evaluation use by: promoting the use of evaluative evidence in the budgetary cycle by requiring more performance data on government spending, introducing evaluation clauses into laws and commissioning evaluations at the committee level in the context of hearings (Jacob, Speer and Furubo, 2015[63]). Second, parliaments rely on verifiable and sound data on which they can base their policy initiatives and can thus push for the establishment of a structured approach to gather this information. Most parliaments have research and information services that help members of parliament order or request evaluation reports. Some even conduct evidence syntheses, thus playing a knowledge brokerage function (Jacob, Speer and Furubo, 2015[63]). Independent fiscal institutions and parliamentary budget offices attached to parliament are some of the main users of such data. Parliaments are also recipients of evaluations conducted by other institutions or bodies, such as supreme audit institutions. In Denmark, the evaluations of Rigsrevisionen are handed over to parliament for a formal reaction. In Japan, the government submits each year a report to the diet on policy evaluation and on how the results of such evaluation have been reflected in policy planning and development.

Nevertheless, the results of the survey suggest that discussion of evaluation findings in parliament is a relatively infrequent practice. Only 10 countries evaluate findings related to the evaluation of their government-wide policy priorities in parliament; only 8 countries do so for health and 4 countries for PSR. France has recently promoted the “Spring of Evaluation” in 2018 as part of its parliamentary finance committee. This provides a platform to discuss work on policy evaluation, following the theme for evaluation and the planning that was agreed by the committee at the beginning of the year. Ministries are invited for hearings and invited to discuss the performance of the public policies for which they have responsibility. Three days of full discussions are then organised in a public hearing, which includes questions, discussions and the adoption of parliamentary resolutions. 3

Figure 3.21. Discussion of evaluation findings in Parliament
Copy link to Figure 3.21. Discussion of evaluation findings in Parliament

The role of Supreme Audit Institutions in promoting the use

SAIs play a role in promoting the use of evaluations in three main ways. First, as part of their mandate, many SAIs assess the mechanisms through which governments manage performance evidence, which includes looking at how evidence is used in the budgeting process and others systems for managing information. Second, SAIs contribute to evaluation use by disclosing the results of the evaluations they conduct. (OECD, 2016[21]).

Indeed, an INTOSAI survey shows that among 14 SAIs, a majority (62%) generally publishes their evaluations, of which 75% states that they are frequently covered by the media (INTOSAI Working Group on Evaluation of Public Policies and Programs, 2019[188]). Supreme audit institutions may contribute to use of evaluations by promoting public awareness of their results. Other SAIs use active communication strategies to ensure the use of the evaluations they conduct. The Swiss federal audit office, for instance, uses advisory groups as “multiplier agents” to foster the use of its evaluations by helping them disseminate and communicate their results (Swiss Federal Audit Office, 2019[189]).

Third, some SAIs contribute to use by assessing government entities’ use of evidence in decision-making as part of their mandate to evaluate for results. For example, the US Government accountability office produces reports and recommendations targeted to both the executive and to Congress on the implementation of the US Government performance management modernization Act (GRPAMA), which gives the Office of Management and Budget (OMB) an important role in disseminating and integrating a results and performance based approach to public administration.

References

[7] 115th Congress (2019), Public Law No: 115-435 (01/14/2019) - Foundations for Evidence-Based Policymaking Act of 2018, https://www.congress.gov/bill/115th-congress/house-bill/4174.

[18] Acquah, D., K. Lisek and S. Jacobzone (2019), “The Role of Evidence Informed Policy Making in Delivering on Performance: Social Investment in New Zealand”, OECD Journal on Budgeting, Vol. 19/1, https://dx.doi.org/10.1787/74fa8447-en.

[131] AEVAL (2015), Practical guide for the design and implementation of public policy evaluations(Guía práctica para el diseño y la realización de evaluaciones de políticas públicas), http://www.aeval.es/export/sites/aeval/comun/pdf/evaluaciones/Guia_Evaluaciones_AEVAL.pdf (accessed on 21 August 2019).

[161] Alkin, M. and S. Taut (2002), “Unbundling evaluation use”, Studies in Educational Evaluation, Vol. 29/1, pp. 1-12, http://dx.doi.org/10.1016/S0191-491X(03)90001-0.

[148] American Evaluation Association (2018), Guiding Principles.

[151] American Evaluation Association (2015), Core Evaluator Competencies, http://www.eval.org.

[117] Barnett, C. and L. Camfield (2016), “Ethics in evaluation”, ournal of Development Effectiveness.

[142] Better evaluation (2019), Review evaluation (do meta-evaluation), https://www.betterevaluation.org/en/rainbow_framework/manage/review_evaluation_do_meta_evaluation (accessed on 19 August 2019).

[67] Bossuyt, J., L. Shaxson and A. Datta (2014), “Study on the uptake of learning from EuropeAid’s strategic evaluations into development policy and practice”, Evaluation Unit of the Directorate General for Development and Cooperation-EuropeAid (European Commission).

[165] Bridgeland, J. and P. Orszag (2013), Can Government Play Moneyball? - The Atlantic, https://www.theatlantic.com/magazine/archive/2013/07/can-government-play-moneyball/309389/ (accessed on 6 December 2018).

[60] Brown, L. and S. Osborne (2013), “Risk and Innovation”, Public Management Review, Vol. 15/2, pp. 186-208, http://dx.doi.org/10.1080/14719037.2012.707681.

[119] Brown, R. and D. Newman (1992), “Ethical Principles and Evaluation Standards: Do They Match?”, Evaluation Review, Vol. 16/6, pp. 650-663, http://dx.doi.org/10.1177/0193841X9201600605.

[152] Bundesministerium für Finanzen and Bundesministerin für Frauen und öffentlichen Dienst (2013), Handbuch Wirkungsorientierte Folgenabschätzung Arbeitsunterlage, http://www.oeffentlicherdienst.gv.at (accessed on 28 August 2019).

[129] Campbell, S. and G. Harper (2012), Quality in policy impact evaluation: understanding the effects of policy from other influences (supplementary Magenta Book guidance), HM Treasury, http://www.nationalarchives.gov.uk/doc/open- (accessed on 9 July 2019).

[8] Canada Treasury Board (2016), Policy on results, https://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=31300 (accessed on 20 September 2019).

[133] Caroline Heider (2018), The Three Pillars of a Working Evaluation Function: IEG’s Experience, http://ieg.worldbankgroup.org/blog/three-pillars-working-evaluation-function-iegs-experience (accessed on 22 August 2019).

[65] Cinar, E., P. Trott and C. Simms (2018), “Public Management Review A systematic review of barriers to public sector innovation process”, http://dx.doi.org/10.1080/14719037.2018.1473477.

[31] Commission on Evidence-Based Policymaking (2017), The Promise of Evidence-Based Policymaking: Report of the Commission on Evidence-Based Policymaking, https://www.cep.gov/report/cep-final-report.pdf (accessed on 6 August 2019).

[23] CONEVAL (2007), Lineamientos generales para la evaluación de los Programas Federales de la Administración Pública Federal, https://www.coneval.org.mx/rw/resource/coneval/eval_mon/361.pdf (accessed on 18 June 2019).

[159] Cooksy, L. and M. Mark (2012), “Influences on evaluation quality”, American Journal of Evaluation, Vol. 33/1, pp. 79-84, http://dx.doi.org/10.1177/1098214011426470.

[45] Crowley, D. et al. (2018), “Standards of Evidence for Conducting and Reporting Economic Evaluations in Prevention Science”, Prevention Science, Vol. 19/3, pp. 366-390, http://dx.doi.org/10.1007/s11121-017-0858-1.

[109] Damschroder, L. et al. (2009), “Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science”, Implementation Science, Vol. 4/1, p. 50, http://dx.doi.org/10.1186/1748-5908-4-50.

[25] Departamento Nacional de Planeación (2016), ¿Qué es una Evaluación?, https://sinergia.dnp.gov.co/Paginas/Internas/Evaluaciones/%C2%BFQu%C3%A9-es-Evaluaciones.aspx.

[42] Department of the Prime Minister and Cabinet (2014), Guide to Implementation Planning, Australian government, https://www.pmc.gov.au/sites/default/files/files/guide-to-implementation-planning.pdf (accessed on 12 July 2019).

[85] Department of the Prime Minister and Cabinet (Australia) (2013), Policy Implementation |, https://www.pmc.gov.au/government/policy-implementation (accessed on 20 September 2019).

[116] Desautels, G. and S. Jacob (2012), “The ethical sensitivity of evaluators: A qualitative study using a vignette design”, Evaluation, Vol. 18/4, pp. 437-450, http://dx.doi.org/10.1177/1356389012461192.

[176] Dobbins, M. et al. (2009), “A randomized controlled trial evaluating the impact of knowledge translation and exchange strategies”, Implementation Science, Vol. 4/1, p. 61, http://dx.doi.org/10.1186/1748-5908-4-61.

[157] Estonian National Audit Office (2011), The state of affairs with the legislative impact assessment.

[171] European Commission (2017), Better Regulation Guidelines, http://europa.eu/about-eu/basic-information/decision-making/treaties/index_en.htm.

[158] European Court of Auditors (2013), Audit Guidelines on Evaluation, https://www.eca.europa.eu/Lists/ECADocuments/GUIDELINES_EVALUATION/Evaluation-Guideline-EN-Oct2013.pdf (accessed on 23 August 2019).

[27] European Environment Agency (2017), “EEA guidance document-policy evaluation”, https://www.researchgate.net/publication/317594615.

[50] Flay, B. et al. (2005), “Standards of Evidence: Criteria for Efficacy, Effectiveness and Dissemination”, Prevention Science, Vol. 6/3, pp. 151-175, http://dx.doi.org/10.1007/s11121-005-5553-y.

[162] Fleischer, D. and C. Christie (2009), “Evaluation use: Results from a survey of U.S. American evaluation Association members”, American Journal of Evaluation, Vol. 30/2, pp. 158-175, http://dx.doi.org/10.1177/1098214008331009.

[61] Flemig, S., S. Osborne and T. Kinder (2016), “Risky business—reconceptualizing risk and innovation in public services”, Public Money & Management, Vol. 36/6, pp. 425-432, http://dx.doi.org/10.1080/09540962.2016.1206751.

[128] France Stratégie (2016), How to evaluate the impact of public policies: a guide for the use of decision makers and practitioners (Comment évaluer l’impact des politiques publiques : un guide à l’usage des décideurs et des praticiens), https://www.strategie.gouv.fr/sites/strategie.gouv.fr/files/atoms/files/guide_methodologique_20160906web.pdf (accessed on 21 August 2019).

[130] France Stratégie, R. Desplatz and M. Ferracci (2016), Comment évaluer l’impact des politiques publiques ? Un guide à l’usage des décideurs et praticiens.

[134] France Stratégie, R. Desplatz and M. Ferracci (n.d.), Comment évaluer l’impact des politiques publiques ? Un guide à l’usage des décideurs et praticiens.

[55] Gaarder, M. and B. Briceño (2010), “Institutionalisation of government evaluation: balancing trade-offs”, Journal of Development Effectiveness, Vol. 2/3, pp. 289-309, http://dx.doi.org/10.1080/19439342.2010.505027.

[28] Gasper, D. (2018), “Policy Evaluation: From Managerialism and Econocracy to a Governance Perspective”, in International Development Governance, Routledge, http://dx.doi.org/10.4324/9781315092577-37.

[30] Gasper, D. (2018), “Policy Evaluation: From Managerialism and Econocracy to a Governance Perspective”, in International Development Governance, Routledge, http://dx.doi.org/10.4324/9781315092577-37.

[168] Gauthier, B. (2015), “Some pointers concerning Evaluation Utilization”.

[51] Goldstein, C. et al. (2018), Ethical issues in pragmatic randomized controlled trials: A review of the recent literature identifies gaps in ethical argumentation, BioMed Central Ltd., http://dx.doi.org/10.1186/s12910-018-0253-x.

[90] Government Policy Analysis Unit (2017), Global Evidence Policy Units: Finland, https://www.ksi-indonesia.org/file_upload/Evidence-Policy-Unit-in-Finland-the-Government-Po-14Jun2017163532.pdf.

[110] Greenhalgh, T. et al. (2004), “Diffusion of Innovations in Service Organizations: Systematic Review and Recommendations”, The Milbank Quarterly, Vol. 82/4, pp. 581-629, http://dx.doi.org/10.1111/j.0887-378X.2004.00325.x.

[170] Haynes, A. et al. (2012), “Identifying Trustworthy Experts: How Do Policymakers Find and Assess Public Health Researchers Worth Consulting or Collaborating With?”, PLoS ONE, Vol. 7/3, p. e32665, http://dx.doi.org/10.1371/journal.pone.0032665.

[173] Haynes, A. et al. (2018), “What can we learn from interventions that aim to increase policy-makers’ capacity to use research? A realist scoping review”, Health Research Policy and Systems, Vol. 16/1, p. 31, http://dx.doi.org/10.1186/s12961-018-0277-1.

[29] Heider, C. (2017), Rethinking Evaluation - Efficiency, Efficiency, Efficiency, https://ieg.worldbankgroup.org/blog/rethinking-evaluation-efficiency.

[41] Hildén, M. (2014), “Evaluation, assessment, and policy innovation: exploring the links in relation to emissions trading”, Environmental Politics, Vol. 23/5, pp. 839-859, http://dx.doi.org/10.1080/09644016.2014.924199.

[9] HM Treasury (2011), The Magenta Book: Guidance for evaluation, https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/220542/magenta_book_combined.pdf (accessed on 18 June 2019).

[191] HM Treasury (2011), The Magenta Book: Guidance for evaluation, https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/220542/magenta_book_combined.pdf (accessed on 18 June 2019).

[3] Howlett, M. (2019), “Policy analytical capacity and evidence‐based policy‐making: Lessons from Canada”, https://doi.org/10.1111/j.1754-7121.2009.00070_1.x.

[153] IGEES (2014), Irish Government Economic and Evaluation Service, https://igees.gov.ie/ (accessed on 28 January 2019).

[94] Impact Assessment Office (2018), “The Uncompleted Evaluation of Legislative in Italy: Critical Issues, Prospects and Good Practice”, http://www.senato.it/service/PDF/PDFServer/BGT/01082854.pdf (accessed on 23 September 2019).

[127] Independent Evaluation Office of UNDP (2019), UNDP Evaluation Guidelines.

[82] Innovate UK (2018), “Evaluation Framework, How we assess our impact on business and the economy”, https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/681741/17.3253_Innovate_UK_Evaluation_Framework_RatherNiceDesign_V2_FINAL_WEB.pdf (accessed on 20 September 2019).

[20] International Organisation of Supreme Audit Institutions (2019), ISSAI 100: Fundamental Principles of Public-Sector Auditing, http://www.issai.orghttp://www.intosai.org (accessed on 8 January 2020).

[155] INTOSAI (2016), Guidelines on the Evaluation of Public Policies, http://www.issai.org (accessed on 22 August 2019).

[122] INTOSAI (2010), Program Evaluation for SAIs, A Primer, https://www.eurosai.org/handle404?exporturi=/export/sites/eurosai/.content/documents/materials/Program-Evaluation-for-SAIs.pdf (accessed on 22 August 2019).

[188] INTOSAI Working Group on Evaluation of Public Policies and Programs (2019), Implementation of the INTOSAI GOV 9400 Guidelines: Survey Results.

[97] INTOSAI and Cour des Comptes (France) (2019), Implementation of the INTOSAI GOV 9400 Guidelines: Survey results.

[118] Jacob, S. and Y. Boisvert (2010), To Be or Not to Be a Profession: Pros, Cons and Challenges for Evaluation, http://dx.doi.org/10.1177/1356389010380001.

[63] Jacob, S., S. Speer and J. Furubo (2015), “The institutionalization of evaluation matters: Updating the International Atlas of Evaluation 10 years later”, Evaluation, Vol. 21/1, pp. 6-31, http://dx.doi.org/10.1177/1356389014564248.

[167] Johnson, K. et al. (2009), “Research on Evaluation Use A Review of the Empirical Literature From 1986 to 2005”, http://dx.doi.org/10.1177/1098214009341660.

[149] King, J. et al. (2001), Toward a Taxonomy of Essential Evaluator Competencies.

[181] Kothari, A. et al. (2009), “Is research working for you? validating a tool to examine the capacity of health organizations to use research”, Implementation Science, Vol. 4/1, p. 46, http://dx.doi.org/10.1186/1748-5908-4-46.

[17] Kroll, A. and D. Moynihan (2018), “The Design and Practice of Integrating Evidence: Connecting Performance Management with Program Evaluation”, Public Administration Review, Vol. 78/2, pp. 183-194, http://dx.doi.org/10.1111/puar.12865.

[103] Kusters, C. (2011), Making evaluations matter: a practical guide for evaluators MSP book View project Strengthening Managing for Impact (SMIP) View project, https://www.researchgate.net/publication/254840956.

[125] Kusters, C. et al. (2011), “Making evaluations matter: a practical guide for evaluators”, Centre for Development Innovation, Wageningen University & Research centre., https://www.researchgate.net/publication/254840956.

[175] Langer, L., J. Tripney and D. Gough (2016), The science of using science: researching the use of Research evidence in decision-making..

[5] Lazaro, B. (2015), Comparative Study on the Institutionalization of Evaluation in Europe and Latin America, Eurosocial Programme.

[54] Lázaro, B. (2015), Comparative study on the institutionalisation of evaluation in Europe and Latin America, Eurosocial Programme, Madrid, http://sia.eurosocial-ii.eu/files/docs/1456851768-E_15_ENfin.pdf (accessed on 9 July 2019).

[57] Ledermann, S. (2012), “Exploring the Necessary Conditions for Evaluation Use in Program Change”, American Journal of Evaluation, Vol. 33/2, pp. 159-178, http://dx.doi.org/10.1177/1098214011411573.

[163] Ledermann, S. (2012), “Exploring the Necessary Conditions for Evaluation Use in Program Change”, http://dx.doi.org/10.1177/1098214011411573.

[100] Leviton, L. and E. Hughes (1981), Research on the Utilization of Evaluations: A Review and Synthesis view and Synthesis.

[135] Little, B. (ed.) (1979), Speaking Truth to Power: The Art and Craft of Policy Analysis.

[68] Liverani, M., B. Hawkins and J. Parkhurst (2013), Political and institutional influences on the use of evidence in public health policy. A systematic review., http://dx.doi.org/10.1371/journal.pone.0077404.

[58] Mackay, K. (2007), How to Build M&E Systems to Support Better Government, The World Bank, http://dx.doi.org/10.1596/978-0-8213-7191-6.

[69] Maeda, A., M. Harrit and S. Mabuchi (2012), Human Development Creating Evidence for Better Health Financing Decisions A Strategic Guide for the Institutionalization of National Health Accounts, The World Bank, Washington, DC, http://dx.doi.org/10.1596/978-0-8213-9469-4.

[140] Malčík, M. and A. Seberová (2010), “Meta-evaluation and Quality Standard of Final Evaluation Report. The New Educational Review”, The New Educational Review, Vol. 22, pp. 149-164.

[10] McDavid, J., I. Huse and L. Hawthorn (2006), Program Evaluation and Performance Measurement: An Introduction to Practice, https://study.sagepub.com/mcdavid3e (accessed on 28 January 2020).

[146] Mcguire, M. and R. Zorzi (2005), “EVALUATOR COMPETENCIES AND PERFORMANCE DEVELOPMENT”, The Canadian Journal of Program Evaluation, Vol. 20/2, pp. 73-99.

[37] Mergaert, L. and R. Minto (2015), “Ex Ante and Ex Post Evaluations: Two Sides of the Same Coin? The Case of Gender Mainstreaming in EU Research Policy”, Symposium on Policy Evaluation in the EU, http://dx.doi.org/10.1017/S1867299X0000427X.

[126] Mideplan (2018), Guide for Terms of References (Guia de Términos de Referencia), https://documentos.mideplan.go.cr/share/s/DVyxtc0OR3a0T6E2QbfQww (accessed on 21 August 2019).

[180] Mideplan (2018), Guide for the use of evaluations: guidelines for its implementation and follow-up on recommendations, https://documentos.mideplan.go.cr/share/s/DDVZ114kTjCsTAxiihi5Kw (accessed on 3 September 2019).

[83] Mideplan (2018), National Evaluation Policy, https://documentos.mideplan.go.cr/share/s/Ymx1WmMJTOWe9YyjyeCHKQ (accessed on 20 September 2019).

[35] Ministerio de Desarrollo Social y Familia de Chile (2019), Evaluación Social Ex Ante, http://sni.ministeriodesarrollosocial.gob.cl/evaluacion-iniciativas-de-inversion/evaluacion-ex-ante/ (accessed on 5 July 2019).

[24] Ministerio de Planificación Nacional y Política Económica (2018), Manual de Evaluación para Intervenciones Públicas, https://documentos.mideplan.go.cr/share/s/6eepeLCESrKkft6Mf5SToA (accessed on 5 August 2019).

[38] Ministry of Finance (2006), The Norwegian Government Agency for Financial Management (DFØ), https://www.regjeringen.no/en/dep/fin/about-the-ministry/etater-og-virksomheter-under-finansdepartementet/subordinateagencies/the-norwegian-government-agency-for-fina/id270409/ (accessed on 12 July 2019).

[132] Ministry of Finance (Lithuania) (2011), Recommendations on Implementation of Programs Evaluation Methodology, https://finmin.lrv.lt/uploads/finmin/documents/files/LT_ver/Veiklos_sritys/Veiklos_efektyvumo_tobulinimas/PVrekomendacijos2011.pdf (accessed on 27 August 2019).

[6] Ministry of Finance of The Netherlands (2018), Arrangements for periodic evaluation research, https://wetten.overheid.nl/BWBR0040754/2018-03-27 (accessed on 12 July 2019).

[76] Ministry of Internal Affairs and Communications (2017), “Basic Guidelines for Implementing Policy Evaluation (Revised)”, https://www.soumu.go.jp/main_content/000556221.pdf (accessed on 16 September 2019).

[190] Ministry of Internal and Communications (2017), Basic Guidelines for Implementing Policy Evaluation Revised, http://www.soumu.go.jp/main_content/000556221.pdf.

[49] Morton, M. (2009), Applicability of Impact Evaluation to Cohesion Policy 1 Report Working Paper of, https://ec.europa.eu/regional_policy/archive/policy/future/pdf/4_morton_final-formatted.pdf (accessed on 8 August 2019).

[154] National Audit Office (GBR) (2013), Evaluation in Government, https://www.nao.org.uk/report/evaluation-government/ (accessed on 22 August 2019).

[177] Neuhoff, A. et al. (2015), The What Works Marketplace Helping Leaders Use Evidence to Make Smarter Choices 2 Invest in What Works Policy Series, The Bridgespan group, http://www.results4america.org.

[164] Newman, J., A. Cherney and B. Head (2017), “Policy capacity and evidence-based policy in the public service”, Public Management Review, Vol. 19/2, pp. 157-174, http://dx.doi.org/10.1080/14719037.2016.1148191.

[111] Newman, K., C. Fisher and L. Shaxson (2012), “Stimulating Demand for Research Evidence: What Role for Capacity-building?”, IDS Bulletin, Vol. 43/5, pp. 17-24, http://dx.doi.org/10.1111/j.1759-5436.2012.00358.x.

[183] Newman, K., C. Fisher and L. Shaxson (2012), “Stimulating Demand for Research Evidence: What Role for Capacity-building?”, IDS Bulletin, Vol. 43/5, pp. 17-24, http://dx.doi.org/10.1111/j.1759-5436.2012.00358.x.

[99] OECD (2020), Building capacity for evidence informed policy making:, OECD, Paris.

[98] OECD (2020), The Irish Government Economic and Evaluation Service: Using Evidence-Informed Policy Making to Improve Performance, OECD Public Governance Reviews, OECD Publishing, Paris, https://dx.doi.org/10.1787/cdda3cb0-en.

[12] OECD (2019), Budgeting and Public Expenditures in OECD Countries 2019, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264307957-en.

[59] OECD (2019), Evaluating Public Sector Innovation Support or hindrance to innovation?, Observatory of Public Sector Innovation-OPSI, Paris, https://oecd-opsi.org/wp-content/uploads/2019/05/Evaluating-Public-Sector-Innovation-Part-5a-of-Lifecycle-Report.pdf (accessed on 11 September 2019).

[107] OECD (2019), “Open Government Data Report”.

[11] OECD (2019), Open Government in Biscay, OECD Public Governance Reviews, OECD Publishing, Paris, https://dx.doi.org/10.1787/e4e1a40c-en.

[1] OECD (2018), Building Capacity for Evidence Informed Policy Making: Towards a Baseline Skill Set, http://www.oecd.org/gov/building-capacity-for-evidence-informed-policymaking.pdf (accessed on 3 September 2019).

[93] OECD (2018), Centre Stage 2- The organisation and functions of the centre of government in OECD countries, https://www.oecd.org/gov/centre-stage-2.pdf.

[47] OECD (2018), Cost-Benefit Analysis and the Environment: Further Developments and Policy Use, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264085169-en.

[2] OECD (2018), Draft Policy Framework on Sound Public Governance, http://www.oecd.org/gov/draft-policy-framework-on-sound-public-governance.pdf (accessed on 8 July 2019).

[123] OECD (2018), “OECD Best Practice Principles for Regulatory Policy: Reviewing the Stock of Regulation”.

[192] OECD (2018), OECD Performance Budgeting Survey.

[172] OECD (2018), OECD Regulatory Policy Outlook 2018, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264303072-en.

[112] OECD (2017), Governing Better Through Evidence-Informed Policy Making Options for an OECD Work Agenda.

[4] OECD (2017), Government at a Glance - OECD, OECD, Paris, http://www.oecd.org/gov/govataglance.htm (accessed on 9 July 2019).

[105] OECD (2017), Making policy evaluation work.

[186] OECD (2017), Policy Advisory Systems: Supporting Good Governance and Sound Public Decision Making, OECD Public Governance Reviews, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264283664-en.

[53] OECD (2017), Systems Approaches to Public Sector Challenges: Working with Change, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264279865-en.

[184] OECD (2016), Engaging Public Employees for a High-Performing Civil Service, OECD Public Governance Reviews, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264267190-en.

[144] OECD (2016), Evaluation Systems in Development Co-operation: 2016 Review, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264262065-en.

[43] OECD (2016), Open Government: The Global Context and the Way Forward, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264268104-en.

[21] OECD (2016), Supreme Audit Institutions and Good Governance: Oversight, Insight and Foresight, OECD Public Governance Reviews, OECD Publishing, Paris, https://doi.org/10.1787/9789264263871-en.

[81] OECD (2014), “Budget Review: Germany”, OECD Journal on Budgeting, Vol. 2.

[92] OECD (2014), “Centre Stage Driving Better Policies from the Centre of Government”, http://www.oecd.org/officialdocuments/publicdisplaydocumentpdf/?cote=gov/pgc/mpm(2014)3&doclanguage=en (accessed on 23 September 2019).

[96] OECD (2014), Chile’s Supreme Audit Institution: Enhancing Strategic Agility and Public Trust, OECD Public Governance Reviews, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264207561-en.

[174] OECD (2011), “Government at a Glance”.

[13] OECD (2011), “Typology and implementation of spending reviews”, OECD SBO Meeting on Performance and Results, http://www.oecd.org/officialdocuments/publicdisplaydocumentpdf/?cote=GOV/PGC/SBO(2011)9&doclanguage=en (accessed on 1 August 2019).

[48] OECD (2010), “DAC Guidelines and Reference Series: Quality Standards for Development Evaluation”, https://www.oecd.org/development/evaluation/qualitystandards.pdf (accessed on 9 July 2019).

[95] OECD (2010), Good Practices in Supporting Supreme Audit Institutions, http://www.oecd.org/dac/effectiveness/Final%20SAI%20Good%20Practice%20Note.pdf.

[19] OECD (2008), Performance Budgeting: A Users’ Guide, https://www.oecd.org/gov/budgeting/Performance-Budgeting-Guide.pdf (accessed on 2 August 2019).

[106] OECD (forthcoming), Ensuring the Good Governance of Evidence, taking stock of standards for policy design, implementation and evaluation, OECD, Paris.

[84] OECD-DAC (2009), “Guidelines for Project and Programme Evaluations”, https://www.entwicklung.at/fileadmin/user_upload/Dokumente/Projektabwicklung/Englisch/Guidelines_for_Project_and_Progamme_Evaluations.PDF (accessed on 20 September 2019).

[26] OECD-DAC (2002), Glossary of Key Terms in Evaluation and Results Based Management, https://www.oecd.org/dac/evaluation/18074294.pdf (accessed on 18 June 2019).

[137] Office fédéral de la justice (2005), Guide for the evaluation of the efficacy of the Confederation (Guide de l’évaluation de l’efficacité à la Confédération), http://www.ofj.admin.ch/ejpd/fr/home/themen/staat_und_buerger/ref_evaluation/ref_umsetzung_art.html (accessed on 27 August 2019).

[86] Office of Management and Budget (2020), MB-20-12 Program Evaluation Standards and Practices, https://www.whitehouse.gov/wp-content/uploads/2020/03/M-20-12.pdf).

[136] Office of Management and Budget (2018), Monitoring and Evaluation Guidelines for Federal Departments and Agencies that Administer United States Foreign Assistance, https://www.whitehouse.gov/wp-content/uploads/2017/11/M-18-04-Final.pdf (accessed on 27 August 2019).

[182] Olejniczak, K. and E. Raimondo (2016), “Evaluation units as knowledge brokers: Testing and calibrating an innovative framework”, Evaluation, Vol. 22/2, pp. 168-189, http://dx.doi.org/10.1177/1356389016638752.

[64] Olejniczak, K., E. Raimondo and T. Kupiec (2016), “Evaluation units as knowledge brokers: Testing and calibrating an innovative framework”, Evaluation, Vol. 22/2, pp. 168-189, http://dx.doi.org/10.1177/1356389016638752.

[169] Oliver, K. et al. (2015), “Identifying public health policymakers’ sources of information: comparing survey and network analyses”, The European Journal of Public Health, Vol. 27/suppl_2, p. ckv083, http://dx.doi.org/10.1093/eurpub/ckv083.

[178] OMB (2010), Section 200 - Overview of the Federal Performance Framework.

[156] Operational and Evaluation Audit Division of Costa Rica (2014), INFORME DE AUDITORÍA DE CARÁCTER ESPECIAL SOBRE LOS PROCESOS DE SEGUIMIENTO, EVALUACIÓN Y RENDICIÓN DE CUENTAS PÚBLICA EN, https://cgrfiles.cgr.go.cr/publico/jaguar/sad_docs/2015/DFOE-SAF-IF-09-2014-Recurrido.pdf (accessed on 22 August 2019).

[32] Parkhurst, J. (2017), The politics of evidence : from evidence-based policy to the good governance of evidence, Routledge, London, http://researchonline.lshtm.ac.uk/3298900/ (accessed on 23 November 2018).

[102] Patton, M. (1978), “Utilization-focused evaluation”.

[56] Picciotto, R. (2013), “Evaluation Independence in Organizations”, Journal of MultiDisciplinary Evaluation, Vol. 9/20, p. 15.

[120] Picciotto, R. (n.d.), The Value of Evaluation Standards: A Comparative Assessment, http://evaluation.wmich.edu/jmde/Articles.

[115] Pleger, L. and S. Hadorn (2018), “The big bad wolf’s view: The evaluation clients’ perspectives on independence of evaluations”, Evaluation, Vol. 24/4, pp. 456-474, http://dx.doi.org/10.1177/1356389018796004.

[138] Pleger, L. and S. Hadorn (2018), “The big bad wolf’s view: The evaluation clients’ perspectives on independence of evaluations”, Evaluation, Vol. 24/4, pp. 456-474, http://dx.doi.org/10.1177/1356389018796004.

[150] Podems, D. (2013), “Evaluator competencies and professionalizing the field: Where are we now?”, Canadian Journal of Program Evaluation, Vol. 28/3, pp. 127-136.

[33] Poder Ejecutivo Nacional (2018), Decreto 292/2018: Evaluación de Políticas y Programas Sociales, 11-04-2018, http://servicios.infoleg.gob.ar/infolegInternet/verNorma.do?id=308653 (accessed on 8 July 2019).

[22] Poder Ejecutivo Nacional de Argentina (2018), Decreto 292/2018: Evaluación de Políticas y Programas Sociales.

[145] Polish Ministry of Infrastructure and Development (2015), Guidelines of cohesion policy evaluation for period 2014-2020, http://www.ewaluacja.gov.pl/media/13209/wytyczne_090915_final.pdf (accessed on 22 August 2019).

[185] Punton, M. et al. (2016), HOW CAN CAPACITY DEVELOPMENT PROMOTE EVIDENCE-INFORMED POLICY MAKING? Literature Review for the Building Capacity to Use Research Evidence (BCURE) Programme, http://www.itad.com/knowledge-and-resources/bcure (accessed on 6 September 2019).

[166] Results for America (2017), Government Mechanisms to Advance the Use of Data and Evidence in Policymaking: A Landscape Review.

[193] Results for America (2017), Government Mechanisms to Advance the Use of Data and Evidence in Policymaking: A Landscape Review.

[114] Robert Picciotto (2013), “Evaluation Independence in Organizations”, Journal of MultiDisciplinary Evaluation, Vol. 9/20, p. 15.

[187] Robinson, M. (2014), Spending reviews, http://www.pfmresults.com. (accessed on 25 June 2019).

[79] Roh, J. (2018), “Improving the government performance management system in South Korea”, Asian Education and Development Studies, Vol. 7/3, pp. 266-278, http://dx.doi.org/10.1108/AEDS-11-2017-0112.

[52] Rutter, J. (2012), Evidence and Evaluation in Policy making, Institute for Government, https://www.instituteforgovernment.org.uk/sites/default/files/publications/evidence%20and%20evaluation%20in%20template_final_0.pdf.

[62] Schillemans, T. and M. Bovens (2011), The Challenge of Multiple Accountability: Does Redundancy lead to Overload?.

[39] Schoenefeld, J. and A. Jordan (2017), “Governing policy evaluation? Towards a new typology”, Evaluation, Vol. 23/3, pp. 274-293, http://dx.doi.org/10.1177/1356389017715366.

[139] Scriven, M. (1969), “An introduction to meta-evaluation”, Educational Product Report, Vol. 2.

[91] Secretaria de Desarrollo Social (2015), Decree for which the Council of social Development Policy Evalaution is regulated..

[108] Shaxson, L. (2019), “Uncovering the practices of evidence-informed policy-making”, Public Money & Management, Vol. 39/1, pp. 46-55, http://dx.doi.org/10.1080/09540962.2019.1537705.

[113] Sinatra, G., D. Kienhues and B. Hofer (2014), “Educational Psychologist Addressing Challenges to Public Understanding of Science: Epistemic Cognition, Motivated Reasoning, and Conceptual Change”, http://dx.doi.org/10.1080/00461520.2014.916216.

[14] Smismans, S. (2015), “Policy Evaluation in the EU: The Challenges of Linking Ex Ante and Ex Post Appraisal”, Symposium on Policy Evaluation in the EU, http://dx.doi.org/10.1017/S1867299X00004244.

[34] Smismans, S. (2015), “Policy Evaluation in the EU: The Challenges of Linking Ex Ante and Ex Post Appraisal”, Symposium on Policy Evaluation in the EU, http://dx.doi.org/10.1017/S1867299X00004244.

[101] Stern, E., M. Saunders and N. Stame (2015), “Standing back and looking forward: Editors’ reflections on the 20th Anniversary of Evaluation”, Evaluation, Vol. 21/4, pp. 380-390, http://dx.doi.org/10.1177/1356389015608757.

[46] Steuerle, E. and L. Jackson (2016), Advancing the power of economic evidence to inform investments in children, youth, and families, National Academies Press, http://dx.doi.org/10.17226/23481.

[147] Stevahn, L. et al. (2005), “Establishing Essential Competencies for Program Evaluators”, ARTICLE American Journal of Evaluation, http://dx.doi.org/10.1177/1098214004273180.

[143] Stufflebeam, D. (2001), Method Notes Evaluation Checklists: Practical Tools for Guiding and Judging Evaluations, http://www.wmich.edu/evalctr/checklists/.

[141] Stufflebeam, D. (1978), “Meta evaluation: an overview”, Evaluation and The Health Professions, Vol. 1/1, https://journals.sagepub.com/doi/pdf/10.1177/016327877800100102 (accessed on 19 August 2019).

[179] Superu (2018), Making sense of evidence: A guide to using evidence in policy, https://thehub.sia.govt.nz/assets/Uploads/Making-Sense-of-Evidence-handbook-FINAL.pdf (accessed on 3 September 2019).

[189] Swiss Federal Audit Office (2019), Involving Stakeholders in Evaluation at the Swiss Federal Audit Office, http://www.program-evaluation.ccomptes.fr/images/stories/evenements/Vilnius_2019/Presentation_Switzerland_Stakeholder_Involvement_WGEPPP_2019.pdf (accessed on 6 September 2019).

[74] The Cabinet Secretariat (2019), The status-quo about the promotion of statistics reform, http://www.kantei.go.jp/jp/singi/toukeikaikaku/dai5/siryou1.pdf (accessed on 2 September 2019).

[73] The Committee on Promoting EBPM (2017), Guidelines on securing and developing human resources for the promotion of EBPM, https://www.gyoukaku.go.jp/ebpm/img/guideline1.pdf (accessed on 2 September 2019).

[36] The European Network for Rural Development (2014), The Ex Ante Evaluation 2014-2020 RDPs, https://enrd.ec.europa.eu/evaluation/publications/guidelines-ex-ante-evaluation-2014-2020-rdps_en (accessed on 29 January 2020).

[78] The Ministry of Internal Affairs and Communication (2005), Policy Evaluation Implementation Guidelines, https://www.soumu.go.jp/main_content/000556222.pdf.

[77] The Ministry of Internal Affairs and Communications (2010), “Guidelines for Publication of Information on Policy Evaluation”, http://www.soumu.go.jp/main_content/000556224.pdf (accessed on 16 September 2019).

[72] The Statistical Reform Promotion Council (2017), The final report of the Statistical Reform Promotion Council. (In Japanese), http://www.kantei.go.jp/jp/singi/toukeikaikaku/pdf/saishu_honbun.pdf (accessed on 2 September 2019).

[15] The World Bank (2018), Spending Review Manual: Bulgaria.

[87] Treasury Board Secretariat (2019), Evaluation in the Government of Canada - Canada.ca, https://www.canada.ca/en/treasury-board-secretariat/services/audit-evaluation/evaluation-government-canada.html (accessed on 20 September 2019).

[88] Treasury Board Secretariat (2013), Assessing program resource utilization when evaluating federal programs, https://www.canada.ca/en/treasury-board-secretariat/services/audit-evaluation/centre-excellence-evaluation/assessing-program-resource-utilization-evaluating-federal-programs.html (accessed on 2 August 2019).

[89] Treasury Board Secretariat (2010), Supporting Effective Evaluations: A Guide to Developing Performance Measurement Strategies, https://www.canada.ca/en/treasury-board-secretariat/services/audit-evaluation/centre-excellence-evaluation/guide-developing-performance-measurement-strategies.html (accessed on 2 August 2019).

[121] United Nations Evaluation Group (2016), Norms and Standards for Evaluation.

[75] United States Office of Management and Budget (2019), Phase 1 Implementation of the Foundations for Evidence-Based Policymaking Act of 2018: Leaming Agendas, Personnel, and Planning Guidance, https://www.whitehouse.gov/wp-content/uploads/2019/07/M-19-23.pdf.

[104] Vaessen, J. (2018), New blogpost - Five ways to think about quality in evaluation, https://www.linkedin.com/pulse/new-blogpost-five-ways-think-quality-evaluation-jos-vaessen (accessed on 21 June 2019).

[44] Vammalle, C. and A. Ruiz Rivadeneira (2017), “Budgeting in Chile”, OECD Journal on Budgeting, Vol. 16/3, https://dx.doi.org/10.1787/budget-16-5jfw22b3c0r3.

[71] van Ooijen, C., B. Ubaldi and B. Welby (2019), “A data-driven public sector: Enabling the strategic use of data for productive, inclusive and trustworthy governance”, OECD Working Papers on Public Governance, No. 33, OECD Publishing, Paris, https://dx.doi.org/10.1787/09ab162c-en.

[66] Viñuela, L., D. Ortega and F. Gomes (2015), Technical Note - Mechanisms and incentives for the adoption of evaluation of Policies and Programs to improve the Efficiency of Public Expenditure.

[16] Walker, K. and K. Moore (2011), Performance Management and Evaluation: What’s the difference?, https://www.childtrends.org/wp-content/uploads/2013/06/2011-02PerformMgmt.pdf (accessed on 18 July 2019).

[40] Weiss, C. (1993), “Where politics and evaluation research meet”, Evaluation Practice, Vol. 14/1, pp. 93-106, http://dx.doi.org/10.1016/0886-1633(93)90046-R.

[160] Weiss, C. and C. Weiaa Harvard (1998), “Have We Learned Anything New About the Use of Evaluation?”, American Journal of Evaluation, Vol. 19/1, pp. 21-33.

[124] World Bank et al. (2019), World Bank Group Evaluation Principles, http://www.worldbank.org.

[80] Yang, S. and A. Torneo (2016), “Government Performance Management and Evaluation in South Korea: History and Current Practices”, Public Performance & Management Review, Vol. 39/2, pp. 279-296, http://dx.doi.org/10.1080/15309576.2015.1108767.

[70] Zida, A. et al. (2017), “Evaluating the Process and Extent of Institutionalization: A Case Study of a Rapid Response Unit for Health Policy in Burkina Faso.”, International journal of health policy and management, Vol. 7/1, pp. 15-26, http://dx.doi.org/10.15171/ijhpm.2017.39.

Notes

Copy link to Notes

← 1. See A data-driven public sector: Enabling the strategic use of data for productive, inclusive and trustworthy governance (Working paper) [GOV/PGC(2019)57].

← 2. Eleven evaluations from: the National Agency of Evaluations-ANE, Ministry of Health, and academia, and their terms of references)

← 3. See the French Parliament website: www.ausimplementationconference.net.au/

Featured topics

Agriculture and fisheries

Climate change

Development

Digital

Economy

Education and skills

Employment

Environment

Finance and investment

Governance

Health

Industry, business and entrepreneurship

Regional, rural and urban development

Science, technology and innovation

Society

Taxation

Trade

Energy

Nuclear energy

Transport

Featured topics

Agriculture and fisheries

Climate change

Development

Digital

Economy

Education and skills

Employment

Environment

Finance and investment

Governance

Health

Industry, business and entrepreneurship

Regional, rural and urban development

Science, technology and innovation

Society

Taxation

Trade

Energy

Nuclear energy

Transport

Countries A - C

Countries D - I

Countries J - M

Countries N - R

Countries S - T

Countries U - Z

Regional and global engagement

Countries

Countries A - C

Countries D - I

Countries J - M

Countries N - R

Countries S - T

Countries U - Z

Regional and global engagement

Publications

Publications

Featured publications

Data

Data

Featured data

News & events

News & events

Featured events

About OECD

About

Engage with us

Work with us

Featured topics

Agriculture and fisheries

Climate change

Development

Digital

Economy

Education and skills

Employment

Environment

Finance and investment