This chapter examines challenges seen in analysed use cases and broader research and analysis. It finds that many government AI initiatives remain in pilots. It highlights common system-wide barriers – skills gaps, difficulties accessing and sharing high-quality data, limited actionable guidance, risk aversion, and weak measurement of results and return on investment. Other challenges vary by function, including inflexible or ambiguous regulation, high or uncertain costs, and outdated legacy information technology systems.
Governing with Artificial Intelligence
3. Implementation challenges that hinder the strategic use of AI in government
Copy link to 3. Implementation challenges that hinder the strategic use of AI in governmentAbstract
Key messages
Copy link to Key messagesThe OECD’s review of AI use cases in government indicates a high presence of early-stage initiatives, such as experiments and pilots. This indicates:
Possible challenges in transitioning from experimentation to implementation
A need for increased monitoring and sharing of information
A possible need for policy actions to encourage implementation and scaling.
Implementation challenges that are shared across all government functions are:
Skills gaps
Challenges to obtaining or sharing quality data
A lack of actionable frameworks and guidance on AI usage, including for specific policy areas
Risk aversion
Demonstrating results and return on investment (ROI).
Government functions face a variety of other challenges, with some more prevalent in some functions than others:
Inflexible or outdated legal and regulatory environments
High or uncertain costs of AI adoption and scaling
Outdated legacy information technology (IT) systems.
Governments face a variety of implementation challenges in adopting AI. Some of these challenges span all functions of government, while others appear more acute in certain ones. For instance, nearly all areas struggle with skills gaps and accessing and sharing quality data. AI implementation should account for distinct regulatory landscapes in different functions, with compliance requirements varying significantly between functions such as public procurement, law enforcement and tax administration. Additionally, some functions are more challenged than others with securing funding or with outdated systems, as only now are we seeing the emergence of digital infrastructure that share services with greater interoperability and integration. These challenges can translate into broader issues, such as difficulties in scaling up solutions and risk aversion that hinders innovation. This chapter discusses the various implementation challenges faced by functions of government, as further discussed as related to each function in Chapter 5. Many of these challenges mirror specific regional analysis on AI in government (OECD/CAF, 2022[1]; Brizuela et al., 2025[2]).
Most government AI efforts exist in exploratory or pilot phases, with limited scaling and documentation
Copy link to Most government AI efforts exist in exploratory or pilot phases, with limited scaling and documentationPossible challenges in moving from experimentation to implementation
The OECD’s review of AI uses in government indicate a high presence of early-stage initiatives, such as experiments and pilots. This is consistent with discussions held among relevant OECD working parties and networks, in which government officials at both national and local levels report being in the early stages of using AI government, seeking to learn by starting small and testing different approaches.
Overall, this approach is a good one. The OECD has long encouraged governments at both national and local levels to experiment with new approaches in a controlled and iterative manner to minimise risks and costs and to promote failures — which are inevitable — to occur quickly and generate lessons learned for future efforts (OECD, 2017[3]; 2024[4]). This is critical, especially when first getting used to using AI, as some estimates suggest that more than 80% of AI projects fail, double the rate of non-AI projects (Ryseff, De Bruhl and Newberry, 2024[5]). For instance, a successful path to AI adoption can involve incorporating AI into low-risk areas and processes and using internally-generated or open data to demonstrate value and establish quick wins. An example of this case is the Finnish Government Shared Services Centre for Finance and HR (Palkeet), which began with modest applications of RPA, paving the way towards “hyper-automation” with machine learning (ML) (see Chapter 5, “AI in public financial management”).
Yet the end goal of most AI projects is the eventual implementation, and as appropriate, the scaling up of successful solutions. In many countries and functions of government, AI use cases largely exist in the exploratory phase (e.g. proofs of concept, pilot projects) and have not yet been more broadly implemented or scaled beyond limited use. For instance, in the United Kingdom (UK) — generally one of the more advanced governments when it comes to using AI — a report by the Parliament’s Public Accounts Committee (PAC) (2025[6]) found the government had “no systematic mechanism for bringing together learning from pilots and there are few examples of successful at-scale adoption across government”. The results of a Deloitte survey (2024[7]) in 14 countries reinforce challenges in scaling GenAI in particular, noting also this challenge is not unique to government. According to the survey, “a significant number of both commercial and government respondents have transitioned fewer than one-third of their GenAI experiments into full production”. This is attributed to other challenges discussed in this chapter: lack of expertise and difficulty in measuring mission value from GenAI.
Existing data sources provide insights about the current state of AI adoption in government. For example, the use cases analysed for this report show most have moved beyond small pilots to be more fully implemented in some way. Of these implemented cases, most have not scaled beyond their original contexts (e.g. in certain offices or for certain processes) to address other needs. Although this is not always a goal, and AI uses in some government functions are often not appropriate for others. AI use cases in public service design and delivery in Chapter 5 demonstrate, however, that successful approaches can indeed be scaled up. In this report, the preponderance of implemented cases that go beyond pilots is in part due to a tendency for the OECD to select more implemented use cases because there is more public information available about them, and governments are somewhat more likely to report on them as part of OECD data collection exercises.
Other data sources offer complementary perspectives. The Public Sector Tech Watch observatory of the European Commission (EC) has systematically collected AI uses cases for several years. Its data on nearly 1 500 AI use cases indicate that most AI solutions are still in the planned, pilot or in-development phase (58%), suggesting across the EU public sector the majority of cases remain experimental or not fully implemented (Figure 3.1). Although moving from pilots to production appears to be a challenge — as reinforced in OECD discussions with governments — the proportion of implemented projects has increased in the latest data collections. This suggests that administrations may be transitioning their initiatives from initial testing to fuller implementation (EC, 2024[8]). This data does not consider the extent to which implemented projects have scaled beyond their initial context.
Figure 3.1. Most European Union (EU) AI use cases are in pilot or development phases
Copy link to Figure 3.1. Most European Union (EU) AI use cases are in pilot or development phasesThe “AI Systems in the Public Sector in Latin America and the Caribbean” database indicates a much higher proportion of implemented use cases (70%) (Muñoz-Cadena et al., 2025[10]). However, this is likely due to the data collection for this database, which was built by researchers using publicly available information. Publicly available information may be less likely to include details on planned or piloted use cases compared to submissions from governments, which largely informed the OECD and EU databases.
Overall, based on the OECD’s work on AI in government since 2019, including directly with many governments, it seems the expected state of implementation should be higher relative to early-stage testing. The high presence of early-stage cases would be easily explained if government use cases had a heavy emphasis on leveraging the latest AI technologies, such as generative AI foundation models. However, OECD research for this report indicates this is largely not the case; governments still tend toward more longstanding approaches. These observations suggest further work in this area is warranted to gain additional understanding about governments planning-to-implementation and scale-up journeys, and to derive more specific lessons learned and factors for success.
As discussed in Chapter 2, it may be possible that some narrow but traditional applications of AI — which may be more likely to be implemented or even scaled up — may have become so integrated or commonplace that they no longer trigger external reporting or a response to data collection efforts. This could skew the numbers through underrepresentation of implemented initiatives that do not rely on newer AI systems and approaches.
Lack of evidence on continuity calls for increased monitoring
Additional information is required to gain a better understanding of the status and evolution of government AI use cases. Future research should investigate this further by monitoring the progression of solution development over time, which could generate lessons from both successes and failures. The OECD sought to explore the progress in relevant AI use cases discussed in previous products; however, there are generally few updated reports on the results of pilots or the status of implemented use cases, making it challenging to follow them in a longitudinal way. Most of the research done to determine whether a use case is operational relies on the ability to access a functional product (if public), or the presence of government press releases, news coverage, blog articles or public presentations. Another source is use cases shared through periodic public reports or official repositories, including those by the OECD. Current inventories or catalogues are often static; they depict projects as a snapshot in time, without providing insights into their development and evolution, and are often not updated. Primary data collection by researchers (i.e. directly surveying government organisations to further identify new efforts and obtain updates and lessons learned on known initiatives) would be useful for further research, though demands greater resources and time.
These challenges in accessing up-to-date information on individual use cases underscores the importance of governments’ monitoring of AI use cases, along with a systematic and regular sharing of information, which would further reinforce the OECD AI Principles on transparency and accountability. This would not only be valuable for external audiences; documenting and disseminating successful (and unsuccessful) methods and use cases can help government organisations to replicate and scale AI projects more effectively. This approach helps avoid common errors, helps ensure consistency and accelerates the adoption of AI technologies across various government entities (OECD/UNESCO, 2024[11]). Inadequate or absent monitoring can also affect future use cases, as potentially effective AI innovations may be overlooked or, conversely, disproven approaches may be scaled up inappropriately. Further discussion and examples of how some governments are doing this can be found in Chapter 4, “Promoting transparency in how government uses AI”.
Policy actions may be needed to encourage implementation and scaling
The fact that many AI initiatives are in the planning and pilot phases, or are unclear regarding their progression, suggests governments need to enhance their implementation capabilities to advance projects beyond initial testing stages, secure successful deployment and sustain long-term impact (EC, 2024[8]). This involves establishing foundational elements such as ensuring access to datasets, computing resources and the necessary expertise required to develop and scale AI projects. Such factors are discussed in-depth in Chapter 4. It also requires overcoming other implementation challenges, as discussed below.
Common challenges shared across core government functions
Copy link to Common challenges shared across core government functionsSkills gaps, the most common challenge
A recent survey in five countries from Salesforce (2024[12]) found a lack of internal skills for using AI to be the primary barrier to government AI adoption, with 60% of public sector respondents highlighting this challenge.1 Public sector respondents were a third more likely to indicate a skills gap in their organisation compared to the industry average. National-level reviews find comparable results, with 70% of UK government bodies reporting skills as a barrier to AI adoption (UK NAO, 2024[13]). In a National Trade Union Study of 2 000 Australian Public Service (APS) employees from August to October 2024, 92% said they had received no training on using AI, and only 16% said they felt equipped to use the technology.2 The Australian government has released and made available to all APS employees an AI in Government Fundamentals training module (in October 2024) and a series of MasterClass sessions on AI run by practitioners. The Australian Government has a number of capability building structures in place to enhance AI capability. The AI CoLab initiative provides a framework for cross-sector collaboration, codesign and regular events. Access to a government AI tool through a closed Beta Trial of GovAI commenced on 5 May 2025 with expanded access available to all APS employees on 31 August 2025.
Skills gaps are also a significant challenge when focusing specifically on sub-national governments (UN Habitat, 2024[14]). For instance, United States (US) surveys of state Chief Information Officers and local-level IT executives found that only 20% of state CIOs and 25% of local respondents were even slightly confident their technology workforce possessed the expertise necessary to respond to the advent of generative AI (NASCIO, 2024[15]; PTI, 2024[16]). Beyond the commonly discussed challenge of governments competing with the private sector for talent, in some instances, national and sub-national governments are in competition with one another for the same limited talent pool. Smaller cities can also suffer from “brain drain”, with young talent moving to larger cities that provide more career possibilities (de Mello and Ter-Minassian, 2020[17]).
Skills gap challenges can be seen across nearly all functions of government discussed in Chapter 5. Skills challenges limit governments’ ability to take advantage of the latest developments in AI and can contribute to reluctance among public servants to accept the use of AI in general. In several functions, governments are struggling to determine exactly what kinds of skills are needed, and for whom.
Skills gaps can exacerbate other risks and challenges (Trajkovski, 2024[18]). For instance, they can lead to poor outcomes, the overestimation and misplaced trust in AI capabilities and systems, inadvertent misuse and general non-compliance with laws and other rules. Inconsistent levels of skill in governments can lead to pockets of innovation, but with little ability to scale beyond them.
In addition, a lack of skills internal in public administrations can result in an overreliance on outsourcing through public procurement (Mitchell, 2025[19]; Autio, Communigs and Elliott, 2023[20]). While procurement is an important and normal aspect of obtaining AI-related goods and services, relying too heavily on procurement relative to building internal capacities can result in a hollowing-out of government capacities (Trajkovski, 2024[18]). This can form a vicious cycle, where governments lack the right skills to design upskilling programmes, understand which skills to recruit for, and fully understand vendor offerings to procure the right goods and services at a fair price. Overall, without proactive skills development, “public agencies will find themselves merely reacting to technological shifts rather than steering these emerging technologies to serve societal interests effectively” (Trajkovski, 2024[18]). If government cannot demonstrate efficient and effective use of and self-control of AI, then it is unlikely that it will be able to regulate the technology. This also contributes to the challenge of high costs for AI adoption, as discussed below; hiring contractors can cost three to four times as much per person as government employees (UK DSIT, 2025[21]).
Several governments have instantiated upskilling and targeted recruitment programmes (see Chapter 4, “Fostering skills and talent”), with some even using AI as a tool to achieve these goals (see Chapter 5, “AI in civil service reform”).
Lack of high-quality data and the ability to share it
Through all levels of government and nearly every government function discussed in-depth in Chapter 5, data challenges are an impediment to developing and using AI in government. Recent work by RAND (2024[5]; 2025[22]) found that data issues were one of the main drivers in failed AI projects, including a lack of suitable data, and noted the importance of work often perceived to have low “activity prestige” (e.g. data cleaning).
For some functions, the needed data may simply not exist or were never digitised from paper (e.g. as seen often in justice administration), or the quality of data available is deficient (e.g. poorly structured, incomplete or mistyped records, discrepancies in data formats). This can arise from a variety of reasons, including poorly controlled data input processes, or even previous lack of foresight that such data could someday be important. While time-consuming and burdensome, such quality issues can often be overcome, such as through digitisation, data cleaning and validation processes.
The repeated emphasis on a lack of sufficient, quality data may seem counterintuitive, as governments hold tremendous amounts of data and often make it available as open government data (OGD) to, among other things, help serve as inputs for training AI systems (OECD, 2023[23]). However, in many cases, it is the ability for government agencies to share data amongst themselves that is a challenge. This can be due to rules that prevent sharing or are unclear to public servants, months-long approval processes, or less commonly, governments signing away data reuse rights in contracts with companies. These issues are also reflected when sharing data between jurisdictions and levels of government, adding challenges for AI adoption in subnational governments. Some countries are grappling with how to comply with data protection and management rules, such as the EU General Data Protection Regulation (GDPR), though these data management issues generally existed long before such rules were put into force. In other cases, there is a lack of technical or policy protocols for sharing, or lack of interoperability across IT systems and data formats. In most cases, these issues are symptoms of a more systemic problem — inadequate data governance resulting in non-strategic, sporadic and fragmented data collection and management — and associated rules across government. Antiquated or burdensome rules around data sharing also contribute and may need to be reconsidered to account for technological advancements while still protecting privacy.
Overall, it is critical that governments establish sound data governance and management activities to succeed in adopting AI — although only 59% of OECD countries have a data strategy in place for the public sector, with even fewer providing actionable guidance for implementation (OECD, 2024[24]). Without strong data governance in place, governments risk developing and deploying AI systems that use poor quality data, resulting in outcomes ranging from simple inaccuracies to systemic bias and unfair outcomes for citizens. Without robust data governance across organisations and levels of government, governments’ AI ambitions would largely need to be limited to the small experiments and pilots in limited settings that are the norm today, as discussed above.
Fostering the development of data-driven public sectors has long been a focus of the OECD (2019[25]; 2023[23]), and a number of governments are putting in place the data foundations needed for governments to reap AI’s benefits (see Chapter 4, “Creating a strong data foundation”).
Lack of actionable frameworks and guidance on AI usage
National strategies for AI in government — either dedicated strategies or those embedded in broader instruments — are now common, and they are important in defining a vision for AI success. However, they generally provide only high-level details on commitments and aspirations, offering limited concrete guidance to facilitate the materialisation of AI’s benefits while safeguarding against its risks. They also often fail to address key operational considerations that would make them effective. Investments and procurement, for instance, are often overlooked, despite being crucial for AI in government (van Noordt, Medaglia and Tangi, 2023[26]; Monteiro, Hlacs and Boéchat, 2024[27]). To bridge this gap, governments need actionable guidance that is aligned to strategies and provides their institutions with tangible direction and assurances. Guidance is also important for sub-national governments, such as cities. This can be important for both ensuring alignment with national approaches — as sub-national governments often follow or take inspiration from national efforts — as well as for helping sub-national governments meet their own digital and AI ambitions as well as the needs of their citizens and residents.
Such guidance can be either boundary spanning, addressing system-wide issues to promote trustworthy AI adoption, or targeted, focusing on specific government functions and applications. Cross-cutting guidance provides clarity and direction for fundamental elements such as data governance, talent development and investment. Vertical guidance, in contrast, helps public servants navigate AI’s opportunities and risks via means effectively tailored to different policy domains. These approaches are not exclusive and can both be pursued complementarily.
Overall, there is a lack of concrete cross-cutting and vertical guidance for AI in government. For instance, an inquiry by the Australian Parliamentary Joint Committee of Public Accounts and Audit (2025[28]) — which does not necessarily reflect the views of the Australian government — found that while some government entities are starting to adopt AI, they lack guidance in doing so. It recommended that a whole-of-government working group be established to consider what rules and governance frameworks necessary for AI systems across the public administration. However, there are notable exceptions, such as the “AI Playbook for the UK Government” (Chapter 4, Box 4.2) and France’s structured approach to integrating AI in HRM (Box 5.19). Governments need to move beyond statements of intent to providing clear, practical guidance on AI adoption, investment, data governance, procurement and workforce development (Morley et al., 2019[29]).
This gap is cited as a challenge in many functions of government in Chapter 5. Overall, a lack of guidance can contribute to risk aversion, as guidance can clarify uncertainty and reduce doubts among civil servants. The need for guidance has been raised specifically in the functions of regulatory design, public procurement, fighting corruption and promoting public integrity, and tax administration to address legal and governance uncertainties that leave public servants uncertain if and how they can use AI in these functions. In some instances, guidance is needed to clearly interpret laws and regulations and their practical application. In others, they provide clarity in situations where formal rules may not yet exist. Given sectoral variations, cross-cutting AI approaches need to be complemented by specialised guidance accounting for unique policy challenges, risk profiles and data landscapes. Governments will also need to keep in mind that AI is a rapidly evolving field, and such guidance should be flexible and will likely need adaptation and iteration to keep up with the pace of change.
Further discussion on government actions to overcome this challenge can be found in Chapter 4, “Establishing key governance mechanisms and processes” and “Using policy levers to guide trustworthy AI”.
Driving innovation while mitigating risks
OECD work last year ([30]) found that when it comes to AI in government, governments tend to focus on the negatives (i.e. AI risks and how to mitigate them), but not so much on the positives (i.e. AI benefits and how to take advantage of them). This is not unique to governments’ approaches to internal AI activities. With regard to the broader economy and society, AI experts have found that government policy discussions and initiatives often recognise that AI may yield significant benefits, but government actions often do not explicitly target achievement of benefits. Rather, they indirectly address them through positive spillover effects when seeking to mitigate risks (OECD, 2024[31]). These experts have urged governments to take more direct action to seize the opportunities presented by AI.
Much of this risk-oriented focus can be attributed to risk aversion, which has long hindered digital government and public sector innovation efforts, fostering a culture resistant to change in which failure should be avoided at all costs, including with regard to AI (OECD, 2021[32]; 2017[3]; 2019[33]; Desouza, 2018[34]; SAS, 2020[35]; Richter, 2024[36]). A survey from Deloitte (2024[7]) showed that 63% of public sector leaders believed GenAI would erode the overall level of trust in national and global institutions. This caution contributed to slower adoption of AI in government than in industry. Instances of risk aversion can be seen in several government functions in Chapter 5, especially in public procurement, fighting corruption and promoting public integrity. For instance, integrity institutions can be risk averse due to fear of making mistakes in the AI adoption process — with government guidance emphasising what not to do rather than the provision of actionable guidance on how to adopt AI in a trustworthy manner. Issues also arise as a result of over-correction in response to AI incidents, especially ones covered in media. Examples include instances when government chatbots provided misinformation or were hacked (Hodges, 2024[37]; Fagan, 2024[38]). Risk aversion is also a common topic of conversation when discussing AI in government in meetings of relevant OECD working parties and fora.
Governments should pursue one of the generally accepted best practices to AI development in use, such as considering the level of potential risk or impact that an AI system might have and developing tailored and commensurate measures to overcome potential adverse issues. Yet governments often seem to treat most AI efforts as if they were of a high level of risk or impact, requiring exacting requirements across the board and imposing cumbersome bureaucratic requirements, daunting to public servants seeking to innovate. This can serve as sufficient disincentive to prevent the process of exploration and can be seen in a study from Deloitte (2023[39]), which analysed all policy initiatives in the OECD.AI Database of National Policies & Strategies.3 It found risk-weighted policies, which aim to shift from one-size-fits-all approaches to a data-driven and risk-based approach, were rare, representing 2% of initiatives in the database. The problem is also recognised in the April 2025 policy from the US on “Accelerating Federal Use of AI through Innovation, Governance, and Public Trust” ([40]), which charges government agencies to “remove unnecessary and bureaucratic requirements that inhibit innovation and responsible adoption”.
Public servants can also develop “algorithmic aversion”. This is somewhat the opposite of “automation bias” discussed in Chapter 1, where “humans are reluctant to use algorithms despite their superior performance” (Cheng and Chouldechova, 2023[41]; Sunstein and Gaffe, 2024[42]), often after seeing mistakes in AI outputs. This suggests potential skills issues pertaining to the understanding of AI, its relative strengths and weaknesses and how to optimally use its outputs. It also suggests a lack of confidence in their abilities for human-machine collaboration as well as a lack of controlled environments for testing and safely experimenting with AI. These biases, which can distort perceptions of AI’s reliability, can be mitigated through structured interventions, such as trainings on AI’s strengths and limitations, as discussed further in Chapter 4 (Featherson, Shlonsky and Lewis, 2019[43]). Workers also need to feel they have a voice in inputs used for an AI system, and be able to use their professional judgement in how the outputs are used (Dietvorst, Simmons and Massey, 2018[44]; Cheng and Chouldechova, 2023[41]).
There is some evidence that risk aversion for AI in government may be receding as governments become more familiar with the technology. A recent study by Google Public Sector found that AI concerns among public IT leaders in the US over issues such as privacy and security are receding (Teale, 2025[45]). However, governments will need to be more active to overcome this risk-oriented focus to better consider trade-offs that better target opportunities. A variety of AI enablers and safeguards discussed in Chapter 4 can help divert from a culture of risk aversion and towards more controlled adoption and informed risk management.
Demonstrating results and return on investment
Governments have made significant strides in implementing AI solutions across various public domains, demonstrating tangible benefits in efficiency, accuracy and service delivery. However, monitoring of progress and thorough retrospective evaluation of impact remain underdeveloped aspects of government AI implementation. While isolated cases of success are well-documented, as seen below, comprehensive efforts to assess AI's contribution to public value creation are often lacking. This can be seen in the case of the UK, where “only 8% of AI projects show measurable benefits and only 16% show forecast costs, making it difficult to assess these against a cost-benefits analysis” (UK DSIT, 2025[21]). Specific to generative AI, a survey from Deloitte (2024[7]) in 14 countries shows that, despite anticipating increasing AI investments, 78% of government leaders surveyed report struggling to measure impacts from GenAI—significantly higher than those in other sectors, which poses a barrier to AI adoption and scaling even when other challenges, such as talent gaps, are resolved.
A handful of AI solutions have demonstrated concrete, measurable results that illustrate the technology's potential to transform government delivery. These quantifiable results provide valuable benchmarks for understanding AI's direct impact on operational efficiency and service quality:
Peru's Amauta Pro AI system has transformed the speed at which courts can respond to victims of domestic violence. This AI-powered system has reduced the time needed to draft resolutions for protection measures from a lengthy 3 hours to 40 seconds. (see Box 5.63).
In the EU, DATACROS Project developed a tool to detect anomalies in corporate ownership structures that may indicate risks of corruption, money laundering and other financial crimes. In 2021, the predictive tool correctly identified 83% of companies targeted by sanctions and 88% of companies with sanctioned owners (see Box 5.27).
The US Federal Emergency Management Agency (FEMA) developed an AI system to assess structural damage across areas affected by Hurricane Ian, which reduced the number of structures requiring human review from over 1 million to just 77 000. Within 72 hours of the hurricane’s landfall in 2022, FEMA had insights into the extent of damage across affected regions, enabling faster resource allocation and recovery planning. (see Box 5.58).
Particularly notable are cases where AI systems have been explicitly compared against human performance, highlighting significant improvements in speed, scale and resource utilisation that exceed human capability. In Singapore, government agencies transformed hiring with AI tools available on the market, enabling one agency to process over 3 000 applications for its Management Associate Program efficiently, saving EUR 44 000 (equivalent) and over 150 days of staff productivity.4 Comparisons with human performance are important because they focus on the key counterfactual that is needed for evidence-based decision-making. Further, they push for a deeper understanding of human performance, making it possible to unveil implicit assumptions and biases that affect human delivery.
Beyond individual use cases, some governments have begun documenting AI's impact at organisational and national levels, revealing substantial financial benefits and operational improvements. These broader assessments help establish the cumulative value of AI investments across government functions. The Australian Taxation Office, for instance, reported that their AI approach combining real-time analytics, pre-filled forms, and anomaly detection systems helped protect approximately AUD 78.9 million in revenue across over 636 000 interactions with users in 2023-2024 (Box 5.5). Similarly, substantial results were observed in Austria with the activity of the Federal Ministry of Finance’s Predictive Analytics Competence Centre (PACC), which made it possible to analyse 6.5 million cases across income, corporate and value-added tax sectors as well as customs transactions in 2023 (Box 5.3). These analyses detected instances of false reporting in employee tax assessments and identified fraudulent activities, resulting in additional tax revenues of approximately EUR 185 million. Looking to the future, a recent study by The Alan Turing Institute (2024[46]) on UK public services found AI could help automate 84% of the central government’s service-related transactions, saving an equivalent of approximately 1 200 person-years of work every year.
Despite these successes, such considerations are rare, and governments face significant challenges in systematically monitoring AI’s progress and impact. One key barrier is the lack of well-defined measurement and evaluation frameworks that can assess AI’s contributions in a standardised manner. Many AI applications are integrated into complex administrative processes, making it difficult to isolate and measure their specific effects. Additionally, the challenge of benchmarking AI against human performance is compounded by the fact that many AI-enabled tasks would be infeasible or prohibitively time-consuming absent automation. There is also a limited understanding of the long-term impact of use of LLM’s on human cognition, and whether their consistent use impacts the creativity, critical thinking skills and productivity of those that use them.
A final consideration is that different contexts may call for different methodologies. For instance, a theme of discussion at the latest OECD (2024[4]) Roundtable on Smart Cities was that cities need to explore different methodologies for measuring and evaluating success that align with their own objectives to allow them to set measurable goals. Some initial governments efforts to address these problems are emerging, such as the UK government publication on best practice for evaluating the impact of AI evaluation methods (Frontier economics, 2024[47]). The US (2025[48]) has also recently issued AI acquisitions policy that recognises government agencies need to be “safeguarding taxpayer dollars by tracking AI performance and managing risks”. Without robust monitoring mechanisms, governments risk wrongly estimating AI’s value, potential risks and missing opportunities for improvement.
Establishing effective impact measurement frameworks is crucial for ensuring AI investments deliver real value to public administrations and citizens. As governments allocate increasing resources to AI development and deployment, demonstrating a clear return on investment (ROI) will become imperative. Reliable retrospective impact assessment mechanisms can help policymakers make informed decisions about scaling AI solutions, optimising resource allocation and justifying further funding. Furthermore, impact assessment provides essential feedback for refining AI systems and approaches, enabling continuous improvement cycles. Documented outputs also facilitate knowledge sharing across government entities, helping to scale successful approaches and avoid repeating unsuccessful ones. Perhaps most importantly, transparent reporting on AI's impacts — both positive and negative — is essential for maintaining public trust and accountability as these technologies become more deeply embedded in core government functions and activities. Different evaluation methods are appropriate for each context, but governments should try and compare the implementation of AI to the situation of its absence. The OECD has produced guidance on choosing an evaluation approach based on a variety of key considerations (Varazzani et al., 2023[49]).
Challenges that are somewhat less common or vary among government functions
Copy link to Challenges that are somewhat less common or vary among government functionsInflexible or outdated legal and regulatory environments
Inflexible, outdated or otherwise inadequate (e.g. excessive, lacking) regulatory environments pose many challenges. Many functions face regulatory or legal restrictions in data access and sharing, as discussed above. Beyond this, there can be confusion about AI accuracy and whether inadvertent errors introduced through the use of AI could lead to non-compliance with regulations and other rules, such as in fiscal reporting. Complexity in regulations is also a factor. For instance, tax administration officials face highly complex laws around tax processes, contributing to them largely relying on classic rules-based approaches. These challenges are as common at the local level as they are in national governments (OECD, 2024[4]).
Sometimes, issues with existing regulation is not the challenge, but gaps in regulation that lead to confusion over what is acceptable with AI. This confusion can contribute to other challenges, such as risk aversion or the preference for the maintenance of one’s existing state of affairs (Samuleson and Zeckhauser, 1988[50]). For instance, because it is not specifically addressed in many countries, public procurement officials are often unclear on whether AI can be used in procurement processes, fearing that doing so could expose them to challenges from unsuccessful bidders or others who question the fairness of the process. This provides a general lack of incentive for change. Confusion also exists around whether using advanced AI systems, which are often highly capable but function in an opaque manner, can meet regulatory standards, such as International Standards on Auditing or evidentiary rules in the criminal justice system. In contrast, people may continue to operate without AI to forego these risks and also the benefits of AI use.
Regulatory environments pose a unique challenge regarding regulatory design and delivery. Beyond rules that restrict AI use, regulators should also be cautious and avoid making frequent changes to regulations, and to how they are implemented and enforced. Regulated entities need a level of clarity and predictability so they comply with regulations in a manner that causes minimal disruption with business operations. Frequent regulatory shifts, even if based on quality AI-informed insights, can lead to a volatile regulatory environment, making it difficult for businesses to plan long-term strategies and for the public to stay informed about current laws.
Governments can overcome these challenges by ensuring regulations and other formal rules are up to date, agile and minimise ambiguity. There is some evidence that governments are moving in this direction, with Deloitte (2023[39]) finding adaptive regulation — shifting from a “regulate and forget” mode to a responsive and interactive approach — to be one of the most common type of policies tracked by the OECD.AI Policy Observatory. Further guidance that helps contextualise regulations in particular functions of government is also important. Discussion on how governments are doing this can be found in Chapter 4, “Establishing key governance mechanisms and processes” and “Using policy levers to guide trustworthy AI”.
High or uncertain costs of AI adoption and scaling
While AI adoption has the potential to reduce costs through enhanced productivity and efficiency, many government organisations struggle to make the initial financial investments to begin their AI adoption journeys, or to scale up use cases that prove successful. These costs can sometimes range from paying licensing fees per employee for service-based AI offerings, such as ChatGPT or Microsoft Copilot, to extensive development, customisation and support costs for more tailored or in-house solutions (Shark, 2025[51]; Barrett and Greene, 2024[52]). In the UK, a survey of government officials by SAS (2025[53]) found cost and budget restrictions to be the main challenge (raised by 67% of respondents), closely followed by a lack of internal skills (63%). Despite the critical nature of funding for AI, the OECD (2024[24]) Digital Government Index (DGI) highlights that only 15% of OECD countries have an investment framework in place for public AI investments.
Chapter 5 cites financial challenges in adopting AI for several functions of government, including in regulatory design and delivery, public services, tax administration, fighting corruption and promoting public integrity, and civic participation. In some instances, financial challenges also relate to the costs of recruiting or procuring skilled talent, with skills gaps discussed as a separate challenge above. Functions like tax administration have also indicated the process for securing budgets in government is a challenge.
This challenge can contribute to the OECD finding that governments often seem stuck in exploratory and pilot phases, with limited scaling of successful solutions. For instance, tax authorities have told the OECD that conducting small pilots is inexpensive and easy, even with advanced systems obtained from the private sector. However, such costs can grow exponentially as AI offerings are implemented more broadly within organisations or scaled up across other parts of government. Costs are particularly high for purpose-built solutions, with a group of 10 countries focusing on AI in the Public Interest stating that “the barrier to scaling AI models has been assumed to be primarily the lack of availability and affordability of compute” (France Élysée, 2025[54]).
Governments need to recognise that underinvestment in technology increases long-term costs and total costs of ownership (UK DSIT, 2025[21]). Some governments are seeking to address these issues through targeted investments, as well as the provisioning of central services that help to ease the need for each agency to build or buy their own solutions. These can be seen in Chapter 4 under “Investing purposefully”, “Building out digital infrastructure” and “Creating spaces to experiment”. Some are also using open-source models or exploring smaller models that can be designed to respond to specific societal and community needs, requiring less computational power and data (France Élysée, 2025[54]).
Although the costs governments and Chapter 5 highlight tend to focus on are financial, it is useful for governments to keep in mind that not only monetary costs impact AI adoption and scaling. Psychological costs related to AI use can also impact the extent to which individuals use AI tools in their day-to-day work, even if the investments are made to make them available. These costs can include search costs — which occur when people are searching for information but encounter outdated information, unclear language or confusing requirements — or cognitive costs, the mental resources people expend understanding complex information (Shahab and Lades, 2021[55]). “Sludge audits” are structured behavioural assessments of a decision-making process which aim to identify, prevent and reduce unnecessary frictions and psychological costs, which prevent people from taking actions they would otherwise take (OECD, 2024[56]). By conducting sludge audits on the use of AI tools, governments can understand and address the barriers to acceptance that may limit AI adoption and scaling.
Governments can lack clarity on how much AI development and use can or should cost
The challenges discussed above are most relevant when such costs are known. Governments have reported to the OECD that there is often uncertainty or confusion about how much the development or use of different types of AI systems could or should cost. This makes it difficult for public institutions to plan effectively and evaluate vendor offerings when considering public procurement to source solutions. Gaining clarity on these costs can help ensure governments are prepared to adopt AI systems in a strategic and sustainable way. Yet, the OECD could identify no research discussing cost for the development or use of different types of AI systems in government. This suggests an optimal area for further research and analysis. Still, understanding costs in a broader sense or from specific government AI projects can help governments arrive at estimates for planning purposes. This section aims to take the first steps in helping governments do this, with the potential for more in-depth OECD work on this topic in the future.
The cost of adopting AI can vary significantly depending on the type of system and its scale of use. For instance, governments can pursue a variety of options in adopting AI, such as those touched on below.
Licensing private sector tools with fixed pricing per user or license
Companies offering AI tools often charge per user. This is the case for services such as Microsoft 365 Copilot, OpenAI’s ChatGPT or Anthropic’s Claude. Licenses for these services can be purchased at the enterprise level, with prices ranging between USD 30-100 per user per month. The enterprise version Microsoft 365 Copilot costs USD 30 per month, for example.5 Thus, a government running a pilot project comparable to Australia’s six-month pilot of 7 600 staff across 60 agencies (2024[57]) could estimate costs to be around USD 1.37 million for licenses alone; that does not include additions such as staff time for pilot administration and reporting results, as well as other overhead. For Australia’s pilot, the Australian Treasury (2025[58]) estimated the license fee could pay for itself for a mid-level government staffer if it mitigated 13 minutes of their time per week for higher-value tasks. ChatGPT Enterprise costs are not published on OpenAI’s website, although when asked, ChatGPT suggested a cost of USD 60-100 per user per month, varying based on volume and features. Anthropic’s Claude Enterprise Plan pricing also depends on business needs and characteristics. While the prices are not available on Anthropic’s website, third-party websites estimate a cost of USD 60 per person per month.6
Using private sector generative AI systems with volume-based pricing, such as tokens
GenAI systems are often offered through volume-based pricing, where the volume is calculated through API traffic. Whereas provision through licenses is more relevant when public servants use the AI tools directly (e.g. using Copilot to help draft documents), volume-based pricing is more relevant when governments build internal or public-facing services that interface with a proprietary model, or when they want to customise (i.e. fine-tune) the content the model considers or how it produces outputs. The main functions of the price include:
Input tokens: tokens included in a prompt, such as instructions, context or data sent to the model.
Output tokens: tokens generated by the model in response to an input.
Training tokens: data (e.g. chunks of text) that an AI model learns from during training.
Models see data as tokens, not sentences or paragraphs. The cost of tokens depends on the company and may be based on the level of complexity and resources needed for an individual model. For instance, the costs for some commonly use models are in Table 3.1.
Table 3.1. Costs for 1 million tokens on common Generative AI models (USD)
Copy link to Table 3.1. Costs for 1 million tokens on common Generative AI models (USD)One million tokens represent approximately 750 000 words, 100 000 lines of code, 11 hours of transcribed audio speech, or 1 hour of transcribed video
|
Model |
Input tokens |
Output |
Training (if applicable) |
|---|---|---|---|
|
OpenAI GPT-4o |
USD 2.50 |
USD 10 |
|
|
OpenAI GPT-4o (if fine-tuning to customise) |
USD 3.75 |
USD 15 |
USD 25 |
|
OpenAI GPT-3.5-turbo |
USD 0.50 |
USD 1.50 |
USD 8 |
|
Google Gemini 2.5 Pro |
USD 1.25 |
USD 10 |
|
|
Google Gemini 2.0 Flash |
USD 0.10 |
USD 0.40 |
|
|
Mistal Large 24.11 |
USD 2 |
USD 6 |
USD 9 |
|
Mistral NeMo |
USD 0.15 |
USD 0.15 |
USD 1 |
Note: As of 10 April 2025. The inference costs for using a particular model or its equivalent tends to decrease over time (Stanford HAI, 2025[59]).
Source: https://openai.com/api/pricing, https://ai.google.dev/gemini-api/docs/pricing, https://mistral.ai/products/la-plateforme, https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them, https://prompt.16x.engineer/blog/code-to-tokens-conversion, https://prompt.16x.engineer/blog/code-to-tokens-conversion, https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024.
Further, some companies are offering foundation models through specific tiers dedicated to government agencies.7 These services seek to meet governments’ stringent security standards. Further, they aim to be tailored to the needs of governments, providing solutions that make it easier to manage their own security, privacy and compliance requirements, as well as enable them to use the services for activities that may fall outside the standard usage policies.
The experiences of one central government AI lab in pursuing this approach is discussed in Box 3.1. Proprietary models can also be used in concert with open-source models, as touched on below.
Box 3.1. Government AI lab’s experience using proprietary AI
Copy link to Box 3.1. Government AI lab’s experience using proprietary AIOperations and expenses
A central government AI lab in one county follows a phased approach to exploring, piloting and scaling AI projects for use in the public sector by civil servants. Overall, there may be as many as 100 projects being considered, around 15 undergoing limited testing, and around five to seven accessible to real users for a pilot or full deployment.
The lab uses cloud hosting and AI from Azure OpenAI, Vertex AI (Google), and Amazon Web Services (AWS). It has a budget of around EUR 17.5 million. All work is conducted in-house. Most of its expenses are staff costs, including roughly:
15 full-time equivalent staff (FTE) for technical talent (seven for development and AI engineering, four for design and user research, four for cloud/infrastructure).
Six FTEs for delivery management, which is critical for ensuring technical talent can focus on technical challenges, while delivery managers focus on addressing challenges regarding policy and bureaucracy.
Six FTEs for impact analysts who use data science to study project results and impact.
Its largest deployed project has around 4 000 users, with around eight FTEs working on it. Other projects are smaller, with some having one or two FTEs. Overall, the lab’s products have around 10 000 monthly users. The total costs for AI cloud services, including tokens, are around EUR 3 500 per month.
Lessons learned
The first few projects are by far the most expensive and time consuming, with significant investments in setting up cloud infrastructure and deployment templates that can be easily re-used for future projects. Deployments that took three weeks each for the first few projects now take 30 minutes.
Having robust cloud infrastructure in place is important, optimally shared among projects to promote synergies.
The lab considered the pros and cons of using proprietary AI models versus custom deployments of open-source models (e.g. Meta’s Llama). It determined volume-priced proprietary models to be more effective because civil servants tend to use the AI systems from 9:00-18:00. For a custom deployment, they would need to pay for GPU usage all day, even when the models are not being used. Overall, token-based pricing was less expensive for their needs. In addition, this approach allows the lab to spin-up new model instances more quickly and easily. For instance, it can deploy a new GPT-4.1 model in around five minutes, whereas custom deploying an open-source model could take weeks of infrastructure work.
Overall, the lab estimated it would have cost EUR 9 300 per month to self-host host a Llama model, where they current spend around EUR 3 500 for tokens.
As usage of the lab’s AI tools has grown, the lab is reaching the limits of what cloud providers are willing to provide in terms of pay-as-you-go pricing. As their usage continues to increase, they face the choice of either 1) purchasing GPU capacity from the cloud providers, or 2) self-hosting open-source models and paying for GPU access directly (as mentioned in the previous bullet). For the lab, option 1 may be optimal because it still allows for rapid deployment.
The cost of developing AI, in terms of both technical resources (e.g. cloud services, tokens) and human resources is decreasing rapidly. The lab is finding that they can increasingly use AI to build AI, potentially changing labour demands. The implications for this have yet to be determined.
Source: OECD interview with officials from an undisclosed country on 18 April 2025. The OECD is not publishing the name of the country or lab because of the preliminary nature of the estimates and analysis.
Developing narrow, purpose-built custom ML applications (either in-house or procured)
Narrow AI systems tailored for specific public-sector tasks can range from relatively small expenses to multi-million-dollar projects. These systems involve ML approaches developed for a specific use case, such as fraud detection, traffic optimisation or document classification. Simple pilots might be built for a few thousand dollars, whereas complex national systems can cost millions or tens of millions (USD), especially if scoping in defence applications (Barnett, 2020[60]). As one example, the South Australian government is piloting four AI-enabled cameras aimed at cutting traffic congestion by analysing congestion and adjusting traffic light cycles at a cost of USD 218 000 (equivalent) (Jackson, 2025[61]).
However, costs vary widely depending on complexity and context, and additional costs may be needed for data preparation, infrastructure and ongoing monitoring and maintenance. Because approaches and associated costs vary significant depending on the use case, it is difficult to provide estimates beyond these examples. Further analysis may be warranted to consider different aspects of such use cases and what different governments around the word have paid for them.
Developing systems using pre-trained open-source models
Compared to the custom development and training of a GenAI model, discussed below, pre-trained open-source AI models (such as Meta’s Llama models) can offer reduced costs with government still able to highly customise the model to meet their needs.8 Open-source models can be self-hosted either on the cloud or on-premises, offering governments greater control over their data and long-term cost efficiency. While leveraging a pre-trained open-source model can reduce training costs to those incurred for fine-tuning, still other remaining costs can be significant. For instance, self-hosting eliminates recurrent fees for tokens or licenses, but it requires significant upfront investment in hardware and infrastructure, cloud resources, energy consumption, and maintenance and support costs.
Despite the higher initial investment compared to licence or volume-based pricing, some governments have found that self-hosting can be cost-effective at scale and unlock use cases not feasible to them through commercial APIs (e.g. sensitive intelligence tasks, always-on local services, or offline operations in critical infrastructure). For instance, Chinese Taipei invested USD 7.4 million to develop its own foundation model called Trustworthy AI Dialogue Engine (TAIDE), which uses Meta’s Llama open-source models (Creery, 2024[62]).9
Government use of the open-source platform Polis (Box 5.36) represents a narrower use of open-source AI than the TAIDE efforts to develop a foundation model. One government organisation that built a customised and self-deployed version of Polis for a large-scale public engagement campaign, consisting of 33 regional and national Polis discussions with 30 000 participations, incurred overall costs of around EUR 422 500 over a 14-month period.10 These expenses consisted of EUR 195 000 for expert web development, cloud services and an outsourced user experience (UX) co-design sprint, and EUR 227 500 in staff time to implement Polis into the organisation’s existing workflows and to coordinate and conduct civic engagement activities. The same organisation has since invested EUR 200 000 in further enhancing their Polis application, split 50/50 between technical and staff costs. This enhancement work included user interface (UI) design and implementation and other technical development work, which have been open sourced for other Polis users. Overall, the organisation estimates that two full-time equivalent staff (FTE) is sufficient to manage the work, with a mix of expertise needed (such as technical development, project management, UX design and digital participation).
Open-source models can also be used in conjunction with proprietary models. For instance, one of the virtual assistants discussed in Chapter 5 uses both Google’s Gemini 1.5 Flash and a Llama model from Meta, with the chat interface and orchestration of the two models developed in-house using open-source technologies.11 The system is being piloted with 18 000 users, with the main costs being associated with the use of the Gemini LLM and cloud web hosting for the chat application. While the LLM platform costs the government around EUR 18 000 per month, they expect a substantial reduction as they understand how to use the model most efficiently. The web hosting costs around EUR 2 300 per month. Overall, they estimate their costs at EUR 0.93-1.55 per user per month. Officials are in early discussions to scale the pilot up in other departments. The development and coordination team consists of approximately 10 FTEs.
Developing custom GenAI models built and trained from scratch
Developing custom built and trained GenAI models is generally the most expensive option (for comparable performance) due to high initial investment and operational complexity. Costs of training an LLM depend on model size (larger models with more parameters require more computational power and consume more energy), the quality and quantity of the training data (influencing the cost of data acquisition and curation), choices of infrastructure (whether training occurs on premises or cloud based), and the efficiency of training algorithms used.
AI companies often do not publicly disclose the training costs associated with their models, although researchers estimate that current popular models cost around USD 41-192 million (Stanford HAI, 2025[59]). With the cost of training state-of-the-art models increasing two- to threefold each year, some research estimates that training the largest models may cost over EUR 1 billion by 2027 (Cottier et al., 2024[63]). While these costs can seem high, they can pale in comparison to the significant research and development investments, staff costs and data gathering efforts needed to achieve the latest foundation models (Stanford HAI, 2025[59]). Leading AI companies also have other prerequisites for developing such models, such as deep technical talent, and often, strategic partnerships with other companies.
Yet governments do not necessarily need to embark on building such extensive and powerful systems that seek to beat market competitors. Training from scratch of government-funded LLMs, for instance, can require fewer staff and cost resources, especially for those with fewer parameters or seeking to maximise relevance for a specific country, region or language. For instance, OpenEuroLLM has a total budget of EUR 37.4 million, implying a fraction of this sum will be dedicated to training its foundation model with another fraction for staff (EC, 2025[64]).12 In another example, one European country custom developed and trained an LLM in the national language, with the total cost, including personnel, coming in around EUR 500 000, of which EUR 300 000 was dedicated to GPU usage.13 Another example is GPT-NL in the Netherlands, which is investing around EUR 13.5 million provided by the Netherlands Ministry of Economic Affairs and Climate (EZK) to train a model (2023[65]). Other efforts have been undertaken in in Japan, Singapore, Spain, Sweden and the United Arab Emirates (Chavez, 2024[66]). The collaboratively developed BigScience Large Open-science Open-access Multilingual Language Model (BLOOM)14 involved significant contributions from government agencies. The project was primarily led by Hugging Face and the French National Centre for Scientific Research (CNRS), supported by a public compute grant on the French public supercomputer “Jean Zay”. The estimated cost of training is USD 2-5 million.
Outdated legacy information technology systems
Many governments’ AI ambitions are slow to materialise because of outdated legacy IT systems not suitable for AI development or use, or inadequate to manage and exchange large amounts of quality, interoperable data (Irani et al., 2023[67]). Such systems can result in significant missed opportunities. For example, in the UK alone, “taxpayer funded services from the NHS to local councils are missing out on GBP 45 billion in productivity savings — more than enough to pay for every primary school in the UK for a full year — because they are too often dependent on old and outdated technology” (UK DSIT, 2025[68]). The government (2025[21]) estimates 28% of central government IT systems are outdated, reaching 70% in some organisations, and 57% of UK government officials surveyed by software company SAS (2025[53]) cited legacy systems as a barrier to AI adoption. The issue has also been raised by the UK PAC (2025[6]) as an impediment to the use of AI in government.
Chapter 5 discusses how outdated legacy technology impacts AI adoption. For instance, the potential for AI in public financial management is limited by outdated financial management information systems in governments around the world, with such systems exceeding a decade old in most OECD countries (Rivero del Paso et al., 2023[69]; OECD, 2024[70]). Despite the significance of the challenge, considerations and analysis for legacy technology’s adverse effects in AI adoption appear light in most countries and government functions. While the previous paragraph includes significant detail about the scale of the challenge in the UK to illustrate the point, this is largely because most other governments have not conducted the analysis necessary to articulate the problem in such a manner.
This challenge is dependent on others, including the significant costs of funding the remediation of legacy systems. Outdated legacy technology also contributes to other challenges, such as data issues and an “overreliance on contractors sending costs rocketing”, including to maintain outdated systems, with “maintenance of legacy systems costing often three to four times that of modern alternatives” (UK DSIT, 2025[68]). These expenses could be better placed in innovation and modernisation efforts.
Governments are taking a variety of measures to modernise their systems to be more AI-ready. In a novel instance, the US Department of Defence is using AI to modernise legacy code (Harper, 2024[71]). More traditionally, some governments are providing targeted funding for modernisation efforts (see Chapter 4, “Funding AI and supporting coherent investments across government”).
To overcome the implementation challenges outlined in this chapter, as well as mitigate the risks outlined in Chapter 1, governments can turn to policy. Indeed, some governments are already doing so. The following chapter looks at policy measures governments can take — and actions some are already taking — to deliver AI that is trustworthy and to benefit from AI’s full potential.
References
[7] Austin, T. et al. (2024), A snapshot of how public sector leaders feel about generative AI, https://www2.deloitte.com/us/en/insights/industry/public-sector/ai-adoption-in-public-sector.html.
[57] Australia DTA (2024), Evaluation of whole-of-government trial into generative AI: Now available, https://www.dta.gov.au/blogs/evaluation-whole-government-trial-generative-ai-now-available.
[58] Australia Treasury (2025), Evaluation of a trial of generative AI (Copilot) in The Treasury, https://evaluation.treasury.gov.au/publications/evaluation-generative-artificial-intelligence.
[20] Autio, C., K. Communigs and B. Elliott (2023), A Snapshot of Artificial Intelligence Procurement Challenges, https://files.thegovlab.org/a-snapshot-of-ai-procurement-challenges-june2023.pdf.
[60] Barnett, J. (2020), JAIC awards biggest contract yet to buy AI for the battlefield, https://fedscoop.com/battlefield-ai-jaic-booz-allen.
[52] Barrett, K. and R. Greene (2024), The Future of AI For the Public Sector: The Challenges and Solutions, https://www.businessofgovernment.org/blog/future-ai-public-sector-challenges-and-solutions.
[2] Brizuela, A. et al. (2025), Analysis of the generative AI landscape in the European public sector, European Commission, https://op.europa.eu/s/z4XY.
[66] Chavez, P. (2024), Sovereign AI in a Hybrid World: National Strategies and Policy Responses, https://www.lawfaremedia.org/article/sovereign-ai-in-a-hybrid-world--national-strategies-and-policy-responses.
[41] Cheng, L. and A. Chouldechova (2023), “Overcoming Algorithm Aversion: A Comparison between Process and Outcome Control”, Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1-27, https://doi.org/10.1145/3544548.3581253.
[63] Cottier, B. et al. (2024), The rising costs of training frontier AI models, https://arxiv.org/abs/2405.21015.
[62] Creery, J. (2024), Taiwan Builds Own AI Language Model to Counter China’s Influence, https://www.bloomberg.com/news/articles/2024-01-25/taiwan-builds-own-ai-language-model-to-counter-china-s-influence?.
[17] de Mello, L. and T. Ter-Minassian (2020), “Digitalisation challenges and opportunities for subnational governments”, OECD Working Papers on Fiscal Federalism, No. 31, OECD Publishing, Paris, https://doi.org/10.1787/9582594a-en.
[34] Desouza, K. (2018), Delivering Artificial Intelligence in Government: Challenges and opportunities, https://www.businessofgovernment.org/sites/default/files/Delivering%20Artificial%20Intelligence%20in%20Government.pdf.
[44] Dietvorst, B., J. Simmons and C. Massey (2018), “Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them”, Management Science, Vol. 64/3, pp. 1155-1170, https://doi.org/10.1287/mnsc.2016.2643.
[64] EC (2025), A pioneering AI project awarded for opening Large Language Models to European languages, https://digital-strategy.ec.europa.eu/en/news/pioneering-ai-project-awarded-opening-large-language-models-european-languages.
[8] EC (2024), Adoption of AI, blockchain and other emerging technologies within the European public sector – A public sector Tech Watch report, Publications Office of the European Union, https://data.europa.eu/doi/10.2799/3438251.
[9] EC (2024), Public Sector Tech Watch latest dataset of selected cases, http://data.europa.eu/89h/e8e7bddd-8510-4936-9fa6-7e1b399cbd92 (accessed on 4 March 2025).
[38] Fagan, M. (2024), AI for the People: Use Cases for Government, https://www.hks.harvard.edu/sites/default/files/centers/mrcbg/working.papers/M-RCBG%20Working%20Paper%202024-02_AI%20for%20the%20People.pdf.
[43] Featherson, R., A. Shlonsky and C. Lewis (2019), “Intervetions to Mitigate Bias in Social Work Decision-Making: A Sytematic Review”, Research on Social Work Practice, Vol. 29/7, https://doi.org/10.1177/1049731518819160.
[54] France Élysée (2025), The Paris Charter on Artificial Intelligence in the Public Interest, https://www.elysee.fr/en/emmanuel-macron/2025/02/11/the-paris-charter-on-artificial-intelligence-in-the-public-interest.
[47] Frontier economics (2024), Guidance on the impact of AI interventions, https://www.frontier-economics.com/uk/en/news-and-insights/news/news-article-i21121-analysing-the-impact-of-ai-interventions-in-government.
[65] Government of the Netherlands (2023), The Netherlands is building its own open language model GPT-NL, https://www.digitaleoverheid.nl/nieuws/nederland-bouwt-eigen-open-taalmodel-gpt-nl/.
[71] Harper, J. (2024), Pentagon using AI to modernize legacy code, https://defensescoop.com/2024/09/12/pentagon-artificial-intelligence-modernize-legacy-code-john-hale/.
[37] Hodges, D. (2024), Fumbles can’t kill the government’s AI appetite, https://www.themandarin.com.au/249756-red-faces-and-fumbles-cant-kill-governments-ai-appetite/.
[67] Irani, Z. et al. (2023), “The impact of legacy systems on digital transformation in European public administration: Lesson learned from a multi case analysis”, Government Information Quarterly, Vol. 40/1, p. 101784, https://doi.org/10.1016/j.giq.2022.101784.
[61] Jackson, B. (2025), South Australian drivers to be monitored by AI cameras, https://www.news.com.au/technology/south-australian-drivers-to-be-monitored-by-ai-cameras/news-story/8a63be3e80d4bba60735d58ec8c473db?utm_source=chatgpt.com.
[39] Mariani, J., W. Eggers and P. Kishnani (2023), The AI regulations that aren’t being talked about, https://www2.deloitte.com/us/en/insights/industry/public-sector/ai-regulations-around-the-world.html.
[19] Mitchell, S. (2025), Skills gap in public sector IT fuels outsourcing reliance, https://itbrief.co.uk/story/skills-gap-in-public-sector-it-fuels-outsourcing-reliance.
[27] Monteiro, B., A. Hlacs and P. Boéchat (2024), “Public procurement for public sector innovation: Facilitating innovators’ access to innovation procurement”, OECD Working Papers on Public Governance, No. 80, OECD Publishing, Paris, https://doi.org/10.1787/9aad76b7-en.
[29] Morley, J. et al. (2019), “From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices”, Science and Engineering Ethics, Vol. 26/4, pp. 2141-2168, https://doi.org/10.1007/s11948-019-00165-5.
[10] Muñoz-Cadena, S. et al. (2025), Sistemas de IA en el sector público de América Latina y el Caribe (Versión V2), https://sistemaspublicos.tech/sistemas-de-ia-en-america-latina/ (accessed on 29 April 2025).
[15] NASCIO (2024), Generative Artificial Intelligence and its Impact on State Government IT Workforces, National Association of State Chief Information Officers, https://www.nascio.org/resource-center/resources/generative-artificial-intelligence-and-its-impact-on-state-government-it-workforces/.
[24] OECD (2024), “2023 OECD Digital Government Index: Results and key findings”, OECD Public Governance Policy Papers, No. 44, OECD Publishing, Paris, https://doi.org/10.1787/1a89ed5e-en.
[31] OECD (2024), “Assessing potential future artificial intelligence risks, benefits and policy imperatives”, OECD Artificial Intelligence Papers, No. 27, OECD Publishing, Paris, https://doi.org/10.1787/3f4e3dfb-en.
[70] OECD (2024), “Financial Management Information Systems in OECD countries”, OECD Papers on Budgeting, No. 2024/02, OECD Publishing, Paris, https://doi.org/10.1787/ce8367cd-en.
[56] OECD (2024), Fixing Frictions: ’Sludge audits’ around the world, OECD Publihing, https://doi.org/10.1787/14e1c5e8-en-fr.
[30] OECD (2024), “Governing with Artificial Intelligence: Are governments ready?”, OECD Artificial Intelligence Papers, No. 20, OECD Publishing, Paris, https://doi.org/10.1787/26324bc2-en.
[4] OECD (2024), Shaping smart cities of all sizes, OECD Publishing, https://www.oecd.org/content/dam/oecd/en/about/programmes/cfe/the-oecd-programme-on-smart-cities-and-inclusive-growth/Proceedings-4th-Roundtable-Smart-Cities-Inclusive-Growth.pdf/_jcr_content/renditions/original./Proceedings-4th-Roundtable-Smart-Cities-In.
[23] OECD (2023), “2023 OECD Open, Useful and Re-usable data (OURdata) Index: Results and key findings”, OECD Public Governance Policy Papers, No. 43, OECD Publishing, Paris, https://doi.org/10.1787/a37f51c3-en.
[32] OECD (2021), “The OECD Framework for digital talent and skills in the public sector”, OECD Working Papers on Public Governance, No. 45, OECD Publishing, Paris, https://doi.org/10.1787/4e7c3f58-en.
[25] OECD (2019), The Path to Becoming a Data-Driven Public Sector, OECD Digital Government Studies, OECD Publishing, Paris, https://doi.org/10.1787/059814a7-en.
[3] OECD (2017), Fostering Innovation in the Public Sector, OECD Publishing, Paris, https://doi.org/10.1787/9789264270879-en.
[1] OECD/CAF (2022), The Strategic and Responsible Use of Artificial Intelligence in the Public Sector of Latin America and the Caribbean, OECD Public Governance Reviews, OECD Publishing, Paris, https://doi.org/10.1787/1f334543-en.
[11] OECD/UNESCO (2024), G7 Toolkit for Artificial Intelligence in the Public Sector, OECD Publishing, Paris, https://doi.org/10.1787/421c1244-en.
[28] Parliament of Australia (2025), Report 510: Inquiry into the use and governance of artificial intelligence systems by public sector entities - ’Proceed with Caution’, https://parlinfo.aph.gov.au/parlInfo/download/committees/reportjnt/RB000567/toc_pdf/Report510Inquiryintotheuseandgovernanceofartificialintelligencesystemsbypublicsectorentities-'ProceedwithCaution'.pdf.
[16] PTI (2024), AI and City/County Government: Survey Results, Public Technology Institute, https://fusionlp.org/wp-content/uploads/2024/11/AI-Survey-City-and-County-Final-2024.pdf.
[36] Richter, A. (2024), Navigating Generative AI in Government, https://businessofgovernment.org/report/navigating-generative-ai-government.
[69] Rivero del Paso, L. et al. (2023), Digital Solutions Guidelines for Public Financial Management, https://www.imf.org/en/Publications/TNM/Issues/2023/10/06/Digital-Solutions-Guidelines-for-Public-Financial-Management-537781.
[5] Ryseff, J., B. De Bruhl and S. Newberry (2024), The Root Causes of Failure for Artificial Intelligence Projects and How They Can Succeed, RAND, https://www.rand.org/pubs/research_reports/RRA2680-1.html.
[22] Ryseff, J. and A. Narayanan (2025), Why AI Projects Fail, https://www.rand.org/pubs/presentations/PTA2680-1.html.
[12] Salesforce (2024), 6 in 10 IT Workers Report Shortage of AI Skills in the Public Sector, https://www.salesforce.com/news/stories/public-sector-ai-statistics/.
[50] Samuleson, W. and W. Zeckhauser (1988), “Status quo bias in decision making”, Journal of Risk and Uncertainty, Vol. 1, pp. 7-59, https://doi.org/10.1007/bf00055564.
[53] SAS (2025), Slow uptake of AI in government hindering strategic goals, new research finds, https://www.sas.com/en_gb/news/press-releases/2024/september/slow-uptake-of-ai-in-government-hindering-strategic-goals.html.
[35] SAS (2020), AI in government: The path to adoption and deployment, https://www.sas.com/en_sa/insights/articles/analytics/ai-in-government.html.
[55] Shahab, S. and L. Lades (2021), “Sludge and transaction costs”, Behaviorual Public Policy, Vol. 1/22, https://doi.org/10.1017/bpp.2021.12.
[51] Shark, A. (2025), What the Rising Costs of AI Means for Government, https://statetechmagazine.com/article/2025/01/what-rising-costs-ai-means-government.
[59] Stanford HAI (2025), Artificial Intelligence Index Report 2025, https://hai-production.s3.amazonaws.com/files/hai_ai_index_report_2025.pdf.
[42] Sunstein, C. and J. Gaffe (2024), An Anatomy of Algorithm Aversion, Elsevier BV, https://doi.org/10.2139/ssrn.4865492.
[45] Teale, C. (2025), Public-sector concerns over AI are lessening, survey says, https://www.route-fifty.com/artificial-intelligence/2025/02/public-sector-concerns-over-ai-are-lessening-survey-says/403328.
[46] The Alan Turing Institute (2024), AI for bureaucratic productivity: Measuring the potential of AI to help automate 143 million UK government transactions, https://www.turing.ac.uk/news/publications/ai-bureaucratic-productivity-measuring-potential-ai-help-automate-143-million-uk.
[18] Trajkovski, G. (2024), “Bridging the public administration‐AI divide: A skills perspective”, Public Administration and Development, Vol. 44/5, pp. 412-426, https://doi.org/10.1002/pad.2061.
[33] Ubaldi, B. et al. (2019), “State of the art in the use of emerging technologies in the public sector”, OECD Working Papers on Public Governance, No. 31, OECD Publishing, Paris, https://doi.org/10.1787/932780bc-en.
[6] UK Committee of Public Accounts (2025), Use of AI in Government, https://committees.parliament.uk/publications/47199/documents/244683/default/.
[68] UK DSIT (2025), Archaic tech sees public sector miss £45 billion annual savings, https://www.gov.uk/government/news/archaic-tech-sees-public-sector-miss-45-billion-annual-savings.
[21] UK DSIT (2025), State of digital government review, https://www.gov.uk/government/publications/state-of-digital-government-review/state-of-digital-government-review.
[13] UK NAO (2024), Use of artificial intelligence in government, National Audit Office, https://www.nao.org.uk/wp-content/uploads/2024/03/use-of-artificial-intelligence-in-government.pdf.
[14] UN Habitat (2024), Global assessment of Responsiuble AI in cities, https://unhabitat.org/sites/default/files/2024/08/global_assessment_of_responsible_ai_in_cities_21082024.pdf.
[40] US OMB (2025), Accelerating Federal Use of AI through Innovation, Governance, and Public Trust, https://www.whitehouse.gov/wp-content/uploads/2025/02/M-25-21-Accelerating-Federal-Use-of-AI-through-Innovation-Governance-and-Public-Trust.pdf.
[48] US OMB (2025), Driving Efficient Acquisition of Artificial Intelligence in Government, White House Office of Management and Bydget, https://www.whitehouse.gov/wp-content/uploads/2025/02/M-25-22-Driving-Efficient-Acquisition-of-Artificial-Intelligence-in-Government.pdf.
[26] van Noordt, C., R. Medaglia and L. Tangi (2023), “Policy initiatives for Artificial Intelligence-enabled government: An analysis of national strategies in Europe”, Public Policy and Administration, https://doi.org/10.1177/09520767231198411.
[49] Varazzani, C. et al. (2023), “Seven routes to experimentation in policymaking: A guide to applied behavioural science methods”, OECD Working Papers on Public Governance, Vol. 64, https://doi.org/10.1787/918b6a04-en.
[72] Waters, R. (2024), Meta under fire for ‘polluting’ open-source, https://www.ft.com/content/397c50d8-8796-4042-a814-0ac2c068361f.
Notes
Copy link to Notes← 1. “Salesforce conducted a double-anonymous survey of 600 IT professionals (200 IT leaders and 400 IT individual contributors) in Australia, France, Germany, the United Kingdom and the United States. Respondents work across industries, including technology, financial services, media and entertainment, manufacturing, retail, healthcare, the public sector and more. The survey was fielded in December 2023 and January 2024” (2024[12]).
← 2. The source for this sentence is a report by the Joint Committee of Public Accounts and Audit, Parliament of Australia (2025[28]). The findings of the Committee and are not necessarily representative of the Australian Government’s views
← 5. Based on Microsoft’s US-oriented website (https://www.microsoft.com/en-us/microsoft-365/copilot) as of 10 April 2024. Annual subscription pricing.
← 7. See, for example, https://openai.com/global-affairs/introducing-chatgpt-gov and https://www.anthropic.com/news/expanding-access-to-claude-for-government.
← 8. The use of “open-source” models for this report does not imply that such models are released under an open-source license approved by the Open Source Initiative (OSI), a nonprofit steward of The Open Source Definition (https://opensource.org/osd). OSI has criticised some companies that call their models open source because they only provide the weights for the model, and not other elements, such as the training data, code and training practices (Waters, 2024[72]). Some argue that such models should be called “open weight” instead of “open source”.
← 10. Information provided by an undisclosed country to the OECD. The OECD is not publishing the name of the country or project because of the preliminary nature of the estimates and analysis.
← 11. Information provided by an undisclosed country to the OECD. The OECD is not publishing the name of the country or project because of the preliminary nature of the estimates and analysis.
← 13. Figures reported to the OECD by a non-disclosed country.