This chapter introduces principles for upper secondary certification that articulate how the perspectives of key stakeholders – namely students, teachers, schools and post-school destinations – can be brought together coherently. The principles build on, and are yet distinct from, the long and rich body of assessment theory. The chapter develops a matrix for upper secondary certificates that sets out the wide range of different assessment types and approaches, including how assessments are undertaken. The principles and matrix aim to support countries to review their certificates, analyse how effectively they balance the needs of different stakeholders and develop a comprehensive overview of how certificates might be built, drawing on the range of approaches that are used internationally.
The Theory and Practice of Upper Secondary Certification
2. Towards principles and a matrix for the design of upper secondary certificates
Copy link to 2. Towards principles and a matrix for the design of upper secondary certificatesAbstract
What matters for upper secondary certification? The perspectives of multiple stakeholders
Copy link to What matters for upper secondary certification? The perspectives of multiple stakeholdersThis report concentrates on the purposes and uses of upper secondary certificates that have a direct focus and impact on the student. Upper secondary certification often serves other important functions, such as being used in accountability mechanisms for schools, which are not the focus of this report.1
Upper secondary certification is a passport to the next stage of young people’s lives
The positioning of upper secondary certificates – between school and the adult world – and their functional purposes which are both forward- and backward-looking (Chapter 1 discusses the purpose of upper secondary certification) mean that they carry important consequences for students and their education systems.
A key transition point in a young person’s life
The move from school to higher education or employment makes the end of upper secondary education a key transition point in the life of any young person. This makes a written or formal record – in the form of a certificate – critical. Progression from one institution (and sector) to another usually necessitates a formal record that the student has completed their upper secondary programme, and often also needs a more detailed report of the learning content, level and standard that the student has achieved. By providing such a record or report, upper secondary certification provides a passport that allows young people to move from school to the next phase of life, whether that is education, training and/or employment.
Regulating access to post-secondary opportunities
Upper secondary certificates play a major role in regulating who accesses higher education, what programme is accessed and how selection decisions are made. There is a wide range of financial and personal benefits associated with higher education and in 2023 almost half (47%) of 25-34-year-olds across OECD countries had completed higher education (OECD, 2024[1]). In line with the associated benefits and increasing “massification” of higher education, almost seven in 10 15-year-old students (69.5%) on average across the OECD expect to complete higher education (OECD, 2024[2]).
In all OECD countries with available data (38), access to first-degree higher education programmes across both public and private institutions requires a minimum qualification level, which is usually at the level of upper secondary education. Around half of OECD systems (18 out of 38) also use students’ upper secondary certificate or national examination to set minimum academic requirements for young people to enter at least some first-degree tertiary programmes or institutions (OECD, 2019[3]).
Motivating performance
While motivating performance is not (and should not) be a primary purpose of upper secondary certification, it is an important washback effect, and one that is often mentioned as vital by teachers, parents and sometimes students themselves. Studies of teacher beliefs reveal that it is common to find teachers who view motivating students as a defining principle of their assessment practice (Barnes, Fives and Dacey, 2014[4]). The belief that assessment motivates performance affects not only teachers’ classroom assessment practices, but also their views about the nature of the assessment and certification system that best serves their students. For example, teachers in Scotland (United Kingdom) have repeatedly provided consultation feedback that suggests that they believe that upper secondary certificates for all students should be assessed using externally-marked, exam-based assessment, arguing that external assessment is necessary to motivate students (OECD, 2021[5]; SQA, 2021[6]; SQA, 2017[7]).
Some studies have suggested that the high stakes attached specifically to final examinations can motivate students to work harder (Bishop, 1998[8]). This might be one reason why systems with external examinations tend to demonstrate higher levels of student achievement than systems without such examinations (Wößmann, 2000[9]). These findings relate only to external examinations (and not upper secondary certificates more broadly which can draw on a range of different types of assessment tasks). As is discussed further in Chapter 6, new analysis conducted for this report using PISA 2022 performance data shows that the association between external exams and student performance is mostly positive but is not statistically significant after accounting for students’ and schools’ socio-economic background.
Shaping teaching and learning
Whether through a formal accountability system or as a byproduct of the publication of results, the certification system can become a powerful lever to ensure that the defined curriculum is resourced by schools and taught by teachers. Upper secondary certification is high stakes for individual students and their parents or carers, especially if it plays a strong role in facilitating selection. This may be seen as a positive motivating force for students, and it may have a backwash effect on teachers and schools, encouraging effective and comprehensive resourcing and teaching of the curriculum. Where the certification system does not provide a good match with the agreed curriculum, such backwash effects can be perceived as perverse and seen as having detrimental effects on the curriculum, on teaching and learning and on students.
Reflecting diverse perspectives on upper secondary certification through principles of relevance, credibility, fairness and manageability
To ground the OECD’s analysis of certification in the fundamental purposes it serves and the perspective of key stakeholders, this report develops principles for upper secondary certification: relevance; credibility; fairness; and manageability (discussed in the box below). The principles aim to provide countries with guidance and direction when developing and reviewing their upper secondary certificates. The principles do not seek to replace long standing and well-evidenced assessment principles, such as reliability and validity (discussed in Chapter 1), but rather provide a guiding framework that reflects the unique function and priorities associated with upper secondary certification.
It is important to note that, although discussed separately, the upper secondary principles suggested here interact with each other. For example, a certificate that is not viewed as relevant is unlikely to be credible. Similarly, certification and assessment that are viewed as unmanageable from a student’s perspective might not be viewed as fair, and may not be trusted, reducing credibility. As in other aspects of upper secondary certification, policymakers must seek an appropriate interplay between the principles.
Towards principles for upper secondary certification
Copy link to Towards principles for upper secondary certificationRelevance
Relevance is concerned with how well the knowledge, understanding and skills assessed in the certificate match those defined as important in the curriculum. It also relates to the extent to which the skills and knowledge assessed are relevant to students’ lives and their next steps after school, whether that is higher education, training, work or life more generally.
Credibility
Credibility is about the trustworthiness of a certification. To effectively perform the selection function, certificates must be perceived to be a reliable and trusted measure of their achievement. Perceptions of credibility are also closely related to relevance: to be seen as credible, the certificate needs to cover what society values, with agreement as to what it should represent and that it does indeed represent this.
Fairness
All groups of students, regardless of their individual characteristics (such as socio-economic background, gender, ethnicity or a particular special educational need) or their education programme (vocational or general) should have equal opportunities to demonstrate what they know and can do. To be fair, an assessment must avoid bias and must be as accessible as possible to all, so that all students can partake on equal footing.
Manageability
Upper secondary certification and its assessment must be manageable in design, delivery and use for the system and all of its actors. The certificate and assessment must be seen to be manageable for the students being assessed, the teachers who assess them, any assessment bodies involved in the assessment process and users of assessment results who must understand the certificate.
Applying the principles of upper secondary certification to the functions of certification and selection
Relevance is a driving principle for the ‘certifying knowledge, skills and understanding’ function of certificates
When considering upper secondary certification in terms of its key function of certifying knowledge and skills, it is the closeness of the relationship between certification and the curriculum that is the key consideration. The certificate’s relevance can be understood in terms of whether and how it is relevant to the curriculum, and students’ lives and future. Relevance is not static – the skills and knowledge most relevant can change, just as the wider context and needs of society evolve. However, excessive emphasis on relevance may come at the expense of other principles. For example, a wide variety of assessment types and tasks may be included to increase relevance and validity, laying the foundations for a diverse range of students – with differing strengths – to show that they have the skills that matter. Yet a wider variety of assessment types will almost always mean greater use of teacher assessment, increasing teacher (and student) workload, reducing manageability while also risking concerns about fairness and credibility.
Credibility is central to the selection function
The overriding priority for stakeholders focused on the selection function (i.e. the certificate’s relevance for progression routes into work, training and post-secondary education) tends to be the certificate’s ability to reliably summarise the skills a student has and their readiness for further education or suitability for roles in the labour market. However, when the upper secondary certificate has been designed to provide universal or near-universal certification, its power for selection can be diminished – resulting in attainment of the certificate losing its currency and, accordingly, a greater focus on individuals’ grades in order to make direct comparisons between certificate holders. Chapter 1 discusses the tensions between the need for upper secondary certificates to certify the full cohort and the need for selection and discrimination, especially in the context of selection for higher education.
Such a situation can lead to debate about whether the upper secondary certificate is well-designed for selection and additional or alternative tests might be used to fulfil the selection function. Such standardised tests often use psychometric approaches to provide perceived reliability, credibility and, through reducing bias, fairness. For example, such tests provide the main evidence for higher education selection in Japan, Korea and the Republic of Türkiye (hereafter Türkiye) (as discussed in Chapters 3-5), as well as for the additional selection tests used in the United States, such as the Advanced Placement tests (Morgan, 2018[10]). Additional tests alongside the main upper secondary certificate can create a threat to the manageability of the overall assessment system, with student workload, for example, being a frequent concern. Additional selection tests can also raise fairness issues since the link between the curriculum and assessed content is typically weaker, potentially advantaging students with access to private tutoring (discussed further in Chapter 6).
Each stakeholder has a slightly different perspective on what a “fit-for-purpose” certificate means
How far an upper secondary certificate fulfils the principles set out in this report – relevance, credibility, fairness and manageability – has direct and crucial consequences for three main groups of stakeholders. These include:
Students: as the individuals taking the assessments and for whom the certificate is their passport to future pathways, a certificate’s credibility and relevance are essential. Students also have strong concerns about fairness, given the high stakes associated with certificates, and, when they perceive unfairness, they tend to strongly voice their concerns, notably on social media. Chapter 1 discusses how perceived unfairness in assessments has sometimes led to a social media “storm”.
Teachers and schools: the views of teachers and schools are influenced by their educational context and assessment culture. However, with a key role in supporting and preparing students for assessments – and, often, for life more broadly after school – many teachers and schools will naturally have a view on how relevant an assessment is to the subject content and to students’ pathways. For teachers in many systems, fairness and objectivity is treated as a key part of their concepts of ethics and professionalism.
Post-secondary destinations: representatives of post-school destinations tend to be very concerned about the robustness of certificates and their effective use for selection i.e. the credibility and relevance of upper secondary certification. Higher education institutions may also be concerned about fairness amid concerns about equitable access.
Table 2.1 provides an overview of stakeholders’ perspectives of the principles for upper secondary certificates developed by this report. The table conveys that each stakeholder will have slightly different views of what “fit-for-purpose” means, which naturally makes the development and continuous improvement of certificates challenging given the multiple perspectives to balance. As discussed in Box 2.1, student groups and teacher unions provided feedback and contributions to the table as well as the wider discussion about different stakeholders’ perspectives in this report. While Table 2.1 sets out the overall tendencies that students, teachers and post-secondary destinations are likely to express, it should be noted that diverse contexts and cultures across different systems mean that there will always be variations and nuances in all systems.
Table 2.1. Stakeholder perspectives on principles for upper secondary certification
Copy link to Table 2.1. Stakeholder perspectives on principles for upper secondary certification|
Relevance |
Credibility |
Fairness |
Manageability |
|
|---|---|---|---|---|
|
Students |
The skills and knowledge assessed are relevant for my life after school. Assessment focuses on the skills and knowledge I learnt in class. |
My results and those of my peers can be trusted. My grades reflect my abilities and the effort I put in. |
I have an equal opportunity to acquire the skills and knowledge assessed and show what I know. Marking and grading are consistent across students. |
The pressure and workload are manageable. I still have time for extracurricular activities, personal obligations and to spend time with friends. |
|
Teachers and schools |
The skills and knowledge assessed are relevant to the curriculum. |
The results for my school and across all schools are marked professionally and responsibly. The rigor of assessment is upheld and secure, with no leaking or cheating. |
Assessments enable all students to equally display their strengths and to receive an appropriate grade. |
The workload of assessment is manageable within contracted hours and there is sufficient space and time for quality teaching and learning across the breadth of the curriculum. |
|
Representatives of post-secondary destinations |
Assessments at upper secondary level focus on the content, skills and knowledge students will need to succeed in post-secondary institutions and at work. |
Achievement of a certificate is seen as a reliable indicator that the student has built strong foundations and is ready for further study or employment. The results indicate students’ suitability for particular programmes and enable direct comparison between students. |
Students from different backgrounds and pathways have an equal chance to do well and access our institution or organisation. We have the necessary information to fairly select students for competitive places. |
We can easily interpret and use results to inform our high-stakes decisions. |
Note: The stakeholder perspectives in this table come from discussions with stakeholders within the Transitions in Upper Secondary Education project.
Unbalanced certificates are unsustainable
In practice, no upper secondary certificate is likely to achieve perfect balance across the four principles and two functions of certification. Moreover, the principles are subjective; in all systems, there will likely be individuals or groups who perceive weaknesses in their certification system in one or more of the principles at any given time. However, when there are significant concerns about a principle, this can lead to pressure for change.
Box 2.1. Stakeholder feedback on upper secondary certification
Copy link to Box 2.1. Stakeholder feedback on upper secondary certificationThe Transitions in Upper Secondary Education project engaged with policy makers, teachers and teacher representatives and students and young people while defining relevance, credibility, fairness and manageability as the key principles of upper secondary certification discussed in this report.
In November 2024, the Trade Union Advisory Committee (TUAC) to the OECD’s Working Group on Education and Skills provided feedback on the principles as they were then defined. This group is made up of representatives from teacher trade unions across the OECD. Engagement with the TUAC working group helped refine the definitions of the principles – for example, emphasising that relevance should be considered both in relation to the curriculum and to students’ futures – and also provided nuanced perspectives on the pros and cons of different assessment approaches.
In November 2024, the Union of Upper Secondary School Students in Finland also provided feedback on the principles and connected the principles to ongoing discussions and debates in the Finnish context. For the students, credibility was of paramount importance and the trustworthiness of the matriculation exam was linked to it being an externally-marked, standardised exam. Beyond student workload, the student union representatives also considered manageability in the sense of the administrative centre’s capacity to give fair and considered grades without markers being rushed to meet deadlines. This example shows the interdependence of the principles – in this case, manageability, fairness and credibility – in shaping stakeholders’ perceptions.
Content not felt to be relevant faces pressure to evolve
As discussed in Chapter 1, a common concern about upper secondary certification in general is its relevance to the curriculum and, more broadly, the skills and knowledge that young people will need to succeed in the world beyond school. The Queensland Core Skills (QCS) test in Queensland (Australia) provides a well-documented illustration of the systemic response to an element of the State’s upper secondary certification that was increasingly felt not to be relevant by the education sector. The QCS test was introduced in 1992 as a standardised test of cross-curriculum skills, set and marked by the state curriculum and assessment body (QCAA). The key function of the QCS test was to act as a scaling mechanism for students intending to continue their studies at university as, otherwise, student assessment was entirely based on externally moderated school-based assessment. The QCS test was compulsory for students applying to higher education through a specific mechanism – the Overall Position (which in 1993 reflected around 90% of entries to higher education) – but was not a requirement for awarding the Queensland Certificate of Education (Matters and Masters, 2014[11]).
In 2014, the Queensland Government commissioned an independent review of the system of senior assessment and higher entrance for students completing Year 12. One of the seven major themes from the public consultation for the review was a widespread rejection of the QCS test.
Tests of 21st Century skills were seen to have no purpose and to occupy a lot of teaching time. There were no positive comments about the QCS Test which is construed as an external exam. The QCS Test is seen as an impost that has a significant effect on teaching time through a focus on teaching to, and preparing for, the test. (Matters and Masters, 2014, p. 28[11])
The schooling sector was adamant in rejecting the QCS Test, while the higher education sector was ambivalent. Stakeholders felt that the cross-curriculum testing through the QCS Test did not serve any clear purpose. They also highlighted that “inordinate” amounts of time were spent preparing for the QCS Test, and that an industry of commercial providers had developed around this which was taking “students and teachers away from the main learning game” (Matters and Masters, 2014[11]). The review recommended ending the QCS Test and it was last sat by students in 2019 (QCAA, 2019[12]). From 2019, Queensland’s system of assessment has evolved to include both internal and external assessment for subjects. Subject results are now used to calculate a rank score for higher education entrance and there is no need for an additional, cross-curricular scaling test.
There are similar examples from other systems of certificates, or elements of a certificate, being removed because they were not widely perceived to be relevant to students’ futures. In England (United Kingdom), General Studies A level (a qualification typically taken at 18) was introduced in the 1950s as an interdisciplinary subject encouraging students to recognise the interdependence across their different areas of study and experience and to promote complex skills like critical and logical thinking, analysis and evaluation, study skills and effective communication (OCR, 2013[13]). However, there was a perception among some students that it was simply not a good use of time, especially because not all higher education providers saw it as a credible certificate (interacting with the credibility principle). Only 40 people took the examination in 2019 and it was removed in 2020.
Perceived credibility of a certificate is a strong driver of change
Fundamentally, the credibility of a national certificate can be tied to perceptions of trust and authority in a society, with grading “being an exercise of authority, which manifests a power relationship where the public sector exercises power over individual citizens” (Skolverket, 2020[14]). If assessments of skills and knowledge are not perceived to be credible – by students, teachers, schools, post-secondary institutions and society more broadly – this may undermine notions of meritocracy and fairness in society. At a more practical level, an upper secondary certificate must be deemed credible for it to effectively influence young people’s progression into work and further education.
In Sweden, upper secondary certification is based on teachers’ grades for their students, with teachers having full autonomy in determining students’ grades. Teachers should take into account nationally-set tests in subjects when giving grades, but they are not required to follow the results from these tests when grading the students. Entry to higher education is selective, based on results from upper secondary certification. National analysis shows that there are differences in the relationship between national test results and teacher-given grades across schools. In some schools, teacher-given grades are very high relative to national test results, while in others, the reverse is true (Skolverket, 2020[15]).
To some extent, it is natural and expected that results on a written examination and teacher-given grades will differ – they are different assessment tasks with varying effectiveness at assessing different types of knowledge and skills (discussed in Chapter 3). However, interviews with teachers in Sweden suggest that, while many teachers feel relatively confident in grading their own students’ knowledge, they are uncertain about whether other teachers are making similar judgements. This means that a given grade does not represent the same level of knowledge and skills across schools. According to national research, this reflects inherent challenges for teachers to achieve a common interpretation of curricula expectations that leads to nationally equivalent grades across the student population (Skolverket, 2020[14]).
Differences in students’ grades is an issue of national concern in Sweden. There are concerns that it impacts student success in higher education. National analysis shows that students from schools run by private entities perform comparatively less successfully in their first year of higher education despite having higher grades on average upon entry when compared with their peers from public schools (Skolverket, 2020[14]). In Sweden’s selective tertiary system, this also means that a student with more ‘generous’ grading may access a programme instead of a student with more appropriate skills and knowledge for a programme but who had ‘stricter’ upper secondary grading, with consequences for how effectively young people’s skills are developed and used across society. (Skolverket, 2020[15]).
Upper secondary school completion rates and university drop-out rates are high-profile problems and it is reasonable to assume that part of the reason is generous grading, not insufficient knowledge (Skolverket, 2020[14])
Multiple national initiatives have been introduced to try to promote greater comparability in teachers’ grades, but so far these have had limited success (Skolverket, 2020[14]). In February 2025, a special investigation commissioned by the Swedish Government proposed that student grades should be based on both teacher-given grades and results from national exams, to support fair comparison across students for entry to higher education. The new final exams are set to be introduced in the first half of 2030 (SOU, 2025[16]).
While the concerns about credibility are well-documented in the Swedish case, it is a common concern in many systems and may lead to significant reform. In Lithuania, upper secondary certification – the national Matura – has historically been based on a combination of school-level and state examinations. However, there were widespread concerns about the reliability of the school-level exams and higher education institutions did not take them into account for tertiary selection because they did not perceive them to be credible (and only used state level exam results). In 2022, the country announced a series of reforms to the Matura including removing school-level examinations and requiring all examinations for certification to be developed and marked at the state level to enhance reliability and perceptions of credibility (OECD, 2023[17]).
Concerns about fairness can create strong demands from students and schools
Often in most systems, there will be some groups of students for whom a certificate does not provide them with equal opportunities to demonstrate their skills and knowledge. Awareness of perceived unfairness might lead to changes for those specific groups. Frequently across many systems, for example, students with diagnosed special education needs have specific accommodations such as extra time, or access arrangements, such as having a scribe to write their answers, so that they can fairly access assessments (Guez, Ketan and Piacentini, 2024[18]).
It is also a common challenge across systems to ensure that students in vocational upper secondary education have equitable opportunities to demonstrate their skills and knowledge fairly for access to higher education. Assessing students in the same way and on the same content across general and vocational programmes is typically not considered to be fair since students in different programmes often study different content and develop different skills. In Poland, a vast majority of secondary school graduates take the Matura (a general education exam in three compulsory subjects and at least one elective subject), whose scores are used for entry to higher education. Since 2018, changes to the Matura have enabled vocational school students to use their vocational exam score in place of the elective general subject, facilitating their entry to higher education (Republic of Poland, 2018[19]).
At a larger scale, when a certificate is not considered to be fair for an entire cohort, this might lead to widespread concerns. In the first half of 2020, in the context of the COVID-19 pandemic, many systems were forced to quickly respond to the urgent and unprecedented context of in-person examinations and assessments not being possible. In England (United Kingdom), in-person written examinations for GCSE and A levels – the main general upper secondary certificates – were cancelled and replaced with “calculated grades”. Schools proposed (for each student) a Centre Assessment Grade, based on the grade that a student’s school or college believed that they would have been most likely to achieve had exams gone ahead as normal, plus an associated rank order. Awarding organisations then standardised these Centre Assessment Grades, based on evidence, including: expected grade distributions at national level; results in previous years at individual centre level; and the prior attainment profile of students at centre level (Department of Education, 2020[20]). This standardisation process resulted in a “calculated grade” for each student. The standardisation aimed to maintain standards, and to ensure that, as far as possible, students were not unfairly advantaged or disadvantaged (Ofqual, 2020[21]).
The published results suggest that the results were largely consistent with previous years and that this was the case across groups of students, notably across different ethnic and socio-economic backgrounds (Ofqual, 2020[22]), However, the differences between schools’ initial grades and students’ final grades – 39% of the grades provided by schools and colleges were revised down – led to immediate accusations of unfairness. For statistical reasons, larger schools and colleges were more likely to have grades revised during the standardisation process than smaller, independent schools, with students from advantaged backgrounds being represented in these types of schools. This led to a perception that downgrading disproportionally affected students from more disadvantaged backgrounds. In practice, while there were some differences across student groups overall, these differences seem relatively balanced across different grades. For example, among students whose school or college had awarded them a Grade C or above, 10.4% from disadvantaged backgrounds had their grades reduced by standardisation compared to 8.3% from the least disadvantaged backgrounds. Yet on the other hand, fewer students from disadvantaged backgrounds whose schools awarded them the highest grades – an A/A* – had their grades reduced by standardisation (10.39%) compared to students from the highest socio-economic backgrounds (11.44%) (Ofqual, 2020[23])).
Despite the overall national picture suggesting that the 2020 cohort was not unfairly advantaged or disadvantaged by the measures put in place in 2020, at an individual level many students felt that they had been unfairly treated.
“While there has been an overall increase in top grades, we are very concerned that this disguises a great deal of volatility among the results at school and student level. We have received heartbreaking feedback from school leaders about grades being pulled down in a way that they feel to be utterly unfair and unfathomable. They are extremely concerned about the detrimental impact on their students.” Geoff Barton, the general secretary of the Association of School and College Leaders (Adams, Weale and Barr, 2020[24])
Following widespread student distress, the Government reverted to school or college awarded grades, removing the standardisation component of the certification process. In announcing this decision, the then Department of Education’s statement focused on fairness, noting that “the system [of grade standardisation] has resulted in too many inconsistent and unfair outcomes for A and AS level students.” (Department of Education, 2020[25]). The case of the A levels in England shows how even when fairness appears to be provided at a national scale, individual student experiences can still be perceived to be unfair. One of the challenges in the situation in England in 2020 was perhaps the apparent opacity of how student grades were reached – it was not transparent for students and therefore seemed unfair – when their school awarded grades were transformed by standardisation.
Unmanageable certificates are unlikely to remain unchanged for long
The extent to which the workload of high-stakes assessment is perceived to be manageable – for the students sitting them, the teachers administering and the overall system – affects the overall sustainability of a certificate. If assessments are deemed to be too time-consuming or demanding, then policymakers can expect calls for change and even public outcry. Manageability criticisms are often linked to the other principles; students who can afford extensive tutoring to prepare for examinations have an ‘unfair’ advantage, and it might be said that students are wasting time studying intensively at the expense of developing more ‘relevant’ skills – with both critiques affecting the overall value and currency people place in the certification.
In New Zealand, the National Certificates of Educational Achievement (NCEA) were introduced between 2002 and 2004 as certificates to assess student skills against set standards, with internal school-based assessment occurring throughout the year in addition to end-of-year external assessments. NCEA contrasted with previous secondary school qualifications that relied heavily on external exams (New Zealand Qualifications Authority, 2024[26]). This shift was seen as a positive one by students since they felt it enhanced their performance and enabled them to pace their assessment workload throughout the year (New Zealand Ministry of Education, 2006[27]).
However, 15 years on from NCEA’s introduction, a comprehensive review was launched amid concerns of ‘over-assessment’, ‘wellbeing’ and ‘learner and teacher workload’ (Ministry of Education, 2018[28]). The review identified that the design of NCEA incentivised over-assessment, with teachers designing courses that included more assessment components – worth ‘credits’ – than necessary, and, ultimately, many students achieving vastly more credits than required to attain an NCEA. Additionally, over time, an increasing proportion of NCEA credits achieved by students has been achieved through internal assessment, at the expense of external assessment (New Zealand Qualifications Authority, 2024[29]). In response, a change package was put forward that would re-set the balance between internal and external assessment, organise learning into more consistent and coherent subjects and set a common minimum standard for literacy and numeracy (New Zealand Ministry of Education, 2019[30]). However, the full implementation of these changes was repeatedly delayed and, before the changes were fully implemented, a new proposal was put forward to replace NCEA with new national qualifications that would go further at mandating a consistent approach to the delivery of subjects and ensuring students take external exams (New Zealand Ministry of Education, 2025[31]).
Similar concerns about over-assessment and workload, particularly for teachers, led to qualification reforms in Scotland from 2016, resulting in less emphasis on internal assessment and final exams becoming more significant (Hayward et al., 2023[32]; Scottish Government, 2016[33]). Recently, however, the Scottish Government has signalled a move to reduce the focus on exams and to broaden the range and types of assessment methods used (Scottish Government, 2025[34]).
A matrix for assessment design
Copy link to A matrix for assessment designAssessment for upper secondary certification can be described as “composite” in nature: the overall assessment package is made up of various components, often selected to achieve different aims in policy contexts that require trade-offs between the strengths of different assessment approaches and methods.
The composite nature of upper secondary certificates means that it is not possible to map out a simple taxonomy of assessment types. Instead, a matrix of assessment approaches and methods is more helpful. The matrix presented in Figure 2.1 has been developed based on extensive discussion and consultation with countries about their upper secondary certificates through the OECD’s Informal Working Group on Assessment and Certification in Upper Secondary Education (outlined in Chapter 1).
To help make sense of the variety of certification approaches, the matrix maps out different approaches for:
Assessment design:
Nature of the assessment task: what tasks does a student have to undertake and engage with?
What are the conditions under which students are required to take the assessment?
When does the assessment take place?
Assessment responsibility:
Who sets the assessment?
Who marks and judges the student’s assessment responses or evidence?
Evolving the matrix to reflect practices across OECD systems
The matrix in Figure 2.1 draws on theory and practice. Chapters 3-5 of this paper map upper secondary certificates across OECD countries. For this mapping, the OECD Secretariat worked closely with members of the Informal Working Group on Assessment and Certification to categorise the upper secondary certificates in their systems. This process entailed categorising the individual components of each upper secondary certificate according to the matrix. For example, if students must complete a written examination, this written exam component was categorised according to each category of the matrix: nature of the assessment task; conditions for taking the assessment; timing of the assessment; who sets the assessment; and who marks the assessment. In most cases, certificates are made up of multiple components and the majority of countries have more than one certificate – typically, separate certificates for general and vocational upper secondary education.
Through the mapping process, the OECD Secretariat found that some categories did not fully capture the reality of country practices and adjustments were made to the matrix to accurately capture the diversity of country practices. Some of these changes are discussed below. It should be noted that the matrix continues to be a developing tool. While the OECD Secretariat has now mapped the majority of OECD systems, not every system has been mapped. As the remainder of OECD systems are mapped, the matrix may continue to evolve and develop.
Assessment design
The matrix first discusses the design of assessments i.e. the decisions that inform the tasks and how students undertake them.
Nature of assessment task: what tasks does a student have to undertake and engage with?
The types of tasks that the student may have to undertake and engage with can be many and varied. Different assessment systems and organisations may use different terminology for very similar sorts of tasks. To help make sense of this diversity, the matrix distinguishes between four main categories of task. Box 2.2 discusses the rationale for the labelling of the different categories of assessment task included in the matrix.
Figure 2.1. Matrix for categorising assessment tasks
Copy link to Figure 2.1. Matrix for categorising assessment tasks
1. Nature of task not specified
In some cases, it is not nationally specified what assessment task the student will experience. The nature of the assessment task – for either a particular component contributing towards certification or even the student’s entire assessment programme – may not be specified in policy, and this may be left up to schools and teachers to decide. The policy intent may be to provide an open and flexible assessment system in which assessment tasks can be tailored, perhaps to the needs of the class or individual student. However, without national monitoring of school-level assessment policies, the policymaker may not know that their intended flexibility is actually being implemented. Assessments in the ‘nature of task not specified’ category may even be experienced by students as ‘tests’ or ‘exams’; the key feature of this category is that there is no national specification for how students must be assessed in this way.
2. Naturally occurring evidence
Assessment using naturally occurring evidence does not require the design and administration of a formal assessment task and need not be planned in advance. Instead, the student’s knowledge and skills are judged on the basis of activities that happen in the course of a learning programme (or in the workplace). It is considered good practice to inform students that their work may be assessed when it is to be used for high-stakes assessment. Naturally occurring evidence can be used in any assessment context, but at the upper secondary level, it is most common in assessment of vocational programmes and skills.
3. Projects and portfolios
Projects and portfolios are forms of assessment that can take place in classroom contexts and throughout the course of study, where the student provides evidence for certain learning outcomes. Rules around the tightness of supervision, access to resources, ability to rework responses, and whether or not the assessment must be completed individually or in groups vary according to system and task rules. Tasks for projects and portfolios will often be relatively broad, providing scope for student personalisation and agency. Projects and portfolios may be found in any assessment context, including assessment of general and vocational programmes of learning.
4. Practical activities and performance assessments
Practical activities and performance assessments are modes of assessment requiring students to demonstrate a skill(s), perform a task(s) or give a performance or some kind. They may be integrated into classroom activities, attached to particular units of study, or they may be stand-alone, summative assessment events.
The matrix defines performance assessment as an assessment of skills that are ephemeral and cannot be easily captured and judged except by an assessor who watches the assessment activity live or, in some cases, engages with a digital recording. Practical activities are assessment tasks where the student is required to demonstrate technical, creative or artistic skills by using physical equipment and materials, often to produce an artefact or product. By their nature, practical activities tend to be found in practical and vocational contexts but can also occur in more knowledge-based subjects such as sciences.
Performance assessments and practical activities can sometimes be treated by countries as exams and scheduled within the exam timetable. They may also be used in the classroom, alongside a range of other assessment methods. Practical activities and performance assessments tend to be point-in-time performances, e.g. a music performance or vocational exam, whereas projects and portfolios might involve evidence for assessment developed or collected over a longer timeframe.
5. Assessment activities with unseen questions/tasks
This category encompasses traditional, paper-based, tests or ‘examinations’, their digitised formats and also other sorts of tests of knowledge and skills that impose the constraint that the test task or questions should be unseen. The defining feature is that the student is presented with a task that they have no advance knowledge of and they respond on the spot.
The assessment may consist of a single task but is likely to be made up of a collection of assessment items, sometimes of different types (e.g. objective questions or items, short answer or restricted response items and extended response items – further discussed in Chapter 3), that are designed to assess a range of related knowledge, understanding and skills. Students are required to complete responses in a set timeframe, usually without resources, and usually individually, under supervised conditions. Typically, to preserve the unseen nature of the assessment task, all students in the cohort (the class, school or jurisdiction) sit the same assessment at the same time. Activities with unseen assessment tasks may be used in any assessment context, and in some systems are used for almost all high-stakes assessment. Such assessments tend to dominate assessment of general or academic skills and knowledge in many education systems.
Box 2.2. Towards a common international understanding of ‘assessment tasks’
Copy link to Box 2.2. Towards a common international understanding of ‘assessment tasks’Different systems use different terminology – or the same terminology in different ways – to describe assessment tasks. In some systems, ‘assessment’ is synonymous with tests and examinations while in others upper secondary certification is referred to as an exam. Many systems also have some kind of coursework or teacher-assessed component, with the terminology used emphasising either key aspects of the format, the timing or even the roles and responsibilities for setting/marking.
In mapping systems’ upper secondary certificates to the matrix developed by this paper (the mapping is discussed in greater depth in Chapters 3-5), the original matrix was adapted and adjusted to reflect the diversity of assessment approaches across OECD systems as accurately as possible. Two of the central changes are discussed below.
Categorising ‘examinations’
Students in most upper secondary systems are expected to sit exams, but the concept of an exam is not consistently defined. Generally, across the countries involved in the Informal Working Group of Certification and Assessment, most tended to use the term ‘exam’ for assessment components with some combination of the following characteristics:
terminal (i.e. occurring at the end of the programme or year)
summative (i.e. assessing what has been learnt once the unit of study is complete)
individually-assessed and taken in closed-book conditions
standardised, with students responding to a consistent set of questions or tasks
externally-set questions or tasks
externally-marked questions or tasks.
Bearing in mind this typical characterisation of exams, it can be hard to describe the nature of the assessment task without pre-determining other aspects of the assessment’s design – such as the assessment responsibilities. The OECD Secretariat identified that, when focusing solely on assessment format and not other characteristics, the key aspect which makes this kind of assessment task unique is its unseen nature i.e. that the student is presented with a task that they have no or limited advance knowledge of and respond on the spot. To avoid the risk of each reader engaging with the term ‘exam’ in overlapping but potentially distinct ways, the term exam is not used in the matrix. Instead, this category in the matrix has been defined based on the unseen nature of the questions/tasks that students respond to.
Categorising other assessment formats
Just as ‘exams’ often – but not always – carry the connotation of being set and assessed by a central exam board or qualifications agency, assessments which are not exams are often associated with being teacher-set and assessed. However, where the matrix seeks to focus on the assessment format, terms are used that describe the way in which evidence for assessment is collected are used e.g. ‘naturally occurring evidence’, ‘projects and portfolios’ and ‘practical activities and performances’. For projects and portfolios, students might work towards these over a period of time, with assessment evidence coming from different stages of the project. This is distinct from practical activities and performances where, although students might prepare over a long period of time, evidence is normally collected from a scheduled, point-in-time assessment event.
What are the conditions under which the student is required to take the assessment?
The required conditions and rules for sitting the assessment lead to important differences in how students engage with and experience assessment. While such conditions and constraints can be many, varied and combined in different ways, the matrix suggests conditions operate within four broad categories.
1. Conditions not specified
In some education systems, there is no central specification of the conditions under which the assessment should take place. When assessment conditions are not specified nationally, schools would set and manage the conditions for assessment and so, student experiences of assessments in this category may range from no constraints to strictly controlled conditions. Chapter 4 examines some of the possible implications of this approach to assessment and certificate design. In some cases, as in Poland, parliamentary bills and other legal frameworks provide specification in very broad terms over the conditions for assessment. However, when this category is applied, the specific decisions are left up to schools or teachers.
2. No constraints, restrictions or rules on assessment conditions
At one end of the continuum are assessments that have no or very limited constraints on where and how the student must complete the assessment. This might be the case when assessment is closely integrated with teaching and learning, including in the workplace. In vocational programmes, assessment with ‘no constraints’ may look like assessments where students have access to all the resources – including time, peers/mentors and paper-based/online resources – that they would ordinarily have when applying the skills in a real-life context. While there will always be some practical constraints and considerations, this category stands in contrast to strictly controlled exam or closed-book conditions.
3. Some constraints, rules or restrictions on assessment conditions
Systems may adopt a defined mixed or midway approach for some assessments. This may mean that certain restrictions are defined, or that certain stages of the task have more constraints/rules placed on them than other stages. For example, some education systems may require students to complete project work in supervised conditions, while others will have few restrictions during the research phase but may require students to complete their final ‘write-up’ in supervised conditions. How accommodations are allocated may also create slightly varying assessment conditions across the cohort – for instance, in OECD countries which offer digital accommodations to take examinations, roughly half make these available to all students, while the other half requires specific medical or special needs documentation.
Access to resources may be defined, too. For example, in the case of a project assessment, students may have free access to resources during their research phase but this may be precluded or constrained for the final stage of the assessment. Chapter 4 discusses student use of AI during upper secondary assessments (see Box 4.1).
4. Strictly controlled assessment conditions.
Assessment, particularly final, summative assessment, is traditionally associated with tightly defined and controlled conditions, often involving time limits, and various rules intended to preserve the security and integrity of the assessment, including rules about prior sharing of assessment tasks, rules against using source materials and rules against conferring. Taken together, a set of such rules creates the traditional ‘examination’ conditions. Such examination conditions are most associated with pencil and paper test-type situations but can be used for several different forms of assessment task, including performances and practical activities. Digital assessment when students are in a closed digital environment, with no access to the internet or tools like spellcheck also meets this definition (OECD, Forthcoming[35]). Common constraints applied to digital examinations are similar to those that might apply in analogue examinations, they may simply manifest differently. The most common types of constraints include limits on preparation time, on sources of information consulted (such as the Internet or an AI system such as ChatGPT) and on tools used (such as speech-to-text accommodations). Such rules are common in both general and vocational assessment systems, so much so that the terms ‘examination’ and ‘assessment’ may be conflated.
When does the assessment take place?
How the assessment is scheduled with relation to the programme of learning is another factor that has a major influence on students’ assessment experiences. Again, this is a continuum, and when one also considers informal, unplanned assessment that takes place in the classroom, perhaps to be used for formative purposes, or as a practice or mock assessment before a formal assessment, then the distinctions can become very blurred. Nevertheless, for the purposes of analysis and comparison, this working paper suggests that it may be useful to think about assessment scheduling as falling into four broad categories.
1. Timing not specified
In some education systems, the scheduling of the assessment task may not be specified, with the aim of leaving it open for the school, teacher or even the student to decide. Schools or teachers decide when assessment takes place to best fit with the needs of the learning programme, and teachers in such systems may be able to follow an ‘assess when ready’ approach. When this category is applied, there may still be some reasonable constraints over timing and scheduling. For example, non-exam assessment may have to take place before the end of the year, and workplace-based assessment may have to take place when the student is in the workplace. Beyond these broad parameters, this category indicates that the precise timing of assessment is not pre-determined.
2. Continuous assessment
In many systems, there is an assessment component which comprises regular and frequent assessment and/or in theory, anything the student does may count for assessment. This is the case when students get a grade for each term or the year and anything the student does during this timeframe may theoretically contribute to this overall grade. This may include tests and pop quizzes, responses given when called upon in class, student work as produced in class or for homework, behaviour and attendance. This may be a holistic judgement, made by the teacher, and/or it may be underpinned by various different items of assessment evidence, and it might be up to the teacher to determine this. For this category, as with the previous category (timing not specified), teachers tend to have a lot of influence over when students are assessed. The key distinction between the two categories is that, with continuous assessment, assessment is intended, according to assessment policy, to be frequent and/or anything the student does may inform their grade.
3. At defined stages i.e. an external or national authority defines when assessments are scheduled
In many systems, certain assessments must take place across schools at the same time or during the same window. Assessment at defined stages may involve students responding to a consistent or standardised task, but this may be scheduled outside of the examination window for logistical reasons e.g. a vocational practical examination. Other examples include staged assessments or interim assessments, which may take place at the end of a semester or the penultimate, with students being assessed summatively on what they have covered over the course of the programme to date. Modularised or unitised systems in particular may go beyond adding just one interim assessment, adding several assessments taken at defined points throughout the course. This category is distinct from the next, as the next and final category is reserved for assessment taking place at the very end of the programme.
4. Assessment at the end of the programme of learning
In many systems, high-stakes assessment, for both general and vocational certificates, tends to happen at the end of the course of learning. There may be a tradition in which all students sit all examinations or vocational assessments during one or two set months, often at the end of the academic year. Since the end of the programme is the point in time at which students have had maximum opportunity to learn and revise what they have learnt, this timing is particularly associated with assessments that aim to comprehensively verify students’ mastery of the curriculum, including acquisition of the intended knowledge and skills.
Responsibility for the assessment
When thinking about categories and analysing assessment systems, it often seems natural to think of an apparently dichotomous scale of external vs internal assessment, or a continuum of external to internal assessment. Yet, these approaches mean it is still difficult to categorise and compare assessments, particularly when different actors may have responsibilities for different parts of the assessment. The matrix focuses on the individuals or bodies that are responsible for the various aspects of assessment design and administration.
Who sets the assessment?
As for many of the categories in the matrix, responsibilities can be shared in a great number of ways, and no description will ever capture every possible scenario across all systems. However, to help for ease of comparison and analysis, the matrix suggests that it is useful to consider three broad categories which are discussed below.
1. Assessment set by the student’s own teacher or school
In some systems, setting the assessment is done by the student’s own teacher, who has the autonomy to design the assessment tasks to suit their learning programme. Sometimes this will be left to the individual teacher and sometimes responsibility will sit at school level, creating a degree of separation from the very personal relationship between the teacher and their students, and introducing the possibility of sharing of responsibilities between teachers. Having assessment set by the student’s own teacher or school does not preclude teachers following some nationally-set high-level guidance.
2. Assessment set by an intermediate person or body or a mixed approach
In some systems, responsibility for setting the assessment may lie with an intermediate individual or body like a local authority. In other systems, devolving responsibility to an entity that is local but external to the school might involve a mixed approach, where responsibility for setting the assessment is shared. For example, in assessment of vocational programmes, local employers may be involved in the assessment process, although they may not actually set the assessment. Mixed approaches might involve an external body or a committee of mixed school-based and external actors providing procedures for assessment-setting, setting broad topics, or providing a template or model from which the teacher or school must develop the detail of the assessment tasks. External bodies might even have a verification role – checking that the proposed assessment meets a national standard or will be an appropriate way to assess a learning outcome – before the student actually sits the assessment. This may be done through a committee structure, with a selected group of stakeholders having oversight of the assessment process, or responsibility for providing strategic advice.
3. Assessment set by an external agency
In many education systems, assessment is set by an external agency, often a national or state examinations board. This is typically the case when there is a need to have standardised assessment, with all students engaging with the same items or task. In this case, agencies or boards are usually part of the overall state education system but are likely to be managed and governed separately from the school system and are often separate from the body that defines and sets the curriculum. Such external agencies may be responsible for assessment of general qualifications, vocational qualifications or both. Frequently, they are not only responsible for assessment of upper secondary certification but have a wider remit of assessment at other stages of the education system.
Who marks and judges the student’s assessment responses or evidence?
Marking and judging assessment responses can be carried out in a range of ways, and features of different approaches may be mixed and matched, sometimes for different subjects. These features may be put together in ways that are similar to approaches used for setting the assessment, but they do not necessarily mirror each other i.e. for some assessments in some educational contexts, the bodies or individuals setting assessments are different from who marks or judges the assessment responses. This fact that mixed approaches to setting and judging assessments frequently occurs across systems confounds the ability to easily classify an assessment as ‘internal’ or ‘external’.
1. Assessment marked or judged by the student’s own teacher or school
In many upper secondary certification systems, all marking and judging of student assessment responses is done by their own teacher. As with responsibility for setting the assessment, sometimes this will be left to the individual teacher and sometimes responsibility will sit at school level. If the teacher sets the assessment, it may be seen as more practical for them to decide the mark scheme and apply it.
2. Assessment marked or judged by an intermediate person or body or a mixed approach
There are many ways in which responsibility for marking the assessment might be mixed. Perhaps because marking or judging the student evidence brings into play the individual relationship between the teacher and the student, it is probably in marking, more than setting, that systems introduce some oversight or externality into the assessment process. For example, the teacher may be responsible for carrying out the marking, but there may be very specific marking instructions for them to follow and external actors may verify that judgements are accurate and consistent with other schools. This can be the approach taken when the assessment is set by an intermediate body or external agency, and here the instructions will be designed to ensure consistency and fairness when different teachers have responsibility for marking the same task.
Another way to introduce some externality to marking are arrangements for teachers to mark other teachers’ students’ assessments, or a local education authority may act like a local exam board, organising marking of the student responses. This brings a degree of objectivity and externality while retaining more local control and is sometimes used in systems where responsibility for setting the assessment is devolved to local or teacher level to provide an overview of the marking process for assurances of the consistency and reliability to support the certificate’s credibility. This may happen in all subjects, or only in subjects that are defined as particularly important (or especially difficult to mark consistently e.g. subjects with essays, artistic projects and performances).
3. Assessments marked or judged by an external agency
Student responses and assessment evidence may be submitted to an external examination board who takes responsibility for the marking of student responses. Often, there will be systems and rules in place to ensure that examiners are genuinely independent of the student and their school. In modern digital marking systems, student responses will be anonymised, and assessment tasks may be separated into their individual component items, with examiners allocated a group of items to mark, further reducing the possibility of examiner malpractice or biased marking.
Analysing assessment tasks in the context of principles and stakeholder needs
Copy link to Analysing assessment tasks in the context of principles and stakeholder needsUpper secondary education systems necessarily make trade-offs in the design of their upper secondary certificates by balancing the function of certificates that is prioritised (i.e. certifying knowledge or facilitating selection) and the principles that are deemed most important in that assessment culture (i.e. relevance, credibility, fairness or manageability). No single assessment approach, even if it combines aspects of the matrix’s categories in highly sophisticated ways, is likely to achieve all of the country’s aims for their upper secondary certification systems, in particular because different stakeholders may have conflicting aims.
Table 2.2 provides an overview of how the categories of assessment types discussed in the matrix (set out in Figure 2.1) might be evaluated against each of the principles for upper secondary certificates developed by this working paper. There are no definitive positions here, as each system will evaluate each assessment type within its own assessment context and culture. However, Table 2.2 provides general indications of how each assessment type might be viewed. The table attempts to cover differing views where these are likely but does not claim to represent every possible view of each category of assessment type. Chapters 3-5 provide more detailed descriptions of categories and their contents, examples of practice, and evidence of evaluation and stakeholder views.
Table 2.2. Different assessment methods and stakeholder perceptions of how these fit with principles for upper secondary certification
Copy link to Table 2.2. Different assessment methods and stakeholder perceptions of how these fit with principles for upper secondary certification|
Assessment activities with unseen questions/tasks |
Projects and portfolios, practical activities and performances |
Naturally occurring evidence |
|||
|
Relevance |
|||||
|
Overall |
All assessments can align with the curriculum, but full coverage usually requires a mix of types. |
||||
|
Students |
Memorisation can make assessment feel detached from real-world applications. |
Personalised projects and workplace-based tasks can feel authentic. |
|||
|
Teachers and schools |
Washback effect - narrowing the curriculum to what is examined |
Projects fit well in arts, languages, PE, sciences and social sciences. |
Teachers may worry about inconsistent marking and grading in relation to their own |
||
|
Post-secondary destinations |
Can help to promote consistent content coverage across regions and schools |
Students develop important skills beyond knowledge learning. |
Naturally occurring evidence requires high trust in assessors |
||
|
Fairness |
|||||
|
Overall |
Can create a 'level playing field’ but does not suit all students. |
Enables diverse students to showcase strengths. |
Tailored to the individual students. |
||
|
Students |
May contribute to stress and anxiety. Need for consistent marking and grading. Not all students feel this assesses their strengths. |
Projects etc. scheduled outside the exam window may relieve end-of-year stress. |
Not all students have same opportunities and experiences. Concern about biased marking and grading inconsistencies. |
||
|
Teachers and schools |
Perception of unfairness if students’ exam results diverge from teachers’ expectations |
Projects risk unequal support (e.g devices, parental support) or plagiarism. |
Autonomy to give grades based on a wide range of evidence and their judgement |
||
|
Post-secondary destinations |
Often see standardised exams and tests as the fairest assessment for selection decisions |
Post-secondary institutions see the consistency of marking across schools for any form of teacher-based assessment as a key issue. |
|||
|
Credibility |
|||||
|
Overall |
Often seen as most credible form of assessment. |
Value of skills and knowledge assessed but potential risk of external influence and unreliable marking. |
Naturally occurring evidence tends to be used for select subjects and programmes – rarely for academic pathways. |
||
|
Students |
Credibility linked to fairness |
Perceptions of credibility may be shaped by conditions and moderation. |
|||
|
Teachers and schools |
Generally seen to be credible |
Importance of moderation. |
Broader, holistic signalling i.e. that a student has grit, determination, etc. |
||
|
Post-secondary destinations |
Exam security underpins credibility |
Engaging employers in assessment can build faith in the assessment. |
Trust in results is context dependent. |
||
|
Manageability |
|||||
|
Overall |
Easy to schedule but may disrupt learning when not at the end of the year. |
Can overload students and staff, especially when deadlines clash. |
Can feel burdensome if overly detailed. |
||
|
Students |
Exam preparation can become overwhelming, particularly for students with special education needs |
Projects may become repetitive. |
Ongoing evidence collection feels like constant pressure |
||
|
Teachers and schools |
Exam focus may limit broader teaching |
Projects add scheduling and moderation pressure. |
In theory, very manageable |
||
|
Post-school destinations |
Exams support student comparison |
Overreliance on project-based or unmoderated assessments may undermine selection decisions. |
|||
References
[24] Adams, R., S. Weale and C. Barr (2020), A-level results: almost 40% of teacher assessments in England downgraded, https://www.theguardian.com/education/2020/aug/13/almost-40-of-english-students-have-a-level-results-downgraded (accessed on 26 November 2024).
[10] Baird, J. et al. (eds.) (2018), Setting Standards in the United States: The Advanced Placement programme, IOE Press.
[4] Barnes, N., H. Fives and C. Dacey (2014), “Teachers’ beliefs about assessment”, in International Handbook of Research on Teachers’ Beliefs, https://doi.org/10.4324/9780203108437-25.
[8] Bishop, J. (1998), “The Effect of Curriculum-Based External Exit Exam Systems on Student Achievement”, The Journal of Economic Education, Vol. 29/2, pp. 171-182, https://doi.org/10.1080/00220489809597951.
[25] Department of Education (2020), GCSE and A level students to receive centre assessment grades, https://www.gov.uk/government/news/gcse-and-a-level-students-to-receive-centre-assessment-grades (accessed on 21 November 2024).
[20] Department of Education (2020), Guidance: Taking exams during the coronavirus (COVID-19) outbreak, https://www.gov.uk/government/publications/coronavirus-covid-19-cancellation-of-gcses-as-and-a-levels-in-2020/coronavirus-covid-19-cancellation-of-gcses-as-and-a-levels-in-2020 (accessed on 21 November 2024).
[18] Guez, A., Ketan and M. Piacentini (2024), “Mapping study for the integration of accommodations for students with Special Education Needs (SEN) in PISA”, OECD Education Working Papers, Vol. 308, https://doi.org/10.1787/ed03c717-en.
[32] Hayward, L. et al. (2023), “National qualifications in Scotland: A lightning rod for public concern about equity during the pandemic”, European Journal of Education, Vol. 58/1, https://doi.org/10.1111/ejed.12543.
[11] Matters, G. and G. Masters (2014), Redesigning the secondary–tertiary interface: Queensland Review of Senior Assessment and Tertiary Entrance, ACER, https://research.acer.edu.au/cgi/viewcontent.cgi?article=1000&context=qld_review (accessed on 8 November 2024).
[28] Ministry of Education (2018), NCEA Review Discussion Document Background to the NCEA Review, https://conversation-space.s3.ap-southeast-2.amazonaws.com/NCEA%20Review%20Background_WEB.pdf/NCEA%20Review%20Background_WEB.pdf (accessed on 25 November 2024).
[31] New Zealand Ministry of Education (2025), Proposal to replace NCEA with new national qualifications, https://web-assets.education.govt.nz/s3fs-public/2025-08/NCEA%20Discussion%20Document%202025_web%206Aug.pdf?VersionId=s3FX.AjGhhVp5riTS4iUi_ZIvI_Ul7xP (accessed on 4 August 2025).
[30] New Zealand Ministry of Education (2019), Annual Report, https://www.education.govt.nz/our-work/publications/corporate-documents/annual-reports-2018-2022 (accessed on 14 January 2025).
[27] New Zealand Ministry of Education (2006), The impact of NCEA on student motivation, https://www.educationcounts.govt.nz/publications/schooling2/learners/student-engagement-and-behaviour/29252 (accessed on 14 January 2026).
[26] New Zealand Qualifications Authority (2024), History of NCEA, https://www2.nzqa.govt.nz/ncea/about-ncea/history-of-ncea/ (accessed on 25 November 2024).
[29] New Zealand Qualifications Authority (2024), NCEA – six indicators and how they have changed over time, https://www2.nzqa.govt.nz/about-us/publications/insights-papers/ncea-six-indicators/ (accessed on 26 August 2025).
[13] OCR (2013), AS/A Level GCE: GCE General Studies Specification, OCR, https://www.ocr.org.uk/images/73471-specification.pdf (accessed on 21 November 2024).
[1] OECD (2024), Education at a Glance 2024: OECD Indicators, OECD Publishing, https://doi.org/10.1787/c00cad36-en.
[2] OECD (2024), PISA 2022 Results (Volume V): Learning Strategies and Attitudes for Life, PISA, OECD Publishing, Paris, https://doi.org/10.1787/c2e44201-en.
[17] OECD (2023), Strengthening Upper Secondary Education in Lithuania, OECD Publishing, https://doi.org/10.1787/a69409d7-en.
[5] OECD (2021), Scotland’s Curriculum for Excellence: Into the Future, Implementing Education Policies, OECD Publishing, Paris, https://doi.org/10.1787/bf624417-en.
[3] OECD (2019), Education at a Glance 2019: OECD Indicators, OECD Publishing, https://doi.org/10.1787/f8d7880d-en.
[36] OECD (2013), Synergies for Better Learning: An International Perspective on Evaluation and Assessment, OECD Reviews of Evaluation and Assessment in Education, OECD Publishing, Paris, https://doi.org/10.1787/9789264190658-en.
[35] OECD (Forthcoming), Delivering examinations digitally: balancing innovation and risk when the stakes are high.
[23] Ofqual (2020), Annex Q: Differences between CAGs and final grades by socio-economic group, ofqual, https://assets.publishing.service.gov.uk/media/5f36711a8fa8f5174816b25f/6656-3_Annex_Q_-_Awarding_GCSE__AS__A_level__advanced_extension_awards_and_extended_project_qualifications_in_summer_2020_-_interim_report.pdf (accessed on 21 November 2024).
[21] Ofqual (2020), Guidance: Summer 2020 grades for GCSE, AS and A level, Exended Project Qualification and Advanced Extension Awards, ofqual, https://assets.publishing.service.gov.uk/media/5ec7a699e90e0754d2437c0f/Summer_2020_Awarding_GCSEs_A_levels_-_Info_for_Heads_of_Centre_22MAY2020.pdf (accessed on 21 November 2024).
[22] Ofqual (2020), Research and Analysis: Awarding GCSE, AS, A level, advanced extension awards and extended project qualifications in summer 2020: interim report, ofqual, https://assets.publishing.service.gov.uk/media/5f3571778fa8f5173f593d61/6656-1_Awarding_GCSE__AS__A_level__advanced_extension_awards_and_extended_project_qualifications_in_summer_2020_-_interim_report.pdf (accessed on 21 November 2024).
[12] QCAA (2019), Queensland Core Skills (QCS) Test: Guideline, The State of Queensland, https://www.qcaa.qld.edu.au/downloads/senior/qcs_test_guideline_19.pdf (accessed on 8 November 2024).
[19] Republic of Poland (2018), Article 70 of the Act of 20 July 2018 - Law on Higher Education and Science, https://isap.sejm.gov.pl/isap.nsf/download.xsp/WDU20180001668/T/D20181668L.pdf (accessed on 21 October 2025).
[34] Scottish Government (2025), Curriculum, Qualifications and Assessment Reform: progress to date and next steps, https://www.gov.scot/publications/curriculum-qualifications-assessment-reform-progress-date-next-steps/pages/2/ (accessed on 23 June 2025).
[33] Scottish Government (2016), Action on teacher workload confirmed, https://www.gov.scot/news/action-on-teacher-workload-confirmed/ (accessed on 26 May 2025).
[15] Skolverket (2020), Analyses of equivalence in secondary school grading: Comparisons between course grades and course tests, National Agency for Education Skolverket, https://www.skolverket.se/download/18.1a8151cc170ae4599bce10/1585902805741/pdf6564.pdf (accessed on 8 November 2024).
[14] Skolverket (2020), Equivalent ratings and merit values: A knowledge base on models to promote equivalence of grades and credits, National Agency for Education Skolverket, https://www.skolverket.se/getFile?file=7582 (accessed on 8 November 2024).
[16] SOU (2025), An equal grading system: Report of the Inquiry on Equivalent Grades and Merit Values, https://www.regeringen.se/rattsliga-dokument/statens-offentliga-utredningar/2025/02/sou-202518/ (accessed on 2 June 2025).
[6] SQA (2021), SQA information for OECD Independent Review of qualifications and assessment, https://www.sqa.org.uk/sqa/files_ccc/20210312-sqa-information-for-oecd.pdf (accessed on 17 June 2024).
[7] SQA (2017), Guide to Assessment, Scottish Qualifications Authority, https://www.sqa.org.uk/files_ccc/Guide_To_Assessment.pdf (accessed on 28 November 2024).
[9] Wößmann, L. (2000), “Schooling Resources, Educational Institutions, and Student Performance: The International Evidence”, Kiel Working Paper, https://hdl.handle.net/10419/17917 (accessed on 5 November 2024).
Note
Copy link to Note← 1. See for example, (OECD, 2013[36]) for a discussion on the wider accountability uses of student assessment data.