This chapter outlines the methodology of the OECD’s PaRIS, emphasising its pioneer approach to collecting data from patients and primary care practices across 19 countries. The chapter explains the conceptual framework, research questions, questionnaires, target population, eligibility criteria, and the nested study design, addressing sampling frames and methods. It also examines implementation details in each participating country, limitations in the approach, and response rates, ensuring transparency in the representativeness of the results. Lastly, the analytical methods used to generate comparable country estimates of patient-reported indicators are outlined, covering standardisation, case‑mix adjustment, and the management of missing data to ensure the reliability and validity of the results across diverse healthcare settings.
Does Healthcare Deliver?
7. PaRIS data and methods
Copy link to 7. PaRIS data and methodsAbstract
7.1. What is PaRIS?
Copy link to 7.1. What is PaRIS?PaRIS aims to improve healthcare by assessing and publishing patient-reported measures across OECD and partner countries to facilitate international comparison and cross-country learning.
PaRIS is a survey of unprecedented scale and includes data collected in 2023‑24 from 107 011 patients linked to 1 816 primary care practices across 19 countries.1
PaRIS addresses gaps in healthcare evaluation by systematically measuring patient reported experiences and outcomes and using results to drive system-level improvements. In doing so, PaRIS supports countries in making health systems more responsive to people’s needs.
PaRIS has been co-created with policy makers, patients, healthcare professionals, academics, and other stakeholders worldwide for comprehensive and policy relevant data collection. It focuses on the whole person and enables linkages between primary care and health system characteristics to outcomes and experiences of patients.
PaRIS is designed for cross-country comparison. The main instrument is a survey among primary care service users aged 45 years or older. An additional survey among their primary care practices is included to collect data on the characteristics of primary care services and care they provide.
The flagship report provides insights into multimorbidity, people‑centred care, inequalities, trust in healthcare and key indicators like self-reported health, well-being and experienced quality of care.
7.1.1. The unique features of PaRIS
In contrast to most traditional health data, PaRIS offers comprehensive details on people’s lives, including nationality and country of birth, financial situation, education, household composition, health-related behaviour, and more. This rich dataset allows for a deeper exploration of how health outcomes and experiences vary across different groups.
PaRIS has a people‑centred approach to patient-reported measures. While much of the existing work on patient-reported measures focuses on specific conditions or procedures, PaRIS stands out by taking a broader perspective, encompassing multiple chronic conditions. PaRIS looks beyond isolated health events and considers the ongoing experiences of people with chronic conditions.
Unlike population-based health surveys, PaRIS focuses on primary care users 45 years or older exclusively. As such, it does not serve as a source for estimating prevalence or incidence rates in the population. Instead, it offers a nuanced understanding of the experiences and outcomes of people who have firsthand experience with primary care services; an area that has often been poorly understood in many countries to date.
PaRIS links characteristics of primary care settings with outcomes and experiences of patients. The survey was conducted on two levels: questionnaires were filled out by both patients and their primary care practices. This allows analysis and further interpretation of findings in light of primary care characteristics – creating a foundation for primary care policy reforms.
PaRIS is a collaborative initiative. The PaRIS-SUR Consortium is an international team led by the Netherlands Institute for Health Services Research that supports the development and implementation of PaRIS. Moreover, the OECD also partnered with patients, healthcare professionals, policy makers, and academics worldwide. These stakeholders played a vital role throughout the project, selecting and testing survey questions, designing the survey implementation, conducting the survey in the field, and analysing results.
7.1.2. Countries and economies participating in PaRIS
Nineteen countries participated in the first cycle of PaRIS: Seventeen OECD member states – Australia, Belgium, Canada, Czechia, France, Greece, Iceland, Italy, Luxembourg, the Netherlands, Norway, Portugal, Slovenia, Spain, Switzerland, the United States, Wales and two non-member countries – Romania and Saudi-Arabia. Countries collected data during 2023, except for Switzerland that collected patient data in early 2024. Data collection was managed by National Project Managers in each country, supported and co‑ordinated by the PaRIS Consortium and the OECD.
7.1.3. Patient survey respondents
A total of 114 576 patients took part in the first cycle of PaRIS. Of these patients, 107 011 could be linked to a primary care practice and were thus available for analysis.2 PaRIS invited people aged 45 and over who had at least one contact with a primary care practice in the 6 months preceding the day of sampling, and who lived in a private household (i.e. not an institution such as a long-term care facility). Eighty-two percent of people surveyed reported having at least one chronic condition.
Patients were asked to self-report their chronic conditions. Self-reporting was preferred to health records to avoid patient privacy breaches and to facilitate data collection in all countries. Self-reporting chronic conditions also avoided comparability challenges related to differences in registration practices between and within countries.
While PaRIS focuses mainly on people living with chronic conditions, this was not a precondition to participate in the survey. As a result, the patient questionnaire was also completed by primary care service users who did not report having a chronic condition. These respondents were included to provide a valuable context for the interpretation of results. For example, when comparing the quality of life of people with chronic conditions across countries, using reference data pertaining to people living without chronic conditions can be used to shed light on healthcare system factors, independent of the presence and prevalence of health conditions.
7.1.4. Primary care practices
A total of 1 814 primary care practices participated in PaRIS. Primary care is considered the first point of contact for non-emergency medical care available to all members of a community, regardless of age, gender or health conditions. It is frequently delivered by teams of healthcare professionals, usually co‑ordinated by a family doctor or a general practitioner. Primary care practitioners offer generalist care, covering a broad spectrum of health needs, including the management of chronic conditions.
Generalist care emphasises comprehensive, continuous, and co‑ordinated care, without restrictions based on patient categories. The focus is on ensuring that patients receive all necessary care, maintain regular contact with a healthcare professional, and experience well-organised and efficient care delivery.
The term primary care practice in this report refers to places where people receive care, such as practices or medical offices, rather than – but not excluding – individual healthcare professionals. These settings may vary in size, ranging from small single‑practitioner offices to larger facilities with multiple healthcare professionals.
Primary care practices typically employ doctors known as general practitioners, family physicians, or family doctors. These professionals serve as a first point of care and deliver care to community members without requiring a referral. These facilities may also employ other healthcare professionals, such as nurses, medical assistants or social workers.
Primary care practices were requested to complete a primary care practice questionnaire. This questionnaire focuses on the organisational structure of the practice and does not include individual sociodemographic items pertaining to the respondent, such as age and gender. Additionally, except from France, Norway and Wales, primary care practices were also requested to facilitate patient sampling and recruitment.
Box 7.1. Relevant publications
Copy link to Box 7.1. Relevant publicationsThe following is a list of relevant publications detailing the methods behind PaRIS
PaRIS project presented to the Health Committee (OECD, 2019[1])
PaRIS Study Protocol (de Boer et al., 2022[2]), updates (Rijken M et al., 2023[3])
PaRIS conceptual framework (Valderas et al., 2024[4])
PaRIS Primary care practice Questionnaire (Bloemeke-Cammin et al., 2024[5])
PaRIS Patient Questionnaire (Valderas et al., 2025[6])
PaRIS Analysis Plan (Groenewegen et al., 2024[7])
PaRIS Field Trial Report (van den Berg et al., 2024[8])
Other relevant publications
All hands on deck: Co-developing the first international survey of people living with chronic conditions: Stakeholder engagement in the design, development, and field trial implementation of the PaRIS survey (Kendir et al., 2023[9])
Lessons from early implementation of the OECD’s Patient-Reported Indicator Surveys (PaRIS) in primary care: making the case for co-development and adaptation to national contexts. (Kendir et al., 2023[10])
Engaging primary care professionals in OECD’s international PaRIS survey: a documentary analysis (Kendir et al., 2024[11])
7.2. What does PaRIS measure?
Copy link to 7.2. What does PaRIS measure?7.2.1. Conceptual framework
The PaRIS conceptual framework was developed together with a wide range of stakeholders, including policy makers, patients, healthcare professionals, academics, and others. A detailed description of the PaRIS conceptual framework was published in 2024 (Valderas et al., 2024[4]).
The framework consists of the following domains (Figure 7.1): patient reported outcomes (symptoms, functioning, self-reported health status, health related quality of life); patient reported experiences of care (access, comprehensiveness, continuity, co‑ordination, safety, people‑centredness, self-management support, trust, overall perceived quality of care); patients’ health and care capabilities; patients’ health behaviours (physical activity, diet, tobacco use, alcohol use); patients’ individual and sociodemographic characteristics; primary care delivery system (characteristics of the primary care facility; characteristics of the main primary care professional); characteristics of the health system, policy and context.
Figure 7.1. Conceptual framework of PaRIS
Copy link to Figure 7.1. Conceptual framework of PaRIS
As shown in the framework, it is expected that the care experiences and outcomes of people living with chronic conditions are determined by their personal and condition-related characteristics, their capabilities to self-manage their health and care as well as by their health behaviours. Moreover, it is expected that structural characteristics of primary care practices as well as the way they organise the delivery of chronic care impacts patients’ care experiences and outcomes. Finally, characteristics of the health system, policy and context determine how the provision of primary care is organised and chronic conditions are managed in a country.
PaRIS data addresses many questions that are relevant for people living with chronic conditions and their families, primary care practices, policy makers and health authorities. This report mainly focuses on the overall goal of PaRIS, that is, to inform countries on the patient-reported outcomes and care experiences of their citizens living with chronic conditions compared with similar populations in other countries. The data analysis is guided by the following main research questions:
7.2.2. Research questions
What are the patient-reported outcomes of primary care service users aged 45 and over with chronic conditions, compared with those without chronic conditions, in the areas of symptoms, physical, mental and social functioning, self-reported health and health-related quality of life? How do these results vary across countries?
What are the experiences of primary care service users aged 45 and over with chronic conditions, compared with those without chronic conditions, in the areas of access, comprehensiveness, continuity, co‑ordination, safety and people‑centredness of care, self-management support, trust and overall perceived quality of care? How do these results vary across countries?
How do patient-reported outcomes and care experiences vary for primary care service users aged 45 and over with chronic conditions by background characteristics such as age group, gender, education level, occupational status, household composition, health-risk behaviours, level of multimorbidity, disease status and confidence in managing one’s own care?
How do key characteristics of primary care practices relate to the care experiences and outcomes of primary care service users aged 45 and over with chronic conditions?
How do characteristics of health systems and countries relate to the care experiences and outcomes of primary care service users aged 45 and over with chronic conditions?
7.2.3. The Questionnaires
PaRIS uses two questionnaires – one for patients and another for primary care practices. These questionnaires were co-created by the OECD and key stakeholders, including patients and primary care professionals (Kendir et al., 2023[9]). Ensuring comparability across countries was a priority in the development process, given the challenges posed by cultural biases, differences in expectations, and response styles that can impact measurement invariance in cross-country studies. To address these issues, the questionnaires were designed with intercultural differences in mind, incorporating insights from international guidelines and expert recommendations for cross-cultural validation and adaptability (Avvisati, Le Donné and Paccagnella, 2019[12]; Van de Vijver et al., 2019[13]).
This process included extensive consultation with patient groups and primary care professionals, rigorous translation procedures to ensure linguistic accuracy, and cognitive testing in each participating country to refine items and minimise cultural discrepancies (see Section 7.2.4). Moreover, during Field Trial, questionnaires were tested in 17 countries and modifications to enhance cross-country validity were incorporated in the final questionnaires. Additionally, the reliability and validity of the measures across diverse contexts was assessed with Field Trial data.
The PaRIS Patient Questionnaire (PaRIS-PQ)
Themes covered in the patient questionnaire are:
Patient-Reported Outcome Measures (PROMs): PROMs refer to the patient’s health, quality of life, or functional status that could be affected by healthcare.
Patient-Reported Experience Measures (PREMs): PREMs refer to people’s experience with healthcare, for example with access, communication, co‑ordination of care, etc.
Trust: people’s confidence in primary care practices and in the healthcare system as a whole.
Healthcare capabilities, such as people’s ability to manage own health and well-being.
Health behaviour, such as diet, exercise and the use of alcohol and tobacco products.
Healthcare use.
Demographic variables, such as age, gender and education.
Self-reported chronic conditions.
The full source questionnaire in English is available at the OECD PaRIS website,3 a detailed description of the development of the questionnaire will be published in 2025 (Valderas et al., 2025[6]).
The PaRIS Primary Care Practice questionnaire (PaRIS-PCPQ)
Themes covered in the PaRIS-PCPQ:
The organisation of care
Chronic care management
Information about the practice
Patient-Reported Outcome Measures (PROMs): PROMs refer to the patient’s health, quality of life, or functional status that could be affected by healthcare.
Patient-Reported Experience Measures (PREMs): PREMs refer to people’s experience with healthcare, for example with access, communication, co‑ordination of care, etc.
Trust: people’s confidence in primary care practices and in the healthcare system as a whole.
Healthcare capabilities, such as people’s ability to manage own health and well-being.
Health behaviour, such as diet, exercise and the use of alcohol and tobacco products
Healthcare use.
Demographic variables, such as age, gender and education.
Self-reported chronic conditions.
The full source questionnaire in English is available at the OECD PaRIS website2, a detailed description of the development of the questionnaire has been published by Bloemeke‑Cammin et al. (2024[5]).
7.2.4. Translation and cognitive testing
Two source versions were developed for both questionnaires: one in UK English and one in French. Translations into the main languages of participating countries were conducted by a specialised translation agency (cApStAn / CEDAR (for Welsh)) in close collaboration with national translators and experts, and followed the TRAPD model (Translation, Review, Adjudication, Pretest, Documentation) (Harkness, van de Vijver and Johnson, 2003[14]). The method requires two independent translations followed by four stages: reconciliation, adjudication, adaptation and final proofreading (Figure 7.2).
The survey was translated into 16 languages: Arabic, Czech, Dutch, English, French, German, Greek, Hebrew, Icelandic, Italian, Norwegian, Portuguese, Romanian, Slovene, Spanish, Welsh. National project managers were encouraged to co‑ordinate additional translations for minority languages in their countries using the same method. Additional translations were made in: Turkish, Russian, Basque, Chinese, Vietnamese, Punjabi, Catalan and Galician.
Figure 7.2. The TRAP-D method for survey translation
Copy link to Figure 7.2. The TRAP-D method for survey translation
Source: PaRIS Translation guidelines and resourcing requirements, based on Harkness, J., F. van de Vijver and T. Johnson (2003[14]), “Questionnaire design in comparative research”, in Cross-Cultural Survey Methods.
At a later stage, the questionnaires underwent the four‑stage model of cognitive testing developed by Tourangeau (1984[15]): Comprehension – Retrieval – Judgement – Response. The model evaluates how respondents understand, recall, and provide accurate responses to survey questions. This process helps ensure that the questionnaires are both reliable and valid, minimising bias and errors in the data collection.
7.2.5. Chronic conditions
The presence of chronic conditions among PaRIS respondents was measured via self-report. The initial version of self-assessment categories was based on a comprehensive scoping review focusing on the documentation of self-reported chronic conditions in primary care. The authors employed specific criteria for the selection of chronic conditions, including their relevance to primary care services, impact on affected patients, prevalence among primary care users, and how often the conditions were present in the condition lists retrieved from literature (Fortin, Almirall and Nicholson, 2017[16]).
Subsequent modifications involved removing categories with anticipated low prevalence in OECD countries based on Global Burden of Disease studies,4 followed by rigorous assessment and testing by both patients and healthcare professionals. To ensure accessibility to laypersons, conditions were presented in simple, self-explanatory terms rather than professional medical language. Conditions were grouped into broader categories such as “breathing conditions” and “cardiovascular or heart conditions” to minimise potential misunderstanding stemming from the use of specific terminology.
The list of conditions was reviewed by healthcare professionals and patients. Box 7.2 shows the final question and options presented to survey respondents.
Box 7.2. Self-reporting tool used in the PaRIS Patient Questionnaire for asking about chronic conditions
Copy link to Box 7.2. Self-reporting tool used in the PaRIS Patient Questionnaire for asking about chronic conditionsHave you ever been told by a doctor that you have any of the following health conditions?
Please select all the options that apply.
High blood pressure
Cardiovascular or heart condition
Diabetes (type 1 or 2)
Arthritis or ongoing problem with back or joints
Breathing condition (e.g. asthma or COPD)
Alzheimer’s disease or other cause of dementia
Depression, anxiety or other mental health condition (e.g. bipolar disorder or schizophrenia)
Neurological condition (e.g. epilepsy or migraine)
Chronic kidney disease
Chronic liver disease
Cancer (diagnosis or treatment in the last 5 years)
Other long-term problem(s)
I have never been told by a doctor that I have any of these problems.
Source: PaRIS Patient Questionnaire 2024.
Due to application of the self-reporting approach used in the survey, it was not necessary to access patient’s medical records to identify if a respondent had received a relevant diagnosis. This approach mitigated registration bias due to different registration practices across countries and reduced concerns related to accessing sensitive health information in the medial record. However, some limitations should be considered when interpreting the data:
While we refer throughout the report to people with 1, 2, or more chronic conditions, or people with multimorbidity, the method may underestimate the prevalence of multimorbidity in cases where individuals have multiple conditions falling into the same category. To avoid overestimation in the number of chronic conditions, respondents were asked to report conditions that they were “told by a doctor”.
Due to the broad categorisation, the PaRIS data offer limited insight into the analysis of specific conditions.
Varying criteria, definitions, and measurements across countries may impact the interpretation of certain conditions. For instance, although some standards are published by international professional associations, national guidelines for defining high blood pressure can differ in participating countries (Justin et al., 2022[17]).
Despite limitations, stakeholders and experts involved in PaRIS agreed that the self-assessment tool was the best choice for capturing chronic conditions in this survey. It safeguards privacy, minimises bias, and promotes participation. While recognising its constraints, the tool’s robust development and user-friendly design ensure valuable insights into health status and ensures international comparability.
7.2.6. PaRIS Field Trial
Nineteen countries (Australia, Belgium, Canada, Czechia, England, France, Greece, Iceland, Italy, Luxembourg, the Netherlands, Norway, Portugal, Romania, Saudi Arabia, Slovenia, Spain, Switzerland and Wales) participated in the Field Trial and collected data in different timelines in the period between March 2022 and March 2023.
The Field Trial aimed to assess the psychometric quality, reliability, and validity of the questionnaires. Further, it was used to test the implementation and collection design in participating countries. The field trial report was published in June 2024 (van den Berg et al., 2024[8]).
A total of 11 153 patients completed the Field Trial patient questionnaire. Patient participation per country ranged from 2 360 to 698. Furthermore, 547 primary care practices in 18 countries completed the Field Trial practice questionnaire.
Response rates varied strongly between countries. Overall, the sample sizes were adequate for a robust test of survey instruments. In confirmatory factor analyses, the Field Trial showed an acceptable performance of the internationally validated PROMs (PROMIS Global – Physical and Mental health scales, and WHO‑5 Well-being Index).
The Field Trial provided valuable lessons for the Main Survey. Based on the results the surveys tools were revised and survey administration procedures were improved. National Project Managers of participating countries critically reflected on achieved response rates and shared experiences and good practices on recruitment, engagement and communication. Several countries considerably improved their recruitment strategies to ensure higher response rates in the Main Survey.
7.3. The PaRIS population
Copy link to 7.3. The PaRIS population7.3.1. PaRIS target population
To secure international comparability, the standardisation of instruments and procedures is essential. Participants in PaRIS are 1) primary care practices and 2) primary care service users who had at least one contact with their primary care practice in the 6 months preceding the sampling, aged 45 and over living in a private household. Although specific procedures to achieve national or partially national representativeness differ between countries, this simple yet concise definition makes populations comparable across participating countries.
7.3.2. Why this population?
Unlike many studies that concentrate on specific diseases or evaluate “before‑and-after” outcomes of specific medical procedures, PaRIS focuses on the continuous management of chronic conditions. This approach is crucial for understanding the real-world experiences of patients who require ongoing care. PaRIS targets a diverse group of people living with chronic conditions who are managed in primary care settings. The choice for this population followed from the recommendations of the high-level reflection group and the 2017 OECD Health Ministerial declaration. Reasons behind the selection of this population are:
High prevalence and impact
Chronic conditions, such as diabetes, hypertension, and chronic obstructive pulmonary disease, represent a significant and growing burden on people and healthcare systems worldwide. These conditions often coexist, complicating patient care and necessitating a more comprehensive understanding of their management. People with chronic conditions form the largest and fastest growing group of healthcare users worldwide.
The central role of primary care
Primary care serves as the cornerstone of chronic disease management. It provides continuous, co‑ordinated, and comprehensive care, which is essential for managing long-term conditions. Unlike care that focuses on acute episodes or specific interventions, primary care offers an ongoing relationship with patients. This continuity allows for a deeper understanding of patient needs, the progression of their conditions, and the effectiveness of management strategies over time.
Most existing healthcare knowledge is derived from hospital settings and focuses on specific conditions, providing detailed insights into acute care and specialised treatments. However, most healthcare, especially for chronic conditions, occurs outside hospital walls. Despite this, primary care remains largely a “black box” in terms of research and understanding. By concentrating on this often‑overlooked aspect of healthcare, PaRIS seeks to shed light on the quality and nature of care provided in these settings. This focus helps to uncover gaps, improve care delivery, and ensure that the insights gained are reflective of where most patient care actually happens.
Facilitating international comparisons and policy development
By concentrating on a patient group that is universally relevant and requires long-term care, PaRIS enables meaningful international comparisons. These comparisons can shed light on different healthcare systems’ strengths and weaknesses, offering valuable insights for policy development and healthcare reforms. The initiative’s global perspective fosters the sharing of best practices and innovative solutions, ultimately improving the quality of life for patients worldwide.
7.3.3. Selection criteria
Eligibility criteria for primary care practices
Primary care practices in PaRIS are:
staffed with care professionals that are licensed to serve the general population of a community, and
provide ambulatory generalist medical care (i.e. in an outpatient setting), including services addressing chronic care management.
The term “primary care practice” refers to a facility, unit or practice rather than to an individual primary care practitioner. Primary care practices can be small, for example a solo practice of a family doctor, or large, for example a health centre with staff from multiple disciplines. “Generalist care” refers to care that is focused on the whole person; not restricted to particular body systems.
Eligibility criteria for patients
Patients in PaRIS are;
aged 45 years or older at the time of sampling; and
living in a private household in the community (i.e. not in a nursing home or other residential institution); and
had at least one registered contact with a primary care practice – either face‑to-face, by telephone or online –, for any medical or administrative reason, during the six months preceding the selection procedure in the practice information system.
Chronic conditions are prevalent among both young and older people. In that respect, the exclusion of people under 45 is a limitation of the PaRIS study. The age‑threshold has been a pragmatic choice: the prevalence of chronic conditions increases with age. This means that the older the sampled group, the higher the chance to include people with chronic conditions. For the younger cohorts, much larger samples would have been required. Experts involved in the development of the study design agreed on 45 years as a pragmatic trade‑off between ensuring enough statistical power to draw meaningful conclusions about the included age groups while avoiding solely focusing on older people.
For patients who are unable to fill out the questionnaire themselves due to, for example visual impairments, proxy respondents were allowed to complete the questionnaire on their behalf. Proxy responses are identifiable in the data for analytical purposes. Patients who had a contact but whose participation in the survey was deemed to be too burdensome by the primary care practice could be excluded. Further considerations underlying these eligibility criteria for patients are provided in the Box 7.3.
Box 7.3. Clarification of eligibility criteria for patients
Copy link to Box 7.3. Clarification of eligibility criteria for patientsReason for contact was not a selection criterion
Why patients contacted the primary care practice in the past six months, i.e. reason for contact, does not play a role in the identification of eligible patients, because the reason for contact is privacy-sensitive information.
Presence of a chronic condition was not a selection criterion
Whether or not the patient has (medically diagnosed) registered chronic condition(s) was not considered in the identification of eligible patients. This has several reasons:
It would have required the use of privacy-sensitive data;
It would have created too much of a burden for primary care practices and/or practice staff;
It would have resulted in registration bias as coding and practice information systems vary.
How people living with a chronic condition were identified
The patient questionnaire contains questions that enable to identify these patients based on their self-reported chronic condition(s). The instrument used is based on literature on self-reporting. Conditions are formulated on a generic level to make them understandable for lay-persons, for example “heart conditions”, “cancer”, etc. See Section 7.2.5 “Chronic conditions”.
People without a chronic condition participating in the survey
Because no medical information about patient was known at forehand, part of the sampled patients did not have a chronic condition. The invitation letter for patients invited all sampled patients to participate and did not mention anything related to chronic conditions.
Most questions of the patient questionnaire have been developed and validated for use in the general population including both people with and without (chronic) conditions. A few questions are only applicable to people with chronic conditions; respondents who do not report a chronic condition could skip these questions.
7.4. Study design
Copy link to 7.4. Study designPaRIS has a nested design with three levels; patients (primary care service users) are nested in primary care practices, which are nested in national healthcare systems. This allows analysis of the variation in patient-reported data in relation to characteristics of and care provided by primary care practices within and across participating countries.
The nested design of the survey allows for a better understanding of the hierarchical structure of patient-reported information. Given that patients experience primary healthcare through healthcare practices in their countries, each practice will only influence the responses of its patients, while country characteristics will only affect patients of that country. In this line, the outcomes and experiences of patients in the same practice are not independent from each other. This violates the observation independence assumption, key for population statistics. Ignoring the hierarchical structure can lead to underestimating the variability in the data and drawing inaccurate conclusions. Multilevel models account for this nested structure and allow to partition the variation in outcomes across different levels – countries, primary care practices, and patients. This approach not only enhances our understanding of patient-level factors (such as demographics), practice‑level factors (like primary care practice capacities), and country-level factors (such as healthcare system characteristics), but also aids in designing targeted policies. It offers a nuanced view of the data’s complexity, avoiding the oversimplifications that might arise from treating observations as independent.
Figure 7.3. PaRIS nested design
Copy link to Figure 7.3. PaRIS nested designThe three levels of PaRIS
7.4.1. Sample size
The target number of primary care practices in a country was based on the number of eligible practices in the country. In countries with 1 000 or more eligible primary care practices, the goal was to include 100 practices. For countries with fewer than 1 000 eligible primary care practices, the target was 75, or as many as possible if the total number of eligible practices was fewer than 75.
The target number of patients was set to 75 per primary care practice, a standard applied to all countries. The required number of participating patients was based on the assumption that at least 70% would report having one or more chronic condition. This assumption was based on the Field Trial results. Some countries aimed for higher numbers to enable additional national-level analyses.
The PaRIS study protocol (de Boer et al., 2022[2]) provides a detailed description of the sample calculation, which determined the optimal number of practices and patients per practice required to reliably assess the survey’s main outcome measures. This calculation also ensured sufficient statistical power to address PaRIS’ key research questions.
In a two‑step approach, the first step involved calculating the sample size required to achieve reliable outcomes. Outcome reliability for four patient-reported measures was analysed across three levels: patient, primary care practice, and country, using the multilevel model reliability measure (Raudenbush, 2003[18]; Leyland and Groenewegen, 2020[19]).
Simulations from Field Trial data determined that at least 50 primary care practices per country with 100 patients each, or 100 practices with 75 patients each, were needed to meet the reliability criterion of 0.70, akin to Cronbach’s alpha in single‑level models (Cronbach, 1951[20]).
In the second step, it was analysed whether the sample size that achieved outcome reliability had sufficient statistical power to answer the five PaRIS main research questions (Section 7.2.2). To compare PREMS and PROMS of people with chronic conditions to people without chronic conditions and between countries (research questions #1 and #2), it was determined that the Field Trial sample was adequate for identifying three groups of countries based on their PREM/PROM scores. The three groups were (1) those aligning with the overall average, (2) those scoring above, and (3) those scoring below the average.
The large patient sample in the survey ensures reliable estimation of outcomes across multiple patient-level variables in a multilevel regression model (research question #3). Likewise, it was estimated that 70 to 100 primary care practices per country was sufficient for studying the relationships between primary care practice characteristics and PREMs/PROMs (research question #4). This estimation is aligned with the rule of thumb supported by literature of ten observations per parameter, adjusted for multilevel models, which suggests that an average of 70 practices per country is required. Finally, given the 19 participating countries it is determined that there is only enough statistical power to study the effect of one healthcare system characteristic on PREMs/PROMs at a time (research question #5).
Box 7.4. Summary of sample size calculation
Copy link to Box 7.4. Summary of sample size calculationSample size calculation focuses on securing 1) outcome reliability, and 2) statistical power to answer the main research questions.
In a two‑step approach, the first step calculates the optimal sample to achieve outcome reliability. The second step tests the sample size obtained in step one or the field trial data, analytically or conceptually, for sufficient statistical power to answer PaRIS’ main research questions.
Results of sample size calculation
The number of primary care practices to be sampled in a country depends on the number of eligible primary care practices in the country:
For countries with at least 1 000 eligible primary care practices, the minimum number of participating practices is set at 100.
For countries with fewer than 1 000 eligible primary care practices, the target number of participating practices is set at 75, or as much as possible in case there are fewer than 75 primary care practices in a country.
A minimum of 75 patients of each participating primary care practice should participate, i.e. complete the patient questionnaire. This number is the same for all countries.
7.4.2. Sampling frame for primary care practices
Sampling frames in each country were defined by the National Project Managers (NPM) based on available information, the sampling goal and the expected response rate. Expected response rates were informed by the experiences with the Field Trial. NPMs identified sampling frames from available data sources and assessed their quality using the following criteria:
Coverage: The sampling frame had a wide and if possible complete information of the country’s population of primary care practices.
Selection bias: Certain groups of primary care practices may not be covered in the data source, hence excluded from the sampling frame. If such “exclusions” were considerable, additional sampling procedures were applied.
Accessibility: Access to required information may depend on legal, administrative, financial or procedural requirements, and whether such data is in digital or paper form.
Completeness: This refers to the comprehensiveness of the information provided to define the sampling frame. It relates to the data source being up-to-date, and whether it provides the required information needed to assess representativeness of samples and response.
The data needed to define the sampling frame contained contact details of primary care practices and variables that allowed ex-post assessments of the representativeness of the sample. This includes, among others, practice size and location (urban or rural).
7.4.3. Sampling frame for patients
The sampling frame for patients was obtained from either centralised data sources or from the patient lists of participating practices. For countries using centralised sources, an identification variable enabling the link between patients and their main primary care practice was used for defining the patient sampling frame. The size of the patient sampling frame per primary care practice was calculated by the NPM by dividing the sampling goal (75) by the expected response rate.
Criteria that NPMs used to assess potential sampling frames were similar to those for primary care practices: coverage, potential selection bias, accessibility of data and completeness. Sampling files contained the contact details for the sampled patients. In addition, variables that allowed assessment of the ex-post representativeness of the patients responding to the survey (mainly age and gender) were included. Contact information of either practices or patients was solely used by the national project management team to invite patients and were never shared with the OECD or the PaRIS Consortium partners.
7.4.4. Sampling method
In most countries, the PaRIS sampling strategy followed a two‑stage approach: first, primary care practices were sampled and recruited, followed by the sampling (and later recruitment) of their patients. This two‑stage method responded to the reliance on participating practices collaboration to recruit patients. The extent of the practice involvement varied across countries – from full responsibility for sampling and recruitment of patients, to shared responsibilities with the National Project Managers team, or, in some cases, practices merely consenting to the sampling and recruitment of their patients. In three countries, France, Norway and Wales, practices were not involved in the sampling of patients. In France, for example, practices were sampled first, then patients were sampled from the list of patients declaring one of the participating practices as their treating physician.
National Project Managers were advised and supported to use a probability sampling method in both stages, to ensure that each practice and patient had an equal (or determined) chance of selection through randomisation. The approach aimed to enhance the representativeness of the sample, which could be further refined by stratification or weighting. A census approach (inviting all eligible practices) was a good alternative approach, particularly in smaller countries, as this also ensured an equal chance of selection in the eligible population. For practical reasons and in particular due to differences in the availability and structure of sampling frames, sampling approaches sometimes had to be adapted to local circumstances in participating countries. The OECD and the PaRIS Consortium supported National Project Managers to create ad hoc sampling designs so that minimum standards for comparability were met. The implementation design for each country is detailed in the next section.
7.5. Implementation of PaRIS
Copy link to 7.5. Implementation of PaRIS7.5.1. Sampling design in participating countries
The sampling methods for both patients and practices varied according to the structure and capabilities of each country’s national healthcare system. Table 7.1 provides an overview of the implementation strategies used by participating countries.
A probability sampling approach was used for practices in 11 countries, employing either simple random selection or stratified random sampling from nation-wide databases including the entire eligible population. For sampling patients, 15 countries adopted probability sampling, with four using stratified random sampling and the remaining employing a simple random selection.
Five countries implemented a census approach for sampling practices, while two applied this method for sampling patients. The census approach involved inviting all eligible practices or patients in the sampling frame that meet the selection criteria to participate. Most countries using this method verified the representativeness of their primary care practice samples by comparing key metrics such as patient panel size, geographical region, and rural/urban distribution with the overall sampling frame.
Two countries opted for a convenience sample of practices. In these instances, National Project Managers invited a broad range of practices to participate, ensuring good representation of the types and geographic distribution of primary care in the assessment areas. The United States followed a distinct approach, using a pre‑existing sample of patients from the Medicare Current Beneficiary Survey (MCBS) for the PaRIS sample. The data from the United States does not include practice level data. The MCBS employs a three‑stage cluster sample design and the sample included in the survey is representative for people 65 years and older nationwide.
Australia
Practices: All accredited practices using the national Electronic Health Record system were invited to participate (Census approach). The validity of the final sample was checked comparing practice size and rural/urban status to the sampling frame. Patients: Practices were given the option of inviting all their eligible patients to participate or taking a random sample. Invitations were sent by staff members of the practice or by the National Project Manager team. Random sampling support was provided for practices that preferred not to survey all their patients.
Limitations: The sample of practices is limited to those who are accredited and had an electronic health record system. Moreover, some smaller deviations included a higher proportion of females than in the underlying population and a slightly lower share of people in major cities and higher share in regional/rural/remote areas. The sample of completed surveys also did not include any patients from Western Australia or the Northern Territory.
Belgium
Practices: Practices were identified through the National Institute for Health and Disability Insurance and contacted via the “eHealthbox” platform. All practices with more than 500 services a year were invited (census approach). Patients: Eligible patients were sampled with simple random selection from the medical record systems of participating practices.
Limitations: Numerous primary care practices in densely populated areas declined participation. This resulted in lower-than-expected response rates for practices.
Canada
Practices: Canada used a convenience sampling approach, where a variety of healthcare practices groups were invited by province. Groups corresponded to practice‑based research and learning networks, the College of Family Physicians, academic researchers, and the Canadian Primary Care Sentinel Surveillance Network. This was complemented by invitations to individual family physicians and nurse practitioners. Patients: For patients, Canada applied a census approach of the eligible patients registered in participating practices.
Limitations: A convenience sample of practices was the only feasible approach for Canada. The coverage of the convenience sample is unknown, as practices information are managed at provincial and not federal level.
Czechia
Practices: In Czechia, eligible practices were randomly selected from the national registry of primary care practices, focusing on practices with 900 or more registered patients. Patients: A simple random selection made by the Institute of Health Information and Statistics from the medical record systems of participating practices was utilised to sample eligible patients.
France
Practices: Primary care practices were randomly selected from the national directory of healthcare professionals, focusing on those with 200 or more registered patients. Patients: France applied a random sampling approach, stratified by age group and gender. The sampling frame for patients was drawn from the list of patients that declared one of the participating practices as their “treating physician” (médecin traitant) provided by the Health Insurance System (CNAM). This list was extracted from the patientèle médecin traitant inter-régime, the centralised register that is used for reimbursements. The contact information of sampled eligible patients was later obtained from the national institute of statistics (INSEE).
Limitations: For technical reasons, the eligibility criteria related to eligible patients having at least one primary care practice visit in the last six months was modified to “at least one visit to the patient’s treating physician in the last six months”. With the modified criteria, the sampling frame for patients comprehended the entire underlying population. Nevertheless, with the original criteria, it is estimated that 77% of eligible patients are represented in the sampling frame.
Table 7.1. Implementation of PaRIS in participating countries
Copy link to Table 7.1. Implementation of PaRIS in participating countries|
Country |
Collection Methods (patients) |
Languages |
Source for sampling frame |
Sampling methods |
||
|---|---|---|---|---|---|---|
|
Primary care practices |
Patients |
Primary care practices |
Patients |
|||
|
Australia |
Online, paper |
English, Arabic, Chinese1 |
Accredited and with electronic health record system. |
Registry of patients of participating practices. |
Census approach |
Probability sampling /Census approach |
|
Belgium |
Online, paper |
French, Dutch, Italian, English, Spanish, Arabic, Turkish2 |
National Institute for Health and disability insurance |
Registry of patients of participating practices |
Census approach |
Probability sampling |
|
Canada |
Online |
English, French |
Groups of practices by province |
Registry of patients of participating practices |
Convenience sampling |
Census approach |
|
Czechia |
Online, paper |
Czech |
National registry of healthcare providers |
Registry of patients of participating practices |
Probability sampling |
Probability sampling |
|
France |
Online, paper, telephone |
French |
National directory of healthcare professionals |
Health insurance registry |
Probability sampling |
Probability sampling |
|
Greece |
Online, paper |
Greek |
IDIKA SA – healthcare provider dataset |
IDIKA SA – patient dataset |
Probability sampling |
Probability sampling |
|
Iceland |
Online |
Icelandic |
National registry of healthcare providers |
Registry of patients of participating practices |
Census approach |
Probability sampling |
|
Italy |
Online |
Italian |
Contact list of eligible practices of the Tuscany, Veneto and the AUSL of Bologna |
Regional information system of outpatient services |
Convenience sampling |
Census approach |
|
Netherlands |
Online, paper |
Dutch, English |
Calculus healthcare provider dataset |
Calculus patient dataset |
Census approach |
Probability sampling |
|
Norway |
Online, paper |
Norwegian, Nynorsk, English |
National municipality healthcare provider registry |
National patient registry |
Probability sampling |
Probability sampling |
|
Luxembourg |
Online, paper |
French, German, English |
Primary Care Registration Platform |
Registry of patients of participating practices |
Probability sampling |
Probability sampling |
|
Portugal |
Online, paper |
Portuguese, English |
National primary healthcare provider registry |
Registry of patients of participating practices |
Probability sampling |
Probability sampling |
|
Romania |
Online, paper |
Romanian |
National and District Health Insurance General Practitioner registry |
National and District Health Insurance patient registry |
Probability sampling |
Probability sampling |
|
Saudi Arabia |
Telephone |
Arabic, English |
Registries of the Ministry of Health, Defence and publicly available information |
Patient records from the Directorate of Primary Care, MoH. and registries of participating practices |
Probability sampling |
Consecutive sampling with randomisation |
|
Slovenia |
Online, paper |
Slovenian |
Registry of family physicians at the National Health Insurance Institute |
Registry of patients of participating practices |
Census approach |
Probability sampling |
|
Spain |
Telephone, online |
Spanish, Catalan, Galician, Basque |
Primary Care Information System |
National Health Card Database |
Probability sampling |
Probability sampling |
|
Switzerland |
Online, paper |
German, Italian, French |
National healthcare provider registry |
Patients visiting participating practices |
Probability sampling + convenience |
Continuous sampling |
|
United States |
Face‑to-face, telephone |
English, Spanish |
None |
Medicare Current Beneficiary Survey (>65) |
N/A |
Probability sampling |
|
Wales |
Online, paper, telephone |
English, Welsh, Polish, Arabic, Bengali, Ukrainian |
Administrative data for analysis and performance assessment of primary care practices |
Welsh Demographic Service and Master Patient Index. |
Probability sampling |
Probability sampling |
1. Also available in Greek, Italian, Vietnamese, Punjabi.
2. Also available in German.
Source: PaRIS Sampling reports.
Greece
Practices: Greece employed a random sampling approach for practices. The sampling frame was provided by the public company specializing in IT for social security and health services (IDIKA SA). This sampling frame represents only publicly funded primary care practices and patients. Patients: Eligible patients were sampled with simple random selection from the patient registry of the public IT company (IDIKA SA) filtered by participating practices.
Limitations: The sampling frame in Greece comprehends the patient and practices of the public healthcare system.
Iceland
Practices: All eligible practices from the national registry of healthcare providers were invited (Census approach). Patients: Eligible patients were drawn following a simple random selection from the medical record systems of participating practices.
Italy
Practices: Healthcare organisations, regions, or supportive organisations supplied contact information for primary care practices. Invited practices were selected based on a stratified approach on functional or territorial aggregation of all eligible practices of the Tuscany, Veneto and the Aziende Unità Sanitarie Locali of Bologna (region of Emilia Romagna). In time, eligible practices were medical practitioners who provide primary care services in functional or territorial aggregations, primary care practices which operate at regional and local levels, and those who provide ambulatory generalist care, including services addressing chronic disease management. Patients: All eligible patients from the regional information system of outpatient specialist services that were linked to a participating practice were invited to participate (Census approach).
Limitations: In Italy, patients were selected and contacted based on the list of ambulatory specialist visits and later linked to their primary care practice. For this reason, they are expected to have higher risk levels than the patient sample of other countries.
The Netherlands
Practices: A census approach was implemented to invite all eligible practices from the registry of primary care practices of a specialised IT management institution (Calculus), managing data for approximately 74% of eligible practices in the country. Patients: Eligible patients were sampled with simple random selection from the patient registry of the same third party (Calculus) filtered by participating practices consenting to sampling their patients.
Limitations: The sampling frame included only 74% of eligible practices in the country. Nevertheless, there are no indications that the group of practices affiliated with Calculus deviates from the practices not affiliated with Calculus. Selection bias is therefore unlikely.
Norway
Practices: A random selection of eligible practices (individual general practitioners in the Norwegian case) was drawn from the national municipality registry, representing all eligible practices in the country. Patients: Simple random sampling from eligible patients that were linked to participating practices were drawn from the national patient registry.
Luxembourg
Practices: All registered general practitioners in the “Primary Care Registration Platform” were invited to participate (census approach). Patients: Simple random selection from the medical record system of participating practices. The sample only considers residents of Luxembourg, excluding daily commuters.
Portugal
Practices: Portugal employed a stratified random sampling approach, stratified by regions and types of primary healthcare practice. A proportional number of units from each region-type of practice stratum was invited to participate. The sampling frame corresponded to the national primary care practice registry. Patients: For patients, Portugal performed stratified random sampling by age group and gender. The sampling frame corresponded to the patient registry of participating practices.
Limitations: The Autonomous Regions of Azores and Madeira are not included.
Romania
Practices: Practices were randomly selected from the Romanian National and District Health Insurance registry of general practitioners. Patients: Random sample drawn from the Romanian National and District Health Insurance registry, filtered by participating practices.
Limitations: Challenges in the recruitment of patients lead to a smaller than expected sample size.
Saudi Arabia
Practices: Stratified random sampling based on 22 health regions and proportional sectors (Ministry of health 60%, other public 15%, private 25%). The sample of practices related to the Ministry of Health was drawn from the registry of healthcare providers of the Directorate of Primary Care at the Ministry of Health. Contact information of practices related to the Ministry of Defence (other public) was provided directly by the Ministry of Defence. The sample of private practices was drawn from public contact information. Patients: Patients were consecutively invited by phone, until the required number per practice was reached. The per-practice calling lists were a random sample stratified by age and recent practice visits. The calling order for the consecutive approach was also randomised. The sample was drawn from the patient registry of the Directorate of Primary Care of the Ministry of Health, the patient registry of participating private practices and the patient registry of the healthcare providers of the Ministry of Defence.
Slovenia
Practices: Slovenia invited all primary care practices from the freely available registry of family physicians at the National Health Insurance Institute (Census approach). Patients: Simple random selection from the medical record systems of participating practices.
Spain
Practices: Stratified random sampling with probability proportional to practice size, using the National primary healthcare information system. Patients: Random sampling from the patient registry by region of the National Health Card Database.
Switzerland
Practices: Simple random sampling from the MedReg national register of licensed medical professionals. An additional ten practices were added to the sample with a convenience approach. Patients: Continuous sampling approach inviting all patients who visited the participating practices during the 3‑month data collection period.
Limitations: A subset of physicians who completed medical training before 1984 could be unrepresented in the sampling frame. Nevertheless, more than 90% of eligible professionals are represented in the sampling frame. In addition to the randomly sampled practices, self-selected practices (10) were admitted to the sample upon their request to join. Continuous sampling of patients is different to a census approach in that invitations are subject to primary care visits, instead of a complete list of eligible visitors. This means that a patient’s likelihood to be sampled increases when he or she attends the practice more often.
United States
Practices: There is no data from primary care practices available for the United States. Patients: The sample source for PaRIS was the Medicare Current Beneficiary Survey (MCBS). The MCBS represents the Medicare population and is sponsored by the Centers for Medicare & Medicaid Services. Sampling in the MCBS follows a stratified random selection of participants according to their geographical area, age, sex and ethnicity. For the PaRIS sample, only patients living in a private household and over 65 years old were invited to participate.
Limitations: The sample of the United States is restricted to Medicare beneficiaries over 65 years old. While there are Medicare beneficiaries younger than 65 years old, these beneficiaries have a significant disability that would bias results. Moreover, MCBS uses weights to strengthen its validity. These weights are not used for PaRIS calculations.
Wales
Practices: Random sampling stratified at the Local Health Board Area level and by practice size (small, medium, large). The sample was drawn from the administrative dataset for analysis and performance assessment of primary care practices. Patients: Patient sample drawn from the Welsh Demographic Service. Sampling followed a random approach. A census approach was taken for the patients of small practices. The final sample of respondents was later validated against the sample frame on several characteristics.
7.5.2. Sample size in participating countries
Countries determined their sample size following the PaRIS guidelines, the conditions and characteristics of their health systems and their individual plans for analysis. Table 7.2 presents the final sampling goals and their compliance per participating country.
Several countries deviated from the optimal numbers described in the main survey design. Spain determined that 251 primary care practices were needed to enable cross-regional comparisons (Table 7.2). Iceland has 77 eligible primary care practices in the country (Spring of 2022) and aimed to recruit as many as possible. Luxembourg, Portugal, Slovenia and Wales had fewer than 1 000 total eligible practices in the country, hence their target was set at 75.
In the United States, the decentralised structure of healthcare creates variability and inconsistency across public and private systems that made a national sampling frame of primary care practices unfeasible. For this reason, the United States did not participate in the PaRIS primary care practice questionnaire.
Following the PaRIS design, the sampling goal for patients was set at 75 per participating practice. For the United States, the sampling goal of patients was set to the estimated number of eligible patients that had completed the MCBS Medical Provider Utilisation Questionnaire.
Although it did not achieve 100% of the target, the PaRIS sample stands as one of the largest, if not the largest, international surveys of patient-reported outcomes and experiences to date.
Table 7.2. PaRIS sample size per participating country
Copy link to Table 7.2. PaRIS sample size per participating countryFourteen countries achieved more than 80% of the target for practices and, on average, 88% of the patient sample was reached
|
Primary care practices |
Patients |
||||||
|---|---|---|---|---|---|---|---|
|
Country |
Respondents used in analysis |
Target |
% |
Respondents used in analysis |
Target |
% |
Additional respondents not linked to a practice |
|
Australia |
561 |
100 |
56% |
2 392 |
7 500 |
32% |
0 |
|
Belgium |
83 |
100 |
83% |
4 372 |
7 500 |
58% |
205 |
|
Canada |
65 |
100 |
65% |
3 883 |
7 500 |
52% |
4 |
|
Czechia |
110 |
100 |
110% |
4 136 |
7 500 |
55% |
30 |
|
France |
150 |
100 |
150% |
12 242 |
7 500 |
163% |
3 415 |
|
Greece |
104 |
100 |
104% |
2 173 |
7 500 |
29% |
45 |
|
Iceland |
36 |
34 |
106% |
1 864 |
4 725 |
39% |
833 |
|
Italy |
113 |
100 |
113% |
1 817 |
7 500 |
24% |
2 295 |
|
Luxembourg |
52 |
75 |
69% |
1 590 |
5 625 |
28% |
0 |
|
Netherlands |
60 |
100 |
60% |
4 851 |
7 500 |
65% |
0 |
|
Norway |
121 |
100 |
121% |
8 684 |
7 500 |
116% |
0 |
|
Portugal |
91 |
75 |
121% |
11 744 |
5 625 |
209% |
595 |
|
Romania |
128 |
100 |
128% |
1 277 |
7 500 |
17% |
1 |
|
Saudi Arabia |
100 |
100 |
100% |
7 579 |
7 500 |
101% |
79 |
|
Slovenia |
81 |
75 |
108% |
3 240 |
5 625 |
58% |
63 |
|
Spain |
251 |
251 |
100% |
19 067 |
7 500 |
254% |
0 |
|
Switzerland |
140¹ |
100 |
140% |
4 178 |
7 500 |
56% |
0 |
|
United States |
|
|
|
4 216 |
5 144 |
82% |
0 |
|
Wales |
75 |
75 |
100% |
7 706 |
5 625 |
137% |
0 |
|
Total |
1 816 |
1 785 |
102% |
107 011 |
129 869 |
82% |
7 565 |
|
Average |
101 |
99 |
102% |
5 632 |
6 835 |
82% |
398 |
1. Some practices did not completed the questionnaire, but their patients are included.
Source: PaRIS Country Roadmaps and OECD PaRIS 2024 Database.
7.5.3. Reliability and power in the main survey
The power of PaRIS
PaRIS data showed to have sufficient power to answer the project main research questions (Section 7.2.2). For research questions #1 and #2, PaRIS required sufficient participation from practices and patients to enable the identification of groups of countries where patients’ reported care experiences and/or outcomes were significantly higher and lower than the overall average. The survey was not designed to detect significant differences between all pairs of countries. Three statistically significant groups of countries were detected for all PaRIS ten key indicators. Annex 7.A presents this analysis.
Research question #3 relates to the statistical power of the sample to estimate the effects of multiple independent patient level variables simultaneously in a multilevel regression model. Considering the high number of patients that participated in PaRIS, the power of the survey in this regard was never in question. Furthermore, since outcome reliability poses a more critical limitation in such analyses, it is assumed that sufficiently reliable outcomes inherently ensure adequate power to address research question #3.
Multilevel analyses were conducted to understand the design effect (clustering of patients in practices and countries) in the power analysis. The stronger the clustering, the larger the sample of patients needed to be, compared with a simple random sample. The formula for the design effect “D” is:
where is the average number of patients per cluster. In PaRIS, clusters are countries and practices. is the intraclass correlation (ICC)., i.e. the proportion of variance that is accounted for by the cluster level. If there is no clustering of the outcome variable within countries, the design effect is zero and the number of patients needed is equal to that number in a simple random sample. The higher the ICC, the more extra patients had to be sampled compared to a simple random sample. Table 7.3 contains the results of this analysis.
The ICCs in Table 7.3 show that the proportion of variance that is accounted for by the country level ranges between less than 3% (well-being as assessed by WHO‑5 Well-being scale) and 25% (experienced care co‑ordination). This means that, in particular for the P3CEQ scales “experienced care co‑ordination” and “experienced people‑centred care”, there is substantial clustering of patients’ experiences within countries. This also holds for the extent to which patients work together with their healthcare professionals – or rely on them – in managing their health.
In general, there is little clustering of patient data at practice level. The high variability of the number of participating practices per country, and the number of patients per practice, has an effect on the power of the survey to assess research question #4 (relationships between characteristics of primary care practices and PREMs/PROMs overall).
This being said, several practice‑level characteristics were found significant in multilevel regressions. For example, having medical records available when patients are seen had a significant effect on PROMIS physical health. In the same line, self-management support by providing written information, written instructions for care management, and scheduling appointment for more than 15 minutes had a significant effect on patient centredness and co‑ordination (Chapter 4). We analysed the robustness of these effects by simulating different datasets with similar characteristics in terms of the number countries, the average number of practices per country and the average number of patients per practice. We found that the significant effects found in the main analysis are robust to random samples with similar characteristics. This analysis is presented in Annex 7.A.
At the same time, practice‑level characteristics such as the role of primary care staff has shown to be significant for care experiences and outcomes in literature (Davis et al., 2021[21]), but this effect was not found in the analysis of PaRIS. We tested the robustness of the non-significance by simulating a larger number of average practices per country than those in PaRIS. The results show that in simulated samples with the optimal average number of practices per country by design (75), the effect of the role of primary care staff in PaRIS was not significant in more than 65% of the simulated runs. This suggests that the lack of a significant effect may be attributed to factors such as the nature of the variable, the model used, or the specification of staff roles, rather than a suboptimal sample size.
Finally, system-level characteristics were not assessed in this report. It was expected that the survey had sufficient power to assess one country level characteristic at a time, given the 19 participating countries.
Reliability of PaRIS instruments
A prerequisite for all analyses is that the underlying constructs are reliably assessed. Because of the multilevel structure of the PaRIS data, the key indicators should not only be reliably assessed at the level of individual patients, but also at the country level. While PaRIS was not specifically designed for comparing practices (de Boer et al., 2022[2]), reliability at the practice level was also assessed.
The reliability of a construct depends on measurement error, the number of items, the number of patients per practice and the variances at practice and country level. The reliability coefficient in a multilevel model is a measure of internal consistency comparable to Cronbach’s alpha in a single level model. Reliability was calculated using the reliability measure for multilevel models (Raudenbush, 2003[18]; Leyland and Groenewegen, 2020[19]). Annex 7.A presents the reliability calculations.
Table 7.3 contains the results of the multilevel reliability analyses of PaRIS. The table includes the results of five key indicators, i.e. those that have been assessed with multiple items. In addition, the table includes three other multi‑item constructs, which aimed to assess patients’ health and care capabilities.
Table 7.3. PaRIS sample size per participating country
Copy link to Table 7.3. PaRIS sample size per participating countryReliability analysis of PaRIS instruments
|
Clustering |
Reliability (multi-level) |
Reliability (single level; Cronbach’s alpha) |
|||||
|---|---|---|---|---|---|---|---|
|
N. of items |
ICC country (%) |
ICC practice (%) |
Country level |
Practice level |
Patient level |
Country level |
|
|
PaRIS key indicators |
|||||||
|
PROMs |
|||||||
|
Physical health (PH) |
4 |
5.30 |
3.53 |
0.77 |
0.68 |
0.75 |
0.77 |
|
Mental health (MH) |
4 |
10.69 |
2.21 |
0.88 |
0.60 |
0.77 |
0.79 |
|
Well-Being (WB) |
5 |
2.91 |
2.06 |
0.66 |
0.59 |
0.88 |
0.89 |
|
PREMs |
|||||||
|
Experienced care co‑ordination |
5 |
19.08 |
4.64 |
0.91 |
0.67 |
0.60 |
0.67 |
|
Experienced people‑centred care |
8 |
24.97 |
3.85 |
0.94 |
0.68 |
0.70 |
0.75 |
|
Other multiple item constructs |
|||||||
|
Active engagement (PNS1) |
4 |
4.14 |
1.51 |
0.73 |
0.48 |
0.73 |
0.73 |
|
Working together with care professionals (PNS2) |
4 |
20.47 |
2.42 |
0.93 |
0.59 |
0.61 |
0.68 |
|
Health literacy (PNS3rev) |
2 |
14.32 |
4.27 |
0.90 |
0.72 |
0.71 |
0.74 |
Note: ICC: Intraclass correlations. The ICC for the reliability analysis is calculated using equal weights for all scale items and using the null model (not standardised) of the whole sample. For these reasons, ICCs in the reliability analysis differ from those presented in Chapter 2.
Source: PaRIS Consortium based on OECD PaRIS 2024 Database.
Table 7.3 shows that almost all constructs are sufficiently reliable (coefficient ≥.70) at country level. The only exception is the WHO‑5 Well-being scale (.66), which is nevertheless substantially more reliably assessed at country level in the main survey than it was in the field trial (.55) (van den Berg et al., 2024[8]), due to the higher number of participating patients in the main survey.
The reliability of constructs at the practice level is insufficient for several countries, likely due to the low number of participating patients per practice in many countries. Annex 7.A illustrates that practice‑level reliability is typically adequate in countries where the average number of participating patients per practice meets the recommended threshold of 75. While higher reliability at the practice level would have been ideal, it is less critical for PaRIS, as the survey’s primary objective is not to analyse differences in patients’ care experiences and outcomes between practices within countries.
7.5.4. Recruitment of patients and primary care practices
National Project Managers collaborated with national stakeholders and the PaRIS Consortium to create a communication strategy for the recruitment process. These strategies were customised to align with the specific needs and preferences of each country, taking into account local administrative structures, resources and communication channels.
Primary care practices were recruited between January 2023 (Luxembourg and Spain) and April 2024 (Switzerland). The recruitment period lasted, on average, 11.8 weeks per country being the longest in Portugal (20 weeks) and the shortest in Italy (4 weeks).
Recruitment of patients started in July 2023 in Norway and ended in April 2024 in Switzerland. The average recruitment period per country lasted 13 weeks and it was longest in Luxembourg (27 weeks) and shortest in the United States (5 weeks).
The communication strategy among participating countries focused on engaging stakeholders –patients, primary care practices, and other related groups – through a variety of channels to encourage awareness, participation, and support. Common communication channels included informational materials like posters and brochures, digital outreach such as emails and social media, and stakeholder-specific methods like advisory boards and steering groups for direct feedback and involvement (Kendir et al., 2023[9]).
Countries found that direct, dialogue‑based channels, including advisory groups and structured consultations, were most effective in building a sense of involvement and ownership, particularly among professionals. Passive communication, such as distributing informational materials, while essential for raising initial awareness, was less effective at generating active engagement and buy-in (Kendir et al., 2023[9]).
7.6. Response rates and paradata
Copy link to 7.6. Response rates and paradataCountries employed several strategies for maximising response rates among invited patients and primary care practices. The Survey Operations Manual provided detailed information on how National Project Managers and participating practices could invite sampled patients and facilitate access to the survey. Examples of the different characteristics of recruitment strategies are; the content of the invitation letter and reminders, the sender of the invitation, the combination of several communication channels and provision of paper-and-pencil alternative. A detailed description of engagement and recruitment strategies was published in a dedicated Health Working Paper (Kendir et al., 2023[9]). Table 7.4 presents response rates for practices and patients of participating countries.
Table 7.4. Practice and patient response rates
Copy link to Table 7.4. Practice and patient response rates|
|
Australia |
Belgium |
Canada |
Czechia |
France |
Greece |
Iceland |
Italy |
Luxembourg |
Netherlands |
Norway |
Portugal |
Romania |
Saudi Arabia |
Slovenia |
Spain |
Switzerland |
United States |
Wales |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Practices |
<5% |
<5% |
11% |
56% |
35% |
57% |
14% |
<5% |
50% |
61% |
26% |
28% |
9% |
86% |
14% |
NA |
37% |
||
|
Patients |
8% |
31% |
16% |
42% |
7% |
25% |
25% |
27% |
17% |
44% |
11% |
6% |
24% |
15% |
47% |
19% |
42% |
21% |
Note: Countries with missing response rates used a convenience sample or were not able to produce reliable data for calculation.
The patient response rates for PaRIS vary significantly but are comparable to those observed in other recent international surveys involving patient-level data. For instance, the People’s Voice Survey (Kruk et al., 2024[22]) recorded response rates from 2% to 84% across 15 participating countries. The Commonwealth Fund’s 2023 International Health Policy Survey (Gumas et al, 2024[23]) had response rates between 6% and 49% among 10 countries. In contrast, the third wave of the European Health Interview Survey (Eurostat, 2024[24]) reported larger response rates, ranging from 25% to 88%. Nevertheless, this survey is focused on the general population, a less restrictive approach to the primary care service users targeted in PaRIS. The implications of response rates falling below 10% are further explored in section 7.7.6.
7.6.1. Response time and completion rates
The average survey completion time for patients across countries is approximately 30.2 minutes, with a range from a minimum of 26.0 minutes in the Netherlands to a maximum of 38.4 minutes in the United States. For practices, the average survey completion time is about 25.3 minutes, with times spanning from a minimum of 19.4 minutes in Greece to a maximum of 32.8 minutes in Slovenia (Figure 7.4).
Figure 7.4. The survey took on average 30 minutes for patients and 25 minutes for practices to complete
Copy link to Figure 7.4. The survey took on average 30 minutes for patients and 25 minutes for practices to completeAverage response time in minutes
Note: Only surveys that were completed online recorded response time.
Source: OECD PaRIS 2024 Database.
Completion rates refers to the proportion of respondents that completed all questions, out of all started questionnaires. The highest completion rates are seen in France, Saudi Arabia, Spain and Wales, with 100% or close to 100% completion for both patients and practices. The lowest patient completion rate is in Italy, at 66%, while practice completion in Italy is higher at 96%. Greece also shows a notable gap with other countries, with patient completion at 75% and practice completion of 72%.
Figure 7.5. Once started, most surveys were completed in all countries
Copy link to Figure 7.5. Once started, most surveys were completed in all countries
Note: It was not possible to compute completion rates in the United States.
Source: OECD PaRIS 2024 Database.
7.6.2. Data collection modes
PaRIS employed four modes of data collection: online, paper-based, telephone, and assisted interviews (CAPI/CATI). Countries predominantly relied on a single mode, with minimal within-country variation in modes used. Analysis revealed that collection mode had a small but significant effect in the models for the calculation of the PaRIS ten key indicators. For example, telephone surveys were associated with higher estimates on seven of the ten key indicators, while paper surveys resulted in lower estimates for most indicators. Despite these findings, collection modes accounted for less than 1% of total outcome variance, and their inclusion in multilevel models had negligible effects on country-level estimates. This suggests that, given the entanglement of country-specific effects and collection modes, the survey’s multilevel structure effectively integrates collection mode into the broader country-specific effects.
In the same line, although including data collection mode improved the model fit, its practical impact on country-level comparisons was minimal. Adjusting for collection modes slightly reduced country-level variance but did not significantly alter estimates for PREMs and PROMs. Further details are provided in the annex (Annex 7.C), including specific mode distributions per country and their effects in country estimates of PREMs and PROMs.
7.7. Data analysis and validation
Copy link to 7.7. Data analysis and validation7.7.1. Sample characteristics
PaRIS provides data from a diverse sample across 19 countries, each exhibiting distinct socio-economic and health profiles. This diversity allows for a comprehensive analysis of patient-reported outcomes and experiences (PROMs and PREMs) within various health system contexts. The guiding principle in analysing PaRIS data is that health systems should adapt to the unique characteristics of their populations to enhance patient care and experience. This approach is supported by the results of a case mix analysis, that indicated minimal impact of covariates on PROMs and PREMs estimates, suggesting that simpler, more parsimonious models are feasible and appropriate for reporting. (See Section 7.7.4 for more details on the case mix analysis).
However, understanding the sample composition remains critical, as although these variables might not necessarily alter PROMs and PREMs in the analysis, they are essential for healthcare systems to consider for optimising performance. Moreover, the influence of certain demographic and health variables on health outcomes is well-documented, reinforcing the idea that these characteristics should be factored in the explanation of country estimates of PREMS and PROMS. Annex 7.B presents an overview of key demographic and health indicators for the sample across participating countries.
The PaRIS sample highlights distinct demographic and health characteristics across participating countries, illustrating the diversity in population profiles and health challenges. The US sample skews older, reflected in notably higher rates of hypertension and arthritis. In contrast, Saudi Arabia’s sample is younger, with a significant concentration in the 45‑54 age group and a lower representation in the 65‑74 age bracket, alongside an elevated prevalence of diabetes. Educational disparities are apparent, with Czechia showing a significantly higher proportion of respondents with lower education levels and a lower representation in mid-level education, while Norway stands out for its higher proportion of respondents with high education. Italy’s sample has a notably lower proportion in the high education category and a higher diabetes rate, with Romania also showing elevated diabetes rates. Additionally, Romania presents a higher prevalence of cardiovascular conditions, and Australia has higher rates of asthma/COPD. Slovenia’s sample diverges in gender composition, with a lower proportion of male respondents. These varied characteristics across countries underscore the importance of contextualising patient-reported outcomes within each country’s unique demographic and health profile.
7.7.2. Standard population
To facilitate interpretation of the PaRIS results and increase their comparability across countries, all patient-reported indicators have been estimated for a reference population of patients with a predefined age and sex distribution, i.e. the PaRIS standard population.
The PaRIS standard population was constructed using data on the patient population eligible for PaRIS in OECD member countries that participated in PaRIS. To be eligible for PaRIS, patients must be aged 45 or older and have been in contact with a primary care practice for their own health within the last six months. The population eligible for the survey is thus a specific selection of the general population.
Standardisation was based on age and sex (standardisation variables). To note that while the survey asks about self-reported gender (“Which of the following best describes you – Female, Male, LEAVE BLANK, Other, prefer not to say”), the standard population is based on sex, as registered in available information sources.
Given the specificities of the patient target group, population data about the age and sex distribution of eligible patients was not readily available in most countries. Therefore, National Project Managers estimated the age and sex distribution of the eligible patient population in their countries based on available information. Given that such information was often limited, age distribution in four categories was used.
Primarily, country-specific standard populations were defined using a combination of available data sources covering the entire population of eligible patients, such as regional or national population or patient registries. The reliability of these sources was assessed and graded by National Project Managers. Nevertheless, given the specificity of the eligible population, high quality information was not available for all participating countries. To address this challenge, country-specific standard populations were defined by applying the following rules:
Level 1: If reliable and up-to-date data for the total eligible patient population were available for one or both standardisation variables, this data source was used to define the country-specific standard population.
Level 2: If this data was unavailable, outdated, or of low quality, the standard population was based on the age and sex distribution of the patients in the total sampling frame, unless substantial selection bias was identified.
Level 3: If data on the total population or the sampling frame was missing or unusable, the standard population was based on the age and sex distribution of the patients in the total sample, unless substantial selection bias was reported.
Level 4: If reliable, recent data for one or both standardisation variables was unavailable for the total population, sampling frame, or sample, the standard population was based on the self-reported age and gender of survey respondents in the country.
Table 7.5 shows the country-specific and the OECD-PaRIS standard population.
Table 7.5. OECD PaRIS standard population and country-specific standard populations
Copy link to Table 7.5. OECD PaRIS standard population and country-specific standard populations|
Country |
Age |
Gender |
||||
|---|---|---|---|---|---|---|
|
45‑54 |
55‑64 |
65‑74 |
75+ |
Women |
Men |
|
|
OECD – PaRIS standard population¹ |
25% |
28% |
27% |
20% |
55% |
45% |
|
Australia |
29% |
28% |
24% |
19% |
53% |
47% |
|
Belgium |
24% |
32% |
30% |
14% |
55% |
45% |
|
Canada |
22% |
29% |
32% |
17% |
64% |
36% |
|
Czechia |
32% |
25% |
25% |
18% |
53% |
47% |
|
France |
23% |
27% |
26% |
24% |
56% |
44% |
|
Greece |
29% |
27% |
22% |
22% |
53% |
47% |
|
Iceland |
20% |
33% |
32% |
15% |
54% |
46% |
|
Italy |
18% |
31% |
32% |
19% |
49% |
51% |
|
Luxembourg |
29% |
30% |
22% |
19% |
54% |
46% |
|
Netherlands |
26% |
27% |
24% |
22% |
52% |
48% |
|
Norway |
24% |
27% |
25% |
24% |
55% |
45% |
|
Portugal |
28% |
27% |
23% |
22% |
55% |
45% |
|
Slovenia |
29% |
28% |
24% |
19% |
52% |
48% |
|
Spain |
28% |
27% |
22% |
23% |
59% |
41% |
|
Switzerland |
19% |
28% |
29% |
24% |
51% |
49% |
|
United States |
- |
- |
25%² |
22%² |
57%³ |
43%³ |
|
Wales |
15% |
27% |
33% |
25% |
55% |
45% |
1. Based on the 17 participating OECD member countries.
2. Estimations based on the assumption that 53% of eligible population is younger than 65 (OECD standard population for the two age categories younger than 65 based on the 16 countries, thus excl. the United States, that have eligible populations aged 45 and older), thus that 47% will be aged 65 and over (within this 47% we used the proportion of the eligible US population provided by the NPM)
3. As provided by the US NPM for the eligible PaRIS population in the US Medicare Beneficiaries survey (aged 65 and older). It has not been corrected for the fact that this proportion was found in the eligible population aged 65 and older only.
Source: PaRIS Consortium based on PaRIS sampling reports.
The OECD PaRIS standard population was constructed based on the country-specific standard populations of the 17 participating OECD member countries, but all 19 participating countries were weighted to this standard population. An equal weigh was assigned to each country in the calculation, because the primary focus in PaRIS is on comparing countries and identifying good practices, rather than analysing a broader population or geographical region.
7.7.3. Standardised estimates
Country-level estimates (PREMS/PROMS) are directly derived from a multilevel regression model, explained in detail in Section 7.7.7. Country estimates are standardised by including a re‑scaled version of age and gender as independent variables in the model. The re‑scaling of the standardisation variables (age and gender) ensures consistency in comparison by creating estimates while simulating the OECD PaRIS standard population in all countries.
Similarly to other regression analysis, the intercept of the multilevel model is calculated when all independent variables are set to zero. As further explained in Section 7.7.7, the intercept of the multilevel model corresponds to the overall average of the estimated outcome in the survey, while accounting for the nested structure of the data. This means that adjustments to the coding (or rescaling) of independent variables affects the overall average of the estimated outcome. The process of rescaling the standardisation variables is set to define the overall average of the model when the standardisation variables are set to their re‑scaled value, instead of zero (Groenewegen et al., 2024[7]). The re‑scaling process follows four steps:
Exclusion of Incomplete Cases: Cases without a valid response for the outcome variable are excluded. For those remaining, their distribution across age and gender variables is determined.
Dummy variables for age and gender are created and incorporated as standardisation variables in the multilevel model (see Section 7.7.7).
The original coding of these dummy variables is re‑scaled to correct for any deviations between the observed distribution of age and sex among included cases and the standard distribution as defined by the OECD PaRIS standard population. The re‑scaling follows the formula: original value – (value in standard population).
Estimates for all outcomes are created using the re‑scaled standardisation variables.
Box 7.5. Example of rescaling the values of the standardisation variable: Gender
Copy link to Box 7.5. Example of rescaling the values of the standardisation variable: GenderRe‑scaling gender
The PaRIS standard population determines that estimates should be calculated as if all countries had 55% of the population women and 45% men. To rescale the original coding for our “MALE” dichotomous variable (0=Female, 1=Male) we calculate: For males (MALE = 1) the value of the variable becomes 1‑0.45= 0.55; for females (MALE=0), it becomes 0‑0.45=‑0.45. Similarly, for the “FEMALE” dichotomous variable values are: For males, 0‑0.55 = ‑0.55; for females, 1‑0.55 = 0.45.
Because of the dummy variable trap (multicollinearity between categorical options) only one of the rescaled variables (“FEMALE_standardised”) is used as a standardisation variable.
Quality checks
The set of six re‑scaled variables (two for Gender, four for age) should add up to zero for each observation. This useful quality check helps in securing a correct calculation:
|
Observation |
MALE |
FEMALE |
MALE_std |
FEMALE_std |
|---|---|---|---|---|
|
Patient1 |
1 |
0 |
0.55 |
‑0.55 |
|
Patient2 |
0 |
1 |
‑0.45 |
0.45 |
Source: Groenewegen, P. et al. (2024[7]), “Data analysis plan of the OECD PaRIS survey: leveraging a multi-level approach to analyse data collected from people living with chronic conditions and their primary care practices in 20 countries”, https://doi.org/10.1186/s13104-024-06815-7.
7.7.4. Case mix adjustment
In principle, PaRIS overall estimates are not case mix adjusted unless described otherwise. Countries may have more patients with specific characteristics than others, and when these characteristics relate to how countries or practices perform in the eyes of the patients, these differences could be considered as independent to the capacities and quality of the health system. On the other hand, it is possible that some countries or practices adapt better and succeed in providing better care to patients with specific characteristics. In this sense, there is a degree of uncertainty over the accountability of the healthcare system to influence the effect of variables affecting the interest outcomes. PaRIS takes an ambitious approach to this issue, asserting that health systems should strive to adapt as fully as possible to the needs and characteristics of their population. To reinforce this approach, potential case‑mix adjusters were rigorously evaluated following (Groenewegen et al., 2023[25]).
Potential case‑mix adjusters had to meet four criteria:
The distribution of the variable differs substantially between units (countries, primary care practices).
The variable (patient characteristic) is significantly related to the outcome variable (patient-reported care experience or outcome).
The relationship between the potential case‑mix adjuster and the outcome variable is similar for all units (countries, practices).
Data quality for potential case‑mix adjuster was high for all countries.
The first two criterion are straightforward. For a variable to be considered for case mixed adjustment there needs to be differences, and these differences should have an impact on the outcome. The third criterion is set to ensure that the influence of the characteristic is outside of the accountability of the health system. For example, if the relationship of the potential case‑mix adjuster to the outcome differs from country to country, it would imply that patients with the specific characteristic at stake have better care experiences or outcomes in some countries or practices than in others, which might point to potential room for care improvement in weaker healthcare systems. Groenewegen and colleagues (Groenewegen et al., 2023[25]) provide guidance on how to examine whether a patient characteristic meets this criterion, and proposes a measure to decide whether observed differences between units are large enough to state that a certain variable does not meet this requirement. The method defines a random slope effect (difference in variance between the categories of the independent variable) lower than 25% of the total variance as the acceptable cut-off, regardless of statistical significance.
To ensure clarity in the final analytical models, potential case‑mix adjusters were assessed only in relation to the ten PaRIS key indicators. Priority was given to patient characteristics that meet the case‑mix adjustment criteria and are relevant to multiple key indicators, promoting a more consistent and harmonised approach.
After exploratory analysis of the first criterions (substantial differences in the distribution of the variable between units) the list of potential case mix adjusters included: Sociodemographic: Education level, income level, born in country of survey, and employment status; Health: Self-reported high blood pressure, cardiovascular disease, diabetes, arthritis, breathing condition, depression and cancer.
For the second and third criterion, multilevel regression analyses were carried out. As the criterion set by Groenewegen et al. had been designed for use with continuous dependent variables, testing for this criterion was done for the key indicators that were continuous (five of the ten key indicators).
Results showed that the second and third criterion did not set any further restrictions for including the selected characteristics as case‑mix adjusters. All selected patient characteristics were significantly related to most of the ten key indicators. However, it was not found that the selected patient characteristics always related to the same key indicators; nor that there was a clear distinction between patient characteristics that related more to the selected PROMs than the PREMs, and vice versa. At the same time, the random slope effects of all selected patient characteristics were smaller than 25% of the total variance, both at country level and practice level, for all five key indicators that were continuous variables.
Nevertheless, except from education level, for other sociodemographic characteristics on the list (income, born in country of survey, urbanisation level, employment status) data were missing for one or more countries, because the question was not asked in the country or answering options in countries were merged or deviated from the original patient questionnaire.
Therefore, only education level and self-reported high blood pressure, cardiovascular disease, diabetes, arthritis, breathing condition, depression and cancer were tested as case‑mixed adjusters. These variables were re‑scaled before inclusion in multilevel models. The reason behind re‑scaling is to define the model’s intercept when case‑mix variables are set to a determined level in the population, instead of 0, that would be unplausible (see Box 7.5 for details on the re‑scaling). Because there is no determined standard population for the potential case‑mix adjusters, the average of the per-country averages was used as the standard for re‑scaling.
The effect of including case‑mix variables into the estimation models was further explored in Chapter 2. Results of country estimates of PREMS and PROMS did not vary significantly, supporting the decision of not including case‑mix variables by default, and continuing with more parsimonious models for the rest of the report unless indicated otherwise.
7.7.5. Missing values
In PaRIS, participating patients and primary care practices had the option to skip questions they preferred not to answer. As a result, the number of valid responses for each question in the patient and practice questionnaires may differ and often falls below the total number of respondents targeted by each question.
The term “missing values” refers to unanswered questions that respondents were expected to complete. Missing values are not inherently problematic if their number is low and if they are distributed randomly. However, non-random missing values can introduce bias in the dataset and affect analysis. A key distinction exists between missing values that are unrelated to any other observed characteristic of respondents (random) and those that correlate with specific characteristics, which may contribute to bias.
The PaRIS consortium conducted an initial review of missing values for each question in both the patient and practice questionnaires. Generally, missing values remained low (less than 10%) across all questions and countries, including questions tied to the PaRIS ten key indicators.
However, several countries showed substantial missing values (over 25% in some cases) for questions regarding specific patient characteristics, such as age, gender, income, and sexual orientation. In these instances, National Project Managers were requested to investigate their local databases (for countries using independent data management systems) and/or provide explanations. National Project Managers attributed these missing values to several factors, including the sensitivity of certain questions in their countries and the questionnaire’s length. Two National Project Managers noted that all sociodemographic questions (except age) were placed at the end of the questionnaire, which likely contributed to missing responses when patients did not complete the survey. For example, for Italy, a notably high number of missing values appeared in the gender question.
Given that missing values were minimal for nearly all questions across all countries, any bias in outcomes due to missing responses was expected to be negligible. As a result, missing values were not imputed. Additionally, imputation methods have notable limitations, particularly when applied to non-random missing values. Imputation is suitable only for random missing values, as this enables the construction of a reliable prediction model based on available respondent data. In cases of non-random missing values, reliable prediction is not possible because the specific reason for the missingness remains unknown.
To maximise the total number of cases in the analysis and to ensure sufficient statistical power, observations with missing values were retained in the multilevel estimations. The following protocol was developed to handle these missing values:
Cases with missing values on the dependent variable were excluded from the analysis.
Cases with missing values on the independent variables (standardisation variables, predictor variables and/or case‑mix adjusters) were not excluded. Instead, missing indicators were added as independent variables to the models (See Box 7.6).
Box 7.6. Including missing indicators in the analytical regression models
Copy link to Box 7.6. Including missing indicators in the analytical regression modelsMissing indicators were constructed for each variable in the estimation models and included as additional independent variables. The missing indicators were coded 0 (valid answer) or 1 (missing answer). In some models, only one missing indicator was constructed, putting together several independent variables, with the values 0 (a valid answer on one or more of the variables) and 1 (missing values for all variables).
Missing indicator variables for all independent variables were included in the multilevel regression models. A significant effect of the missing indicator meant that missingness of the specific patient characteristic was not random. The direction of the regression coefficient of the missing indicator provided an indication of whether the potential bias in the predicted dependent variable was either an under- or overestimation of the outcome.
A more comprehensive review of the effects of missing indicators in the analytical models will be conducted at a later stage, along with further considerations for imputing missing values randomly.
7.7.6. Data validation
PaRIS aimed at being representative, in terms of age and sex, of the population 45 years and older of primary care service users living in the community (de Boer et al., 2022[2]). Representativeness of the samples under these parameters was validated using the available characterisation of the eligible population in each participating country.
Deviations from the original survey design were addressed with the standardisation mechanism detailed in Section 7.7.2. The extent to which applying the age and sex standardisation corrects country-level estimates for selection or non-response bias depends on the quality of the information that was used for constructing the country-specific standard populations.
Twelve out of the 19 participating countries provided data for the characterisation of the entirety of the eligible population (Level 1 standardisation) or the defined sampling frame (Level 2 standardisation). One country (Norway) was able to produce a characterisation of the entire sample of patients (Level 3 standardisation), while in the remaining six countries the sample could only be validated for respondents (Level 4 standardisation) (Figure 7.6).
Figure 7.6. Constructing the PaRIS standard population
Copy link to Figure 7.6. Constructing the PaRIS standard populationLevels of reliability in the characterisation of eligible population
*Only for sex data. **While the characterisation in the United States covered the entire eligible population, age‑sex stratification does not fully account for the US PaRIS sample design and thus may not fully correct for non-response or selection bias.
Source: PaRIS sampling reports.
Data collected in 14 out of the 19 countries in PaRIS were validated either by presenting no limitations in the implementation approach or by addressing potential sources of bias with standardisation. Table 7.6 summarises design limitations in participating countries and how they are addressed.
Table 7.6. Summary of limitations, validation mechanisms and their implications for representativeness
Copy link to Table 7.6. Summary of limitations, validation mechanisms and their implications for representativeness|
Sources of potential bias |
Correction |
Implications for (age and sex) representativeness |
|
|---|---|---|---|
|
Australia |
The patient sample is drawn from practices that use an electronic health record system. Response rates lower than 10%. |
Level 1 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Belgium1 |
Response rates in practices lower than 10%. |
The Belgian patient sample could not be validated against the eligible population. |
|
|
Canada1 |
The patient sample is drawn with a census approach from practices of a convenience sample. |
The Canadian patient sample could not be validated against the eligible population. |
|
|
Czechia |
No limitations / sources of potential bias |
Level 2 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
France |
Minor deviations to the eligibility criteria of patients |
Level 1 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Greece2 |
Patients are sampled from practices in the public health system. Response rates for patients lower than 10%. |
Level 1 standardisation |
Representative of the population of primary care service users 45 years and older in the public system |
|
Iceland |
Response rate for patients could not be calculated |
Level 1 standardisation for sex, Level 4 for Age. |
Representative of the entire population of primary care service users 45 years and older |
|
Italy1,2 |
The patient sample is drawn with a census approach and a different eligibility criterion from practices of a convenience sample in Veneto, Tuscany and Emilia Romagna regions. |
The Italian patient sample could not be validated against the eligible population and major differences in eligibility criteria of patients should be considered in the analysis. |
|
|
Luxembourg |
No limitations / sources of potential bias |
Level 2 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Netherlands |
The patient sample is drawn from practices in an IT network covering 74% of eligible practices. Response rates lower than 10%. |
Level 1 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Norway |
No limitations / sources of potential bias |
Level 3 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Portugal |
Exclusion of Azores and Madeira. Response rates for patients at 10%. |
Level 1 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Romania |
Challenges in implementation resulted in lower-than-expected number of participating patients. Response rates for patients lower than 10%. |
Level 2 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Saudi Arabia3 |
No limitations / sources of potential bias. |
Representative of the entire population of primary care service users 45 years and older |
|
|
Slovenia |
Response rates in practices lower than 10%. |
Level 1 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Spain |
No limitations / sources of potential bias |
Level 2 standardisation |
Representative of the entire population of primary care service users 45 years and older |
|
Switzerland1 |
The patient sample is drawn with a continuous sampling approach. |
The Swiss patient sample could not be validated against the eligible population. |
|
|
United States2 |
The patient sample is drawn from a parent sample representative of Medicare participants aged 65 and older |
Level 1 standardisation |
Data representative of Medicare beneficiaries over 65 years old. Major differences in eligibility criteria of patients should be considered in the analysis |
|
Wales3 |
No limitations / sources of potential bias |
Representative of the entire population of primary care service users 45 years and older |
1. Validation of the sample was not possible, and the country presents sources of potential bias.
2. Major deviations from eligible population guidelines.
3. Validation of the sample was not possible, however, no basis for potential bias was detected.
Source: PaRIS sampling reports.
Three countries (Belgium, Canada, Switzerland) presented deviations that could not be addressed, or their samples validated. Two more countries (Italy and the United States) present major deviations in the eligibility criteria for patients. These deviations should be considered in the analysis and comparison of estimates for these two countries.
7.7.7. Estimation models
Multilevel analyses are advantageous when working with hierarchical data because they account for the nested structure, unlike traditional methods that assume observations are independent. Multilevel models quantify and account for the variation at the country and practice level. This approach improves the accuracy of estimates by considering variations both within and between clusters (Raudenbush and Bryk, 2002[26]). In PaRIS, clusters correspond to countries and primary care practices. The interclass-correlation (ICC) is a measure used in multilevel modelling to quantify the proportion of total variance in the outcome that is attributable to clustering within groups (e.g. practices or countries). It helps determine how much of the variability in the outcome is explained by group-level differences versus individual-level differences. With being the variance at the different levels specified in the random effects of the multilevel model, the formula for the ICC at country level is:
PaRIS data has been analysed using three types of models: (1) to estimate the influence of background characteristics on outcomes and experiences at the patient level; (2) to assess the level and variation of selected outcomes and experiences across countries; and (3) to examine the level and variation of selected outcomes and experiences across countries while evaluating the effect of an independent variable within countries. All these models address the hierarchical structure of patient reported information by estimating the outcomes in a multilevel regression.
The specification and estimation method of each analysis vary according to the type of outcome, the type of analysis and the complexity of the specification. Models were estimated using MLwiN (Rasbash et al., 2005[27]) in Stata (runmlwin) (Leckie and Charlton, 2012[28]) or R (r2mlwin) (Zhang et al., 2016[29]). Models were estimated with linear restricted iterative least squares (RIGLS) for continuous outcomes, and logistic RIGLS, with Penalised quasi- likelihood linearisation (PQL) of the first order for dichotomous outcomes (Goldstein, 1989[30]). For some models, the primary specification could not be estimated. In these cases, an iterative least square (IGLS) (Goldstein, 1986[31]), or marginal quasi-likelihood linearisation (MQL) were used. Models used the entirety of the sample for the analysis in Chapter 2, and only patients with chronic conditions for the rest of the report (unless indicated otherwise).
Estimation models have been explained in detail in the PaRIS analysis plan (Groenewegen et al., 2024[7]). As follows we provide an overview of the three types of analysis in this report.
(1.) To estimate the influence of background characteristics over outcomes and experiences at patient level
This model responds to research questions #3, #4 and #5 (7.2.2). It is used for understanding the effect of sociodemographic characteristics over PROMS/PREMS in Chapter 5 (Inequalities), or primary care practice level characteristics in Chapter 4 (People‑centredness). Variation in PROMs/PREMs by background characteristics was estimated with a multilevel regression model with random intercepts at the country and practice levels, and covariates (also known as “fixed effects” in multilevel lingo) at patient level. Multilevel models with random intercepts account for data hierarchies by allowing the intercepts to vary across clusters. The model assumes a common (“overall”) measure of the outcome, around which there is the average measure of each country and practice distributed normally. Further, the model assumes a common nature of the relationship of the independent variables with the outcome across clusters.
The sample size of participating patients is sufficient to allow for the simultaneous estimation of multiple patient-level independent variables. Since the analysis focuses on the coefficients of socio-demographic characteristics, using a standard population is not required. However, to facilitate comparability with other models in the report, standardisation variables are included as covariates in this analysis. Equation 1 describes the multilevel model for this analysis.
Equation 1
is the outcome for patient i in country k, related to practice j. is the fixed intercept. are the coefficients for the standardised male variable and age variables and the missingness variables explained in Section 7.7.5. is the random intercept for country k. is the random intercept for practice j within country k. is the residual error term. are the coefficients of the m independent variables for patient i, related to practice j, in country . While background characteristics are specified at patient level, there is no restriction of this variables to be patient-level variables. Characteristics of the practice, or the health system can be included. In practice, these characteristics will be repeated for all the patients linked to that practice or in that country.
The purpose of Model (1.) is to examine the overall effect of background characteristics on PREMs and PROMs. To contextualise the relevance of the background characteristic of interest, measures of model fit are provided. These include the log-likelihood ratio and marginal pseudo-, which offer insights into how well the model explains the variance in the outcomes and the contribution of the variables included.
Likelihood Ratio: Compares the goodness-of-fit between two multilevel models (e.g. with and without a specific variable of interest). It consists of subtracting the deviance of the two models, and comparing it to half the p-value from chi-squared distribution. A significant likelihood ratio indicates that adding the variable improves the model fit. (Snijders and Bosker, 2012[32]).
Marginal Pseudo-: Quantifies the proportion of variance explained by the covariates (fixed effect) in the model. Consists of a proportion comparison of the total variance of the Null and interest models. It helps assess how much of the variability in the outcomes is attributed to the predictors, excluding random effects. (Nakagawa and Schielzeth, 2012[33]).
(2.) To estimate the level and variation of selected outcomes in and between countries
This model responds to research questions #1 and #2 (7.2.2). Model (2.) is similar to model one, with the difference that it is focused on estimating an overall measure of PREMS/PROMS by country for international comparisons, without the effect of background characteristics other than age and sex. Similarly to the analysis of model (1.), a multilevel model with random intercepts was used to estimate the level and the variation of outcomes in and between countries. However, in this case, other than the random intercepts, only the standardisation and missingness variables were included as independent variables (except for the standardised case‑mix variables for the analysis of the effect of case‑mix in Chapter 2). Equation 2 describes the multilevel model for this analysis.
Equation 2
is the outcome for patient i in country k, related to practice j. is the fixed intercept. are the coefficients for the standardised male variable and age variables and the missingness variables explained in Section 7.7.5. is the random intercept for country k. is the random intercept for practice j within country k. is the residual error term.
(3.) To estimate the level and variation of selected outcomes in and between countries, while understanding the effect of an independent variable over the outcome within countries
Model (3.) is set to understand the outcomes and experiences of a particular group of people across countries, while compared to other groups within the country. This is the most common type of analysis throughout the report and can be linked to all research questions. The model uses a multilevel model with random slopes (Groenewegen et al., 2024[7]). The key additional assumption to the previous models is that covariates affect the level of outcome differently across countries. The inclusion of a covariate in the random slope is accompanied by the inclusion of a standardised version of the same covariate as an independent variable. The reason is that the analysis is intended to understand how the covariates affect the outcome differently across countries, while controlling for the different distribution of the covariate in the different countries. The standardisation of the covariate is constructed identically to case‑mix variables, with the average of the average by country as the standard level of the interest variable. Equation 3 describes the multilevel model for this analysis.
Equation 3
is the outcome for patient i in country k, related to practice j. is the fixed intercept. are the coefficients for the standardised male variable and age variables and the missingness variables explained in Section 7.7.5. is the random slope for country k and the groups defined by in the country. is the random slope for country k and the groups defined by in the patients of practice j. is the residual error term. To note that includes both the groups defined by variable X, as well as a missingness indicator for variable X as explained in Section 7.7.5. A practical example is provided in Box 7.7.
Box 7.7. Random slope model to estimate WHO‑5 (Well-being index) for people with and without chronic conditions
Copy link to Box 7.7. Random slope model to estimate WHO‑5 (Well-being index) for people with and without chronic conditionsSpecification and codes for R and Stata
As an example, we provide the specification and codes for the analysis with a multilevel random slope of the measure of WHO‑5 for patients with and without Chronic conditions by country. Equation 4 below describes the model.
Equation 4
is the measure of the WHO5 scale for patient i in country k, related to practice j. is the overall intercept. are the coefficients for the standardised male variable and age variables and the missingness variables for age and gender. are the random slopes for patients with chronic conditions for country k and practice j (in country k), respectively. are the random slopes for patients without chronic conditions for country k and practice j (in country k), respectively. are the random slopes for patients with a missing value in the chronic condition question, for country k and practice j (in country k), respectively. is the residual error term.
The country estimate for people with chronic conditions will be the sum of the overall intercept (), the coefficient effect of the chronic (standardised) variable () multiplied by the value of the standardised variable for patients with chronic conditions (), and the country-group specific random effect ( and ).
The model is estimated with a linear restricted iterative least squares model.
Code for estimation in R
library(R2MLwiN)
> model <- runMLwiN(Formula = who5_totcons ~ 1 + MALE_std + Age2_std + Age3_std + Age4_std + Gender_missing + AGE_missing + WithChronic _std + Chronic_missing + (WithCC + WithoutCC + Chronic_missing | Country) + (WithCC + WithoutCC + Chronic_missing |practice_id) + (WithCC + WithoutCC + Chronic_missing |patient_id), data = data, estoptions = list(resi.store = TRUE, EstM = 0, Meth = 0, maxiter = 200)
> country_levels <- levels(data$Country)
> country_estimates <- data.table(Country_v = country_levels, WithCC = model@residual$lev_3_resi_est_WithCC, WithoutCC = model@residual$lev_3_resi_est_WithoutCC)
> overall_intercept_WithCC<-model@FP[1]+ model@FP['FP_WithCC_std']*model@data[WithCC==1, mean(WithCC_std,na.rm=TRUE)]
> overall_intercept_WithoutCC<-model@FP[1]+model@FP['FP_WithCC_std']*model@data[WithoutCC==1, mean(WithCC_std,na.rm=TRUE)]
> country_estimates <- country_estimates %>% mutate(Country_estimate_WithCC = overall_intercept_WithCC + WithCC, Country_estimate_WithoutCC = overall_intercept_WithoutCC + WithoutCC)
Code for estimation in Stata
> runmlwin who5_totcons MALE_std Age2_std Age3_std Age4_std Gender_missing AGE_missing WithChronic_std, level3(Country: WithCC WithoutCC Chronic_missing, diagonal residuals(u)) level2(arts: WithCC WithoutCC Chronic_missing, diagonal) level1(patient: WithCC WithoutCC Chronic_missing, diagonal) rigls nopause.
> gen Country_group_estimates= _b[cons]*cons + _b[WithChronic_std]* WithChronic_std + u1* WithCC + u2* WithoutCC
> collapse (first) Country_group_estimates, by (Country WithCC)
> sort WithCC Country
> list Country WithCC Country_group_estimates
7.7.8. Comparison of country estimates
Country estimates are created under the assumption of a standard population structure (in terms of age and sex) and with the premise that country measures are distributed around the overall PREM/PROM measure for the entire sample, while accounting for the nested nature of the data.
To ease comparison, OECD PaRIS average is provided, which corresponds to the average of the 17 OECD member countries participating in PaRIS. This measure is calculated using a simple average that assumes no standard error in the country estimates calculated by the multilevel model. The robustness of this calculation was corroborated with a simulation approach that accounted for the dependency and standard errors of the country estimates calculated by the multilevel model. Moreover, to assess the statistical significance of differences across countries, we provide estimates with a comparative interval. The comparative sample approach involves examining the overlap of confidence intervals to assess statistical significance. Following Goldstein and Healy’s method (Goldstein and Healy, 1995[34]), these intervals are confidence intervals adjusted so that their overlap effectively implies no statistically significant difference at 5% error. The adjustment involves widening the intervals slightly beyond traditional 95% confidence limits to account for multiple comparisons. The resulting interval is equivalent to an 84% confidence interval. By doing so, the type I error rate – the probability of incorrectly identifying a difference – averages at 5% across all pairwise comparisons. This means that if two comparative intervals overlap, any observed difference is unlikely to be statistically significant.
7.8. Limitations
Copy link to 7.8. LimitationsEnsuring comparability across countries required addressing potential cultural biases and differences in expectations and response styles. These factors can challenge measurement invariance, a key issue in cross-country studies. To mitigate such risks, the PaRIS questionnaire was developed with intercultural differences in mind, drawing on insights from international guidelines and expert recommendations on cross-cultural validity and adaptability of surveys. The process involved extensive consultation with patient groups and experts, rigorous translation procedures, cognitive testing in each country to refine items and minimise cultural discrepancies, and confirmatory factor analysis on the Field Trial data. Further research is planned to enhance cross-country comparability through techniques such as differential item functioning analyses, response style adjustments, and advanced confirmatory factor analyses.
Selection and non-response bias is also a potential concern due to limitations in data collection and sampling methods in some countries, particularly because of the recruitment of patients through primary care practices. Differences in survey collection methods (e.g. paper-based, online, or telephone), unfavourable sampling approaches (e.g. continuous or convenience sampling), and varying response rates can contribute to this limitation.
The analysis in Section 7.6.1 confirms that collection modes have significant impact on PREMS and PROMS. However, because countries used mostly one collection mode, it is difficult to disentangle the collection mode effect from the country effect. This results in minimal changes to the country estimates when accounting for the collection method (see Section 7.6.1). On the other hand, because of data limitations, it was not possible to directly mitigate the risk of selection bias arising from continuous or convenience sampling.
Practices that agree to participate in a survey may not necessarily be representative of all primary care practices in the country. Practices with sufficient resources or those focused on quality improvement could be more likely to participate, which might result in selection of “high performers” and so better scores on patient-reported measures.
This hypothesis was tested in Norway, where it was possible to survey patients whose practice was not willing to participate. This way, results of patients of participating and non-participating practices could be compared. While the analysis showed statistically significant differences, its impact on core outcomes is minimal. In Norway (Bjertnaes et al., 2024[35]), findings revealed that patients in participating practices differed only slightly in areas like people‑centred care, co‑ordinated care, and mental health outcomes, with statistically small effects. These findings reinforce the idea that, despite certain limitations, the survey effectively captures key aspects of patient outcomes and experiences.
PaRIS’ response rates vary between 5% and 86% for practices and between 6% and 47% for patients. While some response rates might seem low, they are within the range of other recent international surveys with patient-level responses. For example, the People’s Voice Survey (Kruk et al., 2024[22]) presents response rates between 2%‑84% among the 15 participating countries. The Commonwealth Fund’s 2023 International Health Policy Survey had response rates varying from 6 to 49% among 10 participating countries (Gumas et al, 2024[23]). While the third wave of the European Health Interview Survey (Eurostat, 2024[24]) achieves higher response rates (between 25% and 88%), it targets the general population, unlike PaRIS, which focuses on primary care service users.
For countries with low response rates, there is an increased risk of self-selection of respondents because of characteristics that differentiate them from non-respondents, amounting to challenging non-response bias and representativeness. In time, if these characteristics influence the measured outcomes, non-response bias can also affect comparability. For example, outcomes could be negatively affected by having a high response rate related to a more wide‑spread inclusion of more socially disadvantaged groups.
Given the lack of information on non-responders, a characterisation of this group was not possible. Instead, we’ve mitigated the risk of non-response bias in a two‑step approach. First, regarding comparability, we’ve conducted a case‑mix analysis (Section 7.7.4), resulting in potential case‑mix covariates having minimal impact on country estimates of PROMs and PREMs. This tells us that, even if there is risk of over-representation of some characteristics that could drive participation, such as socio-economic or chronic conditions prevalence, these have a negligible impact over the interest outcomes. These results reinforce the idea of health systems adapting to the characteristics of their population to optimise patient outcomes and experiences. Regarding representativeness, the validation and standardisation process in terms of the age and sex distribution of the sample secures representativeness under these parameters (See Section 7.7.6).
Overall, these measures allow for meaningful cross-country comparisons, reinforcing the survey’s contribution to understanding patient experiences across diverse health systems. However, these methods do not address potential representativeness or comparability issues in terms of unobserved variables. While this is a shortcoming of every survey and cross-sectional study (Groves and Lyberg, 2010[36]), it is important for the reader to consider potential unaccounted biases arising from the implementation methods described in Section 7.5.1 when analysing results.
Access bias is an important consideration in PaRIS, as characteristics of primary care service users can vary across countries due to differences in access barriers. For instance, in countries with lower financial coverage of primary care, primary care access can be constrained by out-of-pocket costs, which may not be as significant in countries with more extensive healthcare coverage. This variability means that individuals facing greater access challenges may be underrepresented in some countries, as the survey specifically targets those who have interacted with primary care practices within the six months prior to sampling.
When interpreting the survey findings, it is essential to recognise that PaRIS focuses on primary care users rather than the broader population. Therefore, observed disparities between groups may reflect not only differences in care quality but also varying levels of access to primary care. This is not biasing results, given that PaRIS does not aim to be representative of the population, but of primary care users. However, the PaRIS eligibility criteria, which only require one primary care contact within six months, likely still captures a significant portion of people with access barriers, as even those facing challenges may achieve at least this minimal level of contact.
To assess the impact of access, PaRIS included questions on access barriers. The proportion of people reporting access barriers in PaRIS aligns closely with Eurostat data on self-reported unmet needs due to access and cost barriers, as well as OECD data on skipped medical treatments due to costs (further explained in Chapter 5). This alignment suggests that PaRIS is representative of a substantial portion of individuals facing access barriers, though some level of access bias cannot be completely ruled out. Overall, analysis indicates that any potential impact of access bias on the survey’s findings is likely minimal.
References
[12] Avvisati, F., N. Le Donné and M. Paccagnella (2019), “A meeting report: cross-cultural comparability of questionnaire measures in large-scale international surveys”, Measurement Instruments for the Social Sciences, Vol. 1/1, https://doi.org/10.1186/s42409-019-0010-z.
[35] Bjertnaes, O. et al. (2024), “International survey of people living with chronic conditions (PaRIS survey): effects of general practitioner non-participation on the representativeness of the Norwegian patient data”, BMC Health Services Research, Vol. 24/1, https://doi.org/10.1186/s12913-024-11751-0.
[5] Bloemeke-Cammin, J. et al. (2024), “International cross-cultural development and field testing of the primary care practice questionnaire for the PaRIS survey (PaRIS-PCPQ)”, BMC Primary Care, Vol. 25/1, https://doi.org/10.1186/s12875-024-02375-8.
[20] Cronbach, L. (1951), “Coefficient alpha and the internal structure of tests”, Psychometrika, Vol. 16/3, pp. 297-334, https://doi.org/10.1007/bf02310555.
[21] Davis, K. et al. (2021), “Effectiveness of nurse–led services for people with chronic disease in achieving an outcome of continuity of care at the primary-secondary healthcare interface: A quantitative systematic review”, International Journal of Nursing Studies, Vol. 121, p. 103986, https://doi.org/10.1016/j.ijnurstu.2021.103986.
[2] de Boer, D. et al. (2022), “Assessing the outcomes and experiences of care from the perspective of people living with chronic conditions, to support countries in developing people-centred policies and practices: study protocol of the International Survey of People Living with Chronic Conditions (PaRIS survey)”, BMJ Open, Vol. 12/9, p. e061424, https://doi.org/10.1136/bmjopen-2022-061424.
[24] Eurostat (2024), European Health Interview Survey (EHIS): Reference Metadata in Euro SDMX Metadata Structure, Eurostat, The Statistical office of the European Union, https://ec.europa.eu/eurostat/cache/metadata/fr/hlth_det_esms.htm.
[16] Fortin, M., J. Almirall and K. Nicholson (2017), “Development of a Research Tool to Document Self-Reported Chronic Conditions in Primary Care”, Journal of Comorbidity, Vol. 7/1, pp. 117-123, https://doi.org/10.15256/joc.2017.7.122.
[30] Goldstein, H. (1989), “Restricted unbiased iterative generalized least-squares estimation”, Biometrika, Vol. 76/3, pp. 622-623, https://doi.org/10.1093/biomet/76.3.622.
[31] Goldstein, H. (1986), “Multilevel mixed linear model analysis using iterative generalized least squares”, Biometrika, Vol. 73/1, pp. 43-56, https://doi.org/10.1093/biomet/73.1.43.
[34] Goldstein, H. and M. Healy (1995), “The Graphical Presentation of a Collection of Means”, Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 158/1, p. 175, https://doi.org/10.2307/2983411.
[25] Groenewegen, P. et al. (2023), “Case-mix adjustments for patient reported experience and outcome measures in primary care: an empirical approach to identify patient characteristics as case-mix adjusters based on a secondary analysis of an international survey among patients and their general practitioners in 34 countries”, Journal of Patient-Reported Outcomes, Vol. 7/1, https://doi.org/10.1186/s41687-023-00667-8.
[7] Groenewegen, P. et al. (2024), “Data analysis plan of the OECD PaRIS survey: leveraging a multi-level approach to analyse data collected from people living with chronic conditions and their primary care practices in 20 countries”, BMC Research Notes, Vol. 17/1, https://doi.org/10.1186/s13104-024-06815-7.
[36] Groves, R. and L. Lyberg (2010), “Total Survey Error: Past, Present, and Future”, Public Opinion Quarterly, Vol. 74/5, pp. 849-879, https://doi.org/10.1093/poq/nfq065.
[23] Gumas et al, E. (2024), Finger on the Pulse: The State of Primary Care in the U.S. and Nine Other Countries, https://doi.org/10.26099/p3y4-5g38.
[14] Harkness, J., F. van de Vijver and T. Johnson (2003), “Questionnaire design in comparative research”, in Cross-Cultural Survey Methods, Wiley.
[17] Justin, J. et al. (2022), “International Guidelines for Hypertension: Resemblance, Divergence and Inconsistencies”, Journal of Clinical Medicine, Vol. 11/7, p. 1975, https://doi.org/10.3390/jcm11071975.
[41] Katz, P., S. Pedro and K. Michaud (2017), “Performance of the Patient‐Reported Outcomes Measurement Information System 29‐Item Profile in Rheumatoid Arthritis, Osteoarthritis, Fibromyalgia, and Systemic Lupus Erythematosus”, Arthritis Care & Research, Vol. 69/9, pp. 1312-1321, https://doi.org/10.1002/acr.23183.
[9] Kendir, C. et al. (2023), “All hands on deck: Co-developing the first international survey of people living with chronic conditions: Stakeholder engagement in the design, development, and field trial implementation of the PaRIS survey”, OECD Health Working Papers, No. 149, OECD Publishing, Paris, https://doi.org/10.1787/8b31022e-en.
[11] Kendir, C. et al. (2024), “Engaging primary care professionals in OECD’s international PaRIS survey: a documentary analysis”, Health Research Policy and Systems, Vol. 22/1, https://doi.org/10.1186/s12961-024-01170-2.
[10] Kendir, C. et al. (2023), “Lessons from early implementation of the OECD’s Patient-Reported Indicator Surveys (PaRIS) in primary care: making the case for co-development and adaptation to national contexts”, IJQHC Communications, Vol. 3/1, https://doi.org/10.1093/ijcoms/lyad003.
[22] Kruk, M. et al. (2024), “Population confidence in the health system in 15 countries: results from the first round of the People’s Voice Survey”, The Lancet Global Health, Vol. 12/1, pp. e100-e111, https://doi.org/10.1016/s2214-109x(23)00499-0.
[28] Leckie, G. and C. Charlton (2012), “<tt>runmlwin</tt>: A Program to Run the<b>MLwiN</b>Multilevel Modeling Software from within<i>Stata</i>”, Journal of Statistical Software, Vol. 52/11, https://doi.org/10.18637/jss.v052.i11.
[19] Leyland, A. and P. Groenewegen (2020), Multilevel Modelling for Public Health and Health Services Research, Springer International Publishing, Cham, https://doi.org/10.1007/978-3-030-34801-4.
[37] Lloyd, H. et al. (2018), “Validation of the person-centred coordinated care experience questionnaire (P3CEQ)”, International Journal for Quality in Health Care, Vol. 31/7, pp. 506-512, https://doi.org/10.1093/intqhc/mzy212.
[1] OECD (2019), Measuring What Matters: The Patient-Reported Indicator Surveys, OECD Health Policy Studies, OECD Publishing, Paris, https://doi.org/10.1787/2148719d-en.
[33] O’Hara, R. (ed.) (2012), “A general and simple method for obtaining <i>R</i><sup>2</sup> from generalized linear mixed‐effects models”, Methods in Ecology and Evolution, Vol. 4/2, pp. 133-142, https://doi.org/10.1111/j.2041-210x.2012.00261.x.
[27] Rasbash, J. et al. (2005), MLwiN Version 2.02, Centre for Multilevel Modelling, University of Bristol.
[18] Raudenbush, S. (2003), “The Quantitative Assessment of Neighborhood Social Environments”, in Neighborhoods and Health, Oxford University PressNew York, https://doi.org/10.1093/acprof:oso/9780195138382.003.0005.
[26] Raudenbush, S. and A. Bryk (2002), Hierarchical linear models: Applications and data analysis methods, Sage Publishing, Thousand Oaks, CA.
[40] Raudenbush, S., B. Rowan and S. Kang (1991), “A Multilevel, Multivariate Model for Studying School Climate with Estimation Via the EM Algorithm and Application to U. S. High-School Data”, Journal of Educational Statistics, Vol. 16/4, p. 295, https://doi.org/10.2307/1165105.
[3] Rijken M et al. (2023), “Updates on the study protocol for the OECD international PaRIS survey and on its implementation”, BMJ Open, https://bmjopen.bmj.com/content/12/9/e061424.responses#updates-on-the-study-protocol-for-the-oecd-international-paris-survey-and-on-its-implementation-30-june-2023.
[38] Sischka, P. et al. (2020), “The WHO-5 well-being index – validation based on item response theory and the analysis of measurement invariance across 35 countries”, Journal of Affective Disorders Reports, Vol. 1, p. 100020, https://doi.org/10.1016/j.jadr.2020.100020.
[32] Snijders, T. and R. Bosker (2012), Multilevel Analysis: An Introduction To Basic And Advanced Multilevel Modeling, SAGE Publications.
[15] Tourangeau, R. (1984), “‘Cognitive science and survey methods: a cognitive perspective”, in Building a Bridge Between Disciplines, National Academy Press, Washington DC.
[6] Valderas, J. et al. (2025), “The International Survey of People Living with Chronic Conditions: development of the PaRIS Patient Questionnaire (PaRIS-PQ)”, Under revision in BMJ Quality & Safety.
[4] Valderas, J. et al. (2024), “Development of the Patient-Reported Indicator Surveys (PaRIS) conceptual framework to monitor and improve the performance of primary care for people living with chronic conditions”, BMJ Quality & Safety, pp. bmjqs-2024-017301, https://doi.org/10.1136/bmjqs-2024-017301.
[13] Van de Vijver, F. et al. (2019), “Invariance analyses in large-scale studies”, OECD Education Working Papers, No. 201, OECD Publishing, Paris, https://doi.org/10.1787/254738dd-en.
[8] van den Berg, M. et al. (2024), “PaRIS Field Trial Report: Technical report on the international PaRIS survey of people living with chronic conditions”, OECD Health Working Papers, No. 166, OECD Publishing, Paris, https://doi.org/10.1787/e5725c75-en.
[29] Zhang, Z. et al. (2016), “<b>R2MLwiN</b>: A Package to Run<i>MLwiN</i>from within<i>R</i>”, Journal of Statistical Software, Vol. 72/10, https://doi.org/10.18637/jss.v072.i10.
[39] Zonjee, V. et al. (2022), “The patient-reported outcomes measurement information systems (PROMIS®) physical function and its derivative measures in adults: a systematic review of content validity”, Quality of Life Research, Vol. 31/12, pp. 3317-3330, https://doi.org/10.1007/s11136-022-03151-w.
Annex 7.A. Instrument reliability and statistical power
Copy link to Annex 7.A. Instrument reliability and statistical powerOutcome reliability
Copy link to Outcome reliabilityPaRIS data has three interconnected levels: Country, practice, and patient. Four validated measures of patient-reported care experience and outcomes are considered for evaluating outcome reliability: P3CEQ (PREM) (Lloyd et al., 2018[37]), WHO‑5 (Sischka et al., 2020[38]), PROMIS-physical (Zonjee et al., 2022[39]) and PROMIS-mental (PROMs). The reliability of the four selected outcomes depends on measurement error, the number of items,5 the number of patients per practice and the variances at practice and country level.
Following literature (Raudenbush, 2003[18]; Leyland and Groenewegen, 2020[19]), reliability is calculated for the multilevel model with the following formula:
Equation 5
In Equation 5, is the higher level variance; is the individual level variance; is the item consistency;6 is the average number of individual respondents in a higher level unit (country); and is the number of items. The reliability coefficient in a multilevel model is a measure of internal consistency comparable to Cronbach’s alpha in a single‑level model (Raudenbush, Rowan and Kang, 1991[40]).7
Annex Table 7.A.1 to Annex Table 7.A.5 contain the results of the multilevel reliability analyses using the data of the main survey for the scales in the PaRIS ten key indicators. The tables contain the results for the five key indicators that have been assessed with multiple items. In addition, we also evaluated reliability of other scales with multi‑item constructs that were included in the patient questionnaire of the main survey: Active engagement, Working with healthcare professionals and Health literacy.
The tables show that countries varied a lot in the number of participating practices and patients. These differences are reflected in the reliability of the constructs at country level. The reliability at country and practice level is generally higher when the average number of participating patients per practice is high. Unfortunately, as the tables also show, many countries did not reach the recommended minimum of 75 patients per practice.
Looking at the results for each construct separately, high reliability at country level was found for the scales assessing Physical health, Mental health, Experienced care co‑ordination and Experienced people‑centred care. A similar result was found for Active engagement, Working with healthcare professionals to manage one’s health and Health literacy.
Reliability (>.70) at country level was found for all countries except three (Greece, Italy, Romania, all with a low number of participating patients per practice) on Physical health, and for all except one country (Romania) on Mental health. Reliability at country level was sufficient for all countries on Experienced care co‑ordination, Experienced people‑centred care. In addition (not shown in the tables) reliability was found for the scales on Working with healthcare professionals to manage one’s health and Health literacy. Lower reliability at country level was found for the well-being scale (ten countries with country reliability <.70).
Reliability at practice level was in most countries insufficient. While higher reliability at the practice level would have been ideal, it is less critical for PaRIS, as the survey’s primary objective is not to analyse differences in patients’ care experiences and outcomes between practices within countries.
Annex Table 7.A.1. Instrument reliability: PROMIS Physical health scale
Copy link to Annex Table 7.A.1. Instrument reliability: PROMIS Physical health scaleMultilevel reliability coefficient for PROMIS Physical health scale
|
Reliability (multi-level) Reliability (single level; Cronbach’s alpha) |
Reliability (single level; Cronbach’s alpha) |
||
|---|---|---|---|
|
Country reliability |
Practice reliability |
||
|
Australia |
0.87 |
0.81 |
0.7992 |
|
Belgium |
0.84 |
0.73 |
0.7942 |
|
Canada |
0.90 |
0.83 |
0.7626 |
|
Czechia |
0.77 |
0.63 |
0.8066 |
|
France |
0.83 |
0.72 |
0.7853 |
|
Greece |
0.67 |
0.51 |
0.8075 |
|
Iceland |
0.78 |
0.66 |
0.8150 |
|
Italy |
0.60 |
0.43 |
0.7744 |
|
Luxembourg |
0.78 |
0.67 |
0.7817 |
|
The Netherlands |
0.83 |
0.72 |
0.7975 |
|
Norway |
0.81 |
0.69 |
0.8052 |
|
Portugal |
0.89 |
0.82 |
0.7868 |
|
Romania |
0.48 |
0.32 |
0.8291 |
|
Saudi Arabia |
0.81 |
0.69 |
0.7033 |
|
Slovenia |
0.77 |
0.64 |
0.7642 |
|
Spain |
0.81 |
0.69 |
0.7565 |
|
Switzerland |
0.79 |
0.65 |
0.7215 |
|
United States1 |
0.60 |
- |
0.7829 |
|
Wales |
0.86 |
0.76 |
0.8460 |
1. Patients not nested in practices.
Source: OECD PaRIS 2024 Database.
Annex Table 7.A.2. Instrument reliability: PROMIS Mental health scale
Copy link to Annex Table 7.A.2. Instrument reliability: PROMIS Mental health scaleMultilevel reliability coefficient for PROMIS Mental health scale
|
Reliability (multi-level) Reliability (single level; Cronbach’s alpha) |
Reliability (single level; Cronbach’s alpha) |
||
|---|---|---|---|
|
Country reliability |
Practice reliability |
||
|
Australia |
0.94 |
0.74 |
0.8537 |
|
Belgium |
0.92 |
0.65 |
0.8217 |
|
Canada |
0.95 |
0.77 |
0.8299 |
|
Czechia |
0.88 |
0.54 |
0.7926 |
|
France |
0.91 |
0.64 |
0.7631 |
|
Greece |
0.81 |
0.42 |
0.7995 |
|
Iceland |
0.89 |
0.57 |
0.8193 |
|
Italy |
0.76 |
0.34 |
0.7749 |
|
Luxembourg |
0.89 |
0.57 |
0.7909 |
|
The Netherlands |
0.91 |
0.63 |
0.8202 |
|
Norway |
0.90 |
0.60 |
0.8463 |
|
Portugal |
0.95 |
0.76 |
0.7718 |
|
Romania |
0.66 |
0.24 |
0.7933 |
|
Saudi Arabia |
0.90 |
0.60 |
0.7212 |
|
Slovenia |
0.88 |
0.54 |
0.7658 |
|
Spain |
0.90 |
0.60 |
0.7131 |
|
Switzerland |
0.89 |
0.56 |
0.7935 |
|
United States1 |
0.83 |
- |
0.8101 |
|
Wales |
0.93 |
0.68 |
0.8460 |
1. Patients not nested in practices.
Source: OECD PaRIS 2024 Database.
Annex Table 7.A.3. Instrument reliability: WHO5 Well-being
Copy link to Annex Table 7.A.3. Instrument reliability: WHO5 Well-beingMultilevel reliability coefficient for well-being index
|
Reliability (multi-level) Reliability (single level; Cronbach’s alpha) |
Reliability (single level; Cronbach’s alpha) |
||
|---|---|---|---|
|
Country reliability |
Practice reliability |
||
|
Australia |
0.79 |
0.73 |
0.9069 |
|
Belgium |
0.73 |
0.64 |
0.9012 |
|
Canada |
0.82 |
0.76 |
0.8979 |
|
Czechia |
0.64 |
0.53 |
0.9054 |
|
France |
0.73 |
0.63 |
0.9014 |
|
Greece |
0.51 |
0.40 |
0.9064 |
|
Iceland |
0.66 |
0.56 |
0.8935 |
|
Italy |
0.43 |
0.32 |
0.8992 |
|
Luxembourg |
0.66 |
0.56 |
0.9028 |
|
The Netherlands |
0.72 |
0.62 |
0.8959 |
|
Norway |
0.70 |
0.59 |
0.8906 |
|
Portugal |
0.82 |
0.74 |
0.9110 |
|
Romania |
0.32 |
0.23 |
0.9188 |
|
Saudi Arabia |
0.69 |
0.59 |
0.8308 |
|
Slovenia |
0.64 |
0.53 |
0.9104 |
|
Spain |
0.70 |
0.59 |
0.8465 |
|
Switzerland |
0.66 |
0.55 |
0.8933 |
|
United States1 |
0.58 |
- |
0.8258 |
|
Wales |
0.76 |
0.68 |
0.9149 |
1. Patients not nested in practices.
Source: OECD PaRIS 2024 Database.
Annex Table 7.A.4. Instrument reliability: P3CEQ person centredness
Copy link to Annex Table 7.A.4. Instrument reliability: P3CEQ person centrednessMultilevel reliability coefficient for P3CEQ person centredness
|
Reliability (multi-level) Reliability (single level; Cronbach’s alpha) |
Reliability (single level; Cronbach’s alpha) |
||
|---|---|---|---|
|
Country reliability |
Practice reliability |
||
|
Australia |
0.97 |
0.82 |
0.7621 |
|
Belgium |
0.96 |
0.73 |
0.7035 |
|
Canada |
0.98 |
0.83 |
0.7593 |
|
Czechia |
0.94 |
0.64 |
0.7600 |
|
France |
0.96 |
0.72 |
0.7070 |
|
Greece |
0.89 |
0.47 |
0.7284 |
|
Iceland |
0.94 |
0.66 |
0.7547 |
|
Italy |
0.87 |
0.41 |
0.7894 |
|
Luxembourg |
0.94 |
0.67 |
0.7364 |
|
The Netherlands |
0.96 |
0.71 |
0.6662 |
|
Norway |
0.95 |
0.69 |
0.7408 |
|
Portugal |
0.97 |
0.81 |
0.8063 |
|
Romania |
0.83 |
0.35 |
0.7221 |
|
Saudi Arabia |
0.96 |
0.70 |
0.5904 |
|
Slovenia |
0.93 |
0.60 |
0.8140 |
|
Spain |
0.96 |
0.71 |
0.7561 |
|
Switzerland |
0.94 |
0.65 |
0.7128 |
|
United States1 |
0.87 |
- |
0.6994 |
|
Wales |
0.97 |
0.77 |
0.8008 |
1. Patients not nested in practices.
Source: OECD PaRIS 2024 Database.
Annex Table 7.A.5. Instrument reliability: P3CEQ care co‑ordination
Copy link to Annex Table 7.A.5. Instrument reliability: P3CEQ care co‑ordinationMultilevel reliability coefficient for P3CEQ care co‑ordination
|
Reliability (multi-level) Reliability (single level; Cronbach’s alpha) |
Reliability (single level; Cronbach’s alpha) |
||
|---|---|---|---|
|
Country reliability |
Practice reliability |
||
|
Australia |
0.96 |
0.81 |
0.6599 |
|
Belgium |
0.94 |
0.72 |
0.6109 |
|
Canada |
0.97 |
0.83 |
0.6305 |
|
Czechia |
0.92 |
0.63 |
0.6540 |
|
France |
0.94 |
0.71 |
0.5794 |
|
Greece |
0.85 |
0.46 |
0.6988 |
|
Iceland |
0.92 |
0.65 |
0.6840 |
|
Italy |
0.82 |
0.40 |
0.6217 |
|
Luxembourg |
0.92 |
0.66 |
0.6166 |
|
The Netherlands |
0.94 |
0.70 |
0.5595 |
|
Norway |
0.93 |
0.68 |
0.6434 |
|
Portugal |
0.96 |
0.81 |
0.7084 |
|
Romania |
0.78 |
0.34 |
0.6990 |
|
Saudi Arabia |
0.94 |
0.70 |
0.6644 |
|
Slovenia |
0.90 |
0.59 |
0.6269 |
|
Spain |
0.94 |
0.70 |
0.5634 |
|
Switzerland |
0.92 |
0.64 |
0.5708 |
|
United States1 |
0.80 |
- |
0.5571 |
|
Wales |
0.95 |
0.77 |
0.6549 |
1. Patients not nested in practices.
Source: OECD PaRIS 2024 Database.
Statistical power to answer PaRIS’ main research questions.
Research questions #1 and #2.
What are the patient-reported outcomes of {the population}8 with chronic conditions, compared to those without chronic conditions, in the areas of symptoms, physical, mental and social functioning, self-reported health, and health-related quality of life? How do these results vary across countries?
What are the experiences of {the population} with chronic conditions, compared to those without chronic conditions, in the areas of access, comprehensiveness, continuity, co‑ordination, safety and people‑centredness of care, self-management support, trust, and overall perceived quality of care? How do these results vary across countries?
Calculations follow whether it is possible to demonstrate significant differences between three groups of countries based on the care experiences (as assessed with the P3CEQ) and outcomes (as assessed with WHO‑5, PROMIS-physical and PROMIS-mental scales) as reported in PaRIS: 1. countries that did not deviate from the overall average on the specific PREM/PROM; 2. countries with a mean score and comparative interval on the PREM/PROM fully above the overall average; 3. countries with a mean score and comparative interval on the PREM/PROM fully below the overall average. This can be inspected visually in Annex Figure 7.A.1, where four countries have the upper bound of their comparative interval below the lower bound of the comparative interval of the average in the sample (red line), while five other countries have the lower bound of their comparative interval above the upper bound of the comparative interval of the average in the sample (blue line). This assessment is similar for all PaRIS ten key indicators.
Annex Figure 7.A.1. Assessing the power of PaRIS to capture significant differences in PREMS/PROMS across countries
Copy link to Annex Figure 7.A.1. Assessing the power of PaRIS to capture significant differences in PREMS/PROMS across countriesProportion of the population with good general health. Country average and comparative interval.
Note: Countries below the red line are statistically significantly below average. Countries above blue line are statistically significantly above average. Estimates are made over the whole PaRIS population; thus, measures can differ from those on the dashboard in Chapter 2. The “Overall” measure includes all 19 countries in PaRIS and represents the joint adjusted estimate of the indicator.
Source: OECD PaRIS 2024 Database.
Research question #4
How do key characteristics of primary care practices relate to the care experiences and outcomes of primary care service users aged 45 and over with chronic conditions?
To assess the power of the survey to detect meaningful effects of primary care practice or country characteristics, we conducted a simulation-based power analysis of the multilevel regression model detailed in Section 7.7.7.
In this approach, we used the original model fitted to the survey data and simulated datasets with fixed sample sizes at each level, following the average number of practices within countries and patients within practices. The main source of variability in the simulations comes from the randomness inherent in the process, akin to a bootstrap method, which tests the robustness of the model. For each simulation, we tested the significance of a specified characteristic and calculated the proportion of times the characteristic’s effect was statistically significant at a 0.05 significance level. Observed power is considered acceptable when it reaches at least 70%, indicating an 70% probability of detecting a true effect of the characteristic, if it exists. This analysis provides insights into whether the current survey design – considering the number of countries, respondents, and observations – is sufficient to robustly estimate the effect of the characteristics while accounting for random variation at multiple levels. Additionally, if available, some parameters (such as the real distribution of the characteristic of interest) can be adjusted to reflect the true data distribution, rather than using the sample distribution. The simulation analysis shows that the significant effects found in PaRIS are robust in all of the simulated datasets.
Annex Table 7.A.6. Power analysis for practice‑level characteristics
Copy link to Annex Table 7.A.6. Power analysis for practice‑level characteristicsPercentage of significant results among 100 simulated datasets
|
Practice characteristic |
PROMIS Physical |
PROMIS Mental |
Person centredness |
Care co‑ordination |
|---|---|---|---|---|
|
Datasets equivalent to the PaRIS sample |
||||
|
Medical records available |
100% |
100% |
NS |
NS |
|
Self-management support by providing written information |
100% |
NS |
NS |
NS |
|
Scheduling appointment for more than 15 minutes |
NS |
NS |
100% |
100% |
|
Prepared for co‑ordinating care |
NS |
NS |
100% |
100% |
|
Datasets equivalent to the optimal PaRIS sample |
||||
|
Role of primary care staff |
5% |
9% |
17% |
33% |
Note: The optimal Paris Sample considered 106 practice per country and 75 patients per practice. NS = Not significant in the PaRIS analysis.
Source: OECD PaRIS 2024 Database.
We tested the robustness of the non-significance of the role of primary care staff by simulating a larger number of average practices per country that those in PaRIS. The results show that in simulated samples with the optimal average number of practices per country by design (75), the effect size of the role of primary care staff we found in PaRIS was not significant in more than 65% of the simulated runs (Annex Table 7.A.6).
Annex 7.B. Characterisation of PaRIS patients
Copy link to Annex 7.B. Characterisation of PaRIS patientsAnnex Table 7.B.1. Characterisation of PaRIS patients
Copy link to Annex Table 7.B.1. Characterisation of PaRIS patients|
Country averages and 95% Confidence Intervals |
|||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Age 45‑54 |
Age 55‑64 |
Age 65‑74 |
Age 75+ |
Gender (male) |
Education low |
Education mid |
Education high |
Hypertension |
Arthritis |
CVD |
Diabetes Mellitus |
Asthma/ COPD |
|
|
Australia |
0.19 (0.17 – 0.2) |
0.29 (0.28 – 0.31) |
0.32 (0.3 – 0.34) |
0.2 (0.18 – 0.22) |
0.4 (0.38 – 0.42) |
0.24 (0.23 – 0.26) |
0.26 (0.24 – 0.28) |
0.5 (0.47 – 0.52) |
0.51 (0.49 – 0.53) |
0.43 (0.4 – 0.45) |
0.21 (0.19 – 0.23) |
0.15 (0.13 – 0.16) |
0.21 (0.2 – 0.23) |
|
Belgium |
0.24 (0.23 – 0.25) |
0.33 (0.31 – 0.34) |
0.29 (0.28 – 0.31) |
0.14 (0.13 – 0.15) |
0.45 (0.43 – 0.46) |
0.27 (0.26 – 0.28) |
0.29 (0.28 – 0.3) |
0.44 (0.42 – 0.45) |
0.38 (0.37 – 0.4) |
0.32 (0.3 – 0.33) |
0.16 (0.15 – 0.17) |
0.12 (0.11 – 0.13) |
0.12 (0.11 – 0.13) |
|
Canada |
0.22 (0.2 – 0.23) |
0.3 (0.28 – 0.31) |
0.32 (0.3 – 0.33) |
0.17 (0.16 – 0.18) |
0.36 (0.34 – 0.37) |
0.17 (0.16 – 0.19) |
0.32 (0.31 – 0.34) |
0.5 (0.49 – 0.52) |
0.4 (0.38 – 0.41) |
0.37 (0.35 – 0.38) |
0.15 (0.14 – 0.16) |
0.17 (0.16 – 0.18) |
0.16 (0.14 – 0.17) |
|
Czechia |
0.3 (0.28 – 0.31) |
0.28 (0.26 – 0.29) |
0.27 (0.25 – 0.28) |
0.16 (0.15 – 0.17) |
0.43 (0.42 – 0.45) |
0.65 (0.64 – 0.67) |
0.06 (0.05 – 0.07) |
0.29 (0.27 – 0.3) |
0.56 (0.55 – 0.58) |
0.31 (0.3 – 0.33) |
0.17 (0.16 – 0.18) |
0.19 (0.18 – 0.2) |
0.14 (0.13 – 0.15) |
|
France |
0.27 (0.26 – 0.27) |
0.29 (0.28 – 0.29) |
0.27 (0.26 – 0.28) |
0.18 (0.17 – 0.19) |
0.42 (0.41 – 0.43) |
0.23 (0.22 – 0.24) |
0.35 (0.34 – 0.36) |
0.42 (0.41 – 0.43) |
0.36 (0.35 – 0.37) |
0.23 (0.23 – 0.24) |
0.18 (0.17 – 0.18) |
0.13 (0.12 – 0.13) |
0.12 (0.11 – 0.12) |
|
Greece |
0.38 (0.36 – 0.4) |
0.34 (0.32 – 0.36) |
0.22 (0.2 – 0.24) |
0.06 (0.05 – 0.07) |
0.44 (0.42 – 0.46) |
0.11 (0.1 – 0.13) |
0.39 (0.36 – 0.41) |
0.5 (0.48 – 0.53) |
0.33 (0.31 – 0.35) |
0.2 (0.18 – 0.22) |
0.16 (0.15 – 0.18) |
0.17 (0.15 – 0.19) |
0.12 (0.11 – 0.14) |
|
Iceland |
0.2 (0.18 – 0.21) |
0.33 (0.31 – 0.35) |
0.32 (0.3 – 0.34) |
0.16 (0.14 – 0.18) |
0.38 (0.36 – 0.41) |
0.2 (0.18 – 0.22) |
0.36 (0.34 – 0.38) |
0.44 (0.41 – 0.46) |
0.5 (0.48 – 0.53) |
0.26 (0.24 – 0.28) |
0.24 (0.22 – 0.26) |
0.13 (0.12 – 0.15) |
0.16 (0.14 – 0.18) |
|
Italy |
0.18 (0.16 – 0.2) |
0.31 (0.29 – 0.34) |
0.32 (0.3 – 0.35) |
0.18 (0.16 – 0.21) |
0.49 (0.46 – 0.52) |
0.38 (0.36 – 0.41) |
0.43 (0.4 – 0.46) |
0.19 (0.17 – 0.21) |
0.49 (0.46 – 0.51) |
0.26 (0.24 – 0.29) |
0.29 (0.27 – 0.32) |
0.33 (0.31 – 0.36) |
0.2 (0.18 – 0.23) |
|
Luxembourg |
0.3 (0.28 – 0.32) |
0.36 (0.34 – 0.39) |
0.22 (0.2 – 0.24) |
0.12 (0.1 – 0.14) |
0.48 (0.46 – 0.51) |
0.3 (0.27 – 0.32) |
0.35 (0.32 – 0.37) |
0.35 (0.33 – 0.38) |
0.42 (0.4 – 0.45) |
0.33 (0.31 – 0.36) |
0.17 (0.15 – 0.19) |
0.11 (0.1 – 0.13) |
0.13 (0.11 – 0.15) |
|
Netherlands |
0.14 (0.13 – 0.15) |
0.24 (0.23 – 0.25) |
0.36 (0.34 – 0.37) |
0.27 (0.25 – 0.28) |
0.47 (0.46 – 0.48) |
0.45 (0.43 – 0.46) |
0.13 (0.12 – 0.14) |
0.42 (0.41 – 0.43) |
0.4 (0.39 – 0.42) |
0.21 (0.2 – 0.22) |
0.2 (0.18 – 0.21) |
0.12 (0.11 – 0.13) |
0.14 (0.13 – 0.15) |
|
Norway |
0.25 (0.24 – 0.26) |
0.3 (0.29 – 0.31) |
0.26 (0.25 – 0.27) |
0.19 (0.18 – 0.2) |
0.43 (0.42 – 0.44) |
0.11 (0.1 – 0.11) |
0.27 (0.26 – 0.28) |
0.62 (0.61 – 0.63) |
0.43 (0.42 – 0.45) |
0.19 (0.18 – 0.19) |
0.15 (0.14 – 0.16) |
0.11 (0.1 – 0.12) |
0.12 (0.12 – 0.13) |
|
Portugal |
0.43 (0.42 – 0.44) |
0.32 (0.32 – 0.33) |
0.18 (0.17 – 0.18) |
0.07 (0.06 – 0.07) |
0.46 (0.45 – 0.47) |
0.34 (0.33 – 0.35) |
0.31 (0.3 – 0.32) |
0.35 (0.34 – 0.36) |
0.42 (0.41 – 0.43) |
0.32 (0.31 – 0.33) |
0.15 (0.14 – 0.16) |
0.15 (0.15 – 0.16) |
0.13 (0.12 – 0.13) |
|
Romania |
0.22 (0.2 – 0.25) |
0.39 (0.37 – 0.42) |
0.31 (0.28 – 0.33) |
0.08 (0.06 – 0.09) |
0.4 (0.38 – 0.43) |
0.59 (0.56 – 0.62) |
0.16 (0.14 – 0.18) |
0.25 (0.23 – 0.28) |
0.56 (0.53 – 0.58) |
0.29 (0.27 – 0.32) |
0.36 (0.33 – 0.38) |
0.19 (0.17 – 0.22) |
0.13 (0.11 – 0.15) |
|
Saudi Arabia |
0.56 (0.54 – 0.57) |
0.31 (0.3 – 0.32) |
0.1 (0.09 – 0.11) |
0.03 (0.03 – 0.04) |
0.5 (0.48 – 0.51) |
0.34 (0.32 – 0.35) |
0.25 (0.24 – 0.26) |
0.42 (0.4 – 0.43) |
0.4 (0.39 – 0.41) |
0.46 (0.45 – 0.48) |
0.13 (0.12 – 0.13) |
0.4 (0.39 – 0.41) |
0.15 (0.14 – 0.16) |
|
Slovenia |
0.32 (0.3 – 0.34) |
0.33 (0.32 – 0.35) |
0.25 (0.23 – 0.26) |
0.1 (0.09 – 0.11) |
0.35 (0.34 – 0.37) |
0.42 (0.4 – 0.43) |
0.17 (0.15 – 0.18) |
0.42 (0.4 – 0.44) |
0.37 (0.36 – 0.39) |
0.18 (0.16 – 0.19) |
0.15 (0.14 – 0.17) |
0.1 (0.09 – 0.11) |
0.11 (0.1 – 0.12) |
|
Spain |
0.29 (0.28 – 0.29) |
0.3 (0.29 – 0.3) |
0.23 (0.22 – 0.23) |
0.19 (0.18 – 0.19) |
0.42 (0.42 – 0.43) |
0.45 (0.44 – 0.45) |
0.12 (0.12 – 0.13) |
0.43 (0.43 – 0.44) |
0.41 (0.4 – 0.41) |
0.42 (0.41 – 0.43) |
0.16 (0.16 – 0.17) |
0.15 (0.15 – 0.16) |
0.14 (0.13 – 0.14) |
|
Switzerland |
0.19 (0.18 – 0.2) |
0.28 (0.27 – 0.29) |
0.29 (0.27 – 0.3) |
0.24 (0.23 – 0.26) |
0.49 (0.47 – 0.5) |
0.27 (0.25 – 0.28) |
0.39 (0.38 – 0.41) |
0.34 (0.32 – 0.35) |
0.44 (0.42 – 0.45) |
0.24 (0.23 – 0.25) |
0.19 (0.18 – 0.2) |
0.12 (0.11 – 0.13) |
0.11 (0.1 – 0.12) |
|
United States |
0.38 (0.36 – 0.39) |
0.62 (0.61 – 0.64) |
0.42 (0.4 – 0.43) |
0.14 (0.13 – 0.15) |
0.34 (0.32 – 0.35) |
0.52 (0.51 – 0.54) |
0.69 (0.67 – 0.7) |
0.63 (0.61 – 0.64) |
0.36 (0.35 – 0.37) |
0.25 (0.23 – 0.26) |
0.23 (0.22 – 0.24) |
||
|
Wales |
0.15 (0.15 – 0.16) |
0.27 (0.26 – 0.28) |
0.33 (0.32 – 0.34) |
0.25 (0.24 – 0.26) |
0.45 (0.44 – 0.46) |
0.43 (0.42 – 0.45) |
0.11 (0.1 – 0.12) |
0.46 (0.45 – 0.47) |
0.4 (0.39 – 0.41) |
0.35 (0.34 – 0.36) |
0.16 (0.15 – 0.16) |
0.15 (0.14 – 0.15) |
0.18 (0.18 – 0.19) |
|
OECD PaRIS |
0.25 (0.25 - 0.27) |
0.3 (0.27 - 0.3) |
0.28 (0.26 - 0.28) |
0.19 (0.18 - 0.2) |
0.43 (0.42 - 0.45) |
0.3 (0.31 - 0.34) |
0.27 (0.23 - 0.26) |
0.42 (0.42 - 0.45) |
0.44 (0.41 - 0.44) |
0.31 (0.31 - 0.33) |
0.19 (0.17 - 0.19) |
0.16 (0.14 - 0.16) |
0.15 (0.13 - 0.15) |
Note: The table presents the unstandardised prevalences per-country, of the sample used in multilevel analysis. Prevalence and 95% confidence interval for each binary variable using the Wilson score interval method. CVD: Cardiovascular disease. OECD PaRIS is the average of the 17 OECD countries participating in PaRIS.
Source: OECD PaRIS 2024 Database.
Annex 7.C. Data collection modes
Copy link to Annex 7.C. Data collection modesCountries used four distinct modes for data collection: online, paper, telephone and assisted surveys. To understand the effect of collection mode on country estimates of PREMS and PROMS dichotomous variables were included indicating the collection mode into a random intercept model as described in model (1.) of Section 7.7.7. When compared to the online responses, completing the survey by telephone was significantly associated (at 5% error) with higher estimates in seven of the 10 PaRIS key indicators, with non-significant differences in P3CEQ person centredness and PROMIS physical, and significantly lower estimates in General Health. Completing the survey on pen and paper was significantly associated to lower estimates in seven of the PaRIS 10 indicators, with non-significant differences in Overall Quality of Care and significantly higher estimates in P3CEQ care co‑ordination. In general, collection mode variables significantly improve model fit. These findings are consistent with literature (Katz, Pedro and Michaud, 2017[41]). Nevertheless, countries performed predominantly one mode of data collection, resulting in only a very small portion of the variance in the outcomes (less than 1%) being explain by collection mode variables. Annex Table 7.C.1summarises the collection modes by country.
Annex Table 7.C.1. Completion modes by country
Copy link to Annex Table 7.C.1. Completion modes by countryProportion of surveys completed by the distinct collection modes
|
Online |
Paper |
Telephone |
CAPI/CATI |
|
|---|---|---|---|---|
|
Australia |
95% |
5% |
||
|
Belgium |
62% |
38% |
||
|
Canada |
94% |
3% |
3% |
|
|
Czechia |
84% |
16% |
||
|
France |
80% |
9% |
11% |
|
|
Greece |
100% |
|||
|
Iceland |
100% |
|||
|
Italy |
100% |
|||
|
Luxembourg |
85% |
15% |
||
|
Netherlands |
90% |
10% |
||
|
Norway |
92% |
8% |
||
|
Portugal |
100% |
|||
|
Romania |
14% |
85% |
1% |
|
|
Saudi Arabia |
100% |
|||
|
Slovenia |
96% |
4% |
||
|
Spain |
8% |
92% |
||
|
Switzerland |
84% |
16% |
||
|
United States |
100% |
|||
|
Wales |
86% |
14% |
Note: CAPI: Computer assisted Personal Interviewing, CATI: Computer Assisted Telephone Interviewing.
The limited variation in collection modes within each country, entangle the impact of data collection mode with country-specific effects. This is evidenced in the variance diminution of the country average (how much of the outcome variation is explained by the country) when data collection modes are included in the model (see Box 7.8). Moreover, adding collection mode to the model does not significantly alter the country-level estimates of PREMs and PROMs. The model’s multilevel structure implicitly accounts for survey collection mode as part of the country-specific effect.
Box 7.8. Adding collection mode variables has a minimal effect in country estimates of PREMS/PROMS
Copy link to Box 7.8. Adding collection mode variables has a minimal effect in country estimates of PREMS/PROMSTaking PROMIS physical scale as an example, collection mode variables significantly improve model fit; however, these variables explain only 0.7% of the total variance in the outcome. Nevertheless, the inclusion of collection mode variables in the model has a small effect on the model intercept (Annex Table 7.C.2). Further explained in Section 7.7.7, the model intercept corresponds to the standardised overall measure of the outcome of interest. At the same time, including these variables diminishes the variance of the country-level random effect by 13% (Annex Table 7.C.2). This translates into smaller deviations of each country from the overall measure of the outcome.
Annex Table 7.C.2. A small increase in the model intercept counteracts with smaller country-specific random effects
Copy link to Annex Table 7.C.2. A small increase in the model intercept counteracts with smaller country-specific random effectsMultilevel regression results of PROMIS physical with and without controlling for collection modes
|
PROMIS Physical |
Not controlled by collection mode |
Controlling for collection modes |
|---|---|---|
|
Patient-level covariates (Fixed Effects) |
Estimate (Standard Error) |
|
|
Intercept |
46.14** (0.42) |
46.34** (0.4) |
|
MALE_std |
2.07** (0.06) |
2.05** (0.06) |
|
Age2_std |
‑0.65** (0.07) |
‑0.63** (0.07) |
|
Age3_std |
‑0.42** (0.08) |
‑0.36** (0.08) |
|
Age4_std |
‑3.5** (0.09) |
‑3.31** (0.09) |
|
Gender_missing |
‑1.61** (0.12) |
‑1.58** (0.12) |
|
AGE_missing |
0 (0.42) |
0 (0.42) |
|
Paper |
‑1.24** (0.12) |
|
|
Telephone |
‑0.38 (0.23) |
|
|
CAPI_CATI |
0.77 (1.69) |
|
|
Random effects |
Variance (proportion) |
|
|
Country level |
3.27 (4.07%) |
2.84 (3.53%) |
|
Practice level |
2.23 (2.78%) |
2.18 (2.71%) |
|
Patient level |
74.94 (93.15%) |
74.87 (93.07%) |
|
Other parameters |
||
|
Loglikelihood |
‑375 584.2 |
‑375 529.7 |
|
Deviance |
751 168.4 |
751 059.4 |
|
N of obs |
104 768 |
104 768 |
Note: Model specifications are detailed on Section 7.7.7 model (1.) equation (1.). ** p-value<0.001, * p-value<0.05. PROMIS® Scale v1.2 – Global Health component for physical health is a T-score metric with a range of 16‑68, and a good-fair cutoff of 42, higher values represent better physical health.
Source: OECD PaRIS 2024 Database.
Country estimates of PROMIS physical scale with and without accounting for collection modes
Collection mode variables are irrelevant for country estimates. Annex Table 7.C.3 shows that there are no statistically significant differences in the per-country estimates of the PROMIS physical scale, and this result is repeated for all PaRIS ten key indicators.
Annex Table 7.C.3. No statistical differences between estimates with and without controlling for collection modes
Copy link to Annex Table 7.C.3. No statistical differences between estimates with and without controlling for collection modesPROMIS physical scale score by country. Estimates include all patients.
|
Estimate w/o c. modes (Comparative interval) |
Estimate with c. mode (Comparative interval) |
|
|---|---|---|
|
Australia |
47.5 (46.7‑48.3) |
47.6 (46.9‑48.3) |
|
Belgium |
46.0 (45.4‑46.7) |
46.5 (45.9‑47.2) |
|
Canada |
47.8 (47.1‑48.5) |
47.9 (47.2‑48.6) |
|
Czechia |
46.5 (45.9‑47.2) |
46.7 (46.1‑47.3) |
|
France |
45.8 (45.2‑46.4) |
45.9 (45.3‑46.5) |
|
Greece |
46.3 (45.7‑47.0) |
46.4 (45.7‑47.0) |
|
Iceland |
44.8 (44.1‑45.6) |
44.9 (44.2‑45.6) |
|
Italy |
44.6 (43.9‑45.3) |
44.6 (44‑45.3.0) |
|
Luxembourg |
46.4 (45.6‑47.1) |
46.6 (45.9‑47.3) |
|
Netherlands |
48.3 (47.6‑48.9) |
48.4 (47.7‑49.0) |
|
Norway |
47.9 (47.3‑48.6) |
48.0 (47.4‑48.6) |
|
Portugal |
43.4 (42.7‑44.0) |
43.4 (42.8‑44.0) |
|
Romania |
41.9 (41.2‑42.6) |
42.9 (42.3‑43.6) |
|
Saudi Arabia |
46.6 (45.9‑47.2) |
47.0 (46.3‑47.6) |
|
Slovenia |
47.3 (46.6‑47.9) |
47.4 (46.7‑48.0) |
|
Spain |
44.6 (43.9‑45.2) |
44.9 (44.3‑45.5) |
|
Switzerland |
48.4 (47.7‑49.0) |
48.6 (47.9‑49.2) |
|
United States |
48.1 (46.5‑49.8) |
47.6 (45.6‑49.6) |
|
Wales |
45.0 (44.4‑45.7) |
45.2 (44.6‑45.8) |
Note: Estimates are calculated over the total number of valid patients (including patients without chronic conditions) thus estimates vary from those presented on Chapter 2, table 2.2. c.mode = collection mode. PROMIS® Scale v1.2 – Global Health component for physical health is a T-score metric with a range of 16‑68, and a good-fair cutoff of 42, higher values represent better physical health.
Source: OECD PaRIS 2024 Database
Notes
Copy link to Notes← 1. With the exception of patients from the United States, that are not linked to a primary care practice.
← 2. While patients form the United Stated are not linked to a primary care practice, they are still included in the analysis.
← 5. Indicator per outcome. Generally, a larger number of items tends to improve reliability, as it provides a more comprehensive and diverse set of indicators for the construct being measured.
← 6. The sum of the error variances at item level, also known as the measurement error.
← 7. In general, a Cronbach’s alpha value above 0.70 is considered acceptable for most social science research situations.
← 8. Note: The target population are primary care service users aged 45 and over living in the community.