Interventions to transform the delivery of health and social care are being implemented widely, such as those linked to Accountable Care Organizations in the United States,1 or to integrated care systems in the UK.2 Assessing the impact of these health interventions enables healthcare teams to learn and to improve services, and can inform future policy.3 However, some healthcare interventions are implemented without high quality evaluation, in ways that require onerous data collection, or may not be evaluated at all.4
A range of routinely collected administrative and clinically generated healthcare data could be used to evaluate the impact of interventions to improve care. However, there is a lack of guidance as to where relevant routine data can be found or accessed and how they can be linked to other data. A diverse array of methodological literature can also make it hard to understand which methods to apply to analyse the data. This article provides an introduction to help clinicians, commissioners, and other healthcare professionals wishing to commission, interpret, or perform an impact evaluation of a health intervention. We highlight what to consider and discuss key concepts relating to design, analysis, implementation, and interpretation.
A health intervention is a combination of activities or strategies designed to assess, improve, maintain, promote, or modify health among individuals or an entire population. Interventions can include educational or care programmes, policy changes, environmental improvements, or health promotion campaigns. Interventions that include multiple independent or interacting components are referred to as complex.5 The impact of any intervention is likely to be shaped as much by the context (eg, communities, work places, homes, schools, or hospitals) in which it is delivered, as the details of the intervention itself.6789
An impact is a positive or negative, direct or indirect, intended or unintended change produced by an intervention. An impact evaluation is a systematic and empirical investigation of the effects of an intervention; it assesses to what extent the outcomes experienced by affected individuals were caused by the intervention in question, and what can be attributed to other factors such as other interventions, socioeconomic trends, and political or environmental conditions. Evaluations can be categorised as formative or summative (table 1).
Approaches such as the Plan, Do, Study, Act cycle11, which is part of the Model for Improvement, a commonly used tool to test and understand small changes in quality improvement work12 may be used to undertake formative evaluation.
With either type of evaluation, it is important to be realistic about how long it will take to see the intended effects. Assessment that takes place too soon risks incorrectly concluding that there was no impact. This might lead stakeholders to question the value of the intervention, when later assessment might have shown a different picture. For example, in a small case study of cost savings from proactively managing high risk patients, the costs of healthcare for the eligible intervention population initially increased compared with the comparison population, but after six months were consistently lower.14
This article focuses on impact evaluation, but this can only ever address a fraction of questions.15 Much more can be accomplished if it is supplemented with other qualitative and quantitative methods, including process evaluation. This provides context, assesses how the intervention was implemented, identifies any emerging unintended pathways, and is important for understanding what happened in practice and for identifying areas for improvement.16 The economic evaluation of healthcare interventions is also important for healthcare decision making, especially with ongoing financial pressures on health services.17
An effective impact evaluation begins with the formulation of one or more clear questions driven by the purpose of the evaluation and what you and your stakeholders want to learn. For example, “What is the impact of case management on patients’ experience of care?”
Formulate your evaluation questions using your understanding of the idea behind your intervention, the implementation challenges, and your knowledge of what data are available to measure outcomes. Review your theory of change or logic model2122 to understand what inputs and activities were planned, and what outcomes were expected and when. Once you have understood the intended causal pathway, consider the practical aspects of implementation, which include the barriers to change, unexpected changes by recipients or providers, and other influences not previously accounted for. Patient and public involvement (PPI) in setting the right question is strongly recommended for additional insights and meaningful results. For example, if evaluating the impact of case management, you could engage patients to understand what outcomes matter most to them. Healthcare leaders may emphasise metrics such as emergency admissions, but other aspects such as the experience of care might matter more to patients.523
Randomised control designs, where individuals are randomly selected to receive either an intervention or a control treatment, are often referred to as the “gold standard” of causal impact evaluation.24 In large enough samples, the process of randomisation ensures a balance in observed and unobserved characteristics between treatment and control groups. However, while often suitable for assessing, for example, the safety and efficacy of medicines, these designs may be impractical, unethical, or irrelevant when assessing the impact of complex changes to health service delivery.
Observational studies are an alternative approach to estimate causal effects. They use the natural, or unplanned, variation in a population in relation to the exposure to an intervention, or the factors that affect its outcomes, to remove the consequences of a non-randomised selection process.25 The idea is to mimic a randomised control design by ensuring treated and control groups are equivalent—at least in terms of observed characteristics. This can be achieved using a variety of well documented methods, including regression control and matching,26 eg, propensity scoring27 or genetic matching.28 If the matching is successful at producing such groups, and there are also no differences in unobserved characteristics, then it can be assumed that the control group outcomes are representative of those that the treated group would have experienced if nothing had changed, ie, the counterfactual. For example, an evaluation of alternative elective surgical interventions for primary total hip replacement on osteoarthritis patients in England and Wales used genetic matching to compare patients across three different prosthesis groups, and reported that the most prevalent type of hip replacement was the least cost effective.29
Assessing similarity is only possible in relation to observed characteristics, and matching can result in biased estimates if the groups differ in relation to unobserved variables that are predictive of the outcome (confounders). It is rarely possible to eliminate this possibility of bias when conducting observational studies, meaning that the interpretation of the findings must always be sensitive to the possibility that the differences in outcomes were caused by a factor other than the intervention. Methods that can help when selection is on unobserved characteristics include difference-in-difference,30 regression discontinuity,31 instrumental variables,18 or synthetic controls.32Table 2 gives a summary of selected observational study designs.
Observational study designs for quantitative impact evaluation