An end-to-end guide to impact evaluation
In our schools we invest huge amounts of time, money and energy in a variety of initiatives and interventions to benefit pupils. We train our teachers on the latest strategies, we run curriculum boosters and catch-ups, and we provide targeted support to our pupil premium learners.
Sometimes these activities will work and have a profoundly positive impact on the young people involved. But sometimes they don't, and can even be counterproductive.
The classic example here is the evaluation of an American programme called Scared Straight. The basic idea was to scare young people out of committing crimes by taking them to visit prisons to show them the realities of life in jail. Seems plausible enough. Yet trials found that Scared Straight actually increased the risk of offending, with a cost to society of $203.51 for every dollar spent on it – not to mention the cost to the lives of the young people involved.
Impact evaluation can inform decisions on whether to expand, modify or stop an initiative
Impact evaluation matters because it aims to assess the effects of the things you are doing and help teachers to reliably identify what is working. Unlike large-scale impact evaluation, which aims to find out whether an approach works on average across a range of school settings, in-school evaluation aims to help you find out what is working in your context.
Done well, impact evaluation can:
- inform your decisions on whether to expand, modify, or stop doing a programme or initiative
- improve outcomes for pupils, by feeding into school development plans and helping you prioritise those activities making the biggest difference
- save teachers time, by enabling them to work smarter, not harder, to improve outcomes.
Here is a short guide on how to setup, run and analyse impact evaluations.
Impact evaluation stages: an overview
|1: Preparation||Evaluation question||What are you trying to improve, for whom, and in what timescale?|
|Decide your measure||What outcome(s) will tell you if you have improved?|
|Decide your comparison||What can you compare results against to help you isolate impact?|
|2: Implementation||Baseline and final measures||Most evaluations will have some kind of pre-result and some kind of post-result measure.|
|Delivery||Actually implement the change, initiative or intervention.|
|3: Analysis and reporting||Record and report||Record the results, then calculate the effect on the outcome you have chosen.|
Most good evaluations will start with some kind of research question. This might sound a bit off-putting, but really a good research question can be quite simple. Here's a useful structure.
- Choice – what change are you measuring?
- Outcome – what are you measuring?
- Context – who with?
‘I would like to know if running maths booster sessions over the Easter term will improve maths attainment for Year 6 pupils.’
‘What impact does our new English mastery curriculum have on Year 7 pupils’ engagement with reading outside the classroom?’
Avoid trying to measure too many things at once and keep your focus simple initially. You should be coming back to this question throughout the course of the project so it needs to be something you can stick to.
First, decide what type of evidence you need for the question you are trying to answer. For some initiatives that are relatively easy to implement, informal feedback from teachers may be sufficient evidence for what you are trying to achieve. For more involved projects that are aiming to make a sustained difference to pupil outcomes, you may want to look at more robust measures, potentially against a control group.
You should also be selective about your outcome measures, both intermediate and longer-term. The range of indicators you could look at is huge and could include the following.
- Academic attainment. Consider carefully the validity and reliability of your data – national, moderated exam results and standardised assessments will give different sorts of data to classroom assessments.
- Pastoral and school engagement measures. For example, looking at measures of behaviour, exclusions, attendance. This data will often be readily available and may be high quality.
- Broader skills. Many initiatives will be looking to develop outcomes such as pupils’ levels of motivation, self-efficacy or metacognition. In many cases there are pre-existing questionnaires that can be used to measure these outcomes.
Sense check this all against workload and your existing school processes. You don’t want to create a need to collect lots of new data, or to overhaul all your systems. Evaluation should reduce work by helping you focus, not create more.
One of the major challenges in school-based evaluation is noise. This has a statistical meaning but the gist of it is that a lot of things affect how pupils progress in school, from home life to the quality of their teaching to interactions with their peers. Isolating the impact of any one change therefore is always going to be challenging.
One key way to do this is by having a control or comparison group. The issue with just using pre/post measures is that they don’t control for anything else that may be happening at the same time. But if, for example, we can compare two classes taught by the same teacher, and one strategy is being trialled in one but not the other, we can get a bit closer to nailing down what might be driving change.
Two major methods for creating comparison groups are random assignment and matching.
Random assignment means that you would randomly assign pupils to intervention and control groups. This allows you to control for differences between groups you don’t know about, as well as those you do. Typically, you would randomly assign pupils to groups, check that they are balanced (in terms of attainment and basic demographics) and re-randomise until the groups are broadly comparable.
In a school setting, you could use a 'business as usual' approach: control group pupils do not receive any intervention and continue to be taught as usual. Alternatively, you could use a waiting list design: a programme could be used with all pupils but be introduced to some groups earlier than others (using the later groups as control groups for the first).
In many cases, however, random assignment may be logistically impractical or raise ethical issues. Where this is the case, a good alternative is matched control groups of pupils similar to those receiving a new programme. In this case, you would identify a group of pupils already not taking part to act as your control group, with broadly similar characteristics (attainment and demographics being most crucial). You could even compare against previous year groups or similar pupils in other schools. The key is to ensure the groups are as comparable as possible and to have a well-thought through rationale for your approach, which you could explain to others.
Once you have decided on your measures, you will typically want to have both baseline and outcome data – where were young people when you started with an initiative, and where did they end up? Collect this both for those pupils that you are looking to see change in and for any comparison groups.
Once you have got your findings, the key is to act on them
Don’t feel that this will necessarily require new data collection or administrative burdens. In many cases, your school will conduct regular assessments that can be triangulated against interventions for this purpose. If staff are trying to improve non-academic outcomes, be clear about that – you may want to consider measures such as pupil questionnaires or pastoral outcomes from your MIS.
Obviously enough, don’t let evaluation get in the way of the actual change you are trying to achieve. Although many evaluations will rely heavily on baseline and post-test data, you could also consider live monitoring information.
For instance, you could look at pupil attendance at sessions, results from tools like exit tickets and RAG ratings, and – just as importantly – ongoing pupil and teacher reflection on how it is going. But be clear about the limitations of this data: it is more for the purpose of helping you assess how implementation is going, rather than allowing you to make summative judgements about impact.
From the beginning you should hopefully have a clear sense of what you want to do with your data. There are various visual presentation tools you can use within Excel and systems like Tableau, Power BI, or ImpactEd.
But ultimately, once you have got your findings, the key is to act on them. If your evaluation finds that what you were doing wasn’t effective, what are you going to do with that? Will you drop the initiative entirely, redesign some components of it, do some further research?
Evaluation won’t necessarily give you the answers – what it will do is give you some evidence that you can use alongside your professional judgement.
Tying it all together
What’s the end goal of all of this? Clearly the primary purpose of a good evaluation process will be to figure out what is working, and what isn’t, in order to maximise the chances of doing the best possible things in the future.
But there’s also a value in the process itself. Toby Sutherland, headteacher of one of our partner schools, St Clement Danes, describes the importance of evaluation as ‘before we start doing anything, being really rigorous about what we are trying to achieve, how we'll know if we have achieved it, and what we will do as a result'.
So, while impact evaluation should give you some specific lessons about the value of the different initiatives you might be trialling, it can also pave the way for wider cultural change. When we want to make some improvement in school, our default response shifts from ‘do more’, to ‘figure out what’s working and do more of that’ (and less of everything else!). That change in approach is something that we think can really make a difference.
For further resources and useful links, download our guide to impact evaluation.