In October, 2016, New Philanthropy Capital (NPC) published our report bringing together findings from the first three years of evaluations conducted by the Justice Data Lab (JDL). NPC provided the impetus and model for the JDL and have championed it from outset. The primary purpose of the JDL was to provide an accessible way for organisations that provide interventions in criminal justice to assess their efficacy, particularly in terms of whether or not they reduced reoffending.
Reductions or increases in recidivism rates of any one organisation’s interventions may be of interest to various audiences including: commissioners of services; boards of trustees; service users; other providers; researchers and policy makers. Whether or not a criminal justice intervention is more effective than that which would be otherwise provided (treatment as usual), and whether it is more resource efficient than potential alternatives, are also critical questions under contracts where outcomes are rewarded or incentivised, where payment is by results. The JDL provides a model to assess not just how many individual users have been put through an intervention, but also how the outcomes for that cohort of users compare to similar people who have also offended but whose management did not involve the particular intervention being evaluated.
How does the JDL evaluate outcomes?
The organisation submits data on a cohort of service users to the JDL. These data are kept securely for the duration of analysis and unless otherwise agreed, destroyed after the analysis has been completed. The service users’ data are considered in terms of each person’s likelihood to reoffend. This uses a “propensity” model, where the risk of reoffending is considered in terms of predicted outcomes when the service users would be managed as usual within community or custodial settings. Each service user is considered in terms of his or her pre-assessed likelihood of reoffending (a risk propensity). The service users are then compared against multiple other people who have offended and been assessed with similar propensities (the comparison group). Note that no assessment is made of which other programmes or interventions the service users or comparison group may have engaged with; the model assumes that it is testing one intervention against usual management approaches where multiple other interventions may have been implemented.
What did we find?
We considered 97 of the first 132 reports published by the JDL (see the full report for a description of our methodology and how we selected these reports). These 97 papers represent evaluation data from 38 organisations, several of which had submitted data at different times, from more than one year of delivery and/or for more than one intervention. It is hard to find collated sources that indicate how many potential providers there may be of criminal justice services, particularly as both accredited and non-accredited programmes may be run. Within the report, we also drew on a survey conducted by NPC, sent out to over three hundred potential providers so it seems fair to state that there was a modest initial uptake of JDL services. However, it should also be noted that each evaluation conducted, involves significant labour in preparing analysing and synthesising data, not least because each service user needs to be correctly identified in the criminal records. After that, multiple comparators are sought, based on the overall propensity (and increasingly on additional characteristics) of each individual service user submitted for comparison.
Questions still remain about how interventions should be implemented, who is most credible and appropriate to implement them and which groups are most amenable to change.
Fifty-one of the reports related to employment, the next most frequent type of intervention concerned accommodation (14 reports). Nine reports considered mentoring schemes, eight focused on education, with a handful of other types of intervention reported. The size of the sample submitted varied considerably and predictably, statistically significant results were most likely to be found in the larger sized samples (those providing data on more than 300 service users). However, the effect sizes (how much change is observed by comparing service users against comparators) did not vary by size of sample. Just over half of the 97 reports showed effect sizes of between 4% increases and 4% decreases in rates of recidivism. Just below 30% of the reports showed that the interventions may have resulted in higher recidivism rates than usual management but more than two thirds of the effects showed decreases in recidivism. In other words, most reports indicated that most of the outcomes were better than what would otherwise have happened.
This would appear to be good news, but it is dampened by noting that, 64 of the 97 reports were not statistically significant and the outcomes assessed as “inconclusive”. Only five of the increases in recidivism rates were statistically significant whereas 28 of the decreases in recidivism rates (the improved outcomes) were statistically significant. Statistically significant, positive outcomes were most obvious in samples from the Prisoners’ Education Trust and some employment providers.
What does this mean?
Firstly, it is positive to see that the JDL team has consistently aimed for excellence in producing rigorous, reliable evidence. However, uptake has been relatively slow and many of the outcomes have been statistically equivocal. The JDL can contribute to policy decisions and where results are statistically significant, it is clear that commissioning decisions (including payments for results obtained) could be directly informed by the analyses. Service providers do however need enough users going through their interventions to enable statistically significant differences to be discerned.
Why does this matter?
For more than 40 years, attempts have been made to assess what works in rehabilitating adults and young people who have offended. There have been several systematic reviews and there is a growing body of evidence about the types of intervention likely to be effective, some of which we’ve contributed towards. Questions still remain about how interventions should be implemented, who is most credible and appropriate to implement them and which groups are most amenable to change. Providers of services and those involved in their governance and commissioning need to know whether specific interventions are effective and how outcomes can be compared. This is particularly important as we try to assess the impact of the transformed rehabilitative policy landscape and assess how best to implement and evaluate payment by results.