We were looking at some end-point data from one of the spelling interventions we run. (To be sure that each intervention works we measure on the way in and on the way out.) It is a twelve-week programme, targeted at children who have a ‘spelling age’ significantly behind their chronological age.
The basic theory for a successful, effective, well-delivered SEND programme is that it should produce twice the normal rate of progress. So over the twelve weeks we hope / expect / want to see 6 months progress in spelling age (or reading age if it’s a reading programme), otherwise we start asking other questions: wrong intervention, wrong children, inaccurate measure, other factors, implemented as intended?
What the data has led me to is some personal professional (statistics) learning: I need to better understand ‘outliers’, how to define them, and when it is appropriate to remove them from a set of scores.
It is fairly easy to see how it would be hasty to base a judgement about the success of a programme on the progress or otherwise of a single participant. Likewise a single test might be too narrow to justify a confident statement of progress for the participant. To judge the programme / intervention itself we want to aggregate and average the progress scores from a decent sized group.
And this is what brought me to identifying my own learning need. Tests scores (using the same test in and out) showed a range, as we’d expect, of levels of progress. We calculated the number of months gained for each child in the programme, some with disbelief – not in the child but in the scale (and sometimes direction) of the scores. The lowest progress scores was minus 8 months, suggesting the child had lost eight months off their ‘spelling age’ in the three month period. The greatest progress was an apparent gain of two years and 10 months, or plus 34 months!
The majority were grouped between 1 month and 9 months, so that plus 34 looks extraordinary / unlikely / inexplicable. It matters because it puts over two months gain per pupil on the mean average. I needed to learn whether and how to discount it (and then what we would report to parents about this child’s progress if we disbelieved the test score).
'Mathwords', sort of helpfully, defines an outlier as: A data point that is distinctly separate from the rest of the data. One definition of outlier is any data point more than 1.5 interquartile ranges (IQRs) below the first quartile or above the third quartile.
So now I need to work out how to calculate the quartiles for the set of data, and the inter quartile range.
There is no hard and fast advice on whether to remove (and report) the outlier(s) as they may be the most interesting and significant data in the whole set!
We will continue to question it all, I think.
Search our website
Our target is...