13/07/2017 07:34 BST | Updated 13/07/2017 07:34 BST

Smarties Only Have Part Of The Answer

As the deluge of complaints about the differential marking of this year's primary SATs papers hit the media, I view the current debates from varied perspectives, as researcher, teacher and grandparent. While inequitable marking of such high stakes assessments is of course an issue that must be urgently dealt with, it seems to me that the core problem lies at a more fundamental level of the process, a misunderstanding of the relationship between data and assessment.

As a researcher, I am very aware of the value of controlled testing of large samples to discover an average score and thence to compare this to the average scores of other large samples taking the same test, but who have been through a different teaching and learning process. This is, in essence, the way in which the PISA tests for teenagers work. However, such research can only tell us if there is a difference between groups and whether that difference is 'significant'; i.e., whether the average of one group is very different from the average of the other.

The way I have perennially explained this to students is that if you wanted to see if people preferred blue or orange Smarties, you would offer them just one free pack of one or the other. If 55% chose one colour and 45% chose the other, this could be due to chance in terms of random individual preference. On the other hand, if the split was 90%/ 10% it would be clear that the population that you tested did show a clear preference for one colour over the other, and this is then likely to be repeated across a wider population. However, a study of this nature cannot tell you why one colour of Smartie is generally preferred over another.

From this premise then, testing a huge number of five year olds in the UK and in Norway (for example) and comparing the results to see if there is a statistically significant difference between each population is potentially informative, because if such a difference was found, it would indicate that these different nations' childhood environments have created a difference in outcomes. But it wouldn't tell you what differences between the environments were responsible. Furthermore, the causes are likely to relate to not just one factor, but a range of different experiences within school, family lives and the surrounding society. It then becomes clear that using very basic statistics drawn from brief assessment procedures in an attempt to definitively rank children, teachers and schools is a practice rooted in gross misunderstanding.

Most importantly, this process is negatively impacting on England's day-to-day teaching and learning practices. Examples include dull coaching towards 'performance' in the statutory Year 1 phonics test which introduces children to reading as a technical skill rather than as a mode of communication. This is then further compounded in instruction on the nature of the fronted adverbial and the placement of the semi-colon; topics upon which I was never formally instructed at primary or secondary school, but nevertheless went on to write a PhD thesis and become a published author.

And even at the nursery stage, where the very first impressions of school are formed, we find mechanical routines. One example I have recently observed is a daily nursery routine where, once the coat has been hung on the peg and the water bottle placed on the tray, three and four year olds are expected to sit at a table and write their name before being invited to enter the main room. But these are such very young children with rapidly developing minds full of imaginary narratives in which they can be anyone they want to be; maybe Batman, an astronaut, a pirate, or even Peter Pan flying to Neverland on any given morning- only to be brought crashing down to Earth by the introduction of such non-happy thoughts at the nursery door.

As a grandparent, I understand why my grandsons have the school-based experiences that they do- because if they and their peers are judged, through performance on statutory assessments, to be 'below expectation', their schools and teachers will be held 'accountable' in ways that can lead to ruined careers and mental breakdown. As an experienced teacher, I understand the value of assessment, particularly in the formative sense of indicating suitable 'next steps'. As an experienced social researcher, I understand the value of data, and the ways in which it can be used to inform social policy creation. The problem is that the manner in which data drawn from nationally administered SATs is currently collected, stored and used has given rise to a dysfunctional situation, leading to the 'datafication' of individual pupils, schools and teachers, and thence to much human unhappiness. It is time for all stakeholders in this situation to get together to consider a more positive and humane approach.