Why Is It So Hard to Get School Assessment Right?
The complexity of balancing conflicting needs in school assessment
At first glance, assessing student learning seems straightforward. If we want to determine whether a student understands a concept or topic, we can ask a simple question: Can Helena recall the key facts about the Battle of Hastings? Since attainment in a topic is built from understanding individual concepts, and subject mastery is built from attainment across topics, constructing a picture of student achievement from straightforward questions should, in theory, be simple.
However, this view of assessment is illusory—at least in schools. Teachers do not have the luxury of assessing students one at a time, in isolation, with unlimited time and flexible questioning.
School assessment operates within a large, complex system where scale, efficiency, and consistency become essential considerations. As soon as we move from assessing an individual student to designing assessment systems for entire classes, year groups, or schools, the challenge intensifies.
If assessing a single student is easy, why does assessment in schools become so difficult?
Why is school assessment necessarily an act of compromise?
The Key Challenges of School Assessment
Schools require assessments that do more than simply gauge an individual student’s knowledge. Assessment systems must be scalable, efficient, and consistent, which inevitably introduces trade-offs.
1. The Need for Scale
Schools do not assess students one at a time but in groups—sometimes hundreds or even thousands at once. This makes one-to-one questioning, viva voce assessments, or highly personalised methods impractical. Instead, teachers rely on assessment formats that can be administered and marked en masse, such as written tests or multiple-choice questions. While these approaches allow for scale, they sacrifice some of the rich, diagnostic insight that one-to-one discussions provide. The challenge of scale leads to greater compromises in some subjects (such as drama and French) than in others.
2. The Need for Efficiency
Time is a scarce resource in schools, and every assessment must be weighed against the opportunity cost of lost teaching time. If a single test takes an entire lesson, students may benefit from the act of retrieval, but no new material can be taught. The goal is always to gather the most valuable information in the least amount of time.
3. The Need for Consistency
Assessment in schools is rarely just about individual students—it is also about comparability. Schools use assessments to track progress over time, compare students within a cohort, and sometimes even evaluate performance across different subjects or schools. This requires standardisation to ensure that students are assessed in the same way, regardless of teacher, classroom, or location. However, this standardisation limits adaptability. Individual students often sit exactly the same assessment, even if it is not optimal for their stage of learning. As a result, assessments must be pre-designed rather than adapted in response to student performance.
The Disciplinary Demands of Assessment
Our need for efficiency, scalability and consistency means that school-based assessments are always an act of compromise, requiring teachers to use their expertise across the fields of curriculum, pedagogy, cognition, behaviour psychology, as well as technical assessment knowledge, to make optimal decisions.
For example, because we usually cannot observe our students directly as they take their assessment, we need to use insights from cognitive science to help us infer what their assessment responses reveal about what thinking might have taken place during the assessment itself.
When we substitute the one-on-one conversation with a series of pre-written questions, we necessarily place constraints on what these questions must look like. It is our curriculum expertise, including our understanding of how knowledge is structured in our subject, that helps us select optimal question types, given the constraints of school assessment.
If we need to give students written, rather than oral, feedback, we ideally craft this feedback using our knowledge of behavioural psychology to inform an approach that maximises their future learning. This grounding in motivational theories will also aid us in deciding how much to reveal about an assessment before it happens, with the goal of maximising learning gain.
These disciplinary perspectives – from curriculum, pedagogy, cognition and behavioural psychology – can lead to multi-faceted perspectives and even tensions when we ask:
‘What is this assessment for?’ and ‘What is effective assessment practice?’.
Assessment in a Complex system
Developing a framework for how assessment systems support learning is inherently complex because school assessments exist within a broader, interconnected schooling system. This complexity arises for several reasons.
First, students and teachers do not have fixed behaviours; they continuously adapt to each other and their environment. A student’s approach to studying is shaped by their peers, past experiences, and perceptions of success. Understanding student beliefs, motivation, and habits is therefore central to effective assessment.
Second, education is an interconnected system where actions in one area influence others. For example, assessment methods in primary schools shape behaviours in secondary schools, just as department-wide policies impact learning across subjects. Whole-school assessment approaches often arise to manage these interdependencies.
Third, learning is influenced by feedback loops—where assessment outcomes shape self-perception and future effort. A student’s success or struggle in one test affects their confidence, study habits, and subsequent performance, reinforcing either positive or negative cycles.
Fourth, schooling consists of nested sub-systems, from individual cognition to classroom dynamics to institutional structures, each with its own disciplinary tradition for understanding learning and assessment.
Finally, assessment is challenging because learning unfolds over months, years, or even decades. While it is easy to measure recent recall, it is far harder to determine how an assessment affects long-term understanding and attainment.
Given this complexity, there is no single research base dictating how assessment should be used in every context. Establishing cause and effect is difficult. Instead, teachers must develop broad frameworks to guide assessment decisions while remaining responsive to whether they achieve their intended learning outcomes.
Trade-offs in the pathways to learning effects
We create assessments to support student learning, and each assessment influences learning through multiple mechanisms. These mechanisms may operate before the assessment, through the expectation of inference, during the assessment itself, or after, as a consequence of inference. They may involve students, teachers, leaders, parents, or governors. As a result, there are often trade-offs between making valid inferences to inform instructional and organisational decisions, maximising learning gains, and motivating students toward productive action. Learning is best promoted when we understand these trade-offs and determine which to prioritise.
Teachers and schools make decisions about how often to assess, what to assess, how to assess, and the appropriate level of difficulty. Each decision involves a trade-off. Assessing more frequently may motivate pupils to study but reduces instruction time. Assessing only a sample of the domain can make study more focused and productive but provides less valid inferences about overall mastery. Raising the stakes may push some pupils to work harder but risk disengaging those with lower expectations of success. Designing assessments where most pupils score highly may boost motivation but offers less insight into the achievement of high-attaining pupils.
If teachers and schools view assessment too narrowly—for example, focusing solely on the validity of inferences about students' learning—they may overlook potential learning gains. This is why promoting learning should always be the primary consideration when making assessment decisions.
Trade-offs in assessment always come with opportunity costs—choosing one approach to promote learning means sacrificing another. There are also direct costs, such as increased teacher workload, resource demands, and student anxiety; information does not come for free. Additionally, assessment decisions often have unintended consequences. Teachers navigate this complex landscape, but their ability to do so is limited by what they don’t know. For example, when scheduling a class test, they may be unaware that two other subjects have planned high-stakes assessments for the same day, impacting students’ priorities. Schools should aim to minimise opportunity costs, direct costs, and unintended consequences.
Schools carefully consider the costs and consequences of assessment for teachers and administrators, but the impact on students’ experiences is sometimes given less attention. Discussions often focus on workload, responsive teaching, timetable disruptions, curriculum adjustments, progress measures, data recording, accountability, and reporting deadlines. However, how students allocate their time and approach their studies is just as important. Teenagers' motivation and the quality of their study habits play a crucial role in learning outcomes. To maximise learning, assessment should serve broader considerations about how and why students engage with their education. This requires giving as much thought to the consequences of assessment as to the inferences drawn from it.
Getting school assessment right is not about eliminating complexity but about managing it wisely. The best assessment systems are not those that strive for unattainable perfection but those that carefully balance competing priorities to maximise learning for all students.