When a Grade Isn’t the Same Grade
How cross-subject comparisons can mislead students and parents
Last month, we asked whether all subjects in a school should report attainment using a common scale of percentages, grades, or “expected” judgements. The appeal of consistency is clear: it allows parents to compare subjects, helps school leaders track performance, and gives students a seemingly coherent picture of their progress.
But even when we force all subjects onto the same scale, such as grades A to E or numbers 1 to 9, we can’t assume those grades carry the same meaning.
This post picks up where we left off, but shifts to asking what those reported grades actually mean to students and their parents who are making comparisons across subjects within a report.
Where do grades get their meaning?
Grades carry meaning in different ways. First, there's the across-person, within-subject comparison: how a student ranks against their peers in a single subject. This is the kind of meaning most familiar in schools: Am I top set in maths? Did I beat my friend in English?
Second, there's the within-person, across-subject comparison. This is how students internally judge their own strengths: I'm better at history than science; I find French harder than geography. These comparisons shape where they choose to put effort, where they feel confident or discouraged, and which subjects they continue.
Finally, there's absolute meaning: the benchmarks imposed by external qualifications, such as SATs, GCSEs, A-level entry requirements. These add real-world consequences to the picture, though students often interpret them through the lens of the first two.
In previous posts, we’ve focused mainly on peer comparison and the technical underpinnings of grades: reliability, discrimination, and reporting. But this post turns to the meaning students build when they look across their own grades in different subjects.
How are grades assigned?
Let’s say we’ve decided to assign students one of five grades: A to E. How do we decide who gets what?
Sometimes we rely on criterion-referencing: a student hits a defined standard that earns them the grade, e.g. 90% shooting accuracy in netball, or minimal grammatical errors in English. But, as we’ve discussed before, most real-world school tasks don’t come with neat, objective thresholds. So, we start using words like “good” or “secure,” which inevitably involve degrees of quality and comparisons.
Other times, we lean on expert projection. A Grade A might go to a student whose work suggests they’re on track for a 7–9 at GCSE. This relies heavily on teacher experience and implicit knowledge of typical learning trajectories.
Or we might quietly apply cohort-referencing, deciding only 20% of the class can be “Above Expected.” Even when we don’t tell students this, it shapes our decisions.
Finally, we often inherit internal calibrations - historic expectations about what a Grade C “should” look like. But where did those come from? Past distributions? Long-forgotten rubrics?
Each approach has logic. But when different subjects use different logics, the result is inconsistency. And this can be damaging when students start comparing their grades across subjects.
When cross-subject meaning fails
If different subjects use different rules to assign grades, then cross-subject comparisons become meaningless. A Grade C in one subject might place a student near the top, while a Grade A elsewhere could be handed out freely.
Even when subjects claim to use the same approach, small differences in how teachers apply it can lead to big differences in reported grades. The result? Students and parents draw the wrong conclusions about where strengths lie and where effort is needed.
This matters because students use their grades to make strategic decisions: where to put in effort, which subjects to continue, when to seek help, and how to see themselves, e.g. as “good at science” or “not a maths person”. When the grades don’t mean the same thing across subjects, those decisions are based on faulty signals. Students may focus their energy in the wrong places, drop subjects they’re actually strong in, or carry a distorted sense of their academic identity.
Why uniform grade distributions aren’t the answer
One tempting solution is to impose fixed grade distributions across all subjects, such as 15% Grade A, 20% Grade B, and so on. This would at least ensure comparability on paper. But it quickly runs into problems. In subjects like Latin, where only a small, self-selecting group take the course, the distribution of prior attainment may be very different from the year group as a whole. If no lower-attaining students take Latin, the ‘weakest’ student in the class might still be well above average overall.
Even adjusting for prior attainment, by giving each subject a quota of grades based on students’ Key Stage 2 scores, introduces limits. It prevents subjects from showing the impact of excellent teaching or unusually strong effort. And it assumes that all subjects should produce the same shape of grade distribution. But this isn’t always natural. In more hierarchical subjects, like maths or physics, attainment tends to spread more cleanly, i.e. those at the top often are markedly better. In the arts or humanities, where strengths can be more uneven across parts of the knowledge domain, the distinctions between students can be subtler.
A single distribution model looks tidy, but it risks distorting both meaning and motivation.
The shape of the distribution still matters
Everything that has been discussed about the distribution of grades also applies to the distribution of marks or percentages, if they are reported. A score of 68% means something very different when the class average is 85% compared to when it’s 49%. Furthermore, reporting relative to a mean average does not tell the whole story because the shape of the entire mark distribution – whether it is skewed, bunched, or evenly spread – affects how results should be interpreted.
Some schools standardise raw marks to give scores with a fixed mean and standard deviation (like a mean of 100 and SD of 15), or even normalise the data to fit a bell curve. This can help reflect real differences in attainment across subjects and is preferred to reporting percentile ranks, even though ranks or percentile ranks are easier for parents to understand (e.g. Jonny is ranked 12 out of 150 students in the year). The problem with percentile ranks is that middle attainers are likely to fluctuate dramatically up and down the rankings. For example, a mistake in just one question can cause a student to move from the 48th to the 64th percentile. Thus, percentile rankings are generally more misleading in the information they portray, and more demotivating for students.
That said, these types of standardisations are not the norm. I don’t often see schools make much effort to help students or parents interpret marks data in a meaningful way. Perhaps they worry it would cause confusion. Or perhaps, deep down, they’re reluctant to reveal what the numbers really show (an issue we’ll return to in a later post).
What schools should do
Some might argue that none of this matters. Let each subject hand out whatever grades they like and just add a class average or rank to help parents make sense of it. But this misses the point. Students will compare across subjects. In fact, once subject choice comes into play, those comparisons become even more important. And as students move towards public examinations, such as Year 6 SATs or GCSEs, the feedback we give begin to carry real-world weight.
So, what should schools actually do?
First, bring subjects together. Ask departments to produce provisional grades, then sit down as a group and compare distributions. Are they wildly different? Are they justified? You don’t need every subject to hand out exactly the same proportions of each grade, but you do need to understand why those differences exist. And you need to be confident they won’t mislead students or parents.
Second, use benchmarks, but with care. Looking at prior attainment can help calibrate expectations, especially in subjects taken by small or selective groups. But rigid allocations based on Key Stage 2 data can artificially cap progress and obscure genuine success. If students have worked hard and teaching has been strong, they should be able to outperform expectations.
Third, check the bundle of grades and marks that will be given to individual students, chosen at random. What are they likely to infer from the information you give them, and does it accurately portray the message you wish to send about their relative strengths and weaknesses?
Fourth, make your grading logic transparent. What does “expected” actually mean in science? In art? If grades are based on trajectory towards GCSE, say so. If they’re based on attainment in the curriculum that term or year, without reference to any previous study, be clear. It’s hard to be honest about the fuzziness of assessment, but much worse to pretend it isn’t there.
Fifth, consider reporting standardised scores or cohort averages alongside grades, particularly in year groups where grades that little external meaning (i.e. Years 3/4/7/8/9). These can help explain where a student sits in the distribution without implying a false sense of fixed identity.
Finally, be wary of allowing students and parents to over-interpret small differences. A 65% in geography and 72% in English might feel different, but they often aren’t, especially if assessments differ in format, difficulty, or generosity of marking.
Assessment doesn't have to be perfectly aligned across subjects. But it does have to be meaningful. That starts with understanding how grades are used, where they come from, and whether they’re helping or misleading the students we’re trying to support.
Don’t pretend consistency exists. You need to create it
Grades only help students if their meaning is shared and trusted. Inconsistency across subjects isn’t a harmless quirk: it skews choices, undermines effort, and distorts identity. If we want our assessments to guide rather than mislead, we need to align meaning, not just the scale.



This just blew my mind.