Is there such a thing as being 'good' at my subject
Unified and fractured notions of attainment
Imagine that you have been asked to design an assessment which will enable inferences to be made about how good students are at your subject. What forms of assessment will you include? How will the assessment be structured? How much time will be needed to have enough data to make valid inferences?
This task becomes notably more complex when dealing with subjects that have what we refer to as fractured notions of attainment. In this post, we will explore what this means and why it causes problems for making judgements about how good students are at these subjects.
Unified and fractured domains
School subjects can generally be classified into three categories based on the relationship between performances in different domains:
Unified Domains: In these subjects, learning in one area tends to bolster competency across the board. Mathematics serves as a prime example.
Distinct but Correlated Domains: In this category, mastering one part of the curriculum may not necessarily enhance learning in another. However, students often perform similarly across these domains due to related skills like reading and writing. History, for instance, fits this description. Proficiency in Roman history may not improve one's understanding of World War II, but students generally perform similarly in both topics.
Distinct Domains: Here, attainment in one area of the curriculum is often unrelated to performance in another. Physical Education (PE) illustrates this point well: proficiency in a ball-based team sport may have little to do with one's skills in gymnastics or understanding of sports physiology. In such subjects, notions of attainment are fractured.
The type of subject you're dealing with—whether it aligns with a unified or fractured notion of attainment—significantly influences the complexities involved in assessment design. It is generally easier to measure attainment when performance is likely to be correlated across different domains within the subject. This is because understanding the nature of attainment in your subject informs what can realistically be inferred from an assessment and whether it's appropriate to summarise performance across various domains.
Coping with fractured notions of attainment
In designing your assessment, the first decision will be to decide the scope: what parts of the curriculum will it sample from? There are many considerations when it comes to deciding on an assessment’s scope, but for now we will limit our concerns to the challenges caused by fractured subjects.
In highly unified subjects, it matters less what parts of the curriculum are sampled as performance in one aspect will be a good indicator of performance across the board. Rather than sample broadly, it may be better to explore a limited range of topics in depth. This would increase the validity of inferences made in relation to the areas sampled and therefore improve validity of inferences about ability across the subject.
In subjects with fractured domains of knowledge, it will be necessary to sample widely across the domains. Decisions will need to be made about what weight to give each domain; for example, the weight given to practical skills over declarative knowledge. In drama, we may wish to weight practical performance over the student’s knowledge of performance techniques, theatrical traditions, or technical aspects of staging. In which case, the practical element of the assessment would need to carry more marks. This decision relates to beliefs about what the purpose of the subject is. In fractured subjects, purpose is more likely to be contested, which makes the task even harder.
The second decision is to choose assessment methods and question types. Subjects with distinct domains are likely to require a wider range of assessment methods, from closed to open questions, from written papers to practical performances. To make a valid inference about how good students are at our subject will require more time, a wider scope, and a greater variety of methods. Designing assessments in fractured subjects requires mastery of a variety of assessment methods.
The purpose of an assessment
The debate over uni-dimensionality, while seemingly esoteric, touches on the core purpose of an assessment – is it to discern knowledge about atoms, about physics, or about science as a whole? In other words, what is the construct we are trying to measure?
The work of educational assessment expert David Andrich has been pivotal in illustrating these tensions. In one of his examples, he describes a physics test comprising topics like heat, light, sound, electricity and magnetism, and mechanics. These topics, while distinct, define the construct of physics within a specific curriculum framework. They are 'relatively thin' on their own but 'relatively thick' when considered as the subject of physics.
When are topics or parts of a subject similar enough that we can treat them as contributing to a unified knowledge domain? There are statistical fit models that can help us, by showing how well correlated performance on assessment items are across different areas. However, while statistical models can help to identify patterns and suggest whether a set of items behaves as if they measure a single construct, the determination of whether a domain is truly unified goes beyond what the statistics can show. It involves theoretical reasoning about the nature of the construct being measured and the content of the items themselves.
For example, in Design and Technology (D&T), the assessment may cover a wide range of skills and knowledge areas, from creative design to practical workshop skills. Statistical techniques, like the Rasch model, could analyse assessment data and suggest these diverse elements fit a single domain if students tend to score consistently across different types of items. However, this does not fully account for the distinct and multidisciplinary nature of D&T, and so the way we would want to represent attainment within the subject. The unification suggested by statistics does not capture the depth and breadth of D&T as a subject, nor does it align with the educational objectives of teaching these distinct but complementary skills. Therefore, while statistical analysis is a valuable tool, it must be used in conjunction with a deep understanding of the subject matter to determine the true nature of the domain being assessed.
Determining how good students are at your subject is easier said than done, particularly if your subject spans multiple, distinct domains of knowledge and ability. Perhaps it is unreasonable to expect one assessment to be enough. Multiple assessments over time may be a more appropriate way to assess ability in some subjects. Ask whoever gave you the task to read this post, then see if they still think it is an achievable task.
Finding out how good students are at your subject is, in my view, not best left to individual teachers & schools. If you design your own test, you will, naturally enough, use it to assess your curriculum. In which case, all you can discover is how good students are at your curriculum, not the subject itself. If, however, we are able to use assessments which assess subject performance over a lot of schools then we’ll have an idea of whether what we’re teaching shows up in an assessment sampling from a wider domain. Best to design internal assessments to help us know not how good children are at our subjects but how good we are at teaching them. That is is to say, the most useful internal assessments assess the curriculum & its implementation.