The Limits of Lesson Observation

In many education systems, it is a mandatory requirement that every teacher undergoes at least an annual observation by their school leader. Heads and principals generally use some form of rubric or scoring sheet and rate their teachers against this.   At our last count, we located 120+ observation forms that had been published with some evidence about their reliability and validity. These observations are often used for performance management...
May 31, 2019

The following is an excerpt from Education Cargo Cults Must Die by John Hattie and Arran Hamilton. To download the full white paper and others in the Corwin Educator Series, click here.

In many education systems, it is a mandatory requirement that every teacher undergoes at least an annual observation by their school leader. Heads and principals generally use some form of rubric or scoring sheet and rate their teachers against this.  

At our last count, we located 120+ observation forms that had been published with some evidence about their reliability and validity. These observations are often used for performance management purposes, to identify who are the ‘good’ and ‘less good’ teachers, and by national inspectorates to make more holistic judgments about whether a school is outstanding, good, or poor. They are also used for developmental purposes, with teachers peer-reviewing each other’s lessons so they can offer one another advice and harvest good practice to apply back in their own classrooms. Finally, they can be used to sift education cargo cults from education gold by observing the impact of a new education product or teacher development program in the classroom.

But we should ask ourselves an important question: Can you actually see, hear, and sniff a good lesson? Are our five senses any good at measuring outstanding, adequate, and poor? Can we see the impact of a teacher in a class of students? Do we watch teacher performance or do we watch the impact on the students? What if the performance is spectacular, but the impact of little consequence? 

If we phrase the question as a binary yes/no choice, then the answer to whether we can make meaningful and rigorous observations is a resounding yes. And, by binary, we mean questions where there is a clear yes/no answer, like: 

  • Is the teacher in the classroom?
  • Are they talking to the class?
  • Are the children all awake?
  • Has homework been set and marked?

It’s relatively straightforward to establish a sampling plan for each of these and any two observers will have a high degree of consistency in their observations [with minimal training], even if they are not educationalists.

So, for these kinds of binary questions about the performance, we can see, hear and sniff reasonably reliably. We could probably stretch from binary to asking questions about frequency—how often something occurred (e.g. were all the students awake, all the time during the lesson?).

But when we want to use observation to determine whether the teacher delivered a high-quality lesson and ask: 

  • Did the teacher deliver a 'good' leasson?  
  • Did the students 'achieve' the learning objective?  
  • Were the learning objectives worthwhile, appropriate, and challenging?  
  • Was the classwork a 'good' fit with classroom-based activity? 
  • Did the teacher provided 'good' feedback? 
  • Were the education products 'effective'?  
  • Did the teacher-training program deliver 'impact' in the classroom?  

We open a huge can of worms. Who decides what 'good' is and who decides what 'impact' means? 

Observers rely on proxies for learning. A proxy measure is when we use one thing that’s quite easy to get data about to tell us about something else, which is much more difficult to get data about. For example, doctors rely on blood tests, blood pressure and heart rate analyses to tell them whether a patient is fit and well. And, generally, these work relatively well, but it’s possible to have a rare type of illness that does not show up on these types of tests—which means that you might be given a clean bill of health by the doctor, but actually be at death’s door.

It’s the same with lesson observations. It is possible that, when we measure with our eyes, we are looking in the wrong areas. When we see busy, engaged students in a calm and ordered classroom where some students have supplied the correct answers and we conclude that a heck of a lot of learning is going on, it is quite possible that absolutely nothing of any significance is being learned at all.

We know, too, that much of what goes on inside the classroom is completely hidden. The late great Graham Nuthall, in his seminal work The Hidden Lives of Learners (2007), theorizes that there are 3 separate cultural spheres at play in the classroom: the Public Sphere [in theory controlled by the teacher], the Social Sphere of the students [which the teacher is often unaware of] and the Private Mental Worlds of the students themselves [which both the teacher and the other students are unable to directly access]. In short, most of what goes on in the classroom is inaccessible to the teacher and less still to a third-party observer.

Confounding this, the evidence from neuroscience suggests that, of the vast array of data that is collected by our various senses each second, very little is actively processed by the conscious mind. So, even within the Public Sphere that we have direct access to as observers, it’s likely that we see very little. As we focus narrowly on some aspects of classroom practice, we miss the stooge in a gorilla suit dancing across the room. As observers, we have our own lens, our own theories and beliefs about what we consider is 'best' practice, and these can bias the observations, no matter how specific the questions in any observation system. Most observations of other teachers end with us telling the teacher how they can teach more like us!

The challenge with observation is that often we end up seeing what we want to see and we can be guided by our cognitive biases. The process of observing is like interpreting a Rorschach Image – one of those ink blot images that psychiatrists show to their patients—where some say they can see their mother and others JFK. 

The image above, popularized by the philosopher Ludwig Wittgenstein (1953), provides a similar conundrum. When we undertake lesson observations, do we see a duck or do we see a rabbit?  

The data is the same, but we can interpret and re-interpret it in more than one way. 

To read more from Education Cargo Cults Must Die and other white papers in the Corwin Australia Educator Series, click here