Not of General Interest: Student Evaluations and Academic Rigor

Friday, February 11, 2011

Student Evaluations and Academic Rigor

Richard Arum, speaking in "A Lack of Rigor Leaves Students 'Adrift'" at npr.org:

According to the study, one possible reason for a decline in academic rigor and, consequentially, in writing and reasoning skills, is that the principal evaluation of faculty performance comes from student evaluations at the end of the semester. Those evaluations, Arum says, tend to coincide with the expected grade that the student thinks he or she will receive from the instructor.

"There's a huge incentive set up in the system [for] asking students very little, grading them easily, entertaining them, and your course evaluations will be high," Arum says.

I'm glad it's a real study, or we'd all be saying, "thank you, Captain Obvious."

Seriously, though, student evaluation numbers are the primary way in which a lot of us have our teaching evaluated. We (or our administrators) have canonized those numbers and granted them a lot more power than they had when student evaluations began back in the 1970s. Isn't it logical to assume that in situations where those numbers have the most power, the temptation will be the greatest to massage the assignments into something that's student-friendly or at least complaint-proof?

I'd like to see some study like the following: take instructors of comparable rank and teaching ability (as measured by observation, etc.) who are teaching similar kinds of content, maybe a large required course where the instructors don't have to use the same materials. Half of them don't have to have student evaluations at all, or maybe they have evaluations that are locked away for the period of the experiment so that administrators can't see them. Follow both groups for 5 years or so, judging teaching in one group solely by observations, self-report, and review of course materials. At the end of that time, see if there's a demonstrable difference in student learning and academic rigor.

I know this probably couldn't be done (and isn't a scientific design, of course), but if we're going to "assess outcomes," shouldn't we also be assessing one of the primary if not the only means by which we evaluate teachers?

7 comments:

Anthea said...: Urgh....oh no. Whilst reading your posting I started to think and remember conversations where my colleagues back in the UK talked about the Research Assessement Exercise that plagues many British academics every year.; 11:09 PM
Anonymous said...: There's a study at the military academy, I think by Scott Carrell and Jim West, that does a similar study looking at math classes for which there is an external exam.

I *think* they found that high evals were correlated with positive grades on the external exam, but negatively with mathematical learning in later classes. Students preferred professors who taught towards the test and were penalizing those who gave them a deeper or broader understanding.

Of course, their evals could be affected by the fact that they have this general exam.; 4:45 AM
undine said...: Anthea, does the Research Assessment Exercise try to determine the impact of research or of teaching?

nicoleandmaggie--thanks for that information on the math study. I'm not surprised to hear those results.; 8:59 AM
profkm said...: Thanks for pointing me in the direction of this story. I have recently been thinking about the relationship between academic rigor and student evaluations myself: toohottoteach.wordpress.com; 3:56 PM
Carl said...: Without clicking through, because I'm already playing enough hooky here, I'm suspecting the usual gag in pop reports of 'studies', in which carefully qualified correlations are transformed into 'possible reasons'. Another, less flattering correlation is the famous transformation of the academy from an elite enclave into a commodity, and the consequent 'democratic' entry into academe of barbarians from the lower classes and other previously-excluded groups. This includes me; I won't speak for anyone else here.

Anyhoo, some of us identify with the status elite we're trying to infiltrate and become zealously-converted disciplinarians, some of us remain identified with our prole roots and become populist iconoclasts. Some of each group give up on the old standards altogether, in disgust or triumph, once the consumer's voice is duly empowered. Those of us who get good evals find them valuable compendia of popular wisdom, those of us who don't think they're mere demagoguery. So much ideology, so little data.

Another complicating factor is that even where evals are used to assess teaching, the screeners may be colleagues on peer review committees like T&P. I've seen conversations where good evals were actively discounted because they were presumed to be tainted by pandering. Far from simply wolfing down the raw data, impressive hermeneutic machineries were deployed to interpret the slimy entrails. And rightly so, because if the eval system has any kind of narrative dimension it should be possible to distinguish false positives (he made every class fun) from false negatives (she really made us work hard).

There's a solution to all of this. We could teach our students to be the kind of reflective analysts of their own experience who don't simply regurgitate their tastes and preferences every time some official entity gets an assessment bug up its butt.; 1:02 PM
Anonymous said...: Anthea, does the Research Assessment Exercise try to determine the impact of research or of teaching?

Let ma answer, as best I can, being a native and so on. The RAE *did* (for it is no more and what shape its successor has not got any clearer for the recent change of government) assess research, primarily, though I believe figures for teaching load, student numbers and so on were used to put those figures into context per institution. It also ran only every five years, not every year, which causes very odd peaks and troughs in the UK academic employment market as places try to appropriate high-scoring scholars as close to the RAE mark as they can afford, many of whom will be on temporary contracts, not be replaced after the deadline is gone. So it's no bad thing that it's gone in some ways but the new thing, which was going to be called Research Excellence Framework, may well be worse for us.; 5:03 PM
undine said...: Carl, I wish your solution could be implemented. It's important to read those entrails correctly, although in a non-nuanced system ("what's the numerical score for your teaching?"), I doubt that's going to happen.

Jonathan, thanks for explaining the RAE. It sounds as though it distorted the job market in some especially pernicious ways. I'll be interested to see what the new system (REF) does that is different or better.; 11:02 AM