University responses to industrial action and grade inflation

academia
Author

Chris Hanretty

Published

June 21, 2023

tl:dr; policies adopted by my employer are likely to increase the proportion getting a first class degree by roughly three percentage points

My employer, like many others in the sector, is being seriously affected by industrial action (a “marking and assessment boycott”) called by the Universities and Colleges Union as part of a pay dispute.

As a response to the marking and assessment boycott (MAB), my employer drew up new emergency regulations which are supposed to mitigate the effect of the strike. My focus in this blog post is on changes to the classification boundaries.

Previously, our classification boundaries were as follows: a student would be given a first-class degree if:

A student with a (weighted) average of 72 would always get a first; a student with a (weighted) average of 69 might get a first, and a student with a weighted average of 67 could never get a first.

According to the emergency regulations, a student can now be given a first class degree if:

This means that a student with a weighted average of 67 can now get a first-class degree, depending on how the results of their final-year modules.

To my mind there is no rational connection between this policy change and the industrial action. The industrial action is a marking and assessment boycott, not a boycott of teaching. There was a strike period earlier in the year, and so students might not have been taught some topics, but the college has already adopted mitigation policies in respect of this strike, and topics not taught have been struck out from exams. The marking and assessment boycott in itself has no connection to student performance. Essays are not perishables: an essay which is not marked promptly does not deteriorate over time.

This policy does, I think, make sense if you believe that the university is concerned with student and parent satisfaction, and is seeking to paper over student discomfort at the marking and assessment boycott by giving people a higher class of degree. I have no insights into the intentions of the management team at my employer, but this reading would be consistent with a desire shown elsewhere that graduation must proceed as originally scheduled.

What are the consequences of this new policy, and in particular the consequences for grade inflation?

Over some years I have had access to breakdowns of student marks for my department, which covers politics, international relations and philosophy. I can’t use those marks here, but I can simulate marks which closely approximate those marks. You’ll have to take it on faith that my simulation is reasonably true to life.

Student grades in the second and third years of their degree can be well approximated by a skew normal distribution. In this distribution, there’s one parameter (\(\xi\)) which controls the location, a second parameter (\(\omega\)) which controls the scale, and a final parameter (\(\alpha\)) which controls the skew.

Here’s a distribution which matches the second year grade distributions I have access to :

library(sn)
Loading required package: stats4

Attaching package: 'sn'
The following object is masked from 'package:stats':

    sd
nSims <- 1e5
yr2 <- rsn(nSims, xi = 73, omega = 13, alpha = -3.5)

par(bg = "#0f2537", fg = "white", col.lab = "white", col.axis = "white",
    col.main = "white", col.sub = "white")

hist(yr2, breaks = 33,
     xlab = "Stage average",
     main = "Simulated second year marks for 100,000 students")
mtext(expression(xi == 73 ~ ", " ~ omega == 13 ~ ", " ~ alpha == -3.5),
      side = 3)

To simulate third year distributions, we can add a systematic component (around 1.7 points) and some noise (SD = 7). This much is suggested by a intercept-only regression of the difference between third and second-year grades.

yr3 <- yr2 + 1.7 + rnorm(nSims, mean = 0, sd = 7)

If we have simulated second and third year grades, we can work out the distribution of “overall averages”.

overall <- 1/3 * yr2 + 2/3 * yr3
summary(overall)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  6.666  58.327  65.063  64.198  70.934  97.803 

and calculate the percentage with an average of greater than 67, 68 or 70:

round(100 * mean(overall > 70))
[1] 29
round(100 * mean(overall > 68))
[1] 37
round(100 * mean(overall > 67))
[1] 42

This suggests that reducing the cut-off by a percentage point would increase the proportion awarded a first class degree by around four percentage points, if we ignored for a moment the requirement to have at least sixty credits in the upper band.

Is this additional requirement to have sixty credits in the upper band ignorable?

It turns out it doesn’t reduce the estimates that much. Here’s a fitted curve which shows the probability of having at least two modules with grades of 70 or greater as a function of overall average (again, estimated for the data I have had access to but can’t share).

inx <- 40:100
outy <- plogis(-28 + 0.43 * inx)


par(bg = "#0f2537", fg = "white", col.lab = "white", col.axis = "white",
    col.main = "white", col.sub = "white")

plot(inx, outy,
     xlab = "Overall average",
     ylab = "Probability of having two first-class grades in the final year",
     type = "b")
abline(v = c(67, 68, 70),
       lty = 1:3,
       col = c("#66c2a5", "#fc8d62", "#8da0cb"))

text(x = c(67, 68, 70),
     y = c(0.3, 0.4, 0.5),
     label = as.character(c(67, 68, 70)),
     col = c("#66c2a5", "#fc8d62", "#8da0cb"),
     pos = c(2, 4, 4),
     srt = 90)

The probability of having two first-class modules, conditional on getting an average of 68, is roughly seven in nine. The probability of having two first-class modules, conditional on getting an overall average of 67, is seven in ten.

We can include these probabilities in a simulation of grade classifications. By repeatedly simulating, calculating grade classifications under both rules, and subtracting the probability with a first under the old rules form the probability of a first under the new rules, we can get an estimate of the effect of the new rules.

sf <- function(nSims = 1e5) {
    yr2 <- rsn(nSims, xi = 73, omega = 13, alpha = -3.5)
    yr3 <- yr2 + 1.7 + rnorm(nSims, mean = 0, sd = 7)
    avg <- 1/3 * yr2 + 2/3 * yr3
    higher_category_modules <- rbinom(n = nSims,
                                      size = 1,
                                      prob = plogis(-28 + 0.43 * avg))
    old_rules <- mean(avg > 70 | (avg >= 68 & higher_category_modules))
    new_rules <- mean(avg > 70 | (avg >= 67 & higher_category_modules))
    return(new_rules - old_rules)
}
res <- replicate(1000, sf())
summary(res)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.03053 0.03166 0.03200 0.03202 0.03240 0.03355 

If this simulation exercise is reasonably true to life (and you have to accept that much on faith), then the effect of these new rules is likely to increase the proportion with a first class degree by a little over three percentage points, if nothing else changes at the same time.

I have carried out this simulation exercise for the proportion given a first class degree, but it should be easy to adapt this for the proportion given a “good degree” (2:1 or 1st). Equally, it should be possible for academics in other institutions, who have a good picture of the distribution of second and third year marks, to repeat this exercise for their institution. Of course, it will be possible for my employer to estimate exactly the effect of these new rules by calculating degree classes under both the old and the new rules, and I hope to see such an analysis in due course.

In my experience, many academics dismiss concerns about “grade inflation”, because we do not feel that the marks we award have changed over time. It would be better if we talked instead about “grade inflation” and “classification inflation”. Where universities use different algorithms to classify students, students can receive different degree classifications with identical grades. There are good reasons why some classification algorithms should be preferred to others – but the presence of an industrial dispute does not seem to me to be one of them.