suppressPackageStartupMessages(library(tidyverse))
<- read.csv("view_election.csv")
dat <- dat |>
dat filter(election_type == "parliament") |>
mutate(election_date = as.Date(election_date)) |>
filter(election_date > as.Date("1945-12-31")) |>
mutate(vote_share = vote_share / 100,
seat_share = seats / seats_total)
<- function(v, s) {
gallagher sqrt(1/2 * sum((v - s)^2, na.rm = TRUE))
}<- function(v, s) {
sainte_lague sum((v - s)^2 / v, na.rm = TRUE)
}<- dat |>
dat group_by(country_name_short, election_date, election_id) |>
summarize(d_gall = gallagher(vote_share, seat_share),
d_sl = sainte_lague (vote_share, seat_share),
.groups = "drop")
### Generate all pairs
<- expand.grid(A = unique(dat$election_id),
pairs B = unique(dat$election_id)) |>
filter(A < B) |>
left_join(dat, by = join_by(A == election_id)) |>
left_join(dat, by = join_by(B == election_id),
suffix = c(".A", ".B"))
In a previous post, I suggested that one way of determining the minimal important difference is to establish the minimal detectable change. If you don’t know that a change is different from zero, it surely can’t be that important.
In this post, I want to suggest another way to determine the minimal importance difference. This is by working out whether you are indifferent between multiple proposed continuous measures of a concept. If you’re indifferent between two measures, the times when they disagree surely can’t be that important.
But first, a fable…
Why the king’s rooms run hot and cold
A long time ago, in a far away country, there lived two inventors.
These two inventors wished to measure heat.
One inventor measured heat by placing brandy in a thin column of glass.
As the level of the brandy rose, so the heat in the room was judged greater.
The other inventor did the same, but with quicksilver instead of brandy.
The devices made by these men sometimes agreed, but sometimes differed.
People in the mountains preferred the measure based on brandy; people in the plains preferred the measure based on quicksilver.
The two men brought their inventions to the king of that land.
They showed him their inventions, and asked him to judge which was best.
The king, wishing to offend neither man, said he could not decide, but offered to install devices from both men in the rooms in his palace.
Shortly after, the king’s steward noted disagreements between these devices.
The cellar to the west of the palace was judged by brandy to be the coldest room in the palace.
By quicksilver, though, it was the cellar to the east of the palace that was judged coldest.
The two inventors, hearing of the steward’s observations, called on the king again.
“Sire”, they said, “you said you could not decide between us on general grounds. Here you need only decide on particulars. Go into the west cellar, and then into the east cellar, and say which is coldest. Then shall you know which measure is best”.
The king went to the west cellar, and then to the east cellar, and then back to the two inventors. He told them that just as before, he could not judge which of the rooms was colder, and thus could not say which measure was better.
The two inventors thanked the king and left disappointed. The king too was disappointed, for he had no wish to spend any time in the cellars of his palace, and treated his steward rudely thereafter.
Time passed, and the king grew old, and took to complaining of the temperature in his rooms.
He asked his steward to make the temperature in his water closet more equal to the temperature in his bed chamber.
His steward, much put upon, declined.
“No sire, I shall not. For I mark the difference between the measure made in your bed-chamber, and the measure made in your water closet, and these two measures are closer together than the difference between the measures in the west cellar and the measures in the east cellar, which you thought so finely matched. If you tell me now there is a difference, then I must go back to those two men who troubled you earlier, and have them trouble you again”.
The king, not wishing to see the two inventors again, and knowing that his steward had the better of him, accept the man’s argument. And that is why the king’s rooms blow hot and cold.
A worked example with disproportionality
In this section, I’ll look at how we might establish a minimal important difference for votes-seats disproportionality, given indifference between measures. Votes-seats disproportionality is a good example because most of the time votes and seats are measured without error. As a result we can’t calculate a minimal important difference based on the minimal detectable change. I’ll assume that there is a social scientist who is indifferent between the Gallagher index and the Sainte-Laguë index, two commonly used indices of disproportionalit. I’ll calculate the values of these indices for modern parliamentary elections covered by ParlGov, and generate all pairs of elections.
Having generated all pairs of elections, I focus on those pairs where there is disagreement concerning which example has “more” disproportionality. As might be expected, only a minority of pairings throw up some disagreement, but that minority, at roughly one in eight, isn’t neglible.
nrow(pairs)
[1] 212878
<- pairs |>
pairs mutate(gall_pref = sign(d_gall.B - d_gall.A),
sl_pref = sign(d_sl.B - d_sl.A)) |>
filter(gall_pref != sl_pref)
nrow(pairs)
[1] 26250
Now let’s examine the absolute differences between values of the Sainte-Laguë index.
<- pairs |>
pairs mutate(delta_sl = d_sl.B - d_sl.A)
summary(abs(pairs$delta_sl))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000000 0.005039 0.013049 0.022868 0.028760 0.599469
In this case, we wouldn’t want to force anyone to say that a difference in the Sainte-Laguë index of 0.59 units(!) was not important just because it arose in the context of a disagreement between indices. But it does seem plausible to suggest that the median absolute value of 0.013 units might be a minimal important difference. For what it’s worth, if we compare that value to the standard deviation of values of the index in our original data, we have a minimal important difference of roughly 0.125 standard deviations.
We could, of course, have started from the other index. Let’s examine the absolute difference between values of the Gallagher index.
<- pairs |>
pairs mutate(delta_gall = d_gall.B - d_gall.A)
summary(abs(pairs$delta_gall))
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.600e-07 3.848e-03 9.325e-03 1.339e-02 1.863e-02 1.091e-01
In this case, we get a pretty small median absolute difference of just under 0.01 units – but of course the standard deviation of the Gallagher index is much smaller, at 0.048 units. Expressed in terms of standard deviations, we have a minimal important difference of roughly 0.2 standard deviations.
We therefore have a minimal important difference for one index, and a minimal important difference for another index. This means that some differences will be important when we use one index, and not important when we use another index. That might seem unsatisfactory, but “disagreement over indices” should surely have consequences for what is substantively important.
Back to measures of democracy
We can apply this same logic to democracy. Let’s suppose there is a researcher who is indifferent between two measures of electoral democracy:
- the V-Dem project’s measure
v2x_polyarchy
- Polity V scores
Thankfully these two measures are both available in the V-Dem data.
library(vdemdata)
data("vdem")
### Just select the variables we're interested in
<- vdem |>
dat ::select(country_text_id, year,
dplyr
v2x_polyarchy,
e_polity2)### Fix some Polity codes
<- dat |>
dat mutate(e_polity2 = na_if(e_polity2, -88),
e_polity2 = na_if(e_polity2, -66))
Generating all possible pairwise comparisons of all country years across all of the modern period is memory-intensive, so I’ll focus on years since 1945 and countries covered by both projects.
### Restrict it to the post-war period
<- dat |>
dat filter(year >= 1945)
### Restrict it to common cases
<- dat |>
dat filter(!is.na(e_polity2)) |>
filter(!is.na(v2x_polyarchy))
### Create a unique label
<- dat |>
dat mutate(label = paste0(country_text_id, year))
### Generate all pairwise combinations of country years.
### Note that there are around 4,600 country years
### so we have 4,600 * (4,600 - 1) pairings
### but we can restrict it to cases order is alphabetical
### taking it down to 10 million
<- expand.grid(A = unique(dat$label),
pairs B = unique(dat$label),
stringsAsFactors = FALSE) |>
filter(A < B)
### Start merging on the indicators
<- left_join(pairs,
pairs |> dplyr::select(label, v2x_polyarchy, e_polity2),
dat by = join_by(A == label))
<- left_join(pairs,
pairs |> dplyr::select(label, v2x_polyarchy, e_polity2),
dat by = join_by(B == label),
suffix = c(".A", ".B"))
As before, we focus just on cases of disagreement. Here, I’ll look at cases of “strict” disagreement.
nrow(pairs)
[1] 48960460
<- pairs |>
pairs mutate(vdem_pref = sign(v2x_polyarchy.B - v2x_polyarchy.A),
polity_pref = sign(e_polity2.B - e_polity2.A)) |>
filter((vdem_pref == 1 & polity_pref == -1) |
== -1 & polity_pref == 1))
(vdem_pref nrow(pairs)
[1] 4961321
Here, around 10% of pairings of country years see a disagreement between the two indices. Let’s work out what the median absolute difference in these cases of disagreement is.
First, let’s do v2x_polyarchy
:
<- pairs |>
pairs mutate(delta_vdem = v2x_polyarchy.B - v2x_polyarchy.A)
summary(abs(pairs$delta_vdem))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00100 0.02800 0.06500 0.08811 0.11900 0.78000
Our candidate for the minimal important difference is of 0.065 units. That’s around 0.25 standard deviations when looking at the V-Dem data as a whole, and bigger than 0.05, the minimal important difference implied by the minimal detectable change.
and now e_polity2
:
<- pairs |>
pairs mutate(delta_polity = e_polity2.B - e_polity2.A)
summary(abs(pairs$delta_polity))
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 1.000 2.000 3.295 4.000 20.000
Because the Polity score is a fine-grained ordinal variable rather than a continuous variable, the median absolute difference in cases of disagreement is exactly two units.
Conclusions
In this post I’ve suggested that if you are indifferent between two measures, this can help us establish a minimal important difference. This way of establishing a minimal important difference is valuable in cases where, like electoral disproportionality, we can’t establish a minimal important difference by way of measurement error.
I don’t know how common it is for researchers to be genuinely indifferent between measures. I also don’t know how persistent that indifference is. Indeed, one possible way of reacting to these claims about minimal important differences is to quickly develop a more fine grained preference between measures based on cases of disagreement. Obviously it’s fanciful to imagine anyone going through 4 million pairs of country years and determining whether they side more with Polity or V-Dem, but sharpening our preferences by looking at cases of disagreement might still be a worthwhile way to spend some time.
Footnotes
There’s a possible de re/*de dicto confusion here, but pay no attention to that.↩︎