Sunday polls tell us nothing about the impact of the “debates”

March 28, 2015

Polls for the Sunday newspapers will start coming out shortly.

Some will be tempted to interpret any changes in these polls as a consequence of Thursday’s “debates”.

That’s stupid. These polls can tell us almost nothing about the impact of these debates. Here’s why.

First, we know that not many people watched the debate.

Second, we know that relatively few of those who watched the debate were undecided. The ICM poll suggests that 8% of watchers fell into that category. I have no reason to believe that’s an under-estimate. Of those 8%, that same ICM poll said that 56% though Ed Miliband won the debate, 30% David Cameron.

Now, those numbers might be wrong — but that doesn’t materially affect the calculations that follow. Suppose, unrealistically, that all of the undecided voters decided on the basis of Thursday’s “debates”. That means Miliband has won 8% * 56% = 4.45% of the audience, and Cameron 8% * 30% = 2.4%.

That translates to 120,150 new voters for Miliband, and 64,800 new voters for Cameron.

How big are these numbers as a fraction of the voting population (approximately 30 million people)? 120k Miliband switchers equals 4/10ths of a percentage point; 65k Cameron switchers equals 1/5th of a percentage point.

It’s impossible accurately to detect changes this small unless you have huge, huge samples. A sample of 200,000 people might be enough.

Any got a Sunday poll with a sample of 200,000 people?

No?

Thought not.

(PS: I’m open to counter-arguments in the style of Lazarsfeld).

Did the debates in 2010 increase political engagement?

March 17, 2015

British politics has reached an impression level of recursion. Broadcasters and politicians are now having a debate about debates – and, on some shows, debates about the debate about the debates.

These meta-debates feature a lot of cant, and a lot of bullshit. I mean that in the Frankfurtian sense: lots of people are making claims without particularly caring whether they are true or not. One (potentially) bullshit claim is the claim that the 2010 debates mattered for political engagement (see, for example, Adam Boulton’s tweet to this effect).

Whether or not the debates improved engagement is, of course, an empirical question. So I thought I’d dig out the 2010 http://bes2009-10.org/ panel data, to see whether the debates did in fact improve turnout.

The wrong way of proceeding is to look at

  1. stated turnout intention amongst people who watched the debates, and
  2. stated turnout intention amongst people who didn’t watch the debates,

and compare the two. People who watched the debates are weirdunusual: they care about politics. So they’re much more likely to turnout and vote.

A better way of proceeding is to look at

  1. stated turnout intention amongst people after the debates, minus
  2. stated turnout intention amongst people before the debates

and compare the differences between

  1. people who watched the debate and
  2. people who didn’t watch the debates

We can do this thanks to the design of the BES. There’s one variable which measures turnout intention in the pre-campaign survey (0-10 scale, where higher values indicate the respondent is almost certain to vote: mean value across respondents with values for both waves: 9.82 (SD: 2.51)), and one variable which measures turnout intention in the campaign survey (mean value: 5.17 (SD: 3.42)).

You’ll notice stated turnout intention is absurdly high in the pre-campaign period, because PEOPLE LIE. (Sorry, they respond inaccurately given a prevalent social desirability bias). But then, people lie about turnout all the freaking time, and that doesn’t stop people writing about it.

Again, we can’t just compare the change over time across these two groups. We’ve got to compare for people’s pre-existing levels of political interest, and other features. The way I do that is through exact matching – creating matched sets of people, who are alike in terms of

  • political interest in general
  • interest in this election
  • whether they’d been contacted by parties
  • whether they lived in a safe, ultra-safe, marginal, or ultra-marginal seat
  • whether they read a newspaper every day, sometimes, or not at all

When I match respondents in this way, I find that (after removing people who responded to the campaign period questionnaire before any of the debates took place) watching any of the debates is associated with an increase of 0.18 points on that 0-10 scale (p value: 0.04).

Is 0.18 points a little or a lot? One way of judging this effect is to compare it to the effects of other media consumption. For example: we can ask what effect “sometimes” reading a daily newspaper has, compared to never reading one. That effect, at 0.19, is slightly bigger than the effect of watching the debates. But the effect of debate-watching stands up well. So although some of the comment surrounding the debate-about-the-debates might have been bullshit, it might also (accidentally) be true.

Replication code is available at GitHub. Please do get in touch if you can improve the analysis — or suggest why turnout is so high in the pre-election wave.

Subjective economic judgements != what actually happened

March 16, 2015

Like most elections, this election will be fought on the basis of the economy. The Conservatives and the Liberal Democrats will argue that the economy is growing. Labour will argue that living standards are stagnant or declining.

As a result, many people will be asked during the course of this election campaign whether the economic position of their household has improved or worsened – or whether they expect it to improve or worsen over the coming twelve months. It matters how people answer these questions. Economic optimism is known to be associated with positive polling for incumbents (but doesn’t have an independent effect on election outcomes).

Unfortunately, the answers that people give to these questions do not always march in lock-step with more objective measures of households’ economic position. To show that, I’m going to compare two things:

  • first, respondents’ answers to the question “How does the financial situation of your household now compare with what it was 12 months ago” (possible answers: a lot worse / a little worse / stayed the same / a little better / a lot better), from Wave 3 of the BES (October 2014)
  • second, changes in respondents’ levels of household income, recorded by YouGov in (a) February 2014 and (b) February 2015 (possible answers: fifteen separate income bands, starting from under £5,000, and going up in £5,000 or £10,000 increments). I’ve taken the mean points of each bracket, and calculated the difference in household income. The median difference is zero, but 25% of respondents had a change larger than £5000, or one income band.

These two pieces of information are collected at separate points in time: when respondents are asked about the financial situation of their household, they’re not thinking about the answer they gave to the question on household income they gave five moments ago.

There are good reasons to think that people from households which have seen their income rise will say that their household’s financial position has improved. That’s true, but only in the smallest, most grudging way, as Figure 1 shows.

tableout

The percentage saying that their household’s financial position got worse (the red area) decreases as we move from households whose income did in fact decline, to households whose income did in fact increase. But the effect is very small, and the absolute figures still indicate that the link between objective and subjective evaluations isn’t that tight. 35% of people whose household income increased said that their household’s financial position got worse.

Now, you might object that

  • I take nominal income instead of real income, and that it’s possible for nominal increases in income to be wiped out by cost increases in certain bundles of good; or that
  • household financial position includes wealth as well as income, and it’s possible for income to change even as households draw down wealth; or that
  • the timing of the income measures (February 2014 -> February 2015) don’t match up with the timing of the responses (fieldwork: October 2014)

but still, this suggests strongly that subjective evaluations of economic conditions should be used as indicators of mood rather than as a proxy for what actually happened to respondents.

Google search trends and the #indyref

February 11, 2015

Ronald MacDonald and Xuxin Mao, of the University of Glasgow, have published a working paper looking at Google search activity and the Scottish independence referendum.

The paper has got media attention, in particular because it claims

  1. that search trends can be used to predict election outcomes, and
  2. that the “Vow” had no effect on the vote.

It’s rather unfortunate that this paper has received so much media attention, because it’s a very, very bad paper. It

  • is poorly written (Ipsos Mori features as “Dipso Mori”: clearly a pollster who has had a bit too much to drink)
  • misrepresents the state of the literature on election forecasting using search activity
  • bandies around concepts like “clear information”, “rationality”, and “emotion” with scant regard for the meaning of those words.
  • does not attempt to examine other sources of information like the British Election Study

Let’s take the first main claim made by the paper: that search activity can be used to predict elections. How?

The first thing to note is that the authors are attempting to use information on search activity to try and predict a variable which they label “Potential Yes votes”. Those who read the paper will realize that “potential Yes votes” is actually a rolling average of polls. So the authors are using search activity to try and predict polling numbers. Without some polling data, you cannot use search trends to predict elections.

There are situations where using search activity to predict polling numbers is useful. Some countries (Italy, Greece) ban the publication of polls in the run up to elections. I can imagine “predicting” polling would be useful in these contexts.

But any exercise in forecasting will ultimately rely on polling data. If polling data suffers from a systemic bias, forecasting based on search activity will also suffer from systemic bias.

The second thing to note is that the authors are attempting to use searches for particular terms to predict polling numbers. In the paper, they try two terms: “Alex Salmond” and “SNP”. Their assumption is that searching for these terms will be correlated with voting yes — or equivalently, weeks in which there are more searches for Alex Salmond will be weeks in which the Yes campaign is doing better in the polls.

Unfortunately, the authors themselves show that in the latter part of the period under study, there is in fact no correlation between the volume of searches for Alex Salmond and the Yes vote. The authors write

“the effect of Google Trends on Potential Yes Votes became insignificant after 15th March 2014. Based on the testing criteria on clear information, voters in the Scottish referendum encountered difficulties in finding enough clear information to justify a decision to vote Yes”.

In other words, because the authors assume that there is a relationship between particular types of searches and Yes voting, the fact that that relationship breaks down becomes evidence not that this was a poor assumption to begin with, but rather that voters faced difficulty in finding information supporting a Yes vote.

I struggle to accept this reasoning. The only justification I can see for assuming that searching for these terms will be correlated with voting yes is the significant correlation during the first period under study. But it seems more likely that this correlation is entirely epiphenomenal. During the early part of the campaign, the Yes campaign’s polling numbers improved. During the early part of the campaign, search activity increased. But the two are not linked. Search activity is hardly likely to fall during this period.

So, Google search trends can be used to forecast elections if you have polling data, and can identify a search term which correlates with the polling data over the entire period — but these authors couldn’t.

Let’s turn to the second main claim of the paper — that the Vow had no effect on the referendum outcome. This claim is supported by a vector autoregressive model of polling averages, with different dummy variables for different days of the campaign. This is a complicated way of saying that the authors tried to see whether the Yes figure in the polls was higher on particular days.

Unless I have misunderstood the authors’ model very badly, in order to make a difference, the Vow had to produce effects on the day it was published. It does not seem to me to make any sense to assume that the effects produced by an event like this must take place on the day of the event itself.

For what it’s worth, I don’t think we have enough information to judge whether the Vow made a difference. I’m not aware of any polling questions which ask specifically about the Vow, though I’m happy to be corrected on this point. But I’m afraid that this paper doesn’t help us forecast elections, or answer substantive questions about the determinants of voting in the independence referendum.

What does a book chapter count in #REF2014?

January 28, 2015

UPDATE: The sub-panel report provides a more useful breakdown of the percentage of work by output type that was assessed as 4*. Thanks to Jane Tinkler for tweeting the link.

HEFCE’s data on REF submissions identifies a number of different submission types.

For politics, four submission types dominate:

  • Authored books
  • Edited books
  • Chapters in books
  • Journal articles

If we just knew the composition of a department’s REF2014 submission, how would we estimate its eventual GPA? Received wisdom suggests that journal articles are the gold standard, and everything else — particularly chapters in books or edited volumes — are just making up the weight.

We can regress departmental GPAs on the percentage of outputs falling into each of these categories.

Here’s the output of that regression model for Politics and International Relations, with journal articles as the baseline category, and ignoring complications due to books double counting.

Dependent variable:
GPA
PropBooks 0.091
(0.643)
PropEdited -2.985*
(1.581)
PropChapter -1.733***
(0.591)
PropOther -3.904*
(2.165)
Constant 2.863***
(0.146)
Observations 55
R2 0.306
Adjusted R2 0.250
Residual Std. Error 0.298 (df = 50)
F Statistic 5.510*** (df = 4; 50)
Note: *p<0.1; **p<0.05; ***p<0.01

The results suggest that books and journal articles achieve parity, but that a submission composed entirely on chapters or edited volumes would achieve a lowly GPA indeed.

Politics journals and #REF2014 GPAs

January 26, 2015

HEFCE has recently released full details on the submissions for the 2014 REF.

This allows us to start drawing conclusions — sound or ill-founded — about the relationship between each unit of assessment’s grade point average, and the nature of their submissions.

Following the example of Matthias Siems, who has carried out this exercise for law, I list below pseudo-GPAs for each journal. These pseudo-GPAs are taken by matching each journal article with the GPA of the submitting institution, and averaging over all articles from that journal.

I’ve excluded journals which featured fewer than ten times, and journals which featured only in one university’s submissions.

  1. Journal Of Conflict Resolution – 3.22 [11 appearances]
  2. American Political Science Review – 3.16 [21 appearances]
  3. American Journal Of Political Science – 3.08 [22 appearances]
  4. Journal Of Political Philosophy – 3.05 [17 appearances]
  5. World Politics – 3.04 [13 appearances]
  6. British Journal Of Political Science – 3.03 [64 appearances]
  7. Journal Of Politics – 2.99 [33 appearances]
  8. Comparative Political Studies – 2.99 [30 appearances]
  9. International Studies Review – 2.93 [10 appearances]
  10. European Union Politics – 2.93 [10 appearances]
  11. Governance – 2.93 [14 appearances]
  12. Ethics And International Affairs – 2.93 [12 appearances]
  13. Journal Of Elections, Public Opinion And Parties – 2.91 [12 appearances]
  14. Electoral Studies – 2.9 [38 appearances]
  15. Journal Of Peace Research – 2.89 [16 appearances]
  16. European Journal Of Political Research – 2.88 [36 appearances]
  17. Political Research Quarterly – 2.88 [13 appearances]
  18. Journal Of International Relations And Development – 2.87 [14 appearances]
  19. European Journal Of International Relations – 2.86 [41 appearances]
  20. International Theory – 2.86 [11 appearances]
  21. International Studies Quarterly – 2.85 [41 appearances]
  22. Review Of International Political Economy – 2.85 [27 appearances]
  23. Political Studies – 2.84 [109 appearances]
  24. Journal Of Social Philosophy – 2.83 [11 appearances]
  25. West European Politics – 2.82 [37 appearances]
  26. Economy And Society – 2.82 [12 appearances]
  27. Journal Of European Public Policy – 2.82 [51 appearances]
  28. Millennium – 2.82 [25 appearances]
  29. Public Administration – 2.81 [25 appearances]
  30. New Political Economy – 2.8 [41 appearances]
  31. Environmental Politics – 2.78 [17 appearances]
  32. Journal Of Strategic Studies – 2.78 [24 appearances]
  33. Party Politics – 2.78 [31 appearances]
  34. Critical Review Of International Social And Political Philosophy – 2.78 [23 appearances]
  35. European Journal Of Political Theory – 2.77 [24 appearances]
  36. Political Geography – 2.77 [12 appearances]
  37. International Political Sociology – 2.77 [24 appearances]
  38. Millenium – 2.77 [13 appearances]
  39. East European Politics – 2.76 [10 appearances]
  40. Cambridge Review Of International Affairs – 2.76 [16 appearances]
  41. Review Of International Studies – 2.76 [97 appearances]
  42. International History Review – 2.74 [13 appearances]
  43. Journal Of Common Market Studies – 2.73 [33 appearances]
  44. Contemporary Political Theory – 2.73 [13 appearances]
  45. International Affairs – 2.73 [65 appearances]
  46. History Of Political Thought – 2.73 [11 appearances]
  47. Intelligence And National Security – 2.73 [13 appearances]
  48. International Relations – 2.72 [12 appearances]
  49. Political Quarterly – 2.72 [10 appearances]
  50. Government And Opposition – 2.71 [20 appearances]
  51. Policy And Politics – 2.71 [17 appearances]
  52. Globalizations – 2.7 [13 appearances]
  53. Development And Change – 2.7 [10 appearances]
  54. Democratization – 2.7 [20 appearances]
  55. British Politics – 2.69 [25 appearances]
  56. Journal Of Legislative Studies – 2.68 [11 appearances]
  57. International Peacekeeping – 2.67 [12 appearances]
  58. Security Dialogue – 2.67 [27 appearances]
  59. International Politics – 2.67 [24 appearances]
  60. Europe-asia Studies – 2.66 [22 appearances]
  61. Res Publica – 2.66 [11 appearances]
  62. Third World Quarterly – 2.65 [38 appearances]
  63. Studies In Conflict And Terrorism – 2.65 [11 appearances]
  64. Journal Of European Integration – 2.65 [13 appearances]
  65. Cooperation And Conflict – 2.64 [13 appearances]
  66. British Journal Of Politics And International Relations – 2.64 [71 appearances]
  67. Geopolitics – 2.63 [10 appearances]
  68. Journal Of International Political Theory – 2.58 [11 appearances]
  69. Journal Of Military History – 2.56 [10 appearances]
  70. Parliamentary Affairs – 2.55 [38 appearances]
  71. Contemporary Security Policy – 2.55 [14 appearances]
  72. Perspectives On European Politics And Society – 2.54 [10 appearances]
  73. Critical Studies On Terrorism – 2.51 [13 appearances]

Interested in law, courts, and methodology?

January 22, 2015

As part of the ECPR General Conference in Montreal (26 – 29 August 2015), the ECPR Standing Group on Law and Courts is organizing a panel on Data and Methods in Court research.

I’d like to invite papers to be submitted as part of this panel. I’ve pasted the description of the panel below, but let me add that this is an excellent opportunity for all those who are doing research on judicial texts-as-data, particularly in languages other than English, or for researchers dealing with large (several thousand/year) volumes of court decisions.

If you are interested in presenting, please email me, Chris Hanretty, at c.hanretty@uea.ac.uk.

The deadline for panel submission is February 16th. I’d therefore be grateful if you could let me know if you would like to submit a paper by February 14th.

Panel description:

“The methodology of comparison is a key factor for research into law and courts. We need to carefully explore the various ways of analysing and comparing judicial politics. Beyond the traditional qualitative and quantitative divide we wish to underline the challenge of analysing judicial decisions written in different languages. Data collection and standardisation is an essential condition for successful comparative research. Papers dealing with these issues are invited.”

Election forecasting: some due credit

January 5, 2015

One characteristic which academia shares with rap music (and occasionally with house music) is the care it places on giving proper credit. The forecasting site that I’ve built with Ben Lauderdale and Nick Vivyan, and which is featured in tonight’s edition of Newsnight, wouldn’t have been possible without lots of previous research. I’ve put some links below for those that want to follow up some of the academic research on polling and election forecasting.

(1) “Past elections tell us that as the election nears, parties which are polling well above the last general election… tend to drop back slightly”.

In the language of statistics, we find a form of regression toward the mean. We’re far from the first people to find this pattern. In the British context, the best statement of this tendency is by Steve Fisher, who has his own forecasting site. Steve’s working paper is useful for more technically minded readers.

(2) “…use all the polling data that’s out there…”

As Twitter constantly reminds us, one poll does not make a trend — we need to aggregate polls.

Most political scientists who aggregate polls are following in the footsteps of Simon Jackman, who published some very helpful code for combining polls fielded at different times with different sample sizes. We’ve had to make a fair few adjustments for the multiparty system, but there’s enough of a link to make it worth a shout out.

(3) “By matching… [subsamples of national polls] with what we know about each local area we can start to identify patterns”

Again, to give this insight its proper statistical name, this is a form of small area estimation. In political science, a lot of small area estimation is done using something called multilevel regression and post-stratification, which can be quite slow and fiddly (these are non-technical terms). Although we’ve used MRP in the past (for example, to generate estimates of how Eurosceptic each constituency is), we’ve found that you get similar results using simpler regression models. See our technical report for the gory details.

On research intensity in #REF2014

December 19, 2014

Times Higher Education has published an alternative ranking of REF results, which is based on “research intensity”.

This measure is calculated by taking the percentage of full-time equivalent eligible staff submitted to the REF, and multiplying this by the university’s grade point average.

It seems to me that “research intensity” is a metric in search of a definition.

One way in which research intensity is being interpreted (at least on Twitter!) is as an alternative way of reading the REF, “correcting for” strategic decisions of departments concerning who to submit.

I think the proper way of responding to this is not to say that decisions regarding submission were not strategic — most of them most likely were strategic.

Rather, I think we have to think about what interpreting “research intensity” in this way means.

Suppose a university has the option of submitting 100% of its 20 eligible staff, but is concerned about the likely GPA of the 20-th staff member, ranked in descending order of likely GPA. Suppose that this staff member has a likely GPA (over their REF-able items) of 2. So they decide not to submit this individual. They subsequently receive a GPA of 2.50. Their research intensity score is 0.95 * 2.50 = 2.375. Is this an accurate measure of their GPA, “correcting for” strategic submission? No. By construction, their true GPA is 0.95 * 2.5 + 0.05 * 2 = 2.475. In practice, “research intensity” assumes that the GPA of the non-submitted staff member is equal to zero.

A metric which “corrects” for strategic submission would recognize that, under certain assumptions (those who make decisions regarding submission are unbiased judges of likely GPAs of items submitted; no submitting unit is at the margin where an additional staff member means an additional impact case study) the level of quality of non-submitted staff members is below the GPA actually obtained.

In this instance, we knew by construction, that the GPA of the 20th member of staff, set at two, was less than 2.5. Generally, however, it is not clear at what level non-submitted staff lie. Given an upper bound (the GPA actually obtained) and a lower bound (zero), we can say that the level of non-submitted staff is equidistant from each bound. This means taking the arithmetic mean. Or, as Rein Taagepera has argued, we can take the geometric mean, which is generally better for rates, and which is equal to the square root of the upper bound when the lower bound is zero.

Here’s a worked example. Let’s take Anglia Ruskin’s submission for Allied Health Professions (oh, the tyranny of alphabetical order!).

  • For this UoA, Anglia Ruskin got a GPA of 2.92. They submitted 11.3 FTE staff members, out of 122 eligible staff members. Their research intensity score is therefore 0.27.
  • Our best guess about the quality of non-submitted staff is equal to the geometric mean of zero and 2.92. In this instance, that’s just the square root of 2.92, or 1.71.
  • Our best guess about the corrected GPA is therefore 11.3 / 122 * 2.92, plus (122-11.3)/122 * 1.71, or 1.82.

I have a spreadsheet containing all these “corrected” GPAs, but I’m not sufficiently confident of the HEFCE data at the unit of assessment level to release it. There are several instances of units of assessment having submitted more than 100% of their eligible staff, even after accounting for multiple submissions.

For politics, however, the data on submissions all seem to make sense.

This table shows how the rankings change. The GPAs, of course, are very similar — the Pearson correlation between the raw and “corrected” GPA scores is 0.95 or so. But the rank correlation is smaller, because, y’know, ranks are rubbish.

HE provider Rank On GPA “Corrected” Rank
The University of Essex
1
1
University College London
5
2
The University of Oxford
4
3
London School of Economics and Political Science
2
4
The University of Warwick
6
5
The University of Sheffield
3
6
The University of Strathclyde
11
7
The University of Edinburgh
10
8
The University of Reading
19
9
The University of St Andrews
17
10
Aberystwyth University
7
11
The University of Cambridge
13
12
The University of Southampton
14
13
The University of Sussex
12
14
The University of York
8
15
Royal Holloway and Bedford New College
27
16
The University of Glasgow
26
17
Brunel University London
28
18
University of Nottingham
28
19
Queen Mary University of London
24
20
The University of Bristol
34
21
Birkbeck College
16
22
The University of East Anglia
24
23
The University of Kent
31
24
The University of Manchester
18
25
The University of Exeter
9
26
The University of Birmingham
30
27
The School of Oriental and African Studies
22
28
University of Durham
20
29
The City University
36
30
King’s College London
14
31
University of Ulster
38
32
Goldsmiths College
40
33
The University of Leeds
22
34
University of Newcastle-upon-Tyne
37
35
The University of Leicester
39
36
The University of Keele
31
37
The University of Aberdeen
41
38
Swansea University
20
39
Oxford Brookes University
43
40
The University of Westminster
35
41
The University of Hull
44
42
The University of Dundee
46
43
The University of Bradford
33
44
University of the West of England, Bristol
48
45
The University of Surrey
47
46
The University of Liverpool
42
47
The University of Lincoln
45
48
Coventry University
49
49
Liverpool Hope University
51
50
Canterbury Christ Church University
52
51
London Metropolitan University
50
52
St Mary’s University College
53
53

Big winners are Bristol, Royal Holloway, Brunel. Big losers are Swansea, Exeter and King’s College — sorry, King’s, London. The degree to which institutions win and lose, is, however, an entirely misleading impression created by the use of rank information.

I don’t want to imply that these rankings should be taken seriously: these are all parlour games, and I’ve now written far too much on how to construct alternative rankings of universities. Time to enjoy the rest of this train ride north.

#REF2014 spin generating spreadsheet!

December 18, 2014

Update The original HEFCE spreadsheet hid rows 5600 — 7645. When I copied across, I missed these rows. Revised rankings below.

tl;dr I made a spreadsheet which shows twelve different ways to rank your department. You can download it here.

One of the many invidious features of the REF is the way that REF results are often presented as ranks. As Ruth Dixon and Christopher Hood have pointed out, rank information both conceals information (maybe Rank 1 uni was miles ahead of Rank 2 uni), and creates the perception of large differences when the underlying information is quite similar (maybe Ranks 7 through 17 were separated only by two decimal places).

The combination of rank information with multiple assessment and weighting criteria makes this even more invidious. The most commonly seen metrics this morning have been grade point averages, or the average star rating received by each submission. However, I have also seen research power scores (grade point average times number of full-time equivalent staff submitted) and “percentage world-leading” research (that is, percentage of submissions judged 4-star).

Some of these metrics have been calculated on the basis of the overall performance, but some have been calculated on the performance in outputs. It’s also possible to imagine calculating these on the basis of impact, or on environment.

This means that universities can pick and choose between 12 different rankings (some of which don’t really make sense):

  • Rank in impact, measured by GPA
  • Rank in environment, measured by GPA
  • Rank in outputs, measured by GPA
  • Rank overall, measured by GPA
  • Rank in impact, measured by “research power”
  • Rank in environment, measured by “research power”
  • Rank in outputs, measured by “research power”
  • Rank overall, measured by “research power”
  • Rank in impact, measured by percentage world-leading
  • Rank in environment, measured by percentage world-leading
  • Rank in outputs, measured by percentage world-leading
  • Rank overall, measured by percentage world-leading

As I say, not all of these make sense: “7th most world-leading environment” is a difficult stat to make sense of, and so many might be tempted just to abbreviate to “7th best place for research in politics”, or some other bending of the truth.

In order to aid the dark arts of university public relations, I’ve produced a handy spreadsheet which shows, for each university and each unit of assessment, the different ranks. You can download it here.

For my own university (University of East Anglia), and my own unit of assessment (Politics and IR), this works as follows.

  • The best rank is our rank on outputs: as judged by the percentage of research that is world-leading, we rank twelftheighth.
  • Our second worst rank is our rank on the same criterion(!), measured by a different metric: as judged by the research power of our outputs, we rank 31st30th. This really is a reflection of the size of our department and the number of staff we submitted for REF.

This goes to show that quantifying research excellence can give you very equivocal conclusions about which are the most excellent universities. It does not show — but I suspect this will be readily granted by the reader — that such equivocality lends itself very easily to misleading or partially truthful claims about universities’ performance.

 
Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org