Ronald MacDonald and Xuxin Mao, of the University of Glasgow, have published a working paper looking at Google search activity and the Scottish independence referendum.
The paper has got media attention, in particular because it claims
- that search trends can be used to predict election outcomes, and
- that the “Vow” had no effect on the vote.
It’s rather unfortunate that this paper has received so much media attention, because it’s a very, very bad paper. It
- is poorly written (Ipsos Mori features as “Dipso Mori”: clearly a pollster who has had a bit too much to drink)
- misrepresents the state of the literature on election forecasting using search activity
- bandies around concepts like “clear information”, “rationality”, and “emotion” with scant regard for the meaning of those words.
- does not attempt to examine other sources of information like the British Election Study
Let’s take the first main claim made by the paper: that search activity can be used to predict elections. How?
The first thing to note is that the authors are attempting to use information on search activity to try and predict a variable which they label “Potential Yes votes”. Those who read the paper will realize that “potential Yes votes” is actually a rolling average of polls. So the authors are using search activity to try and predict polling numbers. Without some polling data, you cannot use search trends to predict elections.
There are situations where using search activity to predict polling numbers is useful. Some countries (Italy, Greece) ban the publication of polls in the run up to elections. I can imagine “predicting” polling would be useful in these contexts.
But any exercise in forecasting will ultimately rely on polling data. If polling data suffers from a systemic bias, forecasting based on search activity will also suffer from systemic bias.
The second thing to note is that the authors are attempting to use searches for particular terms to predict polling numbers. In the paper, they try two terms: “Alex Salmond” and “SNP”. Their assumption is that searching for these terms will be correlated with voting yes — or equivalently, weeks in which there are more searches for Alex Salmond will be weeks in which the Yes campaign is doing better in the polls.
Unfortunately, the authors themselves show that in the latter part of the period under study, there is in fact no correlation between the volume of searches for Alex Salmond and the Yes vote. The authors write
“the effect of Google Trends on Potential Yes Votes became insignificant after 15th March 2014. Based on the testing criteria on clear information, voters in the Scottish referendum encountered difficulties in finding enough clear information to justify a decision to vote Yes”.
In other words, because the authors assume that there is a relationship between particular types of searches and Yes voting, the fact that that relationship breaks down becomes evidence not that this was a poor assumption to begin with, but rather that voters faced difficulty in finding information supporting a Yes vote.
I struggle to accept this reasoning. The only justification I can see for assuming that searching for these terms will be correlated with voting yes is the significant correlation during the first period under study. But it seems more likely that this correlation is entirely epiphenomenal. During the early part of the campaign, the Yes campaign’s polling numbers improved. During the early part of the campaign, search activity increased. But the two are not linked. Search activity is hardly likely to fall during this period.
So, Google search trends can be used to forecast elections if you have polling data, and can identify a search term which correlates with the polling data over the entire period — but these authors couldn’t.
Let’s turn to the second main claim of the paper — that the Vow had no effect on the referendum outcome. This claim is supported by a vector autoregressive model of polling averages, with different dummy variables for different days of the campaign. This is a complicated way of saying that the authors tried to see whether the Yes figure in the polls was higher on particular days.
Unless I have misunderstood the authors’ model very badly, in order to make a difference, the Vow had to produce effects on the day it was published. It does not seem to me to make any sense to assume that the effects produced by an event like this must take place on the day of the event itself.
For what it’s worth, I don’t think we have enough information to judge whether the Vow made a difference. I’m not aware of any polling questions which ask specifically about the Vow, though I’m happy to be corrected on this point. But I’m afraid that this paper doesn’t help us forecast elections, or answer substantive questions about the determinants of voting in the independence referendum.