Forecasting multiparty by-elections using Dirichlet regression


Chris Hanretty


October 1, 2021

Model performance over next ten by-elections, for models estimated on progressively larger windows of data from 1945 onwards


By-elections, or special elections, play an important role in many democracies – but whilst there are multiple forecasting models for national elections, there are no such models for multiparty by-elections. Multiparty by-elections present particular analytic problems related to the compositional character of the data and structural zeros where parties fail to stand. I model party vote shares using Dirichlet regression, a technique suited for compositional data analysis. After identifying predictor variables from a broader set of candidate variables, I estimate a Dirichlet regression model using data from all post-war by-elections in the UK (n=468). The cross-validated error of the model is comparable to the error of costly and infrequent by-election polls (MAE: 4.0 compared to 3.6 for polls). The steps taken in the analysis are in principle applicable to any system that uses by-elections to fill legislative vacancies.


The version accepted by the journal is available here. The (gated) version of record is available here.

Replication data

Replication data is available at Harvard Dataverse


Hanretty, Chris. 2021. “Forecasting Multiparty by-Elections Using Dirichlet Regression.” International Journal of Forecasting 37 (4): 1666–76.