09 June 2020

Comparing predicted to actual votes for the 2019 Federal election

Our predictions for the 2019 Federal race in Toronto were generated by our agent-based model that uses demographic characteristics and results from previous elections. Now that the final results are available, we can see how our predictions performed at the Electoral District level.

For this analysis, we restrict the comparison to just the major parties, as they were the only parties for which we estimated vote share. We also only compare the actual results to the predictions of our base scenario. In the future, our work will focus much more on scenario planning to explain political campaigns.

We start by plotting the difference between the actual votes and the predicted votes at the party and district level.
Distribution of the difference between the predicted and actual proportion of votes for all parties
The mean absolute value of differences from the actual results is 5.3%. In addition, the median value of the differences is 1.28%, which means that we slightly overestimated support for parties. However, as the histogram shows, there is significant variation in this difference across districts. Our highest overestimation was 15.6% and lowest underestimation was -18.5%.

To better understand this variation, we can look at a plot of the geographical distribution of the differences. In this figure, we show each party separately to illuminate the geographical structure of the differences.
Geographical distribution of the difference between the predicted and actual proportion of votes by Electoral District and party

The overall distribution of differences doesn’t have a clear geographical bias. In some sense, this is good, as it shows our agent-based model isn’t systematically biased to any particular Electoral District.

However, our model does appear to generally overestimate NDP support while underestimating Liberal support. These slight biases are important indicators for us in recalibrating the model.

Overall, we’re very happy with an error distribution of around 5%. As described earlier, our primary objective is to explain political campaigns. Having accurate predictions is useful to this objective, but isn’t the primary concern. Rather, we’re much more interested in using the model that we’ve built for exploring different scenarios and helping to design political campaigns.