25 September 2015

Really preliminary estimates

Given that we are unable to peek into voters’ minds (remember: we are trying to avoid using polls as much as possible), we need data (or proxies) for factors that might influence someone’s vote. We gathered (or created) and joined data for the 2006, 2008, and 2011 Canadian federal elections (as well as the 2015 election, which will be used for predictions) for Toronto ridings.

We’ll be explaining all this in more detail next week, but for now, here are some basics:
  • We’ve assigned leader “likeability” scores to the major party leaders in each election, using polls that ask questions about leadership characteristics and formulaically compare them to party-level polls around the same time. This provides a value for (or at least a proxy of) how much influence the party leader was having on their party’s showing in the polls, and should account for much of the party variation that we see from year to year. (We also use party identifiers, to identify a “base”.)
  • For all 366 candidates across the three elections, we identify two things: are they an incumbent, and are they a “star” candidate, by which we mean would they be generally known outside of their riding? This yields 64 candidate-year incumbents (i.e., an individual could be an incumbent in all three elections) and 29 candidate-year stars.

Regressing these data against the proportion of votes received across ridings yields some interesting results. First: party, leader likeability, star candidate, and incumbency are all statistically significant (as is the interaction of star candidate and incumbency). This isn’t a surprise, given the literature around what it is that drives voters’ decisions. (Note that we haven’t yet included demographics or party platforms.)


Breaking down the results: Being a star candidate or an incumbent (but not both) adds about 20 points right off the top, so name recognition obviously matters a lot. Likeability matter too; a leader that essentially polls the same as their party yields candidates about 14 points. (As an example of what this means, Stephane Dion lost the average Liberal candidate in Toronto about 9 points relative to Paul Martin. Alternatively, in 2011, Jack Layton added about 16 points more to NDP candidates in Toronto than Michael Ignatieff did for equivalent Liberal candidates.) Finally, party base matters too: for example, being an average Liberal candidate in Toronto adds about 17 points over the equivalent NDP candidate. (We expect some of this will be explained with demographics and party platforms.)

To be clear, these are average results, so we can’t yet use them effectively for predicting individual riding-level races (that will come later). But, if we apply them to all 2015 races in Toronto and aggregate across the city, we would predict voting proportions very similar to the results of a recent poll by Mainstreet (if undecided voters split proportionally):


Given that we haven’t used polls or included localized details or party platforms, these results are amazing, and give us a lot of confidence that we’re making fantastic progress in understanding voter behaviour (at least in Toronto).

18 September 2015

Data for federal elections

Analyzing the upcoming federal election requires collecting and integrating new data. This is often the most challenging part of any analysis and we've committed significant efforts to obtaining good data for federal elections in Toronto's electoral districts.

Clearly, the first place to start was with Elections Canada and the results of previous general elections. These are available for download as collections of Excel files, which aren't the most convenient format. So, our toVotes package has been updated to include results from the 2006, 2008, and 2011 federal elections for electoral districts in Toronto. The toFederalVotes data frame provides the candidate's name, party, whether they were an incumbent, and the number of votes they received by electoral district and poll number. Across the three elections, this amounts to 82,314 observations.

Connecting these voting data with other characteristics requires knowing where each electoral district and poll are in Toronto. So, we created spatial joins among datasets to integrate them (e.g., combining demographics from census data with the vote results). Shapefiles for each of the three federal elections are available for download, but the location identifiers aren't a clean match between the Excel and shapefiles. Thanks to some help from Elections Canada, we were able to translate the location identifiers and join the voting data to the election shapefiles. This gives us close to 4,000 poll locations across 23 electoral districts in each year. We then used the census shapefiles to aggregate these voting data into 579 census tracts. These tracts are relatively stable and give us a common geographical classification for all of our data.

This work is currently in the experimental fed-geo branch of the toVotes package and will be pulled into the main branch soon. Now, with votes aggregated into census tracts, we can use the census data for Toronto in our toCensus package to explore how demographics affect voting outcomes.

Getting the data to this point was more work than we expected, but well worth the effort. We're excited to see what we can learn from these data and look forward to sharing the results with you.

17 September 2015

Back? We never left.

A number of people have been asking whether we are going to analyze the upcoming federal election on October 19, like we did for the Toronto mayoral race last year. The truth is, we never stopped working after the mayoral race, but are back with a vengeance for the next five weeks.
We have gathered tonnes of new data and refined our methodology. We have also established a new domain name: psephoanalytics.ca. You can still subscribe to email updates here, or follow us on twitter @psephoanalytics. Finally, if you’d like to chat directly, please email us psephoanalytics@gmail.com.

Nonetheless, stay tuned for lots of updates over the coming weeks, culminating in some predictions for Toronto ridings prior to October 19.