Thursday, October 27, 2016

Predictive Modeling of the US Elections

The 2016 US election is a great place to understand some of the intricacies of using predictive modeling in complex systems. Nate Silver has recently published a post on the shift in the polls recently and its impact on the overall election model [1]. At the very end of the post - Nate talks about Donald Trump's chances of winning being equal to the chance of losing a game of Russian roulette.

The one thing I really like about the US modeling efforts is that the process is quite transparent. I feel this is best way to do this. That way no one can claim you did something to fudge the results.

In Nate's models [2] - each poll is taken as an input and assigned a credibility score.  While the one place everyone can complain is the credibility score, it is an open process [3]. In order to have a high credibility score, your polling organization has to put the data and the analysis for a public review. Your firm also has to produce consistently good agreement between the polls and the eventual outcome.  Each poll is assigned a margin of error so that the agreement of the poll with real world outcomes can be evaluated.

While the details may be too dense for the average reader to parse, the key point that one has to recognize is that the polling agencies and the predictive models only gain from matching their prediction to the outcome. This vested interest drives them towards more accuracy. Whatever the biases of the pollsters or the modelers, if their predictions don't match reality - they won't get paid next time. And the rating system ensures that this accuracy evaluation is almost continuously performed. IMO this is the best it can get.

While a poll or model can't guarantee any outcome - so there is no reason for any side to lose heart at the sight of one - polls and models are at the core of how a campaign manages its resources internally. If polling data and modeling consistently show deficits in a particular geographical region, and that region is important from a strategic perspective - then the campaign has to seek out ways of reaching out to the people who live there.

The Clinton and Trump campaigns are diametrically opposite in the way they approach the polling data and subsequent modeling.

The Clinton campaign uses a very data driven approach - the model is most likely an old aggressively tested one from past elections, and the exact polling data (i.e. credibility scoring system) is very nuanced.  Resources are focused where the data says there is some possibility of a significant return. The Clinton campaign behaves like a well managed mutual fund on Wall Street.

The Donald Trump campaign does not seem to be interested in anything detailed. The polls are seen merely as a publicity vehicle -if they favor Donald Trump - the campaign can't stop talking about them but if they don't the entire campaign dismisses them as being "rigged". It is not clear to me if there is even a higher level data model. Resources are allocated based on Donald's gut feelings. The Trump campaign operates like a small family managed investment firm.

It makes sense in a way I suppose, Hillary Clinton is a career politician who is used to having every minute of her existence scrutinized and dissected by everyone.  Donald Trump runs a small family firm that leverages risky-investments off a guerrilla marketing strategy.


Post a Comment

<< Home