- First, we use state-by-state polling data to calculate the win/loss probability of each state for a given candidate.
- Then, we use those computed probabilities for each state to create a distribution—a histogram—showing the probability of the given candidate winning each possible number of electoral votes from 0 – 538.

## Evaluation of the Distribution of Electoral Colleges

Defined below is the mathematical procedure to compute the probability distribution of a given candidate’s predicted electoral votes. To help clarify the definition, we will act as if we are examining Obama’s electoral votes in the 2012 election. (We thus estimate Romney’s count as being 538 minus Obama’s).

We define the following quantities:

\(k\) is the index for each state.

So \(k = 1, 2, 3, …51\) (50 states plus D.C.; we have not yet accounted for Maine and Nebraska splitting votes.) The states are ordered alphabetically, such that \(k = 1\) corresponds to Alabama and \(k = 51\) corresponds to Wyoming.

\(n_k\) is the number of electoral votes for state \(k\).

For example, \(n_1 = 9\) because Alabama has 9 electoral college votes.

\(q_k\) is the probability that Obama will win state \(k\) by getting more than 50% of the popular vote in that state.

This is also the probability that Obama will win \(n_k\) electoral votes from that state.

\(P(j, S)\) is the probability that Obama wins \(j\) electoral votes from the first \(S\) states.

When \(S = 51\), we have included all the states in our calculation.

\(P(j, 51)\), where \(j\) is \(0, 1, 2, ..., 538\), is the distribution of Obama's electoral vote count.

The probability that Obama wins the presidency is thus given by the summation \(\sum_{j=270}^{538} P(j, 51)\).

Similarly, the probability that Romney wins the presidency is given by the summation \(\sum_{j=0}^{268} P(j, 51)\).

And the probability of a tie is given by \(P(269, 51)\).

The consequences of the case where no candidate wins 270 electoral votes are covered in Appendix A at the end of this document.

We assume that the probability that Obama wins in each state is independent from the outcomes in other states. This is a plausible assumption only if we condition the state-by-state outcomes on polls which factor in changes of opinion from the most current events - a change in the European economic crisis, for example, or a release of domestic economic data, or even Mitt Romney’s debate performance - since then we will be using the most current information to compute each \(q_k\).

With that knowledge, the following recursion can be used to iteratively compute \(P(j, k)\) if we know \(P(j, k - 1)\):

\(P(j, k) = q_k P(j - n_k, k - 1) + (1 - q_k)P(j, k -1) \text{ where } j = 0, 1, ..., 538 \text{ and } k = 1, 2, ..., 51 \text{.} \)

The initial conditions for the recursion are as follows:

\(P(0, 0) = 1 \text{ and } P(j, 0) = 0 \text{ for } j \ne 0 \text{.}\)

\(P(j, k) = 0 \text{ for } j < 0 \text{.}\)

This recursion implicitly “aggregates” all the relevant probabilities without having to exhaustively examine all combinations of win/loss scenarios.

## Calculating a Candidate’s Win/Loss Probability for a State

We will not carry the state index \(k\) in this section, but the analysis shown here is carried out for each state.

We do not know what the outcome of the vote in each state will be, and so we do not know the number of the electoral votes that will go to Obama from that state. Thus, we seek to estimate the proportion of the popular vote in that state which is for Obama by employing a probability density function. Suppose we denote this unknown proportion as \(x\); then \(x\) can range in the interval \((0, 1)\). We will use conditional probability (Bayes’ rule) to update the probability density function for \(x\) based on polling data for a particular state. If we know this probability density function of \(x\), denoting it as \(f(x)\), then we can compute the probability that Obama will win this state - i.e. \(q\) - by evaluating \(\int_{x = 0.5}^{1} f(x) dx \).

This integral simply calculates the probability that the proportion of the popular vote for Obama is above 50%.

A natural density function to use is the beta distribution, which provides easy updating. A beta distribution is determined by two parameters: \(A\) and \(B\). The parameters \(A\) and \(B\) correspond naturally to the polling data: \(A\) reflects how many respondents in the sample survey favor one candidate, and \(B\) reflects how many respondents in the sample survey favor the other candidate. The form of the beta distribution that we are using is this: \(f(x | A, B) = K(A, B)x^{A-1}(1 - x)^{B-1} \text{ for } 0 \le x \le 1\).

\(K(A, B)\) is a normalizing constant so that the density integrates to one. This is a two-candidate model with \(A\) being the number of respondents in the survey favoring Obama, and \(B\) favoring Romney. Once the parameters \(A\) and \(B\) are determined, we can integrate to obtain \(q\).

There still remains the aggregation of many polls of different vintages to determine \(A\) and \(B\). The traditional application of conditional probability using the beta distribution and random sampling simply adds all the poll results. This is a naive approach because the population proportion is not - and should not be - static. It is subject to change as a result of current events. We use a simple heuristic approach to aggregate all the polling results (separately for each state) by using an exponential decay to discount distant (past) polling data – the discount is a function of the age of a poll, relative to the most recent poll. Consider a series of polls \(m_t\), where \(m_t\) is the number of Obama supporters in a poll conducted at time \(t\). Suppose the current time is \(c\); then we compute the parameter \(A(c)\) as follows:

\[A(c) = \sum_{t \le c}e^{\frac{-(c-t)}{h}}m_t\]

The constant \(h > 0\) can be viewed as a decay parameter – it modulates how quickly the past poll data will be discounted. If \(h\) is large, the decay is slow, meaning that past data will be discounted slowly. On the other hand, a small \(h\) discounts recent poll results heavily. At the smallest values of \(h\), only the most recent poll will be counted. We can similarly compute the \(B(c)\) parameter by keeping track of the number of Romney supporters in the polls.

Since using the raw number of supporters in a poll to compute the beta parameters (\(A\) and \(B\)) tends to provide an extreme win/loss probability, we use a scaling factor to modify the beta parameters while keeping their relative proportionality intact.

## Data Sources

Please contact us if you'd like to obtain the list of polls we used as data sources.

## Appendix A

What happens in case the Electoral College is tied?

The Twelfth Amendment, proposed by Congress in 1803 and ratified by the states the next year (following the 1800 tie in the Electoral College), provides that if no presidential candidate has obtained a simple majority of the votes in the Electoral College, the House of Representatives elects the President according to the following procedure. Only the three presidential candidates with the highest number of votes remain in the race (e.g., in the 2004 race: unless Ralph Nader obtained at least one vote in the Electoral College, this would mean only John Kerry and George W. Bush would remain in the race). Each state representation in the House convenes to decide for whom to cast the single ballot they are given to represent that state. The candidate with a simple majority of all the state ballots (i.e., 26 votes) wins, and at least 34 votes must be cast (the quorum requirement being that 2/3 of the states must be represented).

This procedure was only applied once, in 1824, not due to a tie but to the multiplicity of candidates: Andrew Jackson received 99 votes, short of the 131 then required to be elected, John Quincy Adams 84, William Crawford 41 and Henry Clay 37. Clay being fourth in line was excluded from the Twelfth Amendment procedure, and Adam subsequently won the House vote with 13 out of then 23 state ballots.

Another point worth noting is that electors in the College cast two separate ballots: one for the President, one for the Vice-President. A residency requirement prevents an elector from casting both votes for presidential and vice-presidential candidates both residing in the elector’s own state. In practice this residency requirement is not an issue since presidential and vice-presidential candidates on each ticket reside in different states, precisely to avoid this problem. (In the 2000 election, a controversy arose around then vice-presidential candidate Dick Cheney having moved from Texas to Wyoming a few months before the election, so as to avoid a residency conflict with running mate George W. Bush. The case was dismissed by the courts.)

One consequence of having electors cast two separate and in a sense independent votes for President and Vice-President is the need for a separate back-up procedure to elect the Vice-President in case the Electoral College is tied or unable to reach a simple majority for that office. The Twelfth Amendment provides that, similarly to the presidential vote tie-breaking procedure, the Senate then elects the Vice-President, each senator casting one vote. The vice-presidential candidates in the race are limited to those having obtained the two highest numbers of votes (this usually means two candidates, unless there is a tie for the second place). In the 2012 election, this most likely would mean that only Joe Biden and Paul Ryan would be in the race. The vice-presidential candidate with a number of votes equal to or greater than the simple majority of all senators (i.e., with 51 votes or more) wins, with the president of the Senate being denied his or her tie-breaking privilege for that election. (A quorum of 2/3 is here again required, i.e. 67 voting senators.)

Finally, the Twentieth Amendment provides for the case in which the procedure laid out by the Twelfth Amendment is itself deadlocked — which is not unlikely in the House procedure to elect the President, since a number of state representations are currently evenly split Democrats and Republicans. According to the Twentieth Amendment, if a Vice-President has been elected, he or she shall act as President until a President has been elected. If no Vice-President has been elected, Congress may legislate to fill in both offices.

In 2012, it is likely that a tie in the Electoral College would lead to the election of Mitt Romney by the House, and it is probable although less certain that the Vice-President elected by the Senate would be Paul Ryan. The possibility of having Mitt Romney elected President and Joe Biden elected Vice-President is remote but cannot be entirely ruled out given the current composition of the Senate.