Voters can interpret the true meaning of a poll if they understand the importance of the margin of error, confidence interval and sample size.

A poll’s margin of error can inspire hope or panic among candidates and the voters rooting for them.

It begins with the fact that election season turns every political junkie into a statistician. For many, it may be the only time they interpret a statistical metric, and they may find margin of error unfamiliar. But that doesn’t mean it isn’t crucial.

A margin of error measures how likely it is a sample of voters (or any set of data) will mirror the entire population. Polls include it to warn that the sample may or may not capture the best representation.

For example, a pollster might report that “53% of voters would vote for Candidate A, and 47% would vote for Candidate B, with a +/- 5% margin of error.” That means the actual percentage of people who’d vote for Candidate A could be as low as 48% and as high as 58%, and the actual percentage of people who’d vote for Candidate B could be as low as 42% and as high as 52%.

Further, the “confidence interval” the pollster used is usually included at the bottom. For example, a “95% confidence interval” might be used for a particular poll.

Even though the confidence interval says the actual percentage of voters for Candidate A has a 95% chance of being between 48% to 58%, there’s still a 5% chance that the actual percent might be below 48% or above 58%.

A 99% confidence interval would leave only 1% outside its range, but that would make the margin of error much wider. A very high confidence interval can make the potential range too wide to be useful, while a very low confidence interval can leave too much uncertainty. That’s why most polls use a 95% confidence interval.

In the margin of error calculation, a Z-score is used that matches the desired confidence interval. It’s the number of standard deviations the data point is away from the mean. The higher the confidence interval, the higher the Z-score. A Z-score table shows the values for a given confidence interval, and a 95% confidence interval has a 1.96 Z-score, while a 99% confidence interval has a 2.58 Z-score.

So, a poll’s margin of error requires a Z-score for the desired confidence interval (“Z”), the number of voters in the sample (“n”) and the percentage of voters for a particular candidate (“p”), which are plugged into this formula.
For example, if 500 people were polled in a Congressional district and 53% said they’d vote for the Democrat and 47% said they’d vote for the Republican, the margin of error with 95% confidence would be:

Based on a 4.4% margin of error, the Democrat’s actual support has a 95% probability of being between 48.6% and 57.4%, and the Republican’s actual support has a 95% probability of being between 42.6% and 51.4.%. A Democratic voter might interpret those numbers as the Republican’s support having a low probability of being higher than the Democrat’s support and thus be cautiously optimistic.

On the other hand, the numbers show that if the Democrat underperforms and the percentage of voters is on the low end of her range and the Republican outperforms and the percentage of voters is on the high end of her range, the Republican could pull off an upset. In most cases, where voters are closely split, margins of error can give some hope to supporters of both candidates.

The key to the confidence interval is the sample size of the poll. A small number generates a higher margin of error, and a large number generates a lower margin of error. The margin of error changes by the square of the increase in the sample size.

For example, a sample size of 500 has a 4.4% margin of error, a sample size of 1,000 has a 3.09% margin of error and a sample size of 5,000 has a 1.39% margin of error. Larger sample sizes make smaller reductions in the margin of error. The problem for pollsters is that the cost of polling a much larger sample is much greater than the benefit from reducing the margin of error.

Because the margin of error is so dependent on sample size, it can behoove the underdog to use a smaller sample, and the favorite to use a larger sample. The former can increase the margin of error and make it seem the underdog has a chance at victory, and the latter can reduce it and make it seem the favorite is all but guaranteed a win.

That’s why it’s important for a voter interpreting a poll to look at its confidence interval and sample size—and for politicians to be humble about their chances.

Tom Preston, Luckbox contributing editor, is the purveyor of all things probability-based and the poster boy for a standard normal deviate. @thetompreston