Didn’t the polls and statistical models get everything wrong in 2016? How can I trust them?
Every political data analyst has been asked those same questions over and over since the 2016 election. Prior to Election Day 2016, many people watched the campaign coverage and concluded that the polls ruled out a Donald Trump win and signaled clearly that Hillary Clinton would be president. That perception wasn’t correct. Polls can be off in the same direction in multiple key states, but models and analysts who factored in those errors were able to see the possibility of a Trump upset ahead of time. But it nevertheless stuck with people, and after Election Day they felt burned and lost faith in the polls.
Perhaps 2018 will help the public trust survey research and predictive modeling a little bit more. This year, the data did a good job of predicting the final election outcome.
Heading into the election, the consensus both in the world of data and the world of reporting was (1) that the most likely outcome in the House was a decent Democratic win, with a blue landslide and a GOP hold both still possible and (2) that Republicans would likely hold the Senate and probably add to their majority.
These general predictions were right, or close to it. Democrats took the House and Republicans padded their margin in the Senate by a bit more than expected.
In fact, at press time, the best estimate for the number of Democratic seats is in the high 220s or low 230s. On election eve, I used the data to estimate that Democrats would end up with 228 seats. Moreover, statistical forecast models projected that Democrats would end up with approximately 231 or 227 seats. The Real Clear Politics (RCP) average indicated that Democrats would win the overall House popular vote by seven points, and they seem poised to achieve something very close to that. That all communicates a solid level of accuracy.
On the Senate side, our TWS Forecast thought that 52 GOP seats was the likeliest outcome. We don’t know the final composition yet—Arizona is still out and Florida is headed for a recount—but it looks like Republicans will probably end up with 53 or 54 seats. That’s not perfect, but it’s well within the plausible range of outcomes the model laid out.
It’s harder to say how good the polls were on a race-by-race basis at this point. Votes are still being counted in some of the most important races (e.g., some California House races won’t be confirmed for a while), and the exact accuracy of the polls is something I intend to revisit in more detail soon. But I will note a few things about what happened on the Senate side.
In some cases, there were real polling errors. RCP’s poll average in Indiana (one of the best aggregates out there) put incumbent Democratic senator Joe Donnelly ahead by about a point; in fact, Mike Braun, the Republican, will likely win by a large margin (he’s ahead by about eight points at press time). In Tennessee, the final average of polls was off by about six points. The polls also seem to have undershot Missouri Republican Josh Hawley, who appears to be headed for a six-point win despite being virtually tied with Sen. Claire McCaskill heading into the election.
Some of the misses were smaller. In Florida, Democratic senator Bill Nelson led Republican governor Rick Scott by 2.4 points heading into the election, but Scott is now ahead as they move toward a recount. So the call was wrong there, but the polls were only off by a couple of points. The polls appeared to be off by a similar margin in West Virginia, where incumbent senator Joe Manchin won by three points despite leading in the polls by about five points. And Texas falls somewhere between this category and the last one—Democratic candidate Beto O’Rourke outperformed his polls by a few points.
In other key races, the polls were better. In New Jersey, they suggested a roughly 10-point win and that appears to be about what scandal-ridden Democratic senator Bob Menendez got. The polls accurately predicted Democrat Jon Tester’s three-point lead in Montana. In North Dakota, the polls were off by less than two points despite the fact that the race was under-polled in the final stretch. The final Ohio Senate polls were off by a couple of points, but generally trending in the right direction by election day’s end. And the Wisconsin polls got the Leah Vukmir-Tammy Baldwin result almost exactly right. Arizona hasn’t been called yet, but the polls correctly suggested a very close race.
And the “sleeper” races stayed asleep. Democrat Tina Smith won the Minnesota special Senate race for Al Franken’s old seat by low double digits (the average of the last three polls showed a 10-point lead), and Bob Casey won reelection in Pennsylvania by about 13 while the polls showed him up by 14. The polls even caught on to Republican John James’s late (and ultimately unsuccessful) surge in barely talked-about Michigan, showing him behind by eight points before he lost by six.
The data was also decent but not perfect in races for governor. Most of the races that weren’t considered tossups ahead of time stayed off the map. And the polls in the tossup states were helpful even without demonstrating pinpoint accuracy.
In Georgia, the preelection polls were only off by a point and a half. The Florida polls were off by about four points—a real but not unheard-of error. In Wisconsin, the average of the last three polls showed a roughly 2.3-point lead for Democrat Tony Evers, and he barely eked out a one-point win over Republican governor Scott Walker. Republican Kristi Noem, who led the South Dakota polls by two, won by three. Democrat Kate Brown almost exactly hit her poll average in Oregon, and Republican Mike DeWine beat his polls by quite a bit in Ohio (he won by about four points after trailing by mid-single digits in the final polls). Republican Kris Kobach in Kansas underperformed his polls (he lost to Laura Kelly by five points after leading by one in the final three polls).
Once we get full results from every district and state we’ll be able to calculate the exact level of polling error this cycle. But these numbers look pretty good from 30,000 feet. Democrats did about as well as expected in the House. Republicans did better in the Senate than projected, but it was well within the range of plausible outcomes. Republicans did a bit better than some expected in governor races, but the result didn’t seem crazy (e.g., at press time the Democrats had netted six seats and I guessed they would net between six and eight on the morning of Election Day).
There are good reasons to be careful about how you use and examine the polls (there can be low response rates, different methodological choices by various pollsters, complicated sources of error, correlated error, etc.). But this election should prove that they’re not garbage. In fact, they’re arguably the best tool we have for understanding public opinion.