THE LATEST JOURNALIST to fall for the academic pseudo-science of election predicting is the Washington Post’s Robert Kaiser. In a dramatic front-page article on May 26 headlined “To Researchers, Election Is All Over but the Voting,” he writes: “You didn’t realize that Gore has won the election? A technicality. According to half a dozen political scientists who have honed and polished the art of election forecasting, the die is all but cast.”
Kaiser adds that “the academic prognosticators have a startlingly good record” and that most of their forecasting models “have picked the winner correctly in years since 1952 when the winner got 53 percent or more of the vote.”
So should George W. Bush pack it in? Nope. In fact, none of the models predicting a Gore victory has much of a record. The incredible forecasting that the Post and other newspapers trumpet is based not on actual predictions, but on the ability of recently developed models to produce accurate “ex post predictions” or “postdictions,” as political scientists refer to them, of past elections. Two of the “leading academic forecasters” discussed in Kaiser’s article — Christopher Wlezien of the University of Houston and Thomas Holbrook of the University of Wisconsin at Milwaukee — boast models that have correctly predicted just one election. To be sure, they have “postdicted” many earlier ones. But what does that really prove?
The true test is calling an election ahead of time. And that’s hard. Kaiser may want to reread the clips at his own paper. In 1992, Washington Post polling director Richard Morin reported on the work of Yale University’s Ray Fair, “perhaps the dean of presidential election forecasters” and the University of Iowa’s Michael Lewis-Beck, “perhaps the country’s preeminent election forecaster.”
Morin reported that Lewis-Beck’s forecasting model “has successfully picked the winner in 10 of the last 11 elections, missing only in 1960. In the past six presidential elections, Fair (who also picks the president) has not only gotten the winner right, but has come within 1.1 percentage points, on average, of estimating the winner’s share of the popular vote.”
That’s a remarkable record. And how did these experts’ predictions pan out in 1992? They didn’t. Lewis-Beck predicted George Bush would hold the White House with 51.5 percent of the two-party vote. Fair forecast a Bush landslide, with a 55.7 to 44.3 percent margin over Bill Clinton.
Four years later, Yale economist Fair, the “dean of presidential forecasters,” extended his losing streak, predicting that Bob Dole would take a majority of the two-party vote against Clinton. Getting this election wrong was no small feat. Throughout the campaign, Dole never once led Clinton in the polls. Yet Fair’s model predicted Dole would come out on top.
The root problem isn’t that election forecasters are dumb. It’s that they lack the data they need to build reliable models. Most of the models are based on the historical relationship between the presidential vote and factors such as public opinion, the state of the economy, and which party currently holds the presidency.
Forecasters surely are correct to think the economy is crucial. They’re also right to stress the importance of public opinion. But accurate economic data extend back only about a century, while the required polling data go back only to the election of 1948.
With so few elections since then, it’s impossible to build a reliable statistical model. But that hasn’t stopped forecasters from claiming they have. In Before the Vote, a recent book on election forecasting, James E. Campbell of the State University of New York at Buffalo writes that he has developed a “highly confirmed model that produces a very accurate forecast of the national two-party popular vote for president two months before the election.”
How many elections has this “highly confirmed model” correctly called? Two. But Campbell is unabashed. After reviewing several of the major forecasting models and their results, he concludes that while a few issues remain, “the record of accuracy documented above would seem to be sufficient to convince all but an O. J. Simpson jury that the models warrant a high degree of confidence.”
Yet the forecasting models, even when they do call the winner correctly (and there are, after all, only two major-party candidates), often do so with a large margin of error. In 1996, for example, Campbell predicted Clinton would get nearly four percentage points more of the two-party vote than he actually did.
In fact, all the major political science models over-predicted Clinton’s 1996 share of the vote. Why this happened is unclear, but if the errors had been random, some models likely would have under-predicted the Democratic vote while others over-predicted it. That this didn’t occur is further reason for caution in accepting forecasts of a Gore victory this fall.
Another problem with the forecasting models lies in what is known as “survivorship bias.” That’s the tendency of failed models to disappear, with only the most accurate ones surviving to be counted in statistics that measure the success of forecasting.
Exactly this sort of thing goes on with mutual funds. Funds that perform poorly are closed down or merged with more successful funds. That makes statistics showing that most funds outperform this or that benchmark misleading; the ones that don’t outperform disappear and are never counted again.
For a good example of survivorship bias in election forecasting, consider the models of that “preeminent election forecaster” Michael Lewis-Beck. He and his coauthors used one model in 1988, another in 1992, and yet another in 1996. To be fair, switching forecasting models from one election to the next isn’t uncommon. It’s the norm; that’s how the models are supposed to be improved over time. But it should send a signal to reporters that election forecasting is an endeavor still in its infancy, not one with a “startlingly good record.”
The fact that social scientists have such a hard time predicting should also make us skeptical about their ability to explain. Predicting and explaining are two sides of the same coin. To predict who’s going to win an election, you have to understand the determinants of the vote. To explain why the winner won, you have to understand the same thing.
The main difference between a prediction and an explanation is that a bad prediction is much harder to hide. And that’s why, in the end, the reporters who get social scientists’ predictions on the record are performing a useful public service.
Ira Carnahan is a freelance writer in Washington, D.C.