Stuart Gabriel is a professor of finance at the Anderson School of Management at UCLA. Seth Stephens-Davidowitz is an economist and a contributing opinion writer to the New York Times. This op-ed appeared in the New York Times.
Are voters misleading pollsters? Are there hidden Donald Trump supporters who could throw the election his way?
Over the past few years, we have both become interested in how data from the internet, particularly Google searches, might be used to predict events. People also tell Google things — a lot of things — that they may not admit to others. So can we use Google searches to predict whom voters will support in this election? It is not as simple as we’d hoped.
One indicator of support might be how frequently people search for a candidate. There is some evidence that if they search for you, they will vote for you. In primary elections, Google search volume for a candidate in a state has predicted electoral outcomes. It is also true that in each of the past three general elections, the candidate with the most Google searches — George W. Bush in 2004 and Barack Obama in 2008 and 2012 — received the most votes.
This time around, nationwide, for every Google search about Hillary Clinton, there are two for Mr. Trump. Does this mean that there is widespread support for Mr. Trump that is being missed by pollsters and will carry him to a commanding victory? Highly doubtful.
Breaking down these searches by state yields a seemingly implausible electoral map. If Mr. Trump wins every state and has his biggest margins of victory in Vermont and California, we can all ignore polls and simply look at Google search volumes to predict future elections.
The high search volume for Mr. Trump is more likely evidence that he elicits strong reactions, which is hardly news. Many who plan to vote for Mrs. Clinton are anxiously searching the web for information about Mr. Trump’s chances.
While voters occasionally tell Google exactly how they feel — searches like “I love Trump” or “I hate Clinton” — these searches are too rare to be used to predict electoral outcomes.
One state that has a huge number of negative Trump searches is Alabama, a deeply red state that will almost certainly support Mr. Trump. We suspect that the high negative search rate may be driven by African-American voters, who may feel particular animosity toward Mr. Trump. But this group is not large enough to tip Alabama into the Democratic column.
There are two indicators that may contain information about the election that polls could be missing.
The first: Google searches related to information about voting. Every October of a presidential election year, hundreds of thousands of Americans enter terms like “how to vote” or “where to vote.” And the volume of these searches predicts where turnout will be high. This is valuable information since research suggests that more than half of nonvoters falsely tell pollsters they are going to vote.
To obtain clues about the number and composition of likely voters, we can compare city-level demographics to city-level search volumes. The main insight from Google searches this year is that African-American turnout may be down in 2016. There are significantly fewer searches for voting information in cities with large black populations than there were in 2012 and 2008.
This suggests that even though African-Americans overwhelmingly disapprove of Mr. Trump, they will turn out at lower rates to oppose him than they did to support President Obama. This could mean that Mr. Trump will do a bit better than expected in contested states with substantial black populations, like Ohio and Florida.
Google searches for voting information can also be used to test one hypothesis for how Mr. Trump might win. David N. Wasserman, an elections analyst at The Cook Political Report, has found that there were 47 million eligible white voters without a college degree who did not vote in 2012.
Might a significant number of these Americans be excited enough by Mr. Trump’s candidacy to vote in this election? Doubtful. According to Google searches, there is no increased interest in information about voting in cities with low levels of education.
There is another indicator in Google searches that may contain meaningful information. A large percentage of election-related searches contain queries with both candidates’ names. Some people search for “Trump Clinton polls.” Others look for highlights from the “Clinton Trump debate.” In fact, 12 percent of search queries with “Trump” also include the word “Clinton.” More than one-quarter of search queries with “Clinton” also include the word “Trump.”
We have found that these seemingly neutral searches may actually give us some clues to which candidate a person supports.
How? The order in which the candidates appear. Our research suggests that a person is significantly more likely to put the candidate they support first in a search that includes both candidates’ names.
In the previous three elections, the candidate who appeared first in more searches received the most votes. More interesting, the order the candidates were searched was predictive of which way a particular state would go.
In short, we observed a strong, statistically significant relationship between the order in which the candidates appeared in Google searches in that state and the number of votes the candidate received in that state. This relationship has become stronger as Google search data has gotten richer.
The order in which candidates are searched also seems to contain information that the polls can miss. In 2012, Nate Silver, the founder of the website FiveThirtyEight, accurately predicted the result in all 50 states. However, we found that in states that listed Mitt Romney before Mr. Obama in searches most frequently, Mr. Romney actually did better than Mr. Silver predicted. In states that most frequently listed Mr. Obama before Mr. Romney, Mr. Obama did better than Mr. Silver predicted.
While further analysis is required, data about search order could prove a meaningful supplement to polling data. This indicator could contain information that polls miss because voters are either lying to themselves or uncomfortable revealing their true preferences to pollsters. Perhaps if they claimed that they were undecided in 2012, but were consistently searching for “Romney Obama polls,” “Romney Obama debate” and “Romney Obama election,” they were really planning to vote for Mr. Romney all along.
So what does search order tell us about 2016? Well, again, we confront the problem of the unusual nature of Mr. Trump’s campaign. Nationwide, there are more searches that include “Trump” before “Clinton.” This could be evidence that Mr. Trump is doing better than the polls project, although it seems likely that Mr. Trump is such a dominant and divisive figure in the American psyche that even some Clinton supporters think of this as the “Trump-Clinton election.”
The data is clearly not perfect. It is what economists call noisy data. But generally speaking, the states in which the highest percentage of searches put “Trump” first are those where Mr. Trump seems to be performing better. Those states in which “Trump” comes first in search queries least often are those that are most supportive of Mrs. Clinton.
And, as in 2012, this indicator may have state-level information that polls are missing. If so, we would suspect that Mr. Trump might outperform his polls in Iowa and Nevada, for example.
Mr. Trump is such an unusual candidate that he makes it difficult to interpret any data, but there are a couple of indicators that Mr. Trump may be doing better than polls suggest in some states. If you’re a Clinton supporter in a swing state worried about the prospect of a Trump presidency, now is not the time to sit back and relax.