- [Analysis] Can Google Trends be Used for Political Polling? - January 26, 2021
Google Trends is believed to help predict everything from market movements to the most popular Halloween costumes, but is it suitable for predicting political election results?
Google Trends is a handy marketing tool that was launched over 14 years ago for measuring the popularity of top search queries in Google Search across different markets. It aged well, and managed to attract a varied pool of devotees, that includes marketers, journalists and scientists who turn to it whenever they struggle to decipher the human psyche – in other words, always when on duty.
Among its die-hard practitioners is Seth Stephens-Davidowitz, a data scientist and New York Times contributor who completed his PhD thesis using data from Google Trends. This inspired him to investigate the tool more closely and as an alternative to traditional sociological research, which, according to him, often fails to correctly quantify human behavior.
He saw the conditions under which people searched on Google (most likely on their private devices and unaware who was analyzing them) as the ideal environment for gathering real opinions, concerns, and sentiments. On this basis, he published a famous article stating that racial aversion played a huge role in the presidential election in the United States and that Barack Obama, who eventually became president, had to overcome far more adversity than anyone expected or an election poll revealed.
He achieved that by creating a map of racially charged states using Google searches that included the word “n****r” and measured how popular these searches were across the country. The scale from the map was later correlated with the final percentage of votes Barack Obama received in each state and compared to the number of votes John Kerry received in the same part of the country in 2004. Controlled for a strong Democratic support in 2008, Davidowitz estimated that Obama should get about 57% of the vote in Denver and in Wheeling. In fact, Obama’s victory in Wheeling was much smaller – he only received 48% of the vote there.
For Davidowitz, this was evidence that the presence of racial prejudice in states like Wheeling and Denver, which he could see from Google Trends research, must have been a reason why Obama’s support was lower from the estimated figures. This forced him to dismiss popular claims that prejudice was not the primary factor against a black presidential candidate in modern America.
A slightly different use of Google Trends to comment on the exercise of the democratic voting right followed the 2016 referendum on EU membership (Brexit) in the UK. In addition to reports of declines in stock prices, the British media discussed the most popular search terms on Google that day, including the question “What’s the EU” which, for some, signaled indifference and a lack of recognition of the seriousness of the situation among voters.
This inherently selective use of Google data to comment on post-referendum results, as opposed to Davidowitz’s more detailed research, was perhaps anecdotal and instead fed into sensationionalism at the time. Nevertheless, it did help to create a narrative about defrauded citizens who did not know what they voted for when they voted to leave the European Union – the narrative that later became the basis for calls for a second referendum.
If people’s true feelings can be measured with a tool with near-seismographic capabilities, why not use it before the elections and not after them? Among the pioneers who used Google Trends data to measure a candidate’s public popularity was Tom Cochran – an appointee of the Obama administration and CEO of 720 Strategies, a public relations firm that conducted a Google Trends survey in response to Pete Buttgieg’s rise in popularity in November 2019 Iowa’s Poll, just before the 2020 Iowa Democratic presidential caucuses. An event that was truly noteworthy for a country where caucuses are deemed as an indication of how a presidential candidate will later perform in primaries.
Cochran used Google Trends to see how support for Democratic candidates has changed over time, comparing Google search data for Pete Buttigieg, Elizabeth Warren, Joe Biden, and Bernie Sanders over three different timeframes (last 12 months, last 30 days and the last 7 days). He then explained how candidates dominate Google through events (favorable or not) related to apparent spikes in popularity. Lastly, Cochran compared his own results with national polls to see whether polls, such as the one in Iowa really did matter.
As of November 2019, indeed the intent demonstrated through Google searches seemed to lined up closely with official polling data. The research may not have allowed Cochran to predict who would ultimately become the Democrat candidate in the 2020 U.S. presidential election. However, it allowed him to show that the tool, contrary to an explicit refusal by the company (Google), could be used, if used responsibly, to end the political bundling.
Can the interest equal the support? The case of elections in Poland.
Seasoned data scientists like to say that working with tempting but messy data is one of the great sins of data analysis. Incomplete data and weak proxies are often the seeds of far-reaching and wrong conclusions that scientists must avoid.
In a recent interview on using Google Trends to study the spread of Covid-19, Davidowitz referred to one such unsuccessful research project in which Jeremy Ginsberg (known from his 2008 paper On Detecting Flu Using Google Queries) attempted re-apply the model to map the US H1N1 swine flu epidemic.While his initial modeling to predict flu outbreaks was lauded for its accuracy, it did not produce similar results for swine flu, as it indicated an extraordinary number of cases among Americans, while actual levels were found to be much lower.
Contrary to the disadvantages of regular surveys, the predictions from Ginsberg’s modelling were skewed not by biased or untrue pollsters, but by users’ fears and curiosity that rose to levels greater than the actual disease. If Google Trends data were to be used for surveys, it would not be relieved of similar problems. “Donald Trump” could be a groundbreaking search, not because people favour him more than John Biden, but perhaps because they want to learn more about his recent commentary or a tweet (provided Jack Dorsey lifts his ban).
For this reason, traditional researchers may find Google’s search intent to be obscured and unclear, making it interesting but unreliable data. But some might argue that politics is nothing like science: politics is about momentum. Cochran’s findings, in which the survey data were compared with data from Google Trends, are helpful as they suggest that the intention behind Google searches may not always be a worry, at least not in the highly polarized political scene.
The recent presidential election in Poland can be a good example. It was widely predicted that the election was on the brink of a knife with the incumbent Andrzej Duda (an ally of the ruling, right-wing PiS) in a desperate need to defend his title against the liberal center-right president of Warsaw – Rafał Trzaskowski. After two rounds of voting, the incumbent remained in office, which was called by the BBC “the smallest Polish presidential victory since 1989”.
I analysed Google Trends to see if there was anything interesting that would help me understand voter behavior and filter out campaign moments that were decisive for Duda’s victory.
Just look at the chart that one thing was obvious: the interest in Trzaskowski (light blue line) was strangely delayed, as if the man was almost absent from the top of Polish interests until mid-May. Indeed, Trzaskowski got involved late, taking over on May 15 from former PO candidate Kidawa Błońska (the only female candidate).
Even though he was a mayor of Warsaw for almost two years at the time, interest in Trzaskowski was relatively low or certainly not as great as the other key candidates at the start of the race. This is not a groundbreaking data point, but important when we consider that one of his opponents, Szymon Hołownia (light green), a former journalist and independent candidate who was little known to the public before the race, has attracted attention and has had significant growth in popularity about a week before any noticeable interest in Trzaskowski.
From the day of announcing Trzaskowski’s candidacy, he followed an almost similar trajectory as Duda, with a rather short-lived domination over the incumbent. You may ask, were these two imitating each other and copying their speeches to achieve similar results? Probably not.
In his research using Google Trends election data, Davidowitz made an interesting discovery – a large percentage of electoral searches in polarized countries typically list both candidates.
However, this is not necessarily a sign of support, and certainly not to the person whose name is in second place, as people are more likely to include the name of the candidate they support first. For example, during the 2016 election, more than a quarter of searches for “Clinton” (in second place) also included “Trump”. The almost identical trajectory for “Andrzej Duda” and “Rafał Trzaskowski” may suggest that many of these searches included the names of both candidates and it took Trzaskowski about a week to get searched separately on Google, which is a serious drawback for anyone who is a week late in running for presidency.
The mayor’s visible dominance over the remaining candidates, with the greatest difference of interests between him and the incumbent, was visible around the 4th of July. However, unlike competitors, the increase in interest in his candidacy was not motivated by a public speech or announcements of the flagship program, but only because on that day Trzaskowski was collecting signatures that must be registered in Poland in order to run in eletions. Yes, it did mean a support for him was growing, but perhaps it also revealed that a fraction of Poles were ready to vote out of anger.
Another similar increase in interest occurred a few days before the first polls were opened on June 28, but once again his name was found next to another participant, far-right Polish politician Kamil Bosak, who published his version of the economic recovery plan after the Covid-19 pandemic. Since then, people have been googling the mayor less frequently than Andrzej Duda, but no less than Kamil Bosak, although in the end the far-right politician obtained less than 7% of the vote. Trzaskowski was getting noticed, but not as much or not as quickly as it was needed to win the election.
From June onwards, the interest in his main opponent, reflected in Google searches, slowly increased. This is due to Duda’s election strategy to remain visible to national and international media through radical speeches and actions such as his infamous public performance in Brzeg, southwestern Poland, where he called the promotion of LGBT rights an “ideology” similar to communism. The speech was widely covered in domestic and foreign media – as planned. At the end of June, Duda once again found himself in the spotlight during his hasty visit to the White House, which was interpreted by the liberal media as a pretext for additional publicity, that lead to a defense cooperation agreement unfavourable for Poland.
In the eyes of his supporters however, this was seen as a confirmation of Poland’s strongest ties with Washington in years. Such actions, whether deplorable or not, show that Duda has certainly been advised how to revive his campaign and strengthen his image of a politician who is willing to cooperate with the West, but also distance himself from the fruits of Western liberalism.
It would now be tempting to assume that in a country with balanced political influence, a righ wing government that openly declares hostility to sexual minorities must infuriate the left, which in turn will translate to greater support for the Polish Left. However, according to Google Trends, this never happened. The search for the first openly homosexual legislator in Poland, who was the only left-wing candidate in the presidential election, grew only proportionally compared to the interests in other candidates, and nevertheless remained at a very low level. Except for one spike of interest here and there, the lack of seaches for Robert Biedroń confirms speculations about a large part of Polish society, that it is not ready for a left-wing government and a leader alike. This is another interesting, sociliological data point that would be missed in regular polls.
Already at the very beginning of the presidential elections in Poland, it was clear that it would be a tough competition, both for the candidates and for Poles who desperately counted on a change of guard. For this reason, it would be brutal to leave readers with the conclusion that such warning signals as minimal interest in Robert Biedroń, a man whose opinion could be of importance to Duda’s sermons in Brzeg, or the lack of interest in Trzaskowski’s figure earlier in the elections provided more information than any political polls of the time.
Instead, consider the obvious benefits of using this method to measure the effectiveness of political commitment, and how we can tell if the public is sympathetic to certain programs, if at all. It can also be of great benefit to the candidates themselves as they can measure their wins and flaps in real time. Would Trzaskowski be more involved if he knew he was losing it to the incumbent because of his radical public appearances? Probably.
The above insights are just an introduction to the vast knowledge that can be learned from Google Trends. Rather than relying on the names of the candidates, we could also look at specific keywords that can tell us much more about the people who perform these searches, such as what part of the country they come from and whether they have any other associations with the candidates. But that may be a topic for another analysis.