- Is Mr Putin’s Popularity in Russia Failing? - March 28, 2021
- [Analysis] Can Google Trends be Used for Political Polling? - January 26, 2021
Google Trends is believed to help predict everything from market movements to the most popular Halloween costumes, but what happens if we try to predict political elections with it?
Google Trends is a handy marketing tool that was launched over 14 years ago for measuring the popularity of search queries in Google’s search engine across the globe. It aged well, and managed to attract a varied pool of devotees, that includes marketers, journalists and scientists who turn to it whenever they struggle to decipher human psyche – in other words, always when on duty.
Among its die-hard practitioners is Seth Stephens-Davidowitz, a scientist whose PhD thesis was based on data from Google Trends, and who was among the first to position the tool as an alternative to traditional sociological research, which he thinks often fails to quantify human behaviours with precision. Davidowitz believes the conditions under which people search on Google – most likely on their private devices, and alone – are more suited for gathering real opinions, concerns, and sentiments of people, than traditional surveys.
He famously argued his point, in an NewYork Times article in which he demonstrated that racial aversion played a huge role in the presidential election in the United States in 2009 and that Barack Obama, as a future president had to overcome far more adversity than anyone had expected or any election poll at the time had revealed.
He achieved that by creating a map of racially charged states using Google searches that included the word “n****r” and measured how popular such searches were across the country. The scale from the map was later correlated with the final percentage of votes Barack Obama received in each state. Controlled for a strong Democratic support in 2008, and historic data on support for John Kerry in 2004, Davidowitz also estimated the votes Obama should have got – for example he estimated Obama to control about 57% of the vote in Denver and in Wheeling whereas in reality, his victory in those two cities was much smaller – he only received 48% of the vote there.
For Davidowitz, it was clear that something was missing from the estimates. Something that was holding Obama’s popularity score at much lower rate than the one he predicted for him. He traced it back to a racial prejudice, for which evidence was piling up from Google Searches made in Colorado and Ohio. Against the claims that were popular at the time, Davidowitz went to show that prejudice was still a thing in a modern America. His NYT article made waves, and became one of the first uses of Google Trends, not so much to predict but to contradict official pooling.
A slightly different use of Google Trends to comment on the exercise of the democratic right to vote followed the 2016 referendum on EU membership (Brexit) in the UK. In addition to reports of declines of stock prices, the British media discussed the most popular search terms on Google that day, including the question “what’s the EU” which, for some, signaled indifference and a lack of recognition of the seriousness of the situation among voters.
This inherently selective use of Google data to comment on post-referendum results, as opposed to Davidowitz’s more detailed research, was perhaps anecdotal and instead fed into sensationionalism at the time. Nevertheless, it did help to create a narrative about defrauded citizens who did not know what they voted for when they voted to leave the European Union – the narrative that later became the basis for calls for a second referendum.
If people’s true feelings can be measured with a tool with near-seismographic capabilities, why not use it before the elections and not after them? Among the others who used Google Trends data to measure a candidate’s public popularity before the elections was Tom Cochran – a CEO of a public relations firm called 720 strategies that conducted a Google Trends survey in response to Pete Buttgieg’s rise in popularity in November 2019 Iowa’s Poll, just before the 2020 Iowa Democratic presidential caucuses. A noteworthy event for a country where caucuses are deemed as an indication of how a presidential candidate will later perform in primaries.
Cochran used Google Trends to see how support for Democratic candidates has changed over time, comparing Google search data for Pete Buttigieg, Elizabeth Warren, Joe Biden, and Bernie Sanders over three different timeframes (last 12 months, last 30 days and the last 7 days). He then explained how candidates dominate Google through events (favorable or not) related to apparent spikes in popularity. Lastly, Cochran compared his own results with national polls to see whether polls, such as the one in Iowa really did matter.
As of November 2019, indeed the intent demonstrated through Google searches seemed to lined up closely with official polling data. The research may not have allowed Cochran to predict who would ultimately become the Democrat candidate in the 2020 U.S. presidential election. However, it allowed him to show that the tool, contrary to an explicit refusal by Google, could be used, if responsibly, for political polling.
Can an interest equal a support? The case of elections in Poland.
Data scientists like to say that working with tempting but messy data is one of the great sins of data analysis. Incomplete data and weak proxies often lead to far-reaching and effectively wrong conclusions, that can be ruinous to the most inventive theories.
In a recent interview on using Google Trends to study the spread of Covid-19, Davidowitz referred to one such unsuccessful project in which Jeremy Ginsberg (known from his 2008 paper On Detecting Flu Using Google Queries) attempted to re-apply the model to map the US H1N1 swine flu epidemic.While his initial modeling to predict flu outbreaks was lauded for its accuracy, it did not produce similar results for swine flu, as it indicated an extraordinary number of cases among Americans, while actual levels were found to be much lower.
The predictions from Ginsberg’s modelling were skewed not by dishonest pollsters, as in traditional surveys, but by users’ fears and curiosity that manifested itself in searches which rose to levels greater than the actual disease.
If Google Trends data were to be used for predicting politics, it wouldn’t be exempt from similar problems. “Donald Trump” could be a groundbreaking search, not because people favour him more than John Biden, but perhaps because they want to learn more about his recent commentary or a tweet (provided Jack Dorsey lifts his ban).
For this reason, traditional researchers may find Google’s search intent to be obscured and unclear, disqualifying it as a reliable data source. But others will argue that politics is nothing like science: politics is about momentum. Cochran’s findings, in which the survey data were compared with data from Google Trends, could be helpful here, as they suggest that the intent behind Google searches shouldn’t be much of a problem, at least not on the highly polarized political scene like the one in the US.
The recent presidential election in Poland can be a good example. It was widely predicted that the election was on the brink of a knife with the incumbent Andrzej Duda (an ally of the ruling, right-wing party PiS) in a desperate need to defend his title against the liberal center-right president of Warsaw – Rafał Trzaskowski. After two rounds of voting, the incumbent remained in office, which was called by the BBC “the smallest Polish presidential victory since 1989”.
I analysed Google Trends to see if there was anything interesting that would help me understand voters behavior and filter out campaign moments that were decisive for Duda’s victory.
From a quick look at the chart one thing is obvious: the interest in Trzaskowski (light blue line) was strangely delayed, as if the man was almost absent from the top of Polish interests until mid-May. Indeed, Trzaskowski got involved late, taking over on May 15 from former PO candidate Kidawa Błońska (the only female candidate).
Even though he was a mayor of Warsaw for almost two years at the time, interest in Trzaskowski was relatively low or certainly not as great as in the other key candidates at the start of the race. This is not a groundbreaking data point, but important when we consider that one of his opponents, Szymon Hołownia (light green), a former journalist and independent candidate who was little known to the public before the race, has attracted an enormous attention and grew in popularity in Google about a week before there were any noticeable interest in Trzaskowski.
From the day of announcing Trzaskowski’s candidacy, he followed an almost similar trajectory as Duda, with a rather short-lived domination over the incumbent. You may wonder, were these two imitating each other so well that they achieved similar trending results? Probably not, but most likely their names were searched together.
In his research using Google Trends election data, Davidowitz made an interesting discovery – a large percentage of electoral searches in polarized countries typically list both candidates.
However, this is not necessarily a sign of support, and certainly not to the person whose name is in a second place, as people are more likely to include the name of the candidate they support first. For example, during the 2016 election, more than a quarter of searches for “Clinton” (second place in a query string) also included “Trump”(first place in a query string). The almost identical trajectory for “Andrzej Duda” and “Rafał Trzaskowski” may suggest that many of these searches included the names of both candidates and it took Trzaskowski about a week to get searched separately on Google, which is a serious drawback for anyone who is a week late in running for presidency.
The mayor’s visible dominance over the remaining candidates, with the greatest difference of interests between him and the incumbent, was visible around the 4th of July. However, unlike competitors, the increase in interest in his candidacy was not motivated by a public speech or announcements of the flagship program, but only because on that day Trzaskowski was collecting signatures that must be registered in Poland in order to run in eletions. Yes, it was a signal of an improving support, but perhaps it also revealed that a fraction of Poles were ready to vote out of anger.
Another similar increase in interest occurred a few days before the first polls were opened on June 28, but once again his name was found next to another participant, far-right Polish politician Kamil Bosak, who published his version of the economic recovery plan after the Covid-19 pandemic. Since then, people have been googling the mayor less frequently than Andrzej Duda, but no less than Kamil Bosak, although in the end the far-right politician obtained less than 7% of the vote. Trzaskowski was getting noticed, but not as much or not as quickly as it was needed to win the election.
From June onwards, the interest in his main opponent, reflected in Google searches, slowly increased. This is due to Duda’s election strategy to remain visible to national and international media through radical speeches and actions such as his infamous public performance in Brzeg, southwestern Poland, where he called the promotion of LGBT rights an “ideology” similar to communism. The speech was widely covered in domestic and foreign media – as planned. At the end of June, Duda once again found himself in the spotlight during his hasty visit to the White House, which was interpreted by the liberal media as a pretext for additional publicity, that lead to a defense cooperation agreement unfavourable for Poland.
In the eyes of his supporters however, this was seen as a confirmation of Poland’s strongest ties with Washington in years. Such actions, whether deplorable or not, show that Duda has certainly been advised how to revive his campaign and strengthen his image of a politician who is willing to cooperate with the West, but also distance himself from the fruits of Western liberalism.
It would now be tempting to assume that in a country with balanced political influence, a righ wing government that openly declares hostility to sexual minorities must infuriate the left, which in turn will translate to greater support for the Polish Left. However, according to Google Trends, this never happened. The search for the first openly homosexual legislator in Poland, who was the only left-wing candidate in the presidential election, grew only proportionally compared to the interests in other candidates, and nevertheless remained at a very low level. Except for one spike of interest here and there, the lack of seaches for Robert Biedroń confirms speculations about a large part of Polish society, that it is not ready for a left-wing government and a leader alike. This is another interesting, sociliological data point that would be missed in regular polls.
Already at the very beginning of the presidential elections in Poland, it was clear that it would be a tough competition, both for the candidates and for Poles who desperately counted on a change of guard. For this reason, it would be brutal to leave readers with the conclusion that such warning signals as minimal interest in Robert Biedroń, a man whose opinion could be of importance to Duda’s sermons in Brzeg, or the lack of interest in Trzaskowski’s figure earlier in the elections provided more information than any political polls of the time.
Instead, consider the obvious benefits of using this method to measure the effectiveness of political commitment, and how we can tell if the public is sympathetic to certain programs, if at all. It can also be of great benefit to the candidates themselves as they can measure their wins and flaps in real time. Would Trzaskowski be more involved if he knew he was losing it to the incumbent because of his radical public appearances? Probably.
The above insights are just an introduction to the vast knowledge that can be learned from Google Trends. Rather than relying on the names of the candidates, we could also look at specific keywords that can tell us much more about the people who perform these searches, such as what part of the country they come from and whether they have any other associations with the candidates. But that may be a topic for another analysis.