It’s the season for the second or third round of opinion polls in Kerala.
Compared to the countrywide surveys by the national media, the polls in Kerala are a welcome relief as what’s presented here is a constituency-wise assessment and not a conversion of aggregated vote-shares into possible number of seats. But there’s still a critical question: Are they good enough to clearly predict who will win or lose in each constituency?
At the heart of a sample survey, such as the opinion polls, is the integrity of the sample. It’s not about the size, but about a principle called equal probability. The bigger the sample, the better; but a bigger sample doesn’t offer accuracy if the equal probability principle is compromised.
The principle is simple: The sample chosen for each constituency should offer an equal probability for all the voters to be included in it. In other words, if 500 people are chosen as a sample to represent a constituency of 9 lakh voters, the ideal sample is one in which there is equal probability of being included. Without this representative element, raising the number of people in the sample to 1,000 or even 10,000 doesn’t mean anything.
In the last two surveys - one by Mathrubhumi News and the other by Asianet News - the prediction is a near-landslide victory for the UDF across the state and an unambiguous victory for the BJP in Thiruvananthapuram. Mathrubhumi claimed it had chosen 500 voters each from three Assembly constituencies in each Parliament seat while the Asianet News said its sample in each constituency comprised 1,000 people each from all the Assembly segments.
However, what the channels didn’t specify was how representative these samples were, how the selection was done and what they did to ensure that they are as close to an equal probability sample as possible. There was not even a word about that.
In any sample survey, whether it’s academic or official enumeration like the NSSO rounds, the standard practice is to state the methodology and its limitations upfront. In the constituency-wise analysis as done by the Malayalam channels, it’s mostly the sampling that matters because the rest of the methodology is simple: You choose the sample, ask them the questions, enumerate them, make the tables and draw the results. There’s hardly any complex methodology here unlike in aggregated samples in which a lot of mathematical work, including modelling, is applied to convert the aggregated vote-share into seats.
In Kerala, it’s just calculating the vote-share that the sample would indicate for each party. If you get the sample right, you may get the prediction almost right, barring extenuating circumstances (such as the assassination of a big leader) and the response-biases (misleading responses, inhibitions in answering some questions, purposely wrong answers etc.)
How could the channels have gotten equal probability samples in each constituency? When choosing only three Assembly segments from each Parliamentary seat, as Mathrubhumi has reportedly done, how did they ensure that the three that they chose represented the entire constituency? In western countries, over a period they have identified constituencies that are supposedly indicative of certain trends. Did Mathrubhumi identify such key Assembly segments? If they did, on what basis?
Moreover, in the absence of a sampling frame (a list from which people can be randomly chosen, which will ensure that all the socio-economic features of the electorate are represented in the right proportion as it exists in the constituency), how did they make the selection?
Choosing samples from all the Assembly segments in each constituency, as Asianet News has reportedly done, makes better sense because it indeed provides for better representation. But the same question of sample-selection exists here too. How did they achieve equal probability?
This is where people have to be cautious and channels have to be transparent. For instance, the composition of Hindu voters in Mathrubhumi sample was about 67 percent whereas the average Hindu population in the state is only 55 percent. So clearly, there’s a flaw. If Hindu sentiments do play a part in the election, it will be overstated by the sample because of its disproportionately higher number. In this specific survey, it will result in the BJP’s chances getting overstated. It’s clearly bad selection. Similarly, the same sample of 5,000 people had only 1,000 women whereas the state has more women voters than men. The sample is certainly unreliable.
In such situations, the results could be as accurate as guesswork or crystal-gazing.
In the case of Asianet News, there were no disclaimers or statement of limitations upfront. The composition of the sample, how the selection was done etc., were not stated. Only the numbers were made public.
It’s high time people were made more aware of the technical issues involved in opinion polls because they also can be misused to influence voters. A lot of unsuspecting fence-sitters might be influenced by these polls and hence they could also be used as vehicle for proxy-rigging.
So next time, when the channels make bombastic predictions, do ask these questions: how representative is your sample and what have you done to ensure that it’s an equal probability sample. If there’s a vote-share-to-seats conversion, ask how they made the conversions. It could be a mathematical formula or a model. Also do ask them about the limitations of the study.
Otherwise, it will end up in a strange spectacle, in which the channel’s representatives themselves contradict their surveys as was seen last night.
The author is a former journalist and UNDP Senior Adviser in Asia Pacific who is presently a writer based out of Chennai and Thiruvananthapuram.
Views expressed are the author's own.