Thursday, February 27, 2014

PEW results great for BJP, but what about its methodology?

Following the footsteps of a few Indian opinion polls ahead of the Lok Sabha, the US based Pew Research Centre has predicted a "crushing defeat" for the Congress and a stupendous victory for the BJP by suggesting that 63 percent of Indians favoured the latter. This new American survey is more emphatic in its support for the BJP compared to other surveys.

Representational Image. AFP

Representational Image. AFP

Expectedly, the BJP is happy and the Congress, dismissive. The observers are happy because they too predict that the BJP will win and the Congress will lose. Looks like there is no space for the third front because the pollsters didn't ask their respondents if they "favour" them or not.

There are two issues to be looked at here: one, the results and two, the methodology and its limitations.

The results are all out in the open, but the methodology is not, except for an overview without details.

First, the results.

The Pew results are comparable to other surveys that predict BJP becoming the single largest party and the Congress and its allies losing big time. The Indian studies were realistic and consistent with the generally observed trends - the pockets where the BJP and its partners will do well, where the Congress will get more seats, and what will be their possible vote-share.

All of them noted that the South will continue to be impenetrable for the BJP although the party's vote-share will rise substantially. But look at the Pew results. It gives the BJP 59% voter support with the Congress and others scoring 18 and 17 per cent respectively. This sounds crazy because in both Kerala and Karnataka the Congress is projected to do well (by Indian studies which give the BJP a clear edge across the country) and in Tamil Nadu, it will be the AIADMK that will lead. The results from the south are good enough to suspect holes in the the PEW methodology.

Moreover, look at the following statements by the study:

"The party's weakest backing (54 percent) is in the western states of Maharashtra, Chhattisgarh and Gujarat (Modi is the chief minister leading a BJP-led Gujarati state government)" - means Gujarat is among the states where the BJP is comparatively weak!

Congress' strongest regional support (30 percent) is in the eastern states of Odisha, Bihar, West Bengal and Jharkhand, among India's poorest areas and home to 270 million people." - the pollsters don't know about BJD and TMC in Odisha and West Bengal respectively, and their surveyors didn't pick up anything on the political dynamics in Bihar!

More laughable is the choice of Anna Hazare as the prime minister. According to the PEW, his popularity for the PM job is 69 per cent, while Modi, the man ahead of him, has the support of 72 per cent Indians.

How do we get these results?  And how do we overlook these obvious blunders while relishing the pro-BJP and pro-Modi results?

The answer lies in its methodology. For a complex country with dizzying levels of diverse demographies, how did they select a sampling frame in which all Indians have an equal probability of being included? To make such predictions where the results are directly attributed to Indians, shouldn't they go for equal probability sampling? Of course they should, although it is not easy in a country such as India.

Ideally, one should go in for a multi-stage, stratified sampling to develop a sampling frame from which one can randomly choose households/people for interviews. All Indians should have an equal probability of being represented by these selected samples. Even in such situations, there will be limitations.

The Pew study makes bold statements based on a survey which employed area-probability sampling. Instead of individuals, they depend on areas. They haven't given out any details on how they have done the sampling, but one could assume that they have identified areas as units that are representative of similar units across the country. Did these areas truly represent India or similar units across the country that had an equal probability of being included in the areas selected by them?

In a poll like this, what matters is not the number of samples. The sample can be a 1000 or a million and can still go wrong if the science is wrong. The sample size can be less than 1000 and still go right because the science is correct. We wouldn't know the Pew methodology until they spell out the details and more importantly, clearly state the limitations.

It's extremely difficult to base one's survey on representative areas in a country such as India. Except in certain pockets, it's extremely difficult to find areas with demographic uniformity. And if one is selecting areas that would account for such demographic diversities, the public should know how it was done. For instance, how does one include a rich neighbourhood which has a small slum (with more household members) tucked away? How does one randomly choose representative samples from this area?.

These are basic, but pertinent questions. It's high time that pre-election pollsters gave the details of their methodology and limitations upfront so that the electorate don't get influenced by the results which obviously have factual inaccuracies.


No comments:

Post a Comment