Saturday 29 March 2014

The way our data is organized

This post is specifically for people who are looking at either the detailed survey data or at least the Constituency-level summaries of our data. So if you're not one of those serious data-monkeys, begone!

OK, now that only the true believers are here, let me first talk about the CSV. The CSV is a complete representation of all the data collected on a given survey form (you can see a template in English here). Here's a brief explanation:
  • The data begins with a set of demographics - location type, gender, etc.) and each row has the actual selection of the respondent, as seen on the form.
  • The data then goes on to show information about the voter's preferences in terms of whom s/he will vote for, why, etc.
  • Thereafter comes the most important part: the importance and performance relating to each issue, in in two sets of columns. For example, the first issue that someone could  respond to is "Agricultural loan availability". In the CSV, you will see a column marked "I: 24 1 Agricultural loan availability", followed by one marked "P: 24 1 Agricultural loan availability". The I: 24... columns carry the original selection of the respondent: Low, Medium or High. The P: 24... columns, again, are the original response and they carry values of Bad, Average or Good.
  • After the selection columns, there are some "control" columns, including a unique ID for each record.
  • After the control columns, to make calculations easier, we've added numerical translations of the selections: Low becomes 1, Medium - 2 and High - 3. So also, on the Performance side, Bad becomes 1, Average - 2 and Good - 3. There are such translation columns for every pair of issue-related Importance and Performance columns.
  • Following the calculation base columns, we have Scores for each Issue. The Score for an issue is calculated as Issue * Performance * 10 / 9, to get all numbers on a "base" of 10.
  • After the last Score (Traffic congestion), we have an Average Score, calculated using only non-zero values (therefore ignoring issues that the respondent has not responded to).
  • Finally, we have a WealthIndex, a numerical representation of the assets that the respondent owns (taken with points for cattle, TV, motorobike and car, all of which appear in the Demographics section of the record).
That's about the CSV. The other online chart that you can look at is the Constituency Summaries that you can find here. The chart has four tabs. The first three (surprisingly named Issues, Performance and Scores!) are summaries of the columns from the CSV, but separated into their own sheets for eay assessment. For example, if you want to see what the newspapers and TV channels have been carrying as numbers for each Constituency, you can look at the Performance tab. If you want to see what we at Daksh think is the way the MP has done, look at the Scores tab. And the Issues tab will tell you what the issues are in each Constituency (although you'll have to copy out the issues in the header and their scores and then sort them as you see fit). The last tab is a summary of the Issues, Performance and Scores at a National level, averaging each element. This tab therefore provides a National backdrop to compare local issues, performance and scores with.
Hope this helps. If you have questions, please feel free to comment below and we'll respond as quickly as we can.

Friday 28 March 2014

Jobs, Please

The DAKSH-ADR Survey 2014 reached over 2 lakh respondents across the country from various backgrounds. We asked people to identify and rate the issues that are important to them when they vote in an election. Across geographical, gender, wealth, age, caste and religious divides, the top issue was the need for better employment opportunities. This got an average score of 8 out of 10 in our survey.  The near unanimity on better employment opportunities being an important consideration marks a clear shift from the “bijli, sadak, paani” approach of the 1990’s. While “bijli, sadak, paani” still rank very high because of the poor governance in the country, the ascendancy of employment as an issue in elections (at least in the minds of the people) is an important factor.
Now it is up to the political parties to explain how they are going to create better employment opportunities. It is not good enough to say that they will create better opportunities, we need to hear concrete ideas from them. If Mr. Modi has really generated better employment in Gujarat, then we need to hear from him how he achieved it and how he can replicated the model in other parts of the country. And unlike better roads, a mere reduction in corruption in the labour department will not generate better employment opportunities.
The Congress’ manifesto released earlier this week promises creation of 100 million new jobs- great, but how? There is no answer to this. I am sure Mr. Modi and the BJP will promise more without any details of how they will achieve it.  So, let’s hope we hear some new ideas, soon. Else, we are in for another round of lots of promises, no delivery and consequent disappointment.

Thursday 27 March 2014

The top and the bottom of the survey

With the MP survey results coinciding with my son's exams, I guess it's not surprising that my thoughts turn to how our MPs have scored :). And overall, I must say it's a pretty poor showing across the board. I did some quick analysis on the top and bottom scoring 20 MPS. Here are some interesting things that popped out.

Top 20:

  • Only ONE MP has scored an A (80 per cent +).
  • The top 20 are the only ones that have scored a first class (60 percent +).
  • There was only one MP from North India. The rest are from South, Eastern and Western India.
  • Kerala and Maharashtra share the top spot with 5 MPs each. Karnataka, Tamil Nadu and Odisha have two each. Delhi, Goa, Chhattisgarh and West Bengal round out the top 20 with 1 each.
  • There are 10 MPs from the Congress, 4 from the Shiv Sena, 3 from the BJP, 2 from the Biju Janata Dal (BJD) and one from the All India Trinamool Congress (AITC). 
  • Considering that Shiv Sena has only 10 MPs in the Lok Sabha, it is extremely creditable that they have 4 MPS in the top 20. BJD has 14 MPs and AITC has 18 MPs in the Lok Sabha, but have 2 and 1 MP respectively, in the top 20 list. So the smaller, regional parties definitely seem to be doing a better job at managing their constituencies.
  • Congress has 178% more MPs than the BJP (200 Congress MPs to 112 from BJP). But in the top 20, the Congress has 333% more MPs than the BJP (10 Congress MPs to 3 from BJP).
  • For me personally the sad thing was that there was not a single female MP in the top 20.
Moving on to the bottom of the pile...


Bottom 20:

  • The most astonishing thing is that only two states figure in this list: Punjab (and Chandigarh) and Madhya Pradesh.
  • There are 10 MPs from Punjab, 9 MPs from Madhya Pradesh and one from Chandigarh.
  • Shiromani Akali Dal truly is SAD. With just 4 MPs in the Lok Sabha, 3 of them figure in the bottom 20. It's the only one amongst the regional parties that figures here.
  • The BJP has 4 MPs in the list and the Congress has 13 MPs in the list.
  • And very sadly, 6 female MPs figure in this list.
If my son's class had scored as badly as this class of MPs has done, I'm sure that the evaluators would have had lots to say. I wonder what we the evaluators (voters) will have to say once the 2014 exam season begins.

The methodology behind our Survey

Looking at the results of our survey, people have asked some interesting comments/questions:
  • "You talked to two-and-a-half lakh people? Are you serious? Why?"
  • "You did this offline? Using paper and pen? OMG! Why not online?
  • "You used semi-professional investigators? Why? You could've just used students from universities, for free!"
  • "Only 500 people? That's WAY too small!"
  • "Yeah right, you've been to vague-o places in India, sure."
And so on. Well, let me put the confusion to rest, as best I can, in this longish post. I'll describe the process we used in the survey and, in some cases, the reasons for the steps in the process.
To begin with, we decided that this would be a completely offline survey. Why? First of all, a very, very small percentage of India is available online (meaning, online enough to answer a survey, not online enough for Facebook or WhatsApp). Second, there's a strong bias towards certain socioeconomic segments among Indian Internet users anyway, so an online-only survey would not be a sample that is representative of the aam aadmi. Finally, if we did the offline as planned, an online version would be unnecessary anyway.
Having laid that beast to rest, we moved on to design the survey itself. We've done this for many years now and the key attribute of the model is that it is self-correcting: each respondent tells us what issues are important for him/her AND how his/her MLA/MP has performed re: that issue. So we as survey designers don't need to decide on what's important and what's not. People will do that for us. And this works very well - you see all kinds of variations, in terms of people's interests, across the country. For example, halfway through the survey, one of the surveyors called us, complaining that none of the respondents in his location in Goa seemed to  be worried about bijli sadak paani - was this not a problem for the survey? Not at all - that only speaks to the quality of design, we said, and patted ourselves on the back.
Once the survey form itself had been designed, we set some targets for ourselves. Survey theory told us that 384 randomized respondents would be enough to show constituency-level trends with 90% accuracy (you can do your own calculations at this website). We decided we would do 500 samples in each MP constituency, to ensure data quality and "believability". That meant 500 * 543 = 2,71,500 responses! We then figured that we did not want more than 40 responses in each "location" (meaning Village or Ward - more about that below). Why 40? Well, why not 40? Honestly: 50 seemed too large for each location, 25 seemed too small. Yes, great decisions are made on trivial factors. Anyway, going ahead with the Math, that means 271500 / 40 = 6,787.5 locations. Given that we would do this manually, we'd actually need a larger set of locations, since some places may be too small (!) for 40 random responses and some places may be too difficult to access. We finally decided on generating just under 10,000 locations.
Now for the tough part: we had to next make sure that the respondents were random enough to be representative of the constituency. A standard mechanism to prepare for randomization is to (strangely enough) stratify respondents in some manner. To do that, we first acquired a list of all Census locations in India (8.2 lakh locations: 7.28 Villages and the rest city Wards). Then we stratified the list by two factors: the ratio of Rural to Urban locations within the relevant State and the number of General populace (as against SC/ST/BC/BT) as a percentage of the overall population, again at the State level. This means:
  • We grouped all the locations within a State into two groups - Rural and Urban.
  • Within each such group, we calculated the General populace as a precentage of the total population of the location
  • We then sub-grouped locations by whether they were a High (> 66%), Medium (33% to 66%) or Low (<= 33%) General populace
  • We ran a randomization algorithm and generated numbers for each location
  • We figured out the ratio of Rural Vs Urban and High/Medium/Low General at the State level
  • We extracted locations from each sub-group in the same ratio as the ratio of the State.
We used State-level ratios of these factors because:
  • we noticed that the ratio of Rural vs Urban is significantly different among the States; some States (like Tamil Nadu, for example) are significantly more urbanized than others (like Chhattisgarh)
  • the ratio of General vs. SC/ST/BC/BT also varies quite a bit across the country
  • like it or not, caste is an important and acceptable factor to use in survey stratification in India.
Once we had the locations all set, we set up some rules about how the survey would be applied:
  • No surveying in public places like chai shops, etc.
  • All surveys to be conducted inside or just outside a house, surveying only one respondent in each house
  • Once a house is done, the next survey not in the next house but in the house next to that (so alternate houses)
  • Every third respondent to be a female respondent; if not found in the specific house chosen, repeat alternate selections till a female respondent is found.
Divya then went about writing a handbook for the investigators who would actually go to these locations and apply the survey. The handbook had the above methodology, as also other guidelines about how exactly to ask the questions, what to do about "prompting" by others (specifically men in the family when a female respondent is being asked the questions), etc. We had decided along the way that we would use as professional a group of investigators as possible. By "professional", we don't mean people who do surveys for a living - we actually wanted to stay away from that. We mean organizations and people who were not students.We have nothing against students, but the fact that the survey was to be conducted in a specific time-frame and that people would need to travel to some real vague-o places meant that the investigators had to be reasonably familiar with the area they were working in and that they had the time to do this during the work-week. Surveys would need to be conducted late in the evenings, since working people would only be available during those hours, and the investigators would need to travel back home late at night, possibly. All this precluded anyone below 18, and, honestly, anyone who did not do this for a fee of some kind - a small fee, but a fee nonetheless.
Armed with this model, we only needed to find the money and the people to make all this happen. That was the easy part! :)

UPDATE: ADR has a more accessible description of the survey methodology here.

Wednesday 26 March 2014

The Battle for Varanasi

OK, so it's official. Arvind Kejriwal will attempt a Raj Narain against Narendra Modi in Varanasi.  Varanasi claims to be one of the oldest living cities in the world.  The city's official website goes on to quote Mark Twain in not just BEING old but LOOKING old - a dubious distinction, at best. In recent times, Varanasi has been in the news for all the wrong reasons, terrible garbage management being one of the newest. And (surprise! surprise!) that's the most important issue for Males in the constituency, per our survey.
That brings me to the reason for this blog: to highlight Issues in the constituency of Varanasi. Whoever wins, doing better than the current MP should not be difficult at all (Dr M M Joshi scored in the 47th percentile nationally). But that will happen only if the debate during the electoral process and the MP's focus subsequently are about things that are important to the people of Varanasi. Here's what our survey brings up, as the most important issues for the voters of Varanasi, compared with the top five issues across the country, broken down by Men and Women:

Varanasi All India
All Better roads Better employment opportunities
Drinking water Better roads
Anti-terrorism Drinking water
Accessibility of MP Better electric supply
Reservation for jobs and education Better hospitals / Primary Healthcare Centres
Male Better garbage clearance Better employment opportunities
Better roads Better roads
Drinking water Drinking water
Reservation for jobs and education Better electric supply
Anti-terrorism Better hospitals / Primary Healthcare Centres
Female Training for jobs Better employment opportunities
Environmental issues Better roads
Better roads Drinking water
Drinking water Better electric supply
Anti-terrorism Better hospitals / Primary Healthcare Centres

First of all, what jumps right out is that basic amenities are seriously lacking, in the minds of the voter. Roads and drinking water are clearly things that the aam aadmi and aurat wants. And that's not that different from the rest of the country, since the same issues occur at a National level, too. What's interesting about Varanasi specifically is that jobs, garbage clearance and environmental issues are among the top few issues. And these are the issues that the AAP is talking about (weaver / sewer / river). Obviously, someone's been doing their homework out there - and presented the results in the near-alliterations that we all like.
One big segmentation being discussed in this election is that of The Youth vs. The Rest. Here's a chart that shows all the relevant issues in the Varanasi constituency, detailed by various age-groups. The chart shows the percentage of people in each age-group who marked a specific issue (shown in the left-most column) as being oh High importance. I've highlighted the top ten in each age-group.

Importance 18 - 30 30 - 40 40 - 50 50 - 65
Better roads 50.0% 35.8% 22.3% 20.3%
Better schools 41.2% 25.0% 18.9% 23.0%
Better public transport 36.5% 24.3% 17.6% 20.3%
Environmental issues 47.3% 32.5% 16.2% 15.6%
Better employment opportunities 39.2% 24.3% 18.3% 18.3%
Trustworthiness of MP 31.1% 21.0% 14.2% 16.2%
Anti-terrorism 49.4% 33.1% 18.3% 14.9%
Reservation for jobs and education 46.7% 39.2% 20.3% 15.6%
Drinking water 56.1% 36.5% 15.6% 16.2%
Subsidized food distribution 43.3% 25.7% 18.3% 12.2%
Strong Defence/Military 42.6% 32.5% 16.9% 12.8%
Training for jobs 46.7% 29.1% 18.9% 14.9%
Eradication of Corruption 33.8% 27.7% 20.3% 14.2%
Better Law and Order / Policing 39.2% 37.2% 16.9% 13.5%
Security for women 41.9% 20.3% 15.6% 13.5%
Better electric supply 43.3% 29.7% 12.2% 12.8%
Accessibility of MP 51.4% 31.1% 31.1%
Better hospitals / Primary Healthcare Centres 35.2% 30.4% 15.6% 11.5%
Empowerment of Women 44.6% 29.1% 19.6% 9.5%
Better garbage clearance 8.8% 8.1% 5.4% 3.4%
Encroachment of public land / lakes etc 2.0% 0.7% 0.7% 0.7%
Lower food prices for Consumers 4.7% 2.0% 2.7% 1.4%
Traffic congestion 3.4% 4.7% 1.4% 2.0%
Facility for pedestrians and cyclists on roads 3.4% 4.1% 3.4% 0.7%

There is certainly a difference among the age-groups, but the core theme seems to be the same:  basic infrastructure and employment. Yes, the youngest groups seem to worry about terrorism. And accessibility to the MP. But the basic National theme (below) of aspiring for the most basic of necessities is reflected in Varanasi, too, in some ways.

18 - 30 years Better employment opportunities
Better public transport
Drinking water
Better roads
Better electric supply
30 - 40 years Better employment opportunities
Drinking water
Better roads
Better public transport
Better electric supply
40 - 50 years Better employment opportunities
Drinking water
Better roads
Better electric supply
Better hospitals / Primary Healthcare Centres
50 - 65 years Better employment opportunities
Drinking water
Better hospitals / Primary Healthcare Centres
Better roads
Better schools

Can we hope that these will be the currency of discussion in the coming debates, instead of who slept on the street in Delhi and who the candidate's wife is? More importantly, can the people of the oldest living city in India hope that at least their simplest needs are met by the new MP? Time will tell.

Wednesday 19 March 2014

What does a MP actually do? On what criteria should we vote for her and how should we evaluate him?

What does a MP actually do? On what criteria should we vote for her and how should we evaluate him?

As General Elections 2014 draw closer there is more than the usual pre-election excitement. The Modi versus Rahul show down, the unpredictability of Kejriwal, the positioning of the fringe players and the eternal hopefuls from the “Third Front” are dominating the talk in the pre-election season.  The mainstream media however are not paying any attention to the performance of the MPs themselves apart from lamenting the waste of parliamentary time. This lack of attention to the role of the MPs is not surprising given the focus on personalities in our country; the focus is all on the national or state level leaders and whether they will be able to deliver seats in regions or states. Local problems and issues are swept under the need to simplify the issues in elections.  The performance of a member of parliament over the last five years, his stance on various issues in and outside parliament are not even discussed. It is only when divisive issues like Telangana come up is there a focus on MPs of that region and their (un) parliamentary behaviour.  
As far as citizens are concerned, mystery surrounds the role of their MPs. They appear to have an arms- length role as far as the day to day lives of their constituents are concerned. This belief has been strengthened over the years, although every candidate promises to make the constituency a model or a world class constituency if he is elected.
DAKSH, together with ADR, has conducted a survey across 525 parliamentary constituencies- with about 2,50,000 respondents randomized appropriately, making it the largest political survey ever in India- to assess people’s perceptions about the performance of their MPs. We asked two sets of questions: (a) what are the issues that are important to you when you vote in the elections and (b) how has your MP performed on those issues? Both these are clearly perceptions of the voter, but then voter is king and no political party or candidate will argue with that.
Some of the results have already been published. On a CNN IBN program, representatives of various political parties commended that scorecards are being published but raised the usual bogey- MPs should not be measured on perceptions, but on objective criteria, their performance in the Parliament should be measured and not necessarily on issues of local governance, x is a great parliamentarian, so how can she get a low score etc?
So, I go back to the original question- what does a MP do? A MP has in my view, a few roles: a) he represents the people of his constituency in the Lok Sabha and by virtue of that participates in policy making on all matters over which the Parliament has powers (and that is pretty much everything of significant importance in the country except those matters that are specifically given to State Legislatures); b) he can ask questions about the performance of the Union Cabinet and the central bureaucracy and hold them accountable and c) as a representative of his constituents he needs to represent their aspirations properly by interacting with them on a regular basis and providing leadership on issues relevant to the constituents. All these roles constitute an essential part of a MPs role- they are not mutually exclusive. A MP cannot assert when his performance is questioned that he should only be evaluated on one criteria. The DAKSH-ADR survey measures performance on item c) directly and items a) and b) indirectly. If a MP does a) and b) properly, the governance of the country will be good! If he does not do so, governance will be bad- it’s a simplistic explanation, yes, but in reality it is actually that simple!  When the DAKSH-ADR scores are added to the performance of the MP in parliament, it gives a complete picture of how the MP has performed.
The more important aspect to remember is that a MP’s accountability is to the voter and the voter is the boss in democracy; as a voter I can only question my MP, MLA and local representative (either panchayat or municipal councillor) on governance issues, and I will do so as long as the governance of our country continues in the mess that it is in currently. We should also not forget that MPs and MLAs are doing everything they can to ensure the emasculation of local governance of villages and cities. Therefore, when it comes to measuring their performance they cannot point the fingers at someone else. It is their job to get that someone else to perform properly and to equip that someone else to perform properly. Only then can they cry hoarse if we seek accountability from a MP for improper garbage collection or bad roads!