Sunday, March 6, 2011

166 - Thoughts on the Unique Identity Frame work for India nee Aadhaar - Gautam Johns Blog

August 9th 2010

I had written this piece many months ago when the UIDAI white paper leaked. Some of the commentary might now be dated because the goal posts have changed ever since. I have consciously ignored the privacy and security aspects because I am no expert in those areas.

I also need to credit a friend, lets just call him Vikram, for this piece.

Bottom line:

Trying to do too much. The smallest intervention becomes a massive undertaking in India because of scale and organizational/adminstrative complexity, so you should scope projects as narrowly as possible. If the main purpose is to bring marginalized people into mainstream economic life, then you should focus on getting them an ID rather than on eliminating redundant verification activity or eliminating fraud, both of which can be happy by-products further down the line. Why not: Make a national ID number available to anyone who wants it, target it to the people who currently lack any form of ID, and let things evolve from there. It doesn't have to cover everyone or be the only recognized form of ID or be real-time and state-of-the-art to do most of what you want it to do.

Rough scope evaluation:

Aim 1: Getting everyone an ID; Project component: enrollment; Advantages: brings people into economic and social life; Disadvantages: big-brother possibilities (everybody means well in the beginning)

Aim 2: Eliminating redundant verification activity; Project component: on-demand authentication; Advantages: efficiencies over time; Disadvantages: business process disruption across swaths of the economy

Aim 3: Eliminating entitlement fraud; Project component: data de-deduplication using biometrics; Advantages: helps balance sheets of government agencies; Disadvantages: alienates those who benefit from current arrangements (customers as well as government employees who abet them)

I don't know the relative costs of the three components, but I suspect that an incremental approach to 1 with a thinned-down version of 3 would be the 80/20 solution here. On the security side, all of this boils down to persons or individuals and what they can do. Allow me think aloud here.....

- Any kind of marker that defines or demarcates a person or individual. These persons can be real, fictional, fictitious, whatever. Captain James Kirk is an identifiable individual in the world of Star Trek. Avatars on Second Life or gaming sites are identifiable individuals within those universes.

Witness protection and intelligence agencies assign fictitious identities to real individuals. In the serious world of business and government, identity is about each unique existing individual having a unique identity or marker to go along with it that can be used in official business. Most people have many such markers (credit card number, passport number, social security number, tax or voter ID number, combination of name and birthdate) and some countries have one marker that is close to universal (almost everyone in the US has a social security number, for instance). In India, some have many markers and many people have no official markers at all, despite being unique individuals. Having many markers is not really a problem except in an efficiency sense. My bank identifies me by my account number, my university used to have its own 9-digit ID for me, immigration agencies track me by my passport number, etc. etc. Some of these are parasitic on my social security number (which I provided when applying for a bank account or applying to college), but many are not. And the cost (in terms of business process changes, technology investments, confusion, etc.) of getting everyone in the economy to subordinate their own identification numbers to a common national number is going to be prohibitive in any normal decision-making horizon.

- When you claim to be some identifiable individual (the owner of some identity marker), authentication is about making sure you really are that individual. First of all, we should decide whether we care more about false negatives (people falsely claiming to be someone else and getting away with it) or false positives (people truthfully claiming to be themselves but not being believed, maybe because they don't have the paperwork to prove it). If you try to solve both, you end up with the biggest of all possible projects and also the least likely to succeed, because the solution to one exacerbates the other and only the all-singing all-dancing perfect solution (in which all real-world difficulties are assumed away) gives the illusion of bridging the tension. If you care more about false negatives, you'll make it harder to get a valid identity marker, and there go the poor and the marginalized. If you make it easier to get one number, you've made it easier to get a second. That's why they came up with the biometrics, but for that extra bit of security, they've fingerprinted an entire population (don't tell me that won't be abused) and, I suspect, added a whole lot of processing cycles on the IT side (I imagine it's easier to look for matches of a 9-digit number than for fingerprint matches). The problem of identity theft (rather than the creation of false or duplicate identities) doesn't even require the extra security. A 9-digit random number is pretty secure in the sense that it's virtually impossible to guess and only you and maybe a handful of other people know it. [I won't even get into the problems with biometrics. Fingerprint matches are far from unique at standard levels of detail, so it's no silver bullet, and once fingerprint identification is used for high-value financial transactions, expect a rash of even rhymes with de-duplication!]

- Once we know you're you, authorization is about defining what you're allowed to do or what you're entitled to. Here, that whole aspect is (correctly) left to the individual service providers.

In general , I think there are more things under heaven and earth than are thought of in any of our philosophies and these people would do well to ponder that. IT projects always take many times more time and money to finish than bargained on at the outset, and that's only counting the ones that kind of reach their goal. Incentive and coordination problems will cripple (or disfigure beyond recognition) any large project in a complex organization, and you're off the scale here in both size and complexity. Politicians and public figures and academics have a built-in preference for ambitious/sexy/grandiose projects, but the efforts that stick are the ones that start small and evolve.

Some of my concerns are listed under project risks, but there's no clue there beyond platitudes as to how they might be addressed. Project risks are side things that can derail the project if you have bad luck. The obstacles we're talking about here are what the project (or at least this document) should be about. It's trivial to collect data and put it in a database and then query the database from a transaction site. It's not trivial to do it for a billion people or through hundreds of overlapping independent agencies and politically antagonistic local governments. Don't show me diagrams of how you're going to approach the trivial problem and then mention by the way that there might be some complications. If you have a solution to the complications, shout it from the rooftops. Otherwise, come back when you have one or let's talk about how we can find one.

- It seems like they think that the trade-off between entitlement fraud and inclusiveness can be broken by this magic technology called biometrics. If only. If there's one thing that technology executives in large organizations agree on, it's that the technology is never the solution. Technology providers are less wise on this point but even they acknowledge it in their less commercial moments. In any case, I don't know how much accuracy biometrics adds beyond what you could get by triangulating the information that's normally used in verification (biographic data as attested by documents plus distinguishing facial features). I bet it's not much, especially once you consider the failure rate of biometrics itself (since nothing is foolproof). It does however add a layer of certain costs for infrastructure, training, etc. And the privacy implications are chilling.

- really? Enrolling agencies are looking at business process disruption, technology investments, and an extra operating burden, and not only in the first ten years, so they will have to be strong-armed.

It may be that the savings in the larger economy from not having to repeat verification procedures will offset the costs over time, but businesses don't make decisions with a ten-year horizon (especially a ten-year horizon contingent on the success of a government project of unprecedented scale and complexity), so I wouldn't expect them to be queueing up to ditch their current procedures. People who were getting duplicate benefits will lose out under this, so don't expect them to rush forward either. And I'm sure some tribes like being out of view of the state. (In the US, the Amish have resisted social security numbers, I believe successfully.) Also, by making more and more services/entitlements/rights dependent on the ID, you're putting an unrealistic reliance on the benevolence and competence of the enrolling agencies.

Ultimately we're talking about hundreds of millions of vulnerable people interacting with millions or hundreds of thousands of petty officials who have been given extra work to do for the benefit of people they very likely regard with distaste. Expect the worst. Also, I don't think "network effects" means what the authors think it means. It's not the case that the more people that have an ID, the more beneficial it is for me to have an ID. It is the case that the more government or other services that become contingent on having an ID, the more beneficial it is for me to have one. That's a very different thing and not so different from what they disapprovingly call a mandate (except that a mandate would be more clean).

Data quality
- why do you think it's so easy to duplicate identities or, if you prefer, so difficult to create a unique record for each individual? Lots of overlap of names, lots of names that don't follow the western or north indian convention of given name followed by family name, lots of people with no clean permanent or even present address, haziness around date of birth (stop the first twenty people you meet in any village and ask them if they know their exact birthday). How exactly are you going to address this? These are problems with the data themselves, not with how the data are collected. Some of these might disappear over time (three generations from now I imagine there won't be anyone left who doesn't know their birthdate), some of them can be nudged out of existence (we could force south indian names into a given name/surname pattern as many of us have done out of necessity), and some could possibly combated through mega-projects of their own (if we could somehow make it so that everyone who doesn't have a proper address now has one in twenty years' time, we would have accomplished something much grander and worthier than a national identity scheme). If you launch a national identification number without solving these problems, you're just going to be importing a lot of bad data into an arena where it can do much more damage (because now there's a single point of failure as far as the individual is concerned--earlier, data problems might mess up your gas connection but not your phone application or your ration card, because each agency had its own idiosyncratic way of doing things; now, everything is connected). The document refers to KYR standards for the validity of demographic data and that sounded promising, but when I looked around for information on these standards - it was like "know your customer" but for residents rather than customers, which wasn't a big help.


How about this? A chunk of the population is being left out of economic life and social programs because of a lack of an accepted identity marker. Why not provide a unique government-backed ID to everyone, or to anyone who asks for it. It doesn't have to be foolproof, just good enough for the purpose. That way if you have a usable identity marker already, you keep using that, otherwise you apply for the government's random-number ID. Service providers now accept the UID along with what they've always accepted, and they're free to pressure customers to get a UID if they like.

That way, you have an order of magnitude fewer people include in this UID project and the sequence in which people are brought into the system respects your pro-poor agenda much better because it starts with people who most urgently need an identity marker and only then (in a timeline decided by individuals or at most by individual businesses or service providers) gets taken up by people for whom it would be a marginal convenience. [This is roughly how the social security number came to be the de facto national identification number in the US.]

Think of how complicated the census is. And that just involves going door to door and counting people, trying to avoid double-counting. Now you want to catalogue them uniquely and be present in every interaction they have with a service provider? Come on. I don't know of a single large company that has a unique identity for each employee matched to an up-do-date profile of what they can do and a reliable method to ensure that someone fiddling around on the network is who they say they are and aren't doing something they're not supposed to. The best companies have good robust identity, authorization, and authentication for a small group of employees and the bare minimum (including bad data and processes where they can be tolerated) everywhere else, because even something as simple as rolling out a smart card to 50,000 employees can take years because of the logistical and organizational hurdles. It was a big achievement a few years ago when Johnson & Johnson figured out a way to assign unique identifiers to its 150,000 or so employees so that it could keep track of them as they moved through the company. Now maybe these companies are just stupid, but I wouldn't bet on it. I would expect the difficulty to increase exponentially with the number of people covered or at least the number of independent decision points involved, and companies have the advantage of a command-and-control structure that democracies don't and shouldn't have. If the success of your project requires pretty much everyone in the economy to do things differently ("business process change" is easy to say but traumatic for anyone in the middle of it), you can assume you've succumbed to hubris.