OpenID as common authentication system

This part of our web series discusses the merits of a user-centric system for individual identification on the Internet and the role OpenID and related network protocols have to play in this regard.

 

User-centric, bottom-up vs top-down

One strategy for researcher identification is a top-down approach, whereby each researcher is unilaterally assigned an identifier and this would subsequently be used wherever information relating to the researcher needs to be tracked or linked. Arguably, however, the idea of pushing a ‘single-identifier-everywhere’ solution on to researchers will be difficult to set up and operate, and meet with considerable pushback due to concerns about liberty and free will (see more below). Instead, more attractive would be a user-centric pull system, in which each individual seeks out the ID(s) they wish to utilize, and establishes their own linking to other information as and when needed.

Lessons from social networking

This ‘pull’ situation is highly analogous to recent developments in the online social community. At popular networking websites such as Facebook and Flickr, various Web 2.0 services (such as personal blogging platforms) are increasingly being linked together seamlessly to enhance the user experience. A key component in many of these developments is a relatively new technology called OpenID - a decentralized, open authentication protocol backed by Google, Yahoo, Microsoft and numerous other Internet heavyweights. Among recent OpenID-supporters is Facebook who already operate their own proprietary authentication system which is widely used (but see more here on potential synergy between the two systems). 

OpenID provides a way for individuals to identify themselves uniquely across the Internet with a single set of credentials with a provider of their choice, thus avoiding the pain of managing multiple usernames and passwords across a plethora of different websites. OpenID is rapidly gaining ground in the wider online community, and as recently suggested in a publication in PLoS1 it would be possible to use the same system for researcher identification. This proposal has much to merit it, though other options need to be considered (in particular organization-based Shibboleth identities) and there may even be a case for devising a completely new system specifically for biomedical researchers. Whichever system(s) come to be used, however, it is important to realize that individual sub-domains of biomedical research (e.g., journal publishing, funding organisations) will very often wish to employ their own set of individual IDs. This in no way conflicts with the principle of researchers having a universal OpenID, as this would be matched ‘behind the scenes’ to the alternative IDs used publishers, and funders, etc (see more on this page).

More generally, whichever ID and authentication system is used, there are many reasons to make it user-centric, so that: a) the individual is made able to manage his own online identity, and b) the individual has principal control over where his identifier and online profile(s) is deployed and who has access to what sections of it. At present OpenID and companion protocols fits these requirements very well, and so for the remainder of this series we will provisionally assume that OpenID represents the preferred authentication system of choice. However, bear in mind that the usage scenarios described in the sections to follow merely depend on some common mechanism for identification, and not on the use of the OpenID protocol per se.

How does OpenID work?

In brief, the concept of OpenID is that user authentication (i.e. the user proving that he is who she claims to be) is delegated to a third party, the OpenID Provider (OP), instead of taking place at the originating website (e.g. a blogging website) which is called a Relying Party (RP). Put another way, the originating website doesn't ask the user logging via OpenID for proof of his identity, but instead asks the provider of that OpenID "is this person who he says he is?". The user can go to any number of other websites which support OpenID and the process is repeated. In fact, if the user is already logged into the OP site in the same web browser session, he is authenticated straight away (this is called single-signon, or SSO).

The following sections on this page discuss key aspects of OpenID relating to researcher identification. More details on OpenID can found on the OpenID website, on http://openidexplained.com. OpenID proponent Simon Willison also has an excellent Google Tech Talks presentation on this topic.

Freedom to choose OpenID provider

An important feature of OpenID is that there is no single provider that everyone must to use (as was the case with Microsoft's proprietary Passport service (now rebranded as Live ID) several years ago. OpenID is essentially decentralized, with many, many different OPs to choose from. In fact, millions of Internet users already have an OpenID without knowing it, because Google, Yahoo and other services have recently started providing OpenIDs for their users, and users of these sites can therefore log into tens of thousands of websites which support OpenID already.

It is worth nothing that given the wide range of available OPs offering different levels of service (including, crucially, security), control is effectively put into the user’s hands to manage his online identity. For example, a user may not be satisfied with the traditional username/password credentials offered by, say,  Google or Yahoo that he may have already. The user is then free to choose another provider offering more secure hardware-based authentication solutions (e.g. smart cards, one-time pass key via mobile phone text message), such as those offered by VeriSign, MyOpenID, Vidoop and others. Also, if we turn the tables, a RP handling sensitive data may wish to only accept OpenID from providers with more secure authentication schemes. An example of this is Microsoft’s Health Vault service which only accepts three OpenID providers (at the time of writing).

One identity, multiple personal profiles

When registering for an account on a website, typically the user is asked to provide E-mail address, nickname and personal information. On websites supporting OpenID, during the authentication request the website will (via the attribute exchange part of the protocol) often ask the OpenID provider for this information from the user's profile, and the user can approve this request if desired. Most OPs give users the option of creating multiple personal profiles, or personae (e.g. 'work', 'personal'), and then choose the appropriate one upon registration with a new service. Then again, some people will prefer to have entirely separate OpenIDs for different purposes. There's nothing wrong with this; one can have as many OpenID's as desired.

Highlighting the different levels of OP service mentioned above, Google does not currently offer a way to manage OpenID profile and only exposes a user's E-mail address. Users who want to make use of profiles would therefore have to sign up with a different OpenID provider, such as MyOpenID which do offer this service.

OpenID and security

There have been numerous criticisms regarding security in OpenID. One common concern is that with a single username/password, the user now has all his "eggs in one basket" and if that one set of credentials are compromised then an unscrupulous hacker has access to all your user accounts. This is true, but given that most people use the same username/password for many different websites already, existing methods are no better. In fact, since you have a single place where you authenticate, this one place can be made more secure by choosing a reliable OpenID provider, whereas with multiple sites hosting your credentials you have no control over how secure your user account is. Additionally, OPs usually have some way for the user to audit the list of websites he has used his OpenID with, and the user can if needed remove an untrustworty website from this list of trusted sites.

Another concern is that somebody might take over your the Internet domain that your OpenID is based on (e.g. me.myprovider.com), and thereafter control your identity and possibly posing as you all over the Internet. Again, this is true, but  this can be sidestepped by choosing a quality OpenID provider. And same as above, the existing method where people use the same E-mail address when registering on many websites has essentially the same weakness: the E-mail server domain could be hijacked, and subsequently hijackers can request password-reset E-mails usually offered by websites and thereby easily gain control of your accounts on those websites.

Thirdly, because the OpenID authentication process involves redirecting the user from the original site to the OP site, it is vulnerable to "phishing", or man-in-the-middle, attacks. In this scenario, the user is redirected from some less-than-honest website (where he wants to register) to a fake OP website instead of the real provider site. The user types in his username and password, the fake website captures the info and after that the hackers can take over the users identity. This is a serious concern and is not limited to OpenID not at all. In any case, phishing can be countered by simply making sure that the web location (URL) box in the web browser actually contains the right domain name (and not, say, fakeprovider.myopenid.biz). There are also browser extensions (such as VeriSign's OpenID SeatBelt) which help to detect phishing attempts, as well as information cards and related tools.

Conclusions

As discussed on other pages in this primer, there could be big benefits in adopting a common authentication system for researchers to identify themselves on the Internet. OpenID seems like the perfect candidate for this purpose, striking a good balance between ease-of-use and security.

  1. 1. Bourne et al. I am not a scientist, I am a number. PLoS Comput Biol (2008) vol. 4 (12) doi:10.1371/journal.pcbi.1000247