Doing science in the open

Online networking tools are pervasive, but why have scientists been so slow to adopt many of them? Michael Nielsen explains how we can build a better culture of online collaboration

In your high-school science classes you almost certainly learned Hooke’s law, relating a spring’s length to how hard you pull on it. What your high-school science teacher probably did not tell you is that when Robert Hooke discovered his law in 1676, he published it as an anagram, “ceiiinossssttuv”, which he revealed two years later as the Latin “ut tensio, sic vis”, meaning “as the extension, so the force”. This ensured that if someone else made the same discovery, then Hooke could reveal the anagram and claim priority, thus buying time in which he alone could build upon the discovery.

Hooke’s secretiveness was not unusual. Many great scientists of the age, including Leonardo da Vinci, Galileo Galilei and Christiaan Huygens, used anagrams or ciphers for similar purposes. The Newton—Leibniz controversy over who invented calculus occurred because Isaac Newton claimed to have invented calculus in the 1660s and 1670s, but did not publish his work until 1693. In the meantime, Gottfried Leibniz developed and published his own version of calculus. Imagine modern biology if the human genome had been announced as an anagram, or if publication had been delayed 30 years.

Why were Hooke, Newton and their contemporaries so secretive? In fact, until this time discoveries were routinely kept secret. Alchemists intent on converting lead into gold or finding the secret of eternal youth would often take their discoveries with them to the grave. A secretive culture of discovery was a natural consequence of a society in which the personal gain from sharing discoveries was fraught with uncertainty.

The great scientific advances in the time of Hooke and Newton later motivated wealthy patrons, such as the government, to begin subsidizing science as a profession. Much of the motivation came from the public benefit delivered by scientific discovery, and that benefit was strongest if discoveries were shared. The result was a scientific culture that to this day rewards the sharing of discoveries with jobs and prestige for the discoverer.

This cultural transition was just beginning in the time of Hooke and Newton, but a little over a century later the great physicist Michael Faraday could advise a younger colleague to “Work. Finish. Publish”. The culture of science had changed so that a discovery not published in a scientific journal was not truly complete.

The adoption and growth of scientific journals has created a body of shared knowledge for our civilization, a collective long-term memory that is the basis for much of human progress. This system has changed surprisingly little in the last 300 years. Today, the Internet offers us the first major opportunity to improve this collective long-term memory, and to create a collective short-term working memory — a conversational commons for the rapid collaborative development of ideas.

This change will not be achieved without great effort. From the outside, scientists currently appear puzzlingly slow to adopt many online tools. As we will see, this is a consequence of some major barriers deeply embedded within the culture of science. Changing this culture will only be achieved with great effort, but I believe that the process of scientific discovery — how we do science — will change more over the next two decades than in the past 300 years.

How can the Internet improve the way we do science?

There are two useful ways to answer this question. The first is to view online tools as a way of expanding the range of scientific knowledge that can be shared with the world. Many online tools do just this, and some have had a major impact on how scientists work. Two successful examples are the physics preprint server arXiv, which lets physicists share preprints of their papers without the months-long delay typical of a conventional journal, and GenBank, an online database where biologists can deposit and search for DNA sequences. But most online tools of this type remain niche applications, often despite the fact that many scientists believe broad adoption would be valuable. Two examples are Journal of Visualized Experiments, which lets scientists upload videos that show how their experiments work, and “Open Notebook Science”, as practised by scientists like Jean-Claude Bradley and Garrett Lisi, who expose their working notes to the world. In the coming years we will see a proliferation of tools of this type, each geared to sharing different types of knowledge.

Map of knowledge

There is a second and more radical way of thinking about how the Internet can change science, and that is through a change to the process and scale of creative collaboration itself, enabled by social software such as wikis, online forums and their descendants.

There are already many well known but still striking instances of this change in parts of culture outside of science (well documented in Clay Shirky’s excellent book Here Comes Everybody). For example, in 1991 an unknown Finnish student named Linus Torvalds posted a short note in an online forum, asking for help extending a toy operating system he had programmed in his spare time; a volunteer army responded by assembling Linux, one of the most complex engineering artefacts ever constructed. In 2001 another young unknown named Larry Sanger posted a short note asking for help building an online encyclopedia; a volunteer army responded by assembling Wikipedia, the world’s most comprehensive encyclopedia.

Another example of the power of online collaboration comes from chess. In 1999, Garry Kasparov, the greatest chess player of all time, played against the “World Team” — a single team consisting of thousands of chess players, many rank amateurs, which decided on their moves by vote. Kasparov won, but instead of the easy victory he expected, he got the most challenging game of his career, which he called “the greatest game in the history of chess”.

These examples are not curiosities or special cases; they are the leading edge of a great change in the creative process. Science is an example par excellence of creative collaboration, yet scientific collaboration still takes place mainly via small scale face-to-face meetings. With the exception of e-mail, few of the new social tools have been broadly adopted by scientists, even though it is these tools that have the greatest potential to speed up the rate of scientific discovery.

Why have scientists been so slow to adopt these remarkable tools? Is it simply that they are too conservative in their habits, or that the new tools are no better than what we already have? Both these glib answers are wrong. We can resolve this puzzle by looking in detail at two examples where excellent online tools have failed to be adopted by scientists. It turns out that there are major cultural barriers preventing scientists from getting involved, thereby slowing down the progress of science.

A failure of science online: online comment sites

Like many people, when I am considering buying a book or electronic gadget, I often first browse the reviews at Amazon. Inspired by the success of Amazon, several organizations have created comment sites where scientists can share their opinions of scientific papers. Perhaps the best known was Nature’s 2006 trial of open commentary on papers undergoing peer review at the journal (see Physics World January 2007 pp29—30). The trial was not a success, as Nature’s final report explained: “There was a significant level of expressed interest in open peer review. A small majority of those authors who did participate received comments, but typically very few, despite significant Web traffic. Most comments were not technically substantive. Feedback suggests that there is a marked reluctance among researchers to offer open comments.”

The Nature trial is just one of many attempts at comment sites for scientists. The earliest example I am aware of is the Quick Reviews site, which was launched in 1997 and discontinued in 1998. Physics Comments was developed a few years later, and discontinued in 2006. A more recent site, Science Advisor, is still active, but has more members (1139) than reviews (1008). It seems that people want to read reviews of scientific papers, but not write them. An ongoing experiment that incorporates online commentary and many other innovative features is PLoS ONE, but it is too early to tell how successful its commentary will be.

The problem that all these sites have is that while thoughtful commentary on scientific papers is certainly useful for other scientists, there are few incentives for people to write such comments. Why write a comment when you could be doing something more “useful”, like writing a paper or a grant proposal? Furthermore, if you publicly criticize someone’s paper, then there is a chance that the person may be an anonymous referee in a position to scuttle your next paper or grant application.

To grasp the mindset here, you need to understand the monklike intensity that ambitious young scientists bring to the pursuit of scientific publications and grants. To get a position at a major university the most important thing is an impressive record of scientific papers. These papers will bring in the research grants and letters of recommendation necessary to get hired. Competition for positions is so fierce that 70—80 hour working weeks are common. The pace relaxes after tenure, but continued grant support still requires a strong work ethic. It is no wonder people have little inclination to contribute to online comment sites.

The contrast between the science comment sites and the success of the reviews at Amazon is stark. To pick just one example, you will find approximately 1500 Pokemon products on Amazon, more than the total number of reviews on all the scientific comment sites I described above. The disincentives facing scientists have led to a ludicrous situation where popular culture is open enough that people feel comfortable writing Pokemon reviews, yet scientific culture is so closed that people will not publicly share their opinions of scientific papers. Some people find this contrast curious or amusing; I believe it signifies something seriously amiss with science, something we need to understand and change.

A failure of science online: Wikipedia

Wikipedia is a second example where scientists have missed an opportunity to innovate online. Wikipedia has a vision statement to warm a scientist’s heart: “Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.” You might guess Wikipedia was started by scientists eager to collect all of human knowledge into a single source. In fact, Wikipedia’s founder, Jimmy Wales, had a background in finance and as a Web developer for an “erotic search engine”, not in science. Co-founder Larry Sanger was a philosopher who had left academia. In the early days, few established scientists were involved. Just as for the scientific comment sites, to contribute aroused suspicion from colleagues that you were wasting time that could be better spent writing papers and grants.

Some scientists will object that contributing to Wikipedia is not really science. And, of course, it is not if you take a narrow view of what science is, and take it for granted that science is only about publishing in specialized scientific journals. But if you take a broader view, if you believe science is about not only discovering how the world works, but also about sharing that understanding with the rest of humanity, then the lack of early scientific support for Wikipedia looks like an opportunity lost. Nowadays, Wikipedia’s success has to some extent legitimized contribution within the scientific community. But how strange that the modern day Library of Alexandria had to come from outside academia.

The challenge: achieving extreme openness in science

These failures of science online are all examples where scientists show a surprising reluctance to share knowledge that could be useful to others. This is ironic, for the value of cultural openness was understood centuries ago by many of the founders of modern science; indeed, the journal system is perhaps the most open system for the transmission of knowledge that could be built with 17th-century media. The adoption of the journal system was achieved by subsidizing scientists who published their discoveries in journals. This same subsidy now inhibits the adoption of more effective technologies, because it continues to incentivize scientists to share their work in conventional journals and not in more modern media.

We should aim to create an open scientific culture where as much information as possible is moved out of people’s heads and labs, onto the network and into tools that can help us structure and filter the information. This means everything — data, scientific opinions, questions, ideas, folk knowledge, workflows and everything else. Information not on the network cannot do any good.

Ideally, we will achieve a kind of extreme openness: making many more types of content available than just scientific papers; allowing creative reuse and modification of existing work through more open licensing and community norms; making all information not just human readable but also machine readable; providing open interfaces to enable the building of additional services on top of the scientific literature, and possibly even multiple layers of increasingly powerful services. Such extreme openness is the ultimate expression of the idea that others may build upon and extend the work of individual scientists in ways that they themselves would never have conceived.

To create an open scientific culture that embraces new online tools, two challenging tasks must be achieved: first, build superb online tools; and second, cause the cultural changes necessary for those tools to be accepted. The necessity of accomplishing both these tasks is obvious, yet projects in online science often focus mostly on building tools, with cultural change an afterthought. This is a mistake, for the tools are only part of the overall picture. It took just a few years for the first scientific journals (a tool) to be developed, but many decades of cultural change before journal publication was accepted as the gold standard for judging scientific contributions.

None of this is to discount the challenge of building superb online tools. To develop such tools requires a rare combination of strong design and technical skills, and a deep understanding of how science works. The difficulty is compounded because the people who best understand how science works are scientists themselves, yet building such tools is not something scientists are typically encouraged or well suited to do. Scientific institutions reward scientists for making discoveries within the existing system of discovery; there is little space for people working to change that system. A technologically challenged head of department is unlikely to look kindly on a scientist who suggests that instead of writing papers they would like to spend their research time developing general-purpose tools to improve how science is done.

What about the second task, achieving cultural change? As any revolutionary can attest, that is a tall order. Let me describe two strategies that have been successful in the past, and that offer a template for future success. The first is a top-down strategy that has been successfully used by the open-access (OA) movement. The goal of the OA movement is to make scientific research freely available online to everyone in the world. It is an inspiring goal, and the OA movement has achieved some amazing successes. Perhaps most notably, in April 2008 the US National Institutes of Health (NIH) mandated that every paper written with the support of their grants must eventually be made open access. The NIH is the world’s largest grant agency; this decision is the scientific equivalent of successfully storming the Bastille.

Blog world

The second strategy is bottom-up. It is for the people building the new online tools to also develop and boldly evangelize ways of measuring the contributions made with the tools. To understand what this means, imagine you are a scientist sitting on a committee that is deciding whether or not to hire a scientist. Their curriculum vitae reports that they have helped build an open-science wiki, and also that they write a blog. Unfortunately, the committee has no easy way of understanding the significance of these contributions, since as yet there are no broadly accepted metrics for assessing such contributions. The natural consequence is that such contributions are typically undervalued.

To make the challenge concrete, ask yourself what it would take for a description of the contribution made through blogging to be reported by a scientist on their curriculum vitae. How could you measure the different sorts of contributions a scientist can make on a blog — outreach, education and research? These are not easy questions to answer. Yet they must be answered before scientific blogging is accepted as a valuable professional scientific contribution.

A success story: arXiv and SPIRES

One example illustrating the bottom-up strategy in action is the well-known physics preprint server arXiv. Since 1991 physicists have been uploading their papers to arXiv, often at about the same time as they submit the material to a journal. The papers are made freely available within hours for anyone to read. arXiv is not refereed, although a quick check is done by moderators to remove crank submissions. arXiv is an excellent and widely used tool, with more than half of all new papers in physics appearing there first. Many physicists start their day by seeing what has appeared on the site overnight. Thus, arXiv exemplifies the first step towards achieving a more open culture: it is a superb tool.

Not long after arXiv began, a citation-tracking service called SPIRES decided it would extend its service to include both arXiv papers and conventional journal articles. SPIRES specializes in particle physics, and as a result it is now possible to search on a particle physicist’s name and see how frequently all their papers, including arXiv preprints, have been cited by other physicists.

SPIRES has been run since 1974 by one of the most respected and highly visible institutions in particle physics, the SLAC National Accelerator Laboratory. The effort that SLAC has put into developing SPIRES means that its metrics of citation impact are both credible and widely used by the particle-physics community. It is now possible for a particle physicist to convincingly demonstrate that their work is having a high impact, even if it has only been submitted to arXiv and has not yet been published in a conventional scientific journal. When hiring committees meet to evaluate candidates in particle physics, people often have their laptops out, examining and comparing the SPIRES citation records of candidates.

SPIRES and arXiv have not stopped particle physicists from publishing in peer-reviewed journals. When you are applying for jobs, or up for tenure, every ounce of ammunition helps, especially when the evaluating committee may contain someone from another field who is reluctant to take the SPIRES citation data seriously. Still, some physicists have become more relaxed about publication, and it is not uncommon to see CVs including preprints that have not been published in conventional journals.

The problem of collaboration

Even Albert Einstein needed help occasionally. In 1912, when Einstein first realized that a new kind of geometry was needed to describe space and time, he had little idea of how to proceed. Fortunately, he shared his difficulties with a mathematician friend, Marcel Grossman, who knew just what Einstein needed and introduced him to the work of the mathematician Bernhard Riemann. It took Einstein three more years to work out the full theory, but Grossman was right, and this was a critical point in the development of general relativity.

Einstein’s conundrum is familiar to any scientist. When doing research, subproblems constantly arise in unexpected areas. No-one can be expert in all those areas. Most of us instead stumble along, picking up the skills necessary to make progress towards our larger goals, grateful when the zeitgeist of our research occasionally throws up a subproblem in which we are already truly expert. Like Einstein, we have a small group of trusted collaborators with whom we exchange questions and ideas when we are stuck. Unfortunately, most of the time even our collaborators are not that much help. They may point us in the right direction, but rarely do they have exactly the expertise we need. Is it possible to scale up this conversational model, and build an online collaboration market to exchange questions and ideas, a sort of collective working memory for the scientific community?

It is natural to be sceptical of this idea, but an extremely demanding creative culture already exists that shows that such a collaboration market is feasible — the culture of free and open-source software. Scientists browsing for the first time through the development forums of open-source programming projects are often shocked at the high level of the discussion. They expect amateur hour at the local karaoke bar; instead, they find professional programmers routinely sharing their questions and ideas, helping solve each other’s problems, often exerting great intellectual effort and ingenuity. Rather than hoarding their questions and ideas, as scientists do for fear of being scooped, the programmers revel in swapping them. Some of the world’s best programmers hang out in these forums, swapping tips, answering questions and participating in the conversation.

New possibilities

I will now describe two embryonic examples that suggest that online collaboration markets for science may be valuable. The first is InnoCentive, which allows companies like Eli Lilly and Proctor and Gamble to pose “challenges” over the Internet: scientific research problems with associated prizes for their solution, often many thousands of dollars. For example, one of the challenges currently on InnoCentive asks participants to find a biomarker for motor neuron disease, with a $1m prize. If you register for the site, it is possible to obtain a detailed description of the challenge requirements, and attempt to win the prize. More than 140 000 people from 175 countries have registered, and prizes for more than 100 challenges have been awarded.

InnoCentive is an example of how a market in scientific problems and solutions can be established. Of course, it has shortcomings as a model for collaboration in basic research. Only a small number of companies are able to pose challenges, and they may do so only after a lengthy vetting process. InnoCentive’s business model is aimed firmly at industrial rather than basic research, and so the incentives revolve around money and intellectual property, rather than reputation and citation. It is certainly not a rapid-fire conversational tool like the programming forums; one does not wake up in the morning with a problem in mind and post it to InnoCentive, hoping for help with a quick solution.

FriendFeed is a much more fluid tool that is being used by scientists as a conversational medium to discuss research problems. What FriendFeed allows users to do is set up what is called a lifestream. As an example, my lifestream is set up to automatically aggregate pretty much everything I put on the Web, including my blog posts, del.icio.us links, YouTube videos and several other types of content.

I also subscribe to a list of about 100 or so “friends” whose lifestreams I can see aggregated into one giant river of information — all their Flickr photos, blog posts and so on. These people are not necessarily real friends — I am not personally acquainted with my “friend” Barack Obama — but it is a fantastic way of tracking a high volume of activity from a large number of people.

As part of the lifestream, FriendFeed allows messages to be passed back and forth in a lightweight way, so communities can form around common interests and shared friendships. In April 2008 Cameron Neylon, a chemist from the University of Southampton, used FriendFeed messaging to post a request for assistance in building molecular models. Pretty quickly Pawel Szczesny, a biologist at the Max Planck Institute for Developmental Biology in Tübingen, Germany, replied, and said he could help. A scientific collaboration was now under way.

FriendFeed is a great service, but it suffers from many of the same problems that afflict the comment sites and Wikipedia. Lacking widely accepted metrics to measure contribution, scientists are unlikely to adopt FriendFeed en masse as a medium for scientific collaboration. And without widespread adoption, the utility of FriendFeed for scientific collaboration will remain relatively low.

The economics of collaboration

How much is lost due to inefficiencies in the current system of collaboration? To answer this question, imagine a scientist named Alice. Like most scientists, many of Alice’s research projects spontaneously give rise to problems in areas in which she is not an expert. She juggles hundreds or thousands of such problems, re-examining each occasionally and looking to make progress, but knowing that only rarely is she the person best suited to solve any given problem.

Suppose that for a particular problem, Alice estimates that it would take her 4—5 weeks to acquire the required expertise and solve the problem. That is a long time, and so the problem is out on the back burner. Unbeknown to Alice, though, there is another scientist in another part of the world, Bob, who has just the skills required to solve the problem in less than a day. This is not at all uncommon. Quite the contrary; my experience is that this is the usual situation. Consider the example of Grossman, who saved Einstein what might otherwise have been years of extra work.

Do Alice and Bob exchange questions and ideas, and start collaborating towards a solution to Alice’s problem? Unfortunately, nine times out of 10 they never even meet, or if they do, they just exchange small talk. It is an opportunity lost for a mutually beneficial trade, a loss that may cost weeks of work for Alice. It is also a great loss for the society that bears the cost of doing science. Expert attention, the ultimate scarce resource in science, is very inefficiently allocated under existing practices for collaboration.

An efficient collaboration market would enable Alice and Bob to find this common interest, and exchange their know-how, in much the same way as eBay and craigslist enable people to exchange goods and services. However, in order for this to be possible, a great deal of mutual trust is required. Without such trust, there is no way that Alice will be willing to advertise her questions to the entire community. The danger of free riders who will take advantage for their own benefit (and to Alice’s detriment) is just too high.

In science, we are so used to this situation that we take it for granted. But let us compare it to the apparently very different problem of buying shoes. Alice walks into a shoe shop with some money. Alice wants shoes more than she wants to keep her money, while Bob the shop owner wants the money more than he wants the shoes. As a result, Bob hands over the shoes, Alice hands over the money, and everyone walks away happier after just 10 minutes. This rapid transaction takes place because there is a trust infrastructure of laws and enforcement in place that ensures that if either party cheats, then they are likely to be caught and punished.

If shoe shops operated like scientists trading ideas, first Alice and Bob would need to get to know one another, maybe go for a few beers in a nearby bar. Only then would Alice finally say “You know, I am looking for some shoes”. After a pause, and a few more beers, Bob would say “You know what, I just happen to have some shoes I am looking to sell”. Every working scientist recognizes this dance; I know scientists who worry less about selling their house than they do about exchanging scientific information.

In economics, it has been understood for hundreds of years that wealth is created when we lower barriers to trade, provided there is a trust infrastructure of laws and enforcement to prevent cheating and ensure trade is uncoerced. The basic idea, which goes back to economist David Ricardo in 1817, is to concentrate on areas where we have a comparative advantage, and to avoid areas where we have a comparative disadvantage.

Although Ricardo’s work was in economics, his analysis works equally well for the trade in ideas. Indeed, even were Alice to be far more competent than Bob, Ricardo’s analysis shows that both Alice and Bob benefit if Alice concentrates on areas where she has the greatest comparative advantage, and Bob on areas where he has a comparative disadvantage. Unfortunately, science currently lacks the trust infrastructure and incentives necessary for such free, unrestricted trade of questions and ideas.

An ideal collaboration market will enable just such an exchange of questions and ideas. It will include metrics of contribution so that participants can demonstrate the impact that their work is having. Contributions will be archived, time-stamped and signed, so it is clear who said what, and when. Combined with high-quality filtering and search tools, the result will be an open culture of trust that gives scientists a real incentive to outsource problems, and contribute in areas where they have a great comparative advantage. This will change science.

At a Glance: Open science

• The Internet provides an opportunity to create a conversational platform for scientists to develop ideas rapidly and collaboratively
• Scientists, however, have been relatively slow to adopt online tools such as comment sites and Wikipedia
• The Internet can improve the way we do science in two ways. First, online tools are a way of expanding the range of scientific knowledge that can be shared with the world. Second, the Internet can change the process and scale of creative collaboration itself, using social software such as wikis, online forums and their descendents
• Great online applications will not be sufficient to change scientific collaboration. We still require a cultural change that embraces an open scientific culture. This will include new metrics that acknowledge online collaboration as a genuine scientific contribution — something that will act as an incentive for scientists to share their problems online

More about: Open science
Open-access blogs
Author’s blog: michaelnielsen.org/blog
Cameron Neylon’s blog: blog.openwetware.org/scienceintheopen/2008/04/16/the-science-exchange
Open-access news: www.earlham.edu/~peters/fos/fosblog.html
Online communities
Science 2.0: friendfeed.com/rooms/science-2-0
Science commons: sciencecommons.org
Articles
Bill Hooker’s essays: 3quarksdaily.blogs.com/3quarksdaily/2006/10/the_future_of_s_1.html
M Waldrop 2008 Science 2.0: great new tool or great risk? Scientific American May pp68—73

Browse articles by content type

Vacuum and cryogenics

Doing science in the open

How can the Internet improve the way we do science?