A wiki approach to Open Access and Open Science

This blog is dedicated to documenting the progress of the Wikimedian in residence on Open Science project – with an initial focus on Open Access – funded by the Open Society Foundations Information Program and hosted by the Open Knowledge Foundation Germany.

Let’s start by introducing the five bolded terms, in reverse order:

The Open Knowledge Foundation Germany (OKFN-DE) here at okfn.de is the German chapter of the globally operating Open Knowledge Foundation (OKFN) headquartered in Cambridge, UK, whose mission it is to foster the creation, curation and reuse of Open Knowledge, as defined by its Open Definition. To this end, it supports initiatives like the PantonPrinciples for the sharing of scientific data, the CKAN repository for Open Data, the Open Literature prototype Open Shakespeare, or the listing of catalogues of open governmental data DataCatalogs.org. Anyone can participate in these global activities, and in addition to doing that, the German chapter works on promoting the open sharing of information and knowledge within Germany and the German-speaking world.

The Open Society Foundations (formerly Open Society Institute – OSI) run a number of programs to advance justice, education, public health, and independent media. One of these is the Information Program, has three focal areas concerned with increasing public access to knowledge, facilitating civil society communication, and protecting civil liberties as well as the freedom to communicate in the digital environment. Activities under the “public access to knowledge”theme include the Open Access Initiative – built on the Budapest Open Access Initiative, one of the hallmarks of the Open Access movement – as well as the Intellectual Property Reform Initiative, Open Educational Resources, Open Access to Law and Electronic Information for Libraries, whose Open Access Program won the 2011 SPARC Europe Award for Outstanding Achievements in Scholarly Communications.

Open Access is a shorthand for a variety of mechanisms to provide access to scholarly articles at no cost to the reader. These approaches come in two major variants known as the “green” and “gold” Open Access models. In the first one, authors are encouraged to deposit pre- or postprints (manuscripts before and after peer review, respectively) in a public repository, such as arXiv.org or Nature Precedings. In the “gold model”, the formatted publication itself is available to be read by anyone. In 2009, about 20% of scholarly articles were publicly accessible by way of green and/ or gold Open Access. While this means that the large majority of the scholarly literature was not accessible to the public, it also means that a significant (and growing) amount of scholarship already is, including that published in the largest scientific journal on the planet. Neither green nor gold Open Access, however, automatically imply a license that allows to re-use the content elsewhere, an arrangement known as libre Open Access.

Open Science is a movement within the scientific community to take advantage of the opportunities provided by the web to increase the sharing of scientific information beyond the limits imposed by print publishing and associated reuse-restricting notions of copyright, with the ultimate goal of improving collaboration. Open Access is an important first step in this direction, and its constantly rising profile within the scientific community and society at large is paving the way for Open Data and, ultimately, for Open Science in a more participative sense, that of “science carried out and communicated in a manner which allows others to contribute, collaborate and add to the research effort, with all kinds of data, results and protocols made freely available at different stages of the research process”. In the context of this project, science is to be understood to encompass humanities and social sciences.

The Wikimedia Foundation (WMF) is the organization behind projects like Wikipedia, Wikiversity, Wikisource, Wikispecies, Wiktionary as well as the MediaWiki software they run on. The Wikimedia and Open Science movements share a vision of free global access to knowledge, but the many interactions between the two sides have so far not been systematic in nature. There are two major avenues for such interaction – one is to integrate wikis with scholarly workflows (particularly with scholarly publishing) but Wikimedia projects currently offer only limited prospects in this regard, and the other to use wikis to communicate science beyond the scholarly communities.

Along the second route, the “Wikipedian in residence” (WiR) initiative was started about a year ago under similar circumstances, with the aim of improving Wikipedia coverage of cultural institutions (galleries, libraries, archives, and museums – collectively referred to as GLAM). The WiR concept – as well as the GLAM:Wiki conferences it sparked – has been tested in a number of environments, and so far was generally regarded a success from both the wiki and the GLAM sides, as it has helped to keep (or turn, in some cases) debates rational and to focus on the common goals. For instance, tensions between the National Portrait Gallery in London and Wikipedia could be eased significantly by engaging both sides in dialogue.

Now what about taking a similar approach to coverage of scientific topics? In such a context, some of the first questions to arise are whether financial resources should be spent on getting access to non-open knowledge or not, and whether the positive effects on article quality expected from such an investment could also be achieved by using the publicly accessible share of the scholarly literature instead.

While there is no tension between the Wikimedia and Open Access communities, considerable lack of understanding exists. One important point that needs addressing is the lack of understanding in the Open Access community of the importance of licences, and that unsuitable licences rather than lack of interest in sharing often prevent the spread of scholarly information.

Despite these problems, content from Open Access publishers (e.g. from BioMed Central, PLoS, Hindawi or Copernicus) is already widely used on the Wikipedias, yet traditional publishers still receive way more citations from Wikipedia articles, as indicated in the following table, which shows estimates (as of May 31) of the number of times articles from Open Access and other resources have been cited on the English Wikipedia (“Citing pages”) and of the number of times that files originating from these sources have been used on Wikimedia Commons (“Reused files” column).

Resource

Citing pages

Reused files

Default license ; Comment

Elsevier

29643

414

©

JSTOR

19018

134

©

NPG

16167

121

©

Springer

13549

149

© ; hybrid

PNAS

12599

25298

© ; hybrid

Wiley

11945

46

©

AAAS

6580

54

©

ACS

6283

448

©

Taylor & Francis

4758

27

©

BMC

2831

921

CC-BY

PLoS

2666

719

CC-BY

arXiv

2154

58

arXiv ; hybrid

Royal Society

2042

35

© ; hybrid

SAGE

1934

9

©

IOP

1333

10

© ; hybrid

Encyclopedia of Life

1097

255

CC-BY-NC; other CC licenses possible but not widely used

Scholarpedia

240

0

© ; by DOI: 13/ 0

Hindawi

221

14

CC-BY

Citizendium

125

74

CC-BY-SA

Biophysical Society

73

0

©

Copernicus

66

1

CC-BY

Pensoft

49

1034

CC-BY; Wikispecies partner

Frontiers

31

1

CC-BY

There are many reasons for this, including the relative youth of Open Access publishers, but although a search for images on Wikimedia Commons reveals around 28,000 files that contain “doi.org” in their description (and thus are mostly OA content), there is no systematic approach to incorporating Open Access content into Wikimedia projects, albeit some project-specific collaborations exist as part of a wider move for scaling up expert involvement.

One route to facilitate re-use is the “Picture of the Year” contest, in which registered Wikimedia users can vote for the images they like best. The candidates for this contest are taken from the images that have been “Picture of the Day” during the previous year, or from other images with featured status. So far, images originating from Open Access sources form only a very tiny minority of such images. A cursory search revealed that there were at least two candidates for Picture of the Year 2007 that originated from PLoS articles. The latter (originally published in PLoS Biology under a CC-BY license and the current header image of this blog) even was a finalist.

In the framework of the project, Open Access publishers will be encouraged to facilitate such re-use more systematically, as a kind of perpetual image donation. To this end, automated tools for uploading Open Access-published non-text media onto Wikimedia Commons shall be made available, and tools that allow information on such re-use (or even usage stats) to be fed back to the original publisher (manually posted example comment).

For individuals from the Open Access community, many Wikipedia’s policies can hinder potential contributions. Important such policies are notability (which some Open Access subjects might struggle to pass), original research (which refers to basically anything that has not been published in a reputable source and would thus potentially exclude numerous details in Open Access articles, particularly about institutions, organizations and persons involved) and conflict of interest (which means that staff of an Open Access journal or repository – or even authors of the original journal article – have to be particularly careful when editing the respective Wikipedia articles).

For existing articles, this is less of a problem, but in order to get an article started, some expertise in the topic and in Wikipedia policies is clearly beneficial. In the framework of this project, new articles will thus be started in a way that allows input from both communities before the draft is being placed in the main namespace.

Another aspect of the relationship between the Open Access and Wikimedia communities – and the central one for all WiR projects so far – is the coverage of the WiR target topics on Wikipedia. Given that the high popularity of Wikipedia essentially renders it the “public face” for many kinds of knowledge, almost any community should have an interest in improving the coverage of their respective topics in Wikipedia or compatibly licensed knowledge bases (from where articles and media can be imported into Wikimedia projects), and even some academic communities have taken measures in this direction.

Currently the best resources on the web for information on Open Access are the Open Access Directory (also home to the Open Access Tracking Project), along with the BOAI mailing list (hosted by the Open Society Foundations). Open Access coverage on the English Wikipedia is patchy, albeit more useful than that of most other Open Science topics: there are entries on Open access (publishing) (cf. discussion about article title), Open access journal, Hybrid open access journal, Repositories (publishing), Institutional repositories, Disciplinary repositories, Open Journal Systems and about 50 other articles in the parent Open Access category, a subcategory for Open Access journals hosts almost 300 entries, and subsubcategories exist for BMC journals and institutional repository software. However, many of the categories are inconsistently applied, and most smaller Open Access publishers or repositories do not presently have an entry in the English Wikipedia, nor does the Open Access Directory. Most existing articles related to Open Access are rated as stubs or remain unassessed for quality. The situation is even worse for Open Data and most other aspects of Open Science, with some notable exceptions. Coverage in other language editions is generally much less comprehensive than on the English Wikipedia.

In the pilot phase (July 2011- July 2012) of the project, the goals of this project are to improve coverage of topics related to Open Access in the English Wikipedia, to review and facilitate the reuse of materials from Open Access articles in WMF-run projects, to facilitate the implementation of the WMF’s Open Access policy (currently being drafted), and to explore the potential for the Open Access community to collaborate with WMF projects or wikis in general.

There are three major ways in which the project could be extended if the pilot succeeds:

  1. Expand the range of languages covered beyond English;

  2. Expand the range of topics covered beyond Open Access;

  3. Expand the range of projects covered beyond Wikimedia.

During the pilot phase, sample articles will be worked on for each of these different options. Which of these directions to explore in what way will be determined in part on the basis of community feedback received during the pilot phase, which is hereby invited on any aspect of the project.

I have a decade of experience as a practitioner of Open Access, about half that time as a regular contributor in Open Knowledge environments (including WMF projects, Citizendium, OpenWetWare, WikiEducator, Species ID and recently also GitHub) as well as Open Access initiatives and events (Open Access Day, Open Access Week, Conference of Open Access Scholarly Publishing). I got involved with the Open Knowledge Foundation about two years ago and with the German chapter on the day it was launched last year. Throughout the project, both will act as a contact point for expertise on matters of Open Knowledge, especially Open Data.

The Open Society Foundations Information Program supports Open Science in several ways, e.g. through grants to Michael Nielsen to tour with talks on the topic, and to Cameron Neylon to work on research assessment metrics that go beyond the Journal Impact Factor. The Open Society Foundations have also been sponsoring Wikimedia activities for some time, e.g. the most recent Wikimania conferences. My initial contact with them arose through the World Association of Young Scientists after the World Science Forum 2009 in Budapest, when I was one of the scientists consulted by OSI on strategic funding for Open Science. I am grateful that they are now funding this project on a part-time basis, and I am looking forward to developing it together with both the wiki and Open Science communities.

This blog is hosted at wir.okfn.org not just because WiR stands for “Wikimedian in Residence” but also because “wir” is the German word for “we”, thus highlighting the collaborative nature of the project.

This entry was posted in Project and tagged , , , , , , , , , , . Bookmark the permalink.

8 Responses to A wiki approach to Open Access and Open Science

  1. Aubrey says:

    Hi Daniel, I’m really, really happy someone came out with a project like this, which is as simple as it is powerful. I think it would be great if you could foster the dialogue between the two movements, and highlight some issues (i.e. the lack of clear licenses in OA). I’m sure that your work on the coverage of OA and OS in Wikipedia will be very useful also for other languages (for example, Italian :-)). Eager to hear for updates.

    • Thanks for the encouragement, Aubrey. Extendability beyond the English Wikipedia is certainly on my mind, and I am confident that we will find ways to have the project interact with Italian as well.

  2. Pingback: Wikimedia blog » Blog Archive » Joining forces with open science

  3. Congratulations, Daniel! This is an amazing blog post, too, I must say. The interesting-hyperlink ratio is through the roof. You list the “default license” of a lot of publishers as CC-BY. This is the default, but a lot of them also allow authors to retain copyright with tighter restrictions, like no-derivatives … is that right?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <section align="" class="" dir="" lang="" style="" xml:lang=""> <style media="" type="" scoped="">