Confusion about open science repositories

I recently gave a talk on Open Practices in Visualization Research at the workshop on Methodological Approaches for Visualization (BELIV). Unfortunately, with only a 10 minute talk, I had to leave out many important details, which has resulted in some confusion.

A few people have brought up concerns that repositories for open data and materials do not have long term viability. “What happens if the site shuts down in 5 years?” As an alternative, people have proposed storing data and materials in a pay-walled IEEE repository. While it’s good to hear that open access is being discussed, being informed is important for the discussion to be fruitful. So I’ll highlight some critical information about the Open Science Framework (OSF).

1. 50-year preservation fund

The Center for Open Science (COS) has a fund devoted specifically to preserving and maintaining the repository in case the organization ever shuts down. This fund would make a read-only form of the repository accessible for 50+ years. Here is a quote from the sustainability supplement in the COS’s strategic plan (page 24):

In the event of COS’s closing, the preservation fund guarantees long-term hosting and preservation of all the data and content stored on the OSF (50+ years based on present costs and use)

2. An open license with no paywall

Content posted to OSF can choose from a variety of open licenses. Any future work that builds upon the content, incorporates it into a meta-analysis, or scrutinizes it can freely access and link to the material. Openness facilitates research without needing to rely on an expensive subscription to the publisher. Furthermore, an open license means that future work will not require the original author give permission or even reply to emails.

On the other hand, some people want the content to be stored in IEEE’s digital library. That is exactly the opposite of open science. It would be behind a pay-wall (that’s not open). Also, IEEE would own the copyright of the data and material. Either IEEE or an obnoxious original author in fear of scrutiny could obstruct any attempt to publish work that reuses the content on licensing grounds (that’s not science).

3. No risk of lock-in

The openness of OSF allows people to copy their content elsewhere in the future. So there is little risk of being “stuck” with OSF if you don’t like it. If someone creates a better site, they could even mirror OSF’s content, so future open science systems could start with all of the information already on OSF.

4. Updates and edits to content

Like in version control, most open science repositories allow for updating content such that previous versions are always accessible. That approach allows for further updates such as added documentation or fixing typos without erasing the peer-reviewed version. In contrast, making a change to the IEEE digital library is a nightmare.

5. Templates for policies and submission forms

There have been some attempts by individuals and organizations such as ACM to “reinvent the wheel” by creating their own policies for open practice requirements and badges. These attempts often fail to consider flexibility and transparency in reporting.

Alternatively, the Transparency and Openness Promotion (TOP) guidelines have pre-written templates for modular policies that with various levels of strictness (from simply reporting whether it is available to mandatory submission) and for various artifacts (materials, collected data, analysis code, etc.). A table (artifact x sternness) summarizing the different policies is available on  the last page here.

  1. The full set of modular open policy templates with example implementations by various journals is available here.
  2. An author disclosure form for making submissions that request one of the open science badges is available here.


One final note: I’m not especially attached to OSF. There are alternatives such as zenodo and figshare, but OSF has the most full-featured set of services and has the most well-thought-out policies.