Category Archives: Open Science

DALL-E generated: A series of hurdles with data at the end

Scrutiny for thee but not for me: When open science research isn’t open

A fundamental property of scientific research is that it is scrutinizable. And facilitating that scrutiny by eliminating barriers that delay or prevent access to research data and replication materials is the major goal for transparent research advocates. So when a paper that actually studies open research practices hides its data, it should raise eyebrows. A recent paper about openness and transparency in Human-Computer Interaction did exactly that.

The paper is titled “Changes in Research Ethics, Openness, and Transparency in Empirical Studies between CHI 2017 and CHI 2022“. It looked at various open practices of papers sampled from the ACM CHI proceedings in 2017 and 2022. Then it compared how practices like open data, open experiment materials, and open access changed between those years. Sounds like a substantial effort that’s potentially very useful to the field! But it doesn’t live up to the very standards it’s researching and advocating for.

Continue reading
Building Facades. Credit: Zacharie Gaudrillot-Roy

Why open data is critical during review: An example

While reviewing a paper a while ago, a single word in the methods section caught my eye: “randomized“. Based on that single word, I strongly suspected that the methods were not accurately reported and that the conclusions were entirely unsupported. But I also knew that I’d never be able to prove it without the data. Here, I’ll explain what happened in this one example of why empirical papers without open data (or an explicit reason why the data can’t be shared) hamper review and shouldn’t be trusted.

Starting the review

I got a review request for a paper that had been through a couple rounds of review already, but the previous reviewers were unavailable for the revision. I don’t mind these requests, as the previous rounds of review should have caught glaring problems or clarity issues. As I always do, I skimmed the abstract and gave the paper a quick glance to make sure I was qualified. Then I accepted.

While some people review a paper by reading it in order, and others start with the figures, I jump straight to the methods. The abstract and title made the paper’s goals clear, so I had a good gist of the core question. Since I tend to prioritize construct validity in my reviews, I wanted to know whether the experiment and analysis actually provide sufficient evidence to answer the question posed.

The methods

The experiment showed 4 different items, and the task was to select one based on the instruction’s criteria. Over 200 subjects were run. There’s no need to go into more specifics. It was a single-trial 4-alternative-forced-choice (4AFC) experiment that also had an attention check. The items were shown in a vertical column, and the paper noted that the order of the items was “randomized” and that an equal number of subjects were presented with each ordering. The goal was to figure out which item was more likely to be selected.

Did you spot what caught my attention?

Continue reading

A bare minimum for open empirical data

It should be possible for someone to load and analyze your data without ever speaking to you and without becoming enraged in frustration.

Open

It needs to actually be shared publicly. None if this “available upon request” nonsense. There’s no excuse for hiding the data that supports an article’s claims. If the data is not shared, you’re inviting people to assume the you fabricated the results.

What about privacy? Collecting sensitive data doesn’t in any way diminish the likelihood of a calculation error or the incentive to falsify results. For identifiable or sensitive data, put it in a protected access repository

Where to post it? 

Continue reading

Reviewing Tip: The 10-Minute Data Check

Solving a Rubik’s Cube takes skill and time. But checking if at least one face is solved correctly is quick and simple. Science should work the same way.

While it’d be ideal to be able to perform a full computational reproducibility check to detect errors in all submitted manuscripts, journals rarely allocate resources towards it. There are some notable exceptions like Meta-Psychology that have a designated editor rerun the analyses once a manuscript has passed review. The readers, in turn, can be confident that the reported results accurately represent the data. However, most journals do not have such a resource. And reviewers and editors rarely have the time to add an entire reproducibility check to their often overburdened reviewing load.

But even without a full reproducibility check, reviewers can still do quick checks for egregious errors. So here are some quick checks a reviewer can do without a major time commitment.

Note: These checks may seem overly simple, but I’ve spotted each of these issues in at least one submission. And about 1 in 4 submissions I review sadly fail one of these tests.

The Checks 

Continue reading

Open Access VIS 2019 – Part 3 – Who’s Who

This is part 3 of a multi-part post summarizing open practices in visualization research for 2019, as displayed on Open Access Vis. Research openness can either rely on policy or individual behavior. In this part, I’ll look at the individuals. Who in the visualization community is consistently sharing the most research? And who is not?

Related posts: 2017 overview, 2018 overview, 2019 part 1 – Updates and Papers, 2019 part 2 – Research Practices

Whose papers are open?

Many authors are sharing most or even all of their papers on open repositories, which is fantastic progress. But many are not, despite encouragement after acceptance. Easier options, better training, and formal policies will likely be necessary for a field-wide change in behavior. Continue reading

Open Access VIS 2019 – Part 2 – Research Practices

This is part 2 of a multi-part post summarizing open practices in visualization research for 2019. See Open Access Vis for all open research at VIS 2019.

This post describes the sharing of research artifacts, or components, of the research process itself rather than simply the paper. I refer to sharing both these artifacts and the paper as “open research practices”.

Related posts: 2017 overview, 2018 overview, 2019 part 1 – Updates and Papers, 2019 part 3 – Who’s who?

Open research artifacts for 2019

I’ve broken research transparency into 4 artifacts and counted the number of papers on an open persistent repository that linked to each. I’ve given “partial credit” if the component is available but not on a persistent repository.

Continue reading

Open Access VIS 2019 – Part 1 – Updates and Papers

The purpose of Open Access Vis is to highlight open access papers and transparent research practices on persistent repositories outside of a paywall. See the about page and my paper on Open Practices in Visualization Research for more details.

Most visualization research is funded by the public, reviewed and edited by volunteers, and formatted by the authors. So for IEEE to charge $33 for each person who wants to read the paper is… well… (I’ll let you fill in the blank). This paywall as well as the general opacity of research practices and artifacts is contrary to the supposedly public good of research and the claim that visualization research helps practitioners who are not on a university campus. And this need for accessibility extends to all research artifacts for both scientific scrutiny and applicability.

This is part 1 of a multi-part post summarizing open practices in visualization research for 2019.
Related posts: 2017 overview, 2018 overview, 2019 part 2 – Research Practices, 2019 part 3 – Who’s who?

Updates for 2019

Continue reading