Guide to user performance evaluation at InfoVis 2016

Previous years: 2013, 2014, 2015

The goal of this guide is to highlight vis papers that demonstrate evidence of a user performance benefit. I used two criteria:

  1. The paper includes an experiment measuring user performance (e.g. accuracy or speed)
  2. Analysis of statistical differences determined whether results were reliable.

I did not discriminate beyond those two criteria. However, I am using a gold star to highlight one property that only a few papers have: a generalizable explanation for why the results occurred. You can read more about explanatory hypotheses here.

ExplanatoryThe Attraction Effect in Information Visualization – Evanthia Dimara, Anastasia Bezerianos, and Pierre Dragicevic.

ExplanatoryColorgorical: Creating discriminable and preferable color palettes for information visualization – Connor Gramazio, David Laidlaw, and Karen Schloss.

The Connected Scatterplot for Presenting Paired Time Series – Steve Haroz, Robert Kosara, Steven Franconeri.

ExplanatoryEvaluating the Impact of Binning 2D Scalar Fields – Lace Padilla, P. Samuel Quinan, Miriah Meyer, and Sarah Creem-Regehr.

An Evaluation of Visual Search Support in Maps – Rudolf Netzel, Marcel Hlawatsch, Michael Burch, Sanjeev Balakrishnan, Hansj”rg Schmauder, and Daniel Weiskopf.
[This publication is hidden]

Immersive Collaborative Analysis of Network Connectivity: CAVE-style or Head-Mounted Display? – Maxime Cordeil, Tim Dwyer, Karsten Klein, Bireswar Laha, Kim Marriott, and Bruce H. Thomas.

Many-to-Many Geographically-Embedded Flow Visualisation: An Evaluation – Yalong Yang, Tim Dwyer, Sarah Goodwin, and Kim Marriott.

Map LineUps: effects of spatial structure on graphical inference – Roger Beecham, Jason Dykes, Wouter Meulemans, Aidan Slingsby, Cagatay Turkay, and Jo Wood.

Optimizing Hierarchical Visualizations with the Minimum Description Length Principle – Rafael Veras and Christopher Collins.

A Study of Layout, Rendering, and Interaction Methods for Immersive Graph Visualization – Oh-Hyun Kwon, Chris Muelder, Kyungwon Lee, Kwan-Liu Ma.

Towards Unambiguous Edge Bundling: Investigating Confluent Drawings for Network Visualization – Benjamin Bach, Nathalie Henry Riche, Christophe Hurter, Kim Marriott, and Tim Dwyer.

VLAT: Development of a Visualization Literacy Assessment Test – Sukwon Lee, Sung-Hee Kim, and Bum Chul Kwon.

A flat trend

27% of InfoVis conference papers measured user performance – a 10% drop from last year. Overall, there has been little change in the past four years.

Here’s the Aggresti-Coull binomial 84% CI, so each proportion can be compared.


The Journal Articles

Although TVCG appears to have taken a big dive, the total number of articles is so low, that it could very well be noise. Note the large error ribbon for TVCG.

In the chart on the right, I collapsed the past four years of data and recomputed the means and CIs. For TVCG, I’m only include papers presented in the InfoVis track. TVCG has more papers with performance evaluation, and it can’t simply be explained by random noise.

Little Generalization

There is still a very low proportion of papers with an explanatory hypothesis that can inform generalizability. I try to be very generous with this assessment, but very few papers attempt to explain why or if the results are applicable outside of the specific conditions of the study. Also, there are still a lot of guesses presented as hypotheses.

Obviously, please let me know if you find a mistake or think I missed something. Also, please hassle any authors who didn’t make their pdf publicly available.


3 thoughts on “Guide to user performance evaluation at InfoVis 2016

  1. Michael

    This is interesting. Probably you might write a BELIV 2018 paper on “Generalizability aspects in user experiments”. But I do not know if there is already a paper going into this direction…


  2. Steve Haroz Post author

    Thanks Michael. Generalizability is definitely a topic worth writing about. I wrote a short discussion of explanatory hypotheses. I’m considering where I could send a more detailed discussion, but since it would focus specifically on objective measures of time and error, a venue that is “beyond time and error” is explicitly not appropriate.

    By the way, do you have a link to a public PDF of your visual search paper? I’d like to link to it from this post.

  3. Petra Isenberg

    Stay tuned for next Beliv. We’re most probably going to change the title (not the acronym) to be more inclusive to discussions on time and error and other quantitative measures. It’s important that meta discussions cab happen in all areas of evaluation.


Leave a Reply

Your email address will not be published. Required fields are marked *