Datasets for Benchmarking Web Accessibility Testing Tools paper

Datasets employed in the paper “Benchmarking Web Accessibility Evaluation Tools: Measuring the Harm of Sole Reliance on Automated Tests”

Updated on May 28, 2014: if you are landing from Measuring the harm of flawed academic papers this is our response to that post:

As the first author of this terrible paper I'm not going to rebut every single accusation of "serious flaws" and "obvious bias”, apparently there are too many. The paper speaks for itself: we do an exercise of deductive reasoning, we make some assumptions, we describe our methods and protocols, we don't overstate and we were very careful in contextualising the outcomes and limitations of the study.

We do not share the websites we used and generated data by chance, we do it because in Science we seek the replicability of the results from our peers. This way we do not get tangled in banal argumentations about personal preferences or prejudices and we can focus on the Science of the paper. Since the blog post contains serious accusations of academic misconduct I encourage people to read the paper, replicate the study and draw their own conclusions. Then, we are happy to discuss, debate and rectify if necessary; this is how we make progress in Science. We will update this page with the tool versions used in the study to facilitate this.

I'm glad to see that people demand rigour on the web accessibility realm. There are too many papers, guidelines, best practices, blog posts, recommendations and Slideshare presentations that have little empirical foundations, are based on anecdotal evidence or lack a fundamental scientific approach. I hope that this is not an isolate incident and from now on there is a boost in the demand for rigour, open data and replicability. I'm more than happy if our paper triggers this ethos.

Finally, the authors of the paper would like to clarify that we don't have any conflict of interest with any tool vendor (in case the author of the blog is trying to cast doubt on our intentions).

Markel Vigo

Important notice: these datasets can only be reused in other studies or experiments if they are properly cited as follows:

Markel Vigo, Justin Brown and Vivienne Conway (2013) Benchmarking Web Accessibility Evaluation Tools: Measuring the Harm of Sole Reliance on Automated Tests. 10th International Cross-Disciplinary Conference on Web Accessibility, W4A 2013. article 1. ACM Press

Get the paper

Get the Bibtex file

Tested websites (hosted in a web server at ECU)

Vision Australia

Australian Prime Minister

Transperth

Datasets in CSV format

Overall number of true positives, false postives an false negatives across tools
Vision Australia home
Vision Australia make a donation
Vision Australia resources
Prime Minister home
Prime Minister videos
Prime Minister contact
Transperth home
Transperth contact us
Transperth timetable results