Running accessibility tests with Pa11y

Context

On the Drupal 8 redesign site for https://drupal.fr, Pa11y was used to perform an accessibility audit (https://github.com/Drupal-FR/site-drupalfr/issues/51#issuecomment-491448105).

Pa11y is a suite of free tools for auditing website accessibility.

I wasn't familiar with this tool before and being in a period where I'm improving my Docker stack for Drupal https://github.com/FlorentTorregrosa/docker-drupal-project, I therefore wanted to equip it with this tool.

As this solution has nothing to do with Drupal and can be used for any type of site, I therefore made a separate repository https://github.com/FlorentTorregrosa/docker-pa11y-ci in order to have a Docker image that can be easily used to run tests. This image can then easily be integrated into a continuous integration chain on a project.

The rest of the article describes the process for obtaining this image.

Search for existing solutions

Starting from the Pa11y website, the Pa11y-ci tool encapsulates Pa11y for precisely massive, automated use by allowing, for example, you to specify an XML sitemap and scan all the pages in it or by launching the tests by specifying in a configuration file all the URLs to be scanned.

I then looked to see if any Docker versions existed.

I came across:

  1. https://hub.docker.com/r/digitalist/pa11y-ci which looked like a perfect match for what I was looking for, but impossible to find the source repository from which this image was made
  2. https://github.com/promet/docker-pa11y-ci tested quickly, it worked, but I wasn't happy with the following points:
    1. use of Google chrome : I wanted Chromium if possible,
    2. use of version 1.* of Pa11y-ci: currently it is 2.* that is maintained,
    3. using the Debian version of the Node  image: an alpine-based image would be much lighter,
    4. the docker-compose.yml file in the repository was more for building the image than for use where you just clone the repository.

So I forked https://github.com/promet/docker-pa11y-ci and implemented the previous improvements.

Difficulties encountered

The mistake I made was trying to do all the improvements simultaneously :

  • Switching to Alpine Linux, so most of the packages don't have exactly the same name compared to Debian packages, so finding the matches was quite manual, package after package...
  • passage to the 2.* branch of Pa11y-ci which as a result means that we're moving from version 4.* of Pa11y to 5.* which in turn relies on Puppeteer https://github.com/GoogleChrome/puppeteer which brings a lot of changes,
  • passage to Chromium. There I had errors like " Chromium revision " was not present.

So I reviewed my approach and decided to proceed step by step starting from the fork image which it built and was functional:

  1. reorganisation in my own way the original image, in passing upgrade to Node 12 as the base image. I also decided to keep the Debian Stretch image in order to have a Debian based image and an Alpine based one in case the maintenance of the Alpine image becomes too expensive and allow to compare the sizes of the images,
  2. passage to Pa11y-ci 2.*, and there it no longer worked because of the use of Puppeteer which added security (rightly) with the use of sandboxes as root user, to solve the problem:
    1. creation of another user but there error message indicating that the user did not have access to a sandbox and no solution that worked,
    2. used the --no-sandbox option to allow use as a root user, and it was at this point that I realised that the config.json configuration file for passing options to Pa11y-ci which can pass options to Puppeteer was no longer taken into account. Indeed the "chromeLaunchConfig" entry located at level 0 of the file should now be placed in the " defaults "
  3. Going through the Puppeteer documentation, I was able to see the section dedicated to Docker https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md#running-puppeteer-in-docker and see the trick of the PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true environment variable which allows if you go to download Google Chrome or Chromium via the distribution's packages, to prevent it from being downloaded again via Npm or Yarn when installing Puppeteer. And presto, one more improvement.
  4. Switching to Chromium and this time with the " executablePath " option correctly taken into account, it was simple to get the distribution's package manager to use the downloaded Chromium.
  5. Once the Debian Stretch image was ready, creation of the image for Alpine, and here it was simple, I purged the necessary packages and took those from the Puppeteer documentation dedicated to Alpine https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md#running-on-alpine, added a symbolic link to with the same executablePath compared to Debian Stretch and it was ok.
  6. Automated build connection on the Docker Hub so that when pushed to the Git repository it rebuilds the images.

Conclusion

After battling for 2 days, I now have 2 Docker https://hub.docker.com/r/florenttorregrosa/pa11y-ci images (Stretch 600 MB and Alpine 150 MB) enabling automated accessibility tests.

Advantageously, the tool is unrelated to Drupal (for once :) ) and can therefore be used on any site.

One point of improvement is to allow you to dispense with the --no-sandbox option in order to have secure execution. I think the risk is almost zero if you're scanning trusted sites, especially as Puppeteer is run in a container and not on your machine directly.

See the Pa11y and Pa11y-ci documentation for all the possible options.

Comments

Add new comment