How to visually compare two PDF files?

I have an application written in Python, which uses the ReportLab package for exporting PDF files.

Of course, the application needs to be tested. Among other tests, the PDF export function needs to be tested to ensure that the visual rendering of PDF files did not unexpectedly change.

Since it is possible to create and save an expected-results PDF file using fabricated test data, the above implies the need to compare two PDF files. It turns out that two PDF files created from the same data at two different dates – are different, due to embedded timestamps.

Hence, the need to compare visual renderings of the PDF files. ImageMagick’s convert knows how to convert PDF files into PNG. However, one needs to set the background and remove the alpha channel.

convert knows also to perform bitwise XOR on two image files, but it must be told how to compute the bitwise XOR. This is documented in StackOverflow: Searching for a way to do Bitwise XOR on images.

The script in  https://gitlab.com/TDDPirate/compare_pdfs implements all the above.

To get more PDF tips and updates, subscribe to my mailing list

Author: Omer Zak

I am deaf since birth. I played with big computers which eat punched cards and spew out printouts since age 12. Ever since they became available, I work and play with desktop size computers which eat keyboard keypresses and spew out display pixels. Among other things, I developed software which helped the deaf in Israel use the telephone network, by means of home computers equipped with modems. Several years later, I developed Hebrew localizations for some cellular phones, which helped the deaf in Israel utilize the cellular phone networks. I am interested in entrepreneurship, Science Fiction and making the world more accessible to people with disabilities.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.