Home » Blog
date 2.Jun.2019

■ Compare PDF documents side-by-side using your plain text Diff tool


As a programmer I use windiff (and TSVN's TortoiseMerge to a lesser extent) all the time to track modifications in xplorer² code base. Such tools compare 2 plain text files and highlight the changes made (lines added/deleted/modified). This helps tracking bugs and more importantly forces one to revise code changes so as to avoid the introduction of bugs in the first place. Marvellous!

When it comes to more complex office document types like DOC/XLS, microsoft Office has plenty of tools in the package to track changes and compare document revisions by various authors, either side by side or highlighting the changes in place. So that's sorted too. But what about PDF documents? Obviously I wasn't the first to think of this requirement, so there are plenty of paid free tools for PDF comparison.

As it turns out, if you have xplorer² and a plain text diff tool (windiff, examdiff, winmerge etc) you can compare your PDFs and spot differences in plain text mode without any extra or special PDF difference tools. The idea is to extract the plain text out of the PDFs, then compare the resulting plain text files. The comparison is certainly "low fidelity" as it doesn't track changes in formatting or images, but it's quite useful, and you can do it with your existing toolset. Here are the steps:

  1. Extract PDF text. Select the 2 PDF documents in xplorer², and use Edit > Extract text menu command. This will use the PDF text extraction filter and generate 2 plain text files with the text content (minus formatting).

    If using the ribbon UI, you can run the Extract text command through the command finder


  2. Invoke the diff tool. Call whichever text difference tool you have on the 2 newly created text files to compare them. You can do so from xplorer² addressbar; first select the TXT files then run something like:
        > windiff $A

    the special token $A represents the selected TXT files that are passed as command line arguments to windiff tool


  3. Delete TXT files. After the comparison is over, remove the temporary extracted text files.

detect PDF document differences

For your convenience I wrote a macro that automates most of these steps above. You just need to select the 2 PDF files to be compared (assuming they are in the same folder), then execute the macro to see the differences.

If you don't have windiff you need to edit the macro to add your own diff tool path to execute

So one may claim, not without some justification, that xplorer² is indeed Jack of all trades <g>

Post a comment on this topic »

Share |

©2002-2019 ZABKAT LTD, all rights reserved | Privacy policy | Sitemap