Barcodes are everywhere. Barcodes make it really easy to move between paper and electronic workflows and with Adobe Acrobat, PDF barcode fields translate other field values into a visual pattern that can be scanned, decoded, and then used in electronic business processes. Barcodes are particularly useful when data is entered electronically but then require an ink signature and are delivered as paper or [shudder] fax.

But many developers receive PDF files that have either been scanned or contain barcodes that were not created by Acrobat. The ZXing (“zebra crossing”) library, an open-source, multi-format barcode image processing library, makes decoding barcodes really easy. The only trick is getting an image out of the PDF file so that it can be decoded by ZXing. The Gist referenced below shows you how to do exactly that using the Datalogics PDF Java Toolkit. While Java and PDF can both work with multiple types of images and color models, PDF stores images very differently than Java. To simplify moving data between the two, the ImageManager class in the Datalogics PDF Java Toolkit  facilitates converting images between standard Java representations and those used by PDF.

You start by locating all of the images in the PDF file using the PDFXObjectImageWithLocationMap and then iterating through them.

Once you have a PDFXObjectImage to work with, converting it to a BufferedImage is just a few simple lines of code…

… and from there, decoding the barcode is simply a matter of using the classes available in the ZXing library.

Each of the three input files contains a different kind of barcode and an another image that isn’t a barcode; the XZing library can tell the difference. The barcode field properties are set to create comma delimited text and to include the field names so you’ll see that output when you run the sample. To get started with reading barcodes in PDF files, download this Gist and request an evaluation copy of The Datalogics PDF Java Toolkit.

