PDF Optimization: More Than One Way To Do It

Within the many hundreds of pages that define PDF, there are many different ways to accomplish some very similar things. Some of this is through enhancements to the PDF format over time, where improved ways have come along but existing means have been kept as valid and usable. In other cases, this is because the best way to accomplish a given aim depends on the circumstances. For example, there are many different raster image formats that can be used in PDF. Each format has specific strengths and trade-offs that are made in service of these strengths.
In working with our PDF OPTIMIZER product, we’ve been using a variety of different PDFs that we receive over the course of our daily lives. One set in particular have been PDFs of various financial statements: checking account statements, retirement account statements, and the like. Many of these have been PDFs that seem small already – still, PDF OPTIMIZER has been able to shrink these by up to 40%. I took a look into what was being optimized in one specific case, and I found a tale of two images being compressed.
The first was a JPEG (DCT encoded) format image that PDF OPTIMIZER was able to compress with a more aggressive Q factor to save significant file space. DCT compression in PDF, like with all JPEGs, is a lossy image conversion. DCT compresses images significantly by intentionally discarding some data that is unlikely to be noticed by human viewers, and substituting approximations that are heavily compressible. This is how JPEG images regularly end up as orders of magnitude smaller than the same images in lossless (exact) representations, such as PNG. When using DCT compression, there is an important trade-off: higher compression factors come at the expense of more approximations in the compressed image. This trade-off can be quantified by the quality or “Q factor” used to compress an image. With very aggressive compression comes a greater chance of image changes that can start to become noticeable to users. In the case at hand, PDF OPTIMIZER is able to recompress this image significantly with the same DCT filter, but with more aggressive compression – a savings of more than 75% – with minimal degradation of the image’s appearance. Here, since I was targeting a smaller file, I was willing to accept some image quality trade-off. Of course, in other situations different levels of trade-off may make more sense.
In the second image case, PDF OPTIMIZER found an image where additional space savings were achievable without any loss of image quality. Here, a bitonal image existed in the statement as a Flate (ZIP) encoded image. Further optimization was achieved by converting the image representation from Flate compression to CCITT G4 FAX encoding. CCITT encoding is a specialized compression scheme that can squeeze black and white images more compactly than the more generalized Flate compression scheme – without any loss of quality. While not a huge difference (about 9% compression), the smaller image remains as the exact same appearance as the unoptimized representation.
Here, some of my more PDF savvy readers will be wondering why even more aggressive techniques are not being used. PDF OPTIMIZER could optimize these images even more aggressively: the JPEG image could be converted to JPEG2000, and the bitonal image to JBIG2 representation, for an even smaller file. This would result in a smaller file still – but at the loss of compatibility with those viewers that do not properly process JBIG2 or JPEG2000 images. Because these financial statements are targeted for a broad audience, we assume a general audience that will be using a wide variety of different viewers with different levels of PDF support. The optimization profiles that are provided with PDF OPTIMIZER – close approximations of the Adobe Acrobat “Standard” and “Mobile” optimization profiles – are easily editable by users to achieve their desired level of compression and of client compatibility.
If you’ve stayed with me all this way, I encourage you to try our our automatable, scriptable PDF optimization for yourself. Learn more, and request your free trial of PDF OPTIMIZER – try your own custom optimization settings to achieve the best blend of compatibility and space savings for your users.

Share this post with your friends

Get instant access to the latest PDF news, tips and tricks!

Do you want monthly updates on the latest document technology trends?

By submitting the form, you agree to receive marketing emails from Datalogics. You may unsubscribe at any time. 

Like what you're reading?

Get Datalogics blogs sent right to your inbox!

By submitting the form, you agree to receive marketing emails from Datalogics. You may unsubscribe at any time.