PDF Optimization: When Size Isn’t the Only Thing That Matters

PDF Optimization: When Size Isn’t the Only Thing That Matters

Published December 1, 2024

As a developer, when you hear a request about “optimizing” a PDF, it is focused on compressing the PDF as a whole document. But reducing PDF file size is not the only option when using PDF editing software. 

Reducing image sizes or resolution, color management, embedding fonts, flattening transparencies, or even converting the PDF into a different PDF or other document file type. PDF compression depends on what your needs are for that document. For example, you need a PDF to download faster, open more quickly in a web browser or PDF viewer, print more efficiently, or be better suited for long-term preservation. With the right file compressor, you can determine exactly what type of content you prefer to preserve or discard, fine-tune image compression options, and optionally convert to a PDF/A or PDF/X standard. Yes, these all lead to shrinking PDF file size, but there are different ways of achieving that ideal document.

However, optimizing can also improve the performance of a PDF document without necessarily making that document smaller. The optimization process can discard objects and features within a PDF document that require excess processing time. A complicated PDF document with a lot of features will tend to take more time to load, regardless of the size.  You can also choose the amount of compression you want for a document – low, high, medium, as well as converting colors to grayscale and downsample image sizes. 


Here are the most preferred ways developers use PDF SDKs to optimize PDFs: 

Images 

When a PDF document is created that includes photographs, diagrams or drawings, the original graphic files, such as a JPEG photograph or a PNG image, become images in that PDF file. Using a PDF SDK compress these color, gray scale, or black & white images. 

Downsampling Images 

If you have images in a PDF document that you know do not need to have a high resolution in the output file, you can reduce the resolution of these images and/or compress these images within the file. Both steps will reduce the final size of the PDF document. This process is called downsampling. There are several options with downsampling. In a PDF document, you can choose to downsample color images, grayscale, or monochrome (black & white) - or all three in the same document! Using PDF editing software, you need to determine the settings and resolution values for each image. This could allow you, for example, to apply color to specific images and grayscale and black and white to others depending on your requirements. 

Downsampling & Recompression 

While downsampling reduces the size of the image directly by reducing the resolution, with recompression, compressed images in a document are decompressed and then compressed again based on the profile you set for images.  For example, you can enter a recompression setting to change the compression algorithm used for recompression, such as ZIP, JPEG or Flate, and another setting to change the final image quality. The image quality is part of the compression method used. You have full control over the process based on your project requirements. 

Image Resolution 

When we refer to the resolution of an image, we generally refer to the number of pixels in that image. This can be expressed in terms of megapixels, or in Dots per Inch (DPI). With an image in a PDF document, the resolution of the image is expressed as a certain number of pixels wide and pixels high. The downsampling process involves changing the width and height of an image in pixels, in order to reach a given target resolution.   Keep in mind that the resolution values used with downsampling are distinct from the image quality settings used for image recompression. You can specify a target resolution to use for downsampling images in a document (target-dpi) and a trigger resolution (trigger-dpi).   If you decide to downsample an image type, both the target and the trigger resolution settings must be included in your profile file. The target resolution defines the goal— the maximum resolution for every image in the file. If you add a target resolution to 600 DPI, the PDF SDK will downsample every graphic in the PDF document to 600 DPI unless it that image is already at 600 DPI or less.  

Any image with a resolution greater than the target resolution will be downsampled; any larger image will be ignored.    

Fonts 

PDF documents can travel with the fonts that they need to access to properly render text as font files can be embedded within the PDF itself. That way, no matter what machine is used to open a PDF file, the PDF is always guaranteed to look the same, and the viewing tool does not need to look for substitute fonts installed on the local desktop or laptop. These embedded font files, however, can make the PDF larger, maybe substantially larger, if the document needs to express characters from an Asian font set, like Mandarin or Japanese. You can program the workflow to remove individual font characters or sets that you don’t need, thus reducing the size of the PDF file. 

Transparencies 

It is possible to stack objects, such as graphics, images, text boxes, and form fields, on top of each other on a PDF document. These objects can be partially or fully transparent, and thus can interact in various ways with objects behind them. If a set of transparencies are stacked in a PDF file, each one contributes to the final result that appears on the page, such as the colors blending together into a final color that appears. To make a PDF document simpler, you can flatten these transparencies. The flattening process combines the layers of content on a PDF page, or a stack of transparent images or colors, and renders the result as a single image, blended color, or set of text. 

For example, if a digital signature is flattened, the digital certificate key and related properties are removed from the signature field. The name of the person who signed the document and related information, such as the date and time stamp and the signer’s email address, appear on the page as text, but the signature field is no longer interactive.  

flatten transparencies datalogics


Objects 

Besides graphic images and font files, a variety of other objects can be saved within a PDF document that can be removed to reduce size and complexity, making the file easier to use. The objects include blocks of JavaScript code, thumbnail images, bookmarks, tags, and alternate graphics images. 

User Data 

It is possible to edit PDF documents using Adobe Acrobat and other viewing and editing tools. For example, when reviewing the content in a PDF document, a user might want to add a comment. It is also possible to attach external files to a PDF document so that the file is saved as a part of the PDF, embed a hyperlink to a web page or add metadata, or form fields such as text boxes, check boxes and radio buttons. All of this user data can be removed to reduce PDF file size.


Color Conversion 

We see thousands of different shades of colors with the naked eye, and high-quality digital cameras and scanners can often detect millions of shades. To manage the broad range of colors for producing graphics images in digital content, imaging professionals have developed models to define these colors, called color spaces.   Multitudes of color spaces have been defined, often dependent on hardware devices or defined by what a camera detects, a printer prints, or a monitor displays. Others are based on software and thus can be used across many different types of devices, such as Adobe RGB or sRGB (standard RGB). But a color space must be defined for any device or software product to make sure that coloring patterns remain the same from one device or system to another.   The Standard RGB color space, sRGB, was developed by Microsoft and Hewlett Packard to describe colors available on most monitors and displays. This color space is also commonly used for web graphics. Adobe Systems’ own Adobe RGB (Red/Green/Blue) color space is designed to hold all of the colors that are likely to be available on any color CMYK (Cyan/Magenta/Yellow/Black) printer. It is considerably larger than Standard RGB.  

Color profiles are standards for managing colors, used to ensure that the colors for text or graphics in a file remain the same regardless of the hardware or software used to display, edit, or print that file. Color profiles are based on the specification created by the International Color Consortium (ICC) in 1993 to govern color and color management across all operating systems, platforms, and software and hardware and software systems. A color profile is usually expressed as a file included in the software or driver for an installed printer, scanner, or other hardware device, or in software used to edit a file that is to be displayed or printed. A profile provides a set of data that describes an input or output device. A color profile file can also be embedded in a PDF document. 

How does this fit into reducing PDF file size? PDF editing software can compress PDF size by converting the colors in a document to settings programmed into the workflow prior to optimizing the document. Now you just have to narrow down which color profiles to use! 

General Document Compression  

PDF SDKs also offer general document compression processes to optimize your files. You can program the workflow to compress the entire PDF document, parts of the content, remove redundant content or select a compression method to use, as well as other changes designed to make a PDF document open more quickly. These options affect the PDF document as a whole, rather than individual features, like images or fonts or bookmarks.  


Document Conversion 

We cover document conversion and PDF file types thoroughly in other articles, but two document types that are good for optimizing purposes are also covered here. 

PDF/A Conversion

PDF/A, or PDF Archive document, is an ISO-standard version of the PDF format that is designed to be stored so that it can be accessed for many years to come. PDF/A documents must be able to be opened and read using viewing tools available in the future, so they are designed to be self-contained. For example, all of the fonts used in a PDF/A document must be embedded within the PDF document itself for the file to be considered PDF/A compliant. This can be programmed in the conversion workflow. 


PDF/X Conversion 

PDF/X is an ISO-standard version of the PDF format and is a subset of the PDF standard that applies to printing workflows. As a file type that supports graphics-heavy documents, PDF/X specifies print standards not used in standard or other PDFs. This includes requirements on fonts, colors, file identification, bleed.  

PDF/X files provide users with higher quality print output with fewer errors due to the restrictions applied to each document.


start your free trial of adobe pdf library sdk