PDF/A has always been an important part of document management, and the Adobe PDF Library offers support for creating PDF/A documents that can adhere to Part 2 and Part 3 of the standard. This means you can create a Part 1, Part 2, or Part 3 PDF/A-compliant document.
Specifically, we have support for Levels B and U, and because of this, users can create ZUGFeRD 2.0 compliant documents, which are based on Part 3 of the PDF/A standard. This update also includes fixes and improvements to PDF/A conversion in general.
In case you’re not familiar with it already, PDF/A itself is a long-term archival standard (hence the ‘A’) for preservation of documents. The underlying theme of this standard is that it’s self-contained with all of the resources it needs to display its contents. What you get is a consistent, expected presentation of a document — even 100 years from when software has changed dramatically, you can count on your document to be viewed in a predictable way. This means it can’t rely on external resources for its PDF content. The specification for Part 1 was released about 15 years ago, but adoption was fairly slow in those first few years. However, in recent years, adoption has become widespread. Acceptance has been most prevalent in the European Union, so much so that many governments and municipalities have now made it a requirement to be used over the regular PDF format.
A simple Google search will reveal that law firms, government agencies, and court systems have dedicated instructions for how to convert PDF documents to be PDF/A compliant. Instructions typically walk users through the steps in Adobe Acrobat to do the conversion. For a one-off exercise or small scale conversions this certainly works, but these manual steps are not practical for bulk processing hundreds, thousands, or potentially millions of documents in larger use cases. As a company, you don’t want to have to hire people to literally press buttons and click through the conversion process when you can simply design software to do this automatically.
You may be wondering what’s behind the evolution of the PDF/A standard. Part 1 was based on PDF v1.4, which was older at the time but was widespread among PDF vendors. Things introduced since v1.4, such as transparency, for example, are not allowed — it’s believed this led to the standard’s slow adoption. Part 1 specifically prohibits attachments in the ‘spirit’ of being an archival standard, so the PDF is not dependent on external software being used to open the attachment. But many real-world users found this made it impractical for documents that needed associated files in order for the document to make sense and be useful.
Parts 2 and 3 are based on v1.7 of the PDF standard, so features that were not allowed in Part 1 are now legal in Parts 2 and 3, such as JPEG2000 compression, attachments, transparency, and more. A new level of compliance was also added, known as Level U.
As a primer on the different levels, we’ll start with Level A. The ‘A’ stands for ‘Accessibility’ or ‘All,’ and it meets all requirements of the standard. This includes those with regard to Accessibility by including structure information (tagging). However, conversion of a non-structured PDF to have structure information can’t be done automatically. This has led to confusion among users with little background in structural information and is also another suspected reason for slow adoption of the standard by users.
Level B stands for ‘Basic’ (Visual) support and only includes requirements for reliable visual reproduction of the document. This has been the most popular choice among PDF/A users. Parts 2 and 3 introduce a new Level U, which stands for ‘Unicode.’ This level is similar to ‘A’ but doesn’t include logical structure information. It requires Unicode equivalents of text to be present and was designed to get past the difficulties of achieving Level A compliance while including more than just the visual representation that you get with Level B.
For Part 2, there is an additional requirement that all attachments must be PDF/A compliant (Part 1 or 2). For Part 3, any type of attachment is allowed, as long as a relationship between the attachment and document content is specified. This loosened requirement for Part 3 has not been without controversy, as it tends to go against the original spirit of being a completely self-contained document that doesn’t rely on anything external. However, it was driven by the real-world desire to include important non-PDF associated files and maintain the originating data formats behind certain PDF documents.
ZUGFeRD (pun of ‘draft horse’ in German) is a new invoice standard which is based on PDF/A-3 plus XML data. It’s similarly experiencing its own surge in interest, and there is a big push for governments to use it. In Germany, there will soon be expanded requirements for documents to comply to this standard. This interest is expected to expand to other markets, including the United States. At Datalogics, we know the ability to convert a PDF to be a ZUGFeRD document is highly desired — that’s why we added a dedicated C++, C#, and Java sample in the PDF Library to demonstrate its usage. Our sample program illustrates how to easily convert a PDF document to be PDF/A-3, how to add the ZUGFeRD XML invoice as an attachment to the document, and how to add the metadata entries unique to ZUGFeRD and the required extension schema, which are not part of the PDF/A-3 standard itself.
Below is a comparison chart to help you better understand the comparison between PDF/A file types:
(open image in new tab to enlarge)
With all of these changes, you now have much more flexible PDF/A conversion options for all of your PDF document conversion needs. We invite you to download the latest version of the Adobe PDF Library, which includes extended PDF/A support and support for creating ZUGFeRD documents!