There are many different reasons you may want to control access to a PDF file. Perhaps the PDF is intended only for internal use within an organization because it contains proprietary information. A PDF may be licensed such that only a specified number of users may concurrently have access. Potentially, legal and court rulings might require controlling access by country or geography. This article does not discuss the arguments for or against access control; I merely give an introduction to different access control mechanisms available in PDF files.
Broadly speaking, there are three ways to control access to a PDF: data permissions, passwords, and closed workflows. Each mechanism has specific benefits and drawbacks.
The most basic form of PDF access control comes through the permissions flags built into the PDF file format. These permissions can be used to signal to a viewer or processor that a user should be restricted in what they are allowed to carry out with the PDF file. For example, these can signify that a user:
- Not be allowed to print a PDF
- Not be allowed to copy content from a PDF
- Not be allowed to convert a PDF to a different format
Data permissions are simple to apply and understand, but they come with major drawbacks:
Unreliable support: many PDF viewers and libraries will honor data permissions restrictions, but not all. PDF software was required to do so while Adobe owned the PDF specification (and honoring these was a condition of the license in the PDF specification). Upon Adobe’s donation of PDF specification ownership to ISO, however, this became unenforceable and was instead made a recommendation for PDF viewers and processors.
Excessive restriction: many PDF writers would request access restrictions that prevented PDF file read-aloud and TTS (text to speech) applications. While the intent was usually to prevent purchasers of PDF content in “visual” form from re-purposing the content as audiobooks, this has the serious consequence of preventing sight-impaired users from interacting with these PDFs via screen-reading and other assistive software. This proved so problematic that the ability to request this restriction was removed from PDF 2.0.
All in all, data permissions are useful to include in a PDF if needed. But, these are not reliable across the PDF ecosystem for access control. It has always is trivial to bypass or ignore these permissions bits. These work essentially on the honor system – they are suited for specifying the intentions of the author or publisher of a PDF. They are not suitable as an only means of access control or as a security mechanism.
Passwords for PDF access control come in a wide variety of different strengths and levels of complication. Passwords are an example of a bearer token security approach – something a user needs to have in order to access a PDF file. Password security has been available in PDF for over twenty years. Best practices in security, in encryption and hash algorithm choices and in key length recommendations have advanced along with happenings in the broader computer security arena. Password protection capabilities in the PDF file format have also advanced as PDF format revisions have been published. Today, PDF offers both strong security and backwards compatibility. However, there are some tradeoffs:
Non-revocability: once access is granted to someone (by providing the password for a protected PDF), access can not be revoked without re-publishing the PDF with a different password. Because users can maintain local PDF document copies, re-publishing the PDF will only remove access to these subsequent versions. Once a user has a PDF and a password, there’s no way to remove their access.
Lack of user verification: once a password has been given to a user, there is little that prevents sharing that password with others. Restricting access to specific users becomes very difficult. Some systems can work around this by protecting PDFs with a password that a user will not want to share with others – such as a credit card number. Even this is easily defeated by forcing these systems to use identifiers and passwords that have little significance to their users, such as pre-paid credit cards used for such systems.
Security vs. backwards compatibility: PDF security capabilities have been continually increased and refined through different PDF format versions. However, once a PDF specification is published, the security mechanisms specified within can no longer be changed. Therefore, maximum compatibility with different viewers and programs where PDF support is limited to older PDF revisions can require compromises in the choice of password and hash algorithms used. PDF files that take advantage of advances in PDF password security can end up being unopenable or unusable in workflows, or to users where PDF support is limited and frozen in an earlier time.
Public Key Security
Public key security handlers extend the password mechanism by creating and encrypting a password via a public/private key pair for a specific user, and placing this encrypted password within a PDF file. A user with a suitable private key may decrypt the password and use this to open a secured PDF. Multiple users may be specified, with each having their own password and access rights.
This is a more secure system than bearer token passwords because it relies on something a user is – their identity – rather than something a user has, e.g. a password. However, this suffers from the same issue of non-revocability as PDF password schemes generally. Additionally, compatibility with public key security handlers is limited in the PDF viewer and processor ecosystem.
A closed PDF workflow trades interoperability for access control capability. Through restricting the use of PDFs to specific viewers and tools, closed PDF workflows can extend access control of digital documents along several different and important dimensions. In many situations, the use of specific software to create or encode PDFs and to view or work with PDFs in these workflows can actually increase reliability and productivity, through removing many common problems brought on by the varying levels of PDF format support across the broad PDF ecosystem.
Examples of closed workflows include:
- Client/server access control systems such as those provided by Adobe Experience Manager, FileOpen or Datalogics READynamic™. These systems include a server that grants access to a PDF for a specific time period to a specific person or people, and a client that periodically checks in with the server to check for revocation or renewal of access. In some cases, these clients are plugins or are built into Adobe Reader. In other cases, these viewers are separate programs.
- Digital rights management (DRM) systems such as Sony DADC URMS and Adobe ADEPT. These systems store and transmit encoded PDF files to users, and separately transmit decoding instructions for use in secure, closed client reading environments along with entitlement information such as the duration of entitlement, and any restrictions on printing or text copying. Files are always stored on disk as protected, and are kept secure from piracy.
Closed workflows bring the following advantages over other access control mechanisms:
Greater capabilities: closed systems bring the most flexibility in access control. PDF access may be granted to individuals, to specified quantities of concurrent users, to specific groups, or many other different ways to combine user sets. In some systems, access may be revoked automatically when a user should no longer have access. In other systems, content may be lent, gifted or sold without compromising its security. Different systems offer different types and levels of flexibility that can be tailored to specific usage needs.
Greater reliability: closed systems can cast aside many compatibility and reliability issues that can arise when working with PDFs across the broad PDF tooling ecosystem. These systems can focus on maintaining maximum reliability, better performance, and a user experience that is more tailored to the needs of specific users.
There is one noteworthy drawback, of course:
Compatibility: closed workflows do restrict compatibility to specific, defined workflows and toolsets. In some cases, viewers or other tools may not exist on all platforms that users want or need. In other situations, there may be restrictions on offline PDF usage or a requirement to check in with a server periodically for continued usage permission. In some cases, these restrictions may disqualify a closed workflow for consideration.
There are various benefits and drawbacks to the different PDF access control mechanisms. Increased security and protection of content comes at the cost of greater complexity and lowered interoperability. Striking the right balance between control and convenience is highly dependent on your needs, your users’ needs and the consequences of escaped content.