Redaction using the Datalogics PDF Java Toolkit

Sample of the Week:

Joel Geraci“Redaction” is a legal term of art that means to obfuscate parts of a document. In legal proceedings, relevant documents must be disclosed between litigants. However, some documents, or even parts of documents, contain references (names, numbers, or other information) that are is not subject to disclosure. Trade secrets, social security numbers of non-relevant individuals, the names of minors, and some confidential and non-relevant medical information are all commonly redacted from evidentiary documents.
Redacted-Document
Redacting paper documents is pretty straightforward; you grab a Sharpie and start crossing out text. When you’re done, you photocopy the paper and you’re good to go.
Redacting electronic documents can be just as easy… if you’re using the right tools… and you use them correctly.
Code Snippet:

PDFOpenOptions openOptions = PDFOpenOptions.newInstance();
// Setting the font set in open options
openOptions.setFontSet(SampleFontLoaderUtil.loadSampleFontSet());
inDoc = PDFDocument.newInstance(reader, openOptions);
writer = SampleFileServices.getRAFByteWriter(outputFilePath);
RedactionOptions redactionOptions = new RedactionOptions(null);
// Applying redaction
RedactionService.applyRedaction(inDoc, redactionOptions, writer);

The Datalogics PDF Java Toolkit applies redaction to PDF files in the same way that Adobe Acrobat and Adobe LiveCycle does. The author of the sample input file used Acrobat’s “Search and Redact” feature to search for and redact the word “Collection”. In one case, the word “Collection” appears on a curve, something that can be a challenge for any search engine; Acrobat was able to find it and add the appropriate redaction marks… on the curve.
The Datalogics PDF Java Toolkit sample “RedactionSample” shows you how to create a default set of RedactionOptions which are used to control certain aspects of the redaction process. Because the consequences of disclosing confidential or privileged information to opposing counsel can be devastating, the default RedactionOptions were designed to simply “do the right thing” meaning “the same thing as Acrobat”. After creating the options, you just apply those redaction marks using the applyRedaction method of the RedactionService class. It couldn’t be easier.
By leveraging the Datalogics PDF Java Toolkit, developers can create workflows that integrate the redaction process into their other server-based document workflows. Because redaction is “lossy”, information is permanently removed from the file, it is an activity that can benefit from automated processing. Acrobat can be used to add the redaction marks and codes to a PDF file, then checked into a repository that initiates a workflow, the redaction marks are verified by another user familiar with the case then, once approved for redaction, the process archives a copy of the original, a copy with the redaction marks and then creates a new redacted copy for distribution. Finally, you can use the Adobe PDF Java Toolkit to convert the resulting redacted file to PDF/A for submission to the courts… but that’s another sample for another time.
View and download “RedactionSample” sample or get all the samples and documentation by requesting an evaluation of the Datalogics PDF Java Toolkit.

Share this post with your friends

Share on facebook
Share on twitter
Share on linkedin

Leave a Comment

Your email address will not be published. Required fields are marked *

Get instant access to the latest PDF news, tips and tricks!

Do you want monthly updates on the latest document technology trends?

By submitting the form, you agree to receive marketing emails from Datalogics. You may unsubscribe at any time. 

Like what you're reading?

Get Datalogics blogs sent right to your inbox!

By submitting the form, you agree to receive marketing emails from Datalogics. You may unsubscribe at any time.