Redaction using the Datalogics PDF Java Toolkit

Redaction using the Datalogics PDF Java Toolkit

Sample of the Week:

Joel Geraci“Redaction” is a legal term of art that means to obfuscate parts of a document. In legal proceedings, relevant documents must be disclosed between litigants. However, some documents, or even parts of documents, contain references (names, numbers, or other information) that are is not subject to disclosure. Trade secrets, social security numbers of non-relevant individuals, the names of minors, and some confidential and non-relevant medical information are all commonly redacted from evidentiary documents.


Redacting paper documents is pretty straightforward; you grab a Sharpie and start crossing out text. When you’re done, you photocopy the paper and you’re good to go.
Redacting electronic documents can be just as easy… if you’re using the right tools… and you use them correctly.

Code Snippet:

PDFOpenOptions openOptions = PDFOpenOptions.newInstance();
// Setting the font set in open options
inDoc = PDFDocument.newInstance(reader, openOptions);
writer = SampleFileServices.getRAFByteWriter(outputFilePath);
RedactionOptions redactionOptions = new RedactionOptions(null);
// Applying redaction
RedactionService.applyRedaction(inDoc, redactionOptions, writer);

The Datalogics PDF Java Toolkit applies redaction to PDF files in the same way that Adobe Acrobat and Adobe LiveCycle does. The author of the sample input file used Acrobat’s “Search and Redact” feature to search for and redact the word “Collection”. In one case, the word “Collection” appears on a curve, something that can be a challenge for any search engine; Acrobat was able to find it and add the appropriate redaction marks… on the curve.

The Datalogics PDF Java Toolkit sample “RedactionSample” shows you how to create a default set of RedactionOptions which are used to control certain aspects of the redaction process. Because the consequences of disclosing confidential or privileged information to opposing counsel can be devastating, the default RedactionOptions were designed to simply “do the right thing” meaning “the same thing as Acrobat”. After creating the options, you just apply those redaction marks using the applyRedaction method of the RedactionService class. It couldn’t be easier.

By leveraging the Datalogics PDF Java Toolkit, developers can create workflows that integrate the redaction process into their other server-based document workflows. Because redaction is “lossy”, information is permanently removed from the file, it is an activity that can benefit from automated processing. Acrobat can be used to add the redaction marks and codes to a PDF file, then checked into a repository that initiates a workflow, the redaction marks are verified by another user familiar with the case then, once approved for redaction, the process archives a copy of the original, a copy with the redaction marks and then creates a new redacted copy for distribution. Finally, you can use the Adobe PDF Java Toolkit to convert the resulting redacted file to PDF/A for submission to the courts… but that’s another sample for another time.

View and download “RedactionSample” sample or get all the samples and documentation by requesting an evaluation of the Datalogics PDF Java Toolkit.

Leave a Reply

Your email address will not be published. Required fields are marked *