Creating PDF Portfolios from ZIP files using the Datalogics PDF Java Toolkit

Creating PDF Portfolios from ZIP files using the Datalogics PDF Java Toolkit

Sample of the Week:

Depending on your point of view, Adobe either nerfed or improved the functionality of PDF Portfolios by going back to basics with the release of Acrobat DC. While I loved the how the Flash-based PDF Portfolio templates looked, performance suffered with large portfolios and they never ran well on mobile devices.

With Acrobat DC, you can still view Flash-Based PDF Portfolios created in earlier versions, you just can’t create new ones… essentially, we’re back to the same place we were with PDF Packages from the Acrobat 8 days… just with folders. It may feel like we’ve stepped backwards but I believe having a more consistent experience across desktop and mobile devices and removing the dependency on the Flash Player is, overall, a good thing.

For those of you who are not familiar with PDF Portfolios, a portfolio is simply a PDF file with a bunch of attachments and a sort of database that is used by a conforming viewer to present the attachments in an organized way with files separated into folders and with metadata displaying what they all are and how you might use them. Just about any file type can be attached to a PDF or placed in a PDF Portfolio… actually, any file type really… it’s just that the Reader won’t let you launch certain file types due to security concerns… .exe or .js files for example.

A PDF Portfolio or “Collection” also contains a “Cover Sheet” that all of the attachments are actually attached to. Acrobat and other PDF developer tools like the Datalogics PDF Java Toolkit use FlateDecode (ZIP) to store your files inside the PDF so you can think of a PDF Portfolio as a sort of ZIP file with a PDF wrapper, a better user interface, tighter security and a few other key advantages. For many use cases, you may be better off using a PDF file instead of a .ZIP file. For example, many mail servers won’t allow .zip attachments to be sent or received. Additionally, at the time of this writing, .ZIP still requires special software to open or view on mobile devices. Which brings me to this week’s Sample of the Week.

The Datalogics PDF Java Toolkit makes it very easy to simply add attachments to a PDF file but creating a PDF Portfolio requires that a few more dictionaries to be created and the user experience is better if you add a little metadata to help your audience navigate the Portfolio and easily find what they’re looking for.

This GitHub gist demonstrates how to convert a .ZIP file to a PDF Portfolio but you can optionally use a folder as the input.

We’ll start by opening an existing PDF file that we’ll use as the cover sheet. This will be shown by default unless you set an initial document for the Portfolio.

Then we need to create the Schema which is where the information that the viewer will use to display the Portfolio is stored. We want the files to be sorted and displayed alphabetically based on their source file name.

We not have enough of the Portfolio structures in place to begin adding the files and creating folders as needed. We can duplicate the structure of the directory pretty easily using a little recursion. If the type of item we are adding is a folder, we need to create the folder in the Portfolio separately, and then add the items inside it.

Finally, for each file we encounter, we want to attach the file as well as generate some metadata about it. For PDF files, the toolkit lets the developer easily reach into the PDF metadata and use it to populate the Portfolio schema. In this example, we create a new Portfolio metadata key for every key value pair in the PDF file.

The new PDF Portfolio is only slightly larger than the original .ZIP input file. If you’re more careful than I was when creating the cover sheet or if you use one of the documents you want to distribute as the cover sheet, the PDF Portfolio and the .ZIP can be nearly identical in size… except the PDF is far better behaved when going through email servers or used on mobile devices.

To run this Gist, you’ll need at least an evaluation version of the Datalogics PDF Java Toolkit as well as the input files.

Leave a Reply

Your email address will not be published.