Back in the early days, PDF shared it’s imaging model with Adobe’s first big hit, PostScript. PDF files were created in one of two ways, through the PDF Writer, an operating system level driver, and through the Distiller which converted PostScript to PDF. You could also code PDF files by hand, which I personally witnessed on more than one occasion, but that technique really doesn’t apply to this discussion and makes my brain hurt.
Note: This is the fourth in a series of my articles addressing PDF Optimization. You can read the prior articles at the links below.
So… When is refrying a PDF a good idea?
Never!… well… almost never.
First, some background…
As PDF began to get more popular, other non-Adobe tools became available to produce them. Some of these tools produced very good PDF and others produced very bad PDF… and some of the applications that consumed PDF had some trouble reading the bad ones.
The other detail that’s important to know is that what allows PDF viewers to randomly access any page in the file even with documents that are tens of thousands of pages long is the cross reference table which tells the viewer which bytes belong to which object. Back in 1995, email servers tended to do weird things with attachments and a line separator on one operating system wasn’t necessarily the same as on another. These two issues combined with the fact that getting PDF right is non-trivial, Adobe decided, rightly, that their viewer would need to be somewhat forgiving and built in some “repair” code for malformed PDF files. For PDF to succeed, it had to be perceived as reliable, so if Acrobat needed to be able to open PDF files that were created by third party software that may be less than perfect, so be it.
But not all PDF tools were so forgiving.
The Emergence of Refried PDF:
I don’t remember the exact customer that introduced the term “refried PDF” but I do remember laughing when I first heard it used; I knew what they meant immediately. I’d even done it myself on occasion. A refried PDF is one where you convert the PDF to PostScript then use the Distiller to convert it back to PDF; the process fixed a lot of troublesome PDF files.
What was really going on is that the file was opened in Acrobat, it got repaired, then printed to PostScript, then converted back to PDF by Adobe’s Distiller. Acrobat fixed the file, created a valid print stream, then it was converted to PDF using a really good tool.
The process fixed a lot of troublesome PDF files.
But then PDF evolved. Features like transparency, new image compression technology, tags and structure, XMP metadata, and interactivity got added. These things didn’t translate to PostScript so they got lost on the way out of PDF resulting in a less capable refried PDF.
Also, the tools got better. It became possible to use PDF library technologies to transcode one PDF directly into a new PDF that was free of errors and had the characteristics that the developer needed for that particular application… and PDF Optimization was born.
There are still some reasons why a developer might want to refry a PDF file, I won’t go into those here but if you are interested, read Refrying PDF files at Prepressure.com. But PDF Optimization will get you a lot farther than refrying ever could and the process won’t be lossy… unless you want it to be… but that’s an article for the future.