Introducing PDF CHECKER

Introducing PDF CHECKER

Hear Ye! Hear Ye! Today we announce the birth of PDF CHECKER. PDF CHECKER is a free tool – download here – that detects a number of different problem conditions in PDF files. Of course, different PDF workflows and users will naturally have different ideas of what constitutes a problematic PDF. We encourage you to download PDF CHECKER and set up your own custom PDF CHECKER profile to detect PDF problems for your specific user needs.

Too Much Information?

Let’s discuss a common scenario – the problem of PDFs that are larger than they need to be for readers on the go. Many PDFs come out of content tools generated for multi-purpose use: for print use, for storage, as well as for user viewing. These PDFs can be both well-suited for general business use, while at the same time being much larger than necessary for specific use cases, such as mobile viewing.

For content distributors who want to improve mobile customer experience, the download size of digital assets is a key concern. There are many different ways that PDFs can be created, where smaller versions of these files can lead to faster downloads and better mobile experiences. Together, PDF CHECKER and PDF OPTIMIZER form an easy to configure, easy to automate combination designed for server-based correction of many different types of PDF problems.

Example Scenario

Let’s set up a PDF CHECKER profile to look for three common size gremlins:

  • Images with excessively high resolutions (DPI)
  • Fully-embedded fonts
  • Application-specific data

We can use the following PDF CHECKER profile to scan for PDF files that contain any of the three conditions above:

{
    "fonts": {
        "uses-fonts-fully-embedded": {
            "check": "on",
            "report-as-error": "off",
            "report-message": "Uses fonts fully embedded in document",
            "abort-remaining-checks": "on"
        }
    },
    "images": {
        "color": {
            "resolution-too-high": {
                "check": "on",
                "report-as-error": "off",
                "report-message": "High resolution color image(s) present",
                "abort-remaining-checks": "on",
                "trigger-dpi": 150
            }
        },
        "grayscale": {
            "resolution-too-high": {
                "check": "on",
                "report-as-error": "off",
                "report-message": "High resolution color image(s) present",
                "abort-remaining-checks": "on",
                "trigger-dpi": 150
            }
        },
        "monochrome": {
            "resolution-too-high": {
                "check": "on",
                "report-as-error": "off",
                "report-message": "High resolution color image(s) present",
                "abort-remaining-checks": "on",
                "trigger-dpi": 150
            }
        }
    },
    "userdata": {
        "contains-private-data": {
            "check": "on",
            "report-as-error": "off",
            "report-message": "Contains private data",
            "abort-remaining-checks": "on"
        }
    }
}

Here’s the corresponding PDF OPTIMIZER profile to resolve these conditions:

{
    "images": {
        "color": {
            "downsample": {
                "trigger-dpi": 150,
                "target-dpi": 150
            }
        },
        "grayscale": {
            "downsample": {
                "trigger-dpi": 150,
                "target-dpi": 150
            }
        },
        "monochrome": {
            "downsample": {
                "trigger-dpi": 150,
                "target-dpi": 150
            }
        },
        "optimize-images-only-if-reduction-in-size": "on"
    },
    "fonts": {
        "subset-embedded-fonts": "on"
    },
    "userdata": {
        "discard-private-data": "on"
    }
}

There are a few things to note about these check and optimization profiles:

  • In the checking profile, each condition is set to automatically abort scanning if any of the conditions are encountered. This will cause PDF CHECKER to stop checking for issues as soon as one of the listed issues is detected. For workflows where a PDF should be run through an optimization process, usually what’s important is knowing that any given trigger condition was encountered. Knowing all the trigger conditions is usually no more than a waste of processing time. However, for workflows where all the triggers want to be detected, simply set these to not abort on encounter.
  • In both the check and optimization profiles, there are three classes of image processing: one class for color images, one for grayscale, and one for monochrome (bitonal) images. Different image classes may benefit best from different target resolutions, depending on the specific viewing circumstances. In the case of mobile viewing, however, it’s very likely that all images can be reduced to the same resolutions.
  • For image resampling with the optimzation profile, there are separate target and trigger resolutions. These can be used to prevent resampling images that are close to a passable threshold. For example, the Mobile profile that comes with PDF OPTIMIZER will downsample to 96dpi, images that are at or about 144dpi. Images that are less than 144dpi are allowed to remain at their resolutions in the the input PDF, as downsampling these will usually not be worth the processing involved.

Digital content distribution workflows can automate the above check and transformation of customer outbound PDFs using their preferred mechanism, to dramatically reduce the download size of many PDFs. Smaller PDFs for mobile users lead to faster downloads, more users reading your content, and a better user experience for your product or brand.

We encourage you to try PDF CHECKER – download the free tool here.

Leave a Reply

Your email address will not be published. Required fields are marked *