Creating a Table of Contents from Bookmarks with DLE

Creating a Table of Contents from Bookmarks with DLE

A couple of years ago, I put together a sample app for creating a Table of Contents from Bookmarks which I’ve revisited and revised on occasion based on feedback from customers, so I thought I’d share the latest iteration of this code a bit more broadly.  Also, I came across the perfect input Document to demonstrate it, this weekend, during my annual  Choose-Your-Own-Adventure for Adults weekend, when I dive into this text and come out at the other end with either a penalty or a refund; it’s a high-stakes game, or at least that’s what I tell myself to get through it.

Anyway, the Table of Contents for the 1040 general instructions is a little bit parsimonious:

i1040gioriginaltoc

Must be because of the Paperwork Reduction Act. Fortunately, the PDF has plenty of bookmarks so we can generate a rather expanded version of this Table of Contents:

i1040gibookmarkstoc

And so on for 5 more pages.  That actually brings up one of the first technical challenges of creating a Table of Contents that points to the correct pages: You need to know how many pages you are going to be inserting into the document for the table of contents and insert them first so that your page numbers don’t shift by one or more when you discover that your Table of Contents is going to take up more than one page.

Before that, a preliminary challenge is to flatten the bookmark tree into an in-order list of bookmarks:

Of course, we still need the bookmark tree structure to determine the indent level.  We determine the indent-level by counting the parent nodes from the bookmark node back up to the root node.  This code has a builtin table of indent-levels but the bookmark depth can be arbitrary so once we run out of predefined indent-levels, then we increase the indent-level by the last entry in the indentlevels array ad infinitum (but not really; the algorithm would break if the depth was too great)

The practiceLayout is going to be similar to the layoutTOC routine but since we won’t be committing anything to page just yet, it can be simpler.

where the layoutTOC complicates itself is by adding dots between the end of the bookmark title and the right-justified page label:

Also, for the second pass, we add one Text element per bookmark node so that we can use its boundingBox to add a link to the bookmark destination and so close the loop.

There are at least few ways that this code could be further extended. One of which would be to add the Table of Contents after an arbitrary set of pages. Another would be to pass in an array of Rects so that you could do an IRS-style two-column Table of Contents. A third would be to decorate the Table of Contents pages in some way…perhaps adding a page header that uses the Document Title. Or perhaps specifying a special page label range for the table of contents and creating a page footer that uses the page label string for the page number. I’m sure that there are other possibilities I haven’t thought of. Let me know if you think of a good one.

The full code is here.

Leave a Reply

Your email address will not be published. Required fields are marked *