DeveloperWeek 2018 Trending Topics: Document Quality and Data Extraction

DeveloperWeek 2018 Trending Topics: Document Quality and Data Extraction

The Datalogics team had a great time at DeveloperWeek 2018. There were a lot of interesting talks, exhibitors, and an overall excellent crowd. Coming all the way from Chicago, the beautiful, warm Oakland weather was a welcome change for our team.

We had some great in-depth discussions with attendees that stopped by our booth about PDF documents. There were a couple of main topics that were on people’s minds at this year’s show:

  • The first one revolved around taking user documents. This was especially the case with Document Management Service companies that typically take and process large numbers of user documents.
  • The second topic revolved around AI and data research. Companies that have large amounts of data locked inside PDF documents are experimenting with different tools to get that data in an usable format.

 

PDF document quality is a trend that we have seen before, and it seems to be a recurring theme these days for companies that work with user submitted PDF documents. Those documents often present a variety of problems, and can sometimes break internal processes. Everyone we talked to was happy to know Datalogics would soon be releasing a free tool to verify the integrity of documents, flagging issues that need to be fixed within their PDF. For many of these issues, our PDF OPTIMIZER tool can resolve these automatically.

Data extraction was another interesting topic we discussed at length. I was happy that my presentation given at the show, which outlined issues with extracting data from PDF documents, resonated with a lot of the attendees. Many of them have vast amounts of data stored in PDF documents and they are struggling to extract it. Data analysis and AI are hot topics today, and they believe there is a lot of value in extracting this information. Information extraction from PDF documents can be difficult, especially for unstructured content. Getting information out of PDF documents is complex problem, and in some cases, is performed by an AI algorithm specifically designed for the task.

While we were sad to leave sunny Oakland, we really enjoyed our time at DeveloperWeek 2018, and are looking forward to a great year filled with opportunities to continue helping users with our expertise in the PDF file format. For more information, contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *