Last month, Datalogics announced the latest release of PDF Alchemist. My column in this blog in January talked about what’s new with PDF Alchemist. Today, I’d like to talk about what makes PDF Alchemist different from other PDF content extraction and conversion options.
Data-based content extraction
PDF Alchemist analyzes the data within PDF files visually and through other heuristics, to re-assemble the data structures and elements that humans see from the drawing commands within PDF files. Solutions that apply optical character recognition (OCR) to the entirety of PDF pages discard all of this data by relying on a lossy conversion of drawing commands into raster image forms. To date, no OCR solution is perfect, and as such these conversions may result in data loss. Even worse, these “round-trips” through OCR processing could also lead to data and content changing due to misrecognition.
Complete content retrieval
Unlike solutions focused solely on data tables, PDF Alchemist handles all types of visual content within PDF files. Products that are focused on tables alone are great if that’s all your documents consists of. However, most all documents are a combination of tables, text, graphics, and other materials. PDF Alchemist processes and presents all the material from PDF files, not just a select subset of information.
PDF Alchemist offers those interested in responsive HTML with a variety of presentation options. Bookmarks and tables of contents are automatically converted into HTML frames for easy output navigation. Pagination artifacts such as running headers, footers, and page numbers may be removed or kept as desired. The graphical content resolution is adjustable: high-resolution output is supported when quality is paramount, while lower resolutions can be easily adjusted when output size and transfer speeds are top priorities.
We fully understand that much of your data and information is sensitive and confidential. Your data is subject to the policies and security of external cloud-based solution providers. This applies to processing and data storage stages where your information could be compromised and accessed during either stage. In some cases, your data may be kept in perpetuity by these providers who have full permission and data access to mine and analyze your proprietary and sensitive information. With PDF Alchemist, you have peace of mind for your data security and also can choose how you implement it. When processing time or security are crucial, you can deploy PDF Alchemist within your internal cloud or servers. When the fault-tolerance of the cloud is desired, you can license PDF Alchemist for deployment on your choice of IaaS or cloud providers. You choose the right solution for your needs.