PDF OCR Sample Code: Transform Non-Searchable PDFs
Having trouble accessing your PDF text? OCR can enable you to convert non-searchable PDFs so that previously inaccessible image-based text becomes selectable and searchable within the PDF.
With Adobe PDF Library, you can add OCR to your PDFs to reveal what is hidden - here's what our code samples can help you with when it comes to OCR.
Add Text To Document
Adding text to a document places recognized text behind the OCR images found on a PDF page. By placing the text behind the images, the original appearance of the document is maintained. This makes sure that any design elements, formatting, or intricate layouts are kept intact, while still making the text accessible.
The recognized text also allows the PDF to become searchable, so users can easily find specific words or phrases without altering the visual presentation of the document. For users with visual impairments who rely on screen readers, the hidden text can be read aloud, making the document more accessible while retaining its original design.
In situations where maintaining the exact look of the original document is crucial (e.g., legal or historical documents), this method allows for digital archiving without losing the ability to interact with the text.
Add Text To Image
Adding text to image adds an image file to a PDF page, runs OCR on the image, and place the recognized text behind it. This method allows the document to be easily shared, viewed, and interacted with in a digital environment while preserving the exact appearance of the original physical document.
To see our OCR code samples in action, visit our GitHub page. And if you're ready to start OCR'ing those documents right away, go ahead and request a free trial of Adobe PDF Library today!