PDF OCR Sample Code: Transform Non-Searchable PDFs

PDF OCR Sample Code: Transform Non-Searchable PDFs

Published September 6, 2024

Having trouble accessing your PDF text? OCR can enable you to convert non-searchable PDFs so that previously inaccessible image-based text becomes selectable and searchable within the PDF.

With Adobe PDF Library, you can add OCR to your PDFs to reveal what is hidden - here's what our code samples can help you with when it comes to OCR.   

Add Text To Document

Check out the code here.  

Adding text to a document places recognized text behind the OCR images found on a PDF page. By placing the text behind the images, the original appearance of the document is maintained. This makes sure that any design elements, formatting, or intricate layouts are kept intact, while still making the text accessible.  

The recognized text also allows the PDF to become searchable, so users can easily find specific words or phrases without altering the visual presentation of the document. For users with visual impairments who rely on screen readers, the hidden text can be read aloud, making the document more accessible while retaining its original design.  

In situations where maintaining the exact look of the original document is crucial (e.g., legal or historical documents), this method allows for digital archiving without losing the ability to interact with the text.  

what is pdf ocr datalogics


Add Text To Image

Check out the code here.  

Adding text to image adds an image file to a PDF page, runs OCR on the image, and place the recognized text behind it. This method allows the document to be easily shared, viewed, and interacted with in a digital environment while preserving the exact appearance of the original physical document. 


To see our OCR code samples in action, visit our GitHub page. And if you're ready to start OCR'ing those documents right away, go ahead and request a free trial of Adobe PDF Library today!