Optical Character Recognition, OCR, is a technology that recognizes text within images. It allows PDFsam Enhanced to differentiate this text from the rest of the image so you can edit it.
You will be able to recognize an image by the red border that surrounds it when you select it while in Edit Mode.
When the whole page is one large image, it is indicative of a document made up of scanned pages. Without OCR, they cannot be edited easily.
If the OCR module is not available for you, you can purchase it here.
OCR Auto and OCR Manual
These are only active when an individual image is selected. Rather than scanning an entire document, you can work image by image. These features do not create a new file but scan the image within the existing PDF.
The OCR Module has a series of resources designed for different circumstances.
If you have a document made up of several scanned pages that need to be recognized and edited you need to choose the Recognize Document option.
- In the dialog box that appears you can specify the pages to recognize.
- Click the Recognize button.
You will see the status bar appear advising you that PDFsam Enhanced is recognizing text. You can click on Cancel to stop it.
When the recognition is finished, a new file will open in a separate tab with all your images scanned. Your original file will not change.
From External Image
To recognize the text of an external image to PDF choose the External Image option.
This will open a Browse window. Choose your file and it will open ready to be edited.
Scan and Recognize
This feature allows you to create a document directly from your scanner. As you create your new PDF file directly from your scanner, the documents will be scanned with OCR as well, making them ready to edit.