image capture with ocr

Question

Level 1

11 points

image capture with ocr

I am using Image Capture v 8 on an iMac with Ventura. When I scan there is an option to select OCR. The manual just says that it converts text to editable text (which is what I expected and wanted it to do) but it does not say where the text goes. A pdf file is created if that's what I selected, but the text does not go onto the clipboard and no other file, such as rtf, is created. Has anyone an idea how to use this potentially useful feature?

Posted on Jan 5, 2023 7:50 AM

Reply

Answer 1

Top-ranking reply

etresoft

Level 9

51,628 points

Jan 5, 2023 1:23 PM in response to melodeonman

melodeonman wrote:

It's difficult to test this theory.

Maybe not. I did a quick test in Ventura. Preview will definitely use its own built-in OCR on a PDF that does not have embedded text. However, Safari still does not have this capability. Safari will allow you to select text in an old image PDF only if it has had the text embedded within.

You can open the PDF with Safari. If you can select the text, then Image Capture has properly created a hybrid image/text PDF for you.

Reply

Answer 2

PRP_53

Level 10

92,587 points

Jan 5, 2023 8:07 AM in response to melodeonman

From my limited experience with Optical Character Recognition ( OCR ) enable software.

The Image Capture is the Method to scan, with a Scanner Device, a Paper Document.

Most often the Special Software that can OCR a document will or can save the Document in that Same Application and to a PDF file and in some cased other file formats.

That same Special Application will offer to scan the scanned Document and recognize the Text as Text.

In effect, there is nothing copied to the Clipboard for later use elsewhere

Reply

Answer 3

melodeonman Author

Level 1

11 points

Jan 5, 2023 8:26 AM in response to PRP_53

On further testing I have found that I can copy text from a pdf file using Preview and paste into Pages, where it retains its format. So that can be used as an OCR utility in association with a scanner. But when I scan with Image Capture and save into a pdf file I have exactly the same functionality whether or not I click the 'ocr' option. So I am still wondering what the ocr function does.

Reply

Answer 4

etresoft

Level 9

51,628 points

Jan 5, 2023 8:33 AM in response to melodeonman

melodeonman wrote:

when I scan with Image Capture and save into a pdf file I have exactly the same functionality whether or not I click the 'ocr' option. So I am still wondering what the ocr function does.

I don't have a scanner to test Image Capture with. But it sounds like it is creating a PDF with embedded text. However, because Preview already has OCR built-in, you wouldn't ever notice the difference. You would have to use the PDF on some other computer such as Linux, Windows, or an older Mac that does not have OCR built-in. Then, if you can still select text in the PDF, then that is what it has done.

Reply

Answer 5

melodeonman Author

Level 1

11 points

Jan 5, 2023 9:48 AM in response to etresoft

It's difficult to test this theory. I do have a Windows 10 PC, but the only software that successfully reads a pdf file is Acrobat Reader. This does not let me select text without using the premium version, which I don't intend to buy. It's good to know that this functionality is built in to Preview (so many things that it does that are not really promoted). Because I scan a fair few documents it's a shame that I cannot directly convert to text from Image Capture and into a text document without having to copy and paste.

Reply

Answer 6

Jan 5, 2023 11:12 AM in response to melodeonman

I tend to think that the Image Capture OCR is just changing imaged text to selectable text in the same PDF, and not extracting the text elsewhere.

As for Windows 10, you might alternatively consider the free Foxit PDF Reader.

Reply

Answer 7

Jan 5, 2023 11:47 AM in response to melodeonman

Technically, Monterey 12.x added a feature Apple calls, Live Text. The ability to capture text from images in a sort of built-in OCR. I've found it tremendously valuable when my customers send me screenshots and I need to capture the text details. It simply just works in Photos, Preview, Safari and on macOS / iOS / iPadOS. If an image application doesn't support Live Text you can move the image into Preview and it will work. You can highlight text using the cursor in most images and copy it to the clipboard and paste. Sometimes there is a typo such as confusing a Q with a Zero, etc. So you may need to spellcheck it or look for typos. But that was common with OCR technology in general.

This all still works in Ventura as well.

Reply

Answer 8

melodeonman Author

Level 1

11 points

Jan 6, 2023 5:34 AM in response to etresoft

Thanks for all these replies. I read details of the pdf container format and now understand what is happening. I scanned a document and did not select the OCR button. Neither Preview nor Safari would allow me to search or edit text. I repeated, but this time clicking the OCR button. I could then edit text in Preview, but not with Safari, showing that it does not look at the embedded text. So all solved - I just need to click the OCR button and the text will be embedded so that I can make use of it later, or extract the text into a separate file.

Reply

image capture with ocr

Similar questions