Copy and paste from PDF in Preview results in gibberish

When I copy and then paste text from a PDF in Preview into Pages, the resulting text is gibberish symbols. The odd thing is it only happens with some text in the PDF and other PDFs do the same thing.

I have a screen recording to demonstrate my problem. http://blip.tv/file/2823667 View it in full screen to see the text.

MacBook 2.2Ghz Core 2 Duo, Mac OS X (10.6.1), iWork '09

Posted on Nov 8, 2009 5:40 PM

Reply
38 replies

Nov 15, 2009 9:21 PM in response to skidogallard

I believe the following topic is related:

http://discussions.apple.com/thread.jspa?threadID=1743102

For me, I've noticed the problem occurs after I make and save a second annotation (a highlight, to be exact). For whatever reason, the first annotation can be saved without breaking the ability to copy text. After the second annotation and save, the ability to copy text is broken.

FWIW, the following line of text:...

Most students appear to like the use of overheads


Will, after annotations, copy out as :...

\\>&.-$ .-52#,-.$ ((#/$ -&$ )+9#$ -"#$ 5.#$ &$ &0#/"#*2.


Each space character appears to be mapped to $ + space character. Lower case s is mapped to dot ("."). Lower case a is mapped to *. Lower case t is mapped to a hyphen ➖. Is there some kind of pattern? Might this point to an answer/solution?

My source PDF allows me full permissions. There are no security locks.

Nov 8, 2009 6:02 PM in response to skidogallard

The general approach at this time is to ask if you've checked for any problematic fonts (all languages) with Apple's Font Book (look in the Applications folder). Find and remove all duplicates also.

Start there to be sure all fonts that are in play come out with a clean bill of health.

Don't hesisate to perform wholesale deletion of old and/or little used fonts - be skeptical of anything that has come from Office 2008, including those related to an Equation Editor installation.

Nov 9, 2009 11:44 AM in response to skidogallard

Can you provide a link to a publicly available PDF that produces this problem?

Apple did make changes to PDF copying and pasting in an attempt to extract higher quality text. Sometimes you would get spaces between each letter.

If I had such a PDF I could compare the differences in 10.5.8 and 10.6 and provide a bug report (with offending PDF document) to Apple.

Nov 9, 2009 12:31 PM in response to skidogallard

http://dl.dropbox.com/u/86630/internet%20use%20decreases%20social%20interaction. pdf


It looks to me like that part of the text has been transcoded into the Unicode Private Use Area for some reason. I've seen in this various other pdf's, and I think it is a bug in the app which produced the pdf. PUA code points are undefined, so the OS displays the Last Resort Font symbol for that range.

Nov 9, 2009 1:38 PM in response to Tom Gewecke

Tom Gewecke wrote:
Perhaps it was done on purpose


The doc data says there are no restrictions regarding copy, print, change, or save. Adobe Reader properties indicates some non-standard fonts are used -- presumably that is what happened in the parts that can't be copied.


I was thinking about if maybe they didn't want just that portion to be copied.

Nov 16, 2009 10:33 AM in response to etresoft

Three files are posted:

http://www.kinasevych.ca/c/200911/16/test.pdf

http://www.kinasevych.ca/c/200911/16/test.2.pdf

http://www.kinasevych.ca/c/200911/16/test.3.pdf

The first file, test.pdf, was opened in Preview (Version 4.2 (469.5)) and annotations were added. The file was saved as test.2.pdf, opened again, annotations added and deleted, then saved.

In Preview, text can be copied out correctly from test.pdf. Not so with test.2.pdf. My Mac OS X version is 10.5.8.

I opened both files in Acrobat Pro 9.2.0. Copying text produces the same result as in Preview: test.pdf is okay, test.2.pdf is broken.

I created the file test.3.pdf by opening test.pdf in Acrobat Pro, saving as test.3.pdf, and annotating the document, saving, then quitting Acrobat. Opened test.3.pdf again, annotated some more, saved and closed. After this process, text can still be copied correctly from test.3.pdf. This would suggest that Preview is generating the error, but only in certain documents.

Note that test.pdf is a single page from a document where I found this problem. When I created a simple text document in TextWrangler or in OpenOffice, then did Print>Save as PDF, neither instance generated a file which became corrupted as test files given here. There is something about test.pdf that lends itself to the error described in this thread.

Nov 16, 2009 11:36 AM in response to okinasevych

Excellent job!

Now you need to file the bug report with Apple. Be as specific as possible. All you need to give them is the original. Then say:
1) Add highlight annotation to word "color" in sentence "Professors of color have published poignant accounts of harshly negative student evaluations." and save as test2.pdf
2) Add strikethrough annotation to word "color" and save as test3.pdf.
3) Open test3.pdf and try to copy that sentence.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Copy and paste from PDF in Preview results in gibberish

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.