Previous 1 2 3 Next 38 Replies Latest reply: Mar 23, 2010 4:42 PM by Tom Gewecke Go to original post
  • Tom Gewecke Level 9 Level 9 (74,985 points)
    I did print that from the web to PDF while at school on a Windows box. I can't remember what the PDF printer was though.


    If you look at the inspector, it says content creator PScript5.dll Version 5.2.2
  • okinasevych Level 1 Level 1 (0 points)
    I believe the following topic is related:

    http://discussions.apple.com/thread.jspa?threadID=1743102

    For me, I've noticed the problem occurs after I make and save a second annotation (a highlight, to be exact). For whatever reason, the first annotation can be saved without breaking the ability to copy text. After the second annotation and save, the ability to copy text is broken.

    FWIW, the following line of text:...

    Most students appear to like the use of overheads


    Will, after annotations, copy out as :...

    \\>&.-$ .-52#,-.$ ((#/$ -&$ )+9#$ -"#$ 5.#$ &$ &0#/"#*2.


    Each space character appears to be mapped to $ + space character. Lower case s is mapped to dot ("."). Lower case a is mapped to *. Lower case t is mapped to a hyphen . Is there some kind of pattern? Might this point to an answer/solution?

    My source PDF allows me full permissions. There are no security locks.
  • skidogallard Level 1 Level 1 (0 points)
    The part about annotations make sense, as I had been highlighting my PDFs.
  • etresoft Level 7 Level 7 (26,600 points)
    Can you provide a source document and a step-by-step sequence of annotations that cause the corruption? And describe what tools you use? I tried a simple example using Preview and can't get it to fail.
  • okinasevych Level 1 Level 1 (0 points)
    Three files are posted:

    http://www.kinasevych.ca/c/200911/16/test.pdf

    http://www.kinasevych.ca/c/200911/16/test.2.pdf

    http://www.kinasevych.ca/c/200911/16/test.3.pdf

    The first file, test.pdf, was opened in Preview (Version 4.2 (469.5)) and annotations were added. The file was saved as test.2.pdf, opened again, annotations added and deleted, then saved.

    In Preview, text can be copied out correctly from test.pdf. Not so with test.2.pdf. My Mac OS X version is 10.5.8.

    I opened both files in Acrobat Pro 9.2.0. Copying text produces the same result as in Preview: test.pdf is okay, test.2.pdf is broken.

    I created the file test.3.pdf by opening test.pdf in Acrobat Pro, saving as test.3.pdf, and annotating the document, saving, then quitting Acrobat. Opened test.3.pdf again, annotated some more, saved and closed. After this process, text can still be copied correctly from test.3.pdf. This would suggest that Preview is generating the error, but only in certain documents.

    Note that test.pdf is a single page from a document where I found this problem. When I created a simple text document in TextWrangler or in OpenOffice, then did Print>Save as PDF, neither instance generated a file which became corrupted as test files given here. There is something about test.pdf that lends itself to the error described in this thread.
  • etresoft Level 7 Level 7 (26,600 points)
    Excellent job!

    Now you need to file the bug report with Apple. Be as specific as possible. All you need to give them is the original. Then say:
    1) Add highlight annotation to word "color" in sentence "Professors of color have published poignant accounts of harshly negative student evaluations." and save as test2.pdf
    2) Add strikethrough annotation to word "color" and save as test3.pdf.
    3) Open test3.pdf and try to copy that sentence.
  • Bob Spaulding Level 1 Level 1 (25 points)
    Here's how I worked around the problem. I had the same problem, but didn't do anything myself to annotate the document I downloaded from the internet. In any case, this requires the full version of Adobe Acrobat or scanner software with OCR (optical character recognition).

    IF YOU HAVE THE FULL VERSION OF ADOBE ACROBAT:
    1. Use Preview to save the document as a TIFF file, preferably at 300 dpi. This took about 20 seconds per page on my Macbook, so be prepared to wait for longer documents.
    2. Use Preview to open the resulting TIFF file (if necessary), and save it again as a PDF. This essentially has converted the text to an image file, discarding the erroneous character information.
    3. Open the resulting document in Acrobat.
    4. Choose the menu item "Document > Recognize Text Using OCR > Start" (at least in my version of Acrobat), and choose "all pages" in the resulting dialogue box.
    5. Wait (again about 20 seconds per page).
    6. Save changes to the document, which now should work exactly as expected.

    IF YOU DON'T HAVE ACROBAT, BUT DO HAVE A SCANNER:
    1. Print out the PDF.
    2. Scan the PDF you just printed out.
    3. Use the OCR software options (usually part of the scanning software) to perform optical character recognition on the document you scan.
    4. Save the resulting document, which now should include selectable text.

    No easy answers, but answers nonetheless.
  • DutchHarris Level 1 Level 1 (0 points)
    I have encountered a comparable problem. I use -and love- the preview to make annotated pdf's at the same time I copy text quotes out of the pdf into my quaotation database (normal scientific working). I save the pdf's with the annotation. This worked perfectly (OS X 10.6.2) until today.
    I loaded a World Bank Document from 2005
    http://siteresources.worldbank.org/INTURBANDEVELOPMENT/Resources/dynamicsurbanexpansion.pdf
    This is with 21 mb quite large. After I did save my annotation changes. The copy of the document text parts as decribed above changes in the gibberish as described here. Before it did not.
    With other documents this did not happen.
    Perhaps its the size, or the age of the pdf document.
  • DutchHarris Level 1 Level 1 (0 points)
    I just tried it with a smaller document out of the same family(world Bank 2005), with the same gibberish result. That is: After I made an annotation and did save that change, after that, the copying of the text delivered the "gibberish" result.
  • DutchHarris Level 1 Level 1 (0 points)
    Last try....I did use the "Print" to "pdf" save function, to see if then a usable and savable annoted pdf file would come into existence. Alas to no use, this resulted as wel in "gibberish".

    It still holds that probably older pdf files might be the cause.
  • Sasha Harris-Cronin Level 1 Level 1 (0 points)
    I am having the same problem. It also happens specifically after I do two annotations and save the document. Unfortunately, I only just found this out after annotating around 20 documents, which means that I either have to re-download all of them and find the annotated bits to copy or I have to hand type.

    Any solutions yet?
  • MeBeMac Level 1 Level 1 (15 points)
    I have the same issue. Ugh!
  • William Doane Level 1 Level 1 (0 points)
    I've been able to replicate the problem with a reasonably straightforward case.

    Using a freshly downloaded ebook from O'Reilly, Head First iPhone Development. That file reports that it was created by Adobe PDF Library 9.0 and Adobe InDesign CS4 (6.0). My steps were:
    * went to logical page 37 (chapter 1 page 3).
    * selected a line of text that read "ported desktop apps"
    * copied; SUCCESSFUL
    * annotated that line with a highlight
    * selected "ported desktop apps" again
    * copied; SUCCESSFUL
    * saved
    * selected "ported desktop apps" again
    * copied; SUCCESSFUL
    * select the first occurrence of the word "about" on the page
    * annotated with a highlight
    * selected "ported desktop apps" again
    * copied; SUCCESSFUL
    * saved
    * selected "ported desktop apps" again
    * copied; FAILED!

    The results of each copy can be seen in this screenshot showing iClip's contents (last at the top; first at the bottom) after each copy operation: http://dl.dropbox.com/u/1184374/resultsOfCopy.png

    I was NOT able to replicate the failure in simple PDFs that were generated from
    (a) TextEdit using Lorem Ipsum text, (b) in Acrobat 9 Pro creating a PDF from a webpage [yahoo.com], or (c) in a PDF created from Firefox from a webpage.

    -Wil
  • William Doane Level 1 Level 1 (0 points)
    Additionally, I was able to reproduce the problem in the PDF mentioned above, http://siteresources.worldbank.org/INTURBANDEVELOPMENT/Resources/dynamicsurbanexpansion.pdf

    Reported to Apple as bug #7622343

    Message was edited by: William Doane
  • acastro Level 1 Level 1 (0 points)
    I have this issue too but I'm on 10.5.8 and running Preview.app 4.2 (469.5). I downloaded the Practical symfony PDF [1] and was able to copy and paste successfully until I saved the file back to my hard-drive with a different name. I first noticed it when I added a note in a previously downloaded file and couldn't copy and paste anymore without strange characters showing up instead.

    Thoughts?

    1. http://www.symfony-project.org/get/pdf/jobeet-1.4-doctrine-en.pdf