Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Sierra produces corrupt pdf files

A colleague just bought a new Mac with 10.12. The pdf files he creates, eg from Safari, all contain invalid pdf.

Running ps2pdf on them says:


**** Warning: considering '0000000000 XXXXX n' as a free entry.

**** Warning: considering '0000000000 XXXXX n' as a free entry.


**** This file had errors that were repaired or ignored.

**** The file was produced by:

**** >>>> Mac OS X 10.12 Quartz PDFContext <<<<

**** Please notify the author of the software that produced this

**** file that it does not conform to Adobe's published PDF

**** specification.

These files actually crash the pdf viewer that I usually use (atril), though the errors are silently ignored in acroread.

macOS Sierra (10.12)

Posted on Nov 22, 2016 10:36 PM

Reply
22 replies

Nov 23, 2016 3:51 PM in response to dialabrain

ps2pdf is actually a perfectly good tool for fixing broken pdf files. It uses ghostscript to rerender postscript or pdf into pdf. The original error message was:


**** >>>> Mac OS X 10.12 Quartz PDFContext <<<<

**** Please notify the author of the software that produced this

**** file that it does not conform to Adobe's published PDF

**** specification.


which says that the file does not conform to the PDF specification. So you can choose to ignore ghostscript's claim that the pdf file is corrupt if you like, but it is evidence. The pdf that ps2pdf produces from Sierra's renderer plays nicely everywhere I've tried it.


If you don't like that I'm running ps2pdf, you can get the same error message from just running gs to render the pdf.

Nov 23, 2016 5:32 PM in response to dialabrain

I don't have much more to say here. The pdfs that I was sent that are broken claim to be pdf 1.3, so I presume that that is the version that ghostscript is comparing to, but I that is just a presumption.


I don't see documentation for ghostscript that discusses validity checks, but clearly ghostscript's authors included code that produces that error message, and suggests notifying the authors of the application that produced the pdf. Perhaps they didn't feel the need to document it because the error message is self-explanatory.

Nov 24, 2016 8:11 AM in response to camichal

Just to ensure we're talking about the same thing, is this the converter site you're talking about?


You mention that you could convert a PDF, despite what the site implies. So I tried it with a small PDF and got exactly the type of result I expected.


User uploaded file


Input file is not a PostScript file.


A similar site is here. But again, it expects a PostScript file to do anything, since like the first, it works on the same idea as the Adobe Distiller. It expects only PostScript data to work with.


So unless you mean something else entirely, I can't imagine how you got either one of these sites to work with a PDF at all. Unless you did something like changing the file name from test.pdf to test.ps . Which of course does not make the PDF a PostScript file. It's now just a misnamed PDF file.

Nov 24, 2016 8:33 AM in response to Kurt Lang

Er, no. ps2pdf is basically a wrapper script that calls ghostscript to distill post-script or pdf into pdf. If you want to know more about ghostscript, see https://en.wikipedia.org/wiki/Ghostscript or http://www.ghostscript.com/

and http://www.ghostscript.com/doc/9.20/WhatIsGS.htm#GhostPDF


The pdfs that I was sent that are broken claim to be pdf 1.3, so I presume that that is the version that ghostscript is comparing to, but I that is just a presumption.


I don't see documentation for ghostscript that discusses validity checks, but clearly ghostscript's authors included code that produces that error message, and suggests notifying the authors of the application that produced the pdf. Perhaps they didn't feel the need to document it because the error message is self-explanatory.

Nov 24, 2016 9:00 AM in response to camichal

Er, no. ps2pdf is basically a wrapper script that calls ghostscript to distill post-script or pdf into pdf.

Which is exactly what the two sites I linked to do (per the underlined part of your post). They tell you that.


What I don't understand at all (and I've been in the digital end of printing for over 35 years), is why anyone would even bother to try and render a PDF to another PDF. Render what? It's already in that format. That's like squeezing an orange and expecting a new unpeeled orange to appear.


Per one of the sites you linked to:


What is GhostPDF?

GhostPDF is an interpreter built on top of Ghostscript that handles PDF files. Currently GhostPDF relies on extensions to the PostScript language/imaging model, and so cannot be used independently of the Ghostscript PostScript interpreter component. As such GhostPDF is an umbrella term used to refer to both these extensions and the interpreter code.

Many people (including the authors) frequently just refer to Ghostscript as supporting PDF and only specifically mention GhostPDF when wanting to make the distinction between the PostScript and PDF support.

GhostPDF is included in the Ghostscript binaries for various systems available from www.ghostscript.com/download. The source can be found in both the Ghostscript and GhostPDL downloads from the same site.


They keep mentioning "interpreter". Interpret what? There's nothing to interpret when the input and output format is the same.

Nov 24, 2016 9:15 AM in response to Kurt Lang

I've encountered many pdf files that had problems: eg, they were huge, or had transparency issues that didn't render well on some devices, or had corruption issues like these ones, and ghostscript fixes them. It can also change what version of pdf is used in the document - the version of ghostscript I have installed supports creating pdf 1.2, 1.3 or 1.4.


Postscript is a document description language - the document is essentially a program that contains instructions like: move to these coordinates, draw a character here, etc. The ghostscript interpreter interprets these commands and renders them in one of many formats for a huge number of devices - for everything back to old fashioned 9 pin dot-matrix printers. PDF is closely related to post-script. I believe most of ghostscript's pdf interpreter is actually written in postscript.

The ghostscript pdfwrite "device" takes whatever input is given to it (either postscript or pdf) and interprets the document and creates new pdf using its own pdf driver.


Ghostscript is really very widely used. Its an important piece of CUPS, the Unix printing systems (which Apple bought in 2007, and I thought was used on Macs?).

Sierra produces corrupt pdf files

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.