Preview "save as" creates bigger file than input. Why?

I have a PDF of a 12 page document, about 600 KB file size. I open it in Preview, and "save as" another file. The new PDF is about 7 MB file size, more than 10 times the original file size.


Additional info:


1. If I just wanted another copy of the PDF I could copy the file. I know that. But I used Preview with "Create Booklet" and the booklet file was about 7 MB. I used "save as" to show that "Create Booklet" has nothing to do with increasing the file size.


2. The PDF was created in Scribus. Scribus outlines fonts it thinks it's not allowed to embed, and the text in the PDF is almost all in outlined fonts.


3. The font produced from Scribus is about 1.5 KB file size. Processing it with Ghostscript, using "gs -sDEVICE=pdfwrite ..." reduced the file size to about 600 KB.


4. I have another DTP application that embeds the same fonts. The PDF output is a little over 1 MB, Ghostscript shrinks it to well under 1 MB, and it does not become larger when saved from Preview.


Bottom line: After Scribus output and processing with Ghostscript, the PDF has outlined fonts in well under 1 MB file size. Open it in Preview and save it, and the file becomes ten times larger. Why?

Posted on Apr 22, 2012 5:17 PM

Reply
28 replies

May 3, 2012 7:17 AM in response to Kurt Lang

I just looked over all of the versions I have. The OS 9, Office 2008 and Office 2011 versions have no restrictions at all. However, the Type 1 PS version of Lucida Sans has Only printing and previewing of the document is allowed (read-only) on all four styles. The following are the four choices you can place on a font:


User uploaded file


What you can't see in this screen shot is the checkbox behind the drop down menu choices. That's to Allow subsetting, which is tuned on.


So if the Type 1 PS version of Lucida Sans is what you used, that's what may be causing the issue. Scribus is seeing the copyright restriction of only printing and previewing being allowed, and not looking past that. Since subset embedding is allowed, it should still be copying the font into the new document as is instead of turning it into Type 3 embedded fonts. Preview is compounding the issue by turning all of those conversions into artwork.

Apr 22, 2012 7:29 PM in response to etresoft

Well, it's not the images, because Preview doesn't make the file bigger if I change the fonts to fonts that Scribus will embed. I don't think it's rasterizing the fonts because at maximum magnification in Preview the glyph outlines look perfectly sharp.


Anyway, a PDF created by Scribus with outline fonts is at http://www.heartsofjersey.org/pdf/apr12.pdf , and a PDF created with embedded fonts is at http://www.heartsofjersey.org/pdf/mar12.pdf . You can download them, open them in Preview and save them from Preview to see what happens.

Apr 23, 2012 6:06 PM in response to etresoft

There are bugs, misfeatures, and missing features in Scribus. If the output from Scribus is buggy, then, strictly speaking, Preview is not required to deal with it at all. But being able to handle buggy output is a market advantage, so it would be better if Preview could handle it.


I'm just asking why Preview handles the output from Scribus the way it does. That might involve asking what's special about the output from Scribus. There are some clues in the font listing that Adobe Reader shows, but the full answer might have to come from somebody who can understand what's in the PDF file itself.

Apr 23, 2012 7:12 PM in response to marty39

marty39 wrote:


There are bugs, misfeatures, and missing features in Scribus. If the output from Scribus is buggy, then, strictly speaking, Preview is not required to deal with it at all. But being able to handle buggy output is a market advantage, so it would be better if Preview could handle it.


For Scribus, that's true. If it were Acrobat, it wouldn't be true. Adobe really pulled a fast one with PDF. Release a "standard" and then keep updating it every year. But I digress. From my point of view, I can select text coming in to Preview and I lose the selectable text going out. I call that a bug.


I'm just asking why Preview handles the output from Scribus the way it does. That might involve asking what's special about the output from Scribus. There are some clues in the font listing that Adobe Reader shows, but the full answer might have to come from somebody who can understand what's in the PDF file itself.


Unfortunately, we don't have anyone with that level of knowledge. (If anyone wants to prove me wrong, now is the time to chime in.) I don't think Scribus is doing anything wrong. I have seen image PDFs that have been re-text-ified (for want of a real word) so that you can select the text again. So, there must be some facility in PDF to associate graphics with text strings. I don't like it that Preview loses that. The file size expansion may be a separate issue. I consider the loss of the text more important than the gain of 7 MB.

Apr 24, 2012 1:05 PM in response to etresoft

I think that settles the discussion, at least until somebody comes along who can analyze a PDF.


But the personal comment at the end, about which issue is more important, is not universally shared. In my application, the gain of 7 MB is more important than the loss of text.


As I noted in the beginning, the output from Preview that I really use is a bookfold version with the aid of "Create Booklet," which I email to a print shop that prints multiple copies on paper without reading it. Pardon me if I seem to be still living in the twentieth century, but I think 7 MB is a very big email attachment. The loss of text, on the other hand, is irrelevant.


Therefore I think both issues are important.

Apr 25, 2012 9:19 AM in response to etresoft

Hmmm. You tried SintraWorks PDF Nomad. I went to their website and they say it requires Lion. This is a Snow Leopard forum. I chose not to upgrade to the Lion.


I tried telling Scribus to export to PostScript (actually, to print PostScript to a file). Preview and Ghostscript both complained about PostScript errors. I told you Scribus is buggy.


Anyway, as to the issue of output PDF file size, I had great success with the BookletCreator app from http://bookletcreator.com/ (versions for Windows as well as Mac). The bookfold output had the same file size as the input PDF.


However. I also post a PDF on the web (as in the examples I posted links to), and other people might have problems with them. So the solution, for me, is to use only embeddable fonts.


In many cases that means substituting a free font for a bundled font. Lower font quality is not important because the printing process is "low end" and the online PDF looks OK at screen resolution. But in a few cases an exact substitute is not available and I have to settle for a "look" that's not quite what I wanted.

Apr 25, 2012 10:20 AM in response to marty39

marty39 wrote:


Hmmm. You tried SintraWorks PDF Nomad. I went to their website and they say it requires Lion. This is a Snow Leopard forum. I chose not to upgrade to the Lion.


The older PDF Clerk Pro works in Snow Leopard. I hadn't tried PDF Nomad and though that, if it worked, it might be easier.


I tried telling Scribus to export to PostScript (actually, to print PostScript to a file). Preview and Ghostscript both complained about PostScript errors. I told you Scribus is buggy.


Have you considered Apple Pages? I haven't pushed it for complex layout work, but it handled a 1300 page document pretty easily.



Apr 25, 2012 12:33 PM in response to marty39

Just for some quick testing results. I opened apr12.pdf in Preview under Snow Leopard and simply did a Save As to a new name. As you noted, the same document is suddenly 7.1 MB.


Opening the same PDF in the Adobe Acrobat Reader (version 10.1.3) and doing a Save As resulted in a slightly smaller PDF.


Preview is doing something very wrong. Giving them a quick look over, etresoft's thought that it may be rasterizing the fonts appears to be correct. The original and copy made from the Reader both have selectable text. The copy from Preview can't select anything. Which means the pages have been turned completely into raster images with no fonts being used at all.


Edit: Nope, wrong. I can't select the text in the 7.1 MB copy made by Preview, but I can if I open the PDF in Adobe Illustrator. So the text still is font outlines. The difference in size is being caused by the loss of the embedded fonts. The Reader maintains the embedded fonts in the new copy, so it remains small. When I open the copy from Preview in Illustrator, it notes that the expected fonts are missing.


So what it looks like Preview has done is turn every bit of text into individual outlines. Which means instead of having all text being displayed from the font's outlines (one outline set for everything), the text instead is now all outlines, as if you had drawn each individual letter by hand in Illustrator, Freehand or other vector editing program.

Apr 25, 2012 1:32 PM in response to Kurt Lang

I thought it was something related to the copyright too, but when Scribus actually embeds the fonts, it works as expected. It is Scribus that is converting the fonts into pure outlines, perhaps to stay legal. Then it is Preview that scrambles that.


I filed a bug report on it. If Apple choses to fix it, I sincerely doubt any Snow Leopard users will ever see it.

Apr 25, 2012 2:07 PM in response to etresoft

I sincerely doubt any Snow Leopard users will ever see it.

That's just it. Snow Leopard is what caused the PDF to engorge itself to 7.1 MB in my test. And that simply by opening the PDF in Preview and doing a Save As to a new name. Doing the same thing with the Adobe Reader didn't cause that.

It is Scribus that is converting the fonts into pure outlines, perhaps to stay legal. Then it is Preview that scrambles that.

Nope, I found the real problem. I had only been looking at the documents in Illustrator (not the best way to examine it), Preview and the Acrobat Reader. Silly me, I should have been looking at it with Acrobat Pro.


There are no restrictions on the document at all, and no copyrights on the fonts. It uses 5 TrueType fonts. And the problem, 55 Type 3 PostScript fonts.


Acrobat Pro and the Reader have no trouble moving them into a new document intact. Not surprising, Type 3 PS was an Adobe creation and they can handle them any way they want.


Preview, however, cannot seem to handle Type 3 PS fonts. So every glyph using them is turned into a vector object. Examining the PDF copied by Preview in Acrobat Pro, the only embedded fonts left are the five TrueType fonts. All of the Type 3 PS fonts have been stripped and turned into vector art.

Apr 25, 2012 2:17 PM in response to Kurt Lang

Kurt Lang wrote:


I sincerely doubt any Snow Leopard users will ever see it.

That's just it. Snow Leopard is what caused the PDF to engorge itself to 7.1 MB in my test.



I meant Snow Leopard as opposed to Lion. The original poster doesn't want to upgrade to Lion. If Apple fixes this, I doubt it would show up until 10.8.3 at the earliest.


Nope, I found the real problem. I had only been looking at the documents in Illustrator (not the best way to examine it), Preview and the Acrobat Reader. Silly me, I should have been looking at it with Acrobat Pro.


There are no restrictions on the document at all, and no copyrights on the fonts. It uses 5 TrueType fonts. And the problem, 55 Type 3 PostScript fonts.


Acrobat Pro and the Reader have no trouble moving them into a new document intact. Not surprising, Type 3 PS was an Adobe creation and they can handle them any way they want.


You are just feeding my Adobe conspiracy theories now. They publish the specs and standards for Postscript and PDF and the world rejoices. However, they still retain their market dominance so they immediately start flooding the market with new versions that no other tools can interpret.


Then people show up here and complain about Apple. Adobe is effectively making more work for Apple to do. Sun did the same thing with Java and we all know how that turned out. To add insult to injury, Apple codes to said standards with Adobe doesn't actually support themselves. Then Apple has to fix it again to be compliant with Adobe's de-facto standards. Tell me again why Apple is getting out of the print market? PDF is dead. I can't stand it anymore.

Apr 25, 2012 5:02 PM in response to etresoft

No, I didn't consider Pages. Pages wants to be a word processor, with free text on the page, but I suppose it could be forced to do DTP with all the text in frames. It can't do text on a curve, like the logo at the top of the front page that I did within Scribus, but it can import a logo created in Inkscape.


But it can't import vector artwork, can it? If it can't import vector artwork, that's another sacrifice. The logo in the PDF output from Scribus is outlined, not rasterized, so it looks sharp at any magnification. If Pages has to rasterize it, that's trading one minor sacrifice for another.


The next question is, do you know how Pages deals with fonts marked "do not embed"? Does it do any better than Scribus?


I already have Scribus working, and I can live with its limitations. I'd rather bear those ills I have than fly to others that I know not of.


[ I really was curious about why Preview did such strange things with PDF output from Scribus containing outlined fonts, but I guess I'll have to live with not knowing. ]

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Preview "save as" creates bigger file than input. Why?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.