Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Automator - Render multiple PDF as JPG and keep PDF file name

Hi folks,


I need to render many 10-page PDF documents to JPGs. This thread was very helpful at demonstrating how to run a shell script and get the JPG named and ordered in the way I want.


https://discussions.apple.com/thread/6349891?answerId=26056331022#26056331022


But I'm wondering if it's possible to run this with multiple PDFs at once, 10 pages per file, and have every set of 10 JPGs have the same title as the original PDF (ordered 01 through 10).


Many many thanks in advance.


Posted on Apr 11, 2019 8:05 AM

Reply

Similar questions

11 replies

Apr 16, 2019 8:42 AM in response to raphael198

Using homebrew, do the following in the Terminal.


brew update

brew upgrade

brew install ghostscript

brew cleanup


The ghostscript 9.26_1 executable will be installed in /usr/local/bin as gs.


Life is too short to be dealing with internal Automator action mischief. I am changing the solution to a three-action Automator solution. Once you have Ghostscript 9.26.1 installed, edit your Automator application. You do this by launching Automator, cancelling its file dialog, and choosing Open Recent… and your application name.


Keep only these three actions:

  1. Ask for Finder Items
  2. Filter Finder Items
  3. Run Shell Script


Clear the contents from the Run Shell Script action, and copy/paste the following into it:


DPI=300
PAGEMAX=10
OUTFOLDER="$HOME/Desktop/PDF/"

for f in "$@"
do
	page_cnt=$(mdls -name kMDItemNumberOfPages "$f" | awk '{print $NF}')
	# force ten page limit if higher page count
	[[ page_cnt -ge PAGEMAX ]] && page_cnt=$PAGEMAX

	# simultaneously strip path and extension
	shopt -s extglob
	BASE="${f//+(*\/|.*)}"
	shopt -u extglob

	# The intents set to zero is perceptual rendering
	/usr/local/bin/gs -q -dSAFER -sDEVICE=jpeg -r$DPI -dTextAlphaBits=4 -dGraphicsAlphaBits=4 \
    	-dLastPage=$page_cnt -dTextIntent=0 -dRenderIntent=0 -dImageIntent=0 \
    	-dFIXEDMEDIA -dPDFFitPage -o $OUTFOLDER${BASE}_%02d.jpeg "$f"
done



The script will limit the page count to 10 if it is greater in the PDF, and its does sequential numbers of imaged pages correctly.


May 3, 2019 4:42 AM in response to raphael198

In the Bash script, when Ghostscript is interrupted, it leaves .jpeg files in a temporary location. These can impact the next run. You should remove these in the Terminal in this location:


cd $TMPDIR


or


cd $TMPDIR/TemporaryItems


After adding -dJPEGQ=100, I would restructure the Ghostscript content by forcing a line continuation character at the end of that line:



None of my original portrait oriented PDF documents that I tested produced rotated images, so curious about that issue. The png image format will give you better text clarity and document compression, because JPEG is really intended for photographs, and not especially text.

Apr 13, 2019 11:16 AM in response to raphael198

The following Automator application workflow allows you to select multiple PDF files, and render the PDF pages as 300 dpi JPEG. Each JPEG page is renamed to append _nn after the name and before the extension. The original PDF filename is retained for each image.


A 10-page PDF named render_images.pdf will result in image files: render_images_01.jpeg ... render_images_10.jpeg.


You set the output folder in the Run Shell Script action. It defaults to a JPEG folder on the Desktop. You save this Automator Application to your Desktop, and double-click to Run. You select multiple PDF by depressing the ⌘ key.


Here is the content for the Run Shell Script:

OUTDIR="$HOME/Desktop/JPEG/"

for f in "$@"
do
	# a by product of the render is generation (nn) sets (e.g. foo nn_01.jpeg) and Ruby removes to make (foo_01.jpeg)
	OUTFILE="$(ruby -e 'puts File.basename(ARGV.first).split(/(?<=\s)\d+(?=_)/).map(&:strip).join;' "$f")"
	mv "$f" "$OUTDIR$OUTFILE"
done




May 3, 2019 3:22 AM in response to VikingOSX

Again, utmost thanks for this. I ran what you provided here, and it seemed to work at first (the JPEGs were all sideways, but I just added a "Rotate Images" step at the end, no problem). But now when I tried again I get this error. Maybe my computer is just too old.



Another thing: one reason I'm insisting on converting the PDFs this way (aside from the fact that I have to do over 700 files), is that when I do it in other ways the text comes out with some fuzziness around it (some sort of jpeg artifact?). I added "-dJPEGQ=100" at the bottom with the hopes that it would resolve this issue. But maybe this messed up your script?

May 3, 2019 6:52 AM in response to VikingOSX

Yes, for some reason when I clicked your new Run Shell Script action it opened in a new tab and was all garbled, with lots of "&"s everywhere. It looks clean now.


As for the Terminal command, yes I was quite sure to spell it right (just messed up here in the forum). So I'm supposed to "open cd $TMPDIR" or "open cd $TMPDIR/TemporaryItems" (not "rm cd $TMPDIR"), and then manually delete those .jpegs? This is what I'm getting now.



Apr 16, 2019 5:00 AM in response to raphael198

What operating system version are you getting these results? I tested on High Sierra 10.13.6 (17G6030) and Mojave 10.14.4 (18E226). Same version of Ruby on both.


In Terminal, enter the following commands. TMPDIR is the location that Automator writes your JPEG files with the format document nn_nn.JPEG. Remove any *.JPEG that you find in that location:


cd $TMPDIR

ls *.JPEG

rm -f *.JPEG


and to return to your home directory:


cd -


Clean out the files from the output folder, and run the Automator application again.


The sole purpose of the OUTFILE Ruby code is to remove the single whitespace and number that precedes the _nn.JPEG, and I just tested it again using one of your examples from your preceding post and it worked perfectly.


I also have a Ghostscript solution in a Bash script that does the same thing in a single Run Shell Script action.

Apr 16, 2019 6:58 AM in response to VikingOSX

Again, thank you for this and for your patience. I followed your instructions best I could (honestly not sure what I actually did—when I pasted what you provided into Terminal it said "No such file or directory"), but I got a bit closer nonetheless. Now the output JPEGs look like this:


first_document_01.jpeg

[...]

first_document_10.jpeg

second_document_11.jpeg

[...]

second_document_20.jpeg


(I need the second document to go back to "second_document_01.jpeg", if that's possible.) For your information, I'm on El Capitan 10.11.6. Perhaps I need to download Homebrew/Ruby first?

May 3, 2019 6:05 AM in response to VikingOSX

Thanks again, but no dice.


First off, when I tried to open cd $TMDIR, Terminal responded that the file does not exist (or, when cd $TMPDIR/TemporaryItems, "syntax error near unexpected token `cd'"); or when I open $TMPDIR, the Temporary Items folder is empty.


Also, when I plug in your Restructured Ghostscript, Automator responds: The action "Run Shell Script" encountered an error.


-: -c: line 4: syntax error near unexpected token `&'

-: -c: line 4: `for f in &#34;$@&#34;'


Your effort has been truly valiant, but maybe it would be easier for me to just do these one-by-one.

May 3, 2019 6:19 AM in response to raphael198

You were supposed to replace the contents of the Run Shell Script action with what I posted this morning. All I did was introduce a line continuation backslash on the first line of the existing Ghostscript command syntax. Unless the Note tool is somehow garbling what you receive. I couldn't paste into the < > tool which would have preserved accurate syntax.


Spelling counts. That is cd $TMPDIR, not cd $TMDIR. 😉

Automator - Render multiple PDF as JPG and keep PDF file name

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.