Extract first page of a pdf with automator, applescript, pdfkit

referring:

https://discussions.apple.com/thread/254650756?login=true


I only need the first page as pdf in an automator workflow. How does the applescript look like then?


(I couldn't reply to this topic directly, because it was closed)


Best, Michael

Posted on Apr 28, 2023 2:32 PM

Reply
Question marked as Top-ranking reply

Posted on Apr 29, 2023 2:04 PM

Here is an Automator application that writes the first page of a given PDF to a default filename (e.g. current_firstpage.pdf) to your Desktop. That is customizable. The code does nothing with the other pages in the source PDF, and you can access that first page PDF for subsequent use in additional workflow.



And the Run AppleScript code:


use framework "Foundation"
use framework "PDFKit"
use AppleScript version "2.4" -- Yosemite or later
use scripting additions

property ca : current application
property suffix : "_firstpage"

on run {input, parameters}
	
	repeat with apdf in input
		set thePDF to (POSIX path of apdf) as text
		set outpdf to POSIX path of (choose file name "Pick a name for the one-page PDF extraction:" default name "current.pdf" default location (path to desktop)) as text
		
		set pdfPageOne to (ca's NSURL's fileURLWithPath:(my add_PDF_suffix(outpdf, suffix)))
		
		set pdf to (ca's PDFDocument's alloc()'s initWithURL:(ca's NSURL's fileURLWithPath:thePDF))
		-- get first page (objective-C uses index 0 for first element)
		set page_one to (pdf's pageAtIndex:0)'s dataRepresentation()
		set file_exists to false
		try
			set file_exists to (exists POSIX file pdfPageOne as alias) as boolean
		end try
		
		if not file_exists then
			set status to (page_one's writeToURL:pdfPageOne atomically:true) as boolean
		end if
	end repeat
	return input
	
end run

on add_PDF_suffix(apdf, suffix)
	-- change filename.pdf to filename_1.pdf
	set ext to (ca's NSString's stringWithString:apdf)'s pathExtension()
	if ext is not equal to "pdf" then set ext to "pdf"
	set basename to (ca's NSString's stringWithString:apdf)'s stringByDeletingPathExtension()
	set base_with_suffix to (basename's stringByAppendingString:suffix)'s stringByAppendingPathExtension:ext
	return (base_with_suffix) as text
end add_PDF_suffix


11 replies
Question marked as Top-ranking reply

Apr 29, 2023 2:04 PM in response to mgasperl

Here is an Automator application that writes the first page of a given PDF to a default filename (e.g. current_firstpage.pdf) to your Desktop. That is customizable. The code does nothing with the other pages in the source PDF, and you can access that first page PDF for subsequent use in additional workflow.



And the Run AppleScript code:


use framework "Foundation"
use framework "PDFKit"
use AppleScript version "2.4" -- Yosemite or later
use scripting additions

property ca : current application
property suffix : "_firstpage"

on run {input, parameters}
	
	repeat with apdf in input
		set thePDF to (POSIX path of apdf) as text
		set outpdf to POSIX path of (choose file name "Pick a name for the one-page PDF extraction:" default name "current.pdf" default location (path to desktop)) as text
		
		set pdfPageOne to (ca's NSURL's fileURLWithPath:(my add_PDF_suffix(outpdf, suffix)))
		
		set pdf to (ca's PDFDocument's alloc()'s initWithURL:(ca's NSURL's fileURLWithPath:thePDF))
		-- get first page (objective-C uses index 0 for first element)
		set page_one to (pdf's pageAtIndex:0)'s dataRepresentation()
		set file_exists to false
		try
			set file_exists to (exists POSIX file pdfPageOne as alias) as boolean
		end try
		
		if not file_exists then
			set status to (page_one's writeToURL:pdfPageOne atomically:true) as boolean
		end if
	end repeat
	return input
	
end run

on add_PDF_suffix(apdf, suffix)
	-- change filename.pdf to filename_1.pdf
	set ext to (ca's NSString's stringWithString:apdf)'s pathExtension()
	if ext is not equal to "pdf" then set ext to "pdf"
	set basename to (ca's NSString's stringWithString:apdf)'s stringByDeletingPathExtension()
	set base_with_suffix to (basename's stringByAppendingString:suffix)'s stringByAppendingPathExtension:ext
	return (base_with_suffix) as text
end add_PDF_suffix


May 1, 2023 6:57 AM in response to mgasperl

No Apple provided Automator PDF action processes only the first page of a PDF. They all process the entire PDF. Even the Shortcuts solution provided by TonyT1 processes the entire PDF.


The AppleScript that I provided earlier in this thread gets the first and only the first page of the PDF, writes that to a separate PDF and is done. I thought this was the functionality that you originally requested. I also have an AppleScript that can take that first PDF page and write it out as a 72 dpi image, but at the loss of any text or annotation attributes unless the original PDF was already flattened.


If you wanted to pass n-tuple PDFs into Automator and you want the first page only written as a separate PDF, then the AppleScript that I wrote can be further adapted to an Automator Run AppleScript action that takes each input PDF and processes its first page. You never responded to my questions including the first PDF page's output filename, which is why I chose a Choose File Name so you could interactively do that and set the location.

Apr 29, 2023 5:43 AM in response to mgasperl

I used the following suffix for the extracted first page of each PDF n my previous code:


property SUFFIX_1 : "_1"



Are you okey with that, or require a different suffix? As written, the code writes the first page PDF in the same location as the originally selected PDF. Is that OK?


Will you be processing a designated folder containing the PDFs or do you want to select one or more PDF files?


I just tested the revised code on Ventura 13.3.1 and it quickly processed three selected PDF as expected.

Apr 29, 2023 6:00 AM in response to VikingOSX

Hi Viking, thanks for your answer!

I need to create covers for pdf files. So I only need the first page to be extracted temporarily to be processed with the next step in Automator.

So the pdf should not be saved in the same folder.

Currently the next step creates images of the whole pdf file ...

I had a python script before (sips etc), but it doesn't work anymore. Your Applescript works, but it seems to do too much :⁠-⁠)

Apr 29, 2023 7:38 AM in response to mgasperl

The original script that I wrote saves the first page of a PDF to another PDF and then removes that first page from the donor PDF. That was a simple fix by omitting the page removal part. I do need to save that first PDF page somewhere so it is accessible later in your workflow.


However, what I interpret from your post so far is somewhat confusing (and potentially more complicated) as I believe you want to extract the first page of the PDF (as a PDF) to be passed to the next action where you want to generate images of each PDF page. Where are these saved and by what filename? What DPI were you hoping to save each page and in what image format?


Beginning with macOS Monterey 12.3, Apple removed the Python 2.7.16 distribution from the operating system which broke some Apple software and clearly user Python applications that expected an Apple Python distribution where it had been for decades.


May 1, 2023 5:55 AM in response to VikingOSX

Hi,

the scripts asks 2 times for the file name (default location = desktop)

And all pages are still processed. I tried to select only the first page with a filter, but no effect. It always processes all pdf pages additionally to the first one.

With the old python script only the first pdf page was processed and there were no save as questions. It was just a temporary file and the first page was saved in the desired folder.


This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Extract first page of a pdf with automator, applescript, pdfkit

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.