Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

rename a PDF file using an internal info with Automator

I am following up on a thread from here: rename a PDF file using an internal info … - Apple Community


with

VikingOSX


I have created the AppleScript per your directions, but getting this error: "Cannot find document number in PDF."


[Link Edited by Moderator]




Posted on Oct 27, 2022 7:33 AM

Reply

Similar questions

9 replies

Oct 27, 2022 8:27 AM in response to jacob_felger

The script in that prior link looked for a document number that you or someone else has included on the first page of the PDF. If the PDF is scanned, the script does not perform Optical Character Recognition of that image in the PDF.


  1. I need to know what number format you are expecting the code to find and where it occurs in the PDF.
  2. I need to see your current AppleScript solution. Click the <> tool at the bottom of this editor, and copy/paste your entire AppleScript into that field. Then press return and click the <> tool again to turn it off.
  3. The original script as written works on a local folder as the drop folder for PDFs fitting item [1] above, and another folder needs to exist where these PDF are renamed too. I have not tested this with other cloud services.


Oct 27, 2022 10:44 AM in response to jacob_felger

Here is the revision that handles your document number in the form OOO_nnnnn and does the rename without the "Cannot find Document number" dialog. I am using a drop folder named DropMe and a recipient folder named Renamed — for testing purposes. You may want to change that Renamed folder property to your preferred recipient folder. This code is expecting the document number aligned left on the first line, but probably would find it in that location anywhere in the document's first page.


Tested on macOS Monterey 12.6.1.


The Run AppleScript content:


use framework "Foundation"
use framework "PDFKit"
use AppleScript version "2.4" -- Yosemite or later
use scripting additions

property NSString : a reference to current application's NSString
property NSURL : a reference to current application's NSURL
property PDFDocument : a reference to current application's PDFDocument
property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to current application's NSRegularExpressionCaseInsensitive
property NSMatchingAnchored : a reference to current application's NSMatchingAnchored
property OUTFOLDER : ((path to desktop as text) & "Renamed:") as alias
property ext : ".pdf"

on run {input, parameters}
	
	# can handle multiple dropped PDFs with document number
	tell application "Finder"
		repeat with apdf in input
			if name extension of (apdf as alias) is "pdf" then
				set pdf_file to POSIX path of (apdf as alias) as text
				set dropfolderPath to my drop_folder_path(pdf_file) as POSIX file as text
				set pdf_url to (NSURL's fileURLWithPath:pdf_file)
				set pdf to (PDFDocument's alloc()'s initWithURL:pdf_url)
				# assumption: document number is on first page of the PDF
				set page_text to (pdf's pageAtIndex:0)'s |string|()
				set docnum_result to my find_document_number(page_text)
				
				if docnum_result contains "_" then
					set newname to docnum_result & ext
					set name of (apdf as alias) to newname
					move (dropfolderPath & newname) to OUTFOLDER with replacing
					
				else
					display dialog "Cannot find document number in PDF."
				end if
			else
				log "continue" # wasn't a PDF so skip it
			end if
		end repeat
	end tell
	return input
end run


on find_document_number(atxt)
	set tStr to NSString's alloc()'s initWithString:atxt
	set trange to current application's NSMakeRange(0, tStr's |length|())
	# look for document number in format with multiple characters, underline, multiple numbers (e.g. ddt_2345)
	# allow for the possibility of a none, or multiple leading space characters if present
	set pattern to "^\\s+?([A-Z0-9_]+)\\s+"
	set regex to NSRegularExpression's regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive |error|:0
	
	set matches to (regex's firstMatchInString:tStr options:NSMatchingAnchored range:trange)
	
	if not (matches = "" or matches = missing value) is true then
		set matchrange to matches's rangeAtIndex:1
		return (tStr's substringWithRange:matchrange) as text
	else
		return "No match"
	end if
end find_document_number


on drop_folder_path(afile)
	# return the drop folder path associated with the newly dropped PDF file
	return do shell script "/bin/zsh -s <<'EOF' - " & afile & "
	# return the path of the dropfolder
	printf '%s' \"${1:a:h}\"
EOF"
end drop_folder_path


Oct 27, 2022 8:59 AM in response to jacob_felger

The thing with a drop folder is the Automator Folder action runs when something is dropped on the folder, but runs again when something is written to it. That is why you need another peer folder to receive the renamed PDF.


Let me cook up a PDF with your numbering scheme and see if I need to make tweaks to the handler to match your numbering format. I will use your script as written as a base source for my tests. It would help if I saw or you can provide where your numbering string occurs in the PDF:


  1. Beginning of a line (e.g. OOO_0001).
  2. Embedded after some preceding text string (e.g. Document ID: OOO_0001).
  3. Embeded with surrounding text: (e.g. Document ID: OOO_0001 blah blah).


May have something later today…

Oct 27, 2022 8:37 AM in response to VikingOSX

I'm so glad you saw this! Thanks for replying. I would like to use a OOO_00005 format, but I can change that to something else if it helps. I have a gut feeling maybe something with the underscore is causing a problem. I am using a local folder as both the drop and destination folders. I created twelve test PDF files with only the text OOO_00001, OOO_00002, etc. on the first page of each PDF. I get the error "Cannot find document number in PDF." twelve times when I put them in the drop folder. If it helps track down the problem, when I click "Run" in Automator [instead of dropping files into my drop folder], it says "Syntax Error: Can’t make some data into the expected type" and then highlights part of the script starting with "Repeat with apdf in input" and ending with "end repeat". Any help is greatly appreciated!!


The script is here:

use framework "Foundation"
use framework "PDFKit"
use AppleScript version "2.4" -- Yosemite or later
use scripting additions


property NSString : a reference to current application's NSString
property NSURL : a reference to current application's NSURL
property PDFDocument : a reference to current application's PDFDocument
property NSRegularExpression : a reference to current application's NSRegularExpression
property NSRegularExpressionCaseInsensitive : a reference to current application's NSRegularExpressionCaseInsensitive
property NSMatchingAnchored : a reference to current application's NSMatchingAnchored
property OUTFOLDER : ((path to desktop as text) & "Renamed") as alias
property ext : ".pdf"


on run {input, parameters}
	# can handle multiple dropped PDFs with document number
	tell application "Finder"
		repeat with apdf in input
			if name extension of (apdf as alias) is "pdf" then
				set pdf_file to POSIX path of (apdf as alias) as text
				set dropfolderPath to my drop_folder_path(pdf_file) as POSIX file as text
				set pdf_url to (NSURL's fileURLWithPath:pdf_file)
				set pdf to (PDFDocument's alloc()'s initWithURL:pdf_url)
				# assumption: document number is on first page of the PDF
				set page_text to (pdf's pageAtIndex:0)'s |string|()
				set docnum_result to my find_document_number(page_text)
				
				if docnum_result contains "_" then
					set newname to docnum_result & ext
					set name of (apdf as alias) to newname
					move (dropfolderPath & newname) to OUTFOLDER with replacing
					
				else
					display dialog "Cannot find document number in PDF."
				end if
			else
				log "continue" # wasn't a PDF so skip it
			end if
		end repeat
	end tell
	return input
end run


on find_document_number(atxt)
	set tStr to NSString's alloc()'s initWithString:atxt
	set trange to current application's NSMakeRange(0, tStr's |length|())
	# look for document number in format with multiple characters, underline, multiple numbers (e.g. ddt_2345)
	# allow for the possibility of a none, or multiple leading space characters if present
	set pattern to "^\\s*?(^\\w+_\\w+)" # ddt_2345
	set regex to NSRegularExpression's regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive |error|:0
	
	set matches to (regex's firstMatchInString:tStr options:NSMatchingAnchored range:trange)
	
	if not (matches = "" or matches = missing value) is true then
		set matchrange to matches's rangeAtIndex:1
		return (tStr's substringWithRange:matchrange) as text
	else
		return "No match"
	end if
end find_document_number


on drop_folder_path(afile)
	# return the drop folder path associated with the newly dropped PDF file
	return do shell script "/bin/zsh -s <<'EOF' - " & afile & "
	# return the path of the dropfolder
	printf '%s' \"${1:a:h}\"
EOF"
end drop_folder_path


Oct 27, 2022 9:10 AM in response to VikingOSX

I think I understand what you mean by needing one folder to drop and another to receive. The automator is being triggered as I drop, and that's when the "Cannot find document number in PDF." happens.


The numbering string is at the beginning of a line (your first example). I can move the location of the text anywhere in the PDF as necessary, but it is embedded text and not just an image. I'm generating my test PDFs using Google Docs -> Exporting as PDF (yes, saving locally) and just putting the OOO_00001 in the first line. But eventually I'm going to be generating an Object Report in PowerSchool which will let me put a text string at any location in the PDF. I hope this helps clarify. :)


Thank you again!

Oct 27, 2022 10:34 AM in response to jacob_felger

I resolved this with a locally generated PDF and one downloaded from Google Docs with the same content. It now renames and moves the dropped PDFs where the OOO_nnnnn is at the beginning of the first line in the PDF. Cleaning things up and will repost what I tested on macOS Monterey 12.6.1 in a few minutes. Needed a different regular expression syntax.



Oct 28, 2022 12:00 PM in response to VikingOSX

It worked! I noticed that if my input / drop folder had a space in the name, it doesn't work, but when I used your suggested "DropMe", it worked. Also, if my dropped files had longer file names (e.g. spaces, parenthesis), it wouldn't work, but if I keep the original filenames short and without special characters it worked as well. Thank you so much!!

rename a PDF file using an internal info with Automator

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.