Want to highlight a helpful answer? Upvote!

Did someone help you, or did an answer or User Tip resolve your issue? Upvote by selecting the upvote arrow. Your feedback helps others! Learn more about when to upvote >

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Using Ruby, Rename PDF with text from input PDF file

I have this script running pretty well thanks to the great support on this site. It currently opens a PDF, searches for a text string, and then saves the PDF with the string as the file name.


For a different project I need to check the string matches the original filename, so my thoughts are to read the string and then append it to the original filename like this: Originalname_Foundstring.pdf - then I can take over and compare the first and second parts of the filename.


This script searches for a string beginning "1000…." then saves the file with a new name based on the string.


How can I grab the input filename and then append the string like this: Originalname_Foundstring.pdf?

(I am only a beginner in Applescript and even more clueless in Ruby script).



_main()

on _main()

script o

property aa : choose file with prompt ("Choose PDF Files.") of type {"com.adobe.pdf"} ¬

default location (path to desktop) with multiple selections allowed


set my aa's beginning to choose folder with prompt ("Choose Destination Folder.") ¬

default location (path to desktop)


set args to ""

repeat with a in my aa

set args to args & a's POSIX path's quoted form & space

end repeat


considering numeric strings

if (system info)'s system version < "10.9" then

set ruby to "/usr/bin/ruby"

else

set ruby to "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby"

end if

end considering


do shell script ruby & " <<'EOF' - " & args & "

require 'osx/cocoa'

include OSX

require_framework 'PDFKit'



outdir = ARGV.shift.chomp('/')



ARGV.select {|f| f =~ /\\.pdf$/i }.each do |f|

url = NSURL.fileURLWithPath(f)

doc = PDFDocument.alloc.initWithURL(url)

path = doc.documentURL.path

pcnt = doc.pageCount

(0 .. (pcnt - 1)).each do |i|

page = doc.pageAtIndex(i)

page.string.to_s =~ /1000\\S*/

name = $&

unless name

puts \"no matching string in page #{i + 1} of #{path}\"

next # ignore this page


end

newname = name[1..-1]


doc1 = PDFDocument.alloc.initWithData(page.dataRepresentation) # doc for this page

unless doc1.writeToFile(\"#{outdir}/#{newname}.pdf\")

puts \"failed to save page #{i + 1} of #{path}\"

end

end

end

EOF"

end script

tell o to run

end _main

Posted on Apr 14, 2015 7:11 AM

Reply
24 replies

Mar 30, 2017 9:17 AM in response to Hiroto

I am not having much luck passing the variables from Esko Automation Engine into Python.


Although this runs, I don't get the output back in Esko, so I think I'm not parsing the variables correctly?


In this script I am trying to find each string beginning: "VM_" on a page and split the PDF into separate pages, name using that string. I have a working applescript version which is manually triggered, and I thought I could make it run via Esko.


Where oh where am I going wrong? 😟



on main(inputs, outputFolder, params)

(*

list inputs : list of POSIX path of input files

string outputFolder : POSIX path of output folder

list params : optional parameters as list of strings

return string : "OK" | "Warning" | "Error"

* to be invoked by Esko Automation Engine

cf.

https://docs.esko.com/docs/en-us/automationengine/16/userguide/pdf/ae_ScriptRunn er.pdf

*)

set args to ""

repeat with a in inputs

set args to args & a's quoted form & space

end repeat



try

do shell script "/usr/bin/python <<'EOF' - " & args & "

# coding: utf-8

import sys, re

from Foundation import NSURL

from Quartz.PDFKit import PDFDocument



argv = [ a.decode('utf-8') for a in sys.argv[1:] ]

outdir = argv.pop(0).rstrip('/')


for f in [ a for a in argv if re.search(r'\\.pdf$', a, re.I) ]:

url = NSURL.fileURLWithPath_(f)

doc = PDFDocument.alloc().initWithURL_(url)

path = doc.documentURL().path()

pcnt = doc.pageCount()


for i in range(0, pcnt):

page = doc.pageAtIndex_(i)

m = re.search(r'VM_\\S*', page.string())

if not m:

print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8'))

continue # ignore this page

name = m.group()

doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page

if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name)):

print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8'))

EOF"



on error errs number errn

log errs & space & errn

return "Error"

end try



return "OK"

end main

Mar 31, 2017 6:24 AM in response to Phillip Briggs

mmm confused 😕


I'm trying to replace the Ruby script with Python, in case I have to replace my script server with a new OS.

I have this working locally when I run the script manually, but I don't get the output back in Esko when I use theEsko Script runner. Could it be the OutputFolder variable isn't being populated: outdir = argv.pop(0).rstrip('/')



-- # start testing

set inputs to choose file with prompt ("Choose PDF Files.") of type {"com.adobe.pdf"} ¬

default location (path to desktop) with multiple selections allowed

set outputFolder to choose folder with prompt ("Choose Destination Folder.") ¬

default location (path to desktop)

repeat with i in inputs

set i's contents to i's POSIX path

end repeat

set outputFolder to outputFolder's POSIX path

set params to {}

main(inputs, outputFolder, params)

-- # end testing



(*

for Esko Automation Engine Script Runner

*)



on main(inputs, outputFolder, params)

(*

list inputs : list of POSIX path of input files

string outputFolder : POSIX path of output folder

list params : optional parameters as list of strings

return string : "OK" | "Warning" | "Error"

* to be invoked by Esko Automation Engine

cf.

https://docs.esko.com/docs/en-us/automationengine/16/userguide/pdf/ae_ScriptRunn er.pdf

*)


script o

property aa : {outputFolder} & inputs

set args to ""

repeat with a in my aa

set args to args & a's quoted form & space

end repeat


try


do shell script "/usr/bin/python <<'EOF' - " & args & "


# coding: utf-8

import sys, re

from Foundation import NSURL

from Quartz.PDFKit import PDFDocument



argv = [ a.decode('utf-8') for a in sys.argv[1:] ]

outdir = argv.pop(0).rstrip('/')


for f in [ a for a in argv if re.search(r'\\.pdf$', a, re.I) ]:

url = NSURL.fileURLWithPath_(f)

doc = PDFDocument.alloc().initWithURL_(url)

path = doc.documentURL().path()

pcnt = doc.pageCount()


for i in range(0, pcnt):

page = doc.pageAtIndex_(i)

m = re.search(r'VM_\\S*', page.string())

if not m:

print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8'))

continue # ignore this page

name = m.group()

doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page

if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name)):

print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8'))

EOF"


set {r, err} to {result, 0}

on error errs number errn

set {r, err} to {errs, errn}

end try


if err = 0 then

return "OK"

else if err = 1 then

log r

return "Warning"

else

log r

return "Error"

end if


end script

tell o to run

end main

Mar 31, 2017 6:13 PM in response to Phillip Briggs

Hello Phillip Briggs,


Sorry for being late in reply. As far as I can tell, your last script, where outdir in python script is properly assigned, should work in Esko's Script Runner as well. Are you sure you're using AppleScript Runner and not Shell Script Runner?


Anyway, you might try the following which is a revision of your last script to log possible errors.



--APPLESCRIPT -- # start testing set inputs to choose file with prompt ("Choose PDF Files.") of type {"com.adobe.pdf"} with multiple selections allowed set outputFolder to choose folder with prompt ("Choose Destination Folder.") repeat with i in inputs set i's contents to i's POSIX path end repeat set outputFolder to outputFolder's POSIX path set params to {} main(inputs, outputFolder, params) -- # end testing (* for Esko Automation Engine Script Runner *) on main(inputs, outputFolder, params) (* list inputs : list of POSIX path of input files string outputFolder : POSIX path of output folder list params : optional parameters as list of strings return string : "OK" | "Warning" | "Error" * to be invoked by Esko Automation Engine cf. https://docs.esko.com/docs/en-us/automationengine/16/userguide/pdf/ae_ScriptRunn er.pdf *) script o property aa : {outputFolder} & inputs set args to "" repeat with a in my aa set args to args & a's quoted form & space end repeat try do shell script "/usr/bin/python <<'EOF' - " & args & " # coding: utf-8 import sys, re from Foundation import NSURL from Quartz.PDFKit import PDFDocument argv = [ a.decode('utf-8') for a in sys.argv[1:] ] outdir = argv.pop(0).rstrip('/') ret = 0 for f in [ a for a in argv if re.search(r'\\.pdf$', a, re.I) ]: url = NSURL.fileURLWithPath_(f) doc = PDFDocument.alloc().initWithURL_(url) path = doc.documentURL().path() pcnt = doc.pageCount() for i in range(0, pcnt): page = doc.pageAtIndex_(i) m = re.search(r'VM_\\S*', page.string()) if not m: ret = max(1, ret) print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8')) continue # ignore this page name = m.group() doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name)): ret = max(2, ret) print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8')) sys.exit(ret) EOF" set {r, err} to {result, 0} on error errs number errn set {r, err} to {errs, errn} end try if err = 0 then return "OK" else if err = 1 then log r return "Warning" else log r return "Error" end if end script tell o to run end main --END OF APPLESCRIPT



---

And in case, here's a shell script version using pyobjc which accepts search string as an additional parameter in Esko Script Runner.



#!/bin/bash # # for Esko Automation Engine Script Runner # # $1 : input files separatated by : # $2 : output directory # $3.. : additional parameters # exit : 0 = OK, 1 = Warning, 2 = Error # # cf. # https://docs.esko.com/docs/en-us/automationengine/16/userguide/pdf/ae_ScriptRunner.pdf # /usr/bin/python <<'EOF' - "$@" # coding: utf-8 # sys.argv[1] : input files separatated by : # sys.argv[2] : output directory # sys.argv[3..] : additional parameters # sys.argv[3] => search string in page # import sys, re from Foundation import NSURL from Quartz.PDFKit import PDFDocument uargv = [ a.decode('utf-8') for a in sys.argv ] outdir = uargv[2].rstrip('/') re_pattern = re.compile(re.escape(uargv[3]) + '\S*') ret = 0 for f in [ a for a in uargv[1].split(':') if re.search(r'\.pdf$', a, re.I) ]: url = NSURL.fileURLWithPath_(f) doc = PDFDocument.alloc().initWithURL_(url) path = doc.documentURL().path() pcnt = doc.pageCount() for i in range(0, pcnt): page = doc.pageAtIndex_(i) m = re.search(re_pattern, page.string()) if not m: ret = max(1, ret) print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8')) continue # ignore this page name = m.group() doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name)): ret = max(2, ret) print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8')) sys.exit(ret) EOF



And here's its AppleScript wrapper.


--APPLESCRIPT on main(inputs, outputFolder, params) (* list inputs : list of POSIX path of input files string outputFolder : POSIX path of output folder list params : optional parameters as list of strings return string : "OK" | "Warning" | "Error" * to be invoked by Esko Automation Engine cf. https://docs.esko.com/docs/en-us/automationengine/16/userguide/pdf/ae_ScriptRunner.pdf *) set a1 to "" repeat with a in inputs set a1 to a1 & a's quoted form & ":" end repeat set a1 to a1's text 1 thru -2 -- remove last excessive : set a2 to outputFolder's quoted form set ar to "" repeat with a in params set ar to ar & a's quoted form & space end repeat set args to a1 & space & a2 & space & ar try do shell script "/usr/bin/python <<'EOF' - " & args & " # coding: utf-8 # sys.argv[1] : input files separatated by : # sys.argv[2] : output directory # sys.argv[3..] : additional parameters # sys.argv[3] => search string in page # import sys, re from Foundation import NSURL from Quartz.PDFKit import PDFDocument uargv = [ a.decode('utf-8') for a in sys.argv ] outdir = uargv[2].rstrip('/') re_pattern = re.compile(re.escape(uargv[3]) + '\\S*') ret = 0 for f in [ a for a in uargv[1].split(':') if re.search(r'\\.pdf$', a, re.I) ]: url = NSURL.fileURLWithPath_(f) doc = PDFDocument.alloc().initWithURL_(url) path = doc.documentURL().path() pcnt = doc.pageCount() for i in range(0, pcnt): page = doc.pageAtIndex_(i) m = re.search(re_pattern, page.string()) if not m: ret = max(1, ret) print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8')) continue # ignore this page name = m.group() doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name)): ret = max(2, ret) print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8')) sys.exit(ret) EOF" set {r, err} to {result, 0} on error errs number errn set {r, err} to {errs, errn} end try if err = 0 then return "OK" else if err = 1 then log r return "Warning" else log r return "Error" end if end main --END OF APPLESCRIPT



Scripts are briefly tested with pyobjc 2.2b3 and python 2.6.1 under OS X 10.6.8.


Good luck,

Hiroto

Apr 3, 2017 3:00 AM in response to Hiroto

Hello Hiroto, you are correct in your query about shell vs Applescript. I was following the Esko instructions where it tells me to save the script as text:


2. Save this code as an AppleScript text file in the Script Runner’s AppleScript folder (default: / Library/Scripts/Esko/AppleScript) or in the Automation Engine AppleScript folder.

Note: Script Runner supports ‘Text’ format. Therefore it is essential to change the file format to ‘Text’.


But this is incorrect because when I save it as an Applescript it works fine.


Thank you so much for the time and effort you put into your replies, especially the extra examples where you seem to anticipate my next questions!


Many, many thanks 🙂


Apr 20, 2017 8:28 AM in response to Hiroto

Hello!


unfortunately I have some codes containing a forward slash like this: WF158/4001867.1


Luckily for me this string is common to all pages: WF158/400 so I have fudged the script to search for the "400" and then add the "WF158/" later.


Can you tell me if it's possible to search for a string (in Python) containing a forward slash? Or maybe split the string on the slash and then join the 2 parts afterwards?


Currently I have fudged it like this, so I have workaround. 🙂



* Hiroto script modified for WF158/4001867.1.pdf files

_main()

on _main()

script o

property aa : choose file with prompt ("Choose PDF Files.") of type {"com.adobe.pdf"} ¬

default location (path to desktop) with multiple selections allowed


set my aa's beginning to choose folder with prompt ("Choose Destination Folder.") ¬

default location (path to desktop)


set args to ""

repeat with a in my aa

set args to args & a's POSIX path's quoted form & space

end repeat



do shell script "/usr/bin/python <<'EOF' - " & args & "


# coding: utf-8

import sys, re

from Foundation import NSURL

from Quartz.PDFKit import PDFDocument



argv = [ a.decode('utf-8') for a in sys.argv[1:] ]

outdir = argv.pop(0).rstrip('/')



prefix = 'WF158:'



for f in [ a for a in argv if re.search(r'\\.pdf$', a, re.I) ]:

url = NSURL.fileURLWithPath_(f)

doc = PDFDocument.alloc().initWithURL_(url)

path = doc.documentURL().path()

pcnt = doc.pageCount()


for i in range(0, pcnt):

page = doc.pageAtIndex_(i)

m = re.search(r'4001\\S*', page.string())

if not m:

print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8'))

continue # ignore this page

name = m.group()



name3 = prefix + name



doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page

if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name3)):

print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8'))

EOF"

end script

tell o to run

end _main

Apr 28, 2017 8:52 PM in response to Phillip Briggs

Hello Phillip Briggs,


Sorry for being so late...


As I understand it, the main problem with / (SOLIDUS) in this case is / is used as node separator in Un*x path and care must be taken when using arbitrary string extracted from text as a file name. Another reserved character is : (COLON) which is node separator in HFS Plus path. And VFS (virtual file system) used in OS X translates : in BSD name into / in HFS Plus name, and vice versa.


So a solution would be to replace / in extracted string by : so that VFS translates it back into / in HFS Plus name. Apropos of this, if the extracted string originally contains :, it will be replaced by / in HFS Plus name. So original : in string is better to be replaced by other character in advance. In the following code, I introduced two options, one to substitute . (FULL STOP) for : (COLON) and another to substitute ∶ (U+2236 RATIO) for : (COLON), and employed the former, for the latter may be confusing.



Revised shell script using pyobjc is as follows.



#!/bin/bash # # for Esko Automation Engine Script Runner # # $1 : input files separatated by : # $2 : output directory # $3.. : additional parameters # exit : 0 = OK, 1 = Warning, 2 = Error # # cf. # https://docs.esko.com/docs/en-us/automationengine/16/userguide/pdf/ae_ScriptRunner.pdf # /usr/bin/python <<'EOF' - "$@" # coding: utf-8 # sys.argv[1] : input files separatated by : # sys.argv[2] : output directory # sys.argv[3..] : additional parameters # sys.argv[3] => search string in page # import sys, re from Foundation import NSURL from Quartz.PDFKit import PDFDocument uargv = [ a.decode('utf-8') for a in sys.argv ] outdir = uargv[2].rstrip('/') re_pattern = re.compile(re.escape(uargv[3]) + '\S*') ret = 0 for f in [ a for a in uargv[1].split(':') if re.search(r'\.pdf$', a, re.I) ]: url = NSURL.fileURLWithPath_(f) doc = PDFDocument.alloc().initWithURL_(url) path = doc.documentURL().path() pcnt = doc.pageCount() for i in range(0, pcnt): page = doc.pageAtIndex_(i) m = re.search(re_pattern, page.string()) if not m: ret = max(1, ret) print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8')) continue # ignore this page name = m.group() # # treatments for HFS plus name # 1) : (U+003A COLON) is reserved as node separator in path in HFS Plus and cannot be used in node name # 1.a) replace : (U+003A COLON) in name by . (U+002E FULL STOP) # 1.b) replace : (U+003A COLON) in name by ∶ (U+2236 RATIO) # 2) : (U+003A COLON) in BSD name is translated into / (U+002F SOLIDUS) in HFS Plus name, and vice versa # name = name.replace(':', '.') # by 1.a) : (U+003A COLON) => . (U+002E FULL STOP) # name = name.replace(':', u'\u2236') # by 1.b) : (U+003A COLON) => ∶ (U+2236 RATIO) name = name.replace('/', ':') # by 2) doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name)): ret = max(2, ret) print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8')) sys.exit(ret) EOF




And its AppleScript wrapper.



--APPLESCRIPT on main(inputs, outputFolder, params) (* list inputs : list of POSIX path of input files string outputFolder : POSIX path of output folder list params : optional parameters as list of strings return string : "OK" | "Warning" | "Error" * to be invoked by Esko Automation Engine cf. https://docs.esko.com/docs/en-us/automationengine/16/userguide/pdf/ae_ScriptRunner.pdf *) set a1 to "" repeat with a in inputs set a1 to a1 & a's quoted form & ":" end repeat set a1 to a1's text 1 thru -2 -- remove last excessive : set a2 to outputFolder's quoted form set ar to "" repeat with a in params set ar to ar & a's quoted form & space end repeat set args to a1 & space & a2 & space & ar try do shell script "/usr/bin/python <<'EOF' - " & args & " # coding: utf-8 # sys.argv[1] : input files separatated by : # sys.argv[2] : output directory # sys.argv[3..] : additional parameters # sys.argv[3] => search string in page # import sys, re from Foundation import NSURL from Quartz.PDFKit import PDFDocument uargv = [ a.decode('utf-8') for a in sys.argv ] outdir = uargv[2].rstrip('/') re_pattern = re.compile(re.escape(uargv[3]) + '\\S*') ret = 0 for f in [ a for a in uargv[1].split(':') if re.search(r'\\.pdf$', a, re.I) ]: url = NSURL.fileURLWithPath_(f) doc = PDFDocument.alloc().initWithURL_(url) path = doc.documentURL().path() pcnt = doc.pageCount() for i in range(0, pcnt): page = doc.pageAtIndex_(i) m = re.search(re_pattern, page.string()) if not m: ret = max(1, ret) print 'no matching string in page %d of %s' % (i + 1, path.encode('utf-8')) continue # ignore this page name = m.group() # # treatments for HFS plus name # 1) : (U+003A COLON) is reserved as node separator in path in HFS Plus and cannot be used in node name # 1.a) replace : (U+003A COLON) in name by . (U+002E FULL STOP) # 1.b) replace : (U+003A COLON) in name by ∶ (U+2236 RATIO) # 2) : (U+003A COLON) in BSD name is translated into / (U+002F SOLIDUS) in HFS Plus name, and vice versa # name = name.replace(':', '.') # by 1.a) : (U+003A COLON) => . (U+002E FULL STOP) # name = name.replace(':', u'\\u2236') # by 1.b) : (U+003A COLON) => ∶ (U+2236 RATIO) name = name.replace('/', ':') # by 2) doc1 = PDFDocument.alloc().initWithData_(page.dataRepresentation()) # doc for this page if not doc1.writeToFile_('%s/%s.pdf' % (outdir, name)): ret = max(2, ret) print 'failed to save page %d of %s' % (i + 1, path.encode('utf-8')) sys.exit(ret) EOF" set {r, err} to {result, 0} on error errs number errn set {r, err} to {errs, errn} end try if err = 0 then return "OK" else if err = 1 then log r return "Warning" else log r return "Error" end if end main --END OF APPLESCRIPT




Scripts are briefly tested with pyobjc 2.2b3 and python 2.6.1 under OS X 10.6.8.


All the best,

Hiroto

Apr 14, 2015 8:51 AM in response to Phillip Briggs

Hello


If I understand it correctly, you may try something like this.


_main() on _main() script o property aa : choose file with prompt ("Choose PDF Files.") of type {"com.adobe.pdf"} ¬ default location (path to desktop) with multiple selections allowed set my aa's beginning to choose folder with prompt ("Choose Destination Folder.") ¬ default location (path to desktop) set args to "" repeat with a in my aa set args to args & a's POSIX path's quoted form & space end repeat considering numeric strings if (system info)'s system version < "10.9" then set ruby to "/usr/bin/ruby" else set ruby to "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby" end if end considering do shell script ruby & " <<'EOF' - " & args & " require 'osx/cocoa' include OSX require_framework 'PDFKit' outdir = ARGV.shift.chomp('/') ARGV.select {|f| f =~ /\\.pdf$/i }.each do |f| url = NSURL.fileURLWithPath(f) doc = PDFDocument.alloc.initWithURL(url) path = doc.documentURL.path pcnt = doc.pageCount bname = File.basename(f, File.extname(f)) (0 .. (pcnt - 1)).each do |i| page = doc.pageAtIndex(i) page.string.to_s =~ /1000\\S*/ name = $& unless name puts \"no matching string in page #{i + 1} of #{path}\" next # ignore this page end newname = name[1..-1] doc1 = PDFDocument.alloc.initWithData(page.dataRepresentation) # doc for this page unless doc1.writeToFile(\"#{outdir}/#{bname}_#{newname}.pdf\") puts \"failed to save page #{i + 1} of #{path}\" end end end EOF" end script tell o to run end _main




Hope this helps,

H

Apr 14, 2015 8:58 AM in response to Hiroto

That is exactly what I needed - I searched for ages and could find no way of getting the File.basename !


To take this one step further if you will:


This is currently asking for user input in the applescript part, to select the files. Ideally, I would launch this script from another source (it's actually from Esko Automation Engine). So this would send PDFs to this script and then wait for output.


Do you think it's possible to do this? For this to 'receive' files without user input, then process them?

Apr 15, 2015 5:32 AM in response to Phillip Briggs

Hello


I'm not sure but I guess you're trying to use the script via Esko Automation Engine Script Runner.


The rest is based upon an official document I downloaded from the following link:


http://help.esko.com/docs/en-us/automationengine/12/otherdocs/Scripting_in_Autom ation_Engine.pdf



1) AppleScript for Esko Automation Engine Script Runner would be something like this. (You'd only need the main() handler definition. The rest is for testing purpose.)



-- # start testing set inputs to choose file with prompt ("Choose PDF Files.") of type {"com.adobe.pdf"} ¬ default location (path to desktop) with multiple selections allowed set outputFolder to choose folder with prompt ("Choose Destination Folder.") ¬ default location (path to desktop) repeat with i in inputs set i's contents to i's POSIX path end repeat set outputFolder to outputFolder's POSIX path set params to {} main(inputs, outputFolder, params) -- # end testing (* for Esko Automation Engine Script Runner *) on main(inputs, outputFolder, params) (* list inputs : list of POSIX path of input files string outputFolder : POSIX path of output folder list params : list of additional parameters return string : "OK" or "Warning" or "Error" *) script o property aa : {outputFolder} & inputs set args to "" repeat with a in my aa set args to args & a's quoted form & space end repeat considering numeric strings if (system info)'s system version < "10.9" then set ruby to "/usr/bin/ruby" else set ruby to "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby" end if end considering try do shell script ruby & " <<'EOF' - " & args & " require 'osx/cocoa' include OSX require_framework 'PDFKit' ret = 0 outdir = ARGV.shift.chomp('/') ARGV.select {|f| f =~ /\\.pdf$/i }.each do |f| url = NSURL.fileURLWithPath(f) doc = PDFDocument.alloc.initWithURL(url) path = doc.documentURL.path pcnt = doc.pageCount bname = File.basename(f, File.extname(f)) (0 .. (pcnt - 1)).each do |i| page = doc.pageAtIndex(i) page.string.to_s =~ /1000\\S*/ name = $& unless name ret = [ret, 1].max $stderr.puts \"No matching string in page #{i + 1} of #{path}.\" next # ignore this page end newname = name[1..-1] doc1 = PDFDocument.alloc.initWithData(page.dataRepresentation) # doc for this page unless doc1.writeToFile(\"#{outdir}/#{bname}_#{newname}.pdf\") ret = [ret, 2].max $stderr.puts \"Failed to save page #{i + 1} of #{path}.\" end end end exit ret EOF" set {r, err} to {result, 0} on error errs number errn set {r, err} to {errs, errn} end try if err = 0 then return "OK" else if err = 1 then log r return "Warning" else log r return "Error" end if end script tell o to run end main





2) Shell Script for Esko Automation Engine Script Runner would be something like this.



#!/bin/bash # # for Esko Automation Engine Script Runner # # $1 : input files separatated by : # $2 : output directory # $3.. : additional parameters # exit : 0 = OK, 1 = Warning, 2 = Error # ruby1=/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby ruby2=/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby [[ -e $ruby2 ]] && ruby=$ruby2 || ruby=$ruby1 $ruby <<'EOF' - "$@" # # ARGV[0] : input files separatated by : # ARGV[1] : output directory # ARGV[2..] : additional parameters # require 'osx/cocoa' include OSX require_framework 'PDFKit' ret = 0 outdir = ARGV[1].chomp('/') ARGV[0].split(':').select {|f| f =~ /\.pdf$/i }.each do |f| url = NSURL.fileURLWithPath(f) doc = PDFDocument.alloc.initWithURL(url) path = doc.documentURL.path pcnt = doc.pageCount bname = File.basename(f, File.extname(f)) (0 .. (pcnt - 1)).each do |i| page = doc.pageAtIndex(i) page.string.to_s =~ /1000\S*/ name = $& unless name ret = [ret, 1].max $stderr.puts "No matching string in page #{i + 1} of #{path}." next # ignore this page end newname = name[1..-1] doc1 = PDFDocument.alloc.initWithData(page.dataRepresentation) # doc for this page unless doc1.writeToFile("#{outdir}/#{bname}_#{newname}.pdf") ret = [ret, 2].max $stderr.puts "Failed to save page #{i + 1} of #{path}." end end end exit ret EOF




Hope this may help,

H

Dec 6, 2016 5:05 AM in response to Hiroto

Hello,


I am trying to enhance the usability of this script by making the search string a variable as well, so that the search string is passed from Esko into the ARGV[2] which would then let me search for any text I like rather than it being hard coded.


So I have edited the line:


page.string.to_s =~ /1000\S*/


To try and include the additional parameter from Esko like this:


page.string.to_s =~ /ARGV[2]\S*/


but obviously the syntax is incorrect because I get the error: No matching string in page


I've tried various permutations but can't solve it!

Using Ruby, Rename PDF with text from input PDF file

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.