You can make a difference in the Apple Support Community!

When you sign up with your Apple Account, you can provide valuable feedback to other community members by upvoting helpful replies and User Tips.

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Renaming PDF files with RO


I am being paid to rename 3,563 PDF files using an RO number located in the top right corner. Just as the original author of this post has stated, the number varies in its location from time to time but is however always in the top right corner. My files are indeed PDFs and are supposed to be saved as PDFs as well, just renamed. Based on searches I see that I will be using AUTOMATER but unfortunately I am completely new to this application and have no idea where even to begin. Is there anybody here who could maybe help me? It would rescue me from having to perform 9 hours of work I'd imagine.


I have attached a screen shot example of one of the PDF's, the number in the top right corner circled in blue is what I will be using to rename each of the files. If this helps at all, the folders contain documents 6560634-6564197. (3,563 unnamed documents). I was told to click on each .PDF > copy text > and rename said document with pasted text. Individually, one can imagine how arduous a process these kind of jobs can be. I would be so grateful to anybody who can offer a valid solution.


cheers,

Gene


[Re-Titled by Moderator]


Posted on Apr 4, 2023 12:41 AM

Reply
Question marked as Top-ranking reply

Posted on Apr 5, 2023 9:50 AM

THE_DREAM_FACTORY wrote:

You seem to be on the brink of cracking the code! This will for sure help, it is however putting the name twice in certain files where there are two RO numbers. Many of these will contain a second page where the number is listed again with a /2 at the end of it. Is there anyway I can make a rule in the code where it only allows the code to be written once per file?


add -m1 to grep (-m num: Stop reading the file after num matches)

echo -n $(grep -m1 -oE '656[0-9]*[/ 0-9]')



Similar questions

22 replies
Question marked as Top-ranking reply

Apr 5, 2023 9:50 AM in response to THE_DREAM_FACTORY

THE_DREAM_FACTORY wrote:

You seem to be on the brink of cracking the code! This will for sure help, it is however putting the name twice in certain files where there are two RO numbers. Many of these will contain a second page where the number is listed again with a /2 at the end of it. Is there anyway I can make a rule in the code where it only allows the code to be written once per file?


add -m1 to grep (-m num: Stop reading the file after num matches)

echo -n $(grep -m1 -oE '656[0-9]*[/ 0-9]')



Apr 4, 2023 1:21 PM in response to THE_DREAM_FACTORY

We only see a screenshot, so can't test what you posted.

However, see if the Shortcut that I posted works

(It worked on one test file I tried (I converted your screenshot to a PDF to test)


>can someone please explain to me in layman's terms how I would select those operations?

>IE "Select Folders" , "Get contents of folder" "Repeat with each item" , etc.

>Then how to run the script shown below?


Open Applications->Shortcuts and search for the Action in the the App

For info on how to use Shortcuts, see: Shortcuts User Guide for Mac - Apple Support



Apr 4, 2023 12:03 PM in response to THE_DREAM_FACTORY

Firstly, I would like to give big thank both of you for responding so hastily. I do indeed have Ventura 13.0.1 installed on my mac m1 mini. The files are all indeed .PDF's. As for the titles I will be naming the files, I've included two more screenshot examples below and the respective titles would be 6561072 and 6561008. However, the /1 at the end would not be a big deal if that were the case. The company I am doing this job for just needs to be able to easily search for these RO numbers in their database s 6561072/1 and 6561008/1 is totally fine for a file name. Very sorry for being such a nub here, my expertise exists solely in the domain of music production. As for all software outside of the typical Apple DAW, I truly know close to nothing. I am now trying to run the command shown in the first reply, being how I don't know my way around AUTOMATER, can someone please explain to me in layman's terms how I would select those operations? IE "Select Folders" , "Get contents of folder" "Repeat with each item" , etc. Then how to run the script shown below?


Very sorry for my lack of knowledge, this is my very first run around with AUTOMATER, I have absolutely NO idea "what time it is" in these parts of town...


with Humility,

Gene

Apr 6, 2023 6:02 AM in response to Tony T1

Still recovering from the dentist visit yesterday, but because these PDFs are generated from an IBM AS/400 report writer, there is no need use the Get Text from PDF action, as one can get it directly from the individual PDF. In production, one would omit the egrep -H flag which lists the files with the result:



The regular expression says capture a first seven-digit number match that may be optionally followed by a forward slash and another number sequence. If the latter exist, they will be captured too.


I also constructed a Zsh array containing the range of valid serial numbers (without the /1), and as I obtained the REGEX capture serial number, I stripped the /1 and then checked if it was in that valid serial range. I also converted the /1 to _1 for UNIX filesystem purposes when constructing the output filename. One would have to check that the intended serial number rename file does not exist before the rename though to avoid name space collisions, and choose some fallback naming convention when duplicates are detected. Or, the mv command has a -n option that prevents overwriting duplicates.

Apr 4, 2023 9:03 AM in response to Tony T1

Tony,


The OP does not indicate if they are on Monterey or later, so Shortcuts may not be the drug of choice here.


There are some challenges to be acquainted with in this post, so not being critical of your submission:


If these PDFs were saved from an application, or even OCRd, your approach may work, but it is more complicated as that Get Text from PDF action is a line scanner, so any numerals that appear on the same line before the desired serial number will appear before it in a single column text list output. And what does


"the number varies in its location from time to time but is however always in the top right corner"


mean, and how hard is it to get it right without examples of those location variations? Is the nnnnn/n the entire serial number or only digits preceding that slash?


If these are all scanned PDFs then there will be no text to be retrieved from the PDF as it is a PDF wrapper around an image, though you can use the Extract Text From Image action which does work on obtaining the text from a scanned PDF. Tested that.


When I built that solution that you linked to earlier, I had several downloaded PDFs and corresponding images to work with to get the code accurate, and there were several variations in those PDFs/Images that made the work harder and the REGEX uglier.

Apr 4, 2023 4:54 PM in response to Tony T1

So, I built a Shortcut solution that for a few hours was capturing the serial numbers just fine from the two images above converted to PDF that I put in a folder, but under no circumstances would a rename work, though it worked fine in an external shell script. Then, out of the blue, the unchanged Shortcut code stopped capturing the serial number from the OCR stream and ignored print statements in the Run Shell Script that also had been working.


With that, and after a reboot not fixing anything, I simple gave up as my time has a cost.

Apr 5, 2023 9:01 AM in response to Tony T1

You seem to be on the brink of cracking the code! This will for sure help, it is however putting the name twice in certain files where there are two RO numbers. Many of these will contain a second page where the number is listed again with a /2 at the end of it. Is there anyway I can make a rule in the code where it only allows the code to be written once per file?

Apr 5, 2023 6:19 PM in response to Tony T1

Hey there Tony, this is working. I was actually given more jobs to do. The only issue I am having is that once every 90-100 files or so it will stop the process to tell me "a file already exists with the same name." Is there a way to bypass this in the code so it just keeps both automatically? Or even overwrites it. Preferably the former, but either way is fine. This will save me a lot of time in the future. Thanks.

Renaming PDF files with RO

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.