Rename PDF based on invoice number

I am trying to find out how I can rename hundreds of invoice pdfs based on the invoice number.

The format of the number is the two digit year, then a hyphen, then a series of numbers 1-5 digits long

These are scanned files so OCR will be necessary. I have seen that Automator and/or Shortcuts can extract the copy. I've tried copying some code from other threads online but can't seem to get it to work for my needs. There is text before the invoice number "Invoice #:", but I only need the number itself, not the text.


Examples


Invoice #: 19-2456

Invoice #: 22-18635

Invoice #: 23-123

Invoice #: 23-0


Can anyone help? Thanks!



Posted on Sep 26, 2023 2:59 PM

Reply
5 replies

Sep 26, 2023 5:12 PM in response to CHC-mac

The following Shortcut solution will prompt you for a folder containing your current OCR'd PDFs. It will then confirm that they have a .pdf or .PDF extension, and enter a repeat loop for each PDF. In that loop, it sets a variable to the current PDF filename, extracts the OCR text, and then appends that to the just set variable.


Next, it uses a Run Shell Script that uses the Zsh shell to accept the two arguments in the InFile variable. These are referenced as $1 and $2 as PDF and OCR respectively.


Then Perl uses a regular expression to capture the format of the Invoice number which as you have explained, is two digits, a dash, and then from your example, 1 to 5 additional digits. This capture is assigned to the variable inv_number and then follows the rename (/bin/mv) and Zsh hocus pocus to complete the rename. Then it processes the next PDF file, rinse repeat.


I would try this as is with one or two OCR's PDFs in a folder on your Desktop and view the Show results as I have included here. For a production run with more PDFs, you likely will want to remove that Show Results action at the end of the Shortcut.




The original file was named ocr_invoice.pdf and was renamed 19-2456.pdf in the original folder location.

Here is a portion of the original OCR'd PDF:



Tested: macOS 14.0

Sep 27, 2023 3:42 PM in response to CHC-mac

Consider changing the Filter action to Any. It works either with All or Any but Any seems more appropriate.



You want to enter the following as one contiguous line and let it wrap in the Run Shell Script, do not split it with a return.


inv_number="$(perl -lne 'print $1 if $_ =~ /(?<=Invoice #:)\s*(\d{2}-\d{1,5})/' <<<$OCR)"



I just run my Shortcut again and it renamed a PDF correctly in the designated folder:




Sep 27, 2023 2:57 PM in response to VikingOSX

Thanks for the help. Unfortunately on my end, when I run it, the file disappears out of the selected folder without creating the renamed file. I'm running Mac OS 14. Not sure what the issue could be. I know when I copy and past the script from your screenshot, it doesn't paste exactly right. The "$" came in as "S" and a few other things were off that I had to correct. But the entire script runs with no errors.


Any ideas or anything else I can try? And thanks again for any help you have time to offer. It's very much appreciated!


This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Rename PDF based on invoice number

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.