Automator - problem downloading PDFs from URLs

Question

Level 1

4 points

Automator - problem downloading PDFs from URLs

I am trying to use Automator to download PDFs from a simple webpage that consists solely of direct links to said PDFs. I was able to accomplish this for a different website without any issue using the following workflow:

“Get Specified Finder Items” to feed in a txt document with the URLs (all of which end in .pdf)

“Combine Text Files”

“Extract URLs from Text”

“Download URLs”

For the present website, however, the output is just each of those links as an html file, not the PDFs themselves.

If I replace the “Download URLs” command with “Display Webpages,” all of the PDFs open properly in my browser and they can be downloaded one by one. So it’s as if the “Download URLs” command is adding the .html endings for this particularly website, and that's not jiving.

How can I fix this? Thank you.

Posted on Aug 6, 2019 4:00 PM

Reply

Answer 1

aleiof Author

Level 1

4 points

Aug 6, 2019 4:56 PM in response to aleiof

Clarification:

The "Download URLs" action is apparently cutting out the web address when it moves to download. For example, if the link to the PDF is http:// example .com/song.pdf, the output from "Download URLs" will be "song.pdf.html".

Reply

Answer 2

VikingOSX

Level 10

119,027 points

Aug 7, 2019 1:41 PM in response to aleiof

Remove the Download URLs action from your workflow, and drag/drop Utilities Library : Run Shell Script into the bottom of your workflow. Change the Run Shell Script and its content to what you see above. Click the Run button in the upper right to interactively view progress from the individual action results. You will need to provide it a path to an existing folder where you want the PDF saved. I used ~/Desktop/Test.

Reply

Answer 3

aleiof Author

Level 1

4 points

Aug 7, 2019 3:42 PM in response to VikingOSX

Thanks for clarifying. I followed those instructions. A file with a PDF extension is appearing in the "Test" folder I created on my desktop. However, the file does not open, and it is only 168 kb (the same size as the html files that "Download URLs" spat out before), whereas entering the test link into safari manually allows the ~100 mb file to be downloaded normally.

Reply

Answer 4

VikingOSX

Level 10

119,027 points

Aug 7, 2019 3:56 PM in response to aleiof

Are those URLs current? Some sites use content management solutions (e.g. Drupal) where they use a variable name in place of the actual filename, so they do not have to edit the site if the filename changes, or the location you are attempting to access is silently redirecting you elsewhere. The curl command should have dealt with this, but try a variation shown below in that Run Shell Script action.

You migh try adding a -L to the current command to automatically follow redirects to another site if the page has moved. This won't necessarily be obvious looking at the URL.

/usr/bin/curl -L -s -o "$OUTDIR${f##*/}.pdf" "$f"

Reply

Answer 5

VikingOSX

Level 10

119,027 points

Aug 7, 2019 11:16 AM in response to aleiof

Have you considered replacing the Download URLs action with the following:

Reply

Answer 6

aleiof Author

Level 1

4 points

Aug 7, 2019 1:08 PM in response to VikingOSX

Thank you for your reply. Unfortunately, I am a complete novice, so that's beyond me.

Reply

Shop

Quick Links

Shop Special Stores

Explore Mac

Shop Mac

More from Mac

Explore iPad

Shop iPad

More from iPad

Explore iPhone

Shop iPhone

More from iPhone

Explore Watch

Shop Watch

More from Watch

Explore Vision

Shop Vision

More from Vision

Explore AirPods

Shop AirPods

More from AirPods

Explore TV & Home

Shop TV & Home

More from TV & Home

Explore Entertainment

Support

Shop Accessories

Explore Accessories

Explore Support

Get Help

Helpful Topics

Quick Links

Automator - problem downloading PDFs from URLs

Similar questions