Automator file compare problem

Hi All,

I am trying to set up an automator action that will compare 2 folders that contain identical sub folders. In the sub folders one lot has RAW files and the other has JPEG files. The one with JPEG files has had some files deleted. I am trying to get the automator action to compare the JPEG folders to the RAW folders so I can delete the RAW files that no longer have a JPEG version.

I have found an action by Ben Long made in 2005 and although It finds the files to delete it will not pass them on to another action to delete them!!!!

Please can someone offer any help or advise.

Many thanks

Darren

Automator-OTHER, Mac OS X (10.7.5), or Newer

Posted on May 13, 2016 1:51 AM

Reply
6 replies

May 13, 2016 8:01 PM in response to Darrenelite

Here is a script that prompts you for a folder containing RAW files and then a folder containing JPEG files with the same names (except for the extension), then deletes the RAW files that do not have a correspondingly named JPEG file.


There's no error checking here, so be sure to test on a copy of your folders first to be sure that it's doing what you want. If you choose the wrong JPEG folder it could delete RAW files you want to keep, because it won't find matches on the names in the folder you chose.


  1. Copy-paste into Script Editor (in Applications > Utilities)
  2. Click the triangle 'run' button and answer the prompts.


SG



tell application "Finder"

set fr to choose folder with prompt "Choose subfolder with RAW files"

set fj to choose folder with prompt "Choose subfolder with corresponding JPEG files"

set {rFiles, jFiles} to {fr'sfiles as alias list, fj'sfiles as alias list}

set {rNames, jNames} to {my stripExt(rFiles), my stripExt(jFiles)}

repeat with i from 1 to count rFiles

if rNames's item i is not in jNames then delete rFiles's item i

end repeat

end tell


to stripExt(fileList) -- removes extension so names can be compared

set nn to {}

tell application "Finder"

repeat with f in fileList

set nn's end to ¬

(f's name as text)'s text 1 thru -(((f's name extension as text)'s length) + 2)

end repeat

end tell

return nn

end stripExt

May 14, 2016 6:05 PM in response to Darrenelite

Here is a self-contained, interactive AppleScript that can be compiled as an application in the Script Editor, (or an Automator Run AppleScript action) and will find the RAW files that are not in a JPEG folder. It then moves, not deletes these RAW files to the Trash, so that one can use the Finder's Put Back capability for Trash items to move back to their original location.


A "black-box" Python handler is used to identify the RAW files for removal. Excludes the .DS_Store files. Quicker than it looks.


For each folder chooser prompt, navigate to the specific folder containing the RAW, and JPEG images respectively.


In addition to the two folder prompts, there are two dialogs:

User uploaded fileUser uploaded file


Copy/paste the following code into the Script Editor (Launchpad : Other : Script Editor). Click the hammer icon to compile the code, and then the triangle button to run it.


-- dif_folder.applescript

-- Removes RAW image files that are no longer found in the JPEG image folder.

-- For RAW ["a.cr", "b.cr", "c.cr", "d.cr"] and JPEG ["a.jpg", "b.jpg", "c.jpg"], it will

-- remove RAW ["d.cr"]. The Python handler uses sets and dictionaries for speed, and

-- returns the RAW items that are to be moved to the Trash. The Trash is not emptied so that a

-- failsafe exists to Put Back the RAW images to their original location.

-- Tested: OS X 10.11.4, default OS X Python interpreter (2.7.10)

-- VikingOSX, May 14, 2016, Apple Support Community.


property loc : ((path to home folder) as text) as alias


set raw_folder to choose folder with prompt "Select sub-folder with RAW files:" default location loc

set jpg_folder to choose folder with prompt "Select sub-folder with JPEG files:" default location loc


set args to raw_folder'sPOSIX path'squoted form & space & jpg_folder'sPOSIX path'squoted form

set raw_deletes to paragraphs of folder_diff(args)


if "None" is in raw_deletes then

display dialog "No differences found between RAW and JPEG files"

return

end if


tell application "Finder"

repeat with afile from 1 to count raw_deletes

set z to (item afile of raw_deletes) as POSIX file

moveztotrash

end repeat

end tell

display dialog "RAW files with no JPEG equivalent moved to Trash: " & (count raw_deletes) as text

return


on folder_diff(xargs)


return do shell script "python <<EOF - " & xargs & "

#!/usr/bin/python

# coding: utf-8


# black box to return RAW files not found in JPEG folder

from __future__ import print_function

import sys

import os


rawf, jpgf = sys.argv[1:]

raw_files = filter(lambda x: not x.startswith('.'), os.listdir(rawf))

# Filter out the .DS_Store files that will play havoc with dict keys

raw_base = (filter(lambda x: not x.startswith('.'),

[os.path.splitext(os.path.basename(fn))[0].lower() for fn in os.listdir(rawf)]))

jpg_base = (filter(lambda x: not x.startswith('.'),

[os.path.splitext(os.path.basename(fn))[0].lower() for fn in os.listdir(jpgf)]))

raw_dict = dict(zip(raw_base, raw_files))

unmatched = list(set(raw_base) - set(jpg_base))

del_raw = [raw_dict[x] for x in sorted(unmatched)]


if del_raw:

print(*('{}'.format(os.path.join(rawf, item)) for item in del_raw), sep='\\n')

else:

print(None)

EOF"


end folder_diff

May 14, 2016 7:46 PM in response to Darrenelite

In case there's any confusion, my simple script is self-contained. It moves unmatched files to the Trash and does not empty the Trash. So, if necessary, files can be put back into their original position. It assumes you only have RAW files in the first folder you choose, though that could easily be changed if you have a mixture of files in that subfolder. Actually, it isn't limited to RAW files; it can work with other files too should you ever need that. No shell scripts; just a short, plain vanilla AppleScript.


SG

May 14, 2016 11:22 PM in response to Darrenelite

Hello


If I understand it correctly, you might try the following AppleScript script which is a mere wrapper of bash script.


It will ask you to choose raw's base directory and jpeg's base directory and then retrieve raw files in the directory tree rooted at the specified raw base directory, check presence of corresponding jpeg file in the directory tree rooted at the specified jpeg base directory for each raw file and remove raw file if there's no corresponding jpeg file.


Currently it will just print the raw files to be deleted to ~/Desktop/raw_removal_log.txt file for testing purpose. Uncomment the line OP='rm -v' to actually remove the files. Removal cannot be undone. It is assumed that raw file has name extension as .nef or .cr2 and jpeg file has name extension as .jpg or .jpeg.



--APPLESCRIPT set raw_dir to (choose folder with prompt "Choose RAW's base directory")'s POSIX path set jpg_dir to (choose folder with prompt "Choose JPEG's base directory")'s POSIX path set args to "" repeat with a in {raw_dir, jpg_dir} set args to args & a's quoted form & space end repeat do shell script "/bin/bash -s <<'EOF' - " & args & " > ~/Desktop/raw_removal_log.txt # # remove raw file without corresponding jpg file # # $! : raw's base directory # $2 : jpg's base directory # # E.g., # Given: # /path/to/raw/d/e/f/x.nef # /path/to/raw/d/e/f.y.nef # /path/to/jpg/d/e/f/x.jpg # # $1 = /path/to/raw # $2 = /path/to/jpg # # then this will remove raw file witout corresponding jpg file, i.e., # /path/to/raw/d/e/f/y.nef # RAW=$1 # base directory of raw files JPG=$2 # base directory of jpg files OP='echo' # for test # OP='rm -v' while IFS= read -r -d $'\\0' f do n=${f##*/} # name.ext m=${n%.*} # name w/o ext d=${f%$n} # dir/ jpg=${d/#$RAW/$JPG}/$m # corresponding jpg w/o ext [[ -e $jpg.jpg || -e $jpg.jpeg ]] || $OP \"$f\" done < <(find -E \"$RAW\" -type f -iregex '.*\\.(nef|cr2)$' -print0) EOF" --END OF APPLESCRIPT




Briefly tested under OS X 10.6.8 but NO WARRANTIES of any kind. Please make sure you have complete backup of the original directories before running this sort of script.


Good luck,

H

May 15, 2016 5:50 AM in response to VikingOSX

Newer version.


Changes:

  • AppleScript
    • Centralized AppleScript property for valid image extensions, and passed argument reduction
    • Added Finder Empty Trash option (uncomment to use, but permanent removal).
  • Python
    • Simplified file_validate function for use in filter
    • Added comments
    • Pythonic code improvements


Code:


-- dif_folder.applescript

-- Identifies and removes RAW image files that are no longer found in the JPEG image folder.

-- Only *moves* the files to the Trash so that Put Back option can restore them.

--

-- For RAW ["a.cr2", "b.nef", "c.cr2"] and JPEG ["a.jpg", "b.jpeg"], will move ["c.cr2"].

-- Tested: OS X 10.11.4, default OS X Python interpreter (2.7.10)

-- Version 1.2, VikingOSX, May 15, 2016, Apple Support Community. No warranties whatsoever.


property loc : ((path to home folder) as text) as alias

property valid_ext : ".cr2|.nef|.jpg|.jpeg|.jp2"


set raw_folder to choose folder with prompt "Select sub-folder with RAW files:" default location loc

set jpg_folder to choose folder with prompt "Select sub-folder with JPEG files:" default location loc


set args to raw_folder'sPOSIX path'squoted form & space & jpg_folder'sPOSIX path'squoted form & ¬


space & valid_ext'squoted form


set raw_deletes to paragraphs of folder_diff(args)


if "None" is in raw_deletes then

display dialog "No differences found between RAW and JPEG folders"

return

end if


tell application "Finder"

repeat with afile from 1 to count raw_deletes

set z to (item afile of raw_deletes) as POSIX file as text

moveztotrash

end repeat

-- empty trash

end tell

display dialog "RAW files with no JPEG equivalent moved to Trash: " & (count raw_deletes) as text

return


on folder_diff(xargs)

return do shell script "python <<EOF - " & xargs & "

#!/usr/bin/python

# coding: utf-8

# Blackbox to find RAW files not in JPEG folder


from __future__ import print_function

import sys

import os


# assign passed arguments

rawf, jpgf, vext = sys.argv[1:]

valid_ext = tuple(x for x in vext.split('|') if x)


def file_validate(n):

# n is file (implicitly passed)

# validate extensions and exclude dot files

if n.endswith(valid_ext) and not n.startswith('.'):

return n

else:

return False


raw_files = filter(file_validate, os.listdir(rawf))

jpg_files = filter(file_validate, os.listdir(jpgf))

raw_base = [os.path.splitext(os.path.basename(fn))[0].lower() for fn in raw_files]

jpg_base = [os.path.splitext(os.path.basename(fn))[0].lower() for fn in jpg_files]


# dictionary with basenames as key and files as values

raw_dict = dict(zip(raw_base, raw_files))


# RAW files (basenames) not in JPEG folder (set difference)

unmatched = list(set(raw_base) - set(jpg_base))


# get unmatched filenames (values) from dictionary

del_raw = [raw_dict[x] for x in sorted(unmatched)]


# return full POSIX path of each RAW file to be processed

if del_raw:

print(*('{}'.format(os.path.join(rawf, item)) for item in del_raw), sep='\\n')

else:

print(None)

EOF"

end folder_diff

May 15, 2016 8:43 AM in response to Darrenelite

Given a folder structure like this ...


User uploaded file

... the slightly modified script below will send the highlighted files to the Trash. The file extensions are made up; they don't matter. The folder structure and subfolder names and the filenames before the extension do matter.


SG



tell application "Finder"

set fr to choose folder with prompt "Choose TOP folder with RAW files"

set fj to choose folder with prompt "Choose TOP folder with corresponding JPEG files"

set {rFiles, jFiles} to {fr'sfolders'sfiles as alias list, fj'sfolders'sfiles as alias list}

set {rNames, jNames} to {my stripExt(rFiles), my stripExt(jFiles)}

repeat with i from 1 to count rFiles

if rNames's item i is not in jNames then delete rFiles's item i

end repeat

end tell


to stripExt(fileList) --> folder | filename w extension removed

set nn to {}

tell application "Finder"

repeat with f in fileList

set nn's end to (f's container's name) & "|" & ¬

(f's name as text)'s text 1 thru -(((f's name extension as text)'s length) + 2)

end repeat

end tell

return nn

end stripExt

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Automator file compare problem

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.