Q: Find/Replace text in document with automator: any suggestions?
I'm looking to replace a number of unique html tags for a number of documents: is there any way to do this without going through the documents one-by-one, tag-by-tag? I would imagine I might be able to do something with automator, but I'm open to any other suggestions. I have some knowledge of html, but that's about it.
Automator, OS X Mountain Lion (10.8.4)
Posted on Jul 10, 2013 11:48 AM
So I take it this is something you do regularly, not something that needs to get done once? if it were a oneshot operation it would be simpler to use TextWrangler.
For repeated use, the simplest approach is to run through the files and apply text item delimiters to each tag:
set indesignTags to {"idtag1", "idtag2", "idtag3"}
set htmlEquivs to {"htmltag1", "htmltag2", "htmltag3"}
set theFiles to choose file with prompt "Choose indesign files" with multiple selections allowed
repeat with aFile in theFiles
-- get file text
set fileText to read aFile
-- swap tags
repeat with i from 1 to count of indesignTags
-- swap lead tags
set fileTextBits to tid({input:fileText, delim:"<" & item i of indesignTags})
set fileText to tid({input:fileTextBits, delim:"<" & item i of htmlEquivs})
-- swap trailing tags
set fileTextBits to tid({input:fileText, delim:"</" & item i of indesignTags})
set fileText to tid({input:fileTextBits, delim:"</" & item i of htmlEquivs})
end repeat
-- make new file path with html extension
set oldFilePath to POSIX path of aFile
set filePathBits to tid({input:oldFilePath, delim:"."})
set last item of fileNameBits to "html"
set newFilePath to tid({input:filePathBits, delim:"."})
-- save at new file path
set fp to open for access newFilePath with write permission
write fileText to fp
close access fp
end repeat
on tid({input:input, delim:delim})
-- handler for text items
set {oldTID, my text item delimiters} to {my text item delimiters, delim}
if class of input is list then
set output to input as text
else
set output to text items of input
end if
set my text item delimiters to oldTID
return output
end tid
This will work if the indesign tags are unique and there is no complex syntax. It might fail if tags have overlapping names (e.g. "<bl>" and "<blue>") or if there's any irregular notation. If you need more sophisticated handling you'll have to use regular expressions. In that case, download and install the Satimage osax from this page, and use the following (similar) code:
set indesignTags to {"idtag1", "idtag2", "idtag3"}
set htmlEquivs to {"htmltag1", "htmltag2", "htmltag3"}
-- set up regular expressions change lists
set findList to {}
set changeList to {}
repeat with i from 1 to count of indesignTags
set end of findList to "(</?)" & item i of indesignTags & "(?![[:alnum:]])"
set end of changeList to "\\1" & item i of htmlEquivs
end repeat
set theFiles to choose file with prompt "Choose indesign files" with multiple selections allowed
repeat with aFile in theFiles
-- get file text
set fileText to read aFile
-- swap tags - needs Satimage osax
set fileText to change findList into changeList in fileText with regexp
-- make new file path with html extension - needs Satimage osax
set oldFilePath to POSIX path of aFile
set newFilePath to change "\\.[^.]+$" into ".html" in oldFilePath with regexp
-- save at new file path
set fp to open for access newFilePath with write permission
write fileText to fp
close access fp
end repeat
Message was edited by: twtwtw - I made an error in the second regular expression. should be "\\.[^.]+$", not "\\..*$". fixed in text.
Posted on Jul 11, 2013 7:21 AM