Skip navigation

HT2488: Mac Basics: Automator

Learn about Mac Basics: Automator

HT2488 Find/Replace text in document with automator: any suggestions?

1094 Views 9 Replies Latest reply: Jul 16, 2013 2:27 PM by twtwtw RSS
srz92 Calculating status...
Currently Being Moderated
Jul 10, 2013 11:48 AM

I'm looking to replace a number of unique html tags for a number of documents: is there any way to do this without going through the documents one-by-one, tag-by-tag? I would imagine I might be able to do something with automator, but I'm open to any other suggestions. I have some knowledge of html, but that's about it.

Automator, OS X Mountain Lion (10.8.4)
  • twtwtw Level 5 Level 5 (4,580 points)

    It is scriptable, yes, but refactoring markup can be a bit of headache depending on the complexity of the code and what precisely you need done.  There's no one size fits all approach.  can you give more details?

  • twtwtw Level 5 Level 5 (4,580 points)

    Well, I don't know what indesign tags look like.  Are they standard xml?  Does each indesign tag have an equivallent html tag, or does something more complex need to happen?

  • nbar Level 5 Level 5 (6,490 points)

    Are you referring to the name (or extension) of the files, or are you actually referring to content within the files?

  • twtwtw Level 5 Level 5 (4,580 points)

    So I take it this is something you do regularly, not something that needs to get done once?  if it were a oneshot operation it would be simpler to use TextWrangler.

     

    For repeated use, the simplest approach is to run through the files and apply text item delimiters to each tag:

     

    set indesignTags to {"idtag1", "idtag2", "idtag3"}

    set htmlEquivs to {"htmltag1", "htmltag2", "htmltag3"}

     

    set theFiles to choose file with prompt "Choose indesign files" with multiple selections allowed

     

    repeat with aFile in theFiles

      -- get file text

              set fileText to read aFile

     

      -- swap tags

              repeat with i from 1 to count of indesignTags

      -- swap lead tags

                        set fileTextBits to tid({input:fileText, delim:"<" & item i of indesignTags})

                        set fileText to tid({input:fileTextBits, delim:"<" & item i of htmlEquivs})

     

      -- swap trailing tags

                        set fileTextBits to tid({input:fileText, delim:"</" & item i of indesignTags})

                        set fileText to tid({input:fileTextBits, delim:"</" & item i of htmlEquivs})

              end repeat

     

      -- make new file path with html extension

              set oldFilePath to POSIX path of aFile

              set filePathBits to tid({input:oldFilePath, delim:"."})

              set last item of fileNameBits to "html"

              set newFilePath to tid({input:filePathBits, delim:"."})

     

      -- save at new file path

              set fp to open for access newFilePath with write permission

      write fileText to fp

      close access fp

    end repeat

     

    on tid({input:input, delim:delim})

      -- handler for text items

     

              set {oldTID, my text item delimiters} to {my text item delimiters, delim}

              if class of input is list then

                        set output to input as text

              else

                        set output to text items of input

              end if

              set my text item delimiters to oldTID

              return output

    end tid

     

    This will work if the indesign tags are unique and there is no complex syntax.  It might fail if tags have overlapping names (e.g. "<bl>" and "<blue>") or if there's any irregular notation.  If you need more sophisticated handling you'll have to use regular expressions.  In that case, download and install the Satimage osax from this page, and use the following (similar) code:

     

    set indesignTags to {"idtag1", "idtag2", "idtag3"}

    set htmlEquivs to {"htmltag1", "htmltag2", "htmltag3"}

     

    -- set up regular expressions change lists

    set findList to {}

    set changeList to {}

    repeat with i from 1 to count of indesignTags

              set end of findList to "(</?)" & item i of indesignTags & "(?![[:alnum:]])"

              set end of changeList to "\\1" & item i of htmlEquivs

    end repeat

     

    set theFiles to choose file with prompt "Choose indesign files" with multiple selections allowed

     

    repeat with aFile in theFiles

      -- get file text

              set fileText to read aFile

     

      -- swap tags - needs Satimage osax

              set fileText to change findList into changeList in fileText with regexp

     

      -- make new file path with html extension - needs Satimage osax

              set oldFilePath to POSIX path of aFile

              set newFilePath to change "\\.[^.]+$" into ".html" in oldFilePath with regexp

     

      -- save at new file path

              set fp to open for access newFilePath with write permission

      write fileText to fp

      close access fp

    end repeat


     

    Message was edited by: twtwtw - I made an error in the second regular expression.  should be "\\.[^.]+$", not "\\..*$".  fixed in text.

  • twtwtw Level 5 Level 5 (4,580 points)

    Hah!  why do you think I learned how to do it?  lazy people rule.  Or would, if it weren't such a bother.

Actions

More Like This

  • Retrieving data ...

Bookmarked By (0)

Legend

  • This solved my question - 10 points
  • This helped me - 5 points
This site contains user submitted content, comments and opinions and is for informational purposes only. Apple disclaims any and all liability for the acts, omissions and conduct of any third parties in connection with or related to your use of the site. All postings and use of the content on this site are subject to the Apple Support Communities Terms of Use.