9 Replies Latest reply: Jul 16, 2013 2:27 PM by twtwtw
srz92 Level 1 Level 1 (0 points)

I'm looking to replace a number of unique html tags for a number of documents: is there any way to do this without going through the documents one-by-one, tag-by-tag? I would imagine I might be able to do something with automator, but I'm open to any other suggestions. I have some knowledge of html, but that's about it.


Automator, OS X Mountain Lion (10.8.4)
  • 1. Re: Find/Replace text in document with automator: any suggestions?
    twtwtw Level 5 Level 5 (4,690 points)

    It is scriptable, yes, but refactoring markup can be a bit of headache depending on the complexity of the code and what precisely you need done.  There's no one size fits all approach.  can you give more details?

  • 2. Re: Find/Replace text in document with automator: any suggestions?
    srz92 Level 1 Level 1 (0 points)

    Sure. I'm trying to change a bunch of indesign tags into html tags, so I can put the contents onto a website. Is that helpful? Or what other sorts of details would you need?

  • 3. Re: Find/Replace text in document with automator: any suggestions?
    twtwtw Level 5 Level 5 (4,690 points)

    Well, I don't know what indesign tags look like.  Are they standard xml?  Does each indesign tag have an equivallent html tag, or does something more complex need to happen?

  • 4. Re: Find/Replace text in document with automator: any suggestions?
    nbar Level 5 Level 5 (6,945 points)

    Are you referring to the name (or extension) of the files, or are you actually referring to content within the files?

  • 5. Re: Find/Replace text in document with automator: any suggestions?
    srz92 Level 1 Level 1 (0 points)

    They are standard xml, and each there is are equivallent html tags for each of them. I sometimes do the process on Coda with the find and replace function, but I'm looking to speed up the process a bit

  • 6. Re: Find/Replace text in document with automator: any suggestions?
    srz92 Level 1 Level 1 (0 points)

    Actual content within the files. I'm trying to replace indesign tags with html tags

  • 7. Re: Find/Replace text in document with automator: any suggestions?
    twtwtw Level 5 Level 5 (4,690 points)

    So I take it this is something you do regularly, not something that needs to get done once?  if it were a oneshot operation it would be simpler to use TextWrangler.

     

    For repeated use, the simplest approach is to run through the files and apply text item delimiters to each tag:

     

    set indesignTags to {"idtag1", "idtag2", "idtag3"}

    set htmlEquivs to {"htmltag1", "htmltag2", "htmltag3"}

     

    set theFiles to choose file with prompt "Choose indesign files" with multiple selections allowed

     

    repeat with aFile in theFiles

      -- get file text

              set fileText to read aFile

     

      -- swap tags

              repeat with i from 1 to count of indesignTags

      -- swap lead tags

                        set fileTextBits to tid({input:fileText, delim:"<" & item i of indesignTags})

                        set fileText to tid({input:fileTextBits, delim:"<" & item i of htmlEquivs})

     

      -- swap trailing tags

                        set fileTextBits to tid({input:fileText, delim:"</" & item i of indesignTags})

                        set fileText to tid({input:fileTextBits, delim:"</" & item i of htmlEquivs})

              end repeat

     

      -- make new file path with html extension

              set oldFilePath to POSIX path of aFile

              set filePathBits to tid({input:oldFilePath, delim:"."})

              set last item of fileNameBits to "html"

              set newFilePath to tid({input:filePathBits, delim:"."})

     

      -- save at new file path

              set fp to open for access newFilePath with write permission

      write fileText to fp

      close access fp

    end repeat

     

    on tid({input:input, delim:delim})

      -- handler for text items

     

              set {oldTID, my text item delimiters} to {my text item delimiters, delim}

              if class of input is list then

                        set output to input as text

              else

                        set output to text items of input

              end if

              set my text item delimiters to oldTID

              return output

    end tid

     

    This will work if the indesign tags are unique and there is no complex syntax.  It might fail if tags have overlapping names (e.g. "<bl>" and "<blue>") or if there's any irregular notation.  If you need more sophisticated handling you'll have to use regular expressions.  In that case, download and install the Satimage osax from this page, and use the following (similar) code:

     

    set indesignTags to {"idtag1", "idtag2", "idtag3"}

    set htmlEquivs to {"htmltag1", "htmltag2", "htmltag3"}

     

    -- set up regular expressions change lists

    set findList to {}

    set changeList to {}

    repeat with i from 1 to count of indesignTags

              set end of findList to "(</?)" & item i of indesignTags & "(?![[:alnum:]])"

              set end of changeList to "\\1" & item i of htmlEquivs

    end repeat

     

    set theFiles to choose file with prompt "Choose indesign files" with multiple selections allowed

     

    repeat with aFile in theFiles

      -- get file text

              set fileText to read aFile

     

      -- swap tags - needs Satimage osax

              set fileText to change findList into changeList in fileText with regexp

     

      -- make new file path with html extension - needs Satimage osax

              set oldFilePath to POSIX path of aFile

              set newFilePath to change "\\.[^.]+$" into ".html" in oldFilePath with regexp

     

      -- save at new file path

              set fp to open for access newFilePath with write permission

      write fileText to fp

      close access fp

    end repeat


     

    Message was edited by: twtwtw - I made an error in the second regular expression.  should be "\\.[^.]+$", not "\\..*$".  fixed in text.

  • 8. Re: Find/Replace text in document with automator: any suggestions?
    srz92 Level 1 Level 1 (0 points)

    Many thanks! This saved many-an-hour stuck in front of the screen

  • 9. Re: Find/Replace text in document with automator: any suggestions?
    twtwtw Level 5 Level 5 (4,690 points)

    Hah!  why do you think I learned how to do it?  lazy people rule.  Or would, if it weren't such a bother.