maxbrokenculture

Q: Applescript Find & Replace in Text Document from Excel Sheet

I'm new to Applescript and am trying to create a script that can run through a text document (.html) and find & replace various phrases with a corresponding cell in an Excel document.

I will put placeholders in the text to be replaced as B_1 (first value to replace), B_2 (second), B_3 etc...

They should be replaced in order using the cells B1-B9 in the Excel document. The phrases in the Excel document will be changed each time, hence why it has to be dynamic to capture them rather than a standard find & replace script with static values.

I've been following this thread which all makes sense:https://discussions.apple.com/thread/7008048?start=0&tstart=0

However, I keep running into compile errors. Here is my current code:

tell application "Microsoft Excel" set colB_data to value of every cell of range "B1:B9" of worksheet 1 of document 1 end tell tell application "Sublime Text" tell text of "Users/maxquinn/Desktop/index.html" repeat with index from 1 to count colB_data replace ("B_" & index as text) using (item index of colB_data) options {starting at top:true, match words:true} end repeat end tell end tell 

The first half works fine, but the second half gives me the error "Expected end of line but found identifier." and highlights the 'using' in the 'replace' command (line 8).

Does anyone know why this might be, and are there any other glaring errors in the script?

Thanks!

Max

iMac

Posted on Jul 22, 2016 6:25 AM

Close

Q: Applescript Find & Replace in Text Document from Excel Sheet

  • All replies
  • Helpful answers

Page 1 Next
  • by VikingOSX,Helpful

    VikingOSX VikingOSX Jul 22, 2016 7:06 AM in response to maxbrokenculture
    Level 7 (20,606 points)
    Mac OS X
    Jul 22, 2016 7:06 AM in response to maxbrokenculture

    Restructure your AppleScript in the Apple Script Editor (so we don't have to peer at one continuous line of code), and then paste back here using the advanced editor functionality. Then select all of the posted AppleScript, and from the Style menu, select paragraph. Now, your code is (hopefully) legible to us.

    Screen Shot 2016-07-22 at 9.54.42 AM.jpg

    Second point. Sublime Text does not provide any AppleScript scripting definition, so you cannot manipulate it with AppleScript.


    TextEdit is scriptable, and you likely want it in plain text mode as opposed to Rich Text (RTF).

  • by maxbrokenculture,

    maxbrokenculture maxbrokenculture Jul 22, 2016 7:08 AM in response to VikingOSX
    Level 1 (4 points)
    Mac OS X
    Jul 22, 2016 7:08 AM in response to VikingOSX

    Apologies! Didn't realise that had dropped in as one line. Here you go (I can't edit original post now):

     

    tell application "Microsoft Excel"

      set colB_data to value of every cell of range "B1:B9" of worksheet 1 of document 1

    end tell

     

    tell application "TextEdit"

      tell text of "Users/maxquinn/Desktop/index.html"

      repeat with index from 1 to count colB_data

      replace("B_" & index as text) using (item index of colB_data) options {starting at top:true, match words:true}

      end repeat

      end tell

    end tell

     

    As for the Sublime Text point: thanks, I wasn't aware of that. However, changing to TextEdit gives exactly the same error, does the file need to be rtf rather than html?

     

    Cheers!

     

    Max

  • by VikingOSX,Solvedanswer

    VikingOSX VikingOSX Jul 22, 2016 9:01 AM in response to maxbrokenculture
    Level 7 (20,606 points)
    Mac OS X
    Jul 22, 2016 9:01 AM in response to maxbrokenculture

    Given the following HTML input file, the following AppleScript will replace the B_n values with the contents of the list, and writes a new file out with the changes.

     

        Before                                                                            After

    Screen Shot 2016-07-22 at 11.47.40 AM.jpg    Screen Shot 2016-07-22 at 11.49.47 AM.jpg

    use scripting additions

     

    set orig_html to ((path to desktop) as text) & "test.html"

    set replaced_html to ((path to desktop) as text) & "replaced.html"

     

    -- replace this with the tell block to get the Excel spreadsheet cells into a list

    set colB_data to {"A", "B", "C", "D", "E", "F", "G", "H", "I"}

     

    tell application "TextEdit"

     

      set myHTML to read file orig_html

     

      repeat with index from 1 to count of colB_data

           set findStr to ("B_" & index) as text

           set replaceStr to (item index of colB_data) as text

           -- distinctly quoted command-line arguments for Ruby

           set args to findStr's quoted form & space & replaceStr's quoted form & space & myHTML's quoted form

           set myHTML to my ruby_replace(args) as text

      end repeat

      -- write changes out

      set replaced to open for access replaced_html with write permission

      set eof replaced to 0

      write (myHTML as text) to replaced

      close access replaced

    end tell

    return

     

    on ruby_replace(sargs)

      return do shell script "ruby <<'EOF' - " & sargs & "

    #!/usr/bin/ruby

    # coding: utf-8

     

    sfind, sreplace, txt = ARGV

    print txt.gsub(/#{sfind}/, \"#{sreplace}\")

    EOF"

     

    end ruby_replace

  • by maxbrokenculture,

    maxbrokenculture maxbrokenculture Jul 22, 2016 9:01 AM in response to VikingOSX
    Level 1 (4 points)
    Mac OS X
    Jul 22, 2016 9:01 AM in response to VikingOSX

    That's it! Thank you very much for all your help.

  • by VikingOSX,

    VikingOSX VikingOSX Jul 23, 2016 5:11 AM in response to maxbrokenculture
    Level 7 (20,606 points)
    Mac OS X
    Jul 23, 2016 5:11 AM in response to maxbrokenculture

    Here is an update to my previous AppleScript solution. It provides the same result, and uses AppleScript/Objective-C, instead of the Ruby handler overhead — to more efficiently perform find/replace on your HTML content. Tested with AppleScript 2.5 on El Capitan 10.11.6.

     

    -- replace specified text in HTML with text from list items

    -- uses AppleScript/Objective-C to do the find/replace without function call overhead.

    -- Version 2, VikingOSX, Jul 23, 2016, Apple Support Communities

     

    use scripting additions

    use framework "Foundation"

    use AppleScript version "2.4"

     

    set orig_html to ((path to desktop) as text) & "test.html"

    set replaced_html to ((path to desktop) as text) & "replaced.html"

     

    -- replace this with the tell block to get the Excel spreadsheet cells into a list

    set colB_data to {"A", "B", "C", "D", "E", "F", "G", "H", "I"}

     

    tell application "TextEdit"

     

           set myHTML to read file orig_html

     

           repeat with index from 1 to count of colB_data

                set findStr to ("B_" & index) as text

                set replaceStr to (item index of colB_data) as text

                set srcString to (current application's NSString's stringWithString:myHTML)

                set myHTML to (srcString's stringByReplacingOccurrencesOfString:findStr ¬

                     withString:replaceStr) as text

           end repeat

    end tell

     

    -- write changes out

    set replaced to open for access replaced_html with write permission

    set eof replaced to 0

    write (myHTML as text) to replaced

    close access replaced

    return

  • by maxbrokenculture,

    maxbrokenculture maxbrokenculture Jul 25, 2016 1:49 AM in response to VikingOSX
    Level 1 (4 points)
    Mac OS X
    Jul 25, 2016 1:49 AM in response to VikingOSX

    Thanks pal! Although I'm getting one problem with both of them:

     

    It works absolutely fine for the first 9 values 1-9. However, when it gets into double digits it just seems to loop back round.

     

    B_10 - B_19 takes the value of B_1

    B_20 - B_29 takes the value of B_2
    etc

     

    It's as if it only reads the first number? the tell block to get the Excel values works perfectly up to any value so it doesn't seem like that's the problem. Any ideas?

     

    Thanks again for all your help!

     

    Max

  • by VikingOSX,

    VikingOSX VikingOSX Jul 25, 2016 3:00 AM in response to maxbrokenculture
    Level 7 (20,606 points)
    Mac OS X
    Jul 25, 2016 3:00 AM in response to maxbrokenculture

    Looking at the code again.

  • by VikingOSX,

    VikingOSX VikingOSX Jul 25, 2016 4:41 AM in response to maxbrokenculture
    Level 7 (20,606 points)
    Mac OS X
    Jul 25, 2016 4:41 AM in response to maxbrokenculture

    If for example there are 26 items in the colB_data list from Excel, and there are 26 B_nn entries in the HTML, then each HTML entry now uniquely maps to each successive item in the Excel list.

     

    If there are fewer Excel list items than B_nn fields in the HTML, replacements will occur in the HTML until the Excel list items are used, and the remainder of the B_nn fields are untouched.

     

    If there are more Excel list items than B_nn fields in the HTML, replacements occur from the Excel list until B_nn fields are all replaced.

     

    -- version 2 to uniquely replace B_nn HTML text

    use scripting additions

    use framework "Foundation"

    use AppleScript version "2.4"

     

    set orig_html to ((path to desktop) as text) & "test.html"

    set replaced_html to ((path to desktop) as text) & "replaced.html"

     

    -- replace this with the tell block to get the Excel spreadsheet cells into a list

    set colB_data to {"A", "B", "C", "D", "E", "F", "G", "H", "I"}

     

    tell application "TextEdit"

     

           set myHTML to read file orig_html

     

           repeat with index from 1 to count of colB_data

                set findStr to ("B_" & index) as text

                set replaceStr to (item index of colB_data) as text

                set srcString to (current application's NSString's stringWithString:myHTML)

                -- updated

                set loc to (srcString's rangeOfString:findStr)

                set myHTML to (srcString's stringByReplacingCharactersInRange:loc withString:replaceStr) as text

           end repeat

    end tell

     

    -- write changes out

    set replaced to open for access replaced_html with write permission

    set eof replaced to 0

    write (myHTML as text) to replaced

    close access replaced

    return

     

      se

  • by maxbrokenculture,

    maxbrokenculture maxbrokenculture Jul 25, 2016 5:05 AM in response to VikingOSX
    Level 1 (4 points)
    Mac OS X
    Jul 25, 2016 5:05 AM in response to VikingOSX

    Same thing again I'm afraid. Weirdly, this one goes to B_11 and then stops there. Also on top of that, whereas the other one would cope fine with later instances of B_2 (for example) in the HTML and change those to match B_2, this one only does the first instance.

     

    Would it help to see the html I'm using so you can see the aim? I can't include it here as it's too long but can email if you think it can help? It includes B_1 up to B_38 and many of those are repeated multiple times.

     

    I've also tried changing B_1 through B_9 to read B_01 through B_09 but nothing is replaced at all when I do that.

     

    Thanks!

  • by VikingOSX,

    VikingOSX VikingOSX Jul 25, 2016 6:21 AM in response to maxbrokenculture
    Level 7 (20,606 points)
    Mac OS X
    Jul 25, 2016 6:21 AM in response to maxbrokenculture

    The original specification that you posted, and that I have provided as an updated solution earlier this morning, assumes a sequential, unique run of list item replacements from B_1 to B_nn.

     

    Now, you share that it is not entirely sequential, and that there may occur multiple, out-of-sequence instances of B_nn strings. This eliminates a sequential repeat loop based on indices, and suggests that the application needs to harvest the B_nn HTML text instances into a list, and use their index in that list to select their corresponding index (and replacement string) from the Excel list.

     

    Is this a correct assessment?

     

    If so, then this has just now become more complicated, and time consuming to code and test.

     

    I do not share my personal contact information within the Apple Support Community, partially because it conflicts with Apple's terms of use, and because Google indexes these posts. I do not want my email harvested by spammers.

  • by maxbrokenculture,

    maxbrokenculture maxbrokenculture Jul 25, 2016 6:35 AM in response to VikingOSX
    Level 1 (4 points)
    Mac OS X
    Jul 25, 2016 6:35 AM in response to VikingOSX

    That's correct yes, although due to a misunderstanding/poor wording on my part rather than intentional scope creep. I didn't mean to imply in my original post that the values within the html would be in order, merely that they have values which correspond to the order of the Excel cells (B_2 = B2, B_3 = B3 etc). Apologies I didn't phrase that more clearly.

     

    And that's completely understandable re: contact information. I just thought it may help further clarify my question and wasn't sure if there was a way of PMing or anything

     

    Thanks again,

     

    Max

  • by Hiroto,

    Hiroto Hiroto Jul 25, 2016 7:46 AM in response to maxbrokenculture
    Level 5 (7,281 points)
    Jul 25, 2016 7:46 AM in response to maxbrokenculture

    Hello

     

    You might try something like the following script.

     

    Excel part is currently commented out and values of xx and yy are given manually, for I don't have Excel at hand. Cell name Bn in xx is converted to B_n before used as search string.

     

     

    set infile to (choose file of type {"public.html"})'s POSIX path
    set outfile to (choose file name default name "out.html")'s POSIX path
    
    (*
    tell application "Microsoft Excel"
        tell active sheet's range "B1:B14"
            set {xx, yy} to every cell's {name, value}
        end tell
    end tell
    *)
    set xx to {"B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9", "B10", "B11", "B12", "B13", "B14"}
    set yy to {"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"}
    
    set tsv to ""
    repeat with i from 1 to count xx
        set tsv to tsv & xx's item i & tab & yy's item i & linefeed
    end repeat
    
    set args to ""
    repeat with a in {infile, tsv}
        set args to args & a's quoted form & space
    end repeat
    
    do shell script "/usr/bin/ruby -w <<'EOF' - " & args & " > " & outfile's quoted form & "
    # 
    #   ARGV[0] : input html file
    #   ARGV[1] : TSV text of search and replace strings separated by TAB and terminated by LF
    #           
    #           * Search string is preprocessed so that leading non-numeric part and succeeding numeric part
    #           are separated by U+005F LOW LINE. E.g., AB10 is preprocessed as AB_10.
    # 
    t = ''
    File.open(ARGV.shift, 'r') do |f|
        t = f.read
    end
    ARGV.shift.scan(/(.*?)\\t(.*)/).each do |x, y|
        x.sub!(/^(\\D+?)_*(\\d+)/, '\\1_\\2')           # AB10 => AB_10 (AB_10 => AB_10)
        t.gsub!(/(\\W)#{x}(\\D)/, '\\1%s\\2' % y)
    end
    puts t
    EOF"
    

     

     

     

    Briefly tested under OS X 10.6.8.

     

    Good luck,

    H

  • by Hiroto,

    Hiroto Hiroto Jul 25, 2016 8:07 AM in response to Hiroto
    Level 5 (7,281 points)
    Jul 25, 2016 8:07 AM in response to Hiroto

    Minor revision to process UTF-8 html file in Ruby 2.x correctly. Added encoding option r:utf-8 when opening file.

     

     

    set infile to (choose file of type {"public.html"})'s POSIX path
    set outfile to (choose file name default name "out.html")'s POSIX path
    
    (*
    tell application "Microsoft Excel"
        tell active sheet's range "B1:B14"
            set {xx, yy} to every cell's {name, value}
        end tell
    end tell
    *)
    set xx to {"B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9", "B10", "B11", "B12", "B13", "B14"}
    set yy to {"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"}
    
    set tsv to ""
    repeat with i from 1 to count xx
        set tsv to tsv & xx's item i & tab & yy's item i & linefeed
    end repeat
    
    set args to ""
    repeat with a in {infile, tsv}
        set args to args & a's quoted form & space
    end repeat
    
    do shell script "/usr/bin/ruby -w <<'EOF' - " & args & " > " & outfile's quoted form & "
    # 
    #   ARGV[0] : input html file
    #   ARGV[1] : TSV text of search and replace strings separated by TAB and terminated by LF
    #           
    #           * Search string is preprocessed so that leading non-numeric part and succeeding numeric part
    #           are separated by U+005F LOW LINE. E.g., AB10 is preprocessed as AB_10.
    # 
    t = ''
    File.open(ARGV.shift, 'r:utf-8') do |f|
        t = f.read
    end
    ARGV.shift.scan(/(.*?)\\t(.*)/).each do |x, y|
        x.sub!(/^(\\D+?)_*(\\d+)/, '\\1_\\2')           # AB10 => AB_10 (AB_10 => AB_10)
        t.gsub!(/(\\W)#{x}(\\D)/, '\\1%s\\2' % y)
    end
    puts t
    EOF"
    
  • by maxbrokenculture,

    maxbrokenculture maxbrokenculture Jul 25, 2016 9:48 AM in response to Hiroto
    Level 1 (4 points)
    Mac OS X
    Jul 25, 2016 9:48 AM in response to Hiroto

    Thats it! Thank you VERY much, both of you

     

    I made some slight amendments to include the Excel as you mentioned, final working script here if you're interested.

     

     

     

     

     

    set infile to (choose file of type {"public.html"})'s POSIX path

    set outfile to (choose file name default name "out.html")'s POSIX path

     

    tell application "Microsoft Excel"

      tell active sheet's range "B1:B38"

      set {xx, yy} to every cell's {name, value}

      end tell

    end tell

     

    set xx to {"B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9", "B10", "B11", "B12", "B13", "B14", "B15", "B16", "B17", "B18", "B19", "B20", "B21", "B22", "B23", "B24", "B25", "B26", "B27", "B28", "B29", "B30", "B31", "B32", "B33", "B34", "B35", "B36", "B37", "B38"}

     

    tell application "Microsoft Excel"

      set yy to value of every cell of range "B1:B38" of worksheet 1 of document 1

    end tell

     

    set tsv to ""

    repeat with i from 1 to count xx

      set tsv to tsv & xx's item i & tab & yy's item i & linefeed

    end repeat

     

    set args to ""

    repeat with a in {infile, tsv}

      set args to args & a's quoted form & space

    end repeat

     

    do shell script "/usr/bin/ruby -w <<'EOF' - " & args & " > " & outfile's quoted form & "

    #

    #   ARGV[0] : input html file

    #   ARGV[1] : TSV text of search and replace strings separated by TAB and terminated by LF

    #          

    #           * Search string is preprocessed so that leading non-numeric part and succeeding numeric part

    #           are separated by U+005F LOW LINE. E.g., AB10 is preprocessed as AB_10.

    #

    t = ''

    File.open(ARGV.shift, 'r:utf-8') do |f|

        t = f.read

    end

    ARGV.shift.scan(/(.*?)\\t(.*)/).each do |x, y|

        x.sub!(/^(\\D+?)_*(\\d+)/, '\\1_\\2')           # AB10 => AB_10 (AB_10 => AB_10)

        t.gsub!(/(\\W)#{x}(\\D)/, '\\1%s\\2' % y)

    end

    puts t

    EOF"

Page 1 Next