Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

APPLESCRIPT AND HTML PARSING.

hi,

im new to applescript so im not quite sure if what i want to do is actually called html parsing.. but basically i want to put a variable in applescript that is linked to the actual html but i dont know how to make applescript access data inside a html code... to give u a better idea, inside the html is something like this:

100

now that value "100" changes but its maximum amount is 100. i want to create a script which responds to change when that value starts to drop by loading another link.

am i making sense? again the thing id like to achieve is make applescript use that value INSIDE the HTML as its own variable (and perform the right actions as that value changes)

any help would be appreciated.

15" Macbook Pro, Mac OS X (10.5.8)

Posted on Mar 5, 2010 3:02 PM

Reply
17 replies

Mar 13, 2010 4:56 PM in response to intrikate01

Using the " document.getElementById('someID').innerHTML" statement from the previous example works for me. I don't know enough about JavaScript to debug your problem, especially since I do not have a valid page to test against. If the HTML is invalid, malformed, or the ID is not correct, the JavaScript will fail - searching the text may also fail in this case, although it doesn't depend on valid HTML (it just looks for the beginning an ending tags). A more complete script using the text handler would be:

<pre style="
font-family: Monaco, 'Courier New', Courier, monospace;
font-size: 10px;
font-weight: normal;
margin: 0px;
padding: 5px;
border: 1px solid #000000;
width: 720px; height: 340px;
color: #000000;
background-color: #DAFFB6;
overflow: auto;"
title="this text can be pasted into the Script Editor">
set theURL to "www.example.com" -- this will be the page to load - change as needed

tell application "Safari"
open location theURL
if not my pageLoaded(10) then -- did the page load?
display dialog "page did not load" buttons {"OK"} -- nope
error number -128 -- cancel
else -- try to find the element in the page source
set sourceText to source of document 1
set currentHP to my getHTMLElement(sourceText, "<span id=\"currentHP\"", "</span>", true)
end if
end tell

if currentHP is not {} then -- the result is a list, so just show the first item of any contents
display dialog "currentHP = " & first item of currentHP buttons {"OK"}
-- if currentHP is less than 100 then tell application "Safari" to open location "www.apple.com" -- open another URL
else
display dialog "item not found" buttons {"OK"}
end if



on pageLoaded(timeToWait)
(*
waits up to the timeToWait for the current Safari page to load
parameters - timeToWait [integer]: a maximum timeout value in seconds
returns [boolean]: true if page loaded, false if timeout
*)
set interval to 2 -- change interval as desired
delay interval
repeat ((timeToWait - interval) div interval) times -- check every interval seconds
tell application "Safari"
if (do JavaScript "document.readyState" in document 1) is "complete" then
return true
else
delay interval
end if
end tell
end repeat
return false
end pageLoaded


to getHTMLElement(someText, openTag, closeTag, contentsOnly)
(*
return a list of the specified HTML element in someText
parameters - someText [mixed]: the text to look at
openTag [text]: the opening tag (the ending ">" will be searched for if the tag is incomplete)
closeTag [text]: the closing tag (the tag should be complete when returning the element)
contentsOnly [boolean]: true returns just the contents, false returns the entire element
returns [list]: a list of the HTML elements found - {} if none
*)
set someText to someText as text
set currentOffset to 0 -- the current offset in the text buffer
set elementList to {} -- the list of elements found
try
repeat while currentOffset is less than (count someText)
set currentOffset to currentOffset + 1

set here to offset of openTag in (text currentOffset thru -1 of someText) -- start of opening tag
if here is 0 then exit repeat -- not found
set currentOffset to currentOffset + here
set currentTag to currentOffset - 1 -- mark the start of the element
if openTag does not end with ">" then -- find the close of the tag
set here to offset of ">" in (text (currentOffset - 1) thru -1 of someText) -- end of opening tag
if here is 0 then exit repeat -- not found
set currentOffset to currentOffset + here - 1
else
set currentOffset to currentOffset + (count openTag) - 1
end if
set here to currentOffset

set there to offset of closeTag in (text currentOffset thru -1 of someText) -- end tag
if there is 0 then exit repeat -- not found
set currentOffset to currentOffset + there + (count closeTag) - 2
set there to currentOffset

if contentsOnly then -- add the element contents
set the end of elementList to text here thru (there - (count closeTag)) of someText
else -- add the complete element (tags and contents)
set the end of elementList to text currentTag thru there of someText
end if

end repeat
on error errorMessage number errorNumber
if (errorNumber is -128) or (errorNumber is -1711) then -- nothing (user cancelled)
else
activate me
display alert "Error " & (errorNumber as string) message errorMessage as warning buttons {"OK"} default button "OK"
end if
end try
return elementList
end getHTMLElement
</pre>

Mar 15, 2010 12:40 AM in response to red_menace

nevermind guys ive figured it out. since i used the "to do JavaScript" command to set currentHP then all i needed to do was to simply write it out like this:


set theURL1 to "www.example.com/index.php?"
set theURL2 to "www.example.com/ex2.php?"
set theURL3 to "www.example.com/ex3.php?"
tell application "Safari"
set the URL of tab 1 of window 1 to theURL1
set currentHP to do JavaScript "document.getElementById('healthCurrent').innerHTML" in tab 1 of window 1
if currentHP contains "100" then
set the URL of tab 1 of window 1 to theURL2
else
set the URL of tab 1 of window 1 to theURL3
end if
end tell


adding the "if currentHP contains "100" then" allows me to look for that value inside that span id. i have also worked out that the JavaScript getElementByClassName or getElementByTag or any of those do not work anymore (based on a few websites ive seen when i was researching in google lol) except from the getElementById command since this one applies and can be applied to anything that has been labeled with an ID.

again thanks guys for your time. it was great receiving feedbacks you guys helped me alot.

APPLESCRIPT AND HTML PARSING.

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.