-
All replies
-
Helpful answers
-
Dec 17, 2014 6:07 PM in response to Orchardjoanby SGIII,The script in the other thread is a pretty good working example of how to have AppleScript fetch html, manipulate it, extract a value, and place the value in a table in a Numbers document. If you experiment with it you will see how it works and can have a go at adapting it to fetch the dividend information. It isn't that easy, though. It usually takes quite a bit of fiddling around to get something like this to work. But it does look as if this approach will work on the Key Statistics page.
SG
-
Dec 18, 2014 4:25 AM in response to SGIIIby Orchardjoan,TThanks so much. I think you have devoted above and beyond to put me on the right path, and it is now up to me, but what is the other thread you mentioned I should look at?
-
Dec 18, 2014 4:42 AM in response to SGIIIby Orchardjoan,CCan you point me to that thread? I will work on it and not bug you anymore. I know you have a lot to do And have spent a lot of time getting me going. I have learned at lot on this thread and hadn't even heard about scraping.
-
Dec 18, 2014 4:46 AM in response to Orchardjoanby SGIII,I was referring to the link in my post up thread.
For an example of that type of approach (used to fetch an FX currency conversion rate) see the script in this thread.
I suggest you try experimenting with the script there to get an idea of how the approach works.
SG
-
Dec 19, 2014 10:55 AM in response to SGIIIby Orchardjoan,I am trying very hard to get data with the URL from yahoo stock, (next challenge will be the edit), but first I have to retrieve it. I thought if I just modified the stock quote for current price it would work, but it says there is a missing format variable, so I an't figure out how to tell Applescript that I want http://finance.yahoo.com/q/ks?s = VZ+key+statics (VZ is the value of cell(1)). I searched with safari to try to find out the format for the request. I tried setting parms to "VZ+key+statistics" just to see if that works but that didn't seem to work either. Sorry to be such a newbie. I have a book on applescript, but it doesn't help for this kind of script. Thanks
(* the rest of this is to get the key statistics and find the forward annual dividend
then put it into the table.*)
set t3 to "http://" & "download.finance.yahoo.com/d/quotes.csv?s="
tell application "Numbers" to tell document trueDoc to tell sheet trueSheet to tell table trueTable
tell row (2)
set parms to get value of cell (1)
end tell
set request to t3 & parms & "key+statistics"
end tell
set texte to download(request)
tell application "Numbers" to tell document trueDoc to tell sheet trueSheet to tell table trueTable
tell row (60)
set value of cell (25) to texte
end tell
end tell
endtel
end tell
end tell
-
Dec 19, 2014 5:05 PM in response to Orchardjoanby SGIII,Where the previous script has:
set yData to do shell script "curl -s " & quoted form of (baseURL & symbStr & propStr)
you'll need to do something like this:
set baseURL to http://finance.yahoo.com/q/ks?
set symbStr to "s=" & qSymb
set propStr to "Key + Statistics"
For Verizon the idea is to construct this string (what you see in the browser address when you view the Key Statistics page):
http://finance.yahoo.com/q/ks?s=VZ+Key+Statistics
To troubleshoot you can temporally put a
return baseURL & symbStr & propStr
in the script. That way you can verify the script is constructing the correct string to feed to curl (which is just a handy command line command to retrieve the html for that web page and place it in yData.
SG
-
Dec 20, 2014 1:02 PM in response to SGIIIby Orchardjoan,I think I may have to give up. I tried your suggestion and stripped it down to a script that is minimal. One stock, the script output on
return (baseURL & symbStr & propStr)
is "http:finance.yahoo.com/q/ks?s=VZ+Key+Statistics"
and the error code is -609 invalid connection.
I have tried multiple variations such as "Key+Statistics" but all get the same.
I have learned a lot with your help, but it appears I can't master this one. Thanks.
-
Dec 21, 2014 4:12 AM in response to Orchardjoanby SGIII,If you want to post your script here, can have a look to try to diagnose what is generating the error.
SG
-
Dec 21, 2014 6:56 AM in response to SGIIIby Orchardjoan,You are amazingly patient. You do not know how much I appreciate this. Maybe you should write an ebook. Sorry this is right justified. Left justified cut it off for some reason.
property theApp : "Numbers"
on run
local texte, rowsCount, trueDoc, trueSheet, trueTable
say "starting"
tell application "Numbers"
activate
set trueDoc to name of front document
set trueSheet to name of first sheet of document trueDoc
set trueTable to name of first table of sheet trueSheet of document trueDoc
end tell
(* get the key statistics and find the forward annual dividend
then put it into the table.*)
tell application "Numbers" to tell document trueDoc to tell sheet trueSheet to tell table trueTable
set qSymb to "VZ"
set baseURL to "http://download.finance.yahoo.com/q/ks?"
set symbStr to "s=" & qSymb
set symProp to "+Key+Statistics"
set yData to do shell script "curl -s" & quoted form of (baseURL & symbStr & symProp)
tell row (8)
set value of cell (3) to yData
end tell
end tell
say "get quotes done"
end run
-
Dec 21, 2014 6:46 PM in response to Orchardjoanby SGIII,You're definitely getting there. You need a space after "curl -s", i.e. you need "curl -s " instead of "curl -s". That's because the shell script that fetches the html source code for the page is fussy about syntax. Unix expects a space before the rest of the command.
I think the reason you're getting a connection error is that yData is a very long string that the script is trying to stuff into one Numbers cell.
You first need to do your extraction of desired value from yData before placing it in a cell, rather than placing the the whole long yData string.
Your problem with AppleScript indentation results from a now months-old bug in the forum software that would appear easy for Apple to correct but they seem to have other priorities. A quick workaround is to go click the HTML at the top of the editor window and wherever you see a pattern like "text-indent: -41.6px; " remove the minus sign. Then click Show Full Editor to get back to your normal view here.
SG
-
Dec 24, 2014 7:39 AM in response to SGIIIby Orchardjoan,Sorry for delay in thanking you. The devil was in the details, i.e. the missing space. Thanks - That was it. ( I have started on the editing of the downloaded data. I merely put something in an unused cell to see it got there. You had an example of how to extract so am working on edit.)
-
Dec 27, 2014 7:14 AM in response to Orchardjoanby Hiroto,Hello
Here you may try an AppleScript handler listed below, that is a wrapper of Perl script to retieve some statisitics from http://finance.yahoo.com/q/ks?s=X. Currently it will return tab separated values of the following fields:
Symbol Current Quote Time 52-Week High Forward Annual Dividend Rate Ex-Dividend Date [Time of Retrieval]
--return retrieve_stock_data({"VZ", "MSFT", "IBM", "AAPL"}) return retrieve_stock_data({"VZ"}) on retrieve_stock_data(argv) (* list argv : list of symbols return string : tab separated values of retrieved data with header row *) set args to "" repeat with a in argv set args to args & space & a's quoted form end repeat do shell script "/usr/bin/perl -CSDA -w <<'EOF' - " & args & " # # retrieve some statistics of stock quotes from http://finance.yahoo.com/q/ks?s=XXXX # @ARGV : symbols # use strict; use XML::LibXML; use POSIX qw(strftime); # fields to be retrieved my @FIELDS = ( '52-Week High', 'Forward Annual Dividend Rate', 'Ex-Dividend Date', ); # max retry count to get real-time quote my $MAXTRY = 2; sub retrieve (@) { # @ : symbol array # return : 2d-array ref of retrieved table my $parser = XML::LibXML->new(); $parser->recover_silently(1); my $baseurl = 'http://finance.yahoo.com/q/ks?s='; my @data = (); for my $sym (@_) { my $i = 0; my ($doc, $rtime, $node, $rtq, $rtqt); while (1) { # it appears real-time quote data node can be missing in DOM tree; thus here's a retry loop $doc = $parser->parse_html_file($baseurl . $sym); $rtime = POSIX::strftime('%FT%T%z', localtime()); $node = ($doc->findnodes(qq(//span[\\@id=\"yfs_l84_\\L$sym\"])))[0]; next if ! $node && $i++ < $MAXTRY; $rtq = $node ? $node->textContent : '-'; $node = ($doc->findnodes(qq(//span[\\@id=\"yfs_t53_\\L$sym\"])))[0]; $rtqt = $node ? $node->textContent : '-'; last; } my @dd = (); for (@FIELDS) { my $td = ($doc->findnodes(qq(//td[contains(text(), \"$_\")]/following-sibling::td)))[0]; push @dd, $td ? $td->textContent : '-'; } push @data, [$sym, $rtq, $rtqt, @dd, $rtime]; } return [['Symbol', 'Current Quote', 'Time', @FIELDS, '[Time of Retrieval]' ], @data ] } sub array2text($;@) { # $ : source 2d-array ref # @ : key value pairs of options ( fs => field_separator, rs => record_separator ) # return : text representation of source 2d-array my $aref = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), @_ ); my ($fs, $rs) = @opts{'fs', 'rs'}; return (join $rs, map { join $fs, @{$_}; } @{$aref}) . $rs; } my $aref = &retrieve(@ARGV); print &array2text($aref, fs => qq(\\t)); EOF" end retrieve_stock_dataIt is not fast and the bottleneck is the process obtaining the html source from web.
Tested under OS X 10.6.8.
Good luck,
H
-
Dec 27, 2014 9:25 AM in response to Hirotoby Orchardjoan,TThanks! Nice after Christmas present...a script that I really need. I will try it as soon as I can. this support group really is kind and helpful to newbies like me.
-
Dec 27, 2014 11:04 AM in response to Hirotoby Orchardjoan,It worked great and returned exactly the information I want, but now I have a real newbie question. How do I put the result into my numbers spread sheet? I tried a few things, but I can't quite follow your script well enough to know how to take the tabbed data and store it into the appropriate row and column. I get the stock symbol for each row, then get the data and put it into certain cells (so I can then do some comparisons and math on some of the cells.)
That looked like a lot of hard work to create the script and i appreciate it.
-
Dec 28, 2014 8:57 AM in response to Orchardjoanby Hiroto,Hello
I was expecting that question.
Since I don't know your table layout, here's a script that is easily customisable.
Currently it assumes -
a) target table is document 1's sheet 1's table 1; and
a) symbols are stored in column A; and
b) target symbols are in rows 2 thru 6; and
c) retrieved data for a symbol are stored in columns 2, 3, 4, 5, 6, 7 in the same row as the symbol.
These settings are defined in properties |DOCUMENT|, |SHEET|, |TABLE|, |SYMBOL_COLUMN|, |SYMBOL_ROW_RANGE| and |DATA_COLUMNS|. See description in script for the details of how |DATA_COLUMNS| are interpreted.
-- APPLESCRIPT _main() on _main() script o property |DOCUMENT| : 1 -- index or name property |SHEET| : 1 -- index or name property |TABLE| : 1 -- index or name property |SYMBOL_COLUMN| : 1 -- column index of symbols property |SYMBOL_ROW_RANGE| : {2, 6} -- {start, end} row indices of target symbols property |DATA_COLUMNS| : {0, 2, 3, 4, 5, 6, 7} -- column indices for retieved data; 0 denotes to discard corresponding data (* E.g., Given the retriever returns the following fields: Symbol; Current Quote; Time; 52-Week High; Forward Annual Dividend Rate; Ex-Dividend Date; [Time of Retrieval] |DATA_COLUMNS| = {0, 2, 3, 4, 5, 6, 7} means: Symbol => discard Current Quote => store in column 2 Time => store in column 3 52-Week High => store in column 4 Forward Annual Dividend Rate => store in column 5 Ex-Dividend Date => store in column 6 [Time of Retrieval] => store in column 7 *) property wid : missing value -- target window id property tbl : missing value -- target table reference property ss : {} -- symbol list property dd : {} -- data array tell application "Numbers" activate -- get target table reference set wid to id of window 1 where its document = document |DOCUMENT| set tbl to a reference to window id wid's document's sheet |SHEET|'s table |TABLE| tell tbl -- get target symbols set {i0, i1} to |SYMBOL_ROW_RANGE| set ss to column |SYMBOL_COLUMN|'s cells i0 thru i1's value repeat with p from 1 to count my ss -- for each symbol -- retrieve stock data from web set dd to my _text2array(my retrieve_stock_data({my ss's item p}), tab, return) set dd to dd's rest -- remove its header entry -- populate cells set i to i0 + p - 1 -- target row index repeat with q from 1 to count my dd's item 1 -- for each field of retieved data if q ≤ (count |DATA_COLUMNS|) then set j to |DATA_COLUMNS|'s item q -- target column index if j > 0 then --set row i's cell j's value to "'" & my dd's item 1's item q -- force it in text format set row i's cell j's value to my dd's item 1's item q -- store it in current format end if end if end repeat end repeat end tell end tell end script tell o to run end _main on retrieve_stock_data(argv) (* list argv : list of symbols return string : tab separated values of retrieved data with header row *) set args to "" repeat with a in argv set args to args & space & a's quoted form end repeat do shell script "/usr/bin/perl -CSDA -w <<'EOF' - " & args & " # # retrieve some statistics of stock quotes from http://finance.yahoo.com/q/ks?s=XXXX # @ARGV : symbols # use strict; use XML::LibXML; use POSIX qw(strftime); # fields to be retrieved my @FIELDS = ( '52-Week High', 'Forward Annual Dividend Rate', 'Ex-Dividend Date', ); # max retry count to get real-time quote my $MAXTRY = 2; sub retrieve (@) { # @ : symbol array # return : 2d-array ref of retrieved table my $parser = XML::LibXML->new(); $parser->recover_silently(1); my $baseurl = 'http://finance.yahoo.com/q/ks?s='; my @data = (); for my $sym (@_) { my $i = 0; my ($doc, $rtime, $node, $rtq, $rtqt); while (1) { # it appears real-time quote data node can be missing in DOM tree; thus here's a retry loop $doc = $parser->parse_html_file($baseurl . $sym); $rtime = POSIX::strftime('%FT%T%z', localtime()); $node = ($doc->findnodes(qq(//span[\\@id=\"yfs_l84_\\L$sym\"])))[0]; next if ! $node && $i++ < $MAXTRY; $rtq = $node ? $node->textContent : '-'; $node = ($doc->findnodes(qq(//span[\\@id=\"yfs_t53_\\L$sym\"])))[0]; $rtqt = $node ? $node->textContent : '-'; last; } my @dd = (); for (@FIELDS) { my $td = ($doc->findnodes(qq(//td[contains(text(), \"$_\")]/following-sibling::td)))[0]; push @dd, $td ? $td->textContent : '-'; } push @data, [$sym, $rtq, $rtqt, @dd, $rtime]; } return [['Symbol', 'Current Quote', 'Time', @FIELDS, '[Time of Retrieval]' ], @data ] } sub array2text($;@) { # $ : source 2d-array ref # @ : key value pairs of options ( fs => field_separator, rs => record_separator ) # return : text representation of source 2d-array my $aref = shift; my %opts = ( fs => qq(\\t), rs => qq(\\n), @_ ); my ($fs, $rs) = @opts{'fs', 'rs'}; return (join $rs, map { join $fs, @{$_}; } @{$aref}) . $rs; } my $aref = &retrieve(@ARGV); print &array2text($aref, fs => qq(\\t)); EOF" end retrieve_stock_data on _text2array(t, cs, rs) (* string t : source text string cs : column separator (field separator) string rs : row separator (record separator) return list : 2d-array represented by t *) script o property pp : _split(t, rs) property qq : {} repeat with p in my pp set end of my qq to _split(p's contents, cs) end repeat return my qq's contents end script tell o to run end _text2array on _split(t, d) (* string t : source string string or list d : separator(s) return list : t splitted by d *) local astid, astid0, tt set astid to a reference to AppleScript's text item delimiters try set {astid0, astid's contents} to {astid's contents, {} & d} set tt to t's text items set astid's contents to astid0 on error errs number errn set astid's contents to astid0 error errs number errn end try return tt end _split --END OF APPLESCRIPTRun the script with the target document open and it will populate cells like this (e.g.):
* Header row is entered manually.
By the way , you can retieve other statistics by defining field labels of the web table in @FIELDS variable in Perl code in the retrieve_stock_data() handler. Currently they are defined as follows:
my @FIELDS = ( '52-Week High', 'Forward Annual Dividend Rate', 'Ex-Dividend Date', );
Perl script gets the data for each given label string by retrieving data from the cell adjacent to the first cell whose content includes the string.
Tested with Numbers v2.0.5 under OS X 10.6.8.
Hope this may help,
H
PS. As you may have already noticed, AppleScript script is mere external gear for Numbers.app and cannot be integrated with Numbers document in any ways as macro can be with Excel document.
