How to get integer value from a string
I have some strings which contain number and other characters, such as "1 920 pixels". What I want is to get 1920 as integer.
Thanks in advance.
I have some strings which contain number and other characters, such as "1 920 pixels". What I want is to get 1920 as integer.
Thanks in advance.
Mark said:
Well, you still have to turn it into an integer.
Good point. Had forgotten that in all the fun...
This should do it, I think:
setstoquoted formof "He's got 3 cats and 1,920 dogs and $20,000"
do shell script "sed s/[a-zA-Z\\']//g <<< " & s
set dx to the result
set numlist to {}
repeat with i from 1 to count of words in dx
set this_item to word i of dx
try
set this_item to this_item as number
set the end of numlist to this_item
end try
end repeat
numlist
set x to "1920"
log class of x
set y to x as number
class of y
Replies:
(*text*)
Result:
integer
As Frank Caggiano writes, what is your preferred programming language for this implementation?
Also and somewhat more subtly, what is the string encoding? If it's ASCII, then most available calls will work. If you're working with Unicode or UTF-8, then you'll probably want to use different routines to parse and retrieve the data.
strtok() and then atoi() in C are primitive, but would work. Using Objective C, NSString has methods to parse and retrieve data, and to return an integer value from a string. Python and most other languages also have conversions. Common finite state parsers such as Ragel, Lex/Yacc or Flex/Bison can usually return integer values, as well â using a parser can sometimes be easier than hand-coding the parsing, when dealing with big wads of text data.
That's right because you have to extract the part that you want as a number as a substring first.
e.g.,
set s to "hello 1984"
set substringIndex to offset of "1984" in s
set substring to text substringIndex thru -1 of s as number
and going the other way:
set p to "1,920 pixels"
set substringIndex to offsetof "0" inp
set substring to text 1 thru (substringIndex) of p as number
That's my mistakes. Post this thread in a hurry and forget to describe in detail.
As for the text encoding, it may be UTF-8 by default In Apple Script. One of the methods is that: convert the encoding to ASCII and then compare each character with ASCII code, but it's a little inconvenient. I want to know a faster way to get number from string in Apple Script when "as integer" doesn't work (just like: "1 920 pixels" as integer)
OK, here's a handler that should do it for you.
--use this line to specify the string and the number
returnNumber("the number 1,920 can be at the start, int middle or the end", 1920)
--place this handler at the beginning of your script
on returnNumber(myString, aNumber)
set aNumber to aNumber as text
set myString to myString as text
set substring to ""
set removeLeading to offsetof (text 1 of aNumber) inmyString
if removeLeading is greater than 1 then
set substring to text (removeLeading) thru -1 of myString
else
set substring to myString
end if
set removeTrailing to offsetof (text -1 of aNumber) insubstring
set substring to text 1 thru removeTrailing of substring as number
return substring
end returnNumber
Ah, now there's a flaw in that handler, which is that repeated numerals will break it (e.g, "55" will return '5').
This modified handler should solve that problem, but note that you also must specify the number exactly as it is in the string and wrapped in quotes, ie "1,920" not 1920
returnNumber("there are 1,920 apples in my basket", "1,920")
on returnNumber(myString, aNumber)
set aNumber to aNumber as text
set numberCount to (count of aNumber)
set myString to myString as text
set substring to ""
set removeLeading to offsetof (text 1 of aNumber) inmyString
if removeLeading is greater than 1 then
set substring to text (removeLeading) thru -1 of myString
else
set substring to myString
end if
set substring to text 1 thru numberCount of substring as number
return substring
end returnNumber
I like this. đ It 's more robust than messing around with text offsets.
I put it into an AppleScript for the OP:
set formula to "egrep -o \"(\\d{1,3},?\\d{1,3},?\\d{3}|\\d+)\" <<< \""
set s to quoted form of "your string with 1920 or 55 or any other number in it"
set term to "\""
do shell scriptformula & s & term
However, note one minor weakness which my revised handler above doesn't have: if the number is comma-separated like "1,920" it'll return
1
920
Phil,
Another day, another RE. The following will digest any optionally punctuated integer and/or decimal less than a trillion, and handle decimal fractions (e.g. .000456) too. I made certain it handles 1,920 correctly. đ
egrep -o "(\d+[.,]?\d+[.,]?\d+[.,]?\d+[.,]?\d?\d?|\d+[.,]?\d+|[.,]\d+)â
Cool, though you're not going to like the output of this version (Sorry!).
set formula to "egrep -o \"(\\d+[.,]?\\d+[.,]?\\d+[.,]?\\d+[.,]?\\d?\\d?|\\d+[.,]?\\d+|[.,]\\d+)\" <<< \""
set s to quoted form of "any number you like, (e.g., 1,920) but don't put a $ sign in front of it like $1,920 ;)"
set term to "\""
do shell scriptformula & s & term
You are soooo right. Fixed. Still wonât handle a space between currency symbol and number. Long enough as it is.
egrep -o "([$]?\d+[.,]?[$]?\d+[.,]?[$]?\d+[.,]?[$]?\d+[.,]?\d?\d?|[$]?\d+[.,]?[$]?\d+|[$ ]?[.,]\d+)"
egrep -o "([$]?\\d+[.,]?[$]?\\d+[.,]?[$]?\\d+[.,]?[$]?\\d+[.,]?\\d?\\d?|[$]?\\d+[.,]?[$] ?\\d+|[$]?[.,]\\d+)"
I couldn't get your dollar version to work, which forced me to dust off my old Sed & Awk book. So I think I've got the whole thing now and much more compact.
Anyone find any flaws in this one?
set s to quoted form of "he has got 3 cats and 1,920 dogs and $20,000"
do shell script "sed s/[a-z]//g <<< " & s
Escape the dollar sign.
[KSH_93u+]:tmp $ sed s/[a-z]//g <<<"he has got 3 cats and 1,920 dogs and \$20,000"
3 1,920 $20,000If you want to filter a-z and A-Z then
[KSH_93u+]:tmp $ sed 's/[[:alpha:]]//g' <<< "He has got 3 cats and 1,920 dogs and \$20,000"Escaping the dollar sign isn't allowed (I'm assuming) as that's going to be raw input from the user, and in any case unnecessary as its already taken care of with the AppleScript 'quoted form of' syntax. đ
You're right I forgot to account for caps, and we all forgot about apostrophes, so it should now be:
set s to quoted form of "He's got 3 cats and 1,920 dogs and $20,000"
do shell script "sed s/[a-zA-Z\\']//g <<< " & s
Result:
" 3 1,920 $20,000"
What about other punctuation that may appear in your string? The extended RE's in grep are better for handling this. I'm guessing that what ever version of grep you and V... are using must be compiled with pcre.
How to get integer value from a string