SPLIT STRING

Is there a quick command for splitting strings into equal increments?

Ive been using the code below but it does not do well at all.


display dialog my splitText_section("one two three four five six seven", 2)   ----NEEDS FIXED ---  does not work with vety well   on splitText_section(someText, w) -----(string, sections)      set prevTIDs to AppleScript's text item delimiters      set AppleScript's text item delimiters to " "      set output to text items of someText      set AppleScript's text item delimiters to prevTIDs      set output2 to ""      set x to 0      set y to 0      set cnt to count item of output      set partz to cnt / w as string      set partz to round partz rounding up      repeat           set x to x + 1           set output2 to output2 & item x of output as string           set y to y + 1           if y < partz then set output2 to output2 & " "           if x = cnt then exit repeat           if y = partz then                set output2 to output2 & return                set y to 0           end if      end repeat      return output2 end splitText_section

imac, Mac OS X (10.5.8)

Posted on Feb 17, 2010 7:38 AM

Reply
45 replies

Feb 17, 2010 8:54 AM in response to handellphp

Welcome to Apple Discussions!

This will split the string into two, but then shift the break rightwards until the first instance of a defined delimiter is reached. Note that if the delimiter is a space it is discarded, whereas if it's a letter or other character it's not. There is no error checking - so if the delimiter is not in the second string (stringB) the script will fail.

<pre style="
font-family: Monaco, 'Courier New', Courier, monospace;
font-size: 10px;
margin: 0px;
padding: 5px;
border: 1px solid #000000;
width: 720px;
color: #000000;
background-color: #FFDDFF;
overflow: auto;">
set theString to "one two three four five six seven"
set theDelimiter to " "
set midString to round ((length of theString) / 2)
set stringA to text 1 thru midString of theString
set stringB to text (midString + 1) thru end of theString
set controlList to characters of stringB
repeat with theChar in controlList
if theChar as string is not theDelimiter as string then
set stringA to stringA & theChar
set stringB to text 2 thru end of stringB
else
if theDelimiter as string is " " as string then
set stringB to text 2 thru end of stringB
end if
exit repeat
end if
end repeat
stringA & "| |" & stringB </pre>

Feb 17, 2010 12:49 PM in response to Arkouda

This script is very nice & works out way better for what I need it for. However it is limited to only halves. Sometimes I will be calling on a subroutine that divides into thirds, fourths and sometimes even fifths.

How do I convert this into a subroutine that can divide a string equally into any number of parts?
It does not necessarily have to be apple script as long as it will run in the script window. It can be js or a shell script.



[code]
display dialog split_me("Jon Jacob Jingle Hymersmithsonian", 2)


on split_me(theString, 2)
set theDelimiter to " "
set midString to round ((length of theString) / 2)
set stringA to text 1 thru midString of theString
set stringB to text (midString + 1) thru end of theString
set controlList to characters of stringB
repeat with theChar in controlList
if theChar as string is not theDelimiter as string then
set stringA to stringA & theChar
set stringB to text 2 thru end of stringB
else
if theDelimiter as string is " " as string then
set stringB to text 2 thru end of stringB
end if
exit repeat
end if
end repeat
return stringA & return & stringB
end split_me

Feb 17, 2010 5:27 PM in response to handellphp

Ah, since we've thrown out AppleScript ....

Here's the Ruby way ...


def split_me (str,n)
str.scan(/.{#{str.length/n}}/)
end
p split_me("Jon Jacob Jingle Hymersmithsonian",2)
p split_me("Jon Jacob Jingle Hymersmithsonian",3)
p split_me("Jon Jacob Jingle Hymersmithsonian",4)
p split_me("Jon Jacob Jingle Hymersmithsonian",5)


returns

["Jon Jacob Jingle", " Hymersmithsonia"]
["Jon Jacob J", "ingle Hymer", "smithsonian"]
["Jon Jaco", "b Jingle", " Hymersm", "ithsonia"]
["Jon Ja", "cob Ji", "ngle H", "ymersm", "ithson"]

Feb 17, 2010 6:16 PM in response to ericmeyers

And something I threw together in Perl:

sub split_chars {
my $what = shift;
my $size = shift;
my @array = $what =~ m/(.{1,$size})/g;
return (@array);
}


Using it in a script:

#!/usr/bin/perl
use strict;
use Data::Dumper;
my $sentence = qq(Some lovely text information for my program to parse);
my @parts = split_chars($sentence, 4);
print Dumper(@parts);
sub split_chars {
my $what = shift;
my $size = shift;
my @array = $what =~ m/(.{1,$size})/g;
return (@array);
}


charlie

Feb 17, 2010 7:00 PM in response to ericmeyers

I still need to run this from within 'applecript' or 'applecript studio' and to be able to pass variables back n forth to the applecript. I do not know enough about ruby or perl to make this work.

this is what i have so far but it returns noting. I need var 'the_string' to return the string divided.
[code]
set the_string to "Jon Jacob Jingle Hymersmithsonian"


set this_script to {"ruby #!/usr/bin/env ruby
def split_me (str,n)
str.scan(/.{#{str.length/n}}/)
end
p split_me(\"", the_string, "\",2) "} as string


do shell script this_script

Feb 17, 2010 9:15 PM in response to handellphp

handellphp wrote:
Does this break the words up or the characters?


It breaks the string into characters.

I do not want the words broken, just the lines/paragraphs.


Sorry, I didn't read your original post that closely. I was mostly going off Eric's script, which split the string without regard to words.

Also, it would probably be helpful if you posted an example or two of what you want the text to look like once it got split up with different settings. That would make it easier to figure out how to get your end product.

charlie

Feb 18, 2010 6:21 AM in response to Charles Minow

(*
Bernards script was the closest thing to what I need. I posted an example below of more specifically what I intend the outcome to be. This seems like it would be an easy task however Ive been stumped for days.


[code]

I am trying to break up a string/paragraph into as close to equal word segments as possible *)


-----example halves
set the_string to split wordseg("one two three four five six extraordinarily_longer", 2)

(* DESIRED OUTPUT HERE:
output_string = "one two three four five
six extraordinarily_longer"
display dialog the_string
*)



---example thirds
set the_string to split wordseg("one two three four five six extraordinarily_longer", 3)

(* DESIRED OUTPUT HERE:
output_string = "one two three
four five six
extraordinarily_longer"
display dialog the_string
*)


---example sixths
set the_string to split wordseg("one two three four five six extraordinarily_longer", 6)

(* DESIRED OUTPUT HERE:
output_string = "one two
three
four
five
six
extraordinarily_longer"
display dialog the_string
*)



-----the subroutine
split word_seg(inputstring, x)

(*do function divide string into increments of x without cutting breaking up any words. Any blank space at the end of each new paragraphs will be removed as well.
Script would evaluate the length of the 2 words nearest to the split to determine best place to make the paragraph breaks*)

return output_string
end

Message was edited by: handellphp

Feb 18, 2010 7:39 AM in response to handellphp

I gave it some more thought:

<pre style="
font-family: Monaco, 'Courier New', Courier, monospace;
font-size: 10px;
margin: 0px;
padding: 5px;
border: 1px solid #000000;
width: 720px; height: 340px;
color: #000000;
background-color: #FFDDFF;
overflow: auto;">
display dialog split_me("one two three four five six extraordinarily_longer", 4)

on split_me(theString, Segments)

set theDelimiter to " "
set stringList to {}
--set theString to "one two three four five six extraordinarily_longer"
set returnString to ""
--set Segments to 2
set stringLength to length of theString
set segSize to round stringLength / Segments rounding up
set Occurrences to 0
set theOffset to offset of theDelimiter in theString
repeat while theOffset is not 0
if stringList is {} then
set end of stringList to {text 1 thru (theOffset - 1) of theString, theOffset}
else
set end of stringList to {text 1 thru (theOffset - 1) of theString, theOffset + (last item of last item of stringList)}
end if
set theString to text (theOffset + 1) thru end of theString
set Occurrences to Occurrences + 1
set theOffset to offset of theDelimiter in theString
end repeat
set end of stringList to {theString, stringLength}

if (Segments - 1) > Occurrences then
display dialog "Cant Split"
else
if length of stringList = Segments then
repeat with aString in stringList
set returnString to returnString & first item of aString & return
end repeat
else
set breakList to {}
repeat with breakNumber from 1 to Segments
repeat with aString in stringList
if (second item of aString < (segSize * breakNumber)) or (second item of aString = (segSize * breakNumber)) then
if length of breakList ≠ breakNumber then
set end of breakList to {first item of aString}
else
set end of item breakNumber of breakList to first item of aString
end if
if length of stringList > 1 then
set stringList to items 2 thru end of stringList
end if
end if
end repeat
end repeat
repeat with aString in breakList
repeat with subString in aString
set returnString to returnString & subString & " "
end repeat
set returnString to returnString & return
end repeat
end if
end if

return returnString
end split_me </pre>

Feb 18, 2010 11:12 AM in response to Arkouda

When I run your script, I get un even breaks.

Example:

display dialog split_me("The Ohio Valley Country Club, Casino & Resorts", 2)

returns
"The Ohio Valley
Country Club, Casino & Resorts "


when a more logic output would be
"The Ohio Valley Country
Club, Casino & Resorts"



The application is design. I am using Adobe AI to output designs with multiple variables which is usually a pretty simple task however I have one particular job that requires lines to be divided in order to fit a predefined space.

After the lines get divided by a variable character count they are then horizontally scaled to fit the a more precise width. Word wrap will not work.

(it's so difficult trying to make the computer think like a human)


The closest Ive gotten is thsi script below, but it doesnt work well either on equal division and messes up when a word is longer than the max char width for each line.

[code]

display dialog split_me("The Ohio Valley Country Club, Casino & Resorts", 2)
display dialog split_me("The Ohio Valley Country Club, Casino & Resorts", 3)
display dialog split_me("The Ohio Valley Country Club, Casino & Resorts", 4)
display dialog split_me("The Ohio Valley Country Club, Casino & Resorts", 5)
display dialog split_me("The Ohio Valley Country Club, Casino & Resorts", 6)

on split_me(theString, d)
set theDelimiter to " "
set cnt to length of theString
set midString to round ((length of theString) / d) rounding down
set midString to midString
set hhh to count words of theString
set new_string to ""
set x to 0
set incrementz to midString
repeat
set x to x + 1
set char to character x of theString
if char = " " then
if x > midString then
set midString to midString + incrementz
set new_string to new_string & return
end if
end if
set new_string to new_string & char
if x = cnt then exit repeat
end repeat
return new_string
end split_me

Message was edited by: handellphp

Feb 18, 2010 3:12 PM in response to handellphp

You're right. It seems to be something to do with a rounding error, so I added an extra 1 into the segment size that is used in the comparison:

<pre style="
font-family: Monaco, 'Courier New', Courier, monospace;
font-size: 10px;
margin: 0px;
padding: 5px;
border: 1px solid #000000;
width: 720px; height: 340px;
color: #000000;
background-color: #FFDDFF;
overflow: auto;">
display dialog split_me("The Ohio Valley Country Club, Casino & Resorts", 5)

on split_me(theString, Segments)

set theDelimiter to " "
set stringList to {}
set returnString to ""
set stringLength to length of theString
set segSize to 1 + (round stringLength / Segments rounding up)
set Occurrences to 0
set theOffset to offset of theDelimiter in theString
repeat while theOffset is not 0
if stringList is {} then
set end of stringList to {text 1 thru (theOffset - 1) of theString, theOffset}
else
set end of stringList to {text 1 thru (theOffset - 1) of theString, theOffset + (last item of last item of stringList)}
end if
set theString to text (theOffset + 1) thru end of theString
set Occurrences to Occurrences + 1
set theOffset to offset of theDelimiter in theString
end repeat
set end of stringList to {theString, stringLength}

if (Segments - 1) > Occurrences then
return "Cant Split"

else
if length of stringList = Segments then
repeat with aString in stringList
set returnString to returnString & first item of aString & return
end repeat
else
set breakList to {}
repeat with breakNumber from 1 to Segments
repeat with aString in stringList
if (second item of aString < (segSize * breakNumber)) or (second item of aString = (segSize * breakNumber)) then
if length of breakList ≠ breakNumber then
set end of breakList to {first item of aString}
else
set end of item breakNumber of breakList to first item of aString
end if
if length of stringList > 1 then
set stringList to items 2 thru end of stringList
end if
end if
end repeat
end repeat
repeat with aString in breakList
repeat with subString in aString
set returnString to returnString & subString & " "
end repeat
set returnString to returnString & return
end repeat
end if
end if

return returnString
end split_me </pre>

I just checked through the possible segment values and there's something odd happening when you get to 7, but that's my contribution for today!

Message was edited by: Bernard Harte

Feb 18, 2010 4:37 PM in response to Arkouda

Well Im gonna have to go with this one, its not perfect but its closer to my goal.

I added some additional script to the bottom in order to remove the blank spaces at the end of each paragraph. Thank you very much for your input, I appreciate it greatly.



[code]

display dialog split_me("The Ohio Valley Resorts & Casinos", 2)


on split_me(theString, Segments)

set theDelimiter to " "
set stringList to {}
set returnString to ""
set stringLength to length of theString
set segSize to 1 + (round stringLength / Segments rounding up)
set Occurrences to 0
set theOffset to offset of theDelimiter in theString
repeat while theOffset is not 0
if stringList is {} then
set end of stringList to {text 1 thru (theOffset - 1) of theString, theOffset}
else
set end of stringList to {text 1 thru (theOffset - 1) of theString, theOffset + (last item of last item of stringList)}
end if
set theString to text (theOffset + 1) thru end of theString
set Occurrences to Occurrences + 1
set theOffset to offset of theDelimiter in theString
end repeat
set end of stringList to {theString, stringLength}

if (Segments - 1) > Occurrences then
return "Cant Split"

else
if length of stringList = Segments then
repeat with aString in stringList
set returnString to returnString & first item of aString & return
end repeat
else
set breakList to {}
repeat with breakNumber from 1 to Segments
repeat with aString in stringList
if (second item of aString < (segSize * breakNumber)) or (second item of aString = (segSize * breakNumber)) then
if length of breakList ≠ breakNumber then
set end of breakList to {first item of aString}
else
set end of item breakNumber of breakList to first item of aString
end if
if length of stringList > 1 then
set stringList to items 2 thru end of stringList
end if
end if
end repeat
end repeat
repeat with aString in breakList
repeat with subString in aString
set returnString to returnString & subString & " "
end repeat
set returnString to returnString & return
end repeat
end if
end if


-------------my add on
set returnString2 to ""
repeat with x from 1 to (count paragraphs of returnString)
set this_par to paragraph x of returnString
set z to count characters of this_par
set z to z - 1
try
if last character of this_par = " " then set returnString2 to returnString2 & (characters 1 through z of this_par)
if last character of this_par ≠ " " then set returnString2 to returnString2 & this_par
if (last character of this_par = " ") or (last character of this_par ≠ " ") then set returnString2 to returnString2 & return
end try
end repeat

if last character of returnString2 = "
" then
set cnt to count characters of returnString2
set returnString to characters 1 through (cnt - 1) of returnString2 as string
end if
------------- end my add on




return returnString
end split_me

Feb 18, 2010 5:27 PM in response to handellphp

And here's a version you can call from AppleScript ...

Let's call this abc.rb ...


def split_me (str,n)
wordcount = str.scan(/(w|-|,|&)+/).size
ms = str.scan /(([w,&](s)?){#{wordcount/n}})/
ms.each do |m|
p m[0]
end
end
splitme(ARGV[0],ARGV[1].toi)



set pathtoscript to "/Users/ericmeyers/abc.rb"
set foobar to "'The Ohio Valley Country Club, Casino & Resorts'"
set res to do shell script "ruby " & pathtoscript & " " & foobar & " 2"
display dialog res


!http://img718.imageshack.us/img718/8568/screenshot20100218at826.png!

Eric

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

SPLIT STRING

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.