Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

is it possible to alphabetize the words of a text & give word counts?

I don't know if Pages is capable of doing this, or another Apple application (Numbers?) or an independent application?

For example, taking "the wrong place at the wrong time" and creating a new document reading:

at 1
place 1
the 2
time 1
wrong 2

or something like that.

There's something similar I'd be interested in as well, taking for example:

one two
three four five
six

and alphabetizing it using the same number of words per line:

five four
one six three
two

iMac 24", Mac OS X (10.6), 2.4 GHz Intel Core 2 Duo; 1 GB 667 MHz DDR2 SDRAM

Posted on Sep 13, 2009 11:15 AM

Reply
15 replies

Sep 13, 2009 12:16 PM in response to Christopher Philippo

My guess is that's a job for an AppleScript.

Are you asking about a Word Processor document in which every text is in the main text layer or is it needed to grab words from text boxes ?

I'm a bit puzzled by your late example:

and alphabetizing it using the same number of words per line:

five four
one six three
two


as far as my eyes are OK, words aren't alphabetized and
they are two on row 1,
three on row 2
one on row 3.

Yvan KOENIG (VALLAURIS, France) dimanche 13 septembre 2009 21:16:38

Sep 13, 2009 12:25 PM in response to KOENIG Yvan

You may be right about using AppleScript, I've never created one, and am not sure how one would go about doing it for alphabetization and word counts. But yes, taking all the words in a single text and doing this for it.

With regard to the second example:

one two
three four five
six

gets alphabetized as:

five four one six three two

and arranged according to the number of words per line, reading right to left on each line:

five four
one six three
two

And yes, in this case there's two on line one, three on line two, and one on line three. However, I'd like to have something that could preserve the number of words per line regardless of how many words or lines there are; it's not this specific 2/3/1 pattern I want. So for another example:

one two three
four
five six
seven eight

would be

eight five four
one
seven six
three two

doing this when there are more words and lines is a pain and it would be nice to be able to automate it.

Sep 13, 2009 12:59 PM in response to Christopher Philippo

I'm able to write (tomorrow) a script building an alphabetized list of embedded words.
I am able to ask it to count the occurences of each word.

But I really don't understand the way they must be grouped so I can't code the late part of the task.

For my small brain which never read or count from right to left,
the list would be:

eight five
four one
seven six
three two

or

eight five four
one seven six
three two

or

eight five four one
seven six three two

I am unable to code what appears as not logical.

Yvan KOENIG (VALLAURIS, France) dimanche 13 septembre 2009 21:59:12

Sep 13, 2009 1:20 PM in response to KOENIG Yvan

That would be great if you could!

With regard to the second thing I'm trying to do, I mistyped! Sorry to have confused things! I meant "left to right," but wrote "right to left" in error.

To try to describe it, without using an example, as maybe that's somehow confusing things:

The tasks that need to be done for it are:

1) count the number of words per line
2) alphabetize the words
3) redistribute the alphabetized words preserving the number of words per line

If the original text has five lines with word counts per line of 1/2/3/4/5, then the new text will also have 1/2/3/4/5. If the original text has three lines with word counts per line of 5/4/3, then the new text will also have 5/4/3. The number of words per line should always match the original.

So to return to an example:

The original text:

I
like Apple

first step: count the number of words per line:
first line: one word; second line: two words

second step: alphabetize the words (this step need not actually be produced on paper):
Apple I like

third step: rearrange the text, distributing the words from left to right, preserving the number of words per line, resulting in the final text:

Apple
I like

Sep 13, 2009 1:33 PM in response to Christopher Philippo

OK
You want to move words so that they are alphabetized but
so that every edited line contain the number of words existing in the original line.

Are words existing only once ?

May you attach a sample original file to a mail and send it to my mailbox ?
Click my blue name to get my address.

Yvan KOENIG (VALLAURIS, France) dimanche 13 septembre 2009 22:33:57

Sep 13, 2009 2:04 PM in response to KOENIG Yvan

"You want to move words so that they are alphabetized but
so that every edited line contain the number of words existing in the original line."

I think you've got it!

"Are words existing only once ?"

No, a word might be used more than once.

Here's a longer example (the original is a poem by H.P. Lovecraft):

The place was dark and dusty and half-lost
In tangles of old alleys near the quays,
Reeking of strange thing brought in from the seas,
And with queer curls of fog that west winds tossed,
Small lozenge panes obscured by smoke and frost,
Just showed the books, in piles like twisted trees,
Rotting from floor to roof-congeries
Of crumbling elder lore at little cost.

I entered, charmed, and from a cobwebbed heap
Took up the nearest tome and thumbed it through,
Trembling at curious words that seemed to keep
Some secret, monstrous if only one knew
Then, looking for some seller old in craft,
I could find nothing but a voice that laughed.

----

So the above, with the words alphabetized and rearranged with the same number of words per line (also not just the same line breaks but also the same paragraph break), becomes:

a a alleys and and and and and and
at at books brought but by charmed cobwebbed
congeries cost could craft crumbling curious curls dark dusty
elder entered find floor fog for from from frost from
half heap in I I if in in
in it just keep knew laughed like little looking
lore lost lozenge monstrous near nearest
nothing obscured of of of of old

old one only panes piles place queer quays
roof reeking rotting seas secret seemed seller shewed small
smoke some some strange tangles that that that
the the the the the then things
through thumbed to took tome to tossed trembling
trees twisted up voice was west winds with words

----

If you're wondering why, this is part of something called vocabularyclept poetry. One person selects a poem, then alphabetizes it as above, then gives it to somebody else to create a new poem from. It's described in more detail in the book Palindromes and Anagrams by Howard W. Bergerson pages 20-39. Much of it can be previewed in Google Books, if you're inclined.

The other thing, the alphabetized list of words with word counts is an unrelated task.

Sep 14, 2009 2:05 AM in response to Christopher Philippo

A problem clearly described is often a solved problem.

--

--[SCRIPT sort_words]
(*
Enregistrer ce script en tant que script ou progiciel.
Exécuter ce script ou
déposer l'icône d'un fichier texte sur son icône.
Le script lit le fichier
trie les mots du texte
et découpe la liste en lignes de même longueur que les lignes originelles.
Le réultat est enregistré sur le bureau dans "sortedWords.txt"
***********
Save the script as script or application bundle.
Run it or drag and drop a text file icon on its icon.
The script reads the file
sorts the embedded words
split the sorted list in lines of same length than the original ones.
The result is stored on the desktop in "sortedWords.txt"
***********
Yvan KOENIG (VALLAURIS, France)
20090914
*)
property nomDuRapport : "sortedWords.txt"
property rapport : "" -- globale
property liste1 : {}
property liste2 : {}
property liste3 : {}
--=====
on run (* lignes exécutées si on double clique sur l'icône du script application
• lines executed if one double click the application script's icon *)

set fichier to choose file of type {"public.plain-text"} without invisibles
my commun({fichier})
end run
--=====
on open (sel) (* sel contient une liste d'alias des éléments qu'on a déposés sur l'icône du script (la sélection)
• sel contains a list of aliases of the items dropped on the script's icon (the selection) *)
my commun(sel)
end open
--=====
on commun(elems)
my nettoie()
set theDoc to item 1 of elems
set enTexte to read theDoc

set my liste1 to paragraphs of enTexte
set my liste2 to my sort_list(words of enTexte)
set off7 to 0
repeat with l in my liste1
set liste4 to {}

set nbw to count of words of (l as text)
if nbw > 0 then
repeat with i from 1 to nbw
copy item (off7 + i) of my liste2 to end of liste4
end repeat
end if
copy my recolle(liste4, space) to end of my liste3
set off7 to off7 + nbw
end repeat
set enTexte to my recolle(my liste3, return)
set p2d to path to desktop
set p2r to (p2d as Unicode text) & nomDuRapport
tell application "System Events"
if exists (file p2r) then delete (file p2r)
make new file at end of p2d with properties {name:nomDuRapport}
end tell
write enTexte to (p2r as alias)
my nettoie()
end commun
--=====
on nettoie()
set my liste1 to {}
set my liste2 to {}
set my liste3 to {}
end nettoie
--=====
on recolle(l, d)
local t
set AppleScript's text item delimiters to d
set t to l as text
set AppleScript's text item delimiters to ""
return t
end recolle
--=====
on sort_list(unsortedList)
set AppleScript's text item delimiters to (ASCII character 10)
set sortedList to paragraphs of (do shell script "echo " & quoted form of (unsortedList as string) & "| sort -d -f")
set AppleScript's text item delimiters to ""
return sortedList
end sort_list
--=====
--[/SCRIPT]
--


Yvan KOENIG (VALLAURIS, France) lundi 14 septembre 2009 11:05:03

Sep 14, 2009 5:38 PM in response to KOENIG Yvan

"A problem clearly described is often a solved problem."

Indeed!

This script appears to work wonderfully, thank you! I didn't get it to work by dragging a file onto it, so I may have done something wrong, but it works when opening a .txt file from within the script. I initially tried a .pages file, but when that didn't work, I just made a .txt file in AppleWorks. There's probably a way to save a .txt file from Pages, but I don't presently know how.

Thanks again, very clever!

Sep 16, 2009 1:45 PM in response to KOENIG Yvan

I wonder whether you or someone else might be willing to tackle the other script? (I'd like to learn how to write such a thing myself; I'm not sure whether one needs to take classes or buy books, or what.)

What I was hoping to also have was something that would provide a word count for a text document and also a word series count.

For example to have a document consisting of this:

"Because I do not hope to turn again
Because I do not hope
Because I do not hope to turn...."

to be converted into something resembling this:

again
because x 3
do x 3
hope x 3
I x 3
not x 3
to x 2, turn x 2

because I do not hope x 3
because I do not hope to turn x 2

If the original text was arranged otherwise:

"Because I do not hope to turn again. Because I do not hope. Because I do not hope to turn...."

It should return the same listing.

The list of words, arranged alphabetically, should include even words that appear only once (like "again" in the above example), but could exclude "and" and "the." In the above, I imagined this script would list all words beginning with "a" in a single paragraph, all words beginning with "b" in a single paragraph, etc.

The list of series of words would be anything from two words that repeatedly appear together or whole lines/sentences or stanzas/paragraphs. Possibly these repeated series of words should each be followed by a paragraph break as in the above. Alternatively, it could follow the same arrangement as the list of single words; a paragraph for all series of words beginning with the same letter.

The list of series of words for the above example should not include:

because I do not x 3
because I do x 3
because I x 3
I do x 3
I do not x 3

because they always appear in a longer series of words.

However if the text was:

"Because I do not hope to turn again
Because I do not hope
Because I do not hope to turn....
Because I do not turn"

Then it should include

because I do not x 4

I don't know if I have explained it well?

I think it can also be stated this way:

list all repetitions of single words, and their count
and also
list all repetitions of pairs of words if they are not always preceded or followed by the same third word, and their count
list all repetitions of series of three words if they are not always preceded or followed by the same fourth word, and their count
etc.

I'm not sure how the results for repeated series of words would best be arranged; I'd be happy with any arrangement. Possibly it could create alphabetically arranged paragraphs of all the phrases appearing the most, with each following paragraph being the ones with successively fewer repetitions. Alternatively, it could begin with the longest phrases and descend to the pairs of words. Whatever's easiest!

As to why I'd like this, this is part of a kind of analytical writing tool (although it could also be used to check one's own writing for excessive repetition). I came across it in Writing Analytically by Rosenwasser and Stephen (not as a program, but something a reader works out manually); I don't believe it is unique to them. The user of such a script would still be left to determine what the importance of any of the repetitions might mean, or how the context of the words changes them.

Sep 18, 2009 12:01 PM in response to Christopher Philippo

Christopher Philippo wrote:
"A problem clearly described is often a solved problem."

Indeed!

This script appears to work wonderfully, thank you!

I didn't get it to work by dragging a file onto it, so I may have done something wrong, but it works when opening a .txt file from within the script.

The drag and drop feature requires that the script is saved as an Application Package.
It can't work if it's saved as ".scpt).

I initially tried a .pages file, but when that didn't work, I just made a .txt file in AppleWorks.


Maybe my fault, I didn't highlight the fact that a text file is required.

There's probably a way to save a .txt file from Pages, but I don't presently know how.


Share > Export > Standard

Thanks for the feedback.

Yvan KOENIG (VALLAURIS, France) vendredi 18 septembre 2009 21:01:11

Sep 18, 2009 12:22 PM in response to Christopher Philippo

Listing the counted words is a game.

--

--[SCRIPT sortand_countwords]
(*
Enregistrer ce script en tant que script (sortand_countwords.scpt) ou progiciel (sortand_countwords.app).
Exécuter ce script ou
déposer l'icône d'un fichier texte (xxx.txt) sur l'icône de sortand_countwords.app.
Le script lit le fichier
trie les mots du texte
et découpe la liste en lignes de même longueur que les lignes originelles.
Le réultat est enregistré sur le bureau dans "countedWords.txt"
***********
Save the script as script (sortand_countwords.scpt) or application bundle (sortand_countwords.app).
Run it or drag and drop a text file icon (xxx.txt) on the sortand_countwords.app 's icon.
The script reads the file
sorts the embedded words
split the sorted list in lines of same length than the original ones.
The result is stored on the desktop in "countedWords.txt"
***********
Yvan KOENIG (VALLAURIS, France)
2009/09/18
*)
property nomDuRapport : "countedWords.txt"
property rapport : "" -- globale
property liste1 : {}
property liste2 : {}
property liste3 : {}
--=====
on run (* lignes exécutées si on double clique sur l'icône du script application
• lines executed if one double click the application script's icon *)

set fichier to choose file of type {"public.plain-text"} without invisibles
my commun({fichier})
end run
--=====
on open (sel) (* sel contient une liste d'alias des éléments qu'on a déposés sur l'icône du script (la sélection)
• sel contains a list of aliases of the items dropped on the script's icon (the selection) *)
my commun(sel)
end open
--=====
on commun(elems)
my nettoie()
set theDoc to item 1 of elems
set enTexte to read theDoc

set my liste1 to paragraphs of enTexte
set my liste2 to my sort_list(words of enTexte)
set oldWord to ""
set cnt to 1
repeat with i from 1 to count of my liste2
set iw2 to (item i of my liste2) as text
if iw2 is not oldWord then
if oldWord is not "" then copy oldWord & "(" & cnt & ")" to end of my liste3
set oldWord to iw2
set cnt to 1
else
set cnt to cnt + 1
end if

end repeat
set enTexte to my recolle(my liste3, return)
set p2d to path to desktop
set p2r to (p2d as Unicode text) & nomDuRapport
tell application "System Events"
if exists (file p2r) then delete (file p2r)
make new file at end of p2d with properties {name:nomDuRapport}
end tell
write enTexte to (p2r as alias)
my nettoie()
end commun
--=====
on nettoie()
set my liste1 to {}
set my liste2 to {}
set my liste3 to {}
end nettoie
--=====
on recolle(l, d)
local t
set AppleScript's text item delimiters to d
set t to l as text
set AppleScript's text item delimiters to ""
return t
end recolle
--=====
on sort_list(unsortedList)
set AppleScript's text item delimiters to (ASCII character 10)
set sortedList to paragraphs of (do shell script "echo " & quoted form of (unsortedList as string) & "| sort -d -f")
set AppleScript's text item delimiters to ""
return sortedList
end sort_list
--=====
--[/SCRIPT]
--


At this time, I'm busy and tired so I don't know when I will be able to look at the others aspects of your needs.

Yvan KOENIG (VALLAURIS, France) vendredi 18 septembre 2009 21:22:14

Sep 21, 2009 12:29 PM in response to KOENIG Yvan

And now, list and count chunks of words.

--

--[SCRIPT countgroups_ofwords]
(*
Enregistrer ce script en tant que script (sortand_countwords.scpt) ou progiciel (sortand_countwords.app).
Exécuter ce script ou
déposer l'icône d'un fichier texte (xxx.txt) sur l'icône de sortand_countwords.app.
Le script lit le fichier
compte les groupes de mots du texte.
Le résultat est enregistré sur le bureau dans "countedWords.txt".
Il est alors possible de l'exploiter à loisir à l'aide d'un tableur.
***********
Save the script as script (sortand_countwords.scpt) or application bundle (sortand_countwords.app).
Run it or drag and drop a text file icon (xxx.txt) on the sortand_countwords.app 's icon.
The script reads the file
sorts the embedded words
split the sorted list in lines of same length than the original ones.
The result is stored on the desktop in "countedGroupsOfWords.txt".
Then it's easy to treat it in a spreadsheet.
***********
Yvan KOENIG (VALLAURIS, France)
2009/09/21
*)
property nomDuRapport : "countedGroupsOfWords.txt"
property rapport : "" -- globale
property liste1 : {}
property liste2 : {}
property liste3 : {}
--=====
on run (* lignes exécutées si on double clique sur l'icône du script application
• lines executed if one double click the application script's icon *)

--set fichier to choose file of type {"public.plain-text"} without invisibles
set fichier to "Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt" as alias
my commun({fichier})
end run
--=====
on open (sel) (* sel contient une liste d'alias des éléments qu'on a déposés sur l'icône du script (la sélection)
• sel contains a list of aliases of the items dropped on the script's icon (the selection) *)
my commun(sel)
end open
--=====
on commun(elems)
my nettoie()
set theDoc to item 1 of elems
set enTexte to read theDoc

set enTexte to my supprime(enTexte, ",")
set enTexte to my remplace(enTexte, "-", " ")
set my liste1 to paragraphs of enTexte
(* convert to lowercase *)
repeat with i from 1 to count of my liste1
set item i of my liste1 to do shell script "/usr/bin/python -c "import sys; print unicode(sys.argv[1], 'utf8').lower().encode('utf8')" " & quoted form of (item i of my liste1)
end repeat
set enTexte to my recolle(my liste1, return)

set my liste2 to {"string" & tab & "count"}
set my liste3 to {"#" & tab & "string" & tab & "count" & "nb words"}

set l to 0
ignoring case
repeat with i from 1 to count of my liste1
set wordsI to words of item i of my liste1
repeat with j from 1 to count of wordsI
repeat with k from (count of wordsI) to 1 by -1
if not j > k then
set strIJK to my recolle(items j thru k of wordsI, " ")
set nbr to (count of my decoupe(enTexte, strIJK)) - 1
set rec to strIJK & tab & nbr
if rec is not in my liste2 then
set l to l + 1
copy rec to end of my liste2
copy (l as text) & tab & rec & tab & k + 1 - j to end of my liste3
end if
end if
end repeat -- k
end repeat -- j
end repeat -- i
end ignoring

set enTexte to my recolle(my liste3, return)
set p2d to path to desktop
set p2r to (p2d as Unicode text) & nomDuRapport
tell application "System Events"
if exists (file p2r) then delete (file p2r)
make new file at end of p2d with properties {name:nomDuRapport}
end tell
write enTexte to (p2r as alias)

my nettoie()
end commun
--=====
on nettoie()
set my liste1 to {}
set my liste2 to {}
set my liste3 to {}
end nettoie
--=====
on decoupe(t, d)
local l
set AppleScript's text item delimiters to d
set l to text items of t
set AppleScript's text item delimiters to ""
return l
end decoupe
--=====
on remplace(t, d1, d2)
local l
set AppleScript's text item delimiters to d1
set l to text items of t
set AppleScript's text item delimiters to d2
set t to l as text
set AppleScript's text item delimiters to ""
return t
end remplace
--=====
on recolle(l, d)
local t
set AppleScript's text item delimiters to d
set t to l as text
set AppleScript's text item delimiters to ""
return t
end recolle
--=====
on supprime(t, d)
local l
set AppleScript's text item delimiters to d
set l to text items of t
set AppleScript's text item delimiters to ""
return l as text
end supprime
--=====
on sort_list(unsortedList)
set AppleScript's text item delimiters to (ASCII character 10)
set sortedList to paragraphs of (do shell script "echo " & quoted form of (unsortedList as string) & "| sort -d -f")
set AppleScript's text item delimiters to ""
return sortedList
end sort_list
--=====
--[/SCRIPT]
--


Yvan KOENIG (VALLAURIS, France) lundi 21 septembre 2009 21:28:46

Sep 22, 2009 5:57 AM in response to Christopher Philippo

Oops,
I forgot to remove what I used for tests (I'm too lazy to use the dialog).

At this time you see:

--

on run (* lignes exécutées si on double clique sur l'icône du script application
• lines executed if one double click the application script's icon *)

--set fichier to choose file of type {"public.plain-text"} without invisibles
set fichier to "Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt" as alias
my commun({fichier})
end run
--


edit this handler as:

--

on run (* lignes exécutées si on double clique sur l'icône du script application
• lines executed if one double click the application script's icon *)

set fichier to choose file of type {"public.plain-text"} without invisibles
-- set fichier to "Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt" as alias
my commun({fichier})
end run
--


Yvan KOENIG (VALLAURIS, France) mardi 22 septembre 2009 14:56:15

is it possible to alphabetize the words of a text & give word counts?

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.