Previous 1 2 Next 15 Replies Latest reply: Sep 22, 2009 8:35 PM by Christopher Philippo
Christopher Philippo Level 3 Level 3 (505 points)
I don't know if Pages is capable of doing this, or another Apple application (Numbers?) or an independent application?

For example, taking "the wrong place at the wrong time" and creating a new document reading:

at 1
place 1
the 2
time 1
wrong 2

or something like that.

There's something similar I'd be interested in as well, taking for example:

one two
three four five
six

and alphabetizing it using the same number of words per line:

five four
one six three
two

iMac 24", Mac OS X (10.6), 2.4 GHz Intel Core 2 Duo; 1 GB 667 MHz DDR2 SDRAM
  • Level 8 Level 8 (41,790 points)
    My guess is that's a job for an AppleScript.

    Are you asking about a Word Processor document in which every text is in the main text layer or is it needed to grab words from text boxes ?

    I'm a bit puzzled by your late example:

    and alphabetizing it using the same number of words per line:

    five four
    one six three
    two


    as far as my eyes are OK, words aren't alphabetized and
    they are two on row 1,
    three on row 2
    one on row 3.

    Yvan KOENIG (VALLAURIS, France) dimanche 13 septembre 2009 21:16:38
  • Christopher Philippo Level 3 Level 3 (505 points)
    You may be right about using AppleScript, I've never created one, and am not sure how one would go about doing it for alphabetization and word counts. But yes, taking all the words in a single text and doing this for it.

    With regard to the second example:

    one two
    three four five
    six

    gets alphabetized as:

    five four one six three two

    and arranged according to the number of words per line, reading right to left on each line:

    five four
    one six three
    two

    And yes, in this case there's two on line one, three on line two, and one on line three. However, I'd like to have something that could preserve the number of words per line regardless of how many words or lines there are; it's not this specific 2/3/1 pattern I want. So for another example:

    one two three
    four
    five six
    seven eight

    would be

    eight five four
    one
    seven six
    three two

    doing this when there are more words and lines is a pain and it would be nice to be able to automate it.
  • Level 8 Level 8 (41,790 points)
    I'm able to write (tomorrow) a script building an alphabetized list of embedded words.
    I am able to ask it to count the occurences of each word.

    But I really don't understand the way they must be grouped so I can't code the late part of the task.

    For my small brain which never read or count from right to left,
    the list would be:

    eight five
    four one
    seven six
    three two

    or

    eight five four
    one seven six
    three two

    or

    eight five four one
    seven six three two

    I am unable to code what appears as not logical.

    Yvan KOENIG (VALLAURIS, France) dimanche 13 septembre 2009 21:59:12
  • Christopher Philippo Level 3 Level 3 (505 points)
    That would be great if you could!

    With regard to the second thing I'm trying to do, I mistyped! Sorry to have confused things! I meant "left to right," but wrote "right to left" in error.

    To try to describe it, without using an example, as maybe that's somehow confusing things:

    The tasks that need to be done for it are:

    1) count the number of words per line
    2) alphabetize the words
    3) redistribute the alphabetized words preserving the number of words per line

    If the original text has five lines with word counts per line of 1/2/3/4/5, then the new text will also have 1/2/3/4/5. If the original text has three lines with word counts per line of 5/4/3, then the new text will also have 5/4/3. The number of words per line should always match the original.

    So to return to an example:

    The original text:

    I
    like Apple

    first step: count the number of words per line:
    first line: one word; second line: two words

    second step: alphabetize the words (this step need not actually be produced on paper):
    Apple I like

    third step: rearrange the text, distributing the words from left to right, preserving the number of words per line, resulting in the final text:

    Apple
    I like
  • Level 8 Level 8 (41,790 points)
    OK
    You want to move words so that they are alphabetized but
    so that every edited line contain the number of words existing in the original line.

    Are words existing only once ?

    May you attach a sample original file to a mail and send it to my mailbox ?
    Click my blue name to get my address.

    Yvan KOENIG (VALLAURIS, France) dimanche 13 septembre 2009 22:33:57
  • Christopher Philippo Level 3 Level 3 (505 points)
    "You want to move words so that they are alphabetized but
    so that every edited line contain the number of words existing in the original line."

    I think you've got it!

    "Are words existing only once ?"

    No, a word might be used more than once.

    Here's a longer example (the original is a poem by H.P. Lovecraft):

    The place was dark and dusty and half-lost
    In tangles of old alleys near the quays,
    Reeking of strange thing brought in from the seas,
    And with queer curls of fog that west winds tossed,
    Small lozenge panes obscured by smoke and frost,
    Just showed the books, in piles like twisted trees,
    Rotting from floor to roof-congeries
    Of crumbling elder lore at little cost.

    I entered, charmed, and from a cobwebbed heap
    Took up the nearest tome and thumbed it through,
    Trembling at curious words that seemed to keep
    Some secret, monstrous if only one knew
    Then, looking for some seller old in craft,
    I could find nothing but a voice that laughed.

    ----

    So the above, with the words alphabetized and rearranged with the same number of words per line (also not just the same line breaks but also the same paragraph break), becomes:

    a a alleys and and and and and and
    at at books brought but by charmed cobwebbed
    congeries cost could craft crumbling curious curls dark dusty
    elder entered find floor fog for from from frost from
    half heap in I I if in in
    in it just keep knew laughed like little looking
    lore lost lozenge monstrous near nearest
    nothing obscured of of of of old

    old one only panes piles place queer quays
    roof reeking rotting seas secret seemed seller shewed small
    smoke some some strange tangles that that that
    the the the the the then things
    through thumbed to took tome to tossed trembling
    trees twisted up voice was west winds with words

    ----

    If you're wondering why, this is part of something called vocabularyclept poetry. One person selects a poem, then alphabetizes it as above, then gives it to somebody else to create a new poem from. It's described in more detail in the book Palindromes and Anagrams by Howard W. Bergerson pages 20-39. Much of it can be previewed in Google Books, if you're inclined.

    The other thing, the alphabetized list of words with word counts is an unrelated task.
  • Level 8 Level 8 (41,790 points)
    A problem clearly described is often a solved problem.

    --

    --[SCRIPT sort_words]
    (*
    Enregistrer ce script en tant que script ou progiciel.
    Exécuter ce script ou
    déposer l'icône d'un fichier texte sur son icône.
    Le script lit le fichier
    trie les mots du texte
    et découpe la liste en lignes de même longueur que les lignes originelles.
    Le réultat est enregistré sur le bureau dans "sortedWords.txt"

    ***********

    Save the script as script or application bundle.
    Run it or drag and drop a text file icon on its icon.
    The script reads the file
    sorts the embedded words
    split the sorted list in lines of same length than the original ones.
    The result is stored on the desktop in "sortedWords.txt"

    ***********

    Yvan KOENIG (VALLAURIS, France)
    20090914
    *)

    property nomDuRapport : "sortedWords.txt"

    property rapport : "" -- globale

    property liste1 : {}
    property liste2 : {}
    property liste3 : {}

    --=====

    on run (* lignes exécutées si on double clique sur l'icône du script application
    • lines executed if one double click the application script's icon *)

    set fichier to choose file of type {"public.plain-text"} without invisibles
    my commun({fichier})
    end run

    --=====

    on open (sel) (* sel contient une liste d'alias des éléments qu'on a déposés sur l'icône du script (la sélection)
    • sel contains a list of aliases of the items dropped on the script's icon (the selection) *)
    my commun(sel)
    end open

    --=====
    on commun(elems)
    my nettoie()
    set theDoc to item 1 of elems
    set enTexte to read theDoc

    set my liste1 to paragraphs of enTexte
    set my liste2 to my sort_list(words of enTexte)
    set off7 to 0
    repeat with l in my liste1
    set liste4 to {}

    set nbw to count of words of (l as text)
    if nbw > 0 then
    repeat with i from 1 to nbw
    copy item (off7 + i) of my liste2 to end of liste4
    end repeat
    end if
    copy my recolle(liste4, space) to end of my liste3
    set off7 to off7 + nbw
    end repeat
    set enTexte to my recolle(my liste3, return)
    set p2d to path to desktop
    set p2r to (p2d as Unicode text) & nomDuRapport
    tell application "System Events"
    if exists (file p2r) then delete (file p2r)
    make new file at end of p2d with properties {name:nomDuRapport}
    end tell
    write enTexte to (p2r as alias)
    my nettoie()
    end commun

    --=====

    on nettoie()
    set my liste1 to {}
    set my liste2 to {}
    set my liste3 to {}
    end nettoie

    --=====

    on recolle(l, d)
    local t
    set AppleScript's text item delimiters to d
    set t to l as text
    set AppleScript's text item delimiters to ""
    return t
    end recolle

    --=====

    on sort_list(unsortedList)
    set AppleScript's text item delimiters to (ASCII character 10)
    set sortedList to paragraphs of (do shell script "echo " & quoted form of (unsortedList as string) & "| sort -d -f")
    set AppleScript's text item delimiters to ""
    return sortedList
    end sort_list

    --=====
    --[/SCRIPT]
    --


    Yvan KOENIG (VALLAURIS, France) lundi 14 septembre 2009 11:05:03
  • Christopher Philippo Level 3 Level 3 (505 points)
    "A problem clearly described is often a solved problem."

    Indeed!

    This script appears to work wonderfully, thank you! I didn't get it to work by dragging a file onto it, so I may have done something wrong, but it works when opening a .txt file from within the script. I initially tried a .pages file, but when that didn't work, I just made a .txt file in AppleWorks. There's probably a way to save a .txt file from Pages, but I don't presently know how.

    Thanks again, very clever!
  • Christopher Philippo Level 3 Level 3 (505 points)
    I wonder whether you or someone else might be willing to tackle the other script? (I'd like to learn how to write such a thing myself; I'm not sure whether one needs to take classes or buy books, or what.)

    What I was hoping to also have was something that would provide a word count for a text document and also a word series count.

    For example to have a document consisting of this:

    "Because I do not hope to turn again
    Because I do not hope
    Because I do not hope to turn...."

    to be converted into something resembling this:

    again
    because x 3
    do x 3
    hope x 3
    I x 3
    not x 3
    to x 2, turn x 2

    because I do not hope x 3
    because I do not hope to turn x 2

    If the original text was arranged otherwise:

    "Because I do not hope to turn again. Because I do not hope. Because I do not hope to turn...."

    It should return the same listing.

    The list of words, arranged alphabetically, should include even words that appear only once (like "again" in the above example), but could exclude "and" and "the." In the above, I imagined this script would list all words beginning with "a" in a single paragraph, all words beginning with "b" in a single paragraph, etc.

    The list of series of words would be anything from two words that repeatedly appear together or whole lines/sentences or stanzas/paragraphs. Possibly these repeated series of words should each be followed by a paragraph break as in the above. Alternatively, it could follow the same arrangement as the list of single words; a paragraph for all series of words beginning with the same letter.

    The list of series of words for the above example should not include:

    because I do not x 3
    because I do x 3
    because I x 3
    I do x 3
    I do not x 3

    because they always appear in a longer series of words.

    However if the text was:

    "Because I do not hope to turn again
    Because I do not hope
    Because I do not hope to turn....
    Because I do not turn"

    Then it should include

    because I do not x 4

    I don't know if I have explained it well?

    I think it can also be stated this way:

    list all repetitions of single words, and their count
    and also
    list all repetitions of pairs of words if they are not always preceded or followed by the same third word, and their count
    list all repetitions of series of three words if they are not always preceded or followed by the same fourth word, and their count
    etc.

    I'm not sure how the results for repeated series of words would best be arranged; I'd be happy with any arrangement. Possibly it could create alphabetically arranged paragraphs of all the phrases appearing the most, with each following paragraph being the ones with successively fewer repetitions. Alternatively, it could begin with the longest phrases and descend to the pairs of words. Whatever's easiest!

    As to why I'd like this, this is part of a kind of analytical writing tool (although it could also be used to check one's own writing for excessive repetition). I came across it in Writing Analytically by Rosenwasser and Stephen (not as a program, but something a reader works out manually); I don't believe it is unique to them. The user of such a script would still be left to determine what the importance of any of the repetitions might mean, or how the context of the words changes them.
  • Level 8 Level 8 (41,790 points)
    Christopher Philippo wrote:
    "A problem clearly described is often a solved problem."

    Indeed!

    This script appears to work wonderfully, thank you!

    I didn't get it to work by dragging a file onto it, so I may have done something wrong, but it works when opening a .txt file from within the script.

    The drag and drop feature requires that the script is saved as an Application Package.
    It can't work if it's saved as ".scpt).

    I initially tried a .pages file, but when that didn't work, I just made a .txt file in AppleWorks.


    Maybe my fault, I didn't highlight the fact that a text file is required.

    There's probably a way to save a .txt file from Pages, but I don't presently know how.


    Share > Export > Standard

    Thanks for the feedback.

    Yvan KOENIG (VALLAURIS, France) vendredi 18 septembre 2009 21:01:11
  • Level 8 Level 8 (41,790 points)
    Listing the counted words is a game.

    --

    --[SCRIPT sortand_countwords]
    (*
    Enregistrer ce script en tant que script (sortand_countwords.scpt) ou progiciel (sortand_countwords.app).
    Exécuter ce script ou
    déposer l'icône d'un fichier texte (xxx.txt) sur l'icône de sortand_countwords.app.
    Le script lit le fichier
    trie les mots du texte
    et découpe la liste en lignes de même longueur que les lignes originelles.
    Le réultat est enregistré sur le bureau dans "countedWords.txt"

    ***********

    Save the script as script (sortand_countwords.scpt) or application bundle (sortand_countwords.app).
    Run it or drag and drop a text file icon (xxx.txt) on the sortand_countwords.app 's icon.
    The script reads the file
    sorts the embedded words
    split the sorted list in lines of same length than the original ones.
    The result is stored on the desktop in "countedWords.txt"

    ***********

    Yvan KOENIG (VALLAURIS, France)
    2009/09/18
    *)

    property nomDuRapport : "countedWords.txt"

    property rapport : "" -- globale

    property liste1 : {}
    property liste2 : {}
    property liste3 : {}

    --=====

    on run (* lignes exécutées si on double clique sur l'icône du script application
    • lines executed if one double click the application script's icon *)

    set fichier to choose file of type {"public.plain-text"} without invisibles
    my commun({fichier})
    end run

    --=====

    on open (sel) (* sel contient une liste d'alias des éléments qu'on a déposés sur l'icône du script (la sélection)
    • sel contains a list of aliases of the items dropped on the script's icon (the selection) *)
    my commun(sel)
    end open

    --=====
    on commun(elems)
    my nettoie()
    set theDoc to item 1 of elems
    set enTexte to read theDoc

    set my liste1 to paragraphs of enTexte
    set my liste2 to my sort_list(words of enTexte)
    set oldWord to ""
    set cnt to 1
    repeat with i from 1 to count of my liste2
    set iw2 to (item i of my liste2) as text
    if iw2 is not oldWord then
    if oldWord is not "" then copy oldWord & "(" & cnt & ")" to end of my liste3
    set oldWord to iw2
    set cnt to 1
    else
    set cnt to cnt + 1
    end if

    end repeat
    set enTexte to my recolle(my liste3, return)
    set p2d to path to desktop
    set p2r to (p2d as Unicode text) & nomDuRapport
    tell application "System Events"
    if exists (file p2r) then delete (file p2r)
    make new file at end of p2d with properties {name:nomDuRapport}
    end tell
    write enTexte to (p2r as alias)
    my nettoie()
    end commun

    --=====

    on nettoie()
    set my liste1 to {}
    set my liste2 to {}
    set my liste3 to {}
    end nettoie

    --=====

    on recolle(l, d)
    local t
    set AppleScript's text item delimiters to d
    set t to l as text
    set AppleScript's text item delimiters to ""
    return t
    end recolle

    --=====

    on sort_list(unsortedList)
    set AppleScript's text item delimiters to (ASCII character 10)
    set sortedList to paragraphs of (do shell script "echo " & quoted form of (unsortedList as string) & "| sort -d -f")
    set AppleScript's text item delimiters to ""
    return sortedList
    end sort_list

    --=====
    --[/SCRIPT]
    --


    At this time, I'm busy and tired so I don't know when I will be able to look at the others aspects of your needs.

    Yvan KOENIG (VALLAURIS, France) vendredi 18 septembre 2009 21:22:14
  • Level 8 Level 8 (41,790 points)
    And now, list and count chunks of words.

    --

    --[SCRIPT countgroups_ofwords]
    (*
    Enregistrer ce script en tant que script (sortand_countwords.scpt) ou progiciel (sortand_countwords.app).
    Exécuter ce script ou
    déposer l'icône d'un fichier texte (xxx.txt) sur l'icône de sortand_countwords.app.
    Le script lit le fichier
    compte les groupes de mots du texte.
    Le résultat est enregistré sur le bureau dans "countedWords.txt".
    Il est alors possible de l'exploiter à loisir à l'aide d'un tableur.

    ***********

    Save the script as script (sortand_countwords.scpt) or application bundle (sortand_countwords.app).
    Run it or drag and drop a text file icon (xxx.txt) on the sortand_countwords.app 's icon.
    The script reads the file
    sorts the embedded words
    split the sorted list in lines of same length than the original ones.
    The result is stored on the desktop in "countedGroupsOfWords.txt".
    Then it's easy to treat it in a spreadsheet.

    ***********

    Yvan KOENIG (VALLAURIS, France)
    2009/09/21
    *)

    property nomDuRapport : "countedGroupsOfWords.txt"

    property rapport : "" -- globale

    property liste1 : {}
    property liste2 : {}
    property liste3 : {}

    --=====

    on run (* lignes exécutées si on double clique sur l'icône du script application
    • lines executed if one double click the application script's icon *)

    --set fichier to choose file of type {"public.plain-text"} without invisibles
    set fichier to "Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt" as alias
    my commun({fichier})
    end run

    --=====

    on open (sel) (* sel contient une liste d'alias des éléments qu'on a déposés sur l'icône du script (la sélection)
    • sel contains a list of aliases of the items dropped on the script's icon (the selection) *)
    my commun(sel)
    end open

    --=====
    on commun(elems)
    my nettoie()
    set theDoc to item 1 of elems
    set enTexte to read theDoc

    set enTexte to my supprime(enTexte, ",")
    set enTexte to my remplace(enTexte, "-", " ")
    set my liste1 to paragraphs of enTexte
    (* convert to lowercase *)
    repeat with i from 1 to count of my liste1
    set item i of my liste1 to do shell script "/usr/bin/python -c "import sys; print unicode(sys.argv[1], 'utf8').lower().encode('utf8')" " & quoted form of (item i of my liste1)
    end repeat
    set enTexte to my recolle(my liste1, return)

    set my liste2 to {"string" & tab & "count"}
    set my liste3 to {"#" & tab & "string" & tab & "count" & "nb words"}

    set l to 0
    ignoring case
    repeat with i from 1 to count of my liste1
    set wordsI to words of item i of my liste1
    repeat with j from 1 to count of wordsI
    repeat with k from (count of wordsI) to 1 by -1
    if not j > k then
    set strIJK to my recolle(items j thru k of wordsI, " ")
    set nbr to (count of my decoupe(enTexte, strIJK)) - 1
    set rec to strIJK & tab & nbr
    if rec is not in my liste2 then
    set l to l + 1
    copy rec to end of my liste2
    copy (l as text) & tab & rec & tab & k + 1 - j to end of my liste3
    end if
    end if
    end repeat -- k
    end repeat -- j
    end repeat -- i
    end ignoring

    set enTexte to my recolle(my liste3, return)
    set p2d to path to desktop
    set p2r to (p2d as Unicode text) & nomDuRapport
    tell application "System Events"
    if exists (file p2r) then delete (file p2r)
    make new file at end of p2d with properties {name:nomDuRapport}
    end tell
    write enTexte to (p2r as alias)

    my nettoie()
    end commun

    --=====

    on nettoie()
    set my liste1 to {}
    set my liste2 to {}
    set my liste3 to {}
    end nettoie

    --=====

    on decoupe(t, d)
    local l
    set AppleScript's text item delimiters to d
    set l to text items of t
    set AppleScript's text item delimiters to ""
    return l
    end decoupe

    --=====

    on remplace(t, d1, d2)
    local l
    set AppleScript's text item delimiters to d1
    set l to text items of t
    set AppleScript's text item delimiters to d2
    set t to l as text
    set AppleScript's text item delimiters to ""
    return t
    end remplace

    --=====

    on recolle(l, d)
    local t
    set AppleScript's text item delimiters to d
    set t to l as text
    set AppleScript's text item delimiters to ""
    return t
    end recolle

    --=====

    on supprime(t, d)
    local l
    set AppleScript's text item delimiters to d
    set l to text items of t
    set AppleScript's text item delimiters to ""
    return l as text
    end supprime

    --=====

    on sort_list(unsortedList)
    set AppleScript's text item delimiters to (ASCII character 10)
    set sortedList to paragraphs of (do shell script "echo " & quoted form of (unsortedList as string) & "| sort -d -f")
    set AppleScript's text item delimiters to ""
    return sortedList
    end sort_list

    --=====
    --[/SCRIPT]
    --


    Yvan KOENIG (VALLAURIS, France) lundi 21 septembre 2009 21:28:46
  • Christopher Philippo Level 3 Level 3 (505 points)
    I ran into a problem with this script:

    error "File Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt wasn’t found." number -43 from "Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt"

    thank you very much for your continued work on this!
  • Level 8 Level 8 (41,790 points)
    Oops,
    I forgot to remove what I used for tests (I'm too lazy to use the dialog).

    At this time you see:

    --

    on run (* lignes exécutées si on double clique sur l'icône du script application
    • lines executed if one double click the application script's icon *)

    --set fichier to choose file of type {"public.plain-text"} without invisibles
    set fichier to "Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt" as alias
    my commun({fichier})
    end run

    --


    edit this handler as:

    --

    on run (* lignes exécutées si on double clique sur l'icône du script application
    • lines executed if one double click the application script's icon *)

    set fichier to choose file of type {"public.plain-text"} without invisibles
    -- set fichier to "Macintosh HD:Users:yvankoenig:Desktop:sort_words:sortwords:Lovecraft.txt" as alias
    my commun({fichier})
    end run

    --


    Yvan KOENIG (VALLAURIS, France) mardi 22 septembre 2009 14:56:15
Previous 1 2 Next