Want to highlight a helpful answer? Upvote!

Did someone help you, or did an answer or User Tip resolve your issue? Upvote by selecting the upvote arrow. Your feedback helps others! Learn more about when to upvote >

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Pages and Terminal output different results of word count

If pass my .rtf file to wc -w command, it returns me the figure of 6909 words, whereas the words counter of Apple's Pages shows 2617 ones. Why?

Mac OS X (10.7.5), MacBook Pro 15.4 mid-2012

Posted on Apr 29, 2016 3:40 PM

Reply
Question marked as Best reply

Posted on Apr 29, 2016 7:45 PM

Because you are passing the raw RTF syntax in that document to wc, and Pages is showing only the actual words in your document. The RTF control words permeate throughout that .rtf document. Here is an example:


{\rtf1\ansi\ansicpg1252\cocoartf1265\cocoasubrtf210

{\fonttbl\f0\fnil\fcharset0 Baskerville;}

{\colortbl;\red255\green255\blue255;\red26\green26\blue26;\red164\green8\blue0;}

\margl1440\margr1440\vieww8820\viewh17160\viewkind0

\deftab720

\pard\pardeftab720\sa200\pardirnatural

{\header \pard\ql\b\f0\fs28 [ zig.rtf ] \par}





\f0\fs38 \cf2 Lorem ipsum dolor sit amet, ligula suspendisse nulla pretium, rhoncus tempor placerat fermentum, enim integer ad vestibulum volutpat. Nisl rhoncus turpis est, vel elit, congue wisi enim nunc ultricies sit, magna tincidunt. Maecenas aliquam maecenas \cf0 ligula\cf2 \cf0 nostra\cf2 , accumsan taciti. Sociis mauris in integer, a dolor netus non dui aliquet, sagittis felis sodales, dolor sociis mauris, vel eu libero cras. Interdum at. Eget habitasse elementum est, ipsum purus pede porttitor class, ut adipiscing, aliquet sed auctor, imperdiet arcu per diam dapibus libero duis. Enim eros in vel, volutpat nec pellentesque leo, {\field{\*\fldinst{HYPERLINK "http://www.hp.com"}}{\fldrslt temporibus}} scelerisque \cf3 nec\cf2 .\


and this is the text that Pages '09 shows you:

User uploaded file

32 replies

May 4, 2016 3:40 PM in response to scrutinizer82

With the exception of replacing $f with "${f}" — your Run Shell Script action syntax is identical to mine. However, there may be additional code below your existing content that does not show.


Hover your mouse over the bottom blue border of the the Run Shell Script action. When this turns in a horizontal bar with opposing vertical arrows, click and drag to expand your action downwards to see if there is extra code in there that is bombing the action.

May 4, 2016 4:02 PM in response to VikingOSX

Ok, so changed the output to Variable Path (set itself to Desktop instantly). Still getting that error (I even changed the action to custom its name to more Unix-friendly - to no avails). My version of Automator is 2.2.4, what's yours? I feel mine's buggy as ****. I checked - no extra code hiding.


How do I run that code from within Terminal - just to check how it would work?

May 4, 2016 4:55 PM in response to scrutinizer82

Type only the blue text in Terminal. Assumption is that you are in the same directory location as the space punctuated PDF file. The first line is going to wrap here — there is no return between the first and last parenthesis.

$ words=$(textutil -stdout -convert txt ./"that long space punctuated name.PDF" | wc -w | awk '$1=$1')

$ osascript -e "display dialog \"Word Count: $words\" as text"

May 5, 2016 4:58 AM in response to VikingOSX

Thank you for your patience trying to help, but I let it go. The action still fails. I still get either "file doesn't exist" or "user interaction not allowed" messages.


Moreover I even couldn't execute these shell lines in Terminal. It will display the same message "file doesn't exist" or produce output that is a mess (in the latter case I omitted -stdout flag) of creepy symbols. I then figured out the problems with the filename and corrected (it contained ./, it was incorrect, so I deleted these), re-added -stdout, but it went nowhere while on conversion displaying rtf text handling data with the converted file's location undetectable.


What a headache!!

May 5, 2016 5:11 AM in response to scrutinizer82

Well, an oversight on my behalf. Textutil wants the Rich Text document (RTF) that is the result of the PDF to Text action. It does not handle PDF documents as input files. Change that, and the command-line syntax will work as posted. The './' syntax just tells Textutil that the input file is located in the current directory location. I use it in examples because your Bash startup files may not have incorporated the current directory location in their directory search hierarchy.

May 5, 2016 3:58 PM in response to VikingOSX

I'm not familiar with the shell scripting syntax that much. I don't know, for ex, what the combination $f is for and what I have to change or what IT changes or stores. However, simple logic tells me that since the result of Extract text from PDF is to be passed to that Shell Script all it has to do is to count words and pop a window and to accomplish that the input should be the text (not necessarily rtf, it can be txt as well).


I also would like to be able just drag-n-drop pdf file on top of my-made App's icon to trigger the workflow.

May 6, 2016 3:18 PM in response to VikingOSX

Hi again, VikingOSX. Return here just to report on some progress I made during the last few days of investigating the situation with these errors. Your code had errors in the osascript part. Here's how it should've look


for f in "$@"
do
words=$(textutil -stdout -convert txt  | wc -w | awk '$1=$1')
osascript -e 'tell application "Finder"' -e 'activate' -e 'display dialog "Word Count: $words" as text
end tell'
done

I now get that Finder window popping up but with "Word Count: $words" line instead of "Word count: [number]".


User uploaded file

May 6, 2016 5:27 PM in response to scrutinizer82

The following lines in the Run Shell Script that is set to pass input: as arguments will produce a dialog with a word count value in it. There is no need to use the Finder tell block.


for f in "$@"

do

words=$(textutil -stdout -convert txt "${f}" | wc -w | awk '$1=$1')

osascript -e "display dialog \"Word Count: $words\" as text & return & current date"

done

User uploaded file

May 7, 2016 4:29 AM in response to VikingOSX

VikingOSX wrote:


The following lines in the Run Shell Script that is set to pass input: as arguments will produce a dialog with a word count value in it. There is no need to use the Finder tell block.


for f in "$@"

do

words=$(textutil -stdout -convert txt "${f}" | wc -w | awk '$1=$1')

osascript -e "display dialog \"Word Count: $words\" as text & return & current date"

done


And yet these lines produce constant error messages. The corresponding log entries (at the bottom of Automator worksheet) will produce sort of "no user interaction allowed", wrong syntax (like "expected expression but found end of line, expected end of line but found identifier/expresion/parameter etc") warnings. I dig up on the issue and here's what I found:


http://stackoverflow.com/questions/13484482/no-user-interaction-allowed-when-run ning-applescript-in-python


The quote:

from the command line notice this won't work...

osascript -e 'display dialog "Now it will not work."'

But this will work since we tell the Finder to do it...

osascript -e 'tell application "Finder"'-e 'activate'-e 'display dialog "Now it works!"'-e 'end tell'

It was not unless I omitted "\" symbol and added tell blocks when it stopped spitting out that error messages and finally diplayed that dialog albeit containing not what it was meant to. Else I can't in the world discover what interrupts it.

May 7, 2016 8:07 AM in response to scrutinizer82

Apple usually makes changes to AppleScript (and Automator) with each new release of OS X. Some of these changes are not backwards compatible to older releases of OS X. The workflow and shell examples that I provided you are accurate and functional on El Capitan, but break on versions of OS X older than Mavericks. It was in Mavericks that one could begin using the display dialog in osascript without a tell block to support it. I believe you may still be on Lion.


This morning, I took my El Capitan workflow over to Snow Leopard, and had to rework the Run Shell Script action contents to get a word count dialog. When I brought this back to El Capitan, and ran the recompiled Automator workflow, it also worked as expected without any modifications. The link to stackoverflow that you provided was from 2012, and for that timeframe, a Finder tell block was required with display dialog.


So here it is. It incorporates a HERE document with osascript to cut back on quote escaping.

User uploaded file

User uploaded file

May 7, 2016 3:58 PM in response to scrutinizer82

Hello scrutinizer82,

Before you get too far with this, you should realize that PDF is strictly a print format. It is designed to be printed, on a printer, and read by a human. There is really no other use case. From time to time you may get lucky and have a PDF that appears to have text content. But in no case will you ever extract the complete text from a PDF file.

May 7, 2016 4:53 PM in response to VikingOSX

Shell script seems to be too hard for me as of now. I managed to solve the problem by adding "Run AppleScript" action and executing shell script from there.


I passed a very simple script containing a variable:


set WordCount to do shell script "wc -w /Users/myusername/Desktop/*.txt"

tell application "Finder"

display dialog WordCount

end tell


That's it, no torture anymore trying to painstakingly figure out all the subtleties of a very capricious shell (though I'm learning UNIX and still would like to be able to master shell-scripting one day to the level when I could utilize the full power of OS X which I need for my daily work).


P.S. BTW, why "Run Apple Script" has that strange layout "on run, parameters blah-blah-blah (*your text goes here*) end run"? It's utter non-sense: when I inserted my script the first time it went nowhere, so I just deleted all that junk and typed it in like I would do that in AppleScript Editor - and succeeded!

Pages and Terminal output different results of word count

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.