## look up word in sentence?

358 Views 9 Replies Latest reply: Jan 3, 2013 11:31 PM by Barry
Calculating status...
Currently Being Moderated
Jan 3, 2013 4:43 AM

Hello all,

Is the a function to do the following...

Search a range of rows for specific 'key words" and return the total number of time that key word appears in that range...

more specifically i have a column that I have filled with simple sentences, notes about 6 different people. "Bob opens the door." or "Sam and Bob speak."

I would like to know how many times each person is mentioned, and because these notes will be evolving over time and ever expanding, I would like to automate the function so I do not have to count manually...

any help would be apprecatied.

MacBook Pro, Mac OS X (10.6.6), 4gb ram
• Level 6 (12,650 points)
Currently Being Moderated
Jan 3, 2013 7:41 AM (in response to xtrmn8ngangl)

Here is one way:

the first two rows are headers:

B3=IF(IFERROR(SEARCH(B\$2,\$A3,1), 0)>0, 1, 0)

select B3 and fill to right as needed, select B3 through the end of the row, then fill down as needed

B1=SUM(B)

select B1 and fill to the right as needed

Place the names in row 2 as shown.  The main weakness of this solution is it will not count a name if it ocurrs twice in the same sentence.

• Level 7 (29,095 points)
Currently Being Moderated
Jan 3, 2013 11:08 AM (in response to xtrmn8ngangl)

Here's a suggestion similar to Wayne's, using a formula that will count multiple occurrences of a name in the same cell/sentence.

Formula:

B3: =(LEN(\$A3)-LEN(SUBSTITUTE(\$A3,B\$1,"")))/LEN(B\$1)

Fill right and down as directed in Wayne's post.

B2: =SUM(B)

Fill right.

Note the counts of Bob and Sam in rows 6 and 7.

SUBSTITUTE does not distinguish between "Bob" as a word and "Bob" as part of a larger word ( "Bobsled" ), so it counts both occurrences of the string "Bob" in row 6.

SUBSTITUTE is case sensitive, though, and does not count "bob" in "bobsled" (in row 7) as an occurrence of "Bob" or "sam" in "sample" (in rows 6 and 7) as occurrences of "Sam".

SEARCH is insensitive to case. Note the counts for "Bob", "Sam" and "Jack" in the same two rows, using the SEARCH based formula in Wayne's post:

Regards,

Barry

• Level 7 (28,190 points)
Currently Being Moderated
Jan 3, 2013 12:06 PM (in response to xtrmn8ngangl)

Xt,

I'm curious whether the list of "Lines" will be considerably longer than the list of names mentioned. More importantly, I suppose, is whether either list, the lines or the names, will exceed 250.

Jerry

• Level 7 (29,095 points)
Currently Being Moderated
Jan 3, 2013 6:55 PM (in response to xtrmn8ngangl)

Number your characters from 11 to 32 so that each is represented by an unique two character string in the column. Since you want to count only if the character is present in that scene, you need each two character string to appear once in the list cell without regard to the number of times the name or number appears in the line, or you need a counting method that counts only whether the character's string appears in each line of the table.

The sample below uses the second method; the character's two digit identifier appears as many times as the character appears in the scene, but is counted only once.

The character lists were generated randomly, with no control over multiple appearances on the same line (see row 13, for example). The formula counts only the first appearance in the line.

Formula:

E2, and filled down: =COUNTIF(B,"=*"&D2&"*")

Regards,

Barry

• Level 6 (10,760 points)
Currently Being Moderated
Jan 3, 2013 8:11 PM (in response to xtrmn8ngangl)

If the name appears only once in each line, a simple formula to count how many lines contain the specified name is as follows:

=COUNTIF(A,"=*Bob*")

or if the name is specified in a cell, like cell B1,

=COUNTIF(A,"=*"&B1&"*")

Those two formulas do not distiguish between the name "Bob" and words that contain "bob" such as "bobsled".  You can fix this pretty easily though with several countifs added together. One will look for the name at the beginning of a sentence. One will look for the name in the middle of a sentence. And one will look for the name at the end of a sentence. For that last one, the sentences must all end in a period (as assumed in the formula below) or no period (you'll have to modify the formula accordingly).

=COUNTIF(A,"="&B1&" *")+COUNTIF(A,"=* "&B1&" *")+COUNTIF(A,"=* "&B1&".")

This formula will count all instances of the name, including multiple instances on the same line.

• Level 7 (29,095 points)
Currently Being Moderated
Jan 3, 2013 11:31 PM (in response to xtrmn8ngangl)

The "*" is a wild card character.

Each of the three formulas posted in the messages by me and by badunit uses the wildcard in a similar manner:

1. =COUNTIF(A,"=*Bob*")

2. =COUNTIF(B,"=*"&D2&"*")

3. =COUNTIF(A,"="&B1&" *")+COUNTIF(A,"=* "&B1&" *")+COUNTIF(A,"=* "&B1&".")

The first says: Count the cells in column A containing the string "Bob" preceded and/or followed by any other characters.'

The second says: Count the cells in column B containing any characters followed by the contents of cell D2 followed by any characters.

The third says:

Count the cells in column A that start with the contents of B1 followed by a space;

Add the count of all the cells that contain any characters ending with a space followed by the contents of B1 followed by any other characters starting with a space;

Add the count of all the cells that contain any characters ending with a space followed by the contents of B1 followed by a period.

The first two will count each cell meeting the condition only once.

The third will count a cell once for each of the condition it meets, to a maximum of three (one for each of the three conditions).

Formula #3. the first Bob fits the first condition; the second Bob and the third Bob both fit the second condition, but are counted only once; the fourth Bob fits the third condition and is counted for a total of three.

The ampersand in the formulas is the concatenation operator. It's function is to join (concatenate) two text strings into one, as noted by "followed by" in the descriptions. "Any character" includes "no characters," so the first two formulas will count the cell if "Bob" (or the contents of the addressed cell) are found anywhere in the string contained in that cell.

Regards,

Barry

PS: "=*Bob*" would count a cell containing "Jim-Bob", "four-man Bobsled", "Bobby" or "the Bob-tail nag."

#### More Like This

• Retrieving data ...

#### Legend

• This solved my question - 10 points
• This helped me - 5 points
This site contains user submitted content, comments and opinions and is for informational purposes only. Apple disclaims any and all liability for the acts, omissions and conduct of any third parties in connection with or related to your use of the site. All postings and use of the content on this site are subject to the Apple Support Communities Terms of Use.