Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

coding help

For one of the tutorials it asks me to compare the two files words and propernames. I have to list out the matching words from both files.


#import <Foundation/Foundation.h>


int main(int argc, const char * argv[])

{


@autoreleasepool {

//Read in a file as a huge string (ignoring the possibility of an error)

NSString *nameString = [NSStringstringWithContentsOfFile:@"/usr/share/dict/propernames"

encoding:NSUTF8StringEncoding

error:NULL];

NSString *wordString = [NSString stringWithContentsOfFile:@"/usr/share/dict/words"

encoding:NSUTF8StringEncoding

error:NULL];

//Break it into an array of strings

NSArray *names = [nameString componentsSeparatedByString:@"\n"];

NSArray *words = [wordString componentsSeparatedByString:@"\n"];

//Go through the array one string at a time

for(NSString *n in names){

for(NSString *w in words){

if ([n caseInsensitiveCompare:w] == NSOrderedSame) {

NSLog(@"The words %@ and %@ matches", n, w);

}

}

}

}

return 0;

}


This is how I approach it however it didn't work as I planned it. Anyone can help with this?

MacBook Air, Mid 2012

Posted on Jun 15, 2013 7:59 PM

Reply
2 replies

Jun 16, 2013 5:50 AM in response to ImBoss

Purely by looking at your code, I don't see a *logical* error. So .. how exactly "didn't it work"?


In general, saying "it does not work" is an empty statement. It would not make much sense to post a long piece of code and then saying "it works, what do I need to do?"


Do you get a compiler error? Warnings? No output? Other output that you expected?


By the way, as soon as you *do* find a match inside the inner loop, you can stop comparing this 'name' against the rest of the 'words'. It's safe to insert a "break;" statement right after the NSLog line here. You will also find your program runs about twice as fast.(*)


(*) I'm guessing. It actually depends on the length of the 'name' list -- if most of the names are not in the 'words' list, you will not see a difference. If, however, there is a good chance that it is, it'll be on average in the middle of the 'words' list (! -- But can you see why it's fair to assume that?). So stopping the comparing loop at that point will make your program faster by half.(**)


(**) Just for laughs: if your 'words' list is longer than, say, a few hundreds of entries, try this. Sort the names words list and look in to binary searching. You will not be surprised by the speed increase, you will be *astonished*.

Only try if you get this version working "as you planned", though.


Message was edited by: Jongware

Jun 16, 2013 6:58 AM in response to Jongware

(After trying your code)


Okay, I see 2 possible reasons for you to say it does not work as you "planned".


First, since you compare case insensitive, most of the names are shown twice -- once with initial uppercase, once all lowercase.

Second, you always get a final line "The words and matches", with empty text in both places.


1. It's up to you to treat this as an error or "by design". Since you *are* comparing case insensitive, the exact same name *may* occur more than once in the word list.


If you want to only find exact unique same words, compare case sensitive instead.


You should also consider the case where the word list does contain duplicates! The supplied word list does not (at least, I don't expect it to), but then again you might want to re-use this code in a real world situation where it does. Again, it's up to you to decide whether the correct answer is


TRUE, this name occurs at least once in the word list

FALSE, this name does not appear once in the word list


(and it depends on your application which one you should choose).


2. This is by design and as expected. The command 'componentsSeparatedByString' splits your input on returns regardless of whether there is something 'left' and 'right' of it. A blank line inside the word list, for example, would get treated like this:


"abc\n\ndef\ngh" -> ["abc", "", "def", "gh"]


Now where does this last entry come from? Both lists end with a return, so the last entries look like this:


"..\nZyzomys\nZyzzogeton\n" (END)


and after splitting, this becomes "..", "Zyzomys", "Zyzzogeton", "" -- note the last entry "after" the final hard return.


You can solve this in two ways: cleaning up the lists right after reading (e.g., removing blank lines, including the last one), or simply ignore names and words with a length less than 1, using this line before comparing a name n against your word list:


if ([n length] < 1) continue;


Sort the words list and look in to binary searching. You will not be surprised by the speed increase, you will be *astonished*.


I tried this as well, and yes: the speed increase is marvellous. If you are interested, look into "indexOfObject". (As a side result, this also solves the Duplicate Words problem because it will always answer the question "is this name at least once in the words list".)

coding help

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.