using scanf() to input strings with spaces

Hello again. I've noticed in my little practice program that scanf() will only accept the first word entered by the user, and anything after that is ignored. I know that this is because scanf() interprets a space as the end of an entry, but I would like to know: is there a way that I can get scanf() to accept multi-word entries, or do I need to use another function? If so, can you help me figure out which one and how I would use it? Thank you!

MacBook, Mac OS X (10.5.8)

Posted on Oct 15, 2009 2:53 AM

Reply
14 replies

Oct 15, 2009 2:58 AM in response to Tron55555

I figured out that the gets() function will achieve the above functionality, but I would still like to know out of curiosity if there is a way to get scanf() to do this. Also, when I use gets(), I get a message (on the command-line) right before the program gets to the gets() statement that says: "this program uses gets, which is unsafe" or something like that. Is there yet another function that I could use that will not produce this warning, or a way to get this warning not to appear? Thank you.

Oct 15, 2009 7:05 AM in response to Tron55555

Tron55555 wrote:
I figured out that the gets() function will achieve the above functionality, but I would still like to know out of curiosity...

Because you show the right attitude, I will answer in a little more detail than just saying "RTM" 😉

If you look at the man(1) page of scanf(3) you will find (in the part discussing the conversions):


s Matches a sequence of non-white-space characters; the next pointer
must be a pointer to char, and the array must be large enough to
accept all the sequence and the terminating NUL character. The input
string stops at white space or at the maximum field width, whichever
occurs first.
If an l qualifier is present, the next pointer must be a pointer to
wchar_t, into which the input will be placed after conversion by
mbrtowc(3).


if there is a way to get scanf() to do this.

Yes there is. Looking a bit further down in the scanf(3) manual page it says:


[ Matches a nonempty sequence of characters from the specified set of
accepted characters;


By making a small test program using vsscanf(3) (also in the scanf(3) manual page) you can avoid having to type test strings because vsscanf(3) uses a string passed to it for parsing.

Also, when I use gets(), I get a message (on the command-line) right before the program gets to the gets() statement that says: "this program uses gets, which is unsafe"

Yes, it is probably the most unsafe function in the entire C library (although unsafely creating temporary files and directories can also blow huge holes in a system). At the bottom of the gets(3) manual page it says:


SECURITY CONSIDERATIONS
The gets() function cannot be used securely. Because of its lack of bounds
checking, and the inability for the calling program to reliably determine
the length of the next incoming line, the use of this function enables mali-
cious users to arbitrarily change a running program's functionality through
a buffer overflow attack. It is strongly suggested that the fgets() func-
tion be used in all cases. (See the FSA.)


If you look at the OpenBSD ([www.openbsd.org]) manual page the wording is a bit stronger even:


BUGS
Since it is usually impossible to ensure that the next input line is less
than some arbitrary length, and because overflowing the input buffer is
almost invariably a security violation, programs should NEVER use gets().
The gets() function exists purely to conform to ANSI X3.159-1989 (``ANSI
C'').


or something like that. Is there yet another function that I could use that will not produce this warning, or a way to get this warning not to appear?

Use either fgets(3) or fgetln(3).

Thank you.

You're welcome.

P.S. For my use of gets(3) and the like see [http://en.wikipedia.org/wiki/Man_page] for an explanation.

Oct 15, 2009 7:09 AM in response to Tron55555


char str1[1024];
char str2[128];
scanf("%s %s", str1, str2);
strcat(str1, " ");
strcat(str1, str2);

You have read 2 words, then put them into one string.

If you want scanf() to do all that work, then you are really trying to use a hammer to put in a screw. Wrong tool for the job.

If have more complex parsing rules, then you are going to have to roll your own, and maybe scanf() can be a component, or maybe it is the wrong tool for the job. That all depends on what you are trying to do.

I poor man quoted string parse via scanf() might look like

scanf(""%[^"]"", str);

But that suffers from not allowing a " in the middle of the string and not supporting any escape sequence that would allow a " in the middle.

Other approaches are to read the file (fgets()), then use things like sscanf(), strtok(), strchr(), strcspn(), strstr(), strspn(), etc... as well as looping through the string one character at a time.

If you want really complex parsing, then that is a job for lex & yacc (the Open Source versions would be flex & bison)

Oct 15, 2009 7:27 AM in response to Tron55555

Text-based user interfaces are just hard to do - even more so in C. Any useable one I've ever written eventually gets rid of formatted input and just reads an entire line at a time and then parses it. I do most of these in Perl because I don't have to worry about the details of allocating and re-sizing buffers and I have good regex abilities to parse the resulting lines.

Both the C and C++ input methods of scanf and fgets are really too cumbersome to use in a real system - yet another argument for using command-line arguments to drive input.

However, printf in C (and Perl) is quite usable. The C++ equivalents are a joke.

Oct 16, 2009 12:31 AM in response to BobHarris

Thanks, guys. I ended up taking your advice, Bob, and using the fgets() function, like so:


fgets(userName, 32, stdin);


I still have a few questions though:

1.) What would be the difference between this and using fgetln() instead? Is it simply that fgetln() reads until a newline character whereas fgets() reads until the given amount of characters or a newline character, whichever comes first? Is there any other difference worth noting here?

2.) Should I use fgets() for all my input from the command-line, or is it better to use scanf() when I know I'm only going to need to input one word?

etresoft wrote:
The C++ equivalents are a joke


By this I assume you are referring to cin and cout. I was interested to hear that. I would assume someone along the line thought they were superior otherwise they could have just stuck to the C I/O functions, right?

hansz -- thanks for your post. I still am trying to get in the habit of checking the man pages when I have a question. So thank you for reminding me to do so, and for the rest of your post.

Oct 16, 2009 6:46 AM in response to Tron55555

I ended up ... using the fgets() function, like so:


fgets(userName, 32, stdin);


Actually I'm a huge fan of sizeof(). So I would have written the code as:

fgets(userName, sizeof(userName), stdin);

Of course this only works if userName is a char array. It does NOT work if userName is a pointer. But where ever I can use sizeof() instead of hard coding a constant, I use it.
I still have a few questions though:

1.) What would be the difference between this and using fgetln() instead? Is it simply that fgetln() reads until a newline character whereas fgets() reads until the given amount of characters or a newline character, whichever comes first? Is there any other difference worth noting here?

Never used fgetln() (didn't even know it existed 🙂 ).

Reading "man fgetln()", you do not get a nul character terminating the string. Many C string functions depend on that trailing nul so if you are going to use any of those, you will have to add it yourself.

You must use everything in the returned buffer before calling fgetln() a 2nd time, as it reuses its old buffer. I'm guessing here, but it might point into the Stream buffer as long as the entire line is in the buffer. If the line crosses between one buffer and the next, it most likely malloc()s memory, copies the line from the Stream buffer into the malloc()ed memory. This would explain why there is no terminating nul as adding a nul would trash the first character of the next line in the Steam buffer (Steam buffer explained at the end of this post).

On the plus side, you never have to worry about buffer overflow. That alone can be a huge win if you are writing commercial software.
2.) Should I use fgets() for all my input from the command-line, or is it better to use scanf() when I know I'm only going to need to input one word?

I rarely use scanf(), but that is just me. I mostly forget it exists.

Using scanf() means it will read stdin until it fills in all of its argument. If that means consuming multipe input lines, it will do that. This is not always desirable, unless you are doing simple get next blank separated token (scanf("%s",token_string);), and the rest of the code will worry about the structure of the data.

If you do care about one line vs the next, but want the benefits of scanf(), then you read the line, and use
sscanf(). How you want to read the line is up to you, except if you are going to use fgetln(), then you need to add that all important nul terminator.

size_t len;
char *input_line;
char userName[32];
input_line = fgetln(stdin, &len);
input_line[len-1] = ' '; # replaces newline with nul
sscanf(input_line,"%31s",userName);

Stream Buffer. The Stream I/O calls really read into a Steam buffer which is typically 4096 bytes (sometimes 8192, or larger power of 2). This allows multiple calls to fread(), getc(), fgets(), scanf(), etc... to avoid making a more expensive system call and worse performing a very expensive disk I/O. Of course this only applies to reading files, not reading from the Terminal where the source of input is a user at the other end of a keyboard.

For output, the Stream buffer accumulates up to 4096 bytes (or again maybe 8192 etc...) of output before writing it to disk. Again this is a performance feature, as writing to disk each time someone does a putc() or fputc() gets very expensive very fast. Again if the output is a Terminal screen, the buffering rules are different.

Oct 16, 2009 7:53 AM in response to Tron55555

fgetln() is not a part of C89. I don't know if it is C99, but it might not be included in all implementations.

"By this I assume you are referring to cin and cout. I was interested to hear that. I would assume someone along the line thought they were superior otherwise they could have just stuck to the C I/O functions, right?"

I think they were included to be a demonstration example of a stupid use of operator overloading.

Oct 16, 2009 9:21 AM in response to Tron55555

Tron55555 wrote:
1.) What would be the difference between this and using fgetln() instead?


I had never heard of fgetln either until now. Interesting function.

2.) Should I use fgets() for all my input from the command-line, or is it better to use scanf() when I know I'm only going to need to input one word?


Personally, I just use <> in Perl 🙂

etresoft wrote:
The C++ equivalents are a joke


By this I assume you are referring to cin and cout. I was interested to hear that. I would assume someone along the line thought they were superior otherwise they could have just stuck to the C I/O functions, right?


cin and cout are instances of C++ streams. It is the convoluted and difficult process of formatting streams in C++ that I am talking about. Whereas in C one can do something like:

printf("The value of PI is %10.4f", M_PI);

the C++ equivalent is:

std::cout << "The value of PI is";
std::cout << std::setw(10) << std::setprecision(5) << M_PI;
std::cout << std::endl;

So with C++, supposedly a more "advanced" language, I have to do more work to specify the details of how I want something done.

Oct 17, 2009 3:03 AM in response to etresoft

BobHarris wrote:
Actually I'm a huge fan of sizeof().


Thanks for that tip, Bob -- works great. The rest of the post was very helpful as well.

Keith Barkley wrote:
fgetln() is not a part of C89. I don't know if it is C99, but it might not be included in all implementations.


Good point. For anyone interested, I tried using fgetln() in Xcode with a Standard Tool command-line utility project set to the C99 standard, and it worked fine, so I would assume that means that it is part of C99. Thanks for the post, Keith.

etresoft wrote:
cin and cout are instances of C++ streams. It is the convoluted and difficult process of formatting streams in C++ that I am talking about. Whereas in C one can do something like:


I always liked the logical progression of cout statements, like:


cout << "There are " << numTreeApples << " apples in the tree and " << numGroundApples << " on the ground";


Granted, however, I never dealt with precisions and what not very much like in your sample cout statement, and that does look much more tedious than C's equivalent.

So, there's just one thing left I'd like to confirm on this issue. Do I have the right idea when I say that using fgets() versus scanf() in a situation where I only need to input one word is pretty much up to me? Are there any particular benefits of using one over the other in that basic situation? Bob mentioned some technical details in his post that were helpful, but I just want to confirm, in general, that there's nothing unsafe about either one or anything like that (like with the gets() function) that I should know about. Is it pretty much just whichever one I prefer? Thanks again.

Tron out.

Oct 17, 2009 3:58 AM in response to Tron55555

New problem here. I have the following code:


double chipsPurchased;
printf(" Enter how many dollars in chips you would like to purchase: ");
fgets(chipsPurchased, sizeof(chipsPurchased), stdin);


I get an error that says "incompatible type for argument 1 of 'fgets'". If I change the data type to int then it works fine. Does fgets() not work with floating-point values? Is this a situation where I need to use scanf()? Bob, you mentioned that you rarely use scanf(), so what would you use in a situation like this? Thanks.

Oct 17, 2009 5:53 AM in response to Tron55555

Double is a floating point variable. fgets() wants a char array address. Doubles get passed by value. You gave fgets() an unitialized random value as an address to store entered characters.

I also rarely use floating point 🙂 It doesn't show up much in kernel code.

As I don't use floating point much, I might read the line, and after some sanity checking, call sscanf() - notice the ss

I rarely use scanf(), because my code either needs more control or it gets all its input from command line arguments or I'm in kernel code where scanf() is not available.

For your blackjack program, scanf() may be the right use of that function.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

using scanf() to input strings with spaces

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.