[SOLVED] fgets() and buffer overflow

atlantis43 · 06-25-2013, 02:09 PM

Wondering if someone can help solve my confusion in the following:

Code:

#include <stdio.h>

int main(void)
{
char sentence[100];

while(fgets(sentence, sizeof(sentence), stdin)!= NULL)
    {
    printf("%s\n", sentence); 
    }

return 0;
}

Fgets(), I understand, has the advantage to prevent buffer overflow as a result of the buffer size inclusion in its arguments. However, if I enter a string of >99 chars in the above program, the first 99 chars are displayed in one string, and then the remaining chars which I entered are returned in a following string. If the buffer established is only SIZE ==100, then where are the other chars that I entered getting stored. Is a new buffer being created after I exceed 99 chars, or what is happening to store these additional chars?
Thanks for any attempts at explanation.

tronayne · 06-25-2013, 03:19 PM

fgets() will read at most one less than sizeof (buf) characters (it will terminate the string with a null); it will read to a NL character or to one less than the size of the buffer (the NL character will be included in the buffer and the string will be terminated with a NULL).

Your loop reads 99 characters, prints them, then reads whatever is left in the system buffer (not your buffer, the I/O buffer maintained by the system) and prints those.

A slightly more "standard" way might be something like this (which is reading from a file rather than the keyboard):

Code:

#include <stdio.h>
#include <stdlib.h>

void    main    (void)
{
        char    buf [BUFSIZ];
        FILE    *helpfile;

        if ((helpfile = fopen ("instructions", "r")) == (FILE *) NULL) {
                (void) fprintf (stderr, "can't open instructions\n");
                exit (EXIT_FAILURE);
        }
        while (fgets (buf, BUFSIZ, helpfile) != (char *) NULL)
                (void) fputs (buf, stdout);
        (void) fclose (helpfile);
        exit (EXIT_SUCCESS);
}

The token, BUFSIZ, is a numeric value (defined as 8192 on many 64-bit systems) -- it doesn't hurt anything (and can be beneficial) to use BUFSIZ for I/O buffering as above; 8K isn't a heck of lot of memory to allocate for this purpose (and you can use the same buffer space again and again throughout a program).

Also, read the manual page for fgets for more information.

Hope this helps some.

[EDIT]
Duh! I typed NULL instead of NL (fixed above), fumble-fingers!

fgest() will read to a NL, EOF or to one less than size.

NL is the ASCII abbreviation of new line, ASCII EOT is usually EOF in Unix/Linux.

Code:

	Dec	Hex	Octal	Binary		ASCII
	004	004	0004	00000100	EOT	(Ctrl-D)
	010	00a	0012	00001010	NL	(Ctrl-J)

Thanks to http://www.linuxquestions.org/questi...2/#post4978966 for pointing that out.
[/EDIT]

atlantis43 · 06-25-2013, 04:24 PM

helps a lot! Now I get a better idea of the existence of the system buffer vs. mybuffer.

mina86 · 06-25-2013, 04:26 PM

There may be a few buffers between your keyboard and your code. In particular, if you are just typing on your terminal, it is very likely that it is set in line-buffering mode which means that it won't send any data until you press Return.

Then, the TTY device in the kernel will read data to its own buffer and offer it on standard input of your program.

If you've typed more then the TTY driver is willing to handle in one go, then it may read only part of the data and then block in which case the data that has not yet been consumed will still reside in terminal's buffer.

Of course things get much more complicated if you are connected via SSH.

It is also worth nothing that as tronayne has said, 8K isn't a heck of a lot of memory, and if you are reading data from a file, then kernel will most likely read in chunks of multiples of 4K anyway. So a simple fgetc() which returns a single byte may cause operating system to read 4K page anyway.

NevemTeve · 06-26-2013, 03:58 AM

Or you could use getline(3), which is a GNU extension.

tronayne · 06-26-2013, 07:16 AM

If you're serious about C programming, let me recommend an excellent book you might want to pick up: Stephen G. Kochan, Patrick H. Wood, Topics in C Programming (revised ed.). The ISBN is 0-471-53404-8 and it is available from Amazon and other providers (there may be a newer edition).

It's written to teach C programmers how to program and, in my opinion, is the best single-source guide available detailing advanced C programming for a Unix/Linux environment. There are hundreds of working examples (yeah, really, working useable examples). Kochan and Wood come from Bell Labs and write clearly and concisely -- it's an easy read and well worth your time.

I would urge that you write programs to ANSI/POSIX standards -- if you do, they're going to work on pretty much any platform you may need to support. Avoid "handy" extensions -- they're expedient when programming but will come back and haunt you if you need to port from, say, Linux to Solaris (Solaris' C compiler is not GNU). Stick with the standards and you won't be reinventing the wheel (and getting telephone calls at three in the morning, either). And, if you need to port to Microsoft, heaven help you if you don't stick to standards.

Hope this helps some.

linosaurusroot · 06-26-2013, 07:55 AM

Quote:

Originally Posted by tronayne

it will terminate the string with a null); it will read to a NULL character or to one less than the size of the buffer
Also, read the manual page for fgets for more information.

http://linux.die.net/man/3/fgets

Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer.