ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I dont know...I tested the code without putting the null character and apparently the compiler knows where the string ends without problems at all.
Your test was weak. It essentially worked because of the Magic of Uninitialized Variables(tm). It's black magic that you should avoid at all costs. Here, try this shell script as a test. It will show you the danger of not putting the NUL character (all bits off) at the end of your string.
Code:
#!/bin/bash
cat > 1.c <<EOD
#include <stdio.h>
#include <string.h>
void put_new_25(char *arg_string)
{
arg_string[ 0]='a';
arg_string[ 1]='b';
arg_string[ 2]='c';
arg_string[ 3]='d';
arg_string[ 4]='e';
arg_string[ 5]='f';
arg_string[ 6]='g';
arg_string[ 7]='h';
arg_string[ 8]='i';
arg_string[ 9]='j';
arg_string[10]='k';
arg_string[11]='l';
arg_string[12]='m';
arg_string[13]='n';
arg_string[14]='o';
arg_string[15]='p';
arg_string[16]='q';
arg_string[17]='r';
arg_string[18]='s';
arg_string[19]='t';
arg_string[20]='u';
arg_string[21]='v';
arg_string[22]='w';
arg_string[23]='x';
arg_string[24]='y';
/* We won't put a NUL character at the end, although we should! */
} /* put_new_25() */
int main(void)
{
char the_string[100];
strcpy(the_string,"ABCDEFGHIJKLMNOPQRST");
printf("length is %d, data is %s\n",strlen(the_string),the_string);
put_new_25(the_string);
printf("length is %d, data is %s\n",strlen(the_string),the_string);
strcpy(the_string,"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx");
printf("length is %d, data is %s\n",strlen(the_string),the_string);
put_new_25(the_string);
printf("length is %d, data is %s\n",strlen(the_string),the_string);
return 0;
}
EOD
gcc -Wall 1.c -o 1
./1
The trick is to see how well function put_new_25() does its job, since it doesn't place a NUL character at the end of the string. It puts the first 25 letters of the alphabet into that array. When you see them, you should hope that that's all that is in the string.
The first time, it works ok, probably because the character array started out with all bits off (and you should never rely on that).
The second time, not so much. Here is my output, and yours would probably be the same:
Code:
length is 20, data is ABCDEFGHIJKLMNOPQRST
length is 25, data is abcdefghijklmnopqrstuvwxy
length is 50, data is xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
length is 50, data is abcdefghijklmnopqrstuvwxyxxxxxxxxxxxxxxxxxxxxxxxxx
I dont know...I tested the code without putting the null character and apparently the compiler knows where the string ends without problems at all.
When something breaks, it's pretty certain it was wrong.
When something doesn't break, that doesn't mean it wasn't wrong.
When you run a simple program, there are lots of places in your data that will happen to be filled with zeroes.
Depending on how the space was allocated, that zero fill might be a reliable feature of the language or it might be an accident. Even if you allocated so the zero fill is a reliable feature of the language, that just covers the first time you use the space within your program.
Your test was weak. It essentially worked because of the Magic of Uninitialized Variables(tm). It's black magic that you should avoid at all costs. Here, try this shell script as a test. It will show you the danger of not putting the NUL character (all bits off) at the end of your string.
The trick is to see how well function put_new_25() does its job, since it doesn't place a NUL character at the end of the string. It puts the first 25 letters of the alphabet into that array. When you see them, you should hope that that's all that is in the string.
The first time, it works ok, probably because the character array started out with all bits off (and you should never rely on that).
The second time, not so much. Here is my output, and yours would probably be the same:
Code:
length is 20, data is ABCDEFGHIJKLMNOPQRST
length is 25, data is abcdefghijklmnopqrstuvwxy
length is 50, data is xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
length is 50, data is abcdefghijklmnopqrstuvwxyxxxxxxxxxxxxxxxxxxxxxxxxx
I got it, so when the string was first initialized I guess all the remaining characters were set to NULL and not just the last character, like for instance in
Code:
char mystring[MAX_LEN]="whatever";
My program only requires the string to get bigger, itll never shrink, thats the reason why it works, im guessing because all of the characters after 'r' has been set to NULL? But yes I see the potential problem, it will never happen with what Im doing now, but it's important to keep it in mind for sure.
From a storage standpoint, if you're going to allocate a specific number of bytes, as in either of these:
Code:
char mystring[]="abcde";
char mystring[80];
you must never write beyond the end of the allocated number of characters. In the case of the first example, the number of characters is six, because there's a NUL byte after the 'e'.
If the number of bytes you need in the string is variable, use malloc()/calloc()/realloc().
From a data representation standpoint, if you're going to work with strings, and you're storing a new string into an array, always store that NUL byte at the end. It's a good habit, and it keeps your program from breaking if you modify it later.
Quote:
it will never happen with what Im doing now, but it's important to keep it in mind for sure.
Sorry, probably I didnt make myself clear, I meant even when my program was not gonna fail because of that I actually put a null character at the end of the string, cause it is a good programming practice. I was just wondering if the compiler was doing it for me. But thanks for your response.
I was just wondering if the compiler was doing it for me.
For much of this thread the method of declaring the char[] wasn't clear. Whether the language standard specifies the char[] be zero filled depends on how/where you declare the char[].
Code:
int main(void)
{
char the_string[100];
In that declaration, the language standard absolutely does not specify that the array be zero filled.
If you try it in a simple enough program, I think you would find that the array is zero filled (just because ram pages allocated from the OS are normally zero filled). But significant stack use before the entry to main is possible, so even in a specific OS and compiler where you already found that char[] to be zero filled, it isn't a safe bet it would be zero filled in your next program.
Also the phrase "compiler was doing it for me" was never clear in this thread. At times it seemed like it meant the compiler was generating code to reterminate the string after it is modified in the course of the code. That is certainly not true.
If you define a global or static char[] and don't initialize it or give it incomplete initialization, the uninitialized bytes will be pre filled with zeroes. But if you declare a local char[], as in the code above, it may not be pre filled (it will tend to be pre filled if it represents the deepest stack use up to that point in the program, but even that is unreliable).
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.