What is so special about 34 or more spaces when reading text files with C code?
So basically I've written a little program that uses fgets() to loop through each line in a text file with a while loop, but the while loop also has a nested for loop to scan trough the array that holds each string read by fgets(). Which will look for the hash symbol which will indicate a comment, and therefore the nested for loop will continue and set my bool flag to "false", because only if that flag is "true" will the printf() statement in my while loop be executed (which is what I intended).
I will admit that I got probably every error under the sun trying to get this program working even to just actually print whatever doesn't have the hash symbol in front of it - even got a "Bus Error" (don't know how I managed that, but that's new one on me). Anyhow, I got it working and fixed the problems I was having after hours sitting there trying to figure out how to get bloody thing working. But there's one small problem, as usual... Even if I have the hash symbol in front of something that should be ignored as a comment (because it has the hash symbol in front of it), if there is more than 33 spaces (not tabs, spaces) between the start of the line, and the actual string; the string doesn't get ignored, and still gets displayed. But as long as there isn't anymore than 33 spaces between the start of the line and the string, it's fine, and the string gets ignored (as intended) where the hash symbol is in front of the string to be "ignored as a comment". I've tried searching for an answer, but either this is a very unusual problem, or I don't understand the solution, either way, I have no idea what the problem even just might be. Here's my code (I've got the commented out for loop there because I was checking if it was actually reading the spaces, and the actual string, and on both counts, it was); Code:
// a program to skip to the next line in file if the comment char (#) is encounted Code:
line 1 Code:
james@jamespc: practice> ./skip_line_if_comment Code:
james@jamespc: practice> ./skip_line_if_comment James |
This does not look like it will do anything useful...
Code:
for ( i = 0; i < content[i]; i++ ) { This also causes it to fail at 32 spaces - do you see why? It is an important clue! I would also recommend using a #define or variable to set the length of content, then use that same value to test the length of the buffer, like: Code:
#define CONTENT_LEN 373 |
Quote:
I'll add a define in there for the array size like you said. Thanks for your help astro! |
Quote:
Using a #define to set the buffer length you will also need to modify your comparison test to check for the octothorpe or the terminating NULL added by fgets(). Think carefully about what must happen in each case - they are not the same. |
I think I sorta know what you mean about what happens when the loop gets to 32 based on this. But I'm honestly just not sure what you mean about the second thing you said about modifying the comparison test (I assume you mean the if statement you quoted before?), or why what must happen would be different depending on whether it was the hash sign or null. The only thing I can think of is, if the it encounters the hash then it's got to goto the next line in the file and scan it. And the same for the null byte since gets() puts it on the end of the line, so I once again just don't know.
|
I do not understand why you're putting the string through another loop to search it for the char
strchr searches the string for the wanted character and returns NULL if not found. Code:
// a program to skip to the next line in file if the comment char (#) is encounted |
Quote:
Similarly, if I were writing some C to process TeX/LaTeX (or PostScript) source files I'd have to deal with percent signs in a similar manner. Or exclamation points in X11 resource files. You want to allow comments in your data files you'll need to write code to recognize and deal with them. HTH... |
Quote:
To make the mental connection clear I would suggest drawing it out with little squares, each containing the ascii value of the byte that was read into each position for a given line. Then run the loop in your head with i being the index (number) of each square... Code:
for ( i = 0; i < content[i]; i++ ) Code:
i content[i] Make that a regular exercise every time you begin to type "I sorta know..." in a reply, stop right there and develop the habit of exploring the thing further until you can type, "OK, I understnad that! It works like this...". When you are writing real programs "sorta know" will block your path and lead you down endless blind alleys! Train your brain to trigger on "sorta know" and work out your own example to change that to "OK, I understand" before asking others. Then, if you can't make that work you have a solid question to ask others! (This is not intended to discourage you from asking for help, but to help you develop the necessary programming skill of changing uncertainty into certainty on your own - it is good exercise.) Quote:
* The character might be '#' - what do you want to do in that case? * The character might be some other non-space, non-# character - what to do in this case? * The character might be NULL - what to do in that case? Each case probably requires a different action, so you need to craft your comparison test to detect and correctly handle each possibility. But to define those actions you really need to specify just how you actually want it to work. For example, do you want to define a comment as beginning with the first # in a line, or only lines in which # is the first non-whitespace character? Specify first, then write code to meet the specification. |
Thank you BW, while your solution does work, my program shouldn't ignore a line as a comment if the hash is NOT proceeding the string. So for example;
Code:
a line# Code:
#a line Quote:
Quote:
Quote:
I tried what you said astrogeek, but I'm still not clear on exactly why once the loop gets past 32 it fails. The only thing I can tell is the obvious in that, it fails once it gets past 32. I tried to modify the if statement, I tried adding more if statements to check for anything other than a hash symbol, but I've only made it worse. I just don't know how to express in code what you said about coding a different action depending on whether it sees a hash symbol, or null, or whatever other character. And the more things I try, the more confusing it gets, so I really don't know what to do at this point. Here's my code as it stands now; Code:
// a program to skip to the next line in file if the comment char (#) is encountered Code:
james@jamespc: practice> ./skip_line_if_comment |
Hi jsbjsb001!
You still have not understood what your loop is doing, so making changes inside it will only compound the problems, as you have discovered. Consider my original post about your loop parameters... Quote:
The point of my next post was to have you list out the value of the index, i, and the ascii value of each array element, content[i], which should make the reason for the the behavior after 32 spaces apparent. Your printf("i = %i content[i] = content[%c]\n", i, content[i]) statement for showing those values is not correct and is not even inside the loop, so it is not showing you anything useful. But let's not troubleshoot new code until we get the original working so please do not try to fix it, just revert to your original code. Here is what you would see if you did it as I suggested: Code:
i content[i] That answers your original question, "What is so special about 34 or more spaces...". But it also tells you that your loop parameters are just wrong - you should not be comparing to the ascii value of each location! In your current code you are additionally incrementing each of those values to 33 during the loop - what is the character with ascii value 33? Do you see why you are now printing all those '!'s? But you should not be comparing i to content[i] in your loop parameters - that is still just wrong. As I said in my first post: Quote:
The important point to get here is the you need to understand why your loop is working incorrectly first, then based on that make the necessary change to make it work correctly, and only then make changes within the loop. In other words, do not start by making code changes - that gets you no points and adds confusion! Start by analyzing how it is working in your original code, and why that is wrong, then make the single specific code change to fix that one thing first - for the win! So, go back to your original code so we do not add one problem on top of another, and see if you can get the loop to not abort after 32 spaces, as I have indicated. This is all good exercise - but only if you work your way through it by understanding! See what you come up with! |
Ok, I've done what you said about reverting the code back to what it was before, and I've put the printf() statement in the right place this time so it prints out for the whole for loop.
While I can see what you mean about it stopping at 32 if there's just spaces; it's still just not obvious to me why it's stopping at a blank space. I'm sorry, I've looked at the output of the program, the code, ASCII table, but it just isn't obvious to me why it's stopping at a blank space. Here's my code as it stands now; Code:
#define CONTENT_LEN 373 Code:
james@jamespc: practice> ./skip_line_if_comment |
Quote:
I'd open file, read in contents, seek for the #, when found then look for the end line, on the new line look to see if # or not # then do with that line whatever it is you are wanting to do. steps. 1. open file 2. read line 3. if # found. 4. find end of line 5. when found skip that line go to step 2 (repeat) 6. if no # found then print line to output, go to step 2 (repeat). this would eliminate the 32x spaces issue. That would be my first approach to this problem. Code:
if ( true ) { Code:
while (read line) { Code:
while ( fgets(content, CONTENT_LEN, testfile) != NULL ) { not thinking or working in the what if comment looks like this. Code:
some code here #now a comment the end results would be Code:
some code here some code here |
Remember, characters have a numeric value inside a computer.
Does this help your understanding? Code:
#include <stdlib.h> |
Quote:
HTH... |
Please everyone, let's refrain from offering better ways of checking for comment lines and help jsbjsb001 to understand the behavior of his particular loop code, whether we think it is not the best way to perform the task or not.
jsbjsb001, you are getting closer, but you still have not done the thing I asked so the output does not make sense to you. Here is your original loop with your printf() which I have modified to work the way I intended, and one line added - a similar printf() after the loop exits. Code:
for ( i = 0; i < content[i]; i++ ) { I also use a simple test file: Code:
No comment Make those specific changes to you original code and see if the output makes sense. Do you see now why the comparison is exiting after 32 spaces? Please compile these changes, and no others, and think about what you are seeing in the output. If you understand what is happening please try to explain it in fewest words. If you do not understand what is happening just ask for more hints. It is very important you understand and debug this loop methodically and clearly without other distractions. |
All times are GMT -5. The time now is 05:21 PM. |