[SOLVED] A simple shell written in C exits after using pipe

z3rOR0ne · 06-16-2022, 02:56 AM

I have copied a project from stackoverflow and am simply trying to understand why certain behavior is happening. I am a total beginner at C, so please be kind. (original link: https://stackoverflow.com/questions/...ith-pipes-in-c).

What follows is a simple shell written in C, it's major feature is that it is able to process the pipe '|' symbol and process the stdin and stdout as expected in a standard shell like bash. What I am failing to understand is why the program exits after the command is executed (and only if a | is used).

According to the person who provided the answer on stackoverflow, another fork() is required for the while(1) loop to continue, but because two execvp() calls are made in the execpipe() function, the while(1) loop currently executes the pipe and then exits. I am very grateful at this post, as I am trying to ascertain how a shell like bash actually implements operators like pipe(|), but I've been racking my brain for days on this and have made no progress in understanding even where to start on solving this issue of the program exiting after a pipe command is executed...

I have extensively documented in the notation what I believe is happening in each process, and again, am very much a beginner and am just trying to understand how to program a shell with pipes for my own understanding.

Could someone please explain to me where this necessary additional fork() call should be placed within the code and how it should be implemented(meaning, are there any additional calls for pid_t, for example?)?

Sorry for all the verbosity, but I'm at a loss here and any help would be greatly appreciated.

Code:

/* https://stackoverflow.com/questions/33912024/shell-program-with-pipes-in-c */

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

#define MAX_CMD_LENGTH 100

#define MAX_NUM_PARAMS 10

int parsecmd(char *cmd, char **params) {  // split cmd into array of params
    int i, n = -1;
    for (i = 0; i < MAX_NUM_PARAMS; i++) {
        params[i] = strsep(&cmd, " ");
        n++;
        if (params[i] == NULL) break;
    }
    return (n);
};

int executecmd(char **params) {
    pid_t pid = fork();  // create child process that is a clone of the parent

    if (pid == -1) {  // error
        char *error = strerror(errno);
        printf("error fork!!\n");
        return 1;
    } else if (pid == 0) {          // child process
        execvp(params[0], params);  // exec cmd
        char *error = strerror(errno);
        printf("unknown command\n");
        return 0;
    } else {  // parent process
        int childstatus;
        waitpid(pid, &childstatus,
                0);  // wait for child process to finish before exiting
        return 1;
    }
};

int execpipe(char **argv1, char **argv2) {
    int fds[2];  // an array that will hold two file descriptors
    pipe(fds);   // populates fds with two file descriptors

    pid_t pid = fork();  // create child process that is a clone of the parent

    if (pid == -1) {  // error
        char *error = strerror(errno);
        printf("error fork!!\n");
        return 1;
    }
    if (pid == 0) {       // child process
        close(fds[1]);    // file descriptor unused in child
        dup2(fds[0], 0);  // fds[0] reads the end of the pipe and donates its
                          // data to the file descriptor 0 close(fds[0]);
        // file descriptor no longer needed in child since stdin is a copy
        execvp(argv2[0], argv2);
        // run command AFTER pipe character in userinput

        pid_t pid2 = fork();

        // simple error handling
        char *error = strerror(errno);
        printf("unknown command\n");
        return 0;
    } else {              // parent process
        close(fds[0]);    // file descriptor unused in parent
        dup2(fds[1], 1);  // STDIN
        // close(fds[1]);
        execvp(argv1[0], argv1);
        // run command BEFORE pipe character in userinput

        int childstatus;
        waitpid(pid, &childstatus, 0);
        // wait for child process to finish before exiting

        // simple error handling
        char *error = strerror(errno);
        printf("unknown command\n");
        return 0;
    }
};

int main() {
    char cmd[MAX_CMD_LENGTH + 1];
    char *params[MAX_NUM_PARAMS + 1];
    char *argv1[MAX_NUM_PARAMS + 1] = {0};
    char *argv2[MAX_NUM_PARAMS + 1] = {0};
    int k, y, x;
    int f = 1;
    while (1) {
        printf("sh>$ ");  // prompt
        if (fgets(cmd, sizeof(cmd), stdin) == NULL)
            break;                           // read command, ctrl+D exit
        if (cmd[strlen(cmd) - 1] == '\n') {  // remove newline char
            cmd[strlen(cmd) - 1] = '\0';
        }
        int j = parsecmd(cmd, params);  // split cmd into array of params
        if (strcmp(params[0], "exit") == 0) break;  // exit
        for (k = 0; k < j; k++) {
            if (strcmp(params[k], "|") == 0) {  // if pipe is found
                f = 0;  // set f variable (previously 1) to 0
                y = k;  // and whatever y is (undefined) is equal to k (the
                        // count of words up until the "|" was found)
                break;
            }
        }
        if (f == 0) {                  // if pipe was found...
            for (x = 0; x < k; x++) {  // loop through the words
                argv1[x] = params[x];  // and the argument arrays is given a set
                                       // of parameters from our inputted cmd
            }
            int z = 0;
            for (x = k + 1; x < j;
                 x++) {  // one command ahead of k (k + 1) is equal to x, which
                         // is compared to the length of the params array(j),
                         // and looped over
                argv2[z] = params[x];  // whatever is read after the pipe is
                                       // assigned to argv2[z]
                z++;
            }
            if (execpipe(argv1, argv2) == 0) break;
            // and we executepipe() function (exits after executing)
        } else if (f == 1) {  // if pipe was not found
            // simply execute the command (does not exit shell after executing)
            if (executecmd(params) == 0) break;
            // this exits after executing:
            /* execvp(params[0], params);  // exec cmd */
        }
    }  // end while
    return 0;
}

rtmistler · 06-16-2022, 03:56 AM

The way it's coded, once a child exits, the parent exits. That's the answer. The rest is additional follow-up.

That's what those return statements do.

There are two mismatching return values for the parent return cases by the way. Not a problem but if you're checking those you won't be able to discern if things exited as an error vs not. Make the parent case returns some other value.

z3rOR0ne · 06-16-2022, 04:11 AM

Quote:

Originally Posted by rtmistler

The way it's coded, once a child exits, the parent exits. That's the answer. The rest is additional follow-up.

That's what those return statements do.

There are two mismatching return values for the parent return cases by the way. Not a problem but if you're checking those you won't be able to discern if things exited as an error vs not. Make the parent case returns some other value.

Good to know. The thing is, I still don't understand what to do, or even where to start looking for additional info on how to make this easier to understand. From my understanding, the child process needs to persist even after the execpipe() function has completed and returned a value. Why is the suggestion on stackoverflow recommending another fork() process and where would this even start?

Again, I'm just breaking this thing down trying to learn from it, but my level of understanding of the basics of C and programming in general are very rudimentary, so I'm still at a loss as to how to resolve this issue...

pan64 · 06-16-2022, 04:27 AM

in such cases you can add logging to this code to see what's going on, or you can try debugging

z3rOR0ne · 06-16-2022, 04:39 AM

Quote:

Originally Posted by pan64

in such cases you can add logging to this code to see what's going on, or you can try debugging

Sorry to sound like such a noob here, but basically I don't know how to even do that? Is logging adding a series of printf() statements to see what the values of certain variables are during the life cycle of the program?

I'm afraid I'm not yet familiar with how to debug in C.

rtmistler · 06-16-2022, 05:48 AM

You picked a difficult first example.

Search for a very simple fork() example and debug that first.

Also read the manual pages for fork() and one of the exec() calls, paying close attention to the return status.

Nothing wrong with adding things like:

Code:

printf("This is the child\n");

Or also outputting the return value or other data. For instance the child can print out it's own pid and the parent can also print out what the child pid received is.

rtmistler · 06-16-2022, 05:50 AM

Like this example https://www.includehelp.com/c-progra...x-example.aspx

Same for searching for a simple exec() example.

z3rOR0ne · 06-16-2022, 06:31 AM

Quote:

Originally Posted by rtmistler

You picked a difficult first example.

Search for a very simple fork() example and debug that first.

Also read the manual pages for fork() and one of the exec() calls, paying close attention to the return status.

Nothing wrong with adding things like:

Code:

printf("This is the child\n");

Or also outputting the return value or other data. For instance the child can print out it's own pid and the parent can also print out what the child pid received is.

I have looked over the man pages of fork and exec a few times now and am still lost (to be honest, I need a LOT more examples and sample programs to look at, the man pages feel like the briefest of summaries at times to me).

I have gone over some basic tutorials of fork() and execlp as well as execvp, but have had no examples on how to debug super simple examples of that. Here is some simple fork() code I have looked at repeatedly, how would I debug this?:

Code:

// Taken from: https://stackoverflow.com/questions/15102328/how-does-fork-work
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <unistd.h>

#define SIZE 5

int nums[SIZE] = {0,1,2,3,4};

int
main()
{
    int i;
    pid_t pid;
    pid = fork();

    if (pid == 0) {
        for (i = 0; i < SIZE; i++) {
            nums[i] *= i;
            printf("CHILD: %d ", nums[i]); /* LINE X */
        }
    }
    else if (pid > 0) {
        wait(NULL);
        for (i = 0; i < SIZE; i++)
            printf("PARENT: %d ", nums[i]); /* LINE Y */
    }
    return 0;
}

/*
Outputs:
CHILD: 0 CHILD: 1 CHILD: 4 CHILD: 9 CHILD: 16 PARENT: 0 PARENT: 1 PARENT: 2 PARENT: 3 PARENT: 4
*/

/*
Explanation:
fork() duplicates the process, so after calling fork there are actually 2 instances of your program running.

How do you know which process is the original (parent) one, and which is the new (child) one?

In the parent process, the PID of the child process (which will be a positive integer) is returned from fork(). That's why the if (pid > 0) {  PARENT  } code works. In the child process, fork() just returns 0.

Thus, because of the if (pid > 0) check, the parent process and the child process will produce different output, which you can see here (as provided by @jxh in the comments).
*/

What's sad is no matter how often I look at explanations like this I can't seem to retain the information.

Anyways, thanks for your help so far.

z3rOR0ne · 06-16-2022, 06:55 AM

To clarify in relation to the C code in the original post, I am somewhat aware of how execvp does not return a value, but am unsure as to why this breaks a while(1) loop. I have played around with execvp and determined that you cannot, for example, call the ls command more than once whether you call execvp multiple times or put it in a loop of any sort. Can someone explain why this is?

NevemTeve · 06-16-2022, 07:22 AM

TL;DR What is the actual question?

dugan · 06-16-2022, 08:38 AM

It’s too early in the morning to check your code, but BASH creates a process group for the pipeline. Are you doing that? See here:

https://youtu.be/NfHqGv0PlIw

rtmistler · 06-16-2022, 01:22 PM

Quote:

Originally Posted by z3rOR0ne

To clarify in relation to the C code in the original post, I am somewhat aware of how execvp does not return a value, but am unsure as to why this breaks a while(1) loop. I have played around with execvp and determined that you cannot, for example, call the ls command more than once whether you call execvp multiple times or put it in a loop of any sort. Can someone explain why this is?

Because each of those subfunctions you've copied, or modified, or written have return statements and probably exit when you don't expect them too.

Yes exec functions do not return, but it the code or commands they run complete and that exits, then the process goes away.

If it's the parent, then it goes away and takes all it's children with it.

If it's a child, then it terminates and sends a signal to the parent.

Now you can put tons of debug in there to tell you "I am <this>", "I received ..." and etc.

You copied code which in my opinion is doing something it doesn't need to do. They're parsing a command line, text. No need to fork processes to do this. So it's somebody's for fun experiment.

I'm afraid I don't think I can help further here, sorry but not invested with debugging second hand code. Already have recommended you start with simpler code, as in write your own from the ground up, starting with the things you do know.

Perspective: I work as an engineer writing code in C. We keep it simple. Period. Otherwise you're asking for bugs. We have to deal with those, but we prefer not to.

In my LQ blogs I've long ago posted stuff about how I wrote a daemon and forked child processes, used pipes, monitored the children, and used the select statement. That's all I got, but those are are working examples.

ntubski · 06-16-2022, 05:14 PM

Quote:

Originally Posted by dugan

BASH creates a process group for the pipeline. Are you doing that?

This has practical advantages, but seems like a really unnecessary complication for a newbie writing their first sort-of-shell code.

z3rOR0ne · 06-16-2022, 07:00 PM

Quote:

Originally Posted by rtmistler

Because each of those subfunctions you've copied, or modified, or written have return statements and probably exit when you don't expect them too.

Yes exec functions do not return, but it the code or commands they run complete and that exits, then the process goes away.

If it's the parent, then it goes away and takes all it's children with it.

If it's a child, then it terminates and sends a signal to the parent.

Now you can put tons of debug in there to tell you "I am <this>", "I received ..." and etc.

You copied code which in my opinion is doing something it doesn't need to do. They're parsing a command line, text. No need to fork processes to do this. So it's somebody's for fun experiment.

I'm afraid I don't think I can help further here, sorry but not invested with debugging second hand code. Already have recommended you start with simpler code, as in write your own from the ground up, starting with the things you do know.

Perspective: I work as an engineer writing code in C. We keep it simple. Period. Otherwise you're asking for bugs. We have to deal with those, but we prefer not to.

In my LQ blogs I've long ago posted stuff about how I wrote a daemon and forked child processes, used pipes, monitored the children, and used the select statement. That's all I got, but those are are working examples.

Thank you. That is a good explanation as any I can expect regarding exec functions

As a beginner I was looking into how to go about writing my own shell and came across a plethora of examples of how to approach the issue. Yes I copied this code, but that is because I wanted to break it down afterwards to understand what each function was doing (hence the verbose notation).

I also wished to understand how to best implement | characters and their corresponding functionality that is normally available in full featured shells, but examples on how to do such a thing were practically nonexistent. This was the closest one I could find, but I found myself frustrated at it's exiting afterwards and I couldn't wrap my head around how to address this. I know it's time to take a step back, but I've been through two books, multiple videos, and many online tutorials on the beginner's aspects of C programming, and wanted to challenge myself to extend out an existing shell program to have an additional feature. I suppose I won't be figuring this out any time soon.

Again, thank you for your insights.