Questions on efficiency of multiple fork() calls (in C)

POW R TOC H · 07-19-2008, 08:26 AM

Hello.
I am writing a little server program (for fun and experimenting, nothing serious), and I'm using SDL_net to avoid direct socket programming (for now).
I am experimenting with a few things totally new to me : Embedding Lua in my programs, SDL_net, and forks (and pipes).

My 'server' design is a rather simple one :
A main process, listens on a socket. When somebody connects, the server forks, and his child uses the new socket and works with it (the 'work' is defined by Lua scripts) while the parent continues to listen to incoming connections.

This works fine as a concept, however there is a major problem : I use SDL_net for TCP connections, and SDLNet_TCP_Recv() is a blocking function. I can't let my server wait stupidly for something to be received

So, I thought I could make another fork, just as the child forks :
When the main process accepts a connection, and forks, the child forks again, forming a pipe between it and it's child. The child-parent works with lua, and sends data, while the child-child works only on receiving (which he sends thru the pipe).

However, this would mean that for every new connection, I would have :
N*2 + 1 processes, two for each new connection, and one for the main process...

Finally, I got to my point. Is this OK? How fast is fork(), and should I fork this much? Is there another way to do this without forking the child? Is there anything in particular I should be careful about when using fork() ?

Thank you

ophirg · 07-19-2008, 08:49 AM

hi

linux implements fork using a copy-on-write technique. it creates a copy of a memory pages for the child process only on a write to the page. so fork is fast.
but it won't be fast enough if you're expecting thousands of requests per second. if you do, try keeping a group of child processes waiting (blocked on read from stream or waiting for a signal) for the server.
also, consider using threads if you can and where you can. they will be more efficient.

POW R TOC H · 07-19-2008, 08:58 AM

Well, my little server is just a dummy server, and it's behaviour is defined by 3 different Lua scripts (That's my idea, not implemented yet

) : A config.lua script (on startup), main.lua script, which tells the main process what to do (if anything), and child.lua.
Every child process get's it's own interpreter, so every client can get it's own session. The idea is to create a server that can be user-configured to do anything (consider it a server-side analogy of netcat, with extensions) : be a small web server, mail server, FTP server, etc... The idea is not to compete with other servers, but rather to learn more in the process of creating it. Also, to have an excuse to drink to much coffee.
How many request it has to handle depends on what it is used for, and what the user programs the lua scripts to do. So I would like it to be fast. Can you please give me some good references to tutorials and books that deal with threading in linux (I probably won't include them in this project, as I know nothing about them yet, but I'll learn for future use).
So far my server is able to handle incoming connections and fork, and the child process already has some functions for the lua scripts to use. I'm about to implement Lua interpreter today

Thanks for the answer.

ntubski · 07-19-2008, 10:04 AM

Quote:

When the main process accepts a connection, and forks, the child forks again, forming a pipe between it and it's child. The child-parent works with lua, and sends data, while the child-child works only on receiving (which he sends thru the pipe).

I'm not sure I understand, wouldn't you still end up blocking when receiving data from the pipe? Possibly the socket sets functions would be useful to you.

POW R TOC H · 07-19-2008, 02:42 PM

Quote:

Originally Posted by ntubski

I'm not sure I understand, wouldn't you still end up blocking when receiving data from the pipe? Possibly the socket sets functions would be useful to you.

Well, if the pipe is empty, there is simply nothing to read. On the other hand, SDLNet_TCP_Recv() won't tell me that : instead, it will wait for any data to be received, thus disabling my server from doing anything else while it waits for data to be received...

jtshaw · 07-19-2008, 03:05 PM

I think if your worried about efficiency a threading model ultimately makes a lot more sense here, particularly since you can avoid IPC all together if you are using threads.

Chapters 11 and 12 of Advanced Programming in the UNIX Environment are a good place to start learning about threading. There are also plenty of references online for pthreads such as https://computing.llnl.gov/tutorials/pthreads/ which even contains a table comparing fork() to pthread_create() on different architectures.

ntubski · 07-19-2008, 07:44 PM

Quote:

Originally Posted by POW R TOC H

Well, if the pipe is empty, there is simply nothing to read. On the other hand, SDLNet_TCP_Recv() won't tell me that

Which is why I linked to the socket set functions, in particular I think SDLNet_SocketReady will tell you that. Although then you might have the problem of avoiding a busy loop...

Maybe I'm just dense, but I can't see how the pipe helps, to see if the pipe is empty wouldn't you have to read from it, and wouldn't you then block until there is data?

pinniped · 07-19-2008, 09:58 PM

Wow - that's setting up an awful lot of open file descriptors. There must be an easier way to do things.

First of all, as already mentioned, you probably want to use a 'pthread' rather than 'fork'; if you use a fork() you will have to set up IPCs to communicate between processes. Messy, but very portable.

Now if you only ever have a tiny number of connections, threads or fork are just fine. If you have numerous connections, the overhead starts to become a nuisance (context switching, process tables, etc etc). Sequential processing of socket connections would be the best solution even though it may be a little more work to code. You might still have a few threads running to handle other things, especially when handling many unrelated tasks in sequence just grows to be silly and overly complicated - you essentially leave it to the OS scheduler to arrange the sequencing.

POW R TOC H · 07-21-2008, 06:22 AM

Quote:

Originally Posted by pinniped

Wow - that's setting up an awful lot of open file descriptors. There must be an easier way to do things.

First of all, as already mentioned, you probably want to use a 'pthread' rather than 'fork'; if you use a fork() you will have to set up IPCs to communicate between processes. Messy, but very portable.

Now if you only ever have a tiny number of connections, threads or fork are just fine. If you have numerous connections, the overhead starts to become a nuisance (context switching, process tables, etc etc). Sequential processing of socket connections would be the best solution even though it may be a little more work to code. You might still have a few threads running to handle other things, especially when handling many unrelated tasks in sequence just grows to be silly and overly complicated - you essentially leave it to the OS scheduler to arrange the sequencing.

Thank you all

However, I'm going to try all 3 methods with a smaller example, and see what fits my knowledge level (which is not much, mind you)

This is, after all, a project I'm working on just for fun...