fundamental question on threads and critical sections

yanivwehtam · 04-09-2009, 09:23 AM

Hi there,
I am not a real experienced person with regards to threads and critical sections. Would be great if someone could clarify if I have understood the following concept correctly.

Scenario:
2 threads are accessing a global resource. To keep the scenario simple I have simply used a global integer variable as the resource both threads are accessing. Thread_1 increments the global counter variable. Thread_2 decrements this variable.

Thread Actions:
Each thread waits for the arrival of a signal from the main thread. When this signal is received each of them would enter a critical section in order to increment/decrement the global variable.

Thread1:
dd_crit_enter();
while(CTR!=50)
{
CTR++;
}
dd_crit_leave();

Thread2:
dd_crit_enter();
while(CTR!=0)
{
CTR--;
}
dd_crit_leave();

For the moment assume that dd_crit_enter and dd_crit_leave are the functions that allow entry and exit of the critical sections of the code. These are functions I am suppose to be testing btw.

Question:
1. My understanding is that the threads will be scheduled for execution by the operating system such that each is allotted a bit of execution time (unless ofcourse one has a blocking call in one of the threads and this is not the case in my scenario) right?

2. I am under the impression that when a thread enters the critical section of code, it is not allowed to be interrupted by any other process/thread until the critical section is completed. Right?
Assuming thread1 enters the critical section first it would not be interrupted until it has incremented CTR to 50 right? Assuming thread2 is the next to execute, this would then decrement CTR to 0 before giving up control. Right?

I hope I have made the scenario clear here. I am unfortunately not able to provide concrete code here since its proprietery and I am basically testing whether the implementation of these dd_crit_enter and leave is done as per the fundamental requirements.
Thanks a million for your responses.
Cheers,
Vinay

dwhitney67 · 04-09-2009, 11:11 AM

If a variable (or an object, as the term is used in OO languages) can be concurrently accessed by multiple threads, then yes, one should protect concurrent access to this object.

I typically use a pthread_mutex to protect my data. I lock this mutex before read/writing to the variable/object, then unlock the mutex when I am done.

Code:

lock mutex

critical section

unlock mutex

There are inherent dangers, though, with this approach. One needs to be careful that the mutex is always unlocked after completing the critical section. A common mistake is to return from a function that contains the critical code before the mutex is unlocked. So be aware of this issue; for example:

Code:

void function()
{
  lock mutex

  do something

  if (error) return;   // potential exists to cause application dead-lock

  unlock mutex
}

As for your second question, a mutex or semaphore can be used to guard against concurrent access to data or a section of code. Threads ought to be sharing the same mutex or semaphore for this too succeed. Similarly, this applies to individual processes that are involved, say in accessing shared memory.

P.S. I write mostly C++ OO code; thus if I have shared data variable(s) between two LWP (threads), I encapsulate the data and the mutex within an object; this makes the use of the mutex transparent to the calling threads.

If you are developing in C, then something similar can be done. In a module, declare a static pthread_mutex and use it to guard critical sections within accessor functions.

yanivwehtam · 04-14-2009, 07:42 AM

Hi,
Thanks a lot for the reply. As far as my second question goes I think I might not have formulated it appropriately enough. The dd_crit_enter and dd_crit_leave in my case take care of mutex locking. In my scenario, the 2 threads I created would then alternately run depending on which one is scheduled to run first by the underlying OS.

My question here is that is it possible that the underlying OS interrupts one of the threads while it is in its critical section in order to schedule the second thread?? I am asking this cos I have the situation that in the beginning thread1 counts up to 50 and then thread2 counts down to zero. But somewhere during the run the 2 threads sort of enter into a deadlock state and go into an infinite alternating(thread1/thread2) loop which is what I cannot comprehend.

I must state here that I am using the PowerPC architecture running on the qemu emulation on my laptop. Is it possible that the qemu does not actually emulate this scenario correctly.
Thanks once again for the help!

johnsfine · 04-14-2009, 08:02 AM

Quote:

Originally Posted by yanivwehtam

1. My understanding is that the threads will be scheduled for execution by the operating system such that each is allotted a bit of execution time (unless ofcourse one has a blocking call in one of the threads and this is not the case in my scenario) right?

I think your dd_crit_enter() is supposed to represent a blocking call, so I don't get your meaning of "not the case".

Quote:

2. I am under the impression that when a thread enters the critical section of code, it is not allowed to be interrupted by any other process/thread until the critical section is completed. Right?

I don't know what OS or machine primitive is represented by your dd_crit_enter(), but the various forms of "enter critical section" I have used all do allow the thread/process to be interrupted. They just don't allow another thread to enter the same critical section.

Quote:

Assuming thread1 enters the critical section first it would not be interrupted until it has incremented CTR to 50 right?

I don't think that is exactly correct. But the main effect should be true (assuming the same critical section is used by both your dd_crit_enter() calls).

Quote:

I am basically testing whether the implementation of these dd_crit_enter and leave is done as per the fundamental requirements.

That might require a much stronger test.

Quote:

Originally Posted by yanivwehtam

My question here is that is it possible that the underlying OS interrupts one of the threads while it is in its critical section in order to schedule the second thread??

Yes, but haven't you coded the second thread to reach the dd_crit_enter() call very quickly when scheduled? So when it reaches that call it should be blocked until the first thread reaches dd_crit_leave()

Quote:

I am asking this cos I have the situation that in the beginning thread1 counts up to 50 and then thread2 counts down to zero. But somewhere during the run the 2 threads sort of enter into a deadlock state and go into an infinite alternating(thread1/thread2) loop which is what I cannot comprehend.

I don't understand what prevents the situation of one thread getting past the dd_crit_enter() twice in a row without the other one getting in. Do you have first come first served logic in dd_crit_enter() somehow? Usually such things are random.

Would one of them getting in twice in a row set up the problem you are seeing?

sundialsvcs · 04-14-2009, 01:46 PM

A critical section does not prevent a thread from being pre-empted while inside. Rather, it merely prevents another thread from entering the same section at the same time.

Kinda like a stall in a bathroom, or something ...

jiml8 · 04-15-2009, 01:25 AM

Quote:

My question here is that is it possible that the underlying OS interrupts one of the threads while it is in its critical section in order to schedule the second thread??

This is certainly possible and even likely unless the first thread blocks the second thread.

You have not specified whether this program runs in userspace or in kernelspace. If the former, then you cannot get it to run in a fashion that is uninterruptible; the kernel just won't let you do that. If you are in kernel space, then you CAN make the code truly uninterruptible, which makes it possible for you to bring the whole system down if you make a mistake.

Because of this ambiguity, your question is unclear.

Your critical section must be critical to your application only and then you can use mutexes to establish control and set locks, but you can't prevent the kernel from preempting you unless you are running as part of the kernel.

yanivwehtam · 04-16-2009, 02:51 AM

Hi Guys,

Quote:

I don't understand what prevents the situation of one thread getting past the dd_crit_enter() twice in a row without the other one getting in. Do you have first come first served logic in dd_crit_enter() somehow? Usually such things are random.

Would one of them getting in twice in a row set up the problem you are seeing?

Thanks a lot for the explanations. I think I realized where I am making a mistake. The 2 threads dont access a common piece of code rather, I have just bracketed each incrementation/decrementation in the threads in a critical section. Which from your explanations lead to the behaviour I am seeing.

I have changed this so that each thread accesses a function called modify_ctr which does the incrementation/decrementation depending on which thread is currently accessing this code. I have bracketed this call in a dd_crit_enter dd_crit_leave block. This would mean that both threads would access the same piece of code depending on which one is currently executing.

Quote:

I don't understand what prevents the situation of one thread getting past the dd_crit_enter() twice in a row without the other one getting in. Do you have first come first served logic in dd_crit_enter() somehow? Usually such things are random.

Sorry, but I am testing the dd_crit_enter and dd_crit_leave in a black box way. For the discussion we can assume that these 2 functions take care of the synchronisation required to enter the critical sections. I am supposed to find out if these 2 functions work as per fundamental requirements of synchronisation. I therefore cannot really tell what is inside these 2 functions.
I only know that they are supposed to take care of synchronisation of the critical sections.

Quote:

You have not specified whether this program runs in userspace or in kernelspace. If the former, then you cannot get it to run in a fashion that is uninterruptible; the kernel just won't let you do that. If you are in kernel space, then you CAN make the code truly uninterruptible, which makes it possible for you to bring the whole system down if you make a mistake.

Because of this ambiguity, your question is unclear.

Your critical section must be critical to your application only and then you can use mutexes to establish control and set locks, but you can't prevent the kernel from preempting you unless you are running as part of the kernel.

The 2 threads I created are kernel level threads. I understand that I have the possibility of bringing the whole system down. But this is exactly what I am trying to test with the 2 functions dd_crit_enter and dd_crit_leave

Now assuming these 2 functions do this synchronisation of entry into the critical sections correctly, from my change in the code mentioned above, I would expect the following situation.
Assuming thread1 enters the critical section modify_ctr and is interrupted while it is in the critical section. When thread2 is scheduled for execution, I would expect that thread2 now blocks since it tries to access the same modify_ctr using dd_crit_enter and dd_crit_leave. But since thread1 has not yet reached dd_crit_leave and was interrupted in between, thread2 would block until thread1 reaches dd_crit_leave. Right? This would lead to a deadlock situation since thread2 blocks and thread1 would never be able to reach the dd_crit_leave call.
I hope I have made my situation clear enough.
Thanks a million guys!

ta0kira · 04-16-2009, 09:18 PM

Quote:

Originally Posted by yanivwehtam

Assuming thread1 enters the critical section modify_ctr and is interrupted while it is in the critical section. When thread2 is scheduled for execution, I would expect that thread2 now blocks since it tries to access the same modify_ctr using dd_crit_enter and dd_crit_leave. But since thread1 has not yet reached dd_crit_leave and was interrupted in between, thread2 would block until thread1 reaches dd_crit_leave. Right? This would lead to a deadlock situation since thread2 blocks and thread1 would never be able to reach the dd_crit_leave call.

Blocking doesn't really mean anything is actually stuck; it just means the process is put in a different queue while the others are executed (threads are processes with common memory.) In other words, when thread2 blocks, it will be put in a wait queue and the kernel will switch back to thread1, periodically checking to see if the criteria are met for thread2 to unblock. On one other note, you seem to assume this will always run on a single-processor, single-core machine.
Kevin Barry

yanivwehtam · 04-17-2009, 01:28 AM

Quote:

Blocking doesn't really mean anything is actually stuck; it just means the process is put in a different queue while the others are executed (threads are processes with common memory.) In other words, when thread2 blocks, it will be put in a wait queue and the kernel will switch back to thread1, periodically checking to see if the criteria are met for thread2 to unblock. On one other note, you seem to assume this will always run on a single-processor, single-core machine.

Thanks Kevin. Well my machine is a single processor single core machine and the behaviour I am seeing using my functions dd_crit_enter and dd_crit_leave is that when thread1 is interrupted in its critical section(i do this by simply putting thread1 to sleep after every increment it makes), the thread2 is still able to gain access to the same piece of code and count down the counter. As per your explanation thread2 should see that it is not able to access modify_ctr and should be put in the wait queue instead of decrementing the counter. This would mean that dd_crit_enter and leave functions are not performing the synchronisation in the expected manner right?

On another note, what would the difference be for a multiprocessor system? Does the synchronisation mechanism differ there? Assuming thread1 runs on processor1 and thread2 on processor2, they do however access the same critical section. So I would expect to see an alternation of thread1 and thread2 depending on which one gains access to the critical section or is my understanding here wrong as to how critical sections are handled in multiprocessor systems.

thanks once again

ta0kira · 04-17-2009, 08:30 AM

From what it sounds like, modify_ctr will only lock out access while it's performing a single operation. You should probably revert back to dd_crit_enter and dd_crit_leave and have them pthread_mutex_lock and pthread_mutex_unlock the same mutex, respectively.
Kevin Barry