HI All,
I have an interesting problem with the VM management routines in Red Hat Enterprise 2.4.21-37.0.1.ELhugemem and would greatly appreciate any insights.
The systems we are running here are on the above O/S, supporting Oracle 9.2.0.7 databases. We are using the Oracle Clustered File System (OCFS) version 1.14.
The problem is an interesting one. It stems from the behaviour of the Linix VM in that as much free RAM as possible is consumed by the pagecache. This results in the free heap becoming very small in relation to the available RAM. I understand that this is the normal behaviour of the Linux VM sub-system, but it has some serious negative side effects for the configuration we are running.
It would appear that OCFS (Oracle Clustered File System) does some internal checks during its operation based on the amount of Free Memory in the system. It also appears (from watching it in operation) that it is using the amount of Free Memory to determine how much RAM is actually available in the system.
Because the Free Heap (as is shown in either vmstat or /proc/meminfo) is so low, OCFS goes into a CPU spin before it can get the resource it requires.
This can have two very serious negative consequences :
1. The whole RAC cluster will slow up while the affected node is in a CPU spin or
2. The cluster manager will fail entirely and the RAC will go down.
The solution, as I see it, is to force the Red Hat Linux 3 VM to release more memory sooner into the free heap. Unfortunately, I can not find a way to do this. I have tried different setting in the /proc/sys/vm files (bdflush, pagecache, kswapd, etc.) but with no success. It appears to me that the Red Hat Linux 3 VM completely ignores any of these parameters, which is very disappointing.
If all of this seems a bit far fetched, I had exactly the same issue on our previous system which used Red Hat Linux 2.1. In that situation, I was able to write a program that automatically sensed the low RAM situation and would force the VM to return large amounts of RAM to the free heap by setting the pagecache parameters to : 10 20 30 for a short while (a few minutes) before resetting the pagecache to the default 1 50 90.
The new Red Hat V3 Linux pagecache defaults to : 1 15 30, and the parameters are used in a completely different way, so that avenue (as far as I can tell) is closed to me. Further, this version of the kernel completely ignors any different values set in 'pagecache'.
I would appreciate any help/guidance with this issue. There must be a solution, and I am more then willing to learn about it from someone with a greater knowledge of the Linux VM than myself.
As an aside, I have tuned VM's for 'big iron' Unix systems, and have never seen this behaviour - so am quite surprised by it.
Many Thanks in Advance,
Adrian.