Red HatThis forum is for the discussion of Red Hat Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I get complaints from EDAC in /var/log/messages that seems to indicate there is a problem with my memory. They are like :-
****************************
Oct 27 04:05:53 ct-on-db kernel: EDAC MC0: CE page 0x5820a0, offset 0x480, grain 8, syndrome 0x8fd1, row 2, channel 1, label "": k8_edac
Oct 27 04:05:53 ct-on-db kernel: EDAC MC0: CE - no information available: k8_edac Error Overflow set
Oct 27 04:05:53 ct-on-db kernel: EDAC k8 MC0: extended error code: ECC chipkill x4 error
Oct 27 04:07:19 ct-on-db kernel: EDAC k8 MC0: general bus error: participating processor(local node response), time-out(no timeout) memory transaction type(generic read), mem or i/o(mem access), cache level(
*********************************
These messages run continuously - but only a few days after the system has had a reboot and despite the fact that all the memory gets used up pretty quickly (within hours)
I've left memtest on the machine overnight and haven't had a single error. The machine has had some 20+ hours running memtest with no complaint.
My question is, what is EDAC doing that memtest doesn't ? memtest would have hit the same motherboard issues as EDAC I'd have thought (assuming its not a memory problem but possibly a motherboard problem) so I don;t see why EDAC is complaining so regularly and memtest does not.
EDAC might be better in finding and in this case correcting (possible) memory problems.
This from edac.txt (found in the linux kernel Documentation directory):
Quote:
Detecting CE events, then harvesting those events and reporting them, CAN be a predictor of future UE events. With CE events, the system can
continue to operate, but with less safety. Preventive maintenance and
proactive part replacement of memory DIMMs exhibiting CEs can reduce
the likelihood of the dreaded UE events and system 'panics'.
I would keep an eye on this if I where you, especially if these errors have shown up recently and keep pointing to the same dimm (MC0/row 2/channel 1).
Have you tried using dmidecode -t memory to get more information?
BTW: Which memtest are you using (memtest or memtest+)? The latter is supposed to be better equipped to deal with modern hardware.
My question is, what is EDAC doing that memtest doesn't ? memtest would have hit the same motherboard issues as EDAC I'd have thought (assuming its not a memory problem but possibly a motherboard problem) so I don;t see why EDAC is complaining so regularly and memtest does not.
Memtest is a fairly simple test, it just tests all memory cells with different patterns. It can't test any possible real life situation. You simply have to take Memtest results with this in mind: If Memtests finds errors you definitely have a problem, if it doesn't find errors your machine is likely to be error free, but there is no guarantee.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.