Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
We have a server at work that is having some serious issues. We can't seem to figure out what is causing the problem, but I am absolutely sure it is a hardware issue. I would like to know if there are any programs out there that would give the hardware a good run-through and possibly identify any errors. I am currently running memtest86 to test for memory related errors, but I would like something more comprehensive. Does anyone know if something like this exists?
This is a link to the VA Linux burnin test suite, known to mere mortals
as Cerberus . It's used to make sure that new systems are ready to
go out and face the perils of the cold, hard world. It's made up
of a suite of programs that literally pound the system
*NOTE* It's very easy to destroy your system with this software.
DO NOT INSTALL AND RUN ON A PRODUCTION SYSTEM. Please, for
the love of Pete, I'm dead serious about this folks. The
tests are meant for hardware with nothing on it yet... you
will lose data. Not might. Will.
The Linux™ Test Project is a joint project with SGI™, IBM®, OSDL™, Bull®, and Wipro Technologies with a goal to deliver test suites to the open source community that validate the reliability, robustness, and stability of Linux. The Linux Test Project is a collection of tools for testing the Linux kernel and related features. http://ltp.sourceforge.net/
Well, that is not exactly what I am looking for. I am looking for something that will load at boot without an operating system. The whole problem is, we can't seem even get the server to boot correctly.
Can you describe how the system behaves in a little more detail? Does it: fail to boot at all; boot part way then stop; boot normally but the OS won't load; boot normally with the OS loading but then dies mysteriously later, etc. At least from my perspective the description that the machine has serious issues is pretty much on the vague side, and could be anything - a dying component, an overheating problem, a poorly seated cable connection, a BIOS problem, a software problem, etc. FWIW, the first thing I'd check would be to make sure the unit isn't overheating and that all connections are seated properly. If you have a diagnostic disk for your HDD, load up the diag program and verify whether the disks are healthy or not. I realize this post probably won't be much help, but maybe more detail about the behavior of the machine will yield better answers. Good luck with solving the issue whatever it may be. -- J.W.
Originally posted by J.W. Can you describe how the system behaves in a little more detail? Does it: fail to boot at all; boot part way then stop; boot normally but the OS won't load; boot normally with the OS loading but then dies mysteriously later, etc. At least from my perspective the description that the machine has serious issues is pretty much on the vague side, and could be anything - a dying component, an overheating problem, a poorly seated cable connection, a BIOS problem, a software problem, etc. FWIW, the first thing I'd check would be to make sure the unit isn't overheating and that all connections are seated properly. If you have a diagnostic disk for your HDD, load up the diag program and verify whether the disks are healthy or not. I realize this post probably won't be much help, but maybe more detail about the behavior of the machine will yield better answers. Good luck with solving the issue whatever it may be. -- J.W.
HA! All of the above! It was never really the same. Sometimes it would boot and work fine for a couple hours and then lock up. Other times, it would just start to boot the OS and lock up or restart. It was really weird.
It ended up being a power-supply that was going bad! Should have guessed, power-supplies and hard-drives are the two most common components to fail.
Quote:
Tufftest does a nice check.
There is a free "Lite" version. It has limits on hard drive and memory check.
Thanks, I am going to try that too. Never know when I might need it.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.