Product:
Search Type:

How do I test hardware to check if it is working properly?

Article ID: 1332 
Last Review: May,16 2007
APPLIES TO:
  • Parallels Virtuozzo Containers for Linux

RESOLUTION

Sometimes when you have a kernel panic, OOPS, machine check exception (MCE), or other fatal crash the reason may be in your hardware. This article describes how to properly test your hardware to check it is in a good shape.

Please note that most of the tests described below could do harm to your machine if something is wrong with it (e.g. it is overclocked, undercooled etc). In general, overclocking is not recommended for production server boxes.


RAM tests


Random Access Memory (RAM) is sometimes faulty, which leads to some very strange system crashes. It is highly recommended to test system RAM before putting node into production. A several approaches and tools can be used.


Memtest86 and Memtest86+


Memtest86 is a standalone RAM tester, it can be booted either from CD (floppy) or from normal Linux bootloader - LILO/GRUB.


Memtest86+ is a forked version of Memtest86 with some features added.

Installation:
You can either download and install one of this programs from these sites - http://memtest86.com/ or http://memtest.org/. They can be a part of your Linux distribution already.

Usage:
To test server for faulty RAM, install either memtest and reboot into it. Run it for at least a few hours (at least 2-3 iterations). It is better to run tests for 1-2 days. If there will be even a single error reported, you have to change your RAM chips (or, if your system is overclocked, downclock it to normal speed).


Memtester


Memtester is a userspace utility for testing the memory subsystem for faults. The good thing is you can test your memory without a need to reboot the server, and you can run other programs with it. The bad thing is not all the memory is tested.


Installation:
Memtester is available at http://pyropus.ca/software/memtester/. To build: download, unpack, and type `make`.

Usage:
Invoke memtester as a root, giving an amount of memory it will test as an argument, e.g.:

# /usr/sbin/memtester 512M

The more memory you will specify the better.


CPU cooling tests


Such tests checks that your CPU can work fine under highest possible load and temperature.

Cpuburn


Cpuburn (http://pages.sbcglobal.net/redelm/) is an utility to burn your CPU as high as possible. It tests your system stability by checking how the CPU and the whole system is working under high temperatures.

Installation:
Download tarball from http://pages.sbcglobal.net/redelm/, untar and run.

Usage:
It is recommended to switch server to single-user mode and remount all the partitions read-only, just in case of system hang.
Run this command

# burnBX || echo $? &
for at least 15 minutes. If you have more than one physical CPU, repeat the command. If nothing is happening withing 15-20 minutes, and your system is still responding, you can conclude the test is passed, and kill the process(es):
killall -TERM burnBX


You can also use burnMMX utility:

burnMMX J || echo $? &

Cpuburn author says burnMMX is not optimal for AMD processors; use burnBX if you have AMD.

Combined tests


It is also a good thing to run cpuburn and memtester in parallel. Chances are higher that some more errors are detected that way.

Keywords: memory cpu test memtest86 memtest86+ cpuburn hardware

Please provide feedback on this article

Did this article help you solve your issue?
Yes
No
Partially
I do not know yet
 
Strongly Agree   Strongly Disagree
  9 8 7 6 5 4 3 2 1
The article is easy to understand
The article is accurate
Additional Comments:
*Please provide us with your email address in case we need to contact you.
*Please type the code you can see.
* - required fields