Windows Guides Feed

The purpose of this post is to confirm the confidence I have in RAID technology as expressed in the earlier post “RAID“. It is occasioned by my recent plans to write a very different piece.

Background: the Warning Signs

Summers can get pretty hot here in Auckland. The average temperature for this time of year is 24 degrees Celsius (that’s 75 degrees Fahrenheit to North Americans) with 99% humidity so it’s no simple matter to keep a computer cool.

Your computers resources are not infinite and each time a new program is called, a portion of the total resource is used and as we chew through the systems resource we eventually reach a point where the resources are no longer sufficient for the proper functioning of the system.

At that point many things can happen starting with “nothing” happening to a full on “Blue Screen of Death” forced re-boot.

In most cases no error message is produced, or the message that is produced has nothing to do with the actual problem that may be discovered later. What we notice is that the system starts acting “weird”. Commands don’t get executed, we lose icons, a simple file copy takes 3 times as long as it should … and at some point we make the decision to shut everything down and start again fresh with a reboot.

Some time ago (more than a year) I found that I had to re-boot my systems far more often on hot days than on cool days.

Eventually it dawned on me that at least some these faults might be the result of overheating.

I asked my wife if we could turn her sewing room into an air-conditioned computer room with hardware racks and a raised floor – but she said “No”. Women – go figure.

My fallback plan “B” called for the installation of more internal fans into my boxes, and while this seemed to alleviate the faults, it also made the room a lot noisier.

In an effort to reduce the office noise levels I had a cabinet built to house the four computers I had at the time. I included several ventilation holes that I hoped would be sufficient.

The result was a large desk-shaped oven. The multiple 6-inch vent holes weren’t nearly enough to extract the trapped heat so it just got hotter and hotter. I had to take off the cabinet doors and even the PC covers and direct a large office fan directly into the cavity to bring the operating temperature down to a safe level. In the end, for a time, the office was hotter and noisier than ever. (more about this in our next article)

I mention my heat problems because for a long time I attributed all of my “un attributable” faults to it. In particular, the one that bugged me the most was the loss of my RAID array.

The first time I noticed it was a bit of a shock given that the drive holds all my data and my “backups” discipline is a bit loose, but after some investigation and a few reboots it became clear that the array hadn’t disappeared, it had only failed to mount.

I’ve configured this PC with 2 partitions – the boot drive (“C:\”) is a separate physical HDD so the system boots up fine, it just has no second partition which is a virtual drive made up of 4 physical drive and combined by RAID to a single “D:\” drive.

No error message was produced and in fact, the report from the HighPoint RAID management system told me that the array was “Normal” apart from the number 2 disk running a bit hot.

In order to recover the folders I had to power down the box and reboot at least once, and sometimes more, to get the virtual “D:\” drive back on line.

I sent off a note to the HighPoint group via their support web page and got an email back saying their support guy was on vacation and gave me an alternate address to contact. No reply came back from the alternate email.

I used the support page to request an update on the status of my fault report and shortly thereafter I got an email saying that my trouble ticket had been updated. I logged back in to the support site only to find that the “update” was my own query asking them for an update.

Around this time I’d been posting my problem off to my various forums and one kind reader wrote me to point out that if I really wanted to back up a 1.5Tb RAID array, I’d need a 1.5Tb backup disk to do it. He was right of course, but it was a depressing kind of right – there is a good measure of fault tolerance built into the RAID software, but it is fault “tolerance” not fault “proof”. If you lose a second disk before you can replace the first fail, you will lose the array.

The lack of progress on this issue and a growing frustration with the supplier drove me to consider an article on “the failure of raid technology and its suppliers”. Fortunately a lucky-un-lucky break intervened.

“We had to destroy the village in order to save it”

Have you ever had an intermittent fault on a system that you couldn’t pinpoint, but you knew it was in a particular subsystem, so you just whacked the subsystem with a hammer to get the whole thing replaced?

Fortunately for me, fate held the hammer this time.

As disk failures go, the “head crash” (see Wikipedia) has to be one of the most dramatic. It’s a catastrophic hardware fault that occurs when a read-write head (works like the needle on a turntable) comes into contact with the surface of the disks platter which is spinning around at 7200 Rpm.

On February 1st, a noise that sounded very much like a high-speed dentist drill came screaming from my PC. Checking the RAID Management page I could see the number 2 drive had indeed failed. (I’ve put a sound file on my web site if you want to hear it Barracuda Head Crash.wav)

Securing a replacement drive (a Western Digital) I had a go at getting it integrated into my system but the first attempt failed miserably (“no available drive found”). After figuring out that the drive had to be formatted first it only took only a minute to install, and then another 8 hours to mirror the drive back in the array restoring my system to peak performance.

Since February 1, and with additional system cooling modifications, both servers have run well although I still can’t close the cabinet doors. My confidence in RAID technology is solidified, and I’m very happy to re-recommend a RAID 5 solution for any situation that requires a large logical drive for optimum disk utilization and data protection with a lower cost of ownership profile than simply doubling the number of disks.

One down, One to go

Bolstered by the success with the disk failure I pushed ahead for a solution to the disappearing drive problem and sent another email off to HighPoint. I got a note back directing me to their Chinese web site to download the latest drivers, BIOS, and web management tools. It sounded like a fob to get me off their backs, but those basic steps – even if they never seem to work, must be undertaken in order to move on to the next step.

On March 1, I found the driver, bios, and application files on their web site and they were, indeed, different from the ones I’d obtained earlier from their US web site (why didn’t they just update the US files themselves?)

I installed the 4 new files and I guess it must have worked — 4 or 5 reboots since the install and not a missing drive in sight!

I won’t claim a final victory however. As with the currently accepted scientific theory: it’s only true until it’s not.



About Deck Hazen

A computer user since 1976, Deck enjoys testing new software and reconfiguring his equipment to squeeze the most out of it. "Computing has come a long way since those early days" Deck recalls "I get a real kick out of watching the industry grow - getting paid to write about it is just icing on the cake!"

Free PC tips by email

Search Windows Guides




Comments

  • Guest

    FYI – RAID 6 can tolerate 2 drive failures, although you don’t get as much total space out of the array.

  • Steve

    For your computer cabinet, I would suggest cutting out the bottom except for a 2″ “rim” around the perimeter. Take some “expanded metal” (that’s what we call it in the states – see picture – it comes with different coatings, I prefer vinyl coated). Cut it to the size of the bottom (larger than the hole you cut) and place in on the bottom-inside of the cabinet. You can attach it with coax-cable mounting fasteners but you don’t really need to. If you haven’t already, attach some short legs (2″) to the bottom outside of the cabinet so there is good airflow. Now, the important part… You want vertical airflow, from the bottom up, with no sideways airflow. Patch any side holes and re-attach the doors. And… The bad news, you will have to install fans. I normally go to a used electronics store; mine is called “The Ax Man” :-) Those places have fans that have been removed from larger electronic equipment and are usually priced very reasonable. Those stores also allow you to “test” the fans so I would pick out some 6″ ones that are the same exact model and make sure that they run on your house current (are you 220v?). Low speed fans will be quieter so look at the RPMs. Test them at the store, run a couple at a time, and pay attention to the sound that TWO or more make together. There should be a consistent harmonic sound – not a varying in-and-out or louder-then-softer sound. If they don’t sound steady together, than one of the fans is beginning to wear out. Swap fans until you have a set of consistent ones. Now for the number of fans. As an example, let’s assume that your cabinet is 20″ x 30″… That makes your opening at the bottom 600 sq inches (just ignore the lip). For good ventilation, you should have a 4-to-1 ratio between the larger bottom area and the combined area of the fans. Six-inch fans have an area of about 28 sq inches (pi times radius squared) so you would need six fans to equal one-quarter of the bottom area. Mount those babies on the underside of the top, with accompanying 6” holes and connect them directly to the main power cord for your cabinet – WITHOUT A SWITCH – so you never have to worry about accidently turning them off. It might be wise to buy an extra fan but for the most part those fans are rated for continuous operation at 90 degrees Celsius and usually last for years, depending on how long they were used in their original capacity.


Computer tips in your inbox
Sign up for the Windows Guides newsletter to get PC tips and access to free Windows books (More details)

Enter your email address:
 

Popular Guides

See which sites have been visited on your PC (even if private browsing mode is used)

Create a Windows 7 System Repair Disc

Best Free Anti-malware

Hibernate vs. Sleep vs. Shut-Down

i3, i5, and i7; Dual, Quad, Hexa Core Processors. How to they Differ?

Intel's Ivy Bridge Processor: new Features

Submit Your Tip
Submit your computer tip to us; receive full credit for all published tips

Windows Guides on Facebook