The Mudcat Café TM
Thread #75405 Message #1324888
Posted By: JohnInKansas
12-Nov-04 - 03:25 PM
Thread Name: Tech: Ever Lost a Hard Drive?
Subject: RE: Tech: Ever Lost a Hard Drive?
Many years ago, in the "good old days," one could get real tech specs on hard drives. Now, if you ask for "the specs" all the ad-wonks (marketing) will permit them to tell you is:
"This * is VERY GOOD! IT WILL IMPROVE YOUR SEX LIFE. YOU NEED TO BUY IT NOW! CLICK HERE TO ORDER SEVERAL!"
It appears that the "standard" now, for the most popular desktop drives, is either 20,000 or 30,000 hours MTBF. MTBF = Mean Time Between Failures, and is a "statistical" indication of how reliable a drive is. A standard "Man Year" (probably should say "Person Year" I suppose, but nobody would recognize the acronym) for 8 hours per day, 5 days per week, for a year is 2088 hours, so the "average" time a drive is expected to last is about 10 years in "office use." Of course, if you use your machine 24-7, as many of us do, that drops to a little over 3 years.
Because of peculiarities in the way the MTBF is calculated, for well controlled manufacturing processes, it's not unusual for 80% of devices to exceed the MTBF figure. The number gets knocked down because a few "early failures" are usually "very early." Often, about half the devices may last 2X the MTBF, but this depends on what kinds of failures occur. Note that the "average time a drive will run" is subtly, but significantly, different than the "time the average drive will run."
Nearly all drive makers have a variety of lines, and a typical MTBF for an "economy" line may be only 10,000 hours, or less. Some makers produce "High Reliablity" lines for which (if you can get past Marketing) you may get quoted MTBF numers as high as 100,000 hours. Somewhat paradoxically, the higher the reliability, the more likely that nearly all the devices will fail at exactly the rated MTBF life, of very near it. The "Deacon's Buggy" phenomenon.
In the "good old days" the largest HD one could get was somewhere around 30 or 40 MB. Most people replaced machines about every 5 years to keep up with the evolution in OS and application software, so only a few "lucky ones" managed to "wear out" a drive before they replaced a machine and got new ones.
Even then, most HD failures had "causes." The old drives were very susceptible to "head crashes" where the read/write head plowed into the disk surface if you moved the machine with the drive running. Those almost never happen in modern drives due to changes in head design. Failures due to congealed grease, then as now, are ALWAYS due to overheating of the drive. Better cooling in cases has generally taken care of this, but you still have to keep the fans running and keep the case clean with unobstructed air flow. (And if you've failed a drive for this reason, there's probably a bunch of other "cooked" components in the machine.) Rude things that people do to their machines can break the hermetic seal on the drive, which causes almost guarantees failure.
Probably the most common cause of "drive failure" is a failure in the electronics that are part of the drive. While there are probably some random component failures, just based on the number of components, most "electronic" failures also have causes. The accumulated effects of running at too high temperatures comes high on the list. Line surges, lightning strikes, and other "external" insults can do it, although modems and video drivers now seem to be the most commonly affected "surge kill" components.
The "two drives are better than one" philosophy is common, but has to be taken with a "grain of salt." With two drives, statistically, you're TWICE AS LIKELY that one will fail. The "cost of failure" is reduced because you only lose half your stuff. You don't get to choose which half. Putting all your data on a separate drive gives no significant improvement in "data security" because the odds that that drive will fail are exactly, for comparable drives, the SAME as the odds that the other one will fail. To actually make a significant improvement this way, ALL of the "safe" stuff must be on BOTH DRIVES. That's why RAID arrays were invented.
The most practical way to assure data safety is to make frequent backups to reliable permanent media OFF THE MACHINE.