Google publishes 'Failure Trends in a Large Disk Drive Population"

Status
Not open for further replies.

Rick

Posts: 4,512   +66
Staff
A pretty interesting read that I found over at Slashdot.

http://labs.google.com/papers/disk_failures.pdf

Google Inc. said:
Overall, we expected to notice a very strong and con-sistent correlation between high utilization and higher failure rates. However our results appear to paint a more complex picture. First, only very young and very old age groups appear to show the expected behavior. Af-ter the first year, the AFR of high utilization drives is at most moderately higher than that of low utilization drives. The three-year group in fact appears to have the opposite of the expected behavior, with low utilization drives having slightly higher failure rates than high uti-lization ones.
One possible explanation for this behavior is the sur-vival of the fittest theory. It is possible that the fail-ure modes that are associated with higher utilization are more prominent early in the drive’s lifetime. If that is the case, the drives that survive the infant mortality phase are the least susceptible to that failure mode, and result in a population that is more robust with respect to varia-tions in utilization levels.
Another possible explanation is that previous obser-vations of high correlation between utilization and fail-ures has been based on extrapolations from manufactur-ers’ accelerated life experiments. Those experiments are likely to better model early life failure characteristics, and as such they agree with the trend we observe for the young age groups. It is possible, however, that longer term population studies could uncover a less pronounced effect later in a drive’s lifetime.
 
interesting paper on SMART drives.

Their findings are interesting
Our results confirm the findings of previous smaller
population studies that suggest that some of the SMART
parameters are well-correlated with higher failure probabilities
.
We find, for example, that after their first scan
error, drives are 39 times more likely to fail within 60
days than drives with no such errors. First errors in reallocations,
offline reallocations, and probational counts
are also strongly correlated to higher failure probabilities.
Despite those strong correlations, we find that
failure prediction models based on SMART parameters
alone are likely to be severely limited in their prediction
accuracy, given that a large fraction of our failed drives
have shown no SMART error signals whatsoever.
thanks Rick.
 
don't put all your eggs in one basket
I would not use any drive over 200gb
seen and heard of to many failures for drives over that size



PS
got a new cherry keyboard man its great
 
Well, supposing 200GB drives fail twice as often than 100GB drives, then it really doesn't matter which ones I use :p
 
just for note, I am running two 300GB one 400GB, and have a 500GB ordered and have yet to have one fail, this is over a 4 year period. of course all of these are used in a home compasity. meaning not heavily other than the boot drive. the whole computer is only booted at max once a week or when installing software, or reinstalling the op system.
 
Status
Not open for further replies.
Back