IBM to build the biggest data drive ever

By on August 29, 2011, 2:30 PM

IBM is in the process of building the largest data repository ever constructed, with a combined storage capacity of 120 petabytes. The facility is being developed at the company’s Almaden, California research center.

Technology Review reports that IBM is building the record-breaking storage system for an unnamed client that needs a supercomputer capable of real-world phenomena simulation, such as those used to model weather and climate.

120 petabytes is an enormous amount of storage. To break things down, 1024 gigabytes equal one terabyte. 1024 terabytes equal one petabyte. If I’ve done the math correctly, 120 petabytes is equivalent to 124 million gigabytes.

To put that into perspective, the system could store 24 billion five megabyte MP3 files. Furthermore, about 60 backup copies of the Internet Archive’s WayBack Machine could be stored, with each copy containing 150 billion web pages. In total, the system is expected to hold roughly 1 trillion files.

IBM will use 200,000 conventional hard drives to create the data container, which will be about 10 times larger than any previous effort. With so many disks in the array, it’s inevitable that drives will fail, perhaps on a semi-regular basis. IBM is preparing for such a scenario by storing multiple copies of data on different disks as well as employing new methods to keep the supercomputer running at almost full speed should multiple drives expire. According to director of storage research and project leader Bruce Hillsberg, the system should not lose any data for a million years.




User Comments: 14

Got something to say? Post a comment
H3llion H3llion, TechSpot Paladin, said:

now thats what I call a ultimate machine for p0rn

Win7Dev said:

Imagine how fast of a network you would need to be able to handle that much data. A few high end routers and switches aren't going to cut it for this massive storage center. I would think that you would need a really good stock of raid cards for this array, and an even larger supply of hdds.

Guest said:

Wow, Replacing failing drives is going to be a full time position. I wonder if they are hiring.

gwailo247, TechSpot Chancellor, said:

Someone wanted to have the entire internet available in offline mode.

H3llion H3llion, TechSpot Paladin, said:

gwailo247 said:

Someone wanted to have the entire internet available in offline mode.

haha always handy incase alien attack us! :P or what not...

This isn't impressive IBM, go and do that with SSD's, once you get 120Petabytes worth of SSD's then come back and show off.... damned

Jokes :P

Guest said:

How someone can do a backup of this ???

war59312 said:

Guest said:

How someone can do a backup of this ???

No need though.

All the data itself is replicated though out the system. So basically it's self sustaining.

ihaveaname said:

artix said:

now thats what I call a ultimate machine for p0rn

This comment made me happy.

aj_the_kidd said:

The maintenance bill for this beast would be gigantic, must be a government sponsored project

artix said:

now thats what I call a ultimate machine for p0rn

I thought that as well

Archean Archean, TechSpot Paladin, said:

artix said:

now thats what I call a ultimate machine for p0rn

+ 2

Someone wanted to have the entire internet available in offline mode.

Especially if atrix's reasoning is taken into account

aspleme said:

Close... 1024*1024*120 = 125,829,120 Gigabytes... although I would be curious to find out if that is before or after formatting... or if their storage space considers the fact that drives tend to be sold reporting size based on 1000 bytes to a kilobyte, etc, not the 1024 actually measured, thus selling a drive with a higher number of gigabytes than it actually has.

If the drive sizes are actually based on 1024 for each step, we have

135,107,988,821,114,880 bytes

vs

120,000,000,000,000,000 bytes

if we say 120 petabytes on the 1000 base for each step.

On this scale, that is

15,107,988,821,114,880 bytes missing from the reported size.

or about 13.42 petabytes... missing.

aj_the_kidd said:

aspleme said:

Close... 1024*1024*120 = 125,829,120 Gigabytes... although I would be curious to find out if that is before or after formatting... or if their storage space considers the fact that drives tend to be sold reporting size based on 1000 bytes to a kilobyte, etc, not the 1024 actually measured, thus selling a drive with a higher number of gigabytes than it actually has.

If the drive sizes are actually based on 1024 for each step, we have

135,107,988,821,114,880 bytes

vs

120,000,000,000,000,000 bytes

if we say 120 petabytes on the 1000 base for each step.

On this scale, that is

15,107,988,821,114,880 bytes missing from the reported size.

or about 13.42 petabytes... missing.

OK professor, anywhich way you do the math, its still a bucket load, plus "the system should not lose any data for a million years" now thats impressive. I'd be interested see how they came up with that figure

grvalderrama said:

aj_the_kidd said:

plus "the system should not lose any data for a million years" now thats impressive.

Why would it last a million years, when I'm almost certain that in no more than a 100 years, someone will invent a new drive that stores more data, faster, in a safer way, an so on...

Guest said:

How many songs will it hold?

Load all comments...

Add New Comment

TechSpot Members
Login or sign up for free,
it takes about 30 seconds.
You may also...
Get complete access to the TechSpot community. Join thousands of technology enthusiasts that contribute and share knowledge in our forum. Get a private inbox, upload your own photo gallery and more.