Lost 100GB of data on a Promise SuperTrack SX6000 Raid-5 array!

Status
Not open for further replies.

Theo343

Posts: 22   +0
Has anyone experienced this on this IDE RAID kontroller(128MB cache)? I got a 1TB raid-5 array(6*200GB Seagate Barracuda) and its been up for like 1 month.

One time after a reboot it started hanging on initializing the board, and i got a bit worried that it was initializing the array and escaped it and proceeded booting w2k3.

A week after i set up weekly synch of the array as recomended from the maintanance chapter in the manual. It started yesterday evening and took a heap of time.

Today around 12:30 i had to shut it down to retreieve a 250GB SATA HDD that mounted as a stand-alone disk on the server for moving data, before delivering it to a warranty exchange. Afterwards i started the server, went out and came back 6 hours later.

The the synch said 89% and a worrysome popup message told me the driveletter on which contains the whole raid-5 array was corrupt and i needed to run chkdsk on it.

I let the synch(not same as init array, big diff ;)) complete and then ran chkdsk d: /f /x and it started correcting a heap of records and recovering a lot of files.

At the end i had lost 100GB of data and i was almost paralized... this wasnt supposed to happen on a very expensive raid controller.

Does anyone have a real clue of what happened here and how i can prevent it from happening again? I hadnt come around to get a good backup solution running yet and it seems ive misplaced my trust on this raid controller.

I have not yet written any new data to the raid volume, but i guess chkdsk has when running. Any advice on recovering my data and on what might have happened?
 
How many disks do you have in the R5 array?

This may be a difficult fix. The R5 can recover if one of the HDDs goes south, it can rebuild the data. But in this case, chkdsk would have looked for "data errors" which would have ultimately spanned the whole thing. Meaning your fix may be the use of data recovery software. Some simple software such as these come to mind:

Recover My Files - http://www.recovermyfiles.com/

PC Inspector File Recovery - http://www.pcinspector.de/file_recovery/uk/welcome.htm

Easeus Recovery software - http://www.easeus.com/

I can't say off the top of my head how well they handle large volumes like 1tb or RAID setups. But I use them for recovering data on single HDDs all the time.

I'm assuming, though, that the RAID itself is intact? And that it hasn't been broken to individual disks? If so it needs rebuilt right quick. But I'm no expert on RAID for sure.

Good luck
 
The Raid array array was intact and it consists of 6 disks. It was only the NTFS volume within that was corrupt.

So what your saying is that i could have done a rebuild stund if i hadnt ran chkdsk? I just wounder what made the NTFS volume corrupt, it it was the sync operation it seems pretty messsed up that it could happen.

Whats strange in the raid controller log is that it sais it starts synch on array yesterday.. and today after boot it says it starts synch on one drive in the array? sounds pretty weird to me. It its not talking about 1 stripe disk, which isnt the case on Raid-5 i guess.

Ill provide the logs aswell.

This is also some of the info that the chkdsk provided...
Code:
Recovering orphaned file Setup.ini (44234) into directory file 44226.
Recovering orphaned file setup.inx (44235) into directory file 44226.
Recovering orphaned file setup.iss (44236) into directory file 44226.
Recovering orphaned file Setup16.bmp (44237) into directory file 44226.
Recovering orphaned file Ethernet (44238) into directory file 43844.
Recovering orphaned file bdco1.dll (44239) into directory file 44238.
CHKDSK is verifying security descriptors (stage 3 of 3)...
Security descriptor verification completed.
Inserting data attribute into file 43701.
Correcting errors in the uppercase file.
Correcting errors in the master file table's (MFT) BITMAP attribute.
Correcting errors in the Volume Bitmap.
Windows has made corrections to the file system.

 976559188 KB total disk space.
 110102696 KB in 41508 files.
     33784 KB in 2714 indexes.
         0 KB in bad sectors.
    294816 KB in use by the system.
     65536 KB occupied by the log file.
 866127892 KB available on disk.

      4096 bytes in each allocation unit.
 244139797 total allocation units on disk.
 216531973 allocation units available on disk.
Should i use one or all of the revocery utils?
 
i don't know anything about arrays
but if all that was tamper with was the mft or file system
a good sftware recovery program may get back most ,not all
from what I have read Ext2 Installable File System for raid
 
seems my firefox accidentally downloaded a 10MB file to the volume, i quickly change that. Hope not to much damage was done.

Whats the beste recovery program to go with? Is there any freeware program that might be good enough? Im running a demo of the last one mentioned above now, but it takes 22 hours to scan the volume and the demo dont recover.

But what would be the best prog to use for a one timer, and what prog to buy for future events?

another start clip from the chkdsk.
Code:
Event Type:	Information
Event Source:	Chkdsk
Event Category:	None
Event ID:	26180
Date:		23.11.2005
Time:		21:21:26
User:		N/A
Computer:	******
Description:
Checking file system on D:
The type of the file system is NTFS.
Volume dismounted.  All opened handles to this volume are now invalid.
Volume label is DATA.
Deleted corrupt attribute list entry
with type code 128 in file 43701.
Unable to find child frs 0x153ad with sequence number 0x4.
The multi-sector header signature for VCN 0x7 of index $O
in file 0x19 is incorrect.
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Correcting error in index $O for file 25.
The index bitmap $O in file 0x19 is incorrect.
Correcting error in index $O for file 25.
The down pointer of current index entry with length 0x18 is invalid.
00 00 00 00 00 00 00 00 18 00 00 00 03 00 00 00  ................
ff ff ff ff ff ff ff ff 01 02 00 00 00 00 00 00  ÿÿÿÿÿÿÿÿ........
00 00 00 00 f8 c6 6a 57 95 d1 c5 01 ff ff ff ff  ....øÆjW•Ñ..ÿÿÿÿ
Sorting index $O in file 25.
The object id in file 0x3 does not appear in the object
id index in file 0x19.
And what the event log said before running the chkdsk.
Code:
Event Type:	Error
Event Source:	Ntfs
Event Category:	Disk 
Event ID:	55
Date:		23.11.2005
Time:		16:33:48
User:		N/A
Computer:	******
Description:
The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume DATA.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 0c 00 04 00 02 00 52 00   ......R.
0008: 02 00 00 00 37 00 04 c0   ....7..À
0010: 00 00 00 00 02 01 00 c0   .......À
0018: 00 00 00 00 00 00 00 00   ........
0020: 00 00 00 00 00 00 00 00   ........
0028: 17 0b 14 00               ....
 
not sure if there is one program that does it all
I have 3 now
O&O disk recovery
and file scavenger
the other is for MS office corruption and recovery can't remember where I put it.
if the ceom you are using finds/see's the files it more'n likly will recover.
the best recovery sftware out there is from people who only write that type of program or just special disk utilitys
like O&O acronis
company called actionfront recovery are one of the top pro's in the business
if you are really serious give 'm a call
 
Seems like the "gods of computer storage and emotional retribution" wasnt finished with me. Today i discovered that the 150GB of data that was restored by dskchk was only "ghost files". Just size and name, no content.

So there went every file i had right out the window. This makes it kind of hard to trust this so called expensive Raid controller with any form of data. Ive never had such bad luck on any single drive in my entire life and its a Raid-5 array that kills my data in the end hmm.

I found about three working files, and i couldnt be bothered cheking every file so i deleted the whole bunch(emotionally worn out by the whole thing)... and from now on i will never ever again trust my Promise SuperTrak SX6000. Its burned into my memory as my worst computer experience for the last 20 years.

My slogan for Promise will forever be "Promise - dont make _promises_ you cant keep"

Ive been working proffesionally with servers and scsi raid for 15 years and ive never encountered an incident like this before. I guess thats why it hit me so hard aswell, didnt expect it to be a possibility on this quality controller.
 
I'll try to remember that model as one to stay away from.

Your first chkdsk posts NO bad sectors, that's the first good sign. Though I think I would doubt the simple chkdsk abilities to accurately check such a large volume using RAID.

I would think that the RAID utility itself, perhaps Windows based, has tools for checking the RAID and so on. But anyway, guess what's done is done.

Frankly I'm wondering if there was nothing wrong with the RAID, and chkdsk screwed it all up? Or if there was something wrong, chkdsk didn't fix it properly.

The top two programs I posted, only one of them will let you save the files when their found. But it has a very odd bug. For example, I went to recovery data off a floppy disk, which of course is 1.44mb. By the end of the recovery process, it had recovered 36mb of data?!?! I was trying to recovery a 15gb HDD, it had recovered 23gb before I just stopped it. Don't know what that's about.
The 3rd link is a program I've never used, but got pretty high reviews.

It wouldn't hurt to just run the program, just for fun and see if it finds any of your files. That really sucks. Guess even a RAID5 doesn't guarantee safe data. What a shame.
 
I tried running the last data on the highest scan level on the NTFS volume(basic disk). It didnt find any relevant to restore :(.

Do anyone know of any large external filestorages one can use for backup up data offsite?

EDIT:
What im thinking now is that i must have made some horrific mistake to provoke this.

1. I have pressed esc on the controller bios boot to halt Promise to initialize array many times, cause it took about 30 seconds and i was in a hurry. But why allow it to be used if this action would corrupt it?

2. I downed the server while it might have been working on a synch, not sure. To take our a SATA(not array) connected to the motherboard. But it seems far that interupting a synch will corrupt an array?

3. Maybe the array was confused by me doing these things and the NTFS fooled me into thinking the NTFS volume was corrupt and i rushed into a chkdsk instead of giving me and the system a couple of controlled reboots and checking around a bit first...
 
I think it's safe to say that none of your physical hard drives are bad. Because no bad sectors. Which means the problem lies in the logical.

Even if the few things you did caused anything, it's also safe to say that it should not have happened. Meaning the internal workings of the array controller were not very robust. Which confirms your thinking that the controller is not very good.

I think the best data backup is two-part. First part is using a strait mirror. While this costs the most and doesn't increase your storage capacity, it is the safest I think. Because any physical problem on either drive, means you have an identical copy on the other.
The second part is offsite storage, in case a fire or something destroys both drives at once, or a nasty virus gets "mirrored" to both drives. The mirror is constant, but offsite may be once or twice a week, and so it gives you a buffer should you get attacked in software.

There are many offsite (online) storage places you can use. But most all have monthly fees, which could be fairly high for the amount of data you are talking about.
Another way would be to buy some hot-swappable large large drives and bays for your PC. That is if you have 5 1/4 bays left. With these you could just plug them in, do a sych of data, then pull it out and take it to work or somewhere. Then do a backup once a week or whatever.

Then again, if you need one TB of storage, single 1tb hdds are going to hit the market really soon, if not already. I think I read of one that Hitachi or maybe IBM was already putting out. I don't remember.
 
By using a large drive and doing a synch, i suppose you mean a software copy, backup with differential og incremential synch and not a raid level synch.

Would it be possible to mirror a Raid-5 array? Like Raid level 51 or 15 or something. I know thats up to the controller really, but in theory? What if i had one 1Tb drive and i said i wantet to mirror the raid-5 array with this drive. Then after the synch i could unplug the 1TB drive and plug it in a week later to resynch. And if data became corrupt on the r-5 array i could do a tric(set r5 offline and insert 1TB as online and reset R5, dont remeber the routine right now) on the controller and synch back from the 1TB drive?

If not i could ofc use the drive as an ordinary FAT32 or NTFS and do a differential winrar archive copy backup with cheksum to a FAT32 volume on the large drive.

So what im thinking right now is using maybe 2 300GB SATA drives in the server for backup purpose only. I could make them hot-swap enables so that i can store them outside the house. Then i use winrar with compression and filesplit at 50MB or 100MB and run a updating differential backup once ore twice a week.

If winrar is not up to the task i could use NT backup from win2k3. And if thats not good enough i could buy Backupexec or Brightstore for a singleserver and use that with backup to a virtual tape on HDD.
 
You can also mirror R5 arrays. I suppose that also is a function of the controller. Take 4 HDDs at 1tb R5, then mirror that to another 4HDD R5. But again, that is most expensive with least space, as you have to buy all these extra drives and not gain their space.

My thoughts may be to keep all your drives in R5, but perhaps get a new controller. Then add in the mix the hot swap differential backup of your most important stuff, as software-based backup.
1TB external drive:
http://www.cooltechzone.com/reviews/drives/hddnewsstory_001.php

Then again, you could always just buy a new RAID controller, resetup everything, and hope for the best this time around. I suppose that could be the cheapest option.

Also of note, A friend of mine tried using winrar to split large files in 1gb chunks, the process was less then 100%, not sure I'd trust that.
 
Status
Not open for further replies.
Back