By flea13153
Jun 7, 2012
  1. A 40TB file server in a RAID 5 configuration, lost two drives simultaneously. One drive is totally dead and the other drive goes in and out of being recognized by the server. We replaced the dead drive and the server started to rebuild over night, however, when we got in this morning the the other drive was showing errors and the rebuild process had slowed way down. When we previously had restarted the machine the drive seems to recover and work fine. My question is, if we restart the server during the rebuild, would the RAID start to rebuild from the beginning or pick up where it left off? Also, does anyone have any other recommendations beside letting it rebuild or restarting?
    As long as the RAID is still rebuilding, leave it alone! The last thing you want to do is lose all that data, unless you have one hell of a backup, my advise is leave it alone to rebuild. If it's a production fileserver server, try to get users off of it for as long as possible, don't use it if you can at all help it, let it rebuild - which may not happen until the weekend.
    Different RAID cards react differently to a reboot. Most will be just fine if you reboot, they will continue rebuilding - it's a separate process from the OS itself. Think of the RAID rebuild as an automobile with the engine running, but you could change the tire (rebooting windows) while it's running. Now I don't advise it, but it would be ok - it might just be painstakingly slow.
    Another thought, as large as that fileserver is, 40TB is HUGE! I'm surprised that it's not configured in a RAID 6 (two drives can fail) or a RAID 10 (striped mirror) - it would have made more sense to have the double parity of RAID 6, or the mirrored advantage of RAID 10. [I'm just putting in my two cents, no offense intended]
    Hope this helps, take care!
    That's what we banking on, just leaving it alone.

    We would have loved for the RAID to be in 6 or 10 however, we inherited the domain and, as of now, we don't have the ability to take the network's fs offline, transfer all the data off, rebuild the array, and transfer all the data back on. Along the lines of the inherited network, they had a terrible backup strategy in place, and the latest full backup we have is from December 2011. We're in the middle of redesigning the entire network infrastructure and if this happened in August, we wouldn't be in the bind we are in now.
    Wow... sound like you got put right smack in the middle of a hornets nest. I've been in those spots many times myself. Good luck! Yeah, leave it be & let it rebuild. If you don't mind, just out of curiosity, post back & let me know how the rebuild goes over the weekend. If you need further help, post back or msg me directly. Take care!

