Sorry for the late response Cookiedude.
With everything I've been reading I was thinking this would be a good idea too. Is it possible to have 2 hotspares with RAID 5? Given hard drives are relatively cheap I reckon we're best to cover ourselves as much as possible, where possible
Your limitation is purely available bandwidth, connections and physical space inside a case. If you want to run two hot spares that's absolutely fine, but it must be supported by the RAID controller to do so.
Conversely, a RAID 6 array will tolerate two physical disk failures of the array without loss of data. This again, can be run with hot spares.
That said, running with more than two hot spares in any one RAID 5/6 array in my personal opinion is pointless though, as it becomes cost inefficient to do so. If it was me, I would run the single hot spare and keep two (pre-tested!) replacement disks in a safe location for immediate replacement.
Either way, it is essential you keep new disks for replacement of any potential disk failures. It would be a risk to rely on overnight shipping (or longer at weekends) in the event of a disk failure scenario.
Would this be something I could set the backup software to handle or does it require physically checking the data on the drives? If we were to plump for using RAID 1 in the NAS is there a limit to how many drives can be mirrored? Eg, if we get a 5 bay NAS can we use 5 3TB drives all mirroring data? This way we could rotate disks more often and even consider multiple off-site storage
It depends on the software you use for the backup really. Most backup software includes the ability to verify backup images - Some do it automatically, but in either case you *must* ensure every backup, whether the original or incremental is checked for consistency.
Another method is create an MD5 checksum of the filesystem at backup, and then mount the filesystem of the backup image and re-check the MD5 checksum. They should match on full backups. If they don't, it implies an issue.
Always check, check and re-check. You can have a thousand backups, but if none of them are functional you might as well not have bothered.
If you get a 5-bay NAS enclosure that supports multiple RAID volumes then you could run 2x RAID 1 arrays with a single hot spare covering both arrays. Again, RAID 5 is another option, as is RAID 6 and 10.
The beauty of NAS solutions is the simplicity -- if they get too small, just add another NAS.
I think the "extra" backup using tape may have to wait due to cost but certainly we will need something along those lines at some point. Out of curiosity, if we had a remote backup (whether tape or drive based) what kind of upload speeds would be required for this to work based on NAS backing up server overnight, remote solution backing up NAS during the day? Or would this massively affect our companies broadband bandwidth? Currently we get around 3Mb download and 700Kb upload (which is rubbish!).
You wouldn't need massive network bandwidth in order to perform backups. The first image would be large, but subsequent images will only be incremental backups that cover the difference in files since being last backed up. Your current connection could easily handle that if done daily, provided you were not restricted in total upload bandwidth by your ISP.
The below point is essential:
If your RAID is to hang off the Network (ie the router) the for heavens sake get a Gigbit router, and make sure the NAS has Gibit too.
Make absolutely certain you adhere to this. You *must* use gigabit ethernet (also known as 1,000Mbps or 1Gbps) between all backup equipment to ensure the process is efficient and timely and doesn't effect the network bandwidth of those using 100Mbps connections.
As per Jobeard's comments though... No one single user on the network *needs* any more than 100Mbps speeds in an office environment.
Jobeard will no doubt have an opinion on how to achieve this, and I'd definitely check it out. My preferred method is this:
Internet > Gigabit router > Gigabit Switch (server/NAS) > 100Mbps switches > 100Mbps connections (e.g. printers, computers other network infrastructure.
On the gigabit switch would be the server, and your backup equipment. That way network traffic via the router (and the internet), the server, and the NAS would be at 1Gbps speeds with the enormous network bandwidth it offers, and the office equipment on the slower, 100Mbps side of the network. This will prevent network saturation slow down from too much traffic.
If your workplace is more of a 9-5 than 24/7/365, it is entirely possible to get away with the one internet connection and take advantage of the night time to perform backups off-site by SFTP.
Again, I have no commercial experience, so I'm drawing from self-taught knowledge of my own circumstances and my desire to learn new information. Make sure to verify this solution is adequate for your needs with others.
Got to run so perhaps Leeky could address concepts of a Logical Volume comprise of multiple physical volumes and the raid-1 covering the L.V.
Multiple spares is a configuration issue and 'usually' supported: see the spec sheet for the vendor's offering.
Certainly, Jobeard.
Logical Volume Managers (or LVM's as they're commonly called) is a software-based mass storage device manager that enables you to create large virtual volumes using multiple physical disks. It comes into its own with the ability to create huge server farms (thousands, hundreds of thousands) that enables you to add disks, replace disks, copy and share contents between every single disk allocated to the LVM.
What this essentially grants you is the ability to create, remove and re-size Linux partitions whilst hot and on the fly. Another brilliant feature of a LVM is the ability to make snapshots of each LVM in its current state. You can also mirror multiple volumes creating an "RAID 1" like array. One of the biggest features however is the previously noted ability to resize partitions on the fly. Striping, otherwise known as RAID 0 is also possible.
For example: /home is sat at 98% full, you need more space... you add another disk to the LVM and then dynamically increase the /home partition across that new disk -- this removes the need to migrate data from one disk to another, and creates a continuity between disks.
You can then use LVM to create mirrored volumes, and even tell it how to behave in the event of a disk failure. By running with extra disks added to the LVM you can set it to automatically migrate to the additional free capacity of those disks in the event of one drive failing. Its a similar concept to that used by hardware RAID controllers, except this is on the software level.
Another advantage of this solution is by creating a software-like volume management setup like this you remove yourself from the issues that RAID controller failures cause. It is not always guaranteed that the replacement of a faulty RAID controller will guarantee a stress free migration without any data loss. By implementing an LVM on the software level this issue is reduced to near zero.
The LVM will be read by any OS that supports LVMs. So with Linux it would function perfectly on different hardware, or any other example of a hardware failure. The only real thing that trips up a LVM in my experience is disk-wide failures -- if all the disks simultaneously fail then the data will be lost.
You also cannot use an LVM for the /boot partition, that must be on a partition or disk outside of the LVM.
Out of curiosity what operating system are you planning on using with your new server?
If you need further explanations of particular aspects of LVMs, or indeed other aspects of what I've said by all means ask away.
EDIT: Wow, this is a long post! :haha: