Saturday, October 4, 2014

VM Migration to NFS Export hosted by a FreeNAS VM

I mentioned earlier in my Year in Review about not having RAID setup (the on board motherboard RAID wasn't supported with stock ESXi, and I couldn't use my RAID card with hardware pass through to some of the VMs) for the VMs (they were just spread across 2 - 2 TB drives. To support the migration, I was using 4 older SATA I - 250 GB drives for my file shares and an external SAN for non OS data.

The goal was to have something that would protect against drive failure, and I didn't want to purchase any additional hardware. My solution was to provide virtual disks to a FreeNAS VM, create a ZFS volume using RAID, and setup NFS that was mounted within ESXi. I know it isn't the most elegant solution, but it does offer some benefits.
 1) I didn't have to purchase any new hardware
 2) I can sleep better at night knowing I can survive a drive failing
 3) I can setup off site replication for my VMs, since my VMs are on now a ZFS volume

I did opt for a second FreeNAS VM that I could dedicate to the VMs, and the other FreeNAS VM can act more like a development environment to test patches and configurations (if I need to reboot the FreeNAS hosting the VMs, I have to shut down all the VMs). And the FreeNAS VM was kept on an ESXi datastore that was available at boot.

I wanted to setup FreeNAS with a 10 GB NIC, but I initially ran into issues with drivers for the VMXNET 3 NIC. So I setup 2 NICs on the FreeNAS VM, and setup each of the NFS exports so they had their own dedicated NIC. Once I figure this out with my development FreeNAS, I'll fix it. With what I'm doing, I'm not seeing too much of a performance hit using the 1 GB link, but the extra bandwidth wouldn't hurt.

The process was a bit slow to migrate all the data around (it took about a week). The steps I took are below:
 1) Migrated my CIFS data off the 4 drive RAID to the external SAN
 2) Created a new ZFS volume on the 4 drives where the CIFS data was (I went with RAID 10)
 3) Migrated some of the VMs to the new RAID 10 ZFS volume
 4) Migrated the remainder VMs to the external SAN
 5) Once the 2 - 2 TB drives were empty, I created a new virtual disk on each and setup a second RAID 1 ZFS volume, then migrated some of the VMs to it

To wrap this up, I've been using this setup for 20+ days now, and haven't had any issues. And as I said, it's not that elegant, but it does provide peace of mind against drive failure, and I'll feel really good once I have the data replicated off site. :)


*** I recently had a hard drive go bad in my 4 bay external eSATA SAN that's way past warranty. The cost to replace the drive at the current size is actually more than getting a bigger drive when the bigger drives are on sale. So, I started reading more about FreeNAS and how it could solve some of my problems. But while reading about FreeNAS as a VM, it's not recommended. I personally haven't had an issue (it's been wonderful), but apparently there are stories with people losing data. I really don't want to lose everything I've done, so now I plan to build a new server that will be fully dedicated to storage (this was always the most ideal, but I was trying to cut down on costs). I plan to directly connect it to the ESXi server via a cross over Ethernet cable (I don't want to waste two ports on my gigabit switch, I currently only have one virtual host server, and the future FreeNAS box will have two NICs with the second one supporting CIFS and offsite replication).

The other HUGE benefit is that I gain about 12GB memory in my ESXi box. I was using 8 for my prod FreeNAS and 4 for the CIFS / dev FreeNAS. I'll still need to test FreeNAS updates in a VM, but that's something I could spin up, test, and then shut back down.

The plan is to support storage to ESXi via NFS as it seems to be easier than iSCSI (I don't want to deal with increasing iSCSI extent sizes). I'll just keep the data separated by datasets for replication purposes. I've been monitoring my performance of the virtual FreeNAS machine, and bandwidth isn't an issue (I'm mainly concerned with disk IO).

I've started testing growing volumes by replacing the disks with larger ones, so I feel confident in building something that supports growth for three years (through the hard drive warranty period). Then, as the hard drives fail, I would purchase larger ones to replace them. I'm targeting RAID 10 with four drives (probably the WD Red NAS drives, even though they're 5400rpm) for the most optimal performance on a small motherboard.