Tuesday, October 29, 2013

High System Interrupts

I just finished fixing a problem with one of my VMs using hardware pass-through. It was one of the machines getting the second USB controller. When I went into the OS (Windows), it was very slow. When checking task manager, it showed the first CPU floating around 30% and the second CPU was at idle. When looking at the processes, there wasn't anything consuming CPU cycles. Then, I saw that it was hardware system interrupts within the Resource Monitor.

I isolated it to the USB controller vice the video card by removing both and adding them back one at a time. I tried a couple different scenarios with switching the controller to other VMs and whatnot, but with no luck.

To resolve the problem, I de-selected all the hardware for pass-through, rebooted, reconfigured the pass-through and rebooted a second time. I checked my VMs to ensure they were still configured with the correct hardware items and powered everything back on. I think it might have happened when I was adding additional hardware for pass-through and it just got out of sync somewhere.

Within Linux, I'm fairly certain you can use iostat to track down culprits with high system interrupts. I think it was the tool I used a few years ago to track down something (it was either the analog phone card or a dying hard drive from what I can remember).

Storage Issues (RAID)

So, it's ideal to have RAID IMO - primarily for redundancy. I've had many hard drives fail to live without the redundancy.

My inventory includes 4 - 2 TB drives (somewhat old and somewhat slow), 4 - 250 GB (very old and very slow), and 2 -2 TB drives (new and fast). I was hoping to setup the larger drives on their own RAID (RAID 1 for the 2 new drives and RAID 5 for the 4 TB drives). I also have an older 4 bay SANS Digital SAN.

I bought a Dell perc 5i RAID card with two sets of SAS to SATA cables, to support 4 SATA drives per cable. Bad news is that it apparently doesn't work with VT-D enabled, which is the cornerstone of my entire virtual environment. When I installed it, I was able to create the logical drive, but ESX kept loosing the connection to it. It would see it until I started moving files to it, then it would disappear from inventory. When I would add it back, it wouldn't see that there was a VMFS partition on it and it would create a new one.

My options (at least what I can think of) is to use the 4 large drives for a software RAID within my FreeNAS VM, use the drives individually, or use my external SAN that I was using previously. The only problem with the external SAN is that it's not going to be as fast as the internal RAID card.

I decided to use the external SAN. I'll have to be extra careful not to allow the external SATA cable to come unplugged, but it gives me the flexibility to put VMs on the RAID array without having to worry about FreeNAS making the iSCSI / NFS mount point available to ESX to power on additional machines. And, it'll be a little more mobile.

I think that I will try and incorporate the Dell RAID card into a future build. I'll keep my eye open for a decently powered server that can support FreeNAS and add some hard drives to the RAID. Then, I can setup an iSCSI target to be used by the ESX host. It would even set me up to migrate VMs between ESX future ESX hosts.