Last post we looked at guest level backup integration with Microsoft VSS technology as required when the virtual machine attaches to a physical disk. In this post we'll explore in more detail certain host level backup limitations.
The host level backup approach backs up to available disk the virtual machine configuration files and virtual disk files by running a backup from the host machine. Newer virtualization software can communicate with a VM’s internal VSS providers to quiesce the system internally (when the disks are virtual). This backup ideal has several appeals. One, it protects virtual systems by protecting their virtual disks and configuration files at the host level, offering simplicity in just managing copies of virtual disks. Two, the restore of an entire VM requires as little as just the virtual hard disks, and it can be restored to an alternate VM host. Lastly, it removes (at least in theory) the third party backup agent installation in the guest level backup method and the associated hassle of roll outs and upgrades. However, a proper backup of a virtual machine running Windows requires that volumes and applications be quiesced and in a consistent state when the snapshot is taken, and this ultimately requires a consideration of the disks used by the VM and communication with internal VSS writers.
Recall from the last post, if the VM attaches to physical disk, the backup must also be able to take a snapshot of the physical disk to protect the files stored on that disk, and host level backup does not support this. Hyper-V does not support host level backups of physical disks attached to virtual machines. Their best practice is to run guest-level backups with periodic host level backups. See Microsoft’s Planning for Backup, updated March 17, 2010 as of this writing, for more information. VMware has issues with Raw Device Mapping in physical mode, common with clustered environments. This becomes further complicated when a virtual machine’s application files reside across multiple disks, some of which are virtual disk and some of which are physical disk. Logical database corruption occurs in an application backup that does not have simultaneous snapshots across all involved disks. You are back to implementing a guest level backup. Or, if you are going to go with a host-level backup product, be prepared to reconfigure your existing virtual machines to move from physical to virtual disks, and then take into account any performance issues that may result.
For a backup to be application consistent, the backup must quiesce the application, such as Microsoft SQL, register transactions that were in memory, then take a snapshot and protect the appropriate application files. For the backup software to be application aware, the application must know it has been backed up, allowing for log truncation. The following table lays out the requirements for proper backup of a virtual machine.
Excluding from consideration those virtual systems with the physical disk limitation, there is some debate whether a host-level backup of a transactional database has the degree of success acceptable for backup. While anecdotal evidence suggests successful backups do take place, would a 1% failure rate be acceptable in your organization? Attempting to restore and recover an Exchange database from a host level backup and finding that it won’t mount is not the time to discover the limitations of your backup method. See more on the VM backup debate at Scott Waterhouse’s Avamar centered blog advocating guest level backups here, Veeam’s post on host level backups here, and W. Curtis Preston’s posts here and here.
Best practices for backing up virtual machines come down to this: always run guest level backups when your virtual machine attaches to physcial disk, such as a SAN volume, and always run guest level backups when protecting transactional databases.
Now, for a final word on specific additional benefits using dataStor's technology to create guest level backups. dataStor "walks" the file system internally, accounting for and deduplicating each file. Files stored once never have to be stored again. Changed files only have the unique changes within the file stored. This allows for superior deduplication ratios for VMs as well as a much shorter backup window and less network traffic. Also, there is no agent installed inside the VM, just a Windows scheduled task. So dataStor upgrades only take place on the Archive Manager server.