Theory of operations for vaulting to tape

If you are new to our data protection solutions, you may be asking yourself how to run a simple backup to tape, as you would with legacy tape software.

The DATASTOR software is a disk to disk to tape solution; it backs up to disk initially using source-based deduplication technology, then facilitates creating redundant copies of the backup data in its deduplicated format in a vault on tape, providing an off-site, redundant copy to meet business requirements for data recoverability.

Defining terms 'store', 'protection plan', and 'vault'

To define terms in relation to each other, the store is the location on disk where the backup is stored. The protection plan is the set of parameters the engine uses to back up the data, for example the folders to back up, the store to target, and the archive name. The archive is the result of running the plan and it appears in the store with the plan name. Restore points are accessed from the archive folder for data recovery. All this occurs without tape. The vault is the set of archives that have been selected for writing to tape. Nothing is written to tape until the first plan finishes writing to the store and an archive restore point has been generated in the store. Backup operations, then, go from Disk to Disk to Tape

A vault as containerized archives

DATASTOR has found that moving a lot of small files is easier when you "containerize" them for shipping. We load containers, aka segments with 16 MB of small files, for streaming to tape to optimize the process.

The vaulting process

The initial vault execution writes the full content of the selected archives from the store to the tape. Subsequent vault execution only writes the new data generated since the last vault execution. It is intended therefore to create a set of tapes over time. Vaulting spans tape media as required.

Volume sets and media rotation

On some time boundary, for instance quarterly, you would create a new store vault to vault the restore points generated within the new time frame. This would copy to tape all the required data for those restore points, independent of the prior vault. Until that time boundary is reached, simply continue running the current vault task. The software will use an available tape already assigned to the volume set, or pull a blank one if the existing tapes are full.

To facilitate media rotation so full tape volume sets are offsite on any given day, consider creating five vaults, Monday through Friday, for your daily rotation. In this scheme, five complete, independent sets of archive restore points get generated, and the volume sets can be stored offsite, with only the latest tape in any vault's volume set brought back onsite for the appropriate day of the week. Then, 5 new vaults can be created on the desired time boundary.

Recovering data

Data recovery would be immediately available and also quickest by restoring data from one of the restore points in the store on disk. To manage the growth of the store size, expiration would be set on the store to expire older restore points and maintain free disk space. Vaulting is intended for longer term storage for data recovery after restore points have expired from the store on disk. Recovery from a vaulted restore point requires that the restore point be prepared prior to file recovery. Preparation involves writing the archive restore point data back to disk (the cache location specified during vault preparation. See the quick start guide for more information.)

LTFS support

We are working to provide LTFS support in the next release of the software such that a store could be placed on a tape directly, which would back up to the store on tape directly. Performance considerations would suggest that backup of large files could work in this configuration better than smaller files. LTFS support would also allow data recovery operations to use the LTFS volume as the destination drive.