Global Data Deduplication Overview

Categories: Deduplication, Backup, Storage, XenServer, Hyper-V, Alike v3, Alike v4


Alike is different than many other backup solutions thanks to its global data deduplcation. Alike’s backend consistents of large, contiguous files called BDBs which contain globally unique blocks representing 512KB sections of backup data. Since each unique block is only stored once, Alike can store any number of copies of an identical VM with negligible overhead, and can store similar VMs inexpensively as well.

Deduplication in Backup Jobs

When Alike backs up a VM for the first time, it performs a full scan of the VM’s disk and copies down the entire VM’s disk data to the Alike ADS (Alike Data Store). This data is compressed into an efficient format and will appear in your ADS “validate” subdirectory as AMB files:


Alike will then process these AMBs in the background and deduplicate them. You can see the progress of this background operation on the dashboard by looking at the commit queue:

Once an initial backup completes and the AMBs are fully committed, the next backup will use information from the previous backup to avoid a full scan. Instead, Alike will compare data on disk to information from the previous backup and only send changed data back to the ADS in AMB files.

Again, at this stage, the data is not globally deduplicated until Alike finishes processing these AMBs by saving distinct block data in its BDBs.

Final Notes

