You will need to go through this link before you read through the testing I've gone through.
I have configured protected site and recovery site in my home lab. PJ (protected) site versus SG (recovery). I have include 1 VM (PJ-AD01) only in my protection group. This is to test how much data has been replicated with the chances close to 1GB file added and deleted.
I have tried 5 scenarios;
For the last scenario, when we add 2GB file, delete them and add 1GB file, SRM keep track of 2GB sync data. In a typical environment, we might treat this as 3GB delta changes.
In this scenario, it took less than 3 minutes to complete the replication at 14000kbps rate. Running 14000kbps for 3 minutes can transfer around 2400MB data across two sites instead of 2GB data sync captured. Pretty close match.
The most effective way to monitor the bandwidth required is to monitor the replication from time to time. It is always good to follow the below steps when come to vSphere Replication.
I have configured protected site and recovery site in my home lab. PJ (protected) site versus SG (recovery). I have include 1 VM (PJ-AD01) only in my protection group. This is to test how much data has been replicated with the chances close to 1GB file added and deleted.
I have tried 5 scenarios;
- Just adding the 1GB file
- Adding 1GB file and delete 1GB file
- Delete 1GB file and add the same 1GB file
- Just remove existing 1GB file from disk
- Adding 2GB file, delete 2GB file, and add another 1GB file
It seems that data chances tracked by SRM is always based on adding file instead as shown below.
Although it is stated last sync data is close to 1GB for the first 3 scenarios. When we checked on the network utilization, it only took slightly more than 1 minute (to be exact, 1m18s) to replicate the changes (block-level change between the last sync) at max 14000KBps.
Running 14000KBps for 1 minutes will transfer around 800MB. Above figure captured the 3 scenarios replication overhead on network and 3 of them took less than 2 minutes to complete. As mentioned in the link provided in the beginning of this blog, block level chances are not equal to amount of the change in between the RPO interval. In most cases, I believe it will be less than the actual data added into the VM. This provide a more effective replication technology between protected and recovery site.
There is actually no simple way to calculate the block level changes between the last sync image. It will be safe if we calculate based on the actual data change based on the RPO requirement.
The 4th scenario I've tested is to delete existing 1GB file from the disk. It only replicate less than 1MB data from protected site to recovery site. In most cases, when we delete 1GB file, we treat that as 1GB data changes. Windows does not overwrite the disk in this case but rather delete the pointer in the file system. In fact, there is nothing written to the disk, which tell vSR no block changes happen in this level.
In this scenario, it took less than 3 minutes to complete the replication at 14000kbps rate. Running 14000kbps for 3 minutes can transfer around 2400MB data across two sites instead of 2GB data sync captured. Pretty close match.
The most effective way to monitor the bandwidth required is to monitor the replication from time to time. It is always good to follow the below steps when come to vSphere Replication.
- Always turn on vSR on single VM or small number of VMs with higher RPO (eg. 48 hours)
- Start monitoring the bandwidth utilization with vCenter or 3rd party appliance on the WAN link between sites.
- Fine tune the RPO according to the business needs
- Start adding more VM, preferably small number of VMs, stabilize them before adding new VM
- Add more bandwidth as needed
Comments
Post a Comment