In this article, we would like to share with you how to deploy System Center Data Protection Manager for large Hyper-V workloads.
Table of Contents
System Center Data Protection Manager 2016 (DPM) is the latest release by Microsoft and with it comes a lot of improvements and new features. DPM is well recognized in the industry for the protection of Microsoft workloads and environments. With DPM 2016, you can back up the most common workloads in a modern data center. DPM can backup various business workloads such as the following which might be running on physical machines, Hyper-V, VMware, or in Microsoft Azure:
- Exchange Server
- SQL Server
- SharePoint Server
- Microsoft Dynamics
- Windows Server
- Hyper-V VMs
- System States and Active Directory
- Windows clients
Starting with DPM 2012 R2 Update Rollup 11, you can also protect virtual machines running on the VMware platform. The DPM team enabled agentless support which uses VMware’s VADP API to protect VMware VMs remotely without installing agents on vCenter or ESXi servers. VMware support for DPM 2016 is coming very soon.
In an earlier blog post, we covered the latest DPM 2016 features including a step-by-step installation guide, you can read all about it here.
You can install SCDPM in different ways. Please refer to the below articles where we described in detail how to install System Center Data Protection Manager 2016, you can choose the installation method that suits your needs:
- How to install SCDPM 2016 on Windows Server 2016 and SQL Server 2016
- How to Automate the installation of SCDPM 2016 on Windows Server 2016
- How to Deploy SCDPM 2016 using SCVMM 2016 on Windows Server 2016
In summary, here are the latest features and improvements that are included in DPM 2016:
- Protecting data sources in mixed-mode clusters.
- Protecting Windows Server 2016 Hyper-V with Resilient Change Tracking (RCT).
- Protecting Windows Server 2016 Storage Spaces Direct (S2D) and Scale-out File Server (SOFS) over ReFS.
- Protecting Windows Server 2016 Shielded Virtual Machines.
- Storage savings and faster backups with Modern Backup Storage (MBS).
Over the years, we had the opportunity to work on large DPM projects, specifically protecting virtual machines running on top of Windows Server Hyper-V.
In the rest of this blog, we would like to share with you the design and deployment strategy for protecting Hyper-V workloads with System Center Data Protection Manager.
Windows Server Hyper-V and DPM
System Center Data Protection Manager 2016 can protect Windows Server 2016 private cloud deployments efficiently and seamlessly. In Windows Server 2016, Microsoft introduced a new technology called Resilient Change Tracking (RCT). With RCT, backup products including DPM does not need to go through Consistency-Check (CC) in case of sudden power loss or VM storage migration, because when you move a VM from Storage A to Storage B, DPM can keep tracking of these changes by using RCT technology (more on that in a bit).
In Windows Server 2012 R2 (and earlier) with Hyper-V, there was no built-in support for Change Block Tracking (CBT). So, every backup vendor including Microsoft (DPM), at that time, had implemented a file system filter driver to track the change blocks on the storage. This caused the backup vendors to constantly keep updating their file system driver when Microsoft release a new update.
Fortunately, with Windows Server 2016 and RCT, the function moved from the backup products to the hypervisor level, as Hyper-V has the information about the changed blocks in a Virtual Machine (VM). DPM 2016 relies now on RCT technology to check the blocks that have changed and only read the changed blocks instead of tracking the VM changes using a file system filter, which significantly simplifies the backup and recovery of Hyper-V VMs.
Please note that when you deploy DPM 2016 agent on any Windows Server 2016 Hyper-V (host or cluster), DPM will still install the file system driver on each node, this gives you the flexibility to migrate VMs from Windows Server 2012 R2 to Windows Server 2016 and vice versa while keep protecting those VMs seamlessly for you. But, as soon as you upgrade your virtual machine configuration version to 8.0, DPM 2016 will detect that and switch automatically and start using the RCT technology instead of the old method (file system filter).
This operation is NOT reversible, in other words, you cannot downgrade the VM configuration once it’s upgraded, so if you want to keep the flexibility and move VMs around between Windows Server 2012 R2 and Windows Server 2016, then do NOT upgrade the VM configuration version, and DPM 2016 will continue to back up those VMs using file system driver.
Best of all, DPM and RCT technology is transparent and does not need any configuration or management from your side.
The following screenshot shows in action DPM 2016 or later backing up a Hyper-V virtual machine on Windows Server 2016, the VM is at configuration version 8.0:
As you noticed, each virtual hard disk (VHDX) and (AVHDX) has an MRT (Modified Region Table) file, and RCT (Resilient Change Tracking) file associated with it to keep track of all the changed blocks and travel with it if a virtual machine is moved to another Hyper-V host or to another storage.
Do you want to know more about RCT and MRT files in Windows Server 2016 Hyper-V including backup architecture and the difference between different versions? We strongly recommend checking the recently published book Windows Server 2016 Hyper-V Cookbook – Second Edition!
Best Deployment Practices
From a technical perspective, there are some considerations that need to be discussed before you start with the deployment:
- The total amount of data that should be protected
- Untrusted domains/workgroup
- Network limitations between different remote sites
- SQL Server installation
- Virtual or physical deployment of the DPM server
- The need for building up backup scenarios
To be able to start provisioning resources for the DPM server that you want to deploy, you must first know the following:
- Gather the information about the workload. In this example, we are using Hyper-V VMs.
- The number of DPM Servers that would be needed.
- Decide on the backup policy you want to use.
As DPM 2016 comes with Workload Aware Storage based on Modern Backup Storage, the DPM team released the Backup Storage Capacity Planner to help you provision storage for DPM 2016 using storage savings and efficiency. Based on inputs as the size, kind, and policy of backups, the Planner suggests the amount of storage that will be needed to store the backups to disk, and to Azure.
In the following example, we have 250 virtual machines of 200GB each which means 50,000GB (~50TB) of Hyper-V VMs. The backup policy dictates having one recovery point per day, with a 3% percent churn between the backups. Churn is the amount of new data every day (that is, written or appended to existing backup files). For the short-term retention policy, we retain the recovery points on the disks for 7 days (one week), and for long-term retention, we will leverage Azure backup (more about Online Backup in the next section).
For this example, the total requirements for on-premises short-term retention are the following:
A common question that I heard a lot is, can we deploy DPM in a virtual machine or physical machine?
DPM servers can be deployed either on a physical or virtual machine. However, running DPM in a virtual machine has more benefits such as:
- Easier to move the DPM server to new hardware if needed (Portability).
- Easier to recover (Protected DPM virtual machine).
- Enable deduplication on DPM VHDXs, the VHDXs files could exist on Scale-out File Server (SOFS), on Storage Spaces Direct (S2D) cluster, or any other type of storage such as NAS or SAN. As of today, Dedup is only supported on the NTFS file system and NOT on ReFS. See how to reduce DPM 2016 storage by enabling deduplication on Modern Backup Storage for more details. Dedup cannot be used for volumes storing backups on physical DPM servers.
Another important point to remember is to have a dedicated backup network. The backup network for Hyper-V is not listed as a requirement by Microsoft, but we strongly recommend isolating the backup traffic from the host Management OS, and by leveraging converged network in Hyper-V were combining multiple physical NICs with Switch Embedded Teaming (SET) and QoS, you can isolate each network traffic while maintaining resiliency as shown in below diagram:
Check how to isolate DPM backup traffic in Hyper-V for more details.
Online Backup Protection
The online protection combined with DPM 2016 will provide you with a great opportunity to place and protect your production data in Azure, which means that you can securely provide and fully optimize an offsite replication for your data that represents the datacenter’s hosted services.
Adding online backup with DPM 2016 and Update Rollup 2 provides you with important and new security benefits, including:
- The full backup to Azure is only once and that too can be done with offline seeding. No incremental backup after that. And you don’t pay for egress (outbound) data cost for recovery, unlike other online backup products.
- In case of a malicious user deleting backups, the backups are retained for 14 days when an enhanced security feature is enabled in Azure Backup. Thus, make protecting your data against malware, ransomware, and intrusion attacks. Security alerts are sent to you when critical operations as deletion of backup data are performed, which help you to keep a close watch on any intrusive operations that are made by unauthorized personnel. Check more about the new security features for protecting hybrid backups using Azure Backup.
- Azure Backup can now be configured to ask for a Security PIN whenever critical changes, as modification of the Passphrase is triggered on the DPM server on-premises.
- Safeguarding your backups from on-premises attacks.
- Cost-effective and long-term retention to protect your data for very long (no limits).
- DPM will compress and encrypt the data before sending them to Azure. The compression rate will vary depending on the type of data protected (savings is between 30% to 70%).
The most common scenario for DPM-BACKUP-AZURE (disk-to-disk-to-cloud) is the ability to act as a primary DPM server within a data center or branch office for short-term backup and then replicate the protected data to Azure for long-term retention and enhanced security protection as shown in Figure 1.
Figure 1. DPM Azure Online Backup
There are some considerations that you must keep in mind before you enable online backup. The steps involved are as follows:
- You can protect Hyper-V virtual machines, VMware virtual machines, SQL Server, SharePoint, Exchange, System State, Bare Metal Recovery, and Files and folders with online protection.
- You must have a valid Microsoft Azure subscription.
- You must create a Recovery Service Vault and add a Backup vault.
- Download and install Microsoft Azure Recovery Services Agent (MARS Agent). MARS Agent will use the Windows Identity Foundation 3.5 and Microsoft .NET Framework 4.5 features.
- Download vault credentials to register the DPM server to the vault.
- Register DPM server (on-premises) using the vault credentials with Azure Backup.
- DPM storage for Cloud Backups is called “Scratch Space”. During recovery, backup data from the Azure needs to be temporarily downloaded to a local staging area (on-premises) before it is recovered to the destination. The staging folder is automatically cleaned up after recovery.
- The online protection is only available for primary DPM servers and NOT for secondary DPM.
If you have the scenario of limited network bandwidth, you have the possibility to define your business hours and the amount of bandwidth the Microsoft Azure Backup Agent should consume as shown in Figure 2.
Figure 2. DPM Configure Throttling Settings for Azure Online Backup
Based on the example discussed earlier, we have the following requirements for the long-term backup policy:
- 1 daily recovery point, retention for 7 days, and average churn is 3% between the recovery points.
- 1 weekly recovery point, retention for 4 weeks, and average churn is 6% between the recovery points.
- 1 monthly recovery point, retention for 12 months, and average churn is 10% between the recovery points.
- 1 yearly recovery point, retention for 7 years, and average churn is 15% between the recovery points.
For this example, the total requirements for on-premises (short-term policy), and online Azure backup (long-term policy) is the following:
Once you have all the requirements in place, you can start creating your protection groups based on short-term goals and specify the online backup schedule and the online retention policy. As shown in Figure 3, you can define your online protection goals using a daily, weekly, monthly, or yearly synchronization frequency.
Figure 3. DPM Create New Protection Group for Online Backup
Protecting Large Hyper-V Clusters
In DPM 2012 SP1 and 2012 R2, the DPM team introduced a new feature called “Scale-Out Protection” that makes it possible to protect large, clustered Hyper-V environments. This is especially useful if you have a large cluster with more than 800 VMs, in this case, you need multiple DPM servers to protect them. Therefore, DPM Scale-out capability removes the limit of a one-to-one relationship between a Hyper-V cluster and a DPM server, the DPM protection agent that runs on the Hyper-V host can attach itself to multiple DPM servers. Therefore, you can add the virtual machine to a protection group on any of the recognized DPM servers.
As an example, to deploy Scale-Out Protection, we have two DPM servers DPM01 and DPM02, which are visible to all nodes of Hyper-V clusters (nodes HV01, HV02, HV03, HV04), and a standalone Hyper-V host HV05. When you create protection groups on DPM01 or DPM02, you can add any of the virtual machines from VM01 to VM10 for protection as shown in Figure 4 to any of the DPM servers.
Please note that if a VM is protected by DPM01 that VM will stay with the same DPM server unless you stop the protection of the VM.
Figure 4. DPM Scaled-Out Protection Design
In this scenario, the DPM agent needs to be installed on the Hyper-V server, and on each node in the Hyper-V cluster. Then you need to use the SetDPMServer command with the -Add parameter on the protected virtual machine to make multiple DPM servers visible to the protected virtual machine, as follows:
SetDPMServer –Add –DPMServerName DPMSERVER-FQDN
The parameter “–Add“ is very important here. If you don’t use it, the previous DPM server is overwritten.
Please note that a virtual machine can only be protected by one DPM server that is a member of the Scaled-Out Protection configuration. You cannot protect the same virtual machine on multiple DPM members of the “Scale-Out Protection” configuration.
Secondary and Offsite Protection
A secondary DPM server can be designed and implemented by applying the backup and recovery and offsite protection scenarios. One of the scenarios is, you can use a primary DPM server that manages and protect workloads in Site A, which builds up services and replicates the services’ data dependencies to a secondary Site B.
The secondary DPM server will replicate the primary DPM server recovery points for protected data sources. The secondary DPM server can protect more than one primary DPM server. This is important to remember when implementing the backup recovery scenario. A DPM server could also be enabled for chaining, which means that it is both a primary and a secondary DPM server at the same time.
The most common scenario for DPM-DPM-OFFSITE is the ability for DPM servers to act as a primary in a branch Office A and branch Office B and then replicate the protected data to an offsite (secondary) DPM server in a third location, which is in another part of the country or the world as shown in Figure 5.
Figure 5. Primary and Secondary DPM Backup Design
Another main reason for the Secondary DPM server is… if the Primary DPM server dies, the protection can be switched to the Secondary DPM Server.
If you are limited on the network bandwidth between the sites, you can optimize any protection group via the Offset <time> start time and also Enable on-the-wire-compression feature as shown in Figure 6.
Figure 6. DPM Optimize Performance
The secondary DPM server will query the primary DPM server-specific VSS writer called DPM writer, which will create the replication process from the primary DPM server to the secondary. The secondary DPM server will replicate the primary protected data sources that lie on the primary DPM server. The secondary DPM server will create its copy from the data that lives on the primary DPM server.
The secondary DPM server will start its replication of the primary DPM servers’ protected data sources every midnight by default. Using the protection group optimization performance is a good way to push the replication starting point forward in time.
Please note that the secondary DPM server will show only the agents which are in the same domain of the Primary DPM server in each site. In other words, if you have a workgroup or untrusted machines connected to the primary DPM server, you cannot protect them from the secondary DPM server.
Backup and recovery have been a natural part of the business continuity plan for many years. By combining Online Backup and System Center Data Protection Manager 2016 gives you a unique opportunity to save on storage costs, increase performance, be more reliable and scalable, and the most important is to protect your critical data against Ransomware or any intrusive operations that are made by unauthorized personnel.
I hope this article gave you a solid foundation on how to protect your investment on-premises combined with online backup for better security and much more cost-effectiveness.
I encourage you to deploy and evaluate the current release of DPM 2016 including Update Rollup 2 and share your feedback in the comment section below.
Thanks for reading!