Deploying System Center Data Protection Manager 2016 for Large Hyper-V Workloads #HyperV #DPM #SCDPM #WS2016

Introduction

System Center Data Protection Manager 2016 (DPM) is the latest release by Microsoft and with it comes a lot of improvements and new features. DPM is well recognized in the industry for protection of Microsoft workloads and environments. With DPM 2016, you can back up the most common workloads in a modern data center. DPM can backup various business workloads such as the following which might be running on physical machines, Hyper-V, VMware, or in Microsoft Azure:

  • Exchange Server
  • SQL Server
  • SharePoint Server
  • Microsoft Dynamics
  • Windows Server
  • Hyper-V VMs
  • System States and Active Directory
  • Windows clients

Starting with DPM 2012 R2 Update Rollup 11, you can also protect virtual machines running on VMware platform. The DPM team enabled agentless support which uses VMware’s VADP API to protect VMware VMs remotely without installing agents on vCenter or ESXi servers. VMware support for DPM 2016 is coming very soon.

In earlier blog post, I covered the latest DPM 2016 features including a step by step installation guide, you can read all about it here.

You can install SCDPM in different ways. Please refer to below articles where I described in details how to install System Center Data Protection Manager 2016, you can choose the installation method that suits your needs:

In summary, here are the latest features and improvement that are included in DPM 2016:

  • Protecting data sources in mixed-mode clusters.
  • Protecting Windows Server 2016 Hyper-V with Resilient Change Tracking (RCT).
  • Protecting Windows Server 2016 Storage Spaces Direct (S2D) and Scale-out File Server (SOFS) over ReFS.
  • Protecting Windows Server 2016 Shielded Virtual Machines.
  • Storage savings and faster backups with Modern Backup Storage (MBS).

Over the years, I had the opportunity to work on large DPM projects, and specifically protecting virtual machines running on top of Windows Server Hyper-V.

In the rest of this blog, I would like to share with you the design and deployment strategy for protecting Hyper-V workloads with System Center Data Protection Manager 2016.

Windows Server 2016 Hyper-V and DPM 2016

System Center Data Protection Manager 2016 can protect Windows Server 2016 private cloud deployments efficiently and seamlessly. In Windows Server 2016, Microsoft introduced a new technology called Resilient Change Tracking (RCT). With RCT, backup products including DPM does not need to go through Consistency-Check (CC) in case of sudden power loss or VM storage migration, because when you move a VM from Storage A to Storage B, DPM can keep tracking of these changes by using RCT technology (more on that in a bit).

In Windows Server 2012 R2 (and earlier) with Hyper-V, there was no built-in support for Change Block Tracking (CBT). So, every backup vendors including DPM, at that time, had implemented a file system filter driver to track the change blocks on the storage. This caused the backup vendors to constantly keep updating their file system driver when Microsoft release a new update.

Fortunately, with Windows Server 2016 and RCT, the function moved from the backup products to the hypervisor level, as Hyper-V has the information about the changed blocks in a Virtual Machine (VM). DPM 2016 rely now on RCT technology to check the blocks that have changed and only read the changed blocks instead of tracking the VM changes using file system filter, which significantly simplifies the backup and recovery of Hyper-V VMs.

Please note that when you deploy DPM 2016 agent on any Windows Server 2016 Hyper-V (host or cluster), DPM will still install the file system driver on each node, this gives you the flexibility to migrate VMs from Windows Server 2012 R2 to Windows Server 2016 and vice versa while keep protecting those VMs seamlessly for you. But, as soon as you upgrade your virtual machine configuration version to 8.0, DPM 2016 will detect that and switch automatically and start using the RCT technology instead of the old method (file system filter). This operation is NOT reversible, in other words, you cannot downgrade the VM configuration once it’s upgraded, so if you want to keep the flexibility and move VMs around between Windows Server 2012 R2 and Windows Server 2016, then do NOT upgrade the VM configuration version, and DPM 2016 will continue to backup those VMs using file system driver.

Best of all, DPM and RCT technology is transparent, and does not need any configuration or management from your side.

The following screenshot shows in action DPM 2016 backing up a Hyper-V virtual machine on Windows Server 2016, the VM is at configuration version 8.0:

SCDPM2016-HyperV-Post01

As you noticed, each virtual hard disk (VHDX) and (AVHDX) has an MRT (Modified Region Table) file, and RCT (Resilient Change Tracking) file associated with it to keep track for all the changed blocks and travel with it if a virtual machine is moved to another Hyper-V host or to another storage.

Do you want to know more about RCT and MRT files in Windows Server 2016 Hyper-V including backup architecture and the difference between different versions? I strongly recommend to check my recent published book Windows Server 2016 Hyper-V Cookbook – Second Edition!

Best Deployment Practices

From a technical perspective, there are some considerations that need to be discussed before you start with the deployment:

  • The total amount of data that should be protected
  • Untrusted domains / workgroup
  • Network limitations between different remote sites
  • SQL Server installation
  • Virtual or physical deployment of the DPM server
  • The need for building up backup scenarios

To be able to start provisioning resources for DPM server that you want to deploy, you must first know the following:

  • Gather the information about the workload. In this example, we are using Hyper-V VMs.
  • The number of DPM Servers that would be needed.
  • Decide on the backup policy you want to use.

As DPM 2016 comes with Workload Aware Storage based on Modern Backup Storage, the DPM team released the Backup Storage Capacity Planner to help you provision storage for DPM 2016 using the storage savings and efficiency. Based on inputs as the size, kind and policy of backups, the Planner suggests the amount of storage that will be needed to store the backups to disk, and to Azure.

In the following example, we have 250 virtual machines of 200GB each that means 50,000GB (~50TB) of Hyper-V VMs. The backup policy dictates on having one recovery point per day, with 3% percent churn between the backups. Churn is the amount of new data every day (that is, written or appended to existing backup files). For short-term retention policy, we retain the recovery points on the disks for 7 days (one week), and for long-term retention we will leverage Azure backup (more about Online Backup in the next section).

For this example, the total requirements for on-premises short-term retention is the following:

SCDPM2016-HyperV-Post02

A common question that I heard a lot is, can we deploy DPM in a virtual machine or physical machine?

DPM server can be deployed either on physical and virtual machine. However, running DPM in a virtual machine has more benefits such as:

  • Easier to move the DPM server to new hardware if needed (Portability).
  • Easier to recover (Protected DPM virtual machine).
  • Enable deduplication on DPM VHDXs, the VHDXs files could exist on Scale-out File Server (SOFS), on Storage Spaces Direct (S2D) cluster, or any other type of storage such as NAS or SAN. As of today, Dedup is only supported on NTFS file system and NOT on ReFS. See how to reduce DPM 2016 storage by enabling deduplication on Modern Backup Storage for more details. Dedup cannot be used for volumes storing backups on physical DPM servers.

Another important point to remember is to have a dedicated backup network. The backup network for Hyper-V is not listed as requirement by Microsoft, but I strongly recommend to isolate the backup traffic from the host Management OS, and by leveraging converged network in Hyper-V were combining multiple physical NICs with Switch Embedded Teaming (SET) and QoS, you can isolate each network traffic while maintaining resiliency as shown in below diagram:

SCDPM2016-HyperV-Post03

See how to isolate DPM backup traffic in Hyper-V for more details.

Online Backup Protection

The online protection combined with DPM 2016 will provide you with a great opportunity to place and protect your production data in Azure, which means that you can securely provide and fully optimize an offsite replication for your data that represents the datacenter’s hosted services.

Adding online backup with DPM 2016 and Update Rollup 2 provides you with important and new security benefits, including:

  • The full backup to Azure is only once and that too can be done with offline seeding. No incremental backup after that. And you don’t pay for egress (outbound) data cost for recovery unlike other online backup products.
  • In case of malicious user deleting backups, the backups are retained for 14 days when enhanced security feature is enabled in Azure. Thus, make protecting your data against malware, ransomware, and intrusion attacks. Security alerts are sent to you when critical operations as deletion of backup data is performed, which help you to keep a close watch on any intrusive operations that are made by unauthorized personnel. Check more about the new security features for protecting hybrid backups using Azure Backup.
  • Azure Backup can now be configured to ask for a Security PIN whenever critical changes, as modification of the Passphrase is triggered on the DPM server on-premises.
  • Safeguarding your backups from on-premises attacks.
  • Cost effective and long-term retention to protect your data for very long (no limits).
  • DPM will compress and encrypt the data before sending them to Azure. The compression rate will vary depending on type of data protected (savings is between 30% to 70%).

The most common scenario for DPM-BACKUP-AZURE (disk-to-disk-to-cloud) is the ability to act as a primary DPM server within a datacenter or branch office for short-term backup and then replicate the protected data to Azure for long-term retention and enhanced security protection as shown in Figure 1.

SCDPM2016-HyperV-Post04

Figure 1. DPM Azure Online Backup

There are some considerations that you must keep in mind before you enable online backup. The steps involved are as follows:

  • You can protect Hyper-V virtual machines, VMware virtual machines, SQL server, SharePoint, Exchange, System State, Bare Metal Recovery, and Files and folders with the online protection.
  • You must have a valid Microsoft Azure subscription.
  • You must create a Recovery Service Vault and add Backup vault.
  • Download and install Microsoft Azure Recovery Services Agent (MARS Agent). MARS Agent will use the Windows Identity Foundation 3.5 and Microsoft .NET Framework 4.5 features.
  • Download vault credentials to register the DPM server to the vault.
  • Register DPM server (on-premises) using the vault credentials with Azure Backup.
  • DPM storage for Cloud Backups called “Scratch Space”. During recovery, backup data from the Azure needs to be temporarily downloaded to a local staging area (on-premises) before it is recovered to the destination. The staging folder is automatically cleaned up after recovery.
  • The online protection is only available for primary DPM servers and NOT for secondary DPM.

If you have the scenario of a limited network bandwidth, you have the possibility to define your business hours and the amount of bandwidth the Microsoft Azure Backup Agent should consume as shown in Figure 2.

SCDPM2016-HyperV-Post05

Figure 2. DPM Configure Throttling Settings for Azure Online Backup

Based on the example discussed earlier, we have the following requirements for the long-term backup policy:

  • 1 daily recovery point, retention for 7 days and average churn is 3% between the recovery points.
  • 1 weekly recovery point, retention for 4 weeks and average churn is 6% between the recovery points.
  • 1 monthly recovery point, retention for 12 months and average churn is 10% between the recovery points.
  • 1 yearly recovery point, retention for 7 years and average churn is 15% between the recovery points.

For this example, the total requirements for on-premises (short-term policy), and online Azure backup (long-term policy) is the following:

SCDPM2016-HyperV-Post06

Once you have all the requirements in place, you can start creating your protection groups based on short-term goals and specify the online backup schedule and the online retention policy. As shown in Figure 3, you can define your online protection goals using a daily, weekly, monthly, or yearly synchronization frequency.

SCDPM2016-HyperV-Post07

Figure 3. DPM Create New Protection Group for Online Backup

Protecting Large Hyper-V Clusters

In DPM 2012 SP1 and 2012 R2, the DPM team introduced a new feature called “Scale-Out Protection” that makes it possible to protect large, clustered Hyper-V environments. This is especially useful if you have a large cluster with more than 800 VMs, in this case you need multiple DPM servers to protect them. Therefore, DPM Scale out capability removes the limit of a one-to-one relationship between a Hyper-V cluster and a DPM server, the DPM protection agent that runs on the Hyper-V host can attach itself to multiple DPM servers. Therefore, you can add the virtual machine to a protection group on any of the recognized DPM server.

As an example, to deploy Scale-Out Protection, we have two DPM servers DPM01 and DPM02, which are visible to all nodes of Hyper-V clusters (nodes HV01, HV02, HV03, HV04), and a standalone Hyper-V host HV05. When you create protection groups on DPM01 or DPM02, you can add any of the virtual machines from VM01 to VM10 for protection as shown in Figure 4 to any of the DPM servers.

Please note that if a VM is protected by DPM01 that VM will stay with the same DPM server unless you stop the protection of the VM.

SCDPM2016-HyperV-Post08

Figure 4. DPM Scaled-Out Protection Design

In this scenario, the DPM agent needs to be installed on the Hyper-V server, and on each node in the Hyper-V cluster. Then you need to use the SetDPMServer command with the -Add parameter on the protected virtual machine to make multiple DPM servers visible to the protected virtual machine, as follows:

SetDPMServer –Add –DPMServerName DPMSERVER-FQDN

The parameter “–Add is very important here. If you don’t use it, the previous DPM server is overwritten.

Please note that a virtual machine can only be protected by one DPM server that is a member of the Scaled-Out Protection configuration. You cannot protect the same virtual machine on multiple DPM members of the “Scale-Out Protection” configuration.

Secondary and Offsite Protection

A secondary DPM server can be designed and implemented by applying the backup and recovery and offsite protection scenarios. One of the scenario is, you can use a primary DPM server that manages and protect workloads in Site A, which builds up services and replicates the services’ data dependencies to a secondary Site B.

The secondary DPM server will replicate the primary DPM server recovery points for protected data sources. The secondary DPM server can protect more than one primary DPM server. This is important to remember when implementing the backup recovery scenario. A DPM server could also be enabled for chaining, which means that it is both a primary and a secondary DPM server at the same time.

The most common scenario for DPM-DPM-OFFSITE is the ability for DPM servers to act as a primary in branch Office A and branch Office B and then replicate the protected data to an offsite (secondary) DPM server in a third location, which is in another part of the country or the world as shown in Figure 5.

SCDPM2016-HyperV-Post09

Figure 5. Primary and Secondary DPM Backup Design

Another main reason for Secondary DPM server is… if Primary DPM server dies, the protection can be switched to the Secondary DPM Server.

If you are limited on the network bandwidth between the sites, you can optimize any protection group via the Offset <time> start time and also Enable on-the-wire-compression feature as shown in Figure 6.

SCDPM2016-HyperV-Post10

Figure 6. DPM Optimize Performance

The secondary DPM server will query the primary DPM server-specific VSS writer called DPM writer, which will create the replication process from the primary DPM server to the secondary. The secondary DPM server will replicate the primary protected data sources that lie on the primary DPM Server. The secondary DPM server will create its copy from the data that lives on the primary DPM server.

The secondary DPM server will start its replication of the primary DPM servers’ protected data sources every midnight by default. Using the protection group optimization performance is a good way to push the replication starting point forward in time.

Please note that the secondary DPM server, will show only the agents which are in the same domain of the Primary DPM server in each site. In other words, if you have workgroup or untrusted machines connected to the primary DPM Server, you cannot protect them from the secondary DPM Server.

Conclusion

Backup and recovery has been a natural part of the business continuity plan for many years. By combining Online Backup and System Center Data Protection Manager 2016 gives you a unique opportunity to save on storage cost, increase performance, more reliable and scalable, and the most important to protect your critical data against Ransomware or any intrusive operations that are made by unauthorized personnel.

I hope this article gave you a solid foundation on how to protect your investment on-premises combining with online backup for better security and much more cost-effective.

I encourage you to deploy and evaluate the current release of DPM 2016 including Update Rollup 2 and share your feedback in the comment section below.

Thanks for reading!
-Ch@rbel-

About Charbel Nemnom 313 Articles
Charbel Nemnom is a Microsoft Cloud Consultant and Technical Evangelist, totally fan of the latest's IT platform solutions, accomplished hands-on technical professional with over 15 years of broad IT Infrastructure experience serving on and guiding technical teams to optimize performance of mission-critical enterprise systems. Excellent communicator adept at identifying business needs and bridging the gap between functional groups and technology to foster targeted and innovative IT project development. Well respected by peers through demonstrating passion for technology and performance improvement. Extensive practical knowledge of complex systems builds, network design and virtualization.

Be the first to comment

Leave a Reply