(Solution) VM Is Not Booting After Failover To Azure With Azure Site Recovery #ASR

Introduction

Azure Site Recovery (ASR) is Microsoft’s Disaster Recovery strategy. It keeps workloads safe and recoverable through replication from on-premises servers, such as physical servers, Hyper-V based VMs and VMware-based VMs. You can also replicate VMs between different Azure datacenters. Although ASR is designed for disaster recovery, Azure Site Recovery can also be used to migrate virtual machines to the cloud an approach known as “Lift and Shift“.

So what is the difference between both scenarios?

  • For disaster recovery, you replicate machines on a regular basis to Azure. When an outage occurs on-premises or in one of Azure datacenter, you can fail the machines over from the primary site to the secondary Azure site, and access them from there. When the primary site is available again, you can fail back.
  • For migration, you replicate on-premises machines to Azure. Then you fail the VM over from on-premises to Azure and complete the migration process. There’s no failback involved. The key feature of the migration scenario is, ASR will convert automatically all your (VHDX) on-premises to (VHD) and upload them to Azure Blob Storage.

I was recently migrating couple of Hyper-V virtual machines from on-premises to Azure, and after I performed a test failover, I noticed that the virtual machine is not booting.

The issue

When I initiated the Test Failover in Azure to make sure that everything will work as expected, I noticed that the virtual machine is not booting. I checked the Boot diagnostics and I found the following message:

No operating system was loaded.

This is unexpected behavior, first because the virtual machine on-premises is Generation 1, and second is even if the VM is Generation 2, Azure Site Recovery allows you to migrate/replicate Generation 2 machines into Azure by converting them into Generation 1 at the time of failover, thus will make sure the VM is booting successfully. So the No UEFI-compatible file system was found message does NOT make sense here.

Finding the cause

I reached out to the ASR team since this is a weird behavior, after investigation over two days, we found out that Standard_DC* VM size is a very recent SKU that Azure has introduced. It runs on the Intel Xeon platform and isn’t compatible with the format in which the disks get replicated in my scenario. Since in this case, I did not explicitly picked a target VM size, Microsoft ended up trying to pick one from the list of available sizes in my Azure subscription. Behind the scene, Microsoft will try and match as closely as possible to the number of Cores, RAM size of the on-premises VM, and unfortunately the algorithm ended up picking this one incorrectly.

A little word about Azure VM DC-Series. Microsoft has revealed that the new preview series of confidential computing virtual machines, the DC-Series, which went into public preview earlier this month are based on Generation 2 Hyper-V virtual machines. So this is the first time that a non-Generation 1 VM is available in Azure. If you are interested about Azure Confidential Computing, then I highly recommend to check the following article on how to protect your data in use with the public preview of Azure confidential computing.

Fixing the issue

At the time of this writing, if you noticed that the replicated virtual machine to Azure ended up in Standard_DC* VM size, then you need to change the target VM size before you initiate the failover.

Please follow the steps below to change the target VM size:

  1. Go to the Azure Portal, and then browse to Recovery Services Vaults, select the vault which you are replicating to.
  2. From the list of items that appears under the vault, under Protected items section, select Replicated items.
  3. Select the VM to open the VM details page.
  4. On the VM details page select Compute and Network from the left menu.
  5. In the page that opens, click Edit and then update the Target VM size (you can try with Standard_DS* series) and finally hit Save.
  6. Cleanup the test failover and then try to perform the failover again.
  7. The test VM created this time should be using the new size you specified and it should boot properly.

Last but eventually not least, Microsoft is working on a permanent solution to pick the right VM size when replicating to Azure.

Thanks to Bharath, and Manish in the ASR team for their help in getting to the bottom of this.

Hope this helps someone out there!

__
Thank you for reading my blog.

If you have any questions or feedback, please leave a comment.

-Charbel Nemnom-

Advertisements
About Charbel Nemnom 399 Articles
Charbel Nemnom is a Cloud Solutions Architect and Microsoft Most Valuable Professional (MVP), totally fan of the latest's IT platform solutions, accomplished hands-on technical professional with over 17 years of broad IT Infrastructure experience serving on and guiding technical teams to optimize performance of mission-critical enterprise systems. Excellent communicator adept at identifying business needs and bridging the gap between functional groups and technology to foster targeted and innovative IT project development. Well respected by peers through demonstrating passion for technology and performance improvement. Extensive practical knowledge of complex systems builds, network design and virtualization.

Be the first to comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.