(Solution) Azure VM Could Not Start – The BEK Volume For BitLocker Is Missing!

5 min read

Introduction

Azure Disk Encryption is a capability which was announced in May 2016 that lets you encrypt your Windows and Linux IaaS VM disks. Azure Disk Encryption leverages the industry standard BitLocker feature of Windows and the DM-Crypt feature of Linux to provide OS and data disk encryption to help protect and safeguard your data. The solution is integrated with Azure Key Vault to help you control and manage the disk encryption keys and secrets in your key vault subscription while ensuring that all data in the virtual machine disks are encrypted at rest in your Azure storage.

When you apply the Disk Encryption management solution, you can satisfy the following business needs:

  • IaaS VMs are secured at rest by using industry-standard encryption technology to address organizational security and regulatory compliance requirements.
  • IaaS VMs boot under customer-controlled keys and policies. You can audit their usage in your key vault whether by using cloud-native key management system or hybrid (bring our own key approach). And in certain scenarios, there may be regulatory, policy, or technical reasons why you can’t store keys on a key management system provided by a public cloud service, thus you can hold your own key on-premises.

If you are not using Azure Disk Encryption today, I highly recommend enabling encryption for maximum security. For more information about Azure Disk Encryption, please check the following article to get started.

I was recently involved in enabling disk-encryption for several Azure IaaS Windows virtual machines and everything went smooth as expected.

The issue

The other day, I noticed that one of Windows VM with Standard A tier doesn’t start. When I checked the Boot diagnostics, I found the following message:

Plug in the USB drive that has the BitLocker key.

(Solution) Azure VM Could Not Start - The BEK Volume For BitLocker Is Missing! 1

This problem may occur if the VM cannot locate the BitLocker Recovery Key (BEK) files to decrypt the encrypted disk, or if for some reasons, the BitLocker keys were deleted from the “Bek Volume” inside the VM.

(Solution) Azure VM Could Not Start - The BEK Volume For BitLocker Is Missing! 2

The BitLocker encryption keys (BEK) are used to encrypt the OS boot volume and data volumes. BEKs are safeguarded in a key vault as secrets. Microsoft has a detailed document to troubleshoot BitLocker boot errors on an Azure VM here. If the BEK files were missing, you can simply stop and deallocate the VM, and then restart it. This operation forces the VM to retrieve the BEK file from the Azure Key Vault, and then put it on the encrypted disk. However, none of the solution mentioned in the troubleshooting guide worked for me…

Finding the cause

After digging in the logs, I found out that this VM was automatically redeployed at midnight due to an unexpected failure on the host server in Azure datacenter!!!

(Solution) Azure VM Could Not Start - The BEK Volume For BitLocker Is Missing! 3

The auto-recovery action was triggered by a hardware issue on the physical node where the virtual machine was hosted. As designed, the VM was automatically moved to a different and healthy physical node to avoid further impact. So any request to services running inside the VM may have failed during this time. This behavior is known as Service Healing – Auto-recovery of Virtual Machines which is built-in in the Azure platform. To learn more about Azure automated recovery action, please read the following article.

To ensure an increased level of protection and redundancy for your application in Azure, it is recommended that you group two or more virtual machines in an availability set. However, this VM is not a mission-critical workload, for this reason, it was not part of an availability set.

I reached out to the Azure team since this is weird behavior, after investigation, they found out that there is a known issue regarding VMs that are a Standard size and that are encrypted. The VM in question is a Standard_A2_v2 that does not support premium storage and this could be what is causing the issue. However, the size for my Standard VM is totally supported scenario for enabling disk encryption. To understand the requirements and limitations for disk encryption, please check the following article.

Fixing the issue

At the time of this writing, if you encountered this issue, you need to change the target VM size that supports Premium storage so that the VM can boot and then resize it back to Standard storage afterward.

To resolve this issue, please follow the steps below:

  1. Despite that, a backup of this VM already exists, but I would rather be safe than sorry and have a Snapshot created for the VM as well. You can use the Azure Portal to create a snapshot for all virtual hard disks attached to this VM or use PowerShell by using the New-AzSnapshot cmdlet. (Solution) Azure VM Could Not Start - The BEK Volume For BitLocker Is Missing! 4
  2. Next, you need to resize the VM in the Azure Portal to a disk size that supports Premium storage. You can always change this back once you have successfully booted to the VM after it has been resized. A list of general purpose VM sizes and if they support Premium storage can be found here.
  3. Before you resize the VM, please navigate to the VM in question and Stop (deallocate) it.
  4. Select “Size” from the Navigation bar on the left and select one of the sizes that support Premium storage. (Solution) Azure VM Could Not Start - The BEK Volume For BitLocker Is Missing! 5
  5. After successfully resizing of the VM (please do not make any changes to the disks, leave them as Standard HDD). (Solution) Azure VM Could Not Start - The BEK Volume For BitLocker Is Missing! 6
  6. After the VM has successfully resized, start the VM again. Once the VM has completely started, try to RDP into the VM. Please note, the VM may have to be restarted again if the RDP connection can’t be made. (Solution) Azure VM Could Not Start - The BEK Volume For BitLocker Is Missing! 7
  7. Finally, you can resize the VM back down to the Standard A size without Stop (deallocating). The VM will automatically reboot and you should be able to RDP to the machine successfully! Please note that if you decide to resize the VM back to Standard A size, then make sure to choose the size that was previously assigned to it and not a different A size, in my case, it’s Standard_A2_v2.

Last but eventually not least, Microsoft is actively working and investigating this issue to determine the root cause and find a permanent solution for the Standard A size VMs with disk encryption.

Thanks to Azure team for their help in getting to the bottom of this.

Hope this helps someone out there!

__
Thank you for reading my blog.

If you have any questions or feedback, please leave a comment.

-Charbel Nemnom-

About Charbel Nemnom 577 Articles
Charbel Nemnom is a Cloud Architect, Swiss Certified ICT Security Expert, Microsoft Most Valuable Professional (MVP), and Microsoft Certified Trainer (MCT), totally fan of the latest's IT platform solutions, accomplished hands-on technical professional with over 17 years of broad IT Infrastructure experience serving on and guiding technical teams to optimize the performance of mission-critical enterprise systems. Excellent communicator is adept at identifying business needs and bridging the gap between functional groups and technology to foster targeted and innovative IT project development. Well respected by peers through demonstrating passion for technology and performance improvement. Extensive practical knowledge of complex systems builds, network design, business continuity, and cloud security.

Be the first to comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.