Since Hyper-V inception back in 2008, you can configure a virtual machine to use a dynamic or a static MAC address for any given network adapter.
By default, Hyper-V use dynamic MAC address – which means that Hyper-V will generate an initial MAC address for each network adapter, either for VM (vmNIC) or for the host (vNIC), and it will regenerate the MAC address if it believes it is necessary.
If you use static MAC addresses you need to manually specify the MAC address to use, but Hyper-V will never change it.
Each Hyper-V server has a MAC address range that it uses for generating new dynamic MAC addresses. You can also configure this range yourself if you want to.
I have a two nodes Storage Spaces Direct (S2D) Hyper-Converged cluster, when creating a new VM on one of the nodes, I could not start that virtual machine. The error message is like this:
[An error occurred while attempting to start the selected virtual machine(s). Failed to Power on with Error ‘Attempt to access invalid address.’ (0x800701E7). No available MAC address for ‘Network Adapter’].
When checking the virtual network adapter for that virtual machine, I noticed that the MacAddress is set to zero 00:00:00:00:00:00
The only time you would expect to have a virtual machine with the MAC address set to zero is, if you have imported the virtual machine and specified that you are copying (and not moving) the virtual machine. This is not my case, I am creating a new VM.
The only time you would expect to receive No available MAC address for ‘Network Adapter’ is, if you are running more than 254 virtual machines. Because by default, each Hyper-V server can allocate maximum 254 MAC addresses. However, I have only 18 VMs running on that node!
I tried to set a static MAC Address for that VM and I was able to start it. What I found is, the affected node cannot start any VM with Dynamic MAC address assigned. In other words, Hyper-V server is not allocating any Dynamic MAC addresses for virtual machines.
On the second node the VM is getting a Dynamic MAC and is starting without any issue. I was also able to Quick Migrate that VM and start it on the second node. Moreover, if I migrate a VM from the healthy node to the affected node, the VM is live migrating and starting successfully. Why? because the Dynamic MAC was moved along with VM from the healthy node.
So, what is wrong on that particular node?
The basic troubleshooting is as follows:
- Restart the Hyper-V Virtual Machine Management (VMMS) service.
- Remove the VM network adapter, then add it again and try to start the virtual machine.
- Restart the affected node.
- Remove MaximumMacAddress and MinimumMacAddress registry keys, and then restart the vmms service. Hyper-V will generate new Max and Min MAC addresses.
- Expand the Maximum and Minimum MAC addresses range in Hyper-V.
- Make sure that both nodes are running the same patch level.
All the troubleshooting steps above did not lead to anywhere. The virtual machine won’t start whatsoever.
Finding the cause
After digging in the Hyper-V logs, I’ve found a lot of errors related to Hyper-V-Worker, Hyper-V-SynthNic, and Hyper-V-VMMS.
Most of these errors were related to invalid access and permissions issue accessing the registry.
When I checked the registry permissions on the working node and on the affected node, I noticed a lot of access permissions were missing.
On the working node
On the affected node
As you can see, NT VIRTUAL MACHINE\Virtual Machines and NT SERVICE\TrustedInstaller permissions are missing from the list.
Fixing the issue
The first attempt is to try and set the right access permissions in the registry key by running the following PowerShell commands:
$ACL = Get-Acl -Path "HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization" $Inherit = [system.security.accesscontrol.InheritanceFlags]"ContainerInherit, ObjectInherit" $Propagation = [system.security.accesscontrol.PropagationFlags]::None $Rule = New-Object System.Security.AccessControl.RegistryAccessRule ` ("NT Virtual Machine\Virtual Machines","FullControl",$Inherit,$Propagation,"Allow") $Acl.AddAccessRule($Rule) $Acl | Set-Acl $Rule = New-Object System.Security.AccessControl.RegistryAccessRule ` ("NT SERVICE\TrustedInstaller","FullControl",$Inherit,$Propagation,"Allow") $Acl.AddAccessRule($Rule) $Acl | Set-Acl Get-Service VMMS | Restart-Service -Force
After I set the right access on the registry key, I tried to start the virtual machine, and this time I received a different message. The error is less descriptive now.
Unfortunately, the single remaining option in this scenario (which is more like a shotgun approach) is to do the following:
- Pause and drain the roles on the affected node, this will live migrate all virtual machines including cluster resources to the working node and won’t cause any downtime.
- Remove any vNIC that you are using on the host.
- Remove Hyper-V virtual switch including NIC Teaming (SET or Traditional LBFO Team).
- Uninstall Hyper-V role and then reboot the system.
- Delete the registry key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization on the affected node by running the following command. This command deletes the Virtualization registry key and all its subkeys and values.
Get-Item -Path "HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization" | Remove-Item -Recurse -Force -Verbose
- Reinstall Hyper-V role.
- Recreate Hyper-V virtual switch including NIC Teaming.
- Resume the node back to the cluster.
- Create virtual machines with Dynamic MAC address and you are good to go.
Before this issue occurred, the affected node crashed due to power loss. When the node came back online, it was not reachable over the network, all virtual NICs (vNICS) on the host were in disconnected mode, I was forced to remove and recreate the Switch Embedded Teaming (SET) virtual switch again. This caused to hit a timing issue on this one system which results in wrong permissions being applied to the registry key. Microsoft have seen this a handful of times and a fix is in place for the next release. As of this writing, the only option is to disable Hyper-V on the node, delete the registry key and then re-enable Hyper-V.
I want to thank Microsoft Program Manager Lars Iwer for supporting me in this case. Much obliged!
Hope this helps someone out there!