Azure virtual machines (VMs) compute instances can run on demand. You can use them just like servers deployed on-premises, deploying operating systems and applications, or containerized workloads. Operating system updates for Azure VMs are one of the core elements of a zero-day vulnerability and the overall Azure security strategy.
In this article, we will show you how to patch Azure VMs with the update management service backed by Azure automation and log analytics workspace.
In This Article
As you probably know, when we start provisioning resources in any public cloud provider, we need to always think about the types of resources we have and the shared responsibility model.
For infrastructure as service (IaaS) virtual machines, we are responsible for things like the OS, their runtime, the middleware, application, and data. When we think about the OS, this includes securing and hardening the OS, but also obviously patching it. And that’s kind of a huge part of it.
You probably already have a patching solution on-premises like System Center Configuration Manager (SCCM), you could bring that to the cloud if that’s working for you today, you could just use your existing investments if you want. But there is also a cloud-native technology in Azure called “Update Management“, and that’s what we want to focus on in this article.
The Azure Update Management service is generally available (GA) and is included as part of an Azure Automation Account. Update management allows you to manage updates and patches for your machines (Windows and Linux). With Update management, you can quickly assess the status of available updates, schedule installation of required updates, and review deployment results to verify updates that apply successfully. This is possible whether your machines are Azure VMs, hosted by other cloud providers, or on-premises.
To follow this article, you need to have the following:
1) Azure subscription – If you don’t have an Azure subscription, you can create a free one here.
2) Azure Resource Group (RG).
3) At least one supported operating system (x64) is deployed in the desired RG. Please check the following table lists for the supported operating systems for update assessments and patching.
4) Azure Update Management configuration (more on this in the next section).
To follow this example, we are using the following scenario in our production environment. This includes the patch schedule and patch scope.
|Patch Group Name||Patch Schedule||Patch Scope|
|PATCH GROUP 1||20:00 - 24:00 every second Tuesday of the month||All available|
|PATCH GROUP 2||24:00 - 04:00 every second Wednesday of the month||All available|
|PATCH SPECIAL||20:00 - 24:00 every second Tuesday of the month||Security and critical only|
For the remainder of this article, we will follow this scenario for OS management updates in Azure.
Azure Update Management
Azure Update Management will create and leverage the following two Azure services for you. We don’t need to worry about the deployment and the configuration of those components:
1) Log Analytics Workspace – Azure Monitor.
2) Azure Automation Account.
In the first part of this solution, we’re going to have a log analytics workspace. This is where the agent collects/sends the logs and then runs analysis on top of it. It’s going to use that to get information from the operating systems about what patches they have, what patches are they missing, and the overall state.
The second part is Azure automation, which automatically installs the system Hybrid Runbook Worker on Azure IaaS VMs or non-Azure machines that are enabled for Update Management. The hybrid runbook worker enables it to talk to Azure automation.
There are going to be two agents installed on every machine you onboard to update management, they get auto-configured for you and that enables it to go and talk to the log analytics workspace and Azure automation.
The default installation location for the Microsoft Monitoring Agent (log analytics ) is located under “C:\Program Files\Microsoft Monitoring Agent\Agent“, and the Azure Automation agent under the following location “C:\Program Files\Microsoft Monitoring Agent\Agent\AzureAutomation“.
Virtual Machine Configuration
Let’s see how to onboard a single virtual machine to update management.
Launch the Azure portal and scroll down to the Operations section of the VM blade, you can see “Guest and host updates” as shown in the figure below, then click “Go to Update management“.
Next, you’ll see the Update Management configuration blade. You can create or choose a certain log analytics workspace and pick an Azure automation account. Then click Enable.
If you have already enabled and configured update management on the virtual machine, within the same blade you’ll see the missing updates, deployment schedules, and history.
This is kind of one virtual machine management at a time, in production, what you’d rather do is, manage multiple machines. This is really where the power comes.
Update Management Configuration
To manage multiple machines, you can go directly to the automation account you created earlier, or select “manage multiple machines” from any virtual machine under the update management blade.
Once you are at the automation account, click “Update management” as shown in the figure below.
The next thing to do is to onboard virtual machines, you could select “+ Add Azure VMs” and/or add non-Azure machines, then it’s going to go and search for all VMs. You could filter by Subscription, Locations, and Resource Groups.
Then you check all the virtual machines that you want to onboard and then click the “Enable” button as shown in the figure below to onboard them to the update management solution.
If you just onboarded the virtual machines, please note that it’s going to take a while for the VMs to report back with their status.
The VMs has to go and run the compliance scan, and once they are onboarded, it will check which updates are missing based on whatever source of truth it’s being configured to use. If you’ve not changed anything for the source update, then it’s doing to use the Microsoft update catalog for Windows and the public source for Linux machines. If you are using group policy or something else, then you might have changed the source to point to WSUS or some private source, for example, then it’s going to run the compliance scan.
So once it runs the compliance scan, it will then show you if the virtual machines are compliant or missing various updates (Non-compliant) as shown in the figure below. Then you’d go and see what updates are missing. The next thing to do is to configure the schedule update deployment.
Select “Schedule update deployment“, enter a descriptive name, and then select if you want to target different deployments for Windows or Linux.
You can then pick the groups of machines based on two different options as follows:
1) Groups to update: These are dynamic groups that are resolved at deployment time. You could filter based on certain Subscriptions, Resource Groups, or Locations, or filter it by certain Tags. You could have a tag based on the update schedule, business units, application criticality, and so on, it will automatically bring those machines in. You could group and add virtual machines based on any of these items. It’s a dynamic collection based on those filters that you are defining, you could also do the same for Non-Azure machines. This is the first option.
2) Machines to update: The second option you can use is to select the virtual machines you want to update. You change first the Type from Saved searches to Machines. And then you pick and choose the virtual machines you want to be as part of this updated group. In this example, PATCH GROUP 1.
So you are going to typically do one or the other. Either you specify the machines directly or use the dynamic grouping. You could also use both items (groups and machines), if you select both, then it’s going to do the Union. In other words, it’s the sum of both groups and machines, but typically in production, you just do one.
Next, you want to select what kind of update classifications to be included as part of this group. For PATCH GROUP 1, we select all, and for PATCH SPECIAL, we select security and critical updates only. For Windows, we see all these critical and security updates, rollups, feature and service packs, definition updates, etc.
Then you can specifically add to either include or exclude certain updates by using the Include/exclude updates blade. You can add and provide a list of the KB IDs that you want to include or exclude. A great example is to exclude the following broken updates by Microsoft which are available for Windows Server installations as part of the January 11, 2022 updates:
Next, you want to create the schedule based on the OS update schedule defined in your organization. Is this a one-time patch you are rolling out, or are you going to run this every day, every two days, every week, or every month? You could set the expiration at a certain time. In this example, we set the schedule to start at 8:00 PM every second Tuesday of the month.
The next option that you can use is, select Pre-scripts and Post-scripts which are tasks that can be automatically executed before or after an update deployment run. You can configure up to one Pre-script and one Post-script per deployment.
You can also set the amount of time it has to perform these patches. In this example, the update schedule (maintenance window) starts at 8:00 PM until midnight (4 hours = 240 minutes). The duration must be a minimum of 30 minutes and less than 6 hours. The last 20 minutes of the maintenance window is dedicated to machine restart and any remaining updates will not be started once this interval is reached. In-progress updates will finish being applied.
And finally, set the reboot options (reboot if required, never reboot, always reboot, or only reboot – will not install updates). Then you go ahead and “Create” these schedules for each group.
As mentioned in the update scenario section, we have created three deployment schedules as shown in the figure below (PATCH SPECIAL, PATCH GROUP 1, and PATCH GROUP 2). In this example, we have only Windows deployment, the same will apply to Linux. You can select any of the deployment schedules created, modify them or delete them.
And that’s it. Once you’ve defined these configurations, Azure update management will start and do all that work for you automatically.
Monitor Azure Update Management
Now that we have the Azure Update Management service configured and our servers are being updated, we also need to make sure that the service is monitored. There are a few things that can go wrong and we need to be aware of them.
For example, the Azure Monitoring Agent is out of date or not responding. Servers stopped communicating back to the Log Analytics workspace. The maintenance window was too short and not all the OS updates were installed on time. There are a few ways how to monitor Azure Update Management.
Let’s look at the different options that we can use.
View update status in the Azure Portal
We can always look at the Azure Portal to see how many updates are missing or if there are service issues.
Under the Update management blade in the automation account, the Machines tab will show the results of the compliance and agent state, and the History tab will show the results of previous update deployments.
Configure Azure Update Management Alerts
Azure Automation Account offers alerting mechanism that can send information about each update deployment run and also if/when something goes wrong so you can proactively respond.
Azure Automation creates two distinct platform metrics related to Update Management that are collected and forwarded to Azure Monitor. These metrics are available for analysis using Metrics Explorer and for alerting using a metrics alert rule. The two metrics emitted are:
> Total Update Deployment Machine Runs: Used to alert on the status of an update deployment targeted at specific machines.
> Total Update Deployment Runs: Used to alert on the overall status of update deployment.
Follow the steps below to set up alerts to let you know the status of the Azure update deployment:
1) To configure the alert, go back to your Automation account, select Alerts under Monitoring, and then click New Alert Rule as shown in the figure below.
2) On the Select a signal page, choose chose the signal that’s appropriate for your requirements (Total Update Deployment Machine Runs or Total Update Deployment Runs). In this example, we will select the “Total Update Deployment Runs” from the list.
3) For a dimension, scroll down and select a valid value from the dimension name list. Add Update Deployment Name and Status as shown in the figure below, and set the dimension values to “All current and future values“. If you don’t choose a value for a dimension, Update Management ignores that dimension.
4) Under Alert logic, enter values in the Time aggregation as Total, and for Threshold fields, enter 1. Leave the Evaluated based on period as default. Click Done, then click Next: Actions >
5) On the Actions pane, click + Create action group, and then choose the appropriate subscription and resource group. Next, under the Instance details, enter an action group name (i.e. VM Updates Action group) and the display name (i.e VM Up). The display name is limited to 12 characters only. Click Next: Notifications >
6) On the Notifications pane, set the Notification type that’s appropriate for your requirements (Email/SMS message/Push/Voice), and then enter the required details and click OK. You also need to provide a unique name for the notification, under 128 characters. Notification and action names must be unique from one another. Click Next: Actions >
7) On the Actions pane, choose which actions are performed when the action group is triggered (this step is optional). A concrete example where Action type is useful, you could use Webhook as action and set a URI to send notifications to an Azure function app where it creates automatically a ticket in Jira Service Desk. Click Review + create, and then click Create.
8) Click Next: Details >. Under the Project details, select the subscription and resource group in which to save the alert rule.
9) Under the Alert rule details, set the Severity field to Warning (Sev 2) for a warning deployment update run as shown in the figure below. Fill in the Alert rule name with OS Update Alert for example.
10) Lastly, click Review + create, and then click Create to enable the alert rule.
Note you may want to set up one alert for informational successful deployments and a second alert for failed deployments.
How it works
The high level on how Azure update management solutions work is described below:
You start by onboarding your virtual machine into the update management solution, it’s going to get the log analytics agent and the hybrid runbook worker running on it talking to the Azure automation account. It’s then going to run compliance scans to see what’s missing based on whatever source of truth you are pointing to (i.e. Microsoft update or Linux public source).
And then through the Azure automation account, you start defining the update schedules based on your environment and then group the virtual machines into different update groups.
You can see the history update for all machines under the History tab, check its status, and dive deep into the failed update. You can also see the details of each of the run jobs.
Microsoft also released a new update management solution called Automatic VM guest patching for Azure VMs. This solution is also for Windows and Linux machines. But you have no control over this, so what the automatic VM guest patching is going to do is the following:
If automatic VM guest patching is enabled on a VM, then the available Critical and Security patches only are downloaded and applied automatically on the VM. This process kicks off automatically every month (30 days) when new patches are released. Patch assessment and installation are automatic, and the process includes rebooting the VM as required.
There is no time control. You can’t set it when it deploys, you can’t set a maintenance window. You can’t stop certain patches from getting deployed. It’s like you set it and forget it!
Where we see Automatic VM guest patching for Azure VMs useful is for dev and test environment and not for production.
This new solution does NOT require a log analytics workspace or an automation account. It requires only an extension to be deployed on the machine which will be using the native Azure Resource Graph (data store) capability in the backend.
Automatic VM guest patching is native to the resource itself, which is virtual machines, virtual machine scale sets, or non-Azure machines (Azure Arc).
The patch installation process is orchestrated globally by Azure for all VMs that have automatic VM guest patching enabled. This orchestration follows availability-first principles across different levels of availability provided by Azure. It’s not going to do paired regions update at the same time, it will only do one Azure region at a time. If you have Availability sets, then it’s going to do 1 update domain at a time, so it’s going to protect you from some kind of mass impact as part of the deployment, but we have no control over this whatsoever, however, it’s super simple to configure.
Please note that only VMs created from certain OS platform images are currently supported. Please check the official supported OS images on Microsoft documentation.
Another (preview) update feature was released is called “Hotpatch” which is part of the Azure Automanage service for Windows Server. Hot patching is a new way to install updates on supported Windows Server Azure Edition virtual machines (VMs) only that don’t require a reboot after installation. Linux machines are not supported. If you are interested in this solution, check how hotpatch works on Microsoft documentation.
At the time of this writing, the production solution is Azure update management for Windows, Linux, on-premises, Azure, and multi-cloud. We have full control of when to update, and what to include or exclude, and we have all the power.
In this article, we showed you how to patch Azure VMs using the Azure Update Management service backed by an Azure automation account and log analytics workspace.
With Update Management, you enable consistent control and compliance of your virtual machines. This service is included with Azure virtual machines and Azure Arc machines. You only pay for logs stored in Log Analytics. The Azure Update Management solution is completely free even for on-premises or other clouds, there is no cost for this apart from log analytics workspace data.
The Update Management service requires a Log Analytics workspace and an Automation account. You can use your existing workspace and account or let the solution configure the nearest workspace and account for you to use.
> For more information about Azure update management, check the official documentation.
We hope this guide is useful as you patch and update your Azure VMs to protect your organization’s valuable workloads.
> Learn more about hardening Azure VMs – 5 Critical Best Practices.
Thank you for reading my blog.
If you have any questions or feedback, please leave a comment.