In this article, we will show you how to monitor Storage Spaces Direct health with PowerShell.
Table of Contents
In Windows Server 2016, Microsoft added a new type of storage called Storage Spaces Direct (S2D). S2D enables building highly available storage systems with locally attached disks, and without the need to have any external SAS fabric such as shared JBODs or enclosures. With S2D, we can deploy two models: The converged model (known as disaggregated) where the storage is separated from the hypervisor running on different hardware, and the Hyper-Converged Infrastructure model (known as HCI) where the storage and the hypervisor (Hyper-V) running on the same hardware. This feature is included only in Windows Server 2016 Datacenter edition. You can find a lot of articles about S2D on my website.
Storage is the most critical component in any infrastructure, and with S2D it is even more critical since the availability can also affect virtual machines and applications. It’s crucial to monitor S2D in a proactive way to ensure its availability before any service gets affected.
Microsoft has written a management pack for the Operations Manager which gets information from S2D health service API on a regular basis. If you want to know more about how to monitor Storage Spaces Direct with System Center 2016 Operations Manager (SCOM), I advise you to check a great guide written by my fellow MVP Romain Serre.
In the near future, you will also be able to monitor Storage Spaces Direct (Azure Stack HCI) with Windows Admin Center.
But what if you don’t have System Center 2016 Operations Manager license?
Project Honolulu is not even ready to manage Storage Spaces Direct based on Windows Server 2016 Version 1607.
Well, in this post, I’ll show you how to monitor S2D with Windows PowerShell, and if something goes wrong you will receive an e-mail alert.
Storage Spaces Direct Health Service
When you enable Storage Spaces Direct, a new cluster role is also enabled by default, known as the Health Service.
The health service cluster role gathers metrics and alerts of all the cluster nodes and provides them using an Application Programming Interface (API). This API can be accessible from PowerShell, and .Net. However, the Health Service is not usable for day-to-day administration because it provides real-time metrics and no historical. Moreover, there is no GUI built-in for monitoring.
As an example, if you want to gather the metrics by the Health Service, you can run Get-StorageHealthReport as shown in the next screenshot. This will give you consolidated information about the memory available, the IOPS, the capacity, the average CPU usage, and so on.
You can also check what is the last action performed on the system by running the Get-StorageHealthAction as shown below:
And the most important cmdlet is Debug-StorageSubsystem which will give you a complete overview of the current alerts in your S2D cluster as shown below. As you can see, the Health Service automatically detected that one of the nodes has a disconnected network cable. And since I have a redundant network path, the severity is Minor.
This is all great but not enough to monitor my S2D cluster on a day-to-day basis. I want to receive an email alert as soon as something goes wrong in order to rectify the issue in a proactive way.
Storage Spaces Direct Health Monitoring
I was working on a PowerShell script that will keep checking my Storage Spaces Direct health, and then send me an email alert for any minor or critical issue that might arise on the cluster. This will help me to avoid keep checking manually the health service because I have zillion other tasks to do.
Here is an automated email generated for the Minor issue. One physical disk is damaged, you can see all the details including what action you need to take.
Here is another alert for Critical and Minor issues. One server is down and the network cable is disconnected.
You can run the tool once or you can automate the invocation by using a Scheduled Task to run every hour or so, and then send you an alert if something goes wrong. To do this, you can use the Register-ScheduleJob cmdlet.
Here’s an example to monitor your S2D Infrastructure every one hour:
# Change these three variables to whatever you want $jobname = "Recurring S2D Monitoring" $script = "C:\Path\Monitor-S2D.ps1" $repeat = (New-TimeSpan -Minutes 60) $scriptblock = [scriptblock]::Create($script) $trigger = New-JobTrigger -Once -At (Get-Date).Date -RepeatIndefinitely -RepetitionInterval $repeat $msg = "Enter the username and password that will run S2D monitoring task"; $credential = $Host.UI.PromptForCredential("Task username and password",$msg,"$env:userdomain\$env:username",$env:userdomain) $options = New-ScheduledJobOption -RunElevated -ContinueIfGoingOnBattery -StartIfOnBattery Register-ScheduledJob -Name $jobname -ScriptBlock $scriptblock -Trigger $trigger -ScheduledJobOption $options -Credential $credential
The script above will run as the specified user (you will be prompted for credentials), and it is set to be elevated to use the highest privileges. In addition, the task will run every 60 minutes or however long you specified in the $repeat variable.
Where can I download this tool?
This monitoring tool is available on my GitHub repository. You can download the documentation and the script from here.
I am planning to improve this monitoring tool in the future. This is still version 1.1. If you have any feedback or changes that everyone should receive, please feel free to update the source and create a pull request.
Happy monitoring and Happy holidays!