How to Replace a Faulty Disk in Windows Server 2016 Storage Spaces? #StorageSpaces #WS2016 #HyperV

Introduction

Physical disks often experience errors of varying severity, from errors that the disk can transparently recover from, without interruption or data loss; to errors that are catastrophic and can cause data loss such as bad sectors, etc.

In Windows Server 2016, Microsoft introduced a new storage solution called Storage Spaces Direct (a.k.a S2D), for more information about Storage Spaces Direct in Windows Server 2016, check – Storage Spaces Direct overview.

The classic Storage Spaces deployment for Standalone and Cluster mode with external JBODs remains the same between Windows Server 2012 R2 and Windows Server 2016.

A while ago, I wrote a blog post on how to replace a Faulty Disk in Windows Server 2012 R2 Storage Spaces.

Replacing a physical disk in Windows Server 2016 is a bit different than in Windows Server 2012 R2 Storage Spaces. In this article, I am going to focus on how to replace a faulty disk in Windows Server 2016 with Storage Tiering using ReFS on a standalone server.

Before you move forward, I highly recommend to check the following post on How to Create a Multi-Resilient Volume with ReFS on Standalone Server in Windows Server 2016.

The following figure illustrates the classic Storage Spaces workflow for standalone deployment. I want to be crystal clear here, this type of deployment is not a highly available. In other words, if you loose that server, all your services will go down. However, on the other side, the storage is resilient and this depend on the space that you create Mirror or Parity but NOT Simple space !!!

How Storage Spaces Responds to a Faulty Disk in Windows Server 2016?

There are several different ways to check the health status of your physical disks.

One option is to use Server Manager.

In the Storage Pools tile of the File and Storage Services role in Server Manager, health status that requires attention is identified as illustrated below, a Yellow triangle with an exclamation mark! In this example, I am using 2 X SSDs and 4 X HDDs, 1:2 ratio with Two-Way mirroring.

The second option is to use PowerShell. You can use the following PowerShell command to identify the physical disk associated with the I/O error:

You can also use the following PowerShell command to get the physical disk Event Log:

One additional tip, you can also use the Physical Identification (Drive Light) capability in Windows Server 2016 and identify the disk before you remove it. For Physical Identification to work, your server must support SES (SCSI Enclosure Storage). You could use the following command to turn On or Off the drive lights.

The third option is to use Smart Storage Administrator if you are using HPE servers, you can find a similar tool for different OEM vendor as well.

As you can see, the Storage Spaces sustained the failure of a single disk in a two-way mirrored space. The storage pool is healthy, but the virtual disk is in degraded state.

A two-way mirror will allow you to suffer the loss of a single disk in each Tier (SSD/HDD) with no problems while a three-way mirror will allow you to lose two disks in each Tier (SSD/HDD).

Replace The Faulty Disk Without Interruption

While the defected disk is still in the system, you need to insert a new disk in any of the empty bay. In my case, the defected disk is in Bay 4 and the new disk is in Bay 7.

I want to mention here that I am not using a Hot Spare disk. If there is a Hot Spare disk available in the storage pool, Storage Spaces will replace the failed disk with a Hot Spare and retries the write operation without any user intervention.

Since I don’t have a Hot Spare, you need to follow the steps below:

  1. Adding a new disk into the system (Chassis).
  2. Check if the system detect the new disk by running the following command:
  3. Add the new disk to the storage pool by running the following command:
  4. Remove the defected disk from the pool by running the following command:
  5. And the final step is to celebrate 🙂 because in Windows Server 2016, the virtual disk is self-healed. Finally, my wish is fulfilled!!! In Windows Server 2012 R2, you need to run Repair-VirtualDisk command to repair the virtual disk after adding a new physical disk. This step is not needed anymore which will save you a lot of time. Thank you Microsoft for making Storage Spaces great and Storage Spaces Direct even greater. 

And here you go… Healthy Storage Spaces without interruption or data loss.

Did you experience any issue with Storage Spaces in Windows Server 2016? Please add your comment below and share your experience.

Hope this helps,

Until then… Happy holidays!

Cheers,
-Ch@rbel-

About Charbel Nemnom 334 Articles
Charbel Nemnom is a Microsoft Cloud Consultant and Technical Evangelist, totally fan of the latest's IT platform solutions, accomplished hands-on technical professional with over 15 years of broad IT Infrastructure experience serving on and guiding technical teams to optimize performance of mission-critical enterprise systems. Excellent communicator adept at identifying business needs and bridging the gap between functional groups and technology to foster targeted and innovative IT project development. Well respected by peers through demonstrating passion for technology and performance improvement. Extensive practical knowledge of complex systems builds, network design and virtualization.

Be the first to comment