In Windows Server 2016, Microsoft introduced a new type of storage called Storage Spaces Direct which is part of the new Azure Stack HCI program. Azure Stack HCI enables building highly available storage systems with locally attached disks, and without the need to have any external SAS fabric such as shared JBODs or enclosures. This is the first true Software-Defined Storage (SDS) from Microsoft. Software-Defined Storage is a concept, which involves storing data without dedicated hardware.
In Windows Server 2019, Microsoft added a lot of improvements for the Azure Stack HCI (formerly known as Windows Server Software-Defined, a.k.a WSSD).
Fast forward to 2020, Microsoft introduced a new operating system dedicated to the Hyper-Converged deployment model where the innovation continues at a faster cadence compared to Windows Server. Azure Stack HCI was introduced which is a new hyper-converged infrastructure (HCI) operating system delivered as an Azure service that provides the latest security, performance, and feature updates.
Table of Contents
Introduction
With Azure Stack HCI, you can deploy and run Windows and Linux virtual machines (VMs) in your datacenter or at the edge using your existing tools, processes, and skillsets. Additionally, you can extend the datacenter to the cloud with Azure Backup, Site Recovery, Azure File Sync, Azure Monitor, Azure ARC, and Azure Security Center.
On September 1st, 2021, Microsoft announced the GA release of Windows Server 2022 with no major improvement for Storage Spaces Direct. As noted earlier, all future innovations will go into Azure Stack HCI to run hyper-converged infrastructures; however, Windows Server will continue to benefit from improvements to existing features. Windows Server 2022 still lacks advanced features such as stretched clusters but has been given a new repair option for Storage Spaces Direct (“Adjustable Storage Repair Speed”). System admins can use this to control how many resources they want to allocate for repairing data copies or active workloads.
I recently did a 3-Nodes Azure Stack HCI Hyper-Converged deployment on top of DataON AZS-216 Integrated Systems – All-NVMe Flash and hit over 2.5 Million IOPS.

In this article, I would like to share with you my experience and performance results.
3 Nodes DataON – Integrated Systems
For this deployment, I used the following hardware configuration:
- DataON™ AZS-216 Integrated Systems For Azure Stack HCI OS
- Supports Dual Intel Xeon® Scalable™ Gen 2 Processor Series & (24) DDR4 DIMM
- Drive Bay: (16) NVMe U.2 2.5″ Hot-swappable
- PCIe Slot: (7) PCIe 3.0 x8
- Onboard NIC: (2) Built-In 10GbE RJ45
- 1300W (1+1) 110V hot-swappable redundant PSU with NEMA 5-15 Power Cords
- Intel® Remote Management Module 4
- Intel® Xeon® Scalable Gen.2 Gold 5218R 2.1 GHz, 20-Core, 27.5MB Cache
- 384GB (12x32GB) Samsung® DDR4 2933MHz ECC-Register RDIMM
- 2 X Intel® S4510™ 480GB SATA M.2 Boot Drive For OS
- 10 X Intel® DC P5510™ NVMe 3.8TB 2.5″ 144L 3D TLC SSD
- 2 X NVIDIA|Mellanox® ConnectX-4 Lx EN Dual Port SFP+ 10/25GbE RDMA Card
- 2 X NVIDIA|Mellanox® LinkX™ Passive Copper Cable, ETH, up to 25Gb/s, SFP28, 30 AWG
- 2 X Mellanox® Spectrum™ 18-port 10/25GbE X 4-port 100GbE Switch (RDMA/RoCEv2)
The DataON AZS-216 Integrated System for Azure Stack HCI are pre-configured nodes with certified components, tested and validated by DataON and Microsoft to help build Azure Stack HCI clusters with ease.
In this configuration, all NVMe disks are used as capacity (all-flash) as shown in the inventory below.

Resiliency
The cluster shared volumes are configured with a three-way mirror to support the maximum resiliency in one site. With a three-way mirror, you can sustain two failures at the same time, and your workloads remain online.
You could test the following 4 different scenarios:
1) Physical drive pull.
2) Reboot a node (observe failover).
3) Physical power pull of a node.
4) Shut down one node and pull a single drive from one of the remaining nodes that are still up.
Software Configuration
- Host: Azure Stack HCI OS, version 20H2 (OS build 17784.1884)
- Single Storage Pool (117 TB)
- 3 X 10.3 TB (three-way mirror)
- ReFS/CSVF file system
- 60 virtual machines (20 VMs per node)
- 2 virtual processors and 8 GB RAM per VM
- VM: Windows Server 2019 Datacenter Core Edition with August 2021 update
- Jumbo Frame enabled
- CSV Cache is disabled for benchmarking purposes only. For real-world workloads, CSV Cache is enabled with 16GB
Workload Configuration
DISKSPD version 2.0.21a workload generator
VM Fleet workload orchestrator
Test 1 – Random 4K, 8 Threads, 8 Outstanding I/O, 100% Read
Total 2.5 Million IOPS – Read/Write Latency @ 0.1/0.6(ms)
Each VM is configured with:
- 4K IO size
- 10GB working set
- 100% read and 0% write
- No Storage QoS
- RDMA Enabled RoCEv2

Please note that 100% READ output is a bit skewed since the reads are all local. However, having the same amount of threads on any workload that involved writes would drastically increase the latency and reduce the number of IOPS as shown in the subsequent tests.
Test 2 – Random 4K, 4 Threads, 8 Outstanding I/O, 100% Write
Total 460K IOPS – Read/Write Latency @ 0.02/2.5(ms)
Each VM is configured with:
- 4K IO size
- 10GB working set
- 0% read and 100% write
- No Storage QoS
- RDMA Enabled RoCEv2

Test 3 – Random 4K, 4 Threads, 8 Outstanding I/O, 70% Read / 30% Write
Total 1 Million IOPS – Read/Write Latency @ 0.01/0.4(ms)
Each VM is configured with:
- 4K IO size
- 10GB working set
- 70% read and 30% write
- No Storage QoS
- RDMA Enabled RoCEv2

Test 4 – Random 4K, 4 Threads, 8 Outstanding I/O, 50% Read / 50% Write
Total 785K IOPS – Read/Write Latency @ 0.1/0.7(ms)
Each VM is configured with:
- 4K IO size
- 10GB working set
- 50% read and 50% write
- No Storage QoS
- RDMA Enabled RoCEv2

Test 5 – Sequential 512K, 1 Thread, 1 Outstanding I/O, 100% Read
Total 72K IOPS – Read/Write Latency @ 0.7/0.3(ms)
Each VM is configured with:
- 512K IO size
- 10GB working set
- 100% read and 0% write
- No Storage QoS
- RDMA Enabled RoCEv2

Test 6 – Sequential 512K, 1 Thread, 1 Outstanding I/O, 100% Write
Total 17K IOPS – Read/Write Latency @ 0.00/3.3(ms)
Each VM is configured with:
- 512K IO size
- 10GB working set
- 0% read and 100% write
- No Storage QoS
- RDMA Enabled RoCEv2

DataON and Windows Admin Center integration
DataON MUST is a hybrid-cloud infrastructure monitoring and management tool. It’s designed to seamlessly integrate with Windows Admin Center through a single pane of glass that consolidates all aspects of local, remote server, cluster, and Azure Stack HCI monitoring and management

The second integration is DataON MUST Pro which integrates with Windows Admin Center’s cluster creation and cluster-aware updating (CAU) functionality to simplify deployment and updates to Microsoft Azure Stack HCI, with minimal disruptions to your infrastructure.
MUST Pro automatically compares your DataON Integrated Systems for Azure Stack HCI against DataON’s latest quarterly validated server component image baseline. It also ensures that servers have the same OS version, drivers, firmware, BIOS, and BMC, and checks the drivers and firmware for network cards, host bus adapters, and SSD and HDD drives.
Summary
In this article, I shared my experience and showed you the performance results with three-way mirror resiliency on 3 nodes DataON AZS-216 Integrated System. For more information about Azure Stack HCI, please check the Microsoft documentation here.
Always remember that storage is cheap, but downtime is expensive!!!
Let me know what you think in the comment section below.
__
Thank you for reading my blog.
If you have any questions or feedback, please leave a comment.
-Charbel Nemnom-