In July 2018, Microsoft announced the GA release for Azure File Sync. With Azure File Sync, you can centralize your files in Azure and then install a sync agent on Windows Server whether it’s on-premises or in Azure (IaaS VM) to provide fast local access to your files. Your server and Azure Files are constantly in sync, so you have one centralized location for your files with multi-site access powered by fast local cache and cloud tiering.
Cloud tiering is an optional feature of Azure File Sync in which frequently accessed files are cached locally on the server while all other files are tiered to Azure Files based on policy settings. When a file is tiered, the Azure File Sync file system filter (StorageSync.sys) replaces the file locally with a pointer, or reparse point. The reparse point represents a URL to the file in Azure Files.
When a user opens a tiered file, Azure File Sync seamlessly recalls the file data from Azure Files without the user needing to know that the file is actually stored in Azure. For more information about the cloud tiering option, please check the official documentation from Microsoft.
A while ago, I blogged about how to control Azure File Sync bandwidth with network QoS. Please have a quick look before you continue with the remainder of this article.
I have configured a new server endpoint and I am using network throttling QoS as described in this article.
For this particular server endpoint, I was using the Invoke-StorageSyncFileRecall cmdlet to pull down a large amount of data. I noticed a slow file download and I start wondering if this experience is related to the network throttling option.
In this particular example, to recall 9 GiB of data took around 13 minutes as shown in below output.
The network and CPU performance was about 153 Mbps and 15% CPU time.
As confirmed by Microsoft, when you throttle the network, Azure File Sync does not apply (throttle) when a tiered file is accessed or the Invoke-StorageSyncFileRecall cmdlet is used (download). So if Cloud Tiering is enabled, the majority of the traffic that is throttled is Upload. If a user accesses a tiered file or the Invoke-StorageSyncFileRecall cmdlet is used, the request is NOT throttled. Throttling applies on download if the server is downloading a file change that was made on another server endpoint and the file was not accessed by a user or application.
So network throttling is not the issue when you recall or access a tiered file.
After a bit of investigation, I found out that when you use the Invoke-StorageSyncFileRecall PowerShell cmdlet, it has an additional parameter called (-ThreadCount). The thread count determines how many files can be recalled in parallel.
Now when you run the Invoke-StorageSyncFileRecall cmdlet without the -ThreadCount parameter. The default is 4 files are recalled in parallel.
In this example, I increased the -ThreadCount to 16 and the recall time dropped to 8 minutes instead of 13 minutes before. This is for a small amount of data (9 GiB). Imagine that you have 10+ TB of data that you want to recall.
If we look at the performance now, we see 179 Mbps and 20% CPU time compare to 4 threads before. This is expected as we are recalling 16 files in parallel.
How many files can you recall in parallel? The default is 4, and it can be increased up to 32.
There is also an additional parameter you may consider using is (-Order CloudTieringPolicy). Specifying -Order CloudTieringPolicy will recall the most recently modified files first.
Invoke-StorageSyncFileRecall -Path D:\Data -Order CloudTieringPolicy -ThreadCount 16 -Verbose
Hope this helps!
As described in this article, you can easily increase the thread count by running a simple PowerShell command on the server endpoint.
The thread count determines how many files can be recalled in parallel. By default Azure File Sync agent is set to recall 4, however, you can increase it up to 32 threads.
Azure File Sync extends on-premises file servers into Azure by providing cloud benefits while maintaining performance and compatibility. Azure File Sync provides:
- Multi-site access – provide write access to the same data across Windows servers and Azure Files.
- Cloud tiering – store only recently accessed data on the local server(s) and save on capacity storage.
- Integrates with Azure backup – no need to back up your data on-premises.
- Fast disaster recovery – restore file metadata immediately and recall data as needed.
I hope you find this guide useful. To learn more about Azure File Sync, please check the following guides.
Thank you for reading my blog.
If you have any questions or feedback, please leave a comment.