In July 2018, Microsoft announced the GA release for Azure File Sync. With Azure File Sync, you can centralize your files in Azure and then install storage sync agent on a Windows Server whether it’s on-premises or in Azure to provide fast local access to your files. Your server and Azure Files are constantly in sync, so you have one centralized location for your files with multi-site access powered by fast local caches and cloud tiering. What Cloud Tiering does is, it over time builds up a heat map on your disks of what files are being used, what files are being written to, and then as the disks become full, the files will be moved to the cloud and keep only stubs (namespace) on the disks locally, so when the user clicks on any tiered file, it will download seamlessly from Azure Files rather than opening straight from local disks. This obliviously is desirable for those files that you are not using very often but you still want them to be around.
If you want to know more about Azure File Sync, please check my previous step by step article on how to get started with Azure File Sync.
Now you enabled Azure File Sync and everything is running well, but you want to monitor the health status. In this article, I will show you the monitoring options which are available at your disposal as of today to monitor Azure File Sync, which can help you to troubleshoot any issue that you may face.
Monitoring Azure File Sync
The following monitoring options are available as of today:
Option #1 – Azure Portal
You can use the Azure Portal to view the Registered Server state and the Server Endpoint Health (sync health).
Registered Server State
- If Registered server state = Online, the server is successfully communicating with the storage sync service.
- If Registered server state = Offline or Appears Offline, then you need to verify the Storage Sync Monitor (AzureStorageSyncMonitor.exe) process on the server is running. If the server is behind a Firewall or Proxy, then refer to the following documentation to configure the firewall and proxy.
Server Endpoint Health
- The server endpoint health in the Azure Portal is based on the sync events that are logged locally on the server in the Telemetry event logs (ID 9102 and 9302) – check Option #2 for more information. If a sync session fails due a transient error (e.g. error canceled), the sync may still show healthy in the portal as long as the current sync session is making progress (Event ID 9302 is used to determine if files are being applied).
- If the portal shows a sync error due to sync not making progress, then check the following documentation for troubleshooting guidance.
Option #2 – Windows Server Event Logs
You can use the following Telemetry event logs to monitor Azure File Sync locally on the server in Event Viewer under (Applications and Services Logs\Microsoft\File Sync\Agent).
- The Event ID 9102 is logged once a sync session is completed. This event should be used to determine if sync sessions are completing successfully (HResult = 0) and if there are per-item sync errors. Check the following documentation for more information: Sync Health & Per-Item Errors.
- The Event ID 9302 is logged every 5 to 10 minutes if there’s an active sync session. This event should be used to determine if the current sync session is making progress (AppliedItemCount > 0). If sync is not making progress, the sync session should eventually fail and an Event ID 9102 will be logged with the error. Check the following documentation for more information about Sync Progress.
Registered Server Health
- The Event ID 9301 is logged every 30 seconds when a server queries the service for any jobs. If GetNextJob completes with status = 0, the server is able to communicate with the storage sync service. If GetNextJob completes with an error, then check the following documentation for troubleshooting guidance.
Cloud Tiering Health
Tiering: To monitor tiering activity and errors on a server, check the following event logs:
- Event ID 9002 provides ghosting statistics for a server endpoint. For example, TotalGhostedFileCount, SpaceReclaimedMB, etc.
- Event ID 9003 provides error distribution for a server endpoint. For example, Total Error Count, ErrorCode, etc. Note, one event is logged per error code.
- Event ID 9016 provides ghosting results for a volume. For example, Free space percent is, Number of files ghosted in session, Number of files failed to ghost, etc.
- Event ID 9029 provides ghosting session information. For example, Number of files attempted in the session, Number of files tiered in the session, Number of files already tiered, etc.
Recall: To monitor recall activity and errors, check the following event logs:
- Event ID 9005 provides recall reliability for a server endpoint. For example, Total unique files accessed, Total unique files with failed access, etc.
- Event ID 9006 provides recall error distribution for a server endpoint. For example, Total Failed Requests, ErrorCode, etc. Note, one event is logged per error code.
- Event ID 9007 provides recall performance for a server endpoint. For example, TotalRecallIOSize, TotalRecallTimeTaken, etc.
Since I just enabled Azure File Sync, I don’t have any event ID logged for recall activity yet.
Option #3 – Azure File Sync Performance Counters
You can also use the Azure File Sync built-in performance counters to monitor sync activity locally on the server.
Open Perfmon.msc and add the following performance counters:
AFS Bytes Transferred
- Downloaded Bytes/sec
- Total Bytes/sec
- Uploaded Bytes/sec
AFS Sync Operations
- Downloaded Sync Files/sec
- Total Sync File Operations/sec
- Uploaded Sync Files/sec
Option #4 – Azure Portal – Azure Monitor
Last but not least, you can use Azure Monitor. Azure Monitor is still work in progress, at the time of writing this article, you can view the following metrics for Azure File Sync in Azure Monitor when you select the Storage Sync Service.
Since this option is still in preview, the charts for Azure Monitor can be previewed by the using the following URL: https://portal.azure.com/?Microsoft_Azure_Kailani_ShowCustomerReports=true
- The Bytes Synced metric shows the total size of file data transferred during sync sessions over the last 24 hours. Aggregated every 15 minutes.
- The Cloud Tiering Recall metric shows the size of data recalled over the last 24 hours.
- The Files Not Syncing metric shows per-item errors for the last 24 hours, grouped every 15 minutes.
- The Files Synced metric shows the count of files uploaded and downloaded over the last 24 hours. The data is aggregated every 15 minutes, each bar on the chart shows the sum for that 15 minute period.
- The Server Heartbeat metric shows a value of 1 which indicates a heartbeat was received from the server.
- The Sync Session Result metric shows how many synced sessions are currently running, in this example, I have 5 sync sessions.
You can expect a lot of improvements and enhancements that will be added to Azure Monitor in the near future.
As you can see, we have several options to monitor Azure File Sync health and activity status. Personally, I see the integration with Azure Monitor looks really promising. Microsoft has a great documentation for troubleshooting guidance, make sure to check it if you encounter any issue.
Azure File Sync extends on-premises files servers into Azure by providing cloud benefits while maintaining performance and compatibility. Azure File Sync provides:
- Multi-site access – provide write access to the same data across Windows servers and Azure Files.
- Cloud tiering – store only recently accessed data on local servers.
- Integrates with Azure backup service so no need to back up your data on-premises.
- Fast disaster recovery – restore file metadata immediately and recall data as needed.
I hope you find this guide useful.
Thank you for reading my blog.
If you have any questions or feedback, please leave a comment.