Automate failback for SQL AlwaysOn Availability Group. Starting with SQL Server 2016 AlwaysOn Availability Group (AG) and later, Microsoft enhanced AlwaysOn AG where you can have multiple automatic failover targets. This will add a new scenario where two synchronous secondary replicas can be configured as automatic failover partners with the primary replica. This will allow the following benefits:
- This increases the chances that high availability can be maintained if one of the automatic failover partners is lost.
- Reduces the need to manually manage failovers or to reconfigure the availability group with another automatic failover partner in the event that one of the automatic failover partners is unavailable.
This new enhancement is useful if you have more than 2 nodes as members of the same cluster where you can configure multiple automatic failover targets.
Table of Contents
I recently came across a scenario where I have 2 SQL nodes configured with AlwaysOn Availability Group (AG) in Azure, and I need to set one of the nodes to be always the primary in case of automatic failover. This due to an application that can only communicate to one of the nodes instead of communicating directly with the SQL Availability Group Listener.
In this article, I will share with you how to automate and failback to the preferred (desired) node if both replicas are synchronized and in a healthy state. This scenario is useful if one of the nodes rebooted and updated for maintenance, and you need to keep maintaining the same node as the primary replica for the database after a failover.
Windows Cluster for SQL AG failover
First, let’s have a quick overview of the Windows Failover Cluster for SQL servers to understand the behavior in conjunction with SQL AlwaysOn Availability Group (AG).
In the case of 2 or more nodes in the cluster, and if both the secondary replicas are synchronized and healthy, which replica the Cluster choose to failover to?
By default, Windows Cluster will attempt to failover to the next replica in the preferred owner list of the availability group, which dictates attempted failover order. The preferred owner list of the availability group role in Cluster will list all the availability group replicas defined for automatic failover.
Here is a screenshot of the availability group role properties from Failover Cluster Manager, you can see the Up and Down buttons for setting the priority of these nodes as the owner of the primary replica.
For example replicas SQL17N1 (primary), SQL17N2 (secondary), and SQL17N3 (secondary replica) are all defined for automatic failover and are in that order in the preferred owner list. In the event that a health issue is detected in SQL17N1, Cluster will attempt to failover to SQL17N2 next. In the event that SQL17N2 is not healthy (for example, an availability database was NOT synchronized), the Cluster will attempt to failover to SQL17N3.
A question that will often come up is, can you set the automatic failover priority, by choosing a particular replica for a preferred failover target in SQL Server or by using Failover Cluster Manager?
The short answer is NO! Unfortunately, there is no setting in SQL Server for configuring the preferred owner list settings of the availability group role. Additionally, resetting the priority of the nodes in the preferred owner list of the availability group role in Failover Cluster Manager is also not recommended by Microsoft, because SQL Server manages the Preferred and Possible Owner Properties for AlwaysOn Availability Group/Role. So when a failover occurs, SQL Server resets the Preferred Owner list, over-writing any change made through Failover Cluster Manager GUI.
Read more about the Availability Groups restrictions by Microsoft: Do not use the Failover Cluster Manager to manipulate availability groups.
As a workaround, Microsoft says that you can configure automatic failover priority by adding all the replicas to the availability group in the preferred automatic failover priority. For example, you have created an availability group with primary replica SQL17N1, and add SQL17N2 and SQL17N3 as automatic failover partners. If you add SQL17N2 and then SQL17N3 using the Add Replica to Availability Group wizard in SQL Server Management Studio (SSMS), then SQL Server will add them in the same order to the preferred owner list.
However, this workaround does not solve the automatic fallback scenario to the preferred owner.
Automate SQL AG Failback
To automate SQL Server Availability Group (AG) failback of your preferred database, take the following steps:
First, you need to create a folder named SQL on drive C:\ and then create the following sub-folder in the SQL folder. You can also have the folder on any drive as well.
Next, copy the following T-SQL script to notepad and then save it as (.sql) extension under C:\SQL\Scripts\01-SQLDB-Auto-Failover.SQL
Make sure to update the following parameters:
- replica_server_name = ‘Your-Preferred-Node-Name-Here‘
- database_name = ‘Your-Preferred-SQLDB-PartofAG‘
- ALTER AVAILABILITY GROUP [Your-Availability-Group-Name-Here]
USE master IF EXISTS ( select * from sys.dm_hadr_availability_replica_cluster_states join sys.dm_hadr_database_replica_cluster_states on sys.dm_hadr_availability_replica_cluster_states.replica_id = sys.dm_hadr_database_replica_cluster_states.replica_id join sys.dm_hadr_availability_replica_states on sys.dm_hadr_availability_replica_states.replica_id = sys.dm_hadr_availability_replica_cluster_states.replica_id where replica_server_name = 'Your-<span style="color: #000000;">Preferred-Node-Name-Here</span>' and join_state = 1 and database_name = '<span style="color: #000000;">Your-Preferred-SQLDB-PartofAG</span>' and is_failover_ready = 1 and is_database_joined = 1 and is_pending_secondary_suspend = 0 and synchronization_health = 2 --HEALTHY and role = 2 --SECONDARY ) BEGIN ALTER AVAILABILITY GROUP [<span style="color: #000000;">Your-Availability-Group-Name-Here</span>] FAILOVER END
Next, you need to create the SQL CMD (command) that will run the script above. Open notepad and then copy the following two lines and save the file in the same folder C:\SQL\Scripts\02-SQLCMD.bat as (.bat) extension.
@Echo off sqlcmd -E -S . -i C:\SQL\Scripts\01-SQLDB-Auto-Failover.SQL -o C:\SQL\Scripts\Log.txt
Lastly, open the command prompt window as administrator on the preferred node that you want to be the primary for your database, and then run the following command to create a recurring scheduler task that will run every 5 minutes.
Schtasks /create /tn "SQLAGAutoFailover" /sc minute /mo 5 /rl highest /ru system /tr "C:\SQL\Scripts\02-sqlcmd.bat"
The task will run indefinitely every 5 minutes, the script will check the database availability if it’s healthy (finished synchronization) and where the replica server is the preferred node. If the database is healthy and the replica server is NOT the preferred node, then failover will kick in and move the Availability Group to be the primary on the preferred node.
You can verify that the script runs successfully by looking at the log file which will be created in the same path C:\SQL\Scripts\Log.txt.
That’s it there you have it!
In this article, I showed you how to automate and failback your database to the preferred (desired) node if both replicas (primary/secondary) are synchronized and healthy. This scenario is useful if you are performing maintenance on one of the nodes, and you need to keep and maintain the same node as the primary for the database after failover. The databases resynchronize automatically when both nodes are back online.
Learn more about SQL Server failover and failover modes for AlwaysOn Availability Groups.
Thank you for reading my blog.
If you have any questions or feedback, please leave a comment.