Warning: call_user_func_array() expects parameter 1 to be a valid callback, no array or string given in /home/kalofero/public_html/blog/wp-includes/class-wp-hook.php on line 286
Clustering the vRO Cluster Database using SQL Server (PoC) (SKKB1027) | Spas Kaloferov's Blog

Clustering the vRO Cluster Database using SQL Server (PoC) (SKKB1027)

In this post we will demonstrate in a PoC how to configure and a Microsoft SQL Server Cluster to provide a highly available database instance and use it in a VMware vRealize Orchestrator (vRO) cluster server.

 

Lab Environment

The logical design of this lab can be seen HERE.

 

Introduction

In this post we will demonstrate in a PoC how to configure and a Microsoft SQL Server Cluster to provide a highly available database instance and use it in a VMware vRealize Orchestrator (vRO) cluster server. I would like to note that at this point clustered instance of SQL Server database for use with vRO is not officially supported by VMware or at least I could not find information supporting this. This is just a PoC and testing done here might turn out to be insufficient for a production deployments.

The results are free for interpretation for anyone to determine if SQL Clustering is a viable option.

The Setup

For the purpose of this PoC we will be using the following MS SQL and vRO versions:

  • 2x Node Microsoft Failover Cluster with Windows Server 2012 R2 SP1 x64
  • 2x Node Microsoft SQL Server 2014 x64 Cluster
  • 2x Node vRO 7.0.1 Appliances in a Cluster (Active/Active & Active/Passive)

Setting up the Windows Cluster

The Microsoft Windows Failover Cluster (WFC) setup is straight out of the box setup with nothing special.
In this PoC I’ve used an iSCSI Luns which are presented to all nodes to hold resources. I’m using 3 LUN’s : one for the Disk Witness, one for the clustered MSDTC service and one for the SQL Server instance files.

In addition to the default setup I’ve:

  • Added the MSDTC Server as a Role and thus clustering it in the WFC. This is not 100% necessary in this use case , but it’s good to have it clustered. I haven’t mapped the MSDTC Clustered service to the clustered SQL Instnace , but that can also be done.
  • Configured the MSDTC service for Network DTC Access on all nodes.

For more information on when to cluster the MSDTC service and when to map it to a clustered SQL server instance, visit:  MSDTC Recommendations on SQL Failover Cluster

For additional reading, visit Not-Supported: AGs With DTC/Cross-Database Transactions

I’m not going to go step-by-step through the setup there are a lot of references from Microsoft on how to do this.
Following screenshots summarize the installation and configuration.

Setting up the SQL Cluster

The Microsoft SQL Cluster setup is again straight out of the box setup with nothing special. In this PoC I’ve clustered one instance of SQL Server.
For more information on the setup prerequisites and process, visit SQL Server Failover Cluster Installation & Before Installing Failover Clustering,

In addition to the default cluster setup I’ve:

  • Created a new empty vRO databse
  • Enabled ALLOW_SNAPSHOT_ISOLATION and READ_COMMITTED_SNAPSHOT 

Following screenshots summarize the installation and configuration.

Setting up the vRO Cluster

The VMware Orchestrator cluster setup is again straight out of the box setup with nothing special.

For more information visit the official product documentation at VMware vRealize Orchestrator Documentation

In addition to the default vRO cluster setup I’ve changed the number of active nodes of the cluster b/n 1 & 2 to test an Active/Passive and Active/Active configurations.

Following screenshots summarize the installation and configuration.

 

The Tests & The Results

These are the main tests I’ve performed:

  • Test planned failover of a SQL Server Cluster Node.
  • Test failure of a SQL Server Cluster Node.
  • Test failure of a vRO Cluster node in Active/Passive setup
  • Test failure of a vRO Cluster node in Active/Active setup
  • Monitor connection from the Orchestrator client during all test.
  • Test using the Locking test WF that is part of the OOB vRO library.

For the purpose of these tests I’ve buid a very simple Cluster DB Test WF that increments a number from 0 to 300. Each time it does this the WF creates a couple of System Logs (stored on vRO Appliance and seen on Logs tab) and a couple of Server Logs (stored on DB and seen on the Events tab). I’ve split the action in multiple scriptable tasks. I’ve done this because each time a scriptable task executes and finishes it reads and writes to the DB. I wanted to have a constant communication to the DB.

Following screenshots summarize the WF and configuration.

I’ve performed multiple tests like

  • vRO Active/Passive: Start the workflow on Node 1 and failover the SQL DB.
  • vRO Active/Passive: Start the workflow on Node 1 and simulate failure of the SQL DB.
  • vRO Active/Passive: Start the workflow on Node 1 and simulate failure of the active vRO Node
  • vRO Active/Active: Start the workflow on Node 1 and failover the SQL DB.
  • vRO Active/Active: Start the workflow on Node 1 and simulate failure of the SQL DB.
  • vRO Active/Active: Start the workflow on Node 1 and simulate failure of the active vRO Node
  • vRO Active/Active: Start the workflow on Node 1 & 2 and failover the SQL DB.
  • vRO Active/Active: Start the workflow on Node 1 & 2 and simulate failure of the SQL DB.
  • vRO Active/Active: Start the workflow on Node 1 & 2 and simulate failure of the active vRO Node

All tests delivered pretty much same results:

  • SQL Cluster node failure/failover: WF resumes in ~20 seconds and continues.
  • vRO Node failure: WF resumes in ~120 seconds.
  • vRO Client does not loose connection to the vRO Server. It needs only refresh (F5) after SQL node failure/failover.

Note: This is a 2 layer nested virtual environment. These results might be a bit faster in a real life deployment.
Couple of settings affect the time which take for the SQL/WFC to failover the database and for the vRO Cluster to resume workflow execution.

From SQL/ WFC Cluster side this is the cluster Heartbeat Delay and Threshold setting values.
For more information about these values , visit Tuning Failover Cluster Network Thresholds

From vRO Cluster side this is the cluster Heartbeat Interval (in seconds) and Number of failover hearbeats setting values. These can be set via the Control Center interface, under the Orchestrator Node Settings tab (as shown on one of the screenshot)

After vCO 5.5.3 , Orchestrator has more resiliency and tolerance against disruptions of the communication to the database.

Following are screenshots form some of the tests performed.
(Failure/Failover of SQL Node – WF resume after ~20 sec. )

(Failure of vRO Node – WF resume after ~120 sec. After the 30th increment , where the failure happens, the WF execution continues on the other node)

 

 

Final Step

If all went well, go grab a beer.

DISCLAIMER; This is a personal blog. Any views or opinions represented in this blog are personal and belong solely to the blog owner and do not represent those of people, institutions or organizations that the owner may or may not be associated with in professional or personal capacity, unless explicitly stated. Any views or opinions are not intended to malign any religion, ethnic group, club, organization, company, or individual.
All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information.
Photos
Unless stated, all photos are the work of the blog owner and are licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. If used with watermark, no need to credit to the blog owner. For any edit to photos, including cropping, please contact me first.
Recipes
Unless stated, all recipes are the work of the blog owner and are licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Please credit all recipes to the blog owner and link back to the original blog post.
Downloadable Files
Any downloadable file, including but not limited to pdfs, docs, jpegs, pngs, is provided at the user’s own risk. The owner will not be liable for any losses, injuries, or damages resulting from a corrupted or damaged file.
Comments
Comments are welcome. However, the blog owner reserves the right to edit or delete any comments submitted to this blog without notice due to
– Comments deemed to be spam or questionable spam
– Comments including profanity
– Comments containing language or concepts that could be deemed offensive
– Comments containing hate speech, credible threats, or direct attacks on an individual or group
The blog owner is not responsible for the content in comments.
This policy is subject to change at anytime.

Leave a Reply

Your email address will not be published. Required fields are marked *