Monday, October 8, 2012

Part 3- 4 NODE MULTI-SITE DISASTER,DYNAMIC QUORUM


In Part 3 of this 4- Node Multi-Site clustering series I will go over various Disaster recovery scenarios







Powershell cmdlet to verify NODE WEIGHT:


CLUSTER TO BE ONLINE : 3 OUT OF 5 NODE WEIGHT.

COMBINED NODE WEIGHT = 5 (fooprimary+foosecondary+drprimary+drsecondary+quorum)

That being said we can lose 2 Nodes at a time before cluster goes down.


MULTI-SITE SCENARIO:


  1. You want cluster to be online on DR Site only when all the Server in Primary site is down.

This is possible by settings the Preferred owner option in properties of the cluster name


so by settings this preferred owners option what would happen is as along we have any 1 server online of the required 3 Node weight from Primary site, SQL resources will be residing only on either FOOPRIMARY OR FOOSECONDARY NODE.
DRPRIMARY/DRSECONDARY NODE will kick inn only when all the Nodes in primary site is down.

  1. Fail-back option :


Let assume a scenario if both Nodes on Primary Site fails and resources are in Secondary site.
Now if a Node in Primary site comes online, how you want the resources to be handled.

Option :

  1. Prevent fail-back: This settings requires manually intervention for moving the resources to Primary site
  2. All fail-back: Either immediately or schedule a time. If Primary Site comes online, what you want to do , immediately move the resource to Primary Site or schedule at a Particular time.

The recommended option is “Prevent fail-back  in MULTI-SITE , because after the Primary site is online you want to test make sure site is all stable before moving the resources.


DISASTOR SCENARIO CLUSTER ONLINE


DRSECONDARY NODE DOWN:



Cluster : Total Node weight is 4 so cluster will be online.




Cluster: Total Node weight is 2 so the Cluster is down.



Now to bring the cluster online we need to force the cluster without quorum and then add the other nodes in prevent quorum option and recreate quorum.

This where if you have DYNAMIC QUORUM enabled will reconfigure the quorum on the fly and will keep the cluster working even if only 1 Node is online.

HOW TO ENABLE: Please check my Part 1 of this series in the section where I configure quorum.




DYNAMIC QUORUM

Note: http://technet.microsoft.com/en-us/library/jj612870.aspx, highly recommend going through the link which explains in deep.
So Dynamic Quorum will work only
  1. The cluster should have achieved the quorum meaning there should already be a quorum configured before a Node goes down.
  2. Nodes should fail sequentially. If couple of Nodes fail simultaneously , then Dynamic quorum will not recalculate the vote, instead Nodes will regroup with remain Nodes and re-asses if Quorum can be configured and then dynamic quorum will kick inn for any more Node failure.


Dynamic Quorum is enabled by default. In short what is does is on the fly it recalculates the online Nodes and configure the quorum accordingly to keep the cluster online.



Let see the DYNAMIC QUORUM in action


DISASTER SCENARIO 1: DRSECONDARY down




Dynamic weight for DRSECONDARY is 0. So the Dynamic quorum kick's inn...
As the Total Node weight is 4 with file share the cluster in up and running.

DISASTER SCENARIO 2: DRPRIMARY,.DRSECONDARY down





Total weight is 3 with File share so the cluster is working.

DISASTER SCENARIO 3: FOOSECONDARY,.DRPRIMARY,.DRSECONDARY down

Note: I took he 3 node sequentially down.





Note: I have setup a continues ping to SQL CLUSTER NAME FOOMULTISQL to see when 3 nodes goes down.



Note: Did not see a single packet loss...







So to keep cluster we need 3 votes, so Dynamic quorum kicked inn and reconfigured the votes on the fly to keep the cluster working.( 2 Node weight + 1 File share witness)


DISASTER SCENARIO 4: FOOSECONDARY,.DRPRIMARY,.DRSECONDARY down +
FILE SHARE quorum SITE down




During this scenario cluster fails because there is no Dynamic Weight for File share witness therefore dynamic quorum is not calculated, majority of 2 is 2 therefore the cluster goes down..


More information check this very nice article:



Earlier FOR the same scenario without Dynamic weight we need to have 3 Nodes weight all the time, but with new Dynamic Weight configuration we can have either 2 Nodes or 1 Node and 1 File share online and still we can have the cluster up and running.



STEPS TO BRING CLUSTER ONLINE FOR DISASTER SCENARIO 4 :


  1. Investigate why quorum share couldn't be brought online before you perform the next step
  2. Forcing the cluster to start on the last Node online. The cluster will basically use the copy of the cluster configuration and replicate to other noes when it comes online.
    Net start clussvc /fq or start-clusternode -fixqorum


Note: We had dynamic quorum enabled, so when we forced started the quorum, It reconfigured the quorum and brought the cluster online. If no dynamic quorum enabled then the cluster would have started in force cluster mode and any node we add next needs to be added in prevent quorum mode to prevent the remaining nodes from forming a split cluster .
Now we keep adding back the Nodes and file



    1. Lets bring foosecondary online...



As soon as Cluster saw the other node coming online, Cluster service on this server was started with prevent quorum mode and made to join the existing cluster. Now we see a warning that Node and File share Majority is in failed state.

Note: If at this point a Node fails then the whole cluster will go down, because majority of 2 is 2 .

Microsoft doesn't recommend Node majority quorum model for Mufti-site clustering, if configured on 4 Node cluster I found out that the cluster is online even if 3 Nodes fails.


Dynamic Quorum configuration I think is a welcome option in Windows 2012 .


Friday, September 14, 2012

Part 2- 4 NODE MULTI-SITE SQLSERVER 2012 CLUSTER

In the Part1 of this series I went over setting up Multi-Site cluster, today I will go over setting up a 4 Node SQL 2012 cluster which now supports Cross Subnet Multi-Site clustering.
We already have a 5GB disk which I will use it for SQL Database, added a 1 GB iSCSI disk for MSDTC .


Lab:


Site: INDIA Subnet : 10.92.76.0/24
Domain Controller/iSCSI Target Server :  FOODC1 - Windows 2008 R2 SP1
FOOPRIMARY – Windows 2012 RC
FOOSECONDARY – Windows 2012 RC

Site: Hartford,USA Subnet: 172.168.0.0/16
Domain Controller/iSCSI Target Server: DRDC1 - Windows 2008 R2 SP1
DRPRIMARY - Windows 2012 RC
DRSECONDARY Windows 2012 RC

Site: Singapore Subnet: 100.0.0.0/8
Domain Controller: SGPDC1 - Windows 2008 R2 SP1


CLUSTER NAME: FOOMULTICLS IP ADDRESS: 10.92.76.36/172.168.0.36



INSTALLING MSDTC
MSDTC is still a required component before installing SQL cluster. If you don't install MSDTC you would see a warning message during setup.

Note : You can log into any Node and do the install, but best practice is to do it on the Node which owns the disk. So I will do doing MSDTC install and SQL 2012 install on FOOPRIMARY.










Provide Name and 2 Static IP ADDRESS.





Click Finish



Note: wait till FOOMSDTC computer object and DNS record gets replicated to all the sites.
Also keep any eye on DNS update when you failover, as you would see only 1 IP address and we need to fix the all IP registration and DNSTTL,More please check my Part 1 on this.

Testing the Failover:






So we have successfully tested the failover.
Lets fix the All IP registration and DNS TTL issue by running the below Powershell cmdlet.

Get-clusterresource FOOMSDTC | set-clusterparameter RegisterallprovidersIP 1
Get-clusterresource FOOMSDTC | set-clusterparameter HostrecordTTL 300



This finishes installing MSDTC, so lets move on to installing SQL 2012 on the first Node.







INSTALLING SQL 2012 ON THE FIRST NODE:

You can begin your install from any Node, but best practice is to start the install on the Node which owns the Disks required for SQL .I have it on FOOPRIMARY.

FOOPRIMARY:


Note: Also best practice is to copy the SQL 2012 DVD on the local hard drive and run it, I have seen some issues where install would either take long or fail when run from DVD.


Run the SETUP.EXE





click on installation


click New SQL failover cluster installation


click ok



As I don't have access to internet on this Node I just click skip scan and click next..


MSCS has warning, see the Part1 validating cluster which explains why.
.NET Application security warning because the install need windows update option enabled, which is not on this Node.

Both warnings can be skipped safely, click next....






I just picked only Database Engine and Management studio... click next..










Please change the resource group name, this how it will show up in Failover cluster Manager.
click next...


Pick all the disk you need for SQL and click next...


Provide a Static IP and click next..


best practice is to create a SQL group and add user to it and provide the SQL group here, I am just using the domain admin user here .Click collation...


change if you need to and click next...


pick a authentication mode and users and click on Data Directories...


Provide SAN disk Path. Click FILESTREAM if you need TO ENABLE IT..,
click next...









review and go back change anything if you need too and click Install...


click close and open Failover Manager.



So we have successfully installed the SQL on the 1st Node.







ADDING FOOSECONDARY NODE which is same Subnet:


Note: Before you proceed further make your SQL CLUSTER NAME AND DNS RECORD replicated to all the sites.



FOOSECONDARY:

Run Setup and click Installation ..

click Add node to a SQL Server failover cluster....
the rest of the steps are pretty much the same as above, I will include steps from Cluster Node Configuration.


so it automatically identified the Instance, just click next...



As this server is in the same Subnet it shows the IP Address for SQL cluster which we assigned when we installed FOOPRIMARY NODE.
Just click next..


Provide password for Service account and click next...






Click Close and test failing over from FOOPRIMARY to FOOSECONDARY



So we have successfully installed and added the FOOSECONDARY .



ADDING DRPRIMARY NODE FROM OTHER SUBNET:

Run Setup and click Installation ..


click Add node to a SQL Server failover cluster....
the rest of the steps are pretty much the same as above, I will include steps from Cluster Node Configuration.






Click IP4 and provide a static address and click next...



read and click Yes.











Before we test by failing over to DRPRIMARY node , lets KEEP A EYE ON THE DNS record for SQL CLUSTER NAME





Let move the SQL resource to DRPRIMARY.


So we have successfully failed over the SQL Sever to DRPRIMARY on the Other Subnet.

LETS CHECK THE DNS RECORD NOW ON BOTH SITES:


Note: So during failover,a new DNS record with new IP Address 172.168.0.38 has been added/.
So what basically cluster did was registered all the IP Address of SQL CLUSTER NAME without we need to run the POWERSHELL cmdlet RegisterAllIPAddress

Note: We still need to run cmdlet for DNS TTL if you need to change the default 20 mints.


ADDING DRSECONDARY NODE FROM OTHER SUBNET:

Run Setup and click Installation ..


click Add node to a SQL Server failover cluster....
the rest of the steps are pretty much the same as above, I will include steps from Cluster Node Configuration.









So we have successfully installed and tested SQL Server 4 Node cluster.


In part3 I will go various Disaster recovery scenarios.
.