Wednesday, July 1, 2015

Smart Link on a c3000 HP Enclosure - Shared uplink set down caused half my ESX host unreachable

Good day All,

Welcome back!!! before i start sharing on the outage and how a small configuration would have avoided the outage we had, just want to share something on how to make sure you are redundant if you ever have to work on c 3000 as all embedded NIC's use only Interconnect module 1 to go out.
Confused see below


c3000:


c7000:


So you guys where you able to differentiate.. if not let me explain... On a Blade lets assume you have 2 Embedded Flex NIC's(LOM ) , so if you need to have redundancy you need 2 Interconnect Modules so that NIC 1 will have all his traffic go through Bay 1 and NIC 2 will have all this traffic go through Bay 2.. well this on a c7000.

Same Scenario on c3000 , both NIC 1 and NIC 2 will use Interconnect Module 1 to go out, so there is no redundancy for Interconnect Modules and also for the NIC's.. So how do you achieve that..
Well you take the help of Mezzanine Cards.. So what you have to do is add in a additional Mezzanine card which will be your additional 2 port NICs and its traffic will go through the Interconnect Module 2. So when you team them you need to make that you are teaming 1 on-board Embedded NIC along with 1 Mezzanine card NIC so that you get NIC's redundancy and also Interconnect Module redundancy.

Question would be then why not go buy a c7000, well cost and also if its small location you can buy 2 c3000 and have them redundancy for say like ESXi farm rather than 1 c7000 right?


Now let me go over the outage we had: We have 2 c3000 Frames and we got 4 ESXi blades on each frame and we have configured or created the Cluster in such way that 2 ESXi host from each Frame are part of each Cluster for Production and NON-Production.We also have created 4 uplink Sets 2 from each Interconnect Modules configured as Active/Active with same VLAN's going through both uplink sets.

The issue we had was Uplink from Interconnect Bay 2 got unlinked on the Switch end  and any VM's which was using that uplink was not communicating to Network ,VMotion link was down too and we unable to VMotion any VM's.

As part of Initial troubleshooting to bring VM's to Network we enabled Smart Link in Virtual Connect , basically what it does is it keeps checking if any UPLink Set is down, if any Link goes down it moves the traffic to other working uplink Set and we didn't had this configured on the Frames..

Steps to enable Smart Link:
1. Log into VC Manager
2. Edit the shared uplink sets, one at a time
3. Edit each of the Associated Networks within the shared uplink set and check the "Smart Link" box and click Apply, do not click Apply on the main shared uplink edit screen until all Associated Networks have been edited
4.After all has been edited click Apply on the main shared
5. Repeat for all shared uplink sets on both enclosures.

the reason for uplink down was some Network changes done and it was reverted back and issue got fixed.. but with this outage we atleast came to realize that Smart link could have save the day and i hope anyone reading this article will go back and check there settings..
I have lot more c3000 and started to do the same..


Hope this helps someone , till next time all have a good day!!!!



1 comment: