Wednesday, November 22, 2017

CPU 95% spike alert - resolved!!!!

Good day All,

Welcome back!!!! We had a strange incident for CPU alerts been reported by Client.. our monitoring tool was not showing any high CPU usage but Client as monitoring setup and they are getting CPU spike alerts one or two times in couple of days...

We did routine health check by checking month log CPU usage , CPU was hardly showing 30% spiking for entire month but still Client was getting alert so following steps was performed to further troubelshoot

1. Open resource monitor and started to keep a eye on CPU spike to see which Service is causing it..
2. After a close watch for like a hr was able to tell that svchost.exe was utilizing it
3. As svchost is a shared process had to identity which Service is causing it so enabled the check box next  to Service host in CPU tab of resource montior and monitored as below



































4. From the above i was able to identify it was event log Service which is piking the CPU
5.So opened event viewer and started to check the events found that in Security log for every 1 sec at-least 10 to 15 events on Event ID:5156 Platform Filtering Connection was getting generated and filling up the 1 GB security log file in no time. After the log file is filled its trying to over write but number are events are so high its unable to process it and spiking the CPU


























6. Verified in local security policy that Audit Object Access policy was enabled for both Success, Failure and it been enforced with GPO.






























7. Starting Windows 2008 Audit policy has changed and lot of subcategories are added and you can verify it by typing in command prompt auditpol / get / category

















8.Our Security policy was to have both Success and Failure for Filtering Platform Connection, had to raise a exception and policy was changed from Success and Failure to only Failure with the following command on the Server
auditPol /set /Subcategory:"Filtering Platform Connection" /Success:disable /failure:enable

Ola the issue got resolved and we did't see any more spikes....

hopefully this helps someone, until next one you all have a good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Preparing to configure windows - Do not turn off your computer

Good day All,

Welcome back!!! today will share a issue we encountered on a Windows 2008 Server after patching.

After patching when logged into console this is the error we started to see for a very long time



















Troubleshooting steps performed

1. rebooted the Server, went to safe mode and try to uninstall installed patches and it showed error and unable to unistall
2.Tried last good configuration that didn't work
3.dism.exe /cleanup-image /scanhealth didnt help

So to fix the issue we reboot the Server and during the above screen did a MMC to check if all Service was running fine or anything is struck in stopping mode.

On verification we found that Antivirus scan was in stopping state , we tried pskill to kill the service but no luck so we changed the service settings to manual and then did a reboot..
Ola we able to login back and then did a windows update, patches got installed and then started the service back..

I have seen across the internet some one saying to remove the pending.xml file under C:\windows\winsxs so just sharing in-case above steps doesn't work for someone.

Hopefully this helps someone, until next one you all have good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Friday, November 17, 2017

Linux Blade after firmware either NIC's disconnected or unable to reboot or power off.

Good day All,

Welcome back!!! recently Unix team started to do firmware upgrade on the blades and after upgrade when to reboot the Servers couple of issues was reported

1. all the NIC cards would be in disconnected state
2. Blade will not respond or will not be able to hard reset or power it off state.

Following steps was performed to fix the issue

1. Reset the blade :

  • Login to OA and make note of the bay the blade is in under. 
  • Now open putty and connect to primary OA IP
  • type reset server bayno. 


ask you to confirm and then it will show successfully done.

2. If resetting the blade fails then login to Virtual connect Manager. Go to profiles and select the blade and click edit.Down below you will see a option to un-assign the profile for the server and click ok.  Now go back to Server and then apply the same profile back and click Apply and power on the blade.

hopefully this helps someone and until next one all have a good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Sunday, October 15, 2017

C7000 - decomm 2 C7000 Frames and creating new Domain

Good day All,

Welcome back!!! recently we had a request to decomm 2 C7000 Frames which are part of the 4 Frames Linked Frames.
Kindly note as the decomm is for the Primary frame in the domain we will need to re-create a new domain.

Pre-task to be done:

1. Backup the currect VC domain in case if we have to revert back
2. All Server Profiles complete screen shot showing which NIC is configured on the Server
3. Shared uplink set screen shot showing the ports which are configured to uplink switches.
4. All Ethernet networks information which are associated to all the Shared uplink sets and VLAN's
5. Need to determine if you want to use a old name or a new Name for the domain
6. What would be the IP for the new domain
7. Who will remove the existing cables from 4 frames and how to configure new stacking links only for 2 frames.
8. If direct attached SAN is present all the information needed to recreate should be taken
9. All blade Servers will need to shutdown and we will have to recreate all the Shareuplinks,Ethernet networks and Servers profiles so for the whole activity at-least 12-14 hrs of downtime. We may need more in case we run into issues so plan your downtime well.

Note: All our blades are using the Factory default settings for MAC address so no additional information needed to be captured. If you have to use the domain then need to determine what series will be used for it.


Stacking Links connected requested as below:

Frame 1 Interconnect Slot 1 port 1 Frame 2 Interconnect Slot 1 port 1
Frame 1  Interconnect Slot 2 port 1 Frame 2 Interconnect Slot 2 port 1


During the change Window:

1. Shutdown all the Server blades
2. Login to VC and Un-assigned all the Ethernet networks for each Servers in the profile for all profiles in all 4 frames
3. Deleted all the Ethernet networks on all the frames
4. Deleted all the Shared uplink ports on all the frames.
5. Now click on enclosure and under configuration deleted the 2 Frames which you will be reused to create new domain (Note: delete domain can be done too now )

Note: You will not be able to delete domain or delete a enclosure without blades are been shutdown and Ethernet networks been unassigned in Server profiles and shareuplink sets are deleted.

6. Now request the cabling guy to connect the stacking links and confirm he is done before you proceed

7. Now login to OA for the Primary frame , click on Enclosures IP4 and make sure all the VC IP and ILO for the blades are intact.

Note: if this new domain creation then you will need to configure you OA first with IP for all the VC Modules and ILO for Server blades.

8. Now use the IP for VC1 module and connect ,as soon as you authenticate you will notice that domain configuration wizard will start
First step is to configure domain name .Click next will ask you import the current enclosure module, give the password and hit next and module should be imported and then at the end before  you click finish you see a option to un-check  configure network just un-check it as we will do it later.

9. Now click on Configurations , go to IP Address tab and click use Virtual connect domain IP and provide the VC domain IP,

10. you will notice that current session will be logged out and will be redirect to new VC domain IP

11. Logged back and started creating the Shared uplink Sets. Note that all the uplinks will be in standby mode don't worry after you add ethernet networks and assign to Server profile the uplinks will come online.

12. Now started to create Ethernet networks

13. Created new Server profile and assigned the Ethernet networks for the NIC's as it was before

14. Powered on the blades and the Servers came back online with no issues and no need to reconfigure IP's because we used Factory default of blades so no mac-address got changed even though new Server profile was attached.

15. Completed the creation for all the frames and it was handed over for testing.

So this is how we successfully decommission 2 Frames in the 4 frame domain and recreated a new domain.

How to make sure stacking links are connect properly:

After logging to VC, when you click on Stacking links you will the connecting Status and redundancy status showing all OK.

Also i have see people finding default to understand see below, marked in yellow is the external cable links we requested engineer to connect and if you see the other connections they are just internal wired connections between 2 VC modules which are adjacent to each other in  frame.
Example : Connection from VC1 to VC 2 module

Note: Always make sure that VC modules id more than 1 frame are created as parallel, should never connect cross cabled.



















So this is how we successfully completed the request and sharing the same so that it helps someone!!!
Until next one you all have good day!!!!!!!!!!!!!!!!!!!!!!!!!!!

Monday, September 4, 2017

Basic to GPT disk

Good day All,

Welcome back!!! recently we had a request to add additional 1 TB of space to a already existing     1.5 TB of basic disk and its virtual machine.
Lot of guys may be thing what is the issue here and why we need a post for this?
well i have seen still lot of Admins do the mistake of just extending disk beyond 2 TB and struggle to understand why disk is not extending beyond 2 TB,
If you have read my first sentence there is answer to it, well i said this a Basic disk and Basic disk can't be extended beyond 2 TB so we need to convert to either GPT disk or add new disk and make the disk as dynamic.

Dynamic disk : Well even MS does't recommend this on new OS like 2008/2012 and disk performance are not that great.

GPT disk - is the way to go for performance and further growth but we will have to format the drive and restore the data.

After some discussion we decided that as this is File share drive,sighting performance and future growth we decided to go with GPT and wanted to come with a plan so that downtime for this is as minimal as possible.


Pre-Task we performed:

1. Took File share permission screenshots
2.Registry backup was taken as all the file share permission are present in case we have to revert or apply it
3.a New 2.5 TB GPT disk was created and attached to a Server let says name as B in the same ESXi Farm
4.we started a Robocopy batch script with below details from Source Server disk to GPT disk in Server B on the destination Server B.

ROBOCOPY /e /xj /ZB /r:2 /w:5 /LOG+:"C:\Log.txt" /it /purge /copyall Source_Path Destination_Path

@Echo Copying Complete
Pause 

Syntax:
/E :: copy subdirectories, including Empty ones.
/XJ :: eXclude Junction points. (normally included by default).
/ZB :: use restartable mode; if access denied use Backup mode.
 /R:n :: number of Retries on failed copies: default 1 million.
/W:n :: Wait time between retries: default is 30 seconds.
/IT :: Include Tweaked files.
/COPYALL :: COPY ALL file info (equivalent to /COPY:DATSOU)- Includes all Security Permissions.


5. 1.5 TB of data copy took about 15 hrs
6.A day before cutover we did one more incremental Robocopy and synced all the new changes and it took us about 30 mints

Steps performed during the cut-over:

1.Go to shares and close all the open shares for the drive
2.Initiated a Final Sync so that we are not missing new changes, it took about 15 mints
3. Removed the  1.5 TB disk from edit settings on the properties of the VM
4.Removed the 2.5 TB disk from destination VM and noted down the path
5.On the Source Server in edit settings given the new path to 2.5 TB disk
6.went to disk management on the Source Server scanned for new drive.
7.It automatically assigned a new driver letter E
8.So we changed the driver letter from E to original F and all the share permission got applied to drive

I have seen lot of people getting confused, please note Robocopy will only carry Security permission if required all the share permission you will have to manually assign to Shares.

As the registry settings was having all the sharing details as soon as changed the drive letter it took the Share permissions automatically and we didn't had to give anything.

The whole downtime for the post steps was like 45 mints and Server was up.

If anyone has a better way of doing it please share.
So we are at end of this article hopefully this helps someone, until next one you all a good day!!!!!!!!!!!

Friday, September 1, 2017

Unable to change Audit settings in Local group policy even though the settings are not governed by Group policy

Good day!

Welcome back!!! As part of non compliance,our security team asked me to enable below Audit Policy settings for Success/Failure in Local group policy.

Audit system events
Audit process tracking
Audit Policy change
Audit object access

When we try to enable Success/Failure it seams to work and then after we close the settings and go back and recheck the settings get unchecked.

So first think we checked was is it bound by group policy which is not and if its even bound by group policy it will not even allow to change it, we will clearly get error saying can't change it as its been enforced by Group policy.

For starters if you don't know starting 2008 MS introduced Advance Audit settings which you can enable using the Auidtpol command.
Below is the list of Category and subcategory for the Audit

Advance Policy sub category:

Audit system events

Category: System

  Security System Extension            
  System Integrity                    
  IPsec Driver                        
  Other System Events                  
  Security State Change                

Audit process tracking

Category: Detailed Tracking

Sub-Category:

  Process Creation                      
  Process Termination                  
  DPAPI Activity                        
  RPC Events                            
  Plug and Play Events                  

Audit privilege use

Category: Privilege Use

  Non Sensitive Privilege Use          
  Other Privilege Use Events            
  Sensitive Privilege Use              

Audit Policy change

Category: Policy Change

  Authentication Policy Change          
  Authorization Policy Change          
  MPSSVC Rule-Level Policy Change      
  Filtering Platform Policy Change      
  Other Policy Change Events            
  Audit Policy Change                  

Audit object access

Category: Object Access

  File System                          
  Registry                              
  Kernel Object                        
  SAM                                  
  Certification Services                
  Application Generated                
  Handle Manipulation                  
  File Share                            
  Filtering Platform Packet Drop        
  Filtering Platform Connection        
  Other Object Access Events            
  Detailed File Share                  
  Removable Storage                    
  Central Policy Staging                

Audit Logon events

Category: Logon/Logoff

  Logon                                
  Logoff                                
  Account Lockout                      
  IPsec Main Mode                      
  IPsec Quick Mode                      
  IPsec Extended Mode                  
  Special Logon                        
  Other Logon/Logoff Events            
  Network Policy Server                
  User / Device Claims                  


Audit directory service access

Category: DS Access

  Directory Service Changes            
  Directory Service Replication        
  Detailed Directory Service Replication
  Directory Service Access              

Audit account management

Category: Account Management

  User Account Management              
  Computer Account Management          
  Security Group Management            
  Distribution Group Management        
  Application Group Management          
  Other Account Management Events      

Audit account logon events

Category: Account Logon

  Kerberos Service Ticket Operations    
  Other Account Logon Events            
  Kerberos Authentication Service      
  Credential Validation                

How to  enable Category:

Example:

Auditpol /set /category:"Account Logon" /Success:enable /failure:enable
Auditpol /set /category:"Logon/Logoff" /Success:enable /failure:enable


If you don't want to enable all the Audit settings in Category you can enable just the Subcategory

Example:

AuditPol /Set /Subcategory:”Credential Validation” /Success:enable /failure:enable

So if you have to enable Audit policy subcategory you need to enable it as below.


















coming back to my issue when i tried to change the settings under Audit Policy it was not allowing me because Advance Policy was enabled and now any settings you will have to enable it by using Auditpol command only.

Problem was our tool from security was only looking at the Audit policy, is settings enabled or not and it had no clue on Advance Audit Subcategory. Even though we had it enable in there it was not working.

So to fix the problem i had to disable the Force audit policy, then enable all the settings in Audit policy and then enable it back.

Hopefully this helps someone and until next one you all have good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Tuesday, August 29, 2017

Blade movement from 1 Frame to another in a linked Frames

Good day All,

Welcome back!!! we had a recent requirement to move a Blade from 1 Frame to another on a Linked Frame of 4 and following steps was performed

Pre-plan:
1. Note ILO IP details
2.If you have to create new Profile then all the NIC's VLAN information needs to be noted.
3.Blade to be verified if its using VC assigned NIC's,WWW N's or Server default and MAC address and WWW N's needs to be noted.
4.Need to make sure VLAN's been used in current Frame, same is already in palace on New frame which means need to verify Ethernet Networks for same VLAN in place.
4. SAN Fabric if present then need to make sure same is present in new frame we well.

Steps Performed:
Note:
Our Frames are old setup that is SAN connected to a MDS Fiber switch to 2 VC modules on Bay 3,4. For starters this setup is  like a physical Server connected to a external Fabric switches just that its internal that's it.
Also the VC profile was setup to use Blade MAC and WWW N's.

1. Source Server was powered down
2. Existing Profile was unassigned
3. ILO IP was unchecked in old frame and assigned in new frame in OA
4. After the move , as the frames are linked assigned the existing profile pointing to new Blade location in the new Frame.
5.Before powering on network NIC's VLAN was changed so that all the NIC's use new Frame to upload or connectivity
6.After changing Server was powered on
7. NIC's MAC address for the blade didn't change and all the IP's etc was intact.
8.Blade had a Qlogic MEZ card attached to a MDS Fiber switch and WWW N's was intact when we moved to new Frame and no re-zoning was required.
9.Post validation was done.

Hopefully this helps someone and until next one you all have a good day!!!

Saturday, August 5, 2017

Cloud - what i understood- System Admin what we need to look at????

Good day All,

Welcome back!!! i started to spend time look at Cloud so thought to share what i understood Cloud and as a System Admin where we will fall and what we need to look at.

This is how i think Cloud evolved

We had our own data centers : ex for lay man terms its like your own backyard in your independent house.

Adv:
1. Full control
2. Very felixable
3.You own your own equipment

Dis:
1. Lot of resource like money,current,security etc..
2. Lot of investment

Next came is third party data centers:

Adv:
1. No more my head ache for security, resource, space, electricity etc
2.You own your own equipment
3. total privacy to you allocated space.

Dis:
1. Away from your site
2. No major control
3. Lot of tenants

well for a lay man i would say a Apartment in big complex with lot of other tenants living.

Next came is third party data-centers with fully furnished:

What that would mean is you don't bring anything we will take care of every thing

Adv:
1. No hassles
2. Nothing to own or keeping track of equipment's

Dis:
1. Not enough customization. Let's say you requirement is to have a single Server in 1 GB of RAM and 20 GB harddisk but as they have Servers with stand capacity you need to pay what ever it is even though its over provisioned.

With Virtualization playing a key role thirdy party data-centers started to think if we are going to give Client fully furnished kind of Apartment why not customize it according to there needs..

Lets say 1 need 1 server 1GB RAM great here you go.. tomorrow i come i tell same server i need to add couple of more memory for sometime and then i don't need it .. well that can be done..

So data-centers started to stack lot high hardware with Virtualization and they started to provision any kind of requirement to end users across the globe.















So any cloud provider if he able to fulfill the above criteria then i would they say they are providing Services in cloud

Types of Cloud:















Examples of each Cloud Services:
1. SAAS : Hotmail,Gmail.Office365

2. PAAS: example would be let says a developer needs a Server for sometime and he says he needs IIS and SQL database with Windows 2012 OS installed.So we can bundle all these 3 things and create as VM and give to developer.He doesn't need to know how to request for a VM,how to get OS,how to install all the software etc.. all he can do now is to start work and stop worrying about procurement.

3. IAAS: this is where we has a System Administrators come into picture when users say that i need a Virtual Machine and i need full control of it.

So as a System Admin you will feel Cloud as a remote data center and using the tools like Vi client,Vcenter,Hyper V we connect and Provision Servers the same way we will have to use cloud tools to create Virtual machines and connecting would like how we do remote data-centers through VPN same concept nothing changes,only thing is that we are responsible for hardware monitoring, hardware issues now but in cloud that part is no more our responsibility and also the underlying concept if Virtualization so hardware failures you will have Virtual machines getting seamless migrating to another host so you would hard see any hardware issues .



If you System Admins its very important that we understand what is Cloud how this is going to change our support models. I am not saying this would wipe out small data-centers but this will like lets say when Virtualization came it brought lot of benefits and people started to implement so Cloud is going to be the same way it has benefits and companies will look into cloud so we need to start understand where we fall and start gathering which cloud provides give what Services and what benefits they bring along so that we are prepared when been asked to.

I am not a cloud expert just started to play around so sharing what i understood to others so that this article may be a stepping stone for System Admins to do a deep dive.


Hopefully this helps someone until next one you all have a good day!!!!!!!!!!!!!!!!!!!!!!!!!

Wednesday, August 2, 2017

Schedule Task in Citrix

Good day All,

Welcome back!!!! recently we had a request to add schedule task in Citrix and request was  that user should see only the schedule task he is authorized to execute.

Following Steps was performed:

1. Schedule task was published in the Citrix
2. As we didn't want to load balance Schedule task we made sure only 1 server is published
3. Now go to C:\windows\System32\tasks, search for the schedule job file and go to properties and under security add the user who would need access and give him full permission.
4.Asked the user to test it.


well this was easy, hopefully this helps someone until next one you all have a good day!!!!!!!!!!!!!!

Tuesday, July 25, 2017

Windows 2012 - NO RDP

Good day All,

Welcome back!!!
Recently we had Windows 2012 which was online in Console but unable to RDP to Server and we started to see this below error in system logs

Event ID:      1057

Description:
The RD Session Host Server has failed to create a new self signed certificate to be used for RD Session Host Server authentication on SSL connections. The relevant status code was Keyset as registered is invalid.

Troubleshooting steps:

1. Server was rebooted
2.Tried stopping and resetting NIC
3.Tried adding new NIC
4.recreated RDP-TCP Listener

Solution:

1. Please got to folder C:\ProgramData\Microsoft\Crypto\RSA
2. Rename Folder Machinekeys to something as machinekeys_old
3.Restart the Remote desktop Service

After Service restart we able to RDP to Server.

Now after fixing RDP what we encountered was IISAdmin Service wouldn't start.So we tried different articles suggesting that we give permissions nothing worked.. so on further investigation we found that IISADMIN is looking for file which starts with C23 so we went back to old machine key folder then copied all the C23 files and copied to new machine key folder then IISAdmin service started to work.

Well we thought that's it. then Application team which had Sharepoint complained that Application pool related to Sharepoint that is Security Token Service App pool would start and then would fail.
We did some search with no luck so we just said lets replace the old machine key folder so we went ahead and replaced the old machine key folder and everything started to work with sharepoint and i know we lost RDP to Server..
On further investigation we found that RDP Service is creating a file something like this “f686aace6942fb7f7…” so we deleted this file in exsisting machine key folder and copied it from the new machine key folder which had RDP working.
So now RDP and all IIS Application was working fine.....

If anyone looking for what machinekeys folder contains well google would be a place to start..

Also for some reason this didn't work for me, may be it would help someone so sharing this as well...

https://blogs.technet.microsoft.com/askperf/2014/10/22/rdp-fails-with-event-id-1058-event-36870-with-remote-desktop-session-host-certificate-ssl-communication/


Hopefully this helps someone ,until next one everyone have good day!!!!!!!!!!!!!!!!

Tuesday, June 20, 2017

Windows 2016 Fail-over Cluster - what's in the box

Good day All,

Welcome back!!! i was listening to Microsoft ignite recording on great features in the box coming in Windows 2016 Fail over cluster so i just captured it for people who don't have time to watch video but would like to keep apprised of what is going to come, here is the list

Note: All the points are captured from this video, if anyone interested i highly encourage to go over it

1.      Storage QoS added in Windows 2016

2.      Shared VHDX Integration
Guest Clusters can now resize Shared VHDX without downtime
Gust Clusters can now have Shared VHDX protected by Hyper-V
Replica for disaster recovery
Guest Clusters can now have host level backups in addition to guest level backups of Shared VHDX

3.       Evolving CSV Cache

4.       Diagnostic Improvements
Additional Validation tests to catch Active Directory configuration issues
Improved Network Name resource logging
Improved Validation times for both Storage and non-storage tests
Less noise logged to the cluster log to prevent wrapping
Additional data logged to cluster.log, header and mini-dump of log level 5 verbosity

5.       Reducing Dump Sizes
Active Memory Dump captures what is important with smaller file sizes
new alternative to a complete (Full) memory dump
Excludes memory allocated to virtual machines
Simplified debugging of Hyper-V systems with large amounts of RAM

6.      Zero Downtime debugging
Clustering will capture live dumps on failures
Live dumps are a mechanism to generate a memory dump for debugging without crashing the system
Capture debugging data without having to bug check nodes
Debugging data without downtime
Capture dumps across multiple machines in parallel to enable debugging the distributed system
Integrated with Windows Error Reporting to snapshot logs

7.       Thunderbolt Networking - for 2 Node cluster now you can use USB cable and plugin it and IP6 is auto configured, no configuration is required.

8.       VM Compute Resiliency
VMs continue to run even when a node falls out of cluster membership in the past if a node fails it will drain all the VM's on the node and restart on the another node in cluster so in 2016 cluster resiliency to transient failures like spanning tree protocol , network hiccups . or I/O failure on the SAN and it comes up

9.       VM Storage Resiliency
VM Stack quickly notified on failure
VM moved to Paused Critical state and will wait for storage to recover
Session state retained on recovery

10.   Quarantine of Flapping Nodes
Unhealthy nodes are quarantined and are no longer allowed to join the cluster
Prevents flapping nodes from negatively effecting the other nodes and ther overall cluster
Node is quarantined if it ungracefully leaves the cluster 3 times within an hour
VM's are gracefully live migrated once Node is quarantined
Nodes prevented from joining the cluster for 2 hours

11.   Simplified SMB Multi-Channel

12.   VM Start Ordering Improvements

13.   Domain ‘less  Workgroup Servers as Cluster

14.   Multi-domain Cluster

15.   Cloud Witness
      Stretched clusters without a 3rd site
Clusters without shared storage

16.   Site Awareness
Groups failover to a node within the same site, before failing to a node in a different site
VMs follow storage and are placed in same site where their associated storage resides
VM's will begin live migrating to the same site as their associated CSV after 1 minute

17.   VM Load Balancing
Earlier as part of VMM not in Box
Identifies idle nodes in a cluster and distributes VM's to utilize them
Utilization determined by VM memory & CPU  pressure

18.   Seamless Upgrades
Rolling upgrade from Windows 2012 R2 to Win 2016 that is mixed mode clusters like Windows 2012, Windows 2016 in a same cluster eccept few features will be turned off.
In-place upgrades of cluster nodes now possible

19.   End to End Multi-Site Clusters
End to end Windows Server disaster recovery solution
Volume level software replication between storage of any type workload
Synchronous replication

20.   Clusters without shared storage using Storage Spaces Direct.
DAS storage replicated across all nodes clusters with no shared storage!
Hyper converged - VM's on Space Direct Cluster


I don take any credit for it just captured the details for cluster fan's , hopefully it will be useful for someone until next one you all have good day!!!!!!!!!!!!!!!!!!!!!

Thursday, April 6, 2017

MSTSC - need to see RDP screen and punching username and password

Good day All,

Welcome back!!! recently i was pulled to a little issue for RDP and found it interesting so thought to share to all.

The requirement was user had a JumpServer in TestEU domain and trying to RDP to a Server in TestUS domain.

So after connecting to TestEU domain when they launch mstsc and punch in the username and password for TestUS domain, it will take sometime and through error saying "Directory Logon Failure: bad credentials supplied"

1. Firewall ports was verified nothing blocking
2.Telnet on port 3389 from TestEU domain was working to a Server in TestUS domain

This is when i was called to check, it took time to understand why they doing double hop and  i was able to tell them that this error is in RDP Client.

As we all know when RDP client version before 7.0  you will see RDP desktop first and then ask for credentials but after 7.0 as you soon as you hit connect it will ask for username and password and it will do the authentication process and it will take right you in the Server.

So what was happening when user is trying from a Jump Server in TestEU, even though he explicit provides TestUS domain name ,username and password the username and password are was checked against the current domain and getting that bad credentials error.

Why it does please google it because of single sign-on and registry it keeps validating the user name and password in current domain.

Work around:

1. Start, run , type mstsc and click options and remove any ip under Computer
2. In mstsc  under connection settings click saveas on the desktop and it will get saved as default.rdp and close the mstsc
3. Now open a notepad and click open , select all files and open default.rdp and click ok
4. Scroll all the way down and add the following 2 below in the notepad and save it.
enablecredsspsupport:i:0
authentication level:i:0
5. Now asked them to double click the default.rdp on the desktop, type in the ip and see , does it open a RDP screen of windows 2012 to punch in domain name,user name and password.

Note: If you have a RDP client version 7 and below you would have never encountered this error because when you click mstsc and hit connect it will open a RDP session allowing to punch in username and password.

Hopefully this will help some and until next one you all have a good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Blade NIC showing unplugged

Good day All,

Welcome back!!!  Couple of weeks ago we had a interesting incident so thought of sharing to all so that it helps someone.
We got a alert ticket and one of the Production Server lost both Production Team NIC's on a blade and was unreachable , strange is we had NIC's for Backup and Management both are up and online.

So troubleshooting steps we performed but issue didn't get resolved:

1.We updated to latest drivers didn't work
2.Replaced NICs didn't work
3.Replaced mother board didn't work
4.Removed blade and put in next slot didn't work
5.Removed blade and put in second half of slot(bay 14) that didn't work too
6. We tried created new profile nope that didn't help


Solution:


 We created a internal only network and then assigned 1 NIC and then on  other Blade Server we assigned a NIC to same Internal network , assigned a 192.168.1.1 and 1.2 IP and pinged both NIC's it started to work.
Now we know that its not NIC issue as when we configured internally network its pinging between both blades.

So finally what we decided was to drop the existing VLAN and then just recreated a same VLAN,...Vola.. issue got resolved.
We reached out to Level 2 HPTS guys but they had no idea why something would like fix the issue.
Hopefully this will help someone, until next one you all have good day!!!!!!!!!!!!




Virtual Connect upgrade - Interactive Mode

Good day All,

Welcome back!!! quite busy these days so after a very long time started to write.
We have completed upgrading Virtual Connect to 4.45 so thought of sharing the process and also issues we encountered along the way

Also i have seen people getting confused on what order to follow, we have been doing this for quite sometime and has been very successful

1.Apply ilo Firmware for all the Blades in the enclosure
2.Apply Firmware update for the Physical Blades and if its Windows OS\ESXi apply all the drivers/Firmware as well
3.Apply On-board Administrator Firmware update
4. Last Apply the Virtual Connect Firmware update

Before you do anything backup the Configuration :

























Download VCSU Utility:
http://h20564.www2.hpe.com/hpsc/swd/public/detail?swItemId=MTX_5e16cbb76d9e46e891ca04048d

VCSU User Guide:

https://h20566.www2.hpe.com/hpsc/doc/public/display?sp4ts.oid=4144084&docId=emr_na-c04567803&docLocale=en_US

Virtual Connect Firmware download:

http://h20564.www2.hpe.com/hpsc/swd/public/detail?swItemId=MTX_3adcc3c4275f460c8d97cad17e

After you install VCSU, Navigate in start menu and open Virtual Connect Support Utility - Interactive as seen in below screen

Note: If any VC version is less than 4.01 then you need to update to 4.01 first and then to required firmware in our case 4.45

First Step is to run Health Check Report: Open VCSU interactive mode

Please enter action ("help" for list): healthcheck
Please enter Onboard Administrator IP Address: OA Primary IP
Please enter Onboard Administrator Username: Administrator
Please enter Onboard Administrator Password: *************

The target configuration is integrated into a Virtual Connect Domain. Please enter the Virtual Connect Domain administrative user credentials to continue.

User Name: Administrator
Password: *************

A details report will be generated with complete VC Current Firmware list and any issues will be reported.
Make sure to go over it before you start the upgrade steps below

After verifying all at 4.01 then run the steps below to update the Firmware from 4.01 to 4.45
  •  Please enter action ("help" for list): update
  •  Please enter Onboard Administrator IP Address: OA Primary IP
  •    Please enter Onboard Administrator Username: *************
  •    Please enter Onboard Administrator Password: *************
  •    Please enter firmware package location: C:\vc\vcfwall445.bin
  •     Please enter Configuration backup password (Optional):
  •     Please enter Force Update options if any (eg: version,health): health
  •     Please enter VC-Enet module activation order if any (eg: parallel or odd-even or serial or manual. Default: odd-even):hit enter
  •    Please enter VC-FC module activation order if any (eg: parallel or odd-even or serial or manual. Default: serial):hit enter
  •    Please enter the time (in minutes) to wait between activating or rebooting VC-Enet modules (max 60 mins. Default: 0 mins):hit enter
  •    Please enter the time (in minutes) to wait between activating or rebooting VC-FC modules (max 60 mins. Default: 0 mins):hit enter
  •    The target configuration is integrated into a Virtual Connect Domain. Please enter the Virtual Connect Domain administrative user credentials to continue
  •    User Name: ************
  •  Password: *************
After like 40-45 mints you will see notification showing all your VC Modules updated with version you needed.

Issues encountered:

1, if you have critical database or application which can't withstand at-least 10-15 packet drops then make sure the team is aware, we have seen case where there was like at=least 15 packet drops when upgrade
2.Linux Blades if running in older versions of NIC drivers we have see loosing network connection and had to be rebooted
3.If you have ESXi host and VM's running, if NIC's are configured like Active/Standby then when upgrade you will see packet drops

Hopefully this will help someone and until next one all have a good day!!!!!!!!!!!!!!!!