Wednesday, November 22, 2017

CPU 95% spike alert - resolved!!!!

Good day All,

Welcome back!!!! We had a strange incident for CPU alerts been reported by Client.. our monitoring tool was not showing any high CPU usage but Client as monitoring setup and they are getting CPU spike alerts one or two times in couple of days...

We did routine health check by checking month log CPU usage , CPU was hardly showing 30% spiking for entire month but still Client was getting alert so following steps was performed to further troubelshoot

1. Open resource monitor and started to keep a eye on CPU spike to see which Service is causing it..
2. After a close watch for like a hr was able to tell that svchost.exe was utilizing it
3. As svchost is a shared process had to identity which Service is causing it so enabled the check box next  to Service host in CPU tab of resource montior and monitored as below



































4. From the above i was able to identify it was event log Service which is piking the CPU
5.So opened event viewer and started to check the events found that in Security log for every 1 sec at-least 10 to 15 events on Event ID:5156 Platform Filtering Connection was getting generated and filling up the 1 GB security log file in no time. After the log file is filled its trying to over write but number are events are so high its unable to process it and spiking the CPU


























6. Verified in local security policy that Audit Object Access policy was enabled for both Success, Failure and it been enforced with GPO.






























7. Starting Windows 2008 Audit policy has changed and lot of subcategories are added and you can verify it by typing in command prompt auditpol / get / category

















8.Our Security policy was to have both Success and Failure for Filtering Platform Connection, had to raise a exception and policy was changed from Success and Failure to only Failure with the following command on the Server
auditPol /set /Subcategory:"Filtering Platform Connection" /Success:disable /failure:enable

Ola the issue got resolved and we did't see any more spikes....

hopefully this helps someone, until next one you all have a good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Preparing to configure windows - Do not turn off your computer

Good day All,

Welcome back!!! today will share a issue we encountered on a Windows 2008 Server after patching.

After patching when logged into console this is the error we started to see for a very long time



















Troubleshooting steps performed

1. rebooted the Server, went to safe mode and try to uninstall installed patches and it showed error and unable to unistall
2.Tried last good configuration that didn't work
3.dism.exe /cleanup-image /scanhealth didnt help

So to fix the issue we reboot the Server and during the above screen did a MMC to check if all Service was running fine or anything is struck in stopping mode.

On verification we found that Antivirus scan was in stopping state , we tried pskill to kill the service but no luck so we changed the service settings to manual and then did a reboot..
Ola we able to login back and then did a windows update, patches got installed and then started the service back..

I have seen across the internet some one saying to remove the pending.xml file under C:\windows\winsxs so just sharing in-case above steps doesn't work for someone.

Hopefully this helps someone, until next one you all have good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Friday, November 17, 2017

Linux Blade after firmware either NIC's disconnected or unable to reboot or power off.

Good day All,

Welcome back!!! recently Unix team started to do firmware upgrade on the blades and after upgrade when to reboot the Servers couple of issues was reported

1. all the NIC cards would be in disconnected state
2. Blade will not respond or will not be able to hard reset or power it off state.

Following steps was performed to fix the issue

1. Reset the blade :

  • Login to OA and make note of the bay the blade is in under. 
  • Now open putty and connect to primary OA IP
  • type reset server bayno. 


ask you to confirm and then it will show successfully done.

2. If resetting the blade fails then login to Virtual connect Manager. Go to profiles and select the blade and click edit.Down below you will see a option to un-assign the profile for the server and click ok.  Now go back to Server and then apply the same profile back and click Apply and power on the blade.

hopefully this helps someone and until next one all have a good day!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!