VMware ESXi 5.5 Purple Screen (PSOD) w/ Server 2012 & 2012 R2

I’ve had the displeasure of being greeted first thing in the morning by VMWare’s purple screen of death (PSOD) both in my home lab and in production. I seemed to have noticed a trend that once we had more than 2-3 2012 R2 guests on a single ESXi 5.5 host at somewhere round 40 days of uptime the system would purple screen with the message shown below:

PSOD

After lots of digging and frustration I was able to find a VMware KB article (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2059053) that identified the issue being associated with the E1000E Virtual NIC adapter. The thing that struck me as odd about this is that the E1000E Virtual NIC is default for the 2012 and 2012 R2 servers in the desktop vSphere client, however the vCenter web client gives you a VMXNet3 adapter by default.

So long story short resolution involves capturing your IP configuration info, shutdown the VM (scheduling downtime like a responsible admin), remove the E1000E adapter and add a VMXNET3 Adapter. Please note that the VMXNET3 does require VMware tools to be installed on Windows Server in order to recognize the adapter. Below are the steps to complete this operation:

 

Right click your VM and choose edit settings:

Screen Shot 2015-08-30 at 11.56.56 AM

Once you’ve selected Edit settings click on your virtual NIC and look at the right hand side of the window to see what type of adapter it is:

Screen Shot 2015-08-30 at 11.58.13 AM

 

To remove the adapter click the remove button at the top of the windows with the Network Adapter Selected. Then click the Add button to add a new virtual NIC:

Screen Shot 2015-08-30 at 11.58.28 AM

Make sure to select the VMXNet3 from the dropdown list:

Screen Shot 2015-08-30 at 11.58.37 AM

Follow the wizard to victory, then reboot, install VMware tools if not already installed, and reconfigure your IPs. Total downtime for this operation should be 5-10 minutes per 2012 and 2012 R2 VM.

After implementing these steps I have successfully avoided another PSOD in both lab and production. Several other colleagues with similar issues report the fix above resolved the problem for them as well.

Updating ESXi From ESX-CLI

Enable SSH or ESXi Shell on the ESXI Host

Method 1: Without the vSphere Client

DRAC into the ESXI Server and press f2 and enter your credentials

1

Arrow down to troubleshooting options and press enter, choose to enable either the EXSi shell or SSH or both.

2

3

To access the ESXi shell press F1, to return to management press alt F2

Method 2: Using vSphere Client

Log into vSphere and click the host you wish to update. Then select the configuration tab

4

On the left hand side under software click Security Profile then click Properties in the top right hand corner

5

Choose SSH and start the service and optionally set to start with the host if you wish to leave it enabled

6

 

 

Check Version and Patch Level

Log into SSH or ESXI shell and run the following command

vmware -vl

The following link contains a guide to correlate patch numbers to : http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1014508

 

Obtaining Updates

To obtain patches for ESXi visit the following link to download the patches: https://my.vmware.com/group/vmware/patch#search

 

Upload Patches To Datastore

Create a directory on the datastore called patches and upload the zip files you downloaded to this directory.

7

Apply Patches

Evacuate the VM host of any running VMs and place the host in maintenance mode

8

Once you have completed the step above access the ESXI CLI from either DRAC or SSH and type the following command

esxcli software vib update --depot=/vmfs/datastore/patches/<name of patch zip>

Depending on the update you may need to use install in the syntax instead of update

 

Reboot the host to complete patch installation once the CLI states install was successful.

Monitoring Your Servers for Free (Part 4)

It has certainly been a month of monitoring blog posts over here! I have been running a little behind schedule thanks to this chest cold I caught, however I’m on my way to recovery and with a little coffee and determination we’ll get back to it! This post will be the epic conclusion of our monitoring your servers for free series. For this demonstration we will use the Turn Key Linux virtual appliance for Observium built on top of Debian Wheezy. If you’d like to install Observium on your RHEL or CentOS server, instructions can be found at http://www.observium.org/wiki/RHEL_Installation

 

 

Download & Deploy TurnKey Linux Observium Virtual Appliance

To download the appliance visit http://www.turnkeylinux.org/observium. You will have several choices as to what format you want to download the appliance in. If you are using Hyper-V I would recommend using the ISO, however since I will be using VMware ESXi 5.5 we will download the OVA template. Once it’s downloaded to your local machine follow the steps below:

  1. Unzip the OVA template
  2. Open vSphere client and click File>Deploy OVF Template
  3. In the Deploy OVF Template wizard click Browse and point to the OVF template you unzipped
  4. Proceed through the remaining steps in the wizard customizing as appropriate for your environment
  5. Power on the Turnkey VM and open the console
  6. Proceed through the setup wizard for the virtual appliance

 

Prepare Windows Servers for Observium

Before we can begin reporting data to Observium, we must first ensure that we have enabled SNMP. This can either be done through server manager using the add roles/features wizard. In addition you will need to ensure that the hostname the servers you’re monitoring can be resolved Observium, this may require entries into your local DNS server if DNS entries do not already exist, or entries into the /etc/hosts on the Observium server.

  1. Install SNMP (see above)
  2. Restart the Server (the SNMP options needed aren’t active until after reboot)
  3. Open Services (run command: services.msc)
  4. Double click the SNMP service
  5. Click the Security Tab
  6. Add a community name (ie: Observe)
  7. Add the IP of the Observium server to the “Accept SNMP packets from these hosts” list
  8. Open a browser and navigate to your Observium server’s web page and login
  9. Under the devices menu choose add device
  10. Provide the hostname
  11. Choose SNMP v1
  12. Enter SNMP community name set in step 6 and click add device
  13. Wait a few minutes for Observium to begin collecting data on your new server

 

Monitoring Linux Servers and Cisco Equipment

While I typically use New Relic for Linux monitoring, Observium has published instructions for adding Linux servers. Additionally there are also instructions available for adding Cisco switches and firewalls. Again the gotcha here is that A records must exist in local DNS or entries must be created in /etc/hosts on the Observium server, as all devices are managed by hostname not by IP.

Linux: http://www.observium.org/wiki/NetSNMPd_Client_Configuration

Cisco: http://www.observium.org/wiki/Cisco_IOS_SNMP_Configuration

 

Video

Coming Soon!