Damn you C-states! (Unexpected XenServer reboot)

Written by Ingmar Verheij on February 21st, 2012. Posted in XenServer

Processors have the ability to save energy by entering a low-power mode. Each processor has serveral power modes called “C-states”. The C-states are introduced with the 486DX4 processor and are still present in the current processors. Over time more C-states are introduced to lower the power consumption and save energy.

Hypervisors (used to virtualize desktops or servers) like Citrix XenServers or Microsoft Hyper-V can have issues with  C-states causing them to freeze, BSOD or slow down. This happens when C-state 3 “Sleep” or higher is enabled the BIOS.

C-what?

A processor has multiple power modes, or C-states, it can operate in. The purpose of these modes is to lower the power consumption by disabling features if you don’t need them. Each mode shuts down (or lowers) one or more components of the processor lowering power consumption and increasing time to wake-up. The higher the C-state the processor is in, the deeper the sleep.

Processor C-states - Source: Intel

The normal processor operating mode is C0, or operating state, where the CPU is fully operational. Besides a number C-states are known by name (which describes there function) and have sub-modes.  As you can see in the diagram the higher the C number is, the more circuits and signals are turned off and the deeper the sleep is.

While C-state 0 is the basic operating mode C-state 1 till 3 save power by cutting clock signals, C-state 4 to 6 work by reducing CPU voltage. The enhanced C-state modes can do both at the same time. Here’s a quick overview of the C-states present in modern processors (like the Intel Nehalem and Westmere)

Mode Name What it does
C0 Operating State CPU fully turned on
C1 Halt Stops CPU main internal clocks via software; bus interface unit via APIC are kept running at full speed
C1E Enhanced Halt Stops CPU main internal clocks via software and reduces CPU voltage; but interface unit and APIC are kept running at full speed.
C1E Stops all CPU internal clocks.
C2 Stop Grant Stops CPU main internal clocks via hardware; bus interface unit and APIC are kept running at full speed.
C2 Stop Clock Stops CPU internal and external clocks via hardware
C2E Extended Stop Grant Stops CPU main internal clocks via hardware and reduces CPU voltage; bus interface unit and APIC are kept running at full speed.
C3 Sleep Stops all CPU internal clocks
C3 Deep Sleep Stops all CPU internal and external clocks
C3 AltVID Stops all CPU internal clocks and reduces CPU voltage
C4 Deeper Sleep Reduces CPU voltage
C4E/C5 Enhanced Deeper Sleep Reduces CPU voltage even more and turns off the memory cache
C6 Deep Power Down Reduces the CPU internal voltage to any value, including 0 V

The Intel Nehalem (or Core i7) has an embedded power control unit that allows the voltage for individual parts of the processor to be reduced or turned off.

If you want to know more about the CPU C-states and power saving modes you should definitely read this article from Gabriel Torres.

 

How do I know if what C-states are enabled?

In Citrix XenServer this is fairly easy, just connect to the console (either via XenCenter or an SSH-capable client like PuTTY) and enter the command found below and it will return the total number of C-states.If the number of C-states is above 2, the C3 state “Sleep” is enabled and should be disabled in the BIOS.

Since the C3 state enables the machine to enter Sleep state, in Microsoft Windows Hyper-V this is done by looking at Start > Shut down. If there is an option to enter Sleep mode (or Hibernate) the C3 state is enabled.

 

Unexpected XenServer reboot

Today around 14:45, during a training all virtual machines on a Citrix XenServer froze while the student where working on there lab. It didn’t took long to figure out the cause of the freeze, the machine had rebooted unexpectedly. After connecting to the host via a Lantronix Spider, an error message of the RAID controller was shown.

LSI MegaRAID SAS-MFI - Error

After bringing the host and virtual machines back online I started troubleshooting the issue. In the /var/log/message found the following events:

The event  on line 5 “Entering sleep due to inactivity” was adjective because the opposite was true. There where 17 virtual machine active, users where actively using the machines (they where working on a lab) and the Citrix XenServer was pushed to its limits. There was really no reason to enter sleep mode.

This is most likely caused by the C-states that are enabled. After a quick check  on the host I found out the C-states are indeed enabled (total C-states: 4). So I rebooted the machine, entered the BIOS (I’m using a white-box with an Asus motherboard), and disabled following features:

  • Advanced \ CPU Configuration \ Enhanced Intel SpeedStep Technology
  • Advanced \ CPU Configuration \ Turbo Mode
  • Advanced \ CPU Configuration \ CPU C1E
  • Advanced \ CPU Configuration \ CPU C3 Report
  • Advanced \ CPU Configuration \ CPU C6 Report

All these features are not really necessary on a hypervisor host. You don’t want the host to vary the frequency of the processor to reduce power consumption if this negatively impacts the performance (or stability) of your virtual machines. Citrix recommends disabling the Turbo mode and C-states in knowledge base article CTX127395.

According to Citrix : “The core problem is faults in the hardware implementation of the new C-state features in this generation of CPUs. Citrix is investigating the software workarounds that can be implemented to avoid these issues, but recommends that on a current affected hardware, C-states should remain disabled in the BIOS until Intel can provide a CPU microcode update that facilitates the behavior of C-states as designed.”

 

How about Hyper-V

Microsoft Hyper-V has issues with the C-states as well. You can read about it in this blog article.

Ingmar Verheij

At the time Ingmar wrote this article he worked for PepperByte as a Senior Consultant (up to May 2014). His work consisted of designing, migrating and troubleshooting Microsoft and Citrix infrastructures. He was working with technologies like Microsoft RDS, user environment management and (performance) monitoring. Ingmar is User Group leader of the Dutch Citrix User Group (DuCUG). RES Software named Ingmar RES Software Valued Professional in 2014.

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Tags: , , , , ,

Comments (4)

  • 27 February 2012 at 08:29 |

    When I’m building a new hypervisor host, thats one of the first things I turn off, C states, and for AMD, AMD Cool and quite. They cause nothing but issues for hypervisors.

  • Fabian
    13 May 2013 at 15:05 |

    The “xsconsole: Entering sleep due to inactivity – xsconsole is now blocked waiting for a keypress” message indicates that nobody was using the console on the hardware machine itself.
    I don’t remember the exact timeout, but it hasn’t got anything to do with load.

    • Ingmar Verheij
      13 May 2013 at 16:46 |

      Hi Fabian,

      Thanks for clearing that out. After reading it back, that indeed makes sense.

  • fereshteh
    10 August 2016 at 10:32 |

    hi
    Thanks for the explanations you have given. I want to take a xenserver hypervisor in deep sleep mode.
    I did in the BIOS settings that you said, but how does one determine the sleep mode or deep sleep mode is done?
    also, I want to write the script for deep sleep mode. How do I do it?
    Thank you for your guidance.

Leave a comment

*

Donate

%d bloggers like this: