Virtual machines running on a PCS/PSBM server experience Blue Screens of Death or Kernel Panics during a spike of CPU load.
All virtual machines are assigned a large amount of CPUs.
CPU configuration is not optimal, it is needed to reconfigure CPU limitations for the VMs. Timer interrupts cannot be handled properly, which causes guest operating system to crash.
Possible ways of optimization:
- Avoid CPU overcommitment. Each VM should be assigned the amount of CPUs that will actually be used, without any additional overhead. Assign either a specific amount of CPUs (more preferred) or a CPULIMIT to the VM.
- Follow CPU configuration best practices.
- Follow TimeKeeping best practices.
- Do not overcommit the server in terms of RAM, assigned to VMs.
Processor virtualization adds varying amounts of overhead, depending on the percentage of the virtual machine's workload that can be executed on the processor and the cost of virtualizing the remaining workload.
For applications that are processor-bound (that is, most of the application's time is spent executing instructions rather than waiting for external events, such as user interaction, device input, or data retrieval), any processor virtualization overhead translates into a reduction in overall performance.
Applications that are not processor-bound can still deliver comparable performance because processor cycles remain available to absorb the virtualization overhead.
Recommendations for optimal processor performance are the following:
When configuring virtual machines, there is always some virtualization overhead (even idle CPU in the VM must handle at least timer interrupts). Take care not to excessively overcommit processor resources in terms of processor utilizations and the total number of virtual processors
Allocating more processors than there will be regularly used results in additional overhead, including queuing for available cores on the physical hosts. If the guest operating system and applications will regularly use only 2 processors, allocate only 2 processors. Allocating 4 in this case only degrades performance due to queuing and overhead. Do not use virtual symmetric multi-processing (SMP), if your application is single-threaded and does not benefit from the additional virtual processors. In some guest operating systems, the unused virtual processor still consumes timer interrups and executes the idle loop of the guest operating system, which translates into real processor consumption.
Make sure that the CPU limit you plan to set for a virtual machine or Container does not exceed the total CPU power of the server. So if a server has 4 CPUs, 1000 MHz each, do not set the CPU limit to more than 4000 MHz.
The processes running in a virtual machine or Container are scheduled for execution on all server CPUs. For example, if a server has 4 CPUs, 1000 MHz each, and you set the CPU limit for a virtual machine or container to 2000 MHz, the virtual machine or container may consume CPU power of several CPUs (more than 2), e.g. 100% of 1 CPU and 33.3% of the rest 3 CPUs.
All running virtual machines and Containers on a server cannot simultaneously consume more CPU power than there is physically available on the node. In other words, if the total CPU power of the server is 4000 MHz, the running virtual machines and containers on this server will not be able to consume more than 4000 MHz, irrespective of their CPU limits. It is, however, perfectly normal that the overall CPU limit of all virtual machines and Containers exceeds the Node total CPU power because most of the time virtual machines and Containers consume only part of the CPU power assigned to them.
64-bit guests and applications can have better performance than corresponding 32-bit versions.
The guest operating system timer rate can have an impact on performance:
Linux guests keep time by counting timer interrupts.
Unpatched 2.4 and earlier kernels program the virtual system timer to request clock interrupts at 100Hz (100 interrupts per second).
Some 2.6 kernels request interrupts at 1000 Hz, others at 250 Hz, so you should check your kernel to determine the actual rate.
These rates are for the UP kernels only and the SMP Linux kernels request additional timer interrupts.
Microsoft Windows operating systems timer rates are specific to the version of Microsoft Windows and the Windows hardware abstraction layer HAL) installed.
For most uniprocessor Windows installations, the base timer rate is usually 100Hz. Virtual machines with Microsoft Windows request 1000 interrupts per second if they run applications that make use of the Microsoft Windows multimedia timer service; such multimedia applications should be avoided if possible.
- If you have a choice, use guest operating systems that require lower timer rates.