Why Are Virtual Machines So Slow?

Generally, virtual machines are slower than systems using bare metal servers. They are also likely to be slower than their host servers, but not for the same reasons. Since there are many probable reasons why virtual machines are so slow, you need to identify the issue to fix it.

1. CPU Over-commitment

CPU over-commitment or oversubscription is a common problem in virtualization.

Basically, any virtual machine or a set of VMs can’t have more processors than its host server’s CPU capacity. One or all virtual machines will likely be slow if virtualization exceeds hardware capacity.

Suppose you have a host server CPU with 10 cores. These 10 cores can have 20 threads if you take into account hyperthreading.

The number of processors one or all virtual machines hosted by this server can have shouldn’t exceed the logical cores or the total thread count.

So, if you opt for 2:1 mapping, you can have a maximum of 40 vCPUs and virtual machines with this host server.

But the host server CPU should have at least one core for its own performance. Thus, exhausting a host server’s CPU cores/threads will affect their efficiency.

You can opt for 1:1 mapping to create 20 virtual CPUs.

However, the host server CPU may still need at least 1 core, so the ideal approach is to create up to 18 virtual machines with this type of server configuration. 1:1 mapping isn’t necessary unless guest applications are demanding.

A virtual machine that’s too slow due to CPU overcommitment should be moved to another host, or the current server must have more processors.

The logical cores limit the parallel requests or simultaneous transactions a host server CPU can process, which affects CPU performance.

2. RAM Overcommitment

RAM or memory overcommitment is somewhat similar to CPU oversubscription.

The logical cores of a host server CPU decide the maximum number of simultaneous requests of vCPUs that will be processed. Likewise, the allocated and available RAM influences a VM’s efficiency.

In theory, RAM overcommitment isn’t a problem because virtual machines don’t necessarily use all the allocated or available memory in real time.

This isn’t very different from a regular desktop or laptop not utilizing the full memory capacity, or the CPU, for that matter.

In practice, however, RAM overcommitment can make virtual machines slower if many use significant memory simultaneously.

Since RAM overcommitment tends to leave very little to no memory dedicated for the host server, the parent machine will also be unfeasibly slow.

Suppose a host server with 16 GB RAM has 10 virtual machines, each with 2 GB of memory.

If half of these VMs are idle, and the other 5 virtual machines use no more than 10 GB, the host server will have sufficient memory, and everything will likely run efficiently.

However, if all 10 virtual machines try to use up to 2 GB of allocated memory, regardless of what is available in real-time, the host server may not have much RAM for its functioning.

Every virtual machine in this setup can also be painfully slow due to RAM or memory overcommitment.

3. High Storage Latency

Following CPU and RAM or memory, storage is the third most important hardware resource that can make virtual machines slower than they should be.

All these critical hardware resources are subject to allocation and real-time utilization issues. Consider the following storage factors:

  • IOPS: input/output per second.
  • Throughout: bit-rate or bits per second.
  • Storage latency: read/write response time.

Suppose a host server HDD has a disk spindle speed of 7,200 rpm. The available IOPS is 100 for this hard disk drive. 5 such HDDs will offer 500 IOPS.

If this host machine runs 50 VMs with 10 IOPS each, every virtual system will be slow. The solution is fewer VMs or better disk drives.

Many guest applications require at least 30 to 50 IOPS to prevent storage latency. So, this HDD can host 10 to 16 virtual machines.

But if some VMs or guest applications require more than 50 or around 100 IOPS due to demanding processes, this HDD will likely fail to prevent latency.

Additionally, the current state of a storage system influences latency.

A fragmented hard disk drive of the host server can adversely affect the efficiency and performance of virtual machines. Likewise, disk encryption might cause, facilitate, or worsen storage latency.  

Disk encryption is an extra overhead that can impair inputs and outputs’ interactive speed and response time.

Disabling disk encryption can solve the latency issue in such cases unless it is critical for a particular guest application or virtual machine.

4. Internet/Network Issues

Virtual machines generally use the following types of networks:

  • Bridged networking.
  • Host-only networking.
  • Network Address Translation (NAT).

Virtual network issues can make guest machines slower than usual. The same applies to a VM using the internet to connect to a host machine or server.

If a virtual machine uses the host’s internet, that could cause latency if the connectivity isn’t flawless.

Virtual machines in distinct environments have specific networking demands.

Anything that can affect the internal and external interactions due to a glitch in the virtual network and the internet connection may affect the efficiency and overall performance of a VM or a particular application.

Furthermore, antivirus software and firewall configurations of virtual machines can affect a VM’s communication with the host or specific processes of an application.

Both antivirus software and firewall should be configured properly to avoid untoward disruption and other influences.

Should a virtual machine use the host’s internet, the same type of resource contentions may be an issue, like with the limited memory, processor, and storage capacities.

A host machine must have sufficient bandwidth and internet speed for all the virtual machines sharing the network

5. Hypervisor Inefficiency

One of the fundamental reasons virtual machines are so slow is the additional hypervisor layer in the virtualization environment. Normal computers have only 2 modes:

  • Kernel.
  • User.

There is no hypervisor mode, so none of the potential overheads in a virtual machine exist in the context of a regular desktop computer or laptop.

The hypervisor mode in a virtual machine likely includes an operating system, which itself is also an overhead for the host.

The hypervisor of a virtual machine is intended to manage the overheads and system utilization, most of which is in real-time.

If this hypervisor isn’t efficient or mismanages something crucial, a virtual machine can be much slower than expected. Consider the example of memory or RAM. 

Most hypervisors are responsible for dynamic RAM or memory management.

When necessary, these programs dedicate more memory to a virtual machine and reallocate RAM based on the real-time needs of other VMs and the host server. 

However, an inefficient hypervisor can cause problems, including poor performance or the speed of a virtual machine. This could be due to one or more of the following issues:

  • Low memory (RAM).
  • Memory paging.
  • Hard page fault.
  • Soft page fault.

Memory paging and page faults are common when a virtual machine, its operating system, or a process doesn’t have sufficient RAM to function optionally.

These issues due to the inefficiency of a hypervisor aren’t necessarily caused or facilitated by standard RAM overcommitment.

6. Resource Contentions

Resource contentions are expected with virtualization because virtual machines share the host hardware.

Most contemporary hypervisors are configured to mitigate various types of resource contentions and even prevent them in some instances. But not all contentions are avoidable.

The most common resource contentions pertain to the following components:

  • CPU.
  • RAM.
  • Storage.
  • Network.

Whenever a host server is optimized to the utmost extent, which is mostly for financial reasons, there is little leeway for the virtual machines to share the hardware resources.

Thus, any excess activity that a host or its virtual environment can’t process goes into a queue.

This queue affects the performance of one or many virtual machines and the host server.

Virtual machines can be tediously slow for a prolonged period if this queue isn’t resolved quickly.

Such problems aren’t always due to demand and supply. At times, virtualization is the problem.

Virtualization facilitates many dynamic interventions, whether automated or manual. An example is creating new virtual machines for a brief period.

If enough resources are available, a new VM or a few won’t be an issue. But an already stressed host may succumb to resource contentions. 

7. Unusual VM Workload

Any changes to an existing VM workload, like new virtual machines, can slow down that specific system. There may also be a ripple effect if the limits set with the hypervisor are too high, which may affect the performance of other virtual machines and the host.

Consider the maximum memory setting in a hypervisor.

If a virtual machine has a very high limit and it initiates some demanding processes, the hypervisor may allocate the maximum memory if sufficient RAM is available.

Otherwise, the demanding VM will not perform as expected.

The same reality applies to a virtual machine that isn’t running a demanding application, but the ripple effect of another VM causes a specific resource contention.

So, you may have a slow VM despite running the regular applications and not demanding anything extra from the host server.

8. Snapshots or Checkpoints

All virtual machines can use snapshots to establish baselines, but having too many of them and keeping one for a long time will affect the performance of a VM.

Checkpoints or snapshots aren’t an alternative to backup. Also, snapshots are advantageous only when they are used sparingly.

You may be able to use a dozen, score, or more snapshots in a virtual environment, but a couple might suffice for most users.

Besides, snapshots should not be kept or used for a prolonged period. Snapshot files grow larger with every disk-write activity.

An increasingly larger snapshot may exhaust the available storage space.

The simplest remedy for this problem is to use a snapshot for a baseline when you work on something new, complete the task, and delete the checkpoint if it is no longer necessary. Else, your VM may be slow.

9. Subpar Virtualization

Virtualization is a complex process with a lot of dynamic variables. However, one of the primary objectives of virtualization is maximizing the utilization of all available resources.

This objective compels every component of the host infrastructure to be optimized to the utmost extent.

Maxing out any hardware is overcommitment and paves the way for resource contentions. A virtual machine will already be slow due to these issues.

VMs can be even slower if the host has subpar hardware or the virtual machines are poorly configured.

Compare the latency of HDD to that of SSD. A host using SSD storage is unlikely to have more than 3 ms of latency, whereas the lag for HDD can be 10 ms to 20 ms or longer at times.

Likewise, every virtual machine should be configured based on its core functional requirements.

A VM running a light software application or a simple CRM may not be too slow compared to a virtual machine that deals with or interacts with databases, graphics, etc.

Similarly, any virtual machine having too many dependencies isn’t going to be as fast as those without any or a few.

There are different approaches to configuring virtual machine dependencies, mostly depending on the correlation and the impact of the action you might take.

While dependencies within a VM environment call for standard protocols, external processes sometimes affect performance.

A virtual machine may use applications or processes that require communications with external systems, such as another virtualized host or individual VMs.

If an external process is a reason for a slow virtual machine, there is little to nothing your host server or hypervisor can do.