2023-04-27

Understand virtio memory balloon

Introduction

Virtio memory ballooning is a technique that adjusts memory allocation in virtualized environments. The hypervisor can add or remove memory from a virtual machine based on demand, using a balloon driver in the guest operating system. When demand is high, the balloon driver inflates and the guest operating system releases memory. When demand is low, the balloon driver deflates and the guest operating system can use more memory.

This technique optimizes memory usage and reduces the risk of memory exhaustion, making it useful in cloud computing environments. However, it also has trade-offs to consider. Inflating the balloon driver can cause performance issues if the guest operating system can’t release memory quickly enough. It may also struggle with high memory pressure. Understanding these limitations is key to making informed decisions about using virtio memory ballooning.

Overview of Virtio Memory Ballooning

Based on wiki memory ballooning is a technique used to eliminate the need to overprovision host memory used by a virtual machine. To implement it, the virtual machine’s kernel implements a “balloon driver” which allocates unused memory within the VM’s address space into a reserved memory pool (the “balloon”) so that it is unavailable to other processes on the VM. However, rather than being reserved for other uses within the VM, the physical memory mapped to those pages within the VM is actually unmapped from the VM by the host operating system’s hypervisor, making it available for other uses by the host machine. Depending on the amount of memory required by the VM, the size of the “balloon” may be increased or decreased dynamically, mapping and unmapping physical memory as required by the VM.

According to the Virtio v1.2 specification, Virtio Memory Ballooning follows the Virtio protocol. Including:

Feature bits

VIRTIO_BALLOON_F_MUST_TELL_HOST (0): Host must be notified before balloon pages are used.
VIRTIO_BALLOON_F_STATS_VQ (1): A virtqueue is present for reporting guest memory statistics.
VIRTIO_BALLOON_F_DEFLATE_ON_OOM (2): Balloon deflates when guest is out of memory.
VIRTIO_BALLOON_F_FREE_PAGE_HINT (3): The device supports free page hinting. The configuration field free_page_hint_cmd_id is valid.
VIRTIO_BALLOON_F_PAGE_POISON (4): The driver will immediately write poison_val to pages after deflating them. The configuration field poison_val is valid.
VIRTIO_BALLOON_F_PAGE_REPORTING (5): The device supports free page reporting. A virtqueue is present for reporting free guest memory.

Memory Statistics Tags

VIRTIO_BALLOON_S_SWAP_IN (0): Amount of memory swapped in (in bytes).
VIRTIO_BALLOON_S_SWAP_OUT (1): Amount of memory swapped out to disk (in bytes).
VIRTIO_BALLOON_S_MAJFLT (2): Number of major page faults that have occurred.
VIRTIO_BALLOON_S_MINFLT (3): Number of minor page faults that have occurred.
VIRTIO_BALLOON_S_MEMFREE (4): Amount of memory not being used (in bytes).
VIRTIO_BALLOON_S_MEMTOT (5): Total amount of memory available (in bytes).
VIRTIO_BALLOON_S_AVAIL (6): Estimate of available memory (in bytes) for starting new applications.
VIRTIO_BALLOON_S_CACHES (7): Amount of memory (in bytes) that can be quickly reclaimed without I/O.
VIRTIO_BALLOON_S_HTLB_PGALLOC (8): Number of successful hugetlb page allocations in the guest.
VIRTIO_BALLOON_S_HTLB_PGFAIL (9): Number of failed hugetlb page allocations in the guest.

Free page hinting

Free page hinting is used during migration to determine which pages within the guest are not being used. These pages are then skipped over while migrating the guest. The device will indicate it is ready to start hinting by setting the free_page_hint_cmd_id to one of the non-reserved values that can be used as a command ID. The driver is notified of the following reserved values:

VIRTIO_BALLOON_CMD_ID_STOP (0): any previously supplied command ID is invalid. The driver should stop hinting free pages until a new command ID is supplied, but should not release any hinted pages for use by the guest.
VIRTIO_BALLOON_CMD_ID_DONE (1): any previously supplied command ID is invalid. The driver should stop hinting free pages and release all hinted pages for use by the guest.

When a hint is provided, it indicates that the data contained in the given page is no longer needed and can be discarded. If the driver writes to the page, this overrides the hint and the data will be retained. Any stale pages that have not been written to since the page was hinted may lose their content. If read, the contents of such pages will be uninitialized memory.

Page Poison

Page Poison is a feature that lets the host know when the guest is initializing free pages with poison_val. When enabled, the driver immediately writes to pages after deflating and pages reported as free will retain poison_val. If the guest is not initializing freed pages, the driver should reject the VIRTIO_BALLOON_F_PAGE_POISON feature. If the feature has been negotiated, the driver will place the initialization value into the poison_val configuration field data.

Free Page Reporting

Free Page Reporting is a method similar to balloon inflation, but without a deflation queue. Reported free pages can be reused by the driver after the request is acknowledged, without notifying the device.

The driver initiates reporting by gathering free pages into a scatter-gather list, which is then added to the reporting_vq. The exact timing and selection of free pages is determined by the driver.

Once the driver has enough pages available, it sends a reporting request to the device, which acknowledges the request using the reporting_vq descriptor. After acknowledgement, the driver can reuse the reported free pages by returning them to the free page lists in the guest operating system.

The driver can continue to gather and report free pages until it has reached the desired number of pages.

Comparison to Other Memory Management Techniques

Virtio memory ballooning is just one of several memory management techniques available in virtualized environments. Here are some other techniques that are commonly used:

Overcommitment

Overcommitment is a technique that allows virtual machines to use more memory than physically available. This is useful when memory demand is highly variable. However, overcommitment can cause performance issues if the host system runs out of memory and needs to swap memory pages to disk.

KVM hypervisor automatically overcommits CPUs and memory. This means that more virtualized CPUs and memory can be allocated to virtual machines than there are physical resources. This saves system resources, resulting in less power, cooling, and investment in server hardware while still allowing under-utilized virtualized servers or desktops to run on fewer hosts.

Memory Compression

Memory compression compresses memory pages to free up memory in high demand situations. However, this technique can lead to performance problems if the compression algorithm is slow or if memory demand is high.

Zram, zcache, and zswap advance in-kernel compression in different ways. Zram and zcache, both found in the staging tree, have improved in design and implementation, but they are not stable enough for promotion into the core kernel. Zswap proposes a simplified frontswap-only fork of zcache for direct merging into the MM subsystem. While simpler than zcache, zswap is entirely dependent on still-in-staging zsmalloc and has limitations. If zswap is merged, it remains to be seen if it will ever be extended adequately.

Hypervisor Swapping

Hypervisor swapping is a technique in which the hypervisor swaps memory pages between the host and guest operating systems in order to optimize memory usage. This can be useful in situations where there is a high demand for memory or when the host system is running low on memory. However, hypervisor swapping can also lead to performance issues if the guest operating system can’t release memory quickly enough.

Compared to these techniques, virtio memory ballooning has some unique advantages. It optimizes memory usage within the guest operating system itself, reducing the risk of memory exhaustion and improving performance. However, it also has some trade-offs to consider, such as the potential for performance issues if the guest operating system can’t release memory quickly enough.

How to use Virtio Memory Ballooning on linux

Environment

On host side we use libvirt to setup a vm.

The memory tag means: The maximum allocation of memory for the guest at boot time.

The currentMemory tag means: The actual allocation of memory for the guest.

1
2
3

<maxMemory slots='16' unit='KiB'>1524288</maxMemory>
<memory unit='KiB'>8388608</memory>
<currentMemory unit='KiB'>8388608</currentMemory>

And add memballoon virtio device in vm xml:

1	<memballoon model='virtio'>

To use Virtio Memory Ballooning on Linux guest, you’ll need to ensure that your kernel has support for the virtio_balloon driver. You can check for this by running the following command:

1	lsmod \| grep virtio_balloon

If the virtio_balloon driver is not listed, you may need to load it manually by running the following command:

1	modprobe virtio_balloon

We can do some test to confirm balloon driver is working:

Basic usage

Explaination from

libvirt/virsh.rst at master · libvirt/libvirt

# virsh dommemstat YOUR_VM_NAME          
actual 8388608          # Current balloon value (in KB)
swap_in 7011156         # The amount of data read from swap space (in kB)
swap_out 664776         # The amount of memory written out to swap space (in kB)
major_fault 234565      # The number of page faults where disk IO was required
minor_fault 84722778    # The number of other page faults
unused 6291308          # The amount of memory left unused by the system (in kB)
available 8388044       # The amount of usable memory as seen by the domain (in kB)
usable 6349618          # The amount of memory which can be reclaimed by balloon without causing host swapping (in KB) *
last_update 1682566755  # Timestamp of the last update of statistics (in seconds)
disk_caches 116620      # The amount of memory that can be reclaimed without additional I/O, typically disk caches (in KiB)
rss 8529188             # Resident Set Size of the running domain's process (in kB)

with memory balloon we can get details about guest usage which matches the Memory Statistics Tags we metioned above.

And from dominfo we can see the memory usage directly

# virsh dominfo YOUR_VM_NAME
Id:             7
Name:           1970b0ef25e44adc834767fe81f155d5
UUID:           1970b0ef-25e4-4adc-8347-67fe81f155d5
OS Type:        hvm
State:          running
CPU(s):         4
CPU time:       214084.1s
Max memory:     8388608 KiB
Used memory:    8388608 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: none
Security DOI:   0

Shrinking memory

At first, check the unused memory of your guest

1 2	# virsh dommemstat YOUR_VM_NAME \| grep unused unused 2868704

then we try to set memory to a size we want

Simply,

1	use actual - unused = 8388608 - 2868704 = 5519904

Then we use setmem

1	# virsh setmem YOUR_VM_NAME --size 5519904KiB --current

Check the shrink take effects:

# virsh dommemstat YOUR_VM_NAME
actual 5519904
swap_in 0
swap_out 2592
major_fault 6236
minor_fault 181380396
unused 140212
available 5139400
usable 3424496
last_update 1682567978
rss 5583008

actual changed to 5519904 and we check the guest on the other side

# free -hm
              total        used        free      shared  buff/cache   available
Mem:           4.9G        862M        134M        299M        3.9G        3.3G
Swap:          7.9G        3.5M        7.9G

Total memory changed even smaller than 5519904 ~= 5.26G about 7% memory missing and almost same with available 5139400

Expanding memory

To increase the memory allocation of a virtual machine using virtio memory ballooning, you can use the virsh setmem command. For example, to increase the memory allocation to 8GB, you would run:

1	virsh setmem YOUR_VM_NAME --size 8G --current

This will increase the memory allocation of the virtual machine to 8GB. However, it’s important to note that the guest operating system must have support for virtio memory ballooning in order to take advantage of this feature.

In addition, it’s important to monitor the memory usage of virtual machines to ensure that they have enough memory to operate effectively. This can be done using tools like virsh dommemstat to monitor memory usage statistics.

# virsh dommemstat YOUR_VM_NAME
actual 8388608
swap_in 0
swap_out 2592
major_fault 6236
minor_fault 181827159
unused 3008116
available 8008104
usable 6293140
last_update 1682571788
rss 7545844

Inside guest

# free -hm
              total        used        free      shared  buff/cache   available
Mem:           7.6G        862M        2.9G        299M        3.9G        6.0G
Swap:          7.9G        3.5M        7.9G

With 8GB memory from qemu side, guest have total 7.6G memory. There is still a 5% missing.

Industry Practices

Proxmox

Dynamic memory management shows that KSM and memory balloon works on windows and linux guest, a memory range from min and max will be required and guest’s memory will dynamicly changed between the range to impelement memory ballooning.

Google cloud

Dynamic resource management Memory ballooning is an interface mechanism between host and guest to dynamically adjust the size of the reserved memory for the guest. A virtio memory balloon device
is used to implement memory ballooning. Through the virtio memory balloon device, a host can explicitly ask a guest to yield a certain amount of free memory pages (also called memory balloon inflation), and reclaim the memory so that the host can use the free memory for other VMs. Likewise, the virtio memory balloon device can return memory pages back to the guest by deflating the memory balloon.

Compute Engine E2 VM instances that are based on a public image
have a virtio memory balloon device , which monitors the guest operating system’s memory use. The guest operating system communicates its available memory to the host system. The host reallocates any unused memory to other processes on demand, thereby using memory more effectively. Compute Engine collects and uses this data to make more accurate rightsizing recommendations.

In Linux kernels before 5.2, the Linux memory system sometimes mistakenly prevents large allocations when the balloon device is present. This is rarely an issue in practice, but we recommend changing the virtual memory overcommit_memory setting to 1 to prevent the issue from occurring. This change is already made by default in all Google-provided images published since February 9, 2021.

To fix the setting, use the following command to change the value from 0 to 1:

1	sudo /sbin/sysctl -w vm.overcommit_memory=1

To persist this change across reboots, add the following to your /etc/sysctl.conf file:

1	vm.overcommit_memory=1

Nutanix

Squeeze even more memory of your HCI

Memory overcommit allows more memory to be assigned to VMs than is physically present in the server hardware. Unused memory allocated to a VM can be reclaimed by the hypervisor and made available to other VMs on the host. AHV adjusts memory usage for each VM according to its usage, allowing the host to use excess memory to satisfy the requirements of other VMs. This reduces hardware costs for large deployments or increases the utilization of an existing environment that can’t be immediately expanded with new nodes. VMs without memory overcommit will operate with their pre-assigned memory, and can coexist with overcommit enabled VMs. Nutanix uses a multi-tier approach combining ballooning and hypervisor-level swap to optimize performance. Metrics are presented to the administrator in Prism Central to indicate the gains achieved through overcommit and its impact on VM performance. Memory overcommit may not be appropriate for performance-sensitive workloads due to its dynamic nature.

Limits of Memory Overcommit

Memory overcommit has the following limitations:

You can enable or disable Memory Overcommit only while the VM is powered off.
Power off the VM enabled with memory overcommit before you change the memory allocation for the VM.
For example, you cannot update the memory of a VM that is enabled with memory overcommit when it is still running. The system displays the following alert: InvalidVmState: Cannot complete request in state on.
Memory overcommit is not supported with VMs that use GPU passthrough and vNUMA.
For example, you cannot update a VM to a vNUMA VM when it is enabled with memory overcommit. The system displays the following alert: InvalidArgument: Cannot use memory overcommit feature for a vNUMA VM error.
Memory overcommit feature can slow down the performance and the predictable performance of the VM
For example, migrating a VM enabled with Memory Overcommit takes longer than migrating a VM not enabled with Memory Overcommit.
There may be a temporary spike in the aggregate memory usage in the cluster during the migration of a VM enabled with Memory Overcommit from one node to another.
For example, when you migrate a VM from Node A to Node B, the total memory used in the cluster during migration is greater than the memory usage before the migration.
The memory usage of the cluster eventually drops back to pre-migration levels when the cluster reclaims the memory for other VM operations.
Using Memory Overcommit heavily can cause a spike in the disk space utilization in the cluster. This spike is caused because the Host Swap uses some of the disk space in the cluster.
If the VMs do not have a swap disk, then in case of memory pressure, AHV uses space from the swap disk created on ADSF to provide memory to the VM. This can lead to an increase in disk space consumption on the cluster.
All DR operations except Cross Cluster Live Migration (CCLM) are supported
On the destination side, if a VM fails when you enable Memory Overcommit, the failed VM fails over (creating the VM on the remote site) as a fixed size VM. You can enable Memory Overcommit on this VM after the failover is complete.

Limitations and Challenges

Guest should support virtio memory ballooning, if the balloon driver not available there is no effective way to do it.

Distribution	No Balloon Driver	Partially Supported	Fully Supported
CentOS	6.1, 6.2	6.3–6.9, 7.1, 7.2	7.3–7.7, 8.0–8.2
Oracle	7.3	7.4, 7.5	7.6, 7.7
Ubuntu	See note.	12.04	14.04 and newer

Not all situations are suitable for memory ballooning. Frequent expansion and contraction of memory can be harmful if the memory usage changes dynamically.

Future Development

https://www.linux-kvm.org/page/Projects/auto-ballooning The auto ballooning project was initiated in 2013. The hypervisor and Linux kernel need to be updated to support the project, which has not been upstreamed yet.

Real-World Implementation Case Study

Lessons Learned Building a Production Memory-Overcommit Solution

Conclusion

Virtualization is important in modern computing for flexible and efficient resource allocation. Memory management is challenging in virtualized environments when multiple virtual machines run on a single physical server. Virtio memory ballooning optimizes memory usage by dynamically adjusting guest memory reservation. It improves performance and reduces the risk of memory exhaustion. This article explains how to use virtio memory ballooning on Linux, compares it to other memory management techniques, and discusses industry practices, limitations, and future developments.

花の様に