AHV Basics – Part 3 VM High Availability (HA)



HA is natively builtin to AHV to ensure the availability of guest VMs in the event of a node outage.

There are two options for VM HA, Default and Guaranteed.

With Default, there is no configuration needed and the configuration is on automatically out of the box. When an AHV node becomes unavailable, VMs that were running on the failed node will restart on remaining hosts within the cluster as long as there is sufficient resources. If resources are not available, not all VMs will restart.

Guarantee needs to be enabled from Prism, selecting the options cog, then Manage VM High availability

Vmha menuManage vm ha

 

 

 

This configuration option reserves space throughout the nodes in a cluster to guarantee that all VMs on a node will be able to restart in the event of a node failure.

The Acropolis master is responsible for keeping track of node health by monitoring all connections to libvirt on all hosts within the cluster. Acropolis Master is also responsible for restarting any VMs on healthy hosts in an HA event. The following diagram, courtesy of NutanixBible.com explains HA host monitoring.

Ha hostmonitoring

As you would expect, if the Acropolis Master is impacted, a new Acropolis Master will be elected on the remaining nodes in the cluster.

Reservations

After reading the above, you are probably wondering how AHV calculates the amount of resource it needs to reserve in a ‘Guaranteed’ state. The amount of resource is dependant on the Replication Factor that is set. One node worth or resources will be reserved if all containers are set at RF2, and two nodes worth of resources will be reserved if ANY containers are set at RF3. If a cluster has nodes with different memory capacities, AHV will automatically use the node with the largest capacity when making calculations.

Keeping simplicity at the forefront VM HA within AHV is no different.