Mar 062017
 

VMware’s VMTools package has some options for synchronizing the in-guest clock with the time of the ESXi host. By default, these are set to disabled, and under most circumstances, the hypervisor follows these settings and everything is fine. However, several situations exist where the hypervisor will reset the in-guest time, even though the VMTools setting is explicitly set to not allow this to happen.

  • A virtual machine is resumed from a suspended state
  • A vMotion occurs
  • A snapshot is taken
  • A snapshot is restored from
  • A virtual disk is shrunk
  • The VMware Tools service is restarted, which includes a guest reboot

Of these tasks, vMotions for load balancing and snapshots for VM-level backups are very common routine operations for VMs.

These time sync actions can move a guest’s time backwards as well as forwards. More details about this conflict of settings are found in VMware KB1189. If the host time is out of sync, such as when a BIOS battery fails, bad things can happen. This action is extremely detrimental to the state of SQL Server high availability features, such as Availability Groups and Failover Cluster Instances, which depend on the in-guest time closely aligning with the Active Directory synchronized time. This action must be explicitly disabled to ensure that these maintenance items do not trigger an unexpected failover of the SQL Server HA solution. To disable this action, perform the following tasks.

To correct this action, first shut down the virtual machine. Then, edit the virtual machine’s advanced configuration parameters under VM Options, and add the following keys and values.

Key Name Value
tools.syncTime 0
time.synchronize.continue 0
time.synchronize.restore 0
time.synchronize.resume.disk 0
time.synchronize.shrink 0
time.synchronize.tools.startup 0
time.synchronize.tools.enable 0
time.synchronize.resume.host 0

Save the configuration. Power back on the VM.

Apr 232015
 

update_recommendedIf you experience a WSFC failover during a VMware vMotion / Hyper-V Live Migration with your virtualized SQL Servers when using Availability Groups, you might have server and/or networking hardware that takes a bit longer than usual to handle the network port failover. This can lead to short unexpected outages in your application during the failover. The added latency in the infrastructure could trigger this failover from the cluster monitoring agents and their very short latency hitting that threshold too quickly.

In this case, you need to handle the cluster monitoring network settings in the same manner that Microsoft recommends for these types of VMs on Azure – relax the monitoring thresholds by a small amount. To learn more about how to adjust these values, read more here. Just remember to adjust these settings very gradually!