We’ve been tracking a weird state with SQL Server virtual machines on VMware and possible warnings on database corruption while VM backups are running, largely centered around (but not isolated to) the tempdb database.
TLDR: We’ve now got a VMware KB article on this situation that you and your VM admins should read if you hit the condition and fall into the specifics listed below. Reference VMware KB 88201 for more details.
Here’s the specifics.
VMware vSphere 7.0 Update 2 introduced something called a bloom filter to boost storage performance. When a snapshot is present on a virtual machine and the VM is relatively active, we’ve very occasionally seen 823/824 alerts, largely centered on but not limited to the tempdb database. It’s largely during the time when a VM is getting backed up. It’s very rare, but it does happen and we’ve seen our fair share of these over the last two months.
DBAs across the world have been pinging me on this for a few months now as their VM admins update to the latest VMware patches.
VMware has a short-term fix for this to disable the bloom filter, but it must be re-disabled upon ESXi host reboots at this point. They’re working on an improved resolution path on this issue.
If you encounter this, please let me know. We’re looking for more sites that can repeatedly reproduce this state, as it’s quite rare. I’ll keep you all in the loop as new updates are released.
UPDATE: VMware just released 7.0 Update 3f, which officially disables the bloom filter. I’m not sure if VMware intends to re-enable this in the future, but the root cause of the data integrity issue is still being investigated. Apply this update as soon as you can if you are experiencing the issue detailed above!