Sep 072012
 

Your DBAs need direct access to VMware vCenter. Shocking, eh? At a minimum, they need read-only access to the performance statistics for their virtual machines hosting their database server, as well as the host that they reside upon.

I consider a requirement for all DBAs to have vCenter performance statistics access in environments where I have any sort of say in the matter, and here’s why.

Keep in mind that while I reference SQL Server and VMware specifically, these topics apply to ALL business-critical databases and ALL hypervisors.

Fear, Uncertainty, and Doubt

Database Administrators are the sort of people that demand objective numbers about the performance of their systems (myself definitely included). We are numbers and data people, after all. However, adding the virtualization layer, and keeping that layer in a block box away from the DBAs, prevents the collection of performance metrics from that layer.

VMware and System Administrators of the world – listen up. I have a simple analogy. What if you were a race car driver with no dashboard? You have no way to tell if your car is running hot or oil pressure is getting low. You begin to lose trust in the car at the first sign of a problem, and have a hard time trusting it again, even if it is running optimally.

Now put yourselves in the shoes of the DBA. Not having access to these performance statistics creates uncertainty and questions in the event of a performance issue. Worse yet, it creates doubt, and I see the virtualized platform get blamed by the DBAs for performance issues frequently, even though most of the time the issue lies elsewhere. It’s a natural response – the statistics that I have access to all look within standards, so shift and point the finger at the next layer that you cannot see. It has to be in there, right?

You do NOT need your DBAs doubting virtualization. Granting access to the objective numbers of the overall performance and health on that layer will not only keep the DBAs happy that they have access, but will keep your workload down by letting them answer their own questions on the health of the environment.

DBAs out there – do not doubt virtualization. Overcommitted hosts CAN and sometimes DO induce performance penalties on their virtual machines, but these are due to inappropriate uses of virtualization and not the virtualization layer itself. The performance statistics that you gain through the VMware vCenter interface help you see these overcommitment levels. When a host is overloaded, you can see it and react appropriately. When a host is running normally, these same statistics can show you the negligible overhead that virtualization layer induces on your system’s performance.

Troubleshoot Performance Issues

These same stats give you a more complete picture of what is happening around your mission-critical systems. Guess what? In the event of reported performance issues, DBAs now have the ability to quickly look at disk, CPU, memory, and network performance statistics quickly. It can help them start the triage process for issues within their systems. Actually, I had one issue this week where a customer’s SQL Server was hammering a tempdb database. Perfmon and the system activity monitor were reporting 50MB/s throughput to the disk that just contained tempdb, but tempdb only contained 90MB of data. It took some time to isolate the problem as tempdb contention. Had I had vCenter access in this environment, a very cursory evaluation would have shown abnormally high disk access on this volume, and the troubleshooting could have steered in this direction must faster.

The same train of thought goes for other items such as CPU overcommitment (in the form of dangerously high CPU Ready times), memory overcommittment (in the form of memory ballooning from the guests), and storage or networking bottlenecks. All of these items are very easy to quickly spot, and troubleshooting can find the first appropriate path much quicker than without these items easily accessible.

In the real world, a high percentage of the times an examination like this occurs, the virtualization layer is ruled out almost immediately. Having this level of access allows the DBAs to rule out the virtualization layer without escalating a problem up to the VMware administrators and causing them to drop what they are working on to investigate the issue. It saves everyone time, headaches, and unnecessary stress.

Create Better Infrastructure Baselines

Everyone baselines all major aspects of their environment, right?

Riiiiiight…

Rarely do I find organizations that have full predictive baselines of their environment. For those that do, granting access to vCenter statistics for the DBAs will enhance their ability to build a more complete baseline of their system.

  • It will help better understand the capacity of the hosts their important servers operate on. They can watch CPU usage and Ready times. These statistics help them to determine when the consolidation ratio of VMs on the hosts in the cluster gets to the point where performance of the SQL Servers will begin to deteriorate.
  • It can also help them determine the amount of resources free in case a VM must grow to handle increased workloads, as well as better predict when it is time to add more resources to their VMs.
  • They can begin to coorelate the impact of external processes running on the same VMware host against baselined performance of their systems. For example, what happens to a SQL Server’s performance when a second VM on the same host runs an antivirus scan in the middle of the night while a third VM is taking a full system backup?

To me, baselines are so critically important in a large environment that I stress baselining everything possible. I even have a full section in one of my SQL Saturday presentations to help share my urgency!

Pre-Reqs For Granting Access

Now, I am not an advocate of simply handing the keys to the castle to the DBAs. They probably do not need more than read-only access into vCenter. Understand what access the DBAs actually need and grant it to them and nothing more.

The broader issue is education. The responsibility to educate the DBAs lies with the business. Infrastructure topics are generally vastly different from what DBAs usually work with on a day to day basis. The DBA should have a general education in of a number of different concepts that VMware administrators routinely handle, including:

  • Virtualization myths and misconceptions
  • Host utilization – CPU, memory, disk
  • CPU Ready Time and its interpretation
  • Datastores and multipathing
  • Memory ballooning
  • Resource Pools
  • Virtual Networking
  • VM to VM isolation

I believe that all DBAs should have access to VMware-level statistics, but only after suitable training. Hopefully you do too!

Email me if you have any questions or want to argue with me! I’d love to hear comments on this topic.