There are probably a lot of SharePoint Farms that use a Microsoft SQL cluster. Even now with new possibilities like Always-On for SQL 2012. But not only if you use SharePoint might this be of interest for you.
About a year ago, a customer of mine had some very bad performance with his SQL Server. The SQL was version 2008 R2 using the Windows Cluster Service hosted on a VMWare ESX 5.0 private cloud. Nothing unusual and of course the first thoughts were pointing to the VMWare servers. But let’s take a look at the performance test we performed on the cluster.
VMWare without correct settings
The following diagrams are the result oft the performance tests we made with a VMWare ESX Cluster connected to a NetApp Storage. The test was performed with 8, 16, 32, 64, 128 and 256 kilobyte blocks.Random Read IOPS |
Random Write IOPS |
Sequential Read IOPS |
Sequential Write IOPS |
Random Read MB/s |
Random Write MB/s |
Sequential Read MB/s |
Sequential Write MB/s |
VMWare with correct settings
As soon the correct settings for VMWare are selected, the performance boosts immediately. In some areas there is a boost of factor 10.Random Read IOPS |
Random Write IOPS |
Sequential Read IOPS |
Sequential Write IOPS |
Random Read MB/s |
Random Write MB/s |
Sequential Read MB/s |
Sequential Write MB/s |
So when do these problem occur?
As we found out, by digging through the systems, the ESX 5.0 cluster was using redundant IO-paths to the storage. However, ESX by default is configured to use round-robin on the multiple paths. As described here, Round Robin Path Selection Policy (PSP) is not supported for LUNs mapped by RDMs used with shared storage clustering in vSphere 5.1 and earlier.So what is the solution? You need to change the Path Selection Policy to use only one path instead round-robin them. Full details to change the settings can be found here:
http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=1036189
If you are interested how we measured the performance of the server, I’ve written a article about this a while ago (in German):
http://blog.greenbrain.de/2013/03/ermitteln-der-io-leistungsfahigkeit.html