Living a SharePoint life

Thursday, November 20, 2014

Remove the performance breaks when using VMWare 5.0 and Windows Cluster services


There are probably a lot of SharePoint Farms that use a Microsoft SQL cluster. Even now with new possibilities like Always-On for SQL 2012. But not only if you use SharePoint might this be of interest for you.

About a year ago, a customer of mine had some very bad performance with his SQL Server. The SQL was version 2008 R2 using the Windows Cluster Service hosted on a VMWare ESX 5.0 private cloud. Nothing unusual and of course the first thoughts were pointing to the VMWare servers. But let’s take a look at the performance test we performed on the cluster.


VMWare without correct settings

The following diagrams are the result oft the performance tests we made with a VMWare ESX Cluster connected to a NetApp Storage. The test was performed with 8, 16, 32, 64, 128 and 256 kilobyte blocks.

Random Read IOPS


Random Write IOPS


Sequential Read IOPS


Sequential Write IOPS


Random Read MB/s


Random Write MB/s


Sequential Read MB/s


Sequential Write MB/s

VMWare with correct settings

As soon the correct settings for VMWare are selected, the performance boosts immediately. In some areas there is a boost of factor 10.

Random Read IOPS


Random Write IOPS


Sequential Read IOPS


Sequential Write IOPS


Random Read MB/s


Random Write MB/s


Sequential Read MB/s


Sequential Write MB/s


So when do these problem occur?

As we found out, by digging through the systems, the ESX 5.0 cluster was using redundant IO-paths to the storage. However, ESX by default is configured to use round-robin on the multiple paths. As described here, Round Robin Path Selection Policy (PSP) is not supported for LUNs mapped by RDMs used with shared storage clustering in vSphere 5.1 and earlier.

So what is the solution? You need to change the Path Selection Policy to use only one path instead round-robin them. Full details to change the settings can be found here:

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=1036189

If you are interested how we measured the performance of the server, I’ve written a article about this a while ago (in German):

http://blog.greenbrain.de/2013/03/ermitteln-der-io-leistungsfahigkeit.html

Featured Post

How are Microsoft Search quota consumed?

With Office 365 Search, Microsoft has created a central entry point for the modern workplace. In one convenient spot, users can access all ...