Over the past six months I have increasingly become an evangelist of Performance Testing. It has always previously been an area that I was aware of but I never really got massively involved in, but recently I've seen it as an increasingly important part of my work, especially on the larger scale projects with load balanced web front ends (for performance, not just redundancy) and you start hitting I/O limits on SQL. I suppose this may have been triggered by the SharePoint Conference 2009, and one of my follow up blog posts "
Load Testing SharePoint 2010 with Visual Studio Team Test".
So in this post I firstly wanted to look at
why you should do Performance Testing?
It sounds like a bit of a stupid question (with an obvious answer) but it really is surprising how many people don't do it. How many of you have ever asked the following questions on a project?
"How many users can the production system support?"
"What would be the impact of doubling the number of users?"
"What impact with backups have on performance?"
"How fast will the solution perform during peak hours?"
"What is the most cost-effective way of improving performance?"
All of these are questions that you absolutely HAVE to be able to answer. The client (whether it is your organisation, or another organisation who you are running a project for) deserves to know the answers to these, and without them how can you have any idea whether your solution is going to be fit for purpose?
Sure, you can read up on
Estimating Performance and Capacity Planning in SharePoint, but all that gives you is some rough guidelines.. we need to be able to apply some science to the process!
The last question is probable the most compelling. Re-configuring farms and buying new hardware is an expensive process, the consultancy alone can cost thousands of pounds, and you don't want to have your client coming back asking why they just spent tens of thousands of pounds on a new state of the art iSCSI SAN array, to have zero impact on performance ("hey .. we thought it would help .. but we didn't really know!") because the bottleneck was actually the CPU on the Web Front End (WFE).
The story often gets even worse when things do start going wrong. If you have ever been in the unfortunate position where you are troubleshooting a system that is performing badly, these kinds of questions are quite common:
"What is causing the poor performance?"
"How can we fix this?"
"Why did you not notice this during development?"
Again, the last two questions is the killer.. if you don't do any Performance Testing then you won't know that you have a problem until it is too late. The earlier you can get some metrics on this, the faster you will be able to react to performance issues (in some cases finding them and fixing them before the client even knows about it!)
Equally, without performance testing you won't know
WHY the problems are occuring. If you don't know why then you can't know
HOW the best way is to fix them!
So the key messages are this:
- Early Warning .. catch problems early on and they will be easier to fix. There is no point waiting until users are hitting the system to find out the solution can't cope with the load!
- Knowledge ... what is causing the problems, and how do you fix them?
- Confidence ... not just that you know what you are doing, but you can prove it. This instils confidence in your sales, confidence in your delivery, and confidence from your clients too!
Performance Testing with Visual Studio 2010
I've been using
Visual Studio 2010 Ultimate edition. It is the only "2010" product that incorporates
Web Performance Tests and
Load Tests, the two critical pieces that you will use to test the performance on SharePoint 2010 (or any other web based system). It also integrates tightly with Team Foundation Server and provides "Lab Management" capability, but that is out of the scope of this blog post.
In order to do comprehensive testing you really need 4 different software packages:
- Visual Studio 2010 Ultimate: This is where you create your tests and control the execution of them.
- Visual Studio 2010 Test Controller: Part of the Visual Studio Agents 2010 ISO, this allows you to co-ordinate tests executed by several "agents", as well as collecting results and storing all of the test results (and performance counters) in a database. The license for this is included in Visual Studio 2010 Ultimate.
- Visual Studio 2010 Test Agent: Part of the Visual Studio Agents 2010 ISO, this can be installed on machines that will simulate load and execute tests. They are connected to a "Controller" which gives them instructions. The license for this is included in Visual Studio 2010 Ultimate.
- Visual Studio 2010 Virtual User Pack: This is a license that allows you to increase the number of virtual "users" you can simulate by 1,000 (for each pack that you purchase). This is a separate license that must be purchased separately (there is no trial version!)
If you need any help installing these and getting them running then there is a great MSDN article which you should read:
Installing and Configuring Visual Studio Agents and Test and Build Controllers or the equally awesome article from Visual Studio Magazine:
Load Testing with Visual Studio 2010.
So what about actually creating the tests?
Well, the interface is pretty simple. You can create your "Web Performance Tests" using a simple Browser Recorder (literally using a Web Browser which records all of your actions, and then click "stop" when you are finished). This works great, but there are a few caveats:
- You might want to use the "Generate Code" option if you are adding documents or list items. This converts your recorded web test into a code file, allowing you to programmatically change document names, or field values .. useful to make sure you are not just overwriting the same document over and over again
- Web Service tests require a bit more "knowledge" of how they work, needing the SOAP envelope (in XML) and the SOAPAction header.
It is worth noting that there is an excellent Code Plex project available: "SharePoint Performance Tests". Although this was written for Visual Studio 2008 (you can convert it to 2010 if you want) it contains a number of configurable tests (via XML) that allow you to dynamically create tests for generic SharePoint platforms .. well worth a look!
You can then very easily create a "Load Test" which allows you to pick'n'mix tests, and a distribution of which tests you want to run.
My personal favourite is the "Tests Per User Per Hour". For this you would sit down with your client and work out "what would a typical user do in an hour of using the system.." one such activity resulted in this kind of activity distribution:
- Hit the site home page 50 times
- Execute 10 searches
- Upload 5 documents
- Respond to 20 workflow tasks
This kind of valuable information allows you to build your tests and then distribute them using the Load Test. All you do then is plug in how many users you want to simulate and away you go!
Counting the Counters?
All of this so far is great stuff, but without the performance counters you really aren't going to get much legs from Visual Studio. You might get the
WHAT is going on (i.e. do the tests complete very quickly?) but you certainly won't get the
WHY information which is oh-so important (i.e. is it the CPU, RAM or Disk?)
For this you need to add Performance Counters... thankfully this is rediculously simple. You have something called "
Counter Sets" which you can configure to collect from the computers that operate in your farm.
There are a bunch of pre-defined counter-sets you can choose from:
- Application
- ASP.Net (I pick this for my WFE Servers)
- .Net Application (I pick this for my Application Servers)
- IIS
- SQL (I pick this for my SQL Servers)
I won't go into any more detail than that. A step-by-step walkthrough of the options (including screenshots) can be found at the
Load Testing with Visual Studio 2010 article at Visual Studio Magazine.
What about the Results?
Well, there isn't a really simple answer to this. You really need to have a good understanding on how the different hardware components interact, and what limits you should be looking for.
The big hardware counters (CPU usage, Available Memory) are the obvious ones. Any server which exceeds 80% CPU usage for any sustained period is going to be in trouble and is close to a bottleneck. Equally any server which starts to run out of memory (or more importantly .. slowly loses memory, suggesting a memory leak!) should be identified.
But it's the deeper, more granular analysis that proves most useful. On a recent client project I was looking at a Proof of Concept environment. We knew that we had a bottleneck in our WFE (CPU was averaging around 90%) and it was extremely workflow heavy, but the page performance was far too bad to put down to just the CPU.
On closer inspection we found a direct correlation between ther Page Response Time and the Disk Queue Length in SQL Server:
The top-left corner is the Disk Queue Length in SQL Server, and the Top Right is the Page Response Time for the Document Upload operation (bottom right is the overall Test Response time), clearly the spikes happened at the same time.
This is the true power of using Visual Studio. All of the tests and performance counters are time-stamped, allowing you to drill into any specific instance and see exactly what was happening at that moment in time!
Looking closer at the SQL Disk usage, the Write Time (%) and Read Time (%) show us even more interesting results:
The top of the graph shows the Disk Write Usage (%) and the bottom half shows the Disk Read Usage (%). Clearly, the disk is very busy writing (often being at 100%) while it does very little reading. This fits perfectly with our test results as most of the "read" operations (like viewing the home page, or executing a search result) were extremely fast ... but most of the "write" operations (like uploading a document) were much slower.
So the
WHAT is slow write performance (uploading of documents).
The
WHY is now very simple, the disks on the SQL Server need looking at (possibly upgrading to faster disks, or some optimisation in the configuration of the databases).
Conclusion
To be honest I could talk about this subject all day, but hopefully this gives you some indication of just how crucial Performance Testing is .. and how powerful Visual Studio can be as a testing tool.
The ease of creating test scripts, the vast flexibility and power of the enormous performance counters available, and the ability to drill into a single second of activity and see (simultaneously) what was going on in all of the other servers .. its an awesome combination.
I'll probably be posting more blog posts on this in the future, but for now good luck, and hope you get as much of a kick out of VS2010 as I have :)