So exactly what do we mean by "load testing" when it comes to SharePoint 2010? There are lots of methods that people tend to point towards, and I've heard "hits/visits per day" and "throughput" bandied about, but at the end of the day it comes down to 2 things:
- Requests Per Second
The requests per second literally means how many requests for information each server is capable of responding to per second. Each page may consist of dozens of artifacts, and for each artifact the browser needs to make a "request", therefore the more of these "requests" it can serve the better.
- Server Response Time.
The response time represents any processing on the server side (or TTLB - Time to Last Byte). This doesn't factor in network latency or bandwidth though!
So the first thing you should think about is what can influence those metrics? And you end up with 5 different elements of your SharePoint 2010 farm:
- WFE
- Storage
- Network
- App Servers
- SQL
This, as I'm sure you can imagine, can involve a LOT of testing. Simply testing the WFE on their own is going to be struggle for your average developer, and if you don't have any industry testing experience you are going to have a hard time, but this is where the new SharePoint 2010 wave continues to make it's presence felt. ..
SharePoint 2010 Load Testing Toolkit
This is a new set of tools being released with the SharePoint 2010 Administration Toolkit and represents the easiest possible way of load testing your SharePoint environment. The main objective here is to:
- Standardise and simplify the cost of load testing.
- Simulate common SharePoint operations
- Be used as reference to create other custom tests (for custom code, for example!)
The whole thing relies on the IIS analysis logs. These logs give pointers on where users are going, what kinds of requests they are doing (GET / PUT) as well as the types of files they are typically accessing (ASPX / CSS / JS / JPEG / DOCX / etc...)
The Load Testing Toolkit will analyse your IIS logs and automatically generate a set of loads tests to appropriately match your environment, producing automated scripts that can be run in Visual Studio (either Team System or Team Test Edition).
How hard can it be?
It is really quite simple (well, according to the ridiculously simple explanation at the SharePoint 2009 conference!). You literally point the tool at your IIS logs, and it spits out an entire suite of tests, for WFE, SQL, Storage, etc .. Including all the metrics you would want (from CPU, RAM, Network, Disk I/O and even SQL , ASP.Net and .Net Framework specific performance counters).
Then you just run it and analyse the results!
Analyse That!
The analysis couldn't be simpler. With "Requests Per Second" and "Response Times" two of the metrics generated by the Visual Studio test reports, you really can't go far wrong.
If you do find a problem, then you can delve into the new SharePoint 2010 "Usage Database" (which now runs on SQL Server) in order to identify exactly what was causing your dip in performance (say when someone deletes a large list?).
Tips and Tricks
There are a few gotchas, one thing is to be careful of "Validation Rules" in Visual Studio. Typically it will be happy with pages that return "200" codes. This of course includes Error and Access Denied pages (which SharePoint will handle, and returns a perfectly valid page (hence the 200 code!)).
It is also recommended that you let your test "Warm up" for around an hour before you start taking the results seriously. This allows all of the operations, timers and back-end mechanics of SharePoint to properly settle down, and means you are getting a realistic experience of what the environment will react like once it is bedded into it's production environment.
Finally, the SharePoint Usage Logging Database is a great location to grab information out of, so why not leverage other great aspects of the Office 2010 family. You could pull through the Usage DB information into Excel 2010 (perhaps using PowerPivot?) so that you can spin out charts and pivot tables to easily drill down into your data.
Typically load testing tells you WHEN bottlenecks are occurring, but the Usage Database can tell you WHAT is causing the bottlenecks!