I’m indebted to an old colleague for giving me this way of looking at performance tuning. I’m sure he’d recognise the pub too!
How Much Tuning Do We Need?
The better question is how fast does your system go when configured sensibly, and how fast is it supposed to go.
If you have not got a simple model for the required performance of your system, and a way of comparing the performance of the system against that realistic model, then you cannot do anything other than theoretical tuning and you’ll never be happy to say you’re finished, or your users will never be happy.
There’s a theoretical state we’re aiming for. It’s something like this:
If we can prove that the system, as it is now, can process a whole day’s worth of data in under an hour, we can probably stop tuning it and just go down the pub.
What Do You Need To Go Down The Pub?
To get to this state, we need a system that works efficiently on the sort of hardware we’re prepared to deploy. The first thing we’re trying to work out is whether the thing we’ve made so far ALREADY is that! If it is, then after each increment of tuning, we go back to this premise.
We also will need:
- Expected daily load (now and in the likely future)
- Expected peak load (if there’s a chance, for instance, that the system does most of its work within a short period of time)
- Expected reasonable budget for running replicas of nodes within the system
- Organization’s desired trade off between engineering time and machine time
If you have a system that can’t horizontally scale, then some of the above might not apply.
Similarly, if the system does all of its business in a 20 minute window (as can be the case for systems which process enquiries after a TV ad) then you need to modify the expectation appropriately – in this case, if the total load needs to execute in 20 minutes, if we can execute it in around about 2, then we can go to the pub!
What Does This Prove?
This methodology does not tell you how to tune your system. It doesn’t make it faster or slower. It does not tell you what your NFRs are. However, this is a way to try to coerce all the data you’re considering into a shape that’s simple and complete. You must:
- Understand your goal
- Trade off machine time vs tuning time
- Stop when it’s good enough, rather than agonise over “perfection”
- Be able to explain why you think it’s performant enough that you can “go to the pub”
An extra tip: explain the performance in user language, not system language. Then you will get the buy in of your stakeholders and perhaps they’ll be down the pub with you, celebrating.