The Guerrilla Manifesto

The Guerrilla Manifesto

Hit-and-Run Tactics You Can Use on Your Boss or in a Tiger Team Meeting

Updated on Apr 9, 2009

Management resists, the guerrilla planner retreats.
Management dithers, the guerrilla planner proposes.
Management relents, the guerrilla planner promotes.
Management retreats, the guerrilla planner pursues.
The following mantras are taken from my Guerrilla Capacity Planning (GCaP) classes and
also appear as a pull-out booklet in the rear jacket of my book of the same name.    ---NJG

Contents

1  Weapons of Mass Instruction
    1.1  Why Go Guerrilla?
    1.2  Best Practices
    1.3  Virtualization
    1.4  An Ounce of Prevention
    1.5  Why Capacity Management is Hard
    1.6  Brisk Risk Management
    1.7  Failing On Time
    1.8  Performance Homunculus
    1.9  Self Tuning Applications
    1.10  Squeezing Capacity
    1.11  When Wrong is Right
    1.12  Over-engineering Gotcha
    1.13  Network Performance
    1.14  Can't Beat This!
    1.15  Modeling Errors? Checked Your Data Lately?
    1.16  Data Are Not Divine
    1.17  Busy work
    1.18  Little's Law
    1.19  Bigger is Not Always Better
    1.20  Bottlenecks
    1.21  Benchmarks
    1.22  Spit It Out!
    1.23  Consolidation
    1.24  Control Freaks Unite!
2  Performance Modeling Rules of Thumb
    2.1  What is Performance Modeling?
    2.2  Monitoring vs. Modeling
    2.3  Keep It Simple
    2.4  More Like The Map Than The Metro
    2.5  The Big Picture
    2.6  Look for the Principle
    2.7  Guilt is Golden
    2.8  Where to Start?
    2.9  Inputs and Outputs
    2.10  No Service, No Queues
    2.11  Estimating Service Times
    2.12  Change the Data
    2.13  Closed or Open Queue?
    2.14  Opening a Closed Queue
    2.15  Steady-State Measurements
    2.16  Transcribing Data
    2.17  Workloads Come in Threes
3  Universal Law of Computational Scaling
    3.1  The Universal Model
    3.2  Justification
    3.3  Applicability
    3.4  How to Use It

1  Weapons of Mass Instruction

1.1  Why Go Guerrilla?

The planning horizon is now 3 months, thanks to the gnomes on Wall Street. Only Guerrilla-style tactical planning is crazy enough to be compatible with that kind of insanity.

1.2  Best Practices

Best practices are an admission of failure.
Copying someone else's apparent success is like cheating on a test. You may make the grade but how far is the bluff going to take you?

1.3  Virtualization

Virtualization is about illusions and although it is perfectly reasonable to perpetrate such illusions onto a user, it is not ok to propagate those same illusions to the performance analyst.
Translation: Vendors, listen up. We need backdoors and peepholes so we can determine how resources are actually being consumed.
Corollary: It's better for business if we can manage it properly.

1.4  An Ounce of Prevention

Capacity management is about prevention. But someone once told me "You can't sell prevention!"; the implication being that an ounce of prevention is worthless.
Then explain to me the multi-billion dollar dietary-supplements industry!
It's not what you sell, but how you sell it.

1.5  Why Capacity Management is Hard

Capacity planning is complicated by your brain thinking linearly about a computer system that operates nonlinearly.
Capacity planning techniques, such as the universal scalability model (in Sect. 3), help us to describe and predict these nonlinearities.

1.6  Brisk Risk Management

BRisk management, isn't.
"I can understand people being worked up about safety and quality with the welds," said Steve Heminger, executive director ... "But we're concerned about being on schedule because we are racing against the next earthquake."
Although this is not an IT manager, the point still applies. It is a quote from an executive manager for the new Bay Bridge currently being constructed between Oakland and San Francisco. Management threw out the independent assessement of the welds in order to stay on schedule.

1.7  Failing On Time

Management will often let a project fail; as long as it fails on time!
Until you read and heed this statement, you will probably have a very frustrating time getting your perforance management ideas across to managment.

1.8  Performance Homunculus

Capacity management is to systems management as the homunculus (sensory proportion) is to the human body (geometric proportion).
Capacity management can rightly be regarded as just a subset of systems management, but the infrastructure requirements for successful capacity planning (both the tools and knowledgeable humans to use them) are necessarily out of proportion with the requirements for simpler systems management tasks like software distribution, security, backup, etc. It's self-defeating to try doing capacity planning on the cheap.

1.9  Self Tuning Applications

Self-tuning applications are not ready for prime time. How can they be when human performance experts get it wrong all the time!?
Think about it. Performance analysis is a lot like a medical examination, and medical Expert Systems were heavily touted in the mid 1980's. You don't hear about them anymore. And you know that if it worked, HMO's would be all over it. It's a laudable goal but if you lose your job, it won't be because of some expert performance robot.

1.10  Squeezing Capacity

Capacity planning is not just about the future anymore.
Today, there is a serious need to squeeze more out of your current capital equipment.

1.11  When Wrong is Right

Capacity planning is about setting expectations. Even wrong expectations are better than no expectations!
Planning means making predictions. Even a wrong prediction is useful. It means either (i) the understanding behind your prediction is wrong and needs to corrected, or (ii) the measurement process is broken somewhere and needs to be fixed. Start with a SWAG. Next time, try a G. If you aren't making iterative predictions throughout a project life-cycle, you will only know things are amiss when it's too late!

1.12  Over-engineering Gotcha

Hardware is cheaper today, but a truck-load of PCs won't help one iota if all or part of the application runs single-threaded.
My response to the oft heard platitude: "We don't need no stinkin' capacity planning. We'll just throw more cheap iron at it!" The capacity part is easy. It's the planning part that's subtle.

1.13  Network Performance

It's never the network!
If the network is out of bandwidth or has interminable latencies, fix it! Then we'll talk performance of your application.

1.14  Can't Beat This!

If the measured round-trip time (RTT) of an application produces a concave graph in response to increasing user/client load, SHIP IT!
In case you're wondering, those are REAL data and the axes are correctly labeled. I'll let you ponder why these measurements are so broken, they're not even wrong! Only if you don't understand basic queueing theory would you press on regardless (which the original engineer did).

1.15  Modeling Errors? Checked Your Data Lately?

When I'm asked, "But, how accurate are your performance models?" my canonical response is, "Well, how accurate are your performance DATA!?"
Most people remain blissfully unaware of the fact that ALL measurements come with errors; both systematic and random. An important capacity planning task is to determine and track the magnitude of the errors in your performance data. Every datum should come with a `±' attached (which will then force you to put a number after it).

1.16  Data Are Not Divine

Treating performance data as something divine is a sin.
Data comes from the Devil, only models come from God.

1.17  Busy work

Busy work does not accrue enlightenment.
Western culture too often glorifies hours clocked as productive work. If you don't take time off to come up for air and reflect on what you're doing, how are you going to know when you're wrong?

1.18  Little's Law

Little's law means a lot! Learn it by heart; not his proof, just the result.
I use it almost daily to cross-check that throughput and delay data are consistent, no matter whether those data come from measurements or models. More details about Little's law can be found in Chap. 2 of Analyzing Computer System Performance with Perl::PDQ. Another use of Little's law is calculating service times, which are notoriously difficult to measure directly. See the Rules of Thumb in Sect. 2.

1.19  Bigger is Not Always Better

Beware the SMP wall!
The bigger the symmetric multiprocessor (SMP) configuration you purchase, the busier you need to run it. But only to the point where the average run-queue begins to grow. Any busier and the user's response time will rapidly start to climb through the roof.

1.20  Bottlenecks

You never remove a bottleneck, you just shuffle it around.

1.21  Benchmarks

All benchmarking is institutionalized cheating.

1.22  Spit It Out!

Spend as much time on developing the presentation of your capacity planning conclusions as you did reaching them.
If your audience does not get the point, or things go into the weeds because you didn't expend enough thought on a visual, you just wasted a lot more than your presentation time-slot.

1.23  Consolidation

Gunther's law of consolidation: Remove it and they will come!

1.24  Control Freaks Unite!

Your own applications are the last refuge of performance engineering.
Control over the performance of hardware resources e.g., CPUs and disks, is progressively being eroded as these things simply become commodity black boxes viz., multicore processors and disk arrays. This situation will only be exacerbated with the advent of Internet-based application services. Software developers will therefore have to understand more about the performance and capacity planning implications of their designs running on these black boxes. (See Sect. 3)

2  Performance Modeling Rules of Thumb

Here are some ideas that might be of help when you're trying to construct your capacity planning or performance analysis models.

2.1  What is Performance Modeling?

All modeling is programming and all programming is debugging.

2.2  Monitoring vs. Modeling

The difference between performance modeling and performance monitoring is like the difference between weather prediction and simply watching a weather-vane twist in the wind.

2.3  Keep It Simple

To paraphrase Einstein:
A performance model should be as simple as possible, but no simpler!
Someone else said:
"A designer knows that he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away." -Antoine de St-Expurey
I now tell people in my Guerrilla classes, despite the fact that I repeat this rule of thumb several times, you will throw the kitchen sink into your performance models; at least, early on as you first learn how to create them. It's almost axiomatic: the more you know about the system architecture, the more detail you will try to throw into the model. The goal, in fact, is the opposite.

2.4  More Like The Map Than The Metro

A performance model is to a computer system as the London Tube map is to the London underground railway.
The Tube map is pure abstraction that has very little to do with the physical railway system. It encodes only sufficient detail to enable transit on the underground from point A to point B. It does not include a lot of irrelevant details such as altitude of the stations, or even their actual geographical proximity. A performance model is a similar kind of abstraction.
Despite several attempts, the original Tube map has hardly been improved upon since its conception in 1933. Apparently, it already met the requirement of being as simple as possible, but no simpler. The fact that it was designed by an electrical draughtsman, probably helped.

2.5  The Big Picture

Unlike most aspects of computer technology, performance modeling is about deciding how much detail can be ignored!

2.6  Look for the Principle

When trying to construct the performance representation of a computer system (which may or may not be a queueing model), look for the principle of operation. If you can't describe the principle of operation in 25 words or less, you probably don't understand it yet.
As an example, the principle of operation for a time-share computer system can be stated as: Time-share gives every user the illusion that they are the ONLY user active on the system. All the thousands of lines of code in the operating system, which support time-slicing, priority queues, etc., are there merely to support that illusion.

2.7  Guilt is Golden

Performance modeling is also about spreading the guilt around.
You, as the performance analyst or planner, only have to shine the light in the right place and then stand back while others flock to fix it.

2.8  Where to Start?

Have some fun with blocks; functional blocks!
Fun Blocks
One place to start constructing a PDQ model is by drawing a functional block diagram. The objective is to identify where time is spent at each stage in processing the workload of interest. Ultimately, each functional block is converted to a queueing subsystem like those shown above. This includes the ability to distinguish sequential and parallel processing. Other diagrammatic techniques e.g., UML diagrams, may also be useful but I don't understand that stuff and never tried it. See Chap. 6 "Pretty Damn Quick(PDQ) - A Slow Introduction" of Analyzing Computer System Performance with Perl::PDQ.

2.9  Inputs and Outputs

When defining performance models (especially queueing models), it helps to write down a list of INPUTS (measurements or estimates that are used to parameterize the model) and OUTPUTS (numbers that are generated by calculating the model).
Take Little's law Q = X R for example. It is a performance model; albeit a simple equation or operational law, but a model nonetheless. All the variables on the RIGHT side of the equation (X and R) are INPUTS, and the single variable on the LEFT is the OUTPUT. A more detailed discussion of this point is presented in Chap. 6 "Pretty Damn Quick(PDQ) - A Slow Introduction" of Analyzing Computer System Performance with Perl::PDQ.

2.10  No Service, No Queues

You know the restaurant rule: "No shoes, no service!" Well, this is the PDQ rule: no service (time), no queues. In your PDQ models, there is no point creating more queueing nodes than you have measured service times for.
If the measurements of the real system do not include the service time for a queueing node that you think ought to be in your PDQ model, then that PDQ node cannot be defined.

2.11  Estimating Service Times

Service times are notoriously difficult to measure directly. Often, however, the service time can be calculated from other performance metrics that are easier to measure.
Suppose, for example, you had requests coming into an HTTP server and you could measure its CPU utilization with some UNIX tool like vmstat, and you would like to know the service time of the HTTP Gets. UNIX won't tell you, but you can use Little's law (U = X S) to figure it out. If you can measure the arrival rate of requests in Gets/sec (X) and the CPU %utilization (U), then the average service time (S) for a Get is easily calculated from the quotient U/X.

2.12  Change the Data

If the measurements don't support your PDQ performance model, change the measurements.

2.13  Closed or Open Queue?

When trying to figure out which queueing model to apply, ask yourself if you have a finite number of requests to service or not. If the answer is yes (as it would be for a load-test platform), then it's a closed queueing model. Otherwise use an open queueing model.

2.14  Opening a Closed Queue

How do I determine when a closed queueing model can be replaced by an open model?
This important question arises, for example, when you want to extrapolate performance predictions for an Internet application (open) that are based on measurements from a load-test platform (closed).
An open queueing model assumes an infinite population of requesters initiating requests at an arrival rate λ (lambda). In a closed model, λ (lambda) is approximated by the ratio N/Z. Treat the thinktime Z as a free parameter, and choose a value (by trial and error) that keeps N/Z constant as you make N larger in your PDQ model. Eventually, at some value of N, the OUTPUTS of both the closed and open models will agree to some reasonable approximation.

2.15  Steady-State Measurements

The steady-state measurement period should on the order of 100 times larger than the largest service time.

2.16  Transcribing Data

Use the timebase of your measurement tools. If it reports in seconds, use seconds, if it reports in microseconds, use microseconds. The point being, it's easier to check the digits directly for any transcription errors. Of course, the units of ALL numbers should be normalized before doing any arithmetic.

2.17  Workloads Come in Threes

In a mixed workload model (multi-class streams in PDQ), avoid using more than 3 concurrent workstreams whenever possible.
Apart from making an unwieldy PDQ report to read, generally you are only interested in the interaction of 2 workloads (pairwise comparison). Everything else goes in the third (AKA "the background"). If you can't see how to do this, you're probably not ready to create the PDQ model.

3  Universal Law of Computational Scaling

Some reasons why you should understand this law:
  1. A lot of people use the term "scalability" without clearly defining it, let alone defining it quantitatively. Computer system scalability must be quantified. If you can't quantify it, you can't guarantee it. The universal law of computational scaling provides that quantification.

  2. One of the greatest impediments to applying queueing-theory models (whether analytic or simulation) is the inscrutibility of service times within an application. Every queueing facility in a performance model requires a service time as an input parameter. As noted in Sect. 2, No service time, no queue. Without the appropriate queues in the model, system performance metrics like throughtput and response time, cannot be predicted. The universal law of computational scaling leapfrogs this entire problem by NOT requiring ANY low-level service time measurements as inputs.

3.1  The Universal Model

The universal scalability model combines the following independent effects into a single equation (1) expressed in terms of two parameters α and β.

Equal bang for the buck Cost of sharing resources Diminishing returns at higher loads Negative return on investment
sdet bench
sdet bench
sdet bench
sdet bench
α = 0, β = 0 α > 0, β = 0 α > 0, β = 0 α > 0, β > 0
The relative capacity C(N) is a normalized throughput given by:
C(N) =  N

1 + αN + βN (N − 1)
(1)
where N represents either:
  1. (Software Scalability) the number of users or load generators on a fixed hardware configuration. In this case, the number of users acts as the independent variable while the CPU configuration remains constant for the range of user load measurements.

  2. (Hardware Scalability) the number of physical processors or nodes in the hardware configuration. In this case, the number of user processes executing per CPU (say 10) is assumed to be the same for every added CPU. Therefore, on a 4 CPU platform you would run 40 virtual users.

with `α' (alpha) the contention parameter, and `β' (beta) the coherency-delay parameter. The latter accounts for the retrograde throughput seen in Fig. 7.3 (above).
NOTE: The objective of using eqn.(1) is NOT to produce a curve that passes through every data point. That's called curve fitting and that's what graphics artists do with splines. As von Neumann said, "Give me 4 parameters and I'll fit an elephant. Give me 5 and I'll make its trunk wiggle!" (At least I only have 2)

3.2  Justification

The following theorem in queueing theory provides the justification for applying the same Universal Scalability Law eqn.(1) to both hardware and software systems. The queueing theorem I discovered can be stated thusly:
Amdahl's law for parallel speedup is equivalent to the synchronous queueing bound on throughput in the repairman model of a multiprocessor.
It was first published on arXiv in 2002. Both Amdahl's law and my Universal Scalability Law belong to a class of mathematical functions, called Rational Functions, which I have been able to show mathematically possess intimate connections with a load-dependent repairman model in queueing theory. More recently, this result has also been confirmed "experimentally" using simulations developed by my colleague Jim Holtman. So the whole approach to quantifying scalability is now placed on a fundamentally sound physical footing.

3.3  Applicability

This model has wide-spread applicability, including: That's why it's called "universal".

3.4  How to Use It

Virtual load testing
The universal model in eqn.(1) allows you take a sparse set of load measurements (4-6 data points) and determine how your application will scale under larger user loads than you may be able to generate in your test lab. This can all be done in a spreadsheet like EXCEL.
sdet excel
Excerpted from Guerrilla Capacity Planning by Neil J. Gunther, Springer-Verlag (2006), in preparation.
Detecting measurement problems
Equation (1) is not a crystal ball. It cannot foretell the onset of broken measurements or intrinsic pathologies. When the data diverge from the model, that does not automatically make the model wrong. You need to stop measuring and find out why.
Performance heuristics
The relative sizes of the α and β parameters tell you respectively, whether contention effects or coherency effects are responsible for poor scalability.
Performance diagnostics
What makes (1) easy to apply, also limits its diagnostic capability. If the parameter values are poor, you cannot use it to tell you what to fix. All that information is in there alright, but it's compressed into the values of those two little parameters. However, other people e.g., application developers (the people who wrote the code), the systems architect, may easily identify the problem once the universal law has told them they need to look for one.
Copyright © 2006. Performance Dynamics Company. All Rights Reserved.



File translated from TEX by TTH, version 3.38.
On 9 Apr 2009, 17:57.