\Huge Guerrilla Data Analysis Techniques
Guerrilla Data Analysis Techniques
New Expanded 5-Day Course
| with |
|  |  |
| Prof. David Lilja | Mr. Jim Holtman | Dr. Neil Gunther |
Contents
1 Why You Need This Course
2 Certification
3 Course Goals
4 Dates
5 Course Structure
5.1 GDAT Content Day 1
5.2 GDAT Content Day 2
5.3 GDAT Content Day 3
5.4 GDAT Content Day 4
5.5 GDAT Content Day 5
6 Instructors
6.1 David Lilja
6.2 Jim Holtman
6.3 Neil Gunther
7 Terms and Conditions
8 Textbooks
9 Location: Changed
1 Why You Need This Course
Many Guerrilla alumni have asked for this class. Why?
Well, they've collected cubic light years of performance data, but then they
realize that anyone could have pushed the same buttons they did to
collect that data. No job security there. They want to set themselves apart by
transforming that raw performance data into performance information by
applying
Guerrilla techniques.
That's exactly what we teach you in this class.
Moreover, the data analysis techniques we present are general purpose, and therefore
not tied to any particular computing platform or data collection tools.
2 Certification
This class corresponds to Guerrilla Capacity Planner: Level III certification.
The levels are defined as:
- Entry level, e.g.,
Guerrilla Boot Camp.
- Exposure to a wide variety of computer systems capacity planning concepts, methods, and
tools that can be adapted opportunistically to support the needs of
enterprise-level platform-independent performance management.
An example class is
Guerrilla Capacity Planning.
- Detailed study of a particular capacity planning technique or performance analysis tool.
A printed certificate reflecting the level of achievement is awarded to each attendee who completes the course.
Official Purpose
This new 5-day course falls naturally into two parts:
- An easy introduction to both simple and sophisicated statistical concepts.
We begin with a comparison of the three
primary techniques used to measure and evaluate the performance of computer
systems, an in-depth look at the metrics used to characterize performance,
and a survey of the strategies used in the fundamental measurement tools
and techniques. The focus then shifts to provide a gentle introduction to
some of the key statistical tools and techniques needed to interpret noisy
performance measurements and to understand complex simulation results. We
also will examine techniques that can be used to appropriately design
experiments to obtain the maximum amount of information for a given level
of experimental effort. The course then concludes with a discussion of the
key issues related to system simulation.
- Demonstrations of how to apply those concepts. We use tools like
Excel, R, SIMUL8, SimPy and Mathematica applied to actual computer performance data.
3 Course Goals
After completing this course, the participants will be able to:
- Rigorously compare the performance of computer systems in the
presence of measurement noise.
- Determine whether a change made to a system has a statistically
significant impact on performance.
- Use statistical tools to reduce the number of simulations that
need to be performed of a computer system.
- Design a set of experiments to obtain the most information for a
given level of effort.
- Understand the inherent trade-offs involved in using simulation,
measurement, and analytical modeling.
- Apply tools, like R and Excel, to the analysis of large volumes of
computer performance data.
- Discern which visualization techniques are best suited
to assist in converting performance data into information.
- Participate in ongoing Performance Dynamics-sponsored email discussions about using R and
other tools on the job in their shop.
4 Dates
Check the
schedule
page for the latest information.
Online
registration
is available. Additional registration details are provided at the end of this page.
Who Should Attend
This class is intended for application scientists and engineers,
computer architects, compiler writers, and software engineers who use or
design high-performance computer systems. The level of the presentation is
appropriate for both practitioners and students. Experts from any
scientific discipline will find this class useful in helping to
understand how to appropriately measure and statistically analyze the
performance of their systems and applications.
Content level: 20% beginner, 60% intermediate, 20% advanced.
5 Course Structure
Class begins at 9am and ends at 5pm each day.
A morning break of half an hour is serviced around 10:30am
Seated lunch service is provided from Noon until 1pm.
A serviced afternoon break of half an hour occurs around 3:00pm
A number of practical exercises will be given and discussed throughout
the five days. You are encouraged to bring a laptop computer.
5.1 GDAT Content Day 1
- Introduction
-
- Measurement
- Simulation
- Analytical modeling
- Performance Metrics
-
- Characteristics of good metrics
- Processor and system metrics
- Speedup and relative change
- Measurement Tools and Techniques
-
- Strategies
- Interval timers
- Program profiling
- Tracing
- Indirect measurement
5.2 GDAT Content Day 2
- Statistical Interpretation of Data
-
- What do all of these means mean?
- Sources of measurement errors
- Confidence intervals
- Statistical comparison alternatives
- Design of Experiments: Part 1
-
- Terminology
- One-factor ANOVA (Analysis of Variance)
- Two-factor ANOVA
5.3 GDAT Content Day 3
- Design of Experiments: Part 2
-
- Generalized m-factor experiments
- Fractional factorial designs
- Multifactorial designs
- Plackett-Burman design matrix
- Appliation to Simulations
-
- Types of simulations: event-based, workload simulation
- Random number generation
- Verification and validation
5.4 GDAT Content Day 4
- Introduction to Statistical Analysis Tools
-
- Comparison of Excel, R, SIMUL8, SimPy, Mathematica
- Demonstration of doing statistical analysis with R
- Handling millions of data items quickly
- Computing statistics, graphing the results, confidence intervals
- Guided Tour of Technqiues
-
- ANOVA calculations
- Plackett-Burman designs in R
5.5 GDAT Content Day 5
- Using R to Analyze Performance Data
-
- Detailed examples and case studies
- Interfaces to SQL databases
- Advanced R techniques for analyzing data by partitioning and processing subsets
- Debugging your R scripts
- Advanced Techiques
-
- Multivariate analysis case study
- Data visualization techniques for performance analysis
- Open discussion and student-specific examples
6 Instructors
6.1 David Lilja
David received the Ph.D. and M.S. degrees, both in Electrical Engineering,
from the University of Illinois at Urbana-Champaign, and a B.S. in Computer
Engineering from Iowa State University in Ames. He is currently a
Professor of Electrical and Computer Engineering,
and a Fellow of the
Minnesota Supercomputing Institute, at the University of Minnesota in
Minneapolis. He also serves as a member of the graduate faculties in
Computer Science and Scientific Computation, and was the founding Director
of Graduate Studies for Computer Engineering. He has been a visiting
senior engineer in the Hardware Performance Analysis group at IBM in
Rochester, Minnesota, and a visiting professor at the University of Western
Australia in Perth supported by a Fulbright award.
Previously, he worked as a research assistant at the Center for
Supercomputing Research and Development at the University of Illinois, and
as a development engineer at Tandem Computers Incorporated (now a division
of HP/Compaq) in Cupertino, California. He has served on the program
committees of numerous conferences; was a distinguished visitor of the IEEE
Computer Society; is a Senior member of the IEEE and a member of the ACM;
and is a registered Professional Engineer. His primary research interests
are in high-performance computer architecture, parallel computing,
nanocomputing, hardware-software interactions, and performance analysis.
6.2 Jim Holtman
Jim has a BSEE from New Mexico State University and an MSEE/Comp Sci from
the University of California at Berkeley. He worked at Bell Labs
developing a real-time operating system for the Safeguard Anti-ballistic
Missile system which was one of the first multiprocessor systems in the
late 1960s. He worked on the development of operation support systems for
the Bell System and was named a Bell Labs Fellow for his establishment of
the architecture review process at Bell Labs.
He then worked for Convergys consulting with various groups developing
real-time billing systems for mobile carriers on their architecture and
performance issues.
He is currently retired, but is still interested in the analysis and
visualization of computer performance data. He is an advocate of the
R-language for analyzing data and has taught courses on R and on systems
architecture/performance. In particular, he has presented on
this subject
at CMG.
6.3 Neil Gunther
Neil holds an M.Sc. in Applied Matheamtics and a Ph.D. in theoretical
physics with Dirac Number: 2.
He is an internationally recognized performance expert who founded
Performance Dynamics Company in 1994. Prior to that, Dr. Gunther applied
his training as a theoretical physicist to research and management
positions at San Jose State University, JPL/NASA (Voyager and Galileo
missions), Xerox PARC and Pyramid/Siemens Technology. His computer
performance analysis and capacity planning classes have been given at both
corporate and academic institutions including AOL, Boeing, FedEx, Motorola,
Stanford University, Sun Microsystems (USA and EU), SAGE-Australia and
Thales Group (Holland).
Dr. Gunther is the author of numerous papers on computer performance, as
well as several books,
and is a member of the AMS, APS, ACM, CMG, IEEE, and INFORMS.
7 Terms and Conditions
Tuition Fee
See
class schedule
page for current pricing.
Early Bird applies if registered 30 days in advance of the course.
Enroll
online
now!
Discounts
Corporate discounts for THREE (3) or more people from the same company are also
available. Enquire when you book. Once a seat is booked, a penalty of $500 will
be imposed for a one-time transfer of that seat to another session date.
Inability to attend after such a one-time transfer will automatically forfeit
the entire registration fee.
Refunds
Requests for refunds must be received in writing at least 30 days before the
start date of the course. There is a $50 processing
fee for cancellations. Substitutions may be made at any time.
You may do a one-time transfer of your current class enrollment to hold for an alternative class,
but there will be an additional $500 fee for such transfers.
Transportation
Information will be sent upon receipt of enrollment. A packet will include
airport and transportation options.
Reservations
Enrollment is limited to 40 students. All confirmed reservations must be must be
accompanied by a purchase order number, a check for the tuition, or credit card
information for billing. Courtesy Reservations will be held for up to 30 days in
order for paperwork to be processed so long as there is suffcient time and
adequate space in the course.
8 Textbooks
A copy of the textbooks
Measuring Computer Performance
(Cambridge University Press, 2000),
and
Analyzing Computer System Performance with Perl::PDQ
(Springer-Verlag, 2005),
are included in the price of admission.
9 Location: Changed
Please see the
class schedule page
for the latest news regarding location.
Accommodations
A block of rooms has been set aside for Performance Dynamics Company students at
a special corporate rate
(See class schedule page).
Students must make their own reservations by
calling the hotel and identifying themselves as a Performance Dynamics
enrollee. You cannot book a room online at this special rate. You must call the
hotel and tell'em we sent you.
Meals
Breakfast, lunch, morning and afternoon breaks will be catered for by the hotel
each day. See the
Mini Survival Guide
explaining how to get to the hotel and a list of local restaurants to eat at, once you
do.
Travel Tips
If you decide to take the
BART (Bay Area Rapid Transit)
train, call the hotel from the train on your cellphone and
they will pick you up at the Pleasanton BART station (not the Castro Valley station)
and shuttle you to the hotel.
People coming from the 'Right' Coast may want to check air fares at JetBlue.
Last May 2001, some students flew from New York to Oakland direct for $300 return.
File translated from
TEX
by
TTH,
version 3.38.
On 13 Feb 2008, 16:55.