\Huge Guerrilla Data Analysis Techniques

Guerrilla Data Analysis Techniques

New Expanded 5-Day Course

with
Prof. David Lilja Mr. Jim Holtman Dr. Neil Gunther

Contents

1  Why You Need This Course
2  Certification
3  Course Goals
4  Dates
5  Course Structure
    5.1  GDAT Content Day 1
    5.2  GDAT Content Day 2
    5.3  GDAT Content Day 3
    5.4  GDAT Content Day 4
    5.5  GDAT Content Day 5
6  Instructors
    6.1  David Lilja
    6.2  Jim Holtman
    6.3  Neil Gunther
7  Terms and Conditions
8  Textbooks
9  Location: Changed

1  Why You Need This Course

Many Guerrilla alumni have asked for this class. Why? Well, they've collected cubic light years of performance data, but then they realize that anyone could have pushed the same buttons they did to collect that data. No job security there. They want to set themselves apart by transforming that raw performance data into performance information by applying Guerrilla techniques. That's exactly what we teach you in this class.
Moreover, the data analysis techniques we present are general purpose, and therefore not tied to any particular computing platform or data collection tools.

2  Certification

This class corresponds to Guerrilla Capacity Planner: Level III certification. The levels are defined as:
  1. Entry level, e.g., Guerrilla Boot Camp.
  2. Exposure to a wide variety of computer systems capacity planning concepts, methods, and tools that can be adapted opportunistically to support the needs of enterprise-level platform-independent performance management. An example class is Guerrilla Capacity Planning.
  3. Detailed study of a particular capacity planning technique or performance analysis tool. A printed certificate reflecting the level of achievement is awarded to each attendee who completes the course.

    Official Purpose

    This new 5-day course falls naturally into two parts:
    1. An easy introduction to both simple and sophisicated statistical concepts. We begin with a comparison of the three primary techniques used to measure and evaluate the performance of computer systems, an in-depth look at the metrics used to characterize performance, and a survey of the strategies used in the fundamental measurement tools and techniques. The focus then shifts to provide a gentle introduction to some of the key statistical tools and techniques needed to interpret noisy performance measurements and to understand complex simulation results. We also will examine techniques that can be used to appropriately design experiments to obtain the maximum amount of information for a given level of experimental effort. The course then concludes with a discussion of the key issues related to system simulation.

    2. Demonstrations of how to apply those concepts. We use tools like Excel, R, SIMUL8, SimPy and Mathematica applied to actual computer performance data.

    3  Course Goals

    After completing this course, the participants will be able to:

    4  Dates

    Check the schedule page for the latest information.
    Online registration is available. Additional registration details are provided at the end of this page.

    Who Should Attend

    This class is intended for application scientists and engineers, computer architects, compiler writers, and software engineers who use or design high-performance computer systems. The level of the presentation is appropriate for both practitioners and students. Experts from any scientific discipline will find this class useful in helping to understand how to appropriately measure and statistically analyze the performance of their systems and applications.
    Content level: 20% beginner, 60% intermediate, 20% advanced.

    5  Course Structure

    Class begins at 9am and ends at 5pm each day.
    A morning break of half an hour is serviced around 10:30am
    Seated lunch service is provided from Noon until 1pm.
    A serviced afternoon break of half an hour occurs around 3:00pm
    A number of practical exercises will be given and discussed throughout the five days. You are encouraged to bring a laptop computer.

    5.1  GDAT Content Day 1

    Introduction
    • Measurement
    • Simulation
    • Analytical modeling
    Performance Metrics
    • Characteristics of good metrics
    • Processor and system metrics
    • Speedup and relative change
    Measurement Tools and Techniques
    • Strategies
    • Interval timers
    • Program profiling
    • Tracing
    • Indirect measurement

    5.2  GDAT Content Day 2

    Statistical Interpretation of Data
    • What do all of these means mean?
    • Sources of measurement errors
    • Confidence intervals
    • Statistical comparison alternatives
    Design of Experiments: Part 1
    • Terminology
    • One-factor ANOVA (Analysis of Variance)
    • Two-factor ANOVA

    5.3  GDAT Content Day 3

    Design of Experiments: Part 2
    • Generalized m-factor experiments
    • Fractional factorial designs
    • Multifactorial designs
    • Plackett-Burman design matrix
    Appliation to Simulations
    • Types of simulations: event-based, workload simulation
    • Random number generation
    • Verification and validation

    5.4  GDAT Content Day 4

    Introduction to Statistical Analysis Tools
    • Comparison of Excel, R, SIMUL8, SimPy, Mathematica
    • Demonstration of doing statistical analysis with R
    • Handling millions of data items quickly
    • Computing statistics, graphing the results, confidence intervals
    Guided Tour of Technqiues
    • ANOVA calculations
    • Plackett-Burman designs in R

    5.5  GDAT Content Day 5

    Using R to Analyze Performance Data
    • Detailed examples and case studies
    • Interfaces to SQL databases
    • Advanced R techniques for analyzing data by partitioning and processing subsets
    • Debugging your R scripts
    Advanced Techiques
    • Multivariate analysis case study
    • Data visualization techniques for performance analysis
    • Open discussion and student-specific examples

    6  Instructors

    6.1  David Lilja

    David received the Ph.D. and M.S. degrees, both in Electrical Engineering, from the University of Illinois at Urbana-Champaign, and a B.S. in Computer Engineering from Iowa State University in Ames. He is currently a Professor of Electrical and Computer Engineering, and a Fellow of the Minnesota Supercomputing Institute, at the University of Minnesota in Minneapolis. He also serves as a member of the graduate faculties in Computer Science and Scientific Computation, and was the founding Director of Graduate Studies for Computer Engineering. He has been a visiting senior engineer in the Hardware Performance Analysis group at IBM in Rochester, Minnesota, and a visiting professor at the University of Western Australia in Perth supported by a Fulbright award.
    Previously, he worked as a research assistant at the Center for Supercomputing Research and Development at the University of Illinois, and as a development engineer at Tandem Computers Incorporated (now a division of HP/Compaq) in Cupertino, California. He has served on the program committees of numerous conferences; was a distinguished visitor of the IEEE Computer Society; is a Senior member of the IEEE and a member of the ACM; and is a registered Professional Engineer. His primary research interests are in high-performance computer architecture, parallel computing, nanocomputing, hardware-software interactions, and performance analysis.

    6.2  Jim Holtman

    Jim has a BSEE from New Mexico State University and an MSEE/Comp Sci from the University of California at Berkeley. He worked at Bell Labs developing a real-time operating system for the Safeguard Anti-ballistic Missile system which was one of the first multiprocessor systems in the late 1960s. He worked on the development of operation support systems for the Bell System and was named a Bell Labs Fellow for his establishment of the architecture review process at Bell Labs.
    He then worked for Convergys consulting with various groups developing real-time billing systems for mobile carriers on their architecture and performance issues.
    He is currently retired, but is still interested in the analysis and visualization of computer performance data. He is an advocate of the R-language for analyzing data and has taught courses on R and on systems architecture/performance. In particular, he has presented on this subject at CMG.

    6.3  Neil Gunther

    Neil holds an M.Sc. in Applied Matheamtics and a Ph.D. in theoretical physics with Dirac Number: 2. He is an internationally recognized performance expert who founded Performance Dynamics Company in 1994. Prior to that, Dr. Gunther applied his training as a theoretical physicist to research and management positions at San Jose State University, JPL/NASA (Voyager and Galileo missions), Xerox PARC and Pyramid/Siemens Technology. His computer performance analysis and capacity planning classes have been given at both corporate and academic institutions including AOL, Boeing, FedEx, Motorola, Stanford University, Sun Microsystems (USA and EU), SAGE-Australia and Thales Group (Holland).
    Dr. Gunther is the author of numerous papers on computer performance, as well as several books, and is a member of the AMS, APS, ACM, CMG, IEEE, and INFORMS.

    7  Terms and Conditions

    Tuition Fee

    See class schedule page for current pricing. Early Bird applies if registered 30 days in advance of the course. Enroll online now!

    Discounts

    Corporate discounts for THREE (3) or more people from the same company are also available. Enquire when you book. Once a seat is booked, a penalty of $500 will be imposed for a one-time transfer of that seat to another session date. Inability to attend after such a one-time transfer will automatically forfeit the entire registration fee.

    Refunds

    Requests for refunds must be received in writing at least 30 days before the start date of the course. There is a $50 processing fee for cancellations. Substitutions may be made at any time.
    You may do a one-time transfer of your current class enrollment to hold for an alternative class, but there will be an additional $500 fee for such transfers.

    Transportation

    Information will be sent upon receipt of enrollment. A packet will include airport and transportation options.

    Reservations

    Enrollment is limited to 40 students. All confirmed reservations must be must be accompanied by a purchase order number, a check for the tuition, or credit card information for billing. Courtesy Reservations will be held for up to 30 days in order for paperwork to be processed so long as there is suffcient time and adequate space in the course.

    8  Textbooks

    A copy of the textbooks Measuring Computer Performance (Cambridge University Press, 2000), and Analyzing Computer System Performance with Perl::PDQ (Springer-Verlag, 2005), are included in the price of admission.

    9  Location: Changed

    Please see the class schedule page for the latest news regarding location.

    Accommodations

    A block of rooms has been set aside for Performance Dynamics Company students at a special corporate rate (See class schedule page). Students must make their own reservations by calling the hotel and identifying themselves as a Performance Dynamics enrollee. You cannot book a room online at this special rate. You must call the hotel and tell'em we sent you.

    Meals

    Breakfast, lunch, morning and afternoon breaks will be catered for by the hotel each day. See the Mini Survival Guide explaining how to get to the hotel and a list of local restaurants to eat at, once you do.

    Travel Tips

    If you decide to take the BART (Bay Area Rapid Transit) train, call the hotel from the train on your cellphone and they will pick you up at the Pleasanton BART station (not the Castro Valley station) and shuttle you to the hotel.
    People coming from the 'Right' Coast may want to check air fares at JetBlue. Last May 2001, some students flew from New York to Oakland direct for $300 return.



    File translated from TEX by TTH, version 3.38.
    On 13 Feb 2008, 16:55.