The Dynamics of Performance Transients in Large-scale Communication Networks

The Dynamics of Performance Transients in Large-scale Communication Networks

Neil J. Gunther

Performance Dynamics Consulting
Castro Valley, California, USA

OTHER TALKS
In order to view the mathematical notations correctly, check here before continuing.

Abstract

The failure of important communication networks over the last fifteen years has become all too familiar. The entire AT&T phone system was brought to its knees in 1990 by a software bug that escaped the QA process. And a similar problem arose with AT&T's frame-relay backbone on April 14, 1998 which shut down banking ATMs and other major business activities in the western USA. These are examples of sudden, and hard failures, due to ''programming'' errors of one kind or another.

A more subtle, and softer failure mode occurs when a network becomes degraded very suddenly even though the mean traffic intensity on the network remains constant. This sudden congestion manifests itself as a spontaneous collapse in performance, seen as orders-of-magnitude drop in packets/second delivered or a concomitant increase in packet delay. Such events have been witnessed on the Internet and in 1986 it led to the implementation of the TCP/IP ''slow start'' congestion avoidance algorithm. Now, the same algorithm that was intended to avoid high packet latency is now responsible for high latencies in the HTTP protocol of the Web. Some internet researchers have suggested we may be facing ''deja vu all over again.'' In other words, these effects are subtle and local quick fixes become less appropriate for the large-scale networks of the future. A more global systems understanding is imperative.

A variety of mathematical modeling techniques (e.g., Catastrophe Theory [7], Large Deviations Theory [1], [10], [9]) have been applied to the stability analysis of communication networks. This talk will elaborate on the Path-Integral [2] or Instanton [4] technique 1, due to the author [3], and summarized in PART III of my book The Practical Performance Analyst The problem of estimating the mean time to spontaneous network degradation, E{T}, is analogous to calculating the tunneling amplitude for the quantum mechanical decay of an atom (or the Wick-rotated 2 version, more accurately). The asymptotic form of the (real-valued stochastic) estimator derived from the Instanton solution is given by the formula:


lim
N Æ  
E{T} ~ K eN S0
(1)
where

  1. N represents the scale of the network
  2. So is the Action or objective function for the instanton solution
  3. K is a prefactor corresponding to oscillations about the instanton solution.

Numerical results [5] have demonstrated the accuracy of the Instanton technique compared with other calculational methods. The advantages of the sum-over-sample-paths or Instanton technique are threefold:

  1. It makes the dynamics of large transients intuitively obvious.
  2. It furnishes an estimator for the mean time to transition that is most appropriate for large networks.
  3. It provides the connection with (and the corrections to) the Large Deviations estimator.

During the presentation, contact will also be made with Catastrophe Theory (which also leads immediately to a Universality Hypothesis for large-scale computer systems), the Theory of Large Deviations (and corrections from the K factor in eqn 1), and applications to ATM network [9] admission control.

References

[1]
Cottrell, M., Fort, J-C., and Malgouyres, G. 1983. ''Large Deviations and Rare Events in the Study of Stochastic Algorithms,'' IEEE Trans. Comms. 28(2): 907
[2]
Feynman, R. P. and A. R. Hibbs. 1965. Quantum Mechanics and Path Integrals. New York: McGraw-Hill.
[3]
Gunther, N. J. 1989. ''Path Integral Methods for Computer Performance Analysis,'' Information Processing Letters. 32(1): 7-13.
[4]
Gunther, N. J. 1990. ''Instanton Techniques for Queueing Models of Large Computer Systems: Getting a Piece of the Action,'' Invited paper SIAM Conf. Appl. Prob. in Science and Eng. New Orleans, Louisiana.
[5]
Gunther, N. J., and Shaw, J. G. 1990. ''Path Integral Evaluation of ALOHA Network Transients,'' Information Processing Letters, 33(6): 289-295.
[6]
Gunther, N. J. 2000. The Practical Performance Analyst, Lincoln, Nebraska: iUniverse.com Inc.
[7]
Nelson, R. 1987. ''Stochastic Catastrophe Theory in Computer Performance Modeling,'' Journal ACM. 34: 661.
[8]
Schwartz, A., and Weiss, A. 1995. Large Deviations for Performance Analysis: Queues, Communications, and Computing. London: Chapman & Hall.
[9]
Walrand, J., and Variaya, P. 1996. High Performance Communication Networks. San Francisco: Morgan Kaufmann.
[10]
Weiss, A. 1986. ''A New Technique for Analyzing Large Traffic Systems,'' Adv. Appl. Prob., 18, 506.


Footnotes:

1 The so-called ''instanton'' (or Euclidean pseudoparticle) was first discussed by the Dutch theoretical physicist Gerard 't Hooft in ''Computation of the quantum effects due to a four-dimensional pseudoparticle,'' Phys.Rev. D14:3432-3450,1976

2 A one page summary of quantum tunneling and its relationship to the instanton can be read here.


File translated from TEX by TTH, version 2.25.
On 23 Sep 2003, 10:28.