Hypernets -- Good (G)news for Gnutella

Hypernets - Good (G)news for Gnutella

by

Dr. Neil Gunther
Performance Dynamics Consulting

February 15, 2002

Copyright  ©   2002 Performance Dynamics Company. All Rights Reserved.

This article was posted to the LANL preprint server on Feb 16, 2002 and was slashdotted on Feb 17, 2002. A few good points were raised on slashdot and I intend to progressively address them in a Postscript (section 7) as an update to this online version. I will entertain any sensible discussion via this email form initially and direct email thereafter.

Abstract

Criticism of Gnutella network scalability has rested on the bandwidth attributes of the original interconnection topology: a Cayley tree. Trees, in general, are known to have lower aggregate bandwidth than higher dimensional topologies e.g., hypercubes, meshes and tori. Gnutella was intended to support thousands to millions of peers. Studies of interconnection topologies in the literature, however, have focused on hardware implementations which are limited by cost to a few thousand nodes. Since the Gnutella network is virtual, hyper-topologies are relatively unfettered by such constraints. We present performance models for several plausible hyper-topologies and compare their query throughput up to millions of peers. The virtual hypercube and the virtual hypertorus are shown to offer near linear scalability subject to the number of peer TCP/IP connections that can be simultaneously kept open.

Contents

1  Introduction
2  Tree Topologies
    2.1  Binary Tree
    2.2  Rooted Tree
    2.3  Cayley Tree
3  Hypernet Topologies
    3.1  Hypercube
    3.2  HyperTorus
4  Performance Metrics
    4.1  Network Diameter (δ)
    4.2  Total Nodes (N)
    4.3  Path Length
    4.4  Internal Path Length (P)
    4.5  Average Number of Hops (H)
    4.6  Number of Network Links (L)
    4.7  Network Demand (Dlink)
    4.8  Peer Demand (Dpeer)
    4.9  Bandwidth (X)
5  Relative Bandwidth
    5.1  Cayley Trees
    5.2  Trees and Cubes
    5.3  Cubes and Tori
    5.4  Ranked Performance
6  Conclusions
7  Postscript
    7.1  Physical bandwidth constraints
    7.2  Routing in software to make a given topology

1  Introduction

The Gnutella network is a class of open source virtual networks known as Peer-to-Peer or P2P networks. Compared to the more ubiquitous client-server distributed architectures, every P2P node (or servant) can act as both a client and a server. Many client-server applications e.g., commercial databases [DeWitt and Gray 1992], have multiple clients (users) accessing a centralized server (see e.g., [Gunther 2000] Chap. 8). Conversely, P2P network applications are usually completely decentralized.
Finding applications that can make efficient use of P2P is the current gating factor for their widespread adoption. So far, P2P networks have been employed for such applications as the Napster music file-sharing service, and the SETI@Home project, although those implementations rely on a significant centralized server component.
The initial release of Gnutella in 2000 led to the perception that the intrinsic architecture may not be capable of scaling to meet the sharing demands of millions of anticipated 1 users. Similar concerns about scalability have arisen in the context of hypergrowth traffic impinging on popular e-commerce Web sites [Gunther 2001]. Based on measurements of popular queries, [Sripan 2001] proposed that Gnutella scaling problems could be ameliorated through the implementation of appropriate caching strategies. Measurements by [AdaHub 2000] indicated that there were more readers than writers involved in file sharing. They suggested that such a "free ride" could lead to higher than expected load on the P2P network thereby degrading its performance as well as increasing its vulnerability to fragmentation.
A mathematical analysis by Ritter (one of the original developers of Napster) presented a detailed numerical argument demonstrating that the Gnutella network could not scale to the capacity of the competitor 2 Napster network. Essentially, that model showed that the Gnutella network is severely bandwidth limited long before the P2P population reaches a million peers. In each of these previous studies, the conclusions have overlooked the intrinsic bandwidth limits of the underlying distributed topology [Minar 2002] in the Gnutella network: a Cayley tree [RaiSlo 1999]. (See section  2 for the definition)
Trees are known to have lower aggregate bandwidth than higher dimensional topologies e.g., hypercubes and hypertori. Studies of interconnection topologies in the literature have tended to focus on hardware implementations (see e.g., [Cull et al. 1996], [Buyya 1999],  [AlmGot 1994] and  [PatHen 1996]) which are generally limited by the cost of the chips and wires to a few thousand nodes. P2P networks, on the other hand, are intended to support hundreds of thousands to millions of simultaneous peers and since they are implemented in software, hyper-topologies are relatively unfettered 3 by the economics hardware.
In this paper, we analyze the scalability of several alternative topologies and compare their throughput up to 2-3 million peers. The virtual hypercube and the virtual hypertorus offer near-linear scalable bandwidth subject to the number of peer TCP/IP connections that can be simultaneously kept open. We adopt the abbreviation hypernet for these alternative topologies. The assumptions about the distribution of peer activity are similar to those employed by Ritter. This is appropriate since our purpose is to rank the relative performance of these hypernets rather than to predict their absolute performance.

2  Tree Topologies

In the subsequent discussion, the P2P network is treated as a graph i.e., a set nodes or vertices connected by a set of edges or links. The nodes correspond to network peers and the links to the links to network connections.
Because the tree structure of the Gnutella network has been such a hidden determinant underlying the conclusions drawn in previous scalability studies, we commence our performance comparisons by distinguishing clearly among the relevant tree topologies. Topologically, all trees are planar and thus have d = 2 spatial dimensionality.

2.1  Binary Tree

The binary tree is familiar in the computing context by virtue of its ubiquity as a parsing and storage data structure [Wirth 1976]. There is a unique root node which is connected only to two sibling nodes and each of those siblings is connected to another pair of sibling nodes and so on. At each level (h) in the tree, there are 2h nodes. Therefore, the number of nodes grows as a binary exponential. Because of its relatively sparse nodal density, the binary tree is rarely employed as a bona fide interconnection network.

2.2  Rooted Tree

A rooted tree is simply the generalization of a binary tree in which each node (other than the root) has a vertex of degree v. The total number of nodes is the sum of a geometric series:
Nbin(h) =  vh − 1

v − 1
 
(1)

2.3  Cayley Tree

A Cayley tree [RaiSlo 1999] has no root. Recalling the binary tree, what was the root of the parent binary tree now has a link to an another binary sub-tree of height one less than the parent. All nodes thus become tri-valent with v = 3 at every level. More generally, for a v-valent tree, the total number of nodes is given by:
Ncay(h) = 1 +
 v  (v − 1)h − 1
(2)
and therefore is denser than ( 1).
This is the central formula used in the scalability analysis of Ritter. The network he analyzed is thus a Cayley tree with vertex degree (v) corresponding to the number of open network connections per servant. [Ritter 2001] analyzed valences in the range v = 4 ...  8; the former value being the default setting in the original Gnutella release, and the latter more closely resembling the number of peers claimed for the contemporaneous Napster network.

3  Hypernet Topologies

An alternative to bandwidth-limited trees is a topology with higher dimensionality. We examine the performance attributes of two hypernets in particular: the binary hypercube and the hypertorus, each in d-dimensions.

3.1  Hypercube

In a boolean or binary hypercube each node forms the vertex of a d-dimensional cube. The number of nodes is simply 2d and the degree of each vertex (v) is equal to the dimensionality (d) of the network. Hence, each node can be enumerated or addressed using a base-2 (binary) d-digit number.
Moreover, since neighboring nodes differ in address by only 1 digit, sending a message on the hypercube becomes a simple matter of shifting successive bits as the binary address passes each node between source and destination.
In d = 3 dimensions the hypercube is simply a cube. Each vertex has degree v = 3, so there are 23 = 8 nodes. A 4-dimensional hypercube, can be visualized as spatially translating a 3-cube such that the locus of its 4 vertices trace out the additional connections.

3.2  HyperTorus

A d-dimensional hypertorus is a d-dimensional grid with each nodes connected to a ring of nodes in each of the d orthogonal dimensions. The hypertorus reduces to the binary hypercube when there are only 2 nodes in each ring.
The simplest visualization is, once again, in 3-dimensions. A 2-dimensional grid is first wrapped about one axis such the edges join to form a tube. The tube is wrapped about the orthogonal axis to form a ring such that the open ends of the tube become joined. The result is a 3-torus, otherwise known as a donut.
All of these topologies fall into a class known as single stage networks and are relatively easy to implement in software. The more exotic topologies, such as cube-connected cycles, butterflies and other multistage [AlmGot 1994] networks are not considered here because they are likely to be more difficult to implement.

4  Performance Metrics

4.1  Network Diameter (δ)

The notion of a network diameter is analogous to the diameter for a circle. There, it is the maximum chordal length between two points on the circumference. For a network, it is the maximum number of communication links that must be traversed to send a message to any node along the shortest path. It represents a lower bound on the latency to propagate messages throughout the entire network.
Topology δ
Tree 2 h
Hypercube d
Torus d N1/d / 4
Table 1: Network diameters.
In 1997 the Web was estimated to comprise more than half a million sites [Gray 1997]. By 2001, it was estimated [OCLC 1991] to have grown to 3.1 million publicly accessible sites.
The diameter of the Web has been estimated [Reka et al. 1999] to be about 20 hops. If the Web is modelled as a Cayley tree, its height would be half the diameter i.e., h = δ/2 = 10 hops. A vertex degree of 5 (connections per node) would contain just under half a million nodes while a vertex degree of 6 would contain nearly 3 million (2,929,687) nodes.

4.2  Total Nodes (N)

The total number of peer nodes in the P2P network. For a binary tree:
N(h) = h

k = 1 
 2k − 1  
(3)
For a d-dimensional binary hypercube the number of nodes is 2d.

4.3  Path Length

The path length is the maximal distance between a leaf node and the root. For a tree, it is half the diameter. The path length corresponds the peer horizon used by [Ritter 2001] in his analysis. A better measure of network latency is the average number of hops (H), which we shall define shortly.

4.4  Internal Path Length (P)

The internal path length is the total number of paths between all nodes. For a binary tree of depth h, the total number of paths is:
P(h) = h

k = 1 
 k  N(k)  
(4)

4.5  Average Number of Hops (H)

Since the network diameter is a maximal distance, it tends to overestimate message latency. A better measure is the average number of hops between source and destination. This quantity is found by dividing the internal path length in (4) by the total number of nodes in (3)
H =  P

N
 
(5)
It corresponds to the average number of network hops traversed by a P2P query.

4.6  Number of Network Links (L)

This is a measure of the number of physical network links.
Topology L
Tree Ntree
Hypercube dNcube / 2
Torus dNtorus
Table 2: Network links.
As shown in Table 2, L scales with the number of physical nodes (N) for the topologies we consider.

4.7  Network Demand (Dlink)

The transit frequency across a link flink is a measure of the average query size per link. Under the assumption of uniform message routing, it can be defined as:
flink =  H

L
(6)
If the latency across a link is denoted by Slink, then the total service demand [Gunther 2000] is:
Dlink = flink Slink  
(7)
For simplicity, and without loss of generality, we normalize the network demand to unit periods (Slink  =  1).

4.8  Peer Demand (Dpeer)

Similarly, for node latency Speer. Under the assumption of uniform message routing:
fpeers =  1

N
(8)
and the total peer service demand is:
Dpeers =  Speer

N
 
(9)
Again, we normalize the peer demand to unit periods (Speer  =  1) in the subsequent discussion.

4.9  Bandwidth (X)

It follows from Little's law, U = X D (See e.g., [Gunther 2000] p. 44) that when any node in the network reaches saturation (U = 1) the maximum in the system throughput is determined by:
Xmax =  1

Max[Dpeers, Dlink1, Dlink2, ...]
 
(10)
The node with the longest service demand Dmax is the system bottleneck. The service demand at the bottleneck therefore determines the maximum system throughput.
With these metrics defined, we are in a position to compare the asymptotic performance of each of the topologies described in sections  2 and  3.

5  Relative Bandwidth

Since we are interested in network scalability up to a few million peers, it is sufficient to base the comparison on the asymptotic network throughput defined in ( 10). In particular, we will rank the above hypernets according to their relative maximal bandwidth,
Xrelative = Xmax(N) / N
(11)
where N is the number of peers in the horizon (Table 3 at the end of this section). Xrelative = 1.0 corresponds to linear scalability since Xmax = N in (11).
In several respects our approach is similar to that taken by [Cull et al. 1996] for their LogP model of assessing parallel hardware performance. In both approaches, the respective network topology enters into the performance model via the network demand defined in ( 7 and  9).

5.1  Cayley Trees

First, we consider the relative performance of tree topologies. Fig. 1 shows the normalized bandwidths of a 4-th degree rooted tree, a 4-valent Cayley tree and an 8-valent Cayley tree.
BinCayTrees.gif
Figure 1: Relative throughput of binary and Cayley trees.
 
The 4-valent Cayley tree represents the default peer connectivity in the original release of Gnutella. Similarly, the 8-valent Cayley tree corresponds to Ritter's comparison with Napster scalability. The curves in Fig. 1 terminate at different peer populations because the population is an integral multiple which is dramatically affected by the vertex degree and the height of the tree.
We see immediately that the 8-valent Cayley tree has the greatest bandwidth up through 2 million peers. The 4-valent Cayley tree has the lowest bandwidth; even lower than the rooted tree. This follows from the fact that at its root the 4-tree has the same connectivity as the 4-Cayley tree but all its descendents have vertices of 5 degrees. Even for the 8-Cayley, at 2 million peers the bandwidth is less than one quarter of linear scalability.

5.2  Trees and Cubes

We next consider the relative performance of high degree trees and hypercubes.
CubeTrees.gif
Figure 2: Relative throughput of Cayley trees and hypercubes.
 
In particular, Fig. 2 shows the normalized bandwidths for an 8-Cayley (the best throughput of the trees considered in Fig. 1), a 20-Cayley, and a binary hypercube. The d-dimensional hypercube clearly exhibits superior scalability.

5.3  Cubes and Tori

Of these high-order topologies, the binary hypercube offers linearly scalable bandwidth beyond one million active peers (Fig. 3). The 10-dimensional hypertorus has comparable scalability up to one million peers but degrades beyond that point.
CubeTorus.gif
Figure 3: Relative throughput of hypercubes and hypertori.
 
The 3-dimensional hypertorus is also shown for comparison since that topology has been used in large-scale hardware implementations up to several hundred nodes per cluster (e.g., the Tandem Himalya).

5.4  Ranked Performance

The main results of our analysis are summarized in Table 3 which shows each of the topologies ranked by their relative bandwidth as defined in (11).
Network Connections Hops to Peers x 106 Relative (%)
Topology per Peer Horizon in Horizon Bandwidth
20-Cube 20 10 2.1 100
10-Torus 20 11 2.1 93
5-Torus 10 23 2.1 22
20-Cayley 20 6 2.8 16
8-Cayley 8 8 1.1 13
4-Tree 4 11 1.4 12
3-Torus 6 96 2.1 10
4-Cayley 4 13 1.1 8
Table 3: Topologies ranked by maximal relative bandwidth.
The 20-dimensional hypercube outranks all other contenders on the basis of query throughput. For an horizon containing 2 million peers, each servant must maintain 20 open connections, on average. This is well within the capacity limits of most TCP/IP implementations [Stevens 1990].
The 10-dimensional hypertorus is comparable to the 20-hypercube in bandwidth up to an horizon of 1 million peers but falls off by almost 10% at 2 million peers. The 10-torus is also arguably a more difficult topology to implement.
The 20-valent Cayley tree is included since the number of connections per peer is the same as that for the 20-cube and the 10-torus. An horizon of 6 hops was used for comparison because the peer population is only 144,801 nodes at 5 hops. Similarly for 8-Cayley, a 9 hop horizon would contain 7.7 million peers. These large increments are a direct consequence of the high vertex degree per node.
The 4-Cayley (modeling early Gnutella) and 8-Cayley (modeling the Napster population) show relatively poor scalability at 1 million peers. Even doubling the number of connections per peer produces slightly better than 50% improvement in throughput. This confirms the conclusions reached in Ritter and, moreover, supports our proposal to consider hypernet topologies.

6  Conclusions

Previous studies of Gnutella scalability have tended to overlook the intrinsic bandwidth limits of the underlying tree topology. The most thorough and accurate of these studies is that presented in  [Ritter 2001]. Unfortunately, his analysis could be accused of straining at a gnat. As a viable candidate for massively scalable bandwidth, our analysis demonstrates that trees are dead.
Conversely, by going to higher dimensional virtual networks (and the hypercube in particular) near linear scalability can be achieved for populations on the order of several million peers each with only 20 open connections. According to section 4, this level of scalability would already match the number of nodes present in the entire Web.
The dominant constraint for hardware implementations of high-dimensional networks is the cost of the physical wires on the interconnect backplane. Since the hypernets discussed here would be implemented in software, no such constraints would prevent reaching the desired level of scalability. In this sense, we see hypernets as offering good (g)news for Gnutella scalability.

7  Postscript

  Some open questions that I could've addressed more clearly can be summarized under two main headings (so far):

7.1  Physical bandwidth constraints

I estimate that with 20 connections (e.g., for a 20-cube) over a T1/cable/DSL line, each channel would have the bandwidth equivalent of a 56Kbps phone line. I didn't do "the math" (e.g., using the transferred byte sizes in the [Ritter 2001] piece) but it seems to me that it would not be a serious constraint for the small queries 4 that Gnutella/Napster peers generate. If that level of bandwidth per channel were a problem (e.g., because most peers only had 56Kbps to the internet), then the conclusion would simply be: you can't get to millions of peers yet; until there's more bandwidth to the internet. When there is, the hyper-protocol and routing design I'm proposing here ought not to be inhibited from making use of it. In addition, the claim is that it should have better scalability than any tree-like topology. (njg: Tue, Feb 19, 2002)

7.2  Routing in software to make a given topology

It would be interesting to have a packet-expert tell me what would be needed at the IP address/routing level to construct a virtual cube or torus. I don't have a good feel for this. I don't even know what is required for the original Gnutella virtual network to make the Cayley-like topology. To me, it should just be a small matter of programming a routing table in software. I know how it's done in hardware (having built a mesh for Siemens/Pyramid) but I don't know about any gotchas with IP on the internet. Since it's a software implementation, the latencies for a higher dimensional topology can be expected to be longer than if network were rendered physically (with wires). In other words, the virtual net and the physical net are disjoint mappings. For Gnutella-type queries, I don't forsee this as a major issues and the routing should be inherently more scalable. (njg: Tue, Feb 19, 2002)

References

[AdaHub 2000]
Adar E. and Huberman, B. A.
Free Riding on Gnutella
October 2000.
[AlmGot 1994]
Almasi, G. S., and Gottlieb, A. Highly Parallel Computing, Benjamin-Cummings 1994.
[Buyya 1999]
Buyya, R. (Ed.) High Performance Cluster Computing. Vol. 1, Architectures and Systems, Prentice-Hall 1999.
[Cull et al. 1996]
Culler, D. E., Karp, R. M., Patterson, D., Sahay, A., Santos, E. E., Schauser, K. E., Subramonian, R., and Eicken, T.
"LogP: A Practical Model of Parallel Computation,"
Comm. ACM, 39(11): 79 - 85, November 1996.
[DeWitt and Gray 1992]
DeWitt D. J., and Gray, J.
"Parallel Database Systems: The Future of High Performance Database Processing,"
Comm. ACM, 35(6): 85 - 98, 1992.
[Gnutella 2002]
Current development projects.
[Gray 1997]
Gray, M. K.
Web Growth Summary
[Gunther 2000]
Gunther, N. J.
The Practical Performance Analyst,
iUniverse.com Inc. 2000.
The relevant sections can be read online at
[Gunther 2001]
Gunther, N. J.
"Performance and Scalability Models for a Hypergrowth e-Commerce Web Site,"
in Performance Engineering: State of the Art and Current Trends,
(Eds.) Dumke, R., Rautenstrauch, C., Schmietendorf, A., Scholz, A.,
# 2047. Heidelberg: Springer-Verlag 2001.
[Gunther 2002]
Scalable Server Performance and Capacity Planning,
UCLA Course 819.328
March 2002.
[Minar 2002]
Minar, N.
Distributed Systems Topologies
January 2002.
[OCLC 1991]
Online Computer Library Center
OCLC report.
[PatHen 1996]
Hennessy, J. L., and Patterson, D. A.
Computer Architecture: A Quantitative Approach, (2nd Edition),
Morgan Kaufmann 1996.
[RaiSlo 1999]
Rains, E. M. and Sloane, N. J. A.
"On Cayley's Enumeration of Alkanes (or 4-Valent Trees),"
Journal of Integer Sequences, January 1999.
[Reka et al. 1999]
Reka, A., Hawoong, J., and Barabasi, A-L.
"The Diameter of the World Wide Web."
Nature 401 130-131, 1999.
[Ritter 2001]
Ritter, J.
"Why Gnutella Can't Scale. No, Really."
Slashdotted January 2002.
[Sripan 2001]
Sripanidkulchai, K.
"The popularity of Gnutella queries and its implications on scalability,"
March 2001.
[Stevens 1990]
Stevens, W. R.
UNIX Network Programming,
Prentice Hall 1990.
[Wirth 1976]
Wirth, N.
Algorithms + Data Structures = Programs,
Prentice Hall 1976.

Footnotes:

1 In 2001, the size of the Napster network was 160,000 simultaneous users, down from a peak of 1.6 million reported by Webnoize in February, 2001
2At the height of the media attention, Napster's legal problems drove some 50,000 users per day over to Gnutella such that peers connected by 56 Kbps phone lines caused the P2P network to fragment into disconnected "islands" of about 200 peers.
3As the SETI@Home project has demonstrated, 2.8 million desktops (and 10 PetaFLOPS) can be harnessed for free.
4We're not talking massive table scans here, are we?


File translated from TEX by TTH, version 3.38.
On 25 Jun 2010, 15:43.