A Quantitative Theory of Bottleneck Structures for Data Networks Technical Report August 2021 New York USA Jordi Ros-Giralt1 Noah Amsel2 Sruthi Yellamraju3 James Ezick3 Richard Lethin3

2025-04-27 0 0 4.43MB 20 页 10玖币
侵权投诉
A Quantitative Theory of Bottleneck Structures for Data Networks
Technical Report, August 2021, New York, USA
Jordi Ros-Giralt1, Noah Amsel2, Sruthi Yellamraju3, James Ezick3, Richard Lethin3
Yuang Jiang4, Aosong Feng4, Leandros Tassiulas4
contact: jros@qti.qualcomm.com,
1Qualcomm Europe, Inc
2Formerly of Qualcomm Technologies, Inc
3Qualcomm Technologies, Inc
4Yale, University
Abstract—The conventional view of the congestion control
problem in data networks is based on the principle that a
flow’s performance is uniquely determined by the state of
its bottleneck link, regardless of the topological properties
of the network. However, recent work has shown that the
behavior of congestion-controlled networks is better explained
by models that account for the interactions between bottleneck
links. These interactions are captured by a latent bottleneck
structure, a model describing the complex ripple effects that
changes in one part of the network exert on the other parts.
In this paper, we present a quantitative theory of bottleneck
structures (QTBS), a mathematical and engineering framework
comprising a family of polynomial-time algorithms that can be
used to reason about a wide variety of network optimization
problems, including routing, capacity planning and flow con-
trol. QTBS can contribute to traffic engineering by making
clear predictions about the relative performance of alternative
flow routes, and by providing numerical recommendations for
the optimal rate settings of traffic shapers. A particularly
novel result in the domain of capacity planning indicates
that previously established rules for the design of folded-Clos
networks are suboptimal when flows are congestion controlled.
We show that QTBS can be used to derive the optimal rules for
this important class of topologies, and empirically demonstrate
the correctness and efficacy of these results using the BBR and
Cubic congestion-control algorithms.
1. Introduction
Most research on the problem of congestion control for
data networks is based on the principle that the performance
of a flow is solely determined by the state of its bottleneck
link. This view was presented in the original congestion
control algorithm by Jacobson [1], which helped the Internet
recover from congestion collapse in 1988, and it persisted
throughout the more than 30 years of research and develop-
ment that followed, including Google’s new BBR algorithm
[2]. While it is certainly true that a flow’s performance
is limited by the state of its bottleneck link, recent work
[3] reveals a deeper view of network behavior, describing
how bottlenecks interact with each other through a latent
structure—called the bottleneck structure—that depends on
the topological, routing and flow control properties of the
network. This latent structure explains how the performance
of one bottleneck can affect other bottlenecks, and provides
a framework to understand how perturbations in the capacity
of a link or the rate of a flow propagate through a network,
affecting other links and flows.
While [3] introduced the concept of bottleneck struc-
ture, the analysis provided was qualitative. In this paper
we present a quantitative theory of bottleneck structures
(QTBS), a mathematical framework that yields a set of
polynomial time algorithms for quantifying the ripple effects
of perturbations in a network. Perturbations can either be
unintentional (such as the effect of a link failure or the
sudden arrival of a large flow in a network) or intentional
(such as the upgrade of a network link to a higher capacity
or the modification of a route with the goal of optimizing
performance). With QTBS, a network operator can quantify
the effect of such perturbations and use this information to
optimize network performance.
The theoretical contributions of this paper are as follows:
A new generalized bottleneck structure called gradient
graph is studied in detail. A key difference with the bottle-
neck structure introduced in [3] is that the gradient graph
allows us to not only qualify the influences that flows and
bottlenecks exert on each other, but also to quantify them.
This leads to the development of a quantitative theory of
bottleneck structures (QTBS), introduced in this paper.
(Section 2.2)
A novel, fast algorithm to compute the gradient graph
is developed. This algorithm constitutes an asymptotic
speed-up compared to those presented in [3], allowing
us to scale our methodology to large production networks
(Sections 2.2.)
The concepts of link and flow gradient are introduced.
These mathematical operators quantify the effects of in-
finitesimally small perturbations in a network, the core
building blocks of QTBS. A new, fast method to effi-
arXiv:2210.03534v1 [cs.NI] 6 Oct 2022
ciently compute the gradients by leveraging the bottleneck
structure is presented. (Section 2.3.)
Applications demonstrating the practical implications of
QTBS are provided in the areas of routing, capacity planning
and flow control. In each of these applications, we show
how QTBS can potentially alter some of the established
conventional best practices. Our practical contributions are
as follows:
In the routing application, we introduce an algorithm to
find maximal-throughput routes by anticipating the effects
of the congestion control algorithm. While in traditional
traffic engineering approaches (e.g., [4]) the problems of
routing and flow control are considered independently, we
show how QTBS can help resolve them jointly, allowing
operators to design routes that are efficient from a con-
gestion control standpoint. (Section 3.1.)
In the capacity planning application, we use QTBS to
optimize the bandwidth allocation between the spine and
leaf links of a fat-tree (also known as folded-Clos [5]).
We demonstrate that, due to the effects of congestion
control, the optimal design differs from the full fat-tree
configuration proposed by Leiserson [6]. (Section 3.2.)
In the flow control application, we show that QTBS can
be used to precisely compute the rate reduction that a
set of traffic shapers must impose on the network’s low
priority flows in order to achieve a quantifiable positive
impact on the high-priority flows. (Section 3.3.)
To demonstrate that networks behave according to QTBS,
we carry out experiments for each application we consider
using production TCP/IP code and the widely adopted
BBR [2] and Cubic [7] congestion control algorithms.
(Section 3.)
2. Quantitative Theory of Bottleneck Struc-
tures (QTBS)
2.1. Network Model
In their simplest form, networks can be modeled using
two kinds of elements: links, which are communication
resources with a limited capacity; and flows, which make
use of these communication resources. We formalize the
definition of a network as follows:
Definition 1. Network. We say that a tuple N=
hL,F,{cl,l∈ L}i is a network if:
Lis a set of links of the form {l1, l2, ..., l|L|},
Fis a set of flows of the form {f1, f2, ..., f|F|}, and
clis the capacity of link l, for all l∈ L.
Each flow ftraverses a subset of links Lf L and,
similarly, each link lis traversed by a subset of flows
Fl⊂ F. Finally, each flow ftransmits data at a rate rf
and the capacity constraint Pf∈Flrfclmust hold for
all l∈ L.
A core concept upon which our mathematical framework
rests is the notion of a bottleneck link. Intuitively, a flow is
bottlenecked at a link if bypassing the link would allow its
transmission rate to increase. A link whose capacity is fully
utilized is always a bottleneck of at least one flow, though
not necessarily of all the flows traversing it. In this work,
we adopt the following formal definition:
Definition 2. Bottleneck link. Let N=hL,F,{cl,l∈ L}i
be a network where each flow f∈ F transmits data at a rate
rfas determined by a congestion control algorithm (e.g.,
TCP’s algorithm [1]). We will say that flow fis bottlenecked
at link l—equivalently, that link lis a bottleneck of flow
f—if and only if:
Flow ftraverses link l.
rf/∂c
l6= 0. That is, the transmission rate of flow f
changes upon small reductions in link ls capacity.1
This characterization of bottlenecks is a generalization
of some of the classic definitions found in the literature.
Unlike previous work, however, it is based on the notion of
aperturbation, mathematically expressed as a derivative of a
flow rate with respect to the capacity of a link (rf/∂cl). As
an example to illustrate that our definition of bottleneck is
relatively loose, in Appendix A we show that it generalizes
the classic max-min definition of Bertsekas and Gallager
[8]. The generality of the definition of bottlenecks used
in this paper suggests that our framework can be applied
to a wide variety of rate allocation schemes—not only to
max-min fairness [8], proportional fairness [9] and specific
algorithms (e.g., BBR [2], Cubic [7], Reno [10], etc.), but to
other classes of congestion control solutions that meet the
conditions of Definition 2. We leave this promising direction
for future work, and focus on the classic max-min setting
considered in [3].
We complete the description of our network model by
defining the concept of a link’s fair share:
Definition 3. Fair share of a link. Let N=hL,F,{cl,l
L}i be a network. The fair share slof a link l∈ L is the
rate of the flows that are bottlenecked at link l.
As we will see throughout this work, the concept of link
fair share is dual to the concept of flow rate, in that many
of the mathematical properties that are applicable to the rate
of a flow are also applicable to the fair share of a link.
2.2. The Gradient Graph
Our objective is to derive a mathematical framework
capable not just of detecting but also of quantifying the
influences that links and flows exert on each other. In [3], the
authors introduced two bottleneck structures, the bottleneck
precedence graph (BPG) and the gradient graph, and demon-
strated that data networks qualitatively operate according to
the BPG structure. The authors briefly described the concept
of the gradient graph, but their work focused mostly on
the mathematical properties of the bottleneck precedence
1. We use the notation y/∂xto denote the left derivative. This
subtlety is necessary because a flow can have multiple bottleneck links.
In this case, decreasing the capacity of only one bottleneck would affect
the rate of the flow, while increasing its capacity would not; thus, the
(two-sided) derivative would not exist.
2
graph. In our paper, we instead focus on a modified version
of the gradient graph structure. Our work stems from the
insight that, as we will show, this structure enables not just
qualitative analysis, as in [3], but also quantitative analysis,
providing a framework to better understand and optimize
network performance.
We start with the definition of the gradient graph:
Definition 4. Gradient graph. Let N=hL,F,{cl,l∈ L}i
be a network. The gradient graph is a directed graph such
that:
1) There exists a vertex for each bottleneck link and each
flow in the network.
2) For every flow f∈ F:
a) If fis bottlenecked at link l∈ L, then there exists
a directed edge from lto f;
b) If ftraverses link l∈ L, then there exists a directed
edge from fto l;
For ease of exposition, in this paper we will use the terms
gradient graph and bottleneck structure interchangeably.
This definition is borrowed from [3], except for a subtle
but relevant modification of 2b. (The rest of the theoretical
developments presented in this work are new contributions.)
Previously, edges were only included from flows to links
that they traverse, but that do not bottleneck them. In this
work, we also include edges from flows to their bottleneck
links. We call these “backward edges”, and we introduce
them because they are required by several of our theorems
and algorithms.
The utility of our definition of gradient graph as a data
structure for understanding network performance is captured
in the following theorem:
Theorem 1. Propagation of network perturbations. Let
x, y ∈ LF be a pair of links or flows in the network. Then
a perturbation in the capacity cx(for x∈ L) or transmission
rate rx(for x∈ F) of xwill affect the fair share sy(for
y∈ L) or transmission rate ry(for y∈ F) of yif only
if there exists a directed path from xto yin the gradient
graph.
Proof. See Appendix B.
Intuitively, the gradient graph of a network describes
how perturbations in link capacities and flow transmission
rates propagate through the network. Imagine that flow f
is bottlenecked at link l. From Definition 2, this necessarily
implies that a perturbation in the capacity of link lwill cause
a change on the transmission rate of flow f,rf/∂cl6= 0.
This is reflected in the gradient graph by the presence of
a directed edge from a link lto a flow f(Condition 2a in
Definition 4). A change in the value of rf, in turn, affects all
the other links traversed by flow f. This is reflected by the
directed edges from fto the links it traverses (Condition 2b).
This basic process of (1) inducing a perturbation in a vertex
(either in a link or a flow vertex) followed by (2) propagating
the effects of the perturbation along the departing edges of
the vertex creates a ripple effect in the bottleneck structure
as described in Theorem 1. Leveraging this result, we can
formally introduce the concept of region of influence:
Definition 5. Regions of influence in a network. The region
of influence of a link or flow x L ∪ F, denoted R(x), is
the set of links and flows ythat are reachable from xin the
gradient graph.
The region of influence is an important concept in
network performance analysis and optimization because it
describes what parts of a network are affected by perturba-
tions in the performance of a link or a flow. In Section 2.3,
we will also see how such influences can be quantified.
We now introduce the GradientGraph() algorithm (Al-
gorithm 1), a procedure that constructs the gradient graph of
a network. The algorithm begins with crude estimates of the
fair share rates of the links, and iteratively refines them until
all the capacity in the network has been allocated and the
rate of each flow reaches its final value. In the process, the
gradient graph is constructed level by level. The algorithm
starts by initializing the available capacity of each link (line
3), estimating its fair share (line 4) and adding all links to
a min-heap by taking their fair share value as the key (line
5). At each iteration, the algorithm picks the unresolved link
with the lowest fair share value from the min-heap (line 8).
Once this link is selected, all unresolved flows remaining in
the network that traverse it are resolved. That is, their rates
are set to the fair share of the link (line 12) and they are
added to the set of vertices of the gradient graph V(line
13). In addition, directed edges are added in the gradient
graph between the link and all the flows bottlenecked at it
(line 10) and from each of these flows to the other links that
they traverse (line 15). Lines 16-17-18 update the available
capacity of the link, its fair share, and the position of the link
in the min-heap according to the new fair share. Finally, the
link itself is also added as a vertex in the gradient graph (line
22). This iterative process is repeated until all flows have
been added as vertices in the gradient graph (line 7). The
algorithm returns the gradient graph G, the fair share of each
link {sl,l∈ L} and the rate of each flow {rf,f∈ F}.
We conclude this section stating the time complexity of
the GradientGraph() algorithm:
Lemma 1. Time complexity of GradientGraph(). The time
complexity of running GradientGraph() is O(|L|log |L|·H),
where His the maximum number of flows that traverse a
single link.
Proof. See Appendix E.
2.3. Link and Flow Gradients
In this section, we focus on the problem of quantifying
the ripple effects created by perturbations in a network.
Because networks are composed of links and flows, there
are two kinds of perturbations: (1) those originating from
changes to the capacity of a link and (2) those originating
from changes to the rate of a flow. When such changes
occur, the congestion control algorithm adjusts its allocation
3
Algorithm 1 GradientGraph(N=hL,F,{cl,l∈ L}i)
1: V=;E=;rf=,f∈ F;
2: for l∈ L do
3: al=cl; # available capacity
4: sl=al/|Fl|; # fair share
5: MinHeapAdd(key = sl,value= l);
6: end for
7: while F 6⊆ V do
8: l= MinHeapPop();
9: for f∈ Flsuch that rfsldo
10: E=E∪ {(l, f),(f, l)};
11: if f6∈ V then
12: rf=sl;
13: V=V ∪ {f};
14: for l0∈ Lfsuch that rf< sl0do
15: E=E∪ {(f, l0)}
16: al0=al0sl
17: sl0=al/|Fl\ V|;
18: MinHeapUpdateKey(value = l0,newKey = sl0);
19: end for
20: end if
21: end for
22: V=V ∪ {l};
23: end while
24: return hG =hV, Ei,{sl,l∈ L},{rf,f∈ F}i;
of bandwidth to the flows so as to maintain two objec-
tives: (1) maximizing network utilization while (2) ensuring
fairness among competing flows. The congestion control
algorithm acts like a function mapping network conditions
(including its topology, link capacities, and flow paths) to
rate allocations. Large changes in any of these inputs can
have complicated ripple effects on the flow rates, but for
sufficiently small changes, the bandwidth allocation function
is linear.2This local linearity property naturally motivates
the concept of link and flow gradients:
Definition 6. Link and flow gradients. Let N=
hL,F,{cl,l∈ L}i be a network. We define:
The gradient of a link l∈ L with respect to another link
l∈ L as l(l) = sl/∂cl;
The gradient of a flow f∈ F with respect to some link
l∈ L as l(f) = rf/∂cl;
The gradient of a link l∈ L with respect to a flow f∈ F
as f(l) = sl/∂rf;
The gradient of a flow f∈ F with respect to another
flow f∈ F as f(f) = rf/∂rf.
Intuitively, the gradient with respect to a link measures
the impact that a small perturbation in its capacity has on
another link or flow. In real networks, this corresponds to the
scenario of physically upgrading a link or, in programmable
networks (e.g., [11]), logically modifying the capacity of
a virtual link. Thus, link gradients can generally be used
to resolve network design and capacity planning problems.
Similarly, the gradient with respect to a flow measures the
impact that a perturbation in its rate has on a link or another
flow. This scenario corresponds, for instance, to the case
of traffic shaping a flow to alter its transmission rate or
changing the route of a flow—which can be seen as dropping
2. Technically, it is piecewise linear, like the absolute value function, so
picking a linear function that locally approximates it requires knowing the
direction of the change.
the rate of that flow down to zero and adding a new flow
with a different path. Thus, flow gradients can generally be
used to resolve traffic engineering problems. In Section 3
we will see applications in real networks that illustrate each
of these scenarios.
We now present an algorithm called ForwardGrad()
(Algorithm 2) for calculating link and flow gradients. The
algorithm takes a set of links and flows, the gradient graph
of the corresponding network, a link or flow xwith respect
to which to compute the gradients, and a direction xof the
perturbation. It outputs the gradients of all links and flows
in the network with respect to x.ForwardGrad() takes inspi-
ration from forward mode automatic differentiation (“For-
ward Prop”) [12], an algorithm that uses directed acyclic
graphs to represent complicated mathematical functions as
compositions of simpler functions, whose derivatives can
be composed by repeatedly applying the chain rule. In the
case of congestion control, we do not have a closed-form
mathematical formula that relates network conditions (the
inputs) to the flow rates and fair share values (the outputs),
but we can use the gradient graph to break down and
optimize this function.
The thrust of the algorithm is as follows. For all l∈ L,
let lbe the change in the fair share rate of link l. For
all f∈ F, let fbe the change in the rate of flow f. We
call these variables the “drifts” caused by a perturbation.
Before the perturbation, l= ∆f= 0 for all links and
flows. To begin the algorithm, we make an infinitesimally
small perturbation in the independent variable (the one in
the “denominator” of the derivative) that can be positive
or negative. If the independent variable xis a flow f, we
set f=δ(line 2). If it is a link l, and Slis the set
of direct successors of node lin the gradient graph, we set
l=δ/|Sl|(line 3). This is done since, by definition of the
gradient graph, |Sl|is the number of flows bottlenecked at l
and the change in ls capacity will be distributed evenly
among these flows. To determine how this perturbation
propagates to the rest of the network, we follow all directed
paths from that vertex and update the drifts according to the
following two invariants:
Invariant 1: Flow Equation. A flow’s drift fequals
the minimum drift of its bottleneck links. That is, f=
minlPfl, where Pfis the set of links visited directly
before flow vertex fon a path from the starting vertex x
(the predecessors in the graph).
Invariant 2: Link Equation. A link’s drift lis the
negative of the flow drifts entering its vertex, divided
by the number of flow drifts leaving it. That is, l=
PfPlf/|Sl|, where Plis the set of flow vertices
visited directly before link vertex land Slis the set of
flow vertices visited directly after link vertex lon a path
from the starting vertex x.
Finally, the derivative of a given variable with respect to the
independent variable that we perturbed can be calculated by
dividing its drift by δ. In particular, assume the capacity
of link lis the independent variable that we perturbed and
let the rate of flow fbe the dependent variable in which
4
摘要:

AQuantitativeTheoryofBottleneckStructuresforDataNetworksTechnicalReport,August2021,NewYork,USAJordiRos-Giralt1,NoahAmsel2,SruthiYellamraju3,JamesEzick3,RichardLethin3YuangJiang4,AosongFeng4,LeandrosTassiulas4contact:jros@qti.qualcomm.com,1QualcommEurope,Inc2FormerlyofQualcommTechnologies,Inc3Qualcom...

展开>> 收起<<
A Quantitative Theory of Bottleneck Structures for Data Networks Technical Report August 2021 New York USA Jordi Ros-Giralt1 Noah Amsel2 Sruthi Yellamraju3 James Ezick3 Richard Lethin3.pdf

共20页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:20 页 大小:4.43MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 20
客服
关注