What is traffic engineering?
The process of reconfiguring the network in response to changing traffic loads to achieve some operational goal
* Operator might want to reconfigure the network in response to changing loads to, for example, maintain traffic ratios in a peering relationship, to relieve congestion on certain links, or balance load more evenly across available links in the network
What is a key question that traffic engineering tries to address?
How should routing adapt to traffic?
Intradomain traffic engineering
how to tune protocols to adjust to traffic flows in a single AS
* Intradomain TE attempts to solve an optimization problem, where the input is a Graph G(R, L) where R is the set of routers, and L is the set of unidirectional links.
* Each link l also has a fixed capacity.
* Another input is the traffic matrix or the offered traffic load
* Mij represents the traffic load from router i to router j.
* The output is the set of link weights where wl is the weight on any unidirectional link l in the network topology.
* Ultimately, the setting of these link weights should result in a fraction of the traffic from i to j traversing each link l such that those fractions satisfy the network-wide objective function. Defining an objective function is tricky.Interdomain traffic engineering
adjusting how traffic flows between ASes
How can an operator affect how traffic flows within an AS?
Configuring/adjusting link weights
3 steps of traffic engineering
“What if” model
predicts what would happen under various changes, decide which changes to affect on the network, and then ultimately control the behavior on the network by readjusting link weights
In intradomain routing, some examples of possible objective function goals:
- Evenly splitting traffic loads across links
Link utilization
the amount of traffic on the link (Ul) divided by the capacity
* Our objective might be to minimize the sum of this piecewise linear cost function over all the links in the network.
* Unfortunately, solving this optimization is still NP-complete, which means there’s no efficient algorithm to find the optimal setting of link weights, even for simple objective functions.
* Implication: we have to resort to searching through a large set of combinations of link weight settings to ultimately find a good one. This seems sub-optimal, but the graphs turn out to be small enough in practice such that this approach is still reasonably effective.
* In practice: we have other operational realities to worry about:
* Minimizing the # of changes to the network: often, changing just 1 or 2 link weights is enough to achieve a good traffic load balance solution
* Whatever solution we come up with must be resistant to failure.
* Should also be robust to measurement noise.
* Limit the frequency of changes we make to the network.Which of the following are examples of interdomain routing?
BGP in interdomain traffic engineering
An operator might want to use BGP reconfiguration if:
Effective interdomain routing has what 3 goals?
Multipath routing
operator can establish multiple paths in advance.
ECMP
(in intra-domain routing) - Equal Cost Multipath
set link weights such that multiple paths of equal cost exist between 2 nodes in the graph
How can a source router adjust paths?
-Alternating between forwarding table entries
3 important characteristics of data center networking
Challenges of data center networking
3 layers of a typical data center topology
Solution to scale problem in data centers (due to flat topology)
Cause of scale problem: lots of servers on a flat topology (their MAC/hardware addresses are topology-independent)
* Thus, in the default behavior, every switch in the topology has to store in its forwarding table every single MAC address.
One solution:
-Pods: assign pseudo-MAC addresses to each server corresponding to the Pod in which they’re located in the topology. So, each server has a real and pseudo-MAC address. Now, switches only need to maintain entries for reaching other pods in the topology. Once a frame enters a pod, the switch then of course has entries for the servers inside that pod.
2 main objectives of VL2
Why we need it:
* Existing data center topologies provide extremely limited server-to-server capacity because of the oversubscription of the links at the top of the hierarchy * As services continue to be migrated to different parts of the data center, resources can be fragmented, significantly lowering utilization (for example, one service represented by green below is in different VMs in data center — this lowers utilization and cost-efficiency). Reducing this is a complicated Level-2/Level-3 routing reconfiguration, but we want the abstraction of just 1 large layer-2 switch, which is the abstraction that VL2 provides.
Goals of Valiant Load balancing
Jellyfish
Where does data center topology primarily constrain expansion?
Top-level switches