Which of the following are examples of interdomain routing? - Peering between 2 ISPs - Peering between a university network and an ISP - Peering at an internet exchange point - Routing in a data center - Routing across multiple data centers

- Peering between 2 ISPs - Peering between a university network and an ISP - Peering at an internet exchange point - Routing across multiple data centers

operator can establish multiple paths in advance. - This approach applies to both intra- and interdomain routing. - In intra: set link weights such that multiple paths of equal cost exist between 2 nodes in the graph. This is called Equal Cost Multipath (ECMP). Traffic will be split across paths that have equal costs across the network. A source router may also be able to change the fraction of traffic that’s sent along each path. It might even be able to do this based on the level of congestion observed along the paths. It would do so by having multiple forwarding table entries with different next-hops for outgoing packets to the same destination.

Lesson 10 - Traffic Engineering Flashcards by Timothy Hall

What is traffic engineering?

The process of reconfiguring the network in response to changing traffic loads to achieve some operational goal
* Operator might want to reconfigure the network in response to changing loads to, for example, maintain traffic ratios in a peering relationship, to relieve congestion on certain links, or balance load more evenly across available links in the network

How well did you know this?

Not at all

Perfectly

What is a key question that traffic engineering tries to address?

How should routing adapt to traffic?

In general, it tries to:
* Avoid congested links
* Satisfy application requirements such as delay

How well did you know this?

Not at all

Perfectly

Intradomain traffic engineering

how to tune protocols to adjust to traffic flows in a single AS

* Intradomain TE attempts to solve an optimization problem, where the input is a Graph G(R, L) where R is the set of routers, and L is the set of unidirectional links. 
* Each link l also has a fixed capacity.
* Another input is the traffic matrix or the offered traffic load
    * Mij represents the traffic load from router i to router j.
* The output is the set of link weights where wl is the weight on any unidirectional link l in the network topology.
* Ultimately, the setting of these link weights should result in a fraction of the traffic from i to j traversing each link l such that those fractions satisfy the network-wide objective function. Defining an objective function is tricky.

How well did you know this?

Not at all

Perfectly

Interdomain traffic engineering

adjusting how traffic flows between ASes

How well did you know this?

Not at all

Perfectly

How can an operator affect how traffic flows within an AS?

Configuring/adjusting link weights

In practice, operators set these link weights in a variety of ways:
* Set them inversely proportional to capacity
* Proportional to propagation delay
* Network-wide optimization based on traffic

How well did you know this?

Not at all

Perfectly

3 steps of traffic engineering

Measure: figure out the current traffic loads
Model: form a model of how configuration affects the underlying paths in the network
Control: Reconfiguring the network to exert control over how traffic flows through the network
- Each step requires significant heavy lifting to make TE practical and accurate

How well did you know this?

Not at all

Perfectly

“What if” model

predicts what would happen under various changes, decide which changes to affect on the network, and then ultimately control the behavior on the network by readjusting link weights

How well did you know this?

Not at all

Perfectly

In intradomain routing, some examples of possible objective function goals:

Minimizing the maximum congested link in the network

- Evenly splitting traffic loads across links

How well did you know this?

Not at all

Perfectly

Link utilization

the amount of traffic on the link (Ul) divided by the capacity

    * Our objective might be to minimize the sum of this piecewise linear cost function over all the links in the network.
    * Unfortunately, solving this optimization is still NP-complete, which means there’s no efficient algorithm to find the optimal setting of link weights, even for simple objective functions.
        * Implication: we have to resort to searching through a large set of combinations of link weight settings to ultimately find a good one. This seems sub-optimal, but the graphs turn out to be small enough in practice such that this approach is still reasonably effective.
* In practice: we have other operational realities to worry about:
    * Minimizing the # of changes to the network: often, changing just 1 or 2 link weights is enough to achieve a good traffic load balance solution
    * Whatever solution we come up with must be resistant to failure. 
    * Should also be robust to measurement noise.
    * Limit the frequency of changes we make to the network.

How well did you know this?

Not at all

Perfectly

Which of the following are examples of interdomain routing?

Peering between 2 ISPs
Peering between a university network and an ISP
Peering at an internet exchange point
Routing in a data center
Routing across multiple data centers

Peering between 2 ISPs
Peering between a university network and an ISP
Peering at an internet exchange point
Routing across multiple data centers

How well did you know this?

Not at all

Perfectly

BGP in interdomain traffic engineering

Involves the Reconfiguration of the BGP (Border Gateway Protocol) policies or configurations that are running on individual routers in the network
- Changing BGP policies at these edge routers can cause routers inside an AS to direct traffic to or away from certain edge links. We can also change the set of egress links for a particular destination.

How well did you know this?

Not at all

Perfectly

An operator might want to use BGP reconfiguration if:

An edge link is congested
A link is upgraded
If there’s some violation of a peering agreement. (e.g. below if AS1 and AS2 have an agreement that they only send a certain amount of traffic load over that link in a particular time window. If the load exceeds that amount, an operator would need to use BGP to shift traffic from one egress link to another

How well did you know this?

Not at all

Perfectly

Effective interdomain routing has what 3 goals?

Predictability
* It should be possible to predict how traffic flows will change in response to changes in the network configuration
* Suppose a downstream neighbor is trying to reach AS at the top of the figure below.
* Note that when the middle AS makes a change, it’s now taking a longer path. This could cause the bottom AS not to route through the middle one at all, affecting the traffic matrix that the middle AS sees.
* One way to avoid this problem and achieve predictable traffic flow changes, is to avoid making changes like this that are globally visible.
Limit influence of neighboring domains
* An AS might try to make a path look longer with AS path prepending. Forcing consistent advertisements can also help with this.
* Consistent adverts
* Limit AS path
Reduce overhead of routing changes
* Achieve goals with changes to as few IP prefixes as possible
* Group related prefixes. Identify routing choices that group routes that have the same AS paths and move groups of prefixes according to these groups that share an AS path. Allows you to move groups of prefixes by making tweaks to local preference values.
* Can also focus on the small fraction of prefixes that carry the majority of the traffic. 10% of origin ASes are responsible for about 82% of outbound traffic. Therefore we can achieve significant gains by focusing on the heavy hitters.

How well did you know this?

Not at all

Perfectly

Multipath routing

operator can establish multiple paths in advance.

This approach applies to both intra- and interdomain routing.
In intra: set link weights such that multiple paths of equal cost exist between 2 nodes in the graph. This is called Equal Cost Multipath (ECMP).
* Traffic will be split across paths that have equal costs across the network.
* A source router may also be able to change the fraction of traffic that’s sent along each path. It might even be able to do this based on the level of congestion observed along the paths. It would do so by having multiple forwarding table entries with different next-hops for outgoing packets to the same destination.

How well did you know this?

Not at all

Perfectly

ECMP

(in intra-domain routing) - Equal Cost Multipath

set link weights such that multiple paths of equal cost exist between 2 nodes in the graph

How well did you know this?

Not at all

Perfectly

How can a source router adjust paths?

Dropping packets to cause TCP backoff
Alternating between forwarding table entries
Sending alerts to incoming senders

Study These Flashcards

-Alternating between forwarding table entries

3 important characteristics of data center networking

Study These Flashcards

Multi-tenancy: allows a datacenter provider to amortize the cost of shared infrastructure, but because there are multiple independent users, the infrastructure must also provide some level of security and resource isolation.
Elastic resources: as demand for a service fluctuates, the operator can expand and contract these resources. Can also be pay-per use.
Flexible service management: flexible service management, or the ability to move work or workloads to other locations inside the data center, in the form of workload movement or virtual machine migration.
* Creates the need for traffic engineering solutions inside the data center.
* A key enabling technology in data center networking is the ability to virtualize servers — makes it possible to quickly envision/move/migrate servers and services in response to fluctuations in workload. It’s relatively easy, but we must also develop TE solutions that allow the network to reconfigure in response to changing workloads and migrating services.

Challenges of data center networking

Study These Flashcards

Traffic load balance
Support for virtual machine migration in response to changing demands
Adjusting server and traffic placement to save power
Provisioning the network when demands fluctuate
Providing various security guarantees, particularly in scenarios that involve multiple tenants

3 layers of a typical data center topology

Study These Flashcards

Access layer: connects servers themselves
Aggregation layer: connects the access layer
Core: historically connected with layer 3. Increasingly, modern data centers are connected with an entire layer 2 topology.
* This makes it easier to perform migration of services from one part to another, since they can stay on the same layer 2 network, and hence would not need new IP addresses when they moved. It’s also easier to load balance traffic. On the other hand, a monolithic layer 2 topology makes scaling difficult since now we have 10’s of 1000’s of servers on a single flat topology. Layer 2 addresses are not topological, so the forwarding tables in the switches can’t scale, because they can’t take advantage of the natural hierarchy that exists in the topology.
Other problems that exist is that the hierarchy can create single points of failure, and links at the top of the topology (in the core) can become oversubscribed.
* As you move from the bottom to the top of the hierarchy, the links at the top can carry as much as 200x more traffic as the links at the bottom of the hierarchy. This is a serious capacity mismatch.

Solution to scale problem in data centers (due to flat topology)

Study These Flashcards

Cause of scale problem: lots of servers on a flat topology (their MAC/hardware addresses are topology-independent)
* Thus, in the default behavior, every switch in the topology has to store in its forwarding table every single MAC address.

One solution:
-Pods: assign pseudo-MAC addresses to each server corresponding to the Pod in which they’re located in the topology. So, each server has a real and pseudo-MAC address. Now, switches only need to maintain entries for reaching other pods in the topology. Once a frame enters a pod, the switch then of course has entries for the servers inside that pod.

Problem:
* Hosts are unmodified, so they’re still going to respond to ARP queries with their real MAC addresses. Need to deal with that, and need to map pseudo-MAC addresses to real MAC addresses.
Solution:
* When a host such as server A issues an ARP query, it’s intercepted by the switch. Instead of flooding it, the switch forwards it to a fabric manager, which responds with the pseudo-MAC (PMAC) corresponding to that IP address.
* Host A then sends the frame with the destination PMAC address, and switches in the topology can forward that frame to the appropriate pod corresponding to the appropriate PMAC address of the destination server.
* The switch at that pod then maps the PMAC address back to the real MAC address. Then, the destination server receives the ethernet frame with its real ethernet MAC address, so it knows that the ethernet frame was intended for it.
* This allows us to achieve hierarchical forwarding in a large layer-2 topology without having to modify any host software.

2 main objectives of VL2

Study These Flashcards

Achieve layer-2 semantics across entire data center topology: done with a name/location separation and a resolution service that resembles the fabric manager.
To achieve uniform high-capacity in the servers, and balance load, VL2 relies on flow-based random traffic indirection using valiant load balancing.

Why we need it:

* Existing data center topologies provide extremely limited server-to-server capacity because of the oversubscription of the links at the top of the hierarchy
* As services continue to be migrated to different parts of the data center, resources can be fragmented, significantly lowering utilization (for example, one service represented by green below is in different VMs in data center — this lowers utilization and cost-efficiency). Reducing this is a complicated Level-2/Level-3 routing reconfiguration, but we want the abstraction of just 1 large layer-2 switch, which is the abstraction that VL2 provides.

Goals of Valiant Load balancing

Study These Flashcards

Spread traffic evenly across the servers
Ensure traffic load is balanced independently of destinations of the traffic flows
VL2 achieves this by inserting an indirection level into the switching hierarchy. When a switch at the access layer wants to send traffic to a destination, it first selects a switch at the indirection level to send the traffic at random. This intermediate switch then forwards the traffic to the ultimate destination depending on the destination MAC address of the traffic.
Subsequent flows might pick different indirection points at random. This whole idea actually comes from multi-processor architectures and has been rediscovered in the context of data centers.
This achieves better load balance than in traditional fat tree networks.

Jellyfish

Study These Flashcards

Jellyfish is a technique to network data centers randomly. Its goals are:
* Achieve high throughput, to support big data analytics, agile placement of VMs
* Incremental expandability so that network operators can easily add or replace servers and switches.
* For example, large companies like Facebook are adding capacity on a daily basis. Commercial products make it easy to expand or provision servers in response to changing traffic load but not the network.
Unfortunately, the structure of the data center network constrains expansion. Structures such as a hypercube require 2^k switches, where k is the number of servers.
* Even more efficient topologies like a Fat Tree are still quadratic in the number of servers.

Where does data center topology primarily constrain expansion?

Study These Flashcards

Top-level switches

What is Jellyfish's topology?

- Jellyfish’s topology is what is called a random regular graph: * Random because each graph is uniformly selected at random from a set of all regular graphs * Regular: each node has same degree * Graph in Jellyfish means switches are the nodes

With N racks, how many servers can a data center support?

N(ki-ri) servers, where every top-of-rack switch, i, has some total number of ki ports, of which it uses ri to connect to other top-of-rack switches. -The network is a random regular graph, denoted as RRG(N, k, r) -Formerly, RRG’s are sampled uniformly from the space of all r-regular graphs. Achieving such a property is a complex graph theory problem, but there’s a simple procedure that produces a sufficiently uniform random graph that empirically have the desired properties.

Steps to construct a Jellyfish topology

1. Pick a random switch pair with free ports for which the switch pair are not already neighbors 2. Join them with a link 3. Repeat this process until no further links can be added. * If a switch remains with >= 2 free ports, which might happen during the incremental expansion by adding a new switch, switches can be incorporated into the topology by removing a uniform random existing link, and adding links to that switch, * For a particular equipment cost, using identical equipment, the Jellyfish topology can achieve increased capacity by supporting 25% more servers. This higher capacity is achieved because the paths through the topology are shorter than they would be in a Fat Tree topology.

Open questions of Jellyfish - Topology design perspective

* How close are these random graphs to optimal, in terms of the optimal throughput that could be achieved for a particular set of equipment? * What about topologies where switches are heterogeneous, with different numbers of ports or link speeds?

Open questions of Jellyfish - System design perspective

* Random topology model could create problems with physically cabling the data center network * Questions about how to perform routing or congestion control without the structure of a conventional data center network like a Fat Tree.

Link utilization equation

UL/CL (UL = traffic on the link, CL = capacity of the link) Objective goal is to minimize the sum of this piecewise linear cost function over all the links in the network

Lesson 10 - Traffic Engineering Flashcards

(30 cards)