Traffic Engineering defined
Traffic engineering is a fancy way of describing how network operators deal with large amounts of data flowing through their networks
Traffic engineering is the process of reconfiguring the network in response to changing traffic loads to achieve some operational goal.
Example operational goal accomplished by traffic engineering
A network operator might want to reconfigure the network in response to changing loads to, for example, maintain traffic ratios in a peering relationship or to relieve congestion on certain links in the network or to balance load more evenly across the available links in the network
Key questions traffic engineering tries to address
how routing should adapt to traffic
intradomain traffic engineering
how to reconfigure the protocols within a single autonomous system to adjust traffic flows
interdomain traffic engineering
how to adjust how traffic flows between autonomous system
In a single autonomous system with static link weights, routers will flood information to one another to learn the network topology, including the link weights on, links connecting individual routers
by adjusting the link weights in a intra-domain topology, the operator can affect how traffic flows between different points in the network, thus affecting the load on the network links. In practice, network operators set these link weights in a variety of ways. One could set the link weights inversely proportional to capacity. Proportional to propagation delay, or the operator might perform some network wide optimization, based on traffic.
Steps to Traffic engineering
Interdomain application of three steps of traffic engineering
Link Utilization Function
represent that the cost of congestion increases in a quadratic manner as the loads on the links continue to increase, Ultimately becoming increasingly expensive as link utilization approaches one.
utilization defined
the amount of traffic on the link divided by the capacity
utilization objective
minimize the sum of the piecewise linear cost function over all the links in the network
Unfortunately, solving this optimization is still NP-complete, which means that there is no efficient algorithm to find the optimal setting of link weights, even for simple objective functions. The implications of this are that we have to resort to searching through a large set of combinations of link weight settings to ultimately find a good setting. So clearly searching through all the link weights is suboptimal. But the graphs turn out to be small enough in practice such that this approach is still reasonably effective.
Link Utilization concerns
minimizing the number of changes to the network. Often just changing one or two link weights is enough to achieve a good traffic load balance solution
Requirements for link utilization optimization
resistant to failure
robust to measurement noise.
limit the frequency of changes that we make to the network
examples of interdomain routing
Peering between two ISPs
peering between a university network and its ISP
any peering at an Internet exchange point
Routing between multiple data centers typically involves routing in a wide area, and hence may involve inter-domain routing
But not Routing within a single data center because it’s routing in a single domain
BGP in Interdomain Traffic Engineering
Inter-domain routing involves the reconfiguration of the border gateway protocol, policies or configurations that are running on individual routers in the network
cause routers inside an autonomous system to direct traffic to or away from certain edge links
change the set of egress links for a particular destination.
For example, an operator of autonomous system one might observe. Traffic to destination D traversing the green path. But by adjusting B to B policies, the operator might balance load across these two edge links, or shift all of the traffic for that destination to the lower path.
An operator might wish to use inter domain traffic engineering if an edge link is congested, if a link is upgraded, or if there’s some violation of appearing agreement. For example, AS1 and AS2 have an agreement that they only send a certain amount of traffic load. Over that link in a particular time window. If the load exceeds that amount, an operator would need to use BGP to shift traffic from one [INAUDIBLE] link to another.
3 Interdomain Traffic Engineering Goals
how the inter-domain routing choices of a particular autonomous system can wreak havoc on predictability
Let’s suppose that a downstream neighbor, is trying to reach, the autonomous system at the top of this figure. The AS here might wish to relieve congestion, on a particular peering link. To do so, this AS, might now send traffic to that destination out a different set of atonimous systems. But once this AS makes that change, note that it’s choosing a longer AS path. Now taking a path of three half’s rather than two. In response, the down stream neighbor might decide not to send its traffic for that destination Through this attonomous system at all. Thus affecting the traffic matrix that this AS sees. So, all the work that went in to optimizing the traffic load balance for this AS is for not, because the change that it made, effectively changed the offered traffic loads and hence the traffic matrix.
Avoiding interdomain traffic engineering affecting predictability
Need to achieve predictable traffic flow changes
avoid making changes like this that are globally visible. In particular, note that this change caused a change in the AS path link of the advertisement to this particular destination, from two to three. Thus, other neighbors, such as the downstream neighbor here. Might decide to use an alternate path as a result of that globally visible routing change. By avoiding these types of globally visible changes, we can achieve predicitability
limit the influence of neighbors. For example, an autonomous system might try to make a path look longer with AS path prepending. If we consider treating paths that have almost the same AS path length as a common group, we can achieve additional flexibility.
enforce a constraint that our neighbors should advertise consistent BGP route advertisements, over multiple appearing links, should multiple appearing links exists. That gives us additional flexibility, to send traffic over different e-egress points to the same autonomous system
Enforcing egress points to the same autonomous system.
Enforcing consistent advertisments turns out to be difficult in practice, but it is doable to reduce the overhead of routing changes.
We can group related prefixes. Rather than exploring all combinations of prefixes to move a particular volume of traffic. We can identify routing choices that group routes that have the same AS paths and we can move groups of prefixes according to these groups of prefixes that share an AS path. This allows us to move groups of prefixes by making tweaks to local preference on regular expressions on AS path.
We can also focus on the small fraction of prefixes that carry the majority of traffic. Ten percent of origin AS is responsible for about 82 percent of outbound traffic, therefore we can achieve significant gains in rebalancing traffic in the network by focusing on the heavy hitters.
In summary, to achieve predictability in interdomain traffic engineering
effect changes that are not globally visible.
enforce consistent advertisments and limit the influence of a s pass length to limit the influence of neighbors, we .
group prefixes according to those that have common AS path to reduce the overhead of routing changes,
move traffic in terms of groups of prefixes
Multipath Routing
Another way to perform traffic engineering
an operator can establish multiple paths in advance
This approach applies both to intra-domain routing, and inter-domain routing
equal cost multipath routing
Intra-domain routing
– set link weights such that multiple paths of equal cost exist between two nodes in the graph
Thus traffic will be split across paths that have equal costs through the network.
A source router might also be able to change the fraction of traffic that’s sent along each one of these paths. Sending for example 35% along the top path and 65% along the bottom path.
It might even be able to do this based on the level of congestion that’s observed along these paths.
The way that the router would do this is by having multiple forwarding table entries with different stops for outgoing packets to the same destination.
how can a source router adjust paths to a destination when there are multiple paths to the destination
A source router can adjust traffic over multiple paths by having multiple forwarding table entries for the same destination, and splitting traffic flows across the multiple next hops, depending on, for example, the hash of the IP packet header.
Data Center Networking 3 important characteristics
key enabling technology in data center networking
the ability to virtualize servers. This makes it possible to quickly provision, move, and migrate servers and services in response to fluctuations in workload. But while provisioning servers and moving them is relatively easy, we must also develop traffic engineering solutions that allow the network to reconfigure in response to changing workloads and migrating services
Data Center Networking Challenges
traffic load balance support for migrating virtual machines in response to changing demands
Adjusting server and traffic placement to save power
Provisioning the network when demands fluctuate
providing various security guarantees, particularly in scenarios that involve multiple tenants
To understand these challenges, in a bit more detail, let’s take a look at a typical data center topology. A topology typically has three layers: an access layer, which connect the servers themselves. An aggregation layer which connects the access layer, and then the core. Historically, the core of the network has been connected with layer three, but increasingly, modern data centers are connected with an entire layer-two topology. A layer-two topology makes it easier to perform migration of services from one part of the topology to another, since these services can stay on the same layer-two network and hence would not need new IP addresses when they moved. It also becomes easier to load balance traffic. On the other hand, a monolithic layer-two topology makes scaling difficult, since now we have tens of thousands of servers on a single flat topology. In other words, layer-two addresses are not topological. So the forwarding tables in these switches can’t scale as easily, because they can’t take advantage of the natural hierarchy that exists in the topology. Other problems that exist in this type of topology, is that the hierarchy can potentially create single points of failure. And links at the top of the topology, in the core, can be come oversubscribed. Modern data center operators have observed that as you move from the bottom of the hierarchy up towards the core, that the links at the top can carry as much as 200 times as much traffic as the links at the bottom of the hierarchy. So there’s a serious capacity mismatch in that the top part of the topology has to carry a whole lot more traffic than the bottom