Previous Page
Next Page

2.12. Routing

Later in the text, we will need some rudimentary information about routing protocols, so let's briefly review a few basics here. Our goal is not to examine any particular routing protocol in depth but merely to understand the various types of routing protocols and how they interoperate to provide routers with the information they need to choose the next hop for an incoming IP datagram. See [Comer 2000] for an excellent summary of routing and routing protocols. For detailed discussions of the algorithms, their strengths and weaknesses, and how they work, see [Huitema 2000] and [Perlman 2000].

As we've seen, hosts usually have routing (forwarding) tables that contain at most a few routes to specific networks or hosts, and a default route for all other traffic. Indeed, most hosts probably have only a default route, so that all traffic that is not bound to a destination on the same network is sent to a single router that forwards it on to its destination. Typically, the default route and any others a host might need are static; that is, they are set manually by the system administrator or user of the host.

In a small organization, the routers could be configured statically too, but it's obvious that this solution doesn't scale beyond a very small number of routers. That realization brings us to the central question of this section: How do the routers learn the routes they use to populate their routing tables? The obvious answer is that the routers exchange information about which networks they can reach, but that answer brings its own questions.

For example, if an organization has a router sitting on one of its networks, how many other routers should it exchange routes with? It's obviously impractical for the router to exchange routes with every other router on the Internet, so where do we draw the line? Let's consider an organization, such as an ISP, that manages a large number of networks. Figure 2.40 represents a small part of the total network topology of the organization.

Figure 2.40. Part of the Network Topology for a Large Organization


Suppose that a host on network 4 wants to send an IP datagram to a host on network 3. Notice that the datagram could take two possible paths: R4R2R3 and R4R5. R4 must have enough information to choose the best next hoppresumably, R5 in this case. In order to do this, R4 has to have at least a partial understanding of the topology of the network. Similarly, the other routers must understand the network topology well enough to choose the best path for datagrams. Router R1, for example, must have an entry in its routing table for each of the internal networks so that it can deliver datagrams coming in from the outside. To obtain this information, the internal routers exchange messages telling each other what networks they can reach and perhaps how many hops away each network is.

On the other hand, outside routers don't need to have any understanding of the organization's internal network topology. They need know only that datagrams destined for any of the organization's networks should be sent to R1. Thus, R1 is the only router that needs to exchange routes with outside routers. This leads us to the observation that routing information exchanged within an organization's networks is likely to be different from the routing information exchanged with outside routers. The information exchanged internally must reflect the network topology, whereas the information exchanged with outside routers need merely contain reachability data.

Autonomous Systems

As we've seen, internal and external routers need different types of routing information. The term autonomous system (AS) formalizes the notion of internal. An autonomous system is a collection of networks and routers under the control of a single authority. Each autonomous system is assigned a number by the Internet Assigned Numbers Authority (IANA). Examples of autonomous systems in the United States are Harvard (11), Earthlink (3703), and Novell (3680).

For our purposes, the importance of autonomous systems is that the authority in charge of each system is totally responsible for its internal topology and routing. Each autonomous system chooses one or more routing protocols and any policies that apply to those protocols. To an outsider, an autonomous system appears, in the words of RFC 1772 [Rekhter and Gross 1995], "to have a single coherent interior routing plan and presents a consistent picture of which destinations are reachable through it." As suggested in Figure 2.40, each autonomous system has one or more gateways, called border routers, that exchange traffic with other autonomous systems.

Autonomous systems exchange traffic at network access points (NAPs), which are facilities where several autonomous systems maintain border routers on a common network so that they can exchange traffic with one another. Examples of NAPs are MAE-East (Washington DC), Chicago NAP, LINX (London), MAE-Paris, CATNIX (Barcelona), and several others throughout the United States and the world. In addition, some ISPs have private peering agreements that take place outside of NAPs.

A routing protocol that runs within an autonomous system is called an interior gateway protocol (IGP); one that exchanges routes between autonomous systems is called an exterior gateway protocol (EGP). The next two subsections briefly examine each type of protocol.

Interior Gateway Protocols

Autonomous systems use interior gateway protocols to enable internal routers to optimize their forwarding decisions by learning the best next hop for IP datagrams. In order to make optimum decisions, each router must understand the network topology at least to the extent that it can assign a cost to each next hop. For example, let us assume that the routers in Figure 2.40 measure the cost of a route by the number of hops a datagram must take to reach its final destination. If router R4 needs to forward a datagram to network 3, it would choose the path through R5 with a cost of 2 rather than the path through routers R2 and R3 with a cost of 3. Note that R4 needn't understand the exact route the datagram will take after the next hop, merely that one next hop will yield a lower-cost route than the other.

There are two main classes of IGPs: distance-vector protocols and link-state protocols. In distance-vector protocols, as exemplified by the Routing Information Protocol (RIP) [Hedrick 1988] and RIP2 [Malkin 1994], each router sends its neighboring routers a copy of its routing table, which includes the cost to reach each destination network.

More precisely, each router sends its neighbors certain columns from its routing table. There is no need, for example, to send the link that a certain network is reachable through; the network and cost to reach it are sufficient.

In this way, each router builds up a routing table that contains the next hop and cost to reach each network.

In Figure 2.40, for example, R5 would inform R3 and R4 that it can reach network 5 in one hop. R3 would enter a route to network 5 in its routing table with a next hop of R5 and a cost of 2 and then inform R1 of its routes. Router R1 would make an entry in its routing table for network 5 with a next hop of R2 and a cost of 3. These routers may learn better routes to network 5 as they hear from other routers, so these initial entries may be replaced as the routing protocol converges to a final state. Each time a router updates its forwarding table, it must send the updated table to its neighbors.

Several problems with distance-vector protocols limit their use to relatively small networks. First, because each router sends its neighbors information proportional to the size of its routing table, the traffic generated by the routing protocol itself can become significant.

Second, distance-vector protocols can be slow to converge after a network changesuppose that the direct line between R4 and R5 of Figure 2.40 goes down, for instanceand it's easy for routing loops to develop unless special care is taken to prevent them. Finally, each router depends on its neighbors to calculate part of its routing table. In this sense, the calculation of each router's forwarding table is a distributed process, and if one router makes an error in calculating its routes, it can affect the routing tables of all the routers.

More formally, each router takes part in a distributed Bellman-Ford algorithm calculation [Ford and Fulkerson 1962] to arrive at a final common routing table.

Link-state, or shortest path first, protocols take a different approach. Each router broadcasts a list of currently reachable directly connected routers and networks and the cost of reaching them; this process is called flooding. Note that the amount of information that each router transmits is proportional to the number of interfaces on the router, not the size of its routing table.

Each router then uses this information to build a directed graph representing the network. For example, with the network of Figure 2.40, such a graph would be similar to Figure 2.41.

Figure 2.41. A Graph of the Network in Figure 2.40


Each router can now independently calculate the shortest path to each of the other nodes in the graph, using Dijkstra's algorithm [Dijkstra 1959] or a similar method.

By "shortest path" we mean the path of least cost. Several metrics are possible for measuring cost. We could use hops, as with RIP, or some other measure such as delay, link speed, monetary cost, or reliability.

The most common link-state routing protocol is the Open Shortest Path First (OSPF) protocol [Moy 1998a, Moy 1998b]. Because it scales better than distance-vector protocols and can divide an autonomous system into semi-independent areas that operate within the autonomous system, much like the autonomous systems operate within the greater Internet, OSPF is the IGP of choice for all but the smallest networks. OSPF has other optimizations and capabilities as well. See the preceding references for complete details.

The IS-IS (intermediate system to intermediate system) protocol, another link-state routing protocol, is very similar to OSPF but for historical reasons is used primarily by ISPs, whereas OSPF is used primarily on customer networks. For more information on IS-IS and how it differs from OSPF, see [Perlman 2000].

A third choice for routing within autonomous systems, ix autonomous~system is the Enhanced Interior Gateway Routing Protocol (EIGRP). EIGRP is a Cisco proprietary protocol in the distance-vector family. Although similar to RIP, EIGRP has many improvements that help prevent loops from forming and avoids some of the other problems with distance-vector protocols while maintaining much of their simplicity.

Exterior Gateway Protocols

In the history of the Internet, several exterior gateway protocols have been used. Given the architecture of the Internet today, however, virtually all border routers use the Border Gateway Protocol (BGP) [Rekhter and Li 1995].

In the early days of the Internet, there was a backbone network to which the other networks connected in a manner similar to the way hosts connect to an Ethernet backbone today. With that topology, simpler exterior routing protocols, such as the Exterior Gateway Protocol (EGP), were used. Today's topology of autonomous systems exchanging traffic at a series of NAPs or through private peerings requires a more complicated protocol.

BGP views the Internet as a graph where the nodes are the autonomous systems and the edges are the links between them. The internal topology of an external AS plays no role in BGP. Indeed, an AS is generally ignorant of a neighboring autonomous system's internal structure except for the networks that it contains.

As a first approximation, border routers use BGP to tell a peer border router in a neighboring AS what networks it can reach. These networks may be interior networksthose belonging to the router's ASor they may be networks belonging to another AS that the border router knows how to reach. When a border router receives routes from a peer, it redistributes them to the interior routers, using whatever IGP the AS is running.

Our first-approximation description covers the simplest case of an autonomous system having a single border router. In this case, the border router has no real choices to make. It collects reachability information from its external peers and distributes it to the internal routers. Because it is participating in the IGP of the AS, the border router knows what internal networks are reachable, so it informs its external peers of these networks so that external hosts can reach them.

It may be that the manager of the AS wants to keep some or all of the internal networks private. In this case, the border router can be configured not to advertise them to its external peers. Likewise, it is conceivable that the manager may wish to prohibit internal access to certain external networks. Again, the border router can be configured to ignore these routes and not distribute them to the internal routers. Other than these exceptional cases, the single border router merely exchanges reachability information with its external peers.

Now let's consider a slightly more complicated case. Figure 2.42 shows parts of three autonomous systems, each having two border routers. Notice that from the point of view of a router within AS1, network N3 in AS3 is reachable through both routers B1 and B2. From the figure, it appears that, all links being equal, the route through B2 is preferable. Thus, only router B2's route to N3 should be advertised to the internal routers.

Figure 2.42. Parts of Three Autonomous Systems


We might think that both B1 and B2 could advertise their routes but with different metrics. The problem is that BGP doesn't have metrics in the sense that RIP and OSPF do. It's not difficult to see why: Different autonomous systems can use different metrics within their networks, and it may not make sense to compare them. For example, suppose that AS2 uses hop count as a metric but that AS3 uses link speed. What metric should B3 report to B1 for network N3? Similarly, how could an internal router, such as I1, compare the metrics it got from B1 and B2?

This raises the question of how a border router knows whether to advertise a route it has learned from one of its peers. In our example from Figure 2.42, B1 should not advertise a route to network N3, even though N3 is reachable through B1. The answer is that the internal border routers talk to each other.

Ideally, the BGP instance on each border router establishes a TCP connection with every other border router in the autonomous system so that the connections between the routers form a full mesh. This may not be practical for very large autonomous systems, so sometimes the AS manager will configure the routers to form only a partial mesh. These connections are logical TCP connections, not necessarily direct physical connections. Indeed, the border routers are likely to be on separate networks in different, widely dispersed geographical locations.

BGP is an example of a path-vector protocol. Each BGP route advertisement includes additional information about the route, the most important of which is a list of the autonomous systems that must be traversed to reach the destination. This list of autonomous systems is referred to as a path. When BGP receives path information to a network in an external AS, it chooses the best path to that destination among all the paths that it knows about. This best path is distributed to the other internal BGP routers.

The factors considered in choosing a best path are determined by BGP configuration. RFC 1772 [Rekhter and Gross 1995] lists such possible considerations as

  • AS countthe number of autonomous systems in the path

  • Policy considerationsa decision by the AS manager not to use routes through a certain AS or a preference for one AS to another

  • Path originwhere the route was learned from: BGP, an IGP router, or another EGP

  • AS path subsetsgenerally, preference for a path that is a subset of another

  • Link dynamicspreference for stable paths over unstable paths

The other internal BGP routers do the same thing, of course, so that at the end of this process, each BGP router will have a list of candidate paths to the destination. Each router then calculates the most-preferred path, which is distributed to the IGP routers and the external BGP routers. All internal BGP routers perform the same best-path calculation, so they will agree on the preferred path to the destination, and will advertise the same path.


Previous Page
Next Page