9.0. IntroductionBorder Gateway Protocol (BGP) Version 4 is the lifeblood of the Internet. It is responsible for exchanging routing information between all of the major Internet Service Providers (ISPs), as well between larger client sites and their respective ISPs. And in some large enterprise networks, BGP is used to interconnect different geographical or administrative regions. Primarily to support the complexity of the public Internet, Cisco has added several clever and useful features to its BGP implementation. Because this book is focused on solutions to real-world problems, we will not try to describe all of these features. And it would take a whole book to describe how to operate BGP in a large ISP network, so we avoid discussing extremely large-scale BGP problems. Instead, we look at two main classes of BGP problems: connecting a network to the public Internet, and interconnecting two or more Interior Gateway Protocols (IGPs) in an Enterprise network. A detailed discussion of the BGP protocol and its features is out of the scope of this book. For this type of information, we recommend referring instead to IP Routing by Ravi Malhotra (O'Reilly), or BGP by Iljitsch van Beijnum (O'Reilly). The current protocol definition is contained in RFC 4271 (January 2006), which provides several important updates from the original RFC 1771 (March 1995). We include a brief review of the most critical concepts. BGP is an Exterior Gateway Protocol (EGP), which means that it exchanges routing information between Autonomous Systems (AS). This is different from pure IGPs, such as RIP, EIGRP, and OSPF, which we discussed in Chapters 6, 7, and 8, respectively. It also uses a different basic algorithm for building a loop-free topology than any of those protocols. RIP is a Distance Vector protocol, OSPF is a Link State protocol, and EIGRP is a Distance Vector protocol that incorporates many of the advantages of a Link State protocol. BGP, on the other hand, uses a Path Vector algorithm. This means that instead of reducing each route's relative importance in the routing table to a single metric or cost value, BGP keeps a list of every AS that the path passes through. It uses this list to eliminate loops because a router can check whether a route has already passed through a particular AS by simply looking at the path. Basic TerminologyOne of the most critical concepts in BGP is the Autonomous System (AS). RFC 1930 describes what the Internet Engineering Task Force (IETF), which is the official Internet standards organization, considers to be the Best Current Practices (BCP) for creating and numbering ASs. This document defines an AS as "a connected group of one or more IP prefixes run by one or more network operators which has a single and clearly defined routing policy." In practical terms, what appears on the Internet as a single AS may in fact represent an ISP as well as all of the customer networks of this ISP that aren't using BGP to advertise themselves as unique administrative domains. A consistent routing policy in this context means that if a device on the edge of the AS advertises that it can handle routing for a particular set of prefixes, then all of the routers in the same AS can handle the same prefixes. It doesn't matter whether some of these prefixes refer to internal routes and others refer to external routes. What matters is that the routers inside the AS must agree with one another on how to handle each route, and which internal or external router is the best place to send traffic for this particular network. This is what it means for the AS to behave consistently. It is important to note that this definition doesn't mean that there has to be one and only one IGP inside of an AS. In fact, there could be many IGPs, and there could even be no IGP. The interior routing inside of the AS could be handled entirely by a combination of BGP and static routes, for example. BGP routers talk to one another over a permanent TCP connection on port 179. When BGP operates between two routers that are in the same AS, it is called Interior Border Gateway Protocol ( iBGP). And when the peers are in different ASs, they use External Border Gateway Protocol (eBGP). Unless you are using one of the more complex features that were invented to improve scalability, all of the BGP routers in an AS must peer with one another in a complete mesh. This ensures that the AS behaves consistently when advertising routes to other ASs. Synchronization is a concept that comes up frequently in BGP configurations. Because the AS needs to behave consistently, if you run an IGP and iBGP, they have to agree. Think of a network where the iBGP peers are several hops apart and the intervening network uses an IGP to communicate between them. Synchronization requires that for a BGP route to be useable, the IGP must also contain a route to the same prefix. This ensures that one of these BGP peer routers doesn't try to forward a packet to the other internal BGP peer unless the network connecting them knows what to do with this packet. Cisco routers allow you to disable synchronization. This is actually necessary in any case when you don't redistribute the IGP routes into BGP. But then you have to make sure that your network design doesn't require the IGP to have access to the BGP routes in order to communicate between the iBGP peers. Every discussion of BGP includes frequent references to IP prefixes. A prefix is a Classless Interdomain Routing (CIDR) block of addresses. We previously discussed CIDR in Chapter 5. CIDR is a set of rules for IP subnetting that allows you to summarize groups of IP addresses. For example, you might have four network segments that use the IP addresses 172.25.4.0/24, 172.25.5.0/24, 172.25.6.0/24, and 172.25.7.0/24. Each of these network addresses is a prefix. If, for example, you wanted to send a packet to the device 172.25.5.5/32, your router only needs to know how to route packets for 172.25.5.0/24. This route prefix includes the specific host address. But you can go one step further than this. If the paths to all of these IP networks pass through the same router, it is often useful to summarize or aggregate the prefixes. The router that leads to all of these networks might simply advertise a single prefix, 172.25.4.0/22, which covers all of the individual networks. Similarly, CIDR allows you to create supernets that summarize several classful networks. For example, you could summarize 172.24.0.0/16 through 172.31.0.0/16 as 172.24.0.0/13. BGP requires that every AS must have a 16-bit Autonomous System Number (ASN). Because it is 16 bits long, the ASN can have any value between 0 and 65535. The ASN is a globally unique identification number. BGP uses these ASNs to eliminate loops. Suppose two networks are using the same ASN. A router in the first AS will send out its routes normally, but the BGP router for the second network will drop these routes because they already appear to have passed through this AS. So it is important to ensure that you follow the standard rules for ASN selection, which are described in RFC 1930. RFC 1930 originally divided up the range from 1 through 22,527 among the three major international Internet registry organizations (RIPE, ARIN, and APNIC) to allocate to networks connected to the public Internet. Since publication of that RFC, however, the IANA has distributed further blocks of numbers. Currently, as of the time of writing this book, over 35,000 ASNs have been assigned, with some 22,000 being advertised to the public Internet at any given time. Just as RFC 1918 defines private unregistered ranges of IP addresses for networks that don't connect directly to the public Internet, RFC 1930 defines a series of private unregistered ASN values. You can use these private ASNs freely as long as they don't leak onto the public Internet. And, just as you can use NAT to hide your private IP addresses when you connect to the Internet, you can also hide private ASNs, as long as the AS that connects directly to the public Internet has a registered ASN and registered IP addresses. All ASN values between 64,512 and 65,534 are designated for private use. This gives 1,023 ASN values that you can freely use in your internal network without registering, and without fear of conflict. If you use these private ASNs in an enterprise network, you must ensure that each private ASN is unique throughout the network. Enterprise networks that are large enough to require multiple ASs are generally managed by several different groups. So it is critical to coordinate the use of these private ASNs. If there is ever a conflict, with two ASs using the same ASN, it will disrupt routing to both of the conflicting ASs. And, if either of the conflicting ASs is used for transit, it could disrupt routing throughout the entire enterprise network, causing routing loops and unreachable networks. Each individual AS will continue to function normally internally, but traffic between ASs will behave unpredictably. There are many situations when you can use unregistered ASNs. In fact, the only time you absolutely require a registered ASN is when you need to use BGP to exchange routing information with an ISP. Note that if you only have a single link to a single ISP, then you really don't require BGP at all. If you only have a single connection to the Internet, then you can get by with a single default route pointing to your ISP's router because everything passes through this one link. If the link goes down, there's nothing you can do anyway. So in this case, running BGP is overkill. A small router with a default route is more than adequate. You should consult your ISP to discuss your options. They might also be willing to let you use BGP and a private ASN, which they will remove when passing your routes to the rest of the world. Or they may even be willing to let you run a simpler routing protocol such as RIPv2 to provide redundancy among two or more links that all use their network. In any case, your ISP will probably not pass your routes directly to the Internet anyway. It is more likely, and preferable, that they will allocate addresses to your network that are part of a range that they can summarize. Then the ISP will just pass a single routing entry to the rest of the Internet to represent many customer networks. You can also do this kind of AS Path filtering internally. If you have several internal ASs, only one of which connects to the public Internet, then you can register the one directly connected ASN, and simply filter the private ASNs out of any path information that you pass to your ISPs. We show an example of this kind of filtering in Recipe 9.9. Another special ASN value that bears mentioning is 65,535, which the IANA reserves for future requirements. RFC 1930, on the other hand, says that this ASN is part of the range that is freely available for unregistered use. We recommend avoiding this number because the IANA is the ultimate authority. Although there is currently no conflict with this number, the IANA may decide to give it some special significance later, which could break existing private networks that might use it.
BGP AttributesBGP associates several different basic attributes with each route prefix. These attributes include useful pieces of information about the route, where it came from, and how to reach it. Well known attributes must be supported by every BGP implementation. Some well known attributes are mandatory. All of the mandatory attributes must be included with every route entry. A BGP router will generate an error message if it receives a route that is missing one or more well known mandatory attributes. There are also well known discretionary attributes, which every BGP router must recognize and support, but that don't have to be present with every route entry. Whenever a router passes along a route that it has learned via BGP to another BGP peer, it must include all of the well known attributes that came with this route, including any discretionary attributes. Of course, the router may need to update some of these attributes before passing them along, to include itself in the path, for example. BGP routes can also include one or more optional attributes. These are not necessarily supported by all BGP implementations. Optional attributes can be either transitive or nontransitive, which is specified by a special flag in the attribute type field. If a router receives a route with a transitive optional attribute, it will pass this information along intact to other BGP routers, even if it doesn't understand the option. The router will mark the Partial bit in the attribute flags to indicate that it was unable to handle this attribute, however. The router will quietly drop any unrecognized nontransitive optional attributes from the route information without taking any action. We will now describe several of the most common BGP attributes.
Route SelectionUnlike the various interior routing protocols that we discussed in the preceding chapters, BGP doesn't support multipath routing by default. So if there are two or more paths to a destination, BGP will go to great extremes to ensure that only one of them is actually used. BGP decides which route to use by applying a series of tests in order. It is important to understand these tests and the order that the router looks at them, particularly when you are trying to influence which routes are used. Otherwise you might end up wasting a lot of time trying to adjust your routing tables by using one method, while the router is making the actual decision at some earlier step, and never seeing your adjustments. Note that at each step, there may be several routes to the same destination prefix that all meet the requirement, or are equal after a particular test. In that case, BGP will proceed to the next test to attempt to break the tie. We should point out that these are the route selection rules on Cisco routers. Several of these rules are not part of the BGP specification. So for nonCisco equipment, you should consult the vendor's BGP documentation to see what the differences are.
Note that there are subtle variations to these rules for special situations such as AS Confederations, and many individual rules can be disabled if you want the router to skip them. Cisco has also implemented a BGP Multipath option that changes this route selection process somewhat. If you enable multiple path support, BGP will still perform the first seven tests, evaluating everything up to and including the MED values. But if two or more routes are still equivalent at this point, the router will install some or all of them, depending on how you implement this feature. Please refer to Recipe 9.8 for a discussion of this option. |