Previous Page
Next Page

23.0. Introduction

Multicast routing differs from unicast routing in several ways. The most important differences are in the ways that multicast routers use source and destination addresses. A multicast packet is addressed to a special IP address representing a group of devices that can be scattered anywhere throughout a network. Since the destinations can be anywhere, the only reliable way to eliminate loops in multicast routing is to look at the reverse path back to the source. So, while unicast routing cares where the packet is going, multicast routing also needs to know where it came from.

For this reason, multicast routing protocols such as Protocol Independent Multicast (PIM) always work with the source address and destination group simultaneously. The usual notation for a multicast route is (Source, Group), as opposed to the unicast case in which routes are defined by the destination address alone. We have already mentioned that this is necessary for avoiding loops, but the router also needs to keep track of both source and group addresses in each multicast routing table entry because there could be several sources for the same group.

For example, in the NTP chapter, we discussed how a central device can be configured to send time synchronization information as a multicast. In that chapter, we also explained why it was important to have more than one NTP server. So even in a simple multicast example like this, it is quite likely that the routers will need to forward packets to the same set of end devices from two sources that may be on different network segments. The group address alone doesn't tell you enough about how to forward packets belonging to this group.

When you look at the multicast routing table with the show ip mroute command, you will see not only (Source, Group) pairs like (192.168.15.35, 239.5.5.55), but also pairs that look like (*, 239.5.5.55). This means that the source is unspecified. Cisco routers organize their multicast routing tables with a parent (*, Group) for each group, and any number of (Source, Group) pairs under it. If there is a (*, Group) but no (Source, Group) entries for a group, then that just means that the router knows of group members but doesn't yet know where to expect this multicast traffic from.

Each of these (Source, Group) entries represents a Shortest Path Tree (SPT) that leads to the source of the multicast traffic. In sparse mode multicast routing, the root of the tree could actually be a central Rendezvous Point (RP) router rather than the actual traffic source. Because each router must know about the path back to the source or RP, the term Reverse Path Forwarding (RPF) is often used to describe the process of building the SPT.

Two important elements are required for a multicast network to work. The first we've already mentioned: you need a way to route multicast packets from the source to all of the various destinations in the group. The other critical element is that the multicast network has to provide a way for end devices to subscribe to a multicast group so that they can receive the data. The network uses the Internet Group Management Protocol (IGMP) to manage group subscriptions.

IGMP and CGMP

Internet Group Management Protocol (IGMP) functions mainly at Layer 3. Individual end devices use IGMP to announce that they wish to join a particular multicast group. The IGMP request is picked up by a router that then attempts to fulfill the request by forwarding the multicast data stream to the network containing this device. The IGMP protocol is in its third version, which is defined in RFC 3376. However, many devices still use IGMP Version 2 (RFC 2236), and some only support Version 1 (RFC 1112).

What IGMP does is relatively simple in concept. It provides a method for end devices to join and leave multicast groups. Here is the output of tcpdump showing the device 192.168.1.104 joining the group 239.5.5.55:

17:10:16.397055 192.168.1.104 > 239.5.5.55: igmp nreport 239.5.5.55 (DF) [ttl 1]
17:10:19.276998 192.168.1.104 > 239.5.5.55: igmp nreport 239.5.5.55 (DF) [ttl 1]
17:10:21.027002 192.168.1.104 > 239.5.5.55: igmp nreport 239.5.5.55 (DF) [ttl 1]

Note that the device sends three IGMP packets, stating its membership to make sure that it is heard. The router receives the request to join this group and sets a timer to count down for three minutes. As long as some device reasserts its membership with IGMP within this period, the group will remain in the router's multicast routing table. If all of the group members leave, or if they all simply stop sending IGMP updates for more than three minutes, then the router will remove this group from its tables to save memory.

When the device wants to stop receiving a multicast group, it sends a single IGMP Leave packet. The router immediately reacts by sending a query to this segment to find out if there are still any other members left in this group. It tries twice before deciding to stop sending traffic for this group to this network segment:

17:16:17.934667 192.168.1.104 > ALL-ROUTERS.MCAST.NET: igmp leave 239.5.5.55 (DF) [ttl 1]
17:16:17.937715 192.168.1.1 > 239.5.5.55: igmp query [gaddr 239.5.5.55] [tos 0xc0] [ttl 1]
17:16:19.050430 192.168.1.1 > 239.5.5.55: igmp query [gaddr 239.5.5.55] [tos 0xc0] [ttl 1]

The important changes to the protocol between Versions 1 and 2 of IGMP have to do with determining when all of the members of a group in a particular network have left. The most important addition to Version 3 is the ability to specify and filter multicast sources. So a device may specify that it is interested in receiving multicast messages from one source, but not from another, even though both sources may be sending to the same group.

IGMP Version 3 is the new current standard, but many devices do not support the new extensions. However, Cisco provides a fully compliant IGMPv3 implementation. And, in fact, you lose nothing by using IGMPv3 because the protocol is backward-compatible.

In a switched Ethernet LAN (including 100 Mbps, 1,000 Mbps, and higher speed variants), there is an additional benefit to multicast transmission. If the switches are multicast aware, they can forward packets with a particular group address to only those devices that are members of this group. So it is not necessary to "flood" the entire VLAN with multicast packets just because one device is a multicast group member. Naturally, this means that the switch must be able to read and use Layer 3 information, so this sort of functionality is not available on all Ethernet switches.

Many multicast aware switches use IGMP snooping to read IGMP packets from devices as they join and leave particular groups. On the surface, this sounds like a perfect and simple solution, but in practice it can be very complex to implement in the switch. The first problem is that there are several special cases that are difficult to manage. For example, things become quite complex when you have several multicast routers on a segment, or when there are complicated trunk topologies or connections to workgroup hubs. Another important problem with IGMP snooping is that the switch must read the contents of all multicast packets passing through it so that it won't miss any IGMP Join or Leave messages. In effect, the switch acts as if it were a member of every multicast group. If there is a heavy multicast application such as a multimedia application, this can cause serious CPU overhead on the switch.

Cisco has developed a proprietary protocol called Cisco Group Management Protocol (CGMP) to deal with these problems. CGMP is implemented on all Cisco routers and most new switches, even those without Layer 3 capabilities. It is a relatively simple protocol that allows the router to do most of the hard work for the switch. When a device on the LAN segment joins a multicast group by sending an IGMP Join message, the switch simply passes the IGMP packet through to the router as it would with any other packet. The router then sends a CGMP packet to the switch to let it know the MAC addresses of the device and the group. Similarly, when a device leaves a group, the router uses CGMP to tell the switch to stop forwarding this particular multicast group to this device. In this way, the router, which has to keep track of this information anyway, can simply tell the switch what to do.

Unfortunately, CGMP doesn't solve all of the problems inherent in the IGMP model. Specifically, a device doesn't need to send an IGMP Leave message when it is no longer interested in receiving packets for that group. If the last group member leaves without sending the appropriate IGMP Leave message, the router will still think that there are devices in the group. It will continue to forward multicast packets to the segment until a timer expires. The router will eventually poll the LAN segment to see if any devices are still interested in receiving this group. If it gets no response, it will finally stop sending the multicast data stream. However, most implementations of IGMP Version 2 do send explicit Leave messages, unless the end devices crash or terminate improperly. In any case, it is usually better to have a device receive multicast data it didn't subscribe to than to lose the data. The only time when this isn't true is when the multicast data stream consumes too much bandwidth and starts to cause congestion for normal unicast traffic, or when processing the unnecessary multicast traffic causes CPU problems on the end devices.

Switches running newer versions of CGMP include a particularly nice feature called Local Leave Processing. They are able to intercept IGMP Leave messages from devices and process them internally. If there are other group members elsewhere on the switch, it can simply stop sending data from this group to the device that no longer wishes to be a member. Then, when the last group member leaves the group, the switch will send a global IGMP Leave packet to the router to tell it to stop sending this multicast group.

Multicast Routing Protocols

There are two general types multicast routing protocols, called dense and sparse mode. Dense mode means that every multicast router receives every multicast packet unless and until it explicitly says that it doesn't want it. As we will discuss shortly, this applies to each group and each interface separately. Sparse mode, on the other hand, means (loosely) that no router will receive a multicast group unless it explicitly requests it. It is important to note that end devices, whether multicast servers or group members, are completely unaware of which mode their network uses, or even which multicast routing protocol. Indeed it is possible to run a network where the routers use a combination of these modes.

There are many examples of dense-mode protocols, such as Protocol Independent Multicast-Dense Mode (PIM-DM), Distance Vector Multicast Routing Protocol (DVMRP), and Multicast Open Shortest Path First (MOSPF). There are fewer sparse-mode protocols, with the best examples being Protocol Independent MulticastSparse Mode (PIM-SM) and Core-Based Trees (CBT).

Not all of these protocols are available in Cisco routers. Like most vendors, Cisco implements PIM-DM and PIM-SM as well as MBGP. But Cisco does not implement MOSPF or CBT, and has a limited version of DVMRP.


There are two other general categories of multicast routing protocols: protocol dependent and protocol independent. The difference has to do with the interaction with an underlying routing protocol, and not with the ability to handle nonIP multicast traffic. All of the multicast protocols mentioned in this book are specific to IP multicast communications.

For example, MOSPF is protocol-dependent because it relies on OSPF and uses a special OSPF LSA type to carry information about multicast routing. PIM and CBT, on the other hand, both use the multicast traffic itself, along with the standard unicast IP routing table and IGMP requests to build the multicast forwarding trees. Since they don't care how the router got its unicast IP routing table, they are called protocol independent.

For the network engineer, these distinctions are quite important, since they affect flexibility, reliability, and network performance. In general, if you have a large network, particularly with bandwidth constrained WAN links, and the multicast sources and destinations can be more or less anywhere through your network, you should use a protocol independent sparse mode multicast routing protocol. If you're not sure if this really describes your network, it is generally safer and easier to lean in this direction anyway.

The PIM protocols, and, in particular, PIM-SM, are generally the best choices for implementing new multicast networks. In the past there were problems with interoperability in multi-vendor networks, as different router manufacturers implemented different sets of multicast routing protocols. Since DVMRP was the first widely implemented multicast routing protocol, the rule of thumb used to be that DVMRP was the best way to allow communication between groups of routers from different vendors. However, a quick survey of protocols supported by major router vendors shows that almost all of them now support PIM as well as DVMRP.

PIM-DM, PIM-SM, and Bidirectional PIM

There are three different flavors of PIM. The most current version of the Dense Mode protocol, PIM-DM, is defined in RFC 3973. The Sparse Mode PIM-SM protocol is specified in RFC 4601. At the time of this book's writing, Bidirectional PIM was still in the draft specification phase. You can obtain a copy of the draft specification from the IETF web site in the PIM Working Group's directory: http://www.ietf.org/ids.by.wg/pim.html.

Let's look schematically at how each builds and maintains its multicast forwarding trees to explain how they work. We note from the outset that this is not intended to be a rigorous explanation of the protocols. Instead, we just want to give you a good, basic understanding of what they do and how they do it. For more detailed information, please refer to the standards documents mentioned above, as well as RFC 2715, which details interoperability rules for multicast routing protocols.

When a device wants to join a group, G, the first thing it does is to send an IGMP Join message to its local router. If this is the first group member (and if the IGMP Join message doesn't specify a particular multicast source device, an option that we will discuss later), the router creates an entry in its multicast forwarding table for (*,G). This says that the router will forward to this interface all multicast packets addressed to group G from any source. At this point, if the router receives any packets for this group, it knows at least one place to forward them to.

In PIM-DM, the router will create the group and wait for packets. It will also send a PIM Join request to each of its PIM neighbors to find out if they have this group. If it receives multicast packets for a group that it doesn't care about, then the router will send Prune messages back to where they came from, to ask to be removed from the forwarding tree for this group. This is commonly called a "flood and prune" model, which is common to all dense mode multicast protocols.

If this router uses PIM-SM, however, it will attempt to join a multicast tree rooted at the Rendezvous Point (RP). An RP is a router somewhere in the network that acts as a central distribution point for one or more multicast groups. Later we will discuss how the other routers come to know about the RP, but for now we'll assume that they know how to find it. When the last-hop router receives an IGMP message from a device asking to join a group, it has to go looking for that group. The best place to start looking is the RP.

So the last-hop router looks at its unicast routing table to figure out which of its neighboring routers is the best path to the RP, and it sends it an explicit PIM-SM Join message for this group. If the neighboring router is already receiving this group, then the problem is solved and the data starts to flow. Otherwise, this neighbor must send another Join to the next hop router in the direction of the RP, and so on until a multicast-forwarding tree is created with its root at the RP.

The upstream router will automatically prune the branches of this multicast tree if they don't receive another explicit join within the three-minute timeout period. So, by default, the routers all refresh the tree with a new Join for every active group, once per minute. This creates and maintains a stable tree rooted at the RP and extending to all group members in the network that remains active, even if there is no multicast traffic being forwarded.

The only remaining piece of the puzzle is how the packets get from the sender to the RP. When the source device sends its first packet, the first-hop router receives it normally, as it would any other packet. This first-hop router has already learned where the RP is. When it receives a multicast packet from a new source, the router must register this source with the RP. The router encapsulates the multicast packet in a PIM-SM Registration packet, which it sends by unicast to the RP. The RP then removes the encapsulation and forwards the packet down the tree. The RP also sends an explicit PIM-SM Join message toward the source. The Join message links up a tree from the RP upstream to the source and downstream to the group members. Once the tree is built, there is no need for the first-hop router to continue encapsulating multicast packets to send them to the RP. So the first-hop router can revert to normal multicast forwarding instead, knowing that the RP is somewhere downstream on the SPT.

This process is shown schematically in Figure 23-1. The multicast source device sends out the packet (Step 1). The first hop router encapsulates this packet and sends it by unicast to the RP (Step 2). The RP sends the packet by multicast down the tree to the recipient devices (Step 3), who finally receive it from their own local routers (Step 4).

Figure 23-1. PIM-SM delivery model


Finally, once there is a tree connecting the ultimate source with all of the group members, there is no more need for the RP. So the last-hop routers start to send PIM-SM Join messages to create a new tree that is centered on the source rather than the RP. This is actually controlled by a minimum traffic flow threshold value, which is equal to zero by default in Cisco routers. PIM-SM starts to build the new tree rooted at the source only if the amount of traffic coming down the tree for this group exceeds this threshold.

Bidirectional PIM offers many of the advantages of PIM-SM, but with a considerably simpler method for setting up the multicast forwarding tree and less operational overhead. From the RP to the destination, Bidirectional PIM functions exactly the same as PIM-SM. The differences are in the way that the source devices forward packets to the RP.

The key to Bidirectional PIM is to remember that the first hop router, the one that is adjacent to the source device, can quickly establish a forwarding path from the RP by using normal PIM-SM methods. In Bidirectional PIM, the routers exploit this same path in reverse to reach the RP. As a result, there is no need for the complicated encapsulation method used to forward the first few packets of the multicast stream to the RP. As a tradeoff, Bidirectional PIM does not have the capability to create a source-rooted forwarding tree, as PIM-SM does.

To allow packets to traverse the multicast forwarding tree backwards to the RP, Bidirectional PIM needs a few additional tricks to help eliminate potential routing loops. The main trick is the use of a Designated Forwarder (DF) router on each network segment that is more than one hop away from the RP. The DF routers natively forward multicast packets to the RP. This has the added advantage that, if there are recipient devices along the path to the RP, they can receive the multicast packets immediately instead of waiting for the packets to reach the RP and come back. Figure 23-2 shows an example of how this works. As in the PIM-SM example, the source device sends the multicast packet to its local network segment where it is received by the first hop router (Step 1). In this case, because there are two routers on the segment, one of them (Router1) must be the DF, which handles the forwarding of this packet up the multicast tree toward the RP (Step 2). Along the way, because this is a native multicast packet, Router3 realizes that it supports a member of this group, and delivers the packet normally (Step 3), as well as continuing to forward it to the RP (Step 4). The RP then delivers the multicast packet normally, as it did with PIM-SM (Step 5).

Figure 23-2. Bidirectional PIM delivery model


DVMRP

Distance Vector Multicast Routing Protocol (DVMRP) is defined in RFC 1075, and was the first widely implemented multicast routing protocol. This protocol is similar to RIP in many ways. There are a few important differences, though. The maximum diameter of a RIP network is 16 hops, as we mentioned in Chapter 6. DVMRP has a maximum metric of 32, which drastically improves its flexibility. It's not hard to find a network with a diameter greater than 16 hops, but a 32-hop diameter is sufficient for most real-world corporate networks. It is not sufficient for the public Internet, but that is why multiprotocol extensions to the Border Gateway Protocol, sometimes also called Multicast Border Gateway Protocol ( MBGP) was invented.

DVMRP is often a good choice for allowing routers from different manufacturers to exchange multicast routing information. It is a dense mode protocol, however, so it is generally less efficient with network resources. We recommend using DVMRP primarily as a mechanism for exchanging multicast routing information with older nonCisco devices. In recent years, PIM has become the popular choice for multicast routing among most large router vendors, though, so DVMRP's niche is now mostly in interconnecting with existing nonCisco multicast networks.

In many ways, DVMRP functions in a similar way to PIM-DM. It uses a dense-mode strategy that forces all routers to prune themselves from any multicast trees that they don't require. And it also uses the unicast routing table to determine the shortest path back to the source device. The main difference, however, is that DVMRP includes its own internal unicast routing protocol that it uses to help make decisions about the best SPT.

DVMRP uses an algorithm called Truncated Reverse Path Broadcasting (TRPB) to allow every router in the network to determine where it is relative to the multicast source, and to calculate the optimal SPT back to the source. Because DVMRP uses its own internal unicast routing protocol, it is not considered protocol-independent.

You must take special measures to force DVMRP to follow the standard unicast routing table and make it protocol independent. Of course, this would break one of the main reasons for using DVMRP in the first place. Because it maintains its own routing tables, DVMRP is able to work in networks where the multicast and unicast topologies are different. This is not uncommon in cases where parts of the unicast network don't support multicast routing, or where traffic engineering leads you to put multicast traffic through different network links.

In fact, Cisco routers do not provide a full DVMRP implementation. They can take part in discovering and exchanging routing information with DVMRP neighbors. But the actual multicast routing is done using PIM while referring to the DVMRP routing tables.

MOSPF

Multicast Open Shortest Path First (MSOPF) is not really a separate protocol, but rather is a set of extensions to the popular Open Shortest Path First (OSPF) unicast routing protocol. OSPF is described in more detail in Chapter 8. To allow OSPF to carry multicast routing information, RFC 1584 added a new Link State Advertisement (LSA) type, called Type 6, or simply the MOSPF LSA.

Cisco routers do not support MOSPF, so we will not discuss this protocol in any detail except to point out that Cisco routers will generate log error messages whenever they encounter Type 6 OSPF LSAs. So Recipe 23.10 shows how to configure the router to ignore these packets.

The biggest advantage to MOSPF is that it is tightly integrated with OSPF, which can simplify network administration. Furthermore, because it uses the same Link State algorithm as OSPF, every router in the network can independently deduce the best path back to the source.

However, it is a dense-mode protocol, and is consequently less efficient with network resources, and it requires OSPF to work. This is almost certainly why Cisco has chosen not to implement it.

MBGP

MBGP is based on a small set of extensions to BGP defined in RFC 2858 to allow exchange of any routable protocol information between Autonomous Systems. It does this by simply introducing two new attributes to the BGP protocol: Multiprotocol Reachable Network Layer Routing Information (MP_REACH_NLRI) and Multiprotocol Unreachable Network Layer Routing Information (MP_UNREACH_NLRI), which are used to carry information about reachable and unreachable networks.

It's important to understand that MBGP is not really a multicast routing protocol in the same sense as PIM or DVMRP. It doesn't understand or have the ability to Join or Prune SPT's. It doesn't include any functionality for dealing with Rendezvous Points. All it does is forward information about multicast groups and sources, and make this information available to other multicast routing protocols. It needs another protocol to do all of the other work of joining and pruning multicast distribution trees. The two protocols most commonly used for this are PIM and DVMRP.


Previous Page
Next Page