idnits 2.17.1 draft-farinacci-mpls-multicast-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 1999) is 9076 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '1' is defined on line 375, but no explicit reference was found in the text == Unused Reference: '3' is defined on line 382, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-ietf-mpls-arch-05 ** Obsolete normative reference: RFC 2362 (ref. '2') (Obsoleted by RFC 4601, RFC 5059) -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Possible downref: Non-RFC (?) normative reference: ref. '7' Summary: 6 errors (**), 0 flaws (~~), 5 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Dino Farinacci 2 Internet Draft Yakov Rekhter 3 Expiration Date: December 1999 Eric C. Rosen 4 Cisco Systems, Inc. 6 June 1999 8 Using PIM to Distribute MPLS Labels for Multicast Routes 10 draft-farinacci-mpls-multicast-00.txt 12 Status of this Memo 14 This document is an Internet-Draft and is in full conformance with 15 all provisions of Section 10 of RFC2026. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Abstract 35 This document specifies a method of distributing MPLS labels for 36 multicast routes. The labels are distributed in the same PIM 37 messages that are used to create the corresponding routes. The 38 method is media-type independent, and therefore works for multi- 39 access/multicast capable LANs, point-to-point links, and NBMA 40 networks. 42 Table of Contents 44 1 Overview ........................................... 2 45 2 Proposal ........................................... 3 46 2.1 Piggybacking ....................................... 3 47 2.2 Labels for LANs with Multiple Downstream Nodes ..... 5 48 2.3 Labels for Point-to-Point Links .................... 5 49 2.4 Labels for NBMA Networks ........................... 5 50 2.5 Corner cases ....................................... 6 51 2.6 When NOT to Send a Labelled Multicast Packet ....... 7 52 2.7 No Conflict between Unicast and Multicast Labels ... 7 53 3 Modifications to PIMv2 ............................. 7 54 4 Label Distribution for dense-mode groups ........... 8 55 5 Security Considerations ............................ 9 56 6 Acknowledgments .................................... 9 57 7 References ......................................... 9 59 1. Overview 61 PIM [2] is used to combine MPLS label distribution with the 62 distribution of (*,G) join state, (S,G) join state, or (S,G)RPT-bit 63 prune state. Labels and multicast routes are sent together in one 64 message. 66 The design of this method has been motivated by the following goals: 68 o If an interface attaches to a network with data-link broadcast 69 capability, an LSR should never have to send more than one copy 70 of a given multicast data packet out that interface. However, it 71 is NOT a goal for that LSR to be able to send the same packet, 72 with the same label, out multiple interfaces. 74 o When an interface supports data link multicasting, it must be 75 possible to have a single Label Information Base (LIB) for that 76 interface. That is, the receiver of a labeled packet should be 77 able to interpret the label without knowing who the transmitter 78 is. 80 o When a LAN contains multiple label distribution peers, it should 81 be possible to use data link multicast to distribute the label 82 distribution control packets themselves. Other aspects of label 83 distribution methodology should remain as consistent with unicast 84 label distribution as possible. Multicast label distribution 85 procedures should not depend on the media type. 87 o Once the label for a particular multicast tree on a given LAN has 88 been assigned, unicast routing changes should not cause 89 redistribution or reassignment of the label for that group on 90 that LAN. 92 o When a multicast routing table change requires a label 93 distribution change, the latency between the two should be 94 minimized, both to improve performance and to minimize the 95 possibility of race conditions. 97 o The procedures should work with either dense-mode or sparse mode 98 operation. 100 2. Proposal 102 2.1. Piggybacking 104 A LSR that supports multicast sends PIM Join/Prune messages on behalf 105 of hosts that join groups. It sends Join/Prune messages to upstream 106 neighboring LSRs toward the RP for the shared-tree (*,G) or toward a 107 source for a source-tree (S,G). Labels are distributed by being 108 associated with addresses in the join list or the prune list. In 109 particular: 111 1. If an LSR, Rd, joins the shared tree for a group, the 112 Join/Prune message it sends upstream will contain the group 113 address followed by a join-list. The join-list will contain an 114 element which contains the address of the RP. This element 115 will also contain a a label, and this label can be used by the 116 upstream LSR, Ru, when it sends multicast data down the shared 117 tree. Intuitively, this label represents the route downstream 118 from the current node along the shared tree. 120 2. If an LSR, Rd, joins a source tree for a group, the Join/Prune 121 message it sends upstream will contain the group address 122 followed by a join-list. The join-list will contain an element 123 which contains the address of the source. This element will 124 also contain a label, and this label can be used by the 125 upstream LSR, Ru, when it sends multicast data down the source 126 tree. Intuitively, this label represents the route downstream 127 from the current node along the specified source tree. 129 3. Suppose an LSR, Rd, has (S,G)RPT-bit state with a null output 130 interface list. This indicates that all of its downstream 131 neighbors on the shared tree for G have pruned source S from 132 the shared tree. Rd sends a Join/Prune message upstream (on 133 the shared tree), containing the group address followed by a 134 prune-list. The prune-list contains an element which contains 135 the address of the source. In this case, no label is included 136 in the element. 138 4. Suppose an LSR, Rd, as the result of receiving, from a 139 downstream neighbor on the shared tree, a Join/Prune message 140 such as described in 3, creates (S,G)RPT-bit state with a non- 141 null output interface list. In this case, it may send a 142 Join/Prune message upstream on the shared tree, containing the 143 group address followed by a prune-list. An element of the 144 prune list will contain the address S and a corresponding 145 label. However, a special bit (the "don't prune" bit) in the 146 element will be set indicating to the upstream LSR that the 147 source S is not really to be pruned from the shared tree. The 148 result is that the upstream LSR, Ru, will still send packets 149 from S to G to Rd, and will label those packets as specified. 150 When Rd receives such packets, it forwards them according to 151 the output interface list of the (S,G)RPT-bit entry. 152 Intuitively, this label represents a route along the shared 153 tree, but only for packets from the specified source. 155 5. An LSR which receives a Join/Prune message as described in 4 156 may send a corresponding Join/Prune message (with the "don't 157 prune" bit set) to its upstream LSR on the shared tree. Again, 158 this label represents a route along the shared tree, but only 159 for packets from the specified source. 161 Rules 3-5 above ensure that if a source is pruned off the shared tree 162 at some point, any packets from that source which is sent down the 163 shared tree will have a label that implicitly identifies the source. 164 Thus if those packets encounter a node with (S,G)RPT-bit state, they 165 will be sent according to the output interface list of the (S,G)RPT- 166 bit entry, NOT according to the output interface list of the (*,G) 167 entry. 169 2.2. Labels for LANs with Multiple Downstream Nodes 171 Since PIM Join/Prune messages are multicast on a LAN, other 172 downstream LSRs that are interested in the group will hear the 173 message. They must cache the binding of multicast routing table 174 state and label state together. Since the upstream LSR is going to 175 forward data packets using the advertised label, they must be ready 176 to accept the data packet with that advertised label. 178 The first downstream LSR that joins a group is the label assigner on 179 that LAN for that multicast route. All other downstream LSRs that 180 send PIM Join/Prune messages will use the same label that the 181 assigner selected. A LSR that sends a PIM Join/Prune message with a 182 label of 0 means that it doesn't know the label for the associated 183 multicast routing table entry. When this occurs, the assigner can 184 trigger a PIM Join/Prune message making the label known. 186 2.3. Labels for Point-to-Point Links 188 The procedure of section 2.2 works on point-to-point links because 189 there is only one downstream LSR on the link which always becomes the 190 label assigner. 192 2.4. Labels for NBMA Networks 194 On NBMA networks, all PIM routers are known to each other through 195 pseudo-broadcast mechanisms provided by the data-link layer. However, 196 PIM Join messages are unicast to the upstream LSR. Therefore, other 197 downstream LSRs will not hear the label assigner's advertisement. 198 Therefore we treat an NBMA network with one upstream and n downstream 199 LSRs as n point-to-point links, from the upstream LSR to each of the 200 downstream LSRs. Each downstream LSR then assigns its own label, and 201 the upstream LSR must replicate the multicast data packets. 202 Therefore the procedure of section 2.2 applies. 204 Note that this is not incompatible with the use of native point-to- 205 multipoint capabilities at the data link layer. 207 2.5. Corner cases 209 Multiple downstream LSRs cannot assign the same label value for any 210 multicast route because they partition the label space into non- 211 overlapping ranges according to [4]. When a LSR is enabled on an 212 interface, it obtains a unique label range for the LAN. 214 When the label assigner leaves the group, the label that it assigned 215 still remains active. The next highest IP addressed downstream LSR 216 becomes the owner of that label and may change it if it sees fit. 217 However, it is not required to change it. All downstream LSRs can 218 continue to use the assignment in their Join messages. 220 If two systems both join for the first time (they do not have state), 221 at the same time and each choose a different label value, the highest 222 IP addressed downstream LSR's label will be used by the upstream LSR. 223 The lower addressed LSR will hear the higher addressed LSR's Join too 224 and will also use it's label. 226 If the label assigner crashes, the highest IP addressed downstream 227 LSR assigns a new label to the multicast routes, which were assigned 228 by the crashing LSR, and triggers a Join message so all other LSRs on 229 the LAN to use the new label. 231 When a LAN partitions due to a layer-2 switch failure, it follows the 232 same logic for the case when a LSR stops joining for a group. When 233 the partition heals, there may be an RPF neighbor change in one of 234 the partitions. When there is an RPF neighbor change and the 235 downstream routers trigger joins to their new RPF neighbor with a 236 different label assignment than the other partition is using, one of 237 two resolutions occur: 239 1) The LSR which is the allocator in the partition of the new RPF 240 neighbor will trigger a join if it has a higher IP address than 241 the allocator in the other region. The downstream routers in 242 the other partition use the new label assignment immediately. 244 2) If the LSR which is the allocator in the partition of the new 245 RPF neighbor has a lower IP address, all downstream routers and 246 the new RPF neighbor will switch to the label assigned by the 247 allocator in the other partition. 249 If an RPF change occurs (the topology changed so the upstream LSR is 250 different), the PIM protocol spec indicates that a PIM Join may be 251 triggered to get on the new distribution tree as soon as possible. In 252 this case, if the label assigner becomes the upstream LSR, then the 253 new highest IP addressed downstream LSR may become the label 254 assigner. It may change the label if it sees fit. Otherwise, the same 255 label is used. 257 2.6. When NOT to Send a Labelled Multicast Packet 259 PIM Hello messages, sent periodically by all PIM-capable routers, 260 will indicate if the router is MPLS-capable. An upstream router on a 261 LAN will therefore know if all routers on the same LAN are LSRs or 262 not. If there are ANY MPLS-incapable routers which are interested in 263 a particular group, the upstream router will transmit to the LAN only 264 unlabelled multicast data packets for that group. 266 If there are any group members on a LAN, only unlabelled multicast 267 data for that group will be transmitted onto that LAN. 269 Routers that support non-PIM multicast are assumed, for the purposes 270 of this procedure, to be MPLS-incapable. 272 2.7. No Conflict between Unicast and Multicast Labels 274 MPLS uses different data-link layer code-points [5] to distinguish 275 multicast labeled packets from unicast labeled packets. Therefore, 276 the assignment of labels for unicast routes is completely independent 277 from the assignment of labels for multicast routes. For example, the 278 same label value could be allocated for a unicast route and for a 279 multicast route, without any possibility of ambiguity. 281 3. Modifications to PIMv2 283 PIMv2 has a packet format for each address type it may support when 284 encoding both multicast and unicast addresses. We will define a new 285 address type called "Label Address" for unicast address encoding. 286 The label will accompany the source address in the Encoded Source 287 Address format as specified in [2]. The label value will be in a 288 32-bit quantity following the source address. We also take one bit 289 from the PIMv2 reserved field to be the "don't prune" bit (shown 290 below as the "D" bit). So, for example, an IPv4 Label Address format 291 would look like: 293 0 1 2 3 294 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 295 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 296 | Rsrvd |D|S|W|R| Mask Len | 297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 298 | Source Address | 299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 300 | Label | 301 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 302 | Current Multicast Route Timer | 303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 305 Label 306 If the high-order bit is clear, the low-order 20 bits are a label 307 value (as described in [5]) assigned by the LSR sending the 308 Join/Prune message. All other bits should be set to 0 by the 309 sender and should be ignored by the receiver. 311 If the high-order bit is set, the low-order 28 bits are a label 312 value in the VPI/VCI format of (as described in [7]) assigned by 313 the LSR sending the Join/Prune message. All other bits should be 314 set to 0 by the senderand should be ignored by the receiver. 316 Current Multicast Route Timer 317 The sender of a Join/Prune message inserts the current time left 318 before expiration for the multicast route table entry described by 319 the Source Address (either the (S,G) or (*,G) entry). This is 320 needed so all routers on a common multi-access subnet can time-out 321 the entry close to the same time without each other recreating the 322 state when the source goes inactive. 324 Refer to [2] for other field descriptions not specified here. 326 4. Label Distribution for dense-mode groups 328 In dense-mode PIM, there is no downstream Join message traveling 329 upstream to perform the binding of multicast routes with labels. 330 However, since we don't want a separate algorithm for dense-mode 331 groups, we extend this basic design for dense-mode PIM. 333 When a downstream LSR creates (S,G) state from the receipt of 1) 334 data, or 2) Join/Prune or Graft messages, it will start a periodic 335 timer to send Join messages with label assignment information 336 present. The messages look no different and are treated on receipt no 337 differently than in the sparse-mode case. 339 The periodic Join message will be multicast on the LAN with an 340 upstream target address of 0.0.0.0. All multicast LSRs on the LAN 341 must know the group operates in dense-mode. This is accomplished 342 using standard PIM mechanisms. 344 5. Security Considerations 346 Security considerations are not discussed in this memo. 348 6. Acknowledgments 350 The authors would like to thank Fred Baker for his comments. We also 351 thank the authors of [6] for their critique of an earlier version. 353 9.0 Author's Addresses 355 Dino Farinacci 356 Cisco Systems, Inc. 357 170 Tasman Drive 358 San Jose, CA, 95134 359 Email: dino@cisco.com 361 Yakov Rekhter 362 Cisco Systems, Inc. 363 170 Tasman Drive 364 San Jose, CA, 95134 365 Email: yakov@cisco.com 367 Eric C. Rosen 368 Cisco Systems, Inc. 369 250 Apollo Drive 370 Chelmsford, MA, 01824 371 Email: erosen@cisco.com 373 7. References 375 [1] "Multiprotocol Label Switching Architecture", draft-ietf-mpls- 376 arch-05.txt, Rosen, Viswanathan, Callon, April 1999. 378 [2] "Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol 379 Specification", RFC 2362, Estrin, Farinacci, Helmy, Thaler, Deering, 380 Handley, Jacobson, Liu, Sharma, Wei, June 1998. 382 [3] "LDP Specification", , Andersson, 383 Doolan, Feldman, Fredette, Thomas, June 1999. 385 [4] "Partitioning Label Space amoung Multicast Routers on a Common 386 Subnet", , Farinacci, 387 October 1998. 389 [5] "MPLS Label Stack Encoding", , Rosen, Rekhter, Farinacci, Tappan, Fedorkow, Li, Conta, 391 April 1999. 393 [6] "Framework for IP Multicast in MPLS", , Ooms, Livens, Sales, Ramalho, Acharya, Griffoul, 395 Ansari, May 1999. 397 [7] "MPLS using LDP and ATM VC Switching", , Davie, Lawrence, McCloghrie, Rekhter, Rosen, Swallow, 399 Doolan, April 1999.