idnits 2.17.1 draft-ietf-mpls-arch-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2-11], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 1 instance of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 615: '... operation MUST be to "pop the stack...' RFC 2119 keyword, line 650: '...M THE NHLFE; THIS MAY IN SOME CASES BE...' RFC 2119 keyword, line 685: '... cases, Rd MUST NOT distribute to Ru...' RFC 2119 keyword, line 826: '...popping the label stack at all MUST do...' RFC 2119 keyword, line 2665: '... Ru MUST use the RequestRetry ...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 247 has weird spacing: '...e class a gr...' == Line 262 has weird spacing: '...on base the...' == Line 318 has weird spacing: '... router an ...' == Line 360 has weird spacing: '...itching an IE...' == Line 793 has weird spacing: '...rmation gaine...' == (3 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1998) is 9532 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '2-11' on line 40 == Missing Reference: '12' is mentioned on line 1022, but not defined == Unused Reference: '2' is defined on line 2733, but no explicit reference was found in the text == Unused Reference: '3' is defined on line 2737, but no explicit reference was found in the text == Unused Reference: '4' is defined on line 2741, but no explicit reference was found in the text == Unused Reference: '5' is defined on line 2745, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 2749, but no explicit reference was found in the text == Unused Reference: '7' is defined on line 2753, but no explicit reference was found in the text == Unused Reference: '8' is defined on line 2757, but no explicit reference was found in the text == Unused Reference: '9' is defined on line 2761, but no explicit reference was found in the text == Unused Reference: '10' is defined on line 2765, but no explicit reference was found in the text == Unused Reference: '11' is defined on line 2768, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Possible downref: Non-RFC (?) normative reference: ref. '7' -- Possible downref: Non-RFC (?) normative reference: ref. '8' -- Possible downref: Non-RFC (?) normative reference: ref. '9' ** Downref: Normative reference to an Informational RFC: RFC 2098 (ref. '10') -- Possible downref: Non-RFC (?) normative reference: ref. '11' Summary: 10 errors (**), 0 flaws (~~), 19 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Eric C. Rosen 3 Internet Draft Cisco Systems, Inc. 4 Expiration Date: September 1998 5 Arun Viswanathan 6 Lucent Technologies 8 Ross Callon 9 IronBridge Networks, Inc. 11 March 1998 13 Multiprotocol Label Switching Architecture 15 draft-ietf-mpls-arch-01.txt 17 Status of this Memo 19 This document is an Internet-Draft. Internet-Drafts are working 20 documents of the Internet Engineering Task Force (IETF), its areas, 21 and its working groups. Note that other groups may also distribute 22 working documents as Internet-Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 To view the entire list of current Internet-Drafts, please check 30 the "1id-abstracts.txt" listing contained in the Internet-Drafts 31 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 32 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au 33 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu 34 (US West Coast). 36 Abstract 38 This internet draft specifies the architecture for multiprotocol 39 label switching (MPLS). The architecture is based on other label 40 switching approaches [2-11] as well as on the MPLS Framework document 41 [1]. 43 Table of Contents 45 1 Introduction to MPLS ............................... 4 46 1.1 Overview ........................................... 4 47 1.2 Terminology ........................................ 6 48 1.3 Acronyms and Abbreviations ......................... 9 49 1.4 Acknowledgments .................................... 10 50 2 Outline of Approach ................................ 11 51 2.1 Labels ............................................. 11 52 2.2 Upstream and Downstream LSRs ....................... 12 53 2.3 Labeled Packet ..................................... 12 54 2.4 Label Assignment and Distribution; Attributes ...... 12 55 2.5 Label Distribution Protocol (LDP) .................. 13 56 2.6 The Label Stack .................................... 13 57 2.7 The Next Hop Label Forwarding Entry (NHLFE) ........ 14 58 2.8 Incoming Label Map (ILM) ........................... 14 59 2.9 Stream-to-NHLFE Map (STN) .......................... 15 60 2.10 Label Swapping ..................................... 15 61 2.11 Scope and Uniqueness of Labels ..................... 15 62 2.12 Label Switched Path (LSP), LSP Ingress, LSP Egress . 16 63 2.13 Penultimate Hop Popping ............................ 18 64 2.14 LSP Next Hop ....................................... 19 65 2.15 Route Selection .................................... 20 66 2.16 Time-to-Live (TTL) ................................. 21 67 2.17 Loop Control ....................................... 22 68 2.17.1 Loop Prevention .................................... 23 69 2.17.2 Interworking of Loop Control Options ............... 25 70 2.18 Merging and Non-Merging LSRs ....................... 26 71 2.18.1 Stream Merge ....................................... 27 72 2.18.2 Non-merging LSRs ................................... 27 73 2.18.3 Labels for Merging and Non-Merging LSRs ............ 28 74 2.18.4 Merge over ATM ..................................... 29 75 2.18.4.1 Methods of Eliminating Cell Interleave ............. 29 76 2.18.4.2 Interoperation: VC Merge, VP Merge, and Non-Merge .. 29 77 2.19 LSP Control: Egress versus Local ................... 30 78 2.20 Granularity ........................................ 32 79 2.21 Tunnels and Hierarchy .............................. 33 80 2.21.1 Hop-by-Hop Routed Tunnel ........................... 33 81 2.21.2 Explicitly Routed Tunnel ........................... 33 82 2.21.3 LSP Tunnels ........................................ 33 83 2.21.4 Hierarchy: LSP Tunnels within LSPs ................. 34 84 2.21.5 LDP Peering and Hierarchy .......................... 34 85 2.22 LDP Transport ...................................... 36 86 2.23 Label Encodings .................................... 36 87 2.23.1 MPLS-specific Hardware and/or Software ............. 36 88 2.23.2 ATM Switches as LSRs ............................... 37 89 2.23.3 Interoperability among Encoding Techniques ......... 38 90 2.24 Multicast .......................................... 39 91 3 Some Applications of MPLS .......................... 39 92 3.1 MPLS and Hop by Hop Routed Traffic ................. 39 93 3.1.1 Labels for Address Prefixes ........................ 39 94 3.1.2 Distributing Labels for Address Prefixes ........... 39 95 3.1.2.1 LDP Peers for a Particular Address Prefix .......... 39 96 3.1.2.2 Distributing Labels ................................ 40 97 3.1.3 Using the Hop by Hop path as the LSP ............... 41 98 3.1.4 LSP Egress and LSP Proxy Egress .................... 41 99 3.1.5 The POP Label ...................................... 42 100 3.1.6 Option: Egress-Targeted Label Assignment ........... 43 101 3.2 MPLS and Explicitly Routed LSPs .................... 44 102 3.2.1 Explicitly Routed LSP Tunnels: Traffic Engineering . 44 103 3.3 Label Stacks and Implicit Peering .................. 45 104 3.4 MPLS and Multi-Path Routing ........................ 46 105 3.5 LSP Trees as Multipoint-to-Point Entities .......... 46 106 3.6 LSP Tunneling between BGP Border Routers ........... 47 107 3.7 Other Uses of Hop-by-Hop Routed LSP Tunnels ........ 49 108 3.8 MPLS and Multicast ................................. 49 109 4 LDP Procedures for Hop-by-Hop Routed Traffic ....... 50 110 4.1 The Procedures for Advertising and Using labels .... 50 111 4.1.1 Downstream LSR: Distribution Procedure ............. 50 112 4.1.1.1 PushUnconditional .................................. 51 113 4.1.1.2 PushConditional .................................... 51 114 4.1.1.3 PulledUnconditional ................................ 52 115 4.1.1.4 PulledConditional .................................. 52 116 4.1.2 Upstream LSR: Request Procedure .................... 53 117 4.1.2.1 RequestNever ....................................... 53 118 4.1.2.2 RequestWhenNeeded .................................. 53 119 4.1.2.3 RequestOnRequest ................................... 53 120 4.1.3 Upstream LSR: NotAvailable Procedure ............... 54 121 4.1.3.1 RequestRetry ....................................... 54 122 4.1.3.2 RequestNoRetry ..................................... 54 123 4.1.4 Upstream LSR: Release Procedure .................... 54 124 4.1.4.1 ReleaseOnChange .................................... 54 125 4.1.4.2 NoReleaseOnChange .................................. 54 126 4.1.5 Upstream LSR: labelUse Procedure ................... 55 127 4.1.5.1 UseImmediate ....................................... 55 128 4.1.5.2 UseIfLoopFree ...................................... 55 129 4.1.5.3 UseIfLoopNotDetected ............................... 55 130 4.1.6 Downstream LSR: Withdraw Procedure ................. 56 131 4.2 MPLS Schemes: Supported Combinations of Procedures . 56 132 4.2.1 TTL-capable LSP Segments ........................... 57 133 4.2.2 Using ATM Switches as LSRs ......................... 57 134 4.2.2.1 Without Multipoint-to-point Capability ............. 58 135 4.2.2.2 With Multipoint-To-Point Capability ................ 58 136 4.2.3 Interoperability Considerations .................... 59 137 4.2.4 How to do Loop Prevention .......................... 60 138 4.2.5 How to do Loop Detection ........................... 60 139 4.2.6 Security Considerations ............................ 60 140 5 Authors' Addresses ................................. 60 141 6 References ......................................... 61 143 1. Introduction to MPLS 145 1.1. Overview 147 In connectionless network layer protocols, as a packet travels from 148 one router hop to the next, an independent forwarding decision is 149 made at each hop. Each router runs a network layer routing 150 algorithm. As a packet travels through the network, each router 151 analyzes the packet header. The choice of next hop for a packet is 152 based on the header analysis and the result of running the routing 153 algorithm. 155 Packet headers contain considerably more information than is needed 156 simply to choose the next hop. Choosing the next hop can therefore be 157 thought of as the composition of two functions. The first function 158 partitions the entire set of possible packets into a set of 159 "Forwarding Equivalence Classes (FECs)". The second maps each FEC to 160 a next hop. Insofar as the forwarding decision is concerned, 161 different packets which get mapped into the same FEC are 162 indistinguishable. All packets which belong to a particular FEC and 163 which travel from a particular node will follow the same path. Such 164 a set of packets may be called a "stream". 166 In conventional IP forwarding, a particular router will typically 167 consider two packets to be in the same stream if there is some 168 address prefix X in that router's routing tables such that X is the 169 "longest match" for each packet's destination address. As the packet 170 traverses the network, each hop in turn reexamines the packet and 171 assigns it to a stream. 173 In MPLS, the assignment of a particular packet to a particular stream 174 is done just once, as the packet enters the network. The stream to 175 which the packet is assigned is encoded with a short fixed length 176 value known as a "label". When a packet is forwarded to its next 177 hop, the label is sent along with it; that is, the packets are 178 "labeled". 180 At subsequent hops, there is no further analysis of the packet's 181 network layer header. Rather, the label is used as an index into a 182 table which specifies the next hop, and a new label. The old label 183 is replaced with the new label, and the packet is forwarded to its 184 next hop. If assignment to a stream is based on a "longest match", 185 this eliminates the need to perform a longest match computation for 186 each packet at each hop; the computation can be performed just once. 188 Some routers analyze a packet's network layer header not merely to 189 choose the packet's next hop, but also to determine a packet's 190 "precedence" or "class of service", in order to apply different 191 discard thresholds or scheduling disciplines to different packets. 192 MPLS allows the precedence or class of service to be inferred from 193 the label, so that no further header analysis is needed; in some 194 cases MPLS provides a way to explicitly encode a class of service in 195 the "label header". 197 The fact that a packet is assigned to a stream just once, rather than 198 at every hop, allows the use of sophisticated forwarding paradigms. 199 A packet that enters the network at a particular router can be 200 labeled differently than the same packet entering the network at a 201 different router, and as a result forwarding decisions that depend on 202 the ingress point ("policy routing") can be easily made. In fact, 203 the policy used to assign a packet to a stream need not have only the 204 network layer header as input; it may use arbitrary information about 205 the packet, and/or arbitrary policy information as input. Since this 206 decouples forwarding from routing, it allows one to use MPLS to 207 support a large variety of routing policies that are difficult or 208 impossible to support with just conventional network layer 209 forwarding. 211 Similarly, MPLS facilitates the use of explicit routing, without 212 requiring that each IP packet carry the explicit route. Explicit 213 routes may be useful to support policy routing and traffic 214 engineering. 216 MPLS makes use of a routing approach whereby the normal mode of 217 operation is that L3 routing (e.g., existing IP routing protocols 218 and/or new IP routing protocols) is used by all nodes to determine 219 the routed path. 221 MPLS stands for "Multiprotocol" Label Switching, multiprotocol 222 because its techniques are applicable to ANY network layer protocol. 223 In this document, however, we focus on the use of IP as the network 224 layer protocol. 226 A router which supports MPLS is known as a "Label Switching Router", 227 or LSR. 229 A general discussion of issues related to MPLS is presented in "A 230 Framework for Multiprotocol Label Switching" [1]. 232 1.2. Terminology 234 This section gives a general conceptual overview of the terms used in 235 this document. Some of these terms are more precisely defined in 236 later sections of the document. 238 aggregate stream synonym of "stream" 240 DLCI a label used in Frame Relay networks to 241 identify frame relay circuits 243 flow a single instance of an application to 244 application flow of data (as in the RSVP 245 and IFMP use of the term "flow") 247 forwarding equivalence class a group of IP packets which are 248 forwarded in the same manner (e.g., 249 over the same path, with the same 250 forwarding treatment) 252 frame merge stream merge, when it is applied to 253 operation over frame based media, so that 254 the potential problem of cell interleave 255 is not an issue. 257 label a short fixed length physically 258 contiguous identifier which is used to 259 identify a stream, usually of local 260 significance. 262 label information base the database of information containing 263 label bindings 265 label swap the basic forwarding operation consisting 266 of looking up an incoming label to 267 determine the outgoing label, 268 encapsulation, port, and other data 269 handling information. 271 label swapping a forwarding paradigm allowing 272 streamlined forwarding of data by using 273 labels to identify streams of data to be 274 forwarded. 276 label switched hop the hop between two MPLS nodes, on which 277 forwarding is done using labels. 279 label switched path the path created by the concatenation of 280 one or more label switched hops, allowing 281 a packet to be forwarded by swapping 282 labels from an MPLS node to another MPLS 283 node. 285 layer 2 the protocol layer under layer 3 (which 286 therefore offers the services used by 287 layer 3). Forwarding, when done by the 288 swapping of short fixed length labels, 289 occurs at layer 2 regardless of whether 290 the label being examined is an ATM 291 VPI/VCI, a frame relay DLCI, or an MPLS 292 label. 294 layer 3 the protocol layer at which IP and its 295 associated routing protocols operate link 296 layer synonymous with layer 2 298 loop detection a method of dealing with loops in which 299 loops are allowed to be set up, and data 300 may be transmitted over the loop, but the 301 loop is later detected and closed 303 loop prevention a method of dealing with loops in which 304 data is never transmitted over a loop 306 label stack an ordered set of labels 308 loop survival a method of dealing with loops in which 309 data may be transmitted over a loop, but 310 means are employed to limit the amount of 311 network resources which may be consumed 312 by the looping data 314 label switched path The path through one or more LSRs at one 315 level of the hierarchy followed by a 316 stream. 318 label switching router an MPLS node which is capable of 319 forwarding native L3 packets 321 merge point the node at which multiple streams and 322 switched paths are combined into a single 323 stream sent over a single path. 325 Mlabel abbreviation for MPLS label 327 MPLS core standards the standards which describe the core 328 MPLS technology 330 MPLS domain a contiguous set of nodes which operate 331 MPLS routing and forwarding and which are 332 also in one Routing or Administrative 333 Domain 335 MPLS edge node an MPLS node that connects an MPLS domain 336 with a node which is outside of the 337 domain, either because it does not run 338 MPLS, and/or because it is in a different 339 domain. Note that if an LSR has a 340 neighboring host which is not running 341 MPLS, that that LSR is an MPLS edge node. 343 MPLS egress node an MPLS edge node in its role in handling 344 traffic as it leaves an MPLS domain 346 MPLS ingress node an MPLS edge node in its role in handling 347 traffic as it enters an MPLS domain 349 MPLS label a label placed in a short MPLS shim 350 header used to identify streams 352 MPLS node a node which is running MPLS. An MPLS 353 node will be aware of MPLS control 354 protocols, will operate one or more L3 355 routing protocols, and will be capable of 356 forwarding packets based on labels. An 357 MPLS node may optionally be also capable 358 of forwarding native L3 packets. 360 MultiProtocol Label Switching an IETF working group and the effort 361 associated with the working group 363 network layer synonymous with layer 3 365 stack synonymous with label stack 366 stream an aggregate of one or more flows, 367 treated as one aggregate for the purpose 368 of forwarding in L2 and/or L3 nodes 369 (e.g., may be described using a single 370 label). In many cases a stream may be the 371 aggregate of a very large number of 372 flows. Synonymous with "aggregate 373 stream". 375 stream merge the merging of several smaller streams 376 into a larger stream, such that for some 377 or all of the path the larger stream can 378 be referred to using a single label. 380 switched path synonymous with label switched path 382 virtual circuit a circuit used by a connection-oriented 383 layer 2 technology such as ATM or Frame 384 Relay, requiring the maintenance of state 385 information in layer 2 switches. 387 VC merge stream merge when it is specifically 388 applied to VCs, specifically so as to 389 allow multiple VCs to merge into one 390 single VC 392 VP merge stream merge when it is applied to VPs, 393 specifically so as to allow multiple VPs 394 to merge into one single VP. In this case 395 the VCIs need to be unique. This allows 396 cells from different sources to be 397 distinguished via the VCI. 399 VPI/VCI a label used in ATM networks to identify 400 circuits 402 1.3. Acronyms and Abbreviations 404 ATM Asynchronous Transfer Mode 406 BGP Border Gateway Protocol 408 DLCI Data Link Circuit Identifier 410 FEC Forwarding Equivalence Class 412 STN stream to NHLFE Map 413 IGP Interior Gateway Protocol 415 ILM Incoming Label Map 417 IP Internet Protocol 419 LIB Label Information Base 421 LDP Label Distribution Protocol 423 L2 Layer 2 425 L3 Layer 3 427 LSP Label Switched Path 429 LSR Label Switching Router 431 MPLS MultiProtocol Label Switching 433 MPT Multipoint to Point Tree 435 NHLFE Next Hop Label Forwarding Entry 437 SVC Switched Virtual Circuit 439 SVP Switched Virtual Path 441 TTL Time-To-Live 443 VC Virtual Circuit 445 VCI Virtual Circuit Identifier 447 VP Virtual Path 449 VPI Virtual Path Identifier 451 1.4. Acknowledgments 453 The ideas and text in this document have been collected from a number 454 of sources and comments received. We would like to thank Rick Boivie, 455 Paul Doolan, Nancy Feldman, Yakov Rekhter, Vijay Srinivasan, and 456 George Swallow for their inputs and ideas. 458 2. Outline of Approach 460 In this section, we introduce some of the basic concepts of MPLS and 461 describe the general approach to be used. 463 2.1. Labels 465 A label is a short, fixed length, locally significant identifier 466 which is used to identify a stream. The label is based on the stream 467 or Forwarding Equivalence Class that a packet is assigned to. The 468 label does not directly encode the network layer address. The choice 469 of label depends on the network layer address only to the extent that 470 the Forwarding Equivalence Class depends on that address. 472 If Ru and Rd are LSRs, and Ru transmits a packet to Rd, they may 473 agree to use label L to represent stream S for packets which are sent 474 from Ru to Rd. That is, they can agree to a "mapping" between label 475 L and stream S for packets moving from Ru to Rd. As a result of such 476 an agreement, L becomes Ru's "outgoing label" corresponding to stream 477 S for such packets; L becomes Rd's "incoming label" corresponding to 478 stream S for such packets. 480 Note that L does not necessarily correspond to stream S for any 481 packets other than those which are being sent from Ru to Rd. Also, L 482 is not an inherently meaningful value and does not have any network- 483 wide value; the particular value assigned to L gets its meaning 484 solely from the agreement between Ru and Rd. 486 Sometimes it may be difficult or even impossible for Rd to tell, of 487 an arriving packet carrying label L, that the label L was placed in 488 the packet by Ru, rather than by some other LSR. (This will 489 typically be the case when Ru and Rd are not direct neighbors.) In 490 such cases, Rd must make sure that the mapping from label to FEC is 491 one-to-one. That is, in such cases, Rd must not agree with Ru1 to 492 use L for one purpose, while also agreeing with some other LSR Ru2 to 493 use L for a different purpose. 495 2.2. Upstream and Downstream LSRs 497 Suppose Ru and Rd have agreed to map label L to stream S, for packets 498 sent from Ru to Rd. Then with respect to this mapping, Ru is the 499 "upstream LSR", and Rd is the "downstream LSR". 501 The notion of upstream and downstream relate to agreements between 502 nodes of the label values to be assigned for packets belonging to a 503 particular stream that might be traveling from an upstream node to a 504 downstream node. This is independent of whether the routing protocol 505 actually will cause any packets to be transmitted in that particular 506 direction. Thus, Rd is the downstream LSR for a particular mapping 507 for label L if it recognizes L-labeled packets from Ru as being in 508 stream S. This may be true even if routing does not actually forward 509 packets for stream S between nodes Rd and Ru, or if routing has made 510 Ru downstream of Rd along the path which is actually used for packets 511 in stream S. 513 2.3. Labeled Packet 515 A "labeled packet" is a packet into which a label has been encoded. 516 The encoding can be done by means of an encapsulation which exists 517 specifically for this purpose, or by placing the label in an 518 available location in either of the data link or network layer 519 headers. Of course, the encoding technique must be agreed to by the 520 entity which encodes the label and the entity which decodes the 521 label. 523 2.4. Label Assignment and Distribution; Attributes 525 For unicast traffic in the MPLS architecture, the decision to bind a 526 particular label L to a particular stream S is made by the LSR which 527 is downstream with respect to that mapping. The downstream LSR then 528 informs the upstream LSR of the mapping. Thus labels are 529 "downstream-assigned", and are "distributed upstream". 531 A particular mapping of label L to stream S, distributed by Rd to Ru, 532 may have associated "attributes". If Ru, acting as a downstream LSR, 533 also distributes a mapping of a label to stream S, then under certain 534 conditions, it may be required to also distribute the corresponding 535 attribute that it received from Rd. 537 2.5. Label Distribution Protocol (LDP) 539 A Label Distribution Protocol (LDP) is a set of procedures by which 540 one LSR informs another of the label/Stream mappings it has made. 541 Two LSRs which use an LDP to exchange label/Stream mapping 542 information are known as "LDP Peers" with respect to the mapping 543 information they exchange; we will speak of there being an "LDP 544 Adjacency" between them. 546 (N.B.: two LSRs may be LDP Peers with respect to some set of 547 mappings, but not with respect to some other set of mappings.) 549 The LDP also encompasses any negotiations in which two LDP Peers need 550 to engage in order to learn of each other's MPLS capabilities. 552 2.6. The Label Stack 554 So far, we have spoken as if a labeled packet carries only a single 555 label. As we shall see, it is useful to have a more general model in 556 which a labeled packet carries a number of labels, organized as a 557 last-in, first-out stack. We refer to this as a "label stack". 559 IN MPLS, EVERY FORWARDING DECISION IS BASED EXCLUSIVELY ON THE LABEL 560 AT THE TOP OF THE STACK. 562 Although, as we shall see, MPLS supports a hierarchy, the processing 563 of a labeled packet is completely independent of the level of 564 hierarchy. The processing is always based on the top label, without 565 regard for the possibility that some number of other labels may have 566 been "above it" in the past, or that some number of other labels may 567 be below it at present. 569 An unlabeled packet can be thought of as a packet whose label stack 570 is empty (i.e., whose label stack has depth 0). 572 If a packet's label stack is of depth m, we refer to the label at the 573 bottom of the stack as the level 1 label, to the label above it (if 574 such exists) as the level 2 label, and to the label at the top of the 575 stack as the level m label. 577 The utility of the label stack will become clear when we introduce 578 the notion of LSP Tunnel and the MPLS Hierarchy (sections 2.21.3 and 579 2.21.4). 581 2.7. The Next Hop Label Forwarding Entry (NHLFE) 583 The "Next Hop Label Forwarding Entry" (NHLFE) is used when forwarding 584 a labeled packet. It contains the following information: 586 1. the packet's next hop 588 2. the data link encapsulation to use when transmitting the packet 590 3. the way to encode the label stack when transmitting the packet 592 4. the operation to perform on the packet's label stack; this is 593 one of the following operations: 595 a) replace the label at the top of the label stack with a 596 specified new label 598 b) pop the label stack 600 c) replace the label at the top of the label stack with a 601 specified new label, and then push one or more specified 602 new labels onto the label stack. 604 Note that at a given LSR, the packet's "next hop" might be that LSR 605 itself. In this case, the LSR would need to pop the top level label, 606 and then "forward" the resulting packet to itself. It would then 607 make another forwarding decision, based on what remains after the 608 label stacked is popped. This may still be a labeled packet, or it 609 may be the native IP packet. 611 This implies that in some cases the LSR may need to operate on the IP 612 header in order to forward the packet. 614 If the packet's "next hop" is the current LSR, then the label stack 615 operation MUST be to "pop the stack". 617 2.8. Incoming Label Map (ILM) 619 The "Incoming Label Map" (ILM) is a mapping from incoming labels to 620 NHLFEs. It is used when forwarding packets that arrive as labeled 621 packets. 623 2.9. Stream-to-NHLFE Map (STN) 625 The "Stream-to-NHLFE" (STN) is a mapping from stream to NHLFEs. It is 626 used when forwarding packets that arrive unlabeled, but which are to 627 be labeled before being forwarded. 629 2.10. Label Swapping 631 Label swapping is the use of the following procedures to forward a 632 packet. 634 In order to forward a labeled packet, a LSR examines the label at the 635 top of the label stack. It uses the ILM to map this label to an 636 NHLFE. Using the information in the NHLFE, it determines where to 637 forward the packet, and performs an operation on the packet's label 638 stack. It then encodes the new label stack into the packet, and 639 forwards the result. 641 In order to forward an unlabeled packet, a LSR analyzes the network 642 layer header, to determine the packet's stream. It then uses the STN 643 to map this to an NHLFE. Using the information in the NHLFE, it 644 determines where to forward the packet, and performs an operation on 645 the packet's label stack. (Popping the label stack would, of course, 646 be illegal in this case.) It then encodes the new label stack into 647 the packet, and forwards the result. 649 IT IS IMPORTANT TO NOTE THAT WHEN LABEL SWAPPING IS IN USE, THE NEXT 650 HOP IS ALWAYS TAKEN FROM THE NHLFE; THIS MAY IN SOME CASES BE 651 DIFFERENT FROM WHAT THE NEXT HOP WOULD BE IF MPLS WERE NOT IN USE. 653 2.11. Scope and Uniqueness of Labels 655 A given LSR Rd may map label L1 to stream S, and distribute that 656 mapping to LDP peer Ru1. Rd may also map label L2 to stream S, and 657 distribute that mapping to LDP peer Ru2. Whether or not L1 == L2 is 658 not determined by the architecture; this is a local matter. 660 A given LSR Rd may map label L to stream S1, and distribute that 661 mapping to LDP peer Ru1. Rd may also map label L to stream S2, and 662 distribute that mapping to LDP peer Ru2. IF (AND ONLY IF) RD CAN 663 TELL, WHEN IT RECEIVES A PACKET WHOSE TOP LABEL IS L, WHETHER THE 664 LABEL WAS PUT THERE BY RU1 OR BY RU2, THEN THE ARCHITECTURE DOES NOT 665 REQUIRE THAT S1 == S2. In general, Rd can only tell whether it was 666 Ru1 or Ru2 that put the particular label value L at the top of the 667 label stack if the following conditions hold: 669 - Ru1 and Ru2 are the only LDP peers to which Rd distributed a 670 mapping of label value L, and 672 - Ru1 and Ru2 are each directly connected to Rd via a point-to- 673 point interface. 675 When these conditions hold, an LSR may use labels that have "per 676 interface" scope, i.e., which are only unique per interface. When 677 these conditions do not hold, the labels must be unique over the LSR 678 which has assigned them. 680 If a particular LSR Rd is attached to a particular LSR Ru over two 681 point-to-point interfaces, then Rd may distribute to Rd a mapping of 682 label L to stream S1, as well as a mapping of label L to stream S2, 683 S1 != S2, if and only if each mapping is valid only for packets which 684 Ru sends to Rd over a particular one of the interfaces. In all other 685 cases, Rd MUST NOT distribute to Ru mappings of the same label value 686 to two different streams. 688 This prohibition holds even if the mappings are regarded as being at 689 different "levels of hierarchy". In MPLS, there is no notion of 690 having a different label space for different levels of the hierarchy. 692 2.12. Label Switched Path (LSP), LSP Ingress, LSP Egress 694 A "Label Switched Path (LSP) of level m" for a particular packet P is 695 a sequence of routers, 697 699 with the following properties: 701 1. R1, the "LSP Ingress", is an LSR which pushes a label onto P's 702 label stack, resulting in a label stack of depth m; 704 2. For all i, 10). 728 In other words, we can speak of the level m LSP for Packet P as the 729 sequence of routers: 731 1. which begins with an LSR (an "LSP Ingress") that pushes on a 732 level m label, 734 2. all of whose intermediate LSRs make their forwarding decision 735 by label Switching on a level m label, 737 3. which ends (at an "LSP Egress") when a forwarding decision is 738 made by label Switching on a level m-k label, where k>0, or 739 when a forwarding decision is made by "ordinary", non-MPLS 740 forwarding procedures. 742 A consequence (or perhaps a presupposition) of this is that whenever 743 an LSR pushes a label onto an already labeled packet, it needs to 744 make sure that the new label corresponds to a FEC whose LSP Egress is 745 the LSR that assigned the label which is now second in the stack. 747 We will call a sequence of LSRs the "LSP for a particular stream S" 748 if it is an LSP of level m for a particular packet P when P's level m 749 label is a label corresponding to stream S. 751 Consider the set of nodes which may be LSP ingress nodes for stream 752 S. Then there is an LSP for stream S which begins with each of those 753 nodes. If a number of those LSPs have the same LSP egress, then one 754 can consider the set of such LSPs to be a tree, whose root is the LSP 755 egress. (Since data travels along this tree towards the root, this 756 may be called a multipoint-to-point tree.) We can thus speak of the 757 "LSP tree" for a particular stream S. 759 2.13. Penultimate Hop Popping 761 Note that according to the definitions of section 2.11, if is a level m LSP for packet P, P may be transmitted from R[n-1] 763 to Rn with a label stack of depth m-1. That is, the label stack may 764 be popped at the penultimate LSR of the LSP, rather than at the LSP 765 Egress. 767 From an architectural perspective, this is perfectly appropriate. 768 The purpose of the level m label is to get the packet to Rn. Once 769 R[n-1] has decided to send the packet to Rn, the label no longer has 770 any function, and need no longer be carried. 772 There is also a practical advantage to doing penultimate hop popping. 773 If one does not do this, then when the LSP egress receives a packet, 774 it first looks up the top label, and determines as a result of that 775 lookup that it is indeed the LSP egress. Then it must pop the stack, 776 and examine what remains of the packet. If there is another label on 777 the stack, the egress will look this up and forward the packet based 778 on this lookup. (In this case, the egress for the packet's level m 779 LSP is also an intermediate node for its level m-1 LSP.) If there is 780 no other label on the stack, then the packet is forwarded according 781 to its network layer destination address. Note that this would 782 require the egress to do TWO lookups, either two label lookups or a 783 label lookup followed by an address lookup. 785 If, on the other hand, penultimate hop popping is used, then when the 786 penultimate hop looks up the label, it determines: 788 - that it is the penultimate hop, and 790 - who the next hop is. 792 The penultimate node then pops the stack, and forward the packet 793 based on the information gained by looking up the label that was at 794 the top of the stack. When the LSP egress receives the packet, the 795 label at the top of the stack will be the label which it needs to 796 look up in order to make its own forwarding decision. Or, if the 797 packet was only carrying a single label, the LSP egress will simply 798 see the network layer packet, which is just what it needs to see in 799 order to make its forwarding decision. 801 This technique allows the egress to do a single lookup, and also 802 requires only a single lookup by the penultimate node. 804 The creation of the forwarding fastpath in a label switching product 805 may be greatly aided if it is known that only a single lookup is 806 every required: 808 - the code may be simplified if it can assume that only a single 809 lookup is ever needed 811 - the code can be based on a "time budget" that assumes that only a 812 single lookup is ever needed. 814 In fact, when penultimate hop popping is done, the LSP Egress need 815 not even be an LSR. 817 However, some hardware switching engines may not be able to pop the 818 label stack, so this cannot be universally required. There may also 819 be some situations in which penultimate hop popping is not desirable. 820 Therefore the penultimate node pops the label stack only if this is 821 specifically requested by the egress node, or if the next node in the 822 LSP does not support MPLS. (If the next node in the LSP does support 823 MPLS, but does not make such a request, the penultimate node has no 824 way of knowing that it in fact is the penultimate node.) 826 An LSR which is capable of popping the label stack at all MUST do 827 penultimate hop popping when so requested by its downstream LDP peer. 829 Initial LDP negotiations must allow each LSR to determine whether its 830 neighboring LSRS are capable of popping the label stack. A LSR will 831 not request an LDP peer to pop the label stack unless it is capable 832 of doing so. 834 It may be asked whether the egress node can always interpret the top 835 label of a received packet properly if penultimate hop popping is 836 used. As long as the uniqueness and scoping rules of section 2.11 837 are obeyed, it is always possible to interpret the top label of a 838 received packet unambiguously. 840 2.14. LSP Next Hop 842 The LSP Next Hop for a particular labeled packet in a particular LSR 843 is the LSR which is the next hop, as selected by the NHLFE entry used 844 for forwarding that packet. 846 The LSP Next Hop for a particular stream is the next hop as selected 847 by the NHLFE entry indexed by a label which corresponds to that 848 stream. 850 Note that the LSP Next Hop may differ from the next hop which would 851 be chosen by the network layer routing algorithm. We will use the 852 term "L3 next hop" when we refer to the latter. 854 2.15. Route Selection 856 Route selection refers to the method used for selecting the LSP for a 857 particular stream. The proposed MPLS protocol architecture supports 858 two options for Route Selection: (1) Hop by hop routing, and (2) 859 Explicit routing. 861 Hop by hop routing allows each node to independently choose the next 862 hop for the path for a stream. This is the normal mode today with 863 existing datagram IP networks. A hop by hop routed LSP refers to an 864 LSP whose route is selected using hop by hop routing. 866 An explicitly routed LSP is an LSP where, at a given LSR, the LSP 867 next hop is not chosen by each local node, but rather is chosen by a 868 single node (usually the ingress or egress node of the LSP). The 869 sequence of LSRs followed by an explicitly routed LSP may be chosen 870 by configuration, or may be selected dynamically by a single node 871 (for example, the egress node may make use of the topological 872 information learned from a link state database in order to compute 873 the entire path for the tree ending at that egress node). Explicit 874 routing may be useful for a number of purposes such as allowing 875 policy routing and/or facilitating traffic engineering. With MPLS 876 the explicit route needs to be specified at the time that labels are 877 assigned, but the explicit route does not have to be specified with 878 each IP packet. This implies that explicit routing with MPLS is 879 relatively efficient (when compared with the efficiency of explicit 880 routing for pure datagrams). 882 For any one LSP (at any one level of hierarchy), there are two 883 possible options: (i) The entire LSP may be hop by hop routed from 884 ingress to egress; (ii) The entire LSP may be explicit routed from 885 ingress to egress. Intermediate cases do not make sense: In general, 886 an LSP will be explicit routed specifically because there is a good 887 reason to use an alternative to the hop by hop routed path. This 888 implies that if some of the nodes along the path follow an explicit 889 route but some of the nodes make use of hop by hop routing, then 890 inconsistent routing will result and loops (or severely inefficient 891 paths) may form. 893 For this reason, it is important that if an explicit route is 894 specified for an LSP, then that route must be followed. Note that it 895 is relatively simple to *follow* an explicit route which is specified 896 in a LDP setup. We therefore propose that the LDP specification 897 require that all MPLS nodes implement the ability to follow an 898 explicit route if this is specified. 900 It is not necessary for a node to be able to create an explicit 901 route. However, in order to ensure interoperability it is necessary 902 to ensure that either (i) Every node knows how to use hop by hop 903 routing; or (ii) Every node knows how to create and follow an 904 explicit route. We propose that due to the common use of hop by hop 905 routing in networks today, it is reasonable to make hop by hop 906 routing the default that all nodes need to be able to use. 908 2.16. Time-to-Live (TTL) 910 In conventional IP forwarding, each packet carries a "Time To Live" 911 (TTL) value in its header. Whenever a packet passes through a 912 router, its TTL gets decremented by 1; if the TTL reaches 0 before 913 the packet has reached its destination, the packet gets discarded. 915 This provides some level of protection against forwarding loops that 916 may exist due to misconfigurations, or due to failure or slow 917 convergence of the routing algorithm. TTL is sometimes used for other 918 functions as well, such as multicast scoping, and supporting the 919 "traceroute" command. This implies that there are two TTL-related 920 issues that MPLS needs to deal with: (i) TTL as a way to suppress 921 loops; (ii) TTL as a way to accomplish other functions, such as 922 limiting the scope of a packet. 924 When a packet travels along an LSP, it should emerge with the same 925 TTL value that it would have had if it had traversed the same 926 sequence of routers without having been label switched. If the 927 packet travels along a hierarchy of LSPs, the total number of LSR- 928 hops traversed should be reflected in its TTL value when it emerges 929 from the hierarchy of LSPs. 931 The way that TTL is handled may vary depending upon whether the MPLS 932 label values are carried in an MPLS-specific "shim" header, or if the 933 MPLS labels are carried in an L2 header such as an ATM header or a 934 frame relay header. 936 If the label values are encoded in a "shim" that sits between the 937 data link and network layer headers, then this shim should have a TTL 938 field that is initially loaded from the network layer header TTL 939 field, is decremented at each LSR-hop, and is copied into the network 940 layer header TTL field when the packet emerges from its LSP. 942 If the label values are encoded in an L2 header (e.g., the VPI/VCI 943 field in ATM's AAL5 header), and the labeled packets are forwarded by 944 an L2 switch (e.g., an ATM switch). This implies that unless the data 945 link layer itself has a TTL field (unlike ATM), it will not be 946 possible to decrement a packet's TTL at each LSR-hop. An LSP segment 947 which consists of a sequence of LSRs that cannot decrement a packet's 948 TTL will be called a "non-TTL LSP segment". 950 When a packet emerges from a non-TTL LSP segment, it should however 951 be given a TTL that reflects the number of LSR-hops it traversed. In 952 the unicast case, this can be achieved by propagating a meaningful 953 LSP length to ingress nodes, enabling the ingress to decrement the 954 TTL value before forwarding packets into a non-TTL LSP segment. 956 Sometimes it can be determined, upon ingress to a non-TTL LSP 957 segment, that a particular packet's TTL will expire before the packet 958 reaches the egress of that non-TTL LSP segment. In this case, the LSR 959 at the ingress to the non-TTL LSP segment must not label switch the 960 packet. This means that special procedures must be developed to 961 support traceroute functionality, for example, traceroute packets may 962 be forwarded using conventional hop by hop forwarding. 964 2.17. Loop Control 966 On a non-TTL LSP segment, by definition, TTL cannot be used to 967 protect against forwarding loops. The importance of loop control may 968 depend on the particular hardware being used to provide the LSR 969 functions along the non-TTL LSP segment. 971 Suppose, for instance, that ATM switching hardware is being used to 972 provide MPLS switching functions, with the label being carried in the 973 VPI/VCI field. Since ATM switching hardware cannot decrement TTL, 974 there is no protection against loops. If the ATM hardware is capable 975 of providing fair access to the buffer pool for incoming cells 976 carrying different VPI/VCI values, this looping may not have any 977 deleterious effect on other traffic. If the ATM hardware cannot 978 provide fair buffer access of this sort, however, then even transient 979 loops may cause severe degradation of the LSR's total performance. 981 Even if fair buffer access can be provided, it is still worthwhile to 982 have some means of detecting loops that last "longer than possible". 983 In addition, even where TTL and/or per-VC fair queuing provides a 984 means for surviving loops, it still may be desirable where practical 985 to avoid setting up LSPs which loop. 987 The MPLS architecture will therefore provide a technique for ensuring 988 that looping LSP segments can be detected, and a technique for 989 ensuring that looping LSP segments are never created. 991 All LSRs will be required to support a common technique for loop 992 detection. Support for the loop prevention technique is optional, 993 though it is recommended in ATM-LSRs that have no other way to 994 protect themselves against the effects of looping data packets. Use 995 of the loop prevention technique, when supported, is optional. 997 2.17.1. Loop Prevention 999 NOTE: The loop prevention technique described here is being 1000 reconsidered, and may be changed. 1002 LSR's maintain for each of their LSP's an LSR id list. This list is a 1003 list of all the LSR's downstream from this LSR on a given LSP. The 1004 LSR id list is used to prevent the formation of switched path loops. 1005 The LSR ID list is propagated upstream from a node to its neighbor 1006 nodes. The LSR ID list is used to prevent loops as follows: 1008 When a node, R, detects a change in the next hop for a given stream, 1009 it asks its new next hop for a label and the associated LSR ID list 1010 for that stream. 1012 The new next hop responds with a label for the stream and an 1013 associated LSR id list. 1015 R looks in the LSR id list. If R determines that it, R, is in the 1016 list then we have a route loop. In this case, we do nothing and the 1017 old LSP will continue to be used until the route protocols break the 1018 loop. The means by which the old LSP is replaced by a new LSP after 1019 the route protocols breathe loop is described below. 1021 If R is not in the LSR id list, R will start a "diffusion" 1022 computation [12]. The purpose of the diffusion computation is to 1023 prune the tree upstream of R so that we remove all LSR's from the 1024 tree that would be on a looping path if R were to switch over to the 1025 new LSP. After those LSR's are removed from the tree, it is safe for 1026 R to replace the old LSP with the new LSP (and the old LSP can be 1027 released). 1029 The diffusion computation works as follows: 1031 R adds its LSR id to the list and sends a query message to each of 1032 its "upstream" neighbors (i.e. to each of its neighbors that is not 1033 the new "downstream" next hop). 1035 A node S that receives such a query will process the query as 1036 follows: 1038 - If node R is not node S's next hop for the given stream, node S 1039 will respond to node R will an "OK" message meaning that as far 1040 as node S is concerned it is safe for node R to switch over to 1041 the new LSP. 1043 - If node R is node S's next hop for the stream, node S will check 1044 to see if it, node S, is in the LSR id list that it received from 1045 node R. If it is, we have a route loop and S will respond with a 1046 "LOOP" message. R will unsplice the connection to S pruning S 1047 from the tree. The mechanism by which S will get a new LSP for 1048 the stream after the route protocols break the loop is described 1049 below. 1051 - If node S is not in the LSR id list, S will add its LSR id to the 1052 LSR id list and send a new query message further upstream. The 1053 diffusion computation will continue to propagate upstream along 1054 each of the paths in the tree upstream of S until either a loop 1055 is detected, in which case the node is pruned as described above 1056 or we get to a point where a node gets a response ("OK" or 1057 "LOOP") from each of its neighbors perhaps because none of those 1058 neighbors considers the node in question to be its downstream 1059 next hop. Once a node has received a response from each of its 1060 upstream neighbors, it returns an "OK" message to its downstream 1061 neighbor. When the original node, node R, gets a response from 1062 each of its neighbors, it is safe to replace the old LSP with the 1063 new one because all the paths that would loop have been pruned 1064 from the tree. 1066 There are a couple of details to discuss: 1068 - First, we need to do something about nodes that for one reason or 1069 another do not produce a timely response in response to a query 1070 message. If a node Y does not respond to a query from node X 1071 because of a failure of some kind, X will not be able to respond 1072 to its downstream neighbors (if any) or switch over to a new LSP 1073 if X is, like R above, the node that has detected the route 1074 change. This problem is handled by timing out the query message. 1075 If a node doesn't receive a response within a "reasonable" period 1076 of time, it "unsplices" its VC to the upstream neighbor that is 1077 not responding and proceeds as it would if it had received the 1078 "LOOP" message. 1080 - We also need to be concerned about multiple concurrent routing 1081 updates. What happens, for example, when a node M receives a 1082 request for an LSP from an upstream neighbor, N, when M is in the 1083 middle of a diffusion computation i.e., it has sent a query 1084 upstream but hasn't received all the responses. Since a 1085 downstream node, node R is about to change from one LSP to 1086 another, M needs to pass to N an LSR id list corresponding to the 1087 union of the old and new LSP's if it is to avoid loops both 1088 before and after the transition. This is easily accomplished 1089 since M already has the LSR id list for the old LSP and it gets 1090 the LSR id list for the new LSP in the query message. After R 1091 makes the switch from the old LSP to the new one, R sends a new 1092 establish message upstream with the LSR id list of (just) the new 1093 LSP. At this point, the nodes upstream of R know that R has 1094 switched over to the new LSP and that they can return the id list 1095 for (just) the new LSP in response to any new requests for LSP's. 1096 They can also grow the tree to include additional nodes that 1097 would not have been valid for the combined LSR id list. 1099 - We also need to discuss how a node that doesn't have an LSP for a 1100 given stream at the end of a diffusion computation (because it 1101 would have been on a looping LSP) gets one after the routing 1102 protocols break the loop. If node L has been pruned from the 1103 tree and its local route protocol processing entity breaks the 1104 loop by changing L's next hop, L will request a new LSP from its 1105 new downstream neighbor which it will use once it executes the 1106 diffusion computation as described above. If the loop is broken 1107 by a route change at another point in the loop, i.e. at a point 1108 "downstream" of L, L will get a new LSP as the new LSP tree grows 1109 upstream from the point of the route change as discussed in the 1110 previous paragraph. 1112 - Note that when a node is pruned from the tree, the switched path 1113 upstream of that node remains "connected". This is important 1114 since it allows the switched path to get "reconnected" to a 1115 downstream switched path after a route change with a minimal 1116 amount of unsplicing and resplicing once the appropriate 1117 diffusion computation(s) have taken place. 1119 The LSR Id list can also be used to provide a "loop detection" 1120 capability. To use it in this manner, an LSR which sees that it is 1121 already in the LSR Id list for a particular stream will immediately 1122 unsplice itself from the switched path for that stream, and will NOT 1123 pass the LSR Id list further upstream. The LSR can rejoin a switched 1124 path for the stream when it changes its next hop for that stream, or 1125 when it receives a new LSR Id list from its current next hop, in 1126 which it is not contained. The diffusion computation would be 1127 omitted. 1129 2.17.2. Interworking of Loop Control Options 1131 The MPLS protocol architecture allows some nodes to be using loop 1132 prevention, while some other nodes are not (i.e., the choice of 1133 whether or not to use loop prevention may be a local decision). When 1134 this mix is used, it is not possible for a loop to form which 1135 includes only nodes which do loop prevention. However, it is possible 1136 for loops to form which contain a combination of some nodes which do 1137 loop prevention, and some nodes which do not. 1139 There are at least four identified cases in which it makes sense to 1140 combine nodes which do loop prevention with nodes which do not: (i) 1141 For transition, in intermediate states while transitioning from all 1142 non-loop-prevention to all loop prevention, or vice versa; (ii) For 1143 interoperability, where one vendor implements loop prevention but 1144 another vendor does not; (iii) Where there is a mixed ATM and 1145 datagram media network, and where loop prevention is desired over the 1146 ATM portions of the network but not over the datagram portions; (iv) 1147 where some of the ATM switches can do fair access to the buffer pool 1148 on a per-VC basis, and some cannot, and loop prevention is desired 1149 over the ATM portions of the network which cannot. 1151 Note that interworking is straightforward. If an LSR is not doing 1152 loop prevention, and it receives from a downstream LSR a label 1153 mapping which contains loop prevention information, it (a) accepts 1154 the label mapping, (b) does NOT pass the loop prevention information 1155 upstream, and (c) informs the downstream neighbor that the path is 1156 loop-free. 1158 Similarly, if an LSR R which is doing loop prevention receives from a 1159 downstream LSR a label mapping which does not contain any loop 1160 prevention information, then R passes the label mapping upstream with 1161 loop prevention information included as if R were the egress for the 1162 specified stream. 1164 Optionally, a node is permitted to implement the ability of either 1165 doing or not doing loop prevention as options, and is permitted to 1166 choose which to use for any one particular LSP based on the 1167 information obtained from downstream nodes. When the label mapping 1168 arrives from downstream, then the node may choose whether to use loop 1169 prevention so as to continue to use the same approach as was used in 1170 the information passed to it. Note that regardless of whether loop 1171 prevention is used the egress nodes (for any particular LSP) always 1172 initiates exchange of label mapping information without waiting for 1173 other nodes to act. 1175 2.18. Merging and Non-Merging LSRs 1177 Merge allows multiple upstream LSPs to be merged into a single 1178 downstream LSP. When implemented by multiple nodes, this results in 1179 the traffic going to a particular egress nodes, based on one 1180 particular stream, to follow a multipoint to point tree (MPT), with 1181 the MPT rooted at the egress node and associated with the stream. 1182 This can have a significant effect on reducing the number of labels 1183 that need to be maintained by any one particular node. 1185 If merge was not used at all it would be necessary for each node to 1186 provide the upstream neighbors with a label for each stream for each 1187 upstream node which may be forwarding traffic over the link. This 1188 implies that the number of labels needed might not in general be 1189 known a priori. However, the use of merge allows a single label to be 1190 used per stream, therefore allowing label assignment to be done in a 1191 common way without regard for the number of upstream nodes which will 1192 be using the downstream LSP. 1194 The proposed MPLS protocol architecture supports LSP merge, while 1195 allowing nodes which do not support LSP merge. This leads to the 1196 issue of ensuring correct interoperation between nodes which 1197 implement merge and those which do not. The issue is somewhat 1198 different in the case of datagram media versus the case of ATM. The 1199 different media types will therefore be discussed separately. 1201 2.18.1. Stream Merge 1203 Let us say that an LSR is capable of Stream Merge if it can receive 1204 two packets from different incoming interfaces, and/or with different 1205 labels, and send both packets out the same outgoing interface with 1206 the same label. This in effect takes two incoming streams and merges 1207 them into one. Once the packets are transmitted, the information that 1208 they arrived from different interfaces and/or with different incoming 1209 labels is lost. 1211 Let us say that an LSR is not capable of Stream Merge if, for any two 1212 packets which arrive from different interfaces, or with different 1213 labels, the packets must either be transmitted out different 1214 interfaces, or must have different labels. 1216 An LSR which is capable of Stream Merge (a "Merging LSR") needs to 1217 maintain only one outgoing label for each FEC. AN LSR which is not 1218 capable of Stream Merge (a "Non-merging LSR") may need to maintain as 1219 many as N outgoing labels per FEC, where N is the number of LSRs in 1220 the network. Hence by supporting Stream Merge, an LSR can reduce its 1221 number of outgoing labels by a factor of O(N). Since each label in 1222 use requires the dedication of some amount of resources, this can be 1223 a significant savings. 1225 2.18.2. Non-merging LSRs 1227 The MPLS forwarding procedures is very similar to the forwarding 1228 procedures used by such technologies as ATM and Frame Relay. That is, 1229 a unit of data arrives, a label (VPI/VCI or DLCI) is looked up in a 1230 "cross-connect table", on the basis of that lookup an output port is 1231 chosen, and the label value is rewritten. In fact, it is possible to 1232 use such technologies for MPLS forwarding; LDP can be used as the 1233 "signalling protocol" for setting up the cross-connect tables. 1235 Unfortunately, these technologies do not necessarily support the 1236 Stream Merge capability. In ATM, if one attempts to perform Stream 1237 Merge, the result may be the interleaving of cells from various 1238 packets. If cells from different packets get interleaved, it is 1239 impossible to reassemble the packets. Some Frame Relay switches use 1240 cell switching on their backplanes. These switches may also be 1241 incapable of supporting Stream Merge, for the same reason -- cells of 1242 different packets may get interleaved, and there is then no way to 1243 reassemble the packets. 1245 We propose to support two solutions to this problem. First, MPLS will 1246 contain procedures which allow the use of non-merging LSRs. Second, 1247 MPLS will support procedures which allow certain ATM switches to 1248 function as merging LSRs. 1250 Since MPLS supports both merging and non-merging LSRs, MPLS also 1251 contains procedures to ensure correct interoperation between them. 1253 2.18.3. Labels for Merging and Non-Merging LSRs 1255 An upstream LSR which supports Stream Merge needs to be sent only one 1256 label per FEC. An upstream neighbor which does not support Stream 1257 Merge needs to be sent multiple labels per FEC. However, there is no 1258 way of knowing a priori how many labels it needs. This will depend on 1259 how many LSRs are upstream of it with respect to the FEC in question. 1261 In the MPLS architecture, if a particular upstream neighbor does not 1262 support Stream Merge, it is not sent any labels for a particular FEC 1263 unless it explicitly asks for a label for that FEC. The upstream 1264 neighbor may make multiple such requests, and is given a new label 1265 each time. When a downstream neighbor receives such a request from 1266 upstream, and the downstream neighbor does not itself support Stream 1267 Merge, then it must in turn ask its downstream neighbor for another 1268 label for the FEC in question. 1270 It is possible that there may be some nodes which support merge, but 1271 have a limited number of upstream streams which may be merged into a 1272 single downstream streams. Suppose for example that due to some 1273 hardware limitation a node is capable of merging four upstream LSPs 1274 into a single downstream LSP. Suppose however, that this particular 1275 node has six upstream LSPs arriving at it for a particular stream. In 1276 this case, this node may merge these into two downstream LSPs 1277 (corresponding to two labels that need to be obtained from the 1278 downstream neighbor). In this case, the normal operation of the LDP 1279 implies that the downstream neighbor will supply this node with a 1280 single label for the stream. This node can then ask its downstream 1281 neighbor for one additional label for the stream, implying that the 1282 node will thereby obtain the required two labels. 1284 The interaction between explicit routing and merge is FFS. 1286 2.18.4. Merge over ATM 1288 2.18.4.1. Methods of Eliminating Cell Interleave 1290 There are several methods that can be used to eliminate the cell 1291 interleaving problem in ATM, thereby allowing ATM switches to support 1292 stream merge: : 1294 1. VP merge 1296 When VP merge is used, multiple virtual paths are merged into a 1297 virtual path, but packets from different sources are 1298 distinguished by using different VCs within the VP. 1300 2. VC merge 1302 When VC merge is used, switches are required to buffer cells 1303 from one packet until the entire packet is received (this may 1304 be determined by looking for the AAL5 end of frame indicator). 1306 VP merge has the advantage that it is compatible with a higher 1307 percentage of existing ATM switch implementations. This makes it more 1308 likely that VP merge can be used in existing networks. Unlike VC 1309 merge, VP merge does not incur any delays at the merge points and 1310 also does not impose any buffer requirements. However, it has the 1311 disadvantage that it requires coordination of the VCI space within 1312 each VP. There are a number of ways that this can be accomplished. 1313 Selection of one or more methods is FFS. 1315 This tradeoff between compatibility with existing equipment versus 1316 protocol complexity and scalability implies that it is desirable for 1317 the MPLS protocol to support both VP merge and VC merge. In order to 1318 do so each ATM switch participating in MPLS needs to know whether its 1319 immediate ATM neighbors perform VP merge, VC merge, or no merge. 1321 2.18.4.2. Interoperation: VC Merge, VP Merge, and Non-Merge 1323 The interoperation of the various forms of merging over ATM is most 1324 easily described by first describing the interoperation of VC merge 1325 with non-merge. 1327 In the case where VC merge and non-merge nodes are interconnected the 1328 forwarding of cells is based in all cases on a VC (i.e., the 1329 concatenation of the VPI and VCI). For each node, if an upstream 1330 neighbor is doing VC merge then that upstream neighbor requires only 1331 a single VPI/VCI for a particular stream (this is analogous to the 1332 requirement for a single label in the case of operation over frame 1333 media). If the upstream neighbor is not doing merge, then the 1334 neighbor will require a single VPI/VCI per stream for itself, plus 1335 enough VPI/VCIs to pass to its upstream neighbors. The number 1336 required will be determined by allowing the upstream nodes to request 1337 additional VPI/VCIs from their downstream neighbors (this is again 1338 analogous to the method used with frame merge). 1340 A similar method is possible to support nodes which perform VP merge. 1341 In this case the VP merge node, rather than requesting a single 1342 VPI/VCI or a number of VPI/VCIs from its downstream neighbor, instead 1343 may request a single VP (identified by a VPI) but several VCIs within 1344 the VP. Furthermore, suppose that a non-merge node is downstream 1345 from two different VP merge nodes. This node may need to request one 1346 VPI/VCI (for traffic originating from itself) plus two VPs (one for 1347 each upstream node), each associated with a specified set of VCIs (as 1348 requested from the upstream node). 1350 In order to support all of VP merge, VC merge, and non-merge, it is 1351 therefore necessary to allow upstream nodes to request a combination 1352 of zero or more VC identifiers (consisting of a VPI/VCI), plus zero 1353 or more VPs (identified by VPIs) each containing a specified number 1354 of VCs (identified by a set of VCIs which are significant within a 1355 VP). VP merge nodes would therefore request one VP, with a contained 1356 VCI for traffic that it originates (if appropriate) plus a VCI for 1357 each VC requested from above (regardless of whether or not the VC is 1358 part of a containing VP). VC merge node would request only a single 1359 VPI/VCI (since they can merge all upstream traffic into a single VC). 1360 Non-merge nodes would pass on any requests that they get from above, 1361 plus request a VPI/VCI for traffic that they originate (if 1362 appropriate). 1364 2.19. LSP Control: Egress versus Local 1366 There is a choice to be made regarding whether the initial setup of 1367 LSPs will be initiated by the egress node, or locally by each 1368 individual node. 1370 When LSP control is done locally, then each node may at any time pass 1371 label bindings to its neighbors for each FEC recognized by that node. 1373 In the normal case that the neighboring nodes recognize the same 1374 FECs, then nodes may map incoming labels to outgoing labels as part 1375 of the normal label swapping forwarding method. 1377 When LSP control is done by the egress, then initially only the 1378 egress node passes label bindings to its neighbors corresponding to 1379 any FECs which leave the MPLS network at that egress node. Other 1380 nodes wait until they get a label from downstream for a particular 1381 FEC before passing a corresponding label for the same FEC to upstream 1382 nodes. 1384 With local control, since each LSR is (at least initially) 1385 independently assigning labels to FECs, it is possible that different 1386 LSRs may make inconsistent decisions. For example, an upstream LSR 1387 may make a coarse decision (map multiple IP address prefixes to a 1388 single label) while its downstream neighbor makes a finer grain 1389 decision (map each individual IP address prefix to a separate label). 1390 With downstream label assignment this can be corrected by having LSRs 1391 withdraw labels that it has assigned which are inconsistent with 1392 downstream labels, and replace them with new consistent label 1393 assignments. 1395 Even with egress control it is possible that the choice of egress 1396 node may change, or the egress may (based on a change in 1397 configuration) change its mind in terms of the granularity which is 1398 to be used. This implies the same mechanism will be necessary to 1399 allow changes in granularity to bubble up to upstream nodes. The 1400 choice of egress or local control may therefore effect the frequency 1401 with which this mechanism is used, but will not effect the need for a 1402 mechanism to achieve consistency of label granularity. Generally 1403 speaking, the choice of local versus egress control does not appear 1404 to have any effect on the LDP mechanisms which need to be defined. 1406 Egress control and local control can interwork in a very 1407 straightforward manner (although when both methods exist in the 1408 network, the overall behavior of the network is largely that of local 1409 control). With either approach, (assuming downstream label 1410 assignment) the egress node will initially assign labels for 1411 particular FECs and will pass these labels to its neighbors. With 1412 either approach these label assignments will bubble upstream, with 1413 the upstream nodes choosing labels that are consistent with the 1414 labels that they receive from downstream. The difference between the 1415 two approaches is therefore primarily an issue of what each node does 1416 prior to obtaining a label assignment for a particular FEC from 1417 downstream nodes: Does it wait, or does it assign a preliminary label 1418 under the expectation that it will (probably) be correct? 1420 Regardless of which method is used (local control or egress control) 1421 each node needs to know (possibly by configuration) what granularity 1422 to use for labels that it assigns. Where egress control is used, this 1423 requires each node to know the granularity only for streams which 1424 leave the MPLS network at that node. For local control, in order to 1425 avoid the need to withdraw inconsistent labels, each node in the 1426 network would need to be configured consistently to know the 1427 granularity for each stream. However, in many cases this may be done 1428 by using a single level of granularity which applies to all streams 1429 (such as "one label per IP prefix in the forwarding table"). 1431 This architecture allows the choice between local control and egress 1432 control to be a local matter. Since the two methods interwork, a 1433 given LSR need support only one or the other. 1435 2.20. Granularity 1437 When forwarding by label swapping, a stream of packets following a 1438 stream arriving from upstream may be mapped into an equal or coarser 1439 grain stream. However, a coarse grain stream (for example, containing 1440 packets destined for a short IP address prefix covering many subnets) 1441 cannot be mapped directly into a finer grain stream (for example, 1442 containing packets destined for a longer IP address prefix covering a 1443 single subnet). This implies that there needs to be some mechanism 1444 for ensuring consistency between the granularity of LSPs in an MPLS 1445 network. 1447 The method used for ensuring compatibility of granularity may depend 1448 upon the method used for LSP control. 1450 When LSP control is local, it is possible that a node may pass a 1451 coarse grain label to its upstream neighbor(s), and subsequently 1452 receive a finer grain label from its downstream neighbor. In this 1453 case the node has two options: (i) It may forward the corresponding 1454 packets using normal IP datagram forwarding (i.e., by examination of 1455 the IP header); (ii) It may withdraw the label mappings that it has 1456 passed to its upstream neighbors, and replace these with finer grain 1457 label mappings. 1459 When LSP control is egress based, the label setup originates from the 1460 egress node and passes upstream. It is therefore straightforward with 1461 this approach to maintain equally-grained mappings along the route. 1463 2.21. Tunnels and Hierarchy 1465 Sometimes a router Ru takes explicit action to cause a particular 1466 packet to be delivered to another router Rd, even though Ru and Rd 1467 are not consecutive routers on the Hop-by-hop path for that packet, 1468 and Rd is not the packet's ultimate destination. For example, this 1469 may be done by encapsulating the packet inside a network layer packet 1470 whose destination address is the address of Rd itself. This creates a 1471 "tunnel" from Ru to Rd. We refer to any packet so handled as a 1472 "Tunneled Packet". 1474 2.21.1. Hop-by-Hop Routed Tunnel 1476 If a Tunneled Packet follows the Hop-by-hop path from Ru to Rd, we 1477 say that it is in an "Hop-by-Hop Routed Tunnel" whose "transmit 1478 endpoint" is Ru and whose "receive endpoint" is Rd. 1480 2.21.2. Explicitly Routed Tunnel 1482 If a Tunneled Packet travels from Ru to Rd over a path other than the 1483 Hop-by-hop path, we say that it is in an "Explicitly Routed Tunnel" 1484 whose "transmit endpoint" is Ru and whose "receive endpoint" is Rd. 1485 For example, we might send a packet through an Explicitly Routed 1486 Tunnel by encapsulating it in a packet which is source routed. 1488 2.21.3. LSP Tunnels 1490 It is possible to implement a tunnel as a LSP, and use label 1491 switching rather than network layer encapsulation to cause the packet 1492 to travel through the tunnel. The tunnel would be a LSP , where R1 is the transmit endpoint of the tunnel, and Rn is the 1494 receive endpoint of the tunnel. This is called a "LSP Tunnel". 1496 The set of packets which are to be sent though the LSP tunnel becomes 1497 a stream, and each LSR in the tunnel must assign a label to that 1498 stream (i.e., must assign a label to the tunnel). The criteria for 1499 assigning a particular packet to an LSP tunnel is a local matter at 1500 the tunnel's transmit endpoint. To put a packet into an LSP tunnel, 1501 the transmit endpoint pushes a label for the tunnel onto the label 1502 stack and sends the labeled packet to the next hop in the tunnel. 1504 If it is not necessary for the tunnel's receive endpoint to be able 1505 to determine which packets it receives through the tunnel, as 1506 discussed earlier, the label stack may be popped at the penultimate 1507 LSR in the tunnel. 1509 A "Hop-by-Hop Routed LSP Tunnel" is a Tunnel that is implemented as 1510 an hop-by-hop routed LSP between the transmit endpoint and the 1511 receive endpoint. 1513 An "Explicitly Routed LSP Tunnel" is a LSP Tunnel that is also an 1514 Explicitly Routed LSP. 1516 2.21.4. Hierarchy: LSP Tunnels within LSPs 1518 Consider a LSP . Let us suppose that R1 receives 1519 unlabeled packet P, and pushes on its label stack the label to cause 1520 it to follow this path, and that this is in fact the Hop-by-hop path. 1521 However, let us further suppose that R2 and R3 are not directly 1522 connected, but are "neighbors" by virtue of being the endpoints of an 1523 LSP tunnel. So the actual sequence of LSRs traversed by P is . 1526 When P travels from R1 to R2, it will have a label stack of depth 1. 1527 R2, switching on the label, determines that P must enter the tunnel. 1528 R2 first replaces the Incoming label with a label that is meaningful 1529 to R3. Then it pushes on a new label. This level 2 label has a value 1530 which is meaningful to R21. Switching is done on the level 2 label by 1531 R21, R22, R23. R23, which is the penultimate hop in the R2-R3 tunnel, 1532 pops the label stack before forwarding the packet to R3. When R3 sees 1533 packet P, P has only a level 1 label, having now exited the tunnel. 1534 Since R3 is the penultimate hop in P's level 1 LSP, it pops the label 1535 stack, and R4 receives P unlabeled. 1537 The label stack mechanism allows LSP tunneling to nest to any depth. 1539 2.21.5. LDP Peering and Hierarchy 1541 Suppose that packet P travels along a Level 1 LSP , 1542 and when going from R2 to R3 travels along a Level 2 LSP . From the perspective of the Level 2 LSP, R2's LDP peer is 1544 R21. From the perspective of the Level 1 LSP, R2's LDP peers are R1 1545 and R3. One can have LDP peers at each layer of hierarchy. We will 1546 see in sections 3.6 and 3.7 some ways to make use of this hierarchy. 1547 Note that in this example, R2 and R21 must be IGP neighbors, but R2 1548 and R3 need not be. 1550 When two LSRs are IGP neighbors, we will refer to them as "Local LDP 1551 Peers". When two LSRs may be LDP peers, but are not IGP neighbors, 1552 we will refer to them as "Remote LDP Peers". In the above example, 1553 R2 and R21 are local LDP peers, but R2 and R3 are remote LDP peers. 1555 The MPLS architecture supports two ways to distribute labels at 1556 different layers of the hierarchy: Explicit Peering and Implicit 1557 Peering. 1559 One performs label Distribution with one's Local LDP Peers by opening 1560 LDP connections to them. One can perform label Distribution with 1561 one's Remote LDP Peers in one of two ways: 1563 1. Explicit Peering 1565 In explicit peering, one sets up LDP connections between Remote 1566 LDP Peers, exactly as one would do for Local LDP Peers. This 1567 technique is most useful when the number of Remote LDP Peers is 1568 small, or the number of higher level label mappings is large, 1569 or the Remote LDP Peers are in distinct routing areas or 1570 domains. Of course, one needs to know which labels to 1571 distribute to which peers; this is addressed in section 3.1.2. 1573 Examples of the use of explicit peering is found in sections 1574 3.2.1 and 3.6. 1576 2. Implicit Peering 1578 In Implicit Peering, one does not have LDP connections to one's 1579 remote LDP peers, but only to one's local LDP peers. To 1580 distribute higher level labels to ones remote LDP peers, one 1581 encodes the higher level labels as an attribute of the lower 1582 level labels, and distributes the lower level label, along with 1583 this attribute, to the local LDP peers. The local LDP peers 1584 then propagate the information to their peers. This process 1585 continues till the information reaches remote LDP peers. Note 1586 that the intermediary nodes may also be remote LDP peers. 1588 This technique is most useful when the number of Remote LDP 1589 Peers is large. Implicit peering does not require a n-square 1590 peering mesh to distribute labels to the remote LDP peers 1591 because the information is piggybacked through the local LDP 1592 peering. However, implicit peering requires the intermediate 1593 nodes to store information that they might not be directly 1594 interested in. 1596 An example of the use of implicit peering is found in section 1597 3.3. 1599 2.22. LDP Transport 1601 LDP is used between nodes in an MPLS network to establish and 1602 maintain the label mappings. In order for LDP to operate correctly, 1603 LDP information needs to be transmitted reliably, and the LDP 1604 messages pertaining to a particular FEC need to be transmitted in 1605 sequence. Flow control is also required, as is the capability to 1606 carry multiple LDP messages in a single datagram. 1608 These goals will be met by using TCP as the underlying transport for 1609 LDP. 1611 (The use of multicast techniques to distribute label mappings is 1612 FFS.) 1614 2.23. Label Encodings 1616 In order to transmit a label stack along with the packet whose label 1617 stack it is, it is necessary to define a concrete encoding of the 1618 label stack. The architecture supports several different encoding 1619 techniques; the choice of encoding technique depends on the 1620 particular kind of device being used to forward labeled packets. 1622 2.23.1. MPLS-specific Hardware and/or Software 1624 If one is using MPLS-specific hardware and/or software to forward 1625 labeled packets, the most obvious way to encode the label stack is to 1626 define a new protocol to be used as a "shim" between the data link 1627 layer and network layer headers. This shim would really be just an 1628 encapsulation of the network layer packet; it would be "protocol- 1629 independent" such that it could be used to encapsulate any network 1630 layer. Hence we will refer to it as the "generic MPLS 1631 encapsulation". 1633 The generic MPLS encapsulation would in turn be encapsulated in a 1634 data link layer protocol. 1636 The generic MPLS encapsulation should contain the following fields: 1638 1. the label stack, 1640 2. a Time-to-Live (TTL) field 1641 3. a Class of Service (CoS) field 1643 The TTL field permits MPLS to provide a TTL function similar to what 1644 is provided by IP. 1646 The CoS field permits LSRs to apply various scheduling packet 1647 disciplines to labeled packets, without requiring separate labels for 1648 separate disciplines. 1650 2.23.2. ATM Switches as LSRs 1652 It will be noted that MPLS forwarding procedures are similar to those 1653 of legacy "label swapping" switches such as ATM switches. ATM 1654 switches use the input port and the incoming VPI/VCI value as the 1655 index into a "cross-connect" table, from which they obtain an output 1656 port and an outgoing VPI/VCI value. Therefore if one or more labels 1657 can be encoded directly into the fields which are accessed by these 1658 legacy switches, then the legacy switches can, with suitable software 1659 upgrades, be used as LSRs. We will refer to such devices as "ATM- 1660 LSRs". 1662 There are three obvious ways to encode labels in the ATM cell header 1663 (presuming the use of AAL5): 1665 1. SVC Encoding 1667 Use the VPI/VCI field to encode the label which is at the top 1668 of the label stack. This technique can be used in any network. 1669 With this encoding technique, each LSP is realized as an ATM 1670 SVC, and the LDP becomes the ATM "signaling" protocol. With 1671 this encoding technique, the ATM-LSRs cannot perform "push" or 1672 "pop" operations on the label stack. 1674 2. SVP Encoding 1676 Use the VPI field to encode the label which is at the top of 1677 the label stack, and the VCI field to encode the second label 1678 on the stack, if one is present. This technique some advantages 1679 over the previous one, in that it permits the use of ATM "VP- 1680 switching". That is, the LSPs are realized as ATM SVPs, with 1681 LDP serving as the ATM signaling protocol. 1683 However, this technique cannot always be used. If the network 1684 includes an ATM Virtual Path through a non-MPLS ATM network, 1685 then the VPI field is not necessarily available for use by 1686 MPLS. 1688 When this encoding technique is used, the ATM-LSR at the egress 1689 of the VP effectively does a "pop" operation. 1691 3. SVP Multipoint Encoding 1693 Use the VPI field to encode the label which is at the top of 1694 the label stack, use part of the VCI field to encode the second 1695 label on the stack, if one is present, and use the remainder of 1696 the VCI field to identify the LSP ingress. If this technique 1697 is used, conventional ATM VP-switching capabilities can be used 1698 to provide multipoint-to-point VPs. Cells from different 1699 packets will then carry different VCI values, so multipoint- 1700 to-point VPs can be provided without any cell interleaving 1701 problems. 1703 This technique depends on the existence of a capability for 1704 assigning small unique values to each ATM switch. 1706 If there are more labels on the stack than can be encoded in the ATM 1707 header, the ATM encodings must be combined with the generic 1708 encapsulation. This does presuppose that it be possible to tell, 1709 when reassembling the ATM cells into packets, whether the generic 1710 encapsulation is also present. 1712 2.23.3. Interoperability among Encoding Techniques 1714 If is a segment of a LSP, it is possible that R1 will 1715 use one encoding of the label stack when transmitting packet P to R2, 1716 but R2 will use a different encoding when transmitting a packet P to 1717 R3. In general, the MPLS architecture supports LSPs with different 1718 label stack encodings used on different hops. Therefore, when we 1719 discuss the procedures for processing a labeled packet, we speak in 1720 abstract terms of operating on the packet's label stack. When a 1721 labeled packet is received, the LSR must decode it to determine the 1722 current value of the label stack, then must operate on the label 1723 stack to determine the new value of the stack, and then encode the 1724 new value appropriately before transmitting the labeled packet to its 1725 next hop. 1727 Unfortunately, ATM switches have no capability for translating from 1728 one encoding technique to another. The MPLS architecture therefore 1729 requires that whenever it is possible for two ATM switches to be 1730 successive LSRs along a level m LSP for some packet, that those two 1731 ATM switches use the same encoding technique. 1733 Naturally there will be MPLS networks which contain a combination of 1734 ATM switches operating as LSRs, and other LSRs which operate using an 1735 MPLS shim header. In such networks there may be some LSRs which have 1736 ATM interfaces as well as "MPLS Shim" interfaces. This is one example 1737 of an LSR with different label stack encodings on different hops. 1738 Such an LSR may swap off an ATM encoded label stack on an incoming 1739 interface and replace it with an MPLS shim header encoded label stack 1740 on the outgoing interface. 1742 2.24. Multicast 1744 This section is for further study 1746 3. Some Applications of MPLS 1748 3.1. MPLS and Hop by Hop Routed Traffic 1750 One use of MPLS is to simplify the process of forwarding packets 1751 using hop by hop routing. 1753 3.1.1. Labels for Address Prefixes 1755 In general, router R determines the next hop for packet P by finding 1756 the address prefix X in its routing table which is the longest match 1757 for P's destination address. That is, the packets in a given stream 1758 are just those packets which match a given address prefix in R's 1759 routing table. In this case, a stream can be identified with an 1760 address prefix. 1762 If packet P must traverse a sequence of routers, and at each router 1763 in the sequence P matches the same address prefix, MPLS simplifies 1764 the forwarding process by enabling all routers but the first to avoid 1765 executing the best match algorithm; they need only look up the label. 1767 3.1.2. Distributing Labels for Address Prefixes 1769 3.1.2.1. LDP Peers for a Particular Address Prefix 1771 LSRs R1 and R2 are considered to be LDP Peers for address prefix X if 1772 and only if one of the following conditions holds: 1774 1. R1's route to X is a route which it learned about via a 1775 particular instance of a particular IGP, and R2 is a neighbor 1776 of R1 in that instance of that IGP 1778 2. R1's route to X is a route which it learned about by some 1779 instance of routing algorithm A1, and that route is 1780 redistributed into an instance of routing algorithm A2, and R2 1781 is a neighbor of R1 in that instance of A2 1783 3. R1 is the receive endpoint of an LSP Tunnel that is within 1784 another LSP, and R2 is a transmit endpoint of that tunnel, and 1785 R1 and R2 are participants in a common instance of an IGP, and 1786 are in the same IGP area (if the IGP in question has areas), 1787 and R1's route to X was learned via that IGP instance, or is 1788 redistributed by R1 into that IGP instance 1790 4. R1's route to X is a route which it learned about via BGP, and 1791 R2 is a BGP peer of R1 1793 In general, these rules ensure that if the route to a particular 1794 address prefix is distributed via an IGP, the LDP peers for that 1795 address prefix are the IGP neighbors. If the route to a particular 1796 address prefix is distributed via BGP, the LDP peers for that address 1797 prefix are the BGP peers. In other cases of LSP tunneling, the 1798 tunnel endpoints are LDP peers. 1800 3.1.2.2. Distributing Labels 1802 In order to use MPLS for the forwarding of normally routed traffic, 1803 each LSR MUST: 1805 1. bind one or more labels to each address prefix that appears in 1806 its routing table; 1808 2. for each such address prefix X, use an LDP to distribute the 1809 mapping of a label to X to each of its LDP Peers for X. 1811 There is also one circumstance in which an LSR must distribute a 1812 label mapping for an address prefix, even if it is not the LSR which 1813 bound that label to that address prefix: 1815 3. If R1 uses BGP to distribute a route to X, naming some other 1816 LSR R2 as the BGP Next Hop to X, and if R1 knows that R2 has 1817 assigned label L to X, then R1 must distribute the mapping 1818 between T and X to any BGP peer to which it distributes that 1819 route. 1821 These rules ensure that labels corresponding to address prefixes 1822 which correspond to BGP routes are distributed to IGP neighbors if 1823 and only if the BGP routes are distributed into the IGP. Otherwise, 1824 the labels bound to BGP routes are distributed only to the other BGP 1825 speakers. 1827 These rules are intended to indicate which label mappings must be 1828 distributed by a given LSR to which other LSRs, NOT to indicate the 1829 conditions under which the distribution is to be made. That is 1830 discussed in section 2.19. 1832 3.1.3. Using the Hop by Hop path as the LSP 1834 If the hop-by-hop path that packet P needs to follow is , then can be an LSP as long as: 1837 1. there is a single address prefix X, such that, for all i, 1838 1<=i, and the Hop-by-hop path for P2 is . Let's suppose that R3 binds label L3 to X, and distributes 2098 this mapping to R2. R2 binds label L2 to X, and distributes this 2099 mapping to both R1 and R4. When R2 receives packet P1, its incoming 2100 label will be L2. R2 will overwrite L2 with L3, and send P1 to R3. 2101 When R2 receives packet P2, its incoming label will also be L2. R2 2102 again overwrites L2 with L3, and send P2 on to R3. 2104 Note then that when P1 and P2 are traveling from R2 to R3, they carry 2105 the same label, and as far as MPLS is concerned, they cannot be 2106 distinguished. Thus instead of talking about two distinct LSPs, and , we might talk of a single "Multipoint-to- 2108 Point LSP Tree", which we might denote as <{R1, R4}, R2, R3>. 2110 This creates a difficulty when we attempt to use conventional ATM 2111 switches as LSRs. Since conventional ATM switches do not support 2112 multipoint-to-point connections, there must be procedures to ensure 2113 that each LSP is realized as a point-to-point VC. However, if ATM 2114 switches which do support multipoint-to-point VCs are in use, then 2115 the LSPs can be most efficiently realized as multipoint-to-point VCs. 2116 Alternatively, if the SVP Multipoint Encoding (section 2.23) can be 2117 used, the LSPs can be realized as multipoint-to-point SVPs. 2119 3.6. LSP Tunneling between BGP Border Routers 2121 Consider the case of an Autonomous System, A, which carries transit 2122 traffic between other Autonomous Systems. Autonomous System A will 2123 have a number of BGP Border Routers, and a mesh of BGP connections 2124 among them, over which BGP routes are distributed. In many such 2125 cases, it is desirable to avoid distributing the BGP routes to 2126 routers which are not BGP Border Routers. If this can be avoided, 2127 the "route distribution load" on those routers is significantly 2128 reduced. However, there must be some means of ensuring that the 2129 transit traffic will be delivered from Border Router to Border Router 2130 by the interior routers. 2132 This can easily be done by means of LSP Tunnels. Suppose that BGP 2133 routes are distributed only to BGP Border Routers, and not to the 2134 interior routers that lie along the Hop-by-hop path from Border 2135 Router to Border Router. LSP Tunnels can then be used as follows: 2137 1. Each BGP Border Router distributes, to every other BGP Border 2138 Router in the same Autonomous System, a label for each address 2139 prefix that it distributes to that router via BGP. 2141 2. The IGP for the Autonomous System maintains a host route for 2142 each BGP Border Router. Each interior router distributes its 2143 labels for these host routes to each of its IGP neighbors. 2145 3. Suppose that: 2147 a) BGP Border Router B1 receives an unlabeled packet P, 2149 b) address prefix X in B1's routing table is the longest 2150 match for the destination address of P, 2152 c) the route to X is a BGP route, 2154 d) the BGP Next Hop for X is B2, 2156 e) B2 has bound label L1 to X, and has distributed this 2157 mapping to B1, 2159 f) the IGP next hop for the address of B2 is I1, 2161 g) the address of B2 is in B1's and I1's IGP routing tables 2162 as a host route, and 2164 h) I1 has bound label L2 to the address of B2, and 2165 distributed this mapping to B1. 2167 Then before sending packet P to I1, B1 must create a label 2168 stack for P, then push on label L1, and then push on label L2. 2170 4. Suppose that BGP Border Router B1 receives a labeled Packet P, 2171 where the label on the top of the label stack corresponds to an 2172 address prefix, X, to which the route is a BGP route, and that 2173 conditions 3b, 3c, 3d, and 3e all hold. Then before sending 2174 packet P to I1, B1 must replace the label at the top of the 2175 label stack with L1, and then push on label L2. 2177 With these procedures, a given packet P follows a level 1 LSP all of 2178 whose members are BGP Border Routers, and between each pair of BGP 2179 Border Routers in the level 1 LSP, it follows a level 2 LSP. 2181 These procedures effectively create a Hop-by-Hop Routed LSP Tunnel 2182 between the BGP Border Routers. 2184 Since the BGP border routers are exchanging label mappings for 2185 address prefixes that are not even known to the IGP routing, the BGP 2186 routers should become explicit LDP peers with each other. 2188 3.7. Other Uses of Hop-by-Hop Routed LSP Tunnels 2190 The use of Hop-by-Hop Routed LSP Tunnels is not restricted to tunnels 2191 between BGP Next Hops. Any situation in which one might otherwise 2192 have used an encapsulation tunnel is one in which it is appropriate 2193 to use a Hop-by-Hop Routed LSP Tunnel. Instead of encapsulating the 2194 packet with a new header whose destination address is the address of 2195 the tunnel's receive endpoint, the label corresponding to the address 2196 prefix which is the longest match for the address of the tunnel's 2197 receive endpoint is pushed on the packet's label stack. The packet 2198 which is sent into the tunnel may or may not already be labeled. 2200 If the transmit endpoint of the tunnel wishes to put a labeled packet 2201 into the tunnel, it must first replace the label value at the top of 2202 the stack with a label value that was distributed to it by the 2203 tunnel's receive endpoint. Then it must push on the label which 2204 corresponds to the tunnel itself, as distributed to it by the next 2205 hop along the tunnel. To allow this, the tunnel endpoints should be 2206 explicit LDP peers. The label mappings they need to exchange are of 2207 no interest to the LSRs along the tunnel. 2209 3.8. MPLS and Multicast 2211 Multicast routing proceeds by constructing multicast trees. The tree 2212 along which a particular multicast packet must get forwarded depends 2213 in general on the packet's source address and its destination 2214 address. Whenever a particular LSR is a node in a particular 2215 multicast tree, it binds a label to that tree. It then distributes 2216 that mapping to its parent on the multicast tree. (If the node in 2217 question is on a LAN, and has siblings on that LAN, it must also 2218 distribute the mapping to its siblings. This allows the parent to 2219 use a single label value when multicasting to all children on the 2220 LAN.) 2222 When a multicast labeled packet arrives, the NHLFE corresponding to 2223 the label indicates the set of output interfaces for that packet, as 2224 well as the outgoing label. If the same label encoding technique is 2225 used on all the outgoing interfaces, the very same packet can be sent 2226 to all the children. 2228 4. LDP Procedures for Hop-by-Hop Routed Traffic 2230 4.1. The Procedures for Advertising and Using labels 2232 In this section, we consider only label mappings that are used for 2233 traffic to be label switched along its hop-by-hop routed path. In 2234 these cases, the label in question will correspond to an address 2235 prefix in the routing table. 2237 There are a number of different procedures that may be used to 2238 distribute label mappings. One such procedure is executed by the 2239 downstream LSR, and the others by the upstream LSR. 2241 The downstream LSR must perform: 2243 - The Distribution Procedure, and 2245 - the Withdrawal Procedure. 2247 The upstream LSR must perform: 2249 - The Request Procedure, and 2251 - the NotAvailable Procedure, and 2253 - the Release Procedure, and 2255 - the labelUse Procedure. 2257 The MPLS architecture supports several variants of each procedure. 2259 However, the MPLS architecture does not support all possible 2260 combinations of all possible variants. The set of supported 2261 combinations will be described in section 4.2, where the 2262 interoperability between different combinations will also be 2263 discussed. 2265 4.1.1. Downstream LSR: Distribution Procedure 2267 The Distribution Procedure is used by a downstream LSR to determine 2268 when it should distribute a label mapping for a particular address 2269 prefix to its LDP peers. The architecture supports four different 2270 distribution procedures. 2272 Irrespective of the particular procedure that is used, if a label 2273 mapping for a particular address prefix has been distributed by a 2274 downstream LSR Rd to an upstream LSR Ru, and if at any time the 2275 attributes (as defined above) of that mapping change, then Rd must 2276 inform Ru of the new attributes. 2278 If an LSR is maintaining multiple routes to a particular address 2279 prefix, it is a local matter as to whether that LSR maps multiple 2280 labels to the address prefix (one per route), and hence distributes 2281 multiple mappings. 2283 4.1.1.1. PushUnconditional 2285 Let Rd be an LSR. Suppose that: 2287 1. X is an address prefix in Rd's routing table 2289 2. Ru is an LDP Peer of Rd with respect to X 2291 Whenever these conditions hold, Rd must map a label to X and 2292 distribute that mapping to Ru. It is the responsibility of Rd to 2293 keep track of the mappings which it has distributed to Ru, and to 2294 make sure that Ru always has these mappings. 2296 4.1.1.2. PushConditional 2298 Let Rd be an LSR. Suppose that: 2300 1. X is an address prefix in Rd's routing table 2302 2. Ru is an LDP Peer of Rd with respect to X 2304 3. Rd is either an LSP Egress or an LSP Proxy Egress for X, or 2305 Rd's L3 next hop for X is Rn, where Rn is distinct from Ru, and 2306 Rn has bound a label to X and distributed that mapping to Rd. 2308 Then as soon as these conditions all hold, Rd should map a label to X 2309 and distribute that mapping to Ru. 2311 Whereas PushUnconditional causes the distribution of label mappings 2312 for all address prefixes in the routing table, PushConditional causes 2313 the distribution of label mappings only for those address prefixes 2314 for which one has received label mappings from one's LSP next hop, or 2315 for which one does not have an MPLS-capable L3 next hop. 2317 4.1.1.3. PulledUnconditional 2319 Let Rd be an LSR. Suppose that: 2321 1. X is an address prefix in Rd's routing table 2323 2. Ru is a label distribution peer of Rd with respect to X 2325 3. Ru has explicitly requested that Rd map a label to X and 2326 distribute the mapping to Ru 2328 Then Rd should map a label to X and distribute that mapping to Ru. 2329 Note that if X is not in Rd's routing table, or if Rd is not an LDP 2330 peer of Ru with respect to X, then Rd must inform Ru that it cannot 2331 provide a mapping at this time. 2333 If Rd has already distributed a mapping for address prefix X to Ru, 2334 and it receives a new request from Ru for a mapping for address 2335 prefix X, it will map a second label, and distribute the new mapping 2336 to Ru. The first label mapping remains in effect. 2338 4.1.1.4. PulledConditional 2340 Let Rd be an LSR. Suppose that: 2342 1. X is an address prefix in Rd's routing table 2344 2. Ru is a label distribution peer of Rd with respect to X 2346 3. Ru has explicitly requested that Rd map a label to X and 2347 distribute the mapping to Ru 2349 4. Rd is either an LSP Egress or an LSP Proxy Egress for X, or 2350 Rd's L3 next hop for X is Rn, where Rn is distinct from Ru, and 2351 Rn has bound a label to X and distributed that mapping to Rd, 2352 or 2354 Then as soon as these conditions all hold, Rd should map a label to X 2355 and distribute that mapping to Ru. Note that if X is not in Rd's 2356 routing table, or if Rd is not a label distribution peer of Ru with 2357 respect to X, then Rd must inform Ru that it cannot provide a mapping 2358 at this time. 2360 However, if the only condition that fails to hold is that Rn has not 2361 yet provided a label to Rd, then Rd must defer any response to Ru 2362 until such time as it has receiving a mapping from Rn. 2364 If Rd has distributed a label mapping for address prefix X to Ru, and 2365 at some later time, any attribute of the label mapping changes, then 2366 Rd must redistribute the label mapping to Ru, with the new attribute. 2367 It must do this even though Ru does not issue a new Request. 2369 In section 4.2, we will discuss how to choose the particular 2370 procedure to be used at any given time, and how to ensure 2371 interoperability among LSRs that choose different procedures. 2373 4.1.2. Upstream LSR: Request Procedure 2375 The Request Procedure is used by the upstream LSR for an address 2376 prefix to determine when to explicitly request that the downstream 2377 LSR map a label to that prefix and distribute the mapping. There are 2378 three possible procedures that can be used. 2380 4.1.2.1. RequestNever 2382 Never make a request. This is useful if the downstream LSR uses the 2383 PushConditional procedure or the PushUnconditional procedure, but is 2384 not useful if the downstream LSR uses the PulledUnconditional 2385 procedure or the the Pulledconditional procedures. 2387 4.1.2.2. RequestWhenNeeded 2389 Make a request whenever the L3 next hop to the address prefix 2390 changes, and one doesn't already have a label mapping from that next 2391 hop for the given address prefix. 2393 4.1.2.3. RequestOnRequest 2395 Issue a request whenever a request is received, in addition to 2396 issuing a request when needed (as described in section 4.1.2.2). If 2397 Rd receives such a request from Ru, for an address prefix for which 2398 Rd has already distributed Ru a label, Rd shall assign a new 2399 (distinct) label, map it to X, and distribute that mapping. (Whether 2400 Rd can distribute this mapping to Ru immediately or not depends on 2401 the Distribution Procedure being used.) 2403 This procedure is useful when the LSRs are implemented on 2404 conventional ATM switching hardware. 2406 4.1.3. Upstream LSR: NotAvailable Procedure 2408 If Ru and Rd are respectively upstream and downstream label 2409 distribution peers for address prefix X, and Rd is Ru's L3 next hop 2410 for X, and Ru requests a mapping for X from Rd, but Rd replies that 2411 it cannot provide a mapping at this time, then the NotAvailable 2412 procedure determines how Ru responds. There are two possible 2413 procedures governing Ru's behavior: 2415 4.1.3.1. RequestRetry 2417 Ru should issue the request again at a later time. That is, the 2418 requester is responsible for trying again later to obtain the needed 2419 mapping. 2421 4.1.3.2. RequestNoRetry 2423 Ru should never reissue the request, instead assuming that Rd will 2424 provide the mapping automatically when it is available. This is 2425 useful if Rd uses the PushUnconditional procedure or the 2426 PushConditional procedure. 2428 4.1.4. Upstream LSR: Release Procedure 2430 Suppose that Rd is an LSR which has bound a label to address prefix 2431 X, and has distributed that mapping to LSR Ru. If Rd does not happen 2432 to be Ru's L3 next hop for address prefix X, or has ceased to be Ru's 2433 L3 next hop for address prefix X, then Rd will not be using the 2434 label. The Release Procedure determines how Ru acts in this case. 2435 There are two possible procedures governing Ru's behavior: 2437 4.1.4.1. ReleaseOnChange 2439 Ru should release the mapping, and inform Rd that it has done so. 2441 4.1.4.2. NoReleaseOnChange 2443 Ru should maintain the mapping, so that it can use it again 2444 immediately if Rd later becomes Ru's L3 next hop for X. 2446 4.1.5. Upstream LSR: labelUse Procedure 2448 Suppose Ru is an LSR which has received label mapping L for address 2449 prefix X from LSR Rd, and Ru is upstream of Rd with respect to X, and 2450 in fact Rd is Ru's L3 next hop for X. 2452 Ru will make use of the mapping if Rd is Ru's L3 next hop for X. If, 2453 at the time the mapping is received by Ru, Rd is NOT Ru's L3 next hop 2454 for X, Ru does not make any use of the mapping at that time. Ru may 2455 however start using the mapping at some later time, if Rd becomes 2456 Ru's L3 next hop for X. 2458 The labelUse Procedure determines just how Ru makes use of Rd's 2459 mapping. 2461 There are three procedures which Ru may use: 2463 4.1.5.1. UseImmediate 2465 Ru may put the mapping into use immediately. At any time when Ru has 2466 a mapping for X from Rd, and Rd is Ru's L3 next hop for X, Rd will 2467 also be Ru's LSP next hop for X. 2469 4.1.5.2. UseIfLoopFree 2471 Ru will use the mapping only if it determines that by doing so, it 2472 will not cause a forwarding loop. 2474 If Ru has a mapping for X from Rd, and Rd is (or becomes) Ru's L3 2475 next hop for X, but Rd is NOT Ru's current LSP next hop for X, Ru 2476 does NOT immediately make Rd its LSP next hop. Rather, it initiates 2477 a loop prevention algorithm. If, upon the completion of this 2478 algorithm, Rd is still the L3 next hop for X, Ru will make Rd the LSP 2479 next hop for X, and use L as the outgoing label. 2481 The loop prevention algorithm to be used is still under 2482 consideration. 2484 4.1.5.3. UseIfLoopNotDetected 2486 This procedure is the same as UseImmediate, unless Ru has detected a 2487 loop in the LSP. If a loop has been detected, Ru will discard 2488 packets that would otherwise have been labeled with L and sent to Rd. 2490 This will continue until the next hop for X changes, or until the 2491 loop is no longer detected. 2493 4.1.6. Downstream LSR: Withdraw Procedure 2495 In this case, there is only a single procedure. 2497 When LSR Rd decides to break the mapping between label L and address 2498 prefix X, then this unmapping must be distributed to all LSRs to 2499 which the mapping was distributed. 2501 It is desirable, though not required, that the unmapping of L from X 2502 be distributed by Rd to a LSR Ru before Rd distributes to Ru any new 2503 mapping of L to any other address prefix Y, where X != Y. If Ru 2504 learns of the new mapping of L to Y before it learns of the unmapping 2505 of L from X, and if packets matching both X and Y are forwarded by Ru 2506 to Rd, then for a period of time, Ru will label both packets matching 2507 X and packets matching Y with label L. 2509 The distribution and withdrawal of label mappings is done via a label 2510 distribution protocol, or LDP. LDP is a two-party protocol. If LSR R1 2511 has received label mappings from LSR R2 via an instance of an LDP, 2512 and that instance of that protocol is closed by either end (whether 2513 as a result of failure or as a matter of normal operation), then all 2514 mappings learned over that instance of the protocol must be 2515 considered to have been withdrawn. 2517 As long as the relevant LDP connection remains open, label mappings 2518 that are withdrawn must always be withdrawn explicitly. If a second 2519 label is bound to an address prefix, the result is not to implicitly 2520 withdraw the first label, but to map both labels; this is needed to 2521 support multi-path routing. If a second address prefix is bound to a 2522 label, the result is not to implicitly withdraw the mapping of that 2523 label to the first address prefix, but to use that label for both 2524 address prefixes. 2526 4.2. MPLS Schemes: Supported Combinations of Procedures 2528 Consider two LSRs, Ru and Rd, which are label distribution peers with 2529 respect to some set of address prefixes, where Ru is the upstream 2530 peer and Rd is the downstream peer. 2532 The MPLS scheme which governs the interaction of Ru and Rd can be 2533 described as a quintuple of procedures: . (Since there is only one Withdraw Procedure, it 2536 need not be mentioned.) A "*" appearing in one of the positions is a 2537 wild-card, meaning that any procedure in that category may be 2538 present; an "N/A" appearing in a particular position indicates that 2539 no procedure in that category is needed. 2541 Only the MPLS schemes which are specified below are supported by the 2542 MPLS Architecture. Other schemes may be added in the future, if a 2543 need for them is shown. 2545 4.2.1. TTL-capable LSP Segments 2547 If Ru and Rd are MPLS peers, and both are capable of decrementing a 2548 TTL field in the MPLS header, then the MPLS scheme in use between Ru 2549 and Rd must be one of the following: 2551 2554 2556 The former, roughly speaking, is "local control with downstream label 2557 assignment". The latter is an egress control scheme. 2559 4.2.2. Using ATM Switches as LSRs 2561 The procedures for using ATM switches as LSRs depends on whether the 2562 ATM switches can realize LSP trees as multipoint-to-point VCs or VPs. 2564 Most ATM switches existing today do NOT have a multipoint-to-point 2565 VC-switching capability. Their cross-connect tables could easily be 2566 programmed to move cells from multiple incoming VCs to a single 2567 outgoing VC, but the result would be that cells from different 2568 packets get interleaved. 2570 Some ATM switches do support a multipoint-to-point VC-switching 2571 capability. These switches will queue up all the incoming cells from 2572 an incoming VC until a packet boundary is reached. Then they will 2573 transmit the entire sequence of cells on the outgoing VC, without 2574 allowing cells from any other packet to be interleaved. 2576 Many ATM switches do support a multipoint-to-point VP-switching 2577 capability, which can be used if the Multipoint SVP label encoding is 2578 used. 2580 4.2.2.1. Without Multipoint-to-point Capability 2582 Suppose that R1, R2, R3, and R4 are ATM switches which do not support 2583 multipoint-to-point capability, but are being used as LSRs. Suppose 2584 further that the L3 hop-by-hop path for address prefix X is , and that packets destined for X can enter the network at any 2586 of these LSRs. Since there is no multipoint-to-point capability, the 2587 LSPs must be realized as point-to-point VCs, which means that there 2588 needs to be three such VCs for address prefix X: , 2589 , and . 2591 Therefore, if R1 and R2 are MPLS peers, and either is an LSR which is 2592 implemented using conventional ATM switching hardware (i.e., no cell 2593 interleave suppression), the MPLS scheme in use between R1 and R2 2594 must be one of the following: 2596 2599 2602 The use of the RequestOnRequest procedure will cause R4 to distribute 2603 three labels for X to R3; R3 will distribute 2 labels for X to R2, 2604 and R2 will distribute one label for X to R1. 2606 The first of these procedures is the "optimistic downstream-on- 2607 demand" variant of local control. The second is the "conservative 2608 downstream-on-demand" variant of local control. 2610 An egress control scheme which works in the absence of multipoint- 2611 to-point capability is for further study. 2613 4.2.2.2. With Multipoint-To-Point Capability 2615 If R1 and R2 are MPLS peers, and either of them is an LSR which is 2616 implemented using ATM switching hardware with cell interleave 2617 suppression, and neither is an LSR which is implemented using ATM 2618 switching hardware that does not have cell interleave suppression, 2619 then the MPLS scheme in use between R1 and R2 must be one of the 2620 following; 2622 2624 2626 2629 The first of these is an egress control scheme. The second is is the 2630 "downstream" variant of local control. The third is the 2631 "conservative downstream-on-demand" variant of local control. 2633 4.2.3. Interoperability Considerations 2635 It is easy to see that certain quintuples do NOT yield viable MPLS 2636 schemes. For example: 2638 - 2639 2641 In these MPLS schemes, the downstream LSR Rd distributes label 2642 mappings to upstream LSR Ru only upon request from Ru, but Ru 2643 never makes any such requests. Obviously, these schemes are not 2644 viable, since they will not result in the proper distribution of 2645 label mappings. 2647 - <*, RequestNever, *, *, ReleaseOnChange> 2649 In these MPLS schemes, Rd releases mappings when it isn't using 2650 them, but it never asks for them again, even if it later has a 2651 need for them. These schemes thus do not ensure that label 2652 mappings get properly distributed. 2654 In this section, we specify rules to prevent a pair of LDP peers from 2655 adopting procedures which lead to infeasible MPLS Schemes. These 2656 rules require the exchange of information between LDP peers during 2657 the initialization of the LDP connection between them. 2659 1. Each must state whether it is an ATM switch, and if so, whether 2660 it has cell interleave suppression. 2662 2. If Rd is an ATM switch without cell interleave suppression, it 2663 must state whether it intends to use the PulledUnconditional 2664 procedure or the Pulledconditional procedure. If the former, 2665 Ru MUST use the RequestRetry procedure; if the latter, Ru MUST 2666 use the RequestNoRetry procedure. 2668 3. If Ru is an ATM switch without cell interleave suppression, it 2669 must state whether it intends to use the RequestRetry or the 2670 RequestNoRetry procedure. If Rd is an ATM switch without cell 2671 interleave suppression, Rd is not bound by this, and in fact Ru 2672 MUST adopt Rd's preferences. However, if Rd is NOT an ATM 2673 switch without cell interleave suppression, then if Ru chooses 2674 RequestRetry, Rd must use PulledUnconditional, and if Ru 2675 chooses RequestNoRetry, Rd MUST use PulledConditional. 2677 4. If Rd is an ATM switch with cell interleave suppression, it 2678 must specify whether it prefers to use PushConditional, 2679 PushUnconditional, or PulledConditional. If Ru is not an ATM 2680 switch without cell interleave suppression, it must then use 2681 RequestWhenNeeded and RequestNoRetry, or else RequestNever and 2682 NoReleaseOnChange, respectively. 2684 5. If Ru is an ATM switch with cell interleave suppression, it 2685 must specify whether it prefers to use RequestWhenNeeded and 2686 RequestNoRetry, or else RequestNever and NoReleaseOnChange. If 2687 Rd is NOT an ATM switch with cell interleave suppression, it 2688 must then use either PushConditional or PushUnconditional, 2689 respectively. 2691 4.2.4. How to do Loop Prevention 2693 TBD 2695 4.2.5. How to do Loop Detection 2697 TBD. 2699 4.2.6. Security Considerations 2701 Security considerations are not discussed in this version of this 2702 draft. 2704 5. Authors' Addresses 2706 Eric C. Rosen 2707 Cisco Systems, Inc. 2708 250 Apollo Drive 2709 Chelmsford, MA, 01824 2710 E-mail: erosen@cisco.com 2712 Arun Viswanathan 2713 Lucent Technologies 2714 101 Crawford Corner Rd., #4D-537 2715 Holmdel, NJ 07733 2716 732-332-5163 2717 E-mail: arunv@dnrc.bell-labs.com 2719 Ross Callon 2720 IronBridge Networks 2721 55 Hayden Avenue, 2722 Lexington, MA 02173 2723 +1-781-402-8017 2724 E-mail: rcallon@ironbridgenetworks.com 2726 6. References 2728 [1] "A Framework for Multiprotocol Label Switching", R.Callon, 2729 P.Doolan, N.Feldman, A.Fredette, G.Swallow, and A.Viswanathan, work 2730 in progress, Internet Draft , 2731 November 1997. 2733 [2] "ARIS: Aggregate Route-Based IP Switching", A. Viswanathan, N. 2734 Feldman, R. Boivie, R. Woundy, work in progress, Internet Draft 2735 , March 1997. 2737 [3] "ARIS Specification", N. Feldman, A. Viswanathan, work in 2738 progress, Internet Draft , March 2739 1997. 2741 [4] "Tag Switching Architecture - Overview", Rekhter, Davie, Katz, 2742 Rosen, Swallow, Farinacci, work in progress, Internet Draft , January, 1997. 2745 [5] "Tag distribution Protocol", Doolan, Davie, Katz, Rekhter, Rosen, 2746 work in progress, Internet Draft , May, 2747 1997. 2749 [6] "Use of Tag Switching with ATM", Davie, Doolan, Lawrence, 2750 McGloghrie, Rekhter, Rosen, Swallow, work in progress, Internet Draft 2751 , January, 1997. 2753 [7] "Label Switching: Label Stack Encodings", Rosen, Rekhter, Tappan, 2754 Farinacci, Fedorkow, Li, Conta, work in progress, Internet Draft 2755 , February, 1998. 2757 [8] "Partitioning Tag Space among Multicast Routers on a Common 2758 Subnet", Farinacci, work in progress, internet draft , December, 1996. 2761 [9] "Multicast Tag Binding and Distribution using PIM", Farinacci, 2762 Rekhter, work in progress, internet draft , December, 1996. 2765 [10] "Toshiba's Router Architecture Extensions for ATM: Overview", 2766 Katsube, Nagami, Esaki, RFC 2098, February, 1997. 2768 [11] "Loop-Free Routing Using Diffusing Computations", J.J. Garcia- 2769 Luna-Aceves, IEEE/ACM Transactions on Networking, Vol. 1, No. 1, 2770 February 1993.