idnits 2.17.1 draft-ietf-mpls-forwarding-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (February 12, 2014) is 3698 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 533 == Outdated reference: A later version (-06) exists of draft-ietf-mpls-psc-updates-01 == Outdated reference: A later version (-11) exists of draft-ietf-mpls-in-udp-05 == Outdated reference: A later version (-06) exists of draft-ietf-mpls-special-purpose-labels-03 == Outdated reference: A later version (-07) exists of draft-ietf-tictoc-1588overmpls-05 -- Obsolete informational reference (is this intentional?): RFC 4379 (Obsoleted by RFC 8029) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 6424 (Obsoleted by RFC 8029) -- Obsolete informational reference (is this intentional?): RFC 6829 (Obsoleted by RFC 8029) Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MPLS C. Villamizar, Ed. 3 Internet-Draft OCCNC 4 Intended status: Informational K. Kompella 5 Expires: August 16, 2014 Juniper Networks 6 S. Amante 7 Apple Inc. 8 A. Malis 9 Huawei 10 C. Pignataro 11 Cisco 12 February 12, 2014 14 MPLS Forwarding Compliance and Performance Requirements 15 draft-ietf-mpls-forwarding-07 17 Abstract 19 This document provides guidelines for implementers regarding MPLS 20 forwarding and a basis for evaluations of forwarding implementations. 21 Guidelines cover many aspects of MPLS forwarding. Topics are 22 highlighted where implementers might otherwise overlook practical 23 requirements which are unstated or under emphasized or are optional 24 for conformance to RFCs but are often considered mandatory by 25 providers. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on August 16, 2014. 44 Copyright Notice 46 Copyright (c) 2014 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction and Document Scope . . . . . . . . . . . . . . . 3 62 1.1. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 4 63 1.2. Use of Requirements Language . . . . . . . . . . . . . . 8 64 1.3. Apparent Misconceptions . . . . . . . . . . . . . . . . . 8 65 1.4. Target Audience . . . . . . . . . . . . . . . . . . . . . 10 66 2. Forwarding Issues . . . . . . . . . . . . . . . . . . . . . . 10 67 2.1. Forwarding Basics . . . . . . . . . . . . . . . . . . . . 10 68 2.1.1. MPLS Special Purpose Labels . . . . . . . . . . . . . 11 69 2.1.2. MPLS Differentiated Services . . . . . . . . . . . . 13 70 2.1.3. Time Synchronization . . . . . . . . . . . . . . . . 13 71 2.1.4. Uses of Multiple Label Stack Entries . . . . . . . . 14 72 2.1.5. MPLS Link Bundling . . . . . . . . . . . . . . . . . 15 73 2.1.6. MPLS Hierarchy . . . . . . . . . . . . . . . . . . . 15 74 2.1.7. MPLS Fast Reroute (FRR) . . . . . . . . . . . . . . . 16 75 2.1.8. Pseudowire Encapsulation . . . . . . . . . . . . . . 16 76 2.1.8.1. Pseudowire Sequence Number . . . . . . . . . . . 17 77 2.1.9. Layer-2 and Layer-3 VPN . . . . . . . . . . . . . . . 18 78 2.2. MPLS Multicast . . . . . . . . . . . . . . . . . . . . . 19 79 2.3. Packet Rates . . . . . . . . . . . . . . . . . . . . . . 20 80 2.4. MPLS Multipath Techniques . . . . . . . . . . . . . . . . 21 81 2.4.1. Pseudowire Control Word . . . . . . . . . . . . . . . 22 82 2.4.2. Large Microflows . . . . . . . . . . . . . . . . . . 23 83 2.4.3. Pseudowire Flow Label . . . . . . . . . . . . . . . . 23 84 2.4.4. MPLS Entropy Label . . . . . . . . . . . . . . . . . 24 85 2.4.5. Fields Used for Multipath Load Balance . . . . . . . 24 86 2.4.5.1. MPLS Fields in Multipath . . . . . . . . . . . . 24 87 2.4.5.2. IP Fields in Multipath . . . . . . . . . . . . . 26 88 2.4.5.3. Fields Used in Flow Label . . . . . . . . . . . . 28 89 2.4.5.4. Fields Used in Entropy Label . . . . . . . . . . 28 90 2.5. MPLS-TP and UHP . . . . . . . . . . . . . . . . . . . . . 29 91 2.6. Local Delivery of Packets . . . . . . . . . . . . . . . . 29 92 2.6.1. DoS Protection . . . . . . . . . . . . . . . . . . . 29 93 2.6.2. MPLS OAM . . . . . . . . . . . . . . . . . . . . . . 31 94 2.6.3. Pseudowire OAM . . . . . . . . . . . . . . . . . . . 32 95 2.6.4. MPLS-TP OAM . . . . . . . . . . . . . . . . . . . . . 33 96 2.6.5. MPLS OAM and Layer-2 OAM Interworking . . . . . . . . 34 97 2.6.6. Extent of OAM Support by Hardware . . . . . . . . . . 35 98 2.7. Number and Size of Flows . . . . . . . . . . . . . . . . 36 99 3. Questions for Suppliers . . . . . . . . . . . . . . . . . . . 36 100 3.1. Basic Compliance . . . . . . . . . . . . . . . . . . . . 36 101 3.2. Basic Performance . . . . . . . . . . . . . . . . . . . . 38 102 3.3. Multipath Capabilities and Performance . . . . . . . . . 39 103 3.4. Pseudowire Capabilities and Performance . . . . . . . . . 39 104 3.5. Entropy Label Support and Performance . . . . . . . . . . 40 105 3.6. DoS Protection . . . . . . . . . . . . . . . . . . . . . 40 106 3.7. OAM Capabilities and Performance . . . . . . . . . . . . 40 107 4. Forwarding Compliance and Performance Testing . . . . . . . . 41 108 4.1. Basic Compliance . . . . . . . . . . . . . . . . . . . . 41 109 4.2. Basic Performance . . . . . . . . . . . . . . . . . . . . 42 110 4.3. Multipath Capabilities and Performance . . . . . . . . . 42 111 4.4. Pseudowire Capabilities and Performance . . . . . . . . . 43 112 4.5. Entropy Label Support and Performance . . . . . . . . . . 44 113 4.6. DoS Protection . . . . . . . . . . . . . . . . . . . . . 44 114 4.7. OAM Capabilities and Performance . . . . . . . . . . . . 45 115 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 45 116 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46 117 7. Security Considerations . . . . . . . . . . . . . . . . . . . 46 118 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 49 119 8.1. Normative References . . . . . . . . . . . . . . . . . . 49 120 8.2. Informative References . . . . . . . . . . . . . . . . . 51 121 Appendix A. Organization of References Section . . . . . . . . . 56 122 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 56 124 1. Introduction and Document Scope 126 The initial purpose of this document was to address concerns raised 127 on the MPLS WG mailing list about shortcomings in implementations of 128 MPLS forwarding. Documenting existing misconceptions and potential 129 pitfalls might potentially avoid repeating past mistakes. The 130 document has grown to address a broad set of forwarding requirements. 132 The focus of this document is MPLS forwarding, base pseudowire 133 forwarding, and MPLS Operations, Administration, and Maintenance 134 (OAM). The use of pseudowire control word, and sequence number are 135 discussed. Specific pseudowire Attachment Circuit (AC) and Native 136 Service Processing (NSP) are out of scope. Specific pseudowire 137 applications, such as various forms of Virtual Private Network (VPN), 138 are out of scope. 140 MPLS support for multipath techniques is considered essential by many 141 service providers and is useful for other high capacity networks. In 142 order to obtain sufficient entropy from MPLS traffic service 143 providers and others find it essential for the MPLS implementation to 144 interpret the MPLS payload as IPv4 or IPv6 based on the contents of 145 the first nibble of payload. The use of IP addresses, the IP 146 protocol field, and UDP and TCP port number fields in multipath load 147 balancing are considered within scope. The use of any other IP 148 protocol fields, such as tunneling protocols carried within IP, are 149 out of scope. 151 Implementation details are a local matter and are out of scope. Most 152 interfaces today operate at 1 Gb/s or greater. It is assumed that 153 all forwarding operations are implemented in specialized forwarding 154 hardware rather than on a general purpose processor. This is often 155 referred to as "fast path" and "slow path" processing. Some 156 recommendations are made regarding implementing control or management 157 plane functionality in specialized hardware or with limited 158 assistance from specialized hardware. This advise is based on 159 expected control or management protocol loads and on the need for 160 denial of service (DoS) protection. 162 1.1. Abbreviations 164 The following abbreviations are used. 166 AC Attachment Circuit ([RFC3985]) 168 ACH Associated Channel Header (pseudowires) 170 ACK Acknowledgement (TCP flag and type of TCP packet) 172 AIS Alarm Indication Signal (MPLS-TP OAM) 174 ATM Asynchronous Transfer Mode (legacy switched circuits) 176 BFD Bidirectional Forwarding Detection 178 BGP Border Gateway Protocol 180 CC-CV Connectivity Check and Connectivity Verification 182 CE Customer Edge (LDP, RSVP-TE, other protocols) 184 CPU Central Processing Unit (computer or microprocessor) 186 CT Class Type ([RFC4124]) 187 CW Control Word ([RFC4385]) 189 DCCP Datagram Congestion Control Protocol 191 DDoS Distributed Denial of Service 193 DM Delay Measurement (MPLS-TP OAM) 195 DSCP Differentiated Services Code Point ([RFC2474]) 197 DWDM Dense Wave Division Multiplexing 199 DoS Denial of Service 201 E-LSP EXP-Inferred-PSC LSP ([RFC3270]) 203 EBGP External BGP 205 ECMP Equal Cost Multi-Path 207 ECN Explicit Congestion Notification ([RFC3168] and [RFC5129]) 209 EL Entropy Label ([RFC6790]) 211 ELI Entropy Label Indicator ([RFC6790]) 213 EXP Experimental (field in MPLS renamed to TC in [RFC5462]) 215 FEC Forwarding Equivalence Classes (LDP), also Forward Error 216 Correction in other context 218 FR Frame Relay (legacy switched circuits) 220 FRR Fast Reroute ([RFC4090]) 222 G-ACh Generic Associated Channel ([RFC5586]) 224 GAL Generic Associated Channel Label ([RFC5586]) 226 GFP Generic Framing Protocol (used in OTN) 228 GMPLS Generalized MPLS ([RFC3471]) 230 GTSM Generalized TTL Security Mechanism ([RFC5082]) 232 Gb/s Gigabits per second (billion bits per second) 234 IANA Internet Assigned Numbers Authority 235 ILM Incoming Label Map ([RFC3031]) 237 IP Internet Protocol 239 IPVPN Internet Protocol VPN 241 IPv4 Internet Protocol version 4 243 IPv6 Internet Protocol version 6 245 L-LSP Label-Only-Inferred-PSC LSP ([RFC3270]) 247 L2VPN Layer 2 VPN 249 LDP Label Distribution Protocol ([RFC5036]) 251 LER Label Edge Router ([RFC3031]) 253 LM Loss Measurement (MPLS-TP OAM) 255 LSP Label Switched Path ([RFC3031]) 257 LSR Label Switching Router ([RFC3031]) 259 MP2MP Multipoint to Multipoint 261 MPLS MultiProtocol Label Switching ([RFC3031]) 263 MPLS-TP MPLS Transport Profile ([RFC5317]) 265 Mb/s Megabits per second (million bits per second) 267 NSP Native Service Processing ([RFC3985]) 269 NTP Network Time Protocol 271 OAM Operations, Administration, and Maintenance ([RFC6291]) 273 OOB Out-of-band (not carried within a data channel) 275 OTN Optical Transport Network 277 P Provider router (LDP, RSVP-TE, other protocols) 279 P2MP Point to Multi-Point 281 PE Provider Edge router (LDP, RSVP-TE, other protocols) 282 PHB Per-Hop-Behavior ([RFC2475]) 284 PHP Penultimate Hop Popping ([RFC3443]) 286 POS Packet over SONET 288 PSC This abbreviation has multiple interpretations. 290 1. Packet Switch Capable ([RFC3471] 292 2. PHB Scheduling Class ([RFC3270]) 294 3. Protection State Coordination ([RFC6378]) 296 PTP Precision Time Protocol 298 PW Pseudowire 300 QoS Quality of Service 302 RA Router Alert ([RFC3032]) 304 RDI Remote Defect Indication (MPLS-TP OAM) 306 RSVP-TE RSVP Traffic Engineering 308 RTP Real-Time Transport Protocol 310 SCTP Stream Control Transmission Protocol 312 SDH Synchronous Data Hierarchy (European SONET, a form of TDM) 314 SONET Synchronous Optical Network (US SDH, a form of TDM) 316 T-LDP Targeted LDP (LDP sessions over more than one hop) 318 TC Traffic Class ([RFC5462]) 320 TCP Transmission Control Protocol 322 TDM Time-Division Multiplexing (legacy encapsulations) 324 TOS Type of Service (see [RFC2474]) 326 TTL Time-to-live (a field in IP and MPLS headers) 328 UDP User Datagram Protocol 329 UHP Ultimate Hop Popping (opposite of PHP) 331 VCCV Virtual Circuit Connectivity Verification ([RFC5085]) 333 VLAN Virtual Local Area Network (Ethernet) 335 VOQ Virtual Output Queuing (switch fabric design) 337 VPN Virtual Private Network 339 WG Working Group 341 1.2. Use of Requirements Language 343 This document is informational. The upper case [RFC2119] key words 344 are not used in this document, except in the following cases. 346 1. RFC 2119 keywords are used where requirements stated in this 347 document are called for in referenced RFCs. In most cases the 348 RFC containing the requirement is cited within the statement 349 using an RFC 2119 keyword. 351 2. RFC 2119 keywords are used where explicitly noted that the 352 keywords indicate that operator experiences indicate a 353 requirement, but there are no existing RFC requirements. 355 Advice provided by this document may be ignored by implementations. 356 Similarly, implementations not claiming conformance to specific RFCs 357 may ignore the requirements of those RFCs. In both cases, 358 implementers should consider the risk of doing so. 360 1.3. Apparent Misconceptions 362 In early generations of forwarding silicon (which might now be behind 363 us), there apparently were some misconceptions about MPLS. The 364 following statements provide clarifications. 366 1. There are practical reasons to have more than one or two labels 367 in an MPLS label stack. Under some circumstances the label stack 368 can become quite deep. See Section 2.1. 370 2. The label stack MUST be considered to be arbitrarily deep. 371 Section 3.27.4. "Hierarchy: LSP Tunnels within LSPs" of RFC3031 372 states "The label stack mechanism allows LSP tunneling to nest to 373 any depth." [RFC3031] If a bottom of the label stack cannot be 374 found, but sufficient number of labels exist to forward, an LSR 375 MUST forward the packet. An LSR MUST NOT assume the packet is 376 malformed unless the end of packet is found before bottom of 377 stack. See Section 2.1. 379 3. In networks where deep label stacks are encountered, they are not 380 rare. Full packet rate performance is required regardless of 381 label stack depth, except where multiple pop operations are 382 required. See Section 2.1. 384 4. Research has shown that long bursts of short packets with 40 byte 385 or 44 byte IP payload sizes in these bursts are quite common. 386 This is due to TCP ACK compression [ACK-compression]. The 387 following two sub-bullets constitutes advice that reflects very 388 common non-negotiable requirements of providers. Implementers 389 may ignore this advice but should consider the risk of doing so. 391 a. A forwarding engine SHOULD, if practical, be able to sustain 392 an arbitrarily long sequence of small packets arriving at 393 full interface rate. 395 b. If indefinite full packet rate for small packets is not 396 practical, a forwarding engine MUST be able to buffer a long 397 sequence of small packets inbound to the on-chip decision 398 engine and sustain full interface rate for some reasonable 399 average packet rate. Absent this small on-chip buffering, 400 QoS agnostic packet drops can occur. 402 See Section 2.3. 404 5. The implementations and system designs MUST support pseudowire 405 control word (CW) if MPLS-TP is supported or if ACH [RFC5586] is 406 being used on a pseudowire. The implementation and system design 407 SHOULD support pseudowire CW even if MPLS-TP and ACH [RFC5586] 408 are not used, using instead CW and VCCV Type 1 [RFC5085] to allow 409 the use of multipath in the underlying network topology without 410 impacting the PW traffic. [RFC7079] does note that there are 411 still some deployments where the CW is not always used. It also 412 notes that many service providers do enable the CW. See 413 Section 2.4.1 for more discussion on why deployments SHOULD 414 enable the pseudowire CW. 416 The following statements provide clarification regarding more recent 417 requirements that are often missed. 419 1. The implementer and system designer SHOULD support adding a 420 pseudowire Flow Label [RFC6391]. Deployments MAY enable this 421 feature for appropriate pseudowire types. See Section 2.4.3. 423 2. The implementer and system designer SHOULD support adding an MPLS 424 entropy label [RFC6790]. Deployments MAY enable this feature. 425 See Section 2.4.4. 427 1.4. Target Audience 429 This document is intended for multiple audiences: implementer 430 (implementing MPLS forwarding in silicon or in software); systems 431 designer (putting together a MPLS forwarding systems); deployer 432 (running an MPLS network). These guidelines are intended to serve 433 the following purposes: 435 1. Explain what to do and what not to do when a deep label stack is 436 encountered. (audience: implementer) 438 2. Highlight pitfalls to look for when implementing an MPLS 439 forwarding chip. (audience: implementer) 441 3. Provide a checklist of features and performance specifications to 442 request. (audience: systems designer, deployer) 444 4. Provide a set of tests to perform. (audience: systems designer, 445 deployer). 447 The implementer, systems designer, and deployer have a transitive 448 supplier customer relationship. It is in the best interest of the 449 supplier to review their product against their customer's checklist 450 and secondary customer's checklist if applicable. 452 This document identifies and explains many details and potential pit- 453 falls of MPLS forwarding. It is likely that the identified set of 454 potential pit-falls will later prove to be an incomplete set. 456 2. Forwarding Issues 458 A brief review of forwarding issues is provided in the subsections 459 that follow. This section provides some background on why some of 460 these requirements exist. The questions to ask of suppliers is 461 covered in Section 3. Some guidelines for testing are provided in 462 Section 4. 464 2.1. Forwarding Basics 465 Basic MPLS architecture and MPLS encapsulation, and therefore packet 466 forwarding are defined in [RFC3031] and [RFC3032]. RFC3031 and 467 RFC3032 are somewhat LDP centric. RSVP-TE supports traffic 468 engineering (TE) and fast reroute, features that LDP lacks. The base 469 document for RSVP-TE based MPLS is [RFC3209]. 471 A few RFCs update RFC3032. Those with impact on forwarding include 472 the following. 474 1. TTL processing is clarified in [RFC3443]. 476 2. The use of MPLS Explicit NULL is modified in [RFC4182]. 478 3. Differentiated Services is supported by [RFC3270] and [RFC4124]. 479 The "EXP" field is renamed to "Traffic Class" in [RFC5462], 480 removing any misconception that it was available for 481 experimentation or could be ignored. 483 4. ECN is supported by [RFC5129]. 485 5. The MPLS G-ACh and GAL are defined in [RFC5586]. 487 6. [RFC5332] redefines the two data link layer codepoints for MPLS 488 packets. 490 Tunneling encapsulations carrying MPLS, such as MPLS in IP [RFC4023], 491 MPLS in GRE [RFC4023], MPLS in L2TPv3 [RFC4817], or MPLS in UDP 492 [I-D.ietf-mpls-in-udp], are out of scope. 494 Other RFCs have implications to MPLS Forwarding and do not update 495 RFC3032 or RFC3209, including: 497 1. The pseudowire (PW) Associated Channel Header (ACH), defined by 498 [RFC5085], later generalized by the MPLS G-ACh [RFC5586]. 500 2. The entropy label indicator (ELI) and entropy label (EL) are 501 defined by [RFC6790]. 503 A few RFCs update RFC3209. Those that are listed as updating RFC3209 504 generally impact only RSVP-TE signaling. Forwarding is modified by 505 major extension built upon RFC3209. 507 RFCs which impact forwarding are discussed in the following 508 subsections. 510 2.1.1. MPLS Special Purpose Labels 512 [RFC3032] specifies that label values 0-15 are special purpose labels 513 with special meanings. [I-D.ietf-mpls-special-purpose-labels] 514 renamed these from the term "reserved labels" used in [RFC3032] to 515 "special purpose labels". Three values of NULL label are defined 516 (two of which are later updated by [RFC4182]) and a router-alert 517 label is defined. The original intent was that special purpose 518 labels, except the NULL labels, could be sent to the routing engine 519 CPU rather than be processed in forwarding hardware. Hardware 520 support is required by new RFCs such as those defining entropy label 521 and OAM processed as a result of receiving a GAL. For new special 522 purpose labels, some accommodation is needed for LSR that will send 523 the labels to a general purpose CPU or other highly programmable 524 hardware. For example, ELI will only be sent to LSR which have 525 signaled support for [RFC6790] and high OAM packet rate must be 526 negotiated among endpoints. 528 [RFC3429] reserves a label for ITU-T Y.1711, however Y.1711 does not 529 work with multipath and its use is strongly discouraged. 531 The current list of special purpose labels can be found on the 532 "Multiprotocol Label Switching Architecture (MPLS) Label Values" 533 registry reachable at IANA's pages at [1]. 535 [I-D.ietf-mpls-special-purpose-labels] introduces an IANA "Extended 536 Special Purpose MPLS Label Values" registry and makes use of the 537 "extension" label, label 15, to indicate that the next label is an 538 extended special purpose label and requires special handling. The 539 range of only 16 values for special purpose labels allows a table to 540 be used. The range of extended special purpose labels with 20 bits 541 available for use may have to be handled in some other way in the 542 unlikely event that in the future the range of currently reserved 543 values 256-1048575 are used. If only the standards action range, 544 16-239, and the experimental range, 240-255, are used, then a table 545 of 256 entries can be used. 547 Unknown special purpose labels and unknown extended special purpose 548 labels are handled the same. When an unknown special purpose label 549 is encountered or a special purpose label not directly handled in 550 forwarding hardware is encountered, the packet should be sent to a 551 general purpose CPU by default. If this capability is supported, 552 there must be an option to either drop or rate limit such packets on 553 a per special purpose label value basis. 555 2.1.2. MPLS Differentiated Services 557 [RFC2474] deprecates the IP Type of Service (TOS) and IP Precedence 558 (Prec) fields and replaces them with the Differentiated Services 559 Field more commonly known as the Differentiated Services Code Point 560 (DSCP) field. [RFC2475] defines the Differentiated Services 561 architecture, which in other forums, is often called a Quality of 562 Service (QoS) architecture. 564 MPLS uses the Traffic Class (TC) field to support Differentiated 565 Services [RFC5462]. There are two primary documents describing how 566 DSCP is mapped into TC. 568 1. [RFC3270] defines E-LSP and L-LSP. E-LSP use a static mapping of 569 DSCP into TC. L-LSP uses a per LSP mapping of DSCP into TC, with 570 one PHB Scheduling Class (PSC) per L-LSP. Each PSC can use 571 multiple Per-Hop Behavior (PHB) values. For example, the Assured 572 Forwarding service defines three PSC, each with three PHB 573 [RFC2597]. 575 2. [RFC4124] defines assignment of a class-type (CT) to an LSP, 576 where a per CT static mapping of TC to PHB is used. [RFC4124] 577 provides a means to support up to eight E-LSP-like mappings of 578 DSCP to TC. 580 To meet Differentiated Services requirements specified in [RFC3270], 581 the following forwarding requirements must be met. An ingress LER 582 MUST be able to select an LSP and then apply a per LSP map of DSCP 583 into TC. A midpoint LSR MUST be able to apply a per LSP map of TC to 584 PHB. The number of mappings supported will be far less than the 585 number of LSP supported. 587 To meet Differentiated Services requirements specified in [RFC4124], 588 the following forwarding requirements must be met. An ingress LER 589 MUST be able to select an LSP and then apply a per LSP map of DSCP 590 into TC. A midpoint LSR MUST be able to apply a per LSP map to CT 591 map and then use Class Type (CT) to map TC to PHB. Since there are 592 only eight allowed values of CT, only eight maps of TC to PHB need to 593 be supported. The LSP label can be used directly to find the TC to 594 PHB mapping, as is needed to support [RFC3270] L-LSP. 596 While support for [RFC4124] and not [RFC3270] would allow support for 597 only eight mappings of TC to PHB, it is common to support both and 598 simply state a limit on the number of unique TC to PHB mappings which 599 can be supported. 601 2.1.3. Time Synchronization 602 PTP or NTP may be carried over MPLS [I-D.ietf-tictoc-1588overmpls]. 603 Generally NTP will be carried within IP with IP carried in MPLS 604 [RFC5905]. Both PTP and NTP benefit from accurate time stamping of 605 incoming packets and the ability to insert accurate time stamps in 606 outgoing packets. PTP correction which occurs when forwarding 607 requires updating a timestamp compensation field based on the 608 difference between packet arrival at an LSR and packet transmit time 609 at that same LSR. 611 Since the label stack depth may vary, hardware should allow a 612 timestamp to be placed in an outgoing packet at any specified byte 613 position. It may be necessary to modify layer-2 checksums or frame 614 check sequences after insertion. PTP and NTP timestamp formats 615 differ slightly. If NTP or PTP is carried over UDP/IP or UDP/IP/ 616 MPLS, the UDP checksum will also have to be updated. 618 Accurate time synchronization in addition to being generally useful 619 is required for MPLS-TP delay measurement (DM) OAM. See 620 Section 2.6.4. 622 2.1.4. Uses of Multiple Label Stack Entries 624 MPLS deployments in the early part of the prior decade (circa 2000) 625 tended to support either LDP or RSVP-TE. LDP was favored by some for 626 its ability to scale to a very large number of PE devices at the edge 627 of the network, without adding deployment complexity. RSVP-TE was 628 favored, generally in the network core, where traffic engineering and 629 /or fast reroute were considered important. 631 Both LDP and RSVP-TE are used simultaneously within major Service 632 Provider networks using a technique known as "LDP over RSVP-TE 633 Tunneling". This technique allows service providers to carry LDP 634 tunnels inside RSVP-TE tunnels. This makes it possible to take 635 advantage of the Traffic Engineering and Fast Re-Route on more 636 expensive Inter-City and Inter-Continental transport paths. The 637 ingress RSVP-TE PEs places many LDP tunnels on a single RSVP-TE LSP 638 and carries it to the egress RSVP-TE PE. The LDP PEs are situated 639 further from the core, for example within a metro network. LDP over 640 RSVP-TE tunneling requires a minimum of two MPLS labels: one each for 641 LDP and RSVP-TE. 643 The use of MPLS FRR [RFC4090] might add one more label to MPLS 644 traffic, but only when FRR protection is in use (active). If LDP 645 over RSVP-TE is in use, and FRR protection is in use, then at least 646 three MPLS labels are present on the label stack on the links through 647 which the Bypass LSP traverses. FRR is covered in Section 2.1.7. 649 LDP L2VPN, LDP IPVPN, BGP L2VPN, and BGP IPVPN added support for VPN 650 services that are deployed by the vast majority of service providers. 651 These VPN services added yet another label, bringing the label stack 652 depth (when FRR is active) to four. 654 Pseudowires and VPN are discussed in further detail in Section 2.1.8 655 and Section 2.1.9. 657 MPLS hierarchy as described in [RFC4206] and updated by [RFC7074] can 658 in principle add at least one additional label. MPLS hierarchy is 659 discussed in Section 2.1.6. 661 Other features such as Entropy Label (discussed in Section 2.4.4) and 662 Flow Label (discussed in Section 2.4.3) can add additional labels to 663 the label stack. 665 Although theoretical scenarios can easily result in eight or more 666 labels, such cases are rare if they occur at all today. For the 667 purpose of forwarding, only the top label needs to be examined if PHP 668 is used, a few more if UHP is used (see Section 2.5). For deep label 669 stacks, quite a few labels may have to be examined for the purpose of 670 load balancing across parallel links (see Section 2.4), however this 671 depth can be bounded by a provider through use of Entropy Label. 673 2.1.5. MPLS Link Bundling 675 MPLS Link Bundling was the first RFC to address the need for multiple 676 parallel links between nodes [RFC4201]. MPLS Link Bundling is 677 notable in that it tried not to change MPLS forwarding, except in 678 specifying the "All-Ones" component link. MPLS Link Bundling is 679 seldom if ever deployed. Instead multipath techniques described in 680 Section 2.4 are used. 682 2.1.6. MPLS Hierarchy 684 MPLS hierarchy is defined in [RFC4206] and updated by [RFC7074]. 685 Although RFC4206 is considered part of GMPLS, the Packet Switching 686 Capable (PSC) portion of the MPLS hierarchy are applicable to MPLS 687 and may be supported in an otherwise GMPLS free implementation. The 688 MPLS PSC hierarchy remains the most likely means of providing further 689 scaling in an RSVP-TE MPLS network, particularly where the network is 690 designed to provide RSVP-TE connectivity to the edges. This is the 691 case for envisioned MPLS-TP networks. The use of the MPLS PSC 692 hierarchy can add at least one additional label to a label stack, 693 though it is likely that only one layer of PSC will be used in the 694 near future. 696 2.1.7. MPLS Fast Reroute (FRR) 698 Fast reroute is defined by [RFC4090]. Two significantly different 699 methods are defined in RFC4090, the "One-to-One Backup" method which 700 uses the "Detour LSP" and the " Facility Backup" which uses a "bypass 701 tunnel". These are commonly referred to as the detour and bypass 702 methods respectively. 704 The detour method makes use of a presignaled LSP. Hardware 705 assistance is needed for detour FRR only if necessary to accomplish 706 local repair of a large number of LSP within the 10s of milliseconds 707 target. For each affected LSP a swap operation must be reprogrammed 708 or otherwise switched over. The use of detour FRR doubles the number 709 of LSP terminating at any given hop and will increase the number of 710 LSP within a network by a factor dependent on the average detour path 711 length. 713 The bypass method makes use of a tunnel that is unused when no fault 714 exists but may carry many LSP when a local repair is required. There 715 is no presignaling indicating which working LSP will be diverted into 716 any specific bypass LSP. The merge LSR (egress LSR of the bypass 717 LSP) MUST use platform label space (as defined in [RFC3031]) so that 718 an LSP working path on any given interface can be backed up using a 719 bypass LSP terminating on any other interface. Hardware assistance 720 is needed if necessary to accomplish local repair of a large number 721 of LSP within the 10s of milliseconds target. For each affected LSP 722 a swap operation must be reprogrammed or otherwise switched over with 723 an additional push of the bypass LSP label. The use of platform 724 label space impacts the size of the LSR ILM for LSR with a very large 725 number of interfaces. 727 2.1.8. Pseudowire Encapsulation 729 The pseudowire (PW) architecture is defined in [RFC3985]. A 730 pseudowire, when carried over MPLS, adds one or more additional label 731 entries to the MPLS label stack. A PW Control Word is defined in 732 [RFC4385] with motivation for defining the control word in [RFC4928]. 733 The PW Associated Channel defined in [RFC4385] is used for OAM in 734 [RFC5085]. The PW Flow Label is defined in [RFC6391] and is 735 discussed further in this document in Section 2.4.3. 737 There are numerous pseudowire encapsulations, supporting emulation of 738 services such as Frame Relay, ATM, Ethernet, TDM, and SONET/SDH over 739 packet switched networks (PSNs) using IP or MPLS. 741 The pseudowire encapsulation is out of scope for this document. 742 Pseudowire impact on MPLS forwarding at midpoint LSR is within scope. 743 The impact on ingress MPLS push and egress MPLS UHP pop are within 744 scope. While pseudowire encapsulation is out of scope, some advice 745 is given on sequence number support. 747 2.1.8.1. Pseudowire Sequence Number 749 Pseudowire (PW) sequence number support is most important for PW 750 payload types with a high expectation of lossless and/or in-order 751 delivery. Identifying lost PW packets and the exact amount of lost 752 payload is critical for PW services which maintain bit timing, such 753 as Time Division Multiplexing (TDM) services since these services 754 MUST compensate lost payload on a bit-for-bit basis. 756 With PW services which maintain bit timing, packets that have been 757 received out of order also MUST be identified and MAY be either re- 758 ordered or dropped. Resequencing requires, in addition to sequence 759 numbering, a "reorder buffer" in the egress PE, and ability to 760 reorder is limited by the depth of this buffer. The down side of 761 maintaining a large reorder buffer is added end-to-end service delay. 763 For PW services which maintain bit timing or any other service where 764 jitter must be bounded, a jitter buffer is always necessary. The 765 jitter buffer is needed regardless of whether reordering is done. In 766 order to be effective, a reorder buffer must often be larger than a 767 jitter buffer needs to be creating a tradeoff between reducing loss 768 and minimizing delay. 770 PW services which are not timing critical bit streams in nature are 771 cell oriented or frame oriented. Though resequencing support may be 772 beneficial to PW cell and frame oriented payloads such as ATM, FR and 773 Ethernet, this support is desirable but not required. Requirements 774 to handle out of order packets at all vary among services and 775 deployments. For example for Ethernet PW, occasional (very rare) 776 reordering is usually acceptable. If the Ethernet PW is carrying 777 MPLS-TP, then this reordering may be acceptable. 779 Reducing jitter is best done by an end-system, given that the 780 tradeoff of loss vs delay varies among services. For example with 781 interactive real time services low delay is preferred, while with 782 non-interactive (one way) real time services low loss is preferred. 783 The same end-site may be receiving both types of traffic. Regardless 784 of this, bounded jitter is sometimes a requirement for specific 785 deployments. 787 Packet reordering should be rare except in a small number of 788 circumstances, most of which are due to network design or equipment 789 design errors: 791 1. The most common case is where reordering is rare, occurring only 792 when a network or equipment fault forces traffic on a new path 793 with different delay. The packet loss that accompanies a network 794 or equipment fault is generally more disruptive than any 795 reordering which may occur. 797 2. A path change can be caused by reasons other than a network or 798 equipment fault, such as administrative routing change. This may 799 result in packet reordering but generally without any packet 800 loss. 802 3. If the edge is not using pseudowire control word (CW) and the 803 core is using multipath, reordering will be far more common. If 804 this is occurring, using CW on the edge will solve the problem. 805 Without CW, resequencing is not possible since the sequence 806 number is contained in the CW. 808 4. Another avoidable case is where some core equipment has multipath 809 and for some reason insists on periodically installing a new 810 random number as the multipath hash seed. If supporting MPLS-TP, 811 equipment MUST provide a means to disable periodic hash reseeding 812 and deployments MUST disable periodic hash reseeding. Operator 813 experience dictates that even if not supporting MPLS-TP, 814 equipment SHOULD provide a means to disable periodic hash 815 reseeding and deployments SHOULD disable periodic hash reseeding. 817 In provider networks which use multipath techniques and which may 818 occasionally rebalance traffic or which may change PW paths 819 occasionally for other reasons, reordering may be far more common 820 than loss. Where reordering is more common than loss, resequencing 821 packets is beneficial, rather than dropping packets at egress when 822 out of order arrival occurs. Resequencing is most important for PW 823 payload types with a high expectation of lossless delivery since in 824 such cases out of order delivery within the network results in PW 825 loss. 827 2.1.9. Layer-2 and Layer-3 VPN 829 Layer-2 VPN [RFC4664] and Layer-3 VPN [RFC4110] add one or more label 830 entry to the MPLS label stack. VPN encapsulations are out of scope 831 for this document. Its impact on forwarding at midpoint LSR are 832 within scope. 834 Any of these services may be used on an MPLS entropy label enabled 835 ingress and egress (see Section 2.4.4 for discussion of entropy 836 label) which would add an additional two labels to the MPLS label 837 stack. The need to provide a useful entropy label value impacts the 838 requirements of the VPN ingress LER but is out of scope for this 839 document. 841 2.2. MPLS Multicast 843 MPLS Multicast encapsulation is clarified in [RFC5332]. MPLS 844 Multicast may be signaled using RSVP-TE [RFC4875] or LDP [RFC6388]. 846 [RFC4875] defines a root initiated RSVP-TE LSP setup rather than leaf 847 initiated join used in IP multicast. [RFC6388] defines a leaf 848 initiated LDP setup. Both [RFC4875] and [RFC6388] define point to 849 multipoint (P2MP) LSP setup. [RFC6388] also defined multipoint to 850 multipoint (MP2MP) LSP setup. 852 The P2MP LSP have a single source. An LSR may be a leaf node, an 853 intermediate node, or a "bud" node. A bud serves as both a leaf and 854 intermediate. At a leaf an MPLS pop is performed. The payload may 855 be a IP Multicast packet that requires further replication. At an 856 intermediate node a MPLS swap operation is performed. The bud 857 requires that both a pop operation and a swap operation be performed 858 for the same incoming packet. 860 One strategy to support P2MP functionality is to pop at the LSR 861 interface serving as ingress to the P2MP traffic and then optionally 862 push labels at each LSR interface serving as egress to the P2MP 863 traffic at that same LSR. A given LSR egress chip may support 864 multiple egress interfaces, each of which requires a copy, but each 865 with a different set of added labels and layer-2 encapsulation. Some 866 physical interfaces may have multiple sub-interfaces (such as 867 Ethernet VLAN or channelized interfaces) each requiring a copy. 869 If packet replication is performed at LSR ingress, then the ingress 870 interface performance may suffer. If the packet replication is 871 performed within a LSR switching fabric and at LSR egress, congestion 872 of egress interfaces cannot make use of backpressure to ingress 873 interfaces using techniques such as virtual output queuing (VOQ). If 874 buffering is primarily supported at egress, then the need for 875 backpressure is minimized. There may be no good solution for high 876 volumes of multicast traffic if VOQ is used. 878 Careful consideration should be given to the performance 879 characteristics of high fanout multicast for equipment that is 880 intended to be used in such a role. 882 MP2MP LSP differ in that any branch may provide an input, including a 883 leaf. Packets must be replicated onto all other branches. This 884 forwarding is often implemented as multiple P2MP forwarding trees, 885 one for each potential input interface at a given LSR. 887 2.3. Packet Rates 889 While average packet size of Internet traffic may be large, long 890 sequences of small packets have both been predicted in theory and 891 observed in practice. Traffic compression and TCP ACK compression 892 can conspire to create long sequences of packets of 40-44 bytes in 893 payload length. If carried over Ethernet, the 64 byte minimum 894 payload applies, yielding a packet rate of approximately 150 Mpps 895 (million packets per second) for the duration of the burst on a 896 nominal 100 Gb/s link. The peak rate for other encapsulations can be 897 as high as 250 Mpps (for example IP or MPLS encapsulated using GFP 898 over OTN ODU4). 900 It is possible that the packet rates achieved by a specific 901 implementation is acceptable for a minimum payload size, such as 64 902 byte (64B) payload for Ethernet, but the achieved rate declines to an 903 unacceptable level for other packet sizes, such as 65B payload. 904 There are other packet rates of interest besides TCP ACK. For 905 example, a TCP ACK carried over an Ethernet PW over MPLS over 906 Ethernet may occupy 82B or 82B plus an increment of 4B if additional 907 MPLS labels are present. 909 A graph of packet rate vs. packet size often displays a sawtooth. 910 The sawtooth is commonly due to a memory bottleneck and memory 911 widths, sometimes internal cache, but often a very wide external 912 buffer memory interface. In some cases it may be due to a fabric 913 transfer width. A fine packing, rounding up to the nearest 8B or 16B 914 will result in a fine sawtooth with small degradation for 65B, and 915 even less for 82B packets. A course packing, rounding up to 64B can 916 yield a sharper drop in performance for 65B packets, or perhaps more 917 important, a larger drop for 82B packets. 919 The loss of some TCP ACK packets are not the primary concern when 920 such a burst occurs. When a burst occurs, any other packets, 921 regardless of packet length and packet QoS are dropped once on-chip 922 input buffers prior to the decision engine are exceeded. Buffers in 923 front of the packet decision engine are often very small or non- 924 existent (less than one packet of buffer) causing significant QoS 925 agnostic packet drop. 927 Internet service providers and content providers at one time 928 specified full rate forwarding with 40 byte payload packets as a 929 requirement. Today, this requirement often can be waived if the 930 provider can be convinced that when long sequence of short packets 931 occur no packets will be dropped. 933 Many equipment suppliers have pointed out that the extra cost in 934 designing hardware capable of processing the minimum size packets at 935 full line rate is significant for very high speed interfaces. If 936 hardware is not capable of processing the minimum size packets at 937 full line rate, then that hardware MUST be capable of handling large 938 burst of small packets, a condition which is often observed. This 939 level of performance is necessary to meet Differentiated Services 940 [RFC2475] requirements for without it, packets are lost prior to 941 inspection of the IP DSCP field [RFC2474] or MPLS TC field [RFC5462]. 943 With adequate on-chip buffers before the packet decision engine, an 944 LSR can absorb a long sequence of short packets. Even if the output 945 is slowed to the point where light congestion occurs, the packets, 946 having cleared the decision process, can make use of larger VOQ or 947 output side buffers and be dealt with according to configured QoS 948 treatment, rather than dropped completely at random. 950 These on-chip buffers need not contribute significant delay since 951 they are only used when the packet decision engine is unable to keep 952 up, not in response to congestion, plus these buffers are quite 953 small. For example, an on-chip buffer capable of handling 4K packets 954 of 64 bytes in length, or 256KB, corresponds to 2 msec on a 10 Mb/s 955 link and 0.2 usec on a 100 Gb/s link. If the packet decision engine 956 is capable of handling packets at 90% of the full rate for small 957 packets, then the maximum added delay is 0.2 msec and 20 nsec 958 respectively, and this delay only applies if a 4K burst of short 959 packets occurs. When no burst of short packets was being processed, 960 no delay is added. 962 Packet rate requirements apply regardless of which network tier 963 equipment is deployed in. Whether deployed in the network core or 964 near the network edges, one of the two conditions MUST be met if 965 Differentiated Services requirements are to be met: 967 1. Packets must be processed at full line rate with minimum sized 968 packets. -OR- 970 2. Packets must be processed at a rate well under generally accepted 971 average packet sizes, with sufficient buffering prior to the 972 packet decision engine to accommodate long bursts of small 973 packets. 975 2.4. MPLS Multipath Techniques 976 In any large provider, service providers and content providers, hash 977 based multipath techniques are used in the core and in the edge. In 978 many of these providers hash based multipath is also used in the 979 larger metro networks. 981 The Differentiated Services requirements for good reasons dictate 982 that packets within a common microflow SHOULD NOT be reordered 983 [RFC2474]. Service providers generally impose stronger requirements, 984 commonly requiring that packets within a microflow MUST NOT be 985 reordered except in rare circumstances such as load balancing across 986 multiple links or path change for load balancing or path change for 987 other reason. 989 The most common multipath techniques are ECMP applied at the IP 990 forwarding level, Ethernet LAG with inspection of the IP payload, and 991 multipath on links carrying both IP and MPLS, where the IP header is 992 inspected below the MPLS label stack. In most core networks, the 993 vast majority of traffic is MPLS encapsulated. 995 In order to support an adequately balanced load distribution across 996 multiple links, IP header information must be used. Common practice 997 today is to reinspect the IP headers at each LSR and use the label 998 stack and IP header information in a hash performed at each LSR. 999 Further details are provided in Section 2.4.5. 1001 The use of this technique is so ubiquitous in provider networks that 1002 lack of support for multipath makes any product unsuitable for use in 1003 large core networks. This will continue to be the case in the near 1004 future, even as deployment of MPLS entropy label begins to relax the 1005 core LSR multipath performance requirements given the existing 1006 deployed base of edge equipment without the ability to add an entropy 1007 label. 1009 A generation of edge equipment supporting the ability to add an MPLS 1010 entropy label is needed before the performance requirements for core 1011 LSR can be relaxed. However, it is likely that two generations of 1012 deployment in the future will allow core LSR to support full packet 1013 rate only when a relatively small number of MPLS labels need to be 1014 inspected before hashing. For now, don't count on it. 1016 Common practice today is to reinspect the packet at each LSR and use 1017 information from the packet combined plus a hash seed that is 1018 selected by each LSR. Where flow labels or entropy labels are used, 1019 a hash seed must be used when creating these labels. 1021 2.4.1. Pseudowire Control Word 1022 Within the core of a network some form of multipath is almost certain 1023 to be used. Multipath techniques deployed today are likely to be 1024 looking beneath the label stack for an opportunity to hash on IP 1025 addresses. 1027 A pseudowire encapsulated at a network edge must have a means to 1028 prevent reordering within the core if the pseudowire will be crossing 1029 a network core, or any part of a network topology where multipath is 1030 used (see [RFC4385] and [RFC4928]). 1032 Not supporting the ability to encapsulate a pseudowire with a control 1033 word may lock a product out from consideration. A pseudowire 1034 capability without control word support might be sufficient for 1035 applications that are strictly both intra-metro and low bandwidth. 1036 However a provider with other applications will very likely not 1037 tolerate having equipment which can only support a subset of their 1038 pseudowire needs. 1040 2.4.2. Large Microflows 1042 Where multipath makes use of a simple hash and simple load balance 1043 such as modulo or other fixed allocation (see Section 2.4) the 1044 presence of large microflows that each consumes 10% of the capacity 1045 of a component link of a potentially congested composite link, one 1046 such microflow can upset the traffic balance and more than one can in 1047 effect reduce the effective capacity of the entire composite link by 1048 more than 10%. 1050 When even a very small number of large microflows are present, there 1051 is a significant probability that more than one of these large 1052 microflows could fall on the same component link. If the traffic 1053 contribution from large microflows is small, the probability for 1054 three or more large microflows on the same component link drops 1055 significantly. Therefore in a network where a significant number of 1056 parallel 10 Gb/s links exists, even a 1 Gb/s pseudowire or other 1057 large microflow that could not otherwise be subdivided into smaller 1058 flows should carry a flow label or entropy label if possible. 1060 Active management of the hash space to better accommodate large 1061 microflows has been implemented and deployed in the past, however 1062 such techniques are out of scope for this document. 1064 2.4.3. Pseudowire Flow Label 1065 Unlike a pseudowire control word, a pseudowire flow label [RFC6391], 1066 is required only for relatively large capacity pseudowires. There 1067 are many cases where a pseudowire flow label makes sense. Any 1068 service such as a VPN which carries IP traffic within a pseudowire 1069 can make use of a pseudowire flow label. 1071 Any pseudowire carried over MPLS which makes use of the pseudowire 1072 control word and does not carry a flow label is in effect a single 1073 microflow (in [RFC2475] terms) and may result in the types of 1074 problems described in Section 2.4.2. 1076 2.4.4. MPLS Entropy Label 1078 The MPLS entropy label simplifies flow group identification [RFC6790] 1079 at midpoint LSRs. Prior to the MPLS entropy label midpoint LSRs 1080 needed to inspect the entire label stack and often the IP headers to 1081 provide an adequate distribution of traffic when using multipath 1082 techniques (see Section 2.4.5). With the use of MPLS entropy label, 1083 a hash can be performed closer to network edges, placed in the label 1084 stack, and used by midpoint LSRs without fully reinspecting the label 1085 stack and inspecting the payload. 1087 The MPLS entropy label is capable of avoiding full label stack and 1088 payload inspection within the core where performance levels are most 1089 difficult to achieve (see Section 2.3). The label stack inspection 1090 can be terminated as soon as the first entropy label is encountered, 1091 which is generally after a small number of labels are inspected. 1093 In order to provide these benefits in the core, LSR closer to the 1094 edge must be capable of adding an entropy label. This support may 1095 not be required in the access tier, the tier closest to the customer, 1096 but is likely to be required in the edge or the border to the network 1097 core. LSR peering with external networks will also need to be able 1098 to add an entropy label on incoming traffic. 1100 2.4.5. Fields Used for Multipath Load Balance 1102 The most common multipath techniques are based on a hash over a set 1103 of fields. Regardless of whether a hash is used or some other method 1104 is used, the there is a limited set of fields which can safely be 1105 used for multipath. 1107 2.4.5.1. MPLS Fields in Multipath 1109 If the "outer" or "first" layer of encapsulation is MPLS, then label 1110 stack entries are used in the hash. Within a finite amount of time 1111 (and for small packets arriving at high speed that time can be quite 1112 limited) only a finite number of label entries can be inspected. 1114 Pipelined or parallel architectures improve this, but the limit is 1115 still finite. 1117 The following guidelines are provided for use of MPLS fields in 1118 multipath load balancing. 1120 1. Only the 20 bit label field SHOULD be used. The TTL field SHOULD 1121 NOT be used. The S bit MUST NOT be used. The TC field (formerly 1122 EXP) MUST NOT be used. See text following this list for reasons. 1124 2. If an ELI label is found, then if the LSR supports entropy label, 1125 the EL label field in the next label entry (the EL) SHOULD be 1126 used and label entries below that label SHOULD NOT be used and 1127 the MPLS payload SHOULD NOT be used. See below this list for 1128 reasons. 1130 3. Special purpose labels (label values 0-15) MUST NOT be used. 1131 Extended special purpose labels (any label following label 15) 1132 MUST NOT be used. In particular, GAL and RA MUST NOT be used so 1133 that OAM traffic follows the same path as payload packets with 1134 the same label stack. 1136 4. If a new special purpose label or extended special purpose label 1137 is defined which requires special load balance processing, then, 1138 as is the case for the ELI label, a special action may be needed 1139 rather than skipping the special purpose label or extended 1140 special purpose label. 1142 5. The most entropy is generally found in the label stack entries 1143 near the bottom of the label stack (innermost label, closest to 1144 S=1 bit). If the entire label stack cannot be used (or entire 1145 stack up to an EL), then it is better to use as many labels as 1146 possible closest to the bottom of stack. 1148 6. If no ELI is encountered, and the first nibble of payload 1149 contains a 4 (IPv4) or 6 (IPv6), an implementation SHOULD support 1150 the ability to interpret the payload as IPv4 or IPv6 and extract 1151 and use appropriate fields from the IP headers. This feature is 1152 considered a non-negotiable requirement by many service 1153 providers. If supported, there MUST be a way to disable it (if, 1154 for example, PW without CW are used). This ability to disable 1155 this feature is considered a non-negotiable requirement by many 1156 service providers. Therefore an implementation has a very strong 1157 incentive to support both options. 1159 7. A label which is popped at egress (UHP pop) SHOULD NOT be used. 1160 A label which is popped at the penultimate hop (PHP pop) SHOULD 1161 be used. 1163 Apparently some chips have made use of the TC (formerly EXP) bits as 1164 a source of entropy. This is very harmful since it will reorder 1165 Assured Forwarding (AF) traffic [RFC2597] when a subset does not 1166 conform to the configured rates and is remarked but not dropped at a 1167 prior LSR. Traffic which uses MPLS ECN [RFC5129] can also be 1168 reordered if TC is used for entropy. Therefore, as stated in the 1169 guidelines above, the TC field (formerly EXP) MUST NOT be used in 1170 multipath load balancing as it violates Differentiated Services 1171 Ordered Aggregate (OA) requirements in these two instances. 1173 Use of the MPLS label entry S bit would result in putting OAM traffic 1174 on a different path if the addition of a GAL at the bottom of stack 1175 removed the S bit from the prior label. 1177 If an ELI label is found, then if the LSR supports entropy label, the 1178 EL label field in the next label entry (the EL) SHOULD be used and 1179 the search for additional entropy within the packet SHOULD be 1180 terminated. Failure to terminate the search will impact client MPLS- 1181 TP LSP carried within server MPLS LSP. A network operator has the 1182 option to use administrative attributes as a means to identify LSR 1183 which do not terminate the entropy search at the first EL. 1184 Administrative attributes are defined in [RFC3209]. Some 1185 configuration is required to support this. 1187 If the label removed by a PHP pop is not used, then for any PW for 1188 which CW is used, there is no basis for multipath load split. In 1189 some networks it is infeasible to put all PW traffic on one component 1190 link. Any PW which does not use CW will be improperly split 1191 regardless of whether the label removed by a PHP pop is used. 1192 Therefore the PHP pop label SHOULD be used as recommended above. 1194 2.4.5.2. IP Fields in Multipath 1196 Inspecting the IP payload provides the most entropy in provider 1197 networks. The practice of looking past the bottom of stack label for 1198 an IP payload is well accepted and documented in [RFC4928] and in 1199 other RFCs. 1201 Where IP is mentioned in the document, both IPv4 and IPv6 apply. All 1202 LSRs MUST fully support IPv6. 1204 When information in the IP header is used, the following guidelines 1205 apply: 1207 1. Both the IP source address and IP destination address SHOULD be 1208 used. There MAY be an option to reverse the order of these 1209 addresses, improving the ability to provide symmetric paths in 1210 some cases. Many service providers require that both addresses 1211 be used. 1213 2. Implementations SHOULD allow inspection of the IP protocol field 1214 and use of the UDP or TCP port numbers. For many service 1215 providers this feature is considered mandatory, particularly for 1216 enterprise, data center, or edge equipment. If this feature is 1217 provided, it SHOULD be possible to disable use of TCP and UDP 1218 ports. Many service providers consider it a non-negotiable 1219 requirement that use of UDP and TCP ports can be disabled. 1220 Therefore there is a strong incentive for implementations to 1221 provide both options. 1223 3. Equipment suppliers MUST NOT make assumptions that because the IP 1224 version field is equal to 4 (an IPv4 packet) that the IP protocol 1225 will either be TCP (IP protocol 6) or UDP (IP protocol 17) and 1226 blindly fetch the data at the offset where the TCP or UDP ports 1227 would be found. With IPv6, TCP and UDP port numbers are not at 1228 fixed offsets. With IPv4 packets carrying IP options, TCP and 1229 UDP port numbers are not at fixed offsets. 1231 4. The IPv6 header flow field SHOULD be used. This is the explicit 1232 purpose of the IPv6 flow field, however observed flow fields 1233 rarely contains a non-zero value. Some uses of the flow field 1234 have been defined such as [RFC6438]. In the absence of MPLS 1235 encapsulation, the IPv6 flow field can serve a role equivalent to 1236 entropy label. 1238 5. Support for other protocols that share a common Layer-4 header 1239 such as RTP [RFC3550], UDP-Lite [RFC3828], SCTP [RFC4960] and 1240 DCCP [RFC4340] SHOULD be provided, particularly for edge or 1241 access equipment where additional entropy may be needed. 1242 Equipment SHOULD also use RTP, UDP-lite, SCTP and DCCP headers 1243 when creating an entropy label. 1245 6. The following IP header fields should not or must not be used: 1247 a. Similar to avoiding TC in MPLS, the IP DSCP, and ECN bits 1248 MUST NOT be used. 1250 b. The IPv4 TTL or IPv6 Hop Count SHOULD NOT be used. 1252 c. Note that the IP TOS field was deprecated ([RFC0791] was 1253 updated by [RFC2474]). No part of the IP DSCP field can be 1254 used (formerly IP PREC and IP TOS bits). 1256 7. Some IP encapsulations support tunneling, such as IP-in-IP, GRE, 1257 L2TPv3, and IPSEC. These provide a greater source of entropy 1258 which some provider networks carrying large amounts of tunneled 1259 traffic may need, for example as used in [RFC5640] for GRE and 1260 L2TPv3. The use of tunneling header information is out of scope 1261 for this document. 1263 This document makes the following recommendations. These 1264 recommendations are not required to claim compliance to any existing 1265 RFC therefore implementers are free to ignore them, but due to 1266 service provider requirements should consider the risk of doing so. 1267 The use of IP addresses MUST be supported and TCP and UDP ports 1268 (conditional on the protocol field and properly located) MUST be 1269 supported. The ability to disable use of UDP and TCP ports MUST be 1270 available. Though potentially very useful in some networks, it is 1271 uncommon to support using payloads of tunneling protocols carried 1272 over IP. Though the use of tunneling protocol header information is 1273 out of scope for this document, it is not discouraged. 1275 2.4.5.3. Fields Used in Flow Label 1277 The ingress to a pseudowire (PW) can extract information from the 1278 payload being encapsulated to create a flow label. [RFC6391] 1279 references IP carried in Ethernet as an example. The Native Service 1280 Processing (NSP) function defined in [RFC3985] differs with 1281 pseudowire type. It is in the NSP function where information for a 1282 specific type of PW can be extracted for use in a flow label. Which 1283 fields to use for any given PW NSP is out of scope for this document. 1285 2.4.5.4. Fields Used in Entropy Label 1287 An entropy label is added at the ingress to an LSP. The payload 1288 being encapsulated is most often MPLS, a PW, or IP. The payload type 1289 is identified by the layer-2 encapsulation (Ethernet, GFP, POS, etc). 1291 If the payload is MPLS, then the information used to create an 1292 entropy label is the same information used for local load balancing 1293 (see Section 2.4.5.1). This information MUST be extracted for use in 1294 generating an entropy label even if the LSR local egress interface is 1295 not a multipath. 1297 Of the non-MPLS payload types, only payloads that are forwarded are 1298 of interest. For example, ARP is not forwarded and CNLP (used only 1299 for ISIS) is not forwarded. 1301 The non-MPLS payload type of greatest interest are IPv4 and IPv6. 1302 The guidelines in Section 2.4.5.2 apply to fields used to create and 1303 entropy label. 1305 The IP tunneling protocols mentioned in Section 2.4.5.2 may be more 1306 applicable to generation of an entropy label at edge or access where 1307 deep packet inspection is practical due to lower interface speeds 1308 than in the core where deep packet inspection may be impractical. 1310 2.5. MPLS-TP and UHP 1312 MPLS-TP introduces forwarding demands that will be extremely 1313 difficult to meet in a core network. Most troublesome is the 1314 requirement for Ultimate Hop Popping (UHP, the opposite of 1315 Penultimate Hop Popping or PHP). Using UHP opens the possibility of 1316 one or more MPLS pop operation plus an MPLS swap operation for each 1317 packet. The potential for multiple lookups and multiple counter 1318 instances per packet exists. 1320 As networks grow and tunneling of LDP LSPs into RSVP-TE LSPs is used, 1321 and/or RSVP-TE hierarchy is used, the requirement to perform one or 1322 two or more MPLS pop operations plus a MPLS swap operation (and 1323 possibly a push or two) increases. If MPLS-TP LM (link monitoring) 1324 OAM is enabled at each layer, then a packet and byte count MUST be 1325 maintained for each pop and swap operation so as to offer OAM for 1326 each layer. 1328 2.6. Local Delivery of Packets 1330 There are a number of situations in which packets are destined to a 1331 local address or where a return packet must be generated. There is a 1332 need to mitigate the potential for outage as a result of either 1333 attacks on network infrastructure, or in some cases unintentional 1334 misconfiguration resulting in processor overload. Some hardware 1335 assistance is needed for all traffic destined to the general purpose 1336 CPU that is used in MPLS control protocol processing or network 1337 management protocol processing and in most cases to other general 1338 purpose CPUs residing on an LSR. This is due to the ease of 1339 overwhelming such a processor with traffic arriving on LSR high speed 1340 interfaces, whether the traffic is malicious or not. 1342 Denial of service (DoS) protection is an area requiring hardware 1343 support that is often overlooked or inadequately considered. 1344 Hardware assist is also needed for OAM, particularly the more 1345 demanding MPLS-TP OAM. 1347 2.6.1. DoS Protection 1349 Modern equipment supports a number of control plane and management 1350 plane protocols. Generally no single means of protecting network 1351 equipment from denial of service (DoS) attacks is sufficient, 1352 particularly for high speed interfaces. This problem is not specific 1353 to MPLS, but is a topic that cannot be ignored when implementing or 1354 evaluating MPLS implementations. 1356 Two types of protections are often cited as primary means of 1357 protecting against attacks of all kinds. 1359 Isolated Control/Management Traffic 1360 Control and Management traffic can be carried out-of-band (OOB), 1361 meaning not intermixed with payload. For MPLS, use of G-ACh and 1362 GAL to carry control and management traffic provides a means of 1363 isolation from potentially malicious payload. Used alone, the 1364 compromise of a single node, including a small computer at a 1365 network operations center, could compromise an entire network. 1366 Implementations which send all G-ACh/GAL traffic directly to a 1367 routing engine CPU are subject to DoS attack as a result of such 1368 a compromise. 1370 Cryptographic Authentication 1371 Cryptographic authentication can very effectively prevent 1372 malicious injection of control or management traffic. 1373 Cryptographic authentication can in some circumstances be subject 1374 to DoS attack by overwhelming the capacity of the decryption with 1375 a high volume of malicious traffic. For very low speed 1376 interfaces, cryptographic authentication can be performed by the 1377 general purpose CPU used as a routing engine. For all other 1378 cases, cryptographic hardware may be needed. For very high speed 1379 interfaces, even cryptographic hardware can be overwhelmed. 1381 Some control and management protocols are often carried with payload 1382 traffic. This is commonly the case with BGP, T-LDP, and SNMP. It is 1383 often the case with RSVP-TE. Even when carried over G-ACh/GAL 1384 additional measures can reduce the potential for a minor breach to be 1385 leveraged to a full network attack. 1387 Some of the additional protections are supported by hardware packet 1388 filtering. 1390 GTSM 1391 [RFC5082] defines a mechanism that uses the IPv4 TTL or IPv6 Hop 1392 Limit fields to insure control traffic that can only originate 1393 from an immediate neighbor is not forged and originating from a 1394 distant source. GTSM can be applied to many control protocols 1395 which are routable, for example LDP [RFC6720]. 1397 IP Filtering 1398 At the very minimum, packet filtering plus classification and use 1399 of multiple queues supporting rate limiting is needed for traffic 1400 that could potentially be sent to a general purpose CPU used as a 1401 routing engine. The first level of filtering only allows 1402 connections to be initiated from specific IP prefixes to specific 1403 destination ports and then preferably passes traffic directly to 1404 a cryptographic engine and/or rate limits. The second level of 1405 filtering passes connected traffic, such as TCP connections 1406 having received at least one authenticated SYN or having been 1407 locally initiated. The second level of filtering only passes 1408 traffic to specific address and port pairs to be checked for 1409 cryptographic authentication. 1411 The cryptographic authentication is generally the last resort in DoS 1412 attack mitigation. If a packet must be first sent to a general 1413 purpose CPU, then sent to a cryptographic engine, a DoS attack is 1414 possible on high speed interfaces. Only where hardware can identify 1415 a signature and the portion of packet covered by the signature is 1416 cryptographic authentication highly beneficial in protecting against 1417 DoS attacks. 1419 For chips supporting multiple 100 Gb/s interfaces, only a very large 1420 number of parallel cryptographic engines can provide the processing 1421 capacity to handle a large scale DoS or distributed DoS (DDoS) 1422 attack. For many forwarding chips this much processing power 1423 requires significant chip real estate and power, and therefore 1424 reduces system space and power density. For this reason, 1425 cryptographic authentication is not considered a viable first line of 1426 defense. 1428 For some networks the first line of defense is some means of 1429 supporting OOB control and management traffic. In the past this OOB 1430 channel might make use of overhead bits in SONET or OTN or a 1431 dedicated DWDM wavelength. G-ACh and GAL provide an alternative OOB 1432 mechanism which is independent of underlying layers. In other 1433 networks, including most IP/MPLS networks, perimeter filtering serves 1434 a similar purpose, though less effective without extreme vigilance. 1436 A second line of defense is filtering, including GTSM. For protocols 1437 such as EBGP, GTSM and other filtering is often the first line of 1438 defense. Cryptographic authentication is usually the last line of 1439 defense and insufficient by itself to mitigate DoS or DDoS attacks. 1441 2.6.2. MPLS OAM 1443 [RFC4377] defines requirements for MPLS OAM that predate MPLS-TP. 1444 [RFC4379] defines what is commonly referred to as LSP Ping and LSP 1445 Traceroute. [RFC4379] is updated by [RFC6424] supporting MPLS 1446 tunnels and stitched LSP and P2MP LSP. [RFC4379] is updated by 1447 [RFC6425] supporting P2MP LSP. [RFC4379] is updated by [RFC6426] to 1448 support MPLS-TP connectivity verification (CV) and route tracing. 1450 [RFC4950] extends the ICMP format to support TTL expiration that may 1451 occur when using IP traceroute within an MPLS tunnel. The ICMP 1452 message generation can be implemented in forwarding hardware, but if 1453 sent to a general purpose CPU must be rate limited to avoid a 1454 potential denial or service (DoS) attack. 1456 [RFC5880] defines Bidirectional Forwarding Detection (BFD), a 1457 protocol intended to detect faults in the bidirectional path between 1458 two forwarding engines. [RFC5884] and [RFC5885] define BFD for MPLS. 1459 BFD can provide failure detection on any kind of path between 1460 systems, including direct physical links, virtual circuits, tunnels, 1461 MPLS Label Switched Paths (LSPs), multihop routed paths, and 1462 unidirectional links as long as there is some return path. 1464 The processing requirements for BFD are less than for LSP Ping, 1465 making BFD somewhat better suited for relatively high rate proactive 1466 monitoring. BFD does not verify that the data plane matches the 1467 control plane, where LSP Ping does. LSP Ping is somewhat better 1468 suited for on-demand monitoring including relatively low rate 1469 periodic verification of data plane and as a diagnostic tool. 1471 Hardware assistance is often provided for BFD response where BFD 1472 setup or parameter change is not involved and may be necessary for 1473 relatively high rate proactive monitoring. If both BFD and LSP Ping 1474 are recognized in filtering prior to passing traffic to a general 1475 purpose CPU, appropriate DoS protection can be applied (see 1476 Section 2.6.1). Failure to recognize BFD and LSP Ping and at least 1477 rate limit creates the potential for misconfiguration to cause 1478 outages rather than cause errors in the misconfigured OAM. 1480 2.6.3. Pseudowire OAM 1482 Pseudowire OAM makes use of the control channel provided by Virtual 1483 Circuit Connectivity Verification (VCCV) [RFC5085]. VCCV makes use 1484 of the Pseudowire Control Word. BFD support over VCCV is defined by 1485 [RFC5885]. [RFC5885] is updated by [RFC6478] in support of static 1486 pseudowires. [RFC4379] is updated by [RFC6829] supporting LSP Ping 1487 for Pseudowire FEC advertised over IPv6. 1489 G-ACh/GAL (defined in [RFC5586]) is the preferred MPLS-TP OAM control 1490 channel and applies to any MPLS-TP end points, including Pseudowire. 1491 See Section 2.6.4 for an overview of MPLS-TP OAM. 1493 2.6.4. MPLS-TP OAM 1495 [RFC6669] summarizes the MPLS-TP OAM toolset, the set of protocols 1496 supporting the MPLS-TP OAM requirements specified in [RFC5860] and 1497 supported by the MPLS-TP OAM framework defined in [RFC6371]. 1499 The MPLS-TP OAM toolset includes: 1501 CC-CV 1502 [RFC6428] defines BFD extensions to support proactive 1503 Connectivity Check and Connectivity Verification (CC-CV) 1504 applications. [RFC6426] provides LSP ping extensions that are 1505 used to implement on-demand connectivity verification. 1507 RDI 1508 Remote Defect Indication (RDI) is triggered by failure of 1509 proactive CC-CV, which is BFD based. For fast RDI initiation, 1510 RDI SHOULD be initiated and handled by hardware if BFD is handled 1511 in forwarding hardware. [RFC6428] provides an extension for BFD 1512 that includes the RDI indication in the BFD format and a 1513 specification of how this indication is to be used. 1515 Route Tracing 1516 [RFC6426] specifies that the LSP ping enhancements for MPLS-TP 1517 on-demand connectivity verification include information on the 1518 use of LSP ping for route tracing of an MPLS-TP path. 1520 Alarm Reporting 1521 [RFC6427] describes the details of a new protocol supporting 1522 Alarm Indication Signal (AIS), Link Down Indication, and fault 1523 management. Failure to support this functionality in forwarding 1524 hardware can potentially result in failure to meet protection 1525 recovery time requirements and is therefore strongly recommended. 1527 Lock Instruct 1528 Lock instruct is initiated on-demand and therefore need not be 1529 implemented in forwarding hardware. [RFC6435] defines a lock 1530 instruct protocol. 1532 Lock Reporting 1533 [RFC6427] covers lock reporting. Lock reporting need not be 1534 implemented in forwarding hardware. 1536 Diagnostic 1537 [RFC6435] defines protocol support for loopback. Loopback 1538 initiation is on-demand and therefore need not be implemented in 1539 forwarding hardware. Loopback of packet traffic SHOULD be 1540 implemented in forwarding hardware on high speed interfaces. 1542 Packet Loss and Delay Measurement 1543 [RFC6374] and [RFC6375] define a protocol and profile for packet 1544 loss measurement (LM) and delay measurement (DM). LM requires a 1545 very accurate capture and insertion of packet and byte counters 1546 when a packet is transmitted and capture of packet and byte 1547 counters when a packet is received. This capture and insertion 1548 MUST be implemented in forwarding hardware for LM OAM if high 1549 accuracy is needed. DM requires very accurate capture and 1550 insertion of a timestamp on transmission and capture of timestamp 1551 when a packet is received. This timestamp capture and insertion 1552 MUST be implemented in forwarding hardware for DM OAM if high 1553 accuracy is needed. 1555 See Section 2.6.2 for discussion of hardware support necessary for 1556 BFD and LSP Ping. 1558 CC-CV and alarm reporting is tied to protection and therefore SHOULD 1559 be supported in forwarding hardware in order to provide protection 1560 for a large number of affected LSP within target response intervals. 1561 Since CC-CV is supported by BFD, for MPLS-TP providing hardware 1562 assistance for BFD processing helps insure that protection recovery 1563 time requirements can be met even for faults affecting a large number 1564 of LSP. 1566 MPLS-TP Protection State Coordination (PSC) is defined by [RFC6378] 1567 and updated by [I-D.ietf-mpls-psc-updates], correcting some errors in 1568 [RFC6378]. 1570 2.6.5. MPLS OAM and Layer-2 OAM Interworking 1572 [RFC6670] provides the reasons for selecting a single MPLS-TP OAM 1573 solution and examines the consequences were ITU-T to develop a second 1574 OAM solution that is based on Ethernet encodings and mechanisms. 1576 [RFC6310] and [RFC7023] specifies the mapping of defect states 1577 between many types of hardware Attachment Circuits (ACs) and 1578 associated Pseudowires (PWs). This functionality SHOULD be supported 1579 in forwarding hardware. 1581 It is beneficial if an MPLS OAM implementation can interwork with the 1582 underlying server layer and provide a means to interwork with a 1583 client layer. For example, [RFC6427] specifies an inter-layer 1584 propagation of AIS and LDI from MPLS server layer to client MPLS 1585 layers. Where the server layer is a Layer-2, such as Ethernet, PPP 1586 over SONET/SDH, or GFP over OTN, interwork among layers is also 1587 beneficial. For high speed interfaces, supporting this interworking 1588 in forwarding hardware helps insure that protection based on this 1589 interworking can meet recovery time requirements even for faults 1590 affecting a large number of LSP. 1592 2.6.6. Extent of OAM Support by Hardware 1594 Where certain requirements must be met, such as relatively high CC-CV 1595 rates and a large number of interfaces, or strict protection recovery 1596 time requirements and a moderate number of affected LSP, some OAM 1597 functionality must be supported by forwarding hardware. In other 1598 cases, such as highly accurate LM and DM OAM or strict protection 1599 recovery time requirements with a large number of affected LSP, OAM 1600 functionality must be entirely implemented in forwarding hardware. 1602 Where possible, implementation in forwarding hardware should be in 1603 programmable hardware such that if standards are later changed or 1604 extended these changes are likely to be accommodated with hardware 1605 reprogramming rather than replacement. 1607 For some functionality there is a strong case for an implementation 1608 in dedicated forwarding hardware. Examples include packet and byte 1609 counters needed for LM OAM as well as needed for management 1610 protocols. Similarly the capture and insertion of packet and byte 1611 counts or timestamps needed for transmitted LM or DM or time 1612 synchronization packets MUST be implemented in forwarding hardware if 1613 high accuracy is required. 1615 For some functions there is a strong case to provide limited support 1616 in forwarding hardware but may make use of an external general 1617 purpose processor if performance criteria can be met. For example 1618 origination of RDI triggered by CC-CV, response to RDI, and 1619 Protection State Coordination (PSC) functionality may be supported by 1620 hardware, but expansion to a large number of client LSP and 1621 transmission of AIS or RDI to the client LSP may occur in a general 1622 purpose processor. Some forwarding hardware supports one or more on- 1623 chip general purpose processors which may be well suited for such a 1624 role. [I-D.ietf-mpls-psc-updates], being a very recent document that 1625 affects a protection state machine that requires hardware support, 1626 underscores the importance of having a degree of programmability in 1627 forwarding hardware. 1629 The customer (system supplier or provider) should not dictate design, 1630 but should independently validate target functionality and 1631 performance. However, it is not uncommon for service providers and 1632 system implementers to insist on reviewing design details (under NDA) 1633 due to past experiences with suppliers and to reject suppliers who 1634 are unwilling to provide details. 1636 2.7. Number and Size of Flows 1638 Service provider networks may carry up to hundreds of millions of 1639 flows on 10 Gb/s links. Most flows are very short lived, many under 1640 a second. A subset of the flows are low capacity and somewhat long 1641 lived. When Internet traffic dominates capacity a very small subset 1642 of flows are high capacity and/or very long lived. 1644 Two types of limitations with regard to number and size of flows have 1645 been observed. 1647 1. Some hardware cannot handle some high capacity flows because of 1648 internal paths which are limited, such as per packet backplane 1649 paths or paths internal or external to chips such as buffer 1650 memory paths. Such designs can handle aggregates of smaller 1651 flows. Some hardware with acknowledged limitations has been 1652 successfully deployed but may be increasingly problematic if the 1653 capacity of large microflows in deployed networks continues to 1654 grow. 1656 2. Some hardware approaches cannot handle a large number of flows, 1657 or a large number of large flows due to attempting to count per 1658 flow, rather than deal with aggregates of flows. Hash techniques 1659 scale with regard to number of flows due to a fixed hash size 1660 with many flows falling into the same hash bucket. Techniques 1661 that identify individual flows have been implemented but have 1662 never successfully deployed for Internet traffic. 1664 3. Questions for Suppliers 1666 The following questions should be asked of a supplier. These 1667 questions are grouped into broad categories. The questions 1668 themselves are intended to be an open ended question to the supplier. 1669 The tests in Section 4 are intended to verify whether the supplier 1670 disclosed any compliance or performance limitations completely and 1671 accurately. 1673 3.1. Basic Compliance 1675 Q#1 Can the implementation forward packets with an arbitrarily 1676 large stack depth? What limitations exist, and under what 1677 circumstances do further limitations come into play (such as high 1678 packet rate or specific features enabled or specific types of 1679 packet processing)? See Section 2.1. 1681 Q#2 Is the entire set of basic MPLS functionality described in 1682 Section 2.1 supported? 1684 Q#3 Are the set of MPLS special purpose labels handled correctly 1685 and with adequate performance? Are extended special purpose 1686 labels handled correctly and with adequate performance? See 1687 Section 2.1.1. 1689 Q#4 Are mappings of label value and TC to PHB handled correctly, 1690 including RFC3270 L-LSP mappings and RFC4124 CT mappings to PHB? 1691 See Section 2.1.2. 1693 Q#5 Is time synchronization adequately supported in forwarding 1694 hardware? 1696 a. Are both PTP and NTP formats supported? 1698 b. Is the accuracy of timestamp insertion and incoming stamping 1699 sufficient? 1701 See Section 2.1.3. 1703 Q#6 Is link bundling supported? 1705 a. Can LSP be pinned to specific components? 1707 b. Is the "all-ones" component link supported? 1709 See Section 2.1.5. 1711 Q#7 Is MPLS hierarchy supported? 1713 a. Are both PHP and UHP supported? What limitations exist on 1714 the number of pop operations with UHP? 1716 b. Are the pipe, short-pipe, and uniform models supported? Are 1717 TTL and TC values updated correctly at egress where 1718 applicable? 1720 See Section 2.1.6 regarding MPLS hierarchy. See [RFC3443] 1721 regarding PHP, UHP, and pipe, short-pipe, and uniform models. 1723 Q#8 Are pseudowire sequence numbers handled correctly? See 1724 Section 2.1.8.1. 1726 Q#9 Is VPN LER functionality handled correctly and without 1727 performance issues? See Section 2.1.9. 1729 Q#10 Is MPLS multicast (P2MP and MP2MP) handled correctly? 1731 a. Are packets dropped on uncongested outputs if some outputs 1732 are congested? 1734 b. Is performance limited in high fanout situations? 1736 See Section 2.2. 1738 3.2. Basic Performance 1740 Q#11 Can very small packets be forwarded at full line rate on all 1741 interfaces indefinitely? What limitations exist, and under what 1742 circumstances do further limitations come into play (such as 1743 specific features enabled or specific types of packet 1744 processing)? 1746 Q#12 Customers must decide whether to relax the prior requirement and 1747 to what extent. If the answer to the prior question indicates 1748 that limitations exist, then: 1750 a. What is the smallest packet size where full line rate 1751 forwarding can be supported? 1753 b. What is the longest burst of full rate small packets that can 1754 be supported? 1756 Specify circumstances (such as specific features enabled or 1757 specific types of packet processing) often impact these rates and 1758 burst sizes. 1760 Q#13 How many pop operations can be supported along with a swap 1761 operation at full line rate while maintaining per LSP packet and 1762 byte counts for each pop and swap? This requirement is 1763 particularly relevant for MPLS-TP. 1765 Q#14 How many label push operations can be supported. While this 1766 limitation is rarely an issue, it applies to both PHP and UHP, 1767 unlike the pop limit which applies to UHP. 1769 Q#15 For a worst case where all packets arrive on one LSP, what is 1770 the counter overflow time? Are any means provided to avoid 1771 polling all counters at short intervals? This applies to both 1772 MPLS and MPLS-TP. 1774 3.3. Multipath Capabilities and Performance 1776 Multipath capabilities and performance do not apply to MPLS-TP but 1777 apply to MPLS and apply if MPLS-TP is carried in MPLS. 1779 Q#16 How are large microflows accommodated? Is there active 1780 management of the hash space mapping to output ports? See 1781 Section 2.4.2. 1783 Q#17 How many MPLS labels can be included in a hash based on the MPLS 1784 label stack? 1786 Q#18 Is packet rate performance decreased beyond some number of 1787 labels? 1789 Q#19 Can the IP header and payload information below the MPLS stack 1790 be used in the hash? If so, which IP fields, payload types and 1791 payload fields are supported? 1793 Q#20 At what maximum MPLS label stack depth can Bottom of Stack and 1794 an IP header appear without impacting packet rate performance? 1796 Q#21 Are special purpose labels excluded from the label stack hash? 1797 Are extended purpose labels excluded from the label stack hash? 1798 See Section 2.4.5.1. 1800 Q#22 How is multipath performance affected by high capacity flows or 1801 an extremely large number of flows, or by very short lived flows? 1802 See Section 2.7. 1804 3.4. Pseudowire Capabilities and Performance 1806 Q#23 Is the pseudowire control word supported? 1808 Q#24 What is the maximum rate of pseudowire encapsulation and 1809 decapsulation? Apply the same questions as in Base Performance 1810 for any packet based pseudowire such as IP VPN or Ethernet. 1812 Q#25 Does inclusion of a pseudowire control word impact performance? 1814 Q#26 Are flow labels supported? 1816 Q#27 If so, what fields are hashed on for the flow label for 1817 different types of pseudowires? 1819 Q#28 Does inclusion of a flow label impact performance? 1821 3.5. Entropy Label Support and Performance 1823 Q#29 Can an entropy label be added when acting as in ingress LER and 1824 can it be removed when acting as an egress LER? 1826 Q#30 If so, what fields are hashed on for the entropy label? 1828 Q#31 Does adding or removing an entropy label impact packet rate 1829 performance? 1831 Q#32 Can an entropy label be detected in the label stack, used in the 1832 hash, and properly terminate the search for further information 1833 to hash on? 1835 Q#33 Does using an entropy label have any negative impact on 1836 performance? It should have no impact or a positive impact. 1838 3.6. DoS Protection 1840 Q#34 For each control and management plane protocol in use, what 1841 measures are taken to provide DoS attack hardening? 1843 Q#35 Have DoS attack tests been performed? 1845 Q#36 Can compromise of an internal computer on a management subnet be 1846 leveraged for any form of attack including DoS attack? 1848 3.7. OAM Capabilities and Performance 1850 Q#37 What OAM proactive and on-demand mechanisms are supported? 1852 Q#38 What performance limits exist under high proactive monitoring 1853 rates? 1855 Q#39 Can excessively high proactive monitoring rates impact control 1856 plane performance or cause control plane instability? 1858 Q#40 Ask the prior questions for each of the following. 1860 a. MPLS OAM 1862 b. Pseudowire OAM 1864 c. MPLS-TP OAM 1866 d. Layer-2 OAM Interworking 1868 See Section 2.6.2. 1870 4. Forwarding Compliance and Performance Testing 1872 Packet rate performance of equipment supporting a large number of 10 1873 Gb/s or 100 Gb/s links is not possible using desktop computers or 1874 workstations. The use of high end workstations as a source of test 1875 traffic was barely viable 20 years ago, but is no longer at all 1876 viable. Though custom microcode has been used on specialized router 1877 forwarding cards to serve the purpose of generating test traffic and 1878 measuring it, for the most part performance testing will require 1879 specialized test equipment. There are multiple sources of suitable 1880 equipment. 1882 The set of tests listed here do not correspond one-to-one to the set 1883 of questions in Section 3. The same categorization is used and these 1884 tests largely serve to validate answers provided to the prior 1885 questions, and can also provide answers where a supplier is unwilling 1886 to disclose compliance or performance. 1888 Performance testing is the domain of the IETF Benchmark Methodology 1889 Working Group (BMWG). Below are brief descriptions of conformance 1890 and performance tests. Some very basic tests are specified in 1891 [RFC5695] which partially cover only the basic performance test T#3. 1893 The following tests should be performed by the systems designer, or 1894 deployer, or performed by the supplier on their behalf if it is not 1895 practical for the potential customer to perform the tests directly. 1896 These tests are grouped into broad categories. 1898 The tests in Section 4.1 should be repeated under various conditions 1899 to retest basic performance when critical capabilities are enabled. 1900 Complete repetition of the performance tests enabling each capability 1901 and combinations of capabilities would be very time intensive, 1902 therefore a reduced set of performance tests can be used to gauge the 1903 impact of enabling specific capabilities. 1905 4.1. Basic Compliance 1907 T#1 Test forwarding at a high rate for packets with varying number 1908 of label entries. While packets with more than a dozen label 1909 entries are unlikely to be used in any practical scenario today, 1910 it is useful to know if limitations exists. 1912 T#2 For each of the questions listed under "Basic Compliance" in 1913 Section 3, verify the claimed compliance. For any functionality 1914 considered critical to a deployment, where applicable performance 1915 using each capability under load should be verified in addition 1916 to basic compliance. 1918 4.2. Basic Performance 1920 T#3 Test packet forwarding at full line rate with small packets. 1921 See [RFC5695]. The most likely case to fail is the smallest 1922 packet size. Also test with packet sizes in four byte increments 1923 ranging from payload sizes or 40 to 128 bytes. 1925 T#4 If the prior tests did not succeed for all packet sizes, then 1926 perform the following tests. 1928 a. Increase the packet size by 4 bytes until a size is found 1929 that can be forwarded at full rate. 1931 b. Inject bursts of consecutive small packets into a stream of 1932 larger packets. Allow some time for recovery between bursts. 1933 Increase the number of packets in the burst until packets are 1934 dropped. 1936 T#5 Send test traffic where a swap operation is required. Also set 1937 up multiple LSP carried over other LSP where the device under 1938 test (DUT) is the egress of these LSP. Create test packets such 1939 that the swap operation is performed after pop operations, 1940 increasing the number of pop operations until forwarding of small 1941 packets at full line rate can no longer be supported. Also check 1942 to see how many pop operations can be supported before the full 1943 set of counters can no longer be maintained. This requirement is 1944 particularly relevant for MPLS-TP. 1946 T#6 Send all traffic on one LSP and see if the counters become 1947 inaccurate. Often counters on silicon are much smaller than the 1948 64 bit packet and byte counters in various IETF MIBs. System 1949 developers should consider what counter polling rate is necessary 1950 to maintain accurate counters and whether those polling rates are 1951 practical. Relevant MIBs for MPLS are discussed in [RFC4221] and 1952 [RFC6639]. 1954 4.3. Multipath Capabilities and Performance 1956 Multipath capabilities do not apply to MPLS-TP but apply to MPLS and 1957 apply if MPLS-TP is carried in MPLS. 1959 T#7 Send traffic at a rate well exceeding the capacity of a single 1960 multipath component link, and where entropy exists only below the 1961 top of stack. If only the top label is used this test will fail 1962 immediately. 1964 T#8 Move the labels with entropy down in the stack until either the 1965 full forwarding rate can no longer be supported or most or all 1966 packets try to use the same component link. 1968 T#9 Repeat the two tests above with the entropy contained in IP 1969 headers or IP payload fields below the label stack rather than in 1970 the label stack. Test with the set of IP headers or IP payload 1971 fields considered relevant to the deployment or to the target 1972 market. 1974 T#10 Determine whether traffic that contains a pseudowire control 1975 word is interpreted as IP traffic. Information in the payload 1976 MUST NOT be used in the load balancing if the first nibble of the 1977 packet is not 4 or 6 (IPv4 or IPv6). 1979 T#11 Determine whether special purpose labels and extended special 1980 purpose labels are excluded from the label stack hash. They MUST 1981 be excluded. 1983 T#12 Perform testing in the presence of combinations of: 1985 a. Very large microflows. 1987 b. Relatively short lived high capacity flows. 1989 c. Extremely large numbers of flows. 1991 d. Very short lived small flows. 1993 4.4. Pseudowire Capabilities and Performance 1995 T#13 Ensure that pseudowire can be set up with a pseudowire label and 1996 pseudowire control word added at ingress and the pseudowire label 1997 and pseudowire control word removed at egress. 1999 T#14 For pseudowire that contains variable length payload packets, 2000 repeat performance tests listed under "Basic Performance" for 2001 pseudowire ingress and egress functions. 2003 T#15 Repeat pseudowire performance tests with and without a 2004 pseudowire control word. 2006 T#16 Determine whether pseudowire can be set up with a pseudowire 2007 label, flow label, and pseudowire control word added at ingress 2008 and the pseudowire label, flow label, and pseudowire control word 2009 removed at egress. 2011 T#17 Determine which payload fields are used to create the flow label 2012 and whether the set of fields and algorithm provide sufficient 2013 entropy for load balancing. 2015 T#18 Repeat pseudowire performance tests with flow labels included. 2017 4.5. Entropy Label Support and Performance 2019 T#19 Determine whether entropy labels can be added at ingress and 2020 removed at egress. 2022 T#20 Determine which fields are used to create an entropy label. 2023 Labels further down in the stack, including entropy labels 2024 further down and IP headers or IP payload fields where applicable 2025 should be used. Determine whether the set of fields and 2026 algorithm provide sufficient entropy for load balancing. 2028 T#21 Repeat performance tests under "Basic Performance" when entropy 2029 labels are used, where ingress or egress is the device under test 2030 (DUT). 2032 T#22 Determine whether an ELI is detected when acting as a midpoint 2033 LSR and whether the search for further information on which to 2034 base the load balancing is used. Information below the entropy 2035 label SHOULD NOT be used. 2037 T#23 Ensure that the entropy label indicator and entropy label (ELI 2038 and EL) are removed from the label stack during UHP and PHP 2039 operations. 2041 T#24 Insure that operations on the TC field when adding and removing 2042 entropy label are correctly carried out. If TC is changed during 2043 a swap operation, the ability to transfer that change MUST be 2044 provided. The ability to suppress the transfer of TC MUST also 2045 be provided. See "pipe", "short pipe", and "uniform" models in 2046 [RFC3443]. 2048 T#25 Repeat performance tests for a midpoint LSR with entropy labels 2049 found at various label stack depths. 2051 4.6. DoS Protection 2053 T#26 Actively attack LSR under high protocol churn load and determine 2054 control plane performance impact or successful DoS under test 2055 conditions. Specifically test for the following. 2057 a. TCP SYN attack against control plane and management plane 2058 protocols using TCP, including CLI access (typically SSH 2059 protected login), NETCONF, etc. 2061 b. High traffic volume attack against control plane and 2062 management plane protocols not using TCP. 2064 c. Attacks which can be performed from a compromised management 2065 subnet computer, but not one with authentication keys. 2067 d. Attacks which can be performed from a compromised peer within 2068 the control plane (internal domain and external domain). 2069 Assume that per peering keys and per router ID keys rather 2070 than network wide keys are in use. 2072 See Section 2.6.1. 2074 4.7. OAM Capabilities and Performance 2076 T#27 Determine maximum sustainable rates of BFD traffic. If BFD 2077 requires CPU intervention, determine both maximum rates and CPU 2078 loading when multiple interfaces are active. 2080 T#28 Verify LSP Ping and LSP Traceroute capability. 2082 T#29 Determine maximum rates of MPLS-TP CC-CV traffic. If CC-CV 2083 requires CPU intervention, determine both maximum rates and CPU 2084 loading when multiple interfaces are active. 2086 T#30 Determine MPLS-TP DM precision. 2088 T#31 Determine MPLS-TP LM accuracy. 2090 T#32 Verify MPLS-TP AIS/RDI and Protection State Coordination (PSC) 2091 functionality, protection speed, and AIS/RDI notification speed 2092 when a large number of Management Entities (ME) must be notified 2093 with AIS/RDI. 2095 5. Acknowledgements 2097 Numerous very useful comments have been received in private email. 2098 Some of these contributions are acknowledged here, approximately in 2099 chronologic order. 2101 Paul Doolan provided a brief review resulting in a number of 2102 clarifications, most notably regarding on-chip vs. system buffering, 2103 100 Gb/s link speed assumptions in the 150 Mpps figure, and handling 2104 of large microflows. Pablo Frank reminded us of the sawtooth effect 2105 in PPS vs. packet size graphs, prompting the addition of a few 2106 paragraphs on this. Comments from Lou Berger at IETF-85 prompted the 2107 addition of Section 2.7. 2109 Valuable comments were received on the BMWG mailing list. Jay 2110 Karthik pointed out testing methodology hints that after discussion 2111 were deemed out of scope and were removed but may benefit later work 2112 in BMWG. 2114 Nabil Bitar pointed out the need to cover QoS (Differentiated 2115 Services), MPLS multicast (P2MP and MP2MP), and MPLS-TP OAM. Nabil 2116 also provided a number of clarifications to the questions and tests 2117 in Section 3 and Section 4. 2119 Mark Szczesniak provided a thorough review and a number of useful 2120 comments and suggestions that improved the document. 2122 Gregory Mirsky and Thomas Beckhaus provided useful comments during 2123 the MPLS RT review. 2125 Tal Mizrahi provided comments that prompted clarifications regarding 2126 timestamp processing, local delivery of packets, and the need for 2127 hardware assistance in processing OAM traffic. 2129 Alexander (Sasha) Vainshtein pointed out errors in Section 2.1.8.1 2130 and suggested new text which after lengthy discussion resulted in 2131 restating the summarization of requirements from PWE3 RFCs and more 2132 clearly stating the benefits and drawbacks of packet resequencing 2133 based on PW sequence number. 2135 Loa Anderson provided useful comments and corrections prior to WGLC. 2136 Adrian Farrel provided useful comments and corrections prior as part 2137 of the AD review. 2139 Discussion with Steve Kent during SecDir review resulted in expansion 2140 of Section 7, briefly summarizing security considerations related to 2141 forwarding in normative references. Tom Petch pointed out some 2142 editorial errors in private email. Al Morton during OpsDir review 2143 prompted clarification in the target audience section, suggested more 2144 clear wording in places, and found numerous editorial errors. 2146 6. IANA Considerations 2148 This memo includes no request to IANA. 2150 7. Security Considerations 2151 This document reviews forwarding behavior specified elsewhere and 2152 points out compliance and performance requirements. As such it 2153 introduces no new security requirements or concerns. 2155 Discussion of hardware support and other equipment hardening against 2156 DoS attack can be found in Section 2.6.1. Section 3.6 provides a 2157 list of question regarding DoS to be asked of suppliers. Section 4.6 2158 suggests types of testing that can provide some assurance of the 2159 effectiveness of supplier DoS hardening claims. 2161 Knowledge of potential performance shortcomings may serve to help new 2162 implementations avoid pitfalls. It is unlikely that such knowledge 2163 could be the basis of new denial of service as these pitfalls are 2164 already widely known in the service provider community and among 2165 leading equipment suppliers. In practice extreme data and packet 2166 rate are needed to affect existing equipment and to affect networks 2167 that may be still vulnerable due to failure to implement adequate 2168 protection. The extreme data and packet rates make this type of 2169 denial of service unlikely and make undetectable denial of service of 2170 this type impossible. 2172 The set of normative references each contain security considerations. 2173 A brief summarization of MPLS security considerations applicable to 2174 forwarding follows: 2176 1. MPLS encapsulation does not support an authentication extension. 2177 This is reflected in the security section of [RFC3032]. 2178 Documents which clarify MPLS header fields such as TTL 2179 [RFC3443], the explicit null label [RFC4182], renaming EXP to TC 2180 [RFC5462], ECN for MPLS [RFC5129], and MPLS Ethernet 2181 encapsulation [RFC5332] make no changes to security 2182 considerations in [RFC3032]. 2184 2. Some cited RFCs are related to Diffserv forwarding. [RFC3270] 2185 refers to MPLS and Diffserv security. [RFC2474] mentions theft 2186 of service and denial of service due to mismarking. [RFC2474] 2187 mentions IPsec interaction, but with MPLS, not being carried by 2188 IP, this type of interaction in [RFC2474] is not relevant. 2190 3. [RFC3209] is cited here due only to make-before-break forwarding 2191 requirements. This is related to resource sharing and the theft 2192 of service and denial of service concerns in [RFC2474] apply. 2194 4. [RFC4090] defines FRR which provides protection but does not add 2195 security concerns. RFC4201 defines link bundling but raises no 2196 additional security concerns. 2198 5. Various OAM control channels are defined in [RFC4385] (PW CW), 2199 [RFC5085] (VCCV), [RFC5586] (G-Ach and GAL). These documents 2200 describe potential abuse of these OAM control channels. 2202 6. [RFC4950] defines ICMP extensions when MPLS TTL expires and 2203 payload is IP. This provides MPLS header information which is 2204 of no use to an IP attacker, but sending this information can be 2205 suppressed through configuration. 2207 7. GTSM [RFC5082] provides a means to improve protection against 2208 high traffic volume spoofing as a form of DoS attack. 2210 8. BFD [RFC5880] [RFC5884] [RFC5885] provides a form of OAM used in 2211 MPLS and MPLS-TP. The security considerations related to the 2212 OAM control channel are relevant. The BFD payload supports 2213 authentication unlike the MPLS encapsulation or MPLS or PW 2214 control channel encapsulation is carried in. Where an IP return 2215 OAM path is used IPsec is suggested as a means of securing the 2216 return path. 2218 9. Other forms of OAM are supported by [RFC6374] [RFC6375] (Loss 2219 and Delay Measurement), [RFC6428] (Connectivity Check/ 2220 Verification based on BFD), and [RFC6427] (Fault Management). 2221 The security considerations related to the OAM control channel 2222 are relevant. IP return paths, where used, can be secured with 2223 IPsec. 2225 10. Linear protection is defined by [RFC6378] and updated by 2226 [I-D.ietf-mpls-psc-updates]. Security concerns related to MPLS 2227 encapsulation and OAM control channels apply. Security concerns 2228 reiterate [RFC5920] as applied to protection switching. 2230 11. The PW Flow Label [RFC6391] and MPLS Entropy Label [RFC6790] 2231 affect multipath load balancing. Security concerns reiterate 2232 [RFC5920]. Security impacts would be limited to load 2233 distribution. 2235 MPLS security including data plane security is discussed in greater 2236 detail in [RFC5920] (MPLS/GMPLS Security Framework). The MPLS-TP 2237 security framework [RFC6941] build upon this, focusing largely on the 2238 MPLS-TP OAM additions and OAM channels with some attention given to 2239 using network management in place of control plane setup. In both 2240 security framework documents MPLS is assumed to run within a "trusted 2241 zone", defined as being where a single service provider (SP) has 2242 total operational control over that part of the network. 2244 If control plane security and management plane security are 2245 sufficiently robust, compromise of a single network element may 2246 result in chaos in the data plane anywhere in the network through 2247 denial of service attacks, but not a Byzantine security failure in 2248 which other network elements are fully compromised. 2250 MPLS security, or lack of, can affect whether traffic can be 2251 misrouted and lost, or intercepted, or intercepted and reinserted (a 2252 man-in-the-middle attack) or spoofed. End user applications, 2253 including control plane and management plane protocols used by the 2254 SP, are expected to make use of appropriate end-to-end authentication 2255 and where appropriate end-to-end encryption. 2257 8. References 2259 8.1. Normative References 2261 [I-D.ietf-mpls-psc-updates] 2262 Osborne, E., "Updates to PSC", draft-ietf-mpls-psc- 2263 updates-01 (work in progress), January 2014. 2265 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2266 Requirement Levels", BCP 14, RFC 2119, March 1997. 2268 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 2269 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 2270 Encoding", RFC 3032, January 2001. 2272 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 2273 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 2274 Tunnels", RFC 3209, December 2001. 2276 [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, 2277 P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- 2278 Protocol Label Switching (MPLS) Support of Differentiated 2279 Services", RFC 3270, May 2002. 2281 [RFC3443] Agarwal, P. and B. Akyol, "Time To Live (TTL) Processing 2282 in Multi-Protocol Label Switching (MPLS) Networks", RFC 2283 3443, January 2003. 2285 [RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute 2286 Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May 2287 2005. 2289 [RFC4182] Rosen, E., "Removing a Restriction on the use of MPLS 2290 Explicit NULL", RFC 4182, September 2005. 2292 [RFC4201] Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling 2293 in MPLS Traffic Engineering (TE)", RFC 4201, October 2005. 2295 [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, 2296 "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for 2297 Use over an MPLS PSN", RFC 4385, February 2006. 2299 [RFC4950] Bonica, R., Gan, D., Tappan, D., and C. Pignataro, "ICMP 2300 Extensions for Multiprotocol Label Switching", RFC 4950, 2301 August 2007. 2303 [RFC5082] Gill, V., Heasley, J., Meyer, D., Savola, P., and C. 2304 Pignataro, "The Generalized TTL Security Mechanism 2305 (GTSM)", RFC 5082, October 2007. 2307 [RFC5085] Nadeau, T. and C. Pignataro, "Pseudowire Virtual Circuit 2308 Connectivity Verification (VCCV): A Control Channel for 2309 Pseudowires", RFC 5085, December 2007. 2311 [RFC5129] Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion 2312 Marking in MPLS", RFC 5129, January 2008. 2314 [RFC5332] Eckert, T., Rosen, E., Aggarwal, R., and Y. Rekhter, "MPLS 2315 Multicast Encapsulations", RFC 5332, August 2008. 2317 [RFC5586] Bocci, M., Vigoureux, M., and S. Bryant, "MPLS Generic 2318 Associated Channel", RFC 5586, June 2009. 2320 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 2321 (BFD)", RFC 5880, June 2010. 2323 [RFC5884] Aggarwal, R., Kompella, K., Nadeau, T., and G. Swallow, 2324 "Bidirectional Forwarding Detection (BFD) for MPLS Label 2325 Switched Paths (LSPs)", RFC 5884, June 2010. 2327 [RFC5885] Nadeau, T. and C. Pignataro, "Bidirectional Forwarding 2328 Detection (BFD) for the Pseudowire Virtual Circuit 2329 Connectivity Verification (VCCV)", RFC 5885, June 2010. 2331 [RFC6374] Frost, D. and S. Bryant, "Packet Loss and Delay 2332 Measurement for MPLS Networks", RFC 6374, September 2011. 2334 [RFC6375] Frost, D. and S. Bryant, "A Packet Loss and Delay 2335 Measurement Profile for MPLS-Based Transport Networks", 2336 RFC 6375, September 2011. 2338 [RFC6378] Weingarten, Y., Bryant, S., Osborne, E., Sprecher, N., and 2339 A. Fulignoli, "MPLS Transport Profile (MPLS-TP) Linear 2340 Protection", RFC 6378, October 2011. 2342 [RFC6391] Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan, 2343 J., and S. Amante, "Flow-Aware Transport of Pseudowires 2344 over an MPLS Packet Switched Network", RFC 6391, November 2345 2011. 2347 [RFC6427] Swallow, G., Fulignoli, A., Vigoureux, M., Boutros, S., 2348 and D. Ward, "MPLS Fault Management Operations, 2349 Administration, and Maintenance (OAM)", RFC 6427, November 2350 2011. 2352 [RFC6428] Allan, D., Swallow Ed. , G., and J. Drake Ed. , "Proactive 2353 Connectivity Verification, Continuity Check, and Remote 2354 Defect Indication for the MPLS Transport Profile", RFC 2355 6428, November 2011. 2357 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 2358 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 2359 RFC 6790, November 2012. 2361 8.2. Informative References 2363 [ACK-compression] 2364 , , , "Observations and Dynamics of a Congestion Control 2365 Algorithm: The Effects of Two-Way Traffic", Proc. ACM 2366 SIGCOMM, ACM Computer Communications Review (CCR) Vol 21, 2367 No 4, 1991, pp.133-147., 1991. 2369 [I-D.ietf-mpls-in-udp] 2370 Building, K., Sheth, N., Yong, L., Pignataro, C., and F. 2371 Yongbing, "Encapsulating MPLS in UDP", draft-ietf-mpls-in- 2372 udp-05 (work in progress), January 2014. 2374 [I-D.ietf-mpls-special-purpose-labels] 2375 Kompella, K., Andersson, L., and A. Farrel, "Allocating 2376 and Retiring Special Purpose MPLS Labels", draft-ietf- 2377 mpls-special-purpose-labels-03 (work in progress), July 2378 2013. 2380 [I-D.ietf-tictoc-1588overmpls] 2381 Davari, S., Oren, A., Bhatia, M., Roberts, P., and L. 2382 Montini, "Transporting Timing messages over MPLS 2383 Networks", draft-ietf-tictoc-1588overmpls-05 (work in 2384 progress), June 2013. 2386 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 2387 1981. 2389 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 2390 "Definition of the Differentiated Services Field (DS 2391 Field) in the IPv4 and IPv6 Headers", RFC 2474, December 2392 1998. 2394 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 2395 and W. Weiss, "An Architecture for Differentiated 2396 Services", RFC 2475, December 1998. 2398 [RFC2597] Heinanen, J., Baker, F., Weiss, W., and J. Wroclawski, 2399 "Assured Forwarding PHB Group", RFC 2597, June 1999. 2401 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 2402 Label Switching Architecture", RFC 3031, January 2001. 2404 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 2405 of Explicit Congestion Notification (ECN) to IP", RFC 2406 3168, September 2001. 2408 [RFC3429] Ohta, H., "Assignment of the 'OAM Alert Label' for 2409 Multiprotocol Label Switching Architecture (MPLS) 2410 Operation and Maintenance (OAM) Functions", RFC 3429, 2411 November 2002. 2413 [RFC3471] Berger, L., "Generalized Multi-Protocol Label Switching 2414 (GMPLS) Signaling Functional Description", RFC 3471, 2415 January 2003. 2417 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 2418 Jacobson, "RTP: A Transport Protocol for Real-Time 2419 Applications", STD 64, RFC 3550, July 2003. 2421 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., and 2422 G. Fairhurst, "The Lightweight User Datagram Protocol 2423 (UDP-Lite)", RFC 3828, July 2004. 2425 [RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to- 2426 Edge (PWE3) Architecture", RFC 3985, March 2005. 2428 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 2429 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 2430 4023, March 2005. 2432 [RFC4110] Callon, R. and M. Suzuki, "A Framework for Layer 3 2433 Provider-Provisioned Virtual Private Networks (PPVPNs)", 2434 RFC 4110, July 2005. 2436 [RFC4124] Le Faucheur, F., "Protocol Extensions for Support of 2437 Diffserv-aware MPLS Traffic Engineering", RFC 4124, June 2438 2005. 2440 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 2441 Hierarchy with Generalized Multi-Protocol Label Switching 2442 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 2444 [RFC4221] Nadeau, T., Srinivasan, C., and A. Farrel, "Multiprotocol 2445 Label Switching (MPLS) Management Overview", RFC 4221, 2446 November 2005. 2448 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 2449 Congestion Control Protocol (DCCP)", RFC 4340, March 2006. 2451 [RFC4377] Nadeau, T., Morrow, M., Swallow, G., Allan, D., and S. 2452 Matsushima, "Operations and Management (OAM) Requirements 2453 for Multi-Protocol Label Switched (MPLS) Networks", RFC 2454 4377, February 2006. 2456 [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol 2457 Label Switched (MPLS) Data Plane Failures", RFC 4379, 2458 February 2006. 2460 [RFC4664] Andersson, L. and E. Rosen, "Framework for Layer 2 Virtual 2461 Private Networks (L2VPNs)", RFC 4664, September 2006. 2463 [RFC4817] Townsley, M., Pignataro, C., Wainner, S., Seely, T., and 2464 J. Young, "Encapsulation of MPLS over Layer 2 Tunneling 2465 Protocol Version 3", RFC 4817, March 2007. 2467 [RFC4875] Aggarwal, R., Papadimitriou, D., and S. Yasukawa, 2468 "Extensions to Resource Reservation Protocol - Traffic 2469 Engineering (RSVP-TE) for Point-to-Multipoint TE Label 2470 Switched Paths (LSPs)", RFC 4875, May 2007. 2472 [RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal 2473 Cost Multipath Treatment in MPLS Networks", BCP 128, RFC 2474 4928, June 2007. 2476 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC 2477 4960, September 2007. 2479 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 2480 Specification", RFC 5036, October 2007. 2482 [RFC5317] Bryant, S. and L. Andersson, "Joint Working Team (JWT) 2483 Report on MPLS Architectural Considerations for a 2484 Transport Profile", RFC 5317, February 2009. 2486 [RFC5462] Andersson, L. and R. Asati, "Multiprotocol Label Switching 2487 (MPLS) Label Stack Entry: "EXP" Field Renamed to "Traffic 2488 Class" Field", RFC 5462, February 2009. 2490 [RFC5640] Filsfils, C., Mohapatra, P., and C. Pignataro, "Load- 2491 Balancing for Mesh Softwires", RFC 5640, August 2009. 2493 [RFC5695] Akhter, A., Asati, R., and C. Pignataro, "MPLS Forwarding 2494 Benchmarking Methodology for IP Flows", RFC 5695, November 2495 2009. 2497 [RFC5860] Vigoureux, M., Ward, D., and M. Betts, "Requirements for 2498 Operations, Administration, and Maintenance (OAM) in MPLS 2499 Transport Networks", RFC 5860, May 2010. 2501 [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network 2502 Time Protocol Version 4: Protocol and Algorithms 2503 Specification", RFC 5905, June 2010. 2505 [RFC5920] Fang, L., "Security Framework for MPLS and GMPLS 2506 Networks", RFC 5920, July 2010. 2508 [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, 2509 D., and S. Mansfield, "Guidelines for the Use of the "OAM" 2510 Acronym in the IETF", BCP 161, RFC 6291, June 2011. 2512 [RFC6310] Aissaoui, M., Busschbach, P., Martini, L., Morrow, M., 2513 Nadeau, T., and Y(J). Stein, "Pseudowire (PW) Operations, 2514 Administration, and Maintenance (OAM) Message Mapping", 2515 RFC 6310, July 2011. 2517 [RFC6371] Busi, I. and D. Allan, "Operations, Administration, and 2518 Maintenance Framework for MPLS-Based Transport Networks", 2519 RFC 6371, September 2011. 2521 [RFC6388] Wijnands, IJ., Minei, I., Kompella, K., and B. Thomas, 2522 "Label Distribution Protocol Extensions for Point-to- 2523 Multipoint and Multipoint-to-Multipoint Label Switched 2524 Paths", RFC 6388, November 2011. 2526 [RFC6424] Bahadur, N., Kompella, K., and G. Swallow, "Mechanism for 2527 Performing Label Switched Path Ping (LSP Ping) over MPLS 2528 Tunnels", RFC 6424, November 2011. 2530 [RFC6425] Saxena, S., Swallow, G., Ali, Z., Farrel, A., Yasukawa, 2531 S., and T. Nadeau, "Detecting Data-Plane Failures in 2532 Point-to-Multipoint MPLS - Extensions to LSP Ping", RFC 2533 6425, November 2011. 2535 [RFC6426] Gray, E., Bahadur, N., Boutros, S., and R. Aggarwal, "MPLS 2536 On-Demand Connectivity Verification and Route Tracing", 2537 RFC 6426, November 2011. 2539 [RFC6435] Boutros, S., Sivabalan, S., Aggarwal, R., Vigoureux, M., 2540 and X. Dai, "MPLS Transport Profile Lock Instruct and 2541 Loopback Functions", RFC 6435, November 2011. 2543 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 2544 for Equal Cost Multipath Routing and Link Aggregation in 2545 Tunnels", RFC 6438, November 2011. 2547 [RFC6478] Martini, L., Swallow, G., Heron, G., and M. Bocci, 2548 "Pseudowire Status for Static Pseudowires", RFC 6478, May 2549 2012. 2551 [RFC6639] King, D. and M. Venkatesan, "Multiprotocol Label Switching 2552 Transport Profile (MPLS-TP) MIB-Based Management 2553 Overview", RFC 6639, June 2012. 2555 [RFC6669] Sprecher, N. and L. Fang, "An Overview of the Operations, 2556 Administration, and Maintenance (OAM) Toolset for MPLS- 2557 Based Transport Networks", RFC 6669, July 2012. 2559 [RFC6670] Sprecher, N. and KY. Hong, "The Reasons for Selecting a 2560 Single Solution for MPLS Transport Profile (MPLS-TP) 2561 Operations, Administration, and Maintenance (OAM)", RFC 2562 6670, July 2012. 2564 [RFC6720] Pignataro, C. and R. Asati, "The Generalized TTL Security 2565 Mechanism (GTSM) for the Label Distribution Protocol 2566 (LDP)", RFC 6720, August 2012. 2568 [RFC6829] Chen, M., Pan, P., Pignataro, C., and R. Asati, "Label 2569 Switched Path (LSP) Ping for Pseudowire Forwarding 2570 Equivalence Classes (FECs) Advertised over IPv6", RFC 2571 6829, January 2013. 2573 [RFC6941] Fang, L., Niven-Jenkins, B., Mansfield, S., and R. 2574 Graveman, "MPLS Transport Profile (MPLS-TP) Security 2575 Framework", RFC 6941, April 2013. 2577 [RFC7023] Mohan, D., Bitar, N., Sajassi, A., DeLord, S., Niger, P., 2578 and R. Qiu, "MPLS and Ethernet Operations, Administration, 2579 and Maintenance (OAM) Interworking", RFC 7023, October 2580 2013. 2582 [RFC7074] Berger, L. and J. Meuric, "Revised Definition of the GMPLS 2583 Switching Capability and Type Fields", RFC 7074, November 2584 2013. 2586 [RFC7079] Del Regno, N. and A. Malis, "The Pseudowire (PW) and 2587 Virtual Circuit Connectivity Verification (VCCV) 2588 Implementation Survey Results", RFC 7079, November 2013. 2590 Appendix A. Organization of References Section 2592 The References section is split into Normative and Informative 2593 subsections. References that directly specify forwarding 2594 encapsulations or behaviors are listed as normative. References 2595 which describe signaling only, though normative with respect to 2596 signaling, are listed as informative. They are informative with 2597 respect to MPLS forwarding. 2599 Authors' Addresses 2601 Curtis Villamizar (editor) 2602 Outer Cape Cod Network Consulting, LLC 2604 Email: curtis@occnc.com 2606 Kireeti Kompella 2607 Juniper Networks 2609 Email: kireeti@juniper.net 2611 Shane Amante 2612 Apple Inc. 2613 1 Infinite Loop 2614 Cupertino, California 95014 2616 Email: samante@apple.com 2618 Andrew Malis 2619 Huawei Technologies 2621 Email: agmalis@gmail.com 2622 Carlos Pignataro 2623 Cisco Systems 2624 7200-12 Kit Creek Road 2625 Research Triangle Park, NC 27709 2626 US 2628 Email: cpignata@cisco.com