idnits 2.17.1 draft-ietf-trill-oam-req-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 26, 2013) is 4106 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC5101' is defined on line 434, but no explicit reference was found in the text == Unused Reference: '8021Q' is defined on line 450, but no explicit reference was found in the text == Unused Reference: 'RFC4377' is defined on line 454, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 5101 (Obsoleted by RFC 7011) -- Obsolete informational reference (is this intentional?): RFC 2680 (Obsoleted by RFC 7680) -- Obsolete informational reference (is this intentional?): RFC 2679 (Obsoleted by RFC 7679) Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TRILL Working Group Tissa Senevirathne 2 Internet Draft CISCO 3 Intended status: Informational David Bond 4 IBM 5 Sam Aldrin 6 Yizhou Li 7 Huawei 8 Rohit Watve 9 CISCO 11 January 26, 2013 12 Expires: July 2013 14 Requirements for Operations, Administration and Maintenance (OAM) in 15 TRILL (Transparent Interconnection of Lots of Links) 16 draft-ietf-trill-oam-req-05 18 Status of this Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as Internet- 26 Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six 29 months and may be updated, replaced, or obsoleted by other documents 30 at any time. It is inappropriate to use Internet-Drafts as 31 reference material or to cite them other than as "work in progress." 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/ietf/1id-abstracts.txt 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html 39 This Internet-Draft will expire on July 26,2013. 41 Copyright Notice 43 Copyright (c) 2013 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with 51 respect to this document. Code Components extracted from this 52 document must include Simplified BSD License text as described in 53 Section 4.e of the Trust Legal Provisions and are provided without 54 warranty as described in the Simplified BSD License. 56 Abstract 58 OAM (Operations, Administration and Maintenance) is a general term 59 used to identify functions and toolsets to troubleshoot and monitor 60 networks. This document presents OAM Requirements applicable to 61 TRILL. 63 Table of Contents 65 1. Introduction...................................................3 66 1.1. Scope.....................................................3 67 2. Conventions used in this document..............................3 68 3. Terminology....................................................3 69 4. OAM Requirements...............................................5 70 4.1. Data Plane................................................5 71 4.2. Connectivity Verification.................................5 72 4.2.1. Unicast..............................................5 73 4.2.2. Distribution Trees...................................5 74 4.3. Continuity Check..........................................6 75 4.4. Path Tracing..............................................6 76 4.5. General Requirements......................................6 77 4.6. Performance Monitoring....................................7 78 4.6.1. Packet Loss..........................................7 79 4.6.2. Packet Delay.........................................8 80 4.7. ECMP Utilization..........................................8 81 4.8. Security and Operational considerations...................8 82 4.9. Fault Indications.........................................9 83 4.10. Defect Indications.......................................9 84 4.11. Live Traffic monitoring..................................9 85 5. Security Considerations.......................................10 86 6. IANA Considerations...........................................10 87 7. References....................................................10 88 7.1. Normative References.....................................10 89 7.2. Informative References...................................10 91 8. Acknowledgments...............................................11 92 9. Authors.......................................................11 93 10. Contributors.................................................13 95 1. Introduction 97 OAM (Operations, Administration and Maintenance) generally covers 98 various production aspects of a network. In this document we use the 99 term OAM as defined in [RFC6291]. 101 Success of network operations depends on the ability to proactively 102 monitor it for faults, performance, etc. as well as the ability to 103 efficiently and quickly troubleshoot defects and failures. A well- 104 defined OAM toolset is a vital requirement for wider adoption of 105 TRILL (Transparent Interconnection of Lots of Links) as the next 106 generation data forwarding technology in larger networks such as 107 data centers. 109 In this document we define the requirements for TRILL OAM. It is 110 assumed that the readers are familiar with the OAM concepts and 111 terminologies defined in other OAM standards such as [8021ag] and 112 [RFC5860]. This document does not attempt to redefine the terms and 113 concepts specified elsewhere. 115 1.1. Scope 117 The scope of this document is OAM between RBridges of a TRILL campus 118 over links selected by TRILL routing. 120 2. Conventions used in this document 122 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 123 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 124 document are to be interpreted as described in RFC-2119 [RFC2119]. 125 Although this document is not a protocol specification, the use of 126 this language clarifies the instructions to protocol designers 127 producing solutions that satisfy the requirements set out in this 128 document. 130 3. Terminology 132 Section: The term Section refers to a segment of a path between any 133 two given RBridges. As an example, consider the case where RB1 is 134 connected to RBx via RB2,RB3 and RB4. The segment between RB2 to RB4 135 is referred to as a Section of the path RB1 to RBx. More details of 136 "section" definition can be found in [RFC5960] 137 Flow: The term Flow indicates a set of packets that share the same 138 path and per-hop behavior (such as priority). A flow is typically 139 identified by a portion of the inner payload that affects the hop-by 140 hop forwarding decisions. This may contain Layer 2 through Layer 4 141 information. 143 All Selectable Least Cost Paths: The term "all selectable least cost 144 paths" refers to a subset of all potentially available least cost 145 paths to a specified destination RBridge that are available (and 146 usable) for forwarding of frames. It is important to note, in 147 practice, due to limitations in implementations, not all available 148 least cost paths may be selectable for forwarding. 150 Connectivity: The term connectivity indicates reachability between 151 an arbitrary RBridge RB1 and any other RBridge RB2. The specific 152 path can be either explicit (i.e. associated with a specific flow) 153 or unspecified. Unspecified means that messages used for 154 connectivity verification take whatever path the RBs happen to 155 select. Please refer to [OAMOVER] for details. 157 Continuity Verification: Continuity Verification refers to proactive 158 verification of liveliness between two RBridges at periodic 159 intervals and generation of explicit notification when Connectivity 160 failures occur. Please refer to [OAMOVER] for details. 162 Fault: The term Fault refers to an inability to perform a required 163 action, e.g., an unsuccessful attempt to deliver a packet. Please 164 refer to [TERMTP] for definition. 166 Defect: The term Defect refers to an interruption in the normal 167 operation, such that over a period of time no packets are delivered 168 successfully. Please refer to [TERMTP] for definition. 170 Failure: The term Failure refers to the termination of the required 171 function over a longer period of time. Persistence of a defect for a 172 period of time is interpreted as a failure. Please refer to [TERMTP] 173 for definition. 175 Simulated Flow: The term simulated flow refers to a sequence of OAM 176 generated packets designed to follow a specific path. The fields of 177 the packets in the simulated flow may or may not be identical to the 178 fields of data packets of an actual flow being simulated. However, 179 the purpose of the simulated flow is to have OAM packets of the 180 simulated flow follow a specific path. 182 4. OAM Requirements 184 4.1. Data Plane 186 OAM frames, utilized for connectivity verification, continuity 187 checks, performance measurements, etc., will by default take 188 whatever path TRILL chooses based on the current topology and per- 189 hop equal cost path choices. In some cases, it may be required that 190 the OAM frames utilize specific paths. Thus, it MUST be possible to 191 arrange that OAM frames follow the path taken by a specific flow. 193 RBridges MUST have the ability to identify frames which require OAM 194 processing.. 196 TRILL OAM frames MUST remain within a TRILL campus and MUST NOT be 197 egressed from a TRILL network as native frames. 199 OAM MUST have ability to include all Ethernet traffic types carried 200 by TRILL. 202 4.2. Connectivity Verification 204 4.2.1. Unicast 206 From an arbitrary RBridge RB1, OAM MUST have the ability to verify 207 connectivity to any other RBridge RB2. 209 From an arbitrary RBridge RB1, OAM MUST have the ability to verify 210 connectivity to any other RBridge RB2 for a specific flow via the 211 path associated with the specified flow. 213 4.2.2. Distribution Trees 215 OAM MUST have the ability to verify connectivity, from an arbitrary 216 RBridge RB1, to either a specific set of RBridges or all member 217 RBridges, for a specified distribution tree. This functionality is 218 referred to as verification of the un-pruned distribution tree. 220 OAM MUST have the ability to verify connectivity, from an arbitrary 221 RBridge RB1, to either a specific set of RBridges or all member 222 RBridges, for a specified distribution tree and for a specified 223 flow. This functionality is referred to as verification of the 224 pruned tree. 226 4.3. Continuity Check 228 OAM MUST provide functions that allow any arbitrary RBridge RB1 to 229 perform a Continuity Check to any other RBridge. 231 OAM MUST provide functions that allow any arbitrary RBridge RB1 to 232 perform a Continuity Check to any other RBridge using a path 233 associated with a specified flow. 235 OAM SHOULD provide functions that allow any arbitrary RBridge to 236 perform a Continuity Check to any other RBridge over any section of 237 any selectable least cost path. 239 OAM SHOULD provide the ability to perform a Continuity Check on 240 sections of any selectable path within the network. 242 OAM SHOULD provide the ability to perform a multicast Continuity 243 Check for specified distribution tree(s) as well as specified 244 distribution tree and flow combinations. The former is referred to 245 as an un-pruned multi-destination tree Continuity Check and the 246 latter is referred to as a pruned tree Continuity Check. 248 4.4. Path Tracing 250 OAM MUST provide the ability to trace a path between any two 251 RBridges per specified unicast flow. 253 OAM SHOULD provide the ability to trace all selectable least cost 254 paths between any two RBridges. 256 OAM SHOULD provide functionality to trace all branches of a 257 specified distribution tree (un-pruned tree). 259 OAM SHOULD provide functionality to trace all branches of a 260 specified distribution tree for a specified flow (pruned tree). 262 4.5. General Requirements 264 OAM MUST provide the ability to initiate and maintain multiple 265 concurrent sessions for multiple OAM functions between any arbitrary 266 RBridge RB1 to any other RBridge. In general, multiple OAM 267 operations will run concurrently. For example, proactive continuity 268 checks may take place between RB1 and RB2 at the same time an 269 operator decides to test connectivity between the same two RBs. 270 Multiple OAM functions and instances of those functions MUST be able 271 to run concurrently without interfering with each other. 273 OAM MUST provide a single OAM framework for all TRILL OAM functions 274 within the scope of this document. 276 OAM, as practical and as possible, SHOULD reuse functional, 277 operational and semantic elements of existing OAM standards. 279 OAM MUST maintain related error and operational counters. Such 280 counters MUST be accessible via network management applications 281 (e.g. SNMP). 283 OAM functions related to continuity and connectivity checks MUST be 284 able to be invoked either proactively or on-demand. 286 OAM MAY be required to provide the ability to specify a desired 287 response mode for a specific OAM message. The desired response mode 288 can be either in-band, out-of band or none. 290 The OAM Framework MUST be extensible to include new functionality. 291 For example, the solution needs to include a Version number to 292 differentiate older and newer implementations and TLV structures for 293 flexibility to include new information elements. 295 OAM MAY provide methods to verify control plane and forwarding plane 296 alignments. 298 OAM SHOULD leverage existing OAM technologies, where practical. 300 4.6. Performance Monitoring 302 4.6.1. Packet Loss 304 In this document, the term loss of a packet is used as defined in 305 [RFC2680] (see Section 2.4 of RFC2680). 307 OAM SHOULD provide the ability to measure packet loss statistics for 308 a flow from any arbitrary RBridge RB1 to any other RBridge. 310 OAM SHOULD provide the ability to measure packet loss statistics 311 over a section, for a flow between any arbitrary RBridge RB1 to any 312 other RBridge. 314 OAM SHOULD provide the ability to measure packet loss statistics 315 between any two RBridges over all least cost paths. 317 An RBridge SHOULD be able to perform the above packet loss 318 measurement functions either proactively or on-demand. 320 4.6.2. Packet Delay 322 There are two types of packet delays -- one-way delay and two-way 323 delay (Round Trip Delay). 325 One-way delay is defined in [RFC2679] as the time elapsed from the 326 start of transmission of the first bit of a packet by an RBridge 327 until the reception of the last bit of the packet by the destination 328 RBridge. 330 Two-way delay is also referred to as Round Trip Delay and is defined 331 similar to [RFC2681]; i.e. the time elapsed from the start of 332 transmission of the first bit of a packet from RB1, receipt of the 333 packet at RB2, RB2 sending a response packet back to RB1 and RB1 334 receiving the last bit of that response packet. 336 OAM SHOULD provide functions to measure two-way delay between two 337 RBridges. 339 OAM MAY provide functions to measure one-way delay between two 340 RBridges for a specified flow. 342 OAM MAY provide functions to measure one-way delay between two 343 RBridges for a specified flow over a specific section. 345 4.7. ECMP Utilization 347 OAM MAY provide functionality to monitor the effectiveness of per- 348 hop ECMP hashing. For example, individual RBridges could maintain 349 counters that show how packets are being distributed across equal 350 cost next hops for a specified destination RBridge or RBridges as a 351 result of ECMP hashing. 353 4.8. Security and Operational considerations 355 Methods MUST be provided to protect against exploitation of OAM 356 framework for security and denial of service attacks. 358 Methods MUST be provided to prevent OAM messages causing congestion 359 in the networks. Periodically generated messages with high 360 frequencies may lead to congestion, hence methods such as shaping or 361 rate limiting SHOULD be utilized. 363 Certain OAM functions may be utilized to gather operational 364 information such as topology of the network. Methods MUST be 365 provided to prevent unauthorized users accessing OAM functions to 366 gather critical and sensitive information of the network. 368 OAM packets MUST be limited to within the TRILL campus and 369 implementation MUST provide methods to prevent leaking of OAM 370 packets out of the TRILL campus. Additionally methods MUST be 371 provided to prevent accepting OAM packets from outside the TRILL 372 campus. 374 4.9. Fault Indications 376 OAM MUST provide a Fault Indication framework to notify faults to 377 the ingress RBRidge of the packet or other interested parties (such 378 as syslog servers). 380 OAM MUST provide functions to selectively enable or disable 381 different types of Fault Indications. 383 4.10. Defect Indications 385 OAM SHOULD provide a framework for Defect Detection and Indication. 387 OAM Defect Detection and Indication Framework SHOULD provide methods 388 to selectively enable or disable Defect Detection per defect type. 390 OAM Defect Detection and Indication Framework SHOULD provide methods 391 to configure Defect Detection thresholds per different types of 392 defects. 394 OAM Defect Detection and Indication Framework SHOULD provide methods 395 to log defect indications to a locally defined archive (such as log 396 buffer) or SNMP traps. 398 OAM Defect Detection and Indication Framework SHOULD provide a 399 Remote Defect Indication framework that facilitates notifying the 400 originator/owner of the flow experiencing the defect, which is the 401 ingress RBridge. 403 Remote Defect Indication MAY be either in-band or out-of-band. 405 4.11. Live Traffic monitoring 407 OAM implementations MAY provide methods to utilize live traffic for 408 troubleshooting and performance monitoring. 410 5. Security Considerations 412 Security Requirements are specified in section 4.8. For general 413 TRILL security considerations please refer to [RFC6325] 415 6. IANA Considerations 417 None 419 7. References 421 7.1. Normative References 423 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 424 Requirement Levels", BCP 14, RFC 2119, March 1997. 426 [RFC6291] Anderson, L., et.al. "Guidelines for the Use of the "OAM" 427 Acronym in the IETF", RFC 6291, June 2011. 429 7.2. Informative References 431 [RFC6325] Perlman, R., et.al., "Routing Bridges (RBridges): Base 432 Protocol Specification", RFC 6325, July 2011. 434 [RFC5101] Claise, B., "Specification of the IP Flow Information 435 Export (IPFIX) Protocol for the Exchange of IP Traffic 436 Flow Information", RFC5101, January 2008. 438 [RFC2680] Almes, G., et.al. "A One-way Packet Loss Metric for IPPM", 439 RFC 2680, September 1999. 441 [RFC2679] Almes, G., et.al. "A One-way Delay Metric for IPPM", RFC 442 2679, September 1999. 444 [RFC2681] Almes, G., et.al. "A Round-trip Delay Metric for IPPM", 445 RFC 2681, September 1999. 447 [8021ag] IEEE, "Virtual Bridged Local Area Networks Amendment 5: 448 Connectivity Fault Management", 802.1ag, 2007. 450 [8021Q] IEEE, "Media Access Control (MAC) Bridges and Virtual 451 Bridged Local Area Networks", IEEE Std 802.1Q-2011, 452 August, 2011. 454 [RFC4377] Nadeau, T., et.al. "Operations and Management (OAM) 455 Requirements for Multi-protocol Label Switched 456 (MPLS)Networks", RFC 4377, February 2006. 458 [OAMOVER] Mizrahi, T, et.al., "An Overview of Operations, 459 Administration, and Maintenance (OAM) Mechanisms", draft- 460 ietf-opsawg-oam-overview, Work in Progress, March 2012. 462 [RFC5860] Vigoureux, M., et.al., "Requirements for Operations, 463 Administration and Maintenance (OAM) in MPLS Transport 464 Networks", RFC5860, May 2010. 466 [TERMTP] Helvoort, H., et.al., "A Thesaurus for the Terminology used 467 in Multiprotocol Label Switching Transport Profile (MPLS- 468 TP) drafts/RFCs and ITU-T' Transport Network 469 Recommendations", draft-ietf-mpls-tp-rosetta-stone, Work 470 in Progress, July 2012. 472 [RFC5960] Frost, D., et.al., "MPLS Transport Profile Data Plane 473 Architecture" RFC 5960, August 2010. 475 8. Acknowledgments 477 Special acknowledgments to IEEE 802.1 chair, Tony Jeffree for 478 allowing us to solicit comments from IEEE 802.1 group. Also 479 recognized are the comments received from IEEE group, IESG, Stewart 480 Bryant, Ralph Droms, Adrian Farrel, Benoit Claise, Ayal Lior 481 and others. 483 This document was prepared using 2-Word-v2.0.template.dot. 485 9. Authors 487 Tissa Senevirathne 488 CISCO Systems 489 375 East Tasman Drive 490 San Jose, CA 95134 491 USA. 493 Phone: +1-408-853-2291 494 Email: tsenevir@cisco.com 495 David Bond 496 IBM 497 4400 North 1 st Street 498 San Jose, CA 95134 499 USA 501 Phone: +1-603-339-7575 502 Email: mokon@mokon.net 504 Sam Aldrin 505 Huawei Technologies 506 2330 Central Express Way 507 Santa Clara, CA 95951 508 USA 510 Email: aldrin.ietf@gmail.com 512 Yizhou Li 513 Huawei Technologies 514 101 Software Avenue, 515 Nanjing 210012 516 China 518 Phone: +86-25-56625375 519 Email: liyizhou@huawei.com 521 Rohit Watve 522 CISCO Systems 523 375 East Tasman Drive 524 San Jose, CA 95134 525 USA. 527 Phone: +1-408-424-2091 528 Email: rwatve@cisco.com 530 10. Contributors 532 Thomas Narten 533 IBM Corporation 534 3039 Cornwallis Avenue, 535 PO Box 12195 536 Research Triangle Park, NC 27709 537 USA 539 Email:narten@us.ibm.com 541 Donald Eastlake 542 Huawei Technologies 543 155 Beaver Street, 544 Milford, MAC 01757 545 USA. 547 Email: d3e3e3@gmail.com 549 Anoop Ghanwani 550 DELL 551 350 Holger Way 552 San Jose, CA 95134 553 USA. 555 Phone: +1-408-571-3500 556 Email: Anoop@alumni.duke.edu 558 Jon Hudson 559 Brocade 560 120 Holger Way 561 San Jose, CA 95134 562 USA. 564 Email: jon.hudson@gmail.com 565 Naveen Nimmu 566 Broadcom 567 9th Floor, Building no 9, Raheja Mind space 568 Hi-Tec City, Madhapur, 569 Hyderabad - 500 081, INDIA 571 Phone: +1-408-218-8893 572 Email: naveen@broadcom.com 574 Radia Perlman 575 Intel Labs 576 2700 156th Ave NE, Suite 300, 577 Bellevue, WA 98007 578 USA. 580 Phone: +1-425-881-4824 581 Email: radia.perlman@intel.com 583 Tal Mizrahi 584 Marvell 585 6 Hamada St. 586 Yokneam, 20692 Israel 588 Email: talmi@marvell.com