idnits 2.17.1 draft-ietf-trill-oam-req-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Authors' Addresses Section. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 15, 2012) is 4174 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC4377' is defined on line 461, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 5101 (Obsoleted by RFC 7011) -- Obsolete informational reference (is this intentional?): RFC 2680 (Obsoleted by RFC 7680) -- Obsolete informational reference (is this intentional?): RFC 2679 (Obsoleted by RFC 7679) == Outdated reference: A later version (-16) exists of draft-ietf-opsawg-oam-overview-06 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TRILL Working Group Tissa Senevirathne 2 Internet Draft CISCO 3 Intended status: Informational David Bond 4 IBM 5 Sam Aldrin 6 Yizhou Li 7 Huawei 8 Rohit Watve 9 CISCO 11 November 15, 2012 12 Expires: May 2013 14 Requirements for Operations, Administration and Maintenance (OAM) in 15 TRILL (Transparent Interconnection of Lots of Links) 16 draft-ietf-trill-oam-req-04 18 Status of this Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as Internet- 26 Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six 29 months and may be updated, replaced, or obsoleted by other documents 30 at any time. It is inappropriate to use Internet-Drafts as 31 reference material or to cite them other than as "work in progress." 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/ietf/1id-abstracts.txt 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html 39 This Internet-Draft will expire on May 15,2013. 41 Copyright Notice 43 Copyright (c) 2012 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with 51 respect to this document. Code Components extracted from this 52 document must include Simplified BSD License text as described in 53 Section 4.e of the Trust Legal Provisions and are provided without 54 warranty as described in the Simplified BSD License. 56 Abstract 58 OAM (Operations, Administration and Maintenance) is a general term 59 used to identify functions and toolsets to troubleshoot and monitor 60 networks. This document presents, OAM Requirements applicable to 61 TRILL. 63 Table of Contents 65 1. Introduction...................................................3 66 1.1. Scope.....................................................3 67 2. Conventions used in this document..............................3 68 3. Terminology....................................................3 69 4. OAM Requirements...............................................5 70 4.1. Data Plane................................................5 71 4.2. Connectivity Verification.................................5 72 4.2.1. Unicast..............................................5 73 4.2.2. Distribution Trees...................................5 74 4.3. Continuity Check..........................................6 75 4.4. Path Tracing..............................................6 76 4.5. General Requirements......................................6 77 4.6. Performance Monitoring....................................7 78 4.6.1. Packet Loss..........................................7 79 4.6.2. Packet Delay.........................................8 80 4.7. ECMP Utilization..........................................8 81 4.8. Security and Operational considerations...................8 82 4.9. Fault Indications.........................................9 83 4.10. Defect Indications.......................................9 84 4.11. Live Traffic monitoring.................................10 85 5. Security Considerations.......................................10 86 6. IANA Considerations...........................................10 87 7. References....................................................10 88 7.1. Normative References.....................................10 89 7.2. Informative References...................................10 91 8. Acknowledgments...............................................11 92 9. Contributing Authors..........................................11 94 1. Introduction 96 OAM (Operations, Administration and Maintenance) generally covers 97 various production aspects of a network. In this document we use the 98 term OAM as defined in [RFC6291]. 100 Success of any mission critical network depends on the ability to 101 proactively monitor networks for faults, performance, etc. as well 102 as its ability to efficiently and quickly troubleshoot defects and 103 failures. A well-defined OAM toolset is a vital requirement for 104 wider adoption of TRILL (Transparent Interconnection of Lots of 105 Links) as the next generation data forwarding technology in larger 106 networks such as data centers. 108 In this document we define the Requirements for TRILL OAM. It is 109 assumed that the readers are familiar with the OAM concepts and 110 terminologies defined in other OAM standards such as [8021ag] and 111 [RFC5860]. This document does not attempt to redefine the terms and 112 concepts specified elsewhere. 114 1.1. Scope 116 The scope of this document is OAM between RBridges of a TRILL campus 117 over links selected by TRILL routing. 119 2. Conventions used in this document 121 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 122 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 123 document are to be interpreted as described in RFC-2119 [RFC2119]. 124 Although this document is not a protocol specification, the use of 125 this language clarifies the instructions to protocol designers 126 producing solutions that satisfy the requirements set out in this 127 document. 129 3. Terminology 131 Section: The term Section refers to a partial segment of a path 132 between any two given RBridges. As an example, consider the case 133 where RB1 is connected to RBx via RB2,RB3 and RB4. The segment 134 between RB2 to RB4 is referred to as a Section of the path RB1 to 135 RBx. 137 Flow: The term Flow indicates a set of packets that share the same 138 path and per-hop behavior (such as priority). A flow is typically 139 identified by a portion of the inner payload that affects the hop-by 140 hop forwarding decisions. This may contain Layer 2 through Layer 4 141 information. 143 All Selectable Least Cost Paths: The term "all selectable least cost 144 paths" refers to a subset of all potentially available least cost 145 paths to a specified destination RBridge that are available (and 146 usable) for forwarding of frames. It is important to note, in 147 practice, due to limitations in implementations, not all available 148 least cost paths may be selectable for forwarding. 150 Connectivity: The term connectivity indicates reachability between 151 an arbitrary RBridge RB1 and any other RBridge RB2. The specific 152 path can be either explicit (i.e. associated with a specific flow) 153 or unspecified. Unspecified means that messages used for 154 connectivity verification take whatever path the RBs happen to 155 select. 157 Continuity Verification: Continuity Verification refers to proactive 158 verification of Connectivity between two RBridges at periodic 159 intervals and generation of explicit notification when Connectivity 160 failures occur. 162 Fault: The term Fault refers to an inability to perform a required 163 action, e.g., an unsuccessful attempt to deliver a packet. 165 Defect: The term Defect refers to an interruption in the normal 166 operation, such that over a period of time no packets are delivered 167 successfully. 169 Failure: The term Failure refers to the termination of the required 170 function over a longer period of time. Persistence of a defect for a 171 period of time is interpreted as a failure. 173 Simulated Flow: The term simulated flow refers to a sequence of OAM 174 generated packets designed to follow a specific path. The fields of 175 the packets in the simulated flow may or may not be identical to the 176 fields of data packets of an actual flow being simulated. However, 177 the purpose of the simulated flow is to have OAM packets of the 178 simulated flow follow a specific path. 180 4. OAM Requirements 182 4.1. Data Plane 184 OAM frames, utilized for connectivity verification, continuity 185 checks, performance measurements, etc., will by default take 186 whatever path TRILL chooses based on the current topology and per- 187 hop equal cost path choices. In some cases, it may be required that 188 the OAM frames utilize specific paths. Thus, it MUST be possible to 189 arrange that OAM frames follow the path taken by a specific flow. 191 RBridges MUST have the ability to identify OAM frames destined for 192 them or which require processing by the OAM plane from normal data 193 frames. 195 TRILL OAM frames MUST remain within a TRILL campus and MUST NOT be 196 egressed from a TRILL network as native frames. 198 OAM MUST have ability to include all Ethernet traffic types carried 199 by TRILL. 201 4.2. Connectivity Verification 203 4.2.1. Unicast 205 From an arbitrary RBridge RB1, OAM MUST have the ability to verify 206 connectivity to any other RBridge RB2. 208 From an arbitrary RBridge RB1, OAM MUST have the ability to verify 209 connectivity to any other RBridge RB2 for a specific flow via the 210 path associated with the specified flow. 212 An RBridge SHOULD have the ability to verify the above connectivity 213 tests on sections. As an example, assume RB1 is connected to RB5 via 214 RB2, RB3 and RB4. An operator SHOULD be able to verify the RB1 to 215 RB5 connectivity on the section from RB3 to RB5. The difference is 216 that the ingress and egress TRILL nicknames in this case are RB1 and 217 RB5 as opposed to RB3 and RB5, even though the message itself may 218 originate at RB3. 220 4.2.2. Distribution Trees 222 OAM MUST have the ability to verify connectivity, from an arbitrary 223 RBridge RB1, to either a specific set of RBridges or all member 224 RBridges, for a specified distribution tree. This functionality is 225 referred to as verification of the un-pruned distribution tree. 227 OAM MUST have the ability to verify connectivity, from an arbitrary 228 RBridge RB1, to either a specific set of RBridges or all member 229 RBridges, for a specified distribution tree and for a specified 230 flow. This functionality is referred to as verification of the 231 pruned tree. 233 4.3. Continuity Check 235 OAM MUST provide functions that allow any arbitrary RBridge RB1 to 236 perform a Continuity Check to any other RBridge. 238 OAM MUST provide functions that allow any arbitrary RBridge RB1 to 239 perform a Continuity Check to any other RBridge using a path 240 associated with a specified flow. 242 OAM SHOULD provide functions that allow any arbitrary RBridge to 243 perform a Continuity Check to any other RBridge over all selectable 244 least cost paths. 246 OAM SHOULD provide the ability to perform a Continuity Check on 247 sections of any selectable path within the network. 249 OAM SHOULD provide the ability to perform a multicast Continuity 250 Check for specified distribution tree(s) as well as specified 251 distribution tree and flow combinations. The former is referred to 252 as an un-pruned multi-destination tree Continuity Check and the 253 latter is referred to as a pruned tree Continuity Check. 255 4.4. Path Tracing 257 OAM MUST provide the ability to trace a path between any two 258 RBridges per specified unicast flow. 260 OAM SHOULD provide the ability to trace all selectable least cost 261 paths between any two RBridges. 263 OAM SHOULD provide functionality to trace all branches of a 264 specified distribution tree (un-pruned tree). 266 OAM SHOULD provide functionality to trace all branches of a 267 specified distribution tree for a specified flow (pruned tree). 269 4.5. General Requirements 271 OAM MUST provide the ability to initiate and maintain multiple 272 concurrent sessions for multiple OAM functions between any arbitrary 273 RBridge RB1 to any other RBridge. In general, multiple OAM 274 operations will run concurrently. For example, proactive continuity 275 checks may take place between RB1 and RB2 at the same time an 276 operator decides to test connectivity between the same two RBs. 277 Multiple OAM functions and instances of those functions MUST be able 278 to run concurrently without interfering with each other. 280 OAM MUST provide a single OAM framework for all TRILL OAM functions 281 within the scope of this document. 283 OAM, as practical and as possible, SHOULD provide a single framework 284 between TRILL and other similar standards. 286 OAM MUST maintain related error and operational counters. Such 287 counters MUST be accessible via network management applications 288 (e.g. SNMP). 290 OAM functions related to continuity and connectivity checks MUST be 291 able to be invoked either proactively or on-demand. 293 OAM MAY be required to provide the ability to specify a desired 294 response mode for a specific OAM message. The desired response mode 295 can be either in-band, out-of band or none. 297 The OAM Framework MUST be extensible to future needs of TRILL and 298 the needs of other standard organizations. 300 OAM MAY provide methods to verify control plane and forwarding plane 301 alignments. 303 OAM SHOULD leverage existing OAM technologies, where practical. 305 4.6. Performance Monitoring 307 4.6.1. Packet Loss 309 In this document, the term loss of a packet is used as defined in 310 [RFC2680] (see Section 2.4 of RFC2680). 312 OAM SHOULD provide the ability to measure packet loss statistics for 313 a simulated flow from any arbitrary RBridge RB1 to any other 314 RBridge. 316 OAM SHOULD provide the ability to measure packet loss statistics 317 over a section, for a simulated flow between any arbitrary RBridge 318 RB1 to any other RBridge. 320 OAM SHOULD provide the ability to measure simulated packet loss 321 statistics between any two RBridges over all least cost paths. 323 An RBridge SHOULD be able to perform the above packet loss 324 measurement functions either proactively or on-demand. 326 4.6.2. Packet Delay 328 There are two types of packet delays -- one-way delay and two-way 329 delay (Round Trip Delay). 331 One-way delay is defined in [RFC2679] as the time elapsed from the 332 start of transmission of the first bit of a packet by an RBridge 333 until the reception of the last bit of the packet by the destination 334 RBridge. 336 Two-way delay is also referred to as Round Trip Delay and is defined 337 similar to [RFC2681]; i.e. the time elapsed from the start of 338 transmission of the first bit of a packet from RB1, receipt of the 339 packet at RB2, RB2 sending a response packet back to RB1 and RB1 340 receiving the last bit of that response packet. 342 OAM SHOULD provide functions to measure two-way delay between two 343 RBridges. 345 OAM MAY provide functions to measure one-way delay between two 346 RBridges for a specified flow. 348 OAM MAY provide functions to measure one-way delay between two 349 RBridges for a specified flow over a specific section. 351 4.7. ECMP Utilization 353 OAM MAY provide functionality to monitor the effectiveness of per- 354 hop ECMP hashing. For example, individual RBridges could maintain 355 counters that show how packets are being distributed across equal 356 cost next hops for a specified destination RBridge or RBridges as a 357 result of ECMP hashing. 359 4.8. Security and Operational considerations 361 Methods MUST be provided to protect against exploitation of OAM 362 framework for security and denial of service attacks. 364 Methods SHOULD be provided to prevent OAM messages causing 365 congestion in the networks. Periodically generated messages with 366 high frequencies may lead to congestion, hence methods such as 367 shaping or rate limiting SHOULD be utilized. 369 4.9. Fault Indications 371 The term Fault refers to an inability to perform a required action, 372 e.g., an unsuccessful attempt to deliver a packet [OAMOVER]. The 373 unsuccessful attempt may be due to Hop Count expiry, invalid 374 nickname, etc. 376 OAM MUST provide a Fault Indication framework to notify faults to 377 the ingress RBRidge of the packet or other interested parties (such 378 as syslog servers). 380 OAM MUST provide functions to selectively enable or disable 381 different types of Fault Indications. 383 4.10. Defect Indications 385 [OAMOVER] defines "The term Defect refers to an interruption in the 386 normal operation, such as a consecutive period of time where no 387 packets are delivered successfully." 389 OAM SHOULD provide a framework for Defect Detection and Indication. 391 OAM Defect Detection and Indication Framework SHOULD provide methods 392 to selectively enable or disable Defect Detection per defect type. 394 OAM Defect Detection and Indication Framework SHOULD provide methods 395 to configure Defect Detection thresholds per different types of 396 defects. 398 OAM Defect Detection and Indication Framework SHOULD provide 399 methods to log defect indications to a locally defined archive such 400 as log buffer or SNMP traps. 402 OAM Defect Detection and Indication Framework SHOULD provide a 403 Remote Defect Indication framework that facilitates notifying the 404 originator/owner of the flow experiencing the defect, which is the 405 ingress RBridge. 407 Remote Defect Indication MAY be either in-band or out-of-band. 409 4.11. Live Traffic monitoring 411 OAM implementations MAY provide methods to utilize live traffic for 412 troubleshooting and performance monitoring. 414 Implementations MAY leverage Data Driven CFM [8021Q] or IPFIX 415 [RFC5101] for the purpose of performance monitoring. 417 5. Security Considerations 419 Security Requirements are specified in section 4.8. For general 420 TRILL security considerations please refer to [RFC6325] 422 6. IANA Considerations 424 None 426 7. References 428 7.1. Normative References 430 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 431 Requirement Levels", BCP 14, RFC 2119, March 1997. 433 7.2. Informative References 435 [RFC6325] Perlman, R., et.al., "Routing Bridges (RBridges): Base 436 Protocol Specification", RFC 6325, July 2011. 438 [RFC5101] Claise, B., "Specification of the IP Flow Information 439 Export (IPFIX) Protocol for the Exchange of IP Traffic 440 Flow Information", RFC5101, January 2008. 442 [RFC2680] Almes, G., et.al. "A One-way Packet Loss Metric for IPPM", 443 RFC 2680, September 1999. 445 [RFC2679] Almes, G., et.al. "A One-way Delay Metric for IPPM", RFC 446 2679, September 1999. 448 [RFC2681] Almes, G., et.al. "A Round-trip Delay Metric for IPPM", 449 RFC 2681, September 1999. 451 [RFC6291] Anderson, L., et.al. "Guidelines for the Use of the "OAM" 452 Acronym in the IETF", RFC 6291, June 2011. 454 [8021ag] IEEE, "Virtual Bridged Local Area Networks Amendment 5: 455 Connectivity Fault Management", 802.1ag, 2007. 457 [8021Q] IEEE, "Media Access Control (MAC) Bridges and Virtual 458 Bridged Local Area Networks", IEEE Std 802.1Q-2011, 459 August, 2011. 461 [RFC4377] Nadeau, T., et.al. "Operations and Management (OAM) 462 Requirements for Multi-protocol Label Switched 463 (MPLS)Networks", RFC 4377, February 2006. 465 [OAMOVER] Mizrahi, T, et.al., "An Overview of Operations, 466 Administration, and Maintenance (OAM) Mechanisms", draft- 467 ietf-opsawg-oam-overview-06, Work in Progress, March 2012. 469 [RFC5860] Vigoureux, M., et.al., "Requirements for Operations, 470 Administration and Maintenance (OAM) in MPLS Transport 471 Networks", RFC5860, May 2010. 473 8. Acknowledgments 475 Special acknowledgments to IEEE 802.1 chair, Tony Jeffree for 476 allowing us to solicit comments from IEEE 802.1 group. Also 477 recognized are the comments received from IEEE group, Ayal Lior and 478 others. 480 This document was prepared using 2-Word-v2.0.template.dot. 482 9. Contributing Authors 484 Tissa Senevirathne 485 CISCO Systems 486 375 East Tasman Drive 487 San Jose, CA 95134 488 USA. 490 Phone: +1-408-853-2291 491 Email: tsenevir@cisco.com 492 David Bond 493 IBM 494 2051 Mission College Blvd 495 Santa Clara, CA 95054 496 USA 498 Phone: +1-603-339-7575 499 Email: mokon@mokon.net 501 Sam Aldrin 502 Huawei Technologies 503 2330 Central Express Way 504 Santa Clara, CA 95951 505 USA 507 Email: aldrin.ietf@gmail.com 509 Yizhou Li 510 Huawei Technologies 511 101 Software Avenue, 512 Nanjing 210012 513 China 515 Phone: +86-25-56625375 516 Email: liyizhou@huawei.com 518 Rohit Watve 519 CISCO Systems 520 375 East Tasman Drive 521 San Jose, CA 95134 522 USA. 524 Phone: +1-408-424-2091 525 Email: rwatve@cisco.com 527 Thomas Narten 528 IBM Corporation 529 3039 Cornwallis Avenue, 530 PO Box 12195 531 Research Triangle Park, NC 27709 532 USA 534 Email:narten@us.ibm.com 535 Donald Eastlake 536 Huawei Technologies 537 155 Beaver Street, 538 Milford, MAC 01757 539 USA. 541 Email: d3e3e3@gmail.com 543 Anoop Ghanwani 544 DELL 545 350 Holger Way 546 San Jose, CA 95134 547 USA. 549 Phone: +1-408-571-3500 550 Email: Anoop@alumni.duke.edu 552 Jon Hudson 553 Brocade 554 120 Holger Way 555 San Jose, CA 95134 556 USA. 558 Email: jon.hudson@gmail.com 560 Naveen Nimmu 561 Broadcom 562 9th Floor, Building no 9, Raheja Mind space 563 Hi-Tec City, Madhapur, 564 Hyderabad - 500 081, INDIA 566 Phone: +1-408-218-8893 567 Email: naveen@broadcom.com 568 Radia Perlman 569 Intel Labs 570 2700 156th Ave NE, Suite 300, 571 Bellevue, WA 98007 572 USA. 574 Phone: +1-425-881-4824 575 Email: radia.perlman@intel.com 577 Tal Mizrahi 578 Marvell 579 6 Hamada St. 580 Yokneam, 20692 Israel 582 Email: talmi@marvell.com