idnits 2.17.1 draft-salam-trill-oam-framework-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 3 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 19, 2012) is 4169 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'TRILL-IP' is mentioned on line 120, but not defined == Missing Reference: 'RFC6361' is mentioned on line 610, but not defined == Missing Reference: 'RFC5880' is mentioned on line 164, but not defined == Missing Reference: 'RFC5654' is mentioned on line 419, but not defined == Missing Reference: 'TRILLML' is mentioned on line 437, but not defined == Missing Reference: 'RFC4379' is mentioned on line 672, but not defined ** Obsolete undefined reference: RFC 4379 (Obsoleted by RFC 8029) ** Obsolete normative reference: RFC 6327 (Obsoleted by RFC 7177) Summary: 3 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TRILL Working Group Samer Salam 3 INTERNET-DRAFT Tissa Senevirathne 4 Intended Status: Informational Cisco 6 Sam Aldrin 7 Donald Eastlake 8 Huawei 10 Expires: April 22, 2013 October 19, 2012 12 TRILL OAM Framework 13 draft-salam-trill-oam-framework-03 15 Abstract 17 This document specifies a reference framework for Operations, 18 Administration and Maintenance (OAM) in TRILL networks. The focus of 19 the document is on the fault and performance management aspects of 20 TRILL OAM. 22 Status of this Memo 24 This Internet-Draft is submitted to IETF in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as 30 Internet-Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/1id-abstracts.html 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html 43 Copyright and License Notice 45 Copyright (c) 2012 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 62 1.2 Relationship to Other OAM Work . . . . . . . . . . . . . . . 5 63 2. TRILL OAM Model . . . . . . . . . . . . . . . . . . . . . . . . 6 64 2.1 OAM Layering . . . . . . . . . . . . . . . . . . . . . . . . 6 65 2.1.1 Relationship to CFM . . . . . . . . . . . . . . . . . . 7 66 2.1.2 Relationship to BFD . . . . . . . . . . . . . . . . . . 8 67 2.1.3 Relationship to Link OAM . . . . . . . . . . . . . . . . 8 68 2.2 TRILL OAM in the RBridge Port Model . . . . . . . . . . . . 9 69 2.3 Network, Service and Flow OAM . . . . . . . . . . . . . . . 10 70 2.4 Maintenance Domains . . . . . . . . . . . . . . . . . . . . 11 71 2.5 Maintenance Entity and Maintenance Entity Group . . . . . . 12 72 2.6 MEPs and MIPs . . . . . . . . . . . . . . . . . . . . . . . 12 73 2.7 Maintenance Point Addressing . . . . . . . . . . . . . . . . 14 74 3. OAM Frame Format . . . . . . . . . . . . . . . . . . . . . . . 14 75 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 3.2 Determination of Flow Entropy . . . . . . . . . . . . . . . 16 77 3.2.1 Address Learning and Flow Entropy . . . . . . . . . . . 16 78 3.3 OAM Message Channel . . . . . . . . . . . . . . . . . . . . 16 79 3.4 Identification of OAM Messages . . . . . . . . . . . . . . . 17 80 4. Fault Management . . . . . . . . . . . . . . . . . . . . . . . 17 81 4.1 Proactive Fault Management Functions . . . . . . . . . . . . 17 82 4.1.1 Fault Detection (Continuity Check) . . . . . . . . . . . 17 83 4.1.2 Defect Indication . . . . . . . . . . . . . . . . . . . 18 84 4.1.2.1 Forward Defect Indication . . . . . . . . . . . . . 18 85 4.1.2.2 Reverse Defect Indication (RDI) . . . . . . . . . . 18 86 4.2 On-Demand Fault Management Functions . . . . . . . . . . . . 19 87 4.2.1 Connectivity Verification . . . . . . . . . . . . . . . 19 88 4.2.1.1 Unicast . . . . . . . . . . . . . . . . . . . . . . 19 89 4.2.1.2 Multicast . . . . . . . . . . . . . . . . . . . . . 20 90 4.2.2 Fault Isolation . . . . . . . . . . . . . . . . . . . . 20 91 5. Performance Management . . . . . . . . . . . . . . . . . . . . 21 92 5.1 Packet Loss . . . . . . . . . . . . . . . . . . . . . . . . 21 93 5.2 Packet Delay . . . . . . . . . . . . . . . . . . . . . . . . 21 94 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 22 95 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 22 96 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22 97 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 98 9.1 Normative References . . . . . . . . . . . . . . . . . . . 22 99 9.2 Informative References . . . . . . . . . . . . . . . . . . 23 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 102 1. Introduction 104 This document specifies a reference framework for Operations, 105 Administration and Maintenance (OAM, [RFC6291]) in TRILL (Transparent 106 Interconnection of Lots of Links) networks. 108 TRILL [RFC6325] specifies a protocol for shortest-path frame routing 109 in multi-hop networks with arbitrary topologies and link 110 technologies, using the IS-IS routing protocol. TRILL capable devices 111 are referred to as TRILL Switches or RBridges (Routing Bridges). 112 RBridges provide an optimized and transparent Layer 2 delivery 113 service for Ethernet unicast and multicast traffic. Some 114 characteristics of a TRILL network that are different from Ethernet 115 bridging are the following: 117 - TRILL networks support arbitrary link technology between TRILL 118 switches. Hence, a TRILL switch port may not have a 48-bit MAC 119 Address [802] but might, for example, have an IP address as an 120 identifier [TRILL-IP] or no unique identifier (PPP [RFC6361]). 122 - TRILL networks do not enforce congruency of unicast and multicast 123 paths between a given pair of RBridges. 125 - TRILL networks do not impose symmetry of the forward and reverse 126 paths between a given pair of RBridges. 128 - TRILL supports multipathing of unicast as well as multicast 129 traffic. 131 In this document, we refer to the term OAM as defined in [RFC6291]. 132 The Operations aspect involves finding problems that prevent proper 133 functioning of the network. It also includes monitoring of the 134 network to identify potential problems before they occur. 135 Administration involves keeping track of network resources. 136 Maintenance activities are focused on facilitating repairs and 137 upgrades as well as corrective and preventive measures. [ISO/IEC 138 7498-4] defines 5 functional areas in the OSI model for network 139 management, commonly referred to as FCAPS: 141 -Fault Management 142 -Configuration Management 143 -Accounting Management 144 -Performance Management 145 -Security Management 147 The focus of this document is on the first and fourth functional 148 aspects, namely Fault Management and Performance Management, in TRILL 149 networks. These primarily map to the "Operations" and "Maintenance" 150 part of OAM. 152 The draft provides a generic framework for a comprehensive solution 153 that meets the requirements outlined in [TRILL-OAM-REQ]. However, 154 specific mechanisms to address these requirements are considered to 155 be outside the scope of this document. 157 1.1 Terminology 159 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 160 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 161 document are to be interpreted as described in RFC 2119 [RFC2119]. 163 In addition, the following acronyms are used: 164 BFD - Bidirectional Forwarding Detection [RFC5880] 165 CFM - Connectivity Fault Management [802.1Q] 166 FGL - Fine Grained Label(ing) [TRILL-FGL] 167 IEEE - Institute for Electrical and Electronic Engineers 168 IP - Internet Protocol, includes both IPv4 and IPv6 169 L2VPN - Layer 2 Virtual Private Network 170 LAN - Local Area Network 171 MEG - Maintenance Entity Group 172 MEP - Maintenance End Point 173 MIP - Maintenance Intermediate Point 174 MP - Maintenance Point (MEP or MIP) 175 OAM - Operations, Administration, and Maintenance [RFC6291] 176 RBridge - Routing Bridge, a device implementing TRILL [RFC6325] 177 TRILL - Transparent Interconnection of Lots of Links [RFC6325] 178 TRILL Switch - an alternate name for an RBridge 179 VLAN - Virtual LAN 181 1.2 Relationship to Other OAM Work 183 OAM is a technology area where a wealth of prior art exists. This 184 document leverages concepts and draws upon elements defined and/or 185 used in the following documents: 187 [TRILL-OAM-REQ] defines the requirements for TRILL OAM which serve as 188 the basis for this framework. 190 [802.1Q] specifies the Connectivity Fault Management protocol, which 191 defines the concepts of Maintenance Domains, Maintenance End Points, 192 and Maintenance Intermediate Points. 194 [Y.1731] extends Connectivity Fault Management in the following 195 areas: it defines fault notification and alarm suppression functions 196 for Ethernet. It also specifies mechanisms for Ethernet performance 197 management, including loss, delay, jitter, and throughput 198 measurement. 200 [RFC6136] specifies a reference model for OAM as it relates to L2VPN 201 services, pseudowires and associated Public Switched Network tunnels. 202 The document also specifies OAM requirements for L2VPN services. 204 [RFC6371] describes a framework to support a comprehensive set of OAM 205 procedures that fulfill the MPLS-TP OAM requirements for fault, 206 performance, and protection-switching management and that do not rely 207 on the presence of a control plane. 209 [TRILL-BFD] defines a TRILL encapsulation for BFD that enables the 210 use of the latter for network fast convergence. 212 2. TRILL OAM Model 214 2.1 OAM Layering 216 In the TRILL architecture, the TRILL layer is independent of the 217 underlying Link Layer technology. Therefore, it is possible to run 218 TRILL over any transport layer capable of carrying TRILL frames such 219 as Ethernet [RFC6325], PPP [RFC6361], or MPLS. Furthermore, TRILL 220 provides a virtual Ethernet connectivity service that is transparent 221 to higher layer entities (e.g. Layer 3 and above). This strict 222 layering is observed by TRILL OAM. 224 Of particular interest is the layering of TRILL OAM with respect to: 226 - BFD, which is typically used for fast convergence 228 - Ethernet CFM [802.1Q] on paths from an external device, over a 229 TRILL campus, to another external device, especially since TRILL 230 switches are likely to be deployed where existing 802.1 bridges can 231 be such external devices. 233 - Link OAM, on links interior to a TRILL campus, which is link 234 technology specific. 236 Consider the example network depicted in Figure 1 below, where a 237 TRILL network is interconnected via Ethernet links: 239 LAN LAN 240 +---+ +---+ ====== +---+ ============= +---+ 241 +--+ | | | | | +--+ | | | | +--+ +--+ | | | +--+ 242 |B1|---|RB1|---|RB2|---|B2|---|RB3|---|B3|---|B4|---|RB4|---|B5| 243 +--+ | | | | | +--+ | | | | +--+ +--+ | | | +--+ 244 +---+ +---+ ====== +---+ ============= +---+ 246 a. Ethernet CFM (Client Layer) on path over the TRILL campus 247 >---o------------------------------------------------o---< 249 b. TRILL OAM (Network Layer) 250 >------o-----------o---------------------< 252 c. Ethernet CFM (Transport Layer) on interior Ethernet LANs 253 >---o--o---< >---o--o---o--o---< 255 d. BFD (Media Independent Link Layer) 256 #---# #----------# #-----------------# 258 e. Link OAM (Media Dependent Link Layer) 259 *---* *---* *---* *---* *---* *---* *---* *---* 261 Legend: > MEP o MIP # BFD Endpoint * Link OAM Endpoint 263 Figure 1: OAM Layering in TRILL 265 Where Bn and RBn (n= 1,2,3, ...) denote IEEE 802.1Q bridges and TRILL 266 RBridges, respectively. 268 2.1.1 Relationship to CFM 270 In the context of a TRILL network, CFM can be used as either a client 271 layer OAM or a transport layer OAM mechanism. 273 When acting as a client layer OAM (see Figure 1a), CFM provides fault 274 management capabilities for the user, on an end-to-end basis over the 275 TRILL network. Edge ports of the TRILL network may be visible to CFM 276 operations through the optional presence of a CFM Maintenance 277 Intermediate Point (MIP) in the TRILL switches edge Ethernet ports. 279 When acting as a transport layer OAM (see Figure 1c), CFM provides 280 fault management functions for the IEEE 802.1Q bridged LANs that may 281 interconnect RBridges. Such bridged LANs can be used as TRILL level 282 links between RBridges. RBridges directly connected to the 283 intervening 802.1Q bridges may host CFM Down Maintenance End Points 284 (MEPs). 286 2.1.2 Relationship to BFD 288 One-hop BFD (see Figure 1d) runs between adjacent RBridges and 289 provides fast link as well as node failure detection capability 290 [TRILL-BFD]. Note that BFD sits a layer above Link OAM, which is 291 media specific. BFD provides fast convergence characteristics to 292 TRILL networks. It is worth noting that the requirements for BFD are 293 different from those of the TRILL OAM mechanisms that are the prime 294 focus of this document. Furthermore, BFD does not use the frame 295 format described in section 3.1. 297 TRILL BFD differs from TRILL OAM in two significant ways: 299 1. A TRILL BFD transmitter is bound to a specific TRILL output port 300 as explained below. 302 2. TRILL BFD messages can be transmitted by the originator out a port 303 to a neighbor RBridge when the adjacency is in the Detect or Two-Way 304 states as well as when the adjacency is in the Up state [RFC6327]. 306 In contrast, TRILL OAM messages are initially transmitted by 307 appearing to have been received on a virtual TRILL input port (refer 308 to Section 2.2 for details). The output ports on which TRILL OAM 309 message are sent are determined by the TRILL routing function, which 310 will only send on links that are in the Up state and have been 311 incorporated into the local view of the campus topology. 313 For example, assume there are five parallel equal cost links between 314 RB1 and RB2 that have not been aggregated. (Links that are aggregated 315 with [802.1AX] appear to TRILL to be a single link accessible through 316 a single TRILL port.) However, RB1 is only capable of doing up to 4- 317 way ECMP. TRILL OAM messages, as dispatched by the TRILL Routing 318 function, will use 4 of the 5 links. But it is desirable to be able 319 to monitor the fifth link to be sure it is available for failover. 320 TRILL BFD messages sent by RB1 will use the output port to which 321 their session is bound. RB1 can easily monitor all 5 links to RB2 by 322 using a TRILL BFD session bound to each of the 5 output ports. 324 2.1.3 Relationship to Link OAM 326 Link OAM (see Figure 1e) depends on the nature of the technology used 327 in the links interconnecting RBridges. For e.g., for Ethernet links, 328 [802.3] Clause 57 OAM may be used. 330 2.2 TRILL OAM in the RBridge Port Model 332 TRILL OAM processing can be modeled as a shim situated between the 333 Extended Internal Sublayer Service (EISS) in [802.1Q] and the RBridge 334 Forwarding Engine function, on a virtual port with no physical layer 335 (Null PHY). TRILL OAM requires services of the RBridge forwarding 336 engine and utilizes information from the IS-IS control plane. Figure 337 2 below depicts TRILL OAM processing in the context of the RBridge 338 port model defined in [RFC6325]. In this figure, double lines 339 represent flow of both frames and information whereas single lines 340 represent flow of information only. 342 While this figure shows a conceptual model, it is to be understood 343 that implementations need not mirror this exact model as long as the 344 intended OAM requirements and functionality are preserved. 346 +-----------------------------------------------+---- 347 | (Flow of OAM Messages) RBridge 348 | +----------------------+ 349 | |+-------------------+|| Forwarding Engine, 350 | || || IS-IS, Etc. 351 | || || Processing of native 352 | V V and TRILL frames 353 +---------------------------------------------+----- 354 || || || ...other trunk ports 355 || +------------+ +------------+ 356 || | TRILL OAM | | | 357 || | Processing | | Port VLAN | 358 || +------------+ | Processing | 359 || | | 360 |+-------------------------+ | | 361 +-------------------------+| +------------+ <-- ISS 362 || |802.1/802.3 | 363 +-------------------------------+ || |Low Level | 364 | MAC Relay | || |Control | 365 +----||------------------||-----+ || |Frame | 366 +----||------+ ... +-----||-----+ || |Processing, | 367 | | | | || |Port/Link | 368 | Port VLAN | | Port VLAN | || |Control | 369 | Processing | | Processing | || |Logic | 370 +------------+ +------------+ || +------------+ 371 |802.1/802.3 | |802.1/802.3 | || | 802.3PHY | 372 |Low Level | |Low Level | || +------------+ 373 |Control | |Control | || 374 |Frame | |Frame | || 375 |Processing, | |Processing, | || 376 |Port/Link | |Port/Link | || 377 |Control | |Control | || 378 |Logic | |Logic | || 379 +------------+ +------------+ || 380 | 802.3PHY | ... | 802.3PHY | || 381 +------------+ +------------+ || 382 Access/Shared || || 383 Ports |+--------+| 384 +----------+ 386 Figure 2: TRILL OAM in RBridge Port Model 388 Note that there is a single virtual interface, per RBridge, that 389 hosts the TRILL OAM shim. The rationale for this model is discussed 390 in section 2.6 "MEPs and MIPs". 392 2.3 Network, Service and Flow OAM 393 OAM functions in a TRILL network can be conducted at different levels 394 of granularity. This gives rise to 'Network', 'Service' and 'Flow' 395 OAM, listed in order of increasing granularity. 397 Network OAM mechanisms provide fault and performance management 398 functions in the context of a representative 'test' VLAN or fine 399 grained label [TRILL-FGL]. The test VLAN can be thought of as a 400 management or diagnostics VLAN which extends to all RBridges in a 401 TRILL network. In order to account for multipathing, Network OAM 402 functions also make use of test flows (both unicast and multicast) to 403 provide coverage of the various paths in the network. 405 Service OAM mechanisms provide fault and performance management 406 functions in the context of the actual VLAN or fine grained label set 407 for which end station service is enabled. Test flows are used here, 408 as well, to provide coverage in the case of multipathing. 410 Flow OAM mechanisms provide the most granular fault and performance 411 management capabilities, where OAM functions are performed in the 412 context of end station service VLANs or fine grained labels and user 413 flows. While Flow OAM provides the most granular control, it clearly 414 poses scalability challenges if attempted on large numbers of flows. 416 2.4 Maintenance Domains 418 The concept of Maintenance Domains, or OAM Domains, is well known in 419 the industry. IEEE [802.1Q], [RFC6136], [RFC5654], etc... all define 420 the notion of a Maintenance Domain as a collection of devices (e.g. 421 network elements) that are grouped for administrative and/or 422 management purposes. Maintenance domains usually delineate trust 423 relationships, varying addressing schemes, network infrastructure 424 capabilities, etc... 426 When mapped to TRILL, a Maintenance Domain is defined as a collection 427 of RBridges in a network for which faults in connectivity or 428 performance are to be managed by a single operator. All RBridges in a 429 given Maintenance Domain are, by definition, managed by a single 430 entity (e.g. an enterprise or a data center operator, etc...). 431 [RFC6325] defines the operation of TRILL in a single IS-IS area, with 432 the assumption that the network is managed by a single operator. In 433 this context, a single (default) Maintenance Domain is sufficient for 434 TRILL OAM. 436 However, when considering scenarios where different TRILL networks 437 need to be interconnected, for e.g. as discussed in [TRILLML], then 438 the introduction of multiple Maintenance Domains and Maintenance 439 Domain hierarchies becomes useful to map and contain administrative 440 boundaries. When considering multi-domain scenarios, the following 441 rules must be followed: TRILL OAM domains MUST NOT overlap, but MUST 442 either be disjoint or nest to form a hierarchy (i.e. a higher 443 Maintenance Domain MAY completely engulf a lower Domain). A 444 Maintenance Domain is typically identified by a Domain Name and a 445 Maintenance Level (a numeric identifier). The larger the Domain, the 446 higher the Level number. 448 +-------------------+ +---------------+ +-------------------+ 449 | | | TRILL | | | 450 | Site 1 +----+Interconnect +----+ Site 2 | 451 | TRILL | RB | Network | RB | TRILL | 452 | (Level 1) +----+ (Level 2) +----+ (Level 1) | 453 | | | | | | 454 +-------------------+ +---------------+ +-------------------+ 456 <------------------------End-to-End Domain--------------------> 458 <----Site Domain----> <--Interconnect --> <----Site Domain----> 459 Domain 461 Figure 3: TRILL OAM Maintenance Domains 463 2.5 Maintenance Entity and Maintenance Entity Group 465 TRILL OAM functions are performed in the context of logical endpoint 466 pairs referred to as Maintenance Entities (ME). A Maintenance Entity 467 defines a relationship between two points in a TRILL network where 468 OAM functions (e.g. monitoring operations) are applied. The two 469 points which define a Maintenance Entity are known as Maintenance End 470 Points (MEPs) - see section 2.6 below. The set of Maintenance 471 Entities that belong to the same Maintenance Domain are referred to 472 as a Maintenance Entity Group (MEG). On the network path in between 473 MEPs, there can be zero or more intermediate points, called 474 Maintenance Intermediate Points (MIPs). MEPs and MIPs are associated 475 with the MEG and can be part of more than one ME in a given MEG. 477 2.6 MEPs and MIPs 479 OAM capabilities on RBridges can be defined in terms of logical 480 groupings of functions that can be categorized into two functional 481 objects: Maintenance End Points (MEPs) and Maintenance Intermediate 482 Points (MIPs). The two are collectively referred to as Maintenance 483 Points (MPs). 485 MEPs are the active components of TRILL OAM: MEPs source TRILL OAM 486 messages proactively or on-demand based on operator invocation. 487 Furthermore, MEPs ensure that TRILL OAM messages do not leak outside 488 a given Maintenance Domain, e.g. out of the TRILL network and into 489 end stations. MIPs, on the other hand, are internal to a Maintenance 490 Domain. They are the more passive components of TRILL OAM, primarily 491 responsible for forwarding TRILL OAM messages and selectively 492 responding to a subset of these messages. 494 The following figure shows the MEP and MIP placement for the 495 Maintenance Domains depicted in Figure 3 above. 497 TRILL Site 1 Interconnect TRILL Site 2 498 +-----------------+ +------------------+ +-----------------+ 499 | | | | | | 500 | +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ | 501 | |RB1|--|RB2|--|RB3|--|RB4|--|RB5|--|RB6|--|RB7|--|RB8| | 502 | +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ | 503 | | | | | | 504 +-----------------+ +------------------+ +-----------------+ 506 508 510 Legend E: MEP I: MIP 512 Figure 4: MEPs and MIPs 514 It is worth noting that a single RBridge may host multiple MEPs of 515 different technologies, e.g. TRILL OAM MEP(s) and [802.1Q] MEP(s). 516 This does not mean that the protocol operation is necessarily 517 consolidated into a single functional entity on those ports. The 518 protocol functions for each MEP remain independent and reside in 519 different shims in the RBridge Port model of Figure 2: the TRILL OAM 520 MEP resides in the "TRILL OAM Processing" block whereas a CFM MEP 521 resides in the "802.1Q Port VLAN Processing" block. 523 The model of Section 2.2 implies that a single MEP and/or MIP per MEG 524 can be instantiated per RBridge. This simplifies implementations and 525 enables TRILL OAM to perform management functions on sections, as 526 specified in [TRILL-OAM-REQ], while maintaining the simplicity of a 527 single TRILL OAM Maintenance Domain. We do not distinguish between Up 528 MPs and Down MPs (as defined in [802.1Q]) in this framework. Given 529 that the MPs always reside on a special virtual port with no PHY 530 layer, MP directionality is irrelevant. 532 2.7 Maintenance Point Addressing 534 TRILL OAM functions must provide the capability to address a specific 535 Maintenance Point or a set of one or more Maintenance Points in a 536 MEG. To that end, RBridges need to recognize two sets of addresses: 538 - Individual MP addresses 540 - Group MP Addresses 542 TRILL OAM will support the Shared MP address model, where all MPs on 543 an RBridge share the same Individual MP address. In other words, 544 TRILL OAM messages can be addressed to a specific RBridge but not to 545 a specific port on an RBridge. 547 One cannot discern, from observing the external behavior of an 548 RBridge, whether TRILL OAM messages are actually delivered to a 549 certain MP or another entity within the RBridge. The Shared MP 550 address model takes advantage of this fact by allowing MPs in 551 different RBridge ports to share the same Individual MP address. The 552 MPs may still be implemented as residing on different RBridge ports 553 and for the most part, they have distinct identities. 555 The Group MP addresses enable the OAM mechanism to reach all the MPs 556 in a given MEG. Certain OAM functions, e.g. pruned tree verification, 557 require addressing a subset of the MPs in a MEG. Group MP addresses 558 are not defined for such subsets. Rather, the OAM function in 559 question must use the Group MP addresses combined with an indication 560 of the scope of the MP subset encoded in the OAM Message Channel. 561 This prevents the unwieldy response to Group MP addresses. 563 3. OAM Frame Format 565 3.1 Motivation 567 In order for TRILL OAM messages to accurately test the data-path, 568 these messages must be transparent to transit RBridges. That is, a 569 TRILL OAM message must be indistinguishable from a TRILL data frame 570 through normal transit RBridge processing. Only the target RBridge, 571 which needs to process the message, should identify and trap the 572 packet as a control message through normal processing. Additionally 573 methods must be provided to prevent OAM packets from being 574 transmitted out as native frames. 576 The TRILL OAM frame format proposed below provides the necessary 577 flexibility to exercise the data path as closely as possible to 578 actual data packets. 580 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 581 | | 582 . Link Header . Variable 583 | | 584 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 585 | | 586 + TRILL Header + 8 bytes 587 | | 588 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 589 | | 590 . Flow Entropy . Fixed Size 591 . . 592 | | 593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 594 | OAM EtherType | 2 bytes 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 596 | | 597 . OAM Message Channel . Variable 598 . . 599 | | 600 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 601 | | 602 . Link Trailer . Variable 603 | | 604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 606 Figure 5: OAM Frame Format 608 The TRILL Header is as specified in [RFC6325] and the Link Header and 609 Trailer are as specified for the link technology. (Link types 610 standardized so far are [RFC6325] for Ethernet and [RFC6361] for 611 PPP). These fields need to be as similar as practical to the Link 612 Header/Trailer and TRILL Header of the normal TRILL data frame 613 corresponding to the traffic that OAM is testing. 615 The OAM EtherType demarcates the boundary between the Flow Entropy 616 and the OAM Message Channel. The OAM EtherType is expected at a 617 deterministic offset from the TRILL Header, thereby allowing 618 applications to clearly identify the beginning of the OAM Message 619 Channel. Additionally, it facilitates the use of the same OAM frame 620 structure by different Ethernet technologies. 622 The Link Trailer is usually a checksum, such as the Ethernet Frame 623 Check Sequence, which is examined at a low level very early in the 624 frame input process and automatically generated as part of the low 625 level frame output process. If the checksum fails, the frame is 626 normally discarded with no higher level processing. 628 3.2 Determination of Flow Entropy 630 The Flow Entropy is a fixed length field that is populated with 631 either real packet data or synthetic data that mimics the intended 632 flow. 634 For a Layer 2 flow (i.e. non-IP) the Flow Entropy must specify the 635 Ethernet header, including the MAC destination and source addresses 636 as well as a VLAN tag or fine grain label. 638 For a Layer 3 flow, the Flow Entropy must specify the Ethernet 639 header, the IP header and UDP or TCP header fields. 641 Not all fields in the Flow Entropy field need to be identical to the 642 data flow that the OAM message is mimicking. The only requirement is 643 for the selected flow entropy to follow the same path as the data 644 flow that it is mimicking. In other words, the selected flow entropy 645 must result in the same ECMP selection or multicast pruning behavior 646 or other applicable forwarding paradigm. 648 When performing diagnostics on user flows, the OAM mechanisms must 649 allow the network operator to configure the flow entropy parameters 650 (e.g. Layer 2 and/or 3) on the RBridge from which the diagnostic 651 operations are to be triggered. 653 When running OAM functions over Test Flows, the TRILL OAM should 654 provide a mechanism for discovering the flow entropy parameters by 655 querying the RBridges dynamically. 657 3.2.1 Address Learning and Flow Entropy 659 Edge TRILL switches, like traditional 802.1 bridges, are required to 660 learn MAC address associations. Learning is accomplished either by 661 snooping data packets or through other methods. The flow entropy 662 field of TRILL OAM messages mimics real packets and may impact the 663 address learning process of the TRILL data plane. TRILL OAM is 664 required to provide methods to prevent any learning of addresses from 665 the flow entropy field of OAM messages that would interfere with 666 normal TRILL operation. This can be done, for e.g., by 667 suppressing/preventing MAC address learning from OAM messages. 669 3.3 OAM Message Channel 671 The OAM Message Channel provides methods to communicate OAM specific 672 details between RBridges. [802.1Q] CFM and [RFC4379] have implemented 673 OAM message channels. It is desirable to select an appropriate 674 technology and re-use it, instead of redesigning yet another OAM 675 channel. TRILL is a transport layer that carries Ethernet frames, so 676 the TRILL OAM model specified earlier is based on the [802.1Q] CFM 677 model. The use of [802.1Q] CFM encoding format for the OAM Message 678 channel is one possible choice. [TRILL-OAM] presents a proposal on 679 the use of [802.1Q] CFM payload as the OAM message channel. 681 3.4 Identification of OAM Messages 683 RBridges must be able to identify OAM messages that are destined to 684 them, either individually or as a group, so as to properly process 685 those messages. 687 It may be possible to use a combination of one of the unused fields 688 or bits in the TRILL Header and the OAM EtherType to identify TRILL 689 OAM messages. 691 [RFC6325] does not specify any method of identifying OAM messages. 692 Hence, for backwards compatibility reasons, TRILL OAM solutions must 693 provide methods to identify OAM messages through the use of well- 694 known patterns in the Flow Entropy field; for e.g., by using a 695 reserved MAC address as the inner MAC SA. 697 4. Fault Management 699 Section 4.1 below discusses proactive fault management and Section 700 4.2 discusses on-demand fault management. 702 4.1 Proactive Fault Management Functions 704 Proactive fault management functions are configured by the network 705 operator to run periodically without a time bound, or are configured 706 to trigger certain actions upon the occurrence of specific events. 708 4.1.1 Fault Detection (Continuity Check) 710 Proactive fault detection is performed by periodically monitoring the 711 reachability between service endpoints, i.e. MEPs in a given MEG, 712 through the exchange of Continuity Check messages. The reachability 713 between any two arbitrary MEP may be monitored for a specified path, 714 all paths or any representative path. The fact that TRILL networks do 715 not enforce congruency between unicast and multicast paths means that 716 the proactive fault detection mechanism must provide procedures to 717 monitor the unicast paths independently of the multicast paths. 718 Furthermore, where the network has ECMP, the proactive fault 719 detection mechanism must be capable of exercising the equal-cost 720 paths individually. 722 The set of MEPs exchanging Continuity Check messages in a given 723 domain and for a specific monitored entity (flow, network or service) 724 must use the same transmission period. As long as the fault detection 725 mechanism involves MEPs transmitting periodic heartbeat messages 726 independently, then this OAM procedure is not affected by the lack of 727 forward/reverse path symmetry in TRILL. 729 The proactive fault detection function must detect the following 730 types of defects: 732 - Loss of continuity (LoC) to one or more remote MEPs 733 - Unexpected connectivity between isolated VLANs (mismerge) 734 - Unexpected connectivity to one or more remote MEPs 735 - Period mis-configuration 737 4.1.2 Defect Indication 739 TRILL OAM MUST support event-driven defect indication upon the 740 detection of a connectivity defect. Defect indications can be 741 categorized into two types: 743 4.1.2.1 Forward Defect Indication 745 This is used to signal a failure that is detected by a lower layer 746 OAM mechanism. Forward Defect indication is transmitted away from the 747 direction of the failure. For e.g., consider a simple network 748 comprising of four RBridges connected in tandem: RB1, RB2, RB3 and 749 RB4. Both RB1 and RB4 are hosting TRILL OAM MEPs, whereas RB2 and RB3 750 have MIPs. If the link between RB2 and RB3 fails, then RB2 can send a 751 forward defect indication towards RB1 while RB3 sends a forward 752 defect indication towards RB4. 754 Forward defect indication may be used for alarm suppression and/or 755 for purpose of inter-working with other layer OAM protocols. Alarm 756 suppression is useful when a transport/network level fault translates 757 to multiple service or flow level faults. In such a scenario, it is 758 enough to alert a network management station (NMS) of the single 759 transport/network level fault in lieu of flooding that NMS with a 760 multitude of Service or Flow granularity alarms. 762 4.1.2.2 Reverse Defect Indication (RDI) 764 RDI is used to signal that the advertising MEP has detected a loss of 765 continuity (LoC) defect. RDI is transmitted in the direction of the 766 failure. For e.g., consider the same tandem network of the previous 767 section. If RB1 detects that is has lost connectivity to RB4 because 768 it is no longer receiving Continuity Check messages from the MEP on 769 RB4, then RB1 can transmit an RDI towards RB4 to inform the latter of 770 the failure. If the failure is unidirectional (i.e. it is affecting 771 the direction from RB4 to RB1), then the RDI enables RB4 to become 772 aware of the unidirectional connectivity anomaly. 774 RDI allows single-sided management, where the network operator can 775 examine the state of a single MEP and deduce the overall health of a 776 monitored entity (network, flow or service). 778 4.2 On-Demand Fault Management Functions 780 On-demand fault management functions are initiated manually by the 781 network operator and continue for a time bound period. These 782 functions enable the operator to run diagnostics to investigate a 783 defect condition. 785 4.2.1 Connectivity Verification 787 As specified in [TRILL-OAM-REQ], TRILL OAM must support on-demand 788 connectivity verification for unicast and multicast. The connectivity 789 verification mechanism must provide a means for specifying and 790 carrying in the messages: 792 - variable length payload/padding to test MTU related connectivity 793 problems. 795 - test traffic patterns as defined in [RFC2544]. 797 4.2.1.1 Unicast 799 Unicast connectivity verification operation must be initiated from a 800 MEP and may target either a MIP or another MEP. For unicast, 801 connectivity verification can be performed at either Network or Flow 802 granularity. 804 Connectivity verification at the Network granularity tests 805 connectivity between a MEP on a source RBridge and a MIP or MEP on a 806 target RBridge over a representative test VLAN and for a test flow. 807 The operator must supply the source and target RBridges for the 808 operation, and the test VLAN/flow information uses pre-set values or 809 defaults. 811 Connectivity verification at the Flow granularity tests connectivity 812 between a MEP on a source RBridge and a MIP or MEP on a target 813 RBridge over an operator specified VLAN or fine grain label with 814 operator specified flow parameters. 816 The above functions must be supported on sections, as defined in 817 [TRILL-OAM-REQ]. When connectivity verification is triggered over a 818 section, and the initiating MEP does not coincide with the edge 819 (ingress) RBridge, the MEP must use the edge RBridge nickname instead 820 of the local RBridge nickname on the associated connectivity 821 verification messages. The operator must supply the edge RBridge 822 nickname as part of the operation parameters. 824 4.2.1.2 Multicast 826 For multicast, the connectivity verification function tests all 827 branches and leaf nodes of a multidestination distribution tree for 828 reachability. This function should include mechanisms to prevent 829 reply storms from overwhelming the initiating RBridge. This may be 830 done, for e.g., by staggering the replies. To further prevent reply 831 storms, connectivity verification operation is initiated from a MEP 832 and must target MEPs only. MIPs are transparent to multicast 833 connectivity verification. 835 Per [TRILL-OAM-REQ], multicast connectivity verification must provide 836 the following granularity of operation: 838 A. Un-pruned Tree 840 - Connectivity verification for un-pruned multidestination 841 distribution tree. The operator in this case supplies the tree 842 identifier (root RBridge nickname) and campus wide diagnostic VLAN. 844 B. Pruned Tree 846 - Connectivity verification for a VLAN or fine-grain label in a given 847 multidestination distribution tree. The operator in this case 848 supplies the tree identifier and VLAN or fine grain label. 850 - Connectivity verification for an IP multicast group in a given 851 multidestination distribution tree. The operator in this case 852 supplies: the tree identifier, VLAN or fine grain label and IP (S,G) 853 or (*,G). 855 4.2.2 Fault Isolation 857 TRILL OAM must support an on-demand connectivity fault localization 858 function. This is the capability to trace the path of a Flow on a 859 hop-by-hop (i.e. RBridge by RBridge) basis to isolate failures. This 860 involves the capability to narrow down the locality of a fault to a 861 particular port, link or node. The characteristic of forward/reverse 862 path asymmetry, in TRILL, renders fault isolation into a direction- 863 sensitive operation. That is, given two RBridges A and B, 864 localization of connectivity faults between them requires running 865 fault isolation procedures from RBridge A to RBridge B as well as 866 from RBridge B to RBridge A. Generally speaking, single-sided fault 867 isolation is not possible in TRILL OAM. 869 5. Performance Management 871 Performance Management functions can be performed both proactively 872 and on-demand. Proactive management involves a scheduling function, 873 where the performance management probes can be triggered on a 874 recurring basis. Since the basic performance management functions 875 involved are the same, we make no distinction between proactive and 876 on-demand functions in this section. 878 5.1 Packet Loss 880 Given that TRILL provides inherent support for multipoint-to- 881 multipoint connectivity, then packet loss cannot be accurately 882 measured by means of counting user data packets. This is because user 883 packets can be delivered to more RBridges or more ports than are 884 necessary (e.g. due to broadcast, un-pruned multicast or unknown 885 unicast flooding). As such, a statistical means of approximating 886 packet loss rate is required. This can be achieved by sending 887 "synthetic" (i.e. TRILL OAM) packets that are counted only by those 888 ports (MEPs) that are required to receive them. This provides a 889 statistical approximation of the number of data frames lost, even 890 with multipoint-to-multipoint connectivity. 892 Packet loss probes must be initiated from a MEP and must target a 893 MEP. This function must be supported on sections, as defined in 894 [TRILL-OAM-REQ]. When packet loss is measured over a section, and the 895 initiating MEP does not coincide with the edge (ingress) RBridge, the 896 MEP must use the edge RBridge nickname instead of the local RBridge 897 nickname on the associated loss measurement messages. The user must 898 supply the edge RBridge nickname as part of the operation parameters. 900 5.2 Packet Delay 902 Packet delay is measured by inserting time-stamps in TRILL OAM 903 packets. In order to ensure high accuracy of measurement, TRILL OAM 904 must specify the time-stamp location at fixed offsets within the OAM 905 packet in order to facilitate hardware-based time-stamping. Hardware 906 implementations must implement the time-stamping function as close to 907 the wire as practical in order to maintain high accuracy. 909 6. Security Considerations 911 TRILL OAM must provide mechanisms for: 913 - Preventing denial of service attacks caused by exploitation of 914 the OAM message channel. 916 - Optionally authenticate communicating endpoints (MEPs and MIPs) 918 - Preventing TRILL OAM packets from leaking outside of the TRILL 919 network or outside their corresponding Maintenance Domain. This can 920 be done by having MEPs implement a filtering function based on the 921 Maintenance Level associated with received OAM packets. 923 For general TRILL Security Considerations, see [RFC6325]. 925 7. IANA Considerations 927 This document requires no IANA Actions. RFC Editor: Please delete 928 this section before publication. 930 8. Acknowledgements 932 We invite feedback and contributors. 934 9. References 936 9.1 Normative References 938 [TRILL-OAM-REQ] Senevirathne, "Requirements for Operations, 939 Administration and Maintenance (OAM) in TRILL", draft- 940 tissa-trill-oam-req, work in progress. 942 [RFC6325] Perlman, et al., "Routing Bridges (RBridges): Base 943 Protocol Specification", RFC 6325, July 2011. 945 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 946 Requirement Levels", BCP 14, RFC 2119, March 1997. 948 [RFC6136] Sajassi, A., Ed., and D. Mohan, Ed., "Layer 2 Virtual 949 Private Network (L2VPN) Operations, Administration, and 950 Maintenance (OAM) Requirements and Framework", RFC 6136, 951 March 2011. 953 [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for 954 Network Interconnect Devices", RFC 2544, March 1999. 956 [RFC6291] Andersson et al., BCP 161 "Guidelines for the Use of the 957 "OAM" Acronym in the IETF", June 2011. 959 [RFC6327] Eastlake 3rd, D., Perlman, R., Ghanwani, A., Dutt, D., and 960 V. Manral, "Routing Bridges (RBridges): Adjacency", RFC 961 6327, July 2011. 963 [TRILL-FGL] D. Eastlake et al., "TRILL Fine-Grained Labeling", draft- 964 ietf-trill-fine-labeling, work in progress. 966 [802.1Q] "IEEE Standard for Local and metropolitan area networks - 967 Media Access Control (MAC) Bridges and Virtual Bridge 968 Local Area Networks", IEEE Std 802.1Q-2011, 31 August 969 2011. 971 [RFC6371] Busi & Allan, "Operations, Administration, and Maintenance 972 Framework for MPLS-Based Transport Networks", RFC 6371, 973 September 2011. 975 [802] "IEEE Standard for Local and Metropolitan Area Networks - 976 Overview and Architecture", IEEE Std 802-2001, 8 Match 977 2002. 979 9.2 Informative References 981 [Y.1731] "ITU-T Recommendation Y.1731 (02/08) - OAM functions and 982 mechanisms for Ethernet based networks", February 2008. 984 [ISO/IEC 7498-4] "Information processing systems -- Open Systems 985 Interconnection -- Basic Reference Model -- Part 4: 986 Management framework", ISO/IEC, 1989. 988 [TRILL-BFD] V. Manral, et al., "TRILL (Transparent Interconnetion of 989 Lots of Links): Bidirectional Forwarding Detection (BFD) 990 Support", draft-ietf-trill-rbridge-bfd, work in progress, 991 June 2012. 993 [TRILL-OAM] T. Senevirathne, et al., "Use of 802.1ag for TRILL OAM 994 Messages", draft-tissa-trill-8021ag, work in progress, 995 June 2012. 997 Authors' Addresses 999 Samer Salam 1000 Cisco 1001 595 Burrard Street, Suite 2123 1002 Vancouver, BC V7X 1J1, Canada 1003 Email: ssalam@cisco.com 1005 Tissa Senevirathne 1006 Cisco 1007 375 East Tasman Drive 1008 San Jose, CA 95134, USA 1009 Email: tsenevir@cisco.com 1011 Sam Aldrin 1012 Huawei Technologies 1013 2330 Central Expressway 1014 Santa Clara, CA 95050, USA 1015 Email: sam.aldrin@gmail.com 1017 Donald Eastlake 1018 Huawei Technologies 1019 155 Beaver Street 1020 Milford, MA 01757, USA 1021 Tel: 1-508-333-2270 1022 Email: d3e3e3@gmail.com