idnits 2.17.1 draft-ietf-grow-rift-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 4 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 226 has weird spacing: '...lies in the t...' == Line 232 has weird spacing: '...ference model...' == Line 960 has weird spacing: '... Let n = n...' == Line 988 has weird spacing: '...ocating a lar...' == Line 1067 has weird spacing: '...es that passi...' == (1 more instance...) == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 2004) is 7375 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 2119' is mentioned on line 34, but not defined == Missing Reference: 'SOFTNOTIFY' is mentioned on line 481, but not defined == Missing Reference: 'L2TPV3' is mentioned on line 1047, but not defined == Missing Reference: 'REFRESH' is mentioned on line 1135, but not defined == Unused Reference: 'EXTCOMM' is defined on line 1259, but no explicit reference was found in the text == Unused Reference: 'L2TPv3' is defined on line 1284, but no explicit reference was found in the text == Unused Reference: 'RFC1958' is defined on line 1329, but no explicit reference was found in the text == Unused Reference: 'RFC3036' is defined on line 1357, but no explicit reference was found in the text == Unused Reference: 'RFC2119' is defined on line 1378, but no explicit reference was found in the text == Unused Reference: 'RFC2026' is defined on line 1382, but no explicit reference was found in the text == Unused Reference: 'RFC2028' is defined on line 1385, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'AFI' == Outdated reference: A later version (-26) exists of draft-ietf-idr-bgp4-23 == Outdated reference: A later version (-09) exists of draft-ietf-l3vpn-bgpvpn-auto-00 ** Downref: Normative reference to an Informational draft: draft-ietf-l3vpn-bgpvpn-auto (ref. 'BGPVPN') -- Possible downref: Non-RFC (?) normative reference: ref. 'CLARK' == Outdated reference: A later version (-09) exists of draft-ietf-idr-bgp-ext-communities-06 == Outdated reference: A later version (-04) exists of draft-marques-idr-flow-spec-00 -- Possible downref: Normative reference to a draft: ref. 'FLOW' -- Possible downref: Non-RFC (?) normative reference: ref. 'LABELRANGE' == Outdated reference: A later version (-05) exists of draft-ietf-l2vpn-l2-framework-03 ** Downref: Normative reference to an Informational draft: draft-ietf-l2vpn-l2-framework (ref. 'L2VPN') == Outdated reference: A later version (-08) exists of draft-ietf-l2vpn-signaling-00 == Outdated reference: A later version (-10) exists of draft-kompella-l2vpn-l2vpn-00 ** Downref: Normative reference to an Informational draft: draft-kompella-l2vpn-l2vpn (ref. 'L2VPNT') == Outdated reference: A later version (-15) exists of draft-ietf-l2tpext-l2tp-base-11 == Outdated reference: A later version (-17) exists of draft-ietf-pwe3-control-protocol-05 -- Possible downref: Non-RFC (?) normative reference: ref. 'MULLER1999' -- Possible downref: Normative reference to a draft: ref. 'MULTISESSION' == Outdated reference: A later version (-17) exists of draft-ietf-idr-route-filter-09 -- Possible downref: Normative reference to a draft: ref. 'RTCONST' ** Downref: Normative reference to an Experimental RFC: RFC 1075 ** Obsolete normative reference: RFC 1142 (Obsoleted by RFC 7142) ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Downref: Normative reference to an Informational RFC: RFC 1958 ** Obsolete normative reference: RFC 2138 (Obsoleted by RFC 2865) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-03) exists of draft-ietf-l3vpn-rfc2547bis-00 ** Obsolete normative reference: RFC 2858 (Obsoleted by RFC 4760) ** Obsolete normative reference: RFC 3036 (Obsoleted by RFC 5036) ** Downref: Normative reference to an Informational RFC: RFC 3439 -- Possible downref: Non-RFC (?) normative reference: ref. 'SAFI' == Outdated reference: A later version (-08) exists of draft-ietf-l2vpn-vpls-bgp-01 -- Unexpected draft version: The latest known version of draft-kompella-ppvpn-l2vpn is -03, but you're referring to -04. -- Possible downref: Normative reference to a draft: ref. 'VPWS' -- Obsolete informational reference (is this intentional?): RFC 2028 (Obsoleted by RFC 9281) -- Obsolete informational reference (is this intentional?): RFC 2434 (Obsoleted by RFC 5226) Summary: 14 errors (**), 0 flaws (~~), 32 warnings (==), 14 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT D. Meyer (Editor) 2 draft-ietf-grow-rift-01.txt 3 Category Informational 4 Expires: August 2004 February 2004 6 Operational Concerns and Considerations for Routing Protocol 7 Design -- Risk, Interference, and Fit (RIFT) 8 10 Status of this Document 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 The key words "MUST"", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 32 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 33 document are to be interpreted as described in RFC 2119 [RFC 2119]. 35 This document is a product of the RIFT Design Team. Comments should 36 be addressed to the authors, or the mailing list at 37 grow@lists.uoregon.edu. 39 Copyright Notice 41 Copyright (C) The Internet Society (2004). All Rights Reserved. 43 Abstract 45 The Risk, Interference, and Fit (RIFT) design team was formed to 46 document the concerns and considerations surrounding the use of 47 Internet routing protocols for functions not directly related to 48 routing of IP packets within the Internet and IP networks. This 49 document is the output of that activity. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 54 2. Scope of this Work . . . . . . . . . . . . . . . . . . . . . . 5 55 3. Problem Statement. . . . . . . . . . . . . . . . . . . . . . . 6 56 3.1. Risk, Interference, and Application Fit (RIFT) . . . . . . 6 57 3.1.1. Risk: Software Engineering . . . . . . . . . . . . . . . 7 58 3.1.2. Interference: Protocol Specification/Dynamic Behavior . 7 59 3.1.3. Application Fit: Distribution Topology . . . . . . . . . 7 60 4. Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 4.1. Reachability Information. . . . . . . . . . . . . . . . . . 8 62 4.2. Layer 3 Routing Information . . . . . . . . . . . . . . . . 8 63 4.2.1. Standard Routing Information . . . . . . . . . . . . . . 9 64 4.3. Auxiliary (non-routing) Information . . . . . . . . . . . . 9 65 4.4. Address Family Identifier (AFI) . . . . . . . . . . . . . . 9 66 4.5. Subsequent Address Family Identifier (SAFI) . . . . . . . . 10 67 4.6. Network Layer Reachability. . . . . . . . . . . . . . . . . 10 68 4.7. Application . . . . . . . . . . . . . . . . . . . . . . . . 10 69 4.8. Routing Protocol. . . . . . . . . . . . . . . . . . . . . . 10 70 4.9. Fate Sharing. . . . . . . . . . . . . . . . . . . . . . . . 11 71 5. Architectural Models . . . . . . . . . . . . . . . . . . . . . 11 72 5.1. General Purpose Transport Infrastructure (GPT) Model. . . . 12 73 5.2. Special Purpose Transport Infrastructure (SPT) Model. . . . 12 74 6. Analyzing Risk and Interference. . . . . . . . . . . . . . . . 13 75 6.1. Risk: Code Impact, and Resource Sharing . . . . . . . . . . 13 76 6.1.1. Code Impact. . . . . . . . . . . . . . . . . . . . . . . 13 77 6.1.2. Resource Sharing . . . . . . . . . . . . . . . . . . . . 14 78 6.1.2.1. Resource Sharing and Operating System Level Issues . 14 79 6.2. Interference. . . . . . . . . . . . . . . . . . . . . . . . 15 80 7. GTP and SPT Models: Risk and Interference. . . . . . . . . . . 15 81 7.1. Risk. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 7.1.1. Code Impact. . . . . . . . . . . . . . . . . . . . . . . 16 83 7.1.2. Resource Sharing . . . . . . . . . . . . . . . . . . . . 17 84 7.1.3. Multisession BGP . . . . . . . . . . . . . . . . . . . . 17 85 7.2. Interference. . . . . . . . . . . . . . . . . . . . . . . . 19 86 7.2.1. Multisession BGP . . . . . . . . . . . . . . . . . . . . 19 87 8. Application Fit. . . . . . . . . . . . . . . . . . . . . . . . 19 88 8.1. RFC 2547 Style VPNs . . . . . . . . . . . . . . . . . . . . 20 89 8.1.1. RFC 2547 and Label Distribution. . . . . . . . . . . . . 21 90 8.2. VPWS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 91 8.2.1. Assertion #1 . . . . . . . . . . . . . . . . . . . . . . 22 92 8.2.2. Counter-Assertion #1 . . . . . . . . . . . . . . . . . . 22 93 8.2.3. Assertion #2 . . . . . . . . . . . . . . . . . . . . . . 23 94 8.2.4. Counter-Assertion #2 . . . . . . . . . . . . . . . . . . 23 95 8.2.4.1. Assertion #2a . . . . . . . . . . . . . . . . . . . . 23 96 8.2.4.2. Counter-Assertion #2a . . . . . . . . . . . . . . . . 23 97 8.2.5. Assertion #3 . . . . . . . . . . . . . . . . . . . . . . 24 98 8.2.6. Counter-Assertion #3 . . . . . . . . . . . . . . . . . . 25 99 8.3. VPWS and Per-Wire Attributes. . . . . . . . . . . . . . . . 27 100 8.3.1. Assertion #4 . . . . . . . . . . . . . . . . . . . . . . 27 101 8.3.2. Counter-Assertion #4:. . . . . . . . . . . . . . . . . . 27 102 8.3.3. Assertion #5 . . . . . . . . . . . . . . . . . . . . . . 27 103 8.3.4. Counter-Assertion #5 . . . . . . . . . . . . . . . . . . 27 104 8.3.5. Assertion #6 . . . . . . . . . . . . . . . . . . . . . . 28 105 8.3.6. Counter-Assertion #6 . . . . . . . . . . . . . . . . . . 28 106 8.3.7. Assertion #7:. . . . . . . . . . . . . . . . . . . . . . 28 107 8.3.8. Counter-Assertion #7:. . . . . . . . . . . . . . . . . . 29 108 8.4. VPLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 109 8.4.1. Assertion #8 . . . . . . . . . . . . . . . . . . . . . . 29 110 8.4.2. Counter-Assertion #8 . . . . . . . . . . . . . . . . . . 29 111 8.4.3. Assertion #9 . . . . . . . . . . . . . . . . . . . . . . 30 112 8.4.4. Counter-Assertion #9 . . . . . . . . . . . . . . . . . . 30 113 9. Operational Implications . . . . . . . . . . . . . . . . . . . 30 114 9.1. OAM Functionality . . . . . . . . . . . . . . . . . . . . . 30 115 9.1.1. Assertion #10: . . . . . . . . . . . . . . . . . . . . . 30 116 9.1.2. Counter-Assertion #10: . . . . . . . . . . . . . . . . . 31 117 9.2. Full-Mesh Issues. . . . . . . . . . . . . . . . . . . . . . 31 118 10. Conclusions and Recommendations . . . . . . . . . . . . . . . 31 119 11. Intellectual Property . . . . . . . . . . . . . . . . . . . . 31 120 12. Design Team . . . . . . . . . . . . . . . . . . . . . . . . . 31 121 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 32 122 14. Security Considerations . . . . . . . . . . . . . . . . . . . 33 123 15. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 124 16. References. . . . . . . . . . . . . . . . . . . . . . . . . . 34 125 16.1. Normative References . . . . . . . . . . . . . . . . . . . 34 126 16.2. Informative References . . . . . . . . . . . . . . . . . . 37 127 17. Editor's Address. . . . . . . . . . . . . . . . . . . . . . . 38 128 18. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 38 130 1. Introduction 132 The stability of the global Internet routing system has been the 133 subject of much research (see e.g., [RVBIB]) and discussion on 134 various IETF mailing lists [IETFOL]. Much of the research into the 135 routing system has centered around the analysis of the dynamics and 136 stability of the Border Gateway Protocol Version 4 [BGP] (hereafter 137 referred to as BGP). 139 However, while the theoretical properties of BGP remains a topic of 140 great interest, a more recent discussion has focused on effects of 141 the addition of new types of Network Layer Reachability Information, 142 or NLRI to BGP. In particular, the advent of two BGP attributes, 143 Multiprotocol Reachable NLRI (MP_REACH_NLRI), and Multiprotocol 144 Unreachable NLRI (MP_UNREACH_NLRI) [RFC2858], have made it possible 145 to encode and transport a wide variety of features and their 146 associated signaling using the BGP transport infrastructure. Examples 147 include include IPv6 [RFC2460], flow specification rules [FLOW], IP 148 VPNs [RFC2547BIS], Virtual Private LAN services [VPLS], Virtual 149 Private Wire Service [VPWS], and auto-discovery mechanisms for VPNs 150 in general [BGPVPN], 152 This document outlines the concerns and issues surrounding using the 153 BGP infrastructure as a generic feature and signaling transport. 154 However, the similar concerns apply to the Interior Gateway Protocols 155 (IGPs) in common use (e.g., ISIS [RFC1142] or OSPF [RFC2328]). 157 The rest of this document is organized as follows: Section 2 outlines 158 the scope of this work. Section 3 introduces the problem statement 159 which is the focus of this document, section 4 provides definitions, 160 and section 5 outlines the main architectural models that are 161 discussed. The remaining sections discuss the the implications of 162 those models. 164 2. Scope of this Work 166 It is the intention of the RIFT design team that this document serve 167 as a guide for both protocol designers and network operators. The 168 goal is to outline the implications associated with employing 169 existing routing protocols to enable additional feature sets and 170 functionality, as contrasted with designing new mechanisms to carry 171 those feature sets and functionalities. 173 The issues, concerns and considerations discussed in this document 174 focus on the implications for BGP [BGP,RFC1771]. It is important to 175 note that similar issues will arise when considering generalizations 176 to the information that the IGPs carry. 178 3. Problem Statement 180 The advent of the MP_REACH_NLRI and MP_UNREACH_NLRI attributes, 181 combined with the resulting generalization to the BGP infrastructure, 182 have created the opportunity to use BGP to transport a wide variety 183 of data types and their associated signaling. The combination of a 184 BGP data type and its associated signaling is frequently called an 185 "application"; example applications include the IPv4 and IPv6 186 [RFC2460] routing systems, flow specification rules [FLOW], auto- 187 discovery mechanisms for Layer 3 VPNs [BGPVPN], virtual private LAN 188 services [VPLS], and virtual private Wire Service [VPWS]. 190 More recently, the discussion in the IETF community has focused on 191 the use of the BGP as a generalized feature transport infrastructure 192 [IETFOL]. The debate has recently intensified due to the emergence of 193 a new class of application that uses the BGP infrastructure to 194 distribute information that is not directly related to inter-domain 195 routing. Examples of such applications include the use of the BGP 196 transport infrastructure to provide auto-discovery for IP VPNs 197 [RFC2547BIS], the virtual private LAN services mentioned above [VPLS] 198 and VPNs in general [BGPVPN]. 200 3.1. Risk, Interference, and Application Fit (RIFT) 202 As mentioned above, much of the debate surrounding these new uses of 203 the BGP transport infrastructure has focused on the potential 204 tradeoffs between the stability of the Internet routing system, as 205 effected by the deployment of new applications, and the desire on the 206 part of service providers to rapidly deploy these new applications, 207 and to reduce the operational cost by re-using existing protocols. 209 These tradeoffs have at times been described in terms of risk, 210 interference, and application fit. Risk models the software 211 engineering impact of new applications on a generic implementation, 212 while interference models the impact of new applications on protocol 213 definition and behavior. Finally, application fit models the 214 similarity between an application's data and signaling requirements 215 and a specific distribution algorithm. Each is described below. 217 3.1.1. Risk: Software Engineering 219 Risk attempts to assess the robustness tradeoffs inherent in the 220 addition of new applications to a given implementation. That is, risk 221 models the impact of generic software engineering issues on a given 222 implementation. These issues include the impact of new applications 223 on existing implementations and on the fate sharing properties of 224 those implementations. 226 A second aspect of risk lies in the trade-off of extending an 227 existing protocol versus designing, implementing, and deploying a new 228 protocol. 230 3.1.2. Interference: Protocol Specification/Dynamic Behavior 232 Interference models the potential for a new application to adversely 233 effect the operation of an existing implementation at the protocol 234 level, by inadvertently introducing a detrimental dependency of some 235 kind. That is, an application is said to "interfere" with an existing 236 application if, by virtue of the application's protocol extension(s), 237 one or more fundamental properties of the protocol's operation are 238 detrimentally altered. For example, could we create a new state which 239 introduces an unanticipated deadlock situation to occur? Or could we 240 destabilize the distributed behavior of the protocol? Or might we 241 simply run out of the attributes or bits available (as happened, for 242 example, with RADIUS [RFC2138])? 244 3.1.3. Application Fit: Distribution Topology 246 Application fit refers to how closely the requirements of the data to 247 be distributed match the underlying capabilities of a distribution 248 mechanism. For example, it is clearly inefficient to broadcast data 249 to all peers that is only required between two peers, just as it is 250 inefficient to unicast (replicate) data that is required by all peers 251 when a single broadcast would do. 253 4. Definitions 255 4.1. Reachability Information 257 Reachability information refers to information describing some part 258 of a network, along with how one can reach it, and perhaps also 259 containing attributes of the implied path to the network locale. 260 Typically, this information pertains to IP routing information; an 261 example of non-IP reachability is VPLS information [VPLS]. 263 4.2. Layer 3 Routing Information 265 Layer 3 routing information represents either link state information 266 or network reachability information. Link state information 267 represents Layer 3 adjacencies and topology. Link state routing 268 protocols, such as OSPF [RFC2328] and ISIS [RFC1142], flood link 269 state information throughout an IGP domain, so that each 270 participating router maintains an identical copy of a database that 271 is computed to reflect the complete Layer 3 topology. 273 Layer 3 reachability information expressed as an IP address prefix 274 represents the set of destinations (systems) whose IP addresses are 275 contained in the IP address prefix. Distance/path vector routing 276 protocols, such as BGP, distribute Layer 3 reachability information 277 among routing domains. 279 Routers use both types of Layer 3 routing information (link state and 280 reachability) to produce IP forwarding tables. That, is, for purposes 281 of this discussion, "routing information" relates to the Layer 3 282 inter-domain routing data traditionally carried by BGP. 284 Finally, if one defines routing information as "information used to 285 forward packets", combined with the above definition of reachability 286 information, then we can consider information such as described in 287 [FLOW] (for example) to be routing information (since it is 288 attempting to add a level of granularity to how an 'aggregate' is 289 defined). That is, [FLOW] intends to complement to the existing 290 routing information, and the flow information is dependent on IP4 291 unicast reachability advertised by the same neighbor. 293 4.2.1. Standard Routing Information 295 In the most general terms, then, a routing protocol distributes data 296 to accomplish the following three functionalities: 298 (i). To govern the routing decision process (e.g., the 299 standard BGP decision process) 301 (ii). To constrain the flow of information (for example, with 302 BGP communities) 304 (iii). To tell the recipient how to get packets to the next hop 306 We will refer to information that falls into this class as "standard 307 routing information". 309 4.3. Auxiliary (non-routing) Information 311 Auxiliary Information is any information that is exchanged by routers 312 which is neither Layer 3 routing information, nor reachability 313 information. IS-IS hostname TLVs are an example of auxiliary 314 information [RFC1142]. 316 4.4. Address Family Identifier (AFI) 318 An Address Family contains addresses that share common structure and 319 semantics. An Address Family Identifier (AFI) uniquely identifies 320 each address family. Several routing protocol messages contain a 321 field that represents the AFI. The AFI identifies the address type 322 used by another data item contained in that message. The Routing 323 Information Protocol (RIP) [RFC2453], Distance Vector Multicast 324 Routing Protocol (DVMRP) [RFC1075], and BGP all employ the AFI field. 326 For example, the BGP MP_REACH_NLRI and MP_UNREACH_NLRI attributes 327 contain an AFI field. These BGP attributes also contain a NLRI field 328 that enumerates reachable or unreachable subnetworks corresponding to 329 the associated address family. The AFI field indicates the address 330 type by which reachable subnetworks are identified. When BGP is used 331 to distribute Layer 3 routing information, AFIs can indicate the 332 following address types: IPv4, IPv6, VPNv4 [RFC2547BIS]. When BGP is 333 used to distribute auxiliary information, AFIs can indicate other 334 address families. 336 4.5. Subsequent Address Family Identifier (SAFI) 338 A Subsequent Address Family Identifier (SAFI) is part of the BGP 339 MP_REACH_NLRI and MP_UNREACH_NLRI attributes. These BGP attributes 340 also contain a NLRI field that enumerates reachable or unreachable 341 subnetworks. The SAFI augments the AFI, carrying additional 342 information regarding networks enumerated in the NLRI field. 344 4.6. Network Layer Reachability 346 Network Layer Reachability Information, or NLRI is the data described 347 by the AFI/SAFI fields [AFI,SAFI]. While these concepts were 348 originally described for protocols such as DVMRP [RFC1075], the bulk 349 of the generalization of the NLRI described in this document derives 350 from the introduction of the MP_REACH_NLRI and MP_UNREACH_NLRI 351 attributes to BGP [RFC2858]. 353 4.7. Application 355 The term application is used in this document to refer to the 356 combination of a BGP data type and any signaling data that is carried 357 by BGP in support of the service the data type carries. The data type 358 is typically described in an AFI/SAFI, while the actual data is 359 frequently contained in both NLRI and BGP community attributes 360 [RFC1997]. 362 4.8. Routing Protocol 364 A routing protocol is composed of two basic components: a data 365 distribution algorithm and a decision algorithm. A router typically 366 obtains Layer 3 routing information via its data distribution 367 algorithm, and it uses this information to produce an IP forwarding 368 table (by applying the protocol's decision algorithm to the received 369 routing data). Note that it is the use of BGP's data distribution 370 algorithm that is the focus of this document. However, when judging 371 application fit, one may also consider whether the decision 372 algorithms suit the application. 374 4.9. Fate Sharing 376 The fate sharing principle for end to end network protocols was first 377 enunciated by Dave Clark [CLARK]. As applied to software systems, 378 fate sharing refers to the sharing of common resources among a group 379 of applications. In our case, the particular "fate" of most interest 380 is the ability of one application, call it application A, to cause an 381 application with which it is fate sharing, call it application B, to 382 experience one or more faults due to faults in application A. Fate- 383 sharing can exist at many levels, including between modules on a 384 system, between routing protocols, between sessions of a routing 385 protocols such as BGP, or between applications within a routing 386 protocol. 388 5. Architectural Models 390 In this section, we consider the two architectural models which are 391 motivated by salient questions considered in this document, namely: 393 (i). Does the BGP distribution protocol suit a particular 394 application (i.e., does an application fit the BGP 395 distribution protocol)? 397 (ii). What are the effects on the global routing system (if 398 any) of carrying that application using the BGP distribution 399 protocol? 401 These questions must be analyzed in terms of the cost of protocol and 402 code development, as well as in terms of the operational expense that 403 may be incurred by utilizing (or not utilizing) the mechanisms 404 already present in BGP. 406 Two models, describing alternate viewpoints, are examined in the 407 following sections. 409 5.1. General Purpose Transport Infrastructure (GPT) Model 411 The GPT model models BGP data distribution infrastructure as a 412 generic application transport mechanism. As such, it focuses on 413 application fit, and assumes that the tradeoffs, both in terms of 414 risk and interference can be managed in an efficient manner. As a 415 result, the GTP models these issues not in terms of whether the 416 application and signaling data that need to be distributed are part 417 of some particular class (routing, in this case), but rather whether 418 the requirements for the distribution these attributes are similar 419 enough to the distribution mechanisms of BGP. In those cases when 420 distribution requirements are sufficiently similar, BGP can be a 421 logical candidate for a transport infrastructure. Note that this is 422 not because of the nature of information distributed, but rather due 423 to the similarity in the transport requirements. There are of course 424 other operational considerations that make BGP a logical candidate, 425 including its close to ubiquitous deployment in the Internet (as well 426 as in intra-nets), its policy capabilities, and operator comfort 427 levels with the technology. 429 5.2. Special Purpose Transport Infrastructure (SPT) Model 431 The SPT model, on the other hand, models the BGP infrastructure as a 432 special purpose transport designed specifically to transport inter- 433 domain routing information. As such, it is more sensitive to risk and 434 interference than to application fit. 436 There are two basic arguments supporting the SPT model: The first is 437 based on the perceived risk profile involved in adding new 438 applications to the BGP transport infrastructure or new features to 439 existing BGP applications. The concern here is that changes to BGP 440 implementations will cause software quality to degrade, and hence 441 destabilize the global routing system. This position is based upon 442 well understood software engineering principles, and is strengthened 443 by long-standing experience that there is a direct correlation 444 between software features and software stability [MULLER1999]. This 445 concern is augmented by the fact that in many cases, the existence of 446 the code for these features, even if unused, can also cause 447 destabilization in the routing system, since in many cases software 448 faults cannot be isolated. 450 A second concern is based on interference arguments, notably that the 451 increase in complexity of BGP due to the number of data types that it 452 carries can also potentially destabilize the global routing system. 454 This concern is based on a wide range of concerns, including the fact 455 that the interaction of BGP dynamics and current deployment practices 456 are poorly understood, and that the addition of non-routing data 457 types may adversely effect convergence and other scaling properties 458 of the global routing system. 460 6. Analyzing Risk and Interference 462 One way to frame the tradeoffs involved in a model's risk profile is 463 in terms of the software engineering issues surrounding where an 464 implementation might demultiplex among applications. The important 465 point here is that an implementation's choice of demultiplexing point 466 directly affects the implementation's risk profile due to its effects 467 on existing code, and on the system resources it requires to be 468 shared among those applications. 470 6.1. Risk: Code Impact, and Resource Sharing 472 For purposes of this discussion, then, we consider the risk profile 473 of the SPT and GPT models with respect to their application 474 demultiplexing point. The GPT model typically provides a single point 475 for demultiplexing all applications (i.e., the AFI/SAFI). On the 476 other hand, the SPT model, provides an application demultiplexing 477 point above BGP (typically at the TCP port level). That is, in the 478 GPT model, applications typically share a common transport session, 479 while the SPT model generally envisions one or more applications per 480 transport session (see section 7.1.3 for a discussion of the impact 481 of multisession BGP [MULTISESSION,SOFTNOTIFY] on this taxonomy). 483 Finally, note that these models can have very different risk profiles 484 with respect to code impact and resource sharing. Some of the 485 questions relating to risk assessment are considered below. 487 6.1.1. Code Impact 489 In this section, we outline the high-level questions one might ask in 490 assessing the difference in risk between GPT model and the SPT model 491 based on their effect on an existing code base. 493 o Does the code below the demultiplexing point need to be 494 changed when a new application is added? 496 o Does the code in existing applications have to be changed when 497 a new application is added (that is, to what extent are the 498 applications decoupled)? 500 o Can the code in separate applications be developed, tested, 501 released, debugged and packaged independently from other 502 applications? 504 o Is there significant code below the demultiplexing point that 505 can be shared among all applications? 507 6.1.2. Resource Sharing 509 In this section, we outline the high-level questions one might ask in 510 assessing the difference in risk between GPT model and the SPT model 511 with respect to the requirements and properties of the system 512 resource sharing they require. In particular: 514 o Do applications have to compete for socket buffers, and hence 515 have the potential to block or starve each other (at the TCP 516 port level)? 518 o Do applications have to compete for possible protocol-level 519 transport-related buffers and queues, and hence have the 520 potential to starve or block each other at the protocol 521 send/receive level? 523 o Do applications have to compete for a possible per-connection 524 processing time budget, hence have the potential to starve 525 each other at the intra-process scheduling level? 527 6.1.2.1. Resource Sharing and Operating System Level Issues 529 In this section, we outline the high-level questions one might ask in 530 assessing the difference in risk between GPT model and the SPT model 531 based on the affect on resource sharing at the operating system 532 level. In particular: 534 o Do applications share a common scheduling context? That is, 535 do applications have to compete for per-process scheduling 536 budgets? 538 o What is the degree of fate sharing between applications? 540 6.2. Interference 542 Interference models the potential for an application to affect the 543 behavior of an existing application or applications. For example, in 544 the case of the Internet routing system, one might ask if a certain 545 application "interferes" with IPv4 Unicast routing by affecting some 546 aspect of its protocol operation (e.g., convergence time). 548 Interference in the Internet routing system has its roots in the 549 observation that the routing system itself can be described as highly 550 self-dissimilar, with extremely different scales and levels of 551 abstraction. Complex systems with this property are susceptible to 552 "coupling", which RFC 3439 [RFC3439] defines as follows: 554 The Coupling Principle states that as things get larger, they 555 often exhibit increased interdependence between components. 557 COROLLARY: The more events that simultaneously occur, the larger 558 the likelihood that two or more will interact. This phenomenon 559 has also been termed "unforeseen feature interaction" 560 [WILLINGER2002]. 562 That is, interference, if and where it occurs, has its roots in 563 complexity and is frequently the result of application coupling. 565 7. GTP and SPT Models: Risk and Interference 567 In this section, we analyze the risk and interference profiles of the 568 SPT and GPT models. 570 7.1. Risk 572 As mentioned above, risk models the robustness tradeoffs around 573 generic software architecture and engineering associated with 574 protocol implementations, including the impact on existing protocol 575 implementations, and on the fate sharing properties of those 576 implementations. In the following sections we consider these 577 components of risk for both the GPT and SPT models. 579 7.1.1. Code Impact 581 In this section, we outline the answers to the questions posed above. 583 o Does the code below the demultiplexing point need to be 584 changed when a new application is added? 586 In theory, such code changes are unlikely to be required in 587 the SPT model, as the SPT model envisions that a new 588 application will have a new demultiplexing point (port). 590 The GPT model does not by definition require new code below 591 the demultiplexing point either. Specifically, it should in 592 theory be possible to isolate code below the demultiplexing 593 point with suitable abstraction and constructs such as 594 AFI/SAFI API registries. 596 o Does the code in existing applications have to be changed when 597 a new application is added (that is, to what extent are the 598 applications decoupled)? 600 The SPT model envisions application independence with respect to 601 demultiplexing point. As such, it is unlikely to require such 602 changes. However, it is important to note that good software 603 engineering practices encourage code reuse and construction of 604 general purpose libraries. As a result, if applications share 605 libraries and/or other code, the practical independence 606 decreases, and consequently risk increases. The same analysis 607 can be made for the GPT model, since in this case we are already 608 demultiplexing on the AFI/SAFI fields. 610 o Can the code in separate applications be developed, tested, 611 released, debugged and packaged independently from other 612 applications? 614 While this is theoretically possible in the SPT model (and 615 possibly more difficult in the GPT model) practice and 616 experience has shown that achieving this type of independence is 617 difficult in either model. 619 7.1.2. Resource Sharing 621 In this section, we address the questions raised above to assess the 622 difference in risk between GPT model and the SPT model based on the 623 effect on resource sharing considerations. 625 o Do applications have to compete for socket buffers, and hence 626 have the potential the to block or starve each other (at the GPT 627 level)? 629 The SPT model does not require applications to compete for 630 socket level resources. It should also be possible to achieve 631 this type of application independence in the GPT model with 632 multisession BGP. 634 o Do applications have to compete for possible protocol-level 635 transport-related buffers and queues, and hence have the 636 potential to starve or block each other at the protocol 637 send/receive level? 639 Again, while the SPT model does not require competition for 640 transport-level resources, it should be possible to achieve 641 similar behavior with multisession BGP. 643 o Do applications have to compete for a possible per-connection 644 processing time budget, hence have the potential to starve 645 each other at the intra-process scheduling level? 647 Applications written to the the SPT model should not require 648 this type of resource competition. It should also be possible to 649 reduce this type of resource competition with multisession BGP. 651 o Do applications have to compete for resources within the 652 network (e.g., bandwidth), when the protocol session spans 653 multiple hops ? 655 Neither the SPT model nor the GPT model (again, with 656 multisession BGP) should require competition for network 657 resources in this case. 659 7.1.3. Multisession BGP 661 Suppose that one makes the simplifying assumption that a GPT 662 implementation's risk profile is dominated by the probability that an 663 error in one AFI/SAFI stream will cause some subset of the other 664 AFI/SAFI streams to malfunction (e.g., reset). In this case, risk 665 might be characterized as a function of the model and the number of 666 AFI/SAFI carried. Given this simplification, the risk profile looks 667 loosely like 669 Risk = f(Model, |{AFI,SAFI}|) 671 where 673 f:{GPT, SPT} X |{AFI, SAFI}| -> N 675 Note that we assume that 677 f(SPT,n) = O(f(GPT,n)) 679 where 681 O(f) = {g:N->R | there exists c > 0 and n such that g(n) < c*f(n)} 683 That is, that the SPT risk profile is bounded by the GPT risk 684 profile. Clearly, the existence of such an upper bound is an integral 685 aspect of any argument favoring the SPT model. 687 Note that for the SPT model, we can think of the number of AFI/SAFI 688 that a single session carries as a small constant, call it k. k will 689 typically be small (close to 1), since by definition the SPT model 690 envisions a small number of AFI/SAFI per session (e.g., for AFI/SAFI 691 IPv4/unicast and IPv6/unicast, k = 2). 693 When formulated in this way, one can see that one objective of 694 multisession BGP is to find a value, call it g, such that 696 f(GPT, g) ~ f(SPT,k), for small values of k (i.e., k close to 1) 698 where 700 A(n) ~ B(k) ==> A(n) = B(k) + h(n), h(n) >= 0 702 That is, A(n) is approximately equal to B(k) 704 In this case, g is the size of the multisession AFI/SAFI grouping, 705 and for small values of g, multisession BGP can have a risk profile 706 that looks very much like the SPT risk profile. In particular, for g 707 = 1, both models would have similar risk profiles. Of course, there 708 are many other components of risk that that are not considered by 709 this analysis, such as collateral issues resulting from the existence 710 of faulty shared code, operating system process and memory structure, 711 etc. 713 7.2. Interference 715 Interference concerns stem from the possibility that application 716 coupling can lead to the destabilization of the Internet routing 717 system in unanticipated and unexpected ways. In this section we 718 consider interference properties of the GPT and SPT models. 720 7.2.1. Multisession BGP 722 Multisession BGP also seeks to reduce the interference profile of the 723 GPT model by eliminating one potential source of interference, 724 namely, the potential interference due to presence of multiple 725 AFI/SAFIs in a single BGP session. Following the analysis presented 726 in section 7.1.3, we can see that for small groupings (described as 727 small values of g in section 7.1.3), the interference profiles of 728 both models converge. 730 8. Application Fit 732 In the following sub-sections, application fit is examined from the 733 perspective of analyzing the signaling and data distribution needs of 734 three representative applications, namely: 736 RFC 2547 Style VPNs 737 VPWS 738 VPLS 740 However, before investigating how the BGP data distribution mechanism 741 (and its extensions) fit the requirements of these applications, it 742 is useful to briefly review the gross characteristics of the BGP data 743 distribution infrastructure. In particular, we examine which 744 distribution topologies can be naturally built using internal BGP (or 745 iBGP). 747 iBGP has been described loosely as a broadcast mechanism since an 748 iBGP speaker sends information to all its peers. This is typically 749 achieved by means of one or more route reflectors (or RRs); a more 750 direct but less scalable means is for each iBGP speaker to have a BGP 751 session with each iBGP peer. It may, however, be more accurate to 752 characterize iBGP as a constrained broadcast mechanism. This is 753 because the use of communities in conjunction with import and export 754 policies allows an iBGP speaker to effectively limit its 755 communication to a subset of the full set of iBGP peers; the 756 efficiency of constrained broadcast can be improved by techniques 757 such as described in [ORF] and [RTCONST]. 759 8.1. RFC 2547 Style VPNs 761 There are five classes of information that need to be distributed for 762 RFC 2547 style VPNs: 764 (a). Membership (auto-discovery) 765 (b). Prefixes 766 (c). Labels 767 (d). BGP nexthop, and 768 (e). Path selection attributes 770 The first of these, membership or auto-discovery, must be sent to all 771 peers, as a BGP speaker does not know a priori which of its peers are 772 members of a given VPN. Membership of a given VPN is recognized by 773 the use of extended communities called Route Targets. BGP is well- 774 suited for this mode of distribution. 776 The next three of these constitute the reachability information. 777 They say what part of a given VPN (b) is reachable, and how it is to 778 be reached (c and d). The final piece of information is used for 779 selection if there are multiple paths to a given prefix of a VPN, as 780 in the case of multi-homing. All of these pieces of information need 781 only be distributed to members of the VPN, i.e., they require a 782 constrained broadcast mechanism. BGP is reasonably well-suited for 783 this mode of distribution using import and export NLRI filtering. 784 The addition of the mechanism in [RTCONST] makes BGP even better 785 suited to this. 787 The encoding of this information as defined in [RFC2547BIS] puts all 788 of this information in a single NLRI. This seems to imply that a 789 broadcast mechanism has to be used for the distribution of RFC 2547 790 VPN information. However, the combination of [RTCONST] and [RFC2918] 791 allow BGP to distribute this information correctly yet efficiently. 793 In summary, there seems to be little argument that the RFC 2547 794 application is a routing application. This is because the information 795 that gets sent via BGP in RFC 2547 is generally considered to be 796 "routing information". That is, the protocol distributes address 797 prefixes, along with their next hops (and of course, some additional 798 attributes). Finally (and perhaps most importantly), there seems to 799 be little argument that the information distributed by the RFC 2547 800 application is standard routing information. 802 8.1.1. RFC 2547 and Label Distribution 804 One issue that is frequently raised with respect to whether or not 805 the RFC 2547 VPN application is a routing application surrounds the 806 fact that, in the 2547 application, BGP distributes MPLS labels along 807 with the routes. The contention then, is that the RFC 2547 808 application represents more than just a routing application. However, 809 in this case the MPLS label is just a shorthand way of representing 810 one or more address prefixes. That is, the assertion is that in this 811 case, the label represents "standard routing information". 813 8.2. VPWS 815 The question of whether VPWS is a "good fit" for the BGP transport 816 infrastructure is the source of much discussion (and controversy). In 817 this section, we will review both positions and their supporting 818 arguments as a series of assertions and counter-assertions (we will 819 use this format throughout the rest of this section). 821 The key debate with respect to VPWS centers around what set of 822 services are being defined, and how they are to be signaled. One way 823 to analyze the VPWS application, then is in terms of two of its more 824 contentious functionalities, namely: 826 (a). Auto-discovery 828 Auto-discovery refers to discovery of the set of nodes 829 that belong in a common L2VPN, and 831 (b). Signaling 833 Signaling refers to the setup and maintenance of the 834 point-to-point pseudo-wires that carry the traffic of 835 the L2VPN. 837 The next sections examine the various assertions and counter- 838 assertions around auto-discovery and signaling for VPLS. 840 8.2.1. Assertion #1 842 Assertion #1 states VPWS is not a routing application. Those 843 supporting this assertion argue that in the case of VPWS, we are not 844 distributing address prefixes, and (importantly) unlike the case of 845 RFC 2547 style VPNs, the BGP decision process is not used (or at 846 least it is not used in the same way). Further, proponents argue that 847 what we are distributing is state information that corresponds to 848 point-to-point entities, i.e., pseudo-wires, and thus argues that 849 that the VPWS application is completely different. 851 8.2.2. Counter-Assertion #1 853 Counter-Assertion #1 states that VPWS is a routing application. More 854 specifically, this position is outlined in [VPLS] (section 3.4), 855 namely: 857 "It is often desired to multi-home a VPLS site, i.e., to connect 858 it to multiple PEs, perhaps even in different ASes. In such a 859 case, the PEs connected to the same site can either be 860 configured with the same VE ID or with different VE IDs. In the 861 latter case, it is mandatory to run STP on the CE device, and 862 possibly on the PEs, to construct a loop-free VPLS topology. 864 In the case where the PEs connected to the same site are 865 assigned the same VE ID, a loop-free topology is constructed by 866 routing mechanisms, in particular, by BGP path selection. When a 867 BGP speaker receives two equivalent NLRIs (see below for the 868 definition), it applies standard path selection criteria such as 869 Local Preference and AS Path Length to determine which NLRI to 870 choose; it MUST pick only one. 872 If the chosen NLRI is subsequently withdrawn, the BGP speaker 873 applies path selection to the remaining equivalent VPLS NLRIs to 874 pick another; if none remain, the forwarding information 875 associated with that NLRI is removed." 877 8.2.3. Assertion #2 879 Assertion #2 states that auto-discovery for VPWS requires some form 880 of constrained broadcast. There doesn't seem to be much controversy 881 that auto-discovery does require some sort of constrained broadcast 882 mechanism (which we don't want to be limited to a single AS), and we 883 may want to be able to optimize it by using a RP (rendezvous point) 884 like mechanism. BGP route reflectors (RR) provide a convenient and 885 ubiquitously deployed candidate RP. In this case (RR as RP), the fit 886 is good since auto-discovery, like routing, requires an n-party 887 protocol where each party has no a priori knowledge of the existence 888 or identity of the other n-1 parties. 890 8.2.4. Counter-Assertion #2 892 There is no real counter-position to Assertion #2, as it simply 893 states that VPWS auto-discovery requires some form of constrained 894 broadcast (about which there is some controversy; see Assertion #2a 895 below). 897 8.2.4.1. Assertion #2a 899 Assertion #2a states that auto-discovery is not needed for VPWS. 900 Further, the Assertion #2a states that there is not a validated need 901 for VPWS auto-discovery, since auto-discovery is useful only when 902 creating full mesh layer 2 topologies, which are undesirable due to 903 their (well-understood) poor scaling properties; hence auto-discovery 904 for VPWS is not useful. 906 8.2.4.2. Counter-Assertion #2a 908 911 In summary, with the exception of Assertion #2a, the major 912 controversy surrounding VPWS is in signaling piece of the 913 application. The "VPWS is not a routing application" camp argues that 914 the VPWS signaling requirements do not fit the BGP distribution 915 infrastructure, while the "VPWS is a routing application" camp 916 believes that BGP is a good fit. The next sections examine these 917 assertions. 919 8.2.5. Assertion #3 921 Assertion #3 states that VPWS applications are not a good fit for 922 BGP. This argument is based on the assertion that BGP is poorly 923 suited to the VPWS signaling requirements because pseudo-wires are 924 inherently point-to-point (see, for example [L2VPNSIG]). Further, the 925 assertion is that VPWS signaling is qualitatively different than in 926 routing or auto-discovery, in which each piece of information must be 927 distributed to the n participants. The conclusion here is that BGP's 928 distribution mechanisms are a poor match for VPWS signaling. Another 929 way to think about this is that BGP generally works from a single 930 database, and then applies some filtering on a per-connection basis; 931 this only makes sense if most of the information is going to go to a 932 lot of places. 934 For example, suppose that a RR is used for VPWS signaling, and there 935 is the need to set up n pseudo-wires. In this case, instead of 936 sending n setup messages, one sends one large "meta-setup" message 937 with all the info that would have been in the n setup messages. That 938 is, let 940 n = number of pseudo-wires 941 l = the size of the per-wire label information 942 k = the size of the per-wire information 944 In this case, the meta-setup message will be of size O((l + k) * n). 946 After receiving the setup message, the RR then must send the n 947 messages that could have been sent by the endpoint (note that this is 948 almost true; the endpoint would have to send n messages of size (l + 949 k), but the RR will have to send n copies of the larger setup 950 message). 952 8.2.6. Counter-Assertion #3 954 Counter-Assertion #3 states that the VPWS application is a good fit 955 for BGP (see, for example [L2VPNT]). In particular, this camp 956 suggests that a RR really only needs to distribute the label-range 957 [LABELRANGE], so the setup message isn't really n times as large, but 958 rather is analyzed as follows: 960 Let n = number of pseudo-wires 961 m = the size of the label-range data 962 k = the size of the per-wire information 964 Then the messages will be of size O(m + (k * n)), and most 965 importantly for the label-range argument: 967 O(m + (k * n)) < O((l + k) * n) 969 That is, the label-range concept reduces the size of the 970 messages that need to be sent to and by the RR. 972 However, some will argue that the label-range concept is efficient if 973 and only if: 975 (a). A large enough label range is preallocated to accommodate 976 all the systems you might ever want to add to the 977 VPLS/VPWS (assuming that service interruption is not 978 acceptable), and 980 (b). There is no per-wire information other than labels that 981 needs to distributed 983 In these cases, the label range approach can reduce the size of the 984 setup messages as analyzed above. However, the counter argument is 985 that any such reduction will become a second-order effect as soon as 986 some other piece of per-wire status or configuration (e.g., MTU) 987 information must be distributed. In addition, the idea of pre- 988 allocating a large enough label range to accommodate future 989 expansion, while saving bits in the setup messages, has other costs 990 which may be large. In particular: 992 (a). Until the future expansion takes place (if it ever does), 993 one may be wasting quite a lot of labels (noting that 994 that each label you distribute requires you to allocate a 995 piece of high-speed memory in your forwarding engine; 996 putting some of it aside for possible later use seems 997 very costly. Each one you put aside is, e.g., one less 998 RFC2547 route you can support). 1000 However, if you don't preallocate enough contiguous label 1001 space for future expansion, then if the expansion occurs 1002 you must start adding additional labels or label ranges, 1003 and your setup messages start getting longer anyway (in 1004 theory, you could just carve a new set of label ranges, 1005 instead of adding new ones; counter-position: if you did 1006 that, you'd have to bring down your whole VPWS (and 1007 possibly VPLS) every time you add a new endpoint). 1009 (b). Fragmentation of the label space, which can result from 1010 this preallocation, has real impact on label switching 1011 implementations (as the MPLS architecture explicitly 1012 leaves it to the implementation to develop its own label 1013 assignment strategies). So if, for example, a hardware 1014 designer thinks s/he can improve performance by using, 1015 say, prime numbered labels first, s/he should have the 1016 ability to design her/his system in this way. If an 1017 application is going to come along and demand that labels 1018 be assigned in contiguous groups, implementations which 1019 are perfectly conformant to the architecture may not be 1020 able to support that application. 1022 (c). For diagnosis of network problems, the label-range 1023 approach may have the additional issue that the operator 1024 may not know (a priori) which label(s) were assigned to 1025 which endpoint(s). 1027 (d). Finally, one may argue that label-range allocation is 1028 sub-optimal for non-full mesh topologies, since all peers 1029 of the VPN must hear about the a label-range withdraw, and 1030 (in a non-full mesh topology), not all peers need to know 1031 about it. 1033 In any event, one may argue that the scaling benefits of using a RR 1034 in routing is that the RR pre-digests all the received info; it runs 1035 the (BGP) decision process, and only forwards the results of the 1036 decision process, rather than forwarding all the raw data. In the 1037 case of VPWS (and possibly VPLS), the argument is that this advantage 1038 is absent (i.e., we don't run BGP path selection), and as a result, 1039 the RR doesn't help with scaling in the same way it does with 1040 routing. Of course, the counter position is that some form of BGP 1041 path selection is used; see discussion above). Finally, one may argue 1042 that using the RR will introduce some latency into the label withdraw 1043 procedure. 1045 8.3. VPWS and Per-Wire Attributes 1047 While several per-wire attributes have been defined (see [L2TPV3], 1048 for example), the need for per-wire attributes for VPWS remains 1049 controversial. The following sections examine those controversies. 1051 8.3.1. Assertion #4 1053 Assertion #4 is that VPWS requires various per-wire parameters. These 1054 may include (but are not limited to) MTU, whether to use sequencing 1055 capabilities, bandwidth capabilities, and QoS. In addition, during 1056 the lifetime of a pseudo-wire, there are per-wire status indications 1057 that may need to be passed to the other endpoint. 1059 8.3.2. Counter-Assertion #4: 1061 Counter-Assertion #4 states that it has not been demonstrated that 1062 VPWS needs per-wire attributes as few (per-wire attributes) have as 1063 yet been defined (see, e.g., [MARTINI]). 1065 8.3.3. Assertion #5 1067 Assertion #5 states that passing per-wire attributes through an RR 1068 will likely be inefficient. The argument here is that in the event 1069 that per-wire attributes are required, passing these (per-wire) 1070 attributes through a RR will be sub-optimal as the RR will forward 1071 the status to all the VPWS members, not just to the one endpoint that 1072 is interested in it. For attributes like sequence numbers, it may 1073 even more difficult as one has to make sure the sequence numbers 1074 resynchronize properly when the pseudo-wire flaps. This seems 1075 somewhat difficult to achieve through a BGP RR. 1077 8.3.4. Counter-Assertion #5 1079 The counter assertion here is that, since few (or no) per-wire 1080 attributes have been defined (counter-assertion #4), the fact that it 1081 is inefficient to use a RR for distribution is irrelevant. 1083 8.3.5. Assertion #6 1085 Assertion #6 states that, while still an open issue, pseudo-wire 1086 congestion control may require regular point-to-point control message 1087 exchanges, something which BGP would seem ill-equipped to handle. 1089 8.3.6. Counter-Assertion #6 1091 In this case, the counter assertion is that since few (or no) per- 1092 wire attributes have been defined (see counter-assertion #4), and 1093 further, since congestion control for pseudo-wires is still an open 1094 issue, arguing fit is premature. 1096 8.3.7. Assertion #7: 1098 Assertion #7 states that the primary motivation for VPWS is to 1099 deliver existing service models (i.e., Frame Relay and ATM) over a 1100 packet infrastructure (this is as opposed to some new service). In 1101 this case, common deployments involve partial mesh topologies (more 1102 specifically multiple hub and spoke connections, with some hub to hub 1103 connectivity that makes sense for the enterprise traffic profile). In 1104 addition, some of the connections in such deployments require per- 1105 wire characteristics (e.g., guaranteed throughput for voice, etc). 1107 In other words, the argument here is that a VPWS service designed to 1108 support so-called legacy services (Frame Relay and ATM) will require 1109 point-to-point signaling due to existing topologies and the need for 1110 per-wire attributes. Further, for new VPWS services that require 1111 full-mesh auto-provisioning, the "Colored Pools PW Provisioning 1112 Model" [L2VPN] suggests a method to support such provisioning while 1113 retaining the point-to-point signaling required to support per-wire 1114 attributes. 1116 8.3.8. Counter-Assertion #7: 1118 1120 8.4. VPLS 1122 A VPLS service connects a number of sites by an emulated LAN segment. 1123 In the next sections, we examine whether VPLS maybe be considered to 1124 be a routing application, and hence whether BGP is a good fit for its 1125 distribution requirements. 1127 8.4.1. Assertion #8 1129 Assertion #8 states that VPLS is a routing application, since the 1130 notion of "VPLS site identification" is analogous to a VPN site 1131 identifier for VPWS (which this camp also views as a routing 1132 application). As a result, the analysis of the distribution needs of 1133 these five items is exactly as for RFC 2547 VPNs, and the conclusion 1134 is that BGP is reasonably well-suited for this application, and with 1135 the addition of [RTCONST] and [REFRESH], the fit is even better. 1136 Finally, note that existing BGP path selection mechanisms can be used 1137 as is for VPLS, and can prove useful for multi-homed sites. 1139 8.4.2. Counter-Assertion #8 1141 Counter-Assertion #8 states that VPLS is not a routing application. 1142 In particular, the contention here is that while the VPLS NLRI are 1143 used to identify that a particular PE belongs to a particular VPLS 1144 instance (as described in Assertion #8),the path which data traffic 1145 follows will depend on the route to that PE, and that route is 1146 determined by the ordinary IP routing. As a result, it is not 1147 relevant which neighbor a VPLS NLRI was received from, and hence is 1148 not routing. 1150 8.4.3. Assertion #9 1152 Assertion #9 is that constrained or true broadcast is not valuable 1153 for VPLS, since the same label can not be used by all peers. In 1154 particular, the same label can not be used by all peers since MAC 1155 address learning is performed in the data plane. 1157 8.4.4. Counter-Assertion #9 1159 1161 9. Operational Implications 1163 In this section we examine the operational implications of the 1164 various choices in the design spaces described in this document. 1166 9.1. OAM Functionality 1168 A service provider (SP) may want to know exactly where a particular 1169 pseudo-wire leaves its domain, and in addition may want to keep 1170 various counts and bits of status at that point. Further, the SP may 1171 want to be able to do data path testing to that point. That is, a SP 1172 may want point-to-point pseudo-wire state to be maintained at its 1173 border routers. 1175 9.1.1. Assertion #10: 1177 Assertion #10 states that it may be difficult for service providers 1178 to maintain point-to-point pseudo-wire state at their border routers 1179 with the proposed BGP signaling mechanisms. This is because those 1180 mechanisms provide no way to ensure that a pseudo-wire data path will 1181 leave the network at a node which has state information for that 1182 pseudo-wire. 1184 9.1.2. Counter-Assertion #10: 1186 1188 9.2. Full-Mesh Issues 1190 10. Conclusions and Recommendations 1192 11. Intellectual Property 1194 The IETF takes no position regarding the validity or scope of any 1195 intellectual property or other rights that might be claimed to 1196 pertain to the implementation or use of the technology described in 1197 this document or the extent to which any license under such rights 1198 might or might not be available; neither does it represent that it 1199 has made any effort to identify any such rights. Information on the 1200 IETF's procedures with respect to rights in standards-track and 1201 standards-related documentation can be found in BCP-11 [RFC2028]. 1202 Copies of claims of rights made available for publication and any 1203 assurances of licenses to be made available, or the result of an 1204 attempt made to obtain a general license or permission for the use of 1205 such proprietary rights by implementors or users of this 1206 specification can be obtained from the IETF Secretariat. 1208 The IETF invites any interested party to bring to its attention any 1209 copyrights, patents or patent applications, or other proprietary 1210 rights which may cover technology that may be required to practice 1211 this standard. Please address the information to the IETF Executive 1212 Director. 1214 12. Design Team 1216 The design team that produced this document consisted of Daniel 1217 Awduche (awduche@awduche.com), Ron Bonica (Ronald.P.Bonica@mci.com), 1218 Hank Kilmer (hank@rem.com), Kireeti Kompella (kireeti@juniper.net), 1219 Chris Lewis (chrlewis@cisco.com), Danny McPherson (danny@tcb.net), 1220 David Meyer (dmm@1-4-5.net) and Peter Whiting 1221 (pwhiting@vericenter.com). 1223 13. Acknowledgments 1225 David Ball, Peter Gutierrez, Susan Harris, Pedro Marques, Eric Rosen, 1226 Pekka Savola, and Mark Townsley have all made many insightful 1227 comments on earlier versions of this document. 1229 14. Security Considerations 1231 This document specifies neither a protocol nor an operational 1232 practice, and as such, it creates no new security considerations. 1234 15. IANA Considerations 1236 This document creates a no new requirements on IANA namespaces 1237 [RFC2434]. 1239 16. References 1241 16.1. Normative References 1243 [AFI] http://www.iana.org/assignments/address-family-numbers 1245 [BGP] Rekhter, Y, T.Li, and S. Hares, "A Border Gateway 1246 Protocol 4 (BGP-4)", draft-ietf-idr-bgp4-23.txt. 1247 Work in progress. 1249 [BGPVPN] Ould-Brahim, H., E. Rosen, and Y. Rekhter, "Using 1250 BGP as an Auto-Discovery Mechanism for 1251 Provider-provisioned VPNs", 1252 draft-ietf-l3vpn-bgpvpn-auto-00.txt. Work in 1253 progress. 1255 [CLARK] Clark, D., "Design Philosophy of the DARPA Internet 1256 Protocols", Computer Communication Review, volume 1257 25, number 1, January 1995. ISSN # 0146-4833. 1259 [EXTCOMM] Sangali, S., D. Tappan, and Y. Rekhter, "BGP 1260 Extended Communities Attribute", 1261 draft-ietf-idr-bgp-ext-communities-06.txt. Work 1262 in progress. 1264 [FLOW] Marques, P, et. al., "Dissemination of flow 1265 specification rules", 1266 draft-marques-idr-flow-spec-00.txt. Work in 1267 progress. 1269 [LABELRANGE] What is the cite here? 1271 [L2VPN] Andersson, L. and E. Rosen, "L2VPN Framework", 1272 draft-ietf-l2vpn-l2-framework-03.txt. Work in 1273 Progress. 1275 [L2VPNSIG] Rosen, E. and V. Rodoaca, "Provisioning Models 1276 and Endpoint Identifiers in L2VPN Signaling", 1277 draft-ietf-l2vpn-signaling-00.txt. Work in 1278 Progress. 1280 [L2VPNT] Kompella, K. (Editor), "Layer 2 VPNs Over 1281 Tunnels", draft-kompella-l2vpn-l2vpn-00.txt. 1282 Work in Progress. 1284 [L2TPv3] Lau, J., M. Townsley and I. Goyret (Editors), 1285 "Layer Two Tunneling Protocol (Version 1286 3)", draft-ietf-l2tpext-l2tp-base-11.txt. Work in 1287 Progress. 1289 [MARTINI] Martini, L., E.Rosen, and T. Smith, "Pseudowire 1290 Setup and Maintenance using LDP", 1291 draft-ietf-pwe3-control-protocol-05.txt. Work in 1292 progress. 1294 [MULLER1999] Muller, R. et. al., "Control System Reliability 1295 Requires Careful Software Installation 1296 Procedures", International Conference on 1297 Accelerator and Largeand Large Experimental 1298 Physics Systems, 1999, Trieste, Italy. 1300 [MULTISESSION] Scudder, J. and C. Appanna, "Multisession BGP, 1301 draft-scudder-bgp-multisession-00.txt. Work in 1302 progress. 1304 [ORF] Chen, E., and Rekhter, Y., "Cooperative Route 1305 Filtering Capability for BGP-4", 1306 draft-ietf-idr-route-filter-09.txt. Work in 1307 progress. 1309 [RTCONST] Bonica, R. et al, "Constrained VPN route 1310 distribution", 1311 draft-marques-ppvpn-rt-constrain-01.txt. Work in 1312 progress. 1314 [SOFTNOTIFY} Nalawade, G., K. Patel, J. Scudder, and D. Ward, 1315 "BGPv4 Soft-Notification Message", 1316 draft-nalawade-bgp-soft-notify-00.txt., Work in 1317 progress. 1319 [RFC1075] Waitzman, D., C. Partridge, and S. Deering, 1320 "Distance Vector Multicast Routing Protocol", RFC 1321 1075, November, 1988. 1323 [RFC1142] Oran, D. Editor, "OSI IS-IS Intra-domain Routing 1324 Protocol", RFC 1142, February, 1990. 1326 [RFC1771] Rekhter, Y., and T. Li, "A Border Gateway 1327 Protocol 4 (BGP-4)", RFC 1771, March 1995. 1329 [RFC1958] Carpenter, B., "Architectural principles of the 1330 Internet", Editor. RFC 1958, June 1996. 1332 [RFC1997] Chandra, R., P. Traina, and T. Li, "BGP 1333 Communities Attribute", RFC 1997, August, 1996. 1335 [RFC2138] Rigney, C., et. al., "Remote Authentication Dial 1336 In User Service (RADIUS)", RFC 2138, April, 1997. 1338 [RFC2328] Moy, J., "OSPF Version 2", RFC 2328, April, 1998. 1340 [RFC2453] Malkin, G., "RIP Version 2", RFC 2453, November, 1341 1998. 1343 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, 1344 Version 6 (IPv6) Specification", RFC 2460, 1345 December, 1998. 1347 [RFC2547BIS] Rosen, E., et. al., "BGP/MPLS IP VPNs", 1348 draft-ietf-l3vpn-rfc2547bis-00.txt. Work in 1349 progress. 1351 [RFC2858] Bates, T., et. al., "Multiprotocol Extensions 1352 for BGP-4", RFC 2858, June 2000. 1354 [RFC2918] Chen, E., "Route Refresh Capability for BGP-4", 1355 RFC 2918, September 2000. 1357 [RFC3036] Anderson, L., et. al., "LDP Specification", RFC 1358 3036, January 2001. 1360 [RFC3439] Bush, R. and D. Meyer, "Some Internet 1361 Architectural Guidelines and Philosophy", RFC 1362 3439, December, 2002. 1364 [SAFI] http://www.iana.org/assignments/safi-namespace 1366 [VPLS] Kompella, K., et. al. "Virtual Private LAN 1367 Service", draft-ietf-l2vpn-vpls-bgp-01.txt. 1368 Work in progress. 1370 [VPWS] Kompella, K. et.al. "Layer 2 VPNs Over Tunnels", 1371 draft-kompella-ppvpn-l2vpn-04.txt. Work in 1372 progress. 1374 16.2. Informative References 1376 [IETFOL] https://www1.ietf.org/mailman/listinfo/routing-discussion 1378 [RFC2119] Bradner, S., "Key words for use in RFCs to 1379 Indicate Requirement Levels", RFC 2119, March, 1380 1997. 1382 [RFC2026] Bradner, S., "The Internet Standards Process -- 1383 Revision 3", RFC 2026/BCP 9, October, 1996. 1385 [RFC2028] Hovey, R. and S. Bradner, "The Organizations 1386 Involved in the IETF Standards Process", RFC 1387 2028/BCP 11, October, 1996. 1389 [RFC2434] Narten, T., and H. Alvestrand, "Guidelines for 1390 Writing an IANA Considerations Section in RFCs", 1391 RFC 2434/BCP 26, October 1998. 1393 [RVBIB] http://www.routeviews.org/papers 1395 [WILLINGER2002] Willinger, W., and J. Doyle, "Robustness and the 1396 Internet: Design and evolution", 2002. 1398 17. Editor's Address 1400 David Meyer 1401 Email: dmm@1-4-5.net 1403 18. Full Copyright Statement 1405 Copyright (C) The Internet Society (2004). All Rights Reserved. 1407 This document and translations of it may be copied and furnished to 1408 others, and derivative works that comment on or otherwise explain it 1409 or assist in its implementation may be prepared, copied, published 1410 and distributed, in whole or in part, without restriction of any 1411 kind, provided that the above copyright notice and this paragraph are 1412 included on all such copies and derivative works. However, this 1413 document itself may not be modified in any way, such as by removing 1414 the copyright notice or references to the Internet Society or other 1415 Internet organizations, except as needed for the purpose of 1416 developing Internet standards in which case the procedures for 1417 copyrights defined in the Internet Standards process must be 1418 followed, or as required to translate it into languages other than 1419 English. 1421 The limited permissions granted above are perpetual and will not be 1422 revoked by the Internet Society or its successors or assigns. 1424 This document and the information contained herein is provided on an 1425 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1426 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1427 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1428 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1429 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.