idnits 2.17.1 draft-wu-softwire-mesh-framework-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 22. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1555. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1532. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1539. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1545. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There is 1 instance of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 4 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** There is 1 instance of lines with control characters in the document. ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 17, 2006) is 6523 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 63, but not defined == Missing Reference: 'STH' is mentioned on line 591, but not defined ** Obsolete normative reference: RFC 1700 (Obsoleted by RFC 3232) ** Obsolete normative reference: RFC 2858 (Obsoleted by RFC 4760) ** Downref: Normative reference to an Informational RFC: RFC 3985 ** Downref: Normative reference to an Informational RFC: RFC 4110 ** Downref: Normative reference to an Informational RFC: RFC 4111 ** Downref: Normative reference to an Informational RFC: RFC 4176 ** Downref: Normative reference to an Informational RFC: RFC 4378 Summary: 14 errors (**), 0 flaws (~~), 6 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Softwire Working Group J. Wu 3 Internet-Draft Y. Cui 4 Expires: December 19, 2006 X. Li 5 Tsinghua University 6 C. Metz 7 G. Nalawade 8 S. Barber 9 P. Mohapatra 10 Cisco Systems, Inc. 11 June 17, 2006 13 A Framework for Softwire Mesh Signaling, Routing and Encapsulation 14 across IPv4 and IPv6 Backbone Networks 15 draft-wu-softwire-mesh-framework-00 17 Status of this Memo 19 By submitting this Internet-Draft, each author represents that any 20 applicable patent or other IPR claims of which he or she is aware 21 have been or will be disclosed, and any of which he or she becomes 22 aware will be disclosed, in accordance with Section 6 of BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 The list of current Internet-Drafts can be accessed at 35 http://www.ietf.org/ietf/1id-abstracts.txt. 37 The list of Internet-Draft Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 This Internet-Draft will expire on December 19, 2006. 42 Copyright Notice 44 Copyright (C) The Internet Society (2006). 46 Abstract 48 The softwires mesh problem identfies a requirement for a generalized, 49 network-based client IP(x)-over-backbone IP(y) solution in support of 50 the transition to IPv6. Connectivity between islands of IPv6, IPv4 51 or IPv6/v4 networks will be enabled across a single IPv4 or IPv6 52 backbone network employing IP (or MPLS) tunnels called softwires. 53 This solution will re-use where appropriate existing multi-address 54 family routing mechanisms such as the Border Gateway Protocol as well 55 as existing IP (and label) tunnel encapsulation schemes. The intent 56 is to encourage multiple, inter-operable vendor implementations in 57 the hope operators will find it easier and more attractive to support 58 the transition to IPv6. 60 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 61 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 62 document are to be interpreted as described in [RFC2119]. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 9 70 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 12 71 3.1. IPv6-over-IPv4 Scenario . . . . . . . . . . . . . . . . . 12 72 3.2. IPv4-over-IPv6 Scenario . . . . . . . . . . . . . . . . . 14 74 4. Reference Models . . . . . . . . . . . . . . . . . . . . . . . 17 75 4.1. Softwire Mesh Reference Model . . . . . . . . . . . . . . 17 76 4.2. Entities of the Softwire Mesh Reference Model . . . . . . 17 77 4.3. ABFR Reference Model . . . . . . . . . . . . . . . . . . . 18 78 4.4. Entities of the AFBR Reference Model . . . . . . . . . . . 19 79 4.5. Comments on Single AF AFBR Reference Models . . . . . . . 20 81 5. Softwire Signaling . . . . . . . . . . . . . . . . . . . . . . 22 82 5.1. SW Encapsulation Sets . . . . . . . . . . . . . . . . . . 22 83 5.2. BGP . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 84 5.3. non-BGP Signaling . . . . . . . . . . . . . . . . . . . . 24 86 6. Softwire Routing and Tunnel Selection . . . . . . . . . . . . 25 87 6.1. Advertising Client AF Access Island Reachability . . . . . 25 88 6.2. Tunnel Selection . . . . . . . . . . . . . . . . . . . . . 26 89 6.2.1. Softwire Next_Hop . . . . . . . . . . . . . . . . . . 26 90 6.2.2. Next_Hop Overlay Addressing . . . . . . . . . . . . . 28 91 6.3. Comments on a BGP-free Core . . . . . . . . . . . . . . . 28 93 7. Softwire Forwarding and Tunnel Encapsulations . . . . . . . . 30 94 7.1. Forwarding . . . . . . . . . . . . . . . . . . . . . . . . 30 95 7.2. Encapsulations . . . . . . . . . . . . . . . . . . . . . . 30 97 8. Softwire OAM and MIBs . . . . . . . . . . . . . . . . . . . . 31 98 8.1. OAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 99 8.2. MIBs . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 101 9. Softwire Multicast . . . . . . . . . . . . . . . . . . . . . . 32 103 10. Inter-AS Considerations . . . . . . . . . . . . . . . . . . . 34 104 10.1. Option A: Back-to-Back AFBRs . . . . . . . . . . . . . . . 34 105 10.2. Option B: EBGP redistribution of AF(i) prefixes . . . . . 34 106 10.3. Option C: Multihop EBGP distribution of AF(i) prefixes . . 35 108 11. Security Considerations . . . . . . . . . . . . . . . . . . . 37 110 12. Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . 38 111 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 112 13.1. Normative References . . . . . . . . . . . . . . . . . . . 39 113 13.2. Informative References . . . . . . . . . . . . . . . . . . 40 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41 116 Intellectual Property and Copyright Statements . . . . . . . . . . 43 118 1. Introduction 120 The versatility of ISP and corporate IP backbone networks has been 121 enhanced over the last few years by their ability to provide routing 122 services to their attached constituent client networks. For example 123 RFC4364 defines a method for providing network-based IPv4 VPN routing 124 and forwarding across an MPLS backbone. This involves definition of 125 a new address family (AF) (VPNv4), distribution of the client VPNv4 126 prefixes between the provider's interested edge routers (called 127 provider edge or PE routers) and encapsulation/decapsulation of the 128 client's IPv4 packets in labels for transit across the backbone 129 network. The provider can now offer inter-client site IPv4 routing 130 and forwarding services. 132 The provider operates a uniform backbone network based on MPLS label 133 switching. Their cadre of PE routers performs the tasks of client AF 134 prefix/next-hop distribution and switching client packets to and from 135 backbone label switched paths (LSP). This solution has proven to be 136 quite scalable as the interior network needs only store next-hop 137 reachability to the PE routers and while the PE routers employ BGP 138 and its attendant scaling functions (e.g. route reflectors, extended 139 communities) for conveying client AF prefix reachability. 141 At the same time and for a variety of technical, economical, 142 political and operational reasons, some of these same ISP and 143 enterprise backbone providers are being asked to support IPv6 routing 144 and forwarding across their IPv4-based backbone networks. Providers 145 can employ one of the existing IPv6-over-IPv4 transition tunnel 146 schemes that have been defined and in some cases implemented through 147 the years. The drawbacks of these various schemes (of which there 148 are too many to enumerate here) is that some are are limited in their 149 functional scope (e.g. may only work in LAN environments), some 150 require extensive manual configuration and thus do not scale and some 151 require special IPv6 addressing schemes to work effectively. 153 Another option available to providers is to deploy a network-based 154 IPv6 VPN routing and forwarding service and tunnel customer IPv6 VPN 155 packets across an MPLS backbone network. This works in a manner 156 similar to the aforementioned IPv4 VPN service except now the PE 157 routers are distributing and storing prefixes/next-hops belonging to 158 an IPv6 VPN AF. This solution inherits the scaling properties of the 159 IPv4 VPN solution and the backbone remains MPLS. 161 A drawback of the IP VPN solutions is that they mandate client AF 162 prefixes be stored and distributed as VPN address families (AFI=x/ 163 SAFI=128) in VPN-specific routing tables. For topological, 164 operational or other reasons the provider may not wish to configure 165 VPN routing tables on their PE routers that are necessary to support 166 these solutions. 168 Another drawback is the implicit limitation on the type of forwarding 169 supported in the backbone network. Most implementations require the 170 backbone and PE routers to support MPLS. Some implementations can 171 accommodate IP VPN forwarding across IP-based tunnels (where the 172 backbone is IPv4 only). Interoperability could be an issue and the 173 choices of IP tunnel encapsulation/decapsulation performed by the PE 174 routers may also be limited [draft-ietf-l3vpn-gre-ip-2547]. 176 Up until this point, the assumption is that the provider's backbone 177 is IPv4 or MPLS and the client networks are IPv4 or IPv6. That 178 assumption is no longer valid. Indeed there are operators right now 179 who have deployed (or plan to deploy in the near future) a native 180 IPv6 backbone network and who require the ability to support client 181 IPv4 routing and forwarding across such a backbone network. The 182 dilemma faced by these IPv6 backbone operators is that their options 183 for interoperable IPv4-over-IPv6 tunneling are limited and the 184 operational burden associated with a solution that solves the 185 problem, namely point-to-point tunnel provisioning [RFC2473], is 186 large and costly. Moreover it is either too expensive or just not 187 practical to require dual-stacks on the hosts operating in the client 188 networks. Thus the requirement to support a scalable and 189 interoperable IPv4-over-IPv6 routing and forwarding service presents 190 itself. 192 Based on this discussion it is now possible to outline the functions 193 for a generalized, network-based client IP AF(i)-over-backbone IP 194 AF(j)routing and forwarding solution (where (i) and (j) denote 195 different AF's): 197 o Backbone network of IP AF(j) forwards packets with a header or 198 labels based on IP AF(j) 200 o Local PE routers discover a set of tunnel encapsulation parameters 201 and tunnel end-points of IP AF(j) located on remote PE routers. 203 o Mesh of inter-PE tunnels with a tunnel header based on IP AF (j) 204 are dynamically established. Client IP AF(i) packets will be 205 directed into the appropriate tunnel based on destination client 206 IP AF(i) prefixes and a next_hop reachable through the other end 207 of the tunnel. 209 o Client IP AF (i) prefixes, AF(i) next_hops and tunnel identifier/ 210 next-hop addresses are stored on local PE routers and distributed 211 in a scalable fashion to interested remote PE routers. The 212 purpose of the tunnel identifier/next-hop address is to bind the 213 advertised client IP AF (i) prefix/next_hop with an established 214 inter-PE tunnel leading to that prefix that terminates in the 215 tunnel next-hop address. 217 o Packets belonging to client IP AF(i) are encapsulated in backbone 218 AF(j)-based tunnel headers (IP or labels) and forwarded across the 219 backbone AF(j) network. 221 These functions operate with an interchangeable mix of different AF 222 and tunnel encapsulation types. For example client IP AF(i) prefixes 223 could be native IPv4, native IPv6, VPN IPv4 or VPN IPv6. Native 224 means the prefixes are stored in a global table with a SAF=1 and VPN 225 means prefixes are stored in VPN routing and forwarding tables with a 226 SAF=128. The backbone IP AF(j) could be native IPv6 or native IPv4. 227 The tunnel encapsulation types could be IP-IP [RFC2473], GRE 228 [RFC2784], L2TPv3 [RFC3931] and certainly others are possible. It is 229 even possible that MPLS tunnels could be used to progress client 230 AF(i) packets across the IP AF (j) backbone network consistent with 231 the IP VPN solutions discussed earlier. 233 These functions align with a solution to address the requirements to 234 the softwire mesh problem as set forth in [draft-ietf-softwire-problem- 235 statement]. Providers who operate IPv4 or IPv6 backbone networks 236 will require a generalized, scalable and interoperate set of 237 mechanisms to tunnel, based on client IP AF reachability, packets 238 belonging to one or multiple IPv4 or IPv6 address families across 239 their respective IPv4 or IPv6 backbone networks. 241 A byproduct of this solution is the notion of a BGP-free core. Core 242 routers located in the interior of the network need only to provide 243 reachability to themselves and the peripheral set of PE routers via 244 an interior gateway protocol (IGP); all client AF prefixes including 245 the Internet routing table are maintained on the PE routers. A 246 routing decision for an Internet prefix performed at an ingress PE 247 router resolves a BGP next_hop to a tunnel leading to the egress PE 248 and the packet is transparently tunneled across the core. 250 The tunnel in this case is referred to as a softwire. A set of 251 softwires established between PE routers (termed address family 252 border routers or AFBRs in this document) is referred to as a 253 softwire mesh. Softwire tunnel configuration information, client IP 254 AF reachability and softwire identifier/next_hops are automatically 255 distributed between participating AFBRs. Client IP AF packets are 256 tunneled across the softwire mesh between local and remote AFBRs 257 based on client IP AF reachability information and the associated 258 softwire. 260 The objective of this document is to describe a framework for 261 softwire mesh signaling, routing and encapsulation across IPv4 or 262 IPv6 backbone networks. It defines a generalized, network-based 263 client IP(x)-over-backbone IP(y) solution in support of the 264 transition to IPv6. Connectivity between islands of IPv6, IPv4 or 265 dual-stack IPv6/v4 networks will be enabled across a single IPv4 or 266 IPv6 backbone network employing IP (or MPLS) tunnels called 267 softwires. This solution will re-use where appropriate existing 268 multi-address family routing mechanisms such as the Border Gateway 269 Protocol as well as existing IP (and label) tunnel encapsulation 270 schemes. The intent is to encourage multiple, inter-operable vendor 271 implementations in the hope operators will find it easier and more 272 attractive to support the transition to IPv6. 274 2. Terminology 276 The following terminology will used in this document. 278 Address Family (AF) 280 IPv4 or IPv6. Presently defined values for this field are specified 281 in [RFC1700] (see the Address Family Numbers section). 283 AF(i), AF(j) 285 Notation used to indicate that prefixes, a node or network only deal 286 with a single IP AF. 288 AF(i,j) 290 Notation used to indicate that a node is dual-stack (e.g. runs both 291 IPv4 and IPv6) or that a network is composed (at least partially) of 292 dual-stack nodes. 294 Address Family Border Router (AFBR) 296 A dual-stack router that interconnects two networks that use either 297 the same or different address families. An AFBR forms peering 298 relationships with other AFBRs, adjacent core routers and attached CE 299 routers, performs softwire discovery and signaling, advertises 300 client AF(i) reachability information and encapsulates/decapsulates 301 customer packets in softwire transport headers. 303 AF Access Island 305 A single IP AF(i) or dual-stack IP AF (i,j) client access network 306 connected to one (single-homed) or more than one (multi-homed) 307 upstream AFBRs. 309 Customer Edge (CE) 311 A router located inside AF access island that peers with other CE 312 routers within the access island network and with one or more 313 upstream AFBRs 315 Client AF Prefixes 317 IP AF(i) or AF(j) prefixes originating inside an AF access island. 319 Single AF Transit Core 321 A single AF(j) transit core composed of IPv4 or IPv6 core routers 322 surrounded by a periphery of dual stack AF (i,j) AFBRs. The transit 323 core forward packets with IP AF(j) headers or labels derived from an 324 IP AF(j) control plane. 326 Softwire (SW) 328 A "tunnel" that is created on the basis of a control protocol setup 329 between softwire endpoints with shared point-to-point or multipoint- 330 to-point state. Softwires are generally dynamic in nature and when 331 formed over a backbone network in a mesh configuration are considered 332 very long-lived. 334 Softwire Transport Header AF (STH AF) 336 The address family of the outermost IP header of a softwire. This 337 will either be IPv4, IPv6 or labels derived from one or the other. 338 If the softwire employs MPLS label encapsulation then the STH AF is 339 an implicit IPv4 with the assumption that most MPLS deployments 340 currently employ an IPv4 control plane. This could change in the 341 future when native IPv6 backbone networks and their elements 342 implement an MPLS control and forwarding plane based on IPv6. 344 Softwire Payload Header AF (SPH AF) 346 The address family of the IP headers being carried within a softwire. 347 Note that additional "levels" of IP headers may be present if (for 348 example) a tunnel is carried over a softwire - the key attribute of 349 SPH AF is that it is directly encapsulated within the softwire and 350 the softwire endpoint will base forwarding decisions on the SPH AF 351 when a packet is exiting the softwire. 353 Softwire Encapsulation Set (SW-Encap) 355 A softwire encapsulation set contains tunnel header parameters, order 356 of preference of the tunnel header types and the expected payload 357 types (e.g. IPv4) carried inside the softwire. 359 Softwire Next_Hop (SW-NHOP) 361 This attribute accompanies client AF reachability advertisements and 362 is used to reference a softwire on the ingress AFBR leading to the 363 specific prefixes. It contains a softwire identifier value and a 364 softwire next_hop IP address denoted as . Its 365 existence in the presence of client AF prefixes (in advertisements or 366 as entries in a routing table) infers the use of softwire to reach 367 that prefix. 369 Subsequent Address Family (SAF) 370 Additional information about the type of the Network Layer 371 Reachability Information (e.g. unicast or multicast). 373 3. Requirements 375 The requirements for addressing the softwire mesh problem are 376 described in [draft-ietf-softwire-problem-statement]. In addition the 377 framework outlined in this document attempts to: 379 o Leverage and reuse existing protocols and practices where 380 appropriate. L3VPN [RFC4110] solutions already provide multi-AF 381 routing across homogenous IP or MPLS backbones and to that extent 382 we wish to leverage that effort. On the tunnel encapsulation 383 side, nothing new is needed; the existing methods are sufficient 384 for our purposes. 386 o AF and Tunnel Encap Agnostic. Basically packets belonging to any 387 IP AF, native and including the VPN variants, should be able to be 388 tunneled across an IPv4 or IPv6 backbone using different 389 encapsulation techniques including MPLS labels. 391 o Keep it simple while providing the provider with maximum 392 flexibility. 394 In summary the basic requirement is to support a generalized, 395 network-based solution supporting IPv6-over-IPv4 and IPv4-over-IPv6 396 scenarios employing different IP (or label) tunneling encapsulation 397 types. 399 3.1. IPv6-over-IPv4 Scenario 401 The first category of scenarios that must be addressed by the 402 softwire mesh framework is client IPv6-over-backbone IPv4. Figure 1 403 illustrates a number of IPv6 access island networks connected to an 404 IPv4 transit core. The objective of the softwire mesh is to provide 405 a scalable, network-based IPv6 routing and forwarding service that 406 operates over the IPv4 transit core. 408 It should be noted here that the IPv4 transit core may run MPLS label 409 switching. That is not a problem and one could easily see how the 410 softwire mesh could be composed of a set of point-to-point or 411 multipoint-to-point label switched paths (LSP). In addition an 412 access network does not necessarily have to be IPv6 only; it could be 413 dual-stack or IPv4-based. 415 The general problem of IPv6 connectivity across IPv4 networks has 416 been addressed through the years using a number of different 417 tunneling mechanisms, some provisioned manually, others based on 418 special addressing. More recently MPLS VPNs have been extended to 419 support IPv6 VPN services across an MPLS backbone. 421 The softwire mesh framework will employ elements of those solutions 422 with the key differences being a) tunnels are automatically built; 423 b) any tunnel encapsulation scheme is permitted and; c) client AF 424 prefix distribution and processing is not VPN-centric. That is to 425 say that the solution will support existing VPN routing and 426 forwarding capabilities but it is not mandatory. 428 +--------+ +--------+ 429 | IPv6 | | IPv6 | 430 | AF | | AF | 431 | Access | | Access | 432 | Island | | Island | 433 +--------+ +--------+ 434 | \ / | 435 | \ / | 436 | \ / | 437 | X | 438 | / \ | 439 | / \ | 440 | / \ | 441 +--------+ +--------+ 442 | AFBR | | AFBR | 443 +--| IPv4/6 |---| IPv4/6 |--+ 444 | +--------+ +--------+ | 445 +-------+ | | +-------+ 446 | IPv4 | | | | IPv4 | 447 | AF | | | | AF | 448 | Access|-------| IPv4 |-------| Access| 449 | Island| | Transit Core | | Island| 450 +-------+ | | +-------+ 451 | | 452 | +--------+ +--------+ | 453 +--| AFBR |---| AFBR |--+ 454 | IPv4/6 | | IPv4/6 | 455 +--------+ +--------+ 456 | \ / | 457 | \ / | 458 | \ / | 459 | X | 460 | / \ | 461 | / \ | 462 | / \ | 463 +--------+ +--------+ 464 | IPv6 | | IPv6 | 465 | AF | | AF | 466 | Access | | Access | 467 | Island | | Island | 468 +--------+ +--------+ 470 Figure 1 IPv6-over-IPv4 Scenario 472 3.2. IPv4-over-IPv6 Scenario 474 The second category of scenarios that must be addressed by the 475 softwire mesh framework is client IPv4-over-backbone IPv6. Figure 2 476 illustrates a number of IPv4 access island networks connected to an 477 IPv6 transit core. The objective of the softwire mesh is to provide 478 a scalable, network-based IPv4 routing and forwarding service that 479 operates over the IPv6 transit core. 481 This is perhaps the more interesting scenario to look at for several 482 reasons. First it is clearly an emerging and significant 483 requirement. As mentioned native IPv6 backbones are being deployed 484 and there is clearly a large legacy of IPv4 networks and applications 485 that can and should operate across this new transit core. Second the 486 notion of supporting dynamic client IP AF routing and forwarding 487 across native IPv6 networks has frankly not been implemented by 488 vendors. A provider would have a tough time finding a BGP-based VPN 489 routing and forwarding solution that operates across a native IPv6 490 backbone. 492 The third area of interest here involves the problem of the common 493 AFI/SAFI shared by network layer reachability information (NLRI) and 494 next_hop address fields contained in BGP prefix advertisements. 495 Simply put [RFC2858] states that the AFI/SAFI of the NLRI and 496 next_hop fields contained in a BGP prefix advertisement must be the 497 same. 499 Clever workarounds (for operation across IPv4 backbone networks only) 500 have been devised through the years to take VPN or IPv6 NLRIs and 501 generate an AFI/SAFI-matching next-hop address that is really an IPv4 502 address in disguise. This was doable because the larger next_hop 503 field was big enough to hold a padded out IPv4 address (the VPN 504 case) or an IPv4-compatible IPv6 (the IPv6 case). A PE router 505 performs the NLRI lookup that yields the next-hop address which then 506 can be converted to an IPv4 address. The router then has all 507 it needs to figure out where to send the packet to. 509 This is not possible across a native IPv6 backbone when BGP is 510 advertising IPv4 NLRIs and the next-hop field must hold an IPv4 511 address. This will not be much good because the IPv4 address is not 512 known to the IPv6 backbone network. One could perhaps generate an 513 IPv4-compatible IPv6 address but this places an addressing and 514 configuration burden on the provider who must now configure their PE 515 routers with this limited addressing scheme. 517 Ideally one would like to relax this constraint and allow BGP to 518 advertise IPv4 prefixes with an IPv6 next-hop address. 520 In addition to its non-VPN-centricity and tunnel agnosticism, the 521 softwire mesh framework must accommodate the suite of client IPv4- 522 over-backbone IPv6 scenarios. 524 +--------+ +--------+ 525 | IPv4 | | IPv4 | 526 | AF | | AF | 527 | Access | | Access | 528 | Island | | Island | 529 +--------+ +--------+ 530 | \ / | 531 | \ / | 532 | \ / | 533 | X | 534 | / \ | 535 | / \ | 536 | / \ | 537 +--------+ +--------+ 538 | AFBR | | AFBR | 539 +--| IPv4/6 |---| IPv4/6 |--+ 540 | +--------+ +--------+ | 541 +-------+ | | +-------+ 542 | IPv6 | | | | IPv6 | 543 | AF | | IPv6 | | AF | 544 | Access|-------| Transit Core |-------| Access| 545 | Island| | | | Island| 546 +-------+ | | +-------+ 547 | +--------+ +--------+ | 548 +--| AFBR |---| AFBR |--+ 549 | IPv4/6 | | IPv4/6 | 550 +--------+ +--------+ 551 | \ / | 552 | \ / | 553 | \ / | 554 | X | 555 | / \ | 556 | / \ | 557 | / \ | 558 +--------+ +--------+ 559 | IPv4 | | IPv4 | 560 | AF | | AF | 561 | Access | | Access | 562 | Island | | Island | 563 +--------+ +--------+ 565 Figure 2 IPv4-over-IPv6 Scenario 567 4. Reference Models 569 This section a illustrates the softwire mesh and AFBR reference 570 models and describes the entities of each. 572 4.1. Softwire Mesh Reference Model 574 The reference model for the softwires mesh framework is illustrated 575 in figure 3. 577 | | | | 578 | | | | 579 | |<------------>| | 580 | | | | 581 |<---AF(i)---->|<----|<-----AF(i)--->| 582 | Routing | SW_NHOP> | Routing | 583 | | | | 584 +-------+ +-------+ +-------+ +-------+ 585 |AF(i) | | | (Single AF(j)) | | |AF)i) | 586 |Access |------| AFBR |===(Transit Core)===| AFBR |------|Access | 587 |Island | |AF(i,j)| |AF(i,j)| |Island | 588 +-------+ +-------+ +-------+ +-------+ 589 /|\ /|\ 590 | | 591 | [STH] | 592 ---[SPH]---->SW Encap=======[SPH]==========>SW Decap---[SPH]---> 593 [payload] [payload] [payload] 595 Figure 3 Softwire Mesh Reference Model 597 Softwires are established between dual-stack AF(i,j) AFBRs using 598 softwire signaling. AF access island reachability and softwire next- 599 hop information (SW_NHOP) is exchanged between AFBRs. AFBRs will 600 also peer with routers in the AF access island networks to exchange 601 AF(i) routing information. Packets composed of a payload and an IP 602 header termed the SPH flow across the single AF transit core in 603 softwires encapsulated with an AF(j)-based STH. 605 4.2. Entities of the Softwire Mesh Reference Model 607 The entities of the reference model are: 609 Single AF (j) transit core 611 This is an IPv4 or IPv6 backbone network surrounded by a periphery of 612 dual-stack AFBR routers. The transit core provides inter-access 613 island connectivity across a mesh of softwires. 615 Note that single AF (j) access islands may also be attached to the 616 single AF (j) transit core. Connectivity between single AF(j) access 617 islands across the transit core can be accomplished using softwires 618 or normal default routing functions depending on the wishes of the 619 operator and routing configuration of the system. 621 AF Access Islands 623 Client access island networks can be single AF(i) or dual-stack 624 AF(i,j) in makeup and rely on the transit core for connectivity to 625 remote access island networks of the same AF. Routers inside an AF 626 access island will run a routing protocol and subset of access island 627 CE routers will peer with upstream AFBRs to exchange client AF (i) or 628 AF (i,j) reachability information. 630 Address Family Borders Routers (AFBR) 632 These are dual-stack AF (i,j) routers positioned at the edge of the 633 transit core. They will form a peering relationship with one or more 634 CE routers located inside the AF acccess island for the purpose of 635 exchanging AF access island reachability information. AFBR nodes 636 will peer with each other (directly or via a route reflector) to 637 exchange SW-encap sets, perform softwire signaling, advertise AF 638 access island reachability information and SW-NHOP information. 640 Softwire Signaling 642 This involves first the local definition of SW-encap sets on each 643 AFBR and second, the dynamic establishment of softwires in which the 644 peering AFBRs will exchange their configured SW-encap sets. A SW- 645 encap set contains tunnel header parameters, preferences and the 646 expected payload types (e.g. IPv4) carried by the softwire(s). 648 Clients IP AF payloads originating at an AF access island are 649 encapsulated in the STH at the ingress AFBR, forwarded across the 650 backbone, de-encapsulated at the egress AFBR and forwarded on to the 651 destination. 653 4.3. ABFR Reference Model 655 The reference model for a dual-stack, softwire-capable AFBR node is 656 shown in figure 4. 658 +------------------------------->Remote AF(i,j) 659 | AFBR Peers 660 | 661 | +-------------------------->AF(j) Transit 662 | | Core Peers 663 +------|----|-------------------------+ 664 | | | | 665 | \|/ \|/ | 666 | +--------------+ +-------------+ | 667 | | | | | | 668 AF(i),AF(i,j) | |AF(i,j)Access | | | | 669 Access <--|--|AF(j) Transit | | SW Tunnel |--|->Remote AF(i,j) 670 Island | | Core | | Signaling | | AFBR Peers 671 | | RIB(s) | | | | 672 | | | | | | 673 | +--------------+ +-------------+ | 674 | /|\ \ /|\ | 675 | | \ | | 676 | | \ | | 677 | | \ | | 678 | | \ | | 679 | | \ | | 680 | \|/ _\| \|/ | 681 | +----------+ +--------------+ | 682 | | | | | | 683 | | | | SW Tunnel | | 684 AF Access<--|->| L3 FIB |<--->| Encap/Decap |<-|-->Single AF 685 Island | | | | Forwarding | | Transit Core 686 | | | | | | 687 | +----------+ +--------------+ | 688 | | 689 +-------------------------------------+ 691 Figure 4 Softwire AFBR Reference Model 693 4.4. Entities of the AFBR Reference Model 695 The entities of the softwire AFBR reference model are: 697 SW Signaling Module 699 This module is responsible for exchanging SW-encap set(s) and other 700 information with interested AFBR nodes for the purpose of 701 establishing and managing inter-AFBR softwires. 703 AF Access Island and Transit Core RIB(s) 705 This entity represents the one or more routing information bases 706 (RIB) needed to store AF reachability information received over the 707 AFBR�s multiple peering relationships. An AFBR will peer with one 708 or more AF access island CE routers to exchange AF(i) or AF(i) and 709 AF(j) prefixes and store those in an AF access island RIB. An AFBR 710 will peer with remote AFBR nodes to exchange the same possible 711 combination of AF access island prefixes (accompanied by SW-NH 712 information) and store those in the same AF access island RIB. And 713 finally the AFBR will peer with routers inside the transit core and 714 store that information in a transit core RIB. 716 AF L3 FIB(s) 718 This entity represents the one or more forwarding information bases 719 (FIB) computed from the RIB(s) and needed to forward the packets to 720 and from the AF access islands and into and out of the softwire 721 tunnels. 723 SW Tunnel Encap/Decap and Forwarding 725 This entity represents the softwire encapsulation and 726 decapsulation processes performed at the ingress and egress AFBR 727 respectively as well as the lookup and forwarding of the packet based 728 on the STH. 730 This is NOT how a specific implementation must look but rather 731 illustrates the basic function blocks that run in the dual-stack, 732 softwire-capable AFBR. 734 4.5. Comments on Single AF AFBR Reference Models 736 This document describes a framework employing dual-stack AFBR nodes. 737 Noting the cost and perceived complexity of running anything in dual- 738 stack, one might ask is it possible to solve this problem using 739 single-stack AFBR nodes. The answer is yes. 741 One technique would be to make the AFBR a single-stack AF(j) node 742 similar to the transit core routers. It then becomes up to a CE 743 device located in the AF access island network to encapsulate/ 744 de-encaspulate packets in a tunnel that can be forwarded across the 745 single-stack AF(j) backbone composed of AFBR and core routers. This 746 involves moving the dual-stack AF(i,j) processing into the AF access 747 island networks. This processing might evolve manual configuration 748 of inter-CE tunnels or inter-CE BGP peering to exchange client AF 749 prefixes/next-hops. This may not be desirable on the part of the 750 operators of those networks. We call this the dual-stack CE model. 752 Another technique is to employ psuedowire (PW) control and 753 encapsulations [RFC3985] as a means of tunneling AF access island 754 packets across the transit core. In this case the AFBR assumes the 755 role of L2 PE and need only peer with transit core and other remote 756 L2 PE vehicles. It will only forward packets based on L2 connection, 757 L2 header or interface information. A CE router will attach to the 758 L2 AFBR and exchange L2-encapsulated AF access island packets across 759 L2 connections. We call this model the L2VPN model. 761 These approaches circumvent the requirement for dual-stack 762 functionality on the AFBR which can be viewed as an advantage. The 763 disadvantage of the dual-stack CE model is added cost and complexity 764 placed in the AF access island networks. The disadvantage of the 765 L2VPN is limited scalability. 767 5. Softwire Signaling 769 A mesh of inter-AFBR softwires spanning the single AF transit core 770 must be in place before packets can flow between AF access island 771 networks . Given N number of dual-stack AFBRs, it is possible to 772 erect a softwire mesh by manually configuring a full mesh of point- 773 to-point IP or label switch path (LSP) tunnels. This of course 774 introduces the O(N^2) provisioning problem. 776 A more scalable and provision-friendly approach is to establish the 777 softwire mesh dynamically using some sort of signaling. Before 778 deciding on an approach we make two quick observations: 780 o reachability to particular AF access island prefixes is always 781 through one or more egress AFBRs. In the BGP vernacular, this 782 would be the BGP next_hop pointing to an egress PE. 784 o egress AFBR knows exactly the composition of the SW-encap sets it 785 is capable of supporting. It knows what tunnel encapsulation 786 type(s) it can handle and the parameters (e.g. tunnel header 787 fields) of those tunnel encapsulation types. If the egress AFBR 788 supports more than one tunnel encapsulation type, then the egress 789 AFBR's preference for using one over the other can be expressed. 790 It also is aware of the IP AF payloads these tunnel encapsulations 791 will bear and of course it knows its own IP address. 793 With this in mind, we can envision a softwire signaling solution 794 where each egress AFBR advises all interested ingress AFBRs of the 795 following: . Upon receiving this 796 information, the ingress AFBR knows how to reach the egress AFBR and 797 what tunnels encapsulation types to apply if it needs to forward 798 packets to prefixes emanating from that egress AFBR. 800 5.1. SW Encapsulation Sets 802 A SW-encap set is composed of the following: 804 o a type of payload (e.g. IPv4 or IPv6) that will be transported in 805 the softwire. This exists so that the egress AFBR can 806 optimize its processing (e.g. lookup) of the payload once it exits 807 the softwire. 809 o one or more tunnel encapsulation types and associated parameters 810 that can be applied to the payload type. For example one 811 encapsulation type might be GRE [RFC2784] and the parameters would 812 be an ethertype and optionally a key value. 814 o a preference expressed by the egress AFBR to the multiple ingress 815 AFBRs on which tunnel encapsulation type to apply to the specified 816 payload. This would apply if the egress AFBR is capable of 817 supporting and then advertised multiple tunnel encapsulation types 818 for a particular payload. 820 Basically the SW-encap set and AFBR IP address of the egress AFBR is 821 what needs to be conveyed to all interested ingress AFBR nodes so 822 that the softwire mesh can be built. 824 5.2. BGP 826 An ideal choice for softwire signaling is MP-BGP [RFC2858]. First it 827 supports the one-to-many signaling paradigm required by the egress 828 AFBR to communicate softwire information to multiple ingress AFBRs. 829 This can occur across a full mesh of BGP connections or route 830 reflectors can be employed for improved scalability. Second BGP need 831 only operate between softwire-capable AFBR nodes since these are the 832 only devices that maintain softwire tunneling state. And finally BGP 833 has proven to be quite extensible and so can be easily extended to 834 carry softwire information between AFBRs. 836 A new SAFI defined in [draft-nalawade-kapoor-tunnel-safi] and a new 837 attribute for encoding SW-encap sets, [draft-nalawade-softwire-encap- 838 attribute] are introduced to provide MP-BGP with the ability to 839 dynamically establish a softwire mesh between AFBR nodes. 840 [draft-nalawade-kapoor-tunnel-safi] describes the following: 842 o new SAFI (= 64) encoded in the MP_REACH_NLRI attribute indicates 843 that information pertaining to an IPv4 (AFI=1) or IPv6 (AFI=2) 844 tunnel is encoded. We refer to the MP_REACH_NLRI attribute 845 containing the tunnel SAFI as just the tunnel SAFI. 847 o NLRI of the tunnel SAFI is encoded with a tunnel identifier and 848 the IP address of the tunnel end-point which in this case is the IP 849 address of the egress AFBR. The indentifier and/or the IP address 850 will be indexed by subsequent prefix advertisements coupled with a 851 SW-NHOP attribute value to associate reachability to an advertised 852 prefix through that softwire. 854 [draft-nalawade-softwire-encap-attribute] describes the following: 856 o Payload AFI and SAFI. 858 o Softwire encapsulation parameters defining the tunnel 859 encapsulation types and preferences. 861 The egress AFBR then will employ MP-BGP to distribute information as 864 a means of signaling softwire setups to interested AFBR nodes. The 865 softwire payload AFI/SAFI and the tunnel encapsulation attributes 866 collectively form the SW-encap set. And the NLRI of the tunnel SAFI 867 contains a tunnel identifier value and the tunnel end-point IP 868 address which is the IP address of the egress AFBR. 870 Note that this information should be confined to only participating 871 autonomous systems so mechanisms to control the distribution of 872 softwire information should be invoked as needed. 874 Upon receiving the BGP tunnel SAFI advertisement, the ingress AFBR 875 resolves the SW-encap set to exactly one encapsulation type to use 876 when sending packets of the specified payload type to a destination 877 advertised by the egress AFBR. This is based on first whether the 878 AFBR is capable of supporting a particular encapsulation type and 879 second on the order of preference. 881 With respect to order of preference, it is desirable that the ingress 882 AFBR attempt to honor the preference expressed by the egress AFBR. 883 However it is possible to configure a policy on the ingress AFBR that 884 overrides the preference received from the egress AFBR in the 885 BGP tunnel SAFI update. 887 5.3. non-BGP Signaling 889 Dynamic and static methods that do not employ MP-BGP and the BGP 890 tunnel SAFI can be used to establish the mesh of inter-AFBR 891 softwires. Existing point-to-point signaling protocols can establish 892 discrete softwires between pair-wise AFBR nodes. For example if the 893 transit core is based on MPLS, then the operator could configure a 894 mesh of traffic engineered tunnel LSPs using RSVP-TE signaling 895 [RFC3209] between the ingress and egress AFBR. The mesh of TE LSPs 896 would constitute the softwire mesh. 898 In the case where point-to-point signaling is used, it might be 899 necessary to configure each softwire with an identifier which can be 900 later referenced by a client AF prefix advertisement received at the 901 ingress AFBR to indicate that the softwire leads to the set of client 902 AF prefixes. 904 6. Softwire Routing and Tunnel Selection 906 Once the softwire mesh is in place, it now becomes possible to 907 forward packets over a particular softwire. The ingress AFBR will 908 make this decision based on learned client AF access island 909 reachability information and the corresponding SW-NHOP value pointing 910 to a specific softwire to use. 912 So essentially two things need to happen here: 914 o egress AFBR nodes need to advertise client AF access island 915 reachability to the set of interested ingress AFBRs 917 o egress AFBR nodes need to identify a softwire to use to reach the 918 advertised AF access island prefixes. 920 The learned tunnel-to-use information will also include the IP 921 address of the egress AFBR. 923 6.1. Advertising Client AF Access Island Reachability 925 AFBR nodes maintain routing tables gleaned from directly connected AF 926 access island networks. The prefixes maintained in these AF access 927 island routing tables will be from different address families 928 including IPv4, VPNv4, IPv6 and VPNv6. Therefore the logical choice 929 on how to convey multi-AF reachability information between AFBR nodes 930 is MP-BGP. 932 Softwire routing will employ existing and proven mechanisms for 933 advertising reachability between AFBR nodes. Table 1 below describes 934 the possible access island AF/SAF maintained on the AFBR nodes and 935 the reference describing the MP-BGP implementation: 937 AF/SAF Reference 938 ====== ================================ 939 1/1 (IPv4) [RFC2858] 940 2/1 (IPv6) [RFC2858] 941 1/128 (VPNv4) [RFC4364] 942 2/128 (VPNv6) [draft-ietf-l3vpn-bgp-ipv6] 944 Table 1 AF Access Island References 946 It should be noted that prefixes belonging to different AF/SAFs will 947 be stored in different routing tables on the AFBR. In addition the 948 particular AF/SAF combinations described here are completely separate 949 from the AF of the transit core. 951 6.2. Tunnel Selection 953 In conjunction with multi-AF reachability achieved through existing 954 MP-BGP protocol machinery, some form of tunnel indirection must be 955 peformed. In other words the egress AFBR must tell the ingress AFBRs 956 that to reach a particular prefix across the transit core, the 957 ingress AFBR must direct the the packets into a specific softwire. 958 The notion of instructing a node to forward packets to a destination 959 other than the default next_hop is referred to as tunnel indirection. 961 Tunnel indirection generally applies in the following cases: 963 o if the characteristics of an existing tunnel (i.e. softwire) 964 change then the ingress AFBR must be advised that reachability to 965 prefixes is now only possible through the changed tunnel. For 966 example an egress AFBR employing L2TPv3 encapsulation may decide 967 to alter its cookie value in the L2TPv3 header for security 968 reasons. The egress AFBR transmits a BGP tunnel SAFI update 969 containing a new SW-encap set with the new cookie value. In 970 addition the egress AFBR must quickly inform the ingress AFBRs 971 that advertised AF access island reachability is now supported 972 through the updated L2TPv3-based softwire. 974 o if there are multiple softwires to choose from. An egress AFBR 975 could advertise two or more discrete SW-encap sets, each capable 976 of carrying the same payload type(s). In this case the egress 977 AFBR must inform the ingress AFBR what prefixes are reachable 978 through which softwire. 980 Mapping prefixes to specific tunnels can also be done manually but 981 comes at a cost in operational complexity. 983 6.2.1. Softwire Next_Hop 985 Tunnel indirection can be achieved by accompanying MP-BGP prefix 986 updates with a pointer to a softwire leading to the advertised 987 prefix. This pointer could be a value that indexes a table (located 988 on the ingress AFBR) of referenceable softwires leading to the the 989 advertising or egress AFBR. The pointer might also be complimented 990 with an IP address which serves as the softwire end-point address 991 configured on the egress AFBR. This IP address would also be used by 992 the ingress AFBR as the STH when assembling and imposing the 993 encapsulation headers on the client packet to be forwarded through 994 the softwire that spans the transit core. 996 The BGP Softwire Next Hop (SW-NHOP) [draft-nalawade-sw-nhop] 997 attribute is a new attribute that serves this purpose. This 998 attribute effectively functions as a softwire next_hop (SW_NHOP) in 999 that it points to a previously configured softwire (on the ingress 1000 AFBR) and provides an IP address that can serve as the STH in the 1001 softwire encapsulation. A prefix received by the ingress AFBR that 1002 contains a SW_NHOP (by virtue of the BGP SW-NH attribute) is being 1003 instructed to forward packets to that prefix through the referenced 1004 SW. 1006 Another use of this attribute is that it can supercede the next_hop 1007 value contained in the MP_REACH NLRI. Since the SW_NHOP attribute 1008 explicitly identifies the AFI/SAFI of its contained IP address, this 1009 means that the constraint of the matching AFI/SAFI between NLRI and 1010 next hop fields in an MP_REACH_NLRI can be circumvented. This will 1011 be quite useful when addressing the IPv4-over-IPv6 scenario. 1013 The SW_NHOP attribute links reachability to a softwire and the IP 1014 address at the end of the softwire which in this case is the IP 1015 address of the egress AFBR. This translates into a softwire 1016 encapsulation action performed by the ingress AFBR rather then just a 1017 forwarding action to a next hop router as is the usual case with 1018 routing. 1020 If prefixes packed up with SW_NHOP attribute arrive at an ingress 1021 AFBR and there is no active softwire then the prefixes are dropped 1022 since they are not reachable. If prefixes arrive without the SW_NHOP 1023 attribute or if this function has not been negotiated in the 1024 capabilities exchange, then normal next_hop forwarding should be 1025 performed. 1027 A situation may arise where an AF(i) prefix is reachable through a 1028 softwire and a normal next_hop address. It is possible that some 1029 form of load sharing could occur, where some packets are directed 1030 through the softwire next hop and others through the normal next hop. 1031 The implementation can choose to support this scenario or mandate the 1032 use of just a single method for expressing reachability and next hop 1033 information. In practice it is likely that the provider will only 1034 support a single form of transport between AFBR nodes. 1036 The use of the SW_NHOP attribute offers several advantages: 1038 o Egress AFBR can assign reachability to different prefix sets using 1039 different encapsulation schemes. For example one set of prefixes 1040 can be reached through a GRE tunnel and another might be reachable 1041 through an IPsec tunnel. 1043 o Decoupling of AF access island reachability and softwire signaling 1044 leads to more efficient MP-BGP processing. This is because MP-BGP 1045 prefix updates do not transport the corresponding SW-encap set(s). 1046 That is performed once per softwire setup by the BGP tunnel SAFI. 1047 The prefix updates only a carry a pointer indexing the softwire to 1048 use. 1050 o Constraint of the of the NLRI and next-hop fields sharing the same 1051 AFI/SAFI in the MP_Reach_NLRI attribute is removed by encoding 1052 overriding next hop information in the SW_NHOP attribute. 1054 In summary, MP-BGP will advertise 1055 tuples between peering AFBR nodes as a means of binding reachability 1056 to a client AF prefix through a configured softwire. 1058 6.2.2. Next_Hop Overlay Addressing 1060 Another form of tunnel indirection can be achieved by linking 1061 advertised MP-BGP prefix next_hop information to a tunnel end-point 1062 terminating in the egress AFBR. This is achieved by configuring an 1063 IP overlay address, called IP(overlay), on the egress AFBR that is 1064 part of the same AF as that of the advertised AF acccess island 1065 prefixes. 1067 The softwire mesh is dynamically built by extending the BGP tunnel 1068 SAFI to carry the following information: . 1071 When MP-BGP prefix updates arrive at the ingress AFBR, the next_hop 1072 will be resolved to the IP (overlay) address which in turn is bound 1073 to an encapsulation action based on the SW-encap set and AFBR IP 1074 address values. 1076 In this case the SW_NHOP attribute is not used and there is no 1077 additional information needed when MP-BGP prefixes updates are sent 1078 by egress AFBR nodes. However the operator must configure one extra 1079 IP (overlay) address per softwire and then advertise this information 1080 in an extended BGP tunnel SAFI update. 1082 6.3. Comments on a BGP-free Core 1084 The term BGP-free core describes a scenario where the network 1085 is an autonomous system (AS), the edge routers are EBGP speakers, and 1086 the strategy is to keep the core routers free of external 1087 routes. The edge routers must distribute routes to each other, but 1088 not to the core routers. 1090 The requirement itself opens up the possibility to design the core 1091 network with a softwire mesh framework. The edge routers take on the 1092 role of AFBRs that exchange the external routes with each 1093 other. The external routes may or may not be of the same 1094 address family as the core network. The edge routers also signal the 1095 SW-encap set identifying each egress endpoint. Traffic for the 1096 external routes are forwarded by encapsulating the packets with 1097 a tunnel header at the ingress. 1099 7. Softwire Forwarding and Tunnel Encapsulations 1101 7.1. Forwarding 1103 Forwarding of packets sourced from an AF access island onto a 1104 softwire originating in the ingress AFBR is composed of the 1105 following: 1107 o lookup of AF access island IP destination address (SPH) in the 1108 respective AF access island routing and forwarding table. 1110 o Encapsulation of the IP packet in the appropriate softwire 1111 transport header (STH). 1113 o Transmission of the softwire encapsulated packets 1114 across the single AF transit core based on the STH. 1116 When packets arrive at the egress AFBR the following actions are 1117 performed: 1119 o Disposition of the STH. 1121 o Lookup of the SPH in the corresponding AF acccess island routing 1122 and forwarding table. 1124 o Transmission of the native AF access island IP packet toward the 1125 respective downstream CE router. 1127 7.2. Encapsulations 1129 The softwire mesh framework is designed to accommodate any form of IP 1130 tunnel encapsulation. Examples include [RFC2784], [RFC3931] and IP- 1131 in-IP [RFC2473]. MPLS encapsulation can also be accomodated as well. 1133 IPsec is a special case that will be covered in a separate document. 1135 8. Softwire OAM and MIBs 1137 8.1. OAM 1139 Softwires are essentially tunnels connecting routers. If they 1140 disappear or degrade in performance then connectivity through those 1141 tunnels will be impacted. There are several techniques available to 1142 monitor the status of the tunnel end-points (e.g. AFBRs) as well as 1143 the tunnels themselves. These techniques allow operations such as 1144 softwires path tracing, remote softwire end-point pinging and remote 1145 softwire end-point liveness failure detection. 1147 Examples of techniques applicable to softwire OAM include: 1149 o BGP/TCP timeouts between AFBRs 1151 o ICMP or LSP echo request and reply addressed to a particular AFBR 1153 o [draft-ietf-bfd-base] packet exchange between AFBR routers 1155 Another possibility for softwire OAM is to build something similar to 1156 the [RFC4378] or in other words creating and generating softwire echo 1157 request/reply packets. The echo request sent to a well-known UDP 1158 port would contain the egress AFBR IP address and the softwire 1159 identifier as the payload (similar to the MPLS forwarding equivalence 1160 class contained in the LSP echo request). The softwire echo packet 1161 would be encapsulated with the STH and forwarded across the same path 1162 (inband) as that of the softwire itself. 1164 This mechanism can also be automated to periodically verify remote 1165 softwires end-point reachability, with the loss of reachability being 1166 signaled to the softwires application on the local AFBR thus enabling 1167 suitable actions to be taken. Consideration must be given to the 1168 trade offs between scalability of such mechanisms verses time to 1169 detection of loss of endpoint reachability for such automated 1170 mechanisms. 1172 In general a framework for softwire OAM can for a large part be based 1173 on the [RFC4176] framework. 1175 8.2. MIBs 1177 Specific MIBs do exist to manage elements of the softwire mesh 1178 framework. However there will be a need to either extend these MIBs 1179 or create new ones that reflect the functional elements that can be 1180 SNMP-managed within the softwire network. 1182 9. Softwire Multicast 1184 A set of client IP AF access island networks that are connected to a 1185 provider's single AF transit core network may wish to run IP 1186 multicast applications. Extending IP multicast connectivity across 1187 the provider's single AF transit core network can be accomplished 1188 using a variety of techniques. 1190 One option is to extend client IP AF multicast up to the provider's 1191 AFBR and then tunnel the client IP AF multicast packets across the 1192 unicast softwire mesh. Tunneling IP multicast packets across inter- 1193 router unicast IP tunnels such as GRE has been performed for years. 1194 This is sub-optimal from the provider's perspective given that there 1195 is no replication done inside the transit core. A further hit in 1196 optimality will be incurred by the replication processing performed 1197 by the ingress AFBR, especially if there are many downstream AFBRs on 1198 the tree. The advantage is that it does work and the softwire mesh 1199 handles both unicast and multicast traffic. 1201 A second option is to leverage the multicast VPN work already defined 1202 and in fact implemented [draft-ietf-l3vpn-2547bis-mcast]. In this 1203 scenario the transit core implements native IP multicast or MPLS 1204 multipoint LSPs to establish multipoint distribution trees called 1205 provider multicast service instances (PMSI). The PMSI might use an 1206 IP tunnel encapsulation with a provider-only group address as one 1207 encapsulation or MPLS labels as another. Client AF access island 1208 multicast packets are encapsulated in PMSI headers at the ingress 1209 AFBR and then transmitted on the appropriate PMSI for delivery to 1210 leaf AFBRs. The PMSI headers are removed and the client AF multicast 1211 packet is sent on its way. 1213 It should be noted that PMSI establishment and encapsulation operates 1214 separately from softwire signaling and encapsulation. One could say 1215 they operate as "ships in the night". The advantage of the MVPN 1216 approach is that packet replication is performed in the transit core. 1217 The disadvantage is that the enabling client AF IP multicast means 1218 that client AF prefixes must be stored and processed in VPN routing 1219 tables on the AFBR which as noted earlier may not be something the 1220 provider wishes to do. It should also be pointed that the existing 1221 MVPN implementations and in fact the current 1222 [draft-ietf-l3vpn-2547bis-mcast] draft only refers to the IPv4 AF for 1223 both the client and backbone networks. This effort needs to be 1224 extended to support IPv6 AF. 1226 A third option is to push the client AF-to-backbone AF interface down 1227 to the CE. A simply way of realizing this would be to establish a 1228 mesh of point-to-point softwires between participating CE routers. 1229 This has scaling concerns similar to the aforemented AFBR-based 1230 softwire mesh tunneling solution. 1232 Another technique specific to the IPv4-over-IPv6 scenario is outlined 1233 in [draft-ietf-softwire-4over6vpns]. An IPv6 group address is 1234 assigned to each VPN and the CE routers join the group to discover 1235 each and build inter-CE routing adjacencies. IPv4 multicast packets 1236 are encapsulated in an IPv6 group address derived from the IPv4-based 1237 source and group address information. The advantage of this approach 1238 in particular is that the AFBR only runs IPv6 and not a dual-stack. 1240 10. Inter-AS Considerations 1242 [RFC4364] describes three methods for supporting L3VPN functions 1243 across inter-AS topologies. These methods can be 1244 leveraged to support softwire signaling, routing and encapsulation 1245 across the same topologies. 1247 10.1. Option A: Back-to-Back AFBRs 1249 This option works seamlessly with the softwire mesh framework. 1250 Referring to figure 5 the peering AFBRs located in different 1251 automomous systems (AFBR2 and AFBR3) have one or more attachment 1252 circuits which are capable of forwarding AF(i) packets. AFBR1 and 1253 AFBR2 exchange client AF(i) prefixes and SW-NHOP information with 1254 each other. From the forwarding perspective beginning at AFBR1, 1255 packets are tunneled over softwire 1 that terminates at AFBR2 where 1256 the packets are de-encapsulated and sent as normal AF(i) IP packets 1257 over the attachment circuit to the AFBR3 in the other AS. The packet 1258 is then tunneled over softwire 2 to AFBR and so on. 1260 Softwire 1 1261 +**********************+ 1262 AF(i) | | 1263 --AFBR1=(AS1 transit core)=AFBR2 1264 island 1265 AF(j) | | | 1266 | | | AF(k) 1267 AF(i) 1268 AFBR3=(AS2 transit core)=AFBR4-- 1269 | | island 1270 +*************************+ 1271 Softwire 2 1273 Figure 5 Option A Back-to-Back AFBRs 1275 A characterstic of option A is that the AF of each transit core 1276 network within each AS could be different. i.e. the first AS's core 1277 network can be AF(j) and the second AS's core network can be AF(k). 1279 10.2. Option B: EBGP redistribution of AF(i) prefixes 1281 In this procedure, the AFBRs use MP-iBGP and MP-eBGP signalling 1282 between intra-AS and inter-AS AFBR peers to build a contiguous set of 1283 softwires that span multiple autonomous systems. The same MP-iBGP and 1284 MP-eBGP machinery is then used to redistribute AF(i) prefixes and SW- 1285 NHOP attributes that in turn builds the forwarding plane through this 1286 contiguous set of softwires. 1288 o The packets at the ingress ASBR enter softwrite 1. 1290 o Softwire '1' terminates at the AFBR2. 1292 o Packets enter softwire 2 that is setup between the AFBR2 and 1293 AFBR3. 1295 o Softwire 2 terminates on AFBR3. 1297 o Packets are then de-encapsulated and forwarded over softwire 3 to 1298 AFBR4 and so on. 1300 Softwire 1 1301 +**********************+ 1302 AF(i) | | 1303 --AFBR1=(AS1 transit core)=AFBR2---+ 1304 island H | Softwire 2 1305 H | 1306 H _| 1307 H / AF(i) 1308 AFBR3=(AS2 transit core)=AFBR4-- 1309 | |island 1310 +*************************+ 1311 Softwire 3 1313 Figure 6 Option B "C EBGP Distribution of AF(i) Prefixes 1315 To set up the softwires, softwire signalling exchanges SW-encaps sets 1316 between each pair peering AFBRs, including those AFBR peers located 1317 in separate autonomous systems (e.g. AFBR2 and AFBR3 1318 in figure 6). 1320 In this option as well, the AFs of the core network within each AS 1321 need not be the same, i.e. AS1 transit core can be AF(j) whereas AS2 1322 transit core can be AF(k). The inter-AS AFBRs of course need to be 1323 AF(i, j, k) aware, where i, j, k could be the same or different. 1325 10.3. Option C: Multihop EBGP distribution of AF(i) prefixes 1327 In this procedure, AF(i) prefixes are redistributed by some 1328 designated AFBRs to the other AS with the next-hop unchanged, (i.e. 1329 the next-hops are not rewritten by the EBGP AFBR speaker). The 1330 inter-AS peering AFBRs need only to exchange host AF(j) routes of the 1331 AFBRs so that AFBRs in both the ASs have reachability to each other. 1332 The designated AFBRs also signal softwire SW-encap sets of each end- 1333 point without rewriting the next-hops. This facilitates the creation 1334 of an end-to-end softwire from the ingress AFBR to the egress AFBR As 1335 illustrated in figure 7. 1337 Softwire 1338 +*************************************************+ 1339 | | 1340 | | 1341 | +==AFBR3-----+ | 1342 AF(i) | // . EBGP & softwire | 1343 --AFBR1=(AS1 transit core)=AFASBR1 . signaling | 1344 island H . | 1345 H . | 1346 H +==AFBR4 | 1347 H // | AF(i) 1348 AFASBR2=(AS2 transit core)=AFBR2-- 1349 island 1351 Figure 7 Option C Multihop eBGP distribution of AF(i) prefixes 1353 This procedure inherently assumes that both the AS transit cores 1354 run the same AF(j) network because of the requirement to redistribute 1355 reachability information of AFBRs across ASs. 1357 11. Security Considerations 1359 Security for softwire signaling can be achieved using BGP/TCP MD5- 1360 keying. The softwire data plane can employ encryption of the data 1361 packets using Ipsec. This will be explained in a companion document. 1363 [RFC4111] outlines the L3VPN security framework which in many cases 1364 is directly applicable to the softwire mesh framework. 1366 12. Acknowledgment 1368 David Ward, Eric Rosen, Chris Cassar, Ruchi Kapoor, Pranav Mehta, 1369 Mingwei Xu and Ke Xu provided useful input into this document. 1371 13. References 1373 13.1. Normative References 1375 [RFC1700] Reynolds, J. and J. Postel, "Assigned Numbers", RFC 1700, 1376 October 1994. 1378 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1379 IPv6 Specification", RFC 2473, December 1998. 1381 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1382 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1383 March 2000. 1385 [RFC2858] Bates, T., Rekhter, Y., Chandra, R., and D. Katz, 1386 "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000. 1388 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 1389 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 1390 Tunnels", RFC 3209, December 2001. 1392 [RFC3931] Lau, J., Townsley, M., and I. Goyret, "Layer Two Tunneling 1393 Protocol - Version 3 (L2TPv3)", RFC 3931, March 2005. 1395 [RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to- 1396 Edge (PWE3) Architecture", RFC 3985, March 2005. 1398 [RFC4110] Callon, R. and M. Suzuki, "A Framework for Layer 3 1399 Provider-Provisioned Virtual Private Networks (PPVPNs)", 1400 RFC 4110, July 2005. 1402 [RFC4111] Fang, L., "Security Framework for Provider-Provisioned 1403 Virtual Private Networks (PPVPNs)", RFC 4111, July 2005. 1405 [RFC4176] El Mghazli, Y., Nadeau, T., Boucadair, M., Chan, K., and 1406 A. Gonguet, "Framework for Layer 3 Virtual Private 1407 Networks (L3VPN) Operations and Management", RFC 4176, 1408 October 2005. 1410 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1411 Networks (VPNs)", RFC 4364, February 2006. 1413 [RFC4378] Allan, D. and T. Nadeau, "A Framework for Multi-Protocol 1414 Label Switching (MPLS) Operations and Management (OAM)", 1415 RFC 4378, February 2006. 1417 13.2. Informative References 1419 [draft-ietf-bfd-base] 1420 Katz, D. and D. Ward, "Bidirectional Forwarding 1421 Detection", draft-ietf-bfd-base-04 (work in progress), 1422 October 2005. 1424 [draft-ietf-l3vpn-2547bis-mcast] 1425 Rosen, E. and R. Aggarwal, "Multicast in MPLS/BGP IP 1426 VPNs", draft-ietf-l3vpn-2547bis-mcast-01 (work in 1427 progress), December 2005. 1429 [draft-ietf-l3vpn-bgp-ipv6] 1430 Clercq, J., "BGP-MPLS IP VPN extension for IPv6 VPN", 1431 draft-ietf-l3vpn-bgp-ipv6-07 (work in progress), 1432 August 2005. 1434 [draft-ietf-l3vpn-gre-ip-2547] 1435 Rekhter, Y., "Use of PE-PE GRE or IP in BGP/MPLS IP 1436 Virtual Private Networks", draft-ietf-l3vpn-gre-ip-2547-05 1437 (work in progress), August 2005. 1439 [draft-ietf-softwire-problem-statement] 1440 Li, X., "Softwire Problem Statement", 1441 draft-ietf-softwire-problem-statement-00 (work in 1442 progress), December 2005. 1444 [draft-nalawade-kapoor-tunnel-safi] 1445 Nalawade, G., "Tunnel SAFI", 1446 draft-nalawade-kapoor-tunnel-safi-04 (work in progress), 1447 October 2005. 1449 [draft-ietf-softwire-4over6vpns] 1450 Shephard, G., "IPv4 unicast/multicast VPNs over an IPv6 1451 core", draft-ietf-softwire-4over6vpns-00 (work in 1452 progress), June 2006. 1454 [draft-nalawade-softwire-encap-attribute] 1455 Nalawade, G., "BGP Softwire Encapsulation Attribute", 1456 draft-nalawade-softwire-encap-attribute-00 (work in 1457 progress), June 2006. 1459 [draft-nalawade-sw-nhop] 1460 Nalawade, G., "BGP Softwire Next Hop Attribute", 1461 draft-nalawade-sw-nhop-00, (work in progress), June 2006. 1463 Authors' Addresses 1465 Jianping Wu 1466 Tsinghua University 1467 Department of Computer Science, Tsinghua University 1468 Beijing 100084 1469 P.R.China 1471 Phone: +86-10-6278-5983 1472 Email: jianping@cernet.edu.cn 1474 Yong Cui 1475 Tsinghua University 1476 Department of Computer Science, Tsinghua University 1477 Beijing 100084 1478 P.R.China 1480 Phone: +86-10-6278-5822 1481 Email: yong@csnet1.cs.tsinghua.edu.cn 1483 Xing Li 1484 Tsinghua University 1485 Department of Electronic Engineering, Tsinghua University 1486 Beijing 100084 1487 P.R.China 1489 Phone: +86-10-6278-5983 1490 Email: xing@cernet.edu.cn 1492 Chris Metz 1493 Cisco Systems, Inc. 1494 3700 Cisco Way 1495 San Jose, Ca. 95134 1496 American 1498 Email: chmetz@cisco.com 1499 Gargi Nalawade 1500 Cisco Systems, Inc. 1501 3700 Cisco Way 1502 San Jose, Ca. 95134 1503 American 1505 Email: qarqi@cisco.com 1507 Simon Barber 1508 Cisco Systems, Inc. 1509 250 Longwater Avenue 1510 Reading, ENGLAND, RG2 6GB 1511 United Kingdom 1513 Email: sbarber@cisco.com 1515 Pradosh Mohapatra 1516 Cisco Systems, Inc. 1517 3700 Cisco Way 1518 San Jose, Ca. 95134 1519 American 1521 Email: pmohapat@cisco.com 1523 Intellectual Property Statement 1525 The IETF takes no position regarding the validity or scope of any 1526 Intellectual Property Rights or other rights that might be claimed to 1527 pertain to the implementation or use of the technology described in 1528 this document or the extent to which any license under such rights 1529 might or might not be available; nor does it represent that it has 1530 made any independent effort to identify any such rights. Information 1531 on the procedures with respect to rights in RFC documents can be 1532 found in BCP 78 and BCP 79. 1534 Copies of IPR disclosures made to the IETF Secretariat and any 1535 assurances of licenses to be made available, or the result of an 1536 attempt made to obtain a general license or permission for the use of 1537 such proprietary rights by implementers or users of this 1538 specification can be obtained from the IETF on-line IPR repository at 1539 http://www.ietf.org/ipr. 1541 The IETF invites any interested party to bring to its attention any 1542 copyrights, patents or patent applications, or other proprietary 1543 rights that may cover technology that may be required to implement 1544 this standard. Please address the information to the IETF at 1545 ietf-ipr@ietf.org. 1547 Disclaimer of Validity 1549 This document and the information contained herein are provided on an 1550 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1551 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1552 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1553 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1554 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1555 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1557 Copyright Statement 1559 Copyright (C) The Internet Society (2006). This document is subject 1560 to the rights, licenses and restrictions contained in BCP 78, and 1561 except as set forth therein, the authors retain all their rights. 1563 Acknowledgment 1565 Funding for the RFC Editor function is currently provided by the 1566 Internet Society.