idnits 2.17.1 draft-ietf-softwire-mesh-framework-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 16, 2009) is 5548 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-ietf-softwire-encaps-ipsec-01 == Outdated reference: A later version (-11) exists of draft-ietf-bfd-base-09 == Outdated reference: A later version (-10) exists of draft-ietf-l3vpn-2547bis-mcast-07 == Outdated reference: A later version (-08) exists of draft-ietf-l3vpn-2547bis-mcast-bgp-05 == Outdated reference: A later version (-15) exists of draft-ietf-mpls-ldp-p2mp-05 -- Obsolete informational reference (is this intentional?): RFC 2385 (Obsoleted by RFC 5925) -- Obsolete informational reference (is this intentional?): RFC 4306 (Obsoleted by RFC 5996) Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Wu 3 Internet Draft Y. Cui 4 Intended Status: Standards Track Tsinghua University 5 Expires: August 16, 2009 6 C. Metz 7 E. Rosen 8 Cisco Systems, Inc. 10 February 16, 2009 12 Softwire Mesh Framework 14 draft-ietf-softwire-mesh-framework-06.txt 16 Status of this Memo 18 This Internet-Draft is submitted to IETF in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 Copyright and License Notice 39 Copyright (c) 2009 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents in effect on the date of 44 publication of this document (http://trustee.ietf.org/license-info). 45 Please review these documents carefully, as they describe your rights 46 and restrictions with respect to this document. 48 This document may contain material from IETF Documents or IETF 49 Contributions published or made publicly available before November 50 10, 2008. The person(s) controlling the copyright in some of this 51 material may not have granted the IETF Trust the right to allow 52 modifications of such material outside the IETF Standards Process. 53 Without obtaining an adequate license from the person(s) controlling 54 the copyright in such materials, this document may not be modified 55 outside the IETF Standards Process, and derivative works of it may 56 not be created outside the IETF Standards Process, except to format 57 it for publication as an RFC or to translate it into languages other 58 than English. 60 Abstract 62 The Internet needs to be able to handle both IPv4 and IPv6 packets. 63 However, it is expected that some constituent networks of the 64 Internet will be "single protocol" networks. One kind of single 65 protocol network can parse only IPv4 packets and can process only 66 IPv4 routing information; another kind can parse only IPv6 packets 67 and can process only IPv6 routing information. It is nevertheless 68 required that either kind of single protocol network be able to 69 provide transit service for the "other" protocol. This is done by 70 passing the "other kind" of routing information from one edge of the 71 single protocol network to the other, and by tunneling the "other 72 kind" of data packet from one edge to the other. The tunnels are 73 known as "Softwires". This framework document explains how the 74 routing information and the data packets of one protocol are passed 75 through a single protocol network of the other protocol. The 76 document is careful to specify when this can be done with existing 77 technology, and when it requires the development of new or modified 78 technology. 80 Table of Contents 82 1 Specification of requirements ......................... 4 83 2 Introduction .......................................... 4 84 3 Scenarios of Interest ................................. 7 85 3.1 IPv6-over-IPv4 Scenario ............................... 7 86 3.2 IPv4-over-IPv6 Scenario ............................... 9 87 4 General Principles of the Solution .................... 11 88 4.1 'E-IP' and 'I-IP' ..................................... 11 89 4.2 Routing ............................................... 11 90 4.3 Tunneled Forwarding ................................... 12 91 5 Distribution of Inter-AFBR Routing Information ........ 12 92 6 Softwire Signaling .................................... 14 93 7 Choosing to Forward Through a Softwire ................ 16 94 8 Selecting a Tunneling Technology ...................... 16 95 9 Selecting the Softwire for a Given Packet ............. 17 96 10 Softwire OAM and MIBs ................................. 18 97 10.1 Operations and Maintenance (OAM) ...................... 18 98 10.2 MIBs .................................................. 19 99 11 Softwire Multicast .................................... 19 100 11.1 One-to-One Mappings ................................... 20 101 11.1.1 Using PIM in the Core ................................. 20 102 11.1.2 Using mLDP and Multicast MPLS in the Core ............. 21 103 11.2 MVPN-like Schemes ..................................... 22 104 12 Inter-AS Considerations ............................... 23 105 13 IANA Considerations ................................... 24 106 14 Security Considerations ............................... 24 107 14.1 Problem Analysis ...................................... 24 108 14.2 Non-cryptographic techniques .......................... 26 109 14.3 Cryptographic techniques .............................. 27 110 15 Contributors .......................................... 28 111 16 Acknowledgments ....................................... 29 112 17 Normative References .................................. 30 113 18 Informative References ................................ 31 114 1. Specification of requirements 116 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 117 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 118 document are to be interpreted as described in [RFC2119]. 120 2. Introduction 122 The routing information in any IP backbone network can be thought of 123 as being in one of two categories: "internal routing information" or 124 "external routing information". The internal routing information 125 consists of routes to the nodes that belong to the backbone, and to 126 the interfaces of those nodes. External routing information consists 127 of routes to destinations beyond the backbone, especially 128 destinations to which the backbone is not directly attached. In 129 general, BGP [RFC4271] is used to distribute external routing 130 information, and an "Interior Gateway Protocol" (IGP) such as OSPF 131 [RFC2328] or IS-IS [RFC1195] is used to distribute internal routing 132 information. 134 Often an IP backbone will provide transit routing services for 135 packets that originate outside the backbone, and whose destinations 136 are outside the backbone. These packets enter the backbone at one of 137 its "edge routers". They are routed through the backbone to another 138 edge router, after which they leave the backbone and continue on 139 their way. The edge nodes of the backbone are often known as 140 "Provider Edge" (PE) routers. The term "ingress" (or "ingress PE") 141 refers to the router at which a packet enters the backbone, and the 142 term "egress" (or "egress PE") refers to the router at which it 143 leaves the backbone. Interior nodes are often known as "P routers". 144 Routers which are outside the backbone but directly attached to it 145 are known as "Customer Edge" (CE) routers. (This terminology is 146 taken from [RFC4364].) 148 When a packet's destination is outside the backbone, the routing 149 information which is needed within the backbone in order to route the 150 packet to the proper egress is, by definition, external routing 151 information. 153 Traditionally, the external routing information has been distributed 154 by BGP to all the routers in the backbone, not just to the edge 155 routers (i.e., not just to the ingress and egress points). Each of 156 the interior nodes has been expected to look up the packet's 157 destination address and route it towards the egress point. This is 158 known as "native forwarding": the interior nodes look into each 159 packet's header in order to match the information in the header with 160 the external routing information. 162 It is, however, possible to provide transit services without 163 requiring that all the backbone routers have the external routing 164 information. The routing information which BGP distributes to each 165 ingress router specifies the egress router for each route. The 166 ingress router can therefore "tunnel" the packet directly to the 167 egress router. "Tunneling the packet" means putting on some sort of 168 encapsulation header which will force the interior routers to forward 169 the packet to the egress router. The original packet is known as the 170 "encapsulation payload". The P routers do not look at the packet 171 header of the payload, but only at the encapsulation header. Since 172 the path to the egress router is part of the internal routing 173 information of the backbone, the interior routers then do not need to 174 know the external routing information. This is known as "tunneled 175 forwarding". Of course, before the packet can leave the egress, it 176 has to be decapsulated. 178 The scenario where the P routers do not have external routes is 179 sometimes known as a "BGP-free core". That is something of a 180 misnomer, though, since the crucial aspect of this scenario is not 181 that the interior nodes don't run BGP, but that they don't maintain 182 the external routing information. 184 In recent years, we have seen this scenario deployed to support VPN 185 services, as specified in [RFC4364]. An edge router maintains 186 multiple independent routing/addressing spaces, one for each VPN to 187 which it interfaces. However, the routing information for the VPNs 188 is not maintained by the interior routers. In most of these 189 scenarios, MPLS is used as the encapsulation mechanism for getting 190 the packets from ingress to egress. There are some deployments in 191 which an IP-based encapsulation, such as L2TPv3 (Layer 2 Transport 192 Protocol) [RFC3931] or GRE (Generic Routing Encapsulation) [RFC2784] 193 is used. 195 This same technique can also be useful when the external routing 196 information consists not of VPN routes, but of "ordinary" Internet 197 routes. It can be used any time it is desired to keep external 198 routing information out of a backbone's interior nodes, or in fact 199 any time it is desired for any reason to avoid the native forwarding 200 of certain kinds of packets. 202 This framework focuses on two such scenarios. 204 1. In this scenario, the backbone's interior nodes support only 205 IPv6. They do not maintain IPv4 routes at all, and are not 206 expected to parse IPv4 packet headers. Yet it is desired to 207 use such a backbone to provide transit services for IPv4 208 packets. Therefore tunneled forwarding of IPv4 packets is 209 required. Of course, the edge nodes must have the IPv4 routes, 210 but the ingress must perform an encapsulation in order to get 211 an IPv4 packet forwarded to the egress. 213 2. This scenario is the reverse of scenario 1, i.e., the 214 backbone's interior nodes support only IPv4, but it is desired 215 to use the backbone for IPv6 transit. 217 In these scenarios, a backbone whose interior nodes support only one 218 of the two address families is required to provide transit services 219 for the other. The backbone's edge routers must, of course, support 220 both address families. We use the term "Address Family Border 221 Router" (AFBR) to refer to these PE routers. The tunnels that are 222 used for forwarding are referred to as "softwires". 224 These two scenarios are known as the "Softwire Mesh Problem" [SW- 225 PROB], and the framework specified in this draft is therefore known 226 as the "Softwire Mesh Framework". In this framework, only the AFBRs 227 need to support both address families. The CE routers support only a 228 single address family, and the P routers support only the other 229 address family. 231 It is possible to address these scenarios via a large variety of 232 tunneling technologies. This framework does not mandate the use of 233 any particular tunneling technology. In any given deployment, the 234 choice of tunneling technology is a matter of policy. The framework 235 accommodates at least the use of MPLS ([RFC3031], [RFC3032]), both 236 LDP-based (Label Distribution Protocol, [RFC5036]) and RSVP-TE-based 237 ([RFC3209]), L2TPv3 [RFC3931], GRE [RFC2784], and IP-in-IP [RFC2003]. 238 The framework will also accommodate the use of IPsec tunneling, when 239 that is necessary in order to meet security requirements. 241 It is expected that in many deployments, the choice of tunneling 242 technology will be made by a simple expression of policy, such as 243 "always use IP-IP tunnels", or "always use LDP-based MPLS", or 244 "always use L2TPv3". 246 However, other deployments may have a mixture of routers, some of 247 which support, say, both GRE and L2TPv3, but others of which support 248 only one of those techniques. It is desirable therefore to allow the 249 network administration to create a small set of classes, and to 250 configure each AFBR to be a member of one or more of these classes. 251 Then the routers can advertise their class memberships to each other, 252 and the encapsulation policies can be expressed as, e.g., "use L2TPv3 253 to tunnel to routers in class X, use GRE to tunnel to routers in 254 class Y". To support such policies, it is necessary for the AFBRs to 255 be able to advertise their class memberships; a standard way of doing 256 this must be developed. 258 Policy may also require a certain class of traffic to receive a 259 certain quality of service, and this may impact the choice of tunnel 260 and/or tunneling technology used for packets in that class. This 261 needs to be accommodated by the softwires framework. 263 The use of tunneled forwarding often requires that some sort of 264 signaling protocol be used to set up and/or maintain the tunnels. 265 Many of the tunneling technologies accommodated by this framework 266 already have their own signaling protocols. However, some do not, 267 and in some cases the standard signaling protocol for a particular 268 tunneling technology may not be appropriate, for one or another 269 reason, in the scenarios of interest. In such cases (and in such 270 cases only), new signaling methodologies need to be defined and 271 standardized. 273 In this framework, the softwires do not form an overlay topology 274 which is visible to routing; routing adjacencies are not maintained 275 over the softwires, and routing control packets are not sent through 276 the softwires. Routing adjacencies among backbone nodes (including 277 the edge nodes) are maintained via the native technology of the 278 backbone. 280 There is already a standard routing method for distributing external 281 routing information among AFBRs, namely BGP. However, in the 282 scenarios of interest, we may be using IPv6-based BGP sessions to 283 pass IPv4 routing information, and we may be using IPv4-based BGP 284 sessions to pass IPv6 routing information. Furthermore, when IPv4 285 traffic is to be tunneled over an IPv6 backbone, it is necessary to 286 encode the "BGP next hop" for an IPv4 route as an IPv6 address, and 287 vice versa. The method for encoding an IPv4 address as the next hop 288 for an IPv6 route is specified in [V6NLRI-V4NH]; the method for 289 encoding an IPv6 address as the next hop for an IPv4 route is 290 specified in [V4NLRI-V6NH]. 292 3. Scenarios of Interest 294 3.1. IPv6-over-IPv4 Scenario 296 In this scenario, the client networks run IPv6 but the backbone 297 network runs IPv4. This is illustrated in Figure 1. 299 +--------+ +--------+ 300 | IPv6 | | IPv6 | 301 | Client | | Client | 302 | Network| | Network| 303 +--------+ +--------+ 304 | \ / | 305 | \ / | 306 | \ / | 307 | X | 308 | / \ | 309 | / \ | 310 | / \ | 311 +--------+ +--------+ 312 | AFBR | | AFBR | 313 +--| IPv4/6 |---| IPv4/6 |--+ 314 | +--------+ +--------+ | 315 +--------+ | | +--------+ 316 | IPv4 | | | | IPv4 | 317 | Client | | | | Client | 318 | Network|------| IPv4 |-------| Network| 319 +--------+ | only | +--------+ 320 | | 321 | +--------+ +--------+ | 322 +--| AFBR |---| AFBR |--+ 323 | IPv4/6 | | IPv4/6 | 324 +--------+ +--------+ 325 | \ / | 326 | \ / | 327 | \ / | 328 | X | 329 | / \ | 330 | / \ | 331 | / \ | 332 +--------+ +--------+ 333 | IPv6 | | IPv6 | 334 | Client | | Client | 335 | Network| | Network| 336 +--------+ +--------+ 338 Figure 1 IPv6-over-IPv4 Scenario 340 The IPv4 transit core may or may not run MPLS. If it does, MPLS may be 341 used as part of the solution. 343 While Figure 1 does not show any "backdoor" connections among the client 344 networks, this framework assumes that there will be such connections. 346 That is, there is no assumption that the only path between two client 347 networks is via the pictured transit core network. Hence the routing 348 solution must be robust in any kind of topology. 350 Many mechanisms for providing IPv6 connectivity across IPv4 networks 351 have been devised over the past ten years. A number of different 352 tunneling mechanisms have been used, some provisioned manually, others 353 based on special addressing. More recently, L3VPN (Layer 3 Virtual 354 Private Network) techniques from [RFC4364] have been extended to provide 355 IPv6 connectivity, using MPLS in the AFBRs and optionally in the 356 backbone [V6NLRI-V4NH]. The solution described in this framework can be 357 thought of as a superset of [V6NLRI-V4NH], with a more generalized 358 scheme for choosing the tunneling (softwire) technology. In this 359 framework, MPLS is allowed, but not required, even at the AFBRs. As in 360 [V6NLRI-V4NH], there is no manual provisioning of tunnels, and no 361 special addressing is required. 363 3.2. IPv4-over-IPv6 Scenario 365 In this scenario, the client networks run IPv4 but the backbone 366 network runs IPv6. This is illustrated in Figure 2. 368 +--------+ +--------+ 369 | IPv4 | | IPv4 | 370 | Client | | Client | 371 | Network| | Network| 372 +--------+ +--------+ 373 | \ / | 374 | \ / | 375 | \ / | 376 | X | 377 | / \ | 378 | / \ | 379 | / \ | 380 +--------+ +--------+ 381 | AFBR | | AFBR | 382 +--| IPv4/6 |---| IPv4/6 |--+ 383 | +--------+ +--------+ | 384 +--------+ | | +--------+ 385 | IPv6 | | | | IPv6 | 386 | Client | | | | Client | 387 | Network|------| IPv6 |-------| Network| 388 +--------+ | only | +--------+ 389 | | 390 | +--------+ +--------+ | 391 +--| AFBR |---| AFBR |--+ 392 | IPv4/6 | | IPv4/6 | 393 +--------+ +--------+ 394 | \ / | 395 | \ / | 396 | \ / | 397 | X | 398 | / \ | 399 | / \ | 400 | / \ | 401 +--------+ +--------+ 402 | IPv4 | | IPv4 | 403 | Client | | Client | 404 | Network| | Network| 405 +--------+ +--------+ 407 Figure 2 IPv4-over-IPv6 Scenario 409 The IPv6 transit core may or may not run MPLS. If it does, MPLS may be 410 used as part of the solution. 412 While Figure 2 does not show any "backdoor" connections among the client 413 networks, this framework assumes that there will be such connections. 415 That is, there is no assumption the only path between two client 416 networks is via the pictured transit core network. Hence the routing 417 solution must be robust in any kind of topology. 419 While the issue of IPv6-over-IPv4 has received considerable attention in 420 the past, the scenario of IPv4-over-IPv6 has not. Yet it is a 421 significant emerging requirement, as a number of service providers are 422 building IPv6 backbone networks and do not wish to provide native IPv4 423 support in their core routers. These service providers have a large 424 legacy of IPv4 networks and applications that need to operate across 425 their IPv6 backbone. Solutions for this do not exist yet because it had 426 always been assumed that the backbone networks of the foreseeable future 427 would be dual stack. 429 4. General Principles of the Solution 431 This section gives a very brief overview of the procedures. The 432 subsequent sections provide more detail. 434 4.1. 'E-IP' and 'I-IP' 436 In the following we use the term "I-IP" ("Internal IP") to refer to 437 the form of IP (i.e., either IPv4 or IPv6) that is supported by the 438 transit network. We use the term "E-IP" ("External IP") to refer to 439 the form of IP that is supported by the client networks. In the 440 scenarios of interest, E-IP is IPv4 if and only if I-IP is IPv6, and 441 E-IP is IPv6 if and only if I-IP is IPv4. 443 We assume that the P routers support only I-IP. That is, they are 444 expected to have only I-IP routing information, and they are not 445 expected to be able to parse E-IP headers. We similarly assume that 446 the CE routers support only E-IP. 448 The AFBRs handle both I-IP and E-IP. However, only I-IP is used on 449 AFBR's "core facing interfaces", and E-IP is only used on its client- 450 facing interfaces. 452 4.2. Routing 454 The P routers and the AFBRs of the transit network participate in an 455 IGP, for the purposes of distributing I-IP routing information. 457 The AFBRs use IBGP to exchange E-IP routing information with each 458 other. Either there is a full mesh of IBGP connections among the 459 AFBRs, or else some or all of the AFBRs are clients of a BGP Route 460 Reflector. Although these IBGP connections are used to pass E-IP 461 routing information (i.e., the NLRI of the BGP updates is in the E-IP 462 address family), the IBGP connections run over I-IP, and the "BGP 463 next hop" for each E-IP NLRI is in the I-IP address family. 465 4.3. Tunneled Forwarding 467 When an ingress AFBR receives an E-IP packet from a client facing 468 interface, it looks up the packet's destination IP address. In the 469 scenarios of interest, the best match for that address will be a BGP- 470 distributed route whose next hop is the I-IP address of another AFBR, 471 the egress AFBR. 473 The ingress AFBR must forward the packet through a tunnel (i.e, 474 through a "softwire") to the egress AFBR. This is done by 475 encapsulating the packet, using an encapsulation header which the P 476 routers can process, and which will cause the P routers to send the 477 packet to the egress AFBR. The egress AFBR then extracts the 478 payload, i.e., the original E-IP packet, and forwards it further by 479 looking up its IP destination address. 481 Several kinds of tunneling technologies are supported. Some of those 482 technologies require explicit AFBR-to-AFBR signaling before the 483 tunnel can be used, others do not. 485 Transmitting a packet through a softwire always requires that an 486 encapsulation header be added to the original packet. The resulting 487 packet is therefore always longer than the encapsulation payload. As 488 an operational matter, the Maximum Transmission Unit (MTU) of the 489 softwire's path SHOULD be large enough so that (a) no packet will 490 need to be fragmented before being encapsulated, and (b) no 491 encapsulated packet will need to be fragmented while it is being 492 forwarded along a softwire. A general discussion of MTU issues in 493 the context of tunneled forwarding may be found in [RFC4459]. 495 5. Distribution of Inter-AFBR Routing Information 497 AFBRs peer with routers in the client networks to exchange routing 498 information for the E-IP family. 500 AFBRs use BGP to distribute the E-IP routing information to each 501 other. This can be done by an AFBR-AFBR mesh of IBGP sessions, but 502 more likely is done through a BGP Route Reflector, i.e., where each 503 AFBR has an IBGP session to one or two Route Reflectors, rather than 504 to other AFBRs. 506 The BGP sessions between the AFBRs, or between the AFBRs and the 507 Route Reflector, will run on top of the I-IP address family. That 508 is, if the transit core supports only IPv6, the IBGP sessions used to 509 distribute IPv4 routing information from the client networks will run 510 over IPv6; if the transit core supports only IPv4, the IBGP sessions 511 used to distribute IPv6 routing information from the client networks 512 will run over IPv4. The BGP sessions thus use the native networking 513 layer of the core; BGP messages are NOT tunneled through softwires or 514 through any other mechanism. 516 In BGP, a routing update associates an address prefix (or more 517 generally, "Network Layer Reachability Information", or NLRI) with 518 the address of a "BGP Next Hop" (NH). The NLRI is associated with a 519 particular address family. The NH address is also associated with a 520 particular address family, which may be the same as or different than 521 the address family associated with the NLRI. Generally the NH 522 address belongs to the address family that is used to communicate 523 with the BGP speaker to whom the NH address belongs. 525 Since routing updates which contain information about E-IP address 526 prefixes are carried over BGP sessions that use I-IP transport, and 527 since the BGP messages are not tunneled, a BGP update providing 528 information about an E-IP address prefix will need to specify a next 529 hop address in the I-IP family. 531 Due to a variety of historical circumstances, when the NLRI and the 532 NH in a given BGP update are of different address families, it is not 533 always obvious how the NH should be encoded. There is a different 534 encoding procedure for each pair of address families. 536 In the case where the NLRI is in the IPv6 address family, and the NH 537 is in the IPv4 address family, [V6NLRI-V4NH] explains how to encode 538 the NH. 540 In the case where the NLRI is in the IPv4 address family, and the NH 541 is in the IPv6 address family, [V4NLRI-V6NH] explains how to encode 542 the NH. 544 If a BGP speaker sends an update for an NLRI in the E-IP family, and 545 the update is being sent over a BGP session that is running on top of 546 the I-IP network layer, and the BGP speaker is advertising itself as 547 the NH for that NLRI, then the BGP speaker MUST, unless explicitly 548 overridden by policy, specify the NH address in the I-IP family. The 549 address family of the NH MUST NOT be changed by a Route Reflector. 551 In some cases (e.g., when [V4NLRI-V6NH] is used), one cannot follow 552 this rule unless one's BGP peers have advertised a particular BGP 553 capability. This leads to the following softwires deployment 554 restriction: if a BGP Capability is defined for the case in which an 555 E-IP NLRI has an I-IP NH, all the AFBRs in a given transit core MUST 556 advertise that capability. 558 If an AFBR has multiple IP addresses, the network administrators 559 usually have considerable flexibility in choosing which one the AFBR 560 uses to identify itself as the next hop in a BGP update. However, if 561 the AFBR expects to receive packets through a softwire of a 562 particular tunneling technology, and if the AFBR is known to that 563 tunneling technology via a specific IP address, then that same IP 564 address must be used to identify the AFBR in the next hop field of 565 the BGP updates. For example, if L2TPv3 tunneling is used, then the 566 IP address which the AFBR uses when engaging in L2TPv3 signaling must 567 be the same as the IP address it uses to identify itself in the next 568 hop field of a BGP update. 570 In [V6NLRI-V4NH], IPv6 routing information is distributed using the 571 labeled IPv6 address family. This allows the egress AFBR to 572 associate an MPLS label with each IPv6 address prefix. If an ingress 573 AFBR forwards packets through a softwire than can carry MPLS packets, 574 each data packet can carry the MPLS label corresponding to the IPv6 575 route that it matched. This may be useful at the egress AFBR, for 576 demultiplexing and/or enhanced performance. It is also possible to 577 do the same for the IPv4 address family, i.e., to use the labeled 578 IPv4 address family instead of the IPv4 address family. The use of 579 the labeled IP address families in this manner is OPTIONAL. 581 6. Softwire Signaling 583 A mesh of inter-AFBR softwires spanning the transit core must be in 584 place before packets can flow between client networks. Given N dual- 585 stack AFBRs, this requires N^2 "point-to-point IP" or "label switched 586 path" (LSP) tunnels. While in theory these could be configured 587 manually, that would result in a very undesirable O(N^2) provisioning 588 problem. Therefore manual configuration of point-to-point tunnels is 589 not considered part of this framework. 591 Because the transit core is providing layer 3 transit services, 592 point-to-point tunnels are not required by this framework; 593 multipoint-to-point tunnels are all that is needed. In a multipoint- 594 to-point tunnel, when a packet emerges from the tunnel there is no 595 way to tell which router put the packet into the tunnel. This models 596 the native IP forwarding paradigm, wherein the egress router cannot 597 determine a given packet's ingress router. Of course, point-to-point 598 tunnels might be required for some reason which goes beyond the basic 599 requirements described in this document. E.g., QoS or security 600 considerations might require the use of point-to-point tunnels. So 601 point-to-point tunnels are allowed, but not required, by this 602 framework. 604 If it is desired to use a particular tunneling technology for the 605 softwires, and if that technology has its own "native" signaling 606 methodology, the presumption is that the native signaling will be 607 used. This would certainly apply to MPLS-based softwires, where LDP 608 or RSVP-TE would be used. A softwire based on IPsec would use 609 standard IKEv2 (Internet Key Exchange) [RFC4306] and IPsec [RFC4301] 610 signaling, as that is necessary in order to guarantee the softwire's 611 security properties. 613 A Softwire based on GRE might or might not require signaling, 614 depending on whether various optional GRE header fields are to be 615 used. GRE does not have any "native" signaling, so for those cases, 616 a signaling procedure needs to be developed to support Softwires. 618 Another possible softwire technology is L2TPv3. While L2TPv3 does 619 have its own native signaling, that signaling sets up point-to-point 620 tunnels. For the purpose of softwires, it is better to use L2TPv3 in 621 a multipoint-to-point mode, and this requires a different kind of 622 signaling. 624 The signaling to be used for GRE and L2TPv3 to cover these scenarios 625 is BGP-based, and is described in [ENCAPS-SAFI]. 627 If IP-IP tunneling is used, or if GRE tunneling is used without 628 options, no signaling is required, as the only information needed by 629 the ingress AFBR to create the encapsulation header is the IP address 630 of the egress AFBR, and that is distributed by BGP. 632 When the encapsulation IP header is constructed, there may be fields 633 in the IP whose value is determined neither by whatever signaling has 634 been done nor by the distributed routing information. The values of 635 these fields are determined by policy in the ingress AFBR. Examples 636 of such fields may be the TTL (Time to Live) field, the DSCP 637 (DiffServ Service Classes) bits, etc. 639 It is desirable for all necessary softwires to be fully set up before 640 the arrival of any packets which need to go through the softwires. 641 That is, the softwires should be "always on". From the perspective 642 of any particular AFBR, the softwire endpoints are always BGP next 643 hops of routes which the AFBR has installed. This suggests that any 644 necessary softwire signaling should be either be done as part of 645 normal system startup (as would happen, e.g., with LDP-based MPLS), 646 or else should be triggered by the reception of BGP routing 647 information (such as is described in [ENCAPS-SAFI]); it is also 648 helpful if distribution of the routing information that serves as the 649 trigger is prioritized. 651 7. Choosing to Forward Through a Softwire 653 The decision to forward through a softwire, instead of to forward 654 natively, is made by the ingress AFBR. This decision is a matter of 655 policy. 657 In many cases, the policy will be very simple. Some useful policies 658 are: 660 - if routing says that an E-IP packet has to be sent out a "core- 661 facing interface" to an I-IP core, then send the packet through a 662 softwire 664 - if routing says that an E-IP packet has to be sent out an 665 interface that only supports I-IP packets, then send the E-IP 666 packets through a softwire 668 - if routing says that the BGP next hop address for an E-IP packet 669 is an I-IP address, then send the E-IP packets through a softwire 671 - if the route which is the best match for a particular packet's 672 destination address is a BGP-distributed route, then send the 673 packet through a softwire (i.e., tunnel all BGP-routed packets). 675 More complicated policies are also possible, but a consideration of 676 those policies is outside the scope of this document. 678 8. Selecting a Tunneling Technology 680 The choice of tunneling technology is a matter of policy configured 681 at the ingress AFBR. 683 It is envisioned that in most cases, the policy will be a very simple 684 one, and will be the same at all the AFBRs of a given transit core. 685 E.g., "always use LDP-based MPLS", or "always use L2TPv3". 687 However, other deployments may have a mixture of routers, some of 688 which support, say, both GRE and L2TPv3, but others of which support 689 only one of those techniques. It is desirable therefore to allow the 690 network administration to create a small set of classes, and to 691 configure each AFBR to be a member of one or more of these classes. 692 Then the routers can advertise their class memberships to each other, 693 and the encapsulation policies can be expressed as, e.g., "use L2TPv3 694 to talk to routers in class X, use GRE to talk to routers in class 695 Y". To support such policies, it is necessary for the AFBRs to be 696 able to advertise their class memberships. [ENCAPS-SAFI] specifies a 697 way in which an AFBR may advertise, to other AFBRS, various 698 characteristics which may be relevant to the policy (e.g., "I belong 699 to class Y"). In many cases, these characteristics can be 700 represented by arbitrarily selected communities or extended 701 communities, and the policies at the ingress can be expressed in 702 terms of these classes (i.e., communities). 704 Policy may also require a certain class of traffic to receive a 705 certain quality of service, and this may impact the choice of tunnel 706 and/or tunneling technology used for packets in that class. This 707 framework allows a variety of tunneling technologies to be used for 708 instantiating softwires. The choice of tunneling technology is a 709 matter of policy, as discussed in section 2. 711 While in many cases the policy will be unconditional, e.g., "always 712 use L2TPv3 for softwires", in other cases the policy may specify that 713 the choice is conditional upon information about the softwire remote 714 endpoint, e.g., "use L2TPv3 to talk to routers in class X, use GRE to 715 talk to routers in class Y". It is desirable therefore to allow the 716 network administration to create a small set of classes, and to 717 configure each AFBR to be a member of one or more of these classes. 718 If each such class is represented as a community or extended 719 community, then [ENCAPS-SAFI] specifies a method that AFBRs can use 720 to advertise their class memberships to each other. 722 This framework also allows for policies of arbitrary complexity, 723 which may depend on characteristics or attributes of individual 724 address prefixes, as well as on QoS or security considerations. 725 However, the specification of such policies is not within the scope 726 of this document. 728 9. Selecting the Softwire for a Given Packet 730 Suppose it has been decided to send a given packet through a 731 softwire. Routing provides the address, in the address family of the 732 transport network, of the BGP next hop. The packet MUST be sent 733 through a softwire whose remote endpoint address is the same as the 734 BGP next hop address. 736 Sending a packet through a softwire is a matter of encapsulating the 737 packet with an encapsulation header that can be processed by the 738 transit network, and then transmitting towards the softwire's remote 739 endpoint address. 741 In many cases, once one knows the remote endpoint address, one has 742 all the information one needs in order to form the encapsulation 743 header. This will be the case if the tunnel technology instantiating 744 the softwire is, e.g., LDP-based MPLS, IP-in-IP, or GRE without 745 optional header fields. 747 If the tunnel technology being used is L2TPv3 or GRE with optional 748 header fields, additional information from the remote endpoint is 749 needed in order to form the encapsulation header. The procedures for 750 sending and receiving this information are described in [ENCAPS- 751 SAFI]. 753 If the tunnel technology being used is RSVP-TE-based MPLS or IPsec, 754 the native signaling procedures of those technologies will need to be 755 used. 757 If the packet being sent through the softwire matches a route in the 758 labeled IPv4 or labeled IPv6 address families, it should be sent 759 through the softwire as an MPLS packet with the corresponding label. 760 Note that most of the tunneling technologies mentioned in this 761 document are capable of carrying MPLS packets, so this does not 762 presuppose support for MPLS in the core routers. 764 10. Softwire OAM and MIBs 766 10.1. Operations and Maintenance (OAM) 768 Softwires are essentially tunnels connecting routers. If they 769 disappear or degrade in performance then connectivity through those 770 tunnels will be impacted. There are several techniques available to 771 monitor the status of the tunnel end-points (AFBRs) as well as the 772 tunnels themselves. These techniques allow operations such as 773 softwires path tracing, remote softwire end-point pinging and remote 774 softwire end-point liveness failure detection. 776 Examples of techniques applicable to softwire OAM include: 778 o BGP/TCP timeouts between AFBRs 780 o ICMP or LSP echo request and reply addressed to a particular AFBR 782 o BFD (Bidirectional Forwarding Detection) [BFD] packet exchange 783 between AFBR routers 785 Another possibility for softwire OAM is to build something similar to 786 [RFC4378] or in other words creating and generating softwire echo 787 request/reply packets. The echo request sent to a well-known UDP 788 port would contain the egress AFBR IP address and the softwire 789 identifier as the payload (similar to the MPLS forwarding equivalence 790 class contained in the LSP echo request). The softwire echo packet 791 would be encapsulated with the encapsulation header and forwarded 792 across the same path (inband) as that of the softwire itself. 794 This mechanism can also be automated to periodically verify remote 795 softwires end-point reachability, with the loss of reachability being 796 signaled to the softwires application on the local AFBR thus enabling 797 suitable actions to be taken. Consideration must be given to the 798 trade-offs between scalability of such mechanisms versus time to 799 detection of loss of endpoint reachability for such automated 800 mechanisms. 802 In general a framework for softwire OAM can for a large part be based 803 on the [RFC4176] framework. 805 10.2. MIBs 807 Specific MIBs do exist to manage elements of the softwire mesh 808 framework. However there will be a need to either extend these MIBs 809 or create new ones that reflect the functional elements that can be 810 SNMP-managed within the softwire network. 812 11. Softwire Multicast 814 A set of client networks, running E-IP, that are connected to a 815 provider's I-IP transit core, may wish to run IP multicast 816 applications. Extending IP multicast connectivity across the transit 817 core can be done in a number of ways, each with a different set of 818 characteristics. Most (though not all) of the possibilities are 819 either slight variations of the procedures defined for L3VPNs in 820 [L3VPN-MCAST]. 822 We will focus on supporting those multicast features and protocols 823 which are typically used across inter-provider boundaries. Support 824 is provided for PIM-SM (PIM Sparse Mode) and PIM-SSM (PIM Single 825 Source Mode). Support for BIDIR-PIM (Bidirectional PIM), BSR 826 (Bootstrap Router Mechanism for PIM), AutoRP (Automatic Rendezvous 827 Point Determination) is not provided as these features are not 828 typically used across inter-provider boundaries. 830 11.1. One-to-One Mappings 832 In the "one-to-one mapping" scheme, each client multicast tree is 833 extended through the transit core, so that for each client tree there 834 is exactly one tree through the core. 836 The one-to-one scheme is not used in [L3VPN-MCAST], because it 837 requires an amount of state in the core routers which is proportional 838 to the number of client multicast trees passing through the core. In 839 the VPN context, this is considered undesirable, because the amount 840 of state is unbounded and out of the control of the service provider. 841 However, the one-to-one scheme models the typical "Internet 842 multicast" scenario where the client network and the transit core are 843 both IPv4 or are both IPv6. If it scales satisfactorily for that 844 case, it should also scale satisfactorily for the case where the 845 client network and the transit core support different versions of IP. 847 11.1.1. Using PIM in the Core 849 When an AFBR receives an E-IP PIM control message from one of its 850 CEs, it would translate it from E-IP to I-IP, and forward it towards 851 the source of the tree. Since the routers in the transit core will 852 not generally have a route to the source of the tree, the AFBR must 853 create include an "RPF (Reverse Path Forwarding) Vector" [RPF-VECTOR] 854 in the PIM message. 856 Suppose an AFBR A receives an E-IP PIM Join/Prune message from a CE, 857 for either an (S,G) tree or a (*,G) tree. The AFBR would have to 858 "translate" the PIM message into an I-IP PIM message. It would then 859 send it to the neighbor which is the next hop along the route to the 860 root of the (S,G) or (*,G) tree. In the case of an (S,G) tree the 861 root of the tree is S; in the case of a (*,G) tree the root of the 862 tree is the Rendezvous Point (RP) for the group G. 864 Note that the address of the root of the tree will be an E-IP 865 address. Since the routers within the transit core (other than the 866 AFBRs) do not have routes to E-IP addresses, A must put an "RPF 867 Vector" [RPF-VECTOR] in the PIM Join/Prune message that it sends to 868 its upstream neighbor. The RPF Vector will identify, as an I-IP 869 address, the AFBR B that is the egress point in the transit network 870 along the route to the root of the multicast tree. AFBR B is AFBR 871 A's "BGP next hop" for the route to the root of the tree. The RPF 872 Vector allows the core routers to forward PIM Join/Prune messages 873 upstream towards the root of the tree, even though they do not 874 maintain E-IP routes. 876 In order to "translate" the an E-IP PIM message into an I-IP PIM 877 message, the AFBR A must translate the address of S (in the case of 878 an (S,G) group) or the address of G's RP from the E-IP address family 879 to the I-IP address family, and the AFBR B must translate them back. 881 In the case where E-IP is IPv4 and I-IP is IPv6, it may be possible 882 to do this translation algorithmically. A can translate the IPv4 S 883 into the corresponding IPv4-mapped IPv6 address [RFC4291], and then B 884 can translate it back. At the time of this writing, there is no such 885 thing as an IPv4-mapped IPv6 multicast address, but if such a thing 886 were to be standardized, then A could also translate the IPv4 G into 887 IPv6, and B could translate it back. The precise circumstances under 888 which these translations are to be done would be a matter of policy. 890 Obviously, this translation procedure does not generalize to the case 891 where the client multicast is IPv6 but the core is IPv4. To handle 892 that case, one needs additional signaling between the two AFBRs. 893 Each downstream AFBR need to signal the upstream AFBR that it needs a 894 multicast tunnel for (S,G). The upstream AFBR must then assign a 895 multicast address G' to the tunnel, and inform the downstream of the 896 P-G value to use. The downstream AFBR then uses PIM/IPv4 to join the 897 (S', G') tree, where S' is the IPv4 address of the upstream ASBR 898 (Autonomous System Border Router). 900 The (S', G') trees should be SSM trees. 902 This procedure can be used to support client multicasts of either 903 IPv4 or IPv6 over a transit core of the opposite protocol. However, 904 it only works when the client multicasts are SSM, since it provides 905 no method for mapping a client "prune a source off the (*,G) tree" 906 operation into an operation on the (S',G') tree. This method also 907 requires additional signaling. The BGP-based signaling of [L3VPN- 908 MCAST-BGP] is one signaling method that could be used. Other 909 signaling methods could be defined as well. 911 11.1.2. Using mLDP and Multicast MPLS in the Core 913 If the transit core implements mLDP (LDP Extensions for Point-to- 914 Multipoint and Multipoint-to-Multipoint LSPs, [mLDP]) and supports 915 multicast MPLS, then client Single-Source Multicast (SSM) trees can 916 be mapped one-to-one onto P2MP (Point-to-Multipoint) LSPs. 918 When an AFBR A receives a E-IP PIM Join/Prune message for (S,G) from 919 one of its CEs, where G is an SSM group it would use mLDP to join a 920 P2MP LSP. The root of the P2MP LSP would be the AFBR B that is A's 921 BGP next hop on the route to S. In mLDP, a P2MP LSP is uniquely 922 identified by a combination of its root and a "FEC (Forwarding 923 Equivalence Class) identifier". The original (S,G) can be 924 algorithmically encoded into the FEC identifier, so that all AFBRs 925 that need to join the P2MP LSP for (S,G) will generate the same FEC 926 identifier. When the root of the P2MP LSP (AFBR B) receives such an 927 mLDP message, it extracts the original (S,G) from the FEC identifier, 928 creates an "ordinary" E-IP PIM Join/Prune message, and sends it to 929 the CE which is its next hop on the route to S. 931 The method of encoding the (S,G) into the FEC identifier needs to be 932 standardized. The encoding must be self-identifying, so that a node 933 which is the root of a P2MP LSP can determine whether a FEC 934 identifier is the result of having encoded a PIM (S,G). 936 The appropriate state machinery must be standardized so that PIM 937 events at the AFBRs result in the proper mLDP events. For example, 938 if at some point an AFBR determines (via PIM procedures) that it no 939 longer has any downstream receivers for (S,G), the AFBR should invoke 940 the proper mLDP procedures to prune itself off the corresponding P2MP 941 LSP. 943 Note that this method cannot be used when the G is a Sparse Mode 944 group. The reason this method cannot be used is that mLDP does not 945 have any function corresponding to the PIM "prune this source off the 946 shared tree" function. So if a P2MP LSP were mapped one-to-one with 947 a P2MP LSP, duplicate traffic could end up traversing the transit 948 core (i.e., traffic from S might travel down both the shared tree and 949 S's source tree). Alternatively, one could devise an AFBR-to-AFBR 950 protocol to prune sources off the P2MP LSP at the root of the LSP. 951 It is recommended though that client SM multicast groups be supported 952 by other methods, such as those discussed below. 954 Client-side bidirectional multicast groups set up by PIM-bidir could 955 be mapped using the above technique to MP2MP (Multipoint-to- 956 Multipoint) LSPs set up by mLDP [MLDP]. We do not consider this 957 further as inter-provider bidirectional groups are not in use 958 anywhere. 960 11.2. MVPN-like Schemes 962 The "MVPN-like schemes" are those described in [L3VPN-MCAST] and its 963 companion documents (such as [L3VPN-MCAST-BGP]). To apply those 964 schemes to the softwire environment, it is necessary only to treat 965 all the AFBRs of a given transit core as if they were all, for 966 multicast purposes, PE routers attached to the same VPN. 968 The MVPN-like schemes do not require a one-to-one mapping between 969 client multicast trees and transit core multicast trees. In the MVPN 970 environment, it is a requirement that the number of trees in the core 971 scales less than linearly with the number of client trees. This 972 requirement may not hold in the softwires scenarios. 974 The MVPN-like schemes can support SM, SSM, and Bidir groups. They 975 provide a number of options for the control plane: 977 - LAN-like 979 Use a set of multicast trees in the core to emulate a LAN (Local 980 Area Network), and run the client-side PIM protocol over that 981 "LAN". The "LAN" can consists of a single Bidir tree containing 982 all the AFBRs, or a set of SSM trees, one rooted at each AFBR, 983 and containing all the other AFBRs as receivers. 985 - NBMA (Non-Broadcast Multiple Access), using BGP 987 The client-side PIM signaling can be "translated" into BGP-based 988 signaling, with a BGP route reflector mediating the signaling. 990 These two basic options admit of many variations; a comprehensive 991 discussion is in [L3VPN-MCAST]. 993 For the data plane, there are also a number of options: 995 - All multicast data sent over the emulated LAN. This particular 996 option is not very attractive though for the softwires scenarios, 997 as every AFBR would have to receive every client multicast 998 packet. 1000 - Every multicast group mapped to a tree which is considered 1001 appropriate for that group, in the sense of causing the traffic 1002 of that group to go to "too many" AFBRs that don't need to 1003 receive it. 1005 Again, a comprehensive discussion of the issues can be found in 1006 [L3VPN-MCAST]. 1008 12. Inter-AS Considerations 1010 We have so far only considered the case where a "transit core" 1011 consists of a single Autonomous System (AS). If the transit core 1012 consists of multiple ASes, then it may be necessary to use softwires 1013 whose endpoints are AFBRs attached to different Autonomous Systems. 1014 In this case, the AFBR at the remote endpoint of a softwire is not 1015 the BGP next hop for packets that need to be sent on the softwire. 1016 Since the procedures described above require the address of remote 1017 softwire endpoint to be the same as the address of the BGP next hop, 1018 those procedures do not work as specified when the transit core 1019 consists of multiple ASes. 1021 There are several ways to deal with this situation. 1023 1. Don't do it; require that there be AFBRs at the edge of each 1024 AS, so that a transit core does not extend more than one AS. 1026 2. Use multi-hop EBGP to allow AFBRs to send BGP routes to each 1027 other, even if the ABFRs are not in the same or in neighboring 1028 ASes. 1030 3. Ensure that an ASBR which is not an AFBR does not change the 1031 next hop field of the routes for which encapsulation is needed. 1033 In the latter two cases, BGP recursive next hop resolution needs to 1034 be done, and encapsulations may need to be "stacked" (i.e., multiple 1035 layers of encapsulation may need to be used). 1037 For instance, consider packet P with destination IP address D. 1038 Suppose it arrives at ingress AFBR A1, and that the route that is the 1039 best match for D has BGP next hop B1. So A1 will encapsulate the 1040 packet for delivery to B1. If B1 is not within A1's AS, A1 will need 1041 to look up the route to B1 and then find the BGP next hop, call it 1042 B2, of that route. If the interior routers of A1's AS do not have 1043 routes to B1, then A1 needs to encapsulate the packet a second time, 1044 this time for delivery to B2. 1046 13. IANA Considerations 1048 This document has no actions for IANA. 1050 14. Security Considerations 1052 14.1. Problem Analysis 1054 In the Softwires mesh framework, the data packets that are 1055 encapsulated are E-IP data packets that are traveling through the 1056 Internet. These data packets (the Softwires "payload") may or may 1057 not need such security features as authentication, integrity, 1058 confidentiality, or replay protection. However, the security needs 1059 of the payload packets are independent of whether or not those 1060 packets are traversing softwires. The fact that a particular payload 1061 packet is traveling through a softwire does not in any way affect its 1062 security needs. 1064 Thus the only security issues we need to consider are those which 1065 affect the I-IP encapsulation headers, rather than those which affect 1066 the E-IP payload. 1068 Since the encapsulation headers determine the routing of packets 1069 traveling through softwires, they must appear "in the clear". 1071 In the Softwires mesh framework, for each tunnel receiving endpoint, 1072 there are one or more "valid" transmitting endpoints, where the valid 1073 transmitting endpoints are those which are authorized to tunnel 1074 packets to the receiving endpoint. If the encapsulation header has 1075 no guarantee of authentication or integrity, then it is possible to 1076 have spoofing attacks, in which unauthorized nodes send encapsulated 1077 packets to the receiving endpoint, giving the receiving endpoint the 1078 invalid impression the encapsulated packets have really traveled 1079 through the softwire. Replay attacks are also possible. 1081 The effect of such attacks is somewhat limited though. The receiving 1082 endpoint of a softwire decapsulates the payload and does further 1083 routing based on the IP destination address of the payload. Since 1084 the payload packets are traveling through the Internet, they have 1085 addresses from the globally unique address space (rather than, e.g., 1086 from a private address space of some sort). Therefore these attacks 1087 cannot cause payload packets to be delivered to an address other than 1088 the one appearing in the destination IP address field of the payload 1089 packet. 1091 However, attacks of this sort can result in policy violations. The 1092 authorized transmitting endpoint(s) of a softwire may be following a 1093 policy according to which only certain payload packets get sent 1094 through the softwire. If unauthorized nodes are able to encapsulate 1095 the payload packets so that they arrive at the receiving endpoint 1096 looking as if they arrived from authorized nodes, then the properly 1097 authorized policies have been side-stepped. 1099 Attacks of the sort we are considering can also be used in Denial of 1100 Service attacks on the receiving tunnel endpoints. However, such 1101 attacks cannot be prevented by use of cryptographic 1102 authentication/integrity techniques, as the need to do cryptography 1103 on spoofed packets only makes the Denial of Service problem worse. 1104 (The assumption is that the cryptography mechanisms are likely to be 1105 more costly than the decapsulation/forwarding mechanisms. So if one 1106 tries to eliminate a flooding attack on the decapsulation/forwarding 1107 mechanisms by discarding packets that do not pass a cryptographic 1108 integrity test, one ends up just trading one kind of attack for 1109 another.) 1111 This section is largely based on the security considerations section 1112 of RFC 4023, which also deals with encapsulations and tunnels. 1114 14.2. Non-cryptographic techniques 1116 If a tunnel lies entirely within a single administrative domain, then 1117 to a certain extent, then there are certain non-cryptographic 1118 techniques one can use to prevent spoofed packets from reaching a 1119 tunnel's receiving endpoint. For example, when the tunnel 1120 encapsulation is IP-based: 1122 - The tunnel receiving endpoints can be given a distinct set of 1123 addresses, and those addresses can be made known to the border 1124 routers. The border routers can then filter out packets, 1125 destined to those addresses, which arrive from outside the 1126 domain. 1128 - The tunnel transmitting endpoints can be given a distinct set of 1129 addresses, and those addresses can be made known to the border 1130 routers and to the tunnel receiving endpoints. The border routers 1131 can filter out all packets arriving from outside the domain with 1132 source addresses that are in this set, and the receiving 1133 endpoints can discard all packets which appear to be part of a 1134 softwire, but whose source addresses are not in this set. 1136 If an MPLS-based encapsulation is used, the border routers can refuse 1137 to accept MPLS packets from outside the domain, or can refuse to 1138 accept such MPLS packets whenever the top label corresponds to the 1139 address of a tunnel receiving endpoint. 1141 These techniques assume that within a domain, the network is secure 1142 enough to prevent the introduction of spoofed packets from within the 1143 domain itself. That may not always be the case. Also, these 1144 techniques however can be difficult or impossible to use effectively 1145 for tunnels that are not in the same administrative domain. 1147 A different technique is to have the encapsulation header contain a 1148 cleartext password. The 64-bit "cookie" of L2TPv3 [RFC3931] is 1149 sometimes used in this way. This can be useful within an 1150 administrative domain if it is regarded as infeasible for an attacker 1151 to spy on packets that originate in the domain and that do not leave 1152 the domain. An attacker would then not be able to discover the 1153 password. An attacker could of course try to guess the password, but 1154 if the password is an arbitrary 64-bit binary sequence, brute force 1155 attacks which run through all the possible passwords would be 1156 infeasible. This technique may be easier to manage than ingress 1157 filtering is, and may be just as effective if the assumptions hold. 1158 Like ingress filtering, though, it may not be applicable for tunnels 1159 that cross domain boundaries. 1161 Therefore it is necessary to also consider the use of cryptographic 1162 techniques for setting up the tunnels and for passing data through 1163 them. 1165 14.3. Cryptographic techniques 1167 If the path between the two endpoints of a tunnel is not adequately 1168 secure, then 1170 - If a control protocol is used to set up the tunnels (e.g., to 1171 inform one tunnel endpoint of the IP address of the other), the 1172 control protocol MUST have an authentication mechanism, and this 1173 MUST be used when the tunnel is set up. If the tunnel is set up 1174 automatically as the result of, for example, information 1175 distributed by BGP, then the use of BGP's MD5-based 1176 authentication mechanism [RFC2385] is satisfactory. 1178 - Data transmission through the tunnel should be secured with 1179 IPsec. In the remainder of this section, we specify the way 1180 IPsec may be used, and the implementation requirements we mention 1181 are meant to be applicable whenever IPsec is being used. 1183 We consider only the case where IPsec is used together with an IP- 1184 based tunneling mechanism. Use of IPsec with an MPLS-based tunneling 1185 mechanism is for further study. 1187 If it is deemed necessary to use tunnels that are protected by IPsec, 1188 the tunnel type SHOULD be negotiated by the tunnel endpoints using 1189 the procedures specified in [ENCAPS-IPSEC]. That document allows the 1190 use of IPsec tunnel mode, but also allows one to treat the tunnel 1191 head and the tunnel tail as the endpoints of a Security Association, 1192 and to use IPsec transport mode. 1194 In order to use IPsec transport mode, encapsulated packets should be 1195 viewed as originating at the tunnel head and as being destined for 1196 the tunnel tail. A single IP address of the tunnel head will be used 1197 as the source IP address, and a single IP address of the tunnel tail 1198 will be used as the destination IP address. This technique can be 1199 used to to carry MPLS packets through an IPsec Security Association, 1200 but first encapsulating the MPLS packets in MPLS-in-IP or MPLS-in-GRE 1201 [RFC4023] and then applying IPsec transport mode. 1203 When IPsec is used to secure softwires, IPsec MUST provide 1204 authentication and integrity. Thus, the implementation MUST support 1205 either ESP (IP Encapsulating Security Payload) with null encryption 1207 [RFC4303] or else AH (IP Authentication Header) [RFC4302]. ESP with 1208 encryption MAY be supported. If ESP is used, the tunnel tail MUST 1209 check that the source IP address of any packet received on a given SA 1210 (IPsec Security Association) is the one expected, as specified in 1211 [RFC4301] section 5.2 step 4. 1213 Since the softwires are set up dynamically as a byproduct of passing 1214 routing information, key distribution MUST be done automatically by 1215 means of IKEv2 [RFC4306]. If a PKI (Public Key Infrastructure) is 1216 not available, the IPsec Tunnel Authenticator sub-TLV described in 1217 [ENCAPS-IPSEC] MUST be used and validated before setting up an SA. 1219 The selectors associated with the SA are the source and destination 1220 addresses of the encapsulation header, along with the IP protocol 1221 number representing the encapsulation protocol being used. 1223 15. Contributors 1225 Xing Li 1226 Tsinghua University 1227 Department of Electronic Engineering, Tsinghua University 1228 Beijing 100084 1229 P.R.China 1231 Phone: +86-10-6278-5983 1232 Email: xing@cernet.edu.cn 1234 Simon Barber 1235 Cisco Systems, Inc. 1236 250 Longwater Avenue 1237 Reading, ENGLAND, RG2 6GB 1238 United Kingdom 1240 Email: sbarber@cisco.com 1242 Pradosh Mohapatra 1243 Cisco Systems, Inc. 1244 3700 Cisco Way 1245 San Jose, Ca. 95134 1246 USA 1248 Email: pmohapat@cisco.com 1249 John Scudder 1250 Juniper Networks 1251 1194 North Mathilda Avenue 1252 Sunnyvale, California 94089 1253 USA 1255 Email: jgs@juniper.net 1257 16. Acknowledgments 1259 David Ward, Chris Cassar, Gargi Nalawade, Ruchi Kapoor, Pranav Mehta, 1260 Mingwei Xu and Ke Xu provided useful input into this document. 1262 Authors' Addresses 1264 Jianping Wu 1265 Tsinghua University 1266 Department of Computer Science, Tsinghua University 1267 Beijing 100084 1268 P.R.China 1270 Phone: +86-10-6278-5983 1271 Email: jianping@cernet.edu.cn 1273 Yong Cui 1274 Tsinghua University 1275 Department of Computer Science, Tsinghua University 1276 Beijing 100084 1277 P.R.China 1279 Phone: +86-10-6278-5822 1280 Email: yong@csnet1.cs.tsinghua.edu.cn 1282 Chris Metz 1283 Cisco Systems, Inc. 1284 3700 Cisco Way 1285 San Jose, Ca. 95134 1286 USA 1288 Email: chmetz@cisco.com 1289 Eric C. Rosen 1290 Cisco Systems, Inc. 1291 1414 Massachusetts Avenue 1292 Boxborough, MA, 01719 1293 USA 1295 Email: erosen@cisco.com 1297 17. Normative References 1299 [ENCAPS-IPSEC] "BGP IPSec Tunnel Encapsulation Attribute", L. Berger, 1300 R. White, E. Rosen, draft-ietf-softwire-encaps-ipsec-01.txt, April 1301 2008. 1303 [ENCAPS-SAFI] "BGP Information SAFI and BGP Tunnel Encapsulation 1304 Attribute", P. Mohapatra and E. Rosen, draft-ietf-softwire-encaps- 1305 safi-05.txt, February 2009. 1307 [RFC2003] "IP Encapsulation within IP", C. Perkins, October 1996. 1309 [RFC2119] "Key words for use in RFCs to Indicate Requirement Levels", 1310 S. Bradner, March 1997. 1312 [RFC2784] "Generic Routing Encapsulation (GRE)", D. Farinacci, T. Li, 1313 S. Hanks, D. Meyer, P. Traina, RFC 2784, March 2000. 1315 [RFC3031] "Multiprotocol Label Switching Architecture", E. Rosen, A. 1316 Viswanathan, R. Callon, RFC 3031, January 2001. 1318 [RFC3032] "MPLS Label Stack Encoding", E. Rosen, D. Tappan, G. 1319 Fedorkow, Y. Rekhter, D. Farinacci, T. Li, A. Conta, RFC 3032, 1320 January 2001. 1322 [RFC3209] D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan, and G. 1323 Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, 1324 December 2001. 1326 [RFC3931] J. Lau, M. Townsley, I. Goyret, "Layer Two Tunneling 1327 Protocol - Version 3 (L2TPv3)", RFC 3931, March 2005. 1329 [RFC4023] "Encapsulating MPLS in IP or Generic Routing Encapsulation 1330 (GRE)", T. Worster, Y. Rekhter, E. Rosen, RFC 4023, March 2005. 1332 [V4NLRI-V6NH] F. Le Faucheur, E. Rosen, "Advertising an IPv4 NLRI 1333 with an IPv6 Next Hop", draft-ietf-softwire-v4nlri-v6nh-02.txt, 1334 January 2009. 1336 [V6NLRI-V4NH] J. De Clercq, D. Ooms, S. Prevost, F. Le Faucheur, 1337 "Connecting IPv6 Islands over IPv4 MPLS using IPv6 Provider Edge 1338 Routers (6PE)", RFC 4798, February 2007. 1340 18. Informative References 1342 [BFD] D. Katz and D. Ward, "Bidirectional Forwarding Detection", 1343 draft-ietf-bfd-base-09.txt, February 2009. 1345 [L3VPN-MCAST], "Multicast in MPLS/BGP IP VPNs", E. Rosen, R. 1346 Aggarwal, draft-ietf-l3vpn-2547bis-mcast-07.txt, July 2008. 1348 [L3VPN-MCAST-BGP], "BGP Encodings and Procedures for Multicast in 1349 MPLS/BGP IP VPNs", R. Aggarwal, E. Rosen, T. Morin, Y. Rekhter, C. 1350 Kodeboniya, draft-ietf-l3vpn-2547bis-mcast-bgp-05.txt, June 2008. 1352 [MLDP] "Label Distribution Protocol Extensions for Point-to- 1353 Multipoint and Multipoint-to-Multipoint Label Switched Paths", I. 1354 Minei, K. Kompella, IJ. Wijnands, B. Thomas, draft-ietf-mpls-ldp- 1355 p2mp-05.txt, June 2008. 1357 [RFC1195] "Use of OSI IS-IS for Routing in TCP/IP and Dual 1358 Environments", R. Callon, RFC 1195, December 1990. 1360 [RFC2328] J. Moy, "OSPF Version 2", RFC 2328, April 1998. 1362 [RFC2385] "Protection of BGP Sessions via the TCP MD5 Signature 1363 Option", A. Heffernan, RFC 2385, August 1998. 1365 [RFC4176] Y. El Mghazli, T. Nadeau, M. Boucadair, K. Chan, A. 1366 Gonguet, "Framework for Layer 3 Virtual Private Networks (L3VPN) 1367 Operations and Management", RFC 4176, October 2005. 1369 [RFC4271] Y. Rekhter, T. Li, S. Hares, "A Border Gateway Protocol 4 1370 (BGP-4)", RFC 4271, January 2006. 1372 [RFC4291] "IP Version 6 Addressing Architecture", R. Hinden, S. 1373 Deering, RFC 4291, February 2006. 1375 [RFC4301], "Security Architecture for the Internet Protocol", S. 1376 Kent, K. Seo, RFC 4301, December 2005. 1378 [RFC4302] "IP Authentication Header", S. Kent, RFC 4302, December 1379 2005. 1381 [RFC4303] "IP Encapsulating Security Payload (ESP)", S. Kent, RFC 1382 4303, December 2005. 1384 [RFC4306] "Internet Key Exchange (IKEv2) Protocol", C. Kaufman, ed., 1385 RFC 4306, December 2005. 1387 [RFC4364] E. Rosen, Y. Rekhter, "BGP/MPLS IP Virtual Private Networks 1388 (VPNs)", RFC 4364, February 2006. 1390 [RFC4378] D. Allan, T. Nadeau, "A Framework for Multi-Protocol Label 1391 Switching (MPLS) Operations and Management (OAM)", RFC 4378, February 1392 2006. 1394 [RFC4459] P. Savola, "MTU and Fragmentation Issues with In-the- 1395 Network Tunneling", RFC 4459, April 2006. 1397 [RFC5036] "LDP Specification", L. Andersson, I. Minei, B. Thomas, RFC 1398 5036, October 2007. 1400 [RPF-VECTOR], "The RPF Vector TLV", IJ. Wijnands, A. Boers, E. Rosen, 1401 draft-ietf-pim-rpf-vector-08.txt, January 2009. 1403 [SW-PROB] X. Li, S. Dawkins, D. Ward, A. Durand, "Softwire Problem 1404 Statement", RFC 4925, July 2007.