idnits 2.17.1 draft-tsou-softwire-bfd-ds-lite-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 3 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 17, 2013) is 3964 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC6887' is defined on line 429, but no explicit reference was found in the text -- No information found for draft-vinokour-bfd-dhcp - is the name correct? Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force T. Tsou 3 Internet-Draft Huawei Technologies (USA) 4 Intended status: Informational B. Li 5 Expires: December 19, 2013 C. Zhou 6 Huawei Technologies 7 J. Schoenwaelder 8 Jacobs University Bremen 9 R. Penno 10 Cisco Systems, Inc. 11 M. Boucadair 12 France Telecom 13 June 17, 2013 15 DS-Lite Failure Detection and Failover 16 draft-tsou-softwire-bfd-ds-lite-05 18 Abstract 20 In DS-Lite, the tunnel is stateless, not associated with any state 21 information, and the CGN function at the AFTR is stateful. 22 Currently, there is no failure detection and failover mechanism for 23 both stateless tunnel and stateful CGN function, which makes it 24 difficult to manage and diagnose if there is a problem. This draft 25 analyzes the applicability of some of the possible solutions. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on December 19, 2013. 44 Copyright Notice 46 Copyright (c) 2013 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. Failover Mechanisms . . . . . . . . . . . . . . . . . . . . . 3 64 3.1. Anycast Approach . . . . . . . . . . . . . . . . . . . . 3 65 3.2. VRRP Approach . . . . . . . . . . . . . . . . . . . . . . 4 66 4. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 4.1. Bidirectional Forwarding Detection (BFD) . . . . . . . . 4 68 4.1.1. DS-Lite Scenario . . . . . . . . . . . . . . . . . . 5 69 4.1.2. Parameters for BFD . . . . . . . . . . . . . . . . . 5 70 4.1.3. Elements of Procedure . . . . . . . . . . . . . . . . 6 71 4.1.4. BFD for NAT failure detection . . . . . . . . . . . . 6 72 4.1.5. Implementation Considerations . . . . . . . . . . . . 6 73 4.2. Port Control Protocol (PCP) . . . . . . . . . . . . . . . 7 74 4.3. ICMP Echo (Request) / Echo Reply (PING) . . . . . . . . . 7 75 4.4. Comparison of Different Solutions . . . . . . . . . . . . 7 76 5. State Synchronization and Session Re-establishment . . . . . 8 77 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 78 7. Security Considerations . . . . . . . . . . . . . . . . . . . 9 79 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 80 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 81 9.1. Normative References . . . . . . . . . . . . . . . . . . 9 82 9.2. Informative References . . . . . . . . . . . . . . . . . 10 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 85 1. Introduction 87 In DS-Lite [RFC6333], the IPv4-in-IPv6 DS-Lite tunnel is stateless, 88 no status information about the tunnel is available, and no keep- 89 alive mechanism is available. It is difficult to know whether the 90 tunnel is up or down; and if there is a link problem, the Basic 91 Bridging BroadBand (B4) element can not automatically switch to 92 another Address Family Transition Router (AFTR) so as to continue the 93 network service automatically, without the involvement of operators. 94 Besides, In DS-Lite [RFC6333], the CGN function at the AFTR is 95 stateful and there is no mechanism to detect whether the NAT44 CGN is 96 functioning in the AFTR. These will create problems for network 97 operation and maintenance. 99 Possible solutions for failure detection include the usage of 100 Bidirectional Forwarding Detection (BFD), the Port Control Protocol 101 (PCP), and ICMP Echo (Request) / Echo Reply (PING). The properties 102 of these solutions are discussed in this document and guidelines are 103 provided how to implement failure detection and automatic failover. 105 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 106 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 107 document are to be interpreted as described in RFC 2119 [RFC2119]. 109 2. Terminology 111 AFTR: Address Family Transition Router. 113 B4: Basic Bridging BroadBand. 115 BBF: BroadBand Forum. 117 BFD: Bidirectional Forwarding Detection. 119 CPE: Customer Premise Equipment (i.e., the DS-Lite B4). 121 FQDN Fully Qualified Domain Name. 123 PCP Port Control Protocol. 125 3. Failover Mechanisms 127 The FQDN of the AFTR is sent to the B4 element via a DHCP option, as 128 defined in [RFC6334]. Multiple IP addresses can be configured for 129 the FQDN of an AFTR on the DNS server. If a B4 element detects a 130 failure on the link to the AFTR, the B4 element MUST terminate the 131 current DS-Lite tunnel, choose another AFTR address in the list, and 132 create a tunnel to the new AFTR. If necessary, the B4 element SHOULD 133 re-configure the connectivity test tool accordingly and restart the 134 test procedures. 136 3.1. Anycast Approach 138 Anycasts may also be used for failover. But there is an ICMP-error- 139 message problem with anycast, that is, when a packet is sent from the 140 AFTR to a B4 element, if one of the routers along the path generates 141 an ICMP error message, e.g., Packet Too Big (PTB), then the error 142 message may not be sent back to the source AFTR but to another AFTR. 144 There's also a problem with anycast for stateful CGN/AFTR. If there 145 is an asymmetric path though the CGNs, then return path traffic will 146 be dropped as there is no corresponding state table entry in the 147 AFTR. 149 3.2. VRRP Approach 151 For active/passive HA in NAT gateways, it's quite common to have a 152 single virtual address offered by VRRP (or a proprietary equivalent) 153 that the upstream routers will use as their next hop. In the event 154 that the master CGN fails, the standby takes over the virtual L3 155 address. If a VRRP based virtual address is used as the tunnel 156 endpoint, then the clients wouldn't need to be awared of the 157 failover. 159 4. Solutions 161 4.1. Bidirectional Forwarding Detection (BFD) 163 Bidirectional Forwarding Detection [RFC5880] (BFD) is a mechanism 164 intended to detect faults in a bidirectional path. It is usually 165 used in conjunction with applications like OSPF, IS-IS, for fast 166 fault recovery and fast re-route [RFC5882]. BFD is being made 167 mandatory for keep-alive for subscriber sessions, including DS-Lite, 168 by the BroadBand Forum (BBF) [WT-146]. 170 BFD can be used in DS-Lite, by creating a BFD session between the B4 171 element and the AFTR to provide tunnel status information. If a 172 fault is detected, the B4 element can try to create a DS-Lite tunnel 173 with another AFTR and terminate the existing one, so as to continue 174 network service. BFD could also be used to detect the CGN state at 175 the AFTR, but the detection should be based on per-user. 177 [I-D.vinokour-bfd-dhcp] proposes using a DHCP option to distribute 178 BFD parameters to B4 elements. But in case of DS-Lite, some of the 179 key BFD parameters are already available (e.g., peer IP address), and 180 other parameters can be negotiated by BFD signaling or statically 181 configured, so that no extra DHCP option(s) need to be defined. 183 4.1.1. DS-Lite Scenario 185 In DS-Lite [RFC6333], the BFD packet SHOULD be sent through an IPv4 186 -in-IPv6 tunnel, as shown in Figure 1. The IPv4 addresses of the B4 187 element and the AFTR SHOULD be the endpoints of a BFD session. 189 +--------------+ +--------------+ 190 +------+ | | +------+ | | 191 | |-----+--------------+-----| | | | 192 | CPE | IPv6 Tunnel | AFTR |-----| IPv4 Network | 193 | (B4) |-----+--------------+-----| | | | 194 +------+ | IPv6 Network | +------+ | | 195 192.0.0.0 +--------------+ 192.0.0.1 +--------------+ 197 Figure 1: DS-Lite Scenario 199 4.1.2. Parameters for BFD 201 In order to set up a BFD session, the following parameters are 202 needed, as shown in Section 4.1 of [RFC5880]: 204 o Peer IP address 206 o My Discriminator 208 o Your Discriminator 210 o Desired Min TX Interval 212 o Required Min RX Interval 214 o Required Min Echo RX Interval 216 In DS-Lite [RFC6334], the B4's WAN-side IPv4 address is the well- 217 known address 192.0.0.0, and the AFTR's well-known IPv4 address is 218 192.0.0.1, as defined in section 5.7 of [RFC6333]. The B4 element 219 needs to create an IPv6 tunnel to an AFTR so as to get network 220 connectivity to the AFTR, and send IPv4 BFD packets through the 221 tunnel to manage it. 223 The other parameters listed above can be negotiated by BFD signaling, 224 and initial values can be configured on B4 elements and AFTRs. 226 4.1.3. Elements of Procedure 228 When a B4 element gets online, it will be assigned an IPv6 prefix or 229 address, and also the FQDN of the AFTR, as defined in [RFC6334]. The 230 B4 element will create an IPv6 tunnel to the AFTR with which the B4 231 element can initiate a BFD session to the AFTR. BFD packets will be 232 sent through the DS-Lite tunnel. As defined in section 4 of 233 [RFC5881], BFD control packets MUST be sent in UDP packets with 234 destination port 3784, and BFD echo packets MUST be sent in UDP 235 packets with destination port 3785. 237 When sending out the first BFD packet, the B4 element can generate a 238 unique local discriminator, and set the remote discriminator to zero. 239 When the AFTR receives the first BFD packet from a B4 element, the 240 AFTR will also generate a corresponding local discriminator, and put 241 it in the response packet to the B4 element. This will finish the 242 discriminator negotiation in the B4 to AFTR direction, without any 243 manual configuration. 245 When an AFTR receives the first packet from a B4 element, the AFTR 246 will get the IPv6 address and discriminator of the B4 element, so 247 that the AFTR can initiate the BFD session in the other direction and 248 a similar discriminator negotiation can be carried out. 250 4.1.4. BFD for NAT failure detection 252 B4 creates PCP mapping. BFD at AFTR uses an external public 253 interface (or another external mapping) to send a BFD packet to the 254 public PCP mapping created by B4. In this case, the AFTR BFD packet 255 will have a public source IP of interface, which will go through the 256 NAT, therefore exercising the NAT function. B4 will reply to the 257 AFTR external interface. 259 4.1.5. Implementation Considerations 261 BFD is usually used for quick fault detection, at a very small time 262 scale, e.g. milliseconds. But in DS-Lite, it may not be necessary to 263 detect faults in such a short time. On the other hand, an AFTR may 264 need to support tens of thousands of B4 elements, which means an AFTR 265 will need to support the same number of BFD sessions. In order to 266 meet performance requirements on an AFTR, it may be necessary to 267 extend the time period between BFD packet transmissions to a longer 268 time, e.g., 10s or 30s. 270 Compared to other solutions, BFD has a simple and fixed packet 271 format, which is easy to implement by logic devices (e.g., ASIC, 272 FPGA). Complicated protocols are usually processed by software which 273 is relatively slow. An AFTR may need to support 10000-20000 users, 274 and if the protocol is handled by software, it will bring extra load 275 to the AFTR. 277 4.2. Port Control Protocol (PCP) 279 [RFC6887]PCP is a NAT traversal tool. It can also be used for 280 network connectivity test if PCP is supported in the network. A 281 common use case of PCP is to create a pinhole so that external users 282 can visit the servers located behind a NAT. The lifetime of the 283 pinhole mapping is usually long, e.g., hours, and the lifetime will 284 be refreshed periodically by the client before it is expired. For 285 the purpose of network connectivity tests, a B4 element can create a 286 mapping in the CGN via PCP, with a short life time, e.g., 10s of 287 seconds, and keep on refreshing the mapping before it expires. If 288 any refresh requests fail, the B4 element knows that something is 289 wrong with the link or the PCP server or the CGN. 291 In order to detect the network connectivity of the DS-Lite tunnel, 292 the encapsulation mode MUST be used for PCP: PCP packets are sent 293 through the DS-Lite tunnel. 295 PCP can detect the failure of more components of the DS-Lite system. 296 Besides failures of the link and the routing, it also covers NAT 297 functions. 299 4.3. ICMP Echo (Request) / Echo Reply (PING) 301 PING is commonly implemented using the Echo (Request) and Echo 302 Response messages of the Internet Control Message Protocol (ICMP) 303 [RFC0792] [RFC4443]. In case of DS-Lite, a B4 element can send Echo 304 (Request) packets to the AFTR periodically. If the B4 element does 305 not receive Echo Response packets for a certain number (e.g., 3) of 306 Echo (Request) packets, then the B4 element decides that a fault has 307 been detected. 309 In order to test the connectivity of DS-Lite tunnel, Echo (Request) 310 packets MUST be sent using ICMPv4, rather than ICMPv6. 312 Since ICMP is an integral part of any IP implementation, the usage of 313 PING to detect tunnel failures does not require any special 314 implementation efforts on the B4 elements. However, on AFTRs that 315 process ICMP messages in software rather than in hardware, the usage 316 of PING might lead to scalability issues. 318 4.4. Comparison of Different Solutions 320 +--------+-------------+------+-------------------+-------------+ 321 | | |Packet|Additional |Configuration| 322 | |Availablility|format|functionality |/provisioning| 323 | | | |ontop of keepalives| overheads | 324 +--------+-------------+------+-------------------+-------------+ 325 | BFD |Widely used/ | | | | 326 | |network side,|Simple|Bidirectional | | 327 | |less used/ |fixed |status | | 328 | |terminal side| |synchronization | | 329 +--------+-------------+------+-------------------+ Similar | 330 | PCP |Less than | |No bidirectional | | 331 | |BFD/ICMP |Vari- | detection | | 332 +--------+-------------+able +-------------------+ | 333 | ICMP |Ubiquitous | |Network/CGN | | 334 | | | |initiated detection| | 335 +--------+-------------+------+-------------------+-------------+ 337 Figure 2: Comparison of different solutions 339 Figure 2 gives a direct comparison among different solutions. 340 Compared to other solutions, BFD has a simple and fixed packet 341 format, which is easy to implement by logic devices (e.g., ASIC, 342 FPGA). Complicated protocols are usually processed by software which 343 is relatively slow. ICMP is widely used than PCP/BFD, while BFD is 344 more widely used in the router and CGN side than in the terminal 345 side. However, from the aspect of failure detection, BFD has 346 explicit capability of bidirectional status synchronization to 347 guarantee the consistency of the failure status of both sides. ICMP 348 could actively initiate status detection from the network side or CGN 349 side, while PCP could not. PCP has no capability of bidirectional 350 detection. Considering the configuration/provisioning overheads, 351 since there is normally TR-069 server at the network management side. 352 So it is similar for each approach. 354 From the above comparison, BFD is selected as the failure detection 355 approach in this document. 357 5. State Synchronization and Session Re-establishment 359 There should be a state sync mechanism between active AFTR and backup 360 AFTR, to synchronize the state of each user between the two AFTRs. 361 This mechanism is to guarantee that the traffic returning to the B4 362 is from the backup AFTR, if the service is shifted to that AFTR. The 363 BFD link for both active AFTR and backup AFTR should be set up in the 364 initial state. When the active AFTR is detected in failure, the 365 service will be shifted to the backup AFTR. If the backup AFTR is 366 detected in failure, it will notify the network management server to 367 fix the failure. 369 In the hot-standby case, the master AFTR and the backup AFTR will 370 synchronize and backup the session. So there is no need to re- 371 establish the TCP session in the event of an AFTR failure. But in 372 the cold-standby case, if there is an active TCP session through the 373 CGN function of an AFTR, and this AFTR fails, then the TCP session 374 will need to be re-established by the client becasue only the 375 capability is reserved but the session is not backup. 377 6. IANA Considerations 379 This memo includes no request to IANA. 381 7. Security Considerations 383 In the DS-Lite [RFC6333] application, the B4 element may not be 384 directly connected to the AFTR; there may be other routers between 385 them. In such a deployment, there are potential spoofing problems, 386 as described in [RFC5883]. Hence cryptographic authentication SHOULD 387 be used with BFD as described in [RFC5880] if security is concerned. 389 8. Acknowledgements 391 The authors would like to thank Ian Farrer for his valuable comments. 393 9. References 395 9.1. Normative References 397 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 398 RFC 792, September 1981. 400 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 401 Requirement Levels", BCP 14, RFC 2119, March 1997. 403 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 404 Message Protocol (ICMPv6) for the Internet Protocol 405 Version 6 (IPv6) Specification", RFC 4443, March 2006. 407 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 408 (BFD)", RFC 5880, June 2010. 410 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 411 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June 412 2010. 414 [RFC5882] Katz, D. and D. Ward, "Generic Application of 415 Bidirectional Forwarding Detection (BFD)", RFC 5882, June 416 2010. 418 [RFC5883] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 419 (BFD) for Multihop Paths", RFC 5883, June 2010. 421 [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- 422 Stack Lite Broadband Deployments Following IPv4 423 Exhaustion", RFC 6333, August 2011. 425 [RFC6334] Hankins, D. and T. Mrugalski, "Dynamic Host Configuration 426 Protocol for IPv6 (DHCPv6) Option for Dual-Stack Lite", 427 RFC 6334, August 2011. 429 [RFC6887] Wing, D., Cheshire, S., Boucadair, M., Penno, R., and P. 430 Selkirk, "Port Control Protocol (PCP)", RFC 6887, April 431 2013. 433 [WT-146] Kavanagh, A., Klamm, F., Boucadair, W., and R. Dec, 434 "WT-146 Subscriber Sessions (work in progress)", Apr 2012. 436 9.2. Informative References 438 [I-D.vinokour-bfd-dhcp] 439 Vinokour, V., "Configuring BFD with DHCP and Other 440 Musings", May 2008. 442 Authors' Addresses 444 Tina Tsou 445 Huawei Technologies (USA) 446 2330 Central Expressway 447 Santa Clara CA 95050 448 USA 450 Phone: +1 408 330 4424 451 Email: tina.tsou.zouting@huawei.com 453 Brandon Li 454 Huawei Technologies 455 M6, No. 156, Beiqing Road, Haidian District 456 Beijing 100094 457 China 459 Email: brandon.lijian@huawei.com 460 Cathy Zhou 461 Huawei Technologies 462 China 464 Email: cathy.zhou@huawei.com 466 Juergen Schoenwaelder 467 Jacobs University Bremen 468 Campus Ring 1 469 Bremen 28759 470 Germany 472 Email: j.schoenwaelder@jacobs-university.de 474 Reinaldo Penno 475 Cisco Systems, Inc. 476 170 West Tasman Drivee 477 San Jose, California 95134 478 USA 480 Email: repenno@cisco.com 482 Mohamed Boucadair 483 France Telecom 484 Rennes,35000 485 France 487 Email: mohamed.boucadair@orange.com