idnits 2.17.1 draft-tsou-softwire-bfd-ds-lite-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 3 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 13, 2014) is 3718 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC6887' is defined on line 426, but no explicit reference was found in the text -- No information found for draft-vinokour-bfd-dhcp - is the name correct? Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force T. Tsou 3 Internet-Draft Huawei Technologies (USA) 4 Intended status: Informational B. Li 5 Expires: August 17, 2014 C. Zhou 6 Huawei Technologies 7 J. Schoenwaelder 8 Jacobs University Bremen 9 R. Penno 10 Cisco Systems, Inc. 11 M. Boucadair 12 France Telecom 13 February 13, 2014 15 DS-Lite Failure Detection and Failover 16 draft-tsou-softwire-bfd-ds-lite-06 18 Abstract 20 In DS-Lite, the tunnel is stateless, not associated with any state 21 information, and the CGN function at the AFTR is stateful. 22 Currently, there is no failure detection and failover mechanism for 23 both stateless tunnel and stateful CGN function, which makes it 24 difficult to manage and diagnose if there is a problem. This draft 25 analyzes the applicability of some of the possible solutions. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on August 17, 2014. 44 Copyright Notice 46 Copyright (c) 2014 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. Failover Mechanisms . . . . . . . . . . . . . . . . . . . . . 3 64 3.1. Anycast Approach . . . . . . . . . . . . . . . . . . . . . 4 65 3.2. VRRP Approach . . . . . . . . . . . . . . . . . . . . . . 4 66 4. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 4.1. Bidirectional Forwarding Detection (BFD) . . . . . . . . . 4 68 4.1.1. DS-Lite Scenario . . . . . . . . . . . . . . . . . . . 5 69 4.1.2. Parameters for BFD . . . . . . . . . . . . . . . . . . 5 70 4.1.3. Elements of Procedure . . . . . . . . . . . . . . . . 6 71 4.1.4. BFD for NAT failure detection . . . . . . . . . . . . 6 72 4.1.5. Implementation Considerations . . . . . . . . . . . . 6 73 4.2. Port Control Protocol (PCP) . . . . . . . . . . . . . . . 7 74 4.3. ICMP Echo Request / Echo Reply (PING) . . . . . . . . . . 7 75 4.4. Comparison of Different Solutions . . . . . . . . . . . . 8 76 5. State Synchronization and Session Re-establishment . . . . . . 8 77 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 78 7. Security Considerations . . . . . . . . . . . . . . . . . . . 9 79 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 80 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 81 9.1. Normative References . . . . . . . . . . . . . . . . . . . 9 82 9.2. Informative References . . . . . . . . . . . . . . . . . . 10 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 85 1. Introduction 87 In DS-Lite [RFC6333], the IPv4-in-IPv6 DS-Lite tunnel is stateless, 88 no status information about the tunnel is available, and no keep- 89 alive mechanism is available. It is difficult to know whether the 90 tunnel is up or down; and if there is a link problem, the Basic 91 Bridging BroadBand (B4) element can not automatically switch to 92 another Address Family Transition Router (AFTR) so as to continue the 93 network service automatically, without the involvement of operators. 94 Besides, In DS-Lite [RFC6333], the CGN function at the AFTR is 95 stateful and there is no mechanism to detect whether the NAT44 CGN is 96 functioning in the AFTR. These will create problems for network 97 operation and maintenance. 99 Possible solutions for failure detection include the usage of 100 Bidirectional Forwarding Detection (BFD), the Port Control Protocol 101 (PCP), and ICMP Echo Request / Echo Reply (PING). The properties of 102 these solutions are discussed in this document and guidelines are 103 provided how to implement failure detection and automatic failover. 105 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 106 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 107 document are to be interpreted as described in RFC 2119 [RFC2119]. 109 2. Terminology 111 AFTR: Address Family Transition Router. 113 B4: Basic Bridging BroadBand. 115 BBF: BroadBand Forum. 117 BFD: Bidirectional Forwarding Detection. 119 CPE: Customer Premise Equipment (i.e., the DS-Lite B4). 121 FQDN Fully Qualified Domain Name. 123 PCP Port Control Protocol. 125 3. Failover Mechanisms 127 The FQDN of the AFTR is sent to the B4 element via a DHCP option, as 128 defined in [RFC6334]. Multiple IP addresses can be configured for 129 the FQDN of an AFTR on the DNS server. If a B4 element detects a 130 failure on the link to the AFTR, the B4 element MUST terminate the 131 current DS-Lite tunnel, choose another AFTR address in the list, and 132 create a tunnel to the new AFTR. If necessary, the B4 element SHOULD 133 re-configure the connectivity test tool accordingly and restart the 134 test procedures. 136 3.1. Anycast Approach 138 Anycasts may also be used for failover. But there is an ICMP-error- 139 message problem with anycast, that is, when a packet is sent from the 140 AFTR to a B4 element, if one of the routers along the path generates 141 an ICMP error message, e.g., Packet Too Big (PTB), then the error 142 message may not be sent back to the source AFTR but to another AFTR. 144 There's also a problem with anycast for stateful CGN/AFTR. If there 145 is an asymmetric path though the CGNs, then return path traffic will 146 be dropped as there is no corresponding state table entry in the 147 AFTR. 149 3.2. VRRP Approach 151 For active/passive HA in NAT gateways, it's quite common to have a 152 single virtual address offered by VRRP (or a proprietary equivalent) 153 that the upstream routers will use as their next hop. In the event 154 that the master CGN fails, the standby takes over the virtual L3 155 address. If a VRRP based virtual address is used as the tunnel 156 endpoint, then the clients wouldn't need to be aware of the failover. 158 4. Solutions 160 4.1. Bidirectional Forwarding Detection (BFD) 162 Bidirectional Forwarding Detection [RFC5880] (BFD) is a mechanism 163 intended to detect faults in a bidirectional path. It is usually 164 used in conjunction with applications like OSPF, IS-IS, for fast 165 fault recovery and fast re-route [RFC5882]. BFD is being made 166 mandatory for keep-alive for subscriber sessions, including DS-Lite, 167 by the BroadBand Forum (BBF) [WT-146]. 169 BFD can be used in DS-Lite, by creating a BFD session between the B4 170 element and the AFTR to provide tunnel status information. If a 171 fault is detected, the B4 element can try to create a DS-Lite tunnel 172 with another AFTR and terminate the existing one, so as to continue 173 network service. BFD could also be used to detect the CGN state at 174 the AFTR, but the detection should be based on per-user. 176 [I-D.vinokour-bfd-dhcp] proposes using a DHCP option to distribute 177 BFD parameters to B4 elements. But in case of DS-Lite, some of the 178 key BFD parameters are already available (e.g., peer IP address), and 179 other parameters can be negotiated by BFD signaling or statically 180 configured, so that no extra DHCP option(s) need to be defined. 182 4.1.1. DS-Lite Scenario 184 In DS-Lite [RFC6333], the BFD packet SHOULD be sent through an IPv4- 185 in-IPv6 tunnel, as shown in Figure 1. The IPv4 addresses of the B4 186 element and the AFTR SHOULD be the endpoints of a BFD session. 188 +--------------+ +--------------+ 189 +------+ | | +------+ | | 190 | |-----+--------------+-----| | | | 191 | CPE | IPv6 Tunnel | AFTR |-----| IPv4 Network | 192 | (B4) |-----+--------------+-----| | | | 193 +------+ | IPv6 Network | +------+ | | 194 192.0.0.2 +--------------+ 192.0.0.1 +--------------+ 196 Figure 1: DS-Lite Scenario 198 4.1.2. Parameters for BFD 200 In order to set up a BFD session, the following parameters are 201 needed, as shown in Section 4.1 of [RFC5880]: 203 o Peer IP address 205 o My Discriminator 207 o Your Discriminator 209 o Desired Min TX Interval 211 o Required Min RX Interval 213 o Required Min Echo RX Interval 215 B4's WAN-side IPv4 address is the well-known address 192.0.0.2, and 216 the AFTR's well-known IPv4 address is 192.0.0.1, as defined in 217 section 5.7 of [RFC6333]. The B4 element needs to create an IPv6 218 tunnel to an AFTR so as to get network connectivity to the AFTR, and 219 send IPv4 BFD packets through the tunnel to manage it. 221 The other parameters listed above can be negotiated by BFD signaling, 222 and initial values can be configured on B4 elements and AFTRs. 224 4.1.3. Elements of Procedure 226 When a B4 element gets online, it will be assigned an IPv6 prefix or 227 address, and also the FQDN of the AFTR, as defined in [RFC6334]. The 228 B4 element will create an IPv6 tunnel to the AFTR with which the B4 229 element can initiate a BFD session to the AFTR. BFD packets will be 230 sent through the DS-Lite tunnel. As defined in section 4 of 231 [RFC5881], BFD control packets MUST be sent in UDP packets with 232 destination port 3784, and BFD echo packets MUST be sent in UDP 233 packets with destination port 3785. 235 When sending out the first BFD packet, the B4 element can generate a 236 unique local discriminator, and set the remote discriminator to zero. 237 When the AFTR receives the first BFD packet from a B4 element, the 238 AFTR will also generate a corresponding local discriminator, and put 239 it in the response packet to the B4 element. This will finish the 240 discriminator negotiation in the B4 to AFTR direction, without any 241 manual configuration. 243 When an AFTR receives the first packet from a B4 element, the AFTR 244 will get the IPv6 address and discriminator of the B4 element, so 245 that the AFTR can initiate the BFD session in the other direction and 246 a similar discriminator negotiation can be carried out. 248 4.1.4. BFD for NAT failure detection 250 B4 creates PCP mapping. BFD at AFTR uses an external public 251 interface (or another external mapping) to send a BFD packet to the 252 public PCP mapping created by B4. In this case, the AFTR BFD packet 253 will have a public source IP of interface, which will go through the 254 NAT, therefore exercising the NAT function. B4 will reply to the 255 AFTR external interface. 257 4.1.5. Implementation Considerations 259 BFD is usually used for quick fault detection, at a very small time 260 scale, e.g. milliseconds. But in DS-Lite, it may not be necessary to 261 detect faults in such a short time. On the other hand, an AFTR may 262 need to support tens of thousands of B4 elements, which means an AFTR 263 will need to support the same number of BFD sessions. In order to 264 meet performance requirements on an AFTR, it may be necessary to 265 extend the time period between BFD packet transmissions to a longer 266 time, e.g., 10s or 30s. 268 Compared to other solutions, BFD has a simple and fixed packet 269 format, which is easy to implement by logic devices (e.g., ASIC, 270 FPGA). Complicated protocols are usually processed by software which 271 is relatively slow. An AFTR may need to support 10000-20000 users, 272 and if the protocol is handled by software, it will bring extra load 273 to the AFTR. 275 4.2. Port Control Protocol (PCP) 277 [RFC6887]PCP is a NAT traversal tool. It can also be used for 278 network connectivity test if PCP is supported in the network. A 279 common use case of PCP is to create a pinhole so that external users 280 can visit the servers located behind a NAT. The lifetime of the 281 pinhole mapping is usually long, e.g., hours, and the lifetime will 282 be refreshed periodically by the client before it is expired. For 283 the purpose of network connectivity tests, a B4 element can create a 284 mapping in the CGN via PCP, with a short life time, e.g., 10s of 285 seconds, and keep on refreshing the mapping before it expires. If 286 any refresh requests fail, the B4 element knows that something is 287 wrong with the link or the PCP server or the CGN. 289 In order to detect the network connectivity of the DS-Lite tunnel, 290 the encapsulation mode MUST be used for PCP: PCP packets are sent 291 through the DS-Lite tunnel. 293 PCP can detect the failure of more components of the DS-Lite system. 294 Besides failures of the link and the routing, it also covers NAT 295 functions. 297 4.3. ICMP Echo Request / Echo Reply (PING) 299 PING is commonly implemented using the Echo Request and Echo Response 300 messages of the Internet Control Message Protocol (ICMP) [RFC0792] 301 [RFC4443]. In case of DS-Lite, a B4 element can send Echo Request 302 packets to the AFTR periodically. If the B4 element does not receive 303 Echo Response packets for a certain number (e.g., 3) of Echo Request 304 packets, then the B4 element decides that a fault has been detected. 306 In order to test the connectivity of DS-Lite tunnel, Echo Request 307 packets MUST be sent using ICMPv4, rather than ICMPv6. 309 Since ICMP is an integral part of any IP implementation, the usage of 310 PING to detect tunnel failures does not require any special 311 implementation efforts on the B4 elements. However, on AFTRs that 312 process ICMP messages in software rather than in hardware, the usage 313 of PING might lead to scalability issues. 315 4.4. Comparison of Different Solutions 317 +--------+-------------+------+-------------------+-------------+ 318 | | |Packet|Additional |Configuration| 319 | |Availablility|format|functionality |/provisioning| 320 | | | |ontop of keepalives| overheads | 321 +--------+-------------+------+-------------------+-------------+ 322 | BFD |Widely used/ | | | | 323 | |network side,|Simple|Bidirectional | | 324 | |less used/ |fixed |status | | 325 | |terminal side| |synchronization | | 326 +--------+-------------+------+-------------------+ Similar | 327 | PCP |Less than | |No bidirectional | | 328 | |BFD/ICMP |Vari- | detection | | 329 +--------+-------------+able +-------------------+ | 330 | ICMP |Ubiquitous | |Network/CGN | | 331 | | | |initiated detection| | 332 +--------+-------------+------+-------------------+-------------+ 334 Figure 2: Comparison of different solutions 336 Figure 2 gives a direct comparison among different solutions. 337 Compared to other solutions, BFD has a simple and fixed packet 338 format, which is easy to implement by logic devices (e.g., ASIC, 339 FPGA). Complicated protocols are usually processed by software which 340 is relatively slow. ICMP is widely used than PCP/BFD, while BFD is 341 more widely used in the router and CGN side than in the terminal 342 side. However, from the aspect of failure detection, BFD has 343 explicit capability of bidirectional status synchronization to 344 guarantee the consistency of the failure status of both sides. ICMP 345 could actively initiate status detection from the network side or CGN 346 side, while PCP could not. PCP has no capability of bidirectional 347 detection. Considering the configuration/provisioning overheads, 348 since there is normally TR-069 server at the network management side. 349 So it is similar for each approach. 351 From the above comparison, BFD is selected as the failure detection 352 approach in this document. 354 5. State Synchronization and Session Re-establishment 356 There should be a state sync mechanism between active AFTR and backup 357 AFTR, to synchronize the state of each user between the two AFTRs. 358 This mechanism is to guarantee that the traffic returning to the B4 359 is from the backup AFTR, if the service is shifted to that AFTR. The 360 BFD link for both active AFTR and backup AFTR should be set up in the 361 initial state. When the active AFTR is detected in failure, the 362 service will be shifted to the backup AFTR. If the backup AFTR is 363 detected in failure, it will notify the network management server to 364 fix the failure. 366 In the hot-standby case, the master AFTR and the backup AFTR will 367 synchronize and backup the session. So there is no need to re- 368 establish the TCP session in the event of an AFTR failure. But in 369 the cold-standby case, if there is an active TCP session through the 370 CGN function of an AFTR, and this AFTR fails, then the TCP session 371 will need to be re-established by the client because only the 372 capability is reserved but the session is not backup. 374 6. IANA Considerations 376 This memo includes no request to IANA. 378 7. Security Considerations 380 In the DS-Lite [RFC6333] application, the B4 element may not be 381 directly connected to the AFTR; there may be other routers between 382 them. In such a deployment, there are potential spoofing problems, 383 as described in [RFC5883]. Hence cryptographic authentication SHOULD 384 be used with BFD as described in [RFC5880] if security is concerned. 386 8. Acknowledgements 388 The authors would like to thank Ian Farrer for his valuable comments. 390 9. References 392 9.1. Normative References 394 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 395 RFC 792, September 1981. 397 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 398 Requirement Levels", BCP 14, RFC 2119, March 1997. 400 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 401 Message Protocol (ICMPv6) for the Internet Protocol 402 Version 6 (IPv6) Specification", RFC 4443, March 2006. 404 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 405 (BFD)", RFC 5880, June 2010. 407 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 408 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, 409 June 2010. 411 [RFC5882] Katz, D. and D. Ward, "Generic Application of 412 Bidirectional Forwarding Detection (BFD)", RFC 5882, 413 June 2010. 415 [RFC5883] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 416 (BFD) for Multihop Paths", RFC 5883, June 2010. 418 [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- 419 Stack Lite Broadband Deployments Following IPv4 420 Exhaustion", RFC 6333, August 2011. 422 [RFC6334] Hankins, D. and T. Mrugalski, "Dynamic Host Configuration 423 Protocol for IPv6 (DHCPv6) Option for Dual-Stack Lite", 424 RFC 6334, August 2011. 426 [RFC6887] Wing, D., Cheshire, S., Boucadair, M., Penno, R., and P. 427 Selkirk, "Port Control Protocol (PCP)", RFC 6887, 428 April 2013. 430 [WT-146] Kavanagh, A., Klamm, F., Boucadair, W., and R. Dec, "WT- 431 146 Subscriber Sessions (work in progress)", Apr 2012. 433 9.2. Informative References 435 [I-D.vinokour-bfd-dhcp] 436 Vinokour, V., "Configuring BFD with DHCP and Other 437 Musings", May 2008. 439 Authors' Addresses 441 Tina Tsou 442 Huawei Technologies (USA) 443 2330 Central Expressway 444 Santa Clara CA 95050 445 USA 447 Phone: +1 408 330 4424 448 Email: tina.tsou.zouting@huawei.com 449 Brandon Li 450 Huawei Technologies 451 M6, No. 156, Beiqing Road, Haidian District 452 Beijing 100094 453 China 455 Phone: 456 Email: brandon.lijian@huawei.com 458 Cathy Zhou 459 Huawei Technologies 460 China 462 Phone: 463 Email: cathy.zhou@huawei.com 465 Juergen Schoenwaelder 466 Jacobs University Bremen 467 Campus Ring 1 468 Bremen 28759 469 Germany 471 Phone: 472 Email: j.schoenwaelder@jacobs-university.de 474 Reinaldo Penno 475 Cisco Systems, Inc. 476 170 West Tasman Drivee 477 San Jose, California 95134 478 USA 480 Phone: 481 Email: repenno@cisco.com 483 Mohamed Boucadair 484 France Telecom 485 Rennes,35000 486 France 488 Phone: 489 Email: mohamed.boucadair@orange.com