idnits 2.17.1 draft-ymbk-idr-rs-bfd-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii) Publication Limitation clause. If this document is intended for submission to the IESG for publication, this constitutes an error. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (March 24, 2015) is 3314 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC2439' is defined on line 300, but no explicit reference was found in the text == Outdated reference: A later version (-03) exists of draft-ietf-idr-bgp-nh-cost-01 == Outdated reference: A later version (-12) exists of draft-ietf-idr-ix-bgp-route-server-06 Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Bush 3 Internet-Draft Internet Initiative Japan 4 Intended status: Standards Track J. Haas 5 Expires: September 25, 2015 J. Scudder 6 Juniper Networks, Inc. 7 A. Nipper 8 T. King, Ed. 9 DE-CIX Management GmbH 10 March 24, 2015 12 Making Route Servers Aware of Data Link Failures at IXPs 13 draft-ymbk-idr-rs-bfd-01 15 Abstract 17 When route servers are used, the data plane is not congruent with the 18 control plane. Therefore, the peers on the Internet exchange can 19 lose data connectivity without the control plane being aware of it, 20 and packets are dropped on the floor. This document proposes the use 21 of BFD between the two peering routers to detect a data plane 22 failure, and then uses BGP next hop cost to signal the state of the 23 data link to the route server(s). 25 Requirements Language 27 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 28 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to 29 be interpreted as described in [RFC2119] only when they appear in all 30 upper case. They may also appear in lower or mixed case as English 31 words, without normative meaning. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on September 25, 2015. 50 Copyright Notice 52 Copyright (c) 2015 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 This document may not be modified, and derivative works of it may not 66 be created, and it may not be published except as an Internet-Draft. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 71 2. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 3 72 2.1. Mutual Discovery of Route Server Client Routers . . . . . 3 73 2.2. Tracking Connectivity . . . . . . . . . . . . . . . . . . 4 74 3. Advertising Client Router Connectivity to the Route Server . 5 75 4. Utilizing Next Hop Unreachablility Information at Client 76 Routers . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 77 5. Recommendations for Using BFD . . . . . . . . . . . . . . . . 5 78 6. Bootstrapping . . . . . . . . . . . . . . . . . . . . . . . . 6 79 7. Other Considerations . . . . . . . . . . . . . . . . . . . . 6 80 8. Normative References . . . . . . . . . . . . . . . . . . . . 7 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7 83 1. Introduction 85 In configurations (typically Internet exchanges) where EBGP routing 86 information is exchanged between client routers through the agency of 87 a route server [I-D.ietf-idr-ix-bgp-route-server], but traffic is 88 exchanged directly, operational issues can arise when partial data 89 plane connectivity exists among the route server client routers. 90 This is because, as the data plane is not congruent with the control 91 plane, the client routers on the Internet exchange can lose data 92 connectivity without the control plane - the route server - being 93 aware of it, and packets are dropped on the floor. 95 To remedy this, two basic problems need to be solved: 97 1. Client routers must have a means of verifying connectivity 98 amongst themselves, and 100 2. Client routers must have a means of communicating the knowledge 101 so gained back to the route server. 103 The first can be solved by application of Bidirectional Forwarding 104 Detection [RFC5880]. The second can be solved by use of BGP NH-SAFI 105 [I-D.ietf-idr-bgp-nh-cost]. There is a subsidiary problem that must 106 also be solved. Since one of the key value propositions offered by a 107 route server is that client routers need not be configured to peer 108 with each other: 110 3. Client routers must have a means (other than configuration) to 111 know of one another's existence. 113 This can also be solved by an application of BGP NH-SAFI. 115 Throughout this document, we generally assume that the route server 116 being discussed is able to represent different RIBs towards different 117 clients, as discussed in section 2.3.2.1. 118 [I-D.ietf-idr-ix-bgp-route-server]. These procedures (other than the 119 use of BFD to track next hop reachability) have limited value if this 120 is not the case. 122 2. Operation 124 Below, we detail procedures where a route server tells its client 125 routers about other client routers (by sending it their next hops 126 using NH-SAFI), the client router verifies connectivity to those 127 other client routers (using BFD) and communicates its findings back 128 to the route server (again using NH-SAFI). The route server uses the 129 received NH-SAFI routes as input to the route selection process it 130 performs on behalf of the client. 132 2.1. Mutual Discovery of Route Server Client Routers 134 Strictly speaking, what is needed is not for a route server client 135 router to know of other (control-plane) client routers, but rather to 136 know (so that it can validate) all the next hops the route server 137 might choose to send the client router, i.e. to know of potential 138 forwarding plane relationships. 140 In effect, this requirement amounts to knowing the BGP next hops the 141 route server is aware of in its Adj-RIBs-In. Fortunately, 142 [I-D.ietf-idr-bgp-nh-cost] defines a construct that contains exactly 143 this data, the "Next-Hop Information Base", or NHIB, as well as 144 procedures for a BGP speaker to communicate its NHIB to its peer. 146 Thus, the problem can be solved by the route server advertising its 147 NHIB to its client router, following those procedures. 149 We observe that (as per NH-SAFI) the cost advertised in the route 150 server's Adj-NHIB-Out need not reflect a "real" IGP cost, the only 151 requirement being that the advertised costs are commensurate. A 152 route server MAY choose to advertise any fixed cost other than all- 153 ones (which is a reserved value in NH-SAFI). This specification does 154 not suggest semantics be imputed to the NH-SAFI advertised by the 155 route server and received by the client, other than "this next hop is 156 present in the control plane, you might like to track it". The route 157 server is not allowed to advertise a next hop as NH_UNREACHABLE. 159 A route server client SHOULD use BFD (or other means beyond the scope 160 of this document) to track forwarding plane connectivity [RFC5880] to 161 each next hop depicted in the received NH-SAFI. 163 2.2. Tracking Connectivity 165 For each next hop in the Adj-NHIB-In received from the route server, 166 the client router SHOULD use some means to confirm that data plane 167 connectivity does exist to that next hop. 169 For each next hop in the Adj-NHIB-In received from the route server, 170 the client router SHOULD setup a BFD session to it if one is not 171 already available and track the reachability of this next hop. 173 For each next hop being tracked, a corresponding NH-SAFI route should 174 be placed in the client router's own Adj-NHIB-Out to be advertised to 175 the route server. Any next hop for which connectivity has failed 176 should have its cost advertised as NH_UNREACHABLE. (This may also be 177 done as a result of policy even if connectivity exists.) Any other 178 next hop should have some feasible cost advertised. The values 179 advertised may be all equal, or may be set according to policy or 180 other implementation-specific means. 182 If the test of connectivity between one client router and another 183 client router has failed the client router that detected this failure 184 should perform connectivity test for a configurable amount of time 185 (preferable 24 hours) on a regular basis (e.g. every 5 minutes). If 186 during this time no connectivity can be restored no more testing is 187 performed and this client router is advertised as NH_UNREACHABLE 188 until manually changed or the client router is rebooted. 190 3. Advertising Client Router Connectivity to the Route Server 192 As discussed above, a client router will advertise its Adj-NHIB-Out 193 to the route server. The route server should use this information as 194 input to its own decision process when computing the Adj-RIB-Out for 195 this peer. This peer-dependent Adj-RIB-Out is then advertised to 196 this peer. In particular, the route server MUST exclude any routes 197 whose next hops the client has declared to be NH_UNREACHABLE. The 198 route server MAY also consider the advertised cost to be the "IGP 199 cost" section 9.1 [RFC4271] when doing this computation. 201 4. Utilizing Next Hop Unreachablility Information at Client Routers 203 A client router detecting an unreachable next hop signals this 204 information to the route server as described above. Also, it treats 205 the routes as unresolvable as per section 9.1.2.1 [RFC4271] and 206 proceeds with route selection as normal. 208 Changes in nexthop reachability via these mechanisms should receive 209 some amount of consideration toward avoiding unnecessary route 210 flapping. Similar mechanisms exist in IGP implementations and should 211 be applied to this scenario. 213 5. Recommendations for Using BFD 215 The RECOMMENDED way a client router can confirm the data plane 216 connectivity to its next hops is available, is the use of BFD in 217 asynchronous mode. Echo mode MAY be used if both client routers 218 running a BFD session support this. The use of authentication in BFD 219 is OPTIONAL as there is a certain level of trust between the 220 operators of the client routers at a particular IXP. If trust cannot 221 be assumed, it is recommended to use pair-wise keys (how this can be 222 achieved is outside the scope of this document). The ttl/hop limit 223 values as described in section 5 [RFC5881] MUST be obeyed in order to 224 secure BFD sessions from packets coming from outside the IXP. 226 There is interdependence between the functionality described in this 227 document and BFD from an administrative point of view. To streamline 228 behaviour of different implementations the following is RECOMMENDED: 230 o If BFD is administratively shut down by the administrator of a 231 client router then the functionality described in this document 232 MUST also be administratively shut down. 233 o If the administrator enables the functionality described in this 234 document on a client router then BFD MUST be automatically 235 enabled. 237 The following values of the BFD configuration of client routers (see 238 section 6.8.1 [RFC5880]) are RECOMMENDED in order to allow a fast 239 detection of lost data plane connectivity: 241 o DesiredMinTxInterval: 1,000,000 (microseconds) 242 o RequiredMinRxInterval: 1,000,000 (microseconds) 243 o DetectMult: 3 245 The configuration values above are a trade-off between fast detection 246 of data plane connectivity and the load client routers must handle 247 keeping up the BFD communication. Selecting smaller 248 DesiredMinTxInterval and RequiredMinRxInterval values generates lots 249 of BFD packets, especially at larger IXPs with many hundreds of 250 client routers. 252 The configuration values above are selected in order to handle brief 253 interrupts on the data plane. Otherwise, if a BFD session detects a 254 brief data plane interrupt to a particular client router, it will 255 cause to signal the route server that is should remove routes from 256 this client router and tell it shortly afterwards to add the routes 257 again. This is disruptive and computational expensive on the route 258 server. 260 The configuration values above are also partially impacted by BGP 261 advertisement time in reaction to events from BFD. If the 262 configuration values are selected so that BFD detects data plane 263 interrupts a lot faster than the BGP advertisement time, a data plane 264 connectivity flapping could be detected by BFD but the route server 265 is not informed about them because BGP is not able to transport this 266 information fast enough. 268 As discussed, finding good configuration values is hard so a client 269 router administrator MAY select better suited values depending on the 270 special needs of the particular deployment. 272 6. Bootstrapping 274 If the route server starts it does not know anything about 275 connectivity states between client routers. So, the route server 276 assumes optimistically that all client routers are able to reach each 277 other unless told otherwise. 279 7. Other Considerations 281 For purposes of routing stability, implementations may wish to apply 282 hysteresis ("holddown") to next hops that have transitioned from 283 reachable to unreachable and back. 285 8. Normative References 287 [I-D.ietf-idr-bgp-nh-cost] 288 Varlashkin, I. and R. Raszuk, "Carrying next-hop cost 289 information in BGP", draft-ietf-idr-bgp-nh-cost-01 (work 290 in progress), March 2012. 292 [I-D.ietf-idr-ix-bgp-route-server] 293 Jasinska, E., Hilliard, N., Raszuk, R., and N. Bakker, 294 "Internet Exchange Route Server", draft-ietf-idr-ix-bgp- 295 route-server-06 (work in progress), December 2014. 297 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 298 Requirement Levels", BCP 14, RFC 2119, March 1997. 300 [RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route 301 Flap Damping", RFC 2439, November 1998. 303 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 304 Protocol 4 (BGP-4)", RFC 4271, January 2006. 306 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 307 (BFD)", RFC 5880, June 2010. 309 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 310 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June 311 2010. 313 Authors' Addresses 315 Randy Bush 316 Internet Initiative Japan 317 5147 Crystal Springs 318 Bainbridge Island, Washington 98110 319 US 321 Email: randy@psg.com 323 Jeffrey Haas 324 Juniper Networks, Inc. 325 1194 N. Mathilda Ave. 326 Sunnyvale, CA 94089 327 US 329 Email: jhaas@juniper.net 330 John G. Scudder 331 Juniper Networks, Inc. 332 1194 N. Mathilda Ave. 333 Sunnyvale, CA 94089 334 US 336 Email: jgs@juniper.net 338 Arnold Nipper 339 DE-CIX Management GmbH 340 Lichtstrasse 43i 341 Cologne 50825 342 Germany 344 Email: arnold.nipper@de-cix.net 346 Thomas King (editor) 347 DE-CIX Management GmbH 348 Lichtstrasse 43i 349 Cologne 50825 350 Germany 352 Email: thomas.king@de-cix.net