idnits 2.17.1 draft-patel-raszuk-bgp-vector-routing-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 31, 2016) is 2879 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'I1' is mentioned on line 332, but not defined == Missing Reference: 'I2' is mentioned on line 332, but not defined == Unused Reference: 'RFC4223' is defined on line 445, but no explicit reference was found in the text == Unused Reference: 'RFC4271' is defined on line 449, but no explicit reference was found in the text == Unused Reference: 'I-D.keyupate-bgp-services' is defined on line 456, but no explicit reference was found in the text == Unused Reference: 'I-D.previdi-filsfils-isis-segment-routing' is defined on line 461, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 4223 -- Obsolete informational reference (is this intentional?): RFC 5575 (Obsoleted by RFC 8955) Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Raszuk, Ed. 3 Internet-Draft Bloomberg LP 4 Intended status: Standards Track K. Patel, Ed. 5 Expires: December 2, 2016 B. Pithawala 6 A. Sajassi 7 Cisco Systems 8 E. Osborne 9 Level 3 10 L. Jalil 11 VeriZon 12 J. Uttaro 13 ATT 14 May 31, 2016 16 BGP vector routing. 17 draft-patel-raszuk-bgp-vector-routing-07 19 Abstract 21 Network architectures have begun to shift from pure destination based 22 routing to service aware routing. Operator requirements in this 23 space include forcing traffic through particular service nodes (e.g. 24 firewall, NAT) or transit nodes (e.g. segments). This document 25 proposes an enhancement to BGP to accommodate these new requirements. 27 This document proposes a pure control plane solution which allows 28 traffic to be routed via an ordered set of transit points (links, 29 nodes or services) on the way to traffic's destination, with no 30 change in the forwarding plane. This approach is in contrast to 31 other proposal in this space which provide similar capabilities via 32 modifications to the forwarding plane. 34 Requirements Language 36 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 37 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 38 document are to be interpreted as described in RFC 2119 [RFC2119]. 40 Status of This Memo 42 This Internet-Draft is submitted in full conformance with the 43 provisions of BCP 78 and BCP 79. 45 Internet-Drafts are working documents of the Internet Engineering 46 Task Force (IETF). Note that other groups may also distribute 47 working documents as Internet-Drafts. The list of current Internet- 48 Drafts is at http://datatracker.ietf.org/drafts/current/. 50 Internet-Drafts are draft documents valid for a maximum of six months 51 and may be updated, replaced, or obsoleted by other documents at any 52 time. It is inappropriate to use Internet-Drafts as reference 53 material or to cite them other than as "work in progress." 55 This Internet-Draft will expire on December 2, 2016. 57 Copyright Notice 59 Copyright (c) 2016 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents 64 (http://trustee.ietf.org/license-info) in effect on the date of 65 publication of this document. Please review these documents 66 carefully, as they describe your rights and restrictions with respect 67 to this document. Code Components extracted from this document must 68 include Simplified BSD License text as described in Section 4.e of 69 the Trust Legal Provisions and are provided without warranty as 70 described in the Simplified BSD License. 72 Table of Contents 74 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 75 2. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 3 76 2.1. BGP Vector Node Attribute . . . . . . . . . . . . . . . . 3 77 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 6 78 4. Use case example . . . . . . . . . . . . . . . . . . . . . . 7 79 5. Deployment considerations . . . . . . . . . . . . . . . . . . 9 80 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 81 7. Security considerations . . . . . . . . . . . . . . . . . . . 10 82 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 83 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 84 9.1. Normative References . . . . . . . . . . . . . . . . . . 10 85 9.2. Informative References . . . . . . . . . . . . . . . . . 11 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 88 1. Introduction 90 This document addresses two problems. The first is traffic 91 engineering - by providing specific paths over which traffic must 92 flow, an operator can modify the traffic pattern on their network to 93 better address congestion. While typically this has been 94 accomplished by constructing MPLS-TE LSPs and mapping traffic on 95 them, the overhead of the MPLS control plane and the requirement to 96 use the MPLS data plane can pose an operational issue for some 97 network operators such as data center providers or enterprises. 99 The second can be thought of as services engineering - by providing 100 an ordered list of services nodes through which a particular 101 destination's traffic must traverse, an operator can add services 102 (e.g. NATs, load balancers, firewalls) along the forwarding path 103 towards a specific destination. As services such as NAT, Firewalls 104 and Load Balancers move to the cloud based model, a need to discover, 105 prioritize and chain these services is needed. The draft draft- 106 keyupate-bgp-services-02 describes extensions to BGP that facilitates 107 auto discovery of services within the network. This draft proposes 108 an extension to BGP that facilitates prioritizing and chaining of 109 services within a network. Since service chaining is facilitated 110 using the BGP control plane, it can readily be applied to IP-only 111 tunneling encapsulations for network virtualization such as VXLAN and 112 NVGRE. 114 In either case, this document refers to the use of the proposed BGP 115 extension as Service Chaining. 117 To facilitate Service Chaining, this document defines a new BGP 118 attribute known as a BGP Vector Node attribute. The BGP Vector Node 119 attribute consist of an ordered list of IP transit hops that needs to 120 be traversed before the packet is forwarded to its BGP NEXT HOP. The 121 information carried in the ordered list of Vector Node is used 122 towards augmenting the NEXT HOP information for the BGP prefixes as 123 carried in the MP_REACH attribute. This draft specifies rules for 124 BGP-speaking traffic forwarders (i.e. PEs and midpoint nodes) to 125 replace the NEXT HOP information in their RIB/FIB with an 126 intermediate node supplied by the BGP Vector Node attribute. 128 2. Protocol Extensions 130 This document describes a BGP attribute known as BGP Vector Node 131 attribute, along with rules for identifying an intermediate next-hop 132 from tthe BGP Vector Node attribute. 134 2.1. BGP Vector Node Attribute 136 The BGP Vector Node attribute is a new BGP optional transitive 137 attribute. The attribute type code for the Vector Node attribute is 138 to be assigned by IANA. The value field of the Vector Node attribute 139 is defined as a set of one or more Vector Node TLVs. 141 A Vector Node TLVs within a Vector Node Attribute are defined as 142 follows: 144 Type 1 TLV: 146 0 1 2 3 147 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 148 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 149 | TYPE | LENGTH | 150 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 151 | 4 OCTET AS NUMBER | 152 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 153 | AFI | SAFI | RESERVED | 154 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 155 ~ VECTOR/SERVICE NODE ADDRESS ~ 156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 158 Figure1: Vector Node TLV Type 1 160 TYPE: Two octets encoding the Vector Node TLV Type. Type 1 contains 161 vector or service node address which packets should traverse. Such 162 address is part of the IGP. Such node is part of BGP mesh. 164 LENGTH: Two octets encoding the length in octets of the Vector Node 165 TLV, excluding the type and length fields. The Length is encoded as 166 an unsigned binary integer. 168 4 OCTET AS NUMBER: 4 octet AS number or zero padded 2 octet AS number 169 of the autonomous system Vector Node Address belongs 171 AFI: Address Family Identifier (16 bits). 173 SAFI: Subsequent Address Family Identifier (8 bits). Should be set 174 to 1 (unicast) 176 RESERVED: One octet reserved for special flags 178 VECTOR/SERVICE NODE ADDRESS: The IPv4 or IPv6 unicast (or anycast) 179 address of transit router. 181 Type 2 TLV: 183 0 1 2 3 184 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 186 | TYPE | LENGTH | 187 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 188 | 4 OCTET AS NUMBER | 189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 190 | AFI | SAFI | RESERVED | 191 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 192 ~ VECTOR NODE ADDRESS ~ 193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 194 | AFI | SAFI | RESERVED | 195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 196 ~ SERVICE NODE ADDRESS ~ 197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 199 Figure2: Vector Node TLV Type 2 201 TYPE: Two octets encoding the Vector Node TLV Type. Type 2 contains 202 vector node address and service node address which packets should 203 traverse. Vector node address is part of the IGP and such node is 204 part of BGP mesh. Service node is directly attached to a vector 205 node, is reachable from vector node and does not run any BGP 206 sessions. 208 LENGTH: Two octets encoding the length in octets of the Vector Node 209 TLV, excluding the type and length fields. The Length is encoded as 210 an unsigned binary integer. 212 4 OCTET AS NUMBER: 4 octet AS number or zero padded 2 octet AS number 213 of the autonomous system Vector Node Address belongs 215 AFI: Address Family Identifier (16 bits). 217 SAFI: Subsequent Address Family Identifier (8 bits). Should be set 218 to 1 (unicast) 220 RESERVED: One octet reserved for special flags 222 VECTOR/SERVICE NODE ADDRESS: The IPv4 or IPv6 unicast (or anycast) 223 address of respectively a transit node and service appliance. Vector 224 and service node may belong to different AFs. 226 3. Operation 228 The BGP Vector Node attribute is used to augment prefix or set of 229 prefixes carried in given BGP UPDATE message with set of nodes 230 information which are intended to be used to influence computation of 231 forwarding paths for those destinations. The Vector Node attribute 232 can be used within a provider's IBGP network and across EBGP 233 networks. The BGP Vector Node attribute is an optional transitive 234 attribute that can be applied to any address family within BGP where 235 there is need for routing the traffic through ordered list of transit 236 nodes. 238 The BGP Vector Node attribute consists of one more Vector Node TLVs. 239 The ordered list of Vector Node TLVs indicates an ordered list of 240 nodes that need to transit or process the data packets sent towards 241 the destination prefix. The creation of the list of Vector Nodes is 242 outside the scope of this document, but is expected to be created 243 either through a Command Line Interface (CLI) on a router, or using 244 an orchestrator system or by some other automated SDN computing 245 engines. 247 The Vector Node attribute may be advertised by either an egress BGP 248 speaker or injected by a non-egress node such as a BGP Route 249 Reflector. It must be noted that in the event of non egress 250 injection (e.g. a route reflector) extra assurance must be taken to 251 achieve routing consistency. 253 Each BGP speaker which supports the BGP Vector Node attribute needs 254 to process the attribute upon receipt and modify the NEXT HOP that 255 node uses when installing the prefix in its local RIB/FIB. The rules 256 to modify the NEXT HOP using the Vector Node attribute are as 257 follows: 259 1 - Each BGP speaker involved in BGP Vector Routing only examines 260 those TLVs which contains its own AS number. In an event where the 261 BGP Vector node attribute is missing or if no Vector Routing TLVs 262 with an AS number matching to BGP speaker's AS is present (BGP 263 speaker fails the AS check criteria), a BGP speaker MUST use as the 264 NEXT HOP from the received BGP MP_REACH attribute or a BGP NEXT HOP 265 attribute in absense of a MP_REACH attribute. 267 2 - In an event where the BGP speaker passes the AS check criteria 268 for a given Vector TLV, a BGP speaker MUST use as the NEXT HOP of the 269 prefix the first Node Address from the Vector Node Attribute TLVs if 270 it does not find its own IGP node address (typically a loopback 271 address) or if none of the Vector Node addresses belong to any of its 272 connected interface subnets or are covered by any of the locally 273 configured static routes when installing the route in its local RIB/ 274 FIB. 276 3 - In an event where the BGP speaker passes the AS check criteria 277 for a given Vector TLV and if a BGP speaker finds its IGP node 278 address (typically a loopback address) as one of the Vector node 279 address, or if a BGP speaker finds its connected address as one of 280 the Vector node address, or if the Vector node address is covered by 281 any of the locally configured static route, then it MUST use as a 282 NEXT HOP the next eligible Vector Node address from the Vector Node 283 TLVs when installing the route in the RIB/FIB. In addition depending 284 on the type of Vector Node TLV it may need to flag such a RIB/FIB 285 entry with local punt or redirection for example to force Service 286 Processing of type 2 Vector Node TLV. 288 4 - In an event where the BGP speaker passes the AS check criteria 289 for a given Vector TLV and if a BGP speaker finds its IGP node 290 address (typically a loopback address) as one of the Vector node 291 address, or if a BGP speaker finds its connected address as one of 292 the Vector node address, or if the Vector node address is covered by 293 any of the locally configured static route, and if the found Vector 294 node address is the last address in the TLV, then the BGP speaker 295 MUST use NEXT HOP as a NEXT HOP address from the received BGP 296 MP_REACH attribute or a BGP NEXT HOP attribute in absense of a 297 MP_REACH attribute. 299 4. Use case example 301 As an example, consider the following scenario where VM1 attached to 302 NVE1 needs to communicate with H1 attached to PE1. However, packets 303 from VM1 to H1 need the services of S1 off of NVE3 and S2 off of NVE4 304 respectively. Therefore, the service chain of VM1 -> S1 -> S2 -> H1 305 needs to be formed for packets from VM1 to H1. 307 +---+ Enterprise Site 1 308 |PE1|----- H1 309 +---+ 310 / 311 ,---------. Enterprise Site 2 312 ,' `. +---+ 313 ,---------. /( IP )---|PE2|----- H2 314 ' DCN 3 `./ `. Core ,' +---+ 315 `-+------+' `-+------+' 316 __/__ / / \ \ 317 :NVE4 : +------+ \ \ 318 '-----' ,--|ABR1 |. +------+ 319 | ,' +------+ `. |ABR2 |--. 320 VM6 ( DCN 1 ) ,'+------+ `. 321 `. ,' ( DCN 2 ) 322 `-+------+' `. ,' 323 __/__ __\__ `-+------+' 324 :NVE1 : :NVE2 : __/__ __\__ 325 '-----' '-----' :NVE3 : :NVE4 : 326 | | | '-----' '-----' 327 VM1 VM2 VM3 | | | 328 S1 VM5 S2 330 Lets assume VM1, VM3, S1, S2, and H1 are part of the same VPN and a 331 same Autonomous System. PE1 advertises host route H1 with Vector 332 Node Attribute of [I1, I2]; where I1 and I2 are interface subnet 333 addresses corresponding to service nodes S1 and S2 respectively. 335 When NVE1 or NVE2 receives this advertisement, it applies rule (2) 336 and subsequently setting the next hop address of H1 to I1 337 corresponding to service node S1. Therefore, when it receives 338 packets destined to H1, it encapsulates the packets using any 339 existing tunneled mechanisms and forwards them to the I1 address in 340 NVE3. 342 When NVE3 receives this advertisement, it applies rule (3) by 343 identifying its interface subnet I1 in the Vector Node attribute and 344 subsequently setting the next hop address of H1 to I2 corresponding 345 to service node S2. Therefore, when it receives packets from the 346 network it forwards them to S1 and when it receives packets from its 347 attached service node S1, destined to H1, it encapsulates the packets 348 using any existing tunneled mechanisms and forwards them to the I2 349 address in NVE4. 351 When NVE4 receives this advertisement, it applies rule (4) by 352 identifying its interface subnet I2 in the Vector Node attribute and 353 since it is the last address in the Vector Node attribute list, it 354 sets the next hop address of PE1 (received in the BGP advertisement) 355 as the Next Hop for the prefix. Therefore when it receives packets 356 from the network it forwards them to S2 when it receives packets 357 destined to H1 from its attached service node S2, it encapsulates the 358 packets using any existing tunneled mechanisms and forwards them to 359 the PE1. 361 5. Deployment considerations 363 The BGP Vector Routing can be deployed for both Intra and Inter- 364 domain networks without any restriction on version of IP address used 365 as a Vector Node. 367 When using BGP Vector Routing and BGP multipath feature it is 368 mandatory to assure consistent imposition of BGP Vector Node 369 Attribute for a given prefix or group of prefixes from any imposition 370 point in the network. When BGP speaker detects inconsistency across 371 content of BGP Vector Routing Attribute across paths of the same 372 prefix it is mandated to ignore such attribute and log a system 373 warning. 375 When using BGP Vector Routing marking from any points within the 376 domain it is mandatory to assure consistency of application of BGP 377 Vector Routing Attribuite in all injection points. 379 Use of mixed TLV types (type 1 and type 2 is allowed). 381 Reachability to BGP Vector Routing Nodes is resolved in exactly same 382 manner as a reachability to traditional BGP Next Hops are resolved 383 with the help of IGP routing. As such, BGP Vector Routing can use 384 IGP Segment Routing rules to reach next BGP Vector Node. 386 This specification for its deployment simplicity assumes that BGP 387 Vector Routing must be used with some form of IGP encapsulation 388 between ingress, egress and all transit or service nodes. In 389 particular IP encapsulation, MPLS encapsulation or Segment Routing 390 can be used to transit packets within any IGP domain where BGP may 391 not be present or BGP routers are not upgraded with new 392 functionality. 394 In the presence of requirement for more selective then to entire IP 395 destination packet handling (example separate port 80 http traffic 396 from delay sensitve packets) the BGP Vector Node attribute can be 397 attached to BGP update containing Dissemination of Flow Specification 398 Rules RFC 5575 [RFC5575] where traffic action is defined as new E bit 399 (Encapsulate). 401 40 41 42 43 44 45 46 47 402 +---+---+---+---+---+---+---+---+ 403 | reserved | E | S | T | 404 +---+---+---+---+---+---+---+---+ 406 E-bit - defines new action which results in encapsulation of matching 407 packets to the next vector node as specified in the BGP Vector Node 408 Attribute. 410 The rest of the encoding as well as validation rules remain unchanged 411 as defined in RFC 5575 [RFC5575]. 413 In the requirement to provide more then one node as Vector Node at 414 any point in the path (for example: service load balancing, multipath 415 etc...) it is expected that Vector Node address will be an anycast 416 address which via IGP will allow spread of traffic across N Vector 417 Nodes. 419 6. IANA Considerations 421 This document defines a new BGP attribute known as a BGP Vector Node 422 attribute. The code point for a new BGP Vector Node attribute has to 423 be assigned by IANA from the BGP Path Attributes registry. 425 7. Security considerations 427 No new security issues are introduced to the BGP protocol by this 428 specification. 430 8. Acknowledgements 432 Authors would like to acknowledge Dave Qi, Brian Field, Bruno 433 Decraene and Ahmed Bashandy for their valuable input, review and 434 comments. 436 9. References 438 9.1. Normative References 440 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 441 Requirement Levels", BCP 14, RFC 2119, 442 DOI 10.17487/RFC2119, March 1997, 443 . 445 [RFC4223] Savola, P., "Reclassification of RFC 1863 to Historic", 446 RFC 4223, DOI 10.17487/RFC4223, October 2005, 447 . 449 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 450 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 451 DOI 10.17487/RFC4271, January 2006, 452 . 454 9.2. Informative References 456 [I-D.keyupate-bgp-services] 457 Patel, K., Medved, J., and B. Pithawala, "Service 458 Advertisement using BGP", draft-keyupate-bgp-services-02 459 (work in progress), April 2013. 461 [I-D.previdi-filsfils-isis-segment-routing] 462 Previdi, S., Filsfils, C., Bashandy, A., Horneffer, M., 463 Decraene, B., Litkowski, S., Milojevic, I., Shakir, R., 464 Ytti, S., Henderickx, W., and J. Tantsura, "Segment 465 Routing with IS-IS Routing Protocol", draft-previdi- 466 filsfils-isis-segment-routing-02 (work in progress), March 467 2013. 469 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 470 and D. McPherson, "Dissemination of Flow Specification 471 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 472 . 474 Authors' Addresses 476 Robert Raszuk (editor) 477 Bloomberg LP 478 731 Lexington Ave 479 New York City, NY 10022 480 USA 482 Email: robert@raszuk.net 484 Keyur Patel (editor) 485 Cisco Systems 486 170 West Tasman Drive 487 San Jose, CA 95134 488 US 490 Email: keyupate@cisco.com 491 Burjiz Pithawala 492 Cisco Systems 493 170 West Tasman Drive 494 San Jose, CA 95134 495 US 497 Email: bpithaw@cisco.com 499 Ali Sajassi 500 Cisco Systems 501 170 West Tasman Drive 502 San Jose, CA 95134 503 US 505 Email: sajassi@cisco.com 507 Eric Osborne 508 Level 3 510 Email: eric.osborne@level3.com 512 Luay Jalil 513 VeriZon 514 1201 E Arapaho Rd 515 Richardson, Texas 75081 516 USA 518 Email: luay.jalil@verizon.com 520 James Uttaro 521 ATT 522 200 S. Laurel Ave 523 Middletown, NJ 07748 524 USA 526 Email: uttaro@att.com