idnits 2.17.1 draft-filsfils-spring-srv6-net-pgm-illustration-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([I-D.ietf-spring-srv6-network-programming]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == There are 7 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (June 24, 2020) is 1402 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC8754' is defined on line 1026, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-bess-srv6-services-02 == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-07 == Outdated reference: A later version (-09) exists of draft-ietf-spring-sr-service-programming-02 == Outdated reference: A later version (-28) exists of draft-ietf-spring-srv6-network-programming-15 Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SPRING C. Filsfils 3 Internet-Draft P. Camarillo, Ed. 4 Intended status: Informational Cisco Systems, Inc. 5 Expires: December 26, 2020 Z. Li 6 Huawei Technologies 7 S. Matsushima 8 SoftBank 9 B. Decraene 10 Orange 11 D. Steinberg 12 Lapishills Consulting Limited 13 D. Lebrun 14 Google 15 R. Raszuk 16 Bloomberg LP 17 J. Leddy 18 Individual Contributor 19 June 24, 2020 21 Illustrations for SRv6 Network Programming 22 draft-filsfils-spring-srv6-net-pgm-illustration-02 24 Abstract 26 This document illustrates how SRv6 Network Programming 27 [I-D.ietf-spring-srv6-network-programming] can be used to create 28 interoperable and protected overlays with underlay optimization and 29 service programming. 31 Requirements Language 33 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 34 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 35 "OPTIONAL" in this document are to be interpreted as described in BCP 36 14 [RFC2119] [RFC8174] when, and only when, they appear in all 37 capitals, as shown here. 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at https://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on December 26, 2020. 56 Copyright Notice 58 Copyright (c) 2020 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (https://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 74 2. Illustration . . . . . . . . . . . . . . . . . . . . . . . . 3 75 2.1. Simplified SID allocation . . . . . . . . . . . . . . . . 3 76 2.2. Reference diagram . . . . . . . . . . . . . . . . . . . . 4 77 2.3. Basic security . . . . . . . . . . . . . . . . . . . . . 5 78 2.4. SR-L3VPN . . . . . . . . . . . . . . . . . . . . . . . . 5 79 2.5. SR-Ethernet-VPWS . . . . . . . . . . . . . . . . . . . . 6 80 2.6. SR-EVPN-FXC . . . . . . . . . . . . . . . . . . . . . . . 7 81 2.7. SR-EVPN . . . . . . . . . . . . . . . . . . . . . . . . . 7 82 2.7.1. EVPN Bridging . . . . . . . . . . . . . . . . . . . . 7 83 2.7.2. EVPN Multi-homing with ESI filtering . . . . . . . . 9 84 2.7.3. EVPN Layer-3 . . . . . . . . . . . . . . . . . . . . 11 85 2.7.4. EVPN Integrated Routing Bridging (IRB) . . . . . . . 11 86 2.8. SR TE for Underlay SLA . . . . . . . . . . . . . . . . . 12 87 2.8.1. SR policy from the Ingress PE . . . . . . . . . . . . 12 88 2.8.2. SR policy at a midpoint . . . . . . . . . . . . . . . 13 89 2.9. End-to-End policy with intermediate BSID . . . . . . . . 14 90 2.10. TI-LFA . . . . . . . . . . . . . . . . . . . . . . . . . 15 91 2.11. SR TE for Service programming . . . . . . . . . . . . . . 16 92 3. Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . 18 93 3.1. Seamless deployment . . . . . . . . . . . . . . . . . . . 18 94 3.2. Integration . . . . . . . . . . . . . . . . . . . . . . . 19 95 3.3. Security . . . . . . . . . . . . . . . . . . . . . . . . 19 96 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 97 5. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 19 98 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 101 1. Introduction 103 Segment Routing leverages the source routing paradigm. An ingress 104 node steers a packet through a ordered list of instructions, called 105 segments. Each one of these instructions represents a function to be 106 called at a specific location in the network. A function is locally 107 defined on the node where it is executed and may range from simply 108 moving forward in the segment list to any complex user-defined 109 behavior. The network programming consists in combining segment 110 routing functions, both simple and complex, to achieve a networking 111 objective that goes beyond mere packet routing. 113 [I-D.ietf-spring-srv6-network-programming] defines the SRv6 Network 114 Programming concept and the main segment routing behaviors. 116 This document illustrates how these concepts can be used to enable 117 the creation of interoperable overlays with underlay optimization and 118 service programming. 120 The terminology for this document is defined in 121 [I-D.ietf-spring-srv6-network-programming]. 123 2. Illustration 125 We introduce a simplified SID allocation technique to ease the 126 reading of the text. We document the reference diagram. We then 127 illustrate the network programming concept through different use- 128 cases. These use-cases have been thought to allow straightforward 129 combination between each other. 131 2.1. Simplified SID allocation 133 To simplify the illustration, we assume: 135 2001:db8::/32 is an IPv6 block allocated by a RIR to the operator 137 2001:db8:0::/48 is dedicated to the internal address space 139 2001:db8:cccc::/48 is dedicated to the internal SRv6 SID space 141 We assume a location expressed in 64 bits and a function expressed 142 in 16 bits 143 Node k has a classic IPv6 loopback address 2001:db8::k/128 which 144 is advertised in the IGP 146 Node k has 2001:db8:cccc:k::/64 for its local SID space. Its SIDs 147 will be explicitly assigned from that block 149 Node k advertises 2001:db8:cccc:k::/64 in its IGP 151 Function :1:: (function 1, for short) represents the End function 152 with PSP support 154 Function :C2:: (function C2, for short) represents the End.X 155 function towards neighbor 2 157 Each node k has: 159 An explicit SID instantiation 2001:db8:cccc:k:1::/128 bound to an 160 End function with additional support for PSP 162 An explicit SID instantiation 2001:db8:cccc:k:Cj::/128 bound to an 163 End.X function to neighbor J with additional support for PSP 165 2.2. Reference diagram 167 Let us assume the following topology where all the links have IGP 168 metric 10 except the link 3-4 which is 100. 170 Nodes A, B and 1 to 8 are considered within the network domain while 171 nodes CE-A, CE-B and CE-C are outside the domain. 173 CE-B 174 \ 175 3------4---5 176 | \ / 177 | 6 178 | / 179 A--1--- 2------7---8--B 180 / \ 181 CE-A CE-C 182 Tenant100 Tenant100 with 183 IPv4 20/8 185 Figure 1: Reference topology 187 2.3. Basic security 189 Any edge node such as 1 would be configured with an ACL on any of its 190 external interface (e.g. from CE-A) which drops any traffic with SA 191 or DA in 2001:db8:cccc::/16. See SEC-1. 193 Any core node such as 6 could be configured with an ACL with the 194 SEC-2 behavior "IF (DA == LocalSID) && (SA is not in 2001:db8:0::/48 195 or 2001:db8:cccc::/16) THEN drop". 197 SEC-3 protection is a default property of SRv6. A SID must be 198 explicitly instantiated. In our illustration, the only available 199 SIDs are those explicitly instantiated. 201 2.4. SR-L3VPN 203 Let us illustrate the SR-L3VPN use-case applied to IPv4. 205 Nodes 1 and 8 are configured with a tenant 100, each respectively 206 connected to CE-A and CE-C. 208 Node 8 is configured with a locally instantiated End.DT4 SID 209 2001:db8:cccc:8:D100:: bound to tenant IPv4 table 100. 211 Via BGP signaling or an SDN-based controller, Node 1's tenant-100 212 IPv4 table is programmed with an IPv4 SR-VPN route 20/8 via SRv6 213 policy <2001:db8:cccc:8:D100::>. 215 When 1 receives a packet P from CE-A destined to 20.20.20.20, 1 looks 216 up 20.20.20.20 in its tenant-100 IPv4 table and finds an SR-VPN entry 217 20/8 via SRv6 policy <2001:db8:cccc:8:D100::>. As a consequence, 1 218 pushes an outer IPv6 header with SA=2001:db8::1, 219 DA=2001:db8:cccc:8:D100:: and NH=4. 1 then forwards the resulting 220 packet on the shortest path to 2001:db8:cccc:8::/64. 222 When 8 receives the packet, 8 matches the DA in its "My SID Table", 223 finds the bound function End.DT4(100) and confirms NH=4. As a 224 result, 8 decaps the outer header, looks up the inner IPv4 DA in 225 tenant-100 IPv4 table, and forward the (inner) IPv4 packet towards 226 CE-C. 228 The reader can easily infer all the other SR-IPVPN instantiations: 230 +---------------------------------+----------------------------------+ 231 | Route at ingress PE(1) | SR-VPN Egress SID of egress PE(8)| 232 +---------------------------------+----------------------------------+ 233 | IPv4 tenant route with egress | End.DT4 function bound to | 234 | tenant table lookup | IPv4-tenant-100 table | 235 +---------------------------------+----------------------------------+ 236 | IPv4 tenant route without egress| End.DX4 function bound to | 237 | tenant table lookup | CE-C (IPv4) | 238 +---------------------------------+----------------------------------+ 239 | IPv6 tenant route with egress | End.DT6 function bound to | 240 | tenant table lookup | IPv6-tenant-100 table | 241 +---------------------------------+----------------------------------+ 242 | IPv6 tenant route without egress| End.DX6 function bound to | 243 | tenant table lookup | CE-C (IPv6) | 244 +---------------------------------+----------------------------------+ 246 2.5. SR-Ethernet-VPWS 248 Let us illustrate the SR-Ethernet-VPWS use-case. 250 Node 8 is configured a locally instantiated End.DX2 SID 251 2001:db8:cccc:8:DC2C:: bound to local attachment circuit {ethernet 252 CE-C}. 254 Via BGP signalling or an SDN controller, node 1 is programmed with an 255 Ethernet VPWS service for its local attachment circuit {ethernet CE- 256 A} with remote endpoint 2001:db8:cccc:8:DC2C::. 258 When 1 receives a frame F from CE-A, node 1 pushes an outer IPv6 259 header with SA=2001:db8::1, DA=2001:db8:cccc:8:DC2C:: and NH=59. 260 Note that no additional header is pushed. 1 then forwards the 261 resulting packet on the shortest path to 2001:db8:cccc:8::/64. 263 When 8 receives the packet, 8 matches the DA in its "My SID Table" 264 and finds the bound function End.DX2. After confirming that next- 265 header=59, 8 decaps the outer IPv6 header and forwards the inner 266 Ethernet frame towards CE-C. 268 The reader can easily infer the Ethernet VPWS use-case: 270 +------------------------+-----------------------------------+ 271 | Route at ingress PE(1) | SR-VPN Egress SID of egress PE(8) | 272 +------------------------+-----------------------------------+ 273 | Ethernet VPWS | End.DX2 function bound to | 274 | | CE-C (Ethernet) | 275 +------------------------+-----------------------------------+ 277 2.6. SR-EVPN-FXC 279 Let us illustrate the SR-EVPN-FXC use-case (Flexible cross-connect 280 service). 282 Node 8 is configured with a locally instantiated End.DX2V SID 283 2001:db8:cccc:8:DC2C:: bound to the L2 table T1. Node 8 is also 284 configured with local attachment circuits {ethernet CE1-C VLAN:100} 285 and {ethernet CE2-C VLAN:200} in table T1. 287 Via an SDN controller or derived from a BGP-based sginalling, the 288 node 1 is programmed with an EVPN-FXC service for its local 289 attachment circuit {ethernet CE-A} with remote endpoint 290 2001:db8:cccc:8:DC2C::. For this purpose, the EVPN Type-1 route is 291 used. 293 When node 1 receives a frame F from CE-A, it pushes an outer IPv6 294 header with SA=2001:db8::1, DA=2001:db8:cccc:8:DC2C:: and NH=59. 295 Note that no additional header is pushed. Node 1 then forwards the 296 resulting packet on the shortest path to 2001:db8:cccc:8::/64. 298 When node 8 receives the packet, it matches the IP DA in its "My SID 299 Table" and finds the bound function End.DX2V. After confirming that 300 next-header=59, node 8 decaps the outer IPv6 header, performs a VLAN 301 loopkup in table T1 and forwards the inner Ethernet frame to matching 302 interface e.g. for VLAN 100, packet is forwarded to CE1-C and for 303 VLAN 200, frame is forwarded to CE2-C. 305 The reader can easily infer the Ethernet FXC use-case: 307 +---------------------------------+------------------------------------+ 308 | Route at ingress PE (1) | SR-VPN Egress SID of egress PE (8) | 309 +---------------------------------+------------------------------------+ 310 | EVPN-FXC | End.DX2V function bound to | 311 | | CE1-C / CE2-C (Ethernet) | 312 +---------------------------------+------------------------------------+ 314 2.7. SR-EVPN 316 The following section details some of the particular use-cases of SR- 317 EVPN. In particular bridging (unicast and multicast), multi-homing 318 ESI filtering, L3 EVPN and EVPN-IRB. 320 2.7.1. EVPN Bridging 322 Let us illustrate the SR-EVPN unicast and multicast bridging. 324 Nodes 1, 3 and 8 are configured with a EVPN bridging service (E-LAN 325 service). 327 Node 1 is configured with a locally instantiated End.DT2U SID 328 2001:db8:cccc:1:D2AA:: bound to a local L2 table T1 where EVPN is 329 enabled. This SID will be used to attract unicast traffic. 330 Additionally, Node 1 is configured with a locally instantiated 331 End.DT2M SID 2001:db8:cccc:1:D2AF:: bound to the same local L2 table 332 T1. This SID will be used to attract multicast traffic. Node 1 is 333 also configured with local attachment circuit {ethernet CE-A 334 VLAN:100} associated to table T1. 336 A similar instantiation is done at Node 4 and Node 8 resulting in: 338 - Node 1 - My SID table: 340 - End.DT2U SID: 2001:db8:cccc:1:D2AA:: table T1 342 - End.DT2M SID: 2001:db8:cccc:1:D2AF:: table T1 344 - Node 3 - My SID table: 346 - End.DT2U SID: 2001:db8:cccc:3:D2BA:: table T3 348 - End.DT2M SID: 2001:db8:cccc:3:D2BF:: table T3 350 - Node 8 - My SID table: 352 - End.DT2U SID: 2001:db8:cccc:8:D2CA:: table T8 354 - End.DT2M SID: 2001:db8:cccc:8:D2CF:: table T8 356 Nodes 1, 4 and 8 are going to exchange the End.DT2M SIDs via BGP- 357 based EVPN Type-3 route. Upon reception of the EVPN Type-3 routes, 358 each node build its own replication list per L2 table that will be 359 used for ingress BUM traffic replication. The replication lists are 360 the following: 362 - Node 1 - replication list: {2001:db8:cccc:3:D2BF:: and 363 2001:db8:cccc:8:D2CF::} 365 - Node 3 - replication list: {2001:db8:cccc:1:D2AF:: and 366 2001:db8:cccc:8:D2CF::} 368 - Node 8 - replication list: {2001:db8:cccc:1:D2AF:: and 369 2001:db8:cccc:3:D2CF::} 371 When node 1 receives a BUM frame F from CE-A, it replicates that 372 frame to every node in the replication list. For node 3, it pushes 373 an outer IPv6 header with SA=2001:db8::1, DA=2001:db8:cccc:3:D2BF:: 374 and NH=59. For node 8, it performs the same operation but 375 DA=2001:db8:cccc:8:D2CF::. Note that no additional headers are 376 pushed. Node 1 then forwards the resulting packets on the shortest 377 path for each destination. 379 When node 3 receives the packet, it matches the DA in its "My SID 380 Table" and finds the bound function End.DT2M with its related layer2 381 table T3. After confirming that next-header=59, node 3 decaps the 382 outer IPv6 header and forwards the inner Ethernet frame to all 383 layer-2 output interface found in table T3. Similar processing is 384 also performed by node 8 upon packet reception. This example is the 385 same for any BUM stream coming from CE-B or CE-C. 387 Node 1,3 and 8 are also performing software MAC learning to exchange 388 MAC reachability information (unicast traffic) via BGP among 389 themselves. 391 Each MAC being learnt is exchanged using BGP-based EVPN Type-2 route. 393 When node 1 receives an unicast frame F from CE-A, it learns its MAC- 394 SA=CEA in software. Node 1 transmits that MAC and its associated SID 395 2001:db8:cccc:1:D2AA:: using BGP-based EVPN route-type 2 to all 396 remote nodes. 398 When node 3 receives an unicast frame F from CE-B destinated to MAC- 399 DA=CEA, it performs a L2 lookup on T3 to find the associated SID. It 400 pushes an outer IPv6 header with SA=2001:db8::3, 401 DA=2001:db8:cccc:1:D2AA:: and NH=59. Node 3 then forwards the 402 resulting packet on the shortest path to 2001:db8:cccc:1::/64. 403 Similar processing is also performed by node 8. 405 2.7.2. EVPN Multi-homing with ESI filtering 407 In L2 network, support for traffic loop avoidance is mandatory. In 408 EVPN all-active multi-homing scenario enforces that requirement using 409 ESI filtering. Let us illustrate how it works: 411 Nodes 3 and 4 are peering partners of a redundancy group where the 412 access CE-B, is connected in an all-active multi-homing way with 413 these two nodes. Hence, the topology is the following: 415 CE-B 416 / \ 417 3------4---5 418 | \ / 419 | 6 420 | / 421 A--1--- 2------7---8--B 422 / \ 423 CE-A CE-C 424 Tenant100 Tenant100 with 425 IPv4 20/8 427 EVPN ESI filtering - Reference topology 429 Nodes 3 and 4 are configured with an EVPN bridging service (E-LAN 430 service). 432 Node 3 is configured with a locally instantiated End.DT2M SID 433 2001:db8:cccc:3:D2BF:: bound to a local L2 table T1 where EVPN is 434 enabled. This SID is also configured with the optional argument 435 Arg.FE2 that specifies the attachment circuit. Particularly, node 3 436 assigns identifier 0xC1 to {ethernet CE-B}. 438 Node 4 is configured with a locally instantiated End.DT2M SID 439 2001:db8:cccc:4:D2BF:: bound to a local L2 table T1 where EVPN is 440 enabled. This SID is also configured with the optional argument 441 Arg.FE2 that specifies the attachment circuit. Particularly, node 3 442 assigns identifier 0xC2 to {ethernet CE-B}. 444 Both End.DT2M SIDs are exchanged between nodes via BGP-based EVPN 445 Type-3 routes. Upon reception of EVPN Type-3 routes, each node build 446 its own replication list per L2 table T1. 448 On the other hand, the End.DT2M SID arguments (Arg.F2) are exchanged 449 between nodes via SRv6 VPN SID attached to the BGP-based EVPN Type-1 450 route. The BGP ESI-filtering extended community label is set to 451 implicit-null [I-D.ietf-bess-srv6-services]. 453 Upon reception of EVPN Type-1 route and Type-3 route, node 3 merges 454 merges the End.DT2M SID (2001:db8:cccc:4:D2BF:) with the 455 Arg.FE2(0:0:0:C2::) from node 4 (its peering partner). This is done 456 by a simple OR bitwise operation. As a result, the replication list 457 on node 3 for the PEs 3,4 and 8 is: {2001:db8:cccc:1:D2AF::; 458 2001:db8:cccc:4:D2BF:C2::; 2001:db8:cccc:8:D2CF::}. 460 In a similar manner, the replication list on node 4 for the PEs 1,3 461 and 8 is: {2001:db8:cccc:1:D2AF::; 2001:db8:cccc:3:D2BF:C1::; 462 2001:db8:cccc:8:D2CF::}. Note that in this case the SID for PE3 463 contains the OR bitwise operation of SIDs 2001:db8:cccc:3:D2BF:: and 464 0:0:0:C1::. 466 When node 3 receives a BUM frame F from CE-B, it replicates that 467 frame to remote PEs. For node 4, it pushes an outer IPv6 header with 468 SA=2001:db8::1, DA=2001:db8:cccc:4:D2AF:C2:: and NH=59. Note that no 469 additional header is pushed. Node 3 then forwards the resulting 470 packet on the shortest path to node 4, and once the packet arrives to 471 node 4, the End.DT2M function is executed forwarding to all L2 OIFs 472 except the ones corresponding to identifier 0xC2. 474 2.7.3. EVPN Layer-3 476 EVPN layer-3 works exactly in the same way than L3VPN. Please refer 477 to section Section 2.4 479 2.7.4. EVPN Integrated Routing Bridging (IRB) 481 EVPN IRB brings Layer-2 and Layer-3 together. It uses BGP-based EVPN 482 Type-2 route to achieve Layer-2 intra-subnet and Layer-3 inter-subnet 483 forwarding. The EVPN Type-2 route-2 maintains the MAC/IP 484 association. 486 Node 8 is configured with a locally instantiated End.DT2U SID 487 2001:db8:cccc:8:D2C:: used for unicast L2 traffic. Node 8 is also 488 configured with locally instantiated End.DT4 SID 489 2001:db8:cccc:8:D100:: bound to IPv4 tenant table 100. 491 Node 1 is going to be configured with the EVPN IRB service. 493 Node 8 signals to other remote PEs (1, 3) each ARP/ND request learned 494 via BGP-based EVPN Type-2 route. For example, when node 8 receives 495 an ARP/ND packet P from a host (20.20.20.20) on CE-C destined to 496 10.10.10.10, it learns its MAC-SA=CEC in software. It also learns 497 the ARP/ND entry (IP SA=20.20.20.20) in its cache. Node 8 transmits 498 that MAC/IP and its associated L3 SID (2001:db8:cccc:8:D100::) and L2 499 SID (2001:db8:cccc:8:D2C::). 501 When node 1 receives a packet P from CE-A destined to 20.20.20.20 502 from a host (10.10.10.10), node 1 looks up its tenant-100 IPv4 table 503 and finds an SR-VPN entry for that prefix. As a consequence, node 1 504 pushes an outer IPv6 header with SA=2001:db8::1, 505 DA=2001:db8:cccc:8:D100:: and NH=4. Node 1 then forwards the 506 resulting packet on the shortest path to 2001:db8:cccc:8::/64. EVPN 507 inter-subnet forwarding is then achieved. 509 When node 1 receives a packet P from CE-A destined to 20.20.20.20 510 from a host (10.10.10.11), P looks up its L2 table T1 MAC-DA lookup 511 to find the associated SID. It pushes an outer IPv6 header with 512 SA=2001:db8::1, DA=2001:db8:cccc:8:D2C:: and NH=59. Note that no 513 additional header is pushed. Node 8 then forwards the resulting 514 packet on the shortest path to 2001:db8:cccc:8::/64. EVPN intra- 515 subnet forwarding is then achieved. 517 2.8. SR TE for Underlay SLA 519 2.8.1. SR policy from the Ingress PE 521 Let's assume that node 1's tenant-100 IPv4 route "20/8 via 522 2001:db8:cccc:8:D100::" is programmed with a color/community that 523 requires low-latency underlay optimization 524 [I-D.ietf-spring-segment-routing-policy]. 526 In such case, node 1 either computes the low-latency path to the 527 egress node itself or delegates the computation to a PCE. 529 In either case, the location of the egress PE can easily be found by 530 looking for who originates the locator comprising the SID 531 2001:db8:cccc:8:D100::. This can be found in the IGP's LSDB for a 532 single domain case, and in the BGP-LS LSDB for a multi-domain case. 534 Let us assume that the TE metric encodes the per-link propagation 535 latency. Let us assume that all the links have a TE metric of 10, 536 except link 27 which has TE metric 100. 538 The low-latency path from 1 to 8 is thus 1234678. 540 This path is encoded in a SID list as: first a hop through 541 2001:db8:cccc:3:C4:: and then a hop to 8. 543 As a consequence the SR-VPN entry 20/8 installed in the Node1's 544 Tenant-100 IPv4 table is: T.Encaps with SRv6 Policy 545 <2001:db8:cccc:3:C4::, 2001:db8:cccc:8:D100::>. 547 When 1 receives a packet P from CE-A destined to 20.20.20.20, P looks 548 up its tenant-100 IPv4 table and finds an SR-VPN entry 20/8. As a 549 consequence, 1 pushes an outer header with SA=2001:db8::1, 550 DA=2001:db8:cccc:3:C4::, NH=SRH followed by SRH 551 (2001:db8:cccc:8:D100::, 2001:db8:cccc:3:C4::; SL=1; NH=4). 1 then 552 forwards the resulting packet on the interface to 2. 554 2 forwards to 3 along the path to 2001:db8:cccc:3::/64. 556 When 3 receives the packet, 3 matches the DA in its "My SID Table" 557 and finds the bound function End.X to neighbor 4. 3 notes the PSP 558 capability of the SID 2001:db8:cccc:3:C4::. 3 sets the DA to the next 559 SID 2001:db8:cccc:8:D100::. As 3 is the penultimate segment hop, it 560 performs PSP and pops the SRH. 3 forwards the resulting packet to 4. 562 4, 6 and 7 forwards along the path to 2001:db8:cccc:8::/64. 564 When 8 receives the packet, 8 matches the DA in its "My SID Table" 565 and finds the bound function End.DT(100). As a result, 8 decaps the 566 outer header, looks up the inner IPv4 DA (20.20.20.20) in tenant-100 567 IPv4 table, and forward the (inner) IPv4 packet towards CE-B. 569 2.8.2. SR policy at a midpoint 571 Let us analyze a policy applied at a midpoint on a packet without 572 SRH. 574 Packet P1 is (2001:db8::1, 2001:db8:cccc:8:D100::). 576 Let us consider P1 when it is received by node 2 and let us assume 577 that that node 2 is configured to steer 2001:db8:cccc:8::/64 in a 578 T.Insert behavior associated with SR policy <2001:db8:cccc:3:C4::>. 580 In such a case, node 2 would send the following modified packet P1 on 581 the link to 3: 583 (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:8:D100::, 584 2001:db8:cccc:3:C4::; SL=1). 586 The rest of the processing is similar to the previous section. 588 Let us analyze a policy applied at a midpoint on a packet with an 589 SRH. 591 Packet P2 is (2001:db8::1, 592 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::, 2001:db8:cccc:7:1::; 593 SL=1). 595 Let us consider P2 when it is received by node 2 and let us assume 596 that node 2 is configured to steer 2001:db8:cccc:7::/64 in a T.Insert 597 behavior associated with SR policy <2001:db8:cccc:3:C4::, 598 2001:db8:cccc:5:1::>. 600 In such a case, node 2 would send the following modified packet P2 on 601 the link to 4: 603 (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:7:1::, 604 2001:db8:cccc:5:1::, 2001:db8:cccc:3:C4::; 605 SL=2)(2001:db8:cccc:8:D100::, 2001:db8:cccc:7:1::; SL=1) 606 Node 3 would send the following packet to 4: (2001:db8::1, 607 2001:db8:cccc:5:1::)(2001:db8:cccc:6:1::, 2001:db8:cccc:5:1::, 608 2001:db8:cccc:3:C4::; SL=1)(2001:db8:cccc:8:D100::, 609 2001:db8:cccc:7:1::; SL=1) 611 Node 4 would send the following packet to 5: (2001:db8::1, 612 2001:db8:cccc:5:1::)(2001:db8:cccc:6:1::, 2001:db8:cccc:5:1::, 613 2001:db8:cccc:3:C4::; SL=1)(2001:db8:cccc:8:D100::, 614 2001:db8:cccc:7:1::; SL=1) 616 Node 5 would send the following packet to 6: (2001:db8::1, 617 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::, 2001:db8:cccc:7:1::; 618 SL=1) 620 Node 6 would send the following packet to 7: (2001:db8::1, 621 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::, 2001:db8:cccc:7:1::; 622 SL=1) 624 Node 7 would send the following packet to 8: (2001:db8::1, 625 2001:db8:cccc:8:D100::) 627 2.9. End-to-End policy with intermediate BSID 629 Let us now describe a case where the ingress VPN edge node steers the 630 packet destined to 20.20.20.20 towards the egress edge node connected 631 to the tenant100 site with 20/8, but via an intermediate SR Policy 632 represented by a single routable Binding SID. Let us illustrate this 633 case with an intermediate policy which both encodes underlay 634 optimization for low-latency and the service programming via two SR- 635 aware container-based apps. 637 Let us assume that the End.B6.Insert SID 2001:db8:cccc:2:B1:: is 638 configured at node 2 and is associated with midpoint SR policy 639 <2001:db8:cccc:3:C4::, 2001:db8:cccc:9:A1::, 2001:db8:cccc:6:A2::>. 641 2001:db8:cccc:3:C4:: realizes the low-latency path from the ingress 642 PE to the egress PE. This is the underlay optimization part of the 643 intermediate policy. 645 2001:db8:cccc:9:A1:: and 2001:db8:cccc:6:A2:: represent two SR-aware 646 NFV applications residing in containers respectively connected to 647 node 9 and 6. 649 Let us assume the following ingress VPN policy for 20/8 in tenant 100 650 IPv4 table of node 1: T.Encaps with SRv6 Policy 651 <2001:db8:cccc:2:B1::, 2001:db8:cccc:8:D100::>. 653 This ingress policy will steer the 20/8 tenant-100 traffic towards 654 the correct egress PE and via the required intermediate policy that 655 realizes the SLA and NFV requirements of this tenant customer. 657 Node 1 sends the following packet to 2: (2001:db8::1, 658 2001:db8:cccc:2:B1::) (2001:db8:cccc:8:D100::, 2001:db8:cccc:2:B1::; 659 SL=1) 661 Node 2 sends the following packet to 4: (2001:db8::1, 662 2001:db8:cccc:3:C4::) (2001:db8:cccc:6:A2::, 2001:db8:cccc:9:A1::, 663 2001:db8:cccc:3:C4::; SL=2)(2001:db8:cccc:8:D100::, 664 2001:db8:cccc:2:B1::; SL=1) 666 Node 4 sends the following packet to 5: (2001:db8::1, 667 2001:db8:cccc:9:A1::) (2001:db8:cccc:6:A2::, 2001:db8:cccc:9:A1::, 668 2001:db8:cccc:3:C4::; SL=1)(2001:db8:cccc:8:D100::, 669 2001:db8:cccc:2:B1::; SL=1) 671 Node 5 sends the following packet to 9: (2001:db8::1, 672 2001:db8:cccc:9:A1::) (2001:db8:cccc:6:A2::, 2001:db8:cccc:9:A1::, 673 2001:db8:cccc:3:C4::; SL=1)(2001:db8:cccc:8:D100::, 674 2001:db8:cccc:2:B1::; SL=1) 676 Node 9 sends the following packet to 6: (2001:db8::1, 677 2001:db8:cccc:6:A2::) (2001:db8:cccc:8:D100::, 2001:db8:cccc:2:B1::; 678 SL=1) 680 Node 6 sends the following packet to 7: (2001:db8::1, 681 2001:db8:cccc:8:D100::) 683 Node 7 sends the following packet to 8: (2001:db8::1, 684 2001:db8:cccc:8:D100::) which decaps and forwards to CE-B. 686 The benefits of using an intermediate Binding SID are well-known and 687 key to the Segment Routing architecture: the ingress edge node needs 688 to push fewer SIDs, the ingress edge node does not need to change its 689 SR policy upon change of the core topology or re-homing of the 690 container-based apps on different servers. Conversely, the core and 691 service organizations do not need to share details on how they 692 realize underlay SLA's or where they home their NFV apps. 694 2.10. TI-LFA 696 Let us assume two packets P1 and P2 received by node 2 exactly when 697 the failure of link 27 is detected. 699 P1: (2001:db8::1, 2001:db8:cccc:7:1::) 700 P2: (2001:db8::1, 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::, 701 2001:db8:cccc:7:1::; SL=1) 703 Node 2's pre-computed TI-LFA backup path for the destination 704 2001:db8:cccc:7::/64 is <2001:db8:cccc:3:C4::>. It is installed as a 705 T.Insert transit behavior. 707 Node 2 protects the two packets P1 and P2 according to the pre- 708 computed TI-LFA backup path and send the following modified packets 709 on the link to 4: 711 P1: (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:7:1::, 712 2001:db8:cccc:3:C4::; SL=1) 714 P2: (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:7:1::, 715 2001:db8:cccc:3:C4::; SL=1) (2001:db8:cccc:8:D100::, 716 2001:db8:cccc:7:1::; SL=1) 718 Node 4 then sends the following modified packets to 5: 720 P1: (2001:db8::1, 2001:db8:cccc:7:1::) 722 P2: (2001:db8::1, 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::, 723 2001:db8:cccc:7:1::; SL=1) 725 Then these packets follow the rest of their post-convergence path 726 towards node 7 and then go to node 8 for the VPN decaps. 728 2.11. SR TE for Service programming 730 We have illustrated the service programming through SR-aware apps in 731 a previous section. 733 We illustrate the use of End.AS function 734 [I-D.ietf-spring-sr-service-programming] to service chain an IP flow 735 bound to the internet through two SR-unaware applications hosted in 736 containers. 738 Let us assume that servers 20 and 70 are respectively connected to 739 nodes 2 and 7. They are respectively configured with SID spaces 740 2001:db8:cccc:20::/64 and 2001:db8:cccc:70::/64. Their connected 741 routers advertise the related prefixes in the IGP. Two SR-unaware 742 container-based applications App2 and App7 are respectively hosted on 743 server 20 and 70. Server 20 (70) is configured explicitly with an 744 End.AS SID 2001:db8:cccc:20:2:: for App2 (2001:db8:cccc:70:7:: for 745 App7). 747 Let us assume a broadband customer with a home gateway CE-A connected 748 to edge router 1. Router 1 is configured with an SR policy which 749 encapsulates all the traffic received from CE-A into a T.Encaps 750 policy <2001:db8:cccc:20:2::, 2001:db8:cccc:70:7::, 751 2001:db8:cccc:8:D0::> where 2001:db8:cccc:8:D0:: is an End.DT4 SID 752 instantiated at node 8. 754 P1 is a packet sent by the broadband customer to 1: (X, Y) where X 755 and Y are two IPv4 addresses. 757 1 sends the following packet to 2: (A1::, 758 2001:db8:cccc:20:2::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 759 2001:db8:cccc:20:2::; SL=2; NH=4)(X, Y). 761 2 forwards the packet to server 20. 763 20 receives the packet (A1::, 764 2001:db8:cccc:20:2::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 765 2001:db8:cccc:20:2::; SL=2; NH=4)(X, Y) and forwards the inner IPv4 766 packet (X,Y) to App2. App2 works on the packet and forwards it back 767 to 20. 20 pushes the outer IPv6 header with SRH (A1::, 768 2001:db8:cccc:70:7::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 769 2001:db8:cccc:20:2::; SL=1; NH=4) and sends the (whole) IPv6 packet 770 with the encapsulated IPv4 packet back to 2. 772 2 and 7 forward to server 70. 774 70 receives the packet (A1::, 775 2001:db8:cccc:70:7::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 776 2001:db8:cccc:20:2::; SL=1; NH=4)(X, Y) and forwards the inner IPv4 777 packet (X,Y) to App7. App7 works on the packet and forwards it back 778 to 70. 70 pushes the outer IPv6 header with SRH (A1::, 779 2001:db8:cccc:8:D0::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 780 2001:db8:cccc:20:2::; SL=0; NH=4) and sends the (whole) IPv6 packet 781 with the encapsulated IPv4 packet back to 7. 783 7 forwards to 8. 785 8 receives (A1::, 2001:db8:cccc:8:D0::)(2001:db8:cccc:8:D0::, 786 2001:db8:cccc:70:7::, 2001:db8:cccc:20:2::; SL=0; NH=4)(X, Y) and 787 performs the End.DT4 function and sends the IP packet (X, Y) towards 788 its internet destination. 790 3. Benefits 792 3.1. Seamless deployment 794 The VPN use-case can be realized with SRv6 capability deployed solely 795 at the ingress and egress PE's. 797 All the nodes in between these PE's act as transit routers as per 798 [RFC8200]. No software/hardware upgrade is required on all these 799 nodes. They just need to support IPv6 per [RFC8200]. 801 The SRTE/underlay-SLA use-case can be realized with SRv6 capability 802 deployed at few strategic nodes. 804 It is well-known from the experience deploying SR-MPLS that 805 underlay SLA optimization requires few SIDs placed at strategic 806 locations. This was illustrated in our example with the low- 807 latency optimization which required the operator to enable one 808 single core node with SRv6 (node 4) where one single and End.X SID 809 towards node 5 was instantiated. This single SID is sufficient to 810 force the end-to-end traffic via the low-latency path. 812 The TI-LFA benefits are collected incrementally as SRv6 capabilities 813 are deployed. 815 It is well-know that TI-LFA is an incremental node-by-node 816 deployment. When a node N is enabled for TI-LFA, it computes TI- 817 LFA backup paths for each primary path to each IGP destination. 818 In more than 50% of the case, the post-convergence path is loop- 819 free and does not depend on the presence of any remote SRv6 SID. 820 In the vast majority of cases, a single segment is enough to 821 encode the post-convergence path in a loop-free manner. If the 822 required segment is available (that node has been upgraded) then 823 the related back-up path is installed in FIB, else the pre- 824 existing situation (no backup) continues. Hence, as the SRv6 825 deployment progresses, the coverage incrementally increases. 826 Eventually, when the core network is SRv6 capable, the TI-LFA 827 coverage is complete. 829 The service programming use-case can be realized with SRv6 capability 830 deployed at few strategic nodes. 832 The service-programming deployment is again incremental and does 833 not require any pre-deployment of SRv6 in the network. When an 834 NFV app A1 needs to be enabled for inclusion in an SRv6 service 835 chain, all what is required is to install that app in a container 836 or VM on an SRv6-capable server (Linux 4.10 or FD.io 17.04 837 release). The app can either be SR-aware or not, leveraging the 838 proxy functions. 840 By leveraging the various End functions it can also be used to 841 support any current VNF/CNF implementations and their forwarding 842 methods (e.g. Layer 2). 844 The ability to leverage SR TE policies and BSIDs also permits 845 building scalable, hierarchical service-chains. 847 3.2. Integration 849 The SRv6 network programming concept allows integrating all the 850 application and service requirements: multi-domain underlay SLA 851 optimization with scale, overlay VPN/Tenant, sub-50msec automated 852 FRR, security and service programming. 854 3.3. Security 856 The combination of well-known techniques (SEC-1, SEC-2) and carefully 857 chosen architectural rules (SEC-3) ensure a secure deployment of SRv6 858 inside a multi-domain network managed by a single organization. 860 Inter-domain security will be described in a companion document. 862 4. Acknowledgements 864 The authors would like to acknowledge Stefano Previdi, Dave Barach, 865 Mark Townsley, Peter Psenak, Thierry Couture, Kris Michielsen, Paul 866 Wells, Robert Hanzl, Dan Ye, Gaurav Dawra, Faisal Iqbal, Jaganbabu 867 Rajamanickam, David Toscano, Asif Islam, Jianda Liu, Yunpeng Zhang, 868 Jiaoming Li, Narendra A.K, Mike Mc Gourty, Bhupendra Yadav, Sherif 869 Toulan, Satish Damodaran, John Bettink, Kishore Nandyala Veera Venk, 870 Jisu Bhattacharya and Saleem Hafeez. 872 5. Contributors 874 Daniel Bernier 875 Bell Canada 876 Canada 878 Email: daniel.bernier@bell.ca 880 Daniel Voyer 881 Bell Canada 882 Canada 884 Email: daniel.voyer@bell.ca 885 Bart Peirens 886 Proximus 887 Belgium 889 Email: bart.peirens@proximus.com 891 Hani Elmalky 892 Ericsson 893 United States of America 895 Email: hani.elmalky@gmail.com 897 Prem Jonnalagadda 898 Barefoot Networks 899 United States of America 901 Email: prem@barefootnetworks.com 903 Milad Sharif 904 Barefoot Networks 905 United States of America 907 Email: msharif@barefootnetworks.com 909 Stefano Salsano 910 Universita di Roma "Tor Vergata" 911 Italy 913 Email: stefano.salsano@uniroma2.it 915 Ahmed AbdelSalam 916 Gran Sasso Science Institute 917 Italy 919 Email: ahmed.abdelsalam@gssi.it 921 Gaurav Naik 922 Drexel University 923 United States of America 925 Email: gn@drexel.edu 927 Arthi Ayyangar 928 Arista 929 United States of America 931 Email: arthi@arista.com 932 Satish Mynam 933 Innovium Inc. 934 United States of America 936 Email: smynam@innovium.com 938 Wim Henderickx 939 Nokia 940 Belgium 942 Email: wim.henderickx@nokia.com 944 Shaowen Ma 945 Juniper 946 Singapore 948 Email: mashao@juniper.net 950 Ahmed Bashandy 951 Individual 952 United States of America 954 Email: abashandy.ietf@gmail.com 956 Francois Clad 957 Cisco Systems, Inc. 958 France 960 Email: fclad@cisco.com 962 Kamran Raza 963 Cisco Systems, Inc. 964 Canada 966 Email: skraza@cisco.com 968 Darren Dukes 969 Cisco Systems, Inc. 970 Canada 972 Email: ddukes@cisco.com 974 Patrice Brissete 975 Cisco Systems, Inc. 976 Canada 978 Email: pbrisset@cisco.com 979 Zafar Ali 980 Cisco Systems, Inc. 981 United States of America 983 Email: zali@cisco.com 985 6. References 987 [I-D.ietf-bess-srv6-services] 988 Dawra, G., Filsfils, C., Raszuk, R., Decraene, B., Zhuang, 989 S., and J. Rabadan, "SRv6 BGP based Overlay services", 990 draft-ietf-bess-srv6-services-02 (work in progress), 991 February 2020. 993 [I-D.ietf-spring-segment-routing-policy] 994 Filsfils, C., Sivabalan, S., Voyer, D., Bogdanov, A., and 995 P. Mattes, "Segment Routing Policy Architecture", draft- 996 ietf-spring-segment-routing-policy-07 (work in progress), 997 May 2020. 999 [I-D.ietf-spring-sr-service-programming] 1000 Clad, F., Xu, X., Filsfils, C., daniel.bernier@bell.ca, 1001 d., Li, C., Decraene, B., Ma, S., Yadlapalli, C., 1002 Henderickx, W., and S. Salsano, "Service Programming with 1003 Segment Routing", draft-ietf-spring-sr-service- 1004 programming-02 (work in progress), March 2020. 1006 [I-D.ietf-spring-srv6-network-programming] 1007 Filsfils, C., Camarillo, P., Leddy, J., Voyer, D., 1008 Matsushima, S., and Z. Li, "SRv6 Network Programming", 1009 draft-ietf-spring-srv6-network-programming-15 (work in 1010 progress), March 2020. 1012 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1013 Requirement Levels", BCP 14, RFC 2119, 1014 DOI 10.17487/RFC2119, March 1997, 1015 . 1017 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1018 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1019 May 2017, . 1021 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1022 (IPv6) Specification", STD 86, RFC 8200, 1023 DOI 10.17487/RFC8200, July 2017, 1024 . 1026 [RFC8754] Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., 1027 Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header 1028 (SRH)", RFC 8754, DOI 10.17487/RFC8754, March 2020, 1029 . 1031 Authors' Addresses 1033 Clarence Filsfils 1034 Cisco Systems, Inc. 1035 Belgium 1037 Email: cf@cisco.com 1039 Pablo Camarillo Garvia (editor) 1040 Cisco Systems, Inc. 1041 Spain 1043 Email: pcamaril@cisco.com 1045 Zhenbin Li 1046 Huawei Technologies 1047 China 1049 Email: lizhenbin@huawei.com 1051 Satoru Matsushima 1052 SoftBank 1053 1-9-1,Higashi-Shimbashi,Minato-Ku 1054 Tokyo 105-7322 1055 Japan 1057 Email: satoru.matsushima@g.softbank.co.jp 1059 Bruno Decraene 1060 Orange 1061 France 1063 Email: bruno.decraene@orange.com 1064 Dirk Steinberg 1065 Lapishills Consulting Limited 1066 Cyprus 1068 Email: dirk@lapishills.com 1070 David Lebrun 1071 Google 1072 Belgium 1074 Email: david.lebrun@uclouvain.be 1076 Robert Raszuk 1077 Bloomberg LP 1078 United States of America 1080 Email: robert@raszuk.net 1082 John Leddy 1083 Individual Contributor 1084 United States of America 1086 Email: john@leddy.net