idnits 2.17.1 draft-filsfils-spring-srv6-net-pgm-illustration-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC8986]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 30, 2021) is 1123 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC8754' is defined on line 1012, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-bess-srv6-services-05 == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-09 == Outdated reference: A later version (-09) exists of draft-ietf-spring-sr-service-programming-03 Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SPRING C. Filsfils 3 Internet-Draft P. Camarillo, Ed. 4 Intended status: Informational Cisco Systems, Inc. 5 Expires: October 1, 2021 Z. Li 6 Huawei Technologies 7 S. Matsushima 8 SoftBank 9 B. Decraene 10 Orange 11 D. Steinberg 12 Lapishills Consulting Limited 13 D. Lebrun 14 Google 15 R. Raszuk 16 Bloomberg LP 17 J. Leddy 18 Individual Contributor 19 March 30, 2021 21 Illustrations for SRv6 Network Programming 22 draft-filsfils-spring-srv6-net-pgm-illustration-04 24 Abstract 26 This document illustrates how SRv6 Network Programming [RFC8986] can 27 be used to create interoperable and protected overlays with underlay 28 optimization and service programming. 30 Requirements Language 32 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 33 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 34 "OPTIONAL" in this document are to be interpreted as described in BCP 35 14 [RFC2119] [RFC8174] when, and only when, they appear in all 36 capitals, as shown here. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at https://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on October 1, 2021. 55 Copyright Notice 57 Copyright (c) 2021 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (https://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Illustration . . . . . . . . . . . . . . . . . . . . . . . . 3 74 2.1. Simplified SID allocation . . . . . . . . . . . . . . . . 3 75 2.2. Reference diagram . . . . . . . . . . . . . . . . . . . . 4 76 2.3. Basic security . . . . . . . . . . . . . . . . . . . . . 4 77 2.4. SR-L3VPN . . . . . . . . . . . . . . . . . . . . . . . . 5 78 2.5. SR-Ethernet-VPWS . . . . . . . . . . . . . . . . . . . . 6 79 2.6. SR-EVPN-FXC . . . . . . . . . . . . . . . . . . . . . . . 7 80 2.7. SR-EVPN . . . . . . . . . . . . . . . . . . . . . . . . . 7 81 2.7.1. EVPN Bridging . . . . . . . . . . . . . . . . . . . . 7 82 2.7.2. EVPN Multi-homing with ESI filtering . . . . . . . . 9 83 2.7.3. EVPN Layer-3 . . . . . . . . . . . . . . . . . . . . 11 84 2.7.4. EVPN Integrated Routing Bridging (IRB) . . . . . . . 11 85 2.8. SR TE for Underlay SLA . . . . . . . . . . . . . . . . . 12 86 2.8.1. SR policy from the Ingress PE . . . . . . . . . . . . 12 87 2.8.2. SR policy at a midpoint . . . . . . . . . . . . . . . 13 88 2.9. End-to-End policy with intermediate BSID . . . . . . . . 14 89 2.10. TI-LFA . . . . . . . . . . . . . . . . . . . . . . . . . 15 90 2.11. SR TE for Service programming . . . . . . . . . . . . . . 16 91 3. Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . 18 92 3.1. Seamless deployment . . . . . . . . . . . . . . . . . . . 18 93 3.2. Integration . . . . . . . . . . . . . . . . . . . . . . . 19 94 3.3. Security . . . . . . . . . . . . . . . . . . . . . . . . 19 95 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 96 5. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 19 97 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 100 1. Introduction 102 Segment Routing leverages the source routing paradigm. An ingress 103 node steers a packet through a ordered list of instructions, called 104 segments. Each one of these instructions represents a function to be 105 called at a specific location in the network. A function is locally 106 defined on the node where it is executed and may range from simply 107 moving forward in the segment list to any complex user-defined 108 behavior. The network programming consists in combining segment 109 routing functions, both simple and complex, to achieve a networking 110 objective that goes beyond mere packet routing. 112 [RFC8986] defines the SRv6 Network Programming concept and the main 113 segment routing behaviors. 115 This document illustrates how these concepts can be used to enable 116 the creation of interoperable overlays with underlay optimization and 117 service programming. 119 The terminology for this document is defined in [RFC8986]. 121 2. Illustration 123 We introduce a simplified SID allocation technique to ease the 124 reading of the text. We document the reference diagram. We then 125 illustrate the network programming concept through different use- 126 cases. These use-cases have been thought to allow straightforward 127 combination between each other. 129 2.1. Simplified SID allocation 131 To simplify the illustration, we assume: 133 2001:db8::/32 is an IPv6 block allocated by a RIR to the operator 135 2001:db8:0::/48 is dedicated to the internal address space 137 2001:db8:cccc::/48 is dedicated to the internal SRv6 SID space 139 We assume a location expressed in 64 bits and a function expressed 140 in 16 bits 142 Node k has a classic IPv6 loopback address 2001:db8::k/128 which 143 is advertised in the IGP 144 Node k has 2001:db8:cccc:k::/64 for its local SID space. Its SIDs 145 will be explicitly assigned from that block 147 Node k advertises 2001:db8:cccc:k::/64 in its IGP 149 Function :1:: (function 1, for short) represents the End function 150 with PSP support 152 Function :C2:: (function C2, for short) represents the End.X 153 function towards neighbor 2 155 Each node k has: 157 An explicit SID instantiation 2001:db8:cccc:k:1::/128 bound to an 158 End function with additional support for PSP 160 An explicit SID instantiation 2001:db8:cccc:k:Cj::/128 bound to an 161 End.X function to neighbor J with additional support for PSP 163 2.2. Reference diagram 165 Let us assume the following topology where all the links have IGP 166 metric 10 except the link 3-4 which is 100. 168 Nodes A, B and 1 to 8 are considered within the network domain while 169 nodes CE-A, CE-B and CE-C are outside the domain. 171 CE-B 172 \ 173 3------4---5 174 | \ / 175 | 6 176 | / 177 A--1--- 2------7---8--B 178 / \ 179 CE-A CE-C 180 Tenant100 Tenant100 with 181 IPv4 203.0.113.0/24 183 Figure 1: Reference topology 185 2.3. Basic security 187 Any edge node such as 1 would be configured with an ACL on any of its 188 external interface (e.g. from CE-A) which drops any traffic with SA 189 or DA in 2001:db8:cccc::/48. See SEC-1. 191 Any core node such as 6 could be configured with an ACL with the 192 SEC-2 behavior "IF (DA == LocalSID) && (SA is not in 2001:db8:0::/48 193 or 2001:db8:cccc::/48) THEN drop". 195 SEC-3 protection is a default property of SRv6. A SID must be 196 explicitly instantiated. In our illustration, the only available 197 SIDs are those explicitly instantiated. 199 2.4. SR-L3VPN 201 Let us illustrate the SR-L3VPN use-case applied to IPv4. 203 Nodes 1 and 8 are configured with a tenant 100, each respectively 204 connected to CE-A and CE-C. 206 Node 8 is configured with a locally instantiated End.DT4 SID 207 2001:db8:cccc:8:D100:: bound to tenant IPv4 table 100. 209 Via BGP signaling or an SDN-based controller, Node 1's tenant-100 210 IPv4 table is programmed with an IPv4 SR-VPN route 203.0.113.0/24 via 211 SRv6 policy <2001:db8:cccc:8:D100::>. 213 When 1 receives a packet P from CE-A destined to 203.0.113.20, 1 214 looks up 203.0.113.20 in its tenant-100 IPv4 table and finds an SR- 215 VPN entry 203.0.113.0/24 via SRv6 policy <2001:db8:cccc:8:D100::>. 216 As a consequence, 1 pushes an outer IPv6 header with SA=2001:db8::1, 217 DA=2001:db8:cccc:8:D100:: and NH=4. 1 then forwards the resulting 218 packet on the shortest path to 2001:db8:cccc:8::/64. 220 When 8 receives the packet, 8 matches the DA in its "My SID Table", 221 finds the bound function End.DT4(100) and confirms NH=4. As a 222 result, 8 decaps the outer header, looks up the inner IPv4 DA in 223 tenant-100 IPv4 table, and forward the (inner) IPv4 packet towards 224 CE-C. 226 The reader can easily infer all the other SR-IPVPN instantiations: 228 +---------------------------------+----------------------------------+ 229 | Route at ingress PE(1) | SR-VPN Egress SID of egress PE(8)| 230 +---------------------------------+----------------------------------+ 231 | IPv4 tenant route with egress | End.DT4 function bound to | 232 | tenant table lookup | IPv4-tenant-100 table | 233 +---------------------------------+----------------------------------+ 234 | IPv4 tenant route without egress| End.DX4 function bound to | 235 | tenant table lookup | CE-C (IPv4) | 236 +---------------------------------+----------------------------------+ 237 | IPv6 tenant route with egress | End.DT6 function bound to | 238 | tenant table lookup | IPv6-tenant-100 table | 239 +---------------------------------+----------------------------------+ 240 | IPv6 tenant route without egress| End.DX6 function bound to | 241 | tenant table lookup | CE-C (IPv6) | 242 +---------------------------------+----------------------------------+ 244 2.5. SR-Ethernet-VPWS 246 Let us illustrate the SR-Ethernet-VPWS use-case. 248 Node 8 is configured a locally instantiated End.DX2 SID 249 2001:db8:cccc:8:DC2C:: bound to local attachment circuit {ethernet 250 CE-C}. 252 Via BGP signalling or an SDN controller, node 1 is programmed with an 253 Ethernet VPWS service for its local attachment circuit {ethernet CE- 254 A} with remote endpoint 2001:db8:cccc:8:DC2C::. 256 When 1 receives a frame F from CE-A, node 1 pushes an outer IPv6 257 header with SA=2001:db8::1, DA=2001:db8:cccc:8:DC2C:: and NH=143. 258 Note that no additional header is pushed. 1 then forwards the 259 resulting packet on the shortest path to 2001:db8:cccc:8::/64. 261 When 8 receives the packet, 8 matches the DA in its "My SID Table" 262 and finds the bound function End.DX2. After confirming that next- 263 header=143, 8 decaps the outer IPv6 header and forwards the inner 264 Ethernet frame towards CE-C. 266 The reader can easily infer the Ethernet VPWS use-case: 268 +------------------------+-----------------------------------+ 269 | Route at ingress PE(1) | SR-VPN Egress SID of egress PE(8) | 270 +------------------------+-----------------------------------+ 271 | Ethernet VPWS | End.DX2 function bound to | 272 | | CE-C (Ethernet) | 273 +------------------------+-----------------------------------+ 275 2.6. SR-EVPN-FXC 277 Let us illustrate the SR-EVPN-FXC use-case (Flexible cross-connect 278 service). 280 Node 8 is configured with a locally instantiated End.DX2V SID 281 2001:db8:cccc:8:DC2C:: bound to the L2 table T1. Node 8 is also 282 configured with local attachment circuits {ethernet CE1-C VLAN:100} 283 and {ethernet CE2-C VLAN:200} in table T1. 285 Via an SDN controller or derived from a BGP-based sginalling, the 286 node 1 is programmed with an EVPN-FXC service for its local 287 attachment circuit {ethernet CE-A} with remote endpoint 288 2001:db8:cccc:8:DC2C::. For this purpose, the EVPN Type-1 route is 289 used. 291 When node 1 receives a frame F from CE-A, it pushes an outer IPv6 292 header with SA=2001:db8::1, DA=2001:db8:cccc:8:DC2C:: and NH=143. 293 Note that no additional header is pushed. Node 1 then forwards the 294 resulting packet on the shortest path to 2001:db8:cccc:8::/64. 296 When node 8 receives the packet, it matches the IP DA in its "My SID 297 Table" and finds the bound function End.DX2V. After confirming that 298 next-header=143, node 8 decaps the outer IPv6 header, performs a VLAN 299 loopkup in table T1 and forwards the inner Ethernet frame to matching 300 interface e.g. for VLAN 100, packet is forwarded to CE1-C and for 301 VLAN 200, frame is forwarded to CE2-C. 303 The reader can easily infer the Ethernet FXC use-case: 305 +---------------------------------+------------------------------------+ 306 | Route at ingress PE (1) | SR-VPN Egress SID of egress PE (8) | 307 +---------------------------------+------------------------------------+ 308 | EVPN-FXC | End.DX2V function bound to | 309 | | CE1-C / CE2-C (Ethernet) | 310 +---------------------------------+------------------------------------+ 312 2.7. SR-EVPN 314 The following section details some of the particular use-cases of SR- 315 EVPN. In particular bridging (unicast and multicast), multi-homing 316 ESI filtering, L3 EVPN and EVPN-IRB. 318 2.7.1. EVPN Bridging 320 Let us illustrate the SR-EVPN unicast and multicast bridging. 322 Nodes 1, 3 and 8 are configured with a EVPN bridging service (E-LAN 323 service). 325 Node 1 is configured with a locally instantiated End.DT2U SID 326 2001:db8:cccc:1:D2AA:: bound to a local L2 table T1 where EVPN is 327 enabled. This SID will be used to attract unicast traffic. 328 Additionally, Node 1 is configured with a locally instantiated 329 End.DT2M SID 2001:db8:cccc:1:D2AF:: bound to the same local L2 table 330 T1. This SID will be used to attract multicast traffic. Node 1 is 331 also configured with local attachment circuit {ethernet CE-A 332 VLAN:100} associated to table T1. 334 A similar instantiation is done at Node 3 and Node 8 resulting in: 336 - Node 1 - My SID table: 338 - End.DT2U SID: 2001:db8:cccc:1:D2AA:: table T1 340 - End.DT2M SID: 2001:db8:cccc:1:D2AF:: table T1 342 - Node 3 - My SID table: 344 - End.DT2U SID: 2001:db8:cccc:3:D2BA:: table T3 346 - End.DT2M SID: 2001:db8:cccc:3:D2BF:: table T3 348 - Node 8 - My SID table: 350 - End.DT2U SID: 2001:db8:cccc:8:D2CA:: table T8 352 - End.DT2M SID: 2001:db8:cccc:8:D2CB:: table T8 354 Nodes 1, 4 and 8 are going to exchange the End.DT2M SIDs via BGP- 355 based EVPN Type-3 route. Upon reception of the EVPN Type-3 routes, 356 each node build its own replication list per L2 table that will be 357 used for ingress BUM traffic replication. The replication lists are 358 the following: 360 - Node 1 - replication list: {2001:db8:cccc:3:D2BF:: and 361 2001:db8:cccc:8:D2CF::} 363 - Node 3 - replication list: {2001:db8:cccc:1:D2AF:: and 364 2001:db8:cccc:8:D2CF::} 366 - Node 8 - replication list: {2001:db8:cccc:1:D2AF:: and 367 2001:db8:cccc:3:D2CF::} 369 When node 1 receives a BUM frame F from CE-A, it replicates that 370 frame to every node in the replication list. For node 3, it pushes 371 an outer IPv6 header with SA=2001:db8::1, DA=2001:db8:cccc:3:D2BF:: 372 and NH=143. For node 8, it performs the same operation but 373 DA=2001:db8:cccc:8:D2CF::. Note that no additional headers are 374 pushed. Node 1 then forwards the resulting packets on the shortest 375 path for each destination. 377 When node 3 receives the packet, it matches the DA in its "My SID 378 Table" and finds the bound function End.DT2M with its related layer2 379 table T3. After confirming that next-header=143, node 3 decaps the 380 outer IPv6 header and forwards the inner Ethernet frame to all 381 layer-2 output interface found in table T3. Similar processing is 382 also performed by node 8 upon packet reception. This example is the 383 same for any BUM stream coming from CE-B or CE-C. 385 Node 1,3 and 8 are also performing software MAC learning to exchange 386 MAC reachability information (unicast traffic) via BGP among 387 themselves. 389 Each MAC being learnt is exchanged using BGP-based EVPN Type-2 route. 391 When node 1 receives an unicast frame F from CE-A, it learns its MAC- 392 SA=CEA in software. Node 1 transmits that MAC and its associated SID 393 2001:db8:cccc:1:D2AA:: using BGP-based EVPN route-type 2 to all 394 remote nodes. 396 When node 3 receives an unicast frame F from CE-B destinated to MAC- 397 DA=CEA, it performs a L2 lookup on T3 to find the associated SID. It 398 pushes an outer IPv6 header with SA=2001:db8::3, 399 DA=2001:db8:cccc:1:D2AA:: and NH=143. Node 3 then forwards the 400 resulting packet on the shortest path to 2001:db8:cccc:1::/64. 401 Similar processing is also performed by node 8. 403 2.7.2. EVPN Multi-homing with ESI filtering 405 In L2 network, support for traffic loop avoidance is mandatory. In 406 EVPN all-active multi-homing scenario enforces that requirement using 407 ESI filtering. Let us illustrate how it works: 409 Nodes 3 and 4 are peering partners of a redundancy group where the 410 access CE-B, is connected in an all-active multi-homing way with 411 these two nodes. Hence, the topology is the following: 413 CE-B 414 / \ 415 3------4---5 416 | \ / 417 | 6 418 | / 419 A--1--- 2------7---8--B 420 / \ 421 CE-A CE-C 422 Tenant100 Tenant100 with 423 IPv4 203.0.113.0/24 425 EVPN ESI filtering - Reference topology 427 Nodes 3 and 4 are configured with an EVPN bridging service (E-LAN 428 service). 430 Node 3 is configured with a locally instantiated End.DT2M SID 431 2001:db8:cccc:3:D2BF:: bound to a local L2 table T1 where EVPN is 432 enabled. This SID is also configured with the optional argument 433 Arg.FE2 that specifies the attachment circuit. Particularly, node 3 434 assigns identifier 0xC1 to {ethernet CE-B}. 436 Node 4 is configured with a locally instantiated End.DT2M SID 437 2001:db8:cccc:4:D2BF:: bound to a local L2 table T1 where EVPN is 438 enabled. This SID is also configured with the optional argument 439 Arg.FE2 that specifies the attachment circuit. Particularly, node 4 440 assigns identifier 0xC2 to {ethernet CE-B}. 442 Both End.DT2M SIDs are exchanged between nodes via BGP-based EVPN 443 Type-3 routes. Upon reception of EVPN Type-3 routes, each node build 444 its own replication list per L2 table T1. 446 On the other hand, the End.DT2M SID arguments (Arg.F2) are exchanged 447 between nodes via SRv6 VPN SID attached to the BGP-based EVPN Type-1 448 route. The BGP ESI-filtering extended community label is set to 449 implicit-null [I-D.ietf-bess-srv6-services]. 451 Upon reception of EVPN Type-1 route and Type-3 route, node 3 merges 452 merges the End.DT2M SID (2001:db8:cccc:4:D2BF:) with the 453 Arg.FE2(0:0:0:C2::) from node 4 (its peering partner). This is done 454 by a simple OR bitwise operation. As a result, the replication list 455 on node 3 for the PEs 1,4 and 8 is: {2001:db8:cccc:1:D2AF::; 456 2001:db8:cccc:4:D2BF:C2::; 2001:db8:cccc:8:D2CF::}. 458 In a similar manner, the replication list on node 4 for the PEs 1,3 459 and 8 is: {2001:db8:cccc:1:D2AF::; 2001:db8:cccc:3:D2BF:C1::; 460 2001:db8:cccc:8:D2CF::}. Note that in this case the SID for PE3 461 contains the OR bitwise operation of SIDs 2001:db8:cccc:3:D2BF:: and 462 0:0:0:C1::. 464 When node 3 receives a BUM frame F from CE-B, it replicates that 465 frame to remote PEs. For node 4, it pushes an outer IPv6 header with 466 SA=2001:db8::3, DA=2001:db8:cccc:4:D2BF:C2:: and NH=143. Note that 467 no additional header is pushed. Node 3 then forwards the resulting 468 packet on the shortest path to node 4, and once the packet arrives to 469 node 4, the End.DT2M function is executed forwarding to all L2 OIFs 470 except the ones corresponding to identifier 0xC2. 472 2.7.3. EVPN Layer-3 474 EVPN layer-3 works exactly in the same way than L3VPN. Please refer 475 to section Section 2.4 477 2.7.4. EVPN Integrated Routing Bridging (IRB) 479 EVPN IRB brings Layer-2 and Layer-3 together. It uses BGP-based EVPN 480 Type-2 route to achieve Layer-2 intra-subnet and Layer-3 inter-subnet 481 forwarding. The EVPN Type-2 route-2 maintains the MAC/IP 482 association. 484 Node 8 is configured with a locally instantiated End.DT2U SID 485 2001:db8:cccc:8:D2C:: used for unicast L2 traffic. Node 8 is also 486 configured with locally instantiated End.DT4 SID 487 2001:db8:cccc:8:D100:: bound to IPv4 tenant table 100. 489 Node 1 is going to be configured with the EVPN IRB service. 491 Node 8 signals to other remote PEs (1, 3) each ARP/ND request learned 492 via BGP-based EVPN Type-2 route. For example, when node 8 receives 493 an ARP/ND packet P from a host (203.0.113.20) on CE-C destined to 494 192.0.2.10, it learns its MAC-SA=CEC in software. It also learns the 495 ARP/ND entry (IP SA=203.0.113.20) in its cache. Node 8 transmits 496 that MAC/IP and its associated L3 SID (2001:db8:cccc:8:D100::) and L2 497 SID (2001:db8:cccc:8:D2C::). 499 When node 1 receives a packet P from CE-A destined to 203.0.113.20 500 from a host (192.0.2.10), node 1 looks up its tenant-100 IPv4 table 501 and finds an SR-VPN entry for that prefix. As a consequence, node 1 502 pushes an outer IPv6 header with SA=2001:db8::1, 503 DA=2001:db8:cccc:8:D100:: and NH=4. Node 1 then forwards the 504 resulting packet on the shortest path to 2001:db8:cccc:8::/64. EVPN 505 inter-subnet forwarding is then achieved. 507 When node 1 receives a packet P from CE-A destined to 203.0.113.20 508 from a host (192.0.2.11), P looks up its L2 table T1 MAC-DA lookup to 509 find the associated SID. It pushes an outer IPv6 header with 510 SA=2001:db8::1, DA=2001:db8:cccc:8:D2C:: and NH=143. Note that no 511 additional header is pushed. Node 8 then forwards the resulting 512 packet on the shortest path to 2001:db8:cccc:8::/64. EVPN intra- 513 subnet forwarding is then achieved. 515 2.8. SR TE for Underlay SLA 517 2.8.1. SR policy from the Ingress PE 519 Let's assume that node 1's tenant-100 IPv4 route "203.0.113.0/24 via 520 2001:db8:cccc:8:D100::" is programmed with a color/community that 521 requires low-latency underlay optimization 522 [I-D.ietf-spring-segment-routing-policy]. 524 In such case, node 1 either computes the low-latency path to the 525 egress node itself or delegates the computation to a PCE. 527 In either case, the location of the egress PE can easily be found by 528 looking for who originates the locator comprising the SID 529 2001:db8:cccc:8:D100::. This can be found in the IGP's LSDB for a 530 single domain case, and in the BGP-LS LSDB for a multi-domain case. 532 Let us assume that the TE metric encodes the per-link propagation 533 latency. Let us assume that all the links have a TE metric of 10, 534 except link 27 which has TE metric 100. 536 The low-latency path from 1 to 8 is thus 1234678. 538 This path is encoded in a SID list as: first a hop through 539 2001:db8:cccc:3:C4:: and then a hop to 8. 541 As a consequence the SR-VPN entry 203.0.113.0/24 installed in the 542 Node1's Tenant-100 IPv4 table is: H.Encaps with SRv6 Policy 543 <2001:db8:cccc:3:C4::, 2001:db8:cccc:8:D100::>. 545 When 1 receives a packet P from CE-A destined to 203.0.113.20, P 546 looks up its tenant-100 IPv4 table and finds an SR-VPN entry 547 203.0.113.0/24. As a consequence, 1 pushes an outer header with 548 SA=2001:db8::1, DA=2001:db8:cccc:3:C4::, NH=SRH followed by SRH 549 (2001:db8:cccc:8:D100::, 2001:db8:cccc:3:C4::; SL=1; NH=4). 1 then 550 forwards the resulting packet on the interface to 2. 552 2 forwards to 3 along the path to 2001:db8:cccc:3::/64. 554 When 3 receives the packet, 3 matches the DA in its "My SID Table" 555 and finds the bound function End.X to neighbor 4. 3 notes the PSP 556 capability of the SID 2001:db8:cccc:3:C4::. 3 sets the DA to the next 557 SID 2001:db8:cccc:8:D100::. As 3 is the penultimate segment hop, it 558 performs PSP and pops the SRH. 3 forwards the resulting packet to 4. 560 4, 6 and 7 forwards along the path to 2001:db8:cccc:8::/64. 562 When 8 receives the packet, 8 matches the DA in its "My SID Table" 563 and finds the bound function End.DT(100). As a result, 8 decaps the 564 outer header, looks up the inner IPv4 DA (203.0.113.20) in tenant-100 565 IPv4 table, and forward the (inner) IPv4 packet towards CE-B. 567 2.8.2. SR policy at a midpoint 569 Let us analyze a policy applied at a midpoint on a packet without 570 SRH. 572 Packet P1 is (2001:db8::1, 2001:db8:cccc:8:D100::). 574 Let us consider P1 when it is received by node 2 and let us assume 575 that that node 2 is configured to steer 2001:db8:cccc:8::/64 in a 576 H.Insert.Red behavior associated with SR policy 577 <2001:db8:cccc:3:C4::>. 579 In such a case, node 2 would send the following modified packet P1 on 580 the link to 3: 582 (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:8:D100::; SL=1). 584 The rest of the processing is similar to the previous section. 586 Let us analyze a policy applied at a midpoint on a packet with an 587 SRH. 589 Packet P2 is (2001:db8::1, 590 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::; SL=1). 592 Let us consider P2 when it is received by node 2 and let us assume 593 that node 2 is configured to steer 2001:db8:cccc:7::/64 in a 594 H.Insert.Red behavior associated with SR policy 595 <2001:db8:cccc:3:C4::, 2001:db8:cccc:5:1::>. 597 In such a case, node 2 would send the following modified packet P2 on 598 the link to 4: 600 (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:7:1::, 601 2001:db8:cccc:5:1::; SL=2)(2001:db8:cccc:8:D100::; SL=1) 602 Node 3 would send the following packet to 4: (2001:db8::1, 603 2001:db8:cccc:5:1::)(2001:db8:cccc:7:1::, 2001:db8:cccc:5:1::; 604 SL=1)(2001:db8:cccc:8:D100::; SL=1) 606 Node 4 would send the following packet to 5: (2001:db8::1, 607 2001:db8:cccc:5:1::)(2001:db8:cccc:7:1::, 2001:db8:cccc:5:1::; 608 SL=1)(2001:db8:cccc:8:D100::; SL=1) 610 Node 5 would send the following packet to 6: (2001:db8::1, 611 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::; SL=1) 613 Node 6 would send the following packet to 7: (2001:db8::1, 614 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::; SL=1) 616 Node 7 would send the following packet to 8: (2001:db8::1, 617 2001:db8:cccc:8:D100::) 619 2.9. End-to-End policy with intermediate BSID 621 Let us now describe a case where the ingress VPN edge node steers the 622 packet destined to 203.0.113.20 towards the egress edge node 623 connected to the tenant100 site with 203.0.113.0/24, but via an 624 intermediate SR Policy represented by a single routable Binding SID. 625 Let us illustrate this case with an intermediate policy which both 626 encodes underlay optimization for low-latency and the service 627 programming via two SR-aware container-based apps. 629 Let us assume that the End.B6.Insert SID 2001:db8:cccc:2:B1:: is 630 configured at node 2 and is associated with midpoint SR policy 631 <2001:db8:cccc:3:C4::, 2001:db8:cccc:9:A1::, 2001:db8:cccc:6:A2::>. 633 2001:db8:cccc:3:C4:: realizes the low-latency path from the ingress 634 PE to the egress PE. This is the underlay optimization part of the 635 intermediate policy. 637 2001:db8:cccc:9:A1:: and 2001:db8:cccc:6:A2:: represent two SR-aware 638 NFV applications residing in containers respectively connected to 639 node 9 and 6. 641 Let us assume the following ingress VPN policy for 203.0.113.0/24 in 642 tenant 100 IPv4 table of node 1: H.Encaps with SRv6 Policy 643 <2001:db8:cccc:2:B1::, 2001:db8:cccc:8:D100::>. 645 This ingress policy will steer the 203.0.113.0/24 tenant-100 traffic 646 towards the correct egress PE and via the required intermediate 647 policy that realizes the SLA and NFV requirements of this tenant 648 customer. 650 Node 1 sends the following packet to 2: (2001:db8::1, 651 2001:db8:cccc:2:B1::) (2001:db8:cccc:8:D100::, 2001:db8:cccc:2:B1::; 652 SL=1) 654 Node 2 sends the following packet to 4: (2001:db8::1, 655 2001:db8:cccc:3:C4::) (2001:db8:cccc:6:A2::, 2001:db8:cccc:9:A1::, 656 2001:db8:cccc:3:C4::; SL=2)(2001:db8:cccc:8:D100::, 657 2001:db8:cccc:2:B1::; SL=1) 659 Node 4 sends the following packet to 5: (2001:db8::1, 660 2001:db8:cccc:9:A1::) (2001:db8:cccc:6:A2::, 2001:db8:cccc:9:A1::, 661 2001:db8:cccc:3:C4::; SL=1)(2001:db8:cccc:8:D100::, 662 2001:db8:cccc:2:B1::; SL=1) 664 Node 5 sends the following packet to 9: (2001:db8::1, 665 2001:db8:cccc:9:A1::) (2001:db8:cccc:6:A2::, 2001:db8:cccc:9:A1::, 666 2001:db8:cccc:3:C4::; SL=1)(2001:db8:cccc:8:D100::, 667 2001:db8:cccc:2:B1::; SL=1) 669 Node 9 sends the following packet to 6: (2001:db8::1, 670 2001:db8:cccc:6:A2::) (2001:db8:cccc:8:D100::, 2001:db8:cccc:2:B1::; 671 SL=1) 673 Node 6 sends the following packet to 7: (2001:db8::1, 674 2001:db8:cccc:8:D100::) 676 Node 7 sends the following packet to 8: (2001:db8::1, 677 2001:db8:cccc:8:D100::) which decaps and forwards to CE-B. 679 The benefits of using an intermediate Binding SID are well-known and 680 key to the Segment Routing architecture: the ingress edge node needs 681 to push fewer SIDs, the ingress edge node does not need to change its 682 SR policy upon change of the core topology or re-homing of the 683 container-based apps on different servers. Conversely, the core and 684 service organizations do not need to share details on how they 685 realize underlay SLA's or where they home their NFV apps. 687 2.10. TI-LFA 689 Let us assume two packets P1 and P2 received by node 2 exactly when 690 the failure of link 27 is detected. 692 P1: (2001:db8::1, 2001:db8:cccc:7:1::) 694 P2: (2001:db8::1, 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::; 695 SL=1) 697 Node 2's pre-computed TI-LFA backup path for the destination 698 2001:db8:cccc:7::/64 is <2001:db8:cccc:3:C4::>. It is installed as a 699 H.Insert.Red transit behavior. 701 Node 2 protects the two packets P1 and P2 according to the pre- 702 computed TI-LFA backup path and send the following modified packets 703 on the link to 4: 705 P1: (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:7:1::; SL=1) 707 P2: (2001:db8::1, 2001:db8:cccc:3:C4::)(2001:db8:cccc:7:1::; SL=1) 708 (2001:db8:cccc:8:D100::; SL=1) 710 Node 4 then sends the following modified packets to 5: 712 P1: (2001:db8::1, 2001:db8:cccc:7:1::) 714 P2: (2001:db8::1, 2001:db8:cccc:7:1::)(2001:db8:cccc:8:D100::; 715 SL=1) 717 Then these packets follow the rest of their post-convergence path 718 towards node 7 and then go to node 8 for the VPN decaps. 720 2.11. SR TE for Service programming 722 We have illustrated the service programming through SR-aware apps in 723 a previous section. 725 We illustrate the use of End.AS function 726 [I-D.ietf-spring-sr-service-programming] to service chain an IP flow 727 bound to the internet through two SR-unaware applications hosted in 728 containers. 730 Let us assume that servers 20 and 70 are respectively connected to 731 nodes 2 and 7. They are respectively configured with SID spaces 732 2001:db8:cccc:20::/64 and 2001:db8:cccc:70::/64. Their connected 733 routers advertise the related prefixes in the IGP. Two SR-unaware 734 container-based applications App2 and App7 are respectively hosted on 735 server 20 and 70. Server 20 (70) is configured explicitly with an 736 End.AS SID 2001:db8:cccc:20:2:: for App2 (2001:db8:cccc:70:7:: for 737 App7). 739 Let us assume a broadband customer with a home gateway CE-A connected 740 to edge router 1. Router 1 is configured with an SR policy which 741 encapsulates all the traffic received from CE-A into a H.Encaps 742 policy <2001:db8:cccc:20:2::, 2001:db8:cccc:70:7::, 743 2001:db8:cccc:8:D0::> where 2001:db8:cccc:8:D0:: is an End.DT4 SID 744 instantiated at node 8. 746 P1 is a packet sent by the broadband customer to 1: (X, Y) where X 747 and Y are two IPv4 addresses. 749 1 sends the following packet to 2: (A1::, 750 2001:db8:cccc:20:2::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 751 2001:db8:cccc:20:2::; SL=2; NH=4)(X, Y). 753 2 forwards the packet to server 20. 755 20 receives the packet (A1::, 756 2001:db8:cccc:20:2::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 757 2001:db8:cccc:20:2::; SL=2; NH=4)(X, Y) and forwards the inner IPv4 758 packet (X,Y) to App2. App2 works on the packet and forwards it back 759 to 20. 20 pushes the outer IPv6 header with SRH (A1::, 760 2001:db8:cccc:70:7::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 761 2001:db8:cccc:20:2::; SL=1; NH=4) and sends the (whole) IPv6 packet 762 with the encapsulated IPv4 packet back to 2. 764 2 and 7 forward to server 70. 766 70 receives the packet (A1::, 767 2001:db8:cccc:70:7::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 768 2001:db8:cccc:20:2::; SL=1; NH=4)(X, Y) and forwards the inner IPv4 769 packet (X,Y) to App7. App7 works on the packet and forwards it back 770 to 70. 70 pushes the outer IPv6 header with SRH (A1::, 771 2001:db8:cccc:8:D0::)(2001:db8:cccc:8:D0::, 2001:db8:cccc:70:7::, 772 2001:db8:cccc:20:2::; SL=0; NH=4) and sends the (whole) IPv6 packet 773 with the encapsulated IPv4 packet back to 7. 775 7 forwards to 8. 777 8 receives (A1::, 2001:db8:cccc:8:D0::)(2001:db8:cccc:8:D0::, 778 2001:db8:cccc:70:7::, 2001:db8:cccc:20:2::; SL=0; NH=4)(X, Y) and 779 performs the End.DT4 function and sends the IP packet (X, Y) towards 780 its internet destination. 782 3. Benefits 784 3.1. Seamless deployment 786 The VPN use-case can be realized with SRv6 capability deployed solely 787 at the ingress and egress PE's. 789 All the nodes in between these PE's act as transit routers as per 790 [RFC8200]. No software/hardware upgrade is required on all these 791 nodes. They just need to support IPv6 per [RFC8200]. 793 The SRTE/underlay-SLA use-case can be realized with SRv6 capability 794 deployed at few strategic nodes. 796 It is well-known from the experience deploying SR-MPLS that 797 underlay SLA optimization requires few SIDs placed at strategic 798 locations. This was illustrated in our example with the low- 799 latency optimization which required the operator to enable one 800 single core node with SRv6 (node 4) where one single and End.X SID 801 towards node 5 was instantiated. This single SID is sufficient to 802 force the end-to-end traffic via the low-latency path. 804 The TI-LFA benefits are collected incrementally as SRv6 capabilities 805 are deployed. 807 It is well-know that TI-LFA is an incremental node-by-node 808 deployment. When a node N is enabled for TI-LFA, it computes TI- 809 LFA backup paths for each primary path to each IGP destination. 810 In more than 50% of the case, the post-convergence path is loop- 811 free and does not depend on the presence of any remote SRv6 SID. 812 In the vast majority of cases, a single segment is enough to 813 encode the post-convergence path in a loop-free manner. If the 814 required segment is available (that node has been upgraded) then 815 the related back-up path is installed in FIB, else the pre- 816 existing situation (no backup) continues. Hence, as the SRv6 817 deployment progresses, the coverage incrementally increases. 818 Eventually, when the core network is SRv6 capable, the TI-LFA 819 coverage is complete. 821 The service programming use-case can be realized with SRv6 capability 822 deployed at few strategic nodes. 824 The service-programming deployment is again incremental and does 825 not require any pre-deployment of SRv6 in the network. When an 826 NFV app A1 needs to be enabled for inclusion in an SRv6 service 827 chain, all what is required is to install that app in a container 828 or VM on an SRv6-capable server (Linux 4.10 or FD.io 17.04 829 release). The app can either be SR-aware or not, leveraging the 830 proxy functions. 832 By leveraging the various End functions it can also be used to 833 support any current VNF/CNF implementations and their forwarding 834 methods (e.g. Layer 2). 836 The ability to leverage SR TE policies and BSIDs also permits 837 building scalable, hierarchical service-chains. 839 3.2. Integration 841 The SRv6 network programming concept allows integrating all the 842 application and service requirements: multi-domain underlay SLA 843 optimization with scale, overlay VPN/Tenant, sub-50msec automated 844 FRR, security and service programming. 846 3.3. Security 848 The combination of well-known techniques (SEC-1, SEC-2) and carefully 849 chosen architectural rules (SEC-3) ensure a secure deployment of SRv6 850 inside a multi-domain network managed by a single organization. 852 Inter-domain security will be described in a companion document. 854 4. Acknowledgements 856 The authors would like to acknowledge Stefano Previdi, Dave Barach, 857 Mark Townsley, Peter Psenak, Thierry Couture, Kris Michielsen, Paul 858 Wells, Robert Hanzl, Dan Ye, Gaurav Dawra, Faisal Iqbal, Jaganbabu 859 Rajamanickam, David Toscano, Asif Islam, Jianda Liu, Yunpeng Zhang, 860 Jiaoming Li, Narendra A.K, Mike Mc Gourty, Bhupendra Yadav, Sherif 861 Toulan, Satish Damodaran, John Bettink, Kishore Nandyala Veera Venk, 862 Jisu Bhattacharya, Saleem Hafeez and Michael Huang. 864 5. Contributors 866 Daniel Bernier 867 Bell Canada 868 Canada 870 Email: daniel.bernier@bell.ca 872 Daniel Voyer 873 Bell Canada 874 Canada 876 Email: daniel.voyer@bell.ca 877 Bart Peirens 878 Proximus 879 Belgium 881 Email: bart.peirens@proximus.com 883 Hani Elmalky 884 Ericsson 885 United States of America 887 Email: hani.elmalky@gmail.com 889 Prem Jonnalagadda 890 Barefoot Networks 891 United States of America 893 Email: prem@barefootnetworks.com 895 Milad Sharif 896 Barefoot Networks 897 United States of America 899 Email: msharif@barefootnetworks.com 901 Stefano Salsano 902 Universita di Roma "Tor Vergata" 903 Italy 905 Email: stefano.salsano@uniroma2.it 907 Ahmed AbdelSalam 908 Gran Sasso Science Institute 909 Italy 911 Email: ahmed.abdelsalam@gssi.it 913 Gaurav Naik 914 Drexel University 915 United States of America 917 Email: gn@drexel.edu 919 Arthi Ayyangar 920 Arista 921 United States of America 923 Email: arthi@arista.com 924 Satish Mynam 925 Innovium Inc. 926 United States of America 928 Email: smynam@innovium.com 930 Wim Henderickx 931 Nokia 932 Belgium 934 Email: wim.henderickx@nokia.com 936 Shaowen Ma 937 Juniper 938 Singapore 940 Email: mashao@juniper.net 942 Ahmed Bashandy 943 Individual 944 United States of America 946 Email: abashandy.ietf@gmail.com 948 Francois Clad 949 Cisco Systems, Inc. 950 France 952 Email: fclad@cisco.com 954 Kamran Raza 955 Cisco Systems, Inc. 956 Canada 958 Email: skraza@cisco.com 960 Darren Dukes 961 Cisco Systems, Inc. 962 Canada 964 Email: ddukes@cisco.com 966 Patrice Brissete 967 Cisco Systems, Inc. 968 Canada 970 Email: pbrisset@cisco.com 971 Zafar Ali 972 Cisco Systems, Inc. 973 United States of America 975 Email: zali@cisco.com 977 6. References 979 [I-D.ietf-bess-srv6-services] 980 Dawra, G., Filsfils, C., Talaulikar, K., Raszuk, R., 981 Decraene, B., Zhuang, S., and J. Rabadan, "SRv6 BGP based 982 Overlay services", draft-ietf-bess-srv6-services-05 (work 983 in progress), November 2020. 985 [I-D.ietf-spring-segment-routing-policy] 986 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 987 P. Mattes, "Segment Routing Policy Architecture", draft- 988 ietf-spring-segment-routing-policy-09 (work in progress), 989 November 2020. 991 [I-D.ietf-spring-sr-service-programming] 992 Clad, F., Xu, X., Filsfils, C., daniel.bernier@bell.ca, 993 d., Li, C., Decraene, B., Ma, S., Yadlapalli, C., 994 Henderickx, W., and S. Salsano, "Service Programming with 995 Segment Routing", draft-ietf-spring-sr-service- 996 programming-03 (work in progress), September 2020. 998 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 999 Requirement Levels", BCP 14, RFC 2119, 1000 DOI 10.17487/RFC2119, March 1997, 1001 . 1003 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1004 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1005 May 2017, . 1007 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1008 (IPv6) Specification", STD 86, RFC 8200, 1009 DOI 10.17487/RFC8200, July 2017, 1010 . 1012 [RFC8754] Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., 1013 Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header 1014 (SRH)", RFC 8754, DOI 10.17487/RFC8754, March 2020, 1015 . 1017 [RFC8986] Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, 1018 D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 1019 (SRv6) Network Programming", RFC 8986, 1020 DOI 10.17487/RFC8986, February 2021, 1021 . 1023 Authors' Addresses 1025 Clarence Filsfils 1026 Cisco Systems, Inc. 1027 Belgium 1029 Email: cf@cisco.com 1031 Pablo Camarillo Garvia (editor) 1032 Cisco Systems, Inc. 1033 Spain 1035 Email: pcamaril@cisco.com 1037 Zhenbin Li 1038 Huawei Technologies 1039 China 1041 Email: lizhenbin@huawei.com 1043 Satoru Matsushima 1044 SoftBank 1045 1-9-1,Higashi-Shimbashi,Minato-Ku 1046 Tokyo 105-7322 1047 Japan 1049 Email: satoru.matsushima@g.softbank.co.jp 1051 Bruno Decraene 1052 Orange 1053 France 1055 Email: bruno.decraene@orange.com 1056 Dirk Steinberg 1057 Lapishills Consulting Limited 1058 Cyprus 1060 Email: dirk@lapishills.com 1062 David Lebrun 1063 Google 1064 Belgium 1066 Email: david.lebrun@uclouvain.be 1068 Robert Raszuk 1069 Bloomberg LP 1070 United States of America 1072 Email: robert@raszuk.net 1074 John Leddy 1075 Individual Contributor 1076 United States of America 1078 Email: john@leddy.net