idnits 2.17.1 draft-filsfils-spring-sr-policy-considerations-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 7, 2018) is 2122 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-00 == Outdated reference: A later version (-08) exists of draft-anand-spring-poi-sr-05 == Outdated reference: A later version (-07) exists of draft-filsfils-spring-srv6-network-programming-04 == Outdated reference: A later version (-18) exists of draft-ietf-idr-bgp-ls-segment-routing-ext-08 == Outdated reference: A later version (-19) exists of draft-ietf-idr-bgpls-segment-routing-epe-15 == Outdated reference: A later version (-26) exists of draft-ietf-idr-segment-routing-te-policy-03 == Outdated reference: A later version (-19) exists of draft-ietf-idr-te-lsp-distribution-08 == Outdated reference: A later version (-26) exists of draft-ietf-lsr-flex-algo-00 == Outdated reference: A later version (-16) exists of draft-ietf-pce-segment-routing-11 == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-mpls-13 == Outdated reference: A later version (-07) exists of draft-sivabalan-pce-binding-label-sid-04 -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) -- Obsolete informational reference (is this intentional?): RFC 7810 (Obsoleted by RFC 8570) Summary: 0 errors (**), 0 flaws (~~), 12 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SPRING Working Group C. Filsfils 3 Internet-Draft K. Talaulikar, Ed. 4 Intended status: Informational Cisco Systems, Inc. 5 Expires: December 9, 2018 P. Krol 6 Google, Inc. 7 M. Horneffer 8 Deutsche Telekom 9 P. Mattes 10 Microsoft 11 June 7, 2018 13 SR Policy Implementation and Deployment Considerations 14 draft-filsfils-spring-sr-policy-considerations-01.txt 16 Abstract 18 Segment Routing (SR) allows a headend node to steer a packet flow 19 along any path. Intermediate per-flow states are eliminated thanks 20 to source routing. SR Policy framework enables the instantiation and 21 the management of necessary state on the headend node for flows along 22 a source routed paths using an ordered list of segments associated 23 with their specific SR Policies. This document describes some of the 24 implementation and deployment aspects that are useful for 25 operationalizing the SR Policy architecture. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on December 9, 2018. 44 Copyright Notice 46 Copyright (c) 2018 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (https://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. SR Policy Headend Architecture . . . . . . . . . . . . . . . 3 63 3. Dynamic Path Computation . . . . . . . . . . . . . . . . . . 4 64 3.1. Optimization Objective . . . . . . . . . . . . . . . . . 4 65 3.2. Constraints . . . . . . . . . . . . . . . . . . . . . . . 5 66 3.3. SR Native Algorithm . . . . . . . . . . . . . . . . . . . 6 67 3.4. Path to SID . . . . . . . . . . . . . . . . . . . . . . . 7 68 4. Candidate Path Selection . . . . . . . . . . . . . . . . . . 7 69 5. Distributed and/or Centralized Control Plane . . . . . . . . 11 70 5.1. Distributed Control Plane within a single Link-State IGP 71 area . . . . . . . . . . . . . . . . . . . . . . . . . . 11 72 5.2. Distributed Control Plane across several Link-State IGP 73 areas . . . . . . . . . . . . . . . . . . . . . . . . . . 11 74 5.3. Centralized Control Plane . . . . . . . . . . . . . . . . 12 75 5.4. Distributed and Centralized Control Plane . . . . . . . . 12 76 6. Binding SID Aspects . . . . . . . . . . . . . . . . . . . . . 13 77 6.1. Benefits of Binding SID . . . . . . . . . . . . . . . . . 13 78 6.2. Centralized Discovery of available BSID . . . . . . . . . 14 79 7. Flex-Algorithm Based SR Policies . . . . . . . . . . . . . . 16 80 8. Layer 2 and Optical Transport . . . . . . . . . . . . . . . . 17 81 9. Security Considerations . . . . . . . . . . . . . . . . . . . 18 82 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 83 11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 18 84 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 18 85 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 86 13.1. Normative References . . . . . . . . . . . . . . . . . . 19 87 13.2. Informative References . . . . . . . . . . . . . . . . . 20 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 90 1. Introduction 92 Segment Routing (SR) allows a headend node to steer a packet flow 93 along any path. Intermediate per-flow states are eliminated with 94 source routing [I-D.ietf-spring-segment-routing]. 96 The headend node steers a flow into a Segment Routing Policy (SR 97 Policy) by augmenting packet headers with the ordered list of 98 segments associated with that SR Policy. 99 [I-D.ietf-spring-segment-routing-policy] defines the SR Policy 100 architecture and details the concepts of SR Policy and steering into 101 an SR Policy. 103 This document describes some of the implementation aspects for SR 104 Policy framework which should be considered as suggestions. The same 105 behavior, as defined in [I-D.ietf-spring-segment-routing-policy], may 106 in fact be realized with other alternate approaches. The deployment 107 aspects described in this document are also meant to only serve as 108 guidelines. This document describes these aspects and other 109 considerations related to SR Policy concepts as they are important to 110 facilitate multi-vendor interoperable deployments for various SR 111 Policy use-cases. 113 These apply equally to the MPLS 114 [I-D.ietf-spring-segment-routing-mpls] and SRv6 115 [I-D.filsfils-spring-srv6-network-programming] instantiations of 116 segment routing. 118 For reading simplicity, the illustrations are provided for the MPLS 119 instantiations. 121 2. SR Policy Headend Architecture 123 +--------+ +--------+ 124 | BGP | | PCEP | 125 +--------+ +--------+ 126 \ / 127 +--------+ +----------+ +--------+ 128 | | | SR Policy| | | 129 | CLI |--| Module |--| NETCONF| 130 | | | (SRPM) | | | 131 +--------+ +----------+ +--------+ 132 | 133 +--------+ 134 | FIB | 135 +--------+ 137 Figure 1: SR Policy Architecture at a Headend 139 The SR Policy functionality at a headend can be implemented in an SR 140 Policy Module (SRPM) process as illustrated in Figure 1 . 142 The SRPM process interacts with other processes to learn candidate 143 paths. 145 The SRPM process selects the active path of an SR Policy. 147 The SRPM process interacts with the RIB/FIB process to install an 148 active SR Policy in the dataplane. 150 In order to validate explicit candidate paths and compute dynamic 151 candidate paths, the SRPM process maintains an SR Database (SR-DB) as 152 specified in [I-D.ietf-spring-segment-routing-policy]. The SRPM 153 process interacts with other processes as shown in Figure 2 to 154 collect the SR-DB information. 156 +--------+ +--------+ +--------+ 157 | BGP SR | | BGP-LS | | IGP | 158 | Policy | +--------+ +--------+ 159 +--------+ \ | / 160 +--------+ +-----------+ +--------+ 161 | PCEP |---| SRPM |--| NETCONF| 162 +--------+ +-----------+ +--------+ 164 Figure 2: Topology/link-state database architecture 166 The SR Policy architecture supports both centralized and distributed 167 control-plane. 169 3. Dynamic Path Computation 171 A dynamic candidate path for SR Policy is specified as an 172 optimization objective and constraints and needs to be computed by 173 either the headend or a Path Computation Element (PCE). The 174 distributed or centralized computation aspect is described further in 175 Section 5. This section describes the computation aspects of a 176 dynamic path. 178 3.1. Optimization Objective 180 This document describes two optimization objectives: 182 o Min-Metric - requests computation of a solution SID-List optimized 183 for a selected metric. 185 o Min-Metric with margin and maximum number of SIDs - Min-Metric 186 with two changes: a margin of by which two paths with similar 187 metrics would be considered equal, a constraint on the max number 188 of SIDs in the SID-List. 190 The "Min-Metric" optimization objective requests to compute a 191 solution SID-List such that packets flowing through the solution SID- 192 List use ECMP-aware paths optimized for the selected metric. The 193 "Min-Metric" objective can be instantiated for the IGP metric 194 ([RFC1195] [RFC2328] [RFC5340]) xor the TE metric ([RFC5305] 195 [RFC3630]) xor the latency extended TE metric ([RFC7810] [RFC7471]). 196 This metric is called the O metric (the optimized metric) to 197 distinguish it from the IGP metric. The solution SID-List must be 198 computed to minimize the number of SIDs and the number of SID-Lists. 200 If the selected O metric is the IGP metric and the headend and 201 tailend are in the same IGP domain, then the solution SID-List is 202 made of the single prefix-SID of the tailend. 204 When the selected O metric is not the IGP metric, then the solution 205 SID-List is made of prefix SIDs of intermediate nodes, Adjacency SIDs 206 along intermediate links and potentially Binding SIDs (BSIDs) of 207 intermediate policies. 209 In many deployments there are insignificant metric differences 210 between mostly equal path (e.g. a difference of 100 usec of latency 211 between two paths from NYC to SFO would not matter in most cases). 212 The "Min-Metric with margin" objective supports such requirement. 214 The "Min-Metric with margin and maximum number of SIDs" optimization 215 objective requests to compute a solution SID-List such that packets 216 flowing through the solution SID-List do not use a path whose 217 cumulative O metric is larger than the shortest-path O metric + 218 margin. 220 If this is not possible because of the number of SIDs constraint, 221 then the solution SID-List minimizes the O metric while meeting the 222 maximum number of SID constraints (i.e. path with the least value of 223 O metric while using <= the number of SIDs specified). 225 3.2. Constraints 227 The following constraints can be described: 229 o Inclusion and/or exclusion of TE affinity. 231 o Inclusion and/or exclusion of IP address. 233 o Inclusion and/or exclusion of SRLG. 235 o Inclusion and/or exclusion of admin-tag. 237 o Maximum accumulated metric (IGP, TE and latency). 239 o Maximum number of SIDs in the solution SID-List. 241 o Maximum number of weighted SID-Lists in the solution set. 243 o Diversity to another service instance (e.g., link, node, or SRLG 244 disjoint paths originating from different head-ends). 246 3.3. SR Native Algorithm 248 1----------------2----------------3 249 |\ / 250 | \ / 251 | 4-------------5-------------7 252 | \ /| 253 | +-----------6-----------+ | 254 8------------------------------9 256 Figure 3: Illustration used to describe SR native algorithm 258 Let us assume that all the links have the same IGP metric of 10 and 259 let us consider the dynamic path defined as: Min-Metric(from 1, to 3, 260 IGP metric, margin 0) with constraint "avoid link 2-to-3". 262 A classical circuit implementation would do: prune the graph, compute 263 the shortest-path, pick a single non-ECMP branch of the ECMP-aware 264 shortest-path and encode it as a SID-List. The solution SID-List 265 would be <4, 5, 7, 3>. 267 An SR-native algorithm would find a SID-List that minimizes the 268 number of SIDs and maximize the use of all the ECMP branches along 269 the ECMP shortest path. In this illustration, the solution SID-List 270 would be <7, 3>. 272 In the vast majority of SR use-cases, SR-native algorithms should be 273 preferred: they preserve the native ECMP of IP and they minimize the 274 dataplane header overhead. 276 In some specific use-case (e.g. TDM migration over IP where the 277 circuit notion prevails), one may prefer a classic circuit 278 computation followed by an encoding into SIDs (potentially only using 279 non-protected Adj SIDs that pin the path to specific links and avoid 280 ECMP to reflect the TDM paradigm). 282 SR-native algorithms are a local node behavior and are thus outside 283 the scope of this document. 285 3.4. Path to SID 287 Let us assume the below diagram where all the links have an IGP 288 metric of 10 and a TE metric of 10 except the link AB which has an 289 IGP metric of 20 and the link AD which has a TE metric of 100. Let 290 us consider the min-metric(from A, to D, TE metric, margin 0). 292 B---C 293 | | 294 A---D 296 Figure 4: Illustration used to describe path to SID conversion 298 The solution path to this problem is ABCD. 300 This path can be expressed in SIDs as where B and D are the 301 IGP prefix SIDs respectively associated with nodes B and D in the 302 diagram. 304 Indeed, from A, the IGP path to B is AB (IGP metric 20 better than 305 ADCB of IGP metric 30). From B, the IGP path to D is BCD (IGP metric 306 20 better than BAD of IGP metric 30). 308 While the details of the algorithm remain a local node behavior, a 309 high-level description follows: start at the headend and find an IGP 310 prefix SID that leads as far down the desired path as 311 possible(without using any link not included in the desired path). 312 If no prefix SID exists, use the Adj SID to the first neighbor along 313 the path. Restart from the node that was reached. 315 4. Candidate Path Selection 317 An SR Policy may have multiple candidate paths that are provisioned 318 or signaled [I-D.ietf-idr-segment-routing-te-policy] 319 [I-D.ietf-pce-segment-routing] from one of more sources. The tie- 320 breaker rules defined in [I-D.ietf-spring-segment-routing-policy] 321 result in determination of a single "active path" in a formal 322 definition. 324 This section describe some examples for the candidate path selection 325 based on the same rules. 327 Example 1: 329 Consider headend H where two candidate paths of the same SR Policy 330 are signaled via BGP 332 [I-D.ietf-idr-segment-routing-te-policy] and whose respective NLRIs 333 have the same route distinguishers: 335 NLRI A with distinguisher = RD1, color = C, endpoint = N, preference 336 P1. 338 NLRI B with distinguisher = RD1, color = C, endpoint = N, preference 339 P2. 341 o Because the NLRIs are identical (same distinguisher), BGP will 342 perform bestpath selection. Note that there are no changes to BGP 343 best path selection algorithm. 345 o H installs one advertisement as bestpath into the BGP table. 347 o A single advertisement is passed to the SR Policy instantiation 348 process. 350 o The SRPM process does not perform any path selection. 352 Note that the candidate path's preference value does not have any 353 effect on the BGP bestpath selection process. 355 Example 2: 357 Consider headend H where two candidate paths of the same SR Policy 358 are signaled via BGP and whose respective NLRIs 359 have different route distinguishers: 361 NLRI A with distinguisher = RD1, color = C, endpoint = N, preference 362 P1. 364 NLRI B with distinguisher = RD2, color = C, endpoint = N, preference 365 P2. 367 o Because the NLRIs are different (different distinguisher), BGP 368 will not perform bestpath selection. 370 o H installs both advertisements into the BGP table. 372 o Both advertisements are passed to the SR Policy instantiation 373 process. 375 o SRPM process at H selects the candidate path advertised by NLRI B 376 as the active path for the SR policy since P2 is greater than P1. 378 Note that the recommended approach is to use NLRIs with different 379 distinguishers when several candidate paths for the same SR Policy 380 (endpoint, color) are signaled via BGP to a headend. 382 Example 3: 384 Consider that a headend H learns two candidate paths of the same SR 385 Policy one signaled via BGP and another via Local 386 configuration. 388 NLRI A with distinguisher = RD1, color = C, endpoint = N, preference 389 P1. 391 Local "foo" with color = C, endpoint = N, preference P2. 393 o H installs NLRI A into the BGP table. 395 o NLRI A and "foo" are both passed to the SRPM process. 397 o SRPM process at H selects the candidate path indicated by "foo" as 398 the active path for the SR policy since P2 is greater than P1. 400 Now, let us consider cases, when an SR Policy has multiple valid 401 candidate paths with the same best preference, the SRPM process at a 402 headend uses the rules described in 403 [I-D.ietf-spring-segment-routing-policy] section 2.9 to select the 404 active path. This is explained in the following examples: 406 Example 4: 408 Consider headend H with two candidate paths of the same SR Policy 409 and the same preference value received from the 410 same controller R and where RD2 is higher than RD1. 412 o NLRI A with distinguisher RD1, color C, endpoint N, preference 413 P1(selected as active path at time t0). 415 o NLRI B with distinguisher RD2 (RD2 is greater than RD1), color C, 416 endpoint N, preference P1 (passed to SR Policy instatiation 417 process at time t1 > t0). 419 After t1, SRPM process at H selects candidate path associated with 420 NLRI B as active path of the SR policy since RD2 is higher than RD1. 422 Here the time when the headend receives the candidate path via BGP is 423 not a factor in the selection. 425 Note that, in such a scenario where there are redundant sessions to 426 the same controller, the recommended approach is to use the same RD 427 value for conveying the same candidate paths and let the BGP best 428 path algorithm pick the best path. 430 Example 5: 432 Consider headend H with two candidate paths of the same SR Policy 433 and the same preference value both received from 434 the same controller R and where RD2 is higher than RD1. 436 Consider also that headend H is configured to override the 437 discriminator tiebreaker specified in 438 [I-D.ietf-spring-segment-routing-policy] section 2.9 440 o NLRI A with distinguisher RD1, color C, endpoint N, preference P1 441 (selected as active path at time t0). 443 o NLRI B with distinguisher RD2, color C, endpoint N, preference P1 444 (passed to SR Policy instatiation process at time t1). 446 Even after t1, SRPM process at H retains candidate path associated 447 with NLRI A as active path of the SR policy since the discriminator 448 tiebreaker is disabled at H. 450 Example 6: 452 Consider headend H with two candidate paths of the same SR Policy 453 and the same preference value. 455 o Local "foo" with color C, endpoint N, preference P1 (selected as 456 active path at time t0). 458 o NLRI A with distinguisher RD1, color C, endpoint N, preference P1 459 (passed to SRPM process at time t1). 461 Even after t1, SRPM process at H retains candidate path associated 462 with local candidate path "foo" as active path of the SR policy since 463 the Local protocol is preferred over BGP by default based on its 464 higher protocol identifier value. 466 Example 7: 468 Consider headend H with two candidate paths of the same SR Policy 469 and the same preference value but received via 470 NETCONF from two controllers R and S (where S > R) 472 o Path A from R with distinguisher D1, color C, endpoint N, 473 preference P1 (selected as active path at time t0). 475 o Path B from S with distinguisher D2, color C, endpoint N, 476 preference P1 (passed to SRPM process at time t1). 478 Note that the NETCONF process sends both paths to the SRPM process 479 since it does not have any tiebreaker logic. After t1, SRPM process 480 at H selects candidate path associated with Path B as active path of 481 the SR policy. 483 5. Distributed and/or Centralized Control Plane 485 5.1. Distributed Control Plane within a single Link-State IGP area 487 Consider a single-area IGP with per-link latency measurement and 488 advertisement of the measured latency in the extended-TE IGP TLV. 490 A head-end H is configured with a single dynamic candidate path for 491 SR policy P with a low-latency optimization objective and endpoint E. 493 Clearly the SRPM process at H learns the topology (and extended TE 494 latency information) from the IGP and computes the solution SID list 495 providing the low-latency path to E. 497 No centralized controller is involved in such a deployment. 499 The SR-DB at H only uses the Link-State DataBase (LSDB) provided by 500 the IGP. 502 5.2. Distributed Control Plane across several Link-State IGP areas 504 Consider a domain D composed of two link-state IGP single-area 505 instances (I1 and I2) where each sub-domain benefits from per-link 506 latency measurement and advertisement of the measured latency in the 507 related IGP. The link-state information of each IGP is advertised 508 via BGP-LS [RFC7752] towards a set of BGP-LS route reflectors (RR). 509 H is a headend in IGP I1 sub-domain and E is an endpoint in IGP I2 510 sub-domain. 512 Using a BGP-LS session to any BGP-LS RR, H's SRPM process may learn 513 the link-state information of the remote domain I2. H can thus 514 compute the low-latency path from H to E as a solution SID list that 515 spans the two domains I1 and I2. 517 The SR-DB at H collects the LSDB from both sub-domains (I1 and I2). 519 No centralized controller is required. 521 5.3. Centralized Control Plane 523 Considering the same domain D as in the previous section, let us now 524 assume that H does not have a BGP-LS session to the BGP-LS RR's. 525 Instead, let us assume a controller "C" has at least one BGP-LS 526 session to the BGP-LS RR's. 528 The controller C learns the topology and extended latency information 529 from both sub-domains via BGP-LS. It computes a low-latency path 530 from H to E as a SID list and programs H with the 531 related explicit candidate path. 533 The headend H does not compute the solution SID list (it cannot). 534 The headend only validates the received explicit candidate path. 535 Most probably, the controller encodes the SID's of the SID-List with 536 Type-1. In that case, The headend's validation simply consists in 537 resolving the first SID on an outgoing interface and next-hop. 539 The SR-DB at H only includes the LSDB provided by the IGP I1. 541 The SR-DB of the controller collects the LSDB from both sub- 542 domains(I1 and I2). 544 5.4. Distributed and Centralized Control Plane 546 Consider the same domain D as in the previous section. 548 H's SRPM process is configured to associate color C1 with a low- 549 latency optimization objective. 551 H's BGP process is configured to steer a Route R/r of extended-color 552 community C1 and of next-hop N via an SR policy (N, C1). 554 Upon receiving a first BGP route of color C1 and of next-hop N, H 555 recognizes the need for an SR Policy (N, C1) with a low-latency 556 objective to N. As N is outside the SRTE DB of H, H requests a 557 controller to compute such SID list (e.g., PCEP 558 [I-D.ietf-pce-segment-routing]). 560 This is an example of hybrid control-plane: the BGP distributed 561 control plane signals the routes and their TE requirements. Upon 562 receiving these BGP routes, a local headend either computes the 563 solution SID list (entirely distributed when the endpoint is in the 564 SR-DB of the headend) else delegates the computation to a controller 565 (hybrid distributed/centralized control-plane). 567 The SR-DB at H only includes the LSDB provided by the IGP. 569 The SR-DB of the controller collects the LSDB from both sub-domains. 571 6. Binding SID Aspects 573 The Binding SID (BSID) is fundamental to Segment Routing. It 574 provides scaling, network opacity and service independence. 576 This section describes implementation and operational aspects related 577 to the Binding SID. 579 6.1. Benefits of Binding SID 581 A simplified illustration is provided on the basis of Figure 5 where 582 it is assumed that S, A, B, Data Center Interconnect DCI1 and DCI2 583 share the same IGP-SR instance in the data-center 1 (DC1). DCI1, 584 DCI2, C, D, E, F, G, DCI3 and DCI4 share the same IGP-SR domain in 585 the core. DCI3, DCI4, H, K and Z share the same IGP-SR domain in the 586 data-center 2 (DC2). 588 A---DCI1----C----D----E----DCI3---H 589 / | | \ 590 S | | Z 591 \ | | / 592 B---DCI2----F---------G----DCI4---K 593 <==DC1==><=========Core========><==DC2==> 595 Figure 5: A Simple Datacenter Topology 597 In this example, it is assumed no redistribution between the IGP's 598 and no presence of BGP-LU. The inter-domain communication is only 599 provided by SR through SR Policies. 601 The latency from S to DCI1 equals to DCI2. The latency from Z to 602 DCI3 equals to DCI4. All the intra-DC links have the same IGP metric 603 10. 605 The path DCI1, C, D, E, DCI3 has a lower latency and lower capacity 606 than the path DCI2, F, G, DCI4. 608 The IGP metrics of all the core links are set to 10 except the links 609 D-E which is set to 100. 611 A low-latency multi-domain policy from S to Z may be expressed as 612 where: 614 o DCI1 is the prefix SID of DCI1. 616 o BSID is the Binding SID bound to an SR policy 617 instantiated at DCI1. 619 o Z is the prefix SID of Z. 621 Without the use of an intermediate core SR Policy (efficiently 622 summarized by a single BSID), S would need to steer its low-latency 623 flow into the policy . 625 The use of a BSID (and the intermediate bound SR Policy) decreases 626 the number of segments imposed by the source. 628 A BSID acts as a stable anchor point which isolates one domain from 629 the churn of another domain. Upon topology changes within the core 630 of the network, the low-latency path from DCI1 to DCI3 may change. 631 While the path of an intermediate policy changes, its BSID does not 632 change. Hence the policy used by the source does not change, hence 633 the source is shielded from the churn in another domain. 635 A BSID provides opacity and independence between domains. The 636 administrative authority of the core domain may not want to share 637 information about its topology. The use of a BSID allows keeping the 638 service opaque. S is not aware of the details of how the low-latency 639 service is provided by the core domain. S is not aware of the need 640 of the core authority to temporarily change the intermediate path. 642 6.2. Centralized Discovery of available BSID 644 This section explains how controllers can discover the local SIDs 645 available at a node N so as to pick an explicit BSID for a SR Policy 646 to be instantiated at headend N. 648 Any controller can discover the following properties of a node N 649 (e.g., via BGP-LS , NETCONF etc.): 651 o its local topology [RFC7752]. 653 o its topology-related SIDs (Prefix SIDs, Adj SID and EPE SID 654 [I-D.ietf-idr-bgp-ls-segment-routing-ext] 655 [I-D.ietf-idr-bgpls-segment-routing-epe]). 657 o its Segment Routing Label Block (SRLB). 659 o its SR Policies and their BSID ([I-D.ietf-pce-segment-routing] 660 [I-D.sivabalan-pce-binding-label-sid] 661 [I-D.ietf-idr-te-lsp-distribution]). 663 Any controller can thus infer the available SIDs in the SRLB of any 664 node. 666 As an example, a controller discovers the following characteristics 667 of N: SRLB (4000, 8000), 3 Adj SIDs (4001, 4002, 4003), 2 EPE SIDs 668 (4004, 4005) and 3 SRTE policies (whose BSIDs are respectively 4006, 669 4007 and 4008). This controller can deduce that the SRLB sub-range 670 (4009, 5000) is free for allocation. 672 A controller is not restricted to use the next numerically available 673 SID in the available SRLB sub-range. It can pick any label in the 674 subset of available labels. This random pick make the chance for a 675 collision unlikely. 677 An operator could also sub-allocate the SRLB between different 678 controllers (e.g. (4000-4499) to controller 1 and (4500-5000) to 679 controller 2). 681 Inter-controller state-synchronization may be used to avoid/detect 682 collision in BSID. 684 All these techniques make the likelihood of a collision between 685 different controllers very unlikely. 687 In the unlikely case of a collision, the controllers will detect it 688 through system alerts, BGP-LS reporting using 689 [I-D.ietf-idr-te-lsp-distribution] or PCEP notification [RFC8231]. 690 They then have the choice to continue the operation of their SR 691 Policy with the dynamically allocated BSID or re-try with another 692 explicit pick. 694 Note: in deployments where PCE Protocol (PCEP) is used between head- 695 end and controller (PCE), a head-end can report BSID as well as 696 policy attributes (e.g., type of disjointness) and operational and 697 administrative states to controller. Similarly, a controller can 698 also assign/update the BSID of a policy via PCEP when instantiating 699 or updating SR Policy. 701 7. Flex-Algorithm Based SR Policies 703 SR allows for association of algorithms to Prefix SIDs 704 [I-D.ietf-spring-segment-routing]. [I-D.ietf-lsr-flex-algo] defines 705 the IGP based Flex-Algorithm solution which allows IGPs themselves to 706 compute constraint based paths over the network. Prefix SIDs for the 707 specific flex-algorithm and associated with a node are used in the 708 forwarding plane to steer along the specific constraint path to that 709 node. 711 As specified in [I-D.ietf-spring-segment-routing] these IGP Flex Algo 712 Prefix SIDs can be used as segments within SR Policies thereby 713 leveraging the underlying IGP Flex Algo solution. 715 1--RED--2-------6 716 | | | 717 4-------3--RED--9 719 Figure 6: Illustration for Flex-Alg SID 721 Now let us assume that 723 o 1, 2, 3 and 4 are part of IGP 1. 725 o 2, 6, 9 and 3 are part of IGP 2. 727 o All the IGP link costs are 10. 729 o Links 1to2 and 3to9 are colored with IGP Link Affinity Red. 731 o Flex-Alg1 is defined in both IGPs as: avoid red, minimize IGP 732 metric. 734 o All nodes of each IGP domain are enabled for FlexAlg1 736 o SID(k, 0) represents the PrefixSID of node k according to Alg=0. 738 o SID(k, FlexAlg1) represents the PrefixSID of node k according to 739 Flex-Alg1. 741 A controller can steer a flow from 1 to 9 through an end-to-end path 742 that avoids the RED links of both IGP domains thanks to the explicit 743 SR Policy . 745 8. Layer 2 and Optical Transport 747 1----2----3----4----5 748 I2(lambda L241)\ / I4(lambda L241) 749 Optical 751 Figure 7: SR Policy with integrated DWDM 753 An explicit candidate path can express a path through a transport 754 layer beneath IP (ATM, FR, DWDM). The transport layer could be ATM, 755 FR, DWDM, back-to-back Ethernet etc. The transport path is modelled 756 as a link between two IP nodes with the specific assumption that no 757 distributed IP routing protocol runs over the link. The link may 758 have IP address or be IP unnumbered. Depending on the transport 759 protocol case, the link can be a physical DWDM interface and a lambda 760 (integrated solution), an Ethernet interface and a VLAN, an ATM 761 interface with a VPI/VCI, a FR interface with a DLCI etc. 763 Using the DWDM integrated use-case of Figure 7 as an illustration, 764 let us assume 766 o nodes 1, 2, 3, 4 and 5 are IP routers running an SR-enable IGP on 767 the links 1-2, 2-3, 3-4 and 4-5. 769 o The SRGB is homogeneous (16000, 24000). 771 o Node K's prefix SID is 16000+K. 773 o node 2 has an integrated DWDM interface I2 with Lambda L1. 775 o node 4 has an integrated DWDM interface I4 with Lambda L2. 777 o the optical network is provisioned with a circuit from 2 to 4 with 778 continuous lambda L241 (details outside the scope of this 779 document). 781 o Node 2 is provisioned with an SR policy with SID list 782 and Binding SID B where I2(L241) is of type 5 (IPv4) or type 7 783 (IPv6), see section 4. 785 o node 1 steers a packet P1 towards the prefix SID of node 5 786 (16005). 788 o node 1 steers a packet P2 on the SR policy <16002, B, 16005>. 790 In such a case, the journey of P1 will be 1-2-3-4-5 while the journey 791 of P2 will be 1-2-lambda(L241)-4-5. P2 skips the IP hop 3 and 792 leverages the DWDM circuit from node 2 to node 4. P1 follows the 793 shortest-path computed by the distributed routing protocol. The path 794 of P1 is unaltered by the addition, modification or deletion of 795 optical bypass circuits. 797 The salient point of this example is that the SR Policy architecture 798 seamlessly support explicit candidate paths through any transport 799 sub-layer. 801 BGP-LS Extensions to describe the sub-IP-layer characteristics of the 802 SR Policy are out of scope of this document (e.g. in Figure 7, the 803 DWDM characteristics of the SR Policy at node 2 in terms of latency, 804 loss, security, domain/country traversed by the circuit etc.). 806 Further details of the SR Policy use-case for Packet Optical networks 807 are specified in [I-D.anand-spring-poi-sr] . 809 9. Security Considerations 811 The security considerations related to Segment Routing architecture 812 are described in [I-D.ietf-spring-segment-routing] and for SR Policy 813 architecture are described in 814 [I-D.ietf-spring-segment-routing-policy] and they apply to this 815 document as well. 817 10. IANA Considerations 819 This document has no actions for IANA. 821 11. Acknowledgement 823 The authors like to thank Tarek Saad, Dhanendra Jain and Muhammad 824 Durrani for their valuable comments and suggestions. 826 12. Contributors 828 The following people have contributed to this document: 830 Siva Sivabalan 831 Cisco Systems 832 Email: msiva@cisco.com 834 Zafar Ali 835 Cisco Systems 836 Email: zali@cisco.com 838 Jose Liste 839 Cisco Systems 840 Email: jliste@cisco.com 841 Francois Clad 842 Cisco Systems 843 Email: fclad@cisco.com 845 Kamran Raza 846 Cisco Systems 847 Email: skraza@cisco.com 849 Shraddha Hegde 850 Juniper Networks 851 Email: shraddha@juniper.net 853 Steven Lin 854 Google, Inc. 855 Email: stevenlin@google.com 857 Alex Bogdanov 858 Google, Inc. 859 Email: bogdanov@google.com 861 Daniel Voyer 862 Bell Canada 863 Email: daniel.voyer@bell.ca 865 Dirk Steinberg 866 Steinberg Consulting 867 Email: dws@steinbergnet.net 869 Bruno Decraene 870 Orange Business Services 871 Email: bruno.decraene@orange.com 873 Stephane Litkowski 874 Orange Business Services 875 Email: stephane.litkowski@orange.com 877 Luay Jalil 878 Verizon 879 Email: luay.jalil@verizon.com 881 13. References 883 13.1. Normative References 885 [I-D.ietf-spring-segment-routing] 886 Filsfils, C., Previdi, S., Ginsberg, L., Decraene, B., 887 Litkowski, S., and R. Shakir, "Segment Routing 888 Architecture", draft-ietf-spring-segment-routing-15 (work 889 in progress), January 2018. 891 [I-D.ietf-spring-segment-routing-policy] 892 Filsfils, C., Sivabalan, S., daniel.voyer@bell.ca, d., 893 bogdanov@google.com, b., and P. Mattes, "Segment Routing 894 Policy for Traffic Engineering", draft-ietf-spring- 895 segment-routing-policy-00 (work in progress), June 2018. 897 13.2. Informative References 899 [I-D.anand-spring-poi-sr] 900 Anand, M., Bardhan, S., Subrahmaniam, R., Tantsura, J., 901 Mukhopadhyaya, U., and C. Filsfils, "Packet-Optical 902 Integration in Segment Routing", draft-anand-spring-poi- 903 sr-05 (work in progress), February 2018. 905 [I-D.filsfils-spring-srv6-network-programming] 906 Filsfils, C., Li, Z., Leddy, J., daniel.voyer@bell.ca, d., 907 daniel.bernier@bell.ca, d., Steinberg, D., Raszuk, R., 908 Matsushima, S., Lebrun, D., Decraene, B., Peirens, B., 909 Salsano, S., Naik, G., Elmalky, H., Jonnalagadda, P., and 910 M. Sharif, "SRv6 Network Programming", draft-filsfils- 911 spring-srv6-network-programming-04 (work in progress), 912 March 2018. 914 [I-D.ietf-idr-bgp-ls-segment-routing-ext] 915 Previdi, S., Talaulikar, K., Filsfils, C., Gredler, H., 916 and M. Chen, "BGP Link-State extensions for Segment 917 Routing", draft-ietf-idr-bgp-ls-segment-routing-ext-08 918 (work in progress), May 2018. 920 [I-D.ietf-idr-bgpls-segment-routing-epe] 921 Previdi, S., Filsfils, C., Patel, K., Ray, S., and J. 922 Dong, "BGP-LS extensions for Segment Routing BGP Egress 923 Peer Engineering", draft-ietf-idr-bgpls-segment-routing- 924 epe-15 (work in progress), March 2018. 926 [I-D.ietf-idr-segment-routing-te-policy] 927 Previdi, S., Filsfils, C., Jain, D., Mattes, P., Rosen, 928 E., and S. Lin, "Advertising Segment Routing Policies in 929 BGP", draft-ietf-idr-segment-routing-te-policy-03 (work in 930 progress), May 2018. 932 [I-D.ietf-idr-te-lsp-distribution] 933 Previdi, S., Dong, J., Chen, M., Gredler, H., and J. 934 Tantsura, "Distribution of Traffic Engineering (TE) 935 Policies and State using BGP-LS", draft-ietf-idr-te-lsp- 936 distribution-08 (work in progress), December 2017. 938 [I-D.ietf-lsr-flex-algo] 939 Psenak, P., Hegde, S., Filsfils, C., Talaulikar, K., and 940 A. Gulko, "IGP Flexible Algorithm", draft-ietf-lsr-flex- 941 algo-00 (work in progress), May 2018. 943 [I-D.ietf-pce-segment-routing] 944 Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., 945 and J. Hardwick, "PCEP Extensions for Segment Routing", 946 draft-ietf-pce-segment-routing-11 (work in progress), 947 November 2017. 949 [I-D.ietf-spring-segment-routing-mpls] 950 Bashandy, A., Filsfils, C., Previdi, S., Decraene, B., 951 Litkowski, S., and R. Shakir, "Segment Routing with MPLS 952 data plane", draft-ietf-spring-segment-routing-mpls-13 953 (work in progress), April 2018. 955 [I-D.sivabalan-pce-binding-label-sid] 956 Sivabalan, S., Tantsura, J., Filsfils, C., Previdi, S., 957 Hardwick, J., and D. Dhody, "Carrying Binding Label/ 958 Segment-ID in PCE-based Networks.", draft-sivabalan-pce- 959 binding-label-sid-04 (work in progress), March 2018. 961 [RFC1195] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and 962 dual environments", RFC 1195, DOI 10.17487/RFC1195, 963 December 1990, . 965 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 966 DOI 10.17487/RFC2328, April 1998, 967 . 969 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 970 (TE) Extensions to OSPF Version 2", RFC 3630, 971 DOI 10.17487/RFC3630, September 2003, 972 . 974 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 975 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 976 2008, . 978 [RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF 979 for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008, 980 . 982 [RFC7471] Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 983 Previdi, "OSPF Traffic Engineering (TE) Metric 984 Extensions", RFC 7471, DOI 10.17487/RFC7471, March 2015, 985 . 987 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 988 S. Ray, "North-Bound Distribution of Link-State and 989 Traffic Engineering (TE) Information Using BGP", RFC 7752, 990 DOI 10.17487/RFC7752, March 2016, 991 . 993 [RFC7810] Previdi, S., Ed., Giacalone, S., Ward, D., Drake, J., and 994 Q. Wu, "IS-IS Traffic Engineering (TE) Metric Extensions", 995 RFC 7810, DOI 10.17487/RFC7810, May 2016, 996 . 998 [RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path 999 Computation Element Communication Protocol (PCEP) 1000 Extensions for Stateful PCE", RFC 8231, 1001 DOI 10.17487/RFC8231, September 2017, 1002 . 1004 Authors' Addresses 1006 Clarence Filsfils 1007 Cisco Systems, Inc. 1008 Pegasus Parc 1009 De kleetlaan 6a, DIEGEM BRABANT 1831 1010 BELGIUM 1012 Email: cfilsfil@cisco.com 1014 Ketan Talaulikar (editor) 1015 Cisco Systems, Inc. 1017 Email: ketant@cisco.com 1019 Przemyslaw Krol 1020 Google, Inc. 1022 Email: pkrol@google.com 1023 Martin Horneffer 1024 Deutsche Telekom 1026 Email: martin.horneffer@telekom.de 1028 Paul Mattes 1029 Microsoft 1030 One Microsoft Way 1031 Redmond, WA 98052-6399 1032 USA 1034 Email: pamattes@microsoft.com