idnits 2.17.1 

draft-filsfils-spring-sr-policy-considerations-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (April 4, 2021) is 1117 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-22) exists of
     draft-ietf-spring-segment-routing-policy-09

  == Outdated reference: A later version (-18) exists of
     draft-ietf-idr-bgp-ls-segment-routing-ext-16

  == Outdated reference: A later version (-26) exists of
     draft-ietf-idr-segment-routing-te-policy-11

  == Outdated reference: A later version (-19) exists of
     draft-ietf-idr-te-lsp-distribution-14

  == Outdated reference: A later version (-26) exists of
     draft-ietf-lsr-flex-algo-13

  == Outdated reference: A later version (-16) exists of
     draft-ietf-pce-binding-label-sid-05

  -- Obsolete informational reference (is this intentional?): RFC 7752
     (Obsoleted by RFC 9552)


     Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	SPRING Working Group                                         C. Filsfils
3	Internet-Draft                                        K. Talaulikar, Ed.
4	Intended status: Informational                       Cisco Systems, Inc.
5	Expires: October 6, 2021                                         P. Krol
6	                                                            Google, Inc.
7	                                                            M. Horneffer
8	                                                        Deutsche Telekom
9	                                                               P. Mattes
10	                                                               Microsoft
11	                                                           April 4, 2021

13	         SR Policy Implementation and Deployment Considerations
14	           draft-filsfils-spring-sr-policy-considerations-07

16	Abstract

18	   Segment Routing (SR) allows a headend node to steer a packet flow
19	   along any path.  Intermediate per-flow states are eliminated thanks
20	   to source routing.  SR Policy framework enables the instantiation and
21	   the management of necessary state on the headend node for flows along
22	   a source routed paths using an ordered list of segments associated
23	   with their specific SR Policies.  This document describes some of the
24	   implementation and deployment aspects that are useful for
25	   operationalizing the SR Policy architecture.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at https://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on October 6, 2021.

44	Copyright Notice

46	   Copyright (c) 2021 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (https://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
62	   2.  SR Policy Headend Architecture  . . . . . . . . . . . . . . .   3
63	   3.  Dynamic Path Computation  . . . . . . . . . . . . . . . . . .   4
64	     3.1.  Optimization Objective  . . . . . . . . . . . . . . . . .   4
65	     3.2.  Constraints . . . . . . . . . . . . . . . . . . . . . . .   5
66	     3.3.  SR Native Algorithm . . . . . . . . . . . . . . . . . . .   6
67	     3.4.  Path to SID . . . . . . . . . . . . . . . . . . . . . . .   7
68	   4.  Candidate Path Selection  . . . . . . . . . . . . . . . . . .   7
69	   5.  Distributed and/or Centralized Control Plane  . . . . . . . .  11
70	     5.1.  Distributed Control Plane within a single Link-State IGP
71	           area  . . . . . . . . . . . . . . . . . . . . . . . . . .  11
72	     5.2.  Distributed Control Plane across several Link-State IGP
73	           areas . . . . . . . . . . . . . . . . . . . . . . . . . .  11
74	     5.3.  Centralized Control Plane . . . . . . . . . . . . . . . .  12
75	     5.4.  Distributed and Centralized Control Plane . . . . . . . .  12
76	   6.  Binding SID Aspects . . . . . . . . . . . . . . . . . . . . .  13
77	     6.1.  Benefits of Binding SID . . . . . . . . . . . . . . . . .  13
78	     6.2.  Centralized Discovery of available BSID . . . . . . . . .  14
79	   7.  Flex-Algorithm Based SR Policies  . . . . . . . . . . . . . .  16
80	   8.  Layer 2 and Optical Transport . . . . . . . . . . . . . . . .  17
81	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  18
82	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  18
83	   11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . .  18
84	   12. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  18
85	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  20
86	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  20
87	     13.2.  Informative References . . . . . . . . . . . . . . . . .  20
88	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  22

90	1.  Introduction

92	   Segment Routing (SR) allows a headend node to steer a packet flow
93	   along any path.  Intermediate per-flow states are eliminated with
94	   source routing [RFC8402].

96	   The headend node steers a flow into a Segment Routing Policy (SR
97	   Policy) by augmenting packet headers with the ordered list of
98	   segments associated with that SR Policy.
99	   [I-D.ietf-spring-segment-routing-policy] defines the SR Policy
100	   architecture and details the concepts of SR Policy and steering into
101	   an SR Policy.

103	   This document describes some of the implementation aspects for SR
104	   Policy framework which should be considered as suggestions.  The same
105	   behavior, as defined in [I-D.ietf-spring-segment-routing-policy], may
106	   in fact be realized with other alternate approaches.  The deployment
107	   aspects described in this document are also meant to only serve as
108	   guidelines.  This document describes these aspects and other
109	   considerations related to SR Policy concepts as they are important to
110	   facilitate multi-vendor interoperable deployments for various SR
111	   Policy use-cases.

113	   These apply equally to the MPLS [RFC8660] and SRv6 [RFC8986]
114	   instantiations of segment routing.

116	   For reading simplicity, the illustrations are provided for the MPLS
117	   instantiation.

119	2.  SR Policy Headend Architecture

121	   This section provides a conceptual overview of components (or
122	   functions) that interact to implement SR Policy on a headend

124	                   +--------+  +--------+
125	                   |  BGP   |  |  PCEP  |
126	                   +--------+  +--------+
127	                            \ /
128	              +--------+  +----------+  +--------+
129	              |        |  |    SR    |  |        |
130	              |  CLI   |--|  Policy  |--| NETCONF|
131	              |        |  |          |  |        |
132	              +--------+  +----------+  +--------+
133	                              |
134	                           +--------+
135	                           |  FIB   |
136	                           +--------+

138	               Figure 1: SR Policy Architecture at a Headend

140	   The SR Policy functionality at a headend can be implemented in an SR
141	   Policy (SRP) process as illustrated in Figure 1 .

143	   The SRP process interacts with other processes to learn candidate
144	   paths.

146	   The SRP process selects the active path of an SR Policy.

148	   The SRP process interacts with the RIB/FIB process to install an
149	   active SR Policy in the dataplane.

151	   In order to validate explicit candidate paths and compute dynamic
152	   candidate paths, the SRP process maintains an SR Database (SR-DB) as
153	   specified in [I-D.ietf-spring-segment-routing-policy].  The SRP
154	   process interacts with other processes as shown in Figure 2 to
155	   collect the SR-DB information.

157	                 +--------+  +--------+  +--------+
158	                 | BGP SR |  | BGP-LS |  |  IGP   |
159	                 | Policy |  +--------+  +--------+
160	                 +--------+ \    |       /
161	               +--------+   +-----------+  +--------+
162	               |   PCEP |---|    SRP    |--| NETCONF|
163	               +--------+   +-----------+  +--------+

165	            Figure 2: Topology/link-state database architecture

167	   The SR Policy architecture supports both centralized and distributed
168	   control-plane.

170	3.  Dynamic Path Computation

172	   A dynamic candidate path for SR Policy is specified as an
173	   optimization objective and constraints and needs to be computed by
174	   either the headend or a Path Computation Element (PCE).  The
175	   distributed or centralized computation aspect is described further in
176	   Section 5.  This section describes the computation aspects of a
177	   dynamic path.

179	3.1.  Optimization Objective

181	   This document describes two optimization objectives:

183	   o  Min-Metric - requests computation of a solution Segment-List
184	      optimized for a selected metric.

186	   o  Min-Metric with margin and maximum number of SIDs - Min-Metric
187	      with two changes: a margin of by which two paths with similar
188	      metrics would be considered equal, a constraint on the max number
189	      of SIDs in the Segment-List.

191	   The "Min-Metric" optimization objective requests to compute a
192	   solution Segment-List such that packets flowing through the solution
193	   Segment-List use ECMP-aware paths optimized for the selected metric.
194	   The "Min-Metric" objective can be instantiated for the IGP metric
195	   ([RFC1195] [RFC2328] [RFC5340]) xor the TE metric ([RFC5305]
196	   [RFC3630]) xor the latency extended TE metric ([RFC8570] [RFC7471]).
197	   This metric is called the O metric (the optimized metric) to
198	   distinguish it from the IGP metric.  The solution Segment-List must
199	   be computed to minimize the number of SIDs and the number of Segment-
200	   Lists.

202	   If the selected O metric is the IGP metric and the headend and
203	   tailend are in the same IGP domain, then the solution Segment-List is
204	   made of the single prefix-SID of the tailend.

206	   When the selected O metric is not the IGP metric, then the solution
207	   Segment-List is made of prefix SIDs of intermediate nodes, Adjacency
208	   SIDs along intermediate links and potentially Binding SIDs (BSIDs) of
209	   intermediate policies.

211	   In many deployments there are insignificant metric differences
212	   between mostly equal path (e.g. a difference of 100 usec of latency
213	   between two paths from NYC to SFO would not matter in most cases).
214	   The "Min-Metric with margin" objective supports such requirement.

216	   The "Min-Metric with margin and maximum number of SIDs" optimization
217	   objective requests to compute a solution Segment-List such that
218	   packets flowing through the solution Segment-List do not use a path
219	   whose cumulative O metric is larger than the shortest-path O metric +
220	   margin.

222	   If this is not possible because of the number of SIDs constraint,
223	   then one option is that the solution Segment-List minimizes the O
224	   metric while meeting the maximum number of SID constraints (i.e. path
225	   with the least value of O metric while using <= the number of SIDs
226	   specified).  The other default option is to not come up with a
227	   solution unless the desired SLA is guaranteed.

229	   Section 7 describes another approach for computing a solution
230	   Segment-List consisting of a single segment when the O metric is not
231	   the IGP metric by using the Flex Algorithm Prefix-SID of the tailend.

233	3.2.  Constraints

235	   The following constraints can be described:

237	   o  Inclusion and/or exclusion of TE affinity.

239	   o  Inclusion and/or exclusion of IP address.

241	   o  Inclusion and/or exclusion of SRLG.

243	   o  Inclusion and/or exclusion of admin-tag.

245	   o  Maximum accumulated metric (IGP, TE and latency).

247	   o  Maximum number of SIDs in the solution Segment-List.

249	   o  Maximum number of weighted Segment-Lists in the solution set.

251	   o  Diversity to another service instance (e.g., link, node, or SRLG
252	      disjoint paths originating from different head-ends).

254	3.3.  SR Native Algorithm

256	         1----------------2----------------3
257	        |\                               /
258	        | \                             /
259	        |  4-------------5-------------7
260	        |   \                         /|
261	        |    +-----------6-----------+ |
262	        8------------------------------9

264	        Figure 3: Illustration used to describe SR native algorithm

266	   Let us assume that all the links have the same IGP metric of 10 and
267	   let us consider the dynamic path defined as: Min-Metric(from 1, to 3,
268	   IGP metric, margin 0) with constraint "avoid link 2-to-3".

270	   A classical circuit implementation would do: prune the graph, compute
271	   the shortest-path, pick a single non-ECMP branch of the ECMP-aware
272	   shortest-path and encode it as a Segment-List.  The solution Segment-
273	   List would be <4, 5, 7, 3>.

275	   An SR-native algorithm would find a Segment-List that minimizes the
276	   number of SIDs and maximize the use of all the ECMP branches along
277	   the ECMP shortest path.  In this illustration, the solution Segment-
278	   List would be <7, 3>.

280	   In the vast majority of SR use-cases, SR-native algorithms should be
281	   preferred: they preserve the native ECMP of IP and they minimize the
282	   dataplane header overhead.

284	   In some specific use-case (e.g.  TDM migration over IP where the
285	   circuit notion prevails), one may prefer a classic circuit
286	   computation followed by an encoding into SIDs (potentially only using
287	   non-protected Adj SIDs that pin the path to specific links and avoid
288	   ECMP to reflect the TDM paradigm).

290	   SR-native algorithms are a local node behavior and are thus outside
291	   the scope of this document.

293	3.4.  Path to SID

295	   Let us assume the below diagram where all the links have an IGP
296	   metric of 10 and a TE metric of 10 except the link AB which has an
297	   IGP metric of 20 and the link AD which has a TE metric of 100.  Let
298	   us consider the min-metric(from A, to D, TE metric, margin 0).

300	            B---C
301	            |   |
302	            A---D

304	      Figure 4: Illustration used to describe path to SID conversion

306	   The solution path to this problem is ABCD.

308	   This path can be expressed in SIDs as <B, D> where B and D are the
309	   IGP prefix SIDs respectively associated with nodes B and D in the
310	   diagram.

312	   Indeed, from A, the IGP path to B is AB (IGP metric 20 better than
313	   ADCB of IGP metric 30).  From B, the IGP path to D is BCD (IGP metric
314	   20 better than BAD of IGP metric 30).

316	   While the details of the algorithm remain a local node behavior, a
317	   high-level description follows: start at the headend and find an IGP
318	   prefix SID that leads as far down the desired path as
319	   possible(without using any link not included in the desired path).
320	   If no prefix SID exists, use the Adj SID to the first neighbor along
321	   the path.  Restart from the node that was reached.

323	4.  Candidate Path Selection

325	   An SR Policy may have multiple candidate paths that are provisioned
326	   or signaled [I-D.ietf-idr-segment-routing-te-policy] [RFC8664] from
327	   one of more sources.  The tie-breaker rules defined in
328	   [I-D.ietf-spring-segment-routing-policy] result in determination of a
329	   single "active path" in a formal definition.

331	   This section describe some examples for the candidate path selection
332	   based on the same rules.

334	   Example 1:

336	   Consider headend H where two candidate paths of the same SR Policy
337	   <color, endpoint> are signaled via BGP
338	   [I-D.ietf-idr-segment-routing-te-policy] and whose respective NLRIs
339	   have the same route distinguishers:

341	   NLRI A with distinguisher = RD1, color = C, endpoint = N, preference
342	   P1.

344	   NLRI B with distinguisher = RD1, color = C, endpoint = N, preference
345	   P2.

347	   o  Because the NLRIs are identical (same distinguisher), BGP will
348	      perform bestpath selection.  Note that there are no changes to BGP
349	      best path selection algorithm.

351	   o  H installs one advertisement as bestpath into the BGP table.

353	   o  A single advertisement is passed to the SR Policy instantiation
354	      process.

356	   o  The SRP process does not perform any path selection.

358	   Note that the candidate path's preference value does not have any
359	   effect on the BGP bestpath selection process.

361	   Example 2:

363	   Consider headend H where two candidate paths of the same SR Policy
364	   <color, endpoint> are signaled via BGP and whose respective NLRIs
365	   have different route distinguishers:

367	   NLRI A with distinguisher = RD1, color = C, endpoint = N, preference
368	   P1.

370	   NLRI B with distinguisher = RD2, color = C, endpoint = N, preference
371	   P2.

373	   o  Because the NLRIs are different (different distinguisher), BGP
374	      will not perform bestpath selection.

376	   o  H installs both advertisements into the BGP table.

378	   o  Both advertisements are passed to the SR Policy instantiation
379	      process.

381	   o  SRP process at H selects the candidate path advertised by NLRI B
382	      as the active path for the SR policy since P2 is greater than P1.

384	   Note that the recommended approach is to use NLRIs with different
385	   distinguishers when several candidate paths for the same SR Policy
386	   (color, endpoint) are signaled via BGP to a headend.

388	   Example 3:

390	   Consider that a headend H learns two candidate paths of the same SR
391	   Policy <color, endpoint> one signaled via BGP and another via Local
392	   configuration.

394	   NLRI A with distinguisher = RD1, color = C, endpoint = N, preference
395	   P1.

397	   Local "foo" with color = C, endpoint = N, preference P2.

399	   o  H installs NLRI A into the BGP table.

401	   o  NLRI A and "foo" are both passed to the SRP process.

403	   o  SRP process at H selects the candidate path indicated by "foo" as
404	      the active path for the SR policy since P2 is greater than P1.

406	   Now, let us consider cases, when an SR Policy has multiple valid
407	   candidate paths with the same best preference, the SRP process at a
408	   headend uses the rules described in
409	   [I-D.ietf-spring-segment-routing-policy] section 2.9 to select the
410	   active path.  This is explained in the following examples:

412	   Example 4:

414	   Consider headend H with two candidate paths of the same SR Policy
415	   <color, endpoint> and the same preference value received from the
416	   same controller R and where RD2 is higher than RD1.

418	   o  NLRI A with distinguisher RD1, color C, endpoint N, preference
419	      P1(selected as active path at time t0).

421	   o  NLRI B with distinguisher RD2 (RD2 is greater than RD1), color C,
422	      endpoint N, preference P1 (passed to SR Policy instantiation
423	      process at time t1 > t0).

425	   After t1, SRP process at H selects candidate path associated with
426	   NLRI B as active path of the SR policy since RD2 is higher than RD1.
427	   Here the time when the headend receives the candidate path via BGP is
428	   not a factor in the selection.

430	   Note that, in such a scenario where there are redundant sessions to
431	   the same controller, the recommended approach is to use the same RD
432	   value for conveying the same candidate paths and let the BGP best
433	   path algorithm pick the best path.

435	   Example 5:

437	   Consider headend H with two candidate paths of the same SR Policy
438	   <color, endpoint> and the same preference value both received from
439	   the same controller R and where RD2 is higher than RD1.

441	   Consider also that headend H is configured to override the
442	   discriminator tiebreaker specified in
443	   [I-D.ietf-spring-segment-routing-policy] section 2.9

445	   o  NLRI A with distinguisher RD1, color C, endpoint N, preference P1
446	      (selected as active path at time t0).

448	   o  NLRI B with distinguisher RD2, color C, endpoint N, preference P1
449	      (passed to SR Policy instantiation process at time t1).

451	   Even after t1, SRP process at H retains candidate path associated
452	   with NLRI A as active path of the SR policy since the discriminator
453	   tiebreaker is disabled at H.

455	   Example 6:

457	   Consider headend H with two candidate paths of the same SR Policy
458	   <color, endpoint> and the same preference value.

460	   o  Local "foo" with color C, endpoint N, preference P1 (selected as
461	      active path at time t0).

463	   o  NLRI A with distinguisher RD1, color C, endpoint N, preference P1
464	      (passed to SRP process at time t1).

466	   Even after t1, SRP process at H retains candidate path associated
467	   with local candidate path "foo" as active path of the SR policy since
468	   the Local protocol is preferred over BGP by default based on its
469	   higher protocol identifier value.

471	   Example 7:

473	   Consider headend H with two candidate paths of the same SR Policy
474	   <color, endpoint> and the same preference value but received via
475	   NETCONF from two controllers R and S (where S > R)

477	   o  Path A from R with distinguisher D1, color C, endpoint N,
478	      preference P1 (selected as active path at time t0).

480	   o  Path B from S with distinguisher D2, color C, endpoint N,
481	      preference P1 (passed to SRP process at time t1).

483	   Note that the NETCONF process sends both paths to the SRP process
484	   since it does not have any tiebreaker logic.  After t1, SRP process
485	   at H selects candidate path associated with Path B as active path of
486	   the SR policy.

488	5.  Distributed and/or Centralized Control Plane

490	5.1.  Distributed Control Plane within a single Link-State IGP area

492	   Consider a single-area IGP with per-link latency measurement and
493	   advertisement of the measured latency in the extended-TE IGP TLV.

495	   A head-end H is configured with a single dynamic candidate path for
496	   SR policy P with a low-latency optimization objective and endpoint E.

498	   Clearly the SRP process at H learns the topology (and extended TE
499	   latency information) from the IGP and computes the solution Segment-
500	   List providing the low-latency path to E.

502	   No centralized controller is involved in such a deployment.

504	   The SR-DB at H only uses the Link-State Database (LSDB) provided by
505	   the IGP.

507	5.2.  Distributed Control Plane across several Link-State IGP areas

509	   Consider a domain D composed of two link-state IGP single-area
510	   instances (I1 and I2) where each sub-domain benefits from per-link
511	   latency measurement and advertisement of the measured latency in the
512	   related IGP.  The link-state information of each IGP is advertised
513	   via BGP-LS [RFC7752] towards a set of BGP-LS route reflectors (RR).

515	   H is a headend in IGP I1 sub-domain and E is an endpoint in IGP I2
516	   sub-domain.

518	   Using a BGP-LS session to any BGP-LS RR, H's SRP process may learn
519	   the link-state information of the remote domain I2.  H can thus
520	   compute the low-latency path from H to E as a solution Segment-List
521	   that spans the two domains I1 and I2.

523	   The SR-DB at H collects the LSDB from both sub-domains (I1 and I2).

525	   No centralized controller is required.

527	5.3.  Centralized Control Plane

529	   Considering the same domain D as in the previous section, let us now
530	   assume that H does not have a BGP-LS session to the BGP-LS RR's.
531	   Instead, let us assume a controller "C" has at least one BGP-LS
532	   session to the BGP-LS RR's.

534	   The controller C learns the topology and extended latency information
535	   from both sub-domains via BGP-LS.  It computes a low-latency path
536	   from H to E as a Segment-List <S1, S2, S3> and programs H with the
537	   related explicit candidate path.

539	   The headend H does not compute the solution Segment-List (it cannot).
540	   The headend only validates the received explicit candidate path.
541	   Most probably, the controller encodes the SID's of the Segment-List
542	   with Type-1.  In that case, The headend's validation simply consists
543	   in resolving the first SID on an outgoing interface and next-hop.

545	   The SR-DB at H only includes the LSDB provided by the IGP I1.

547	   The SR-DB of the controller collects the LSDB from both sub-
548	   domains(I1 and I2).

550	5.4.  Distributed and Centralized Control Plane

552	   Consider the same domain D as in the previous section.

554	   H's SRP process is configured to associate color C1 with a low-
555	   latency optimization objective.

557	   H's BGP process is configured to steer a Route R/r of extended-color
558	   community C1 and of next-hop N via an SR policy (N, C1).

560	   Upon receiving a first BGP route of color C1 and of next-hop N, H
561	   recognizes the need for an SR Policy (N, C1) with a low-latency
562	   objective to N.  As N is outside the SRTE DB of H, H requests a
563	   controller to compute such Segment-List (e.g., PCEP [RFC8664]).

565	   This is an example of hybrid control-plane: the BGP distributed
566	   control plane signals the routes and their TE requirements.  Upon
567	   receiving these BGP routes, a local headend either computes the
568	   solution Segment-List (entirely distributed when the endpoint is in
569	   the SR-DB of the headend) else delegates the computation to a
570	   controller (hybrid distributed/centralized control-plane).

572	   The SR-DB at H only includes the LSDB provided by the IGP.

574	   The SR-DB of the controller collects the LSDB from both sub-domains.

576	6.  Binding SID Aspects

578	   The Binding SID (BSID) is fundamental to Segment Routing.  It
579	   provides scaling, network opacity and service independence.

581	   This section describes implementation and operational aspects related
582	   to the Binding SID.

584	6.1.  Benefits of Binding SID

586	   A simplified illustration is provided on the basis of Figure 5 where
587	   it is assumed that S, A, B, Data Center Interconnect DCI1 and DCI2
588	   share the same IGP-SR instance in the data-center 1 (DC1).  DCI1,
589	   DCI2, C, D, E, F, G, DCI3 and DCI4 share the same IGP-SR domain in
590	   the core.  DCI3, DCI4, H, K and Z share the same IGP-SR domain in the
591	   data-center 2 (DC2).

593	             A---DCI1----C----D----E----DCI3---H
594	            /            |         |            \
595	           S             |         |             Z
596	            \            |         |            /
597	             B---DCI2----F---------G----DCI4---K
598	          <==DC1==><=========Core========><==DC2==>

600	                  Figure 5: A Simple Datacenter Topology

602	   In this example, it is assumed no redistribution between the IGP's
603	   and no presence of BGP-LU.  The inter-domain communication is only
604	   provided by SR through SR Policies.

606	   The latency from S to DCI1 equals to DCI2.  The latency from Z to
607	   DCI3 equals to DCI4.  All the intra-DC links have the same IGP metric
608	   10.

610	   The path DCI1, C, D, E, DCI3 has a lower latency and lower capacity
611	   than the path DCI2, F, G, DCI4.

613	   The IGP metrics of all the core links are set to 10 except the links
614	   D-E which is set to 100.

616	   A low-latency multi-domain policy from S to Z may be expressed as
617	   <DCI1, BSID, Z> where:

619	   o  DCI1 is the prefix SID of DCI1.

621	   o  BSID is the Binding SID bound to an SR policy <D, D2E, DCI3>
622	      instantiated at DCI1.

624	   o  Z is the prefix SID of Z.

626	   Without the use of an intermediate core SR Policy (efficiently
627	   summarized by a single BSID), S would need to steer its low-latency
628	   flow into the policy <DCI1, D, D2E, DCI3, Z>.

630	   The use of a BSID (and the intermediate bound SR Policy) decreases
631	   the number of segments imposed by the source.

633	   A BSID acts as a stable anchor point which isolates one domain from
634	   the churn of another domain.  Upon topology changes within the core
635	   of the network, the low-latency path from DCI1 to DCI3 may change.
636	   While the path of an intermediate policy changes, its BSID does not
637	   change.  Hence the policy used by the source does not change, hence
638	   the source is shielded from the churn in another domain.

640	   A BSID provides opacity and independence between domains.  The
641	   administrative authority of the core domain may not want to share
642	   information about its topology.  The use of a BSID allows keeping the
643	   service opaque.  S is not aware of the details of how the low-latency
644	   service is provided by the core domain.  S is not aware of the need
645	   of the core authority to temporarily change the intermediate path.

647	6.2.  Centralized Discovery of available BSID

649	   This section explains how controllers can discover the local SIDs
650	   available at a node N so as to pick an explicit BSID for a SR Policy
651	   to be instantiated at headend N.

653	   Any controller can discover the following properties of a node N
654	   (e.g., via BGP-LS , NETCONF etc.):

656	   o  its local topology [RFC7752].

658	   o  its topology-related SIDs (Prefix SIDs, Adj SID and EPE SID
659	      [I-D.ietf-idr-bgp-ls-segment-routing-ext]
660	      [I-D.ietf-idr-bgpls-segment-routing-epe]).

662	   o  its Segment Routing Label Block (SRLB).

664	   o  its SR Policies and their BSID ([RFC8664]
665	      [I-D.ietf-pce-binding-label-sid]
666	      [I-D.ietf-idr-te-lsp-distribution]).

668	   Any controller can thus infer the available SIDs in the SRLB of any
669	   node with the assumption that all SIDs allocated from the SRLB on
670	   that node are being advertised by it via some protocols or mechanisms
671	   to the controller.

673	   As an example, a controller discovers the following characteristics
674	   of N: SRLB (4000, 8000), 3 Adj SIDs (4001, 4002, 4003), 2 EPE SIDs
675	   (4004, 4005) and 3 SRTE policies (whose BSIDs are respectively 4006,
676	   4007 and 4008).  This controller can deduce that the SRLB sub-range
677	   (4009, 8000) is free for allocation.

679	   A controller is not restricted to use the next numerically available
680	   SID in the available SRLB sub-range.  It can pick any label in the
681	   subset of available labels.  This random pick make the chance for a
682	   collision unlikely.

684	   An operator could also sub-allocate the SRLB between different
685	   controllers (e.g. (4000-4499) to controller 1 and (4500-5000) to
686	   controller 2).

688	   Inter-controller state-synchronization may be used to avoid/detect
689	   collision in BSID.

691	   All these techniques make the likelihood of a collision between
692	   different controllers very unlikely.

694	   In the unlikely case of a collision, the controllers will detect it
695	   through system alerts, BGP-LS reporting using
696	   [I-D.ietf-idr-te-lsp-distribution] or PCEP notification [RFC8231].
697	   They then have the choice to continue the operation of their SR
698	   Policy with the dynamically allocated BSID or re-try with another
699	   explicit pick.

701	   Note: in deployments where PCE Protocol (PCEP) is used between head-
702	   end and controller (PCE), a head-end can report BSID as well as
703	   policy attributes (e.g., type of disjointness) and operational and
704	   administrative states to controller.  Similarly, a controller can
705	   also assign/update the BSID of a policy via PCEP when instantiating
706	   or updating SR Policy.

708	7.  Flex-Algorithm Based SR Policies

710	   SR allows for association of algorithms to Prefix SIDs [RFC8402].
711	   [I-D.ietf-lsr-flex-algo] defines the IGP based Flex-Algorithm
712	   solution which allows IGPs themselves to compute constraint based
713	   paths over the network.  Prefix SIDs for the specific flex-algorithm
714	   and associated with a node are used in the forwarding plane to steer
715	   along the specific constraint path to that node.

717	   As specified in [RFC8402] these IGP Flex Algo Prefix SIDs can be used
718	   as segments within SR Policies thereby leveraging the underlying IGP
719	   Flex Algo solution.

721	            1--RED--2-------6
722	            |       |       |
723	            4-------3--RED--9

725	                  Figure 6: Illustration for Flex-Alg SID

727	   Now let us assume that

729	   o  1, 2, 3 and 4 are part of IGP 1.

731	   o  2, 6, 9 and 3 are part of IGP 2.

733	   o  All the IGP link costs are 10.

735	   o  Links 1to2 and 3to9 are colored with IGP Link Affinity Red.

737	   o  Flex-Alg1 is defined in both IGPs as: avoid red, minimize IGP
738	      metric.

740	   o  All nodes of each IGP domain are enabled for FlexAlg1

742	   o  SID(k, 0) represents the Prefix SID of node k according to Alg=0.

744	   o  SID(k, FlexAlg1) represents the Prefix SID of node k according to
745	      Flex-Alg1.

747	   A controller can steer a flow from 1 to 9 through an end-to-end path
748	   that avoids the RED links of both IGP domains thanks to the explicit
749	   SR Policy <SID(2, FlexAlg1), SID9(FlexAlg1)>.

751	8.  Layer 2 and Optical Transport

753	                  1----2----3----4----5
754	         I2(lambda L241)\       / I4(lambda L241)
755	                         Optical

757	                 Figure 7: SR Policy with integrated DWDM

759	   An explicit candidate path can express a path through a transport
760	   layer beneath IP (ATM, FR, DWDM).  The transport layer could be ATM,
761	   FR, DWDM, back-to-back Ethernet etc.  The transport path is modelled
762	   as a link between two IP nodes with the specific assumption that no
763	   distributed IP routing protocol runs over the link.  The link may
764	   have IP address or be IP unnumbered.  Depending on the transport
765	   protocol case, the link can be a physical DWDM interface and a lambda
766	   (integrated solution), an Ethernet interface and a VLAN, an ATM
767	   interface with a VPI/VCI, a FR interface with a DLCI etc.

769	   Using the DWDM integrated use-case of Figure 7 as an illustration,
770	   let us assume

772	   o  nodes 1, 2, 3, 4 and 5 are IP routers running an SR-enable IGP on
773	      the links 1-2, 2-3, 3-4 and 4-5.

775	   o  The SRGB is homogeneous (16000, 24000).

777	   o  Node K's prefix SID is 16000+K.

779	   o  node 2 has an integrated DWDM interface I2 with Lambda L1.

781	   o  node 4 has an integrated DWDM interface I4 with Lambda L2.

783	   o  the optical network is provisioned with a circuit from 2 to 4 with
784	      continuous lambda L241 (details outside the scope of this
785	      document).

787	   o  Node 2 is provisioned with an SR policy with Segment-List
788	      <I2(L241)> and Binding SID B where I2(L241) is of type 5 (IPv4) or
789	      type 7 (IPv6), see section 4 of
790	      [I-D.ietf-spring-segment-routing-policy] .

792	   o  node 1 steers a packet P1 towards the prefix SID of node 5
793	      (16005).

795	   o  node 1 steers a packet P2 on the SR policy <16002, B, 16005>.

797	   In such a case, the journey of P1 will be 1-2-3-4-5 while the journey
798	   of P2 will be 1-2-lambda(L241)-4-5.  P2 skips the IP hop 3 and
799	   leverages the DWDM circuit from node 2 to node 4.  P1 follows the
800	   shortest-path computed by the distributed routing protocol.  The path
801	   of P1 is unaltered by the addition, modification or deletion of
802	   optical bypass circuits.

804	   The salient point of this example is that the SR Policy architecture
805	   seamlessly support explicit candidate paths through any transport
806	   sub-layer.

808	   BGP-LS Extensions to describe the sub-IP-layer characteristics of the
809	   SR Policy are out of scope of this document (e.g. in Figure 7, the
810	   DWDM characteristics of the SR Policy at node 2 in terms of latency,
811	   loss, security, domain/country traversed by the circuit etc.).

813	   Further details of the SR Policy use-case for Packet Optical networks
814	   are specified in [I-D.anand-spring-poi-sr] .

816	9.  Security Considerations

818	   The security considerations related to Segment Routing architecture
819	   are described in [RFC8402] and for SR Policy architecture are
820	   described in [I-D.ietf-spring-segment-routing-policy] and they apply
821	   to this document as well.

823	10.  IANA Considerations

825	   This document has no actions for IANA.

827	11.  Acknowledgement

829	   The authors like to thank Tarek Saad, Dhanendra Jain, Muhammad
830	   Durrani and Rob Shakir for their valuable comments and suggestions.

832	12.  Contributors

834	   The following people have contributed to this document:

836	   Siva Sivabalan
837	   Cisco Systems
838	   Email: msiva@cisco.com

840	   Zafar Ali
841	   Cisco Systems
842	   Email: zali@cisco.com
843	   Jose Liste
844	   Cisco Systems
845	   Email: jliste@cisco.com

847	   Francois Clad
848	   Cisco Systems
849	   Email: fclad@cisco.com

851	   Kamran Raza
852	   Cisco Systems
853	   Email: skraza@cisco.com

855	   Shraddha Hegde
856	   Juniper Networks
857	   Email: shraddha@juniper.net

859	   Steven Lin
860	   Google, Inc.
861	   Email: stevenlin@google.com

863	   Alex Bogdanov
864	   Google, Inc.
865	   Email: bogdanov@google.com

867	   Daniel Voyer
868	   Bell Canada
869	   Email: daniel.voyer@bell.ca

871	   Dirk Steinberg
872	   Steinberg Consulting
873	   Email: dws@steinbergnet.net

875	   Bruno Decraene
876	   Orange Business Services
877	   Email: bruno.decraene@orange.com

879	   Stephane Litkowski
880	   Orange Business Services
881	   Email: stephane.litkowski@orange.com

883	   Luay Jalil
884	   Verizon
885	   Email: luay.jalil@verizon.com

887	13.  References

889	13.1.  Normative References

891	   [I-D.ietf-spring-segment-routing-policy]
892	              Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and
893	              P. Mattes, "Segment Routing Policy Architecture", draft-
894	              ietf-spring-segment-routing-policy-09 (work in progress),
895	              November 2020.

897	   [RFC8402]  Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L.,
898	              Decraene, B., Litkowski, S., and R. Shakir, "Segment
899	              Routing Architecture", RFC 8402, DOI 10.17487/RFC8402,
900	              July 2018, <https://www.rfc-editor.org/info/rfc8402>.

902	13.2.  Informative References

904	   [I-D.anand-spring-poi-sr]
905	              Anand, M., Bardhan, S., Subrahmaniam, R., Tantsura, J.,
906	              Mukhopadhyaya, U., and C. Filsfils, "Packet-Optical
907	              Integration in Segment Routing", draft-anand-spring-poi-
908	              sr-08 (work in progress), July 2019.

910	   [I-D.ietf-idr-bgp-ls-segment-routing-ext]
911	              Previdi, S., Talaulikar, K., Filsfils, C., Gredler, H.,
912	              and M. Chen, "BGP Link-State extensions for Segment
913	              Routing", draft-ietf-idr-bgp-ls-segment-routing-ext-16
914	              (work in progress), June 2019.

916	   [I-D.ietf-idr-bgpls-segment-routing-epe]
917	              Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray,
918	              S., and J. Dong, "BGP-LS extensions for Segment Routing
919	              BGP Egress Peer Engineering", draft-ietf-idr-bgpls-
920	              segment-routing-epe-19 (work in progress), May 2019.

922	   [I-D.ietf-idr-segment-routing-te-policy]
923	              Previdi, S., Filsfils, C., Talaulikar, K., Mattes, P.,
924	              Rosen, E., Jain, D., and S. Lin, "Advertising Segment
925	              Routing Policies in BGP", draft-ietf-idr-segment-routing-
926	              te-policy-11 (work in progress), November 2020.

928	   [I-D.ietf-idr-te-lsp-distribution]
929	              Previdi, S., Talaulikar, K., Dong, J., Chen, M., Gredler,
930	              H., and J. Tantsura, "Distribution of Traffic Engineering
931	              (TE) Policies and State using BGP-LS", draft-ietf-idr-te-
932	              lsp-distribution-14 (work in progress), October 2020.

934	   [I-D.ietf-lsr-flex-algo]
935	              Psenak, P., Hegde, S., Filsfils, C., Talaulikar, K., and
936	              A. Gulko, "IGP Flexible Algorithm", draft-ietf-lsr-flex-
937	              algo-13 (work in progress), October 2020.

939	   [I-D.ietf-pce-binding-label-sid]
940	              Sivabalan, S., Filsfils, C., Tantsura, J., Hardwick, J.,
941	              Previdi, S., and C. Li, "Carrying Binding Label/Segment-ID
942	              in PCE-based Networks.", draft-ietf-pce-binding-label-
943	              sid-05 (work in progress), October 2020.

945	   [RFC1195]  Callon, R., "Use of OSI IS-IS for routing in TCP/IP and
946	              dual environments", RFC 1195, DOI 10.17487/RFC1195,
947	              December 1990, <https://www.rfc-editor.org/info/rfc1195>.

949	   [RFC2328]  Moy, J., "OSPF Version 2", STD 54, RFC 2328,
950	              DOI 10.17487/RFC2328, April 1998,
951	              <https://www.rfc-editor.org/info/rfc2328>.

953	   [RFC3630]  Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering
954	              (TE) Extensions to OSPF Version 2", RFC 3630,
955	              DOI 10.17487/RFC3630, September 2003,
956	              <https://www.rfc-editor.org/info/rfc3630>.

958	   [RFC5305]  Li, T. and H. Smit, "IS-IS Extensions for Traffic
959	              Engineering", RFC 5305, DOI 10.17487/RFC5305, October
960	              2008, <https://www.rfc-editor.org/info/rfc5305>.

962	   [RFC5340]  Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF
963	              for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008,
964	              <https://www.rfc-editor.org/info/rfc5340>.

966	   [RFC7471]  Giacalone, S., Ward, D., Drake, J., Atlas, A., and S.
967	              Previdi, "OSPF Traffic Engineering (TE) Metric
968	              Extensions", RFC 7471, DOI 10.17487/RFC7471, March 2015,
969	              <https://www.rfc-editor.org/info/rfc7471>.

971	   [RFC7752]  Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and
972	              S. Ray, "North-Bound Distribution of Link-State and
973	              Traffic Engineering (TE) Information Using BGP", RFC 7752,
974	              DOI 10.17487/RFC7752, March 2016,
975	              <https://www.rfc-editor.org/info/rfc7752>.

977	   [RFC8231]  Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path
978	              Computation Element Communication Protocol (PCEP)
979	              Extensions for Stateful PCE", RFC 8231,
980	              DOI 10.17487/RFC8231, September 2017,
981	              <https://www.rfc-editor.org/info/rfc8231>.

983	   [RFC8570]  Ginsberg, L., Ed., Previdi, S., Ed., Giacalone, S., Ward,
984	              D., Drake, J., and Q. Wu, "IS-IS Traffic Engineering (TE)
985	              Metric Extensions", RFC 8570, DOI 10.17487/RFC8570, March
986	              2019, <https://www.rfc-editor.org/info/rfc8570>.

988	   [RFC8660]  Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S.,
989	              Decraene, B., Litkowski, S., and R. Shakir, "Segment
990	              Routing with the MPLS Data Plane", RFC 8660,
991	              DOI 10.17487/RFC8660, December 2019,
992	              <https://www.rfc-editor.org/info/rfc8660>.

994	   [RFC8664]  Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W.,
995	              and J. Hardwick, "Path Computation Element Communication
996	              Protocol (PCEP) Extensions for Segment Routing", RFC 8664,
997	              DOI 10.17487/RFC8664, December 2019,
998	              <https://www.rfc-editor.org/info/rfc8664>.

1000	   [RFC8986]  Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer,
1001	              D., Matsushima, S., and Z. Li, "Segment Routing over IPv6
1002	              (SRv6) Network Programming", RFC 8986,
1003	              DOI 10.17487/RFC8986, February 2021,
1004	              <https://www.rfc-editor.org/info/rfc8986>.

1006	Authors' Addresses

1008	   Clarence Filsfils
1009	   Cisco Systems, Inc.
1010	   Pegasus Parc
1011	   De kleetlaan 6a, DIEGEM  BRABANT 1831
1012	   BELGIUM

1014	   Email: cfilsfil@cisco.com

1016	   Ketan Talaulikar (editor)
1017	   Cisco Systems, Inc.

1019	   Email: ketant@cisco.com

1021	   Przemyslaw Krol
1022	   Google, Inc.

1024	   Email: pkrol@google.com
1025	   Martin Horneffer
1026	   Deutsche Telekom

1028	   Email: martin.horneffer@telekom.de

1030	   Paul Mattes
1031	   Microsoft
1032	   One Microsoft Way
1033	   Redmond, WA  98052-6399
1034	   USA

1036	   Email: pamattes@microsoft.com