idnits 2.17.1 

draft-filsfils-spring-sr-policy-considerations-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 12, 2020) is 1292 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-22) exists of
     draft-ietf-spring-segment-routing-policy-08

  == Outdated reference: A later version (-18) exists of
     draft-ietf-idr-bgp-ls-segment-routing-ext-16

  == Outdated reference: A later version (-26) exists of
     draft-ietf-idr-segment-routing-te-policy-09

  == Outdated reference: A later version (-19) exists of
     draft-ietf-idr-te-lsp-distribution-13

  == Outdated reference: A later version (-26) exists of
     draft-ietf-lsr-flex-algo-12

  == Outdated reference: A later version (-16) exists of
     draft-ietf-pce-binding-label-sid-03

  == Outdated reference: A later version (-28) exists of
     draft-ietf-spring-srv6-network-programming-24

  -- Obsolete informational reference (is this intentional?): RFC 7752
     (Obsoleted by RFC 9552)


     Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	SPRING Working Group                                         C. Filsfils
3	Internet-Draft                                        K. Talaulikar, Ed.
4	Intended status: Informational                       Cisco Systems, Inc.
5	Expires: April 15, 2021                                          P. Krol
6	                                                            Google, Inc.
7	                                                            M. Horneffer
8	                                                        Deutsche Telekom
9	                                                               P. Mattes
10	                                                               Microsoft
11	                                                        October 12, 2020

13	         SR Policy Implementation and Deployment Considerations
14	           draft-filsfils-spring-sr-policy-considerations-06

16	Abstract

18	   Segment Routing (SR) allows a headend node to steer a packet flow
19	   along any path.  Intermediate per-flow states are eliminated thanks
20	   to source routing.  SR Policy framework enables the instantiation and
21	   the management of necessary state on the headend node for flows along
22	   a source routed paths using an ordered list of segments associated
23	   with their specific SR Policies.  This document describes some of the
24	   implementation and deployment aspects that are useful for
25	   operationalizing the SR Policy architecture.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at https://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on April 15, 2021.

44	Copyright Notice

46	   Copyright (c) 2020 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (https://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
62	   2.  SR Policy Headend Architecture  . . . . . . . . . . . . . . .   3
63	   3.  Dynamic Path Computation  . . . . . . . . . . . . . . . . . .   4
64	     3.1.  Optimization Objective  . . . . . . . . . . . . . . . . .   4
65	     3.2.  Constraints . . . . . . . . . . . . . . . . . . . . . . .   5
66	     3.3.  SR Native Algorithm . . . . . . . . . . . . . . . . . . .   6
67	     3.4.  Path to SID . . . . . . . . . . . . . . . . . . . . . . .   7
68	   4.  Candidate Path Selection  . . . . . . . . . . . . . . . . . .   7
69	   5.  Distributed and/or Centralized Control Plane  . . . . . . . .  11
70	     5.1.  Distributed Control Plane within a single Link-State IGP
71	           area  . . . . . . . . . . . . . . . . . . . . . . . . . .  11
72	     5.2.  Distributed Control Plane across several Link-State IGP
73	           areas . . . . . . . . . . . . . . . . . . . . . . . . . .  11
74	     5.3.  Centralized Control Plane . . . . . . . . . . . . . . . .  12
75	     5.4.  Distributed and Centralized Control Plane . . . . . . . .  12
76	   6.  Binding SID Aspects . . . . . . . . . . . . . . . . . . . . .  13
77	     6.1.  Benefits of Binding SID . . . . . . . . . . . . . . . . .  13
78	     6.2.  Centralized Discovery of available BSID . . . . . . . . .  14
79	   7.  Flex-Algorithm Based SR Policies  . . . . . . . . . . . . . .  16
80	   8.  Layer 2 and Optical Transport . . . . . . . . . . . . . . . .  17
81	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  18
82	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  18
83	   11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . .  18
84	   12. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  18
85	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  20
86	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  20
87	     13.2.  Informative References . . . . . . . . . . . . . . . . .  20
88	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  22

90	1.  Introduction

92	   Segment Routing (SR) allows a headend node to steer a packet flow
93	   along any path.  Intermediate per-flow states are eliminated with
94	   source routing [RFC8402].

96	   The headend node steers a flow into a Segment Routing Policy (SR
97	   Policy) by augmenting packet headers with the ordered list of
98	   segments associated with that SR Policy.
99	   [I-D.ietf-spring-segment-routing-policy] defines the SR Policy
100	   architecture and details the concepts of SR Policy and steering into
101	   an SR Policy.

103	   This document describes some of the implementation aspects for SR
104	   Policy framework which should be considered as suggestions.  The same
105	   behavior, as defined in [I-D.ietf-spring-segment-routing-policy], may
106	   in fact be realized with other alternate approaches.  The deployment
107	   aspects described in this document are also meant to only serve as
108	   guidelines.  This document describes these aspects and other
109	   considerations related to SR Policy concepts as they are important to
110	   facilitate multi-vendor interoperable deployments for various SR
111	   Policy use-cases.

113	   These apply equally to the MPLS [RFC8660] and SRv6
114	   [I-D.ietf-spring-srv6-network-programming] instantiations of segment
115	   routing.

117	   For reading simplicity, the illustrations are provided for the MPLS
118	   instantiations.

120	2.  SR Policy Headend Architecture

122	   This section provides a conceptual overview of components (or
123	   functions) that interact to implement SR Policy on a headend

125	                   +--------+  +--------+
126	                   |  BGP   |  |  PCEP  |
127	                   +--------+  +--------+
128	                            \ /
129	              +--------+  +----------+  +--------+
130	              |        |  |    SR    |  |        |
131	              |  CLI   |--|  Policy  |--| NETCONF|
132	              |        |  |          |  |        |
133	              +--------+  +----------+  +--------+
134	                              |
135	                           +--------+
136	                           |  FIB   |
137	                           +--------+

139	               Figure 1: SR Policy Architecture at a Headend

141	   The SR Policy functionality at a headend can be implemented in an SR
142	   Policy (SRP) process as illustrated in Figure 1 .

144	   The SRP process interacts with other processes to learn candidate
145	   paths.

147	   The SRP process selects the active path of an SR Policy.

149	   The SRP process interacts with the RIB/FIB process to install an
150	   active SR Policy in the dataplane.

152	   In order to validate explicit candidate paths and compute dynamic
153	   candidate paths, the SRP process maintains an SR Database (SR-DB) as
154	   specified in [I-D.ietf-spring-segment-routing-policy].  The SRP
155	   process interacts with other processes as shown in Figure 2 to
156	   collect the SR-DB information.

158	                 +--------+  +--------+  +--------+
159	                 | BGP SR |  | BGP-LS |  |  IGP   |
160	                 | Policy |  +--------+  +--------+
161	                 +--------+ \    |       /
162	               +--------+   +-----------+  +--------+
163	               |   PCEP |---|    SRP    |--| NETCONF|
164	               +--------+   +-----------+  +--------+

166	            Figure 2: Topology/link-state database architecture

168	   The SR Policy architecture supports both centralized and distributed
169	   control-plane.

171	3.  Dynamic Path Computation

173	   A dynamic candidate path for SR Policy is specified as an
174	   optimization objective and constraints and needs to be computed by
175	   either the headend or a Path Computation Element (PCE).  The
176	   distributed or centralized computation aspect is described further in
177	   Section 5.  This section describes the computation aspects of a
178	   dynamic path.

180	3.1.  Optimization Objective

182	   This document describes two optimization objectives:

184	   o  Min-Metric - requests computation of a solution Segment-List
185	      optimized for a selected metric.

187	   o  Min-Metric with margin and maximum number of SIDs - Min-Metric
188	      with two changes: a margin of by which two paths with similar
189	      metrics would be considered equal, a constraint on the max number
190	      of SIDs in the Segment-List.

192	   The "Min-Metric" optimization objective requests to compute a
193	   solution Segment-List such that packets flowing through the solution
194	   Segment-List use ECMP-aware paths optimized for the selected metric.
195	   The "Min-Metric" objective can be instantiated for the IGP metric
196	   ([RFC1195] [RFC2328] [RFC5340]) xor the TE metric ([RFC5305]
197	   [RFC3630]) xor the latency extended TE metric ([RFC8570] [RFC7471]).
198	   This metric is called the O metric (the optimized metric) to
199	   distinguish it from the IGP metric.  The solution Segment-List must
200	   be computed to minimize the number of SIDs and the number of Segment-
201	   Lists.

203	   If the selected O metric is the IGP metric and the headend and
204	   tailend are in the same IGP domain, then the solution Segment-List is
205	   made of the single prefix-SID of the tailend.

207	   When the selected O metric is not the IGP metric, then the solution
208	   Segment-List is made of prefix SIDs of intermediate nodes, Adjacency
209	   SIDs along intermediate links and potentially Binding SIDs (BSIDs) of
210	   intermediate policies.

212	   In many deployments there are insignificant metric differences
213	   between mostly equal path (e.g. a difference of 100 usec of latency
214	   between two paths from NYC to SFO would not matter in most cases).
215	   The "Min-Metric with margin" objective supports such requirement.

217	   The "Min-Metric with margin and maximum number of SIDs" optimization
218	   objective requests to compute a solution Segment-List such that
219	   packets flowing through the solution Segment-List do not use a path
220	   whose cumulative O metric is larger than the shortest-path O metric +
221	   margin.

223	   If this is not possible because of the number of SIDs constraint,
224	   then one option is that the solution Segment-List minimizes the O
225	   metric while meeting the maximum number of SID constraints (i.e. path
226	   with the least value of O metric while using <= the number of SIDs
227	   specified).  The other default option is to not come up with a
228	   solution unless the desired SLA is guaranteed.

230	   Section 7 describes another approach for computing a solution
231	   Segment-List consisting of a single segment when the O metric is not
232	   the IGP metric by using the Flex Algorithm Prefix-SID of the tailend.

234	3.2.  Constraints

236	   The following constraints can be described:

238	   o  Inclusion and/or exclusion of TE affinity.

240	   o  Inclusion and/or exclusion of IP address.

242	   o  Inclusion and/or exclusion of SRLG.

244	   o  Inclusion and/or exclusion of admin-tag.

246	   o  Maximum accumulated metric (IGP, TE and latency).

248	   o  Maximum number of SIDs in the solution Segment-List.

250	   o  Maximum number of weighted Segment-Lists in the solution set.

252	   o  Diversity to another service instance (e.g., link, node, or SRLG
253	      disjoint paths originating from different head-ends).

255	3.3.  SR Native Algorithm

257	         1----------------2----------------3
258	        |\                               /
259	        | \                             /
260	        |  4-------------5-------------7
261	        |   \                         /|
262	        |    +-----------6-----------+ |
263	        8------------------------------9

265	        Figure 3: Illustration used to describe SR native algorithm

267	   Let us assume that all the links have the same IGP metric of 10 and
268	   let us consider the dynamic path defined as: Min-Metric(from 1, to 3,
269	   IGP metric, margin 0) with constraint "avoid link 2-to-3".

271	   A classical circuit implementation would do: prune the graph, compute
272	   the shortest-path, pick a single non-ECMP branch of the ECMP-aware
273	   shortest-path and encode it as a Segment-List.  The solution Segment-
274	   List would be <4, 5, 7, 3>.

276	   An SR-native algorithm would find a Segment-List that minimizes the
277	   number of SIDs and maximize the use of all the ECMP branches along
278	   the ECMP shortest path.  In this illustration, the solution Segment-
279	   List would be <7, 3>.

281	   In the vast majority of SR use-cases, SR-native algorithms should be
282	   preferred: they preserve the native ECMP of IP and they minimize the
283	   dataplane header overhead.

285	   In some specific use-case (e.g.  TDM migration over IP where the
286	   circuit notion prevails), one may prefer a classic circuit
287	   computation followed by an encoding into SIDs (potentially only using
288	   non-protected Adj SIDs that pin the path to specific links and avoid
289	   ECMP to reflect the TDM paradigm).

291	   SR-native algorithms are a local node behavior and are thus outside
292	   the scope of this document.

294	3.4.  Path to SID

296	   Let us assume the below diagram where all the links have an IGP
297	   metric of 10 and a TE metric of 10 except the link AB which has an
298	   IGP metric of 20 and the link AD which has a TE metric of 100.  Let
299	   us consider the min-metric(from A, to D, TE metric, margin 0).

301	            B---C
302	            |   |
303	            A---D

305	      Figure 4: Illustration used to describe path to SID conversion

307	   The solution path to this problem is ABCD.

309	   This path can be expressed in SIDs as <B, D> where B and D are the
310	   IGP prefix SIDs respectively associated with nodes B and D in the
311	   diagram.

313	   Indeed, from A, the IGP path to B is AB (IGP metric 20 better than
314	   ADCB of IGP metric 30).  From B, the IGP path to D is BCD (IGP metric
315	   20 better than BAD of IGP metric 30).

317	   While the details of the algorithm remain a local node behavior, a
318	   high-level description follows: start at the headend and find an IGP
319	   prefix SID that leads as far down the desired path as
320	   possible(without using any link not included in the desired path).
321	   If no prefix SID exists, use the Adj SID to the first neighbor along
322	   the path.  Restart from the node that was reached.

324	4.  Candidate Path Selection

326	   An SR Policy may have multiple candidate paths that are provisioned
327	   or signaled [I-D.ietf-idr-segment-routing-te-policy] [RFC8664] from
328	   one of more sources.  The tie-breaker rules defined in
329	   [I-D.ietf-spring-segment-routing-policy] result in determination of a
330	   single "active path" in a formal definition.

332	   This section describe some examples for the candidate path selection
333	   based on the same rules.

335	   Example 1:

337	   Consider headend H where two candidate paths of the same SR Policy
338	   <color, endpoint> are signaled via BGP
339	   [I-D.ietf-idr-segment-routing-te-policy] and whose respective NLRIs
340	   have the same route distinguishers:

342	   NLRI A with distinguisher = RD1, color = C, endpoint = N, preference
343	   P1.

345	   NLRI B with distinguisher = RD1, color = C, endpoint = N, preference
346	   P2.

348	   o  Because the NLRIs are identical (same distinguisher), BGP will
349	      perform bestpath selection.  Note that there are no changes to BGP
350	      best path selection algorithm.

352	   o  H installs one advertisement as bestpath into the BGP table.

354	   o  A single advertisement is passed to the SR Policy instantiation
355	      process.

357	   o  The SRP process does not perform any path selection.

359	   Note that the candidate path's preference value does not have any
360	   effect on the BGP bestpath selection process.

362	   Example 2:

364	   Consider headend H where two candidate paths of the same SR Policy
365	   <color, endpoint> are signaled via BGP and whose respective NLRIs
366	   have different route distinguishers:

368	   NLRI A with distinguisher = RD1, color = C, endpoint = N, preference
369	   P1.

371	   NLRI B with distinguisher = RD2, color = C, endpoint = N, preference
372	   P2.

374	   o  Because the NLRIs are different (different distinguisher), BGP
375	      will not perform bestpath selection.

377	   o  H installs both advertisements into the BGP table.

379	   o  Both advertisements are passed to the SR Policy instantiation
380	      process.

382	   o  SRP process at H selects the candidate path advertised by NLRI B
383	      as the active path for the SR policy since P2 is greater than P1.

385	   Note that the recommended approach is to use NLRIs with different
386	   distinguishers when several candidate paths for the same SR Policy
387	   (color, endpoint) are signaled via BGP to a headend.

389	   Example 3:

391	   Consider that a headend H learns two candidate paths of the same SR
392	   Policy <color, endpoint> one signaled via BGP and another via Local
393	   configuration.

395	   NLRI A with distinguisher = RD1, color = C, endpoint = N, preference
396	   P1.

398	   Local "foo" with color = C, endpoint = N, preference P2.

400	   o  H installs NLRI A into the BGP table.

402	   o  NLRI A and "foo" are both passed to the SRP process.

404	   o  SRP process at H selects the candidate path indicated by "foo" as
405	      the active path for the SR policy since P2 is greater than P1.

407	   Now, let us consider cases, when an SR Policy has multiple valid
408	   candidate paths with the same best preference, the SRP process at a
409	   headend uses the rules described in
410	   [I-D.ietf-spring-segment-routing-policy] section 2.9 to select the
411	   active path.  This is explained in the following examples:

413	   Example 4:

415	   Consider headend H with two candidate paths of the same SR Policy
416	   <color, endpoint> and the same preference value received from the
417	   same controller R and where RD2 is higher than RD1.

419	   o  NLRI A with distinguisher RD1, color C, endpoint N, preference
420	      P1(selected as active path at time t0).

422	   o  NLRI B with distinguisher RD2 (RD2 is greater than RD1), color C,
423	      endpoint N, preference P1 (passed to SR Policy instatiation
424	      process at time t1 > t0).

426	   After t1, SRP process at H selects candidate path associated with
427	   NLRI B as active path of the SR policy since RD2 is higher than RD1.
428	   Here the time when the headend receives the candidate path via BGP is
429	   not a factor in the selection.

431	   Note that, in such a scenario where there are redundant sessions to
432	   the same controller, the recommended approach is to use the same RD
433	   value for conveying the same candidate paths and let the BGP best
434	   path algorithm pick the best path.

436	   Example 5:

438	   Consider headend H with two candidate paths of the same SR Policy
439	   <color, endpoint> and the same preference value both received from
440	   the same controller R and where RD2 is higher than RD1.

442	   Consider also that headend H is configured to override the
443	   discriminator tiebreaker specified in
444	   [I-D.ietf-spring-segment-routing-policy] section 2.9

446	   o  NLRI A with distinguisher RD1, color C, endpoint N, preference P1
447	      (selected as active path at time t0).

449	   o  NLRI B with distinguisher RD2, color C, endpoint N, preference P1
450	      (passed to SR Policy instatiation process at time t1).

452	   Even after t1, SRP process at H retains candidate path associated
453	   with NLRI A as active path of the SR policy since the discriminator
454	   tiebreaker is disabled at H.

456	   Example 6:

458	   Consider headend H with two candidate paths of the same SR Policy
459	   <color, endpoint> and the same preference value.

461	   o  Local "foo" with color C, endpoint N, preference P1 (selected as
462	      active path at time t0).

464	   o  NLRI A with distinguisher RD1, color C, endpoint N, preference P1
465	      (passed to SRP process at time t1).

467	   Even after t1, SRP process at H retains candidate path associated
468	   with local candidate path "foo" as active path of the SR policy since
469	   the Local protocol is preferred over BGP by default based on its
470	   higher protocol identifier value.

472	   Example 7:

474	   Consider headend H with two candidate paths of the same SR Policy
475	   <color, endpoint> and the same preference value but received via
476	   NETCONF from two controllers R and S (where S > R)

478	   o  Path A from R with distinguisher D1, color C, endpoint N,
479	      preference P1 (selected as active path at time t0).

481	   o  Path B from S with distinguisher D2, color C, endpoint N,
482	      preference P1 (passed to SRP process at time t1).

484	   Note that the NETCONF process sends both paths to the SRP process
485	   since it does not have any tiebreaker logic.  After t1, SRP process
486	   at H selects candidate path associated with Path B as active path of
487	   the SR policy.

489	5.  Distributed and/or Centralized Control Plane

491	5.1.  Distributed Control Plane within a single Link-State IGP area

493	   Consider a single-area IGP with per-link latency measurement and
494	   advertisement of the measured latency in the extended-TE IGP TLV.

496	   A head-end H is configured with a single dynamic candidate path for
497	   SR policy P with a low-latency optimization objective and endpoint E.

499	   Clearly the SRP process at H learns the topology (and extended TE
500	   latency information) from the IGP and computes the solution Segment-
501	   List providing the low-latency path to E.

503	   No centralized controller is involved in such a deployment.

505	   The SR-DB at H only uses the Link-State DataBase (LSDB) provided by
506	   the IGP.

508	5.2.  Distributed Control Plane across several Link-State IGP areas

510	   Consider a domain D composed of two link-state IGP single-area
511	   instances (I1 and I2) where each sub-domain benefits from per-link
512	   latency measurement and advertisement of the measured latency in the
513	   related IGP.  The link-state information of each IGP is advertised
514	   via BGP-LS [RFC7752] towards a set of BGP-LS route reflectors (RR).

516	   H is a headend in IGP I1 sub-domain and E is an endpoint in IGP I2
517	   sub-domain.

519	   Using a BGP-LS session to any BGP-LS RR, H's SRP process may learn
520	   the link-state information of the remote domain I2.  H can thus
521	   compute the low-latency path from H to E as a solution Segment-List
522	   that spans the two domains I1 and I2.

524	   The SR-DB at H collects the LSDB from both sub-domains (I1 and I2).

526	   No centralized controller is required.

528	5.3.  Centralized Control Plane

530	   Considering the same domain D as in the previous section, let us now
531	   assume that H does not have a BGP-LS session to the BGP-LS RR's.
532	   Instead, let us assume a controller "C" has at least one BGP-LS
533	   session to the BGP-LS RR's.

535	   The controller C learns the topology and extended latency information
536	   from both sub-domains via BGP-LS.  It computes a low-latency path
537	   from H to E as a Segment-List <S1, S2, S3> and programs H with the
538	   related explicit candidate path.

540	   The headend H does not compute the solution Segment-List (it cannot).
541	   The headend only validates the received explicit candidate path.
542	   Most probably, the controller encodes the SID's of the Segment-List
543	   with Type-1.  In that case, The headend's validation simply consists
544	   in resolving the first SID on an outgoing interface and next-hop.

546	   The SR-DB at H only includes the LSDB provided by the IGP I1.

548	   The SR-DB of the controller collects the LSDB from both sub-
549	   domains(I1 and I2).

551	5.4.  Distributed and Centralized Control Plane

553	   Consider the same domain D as in the previous section.

555	   H's SRP process is configured to associate color C1 with a low-
556	   latency optimization objective.

558	   H's BGP process is configured to steer a Route R/r of extended-color
559	   community C1 and of next-hop N via an SR policy (N, C1).

561	   Upon receiving a first BGP route of color C1 and of next-hop N, H
562	   recognizes the need for an SR Policy (N, C1) with a low-latency
563	   objective to N.  As N is outside the SRTE DB of H, H requests a
564	   controller to compute such Segment-List (e.g., PCEP [RFC8664]).

566	   This is an example of hybrid control-plane: the BGP distributed
567	   control plane signals the routes and their TE requirements.  Upon
568	   receiving these BGP routes, a local headend either computes the
569	   solution Segment-List (entirely distributed when the endpoint is in
570	   the SR-DB of the headend) else delegates the computation to a
571	   controller (hybrid distributed/centralized control-plane).

573	   The SR-DB at H only includes the LSDB provided by the IGP.

575	   The SR-DB of the controller collects the LSDB from both sub-domains.

577	6.  Binding SID Aspects

579	   The Binding SID (BSID) is fundamental to Segment Routing.  It
580	   provides scaling, network opacity and service independence.

582	   This section describes implementation and operational aspects related
583	   to the Binding SID.

585	6.1.  Benefits of Binding SID

587	   A simplified illustration is provided on the basis of Figure 5 where
588	   it is assumed that S, A, B, Data Center Interconnect DCI1 and DCI2
589	   share the same IGP-SR instance in the data-center 1 (DC1).  DCI1,
590	   DCI2, C, D, E, F, G, DCI3 and DCI4 share the same IGP-SR domain in
591	   the core.  DCI3, DCI4, H, K and Z share the same IGP-SR domain in the
592	   data-center 2 (DC2).

594	             A---DCI1----C----D----E----DCI3---H
595	            /            |         |            \
596	           S             |         |             Z
597	            \            |         |            /
598	             B---DCI2----F---------G----DCI4---K
599	          <==DC1==><=========Core========><==DC2==>

601	                  Figure 5: A Simple Datacenter Topology

603	   In this example, it is assumed no redistribution between the IGP's
604	   and no presence of BGP-LU.  The inter-domain communication is only
605	   provided by SR through SR Policies.

607	   The latency from S to DCI1 equals to DCI2.  The latency from Z to
608	   DCI3 equals to DCI4.  All the intra-DC links have the same IGP metric
609	   10.

611	   The path DCI1, C, D, E, DCI3 has a lower latency and lower capacity
612	   than the path DCI2, F, G, DCI4.

614	   The IGP metrics of all the core links are set to 10 except the links
615	   D-E which is set to 100.

617	   A low-latency multi-domain policy from S to Z may be expressed as
618	   <DCI1, BSID, Z> where:

620	   o  DCI1 is the prefix SID of DCI1.

622	   o  BSID is the Binding SID bound to an SR policy <D, D2E, DCI3>
623	      instantiated at DCI1.

625	   o  Z is the prefix SID of Z.

627	   Without the use of an intermediate core SR Policy (efficiently
628	   summarized by a single BSID), S would need to steer its low-latency
629	   flow into the policy <DCI1, D, D2E, DCI3, Z>.

631	   The use of a BSID (and the intermediate bound SR Policy) decreases
632	   the number of segments imposed by the source.

634	   A BSID acts as a stable anchor point which isolates one domain from
635	   the churn of another domain.  Upon topology changes within the core
636	   of the network, the low-latency path from DCI1 to DCI3 may change.
637	   While the path of an intermediate policy changes, its BSID does not
638	   change.  Hence the policy used by the source does not change, hence
639	   the source is shielded from the churn in another domain.

641	   A BSID provides opacity and independence between domains.  The
642	   administrative authority of the core domain may not want to share
643	   information about its topology.  The use of a BSID allows keeping the
644	   service opaque.  S is not aware of the details of how the low-latency
645	   service is provided by the core domain.  S is not aware of the need
646	   of the core authority to temporarily change the intermediate path.

648	6.2.  Centralized Discovery of available BSID

650	   This section explains how controllers can discover the local SIDs
651	   available at a node N so as to pick an explicit BSID for a SR Policy
652	   to be instantiated at headend N.

654	   Any controller can discover the following properties of a node N
655	   (e.g., via BGP-LS , NETCONF etc.):

657	   o  its local topology [RFC7752].

659	   o  its topology-related SIDs (Prefix SIDs, Adj SID and EPE SID
660	      [I-D.ietf-idr-bgp-ls-segment-routing-ext]
661	      [I-D.ietf-idr-bgpls-segment-routing-epe]).

663	   o  its Segment Routing Label Block (SRLB).

665	   o  its SR Policies and their BSID ([RFC8664]
666	      [I-D.ietf-pce-binding-label-sid]
667	      [I-D.ietf-idr-te-lsp-distribution]).

669	   Any controller can thus infer the available SIDs in the SRLB of any
670	   node with the assumption that all SIDs allocated from the SRLB on
671	   that node are being advertised by it via some protocols or mechanisms
672	   to the controller.

674	   As an example, a controller discovers the following characteristics
675	   of N: SRLB (4000, 8000), 3 Adj SIDs (4001, 4002, 4003), 2 EPE SIDs
676	   (4004, 4005) and 3 SRTE policies (whose BSIDs are respectively 4006,
677	   4007 and 4008).  This controller can deduce that the SRLB sub-range
678	   (4009, 8000) is free for allocation.

680	   A controller is not restricted to use the next numerically available
681	   SID in the available SRLB sub-range.  It can pick any label in the
682	   subset of available labels.  This random pick make the chance for a
683	   collision unlikely.

685	   An operator could also sub-allocate the SRLB between different
686	   controllers (e.g. (4000-4499) to controller 1 and (4500-5000) to
687	   controller 2).

689	   Inter-controller state-synchronization may be used to avoid/detect
690	   collision in BSID.

692	   All these techniques make the likelihood of a collision between
693	   different controllers very unlikely.

695	   In the unlikely case of a collision, the controllers will detect it
696	   through system alerts, BGP-LS reporting using
697	   [I-D.ietf-idr-te-lsp-distribution] or PCEP notification [RFC8231].
698	   They then have the choice to continue the operation of their SR
699	   Policy with the dynamically allocated BSID or re-try with another
700	   explicit pick.

702	   Note: in deployments where PCE Protocol (PCEP) is used between head-
703	   end and controller (PCE), a head-end can report BSID as well as
704	   policy attributes (e.g., type of disjointness) and operational and
705	   administrative states to controller.  Similarly, a controller can
706	   also assign/update the BSID of a policy via PCEP when instantiating
707	   or updating SR Policy.

709	7.  Flex-Algorithm Based SR Policies

711	   SR allows for association of algorithms to Prefix SIDs [RFC8402].
712	   [I-D.ietf-lsr-flex-algo] defines the IGP based Flex-Algorithm
713	   solution which allows IGPs themselves to compute constraint based
714	   paths over the network.  Prefix SIDs for the specific flex-algorithm
715	   and associated with a node are used in the forwarding plane to steer
716	   along the specific constraint path to that node.

718	   As specified in [RFC8402] these IGP Flex Algo Prefix SIDs can be used
719	   as segments within SR Policies thereby leveraging the underlying IGP
720	   Flex Algo solution.

722	            1--RED--2-------6
723	            |       |       |
724	            4-------3--RED--9

726	                  Figure 6: Illustration for Flex-Alg SID

728	   Now let us assume that

730	   o  1, 2, 3 and 4 are part of IGP 1.

732	   o  2, 6, 9 and 3 are part of IGP 2.

734	   o  All the IGP link costs are 10.

736	   o  Links 1to2 and 3to9 are colored with IGP Link Affinity Red.

738	   o  Flex-Alg1 is defined in both IGPs as: avoid red, minimize IGP
739	      metric.

741	   o  All nodes of each IGP domain are enabled for FlexAlg1

743	   o  SID(k, 0) represents the PrefixSID of node k according to Alg=0.

745	   o  SID(k, FlexAlg1) represents the PrefixSID of node k according to
746	      Flex-Alg1.

748	   A controller can steer a flow from 1 to 9 through an end-to-end path
749	   that avoids the RED links of both IGP domains thanks to the explicit
750	   SR Policy <SID(2, FlexAlg1), SID9(FlexAlg1)>.

752	8.  Layer 2 and Optical Transport

754	                  1----2----3----4----5
755	         I2(lambda L241)\       / I4(lambda L241)
756	                         Optical

758	                 Figure 7: SR Policy with integrated DWDM

760	   An explicit candidate path can express a path through a transport
761	   layer beneath IP (ATM, FR, DWDM).  The transport layer could be ATM,
762	   FR, DWDM, back-to-back Ethernet etc.  The transport path is modelled
763	   as a link between two IP nodes with the specific assumption that no
764	   distributed IP routing protocol runs over the link.  The link may
765	   have IP address or be IP unnumbered.  Depending on the transport
766	   protocol case, the link can be a physical DWDM interface and a lambda
767	   (integrated solution), an Ethernet interface and a VLAN, an ATM
768	   interface with a VPI/VCI, a FR interface with a DLCI etc.

770	   Using the DWDM integrated use-case of Figure 7 as an illustration,
771	   let us assume

773	   o  nodes 1, 2, 3, 4 and 5 are IP routers running an SR-enable IGP on
774	      the links 1-2, 2-3, 3-4 and 4-5.

776	   o  The SRGB is homogeneous (16000, 24000).

778	   o  Node K's prefix SID is 16000+K.

780	   o  node 2 has an integrated DWDM interface I2 with Lambda L1.

782	   o  node 4 has an integrated DWDM interface I4 with Lambda L2.

784	   o  the optical network is provisioned with a circuit from 2 to 4 with
785	      continuous lambda L241 (details outside the scope of this
786	      document).

788	   o  Node 2 is provisioned with an SR policy with Segment-List
789	      <I2(L241)> and Binding SID B where I2(L241) is of type 5 (IPv4) or
790	      type 7 (IPv6), see section 4 of
791	      [I-D.ietf-spring-segment-routing-policy] .

793	   o  node 1 steers a packet P1 towards the prefix SID of node 5
794	      (16005).

796	   o  node 1 steers a packet P2 on the SR policy <16002, B, 16005>.

798	   In such a case, the journey of P1 will be 1-2-3-4-5 while the journey
799	   of P2 will be 1-2-lambda(L241)-4-5.  P2 skips the IP hop 3 and
800	   leverages the DWDM circuit from node 2 to node 4.  P1 follows the
801	   shortest-path computed by the distributed routing protocol.  The path
802	   of P1 is unaltered by the addition, modification or deletion of
803	   optical bypass circuits.

805	   The salient point of this example is that the SR Policy architecture
806	   seamlessly support explicit candidate paths through any transport
807	   sub-layer.

809	   BGP-LS Extensions to describe the sub-IP-layer characteristics of the
810	   SR Policy are out of scope of this document (e.g. in Figure 7, the
811	   DWDM characteristics of the SR Policy at node 2 in terms of latency,
812	   loss, security, domain/country traversed by the circuit etc.).

814	   Further details of the SR Policy use-case for Packet Optical networks
815	   are specified in [I-D.anand-spring-poi-sr] .

817	9.  Security Considerations

819	   The security considerations related to Segment Routing architecture
820	   are described in [RFC8402] and for SR Policy architecture are
821	   described in [I-D.ietf-spring-segment-routing-policy] and they apply
822	   to this document as well.

824	10.  IANA Considerations

826	   This document has no actions for IANA.

828	11.  Acknowledgement

830	   The authors like to thank Tarek Saad, Dhanendra Jain, Muhammad
831	   Durrani and Rob Shakir for their valuable comments and suggestions.

833	12.  Contributors

835	   The following people have contributed to this document:

837	   Siva Sivabalan
838	   Cisco Systems
839	   Email: msiva@cisco.com

841	   Zafar Ali
842	   Cisco Systems
843	   Email: zali@cisco.com
844	   Jose Liste
845	   Cisco Systems
846	   Email: jliste@cisco.com

848	   Francois Clad
849	   Cisco Systems
850	   Email: fclad@cisco.com

852	   Kamran Raza
853	   Cisco Systems
854	   Email: skraza@cisco.com

856	   Shraddha Hegde
857	   Juniper Networks
858	   Email: shraddha@juniper.net

860	   Steven Lin
861	   Google, Inc.
862	   Email: stevenlin@google.com

864	   Alex Bogdanov
865	   Google, Inc.
866	   Email: bogdanov@google.com

868	   Daniel Voyer
869	   Bell Canada
870	   Email: daniel.voyer@bell.ca

872	   Dirk Steinberg
873	   Steinberg Consulting
874	   Email: dws@steinbergnet.net

876	   Bruno Decraene
877	   Orange Business Services
878	   Email: bruno.decraene@orange.com

880	   Stephane Litkowski
881	   Orange Business Services
882	   Email: stephane.litkowski@orange.com

884	   Luay Jalil
885	   Verizon
886	   Email: luay.jalil@verizon.com

888	13.  References

890	13.1.  Normative References

892	   [I-D.ietf-spring-segment-routing-policy]
893	              Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and
894	              P. Mattes, "Segment Routing Policy Architecture", draft-
895	              ietf-spring-segment-routing-policy-08 (work in progress),
896	              July 2020.

898	   [RFC8402]  Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L.,
899	              Decraene, B., Litkowski, S., and R. Shakir, "Segment
900	              Routing Architecture", RFC 8402, DOI 10.17487/RFC8402,
901	              July 2018, <https://www.rfc-editor.org/info/rfc8402>.

903	13.2.  Informative References

905	   [I-D.anand-spring-poi-sr]
906	              Anand, M., Bardhan, S., Subrahmaniam, R., Tantsura, J.,
907	              Mukhopadhyaya, U., and C. Filsfils, "Packet-Optical
908	              Integration in Segment Routing", draft-anand-spring-poi-
909	              sr-08 (work in progress), July 2019.

911	   [I-D.ietf-idr-bgp-ls-segment-routing-ext]
912	              Previdi, S., Talaulikar, K., Filsfils, C., Gredler, H.,
913	              and M. Chen, "BGP Link-State extensions for Segment
914	              Routing", draft-ietf-idr-bgp-ls-segment-routing-ext-16
915	              (work in progress), June 2019.

917	   [I-D.ietf-idr-bgpls-segment-routing-epe]
918	              Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray,
919	              S., and J. Dong, "BGP-LS extensions for Segment Routing
920	              BGP Egress Peer Engineering", draft-ietf-idr-bgpls-
921	              segment-routing-epe-19 (work in progress), May 2019.

923	   [I-D.ietf-idr-segment-routing-te-policy]
924	              Previdi, S., Filsfils, C., Talaulikar, K., Mattes, P.,
925	              Rosen, E., Jain, D., and S. Lin, "Advertising Segment
926	              Routing Policies in BGP", draft-ietf-idr-segment-routing-
927	              te-policy-09 (work in progress), May 2020.

929	   [I-D.ietf-idr-te-lsp-distribution]
930	              Previdi, S., Talaulikar, K., Dong, J., Chen, M., Gredler,
931	              H., and J. Tantsura, "Distribution of Traffic Engineering
932	              (TE) Policies and State using BGP-LS", draft-ietf-idr-te-
933	              lsp-distribution-13 (work in progress), April 2020.

935	   [I-D.ietf-lsr-flex-algo]
936	              Psenak, P., Hegde, S., Filsfils, C., Talaulikar, K., and
937	              A. Gulko, "IGP Flexible Algorithm", draft-ietf-lsr-flex-
938	              algo-12 (work in progress), October 2020.

940	   [I-D.ietf-pce-binding-label-sid]
941	              Filsfils, C., Sivabalan, S., Tantsura, J., Hardwick, J.,
942	              Previdi, S., and C. Li, "Carrying Binding Label/Segment-ID
943	              in PCE-based Networks.", draft-ietf-pce-binding-label-
944	              sid-03 (work in progress), June 2020.

946	   [I-D.ietf-spring-srv6-network-programming]
947	              Filsfils, C., Camarillo, P., Leddy, J., Voyer, D.,
948	              Matsushima, S., and Z. Li, "SRv6 Network Programming",
949	              draft-ietf-spring-srv6-network-programming-24 (work in
950	              progress), October 2020.

952	   [RFC1195]  Callon, R., "Use of OSI IS-IS for routing in TCP/IP and
953	              dual environments", RFC 1195, DOI 10.17487/RFC1195,
954	              December 1990, <https://www.rfc-editor.org/info/rfc1195>.

956	   [RFC2328]  Moy, J., "OSPF Version 2", STD 54, RFC 2328,
957	              DOI 10.17487/RFC2328, April 1998,
958	              <https://www.rfc-editor.org/info/rfc2328>.

960	   [RFC3630]  Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering
961	              (TE) Extensions to OSPF Version 2", RFC 3630,
962	              DOI 10.17487/RFC3630, September 2003,
963	              <https://www.rfc-editor.org/info/rfc3630>.

965	   [RFC5305]  Li, T. and H. Smit, "IS-IS Extensions for Traffic
966	              Engineering", RFC 5305, DOI 10.17487/RFC5305, October
967	              2008, <https://www.rfc-editor.org/info/rfc5305>.

969	   [RFC5340]  Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF
970	              for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008,
971	              <https://www.rfc-editor.org/info/rfc5340>.

973	   [RFC7471]  Giacalone, S., Ward, D., Drake, J., Atlas, A., and S.
974	              Previdi, "OSPF Traffic Engineering (TE) Metric
975	              Extensions", RFC 7471, DOI 10.17487/RFC7471, March 2015,
976	              <https://www.rfc-editor.org/info/rfc7471>.

978	   [RFC7752]  Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and
979	              S. Ray, "North-Bound Distribution of Link-State and
980	              Traffic Engineering (TE) Information Using BGP", RFC 7752,
981	              DOI 10.17487/RFC7752, March 2016,
982	              <https://www.rfc-editor.org/info/rfc7752>.

984	   [RFC8231]  Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path
985	              Computation Element Communication Protocol (PCEP)
986	              Extensions for Stateful PCE", RFC 8231,
987	              DOI 10.17487/RFC8231, September 2017,
988	              <https://www.rfc-editor.org/info/rfc8231>.

990	   [RFC8570]  Ginsberg, L., Ed., Previdi, S., Ed., Giacalone, S., Ward,
991	              D., Drake, J., and Q. Wu, "IS-IS Traffic Engineering (TE)
992	              Metric Extensions", RFC 8570, DOI 10.17487/RFC8570, March
993	              2019, <https://www.rfc-editor.org/info/rfc8570>.

995	   [RFC8660]  Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S.,
996	              Decraene, B., Litkowski, S., and R. Shakir, "Segment
997	              Routing with the MPLS Data Plane", RFC 8660,
998	              DOI 10.17487/RFC8660, December 2019,
999	              <https://www.rfc-editor.org/info/rfc8660>.

1001	   [RFC8664]  Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W.,
1002	              and J. Hardwick, "Path Computation Element Communication
1003	              Protocol (PCEP) Extensions for Segment Routing", RFC 8664,
1004	              DOI 10.17487/RFC8664, December 2019,
1005	              <https://www.rfc-editor.org/info/rfc8664>.

1007	Authors' Addresses

1009	   Clarence Filsfils
1010	   Cisco Systems, Inc.
1011	   Pegasus Parc
1012	   De kleetlaan 6a, DIEGEM  BRABANT 1831
1013	   BELGIUM

1015	   Email: cfilsfil@cisco.com

1017	   Ketan Talaulikar (editor)
1018	   Cisco Systems, Inc.

1020	   Email: ketant@cisco.com

1022	   Przemyslaw Krol
1023	   Google, Inc.

1025	   Email: pkrol@google.com
1026	   Martin Horneffer
1027	   Deutsche Telekom

1029	   Email: martin.horneffer@telekom.de

1031	   Paul Mattes
1032	   Microsoft
1033	   One Microsoft Way
1034	   Redmond, WA  98052-6399
1035	   USA

1037	   Email: pamattes@microsoft.com