Abstract A Segment Routing (SR) Policy is an ordered list of segments (i.e., instructions) that represent a source-routed policy. An SR Policy consists of one or more candidate paths, each consisting of one or more segment lists. A headend may be provisioned with candidate paths for an SR Policy via several different mechanisms, e.g., CLI, NETCONF, PCEP, or BGP. "an ordered list of segments" or "ordered lists of segments" or "ordered lists of ordered segments? I assume each candidate path can have at least one "ordered list of segments". This document introduces a BGP subsequent address family (SAFI) for IPv4 and IPv6 address families. In UPDATE messages of those AFI/ SAFIs, the NLRI identifies an SR Policy Candidate Path while the attributes encode the segment lists and other details of that SR Policy Candidate Path. Does a candidate path include the endpoint and color information? While for simplicity we may write that BGP advertises an SR Policy, it has to be understood that BGP advertises a candidate path of an SR policy and that this SR Policy might have several other candidate paths provided via BGP (via an NLRI with a different distinguisher as defined in Section 2.1), PCEP, NETCONF, or local policy configuration. Typically, a controller defines the set of policies and advertises them to policy headend routers (typically ingress routers). These policy advertisements use the BGP extensions defined in this document. The policy advertisement is, in most but not all cases, tailored for a specific policy headend. In this case, the advertisement may be sent on a BGP session to that headend and not propagated any further. "in most cases" and "in this case" - are they the same? Alternatively, a router (i.e., a BGP egress router) advertises SR Policies representing paths to itself. In this case, it is possible to send the policy to each headend over a BGP session to that headend, without requiring any further propagation of the policy. What's the difference from the previous one? There is no difference whether it is sent from an egress router or a controller should not matter, right? An SR Policy intended only for the receiver will, in most cases, not traverse any Route Reflector (RR, [RFC4456]). Is the above paragraph correct/needed. I suppose in most cases they will traverse RR after all - whether it is from a controller or an egress PE. In some situations, it is undesirable for a controller or BGP egress router to have a BGP session to each policy headend. In these situations, BGP Route Reflectors may be used to propagate the advertisements. In certain other deployments, it may be necessary for the advertisement to propagate through a sequence of one or more ASes within an SR Domain (refer to Section 7 for the associated security considerations). To make this possible, an attribute needs to be attached to the advertisement that enables a BGP speaker to determine whether it is intended to be a headend for the advertised policy. This is done by attaching one or more Route Target Extended Communities to the advertisement [RFC4360]. How is further propagation prevented after the headend is reached? The BGP extensions for the advertisement of SR Policies include following components: The BGP extensions is for the advertisement of SR Policy Candidate Paths not SR Policies themselves, right? * One or more IPv4 address format route target extended community ([RFC4360]) attached to the SR Policy advertisement and that indicates the intended headend of such an SR Policy advertisement. and IPv6? s/format/specific/? The SR Policy SAFI route updates use the Tunnel Encapsulation Attribute to signal an SR Policy - i.e., a tunnel itself. Its usage An SR Policy Candidate Path, not an SR Policy? Good to see "a tunnel itself" mentioned here :-) I've always thought the "SR Policy" is a convoluted term for tunnel :-) of this attribute is hence very different from [RFC9012] where this attribute is associated with a BGP route update (e.g., for Internet or VPN routes) to specify the tunnel which is used for forwarding traffic for that route. This document does not update or change the usage of the Tunnel Encapsulation Attribute as specified in [RFC9012] for existing AFI/SAFIs as specified in that document. The details of processing of the Tunnel Encapsulation Attribute for the SR Policy SAFI are specified in Section 2.2 and Section 2.3. Good to see the difference is pointed out here. I've always thought Tunnel Encapsulation Attribute (TEA) is shoehorned here but I guess it is too late to change that. The Color Extended Community (as defined in [RFC9012]) is used to steer traffic into an SR Policy, as described in section 8.8 of [RFC9256]. The Section 3 of this document updates [RFC9012] with modifications to the format of the Flags field of the Color Extended Community by using the two leftmost bits of that field. * Policy Color: 4-octet value identifying (with the endpoint) the policy. The color is used to match the color of the destination prefixes to steer traffic into the SR Policy as specified in section 8 of [RFC9256]. * Endpoint: value identifies the endpoint of a policy. The Endpoint may represent a single node or a set of nodes (e.g., an anycast address). The Endpoint is an IPv4 (4-octet) address or an IPv6 (16-octet) address according to the AFI of the NLRI. The address can be either a unicast or an unspecified address (0.0.0.0 for IPv4, :: for IPv6) as specified in section 2.1 of [RFC9256]. Can you call it out as "null endpoint" that was used later? It is important to note that any BGP speaker receiving a BGP message with an SR Policy NLRI, will process it only if the NLRI is among the There are a lot of "processing" before it is deemed "among the bet paths", right? Do you mean the "SRPM" will process it only if the NLRI is among the best paths? best paths as per the BGP best-path selection algorithm. In other words, this document leverages the existing BGP propagation and best- path selection rules. Details of the procedures are described in Section 4. SR Policy SAFI NLRI: Attributes: Tunnel Encapsulation Attribute (23) Tunnel Type: SR Policy (15) Binding SID SRv6 Binding SID Preference Priority Policy Name Policy name seems to be a property for policy not the candidate path. What if the names do not match among different candidate paths of the same policy? Policy Candidate Path Name Explicit NULL Label Policy (ENLP) Segment List Weight Segment Segment ... ... Figure 2: SR Policy Encoding 2.3. Applicability of Tunnel Encapsulation Attribute Sub-TLVs The Tunnel Egress Endpoint and Color sub-TLVs, as defined in [RFC9012], may also be present in the SR Policy encodings. Why do we say the above given the following paragraph? They seem to be contractive. The Tunnel Egress Endpoint and Color Sub-TLVs of the Tunnel Encapsulation Attribute are not used for SR Policy encodings and therefore their value is irrelevant in the context of the SR Policy SAFI NLRI. If present, the Tunnel Egress Endpoint sub-TLV and the Color sub-TLV MUST be ignored by the BGP speaker and MAY be removed from the Tunnel Encapsulation Attribute during propagation. Similarly, any other sub-TLVs (including those defined in [RFC9012]) whose applicability is not specifically defined for the SR Policy SAFI MUST be ignored by the BGP speaker and MAY be removed from the Tunnel Encapsulation Attribute during propagation. Why don't we say any those sub-TLVs not defined in this document must not be present and must be ignored? Preference, Binding SID, SRv6 Binding SID, Segment-List, Priority, Policy Name, Policy Candidate Path Name, and Explicit NULL Label Policy are all optional sub-TLVs introduced for the BGP Tunnel Encapsulation Attribute [RFC9012] being defined in this section. Should the segment-list be mandatory? What does it mean if the segment-list is empty? When the Binding SID sub-TLV is used to signal an SRv6 SID, the choice of its SRv6 Endpoint Behavior [RFC8986] to be instantiated is left to the headend node. It is RECOMMENDED that the SRv6 Binding SID sub-TLV defined in Section 2.4.3, that enables the specification of the SRv6 Endpoint Behavior, be used for signaling of an SRv6 Binding SID for an SR Policy candidate path. Is there a choice here? Shouldn't the behavior be that traffic with that Binding SID is steered into this policy? The whole paragraph is hard to parse. * Binding SID: If the length is 2, then no Binding SID is present. If the length is 6 then the Binding SID is encoded in 4 octets using the format below. Traffic Class (TC), S, and TTL (Total of 12 bits) are RESERVED and MUST be set to zero and MUST be ignored. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label | TC |S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6: Binding SID Label Encoding If the length is 18 then the Binding SID contains a 16-octet SRv6 SID. Why do we need the a 16-octet Binding SID since we have the following "SRv6 Binding SID Sub-TLV"? 2.4.3. SRv6 Binding SID Sub-TLV The SRv6 Binding SID sub-TLV is optional. More than one SRv6 Binding SID sub-TLVs MAY be signaled in the same SR Policy encoding to indicate one or more SRv6 SIDs, each with potentially different SRv6 Endpoint Behaviors to be instantiated. Why would there be more than one signaled, and why would there be different endpoing behaviors? Isn't the behavior simply "steer into the SR policy"? - S-Flag: This flag encodes the "Specified-BSID-only" behavior. It is used by SRPM as described in section 6.2.3 in [RFC9256]. I have trouble understanding this "Specified-BSID-only" behavior. - I-Flag: This flag encodes the "Drop Upon Invalid" behavior. It is used by SRPM as described in section 8.2 in [RFC9256]. I also have trouble understanding this "Drop Upon Invalid" behavior. I read rfc9256 but still can't put the two together. 2.4.4.2. Segment Sub-TLVs The Segment sub-TLVs are optional and MAY appear multiple times in the Segment List sub-TLV. Why are they optional? What is the use case of an empty segment list? 2.4.4.2.2. Segment Type B The Type B Segment Sub-TLV encodes a single SRv6 SID. The format is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Flags | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // SRv6 SID (16 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // SRv6 Endpoint Behavior and SID Structure (optional) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * Flags: 1 octet of flags as defined in Section 2.4.4.2.3. * SRv6 SID: 16 octets of IPv6 address. * SRv6 Endpoint Behavior and SID Structure: Optional, as defined in Section 2.4.4.2.4. When this is part of a segment list, what is the significance of the Flags and SRv6 Endpoint Behavior and SID Structure? The TLV 2 defined for the advertisement of Segment Type B in the earlier versions of this document has been deprecated to avoid backward compatibility issues. Why would deprecating them avoid backward compatibility issues? If there are implementations/deployments based on earlier versions, deprecating them won't help. If there are no implementations/deployments based on earlier versions, there is no backward compatiblity issue. Perhaps just remove "to avoid ..."? 2.4.4.2.3. Segment Flags The Segment Types sub-TLVs described above may contain the following flags in the "Flags" field defined in Section 6.8: 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |V| |B| | +-+-+-+-+-+-+-+-+ Figure 22: Segment Flags where: V-Flag: This flag, when set, is used by SRPM for "SID verification" as described in Section 5.1 of [RFC9256]. I have trouble understanding the V-Flag. How is the headend supposed to verify the BSID or any segment in the segment list? 2.4.4.2.4. SRv6 SID Endpoint Behavior and Structure The Segment Types sub-TLVs described above MAY contain the SRv6 Endpoint Behavior and SID Structure [RFC8986] encoding as described below: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Endpoint Behavior | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LB Length | LN Length | Fun. Length | Arg. Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 23: SRv6 SID Endpoint Behavior and Structure where: Endpoint Behavior: 2 octets. It carries the SRv6 Endpoint Behavior code point for this SRv6 SID as defined in section 9.2 of [RFC8986]. When set with the value 0xFFFF (i.e., Opaque), the choice of SRv6 Endpoint Behavior is left to the headend. Reserved: 2 octets of reserved bits. This field MUST be set to zero on transmission and MUST be ignored on receipt. Locator Block Length: 1 octet. SRv6 SID Locator Block length in bits. Locator Node Length: 1 octet. SRv6 SID Locator Node length in bits. Function Length: 1 octet. SRv6 SID Function length in bits. Argument Length: 1 octet. SRv6 SID Arguments length in bits. How is this different from the "SRv6 SID Structure Sub-Sub-TLV" in RFC9252? Why not reuse that one? 2.4.5. Explicit NULL Label Policy Sub-TLV To steer an unlabeled IP packet into an SR policy, it is necessary to create a label stack for that packet, and push one or more labels onto that stack. Do you mean SR-mpls policy? Perhaps remove ", and push one or more labels onto that stack"? Perhaps changes "Explicit NULL Label Policy" to "Explicit NULL Label Behavior"? The word "policy" here gets tangled with "SR Policy". 4.2.1. Validation of an SR Policy NLRI When a BGP speaker receives an SR Policy NLRI from a neighbor it MUST first perform validation based on the following rules in addition to the validation described in Section 5: * The SR Policy NLRI MUST include a distinguisher, color, and endpoint field which implies that the length of the NLRI MUST be either 12 or 24 octets (depending on the address family of the endpoint). * The SR Policy update MUST have either the NO_ADVERTISE community or at least one route target extended community in IPv4-address format or both. If a router supporting this specification receives an SR Policy update with no route target extended communities and no NO_ADVERTISE community, the update MUST be considered as malformed. What about IPv6-address specific RT? 4.2.2. Eligibility for Local Use of an SR Policy NLRI If one or more route targets are present and none matches the local BGP Identifier, then, while the SR Policy NLRI is valid, it is not usable on the receiver node. Does the route target have to match the local BGP identifier? As long as the receiver is configured with a local RT that matches one of the advertised RTs, it should be fine, right? That is how VPN RT works and I suppose the same can be used here. When should the BGP update stops being propagated if RT is used? Never? or should a matching RT be removed by each matching receiver and then the propagation stops when there is no RT left? By default, a BGP node receiving an SR Policy NLRI SHOULD NOT remove route target extended community before propagation. An implementation MAY provide support for configuration to filter and/or remove route target extended community before propagation. Isn't the above applicable to any AFI/SAFI? Why do we need to specify that? 5. Error Handling and Fault Management A BGP Speaker MUST perform the following syntactic validation of the SR Policy NLRI to determine if it is malformed. This includes the validation of the length of each NLRI and the total length of the MP_REACH_NLRI and MP_UNREACH_NLRI attributes. It also includes the validation of the consistency of the NLRI length with the AFI and the endpoint address as specified in Section 2.1. When the error determined allows for the router to skip the malformed NLRI(s) and continue the processing of the rest of the update message, then it MUST handle such malformed NLRIs as 'Treat-as- withdraw'. In other cases, where the error in the NLRI encoding results in the inability to process the BGP update message (e.g. length related encoding errors), then the router SHOULD handle such malformed NLRIs as 'AFI/SAFI disable' when other AFI/SAFI besides SR Policy are being advertised over the same session. Alternately, the router MUST perform 'session reset' when the session is only being used for SR Policy or when it 'AFI/SAFI disable' action is not possible. Is the above generic BGP handling? The validation of the TLVs/sub-TLVs introduced in this document and defined in their respective sub-sections of Section 2.4 MUST be performed to determine if they are malformed or invalid. The validation of the Tunnel Encapsulation Attribute itself and the other TLVs/sub-TLVs specified in Section 13 of [RFC9012] MUST be done as described in that document. In case of any error detected, either at the attribute or its TLV/sub-TLV level, the "treat-as-withdraw" strategy MUST be applied. This is because an SR Policy update without a valid Tunnel Encapsulation Attribute (comprising of all valid TLVs/sub-TLVs) is not usable. The above says the validation of those in Section 2.4 may lead to "treat-as-withdraw" - I assume this is BGP handling. Does that not conflict with the following paragraph? The validation of the individual fields of the TLVs/sub-TLVs defined in Section 2.4 are beyond the scope of BGP as they are handled by the SRPM as described in the individual TLV/sub-TLV sub-sections. A BGP implementation MUST NOT perform semantic verification of such fields nor consider the SR Policy update to be invalid or not usable based on such validation. 6. IANA Considerations This document uses code point allocations from the following existing registries: * Subsequent Address Family Identifiers (SAFI) Parameters registry * BGP Tunnel Encapsulation Attribute Tunnel Types registry under the BGP Tunnel Encapsulation registry * BGP Tunnel Encapsulation Attribute sub-TLVs registry under the BGP Tunnel Encapsulation registry * Color Extended Community Flags registry under the BGP Tunnel Encapsulation registry Do we need to mention the above for the already allocated code points? if yes, should we mention the value as well? Actually I see 6.1~6.4 below - so the above is not needed at all. This document also requests the creation of the following new registries: * SR Policy Segment List Sub-TLVs under the BGP Tunnel Encapsulation registry * SR Policy Binding SID Flags under the BGP Tunnel Encapsulation registry * SR Policy SRv6 Binding SID Flags under the BGP Tunnel Encapsulation registry * SR Policy Segment Flags under the BGP Tunnel Encapsulation registry * Color Extended Community Color-Only Types registry under the BGP Tunnel Encapsulation registry Similarly, we probably don't need the above. Just a nit.