| < draft-ietf-sfc-nsh-ecn-support-08.txt | draft-ietf-sfc-nsh-ecn-support-09.txt > | |||
|---|---|---|---|---|
| skipping to change at page 1, line 13 ¶ | skipping to change at page 1, line 13 ¶ | |||
| INTERNET-DRAFT D. Eastlake | INTERNET-DRAFT D. Eastlake | |||
| Intended status: Proposed Standard Futurewei Technologies | Intended status: Proposed Standard Futurewei Technologies | |||
| B. Briscoe | B. Briscoe | |||
| Independent | Independent | |||
| Y. Li | Y. Li | |||
| Huawei Technologies | Huawei Technologies | |||
| A. Malis | A. Malis | |||
| Malis Consulting | Malis Consulting | |||
| X. Wei | X. Wei | |||
| Huawei Technologies | Huawei Technologies | |||
| Expires: April 20, 2022 October 21, 2021 | Expires: October 16, 2022 April 17, 2022 | |||
| Explicit Congestion Notification (ECN) and Congestion Feedback | Explicit Congestion Notification (ECN) and Congestion Feedback | |||
| Using the Network Service Header (NSH) and IPFIX | Using the Network Service Header (NSH) and IPFIX | |||
| <draft-ietf-sfc-nsh-ecn-support-08.txt> | <draft-ietf-sfc-nsh-ecn-support-09.txt> | |||
| Abstract | Abstract | |||
| Explicit congestion notification (ECN) allows a forwarding element to | Explicit congestion notification (ECN) allows a forwarding element to | |||
| notify downstream devices of the onset of congestion without having | notify downstream devices of the onset of congestion without having | |||
| to drop packets. Coupled with a means to feed information about | to drop packets. Coupled with a means to feed information about | |||
| congestion back to upstream nodes, this can improve network | congestion back to upstream nodes, this can improve network | |||
| efficiency through better congestion control, frequently without | efficiency through better congestion control, frequently without | |||
| packet drops. This document specifies ECN and congestion feedback | packet drops. This document specifies ECN and congestion feedback | |||
| support within a Service Function Chaining (SFC) domain through use | support within a Service Function Chaining (SFC) enabled domain | |||
| of the Network Service Header (NSH, RFC 8300) and IP Flow Information | through use of the Network Service Header (NSH, RFC 8300) and IP Flow | |||
| Export (IPFIX, RFC 7011). | Information Export (IPFIX, RFC 7011). | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Distribution of this document is unlimited. Comments should be sent | Distribution of this document is unlimited. Comments should be sent | |||
| to the SFC Working Group mailing list <sfc@ietf.org> or to the | to the SFC Working Group mailing list <sfc@ietf.org> or to the | |||
| authors. | authors. | |||
| skipping to change at page 3, line 29 ¶ | skipping to change at page 3, line 29 ¶ | |||
| 3.2.2 At an SF/Proxy......................................15 | 3.2.2 At an SF/Proxy......................................15 | |||
| 3.2.3 At Other Forwarding Nodes...........................15 | 3.2.3 At Other Forwarding Nodes...........................15 | |||
| 3.3 At Exit/Egress........................................16 | 3.3 At Exit/Egress........................................16 | |||
| 3.4 Congestion Statistics and the Conservation of Packets.16 | 3.4 Congestion Statistics and the Conservation of Packets.16 | |||
| 4. Tunnel Congestion Feedback Support.....................18 | 4. Tunnel Congestion Feedback Support.....................18 | |||
| 4.1 Congestion Level Measurements.........................18 | 4.1 Congestion Level Measurements.........................18 | |||
| 4.3 Congestion Information Delivery.......................19 | 4.3 Congestion Information Delivery.......................19 | |||
| 4.3 IPFIX Extensions......................................21 | 4.3 IPFIX Extensions......................................21 | |||
| 4.3.1 nshServicePathID....................................21 | 4.3.1 nshServicePathID....................................21 | |||
| 4.3.2 tunnelEcnCeCeByteTotalCount.........................21 | 4.3.2 tunnelEcnCeCeByteTotalCount.........................22 | |||
| 4.3.3 tunnelEcnEctNectBytetTotalCount.....................22 | 4.3.3 tunnelEcnEctNectBytetTotalCount.....................22 | |||
| 4.3.4 tunnelEcnCeNectByteTotalCount.......................22 | 4.3.4 tunnelEcnCeNectByteTotalCount.......................22 | |||
| 4.3.5 tunnelEcnCeEctByteTotalCount........................23 | 4.3.5 tunnelEcnCeEctByteTotalCount........................23 | |||
| 4.3.6 tunnelEcnEctEctByteTotalCount.......................23 | 4.3.6 tunnelEcnEctEctByteTotalCount.......................23 | |||
| 4.3.7 tunnelEcnCEMarkedRatio..............................23 | 4.3.7 tunnelEcnCEMarkedRatio..............................24 | |||
| 5. Example of Use.........................................24 | 5. Example of Use.........................................25 | |||
| 6. IANA Considerations....................................27 | 6. IANA Considerations....................................28 | |||
| 6.1 SFC NSH Header ECN Bits...............................27 | 6.1 SFC NSH Header ECN Bits...............................28 | |||
| 6.2 IPFIX Information Element IDs.........................27 | 6.2 IPFIX Information Element IDs.........................28 | |||
| 7. Security Considerations................................29 | 7. Security Considerations................................30 | |||
| 8. Acknowledgements.......................................29 | 8. Acknowledgements.......................................30 | |||
| Normative References......................................30 | Normative References......................................31 | |||
| Informative References....................................31 | Informative References....................................32 | |||
| Authors' Addresses........................................32 | Authors' Addresses........................................33 | |||
| 1. Introduction | 1. Introduction | |||
| Explicit Congestion Notification (ECN [RFC3168]) allows a forwarding | Explicit Congestion Notification (ECN [RFC3168]) allows a forwarding | |||
| element to notify downstream devices of the onset of congestion | element to notify downstream nodes of the onset of congestion without | |||
| without having to drop packets. Coupled with a means to feed | having to drop packets. Coupled with a means to feed information | |||
| information about congestion back to upstream nodes, this can improve | about congestion back to upstream nodes, this can improve network | |||
| network efficiency through better congestion control, frequently | efficiency through better congestion control, frequently without | |||
| without packet drops. This document specifies ECN and congestion | packet drops. This document specifies ECN and congestion feedback | |||
| feedback support within a Service Function Chaining (SFC [RFC7665]) | support within a Service Function Chaining (SFC [RFC7665]) enabled | |||
| domain through use of the Network Service Header (NSH [RFC8300]) and | domain through use of the Network Service Header (NSH [RFC8300]) and | |||
| IP Flow Information Export (IPFIX [RFC7011]). | IP Flow Information Export (IPFIX [RFC7011]). | |||
| It requires that all ingress and egress nodes of the SFC domain | This document requires that all ingress and egress nodes of the SFC | |||
| implement ECN. While congestion management will be the most effective | domain implement ECN. While congestion management will be the most | |||
| if all interior nodes of the SFC domain implement ECN, some benefit | effective if all interior nodes of the SFC enabled domain implement | |||
| is obtained even if some interior nodes do not implement ECN. | ECN, some benefit is obtained even if some interior nodes do not | |||
| Congestion at any interior bottleneck where ECN marking is not | implement ECN. Congestion at any interior bottleneck where ECN | |||
| implemented will be unmanaged. | marking is not implemented will be unmanaged. | |||
| The subsections below in this section provide background information | The following subsections provide background information on NSH, ECN, | |||
| on NSH, ECN, congestion feedback, and terminology used in this | congestion feedback, and terminology used in this document. | |||
| document. | ||||
| 1.1 NSH Background | 1.1 NSH Background | |||
| The Service Function Chaining (SFC [RFC7665]) architecture calls for | The Service Function Chaining (SFC [RFC7665]) architecture calls for | |||
| the encapsulation of traffic within a service function chaining | the encapsulation of traffic within a service function chaining | |||
| domain with a Network Service Header (NSH [RFC8300]) added by the | domain with a Network Service Header (NSH [RFC8300]) added by the | |||
| "Classifier" (ingress node) on entry to the domain and the NSH being | "Classifier" (ingress node) on entry to the domain and the NSH being | |||
| removed on exit from the domain at the egress node. The NSH is used | removed on exit from the domain at the egress node. The NSH is used | |||
| to control the path of a packet in an SFC domain. The NSH is a | to control the path of a packet in an SFC domain. The NSH is a | |||
| natural place, in a domain where traffic is NSH encapsulated, to note | reasonable place, in a domain where traffic is NSH encapsulated, to | |||
| congestion, avoiding possible confusion due, for example, to changes | accumulate congestion information. | |||
| in the outer transport header in different parts of the domain. | ||||
| | | | | |||
| v | v | |||
| +----------+ | +----------+ | |||
| . .|Classifier|. . . . . . . . . . . . . . | . .|Classifier|. . . . . . . . . . . . . . | |||
| . +----------+ . | . +----------+ . | |||
| . | +----+ . | . | +----+ . | |||
| . | --+ SF | Service . | . | --+ SF | Service . | |||
| . | / +----+ Function . | . | / +----+ Function . | |||
| . v --- Chaining . | . v --- Chaining . | |||
| skipping to change at page 5, line 36 ¶ | skipping to change at page 5, line 36 ¶ | |||
| . | / +----+ . | . | / +----+ . | |||
| . v --- . | . v --- . | |||
| . +-----+/ +----+ . | . +-----+/ +----+ . | |||
| . | SFF |--------+ SF | . | . | SFF |--------+ SF | . | |||
| . +-----+\ +----+ . | . +-----+\ +----+ . | |||
| . | --- . | . | --- . | |||
| . | \ +----+ . | . | \ +----+ . | |||
| . | --+ SF | . | . | --+ SF | . | |||
| . v +----+ . | . v +----+ . | |||
| . +------+ . | . +------+ . | |||
| . . .| Exit |. . . . . . . . . . . . . . . | . . .|Egress|. . . . . . . . . . . . . . . | |||
| +------+ | +------+ | |||
| | | | | |||
| v | v | |||
| Figure 1. Example SFC Path Forwarding Nodes | Figure 1. Example SFC Forwarding Nodes Path | |||
| Figure 1 shows an SFC domain for the purpose of illustrating the use | Figure 1 shows an SFC enabled domain for the purpose of illustrating | |||
| of the NSH. Traffic passes through a sequence of Service Function | the use of the NSH. Traffic passes through a sequence of Service | |||
| Forwarders (SFFs) each of which sends the traffic to one or more | Function Forwarders (SFFs) each of which sends the traffic to one or | |||
| Service Functions (SFs). Each SF performs some operation on the | more Service Functions (SFs). Each SF performs some operation on the | |||
| traffic, for example firewall or Network Address Translation (NAT) or | traffic, for example firewalling or Network Address Translation (NAT) | |||
| load balancer, and then returns it to the SFF from which it was | or load balancing, and then returns the traffic to the SFF from which | |||
| received. | it was received. | |||
| Logically, during the transit of each SFF, the outer transport header | Logically, during the transit of each SFF, the outer transport header | |||
| that got the packet to the SFF is stripped (see Figure 3), the SFF | that got the packet to the SFF is stripped (see Figure 3), the SFF | |||
| decides on the next forwarding step, either adding a new transport | decides on the next forwarding step, either adding a new outer | |||
| header or, if the SFF is the exit/egress, removing the NSH header. | transport header or, if the SFF is the exit/egress, removing the NSH | |||
| The transport headers added may be different in different regions of | header. The outer transport headers added may be different in | |||
| the SFC domain. For example, IP could be used for some SFF-to-SFF | different regions of the SFC enabled domain. For example, IP could be | |||
| communication and MPLS used for other such communication. | used for some SFF-to-SFF communication and MPLS used for other SFF- | |||
| to-SFF communication. | ||||
| 1.2 ECN Background | 1.2 ECN Background | |||
| Explicit congestion notification (ECN [RFC3168]) allows a forwarding | Explicit Congestion Notification (ECN [RFC3168]) allows a forwarding | |||
| element (such as a router or a Service Function Forwarder (SFF) or | element (such as a router or a Service Function Forwarder (SFF) or | |||
| Service Function (SF)) to notify downstream devices of the onset of | Service Function (SF)) to notify downstream nodes of the onset of | |||
| congestion without having to drop packets. This can be used as an | congestion without having to drop packets. This can be used as an | |||
| element in active queue management (AQM) [RFC7567] to improve network | element in active queue management (AQM) [RFC7567] to improve network | |||
| efficiency through better traffic control without packet drops. The | efficiency through better traffic control without packet drops. The | |||
| forwarding element can explicitly mark some packets in an ECN field | forwarding element can explicitly mark some packets in an ECN field | |||
| instead of dropping the packet. For example, a two-bit field is | instead of dropping the packet. For example, a two-bit field is | |||
| available for ECN marking in IP headers [RFC3168]. | available for ECN marking in IP headers [RFC3168]. | |||
| 1.3 Tunnel Congestion Feedback Background | 1.3 Tunnel Congestion Feedback Background | |||
| Tunnels are widely deployed in various networks including data center | Tunnels are widely deployed in various networks including data center | |||
| networks, enterprise network, and the public Internet. A tunnel | networks, enterprise network, and the public Internet. A tunnel | |||
| consists of ingress, egress, and a set of intermediate nodes | consists of ingress, egress, and a set of intermediate nodes | |||
| including routers. Tunnel Congestion Feedback (Section 4) is a | including routers. Tunnel Congestion Feedback (Section 4) is a | |||
| building block for congestion mitigation methods. It supports | building block for congestion mitigation methods. It supports | |||
| feedback of congestion information from an egress node to an ingress | feedback of congestion information from an egress node to an ingress | |||
| node. This document treats the SFC domain as a tunnel with the | node. This document treats the SFC emabled domain as a tunnel with | |||
| initial Classifier node being the ingress; however, the Tunnel | the initial Classifier node being the ingress; however, the tunnel | |||
| Congestion Feedback facilities specified in this document MAY be used | congestion feedback facilities specified in this document MAY be used | |||
| in other contexts besides SFC domains. | in contexts other than SFC. | |||
| Any action by a tunnel ingress to reduce congestion needs to allow | Any action by a tunnel ingress to reduce congestion needs to allow | |||
| sufficient time for the end-to-end congestion control loop to respond | sufficient time for the end-to-end congestion control loop to respond | |||
| first, otherwise the system could go unstable. For instance by the | first, otherwise the system could go unstable. For instance by the | |||
| ingress taking a smoothed average of the level of congestion signaled | ingress taking a smoothed average of the level of congestion signaled | |||
| by feedback from the tunnel egress or delaying any action for at | by feedback from the tunnel egress or delaying any action for at | |||
| least the worst case end-to-end round trip time (for example 200 | least the worst case end-to-end round-trip time (for example, 200 | |||
| milliseconds). | milliseconds). | |||
| Examples of actions that can be taken by an ingress node when it has | Examples of actions that can be taken by an ingress node when it has | |||
| knowledge of downstream congestion include those listed below. | knowledge of downstream congestion include those listed below. | |||
| Details of implementing these traffic control methods, beyond those | Details of implementing these traffic control methods, beyond those | |||
| given here, are outside the scope of this document. | given here, are outside the scope of this document. | |||
| (1) Traffic throttling (policing), where the downstream traffic | (1) Traffic throttling (policing), where the downstream traffic | |||
| flowing out of the ingress node is limited to reduce or eliminate | flowing out of the ingress node is limited to reduce or eliminate | |||
| congestion. | congestion. | |||
| (2) Upstream congestion feedback, where the ingress node sends | (2) Upstream congestion feedback, where the ingress node sends | |||
| messages upstream to or towards the ultimate traffic source, a | messages upstream to or towards the ultimate traffic source, a | |||
| function that can throttle traffic generation/transmission. | function that can throttle traffic generation/transmission. | |||
| (3) Traffic re-direction, where the ingress node configures the NSH | (3) Traffic re-direction, where the ingress node configures the NSH | |||
| of some future traffic so that it avoids congested paths. Great | of some future traffic so that it avoids congested paths. Great | |||
| care must be taken with this option to avoid (a) significant re- | care must be taken with this option to avoid (a) significant re- | |||
| ordering of traffic in flows that it is desirable to keep in | ordering of traffic in flows that it is desirable to keep in | |||
| order and (b) oscillation/instability in traffic paths due to | order (due to end-to-end requiresment or due to a stateful SF) | |||
| alternate congestion of previously idle paths and the idling of | and (b) oscillation/instability in traffic paths due to alternate | |||
| previously congested paths. For example, it is preferable to | congestion of previously idle paths and the idling of previously | |||
| classify traffic into flows of a sufficiently coarse granularity | congested paths. For example, it is preferable to classify | |||
| that the flows are long lived and then use a stable path per | traffic into flows of a sufficiently coarse granularity that the | |||
| flow, sending only newly appearing flows on apparently | flows are long lived and to use a stable path per flow, sending | |||
| uncongested paths. | only newly appearing flows on apparently uncongested paths. | |||
| Figure 2 shows an example path from an original sender to a final | Figure 2 shows an example path from an original sender to a final | |||
| receiver passing through a chain of service functions between the | receiver passing through a chain of service functions between the | |||
| ingress and egress of an SFC domain. The path is also likely to pass | ingress and egress of an SFC enabled domain. The path is also likely | |||
| through other network nodes outside the SFC domain (not shown) before | to pass through other network nodes outside the SFC enabled domain | |||
| entering the SFC domain and after leaving the SFC domain. | (not shown) before entering that domain and after leaving that | |||
| domain. | ||||
| The figure shows typical congestion feedback that would be expected | Figure 2 shows typical congestion feedback that would be expected | |||
| from the final receiver to the origin sender, which controls the load | from the final receiver to the origin sender, which controls the load | |||
| the origin sender directs to all elements on the path. The figure | the origin sender directs to all elements on the path. The figure | |||
| also shows the congestion feedback from the egress to the ingress of | also shows the congestion feedback from the egress to the ingress of | |||
| the SFC domain that is described in this document, to control or | the SFC enabled domain that is described in this document, to control | |||
| balance load within the SFC domain. | or balance load within that domain. | |||
| .:= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = :. | .:= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = :. | |||
| _||_ End-to-End Congestion Feedback || | _||_ End-to-End Congestion Feedback || | |||
| \ / || | \ / || | |||
| \/ || | \/ || | |||
| __ Inner Transport Header and Payload __ | __ Inner Transport Header and Payload __ | |||
| | | ->- - - - - - - - - - - - - - ->- - - - - -- - - - - - ->- | | | | | ->- - - - - - - - - - - - - - ->- - - - - -- - - - - - ->- | | | |||
| | | | | | | | | | | |||
| | | .:= = = = = = = = = = = = = = = = = = = = = =:. | | | | | .:= = = = = = = = = = = = = = = = = = = = = =:. | | | |||
| | | _||_ Tunnel Congestion Feedback || | | | | | _||_ Tunnel Congestion Feedback || | | | |||
| skipping to change at page 8, line 25 ¶ | skipping to change at page 8, line 25 ¶ | |||
| | | \/ || | | | | | \/ || | | | |||
| | | __ NSH __ | | | | | __ NSH __ | | | |||
| | | | |-------------------------->--------------| | | | | | | | |-------------------------->--------------| | | | | |||
| | |. . . | | ___ ___ ___ | |. . .| | | | |. . . | | ___ ___ ___ | |. . .| | | |||
| | | | | OT1 | | OT4 | | . . . | | OTn | | | | | | | | | OT1 | | OT4 | | . . . | | OTn | | | | | |||
| | | | |-->--|SFF|--->---|SFF| |SFF|-->--| | | | | | | | |-->--|SFF|--->---|SFF| |SFF|-->--| | | | | |||
| |__| |__| |___| |___| |___| |__| |__| | |__| |__| |___| |___| |___| |__| |__| | |||
| origin SFC | ^ | ^ SFC final | origin SFC | ^ | ^ SFC final | |||
| sender domain OT2| |OT3 OT6| |OT7 domain rcvr | sender domain OT2| |OT3 OT6| |OT7 domain rcvr | |||
| ingress v | v | egress | ingress v | v | egress | |||
| +---+ +---+ | +---+ +---+ SFF | |||
| |SF | |SF | | |SF | |SF | | |||
| +---+ +---+ | +---+ +---+ | |||
| Figure 2. Congestion Feedback across an SFC Domain | Figure 2. Congestion Feedback across an SFC enabled Domain | |||
| SFC Domain congestion feedback in Figure 2 is shown within the | SFC enabled Domain congestion feedback in Figure 2 is shown within | |||
| context of an end-to-end congestion feedback loop. Also shown is the | the context of an end-to-end congestion feedback loop. Also shown is | |||
| encapsulated layering of NSH headers within a series of outer | the encapsulated layering of NSH headers within a series of outer | |||
| transport headers (OT1, OT2, ... OTn). | transport headers (OT1, OT2, ... OTn). | |||
| 1.4 Conventions Used in This Document | 1.4 Conventions Used in This Document | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
| capitals, as shown here. | capitals, as shown here. | |||
| skipping to change at page 10, line 7 ¶ | skipping to change at page 10, line 7 ¶ | |||
| SFF - Service Function Forwarder [RFC7665] - A type of node that | SFF - Service Function Forwarder [RFC7665] - A type of node that | |||
| forwards based on the NSH. | forwards based on the NSH. | |||
| TLV - Type Length Value | TLV - Type Length Value | |||
| upstream - The direction from egress to ingress | upstream - The direction from egress to ingress | |||
| 2. The NSH ECN Field | 2. The NSH ECN Field | |||
| The NSH header is used to encapsulate traffic and control its | The NSH is used to encapsulate traffic and control its subsequent | |||
| subsequent path (see Section 2 of [RFC8300]). The NSH also provides | path (see Section 2 of [RFC8300]). The NSH also provides for optional | |||
| for optional metadata inclusion, as shown in Figure 3. | metadata inclusion, as shown in Figure 3. | |||
| +-----------------------------------+ | +-----------------------------------+ | |||
| | Outer Transport Header | | | Outer Transport Header | | |||
| +-----------------------------------+ | +-----------------------------------+ | |||
| | Network Service Header (NSH) | | | Network Service Header (NSH) | | |||
| | +------------------------------+ | | | +------------------------------+ | | |||
| | | Base Header | | | | | Base Header | | | |||
| | +------------------------------+ | | | +------------------------------+ | | |||
| | | Service Path Header | | | | | Service Path Header | | | |||
| | +------------------------------+ | | | +------------------------------+ | | |||
| | | Metadata (Context Header(s)) | | | | | Metadata (Context Header(s)) | | | |||
| | +------------------------------+ | | | +------------------------------+ | | |||
| +-----------------------------------+ | +-----------------------------------+ | |||
| | Original Packet / Frame / Payload | | | Original Packet / Frame / Payload | | |||
| +-----------------------------------+ | +-----------------------------------+ | |||
| Figure 3. Data Encapsulation with the NSH | Figure 3. Data Encapsulation with the NSH | |||
| Two currently unused bits (indicated by "U") in the NSH Base Header | This document assigns two currently unused bits (indicated by "U") in | |||
| (Section 2.2 of [RFC8300]) are allocated for ECN indication as shown | the NSH Base Header (Section 2.2 of [RFC8300]) for the purpose of ECN | |||
| in Figure 4. | indication as shown in Figure 4. | |||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| |Ver|O|U| TTL | Length |U|U|U|U|MD Type| Next Protocol | | |Ver|O|U| TTL | Length |U|U|U|U|MD Type| Next Protocol | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ^ ^ | ^ ^ | |||
| | | | | | | |||
| +-------+ | +-------+ | |||
| |NSH ECN| | |NSH ECN| | |||
| | field | | | field | | |||
| +-------+ | +-------+ | |||
| Figure 4. NSH Base Header | Figure 4. Updated NSH Base Header | |||
| RFC Editor NOTE: The above figure should be adjusted based on the | RFC Editor NOTE: The above figure should be adjusted based on the | |||
| bits assigned by IANA (see Section 5) and this note deleted. | bits actually assigned by IANA (see Section 5) and this note deleted. | |||
| Table 1 shows the meaning of the code points in the NSH ECN field. | Table 1 shows the meaning of the code points in the NSH ECN field. | |||
| These have the same meaning as the ECN field code points in the IPv4 | These have the same meaning as the ECN field code points in the IPv4 | |||
| or IPv6 header as defined in [RFC3168]. | or IPv6 header as defined in Section 23.1 of [RFC3168]. | |||
| Binary Name Meaning | Binary Name Meaning | |||
| ------ ------- -------------------------------- | ------ ------- -------------------------------- | |||
| 00 Not-ECT Not ECN-Capable Transport | 00 Not-ECT Not ECN-Capable Transport | |||
| 01 ECT(1) ECN-Capable Transport | 01 ECT(1) ECN-Capable Transport | |||
| 10 ECT(0) ECN-Capable Transport | 10 ECT(0) ECN-Capable Transport | |||
| 11 CE Congestion Experienced | 11 CE Congestion Experienced | |||
| Table 1. ECN Field Code Points | Table 1. ECN Field Code Points | |||
| 3. ECN Support in the NSH | 3. ECN Support in the NSH | |||
| This section describes the required behavior to support ECN using the | This section describes the required behavior to support ECN using the | |||
| NSH. There are two aspects to ECN support: | NSH. There are two aspects to ECN support: | |||
| 1. ECN propagation during encapsulation or decapsulation | 1. ECN propagation during encapsulation or decapsulation; | |||
| 2. ECN marking during congestion at bottlenecks. | 2. ECN marking during congestion at bottlenecks. | |||
| While this section covers all combinations of ECN-aware and ECN- | While this section covers all combinations of ECN-aware and ECN- | |||
| unaware, it is expected that in most cases the NSH domain will be | unaware, it is expected that in most cases the NSH domain will be | |||
| uniform so that, if this document is applicable, all SFFs will | uniform so that, if this document is applicable, all SFFs will | |||
| support ECN; however, some legacy SFs might not support ECN. | support ECN; however, some SFs might not support ECN. | |||
| ECN Propagation: | ECN Propagation: | |||
| The specification of ECN tunneling [RFC6040] explains that an | The specification of ECN tunneling [RFC6040] explains that an | |||
| ingress must not propagate ECN support into an encapsulating | ingress must not propagate ECN support into an encapsulating | |||
| header unless the egress supports correct onward propagation of | header unless the egress supports correct onward propagation of | |||
| the ECN field during decapsulation. We define Compliant ECN | the ECN field during decapsulation. We define Compliant ECN | |||
| Decapsulation here as decapsulation compliant with either | Decapsulation here as decapsulation compliant with either | |||
| [RFC6040] or an earlier compatible equivalent ([RFC4301], or the | [RFC6040] or an earlier compatible equivalent ([RFC4301], or the | |||
| full functionality mode of [RFC3168]). | full functionality mode of [RFC3168]). | |||
| The procedures in Section 3.2.1 ensure that each ingress of the | The procedures in Section 3.2.1 ensure that each ingress of the | |||
| large number of possible transport links within the SFC domain | transport links within the SFC enabled domain does not propagate | |||
| does not propagate ECN support into the encapsulating outer | ECN support into the encapsulating outer transport header unless | |||
| transport header unless the corresponding egress of that link | the corresponding egress of that link supports Compliant ECN | |||
| supports Compliant ECN Decapsulation. | Decapsulation. | |||
| Section 3.3 requires that all the egress nodes of the SFC domain | Section 3.3 requires that all the egress nodes of the SFC enabled | |||
| support Compliant ECN Decapsulation in conjunction with tunnel | domain support Compliant ECN Decapsulation in conjunction with | |||
| congestion feedback, otherwise the scheme in this document will | tunnel congestion feedback, otherwise the scheme in this document | |||
| not work. | will not work. | |||
| ECN Marking: | ECN Marking: | |||
| At transit nodes the marking behavior specified in Section 3.2.1 | At transit nodes the marking behavior specified in Section 3.2.1 | |||
| is recommended and if not implemented at such transit nodes, there | is recommended and if not implemented at such transit nodes, there | |||
| may be unmanaged congestion. | may be unmanaged congestion. | |||
| Detection of congestion will be most effective if ECN marking is | Detection of congestion will be most effective if ECN marking is | |||
| supported by all potential bottlenecks inside the domain in which | supported by all potential bottlenecks inside the domain in which | |||
| NSH is being used to route traffic as well as at the ingress and | NSH is being used to route traffic as well as at the ingress and | |||
| egress. Nodes that do not support ECN marking, or that support | egress. Nodes that do not support ECN marking, or that support | |||
| AQM but not ECN, will naturally use drop to relieve congestion. | AQM but not ECN, will naturally use drop to relieve congestion. | |||
| The gap in the end-to-end packet sequence will be detected as | The gap in the end-to-end packet sequence will be detected as | |||
| congestion by the final receiving endpoint, but not by the NSH | congestion by the final receiving endpoint, but not by the NSH | |||
| egress (see Figure 2). | egress (see Figure 2). | |||
| 3.1 At The Ingress | 3.1 At The Ingress | |||
| When the ingress/Classifier encapsulates an incoming IP packet with | When the ingress/Classifier encapsulates an incoming packet with an | |||
| an NSH, it MUST set the NSH ECN field using the "Normal mode" | NSH, it MUST set the NSH ECN field using the "Normal mode" specified | |||
| specified in [RFC6040] (i.e., copied from the incoming IP header). | in [RFC6040] (e.g., copied from the incoming IP header). | |||
| Then, if the resulting NSH ECN field is Not-ECT, the ingress SHOULD | Then, if the resulting NSH ECN field is Not-ECT, the ingress SHOULD | |||
| set it to ECT(0). This indicates that, even though the end-to-end | set it to ECT(0). This indicates that, even though the end-to-end | |||
| transport is not ECN-capable, the egress and ingress of the SFC | transport is not ECN-capable, the egress and ingress of the SFC | |||
| domain are acting as an ECN-capable transport. This approach will | enabled domain are acting as an ECN-capable transport. This approach | |||
| inherently support all known variants of ECN, including the | supports all known variants of ECN, including the experimental L4S | |||
| experimental L4S capability [RFC8311] [ecnL4S]. | capability [RFC8311] [ecnL4S]. | |||
| Packets arriving at the ingress might not use IP. If the protocol of | Packets arriving at the ingress might not use IP. If the protocol of | |||
| arriving packets supports an ECN field similar to IP, the procedures | arriving packets supports an ECN field similar to IP, for example | |||
| for IP packets can be used. If arriving packets do not support an ECN | MPLS [RFC5129], the procedures for IP packets can be used. If | |||
| field similar to IP, they MUST be treated as if they are Not-ECT IP | arriving packets do not support an ECN field similar to IP, they MUST | |||
| packets. | be treated as if they are Not-ECT IP packets. | |||
| Then, as the NSH encapsulated packet is further encapsulated with a | Then, as the NSH encapsulated packet is further encapsulated with a | |||
| transport header, if ECN marking is available for that transport (as | transport header, if ECN marking is available for that transport (as | |||
| it is for IP [RFC3168] and MPLS [RFC5129]), the ECN field of the | it is for IP [RFC3168] and MPLS [RFC5129]), the ECN field of the | |||
| transport header MUST be set using the "Normal mode" specified in | transport header MUST be set using the "Normal mode" specified in | |||
| [RFC6040] (i.e., copied from the NSH ECN field). | [RFC6040] (i.e., copied from the NSH ECN field). | |||
| A summary of these normative steps is given in Table 2. | A summary of these normative steps is given in Table 2. | |||
| +-----------------+---------------+ | +-----------------+---------------+ | |||
| skipping to change at page 13, line 44 ¶ | skipping to change at page 13, line 44 ¶ | |||
| | (also equal to | and Outer | | | (also equal to | and Outer | | |||
| | departing Inner | Headers | | | departing Inner | Headers | | |||
| | Header) | | | | Header) | | | |||
| +-----------------+---------------+ | +-----------------+---------------+ | |||
| | Not-ECT | ECT(0) | | | Not-ECT | ECT(0) | | |||
| | ECT(0) | ECT(0) | | | ECT(0) | ECT(0) | | |||
| | ECT(1) | ECT(1) | | | ECT(1) | ECT(1) | | |||
| | CE | CE | | | CE | CE | | |||
| +-----------------+---------------+ | +-----------------+---------------+ | |||
| Table 2. Setting of ECN fields by an ingress/Classifier | Table 2. Setting of ECN fields by an Ingress/Classifier | |||
| The requirements in this section apply to all ingress nodes for the | The requirements in this section apply to all ingress nodes for the | |||
| domain in which NSH is being used to route traffic. | domain in which an NSH is being used to steer traffic. | |||
| 3.2 At Transit Nodes | 3.2 At Transit Nodes | |||
| This section described behavior at nodes that forward based on the | This section describes the behavior at nodes that forward based on | |||
| NSH such as SFF and other forwarding nodes such as IP routers. Figure | the NSH such as SFF and other forwarding nodes such as IP routers. | |||
| 5 shows a packet on the wire between forwarding nodes. | Figure 5 shows a packet on the wire between forwarding nodes. | |||
| +-----------------+ | +-----------------+ | |||
| | Outer Header | | | Outer Header | | |||
| +-----------------+ | +-----------------+ | |||
| | NSH | | | NSH | | |||
| +-----------------+ | +-----------------+ | |||
| | Inner Header | | | Inner Header | | |||
| +-----------------+ | +-----------------+ | |||
| | Payload | | | Payload | | |||
| +-----------------+ | +-----------------+ | |||
| skipping to change at page 15, line 8 ¶ | skipping to change at page 15, line 8 ¶ | |||
| supports Compliant ECN Decapsulation (see Section 3). If it does, | supports Compliant ECN Decapsulation (see Section 3). If it does, | |||
| then the encapsulating node propagates the NSH ECN field to this | then the encapsulating node propagates the NSH ECN field to this | |||
| outer encapsulation using the "Normal Mode" of ECN encapsulation | outer encapsulation using the "Normal Mode" of ECN encapsulation | |||
| [RFC6040] (the ECN field is copied). If it does not, then the | [RFC6040] (the ECN field is copied). If it does not, then the | |||
| encapsulating node MUST clear ECN in the outer encapsulation to non- | encapsulating node MUST clear ECN in the outer encapsulation to non- | |||
| ECT (the "Compatibility Mode" of [RFC6040]). | ECT (the "Compatibility Mode" of [RFC6040]). | |||
| 3.2.2 At an SF/Proxy | 3.2.2 At an SF/Proxy | |||
| If the SF is NSH and ECN-aware, the processing is essentially the | If the SF is NSH and ECN-aware, the processing is essentially the | |||
| same at the SF as at an SFF as discussed in Section 3.2.1. | same at the SF as at an SFF as discussed in Section 3.2.1 (except in | |||
| the case where the SF terminates the packets path). | ||||
| If the SF is NSH-aware but ECN-unaware, then the SFF transmitting the | If the SF is NSH-aware but ECN-unaware, then the SFF transmitting the | |||
| packet to the SF will use Compatibility Mode. Congestion encountered | packet to the SF will use Compatibility Mode. Congestion encountered | |||
| in the SFF to SF and SF to SFF paths will be unmanaged. | in the SFF to SF and SF to SFF paths will be unmanaged. | |||
| If the SF is not NSH-aware, then an NSH proxy will be between the SFF | If the SF is not NSH-aware, then an NSH proxy will be between the SFF | |||
| and the SF to avoid exposure of the SF that does not understand NSHs | and the SF to avoid exposure of the SF that does not understand NSHs | |||
| to the NSH as shown in Figure 6. This is described in Section 4.6 of | to the NSH as shown in Figure 6. This is described in Section 4.6 of | |||
| [RFC7665]. The SF and proxy together look to the SFF like an NSH- | [RFC7665]. The SF and proxy together look to the SFF like an NSH- | |||
| aware SF. The behavior at the proxy and SF in this case is as below: | aware SF. The behavior at the proxy and SF in this case is as below: | |||
| skipping to change at page 16, line 9 ¶ | skipping to change at page 16, line 10 ¶ | |||
| 3.2.3 At Other Forwarding Nodes | 3.2.3 At Other Forwarding Nodes | |||
| Other forwarding nodes, that is non-NSH forwarding nodes between NSH | Other forwarding nodes, that is non-NSH forwarding nodes between NSH | |||
| forwarding nodes, such as IP or label switched routers, might also | forwarding nodes, such as IP or label switched routers, might also | |||
| contain potential bottlenecks. If so, they SHOULD implement an AQM | contain potential bottlenecks. If so, they SHOULD implement an AQM | |||
| algorithm to update the ECN marking in the outer transport header as | algorithm to update the ECN marking in the outer transport header as | |||
| specified in [RFC3168]. | specified in [RFC3168]. | |||
| 3.3 At Exit/Egress | 3.3 At Exit/Egress | |||
| At the SFC domain egress node, first any actions are taken based on | At the SFC enabled domain egress node, first any actions are taken | |||
| Congestion Experienced or other values of ECN marking, such as | based on Congestion Experienced or other values of ECN marking, such | |||
| accumulating statistics to send back to the ingress (see Section 4) | as accumulating statistics to send back to the ingress (see Section | |||
| or for other uses. If the packet being carried inside the NSH is IP, | 4) or for other uses. If the packet being carried inside the NSH is | |||
| when the NSH is removed the NSH ECN field MUST be combined with the | IP, when the NSH is removed the NSH ECN field MUST be combined with | |||
| IP ECN field as specified in Table 3 that was extracted from | the IP ECN field as specified in Table 3 that was extracted from | |||
| [RFC6040]. This requirement applies to all egress nodes for the | Section 3.2 of [RFC6040]. This requirement applies to all egress | |||
| domain in which NSH is being used to route traffic. | nodes for the domain in which NSH is being used to route traffic. | |||
| +---------+---------------------------------------------+ | +---------+---------------------------------------------+ | |||
| |Arriving | Arriving Outer Header | | |Arriving | Arriving Outer Header | | |||
| | Inner +---------+-----------+-----------+-----------+ | | Inner +---------+-----------+-----------+-----------+ | |||
| | Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| +---------+---------+-----------+-----------+-----------+ | +---------+---------+-----------+-----------+-----------+ | |||
| | Not-ECT | Not-ECT | Not-ECT | Not-ECT | <drop> | | | Not-ECT | Not-ECT | Not-ECT | Not-ECT | <drop> | | |||
| | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | |||
| | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | |||
| | CE | CE | CE | CE | CE | | | CE | CE | CE | CE | CE | | |||
| +---------+---------+-----------+-----------+-----------+ | +---------+---------+-----------+-----------+-----------+ | |||
| Table 3. Exit ECN Fields Merger | Table 3. Exit ECN Fields Merger (Source [RFC6040]) | |||
| All the egress nodes of the SFC domain MUST support Compliant ECN | All the egress nodes of the SFC enabled domain MUST support Compliant | |||
| Decapsulation as specified in this section. If this is not the case, | ECN Decapsulation as specified in this section. If this is not the | |||
| the scheme described in this document will not work, and cannot be | case, the scheme described in this document will not work, and cannot | |||
| used. | be used. | |||
| 3.4 Congestion Statistics and the Conservation of Packets | 3.4 Congestion Statistics and the Conservation of Packets | |||
| The SFC specification permits an SF to absorb packets and to generate | The SFC specification permits an SF to absorb packets and to generate | |||
| new packets as well as simply processing and forwarding the packets | new packets as well as simply processing and returning the packets it | |||
| it receives. Such actions might appear to be packet loss due to | receives to an SFF. Such actions might appear to be packet loss due | |||
| congestion or might mask the loss of packets by generating additional | to congestion or might mask the loss of packets by generating | |||
| packets. | additional packets. | |||
| The tunnel congestion feedback approach (Section 4) can detect | The tunnel congestion feedback approach (Section 4) can detect | |||
| congestions in several ways. One way detects traffic loss by counting | congestions in several ways. One way detects traffic loss by counting | |||
| payload packets and bytes in at the ingress and counting them out at | payload packets and bytes in at the ingress and counting them out at | |||
| the egress. This does not work unless nodes conserve the number of | the egress. This does not work unless nodes conserve the number of | |||
| payload packets and/or bytes. Therefore, it will not be possible to | payload packets and/or bytes. Therefore, it will not be possible to | |||
| accurately detect packet loss using this technique if traffic volume | accurately detect packet loss using this technique if traffic volume, | |||
| is not conserved by the service function chain processing that | as measured by the metric in used (packets or bytes), is not | |||
| traffic. | conserved by the service function chain processing that traffic. | |||
| Nonetheless, if a bottleneck supports ECN marking, it will be | Nonetheless, if a bottleneck supports ECN marking, it will be | |||
| possible to detect the high level of CE markings that are associated | possible to detect the high level of CE markings that are associated | |||
| with congestion at that bottleneck by looking at the ratio of CE- | with congestion at that bottleneck by looking at the ratio of CE- | |||
| marked to non-CE-marked packets. However, it will not be possible for | marked to non-CE-marked packets. However, it will not be possible to | |||
| the tunnel congestion feedback approach to detect any congestion, | detect any congestion based on ECN marking, whether slight or severe, | |||
| whether slight or severe, if it occurs at a bottleneck that does not | if it occurs at a bottleneck that does not support ECN marking. | |||
| support ECN marking. | ||||
| 4. Tunnel Congestion Feedback Support | 4. Tunnel Congestion Feedback Support | |||
| The collection and storage of congestion information at the egress | The collection and storage of congestion information at the egress | |||
| may be useful for later analysis but, unless it can be fed back to a | may be useful for later analysis and MAY be used without the feedback | |||
| point which can take action to reduce congestion, it will not be | mechanisms specified in this Section. However, if congestion | |||
| useful in real time. Such congestion feedback to the ingress enables | information is not fed back to a point which can take action to | |||
| it to take actions such as those listed in Section 1.3. | reduce congestion, it will not be useful in real time. Such | |||
| congestion feedback to the ingress enables it to take actions such as | ||||
| those listed in Section 1.3. | ||||
| IP Flow Information Export (IPFIX [RFC7011]) provides a standard for | IP Flow Information Export (IPFIX [RFC7011]) provides a standard for | |||
| communicating traffic flow statistics. As extended by this document, | communicating traffic flow statistics. As extended by this document, | |||
| IPFIX messages from the egress to the ingress are used to communicate | IPFIX messages from the egress to the ingress are used to communicate | |||
| the extent of congestion between an ingress and egress based on ECN | the extent of congestion between an ingress and egress based on ECN | |||
| marking in the NSH. | marking in the NSH. The ingress MUST be configured to know the | |||
| relevant egress for a flow. The egress MUST be able to identify the | ||||
| relevant ingress for a packet based on the SPI, the Ingress Network | ||||
| Node Information Context Header [NSHTLV], or the like. | ||||
| 4.1 Congestion Level Measurements | 4.1 Congestion Level Measurements | |||
| The congestion level measurements are based on ECN marking in the NSH | The congestion level measurements are based on ECN marking in the NSH | |||
| and packet drop. In particular congestion information includes at | and packet drop. In particular congestion information includes at | |||
| least one of cumulative bytes counts of packets with each type of | least one of cumulative bytes counts of packets with each type of | |||
| outer/inner header ECN marking combination, the ratio of CE-marked | outer/inner header ECN marking combination, the ratio of CE-marked | |||
| packets to all packets, and the ratio of dropped packets to all | packets to all packets, and the ratio of dropped packets to all | |||
| packets. | packets. | |||
| skipping to change at page 18, line 43 ¶ | skipping to change at page 18, line 48 ¶ | |||
| ingress and total packets arriving at egress over the same span of | ingress and total packets arriving at egress over the same span of | |||
| packets. If packet loss is detected for a flow that would preserve | packets. If packet loss is detected for a flow that would preserve | |||
| the number of packets in the absence of congestion, then it can be | the number of packets in the absence of congestion, then it can be | |||
| assumed that severe congestion has occurred in the tunnel. | assumed that severe congestion has occurred in the tunnel. | |||
| The egress calculates the CE-marked packet ratio by counting packets | The egress calculates the CE-marked packet ratio by counting packets | |||
| with different ECN markings. The CE-marked packet ratio will be used | with different ECN markings. The CE-marked packet ratio will be used | |||
| as an indication of tunnel load level. It is assumed that nodes | as an indication of tunnel load level. It is assumed that nodes | |||
| between the ingress and egress will not drop packets biased towards | between the ingress and egress will not drop packets biased towards | |||
| certain ECN codepoints, so calculating of CE-marked packet ratio is | certain ECN codepoints, so calculating of CE-marked packet ratio is | |||
| not affect by packet drop. | not affected by packet drop. | |||
| The calculation of the fraction of packets droped is by comparing the | The calculation of the fraction of packets dropped is by comparing | |||
| traffic volumes between ingress and egress. | the traffic volumes between ingress and egress. | |||
| Faked ECN-Capable Transport (ECT) is used at the ingress to defer | Faked ECN-Capable Transport (ECT) is used at the ingress to defer | |||
| packet loss to the egress. The basic idea of faked ECT is that, when | packet loss to the egress. The basic idea of faked ECT is that, when | |||
| encapsulating packets, the ingress first marks the tunnel outer | encapsulating packets, the ingress first marks the tunnel outer | |||
| header (NSH for an SFC domain) according to [RFC6040], and then | header according to [RFC6040], and then remarks the outer header of | |||
| remarks the outer header of Not-ECT packets as ECT. (ECT(0) and | Not-ECT packets as ECT. (ECT(0) and ECT(1) are treated as the same.) | |||
| ECT(1) are treated as the same.) Thus, as transmitted by the ingress | In this case, the NSH is treated as the tunnel outer header because | |||
| it will be present for the entire SFC enabled domain transit while | ||||
| transport headers may change. Thus, as transmitted by the ingress | ||||
| node, there will be one of three combinations of outer header ECN | node, there will be one of three combinations of outer header ECN | |||
| field and inner header ECN field as follows: CE|CE, ECT|N-ECT, and | field and inner header ECN field as follows: CE|CE, ECT|N-ECT, and | |||
| ECT|ECT (in the format of outer-ECN|inner-ECN); when decapsulating | ECT|ECT (in the format of outer-ECN|inner-ECN); when decapsulating | |||
| packets at the egress, [RFC6040] defined decapsulation behavior is | packets at the egress, [RFC6040] defined decapsulation behavior is | |||
| used, and according to [RFC6040], the packets marked as CE|N-ECT will | used, and according to [RFC6040], the packets marked as CE|N-ECT will | |||
| be dropped. Faked-ECT is used to shift some drops to the egress in | be dropped. Faked-ECT is used to shift some drops to the egress in | |||
| order to allow the egress to calculate the CE-marked packet ratio | order to allow the egress to calculate the CE-marked packet ratio | |||
| more precisely. | more precisely. | |||
| The ingress encapsulates packets and marks their outer header | The ingress encapsulates packets and marks their outer header | |||
| skipping to change at page 19, line 42 ¶ | skipping to change at page 19, line 50 ¶ | |||
| the path between the ingress and the egress or at the granularity of | the path between the ingress and the egress or at the granularity of | |||
| individual customer's traffic or a specific set of flows to learn | individual customer's traffic or a specific set of flows to learn | |||
| about their congestion contribution. | about their congestion contribution. | |||
| For example, the tunnelEcnCEMarkedRatio field (specified below) | For example, the tunnelEcnCEMarkedRatio field (specified below) | |||
| indicates the fraction of traffic that has been marked in the ECN | indicates the fraction of traffic that has been marked in the ECN | |||
| field of the NSH as Congestion Experienced (CE). | field of the NSH as Congestion Experienced (CE). | |||
| 4.3 Congestion Information Delivery | 4.3 Congestion Information Delivery | |||
| As described above, the tunnel ingress sends a messages containing | As described above, the tunnel ingress sends a message containing | |||
| cumulative byte counts of packets of each type of ECN marking to the | cumulative byte counts of packets of each type of ECN marking to the | |||
| tunnel egress, and the tunnel egress feeds back messages to the | tunnel egress, and the tunnel egress feeds back messages to the | |||
| ingress with at least one of the following: cumulative byte counts of | ingress with at least one of the following: cumulative byte counts of | |||
| packets of each type of ECN combination, the ratio of CE-marked | packets of each type of ECN combination, the ratio of CE-marked | |||
| packets to all packets, and the ratio of dropped packets to all | packets to all packets, and the ratio of dropped packets to all | |||
| packets. This section specifies how the messages are conveyed. | packets. (It is possible for these messages to contribute to | |||
| congestion.) This section specifies how the messages are conveyed. | ||||
| IPFIX recommends, but does not require, use of SCTP [RFC4960] in | IPFIX recommends, but does not require, use of SCTP [RFC4960] in | |||
| partial reliability mode [RFC3758] for the transport of its messages. | partial reliability mode [RFC3758] for the transport of its messages. | |||
| This mode allows loss of some packets, which is tolerable because | This mode allows loss of some packets, which is tolerable because | |||
| IPFIX communicates cumulative statistics. IPFIX over SCTP over IP | IPFIX communicates cumulative statistics. IPFIX over SCTP over IP | |||
| SHOULD be used directly where there is IP connectivity between the | SHOULD be used directly where there is IP connectivity between the | |||
| ingress and egress; however, there might be different transport | ingress and egress; however, there might be different transport | |||
| protocols or address spaces used in different regions of an SFC | protocols or address spaces used in different regions of an SFC | |||
| domain that block such direct IP connectivity. The NSH provides the | enabled domain that block such direct IP connectivity. The NSH | |||
| general method of routing traffic within an SFC domain so the | provides the general method of routing traffic within an SFC enabled | |||
| encapsulation of the required IPFIX traffic in NSH MUST be | domain so the encapsulation of the required IPFIX traffic in NSH MUST | |||
| implemented and, when IP connectivity is not available, IPFIX over | be implemented and, when IP connectivity is not available, IPFIX over | |||
| NSH SHOULD be used along with configuration of appropriate SFC paths | NSH SHOULD be used along with configuration of appropriate SFC paths | |||
| for the IPFIX over NSH traffic. | for the IPFIX over NSH traffic. | |||
| IPFIX messages could travel along the same path as network data | IPFIX messages could travel along the same path as network data | |||
| traffic. In any case, an IPFIX message packet may get lost in case of | traffic. In any case, an IPFIX message packet may get lost in case of | |||
| network congestion. Even though the missing information could be | network congestion. Even though the missing information could be | |||
| recovered because of the use of cumulative counts, the message SHOULD | recovered because of the use of cumulative counts, the message SHOULD | |||
| be transmitted at a higher priority than users' traffic flows to | be transmitted at a higher priority than users' traffic flows to | |||
| improve the promptness of congestion information feedback. | improve the promptness of congestion information feedback. | |||
| skipping to change at page 20, line 49 ¶ | skipping to change at page 21, line 6 ¶ | |||
| The combination of congestion level measurement and congestion | The combination of congestion level measurement and congestion | |||
| information delivery procedures are as following: | information delivery procedures are as following: | |||
| o The ingress node determines the IPFIX template record to be used. | o The ingress node determines the IPFIX template record to be used. | |||
| The template record can be pre-configured or determined at | The template record can be pre-configured or determined at | |||
| runtime, the content of the template record will be determined | runtime, the content of the template record will be determined | |||
| according to the granularity of congestion management; if the | according to the granularity of congestion management; if the | |||
| ingress wants to limit congestion volume contributed by specific | ingress wants to limit congestion volume contributed by specific | |||
| traffic flows then the elements such as source IP address, | traffic flows then the elements such as source IP address, | |||
| destination IP address, flow ID and CE-marked packet volume of the | destination IP address, flow ID, and CE-marked packet volume of | |||
| flows, etc., will be included in the template record. | the flows, etc., will be included in the template record. | |||
| o Metering at the ingress measures traffic volume according to the | o Metering at the ingress measures traffic volume according to the | |||
| template record chosen and then the measurement records are sent | template record chosen and then the measurement records are sent | |||
| to the egress. | to the egress. | |||
| o Metering on the egress measures congestion level information | o Metering on the egress measures congestion level information | |||
| according to template record which SHOULD be the same as the | according to template record which SHOULD be the same as the | |||
| template record sent by the ingress. | template record sent by the ingress. | |||
| o The egress sends its measurement records together with the | o The egress sends its measurement records together with the | |||
| skipping to change at page 23, line 43 ¶ | skipping to change at page 24, line 7 ¶ | |||
| Data Type Semantics: totalCounter | Data Type Semantics: totalCounter | |||
| ElementId: TBD5 | ElementId: TBD5 | |||
| Statues: current | Statues: current | |||
| Units: bytes | Units: bytes | |||
| 4.3.7 tunnelEcnCEMarkedRatio | 4.3.7 tunnelEcnCEMarkedRatio | |||
| Description: The ratio of CE-marked packets at the Observation | Description: The ratio of packets that are CE-marked to packets | |||
| Point. | that are not CE-marked at the Observation Point. | |||
| Abstract Data Type: float32 | Abstract Data Type: float32 | |||
| ElementId: TBD6 | ElementId: TBD6 | |||
| Statues: current | Statues: current | |||
| 5. Example of Use | 5. Example of Use | |||
| This section provides an example of the solution described in this | This section provides an example of the solution described in this | |||
| skipping to change at page 25, line 19 ¶ | skipping to change at page 26, line 19 ¶ | |||
| |---------------------------------|----------------------| | |---------------------------------|----------------------| | |||
| |tunnelEcnCeCeByteTotalCount Field Length=8 | | |tunnelEcnCeCeByteTotalCount Field Length=8 | | |||
| |---------------------------------|----------------------| | |---------------------------------|----------------------| | |||
| |tunnelEcnEctNectByteTotalCount Field Length=8 | | |tunnelEcnEctNectByteTotalCount Field Length=8 | | |||
| |---------------------------------|----------------------| | |---------------------------------|----------------------| | |||
| |tunnelEcnEctEctByteTotalCount Field Length=8 | | |tunnelEcnEctEctByteTotalCount Field Length=8 | | |||
| |---------------------------------+----------------------| | |---------------------------------+----------------------| | |||
| Figure 8. Template Record Sent From Ingress to Egress | Figure 8. Template Record Sent From Ingress to Egress | |||
| +-------+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-------+ | +-------+ +-------+ | |||
| | | |M| |P| |P| |P| |M| |P| |P| | | | | | +-+ +-+ +-+ +-+ +-+ +-+ +-+ | | | |||
| | | +-+ +-+ +-+ +-+ +-+ +-+ +-+ | | | | | |P| |P| |M| |P| |P| |P| |M| | | | |||
| | |<---------------------------------------| | | | | +-+ +-+ +-+ +-+ +-+ +-+ +-+ | | | |||
| | | | | | ||||
| | | | | | ||||
| |egress | +-+ +-+ |ingress| | ||||
| | | |M| |M| | | | ||||
| | | +-+ +-+ | | | ||||
| | |--------------------------------------->| | | | |--------------------------------------->| | | |||
| | | | | | | | | | | |||
| |ingress| |egress | | ||||
| | | +-+ +-+ | | | ||||
| | | |M| |M| | | | ||||
| | | +-+ +-+ | | | ||||
| | |<---------------------------------------| | | ||||
| | | | | | | | | | | |||
| +-------+ +-------+ | +-------+ +-------+ | |||
| +-+ | +-+ | |||
| |M| : Message Packet | |M| : IPFIX Message Packet | |||
| +-+ | +-+ | |||
| +-+ | +-+ | |||
| |P| : User Packet | |P| : User Data Packet | |||
| +-+ | +-+ | |||
| Figure 9. Traffic flow Between Ingress and Egress | Figure 9. Traffic flow Between Ingress and Egress | |||
| Set ID=257, Length=28 | Set ID=257, Length=28 | |||
| +------+ A1 +-------+ | +-------+ A1 +-------+ | |||
| | | B1 | | | | | B1 | | | |||
| | | C1 | | | | | C1 | | | |||
| | | <----------------------------- | | | | | -----------------------------> | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | SetID=256, Length=72 | | | | | SetID=256, Length=72 | | | |||
| | | A1 | | | | | A1 | | | |||
| | | B1 | | | | | B1 | | | |||
| |egress| C1 |ingress| | |ingress| C1 |egress | | |||
| | | A2 | | | | | A2 | | | |||
| | | B2 | | | | | B2 | | | |||
| | | C2 | | | | | C2 | | | |||
| | | D | | | | | D | | | |||
| | | E | | | | | E | | | |||
| | | R | | | | | R | | | |||
| | | ----------------------------> | | | | | <---------------------------- | | | |||
| | | | | | | | | | | |||
| +------+ +-------+ | +-------+ +-------+ | |||
| Figure 10. Messages Between Ingress and Egress | Figure 10. Messages Between Ingress and Egress | |||
| The following provides an example of how the tunnel congestion level | The following provides an example of how the tunnel congestion level | |||
| can be calculated (see Figure 10): | can be calculated (see Figure 10): | |||
| The congestion Level could be divided into two categories: (1) | The congestion Level could be divided into two categories: (1) | |||
| slight congestion (no packets dropped); (2) serious congestion | slight congestion (no packets dropped); (2) serious congestion | |||
| (packets are being dropped). | (packets are being dropped). | |||
| For slight congestion, the congestion level is indicated by the | For slight congestion, the congestion level is indicated by the | |||
| ratio of CE-marked packets: | ratio of CE-marked packets: | |||
| ce_marked = R; | R = ce_marked_ratio = ce-marked / total_egress ; | |||
| For serious congestion, the congestion level is indicated as the | For serious congestion, the congestion level is indicated as the | |||
| volume of traffic loss: | volume of traffic loss: | |||
| total_ingress = (A1 + B1 + C1) | total_ingress = (A1 + B1 + C1) | |||
| total_egress = (A2 + B2 + C2 + D + E) | total_egress = (A2 + B2 + C2 + D + E) | |||
| volume_loss = (total_ingress - total_egress) | volume_loss = (total_ingress - total_egress) | |||
| skipping to change at page 27, line 21 ¶ | skipping to change at page 28, line 21 ¶ | |||
| IANA is requested to assign two contiguous bits in the NSH Base | IANA is requested to assign two contiguous bits in the NSH Base | |||
| Header Bits registry for ECN (bits 16 and 17 suggested) and note this | Header Bits registry for ECN (bits 16 and 17 suggested) and note this | |||
| assignment as follows: | assignment as follows: | |||
| Bit Description Reference | Bit Description Reference | |||
| ---------- ----------- ----------------- | ---------- ----------- ----------------- | |||
| tbd(16-17) NSH ECN [this document] | tbd(16-17) NSH ECN [this document] | |||
| 6.2 IPFIX Information Element IDs | 6.2 IPFIX Information Element IDs | |||
| IANA is requested to assign IPFIX Information Element IDs as follows: | IANA is requested to assign seven IPFIX Information Element IDs as | |||
| follows: | ||||
| ElementID: TBD0 | ElementID: TBD0 | |||
| Name: nshServicePathID | Name: nshServicePathID | |||
| Data Type: unsigned32 | Data Type: unsigned32 | |||
| Data Type Semantics: identifier | Data Type Semantics: identifier | |||
| Status: current | Status: current | |||
| Description: The Network Service Header [RFC8300] Service Path | Description: The Network Service Header [RFC8300] Service Path | |||
| Identifier. | Identifier. | |||
| ElementID: TBD1 | ElementID: TBD1 | |||
| skipping to change at page 29, line 17 ¶ | skipping to change at page 30, line 17 ¶ | |||
| For general NSH security considerations, see [RFC8300]. | For general NSH security considerations, see [RFC8300]. | |||
| For security considerations concerning ECN signaling tampering, see | For security considerations concerning ECN signaling tampering, see | |||
| [RFC3168]. For security considerations concerning ECN and | [RFC3168]. For security considerations concerning ECN and | |||
| encapsulation, see [RFC6040]. | encapsulation, see [RFC6040]. | |||
| For general IPFIX security considerations, see [RFC7011]. If deployed | For general IPFIX security considerations, see [RFC7011]. If deployed | |||
| in an untrusted environment, the signaling traffic between ingress | in an untrusted environment, the signaling traffic between ingress | |||
| and egress can be protected utilizing the security mechanisms | and egress can be protected utilizing the security mechanisms | |||
| provided by IPFIX (see Section 11 in [RFC7011]). The tunnel | provided by IPFIX (see Section 11 in [RFC7011]). The tunnel | |||
| endpoints (the ingress and egress for an SFC domain) are assumed to | endpoints (the ingress and egress for an SFC enabled domain) are | |||
| be in the same administrative domain, so they will trust each other. | assumed to be in the same administrative domain, so they will trust | |||
| each other. | ||||
| The solution in this document does not introduce any greater | The solution in this document does not introduce any greater | |||
| potential to invade privacy than would have been available without | potential to invade privacy than would have been available without | |||
| the solution. | the solution. | |||
| 8. Acknowledgements | 8. Acknowledgements | |||
| Most of the material on Tunnel Congestion Feedback was originally in | Most of the material on Tunnel Congestion Feedback was originally in | |||
| draft-ietf-tsvwg-tunnel-congestion-feedback. After discussion with | draft-ietf-tsvwg-tunnel-congestion-feedback. After discussion with | |||
| the authors of that draft, the authors of this draft, and the Chairs | the authors of that draft, the authors of this draft, and the Chairs | |||
| of the TSVWG and SFC Working Groups, the Tunnel Congestion Feedback | of the TSVWG and SFC Working Groups, the Tunnel Congestion Feedback | |||
| draft was merged into this draft. | draft was merged into this draft. | |||
| The authors wish to thank the following for their comments, | The authors wish to thank the following for their comments, | |||
| suggestions, and reviews: | suggestions, and reviews: | |||
| David Black, Sami Boutros, Anthony Chan, Lingli Deng, Liang Geng, | David Black, Mohamed Boucadair, Sami Boutros, Anthony Chan, | |||
| Joel Halpern, Jake Holland, John Kaippallimalil, Tal Mizrahi, | Lingli Deng, Liang Geng, Joel Halpern, Jake Holland, | |||
| Vincent Roca, Lei Zhu | John Kaippallimalil, Tal Mizrahi, Vincent Roca, Lei Zhu | |||
| Normative References | Normative References | |||
| [RFC2119] - Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] - Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, | Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, | |||
| March 1997, <http://www.rfc-editor.org/info/rfc2119>. | March 1997, <http://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC3168] - Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] - Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
| of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI | of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI | |||
| 10.17487/RFC3168, September 2001, <http://www.rfc- | 10.17487/RFC3168, September 2001, <http://www.rfc- | |||
| skipping to change at page 32, line 5 ¶ | skipping to change at page 32, line 35 ¶ | |||
| [RFC8311] - Black, D., "Relaxing Restrictions on Explicit Congestion | [RFC8311] - Black, D., "Relaxing Restrictions on Explicit Congestion | |||
| Notification (ECN) Experimentation", RFC 8311, DOI | Notification (ECN) Experimentation", RFC 8311, DOI | |||
| 10.17487/RFC8311, January 2018, <https://www.rfc- | 10.17487/RFC8311, January 2018, <https://www.rfc- | |||
| editor.org/info/rfc8311>. | editor.org/info/rfc8311>. | |||
| [ecnL4S] - De Schepper, K., and B. Briscoe, "Identifying Modified | [ecnL4S] - De Schepper, K., and B. Briscoe, "Identifying Modified | |||
| Explicit Congestion Notification (ECN) Semantics for Ultra-Low | Explicit Congestion Notification (ECN) Semantics for Ultra-Low | |||
| Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-id, work in | Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-id, work in | |||
| progress. | progress. | |||
| [NSHTLV] - Wei, Y., U. Elzur, S. Majee, C. Pignataro, and D. | ||||
| Eastlake, "Network Service Header Metadata Type 2 Variable- | ||||
| Length Context Headers" , draft-ietf-sfc-nsh-tlv, work in | ||||
| progress. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Donald E. Eastlake, 3rd | Donald E. Eastlake, 3rd | |||
| Futurewei Technologies | Futurewei Technologies | |||
| 2386 Panoramic Circle | 2386 Panoramic Circle | |||
| Apopka, FL 32703 USA | Apopka, FL 32703 USA | |||
| Tel: +1-508-333-2270 | Tel: +1-508-333-2270 | |||
| Email: d3e3e3@gmail.com | Email: d3e3e3@gmail.com | |||
| skipping to change at page 33, line 7 ¶ | skipping to change at page 34, line 7 ¶ | |||
| Xinpeng Wei | Xinpeng Wei | |||
| Huawei Technologies | Huawei Technologies | |||
| Beiqing Rd. Z-park No.156, Haidian District, | Beiqing Rd. Z-park No.156, Haidian District, | |||
| Beijing, 100095, P. R. China | Beijing, 100095, P. R. China | |||
| EMail: weixinpeng@huawei.com | EMail: weixinpeng@huawei.com | |||
| Copyright and IPR Provisions | Copyright and IPR Provisions | |||
| Copyright (c) 2021 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Revised BSD License text as described in Section 4.e of the | |||
| the Trust Legal Provisions and are provided without warranty as | Trust Legal Provisions and are provided without warranty as described | |||
| described in the Simplified BSD License. The definitive version of | in the Revised BSD License. | |||
| an IETF Document is that published by, or under the auspices of, the | ||||
| IETF. Versions of IETF Documents that are published by third parties, | ||||
| including those that are translated into other languages, should not | ||||
| be considered to be definitive versions of IETF Documents. The | ||||
| definitive version of these Legal Provisions is that published by, or | ||||
| under the auspices of, the IETF. Versions of these Legal Provisions | ||||
| that are published by third parties, including those that are | ||||
| translated into other languages, should not be considered to be | ||||
| definitive versions of these Legal Provisions. For the avoidance of | ||||
| doubt, each Contributor to the IETF Standards Process licenses each | ||||
| Contribution that he or she makes as part of the IETF Standards | ||||
| Process to the IETF Trust pursuant to the provisions of RFC 5378. No | ||||
| language to the contrary, or terms, conditions or rights that differ | ||||
| from or are inconsistent with the rights and licenses granted under | ||||
| RFC 5378, shall have any effect and shall be null and void, whether | ||||
| published or posted by such Contributor, or included with or in such | ||||
| Contribution. | ||||
| End of changes. 75 change blocks. | ||||
| 194 lines changed or deleted | 208 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||