< draft-ietf-sfc-nsh-ecn-support-07.txt   draft-ietf-sfc-nsh-ecn-support-08.txt >
skipping to change at page 1, line 13 skipping to change at page 1, line 13
INTERNET-DRAFT D. Eastlake INTERNET-DRAFT D. Eastlake
Intended status: Proposed Standard Futurewei Technologies Intended status: Proposed Standard Futurewei Technologies
B. Briscoe B. Briscoe
Independent Independent
Y. Li Y. Li
Huawei Technologies Huawei Technologies
A. Malis A. Malis
Malis Consulting Malis Consulting
X. Wei X. Wei
Huawei Technologies Huawei Technologies
Expires: April 5, 2022 October 6, 2021 Expires: April 20, 2022 October 21, 2021
Explicit Congestion Notification (ECN) and Congestion Feedback Explicit Congestion Notification (ECN) and Congestion Feedback
Using the Network Service Header (NSH) and IPFIX Using the Network Service Header (NSH) and IPFIX
<draft-ietf-sfc-nsh-ecn-support-07.txt> <draft-ietf-sfc-nsh-ecn-support-08.txt>
Abstract Abstract
Explicit congestion notification (ECN) allows a forwarding element to Explicit congestion notification (ECN) allows a forwarding element to
notify downstream devices of the onset of congestion without having notify downstream devices of the onset of congestion without having
to drop packets. Coupled with a means to feed information about to drop packets. Coupled with a means to feed information about
congestion back to upstream nodes, this can improve network congestion back to upstream nodes, this can improve network
efficiency through better congestion control, frequently without efficiency through better congestion control, frequently without
packet drops. This document specifies ECN and congestion feedback packet drops. This document specifies ECN and congestion feedback
support within a Service Function Chaining (SFC) architecture domain support within a Service Function Chaining (SFC) domain through use
through use of the Network Service Header (NSH, RFC 8300) and IP Flow of the Network Service Header (NSH, RFC 8300) and IP Flow Information
Information Export (IPFIX, RFC 7011). Export (IPFIX, RFC 7011).
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Distribution of this document is unlimited. Comments should be sent Distribution of this document is unlimited. Comments should be sent
to the SFC Working Group mailing list <sfc@ietf.org> or to the to the SFC Working Group mailing list <sfc@ietf.org> or to the
authors. authors.
skipping to change at page 3, line 32 skipping to change at page 3, line 32
3.4 Congestion Statistics and the Conservation of Packets.16 3.4 Congestion Statistics and the Conservation of Packets.16
4. Tunnel Congestion Feedback Support.....................18 4. Tunnel Congestion Feedback Support.....................18
4.1 Congestion Level Measurements.........................18 4.1 Congestion Level Measurements.........................18
4.3 Congestion Information Delivery.......................19 4.3 Congestion Information Delivery.......................19
4.3 IPFIX Extensions......................................21 4.3 IPFIX Extensions......................................21
4.3.1 nshServicePathID....................................21 4.3.1 nshServicePathID....................................21
4.3.2 tunnelEcnCeCeByteTotalCount.........................21 4.3.2 tunnelEcnCeCeByteTotalCount.........................21
4.3.3 tunnelEcnEctNectBytetTotalCount.....................22 4.3.3 tunnelEcnEctNectBytetTotalCount.....................22
4.3.4 tunnelEcnCeNectByteTotalCount.......................22 4.3.4 tunnelEcnCeNectByteTotalCount.......................22
4.3.5 tunnelEcnCeEctByteTotalCount........................22 4.3.5 tunnelEcnCeEctByteTotalCount........................23
4.3.6 tunnelEcnEctEctByteTotalCount.......................23 4.3.6 tunnelEcnEctEctByteTotalCount.......................23
4.3.7 tunnelEcnCEMarkedRatio..............................23 4.3.7 tunnelEcnCEMarkedRatio..............................23
5. Example of Use.........................................24 5. Example of Use.........................................24
6. IANA Considerations....................................27 6. IANA Considerations....................................27
6.1 SFC NSH Header ECN Bits...............................27 6.1 SFC NSH Header ECN Bits...............................27
6.2 IPFIX Information Element IDs.........................27 6.2 IPFIX Information Element IDs.........................27
7. Security Considerations................................29 7. Security Considerations................................29
skipping to change at page 4, line 14 skipping to change at page 4, line 14
1. Introduction 1. Introduction
Explicit Congestion Notification (ECN [RFC3168]) allows a forwarding Explicit Congestion Notification (ECN [RFC3168]) allows a forwarding
element to notify downstream devices of the onset of congestion element to notify downstream devices of the onset of congestion
without having to drop packets. Coupled with a means to feed without having to drop packets. Coupled with a means to feed
information about congestion back to upstream nodes, this can improve information about congestion back to upstream nodes, this can improve
network efficiency through better congestion control, frequently network efficiency through better congestion control, frequently
without packet drops. This document specifies ECN and congestion without packet drops. This document specifies ECN and congestion
feedback support within a Service Function Chaining (SFC [RFC7665]) feedback support within a Service Function Chaining (SFC [RFC7665])
architecture domain through use of the Network Service Header (NSH domain through use of the Network Service Header (NSH [RFC8300]) and
[RFC8300]) and IP Flow Information Export (IPFIX [RFC7011]). IP Flow Information Export (IPFIX [RFC7011]).
It requires that all ingress and egress nodes of the SFC domain It requires that all ingress and egress nodes of the SFC domain
implement ECN. While congestion management will be the most effective implement ECN. While congestion management will be the most effective
if all interior nodes of the SFC domain implement ECN, some benefit if all interior nodes of the SFC domain implement ECN, some benefit
is obtained even if some interior nodes do not implement ECN. is obtained even if some interior nodes do not implement ECN.
Congestion at any interior bottleneck where ECN marking is not Congestion at any interior bottleneck where ECN marking is not
implemented will be unmanaged. implemented will be unmanaged.
The subsections below in this section provide background information The subsections below in this section provide background information
on NSH, ECN, congestion feedback, and terminology used in this on NSH, ECN, congestion feedback, and terminology used in this
skipping to change at page 6, line 34 skipping to change at page 6, line 34
networks, enterprise network, and the public Internet. A tunnel networks, enterprise network, and the public Internet. A tunnel
consists of ingress, egress, and a set of intermediate nodes consists of ingress, egress, and a set of intermediate nodes
including routers. Tunnel Congestion Feedback (Section 4) is a including routers. Tunnel Congestion Feedback (Section 4) is a
building block for congestion mitigation methods. It supports building block for congestion mitigation methods. It supports
feedback of congestion information from an egress node to an ingress feedback of congestion information from an egress node to an ingress
node. This document treats the SFC domain as a tunnel with the node. This document treats the SFC domain as a tunnel with the
initial Classifier node being the ingress; however, the Tunnel initial Classifier node being the ingress; however, the Tunnel
Congestion Feedback facilities specified in this document MAY be used Congestion Feedback facilities specified in this document MAY be used
in other contexts besides SFC domains. in other contexts besides SFC domains.
Examples of actions that can be taken by an ingress node when it has
knowledge of downstream congestion include those listed below.
Details of implementing these traffic control methods, beyond those
given here, are outside the scope of this document.
Any action by a tunnel ingress to reduce congestion needs to allow Any action by a tunnel ingress to reduce congestion needs to allow
sufficient time for the end-to-end congestion control loop to respond sufficient time for the end-to-end congestion control loop to respond
first, otherwise the system could go unstable. For instance by the first, otherwise the system could go unstable. For instance by the
ingress taking a smoothed average of the level of congestion signaled ingress taking a smoothed average of the level of congestion signaled
by feedback from the tunnel egress or delaying any action for at by feedback from the tunnel egress or delaying any action for at
least the worst case global round trip time (for example 100 least the worst case end-to-end round trip time (for example 200
milliseconds). milliseconds).
Examples of actions that can be taken by an ingress node when it has
knowledge of downstream congestion include those listed below.
Details of implementing these traffic control methods, beyond those
given here, are outside the scope of this document.
(1) Traffic throttling (policing), where the downstream traffic (1) Traffic throttling (policing), where the downstream traffic
flowing out of the ingress node is limited to reduce or eliminate flowing out of the ingress node is limited to reduce or eliminate
congestion. congestion.
(2) Upstream congestion feedback, where the ingress node sends (2) Upstream congestion feedback, where the ingress node sends
messages upstream to or towards the ultimate traffic source, a messages upstream to or towards the ultimate traffic source, a
function that can throttle traffic generation/transmission. function that can throttle traffic generation/transmission.
(3) Traffic re-direction, where the ingress node configures the NSH (3) Traffic re-direction, where the ingress node configures the NSH
of some future traffic so that it avoids congested paths. Great of some future traffic so that it avoids congested paths. Great
skipping to change at page 7, line 22 skipping to change at page 7, line 22
ordering of traffic in flows that it is desirable to keep in ordering of traffic in flows that it is desirable to keep in
order and (b) oscillation/instability in traffic paths due to order and (b) oscillation/instability in traffic paths due to
alternate congestion of previously idle paths and the idling of alternate congestion of previously idle paths and the idling of
previously congested paths. For example, it is preferable to previously congested paths. For example, it is preferable to
classify traffic into flows of a sufficiently coarse granularity classify traffic into flows of a sufficiently coarse granularity
that the flows are long lived and then use a stable path per that the flows are long lived and then use a stable path per
flow, sending only newly appearing flows on apparently flow, sending only newly appearing flows on apparently
uncongested paths. uncongested paths.
Figure 2 shows an example path from an original sender to a final Figure 2 shows an example path from an original sender to a final
receiver passing through an example chain of service functions receiver passing through a chain of service functions between the
between the ingress and egress of an SFC domain. The path is also ingress and egress of an SFC domain. The path is also likely to pass
likely to pass through other network nodes outside the SFC domain through other network nodes outside the SFC domain (not shown) before
(not shown) before entering the SFC domain and after leaving the SFC entering the SFC domain and after leaving the SFC domain.
domain.
The figure shows typical congestion feedback that would be expected The figure shows typical congestion feedback that would be expected
from the final receiver to the origin sender, which controls the load from the final receiver to the origin sender, which controls the load
the origin sender applies to all elements on the path. The figure the origin sender directs to all elements on the path. The figure
also shows the congestion feedback from the egress to the ingress of also shows the congestion feedback from the egress to the ingress of
the SFC domain that is described in this document, to control or the SFC domain that is described in this document, to control or
balance load within the SFC domain. balance load within the SFC domain.
.:= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = :. .:= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = :.
_||_ End-to-End Congestion Feedback || _||_ End-to-End Congestion Feedback ||
\ / || \ / ||
\/ || \/ ||
__ Inner Transport Header and Payload __ __ Inner Transport Header and Payload __
| | ->- - - - - - - - - - - - - - ->- - - - - -- - - - - - ->- | | | | ->- - - - - - - - - - - - - - ->- - - - - -- - - - - - ->- | |
skipping to change at page 17, line 4 skipping to change at page 17, line 4
new packets as well as simply processing and forwarding the packets new packets as well as simply processing and forwarding the packets
it receives. Such actions might appear to be packet loss due to it receives. Such actions might appear to be packet loss due to
congestion or might mask the loss of packets by generating additional congestion or might mask the loss of packets by generating additional
packets. packets.
The tunnel congestion feedback approach (Section 4) can detect The tunnel congestion feedback approach (Section 4) can detect
congestions in several ways. One way detects traffic loss by counting congestions in several ways. One way detects traffic loss by counting
payload packets and bytes in at the ingress and counting them out at payload packets and bytes in at the ingress and counting them out at
the egress. This does not work unless nodes conserve the number of the egress. This does not work unless nodes conserve the number of
payload packets and/or bytes. Therefore, it will not be possible to payload packets and/or bytes. Therefore, it will not be possible to
detect loss using this technique if traffic volume is not conserved accurately detect packet loss using this technique if traffic volume
by the service function chain processing that traffic. is not conserved by the service function chain processing that
traffic.
Nonetheless, if a bottleneck supports ECN marking, it will be Nonetheless, if a bottleneck supports ECN marking, it will be
possible to detect the high level of CE markings that are associated possible to detect the high level of CE markings that are associated
with congestion at that bottleneck by looking at the ratio of CE- with congestion at that bottleneck by looking at the ratio of CE-
marked to non-CE-marked packets. However, it will not be possible for marked to non-CE-marked packets. However, it will not be possible for
the tunnel congestion feedback approach to detect any congestion, the tunnel congestion feedback approach to detect any congestion,
whether slight or severe, if it occurs at a bottleneck that does not whether slight or severe, if it occurs at a bottleneck that does not
support ECN marking. support ECN marking.
4. Tunnel Congestion Feedback Support 4. Tunnel Congestion Feedback Support
skipping to change at page 18, line 22 skipping to change at page 18, line 22
IP Flow Information Export (IPFIX [RFC7011]) provides a standard for IP Flow Information Export (IPFIX [RFC7011]) provides a standard for
communicating traffic flow statistics. As extended by this document, communicating traffic flow statistics. As extended by this document,
IPFIX messages from the egress to the ingress are used to communicate IPFIX messages from the egress to the ingress are used to communicate
the extent of congestion between an ingress and egress based on ECN the extent of congestion between an ingress and egress based on ECN
marking in the NSH. marking in the NSH.
4.1 Congestion Level Measurements 4.1 Congestion Level Measurements
The congestion level measurements are based on ECN marking in the NSH The congestion level measurements are based on ECN marking in the NSH
and packet drop. In particular the congestion information includes and packet drop. In particular congestion information includes at
the ratio of CE-marked packets to all packets and the ratio of least one of cumulative bytes counts of packets with each type of
dropped packets to all packets. outer/inner header ECN marking combination, the ratio of CE-marked
packets to all packets, and the ratio of dropped packets to all
packets.
If the congestion level is low enough, the packets are marked as CE If the congestion level is low enough, the packets are marked as CE
instead of being dropped, and then it is easy to calculate congestion instead of being dropped, and then it is easy to calculate congestion
level according to the ratio of CE-marked packets. If the congestion level according to the ratio of CE-marked packets. If the congestion
level is so high that ECT packets will be dropped, then the packet level is so high that ECT packets will be dropped, then the packet
loss ratio could be calculated by comparing total packets entering loss ratio could be calculated by comparing total packets entering
ingress and total packets arriving at egress over the same span of ingress and total packets arriving at egress over the same span of
packets. If packet loss is detected for a flow that would preserve packets. If packet loss is detected for a flow that would preserve
the number of packets in the absence of congestion, then it can be the number of packets in the absence of congestion, then it can be
assumed that severe congestion has occurred in the tunnel. assumed that severe congestion has occurred in the tunnel.
skipping to change at page 19, line 18 skipping to change at page 19, line 20
be dropped. Faked-ECT is used to shift some drops to the egress in be dropped. Faked-ECT is used to shift some drops to the egress in
order to allow the egress to calculate the CE-marked packet ratio order to allow the egress to calculate the CE-marked packet ratio
more precisely. more precisely.
The ingress encapsulates packets and marks their outer header The ingress encapsulates packets and marks their outer header
according to faked ECT as described above. The ingress cumulatively according to faked ECT as described above. The ingress cumulatively
counts packet bytes for three types of ECN combination (CE|CE, ECT|N- counts packet bytes for three types of ECN combination (CE|CE, ECT|N-
ECT, and ECT|ECT) and then the ingress regularly sends cumulative ECT, and ECT|ECT) and then the ingress regularly sends cumulative
bytes counts message of each type of ECN combination to the egress. bytes counts message of each type of ECN combination to the egress.
When each message arrives at the egress, (1) the egress calculates When each message arrives at the egress, the following two steps
the ratio of CE-marked packets; (2) the egress cumulatively counts occur: (1) the egress calculates the ratio of CE-marked packets; (2)
packet bytes coming from the ingress and adds its own bytes counts of the egress cumulatively counts packet bytes coming from the ingress
each type of ECN combination (CE|CE, ECT|N-ECT, CE|N-ECT, CE|ECT, and and adds its own bytes counts of each type of ECN combination (CE|CE,
ECT|ECT) to the message for ingress to calculate packet loss. The ECT|N-ECT, CE|N-ECT, CE|ECT, and ECT|ECT) to the message for the
egress feeds back the CE-marked packet ratio, packet loss ratio, ingress to calculate packet loss. The egress feeds back the CE-marked
bytes counts information, and the like to the ingress as requested packet ratio, packet loss ratio, bytes counts information, and the
for evaluating congestion level in the tunnel. like to the ingress as requested for evaluating congestion level in
the tunnel.
The statistics can be at the granularity of all traffic from the The statistics can be at the granularity of all traffic from the
ingress to the egress to learn about the overall congestion status of ingress to the egress to learn about the overall congestion status of
the path between the ingress and the egress or at the granularity of the path between the ingress and the egress or at the granularity of
individual customer's traffic or a specific set of flows to learn individual customer's traffic or a specific set of flows to learn
about their congestion contribution. about their congestion contribution.
For example, the tunnelEcnCEMarkedRatio field (specified below) For example, the tunnelEcnCEMarkedRatio field (specified below)
indicates the fraction of traffic that has been marked in the ECN indicates the fraction of traffic that has been marked in the ECN
field of the NSH as Congestion Experienced (CE). field of the NSH as Congestion Experienced (CE).
4.3 Congestion Information Delivery 4.3 Congestion Information Delivery
As described above, the tunnel ingress needs to send a messages As described above, the tunnel ingress sends a messages containing
containing cumulative bytes counts of packets of each type of ECN cumulative byte counts of packets of each type of ECN marking to the
combination to the tunnel egress, and the tunnel egress also needs to tunnel egress, and the tunnel egress feeds back messages to the
feed back messages with cumulative bytes counts of packets of each ingress with at least one of the following: cumulative byte counts of
type of ECN combination and the CE-marked packet ratio to the packets of each type of ECN combination, the ratio of CE-marked
ingress. This section specifies how the messages are conveyed. packets to all packets, and the ratio of dropped packets to all
packets. This section specifies how the messages are conveyed.
IPFIX recommends, but does not require, use of SCTP [RFC4960] in IPFIX recommends, but does not require, use of SCTP [RFC4960] in
partial reliability mode [RFC3758] for the transport of its messages. partial reliability mode [RFC3758] for the transport of its messages.
This mode allows loss of some packets, which is tolerable because This mode allows loss of some packets, which is tolerable because
IPFIX communicates cumulative statistics. IPFIX over SCTP over IP IPFIX communicates cumulative statistics. IPFIX over SCTP over IP
SHOULD be used directly where there is IP connectivity between the SHOULD be used directly where there is IP connectivity between the
ingress and egress; however, there might be different transport ingress and egress; however, there might be different transport
protocols or address spaces used in different regions of an SFC protocols or address spaces used in different regions of an SFC
domain that make such direct IP connectivity problematic. The NSH domain that block such direct IP connectivity. The NSH provides the
provides the general method of routing traffic within an SFC domain general method of routing traffic within an SFC domain so the
so the encapsulation of the required IPFIX traffic in NSH MUST be encapsulation of the required IPFIX traffic in NSH MUST be
implemented and, when IP connectivity is not available, IPFIX over implemented and, when IP connectivity is not available, IPFIX over
NSH SHOULD be used along with configuration of appropriate SFC paths NSH SHOULD be used along with configuration of appropriate SFC paths
for the IPFIX over NSH traffic. for the IPFIX over NSH traffic.
IPFIX messages could travel along the same path as network data IPFIX messages could travel along the same path as network data
traffic. In any case, an IPFIX message packet may get lost in case of traffic. In any case, an IPFIX message packet may get lost in case of
network congestion. Even though the missing information could be network congestion. Even though the missing information could be
recovered because of the use of cumulative counts, the message SHOULD recovered because of the use of cumulative counts, the message SHOULD
be transmitted at a higher priority than users' traffic flows to be transmitted at a higher priority than users' traffic flows to
improve the promptness of congestion information feedback. improve the promptness of congestion information feedback.
skipping to change at page 29, line 9 skipping to change at page 29, line 9
Name: tunnelEcnCEMarkedRatio Name: tunnelEcnCEMarkedRatio
Data Type: float32 Data Type: float32
Status: current Status: current
Description: The ratio of CE-marked Packet at the Observation Description: The ratio of CE-marked Packet at the Observation
Point. Point.
7. Security Considerations 7. Security Considerations
For general NSH security considerations, see [RFC8300]. For general NSH security considerations, see [RFC8300].
For security considerations concerning tampering with ECN signaling, For security considerations concerning ECN signaling tampering, see
see [RFC3168]. For security considerations concerning ECN and [RFC3168]. For security considerations concerning ECN and
encapsulation, see [RFC6040]. encapsulation, see [RFC6040].
For general IPFIX security considerations, see [RFC7011]. If deployed For general IPFIX security considerations, see [RFC7011]. If deployed
in an untrusted environment, the signaling traffic between ingress in an untrusted environment, the signaling traffic between ingress
and egress can be protected utilizing the security mechanisms and egress can be protected utilizing the security mechanisms
provided by IPFIX (see Section 11 in [RFC7011]). The tunnel provided by IPFIX (see Section 11 in [RFC7011]). The tunnel
endpoints (the ingress and egress for an SFC domain) are assumed to endpoints (the ingress and egress for an SFC domain) are assumed to
be in the same administrative domain, so they will trust each other. be in the same administrative domain, so they will trust each other.
The solution in this document does not introduce any greater The solution in this document does not introduce any greater
 End of changes. 17 change blocks. 
44 lines changed or deleted 49 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/