| < draft-ietf-sfc-nsh-ecn-support-06.txt | draft-ietf-sfc-nsh-ecn-support-07.txt > | |||
|---|---|---|---|---|
| skipping to change at page 1, line 13 ¶ | skipping to change at page 1, line 13 ¶ | |||
| INTERNET-DRAFT D. Eastlake | INTERNET-DRAFT D. Eastlake | |||
| Intended status: Proposed Standard Futurewei Technologies | Intended status: Proposed Standard Futurewei Technologies | |||
| B. Briscoe | B. Briscoe | |||
| Independent | Independent | |||
| Y. Li | Y. Li | |||
| Huawei Technologies | Huawei Technologies | |||
| A. Malis | A. Malis | |||
| Malis Consulting | Malis Consulting | |||
| X. Wei | X. Wei | |||
| Huawei Technologies | Huawei Technologies | |||
| Expires: March 12, 2022 September 13, 2021 | Expires: April 5, 2022 October 6, 2021 | |||
| Explicit Congestion Notification (ECN) and Congestion Feedback | Explicit Congestion Notification (ECN) and Congestion Feedback | |||
| Using the Network Service Header (NSH) and IPFIX | Using the Network Service Header (NSH) and IPFIX | |||
| <draft-ietf-sfc-nsh-ecn-support-06.txt> | <draft-ietf-sfc-nsh-ecn-support-07.txt> | |||
| Abstract | Abstract | |||
| Explicit congestion notification (ECN) allows a forwarding element to | Explicit congestion notification (ECN) allows a forwarding element to | |||
| notify downstream devices of the onset of congestion without having | notify downstream devices of the onset of congestion without having | |||
| to drop packets. Coupled with a means to feed information about | to drop packets. Coupled with a means to feed information about | |||
| congestion back to upstream nodes, this can improve network | congestion back to upstream nodes, this can improve network | |||
| efficiency through better congestion control, frequently without | efficiency through better congestion control, frequently without | |||
| packet drops. This document specifies ECN and congestion feedback | packet drops. This document specifies ECN and congestion feedback | |||
| support within a Service Function Chaining (SFC) architecture domain | support within a Service Function Chaining (SFC) architecture domain | |||
| skipping to change at page 3, line 22 ¶ | skipping to change at page 3, line 22 ¶ | |||
| 2. The NSH ECN Field......................................10 | 2. The NSH ECN Field......................................10 | |||
| 3. ECN Support in the NSH.................................12 | 3. ECN Support in the NSH.................................12 | |||
| 3.1 At The Ingress........................................13 | 3.1 At The Ingress........................................13 | |||
| 3.2 At Transit Nodes......................................14 | 3.2 At Transit Nodes......................................14 | |||
| 3.2.1 At NSH Transit Nodes................................14 | 3.2.1 At NSH Transit Nodes................................14 | |||
| 3.2.2 At an SF/Proxy......................................15 | 3.2.2 At an SF/Proxy......................................15 | |||
| 3.2.3 At Other Forwarding Nodes...........................15 | 3.2.3 At Other Forwarding Nodes...........................15 | |||
| 3.3 At Exit/Egress........................................16 | 3.3 At Exit/Egress........................................16 | |||
| 3.4 Conservation of Packets...............................16 | 3.4 Congestion Statistics and the Conservation of Packets.16 | |||
| 4. Tunnel Congestion Feedback Support.....................18 | 4. Tunnel Congestion Feedback Support.....................18 | |||
| 4.1 Congestion Level Measurement..........................18 | 4.1 Congestion Level Measurements.........................18 | |||
| 4.3 Congestion Information Delivery.......................19 | 4.3 Congestion Information Delivery.......................19 | |||
| 4.3 IPFIX Extensions......................................21 | 4.3 IPFIX Extensions......................................21 | |||
| 4.3.1 nshServicePathID....................................21 | 4.3.1 nshServicePathID....................................21 | |||
| 4.3.2 tunnelEcnCeCeByteTotalCount.........................21 | 4.3.2 tunnelEcnCeCeByteTotalCount.........................21 | |||
| 4.3.3 tunnelEcnEctNectBytetTotalCount.....................22 | 4.3.3 tunnelEcnEctNectBytetTotalCount.....................22 | |||
| 4.3.4 tunnelEcnCeNectByteTotalCount.......................22 | 4.3.4 tunnelEcnCeNectByteTotalCount.......................22 | |||
| 4.3.5 tunnelEcnCeEctByteTotalCount........................22 | 4.3.5 tunnelEcnCeEctByteTotalCount........................22 | |||
| 4.3.6 tunnelEcnEctEctByteTotalCount.......................23 | 4.3.6 tunnelEcnEctEctByteTotalCount.......................23 | |||
| 4.3.7 tunnelEcnCEMarkedRatio..............................23 | 4.3.7 tunnelEcnCEMarkedRatio..............................23 | |||
| skipping to change at page 10, line 46 ¶ | skipping to change at page 10, line 46 ¶ | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ^ ^ | ^ ^ | |||
| | | | | | | |||
| +-------+ | +-------+ | |||
| |NSH ECN| | |NSH ECN| | |||
| | field | | | field | | |||
| +-------+ | +-------+ | |||
| Figure 4. NSH Base Header | Figure 4. NSH Base Header | |||
| Note to RFC Editor: The above figure should be adjusted based on the | RFC Editor NOTE: The above figure should be adjusted based on the | |||
| bits assigned by IANA (see Section 5) and this note deleted. | bits assigned by IANA (see Section 5) and this note deleted. | |||
| Table 1 shows the meaning of the code points in the NSH ECN field. | Table 1 shows the meaning of the code points in the NSH ECN field. | |||
| These have the same meaning as the ECN field code points in the IPv4 | These have the same meaning as the ECN field code points in the IPv4 | |||
| or IPv6 header as defined in [RFC3168]. | or IPv6 header as defined in [RFC3168]. | |||
| Binary Name Meaning | Binary Name Meaning | |||
| ------ ------- -------------------------------- | ------ ------- -------------------------------- | |||
| 00 Not-ECT Not ECN-Capable Transport | 00 Not-ECT Not ECN-Capable Transport | |||
| 01 ECT(1) ECN-Capable Transport | 01 ECT(1) ECN-Capable Transport | |||
| skipping to change at page 14, line 29 ¶ | skipping to change at page 14, line 29 ¶ | |||
| +-----------------+ | +-----------------+ | |||
| Figure 5. Packet in Transit | Figure 5. Packet in Transit | |||
| 3.2.1 At NSH Transit Nodes | 3.2.1 At NSH Transit Nodes | |||
| When a packet is received at an NSH based forwarding node such as an | When a packet is received at an NSH based forwarding node such as an | |||
| SFF, say N1, the outer transport encapsulation is removed and its ECN | SFF, say N1, the outer transport encapsulation is removed and its ECN | |||
| marking SHOULD be combined into the NSH ECN marking as specified in | marking SHOULD be combined into the NSH ECN marking as specified in | |||
| [RFC6040]. If this is not done, any congestion encountered at non-NSH | [RFC6040]. If this is not done, any congestion encountered at non-NSH | |||
| transit nodes between N1 and the next upstream NSH based forwarding | transit nodes between N1 and the previous upstream NSH based | |||
| node will be lost and not transmitted downstream. | forwarding node will be lost and not transmitted downstream. | |||
| The NSH forwarding node SHOULD use a recognized AQM algorithm | The NSH forwarding node SHOULD use a recognized AQM algorithm | |||
| [RFC7567] to detect congestion. If the NSH ECN field indicates ECT, | [RFC7567] to detect congestion. If the NSH ECN field indicates ECT, | |||
| it will probabilistically set the NSH ECN field to the Congestion | it will probabilistically set the NSH ECN field to the Congestion | |||
| Experienced (CE) value or, in cases of extreme congestion, drop the | Experienced (CE) value or, in cases of extreme congestion, drop the | |||
| packet. | packet. | |||
| When the NSH encapsulated packet is further encapsulated for | When the NSH encapsulated packet is further encapsulated for | |||
| transmission to the next SFF or SF, ECN marking behavior depends on | transmission to the next SFF or SF, ECN marking behavior depends on | |||
| whether or not the node that will decapsulate the outer header | whether or not the node that will decapsulate the outer header | |||
| skipping to change at page 15, line 15 ¶ | skipping to change at page 15, line 15 ¶ | |||
| 3.2.2 At an SF/Proxy | 3.2.2 At an SF/Proxy | |||
| If the SF is NSH and ECN-aware, the processing is essentially the | If the SF is NSH and ECN-aware, the processing is essentially the | |||
| same at the SF as at an SFF as discussed in Section 3.2.1. | same at the SF as at an SFF as discussed in Section 3.2.1. | |||
| If the SF is NSH-aware but ECN-unaware, then the SFF transmitting the | If the SF is NSH-aware but ECN-unaware, then the SFF transmitting the | |||
| packet to the SF will use Compatibility Mode. Congestion encountered | packet to the SF will use Compatibility Mode. Congestion encountered | |||
| in the SFF to SF and SF to SFF paths will be unmanaged. | in the SFF to SF and SF to SFF paths will be unmanaged. | |||
| If the SF is not NSH-aware, then an NSH proxy will be between the SFF | If the SF is not NSH-aware, then an NSH proxy will be between the SFF | |||
| and the SF to avoid exposure of the NSH to the SF that does not | and the SF to avoid exposure of the SF that does not understand NSHs | |||
| understand NSHs as shown in Figure 6. This is described in Section | to the NSH as shown in Figure 6. This is described in Section 4.6 of | |||
| 4.6 of [RFC7665]. The SF and proxy together look to the SFF like an | [RFC7665]. The SF and proxy together look to the SFF like an NSH- | |||
| NSH-aware SF. The behavior at the proxy and SF in this case is as | aware SF. The behavior at the proxy and SF in this case is as below: | |||
| below: | ||||
| If such a proxy is not ECN-aware then congestion in the entire | If such a proxy is not ECN-aware then congestion in the entire | |||
| path from SFF to proxy to SF back to proxy to SFF will be | path from SFF to proxy to SF back to proxy to SFF will be | |||
| unmanaged. | unmanaged. | |||
| | | | | |||
| v | v | |||
| +----------+ +---------+ | +----------+ +---------+ | |||
| | | +-------+ | NSH | | | | +-------+ | NSH | | |||
| | SFF +---->| NSH +---->|un-aware | | | SFF +---->| NSH +---->|un-aware | | |||
| skipping to change at page 15, line 44 ¶ | skipping to change at page 15, line 43 ¶ | |||
| | | | | |||
| v | v | |||
| Figure 6. Proxy for NSH Un-aware SFF | Figure 6. Proxy for NSH Un-aware SFF | |||
| If the proxy is ECN-aware, the proxy uses an AQM to indicate | If the proxy is ECN-aware, the proxy uses an AQM to indicate | |||
| congestion within the proxy in the NSH that it returns to the SFF. | congestion within the proxy in the NSH that it returns to the SFF. | |||
| The outer header used for the proxy-to-SF path uses Normal Mode. | The outer header used for the proxy-to-SF path uses Normal Mode. | |||
| The outer header used for the proxy-to-SFF path uses Normal Mode | The outer header used for the proxy-to-SFF path uses Normal Mode | |||
| based copying of the NSH ECN field to the outer header. Thus | based copying of the NSH ECN field to the outer header. Thus | |||
| congestion in the proxy will be managed. Congestion in the SF will | congestion in the proxy will be managed. | |||
| be managed only if the SF is ECN-aware and implements an AQM. | ||||
| Congestion in the SF will be managed only if the SF is ECN-aware | ||||
| and implements an AQM. | ||||
| 3.2.3 At Other Forwarding Nodes | 3.2.3 At Other Forwarding Nodes | |||
| Other forwarding nodes, that is non-NSH forwarding nodes between NSH | Other forwarding nodes, that is non-NSH forwarding nodes between NSH | |||
| forwarding nodes, such as IP or label switched routers, might also | forwarding nodes, such as IP or label switched routers, might also | |||
| contain potential bottlenecks. If so, they SHOULD implement an AQM | contain potential bottlenecks. If so, they SHOULD implement an AQM | |||
| algorithm to update the ECN marking in the outer transport header as | algorithm to update the ECN marking in the outer transport header as | |||
| specified in [RFC3168]. | specified in [RFC3168]. | |||
| 3.3 At Exit/Egress | 3.3 At Exit/Egress | |||
| skipping to change at page 16, line 20 ¶ | skipping to change at page 16, line 21 ¶ | |||
| accumulating statistics to send back to the ingress (see Section 4) | accumulating statistics to send back to the ingress (see Section 4) | |||
| or for other uses. If the packet being carried inside the NSH is IP, | or for other uses. If the packet being carried inside the NSH is IP, | |||
| when the NSH is removed the NSH ECN field MUST be combined with the | when the NSH is removed the NSH ECN field MUST be combined with the | |||
| IP ECN field as specified in Table 3 that was extracted from | IP ECN field as specified in Table 3 that was extracted from | |||
| [RFC6040]. This requirement applies to all egress nodes for the | [RFC6040]. This requirement applies to all egress nodes for the | |||
| domain in which NSH is being used to route traffic. | domain in which NSH is being used to route traffic. | |||
| +---------+---------------------------------------------+ | +---------+---------------------------------------------+ | |||
| |Arriving | Arriving Outer Header | | |Arriving | Arriving Outer Header | | |||
| | Inner +---------+-----------+-----------+-----------+ | | Inner +---------+-----------+-----------+-----------+ | |||
| | Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
| +---------+---------+-----------+-----------+-----------+ | +---------+---------+-----------+-----------+-----------+ | |||
| | Not-ECT | Not-ECT |Not-ECT |Not-ECT | <drop> | | | Not-ECT | Not-ECT | Not-ECT | Not-ECT | <drop> | | |||
| | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | |||
| | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | |||
| | CE | CE | CE | CE | CE | | | CE | CE | CE | CE | CE | | |||
| +---------+---------+-----------+-----------+-----------+ | +---------+---------+-----------+-----------+-----------+ | |||
| Table 3. Exit ECN Fields Merger | Table 3. Exit ECN Fields Merger | |||
| All the egress nodes of the SFC domain MUST support Compliant ECN | All the egress nodes of the SFC domain MUST support Compliant ECN | |||
| Decapsulation as specified in this section. If this is not the case, | Decapsulation as specified in this section. If this is not the case, | |||
| the scheme described in this document will not work, and cannot be | the scheme described in this document will not work, and cannot be | |||
| used. | used. | |||
| 3.4 Conservation of Packets | 3.4 Congestion Statistics and the Conservation of Packets | |||
| The SFC specification permits an SF to absorb packets and to generate | The SFC specification permits an SF to absorb packets and to generate | |||
| new packets as well as simply processing and forwarding the packets | new packets as well as simply processing and forwarding the packets | |||
| it receives. Such actions might appear to be packet loss due to | it receives. Such actions might appear to be packet loss due to | |||
| congestion or might mask the loss of packets by generating additional | congestion or might mask the loss of packets by generating additional | |||
| packets. | packets. | |||
| The tunnel congestion feedback approach (Section 4) detects loss by | The tunnel congestion feedback approach (Section 4) can detect | |||
| counting payload bytes in at the ingress and counting them out at the | congestions in several ways. One way detects traffic loss by counting | |||
| egress. This does not work unless nodes conserve the amount of | payload packets and bytes in at the ingress and counting them out at | |||
| payload bytes. Therefore, it will not be possible to detect loss | the egress. This does not work unless nodes conserve the number of | |||
| using this technique if they are not conserved. | payload packets and/or bytes. Therefore, it will not be possible to | |||
| detect loss using this technique if traffic volume is not conserved | ||||
| by the service function chain processing that traffic. | ||||
| Nonetheless, if a bottleneck supports ECN marking, it will be | Nonetheless, if a bottleneck supports ECN marking, it will be | |||
| possible to detect the very high level of CE markings that are | possible to detect the high level of CE markings that are associated | |||
| associated with congestion that is so excessive that it leads to | with congestion at that bottleneck by looking at the ratio of CE- | |||
| loss. However, it will not be possible for the tunnel congestion | marked to non-CE-marked packets. However, it will not be possible for | |||
| feedback approach to detect any congestion, whether slight or severe, | the tunnel congestion feedback approach to detect any congestion, | |||
| if it occurs at a bottleneck that does not support ECN marking. | whether slight or severe, if it occurs at a bottleneck that does not | |||
| support ECN marking. | ||||
| 4. Tunnel Congestion Feedback Support | 4. Tunnel Congestion Feedback Support | |||
| The collection and storage of congestion information at the egress | The collection and storage of congestion information at the egress | |||
| may be useful for later analysis but, unless it can be fed back to a | may be useful for later analysis but, unless it can be fed back to a | |||
| point which can take action to reduce congestion, it will not be | point which can take action to reduce congestion, it will not be | |||
| useful in real time. Such congestion feedback to the ingress enables | useful in real time. Such congestion feedback to the ingress enables | |||
| it to take actions such as those listed in Section 1.3. | it to take actions such as those listed in Section 1.3. | |||
| IP Flow Information Export (IPFIX [RFC7011]) provides a standard for | IP Flow Information Export (IPFIX [RFC7011]) provides a standard for | |||
| communicating traffic flow statistics. As extended by this document, | communicating traffic flow statistics. As extended by this document, | |||
| IPFIX messages from the egress to the ingress are used to communicate | IPFIX messages from the egress to the ingress are used to communicate | |||
| the extent of congestion between an ingress and egress based on ECN | the extent of congestion between an ingress and egress based on ECN | |||
| marking in the NSH. | marking in the NSH. | |||
| 4.1 Congestion Level Measurement | 4.1 Congestion Level Measurements | |||
| The congestion level measurement is based on ECN marking in the NSH | The congestion level measurements are based on ECN marking in the NSH | |||
| and packet drop. In particular the congestion information includes | and packet drop. In particular the congestion information includes | |||
| the ratio of CE-marked packets to all packets and the ratio of | the ratio of CE-marked packets to all packets and the ratio of | |||
| dropped packets to all packets. | dropped packets to all packets. | |||
| If the congestion level is not high enough, the packets are marked as | If the congestion level is low enough, the packets are marked as CE | |||
| CE instead of being dropped, and then it is easy to calculate | instead of being dropped, and then it is easy to calculate congestion | |||
| congestion level according to the ratio of CE-marked packets. If the | level according to the ratio of CE-marked packets. If the congestion | |||
| congestion level is so high that ECT packets will be dropped, then | level is so high that ECT packets will be dropped, then the packet | |||
| the packet loss ratio could be calculated by comparing total packets | loss ratio could be calculated by comparing total packets entering | |||
| entering ingress and total packets arriving at egress over the same | ingress and total packets arriving at egress over the same span of | |||
| span of packets. If packet loss is detected, it can be assumed that | packets. If packet loss is detected for a flow that would preserve | |||
| severe congestion has occurred in the tunnel. | the number of packets in the absence of congestion, then it can be | |||
| assumed that severe congestion has occurred in the tunnel. | ||||
| The egress calculates the CE-marked packet ratio by counting packets | The egress calculates the CE-marked packet ratio by counting packets | |||
| with different ECN markings. The CE-marked packet ratio will be used | with different ECN markings. The CE-marked packet ratio will be used | |||
| as an indication of tunnel load level. It is assumed that nodes | as an indication of tunnel load level. It is assumed that nodes | |||
| between the ingress and egress will not drop packets biased towards | between the ingress and egress will not drop packets biased towards | |||
| certain ECN codepoints, so calculating of CE-marked packet ratio is | certain ECN codepoints, so calculating of CE-marked packet ratio is | |||
| not affect by packet drop. | not affect by packet drop. | |||
| The calculation of volumes of packet drop is by comparing the traffic | The calculation of the fraction of packets droped is by comparing the | |||
| volumes between ingress and egress. | traffic volumes between ingress and egress. | |||
| Faked ECN-Capable Transport (ECT) is used at the ingress to defer | Faked ECN-Capable Transport (ECT) is used at the ingress to defer | |||
| packet loss to the egress. The basic idea of faked ECT is that, when | packet loss to the egress. The basic idea of faked ECT is that, when | |||
| encapsulating packets, the ingress first marks the tunnel outer | encapsulating packets, the ingress first marks the tunnel outer | |||
| header (NSH for an SFC domain) according to [RFC6040], and then | header (NSH for an SFC domain) according to [RFC6040], and then | |||
| remarks the outer header of Not-ECT packets as ECT. (ECT(0) and | remarks the outer header of Not-ECT packets as ECT. (ECT(0) and | |||
| ECT(1) are treated as the same.) Thus, as transmitted by the ingress | ECT(1) are treated as the same.) Thus, as transmitted by the ingress | |||
| node, there will be one of three combinations of outer header ECN | node, there will be one of three combinations of outer header ECN | |||
| field and inner header ECN field as follows: CE|CE, ECT|N-ECT, and | field and inner header ECN field as follows: CE|CE, ECT|N-ECT, and | |||
| ECT|ECT (in the format of outer-ECN|inner-ECN); when decapsulating | ECT|ECT (in the format of outer-ECN|inner-ECN); when decapsulating | |||
| skipping to change at page 19, line 22 ¶ | skipping to change at page 19, line 23 ¶ | |||
| according to faked ECT as described above. The ingress cumulatively | according to faked ECT as described above. The ingress cumulatively | |||
| counts packet bytes for three types of ECN combination (CE|CE, ECT|N- | counts packet bytes for three types of ECN combination (CE|CE, ECT|N- | |||
| ECT, and ECT|ECT) and then the ingress regularly sends cumulative | ECT, and ECT|ECT) and then the ingress regularly sends cumulative | |||
| bytes counts message of each type of ECN combination to the egress. | bytes counts message of each type of ECN combination to the egress. | |||
| When each message arrives at the egress, (1) the egress calculates | When each message arrives at the egress, (1) the egress calculates | |||
| the ratio of CE-marked packets; (2) the egress cumulatively counts | the ratio of CE-marked packets; (2) the egress cumulatively counts | |||
| packet bytes coming from the ingress and adds its own bytes counts of | packet bytes coming from the ingress and adds its own bytes counts of | |||
| each type of ECN combination (CE|CE, ECT|N-ECT, CE|N-ECT, CE|ECT, and | each type of ECN combination (CE|CE, ECT|N-ECT, CE|N-ECT, CE|ECT, and | |||
| ECT|ECT) to the message for ingress to calculate packet loss. The | ECT|ECT) to the message for ingress to calculate packet loss. The | |||
| egress feeds back the CE-marked packet ratio and bytes counts | egress feeds back the CE-marked packet ratio, packet loss ratio, | |||
| information to the ingress for evaluating congestion level in the | bytes counts information, and the like to the ingress as requested | |||
| tunnel. | for evaluating congestion level in the tunnel. | |||
| The counting of bytes can be at the granularity of all traffic from | The statistics can be at the granularity of all traffic from the | |||
| the ingress to the egress to learn about the overall congestion | ingress to the egress to learn about the overall congestion status of | |||
| status of the path between the ingress and the egress. The counting | the path between the ingress and the egress or at the granularity of | |||
| can also be at the granularity of individual customer's traffic or a | individual customer's traffic or a specific set of flows to learn | |||
| specific set of flows to learn about their congestion contribution. | about their congestion contribution. | |||
| For example, the tunnelEcnCEMarkedRatio field (specified below) | For example, the tunnelEcnCEMarkedRatio field (specified below) | |||
| indicates the fraction of traffic that has been marked in the ECN | indicates the fraction of traffic that has been marked in the ECN | |||
| field of the NSH as Congestion Experienced (CE). | field of the NSH as Congestion Experienced (CE). | |||
| 4.3 Congestion Information Delivery | 4.3 Congestion Information Delivery | |||
| As described above, the tunnel ingress needs to send a messages | As described above, the tunnel ingress needs to send a messages | |||
| containing cumulative bytes counts of packets of each type of ECN | containing cumulative bytes counts of packets of each type of ECN | |||
| combination to the tunnel egress, and the tunnel egress also needs to | combination to the tunnel egress, and the tunnel egress also needs to | |||
| feed back messages with cumulative bytes counts of packets of each | feed back messages with cumulative bytes counts of packets of each | |||
| type of ECN combination and the CE-marked packet ratio to the | type of ECN combination and the CE-marked packet ratio to the | |||
| ingress. This section specifies how the messages should be conveyed. | ingress. This section specifies how the messages are conveyed. | |||
| IPFIX recommends, but does not require, use of SCTP [RFC4960] in | IPFIX recommends, but does not require, use of SCTP [RFC4960] in | |||
| partial reliability mode [RFC3758] for the transport of its messages. | partial reliability mode [RFC3758] for the transport of its messages. | |||
| This mode allows loss of some packets, which is tolerable because | This mode allows loss of some packets, which is tolerable because | |||
| IPFIX communicates cumulative statistics. IPFIX over SCTP over IP | IPFIX communicates cumulative statistics. IPFIX over SCTP over IP | |||
| SHOULD be used directly where there is IP connectivity between the | SHOULD be used directly where there is IP connectivity between the | |||
| ingress and egress; however, there might be different transport | ingress and egress; however, there might be different transport | |||
| protocols or address spaces used in different regions of an SFC | protocols or address spaces used in different regions of an SFC | |||
| domain that make such direct IP connectivity problematic. The NSH | domain that make such direct IP connectivity problematic. The NSH | |||
| provides the general method of routing traffic within an SFC domain | provides the general method of routing traffic within an SFC domain | |||
| so the encapsulation of the required IPFIX traffic in NSH MUST be | so the encapsulation of the required IPFIX traffic in NSH MUST be | |||
| implemented and, when IP connectivity is not available, IPFIX over | implemented and, when IP connectivity is not available, IPFIX over | |||
| NSH SHOULD be used along with configuration of appropriate SFC paths | NSH SHOULD be used along with configuration of appropriate SFC paths | |||
| for the IPFIX over NSH traffic. | for the IPFIX over NSH traffic. | |||
| IPFIX messages could travel along the same path as network data | IPFIX messages could travel along the same path as network data | |||
| traffic. In any case, an IPFIX message packet may get lost in case of | traffic. In any case, an IPFIX message packet may get lost in case of | |||
| network congestion. Even though the missing information could be | network congestion. Even though the missing information could be | |||
| recovered because of the use of cumulative counts, the message SHOULD | recovered because of the use of cumulative counts, the message SHOULD | |||
| be transmitted at a higher priority than users' traffic flows. | be transmitted at a higher priority than users' traffic flows to | |||
| improve the promptness of congestion information feedback. | ||||
| The ingress node can do congestion management at different | The ingress node can do congestion management at different | |||
| granularity which means both the overall aggregated inner tunnel | granularity which means both the overall aggregated inner tunnel | |||
| congestion level and congestion level contributed by certain traffic | congestion level and congestion level contributed by certain traffic | |||
| flows could be measured for different congestion management purpose. | flows could be measured for different congestion management purposes. | |||
| For example, if the ingress only wants to limit congestion volume | For example, if the ingress only wants to limit congestion volume | |||
| caused by certain traffic flows, such as UDP-based traffic, then | caused by certain traffic flows, such as UDP-based traffic, then | |||
| congestion volume for that traffic will be fed back; or if the | congestion volume for that traffic can be fed back; or if the ingress | |||
| ingress is doing overall congestion management, the aggregated | is doing overall congestion management, the aggregated congestion | |||
| congestion volume will be fed back. | volume can be fed back. | |||
| When sending IPFIX messages from ingress to egress, the ingress acts | When sending IPFIX messages from ingress to egress, the ingress acts | |||
| as IPFIX exporter and egress acts as IPFIX collector; When feedback | as IPFIX exporter and the egress acts as IPFIX collector; When | |||
| congestion level information from egress to ingress, then the egress | feeding back congestion level information from egress to ingress, | |||
| acts as IPFIX exporter and ingress acts as IPFIX collector. | then the egress acts as IPFIX exporter and ingress acts as IPFIX | |||
| collector. | ||||
| The combination of congestion level measurement and congestion | The combination of congestion level measurement and congestion | |||
| information delivery procedures should be as following: | information delivery procedures are as following: | |||
| o The ingress node determines the IPFIX template record to be used. | o The ingress node determines the IPFIX template record to be used. | |||
| The template record can be pre-configured or determined at | The template record can be pre-configured or determined at | |||
| runtime, the content of the template record will be determined | runtime, the content of the template record will be determined | |||
| according to the granularity of congestion management; if the | according to the granularity of congestion management; if the | |||
| ingress wants to limit congestion volume contributed by specific | ingress wants to limit congestion volume contributed by specific | |||
| traffic flows then the elements such as source IP address, | traffic flows then the elements such as source IP address, | |||
| destination IP address, flow ID and CE-marked packet volume of the | destination IP address, flow ID and CE-marked packet volume of the | |||
| flows, etc., will be included in the template record. | flows, etc., will be included in the template record. | |||
| skipping to change at page 21, line 29 ¶ | skipping to change at page 21, line 32 ¶ | |||
| Description: Network Service Header [RFC8300] Service Path | Description: Network Service Header [RFC8300] Service Path | |||
| Identifier. This is a 24-bit value which is left justified in | Identifier. This is a 24-bit value which is left justified in | |||
| the Information Element. The low order byte MUST be sent as | the Information Element. The low order byte MUST be sent as | |||
| zero and ignored on receipt. | zero and ignored on receipt. | |||
| Abstract Data Type: unsigned32 | Abstract Data Type: unsigned32 | |||
| Data Type Semantics: identifier | Data Type Semantics: identifier | |||
| ElementId: tbd0 | ElementId: TBD0 | |||
| Status: current | Status: current | |||
| 4.3.2 tunnelEcnCeCeByteTotalCount | 4.3.2 tunnelEcnCeCeByteTotalCount | |||
| Description: The total number of bytes of incoming packets withthe | Description: The total number of bytes of incoming packets with | |||
| CE|CE ECN marking combination at the Observation Point since | the CE|CE ECN marking combination at the Observation Point | |||
| the Metering Process (re-)initialization for this Observation | since the Metering Process (re-)initialization for this | |||
| Point. | Observation Point. | |||
| Abstract Data Type: unsigned64 | Abstract Data Type: unsigned64 | |||
| Data Type Semantics: totalCounter | Data Type Semantics: totalCounter | |||
| ElementId: TBD1 | ElementId: TBD1 | |||
| Statues: current | Statues: current | |||
| Units: bytes | Units: bytes | |||
| 4.3.3 tunnelEcnEctNectBytetTotalCount | 4.3.3 tunnelEcnEctNectBytetTotalCount | |||
| Description: The total number of bytes of incoming packets with | Description: The total number of bytes of incoming packets with | |||
| the ECT|N-ECT ECN marking combination (ECT(0) and ECT(1) are | the ECT|N-ECT ECN marking combination (ECT(0) and ECT(1) are | |||
| treated the same as each other) at the Observation Point since | treated the same as each other) at the Observation Point since | |||
| the Metering Process (re-)initialization for this Observation | the Metering Process (re-)initialization for this Observation | |||
| skipping to change at page 24, line 7 ¶ | skipping to change at page 24, line 7 ¶ | |||
| Point. | Point. | |||
| Abstract Data Type: float32 | Abstract Data Type: float32 | |||
| ElementId: TBD6 | ElementId: TBD6 | |||
| Statues: current | Statues: current | |||
| 5. Example of Use | 5. Example of Use | |||
| This subsection provides an example of how the solution described in | This section provides an example of the solution described in this | |||
| this document could work. | document. | |||
| First of all, IPFIX template records are exchanged between ingress | First, IPFIX template records are exchanged between ingress and | |||
| and egress to negotiate the format of data records. The example here | egress to negotiate the format of the data records to be exchanged. | |||
| is to measure the congestion level for the overall tunnel caused by | The example here is to measure the congestion level for the overall | |||
| all the traffic. After the negotiation is finished, the ingress sends | tunnel caused by all the traffic. After the negotiation is finished, | |||
| in-band messages to the egress containing the number of each kind of | the ingress sends in-band messages to the egress containing the | |||
| ECN-marked packets (i.e., CE|CE, ECT|N-ECT and ECT|ECT) received | number of each kind of ECN-marked packets (i.e., CE|CE, ECT|N-ECT and | |||
| before it sent the message. | ECT|ECT) received before it sent the message. | |||
| After the egress receives the message, the egress calculates the CE- | After the egress receives the message, the egress calculates the CE- | |||
| marked packet ratio and counts the number of different kinds of ECN- | marked packet ratio and counts the number of different kinds of ECN- | |||
| marking packets received before it received the message. Then the | marking packets received before it received the message. Then the | |||
| egress sends a feedback message containing the counts together with | egress sends a feedback message containing the counts together with | |||
| the information in the ingress's message back to the ingress. | the information in the ingress's message back to the ingress. | |||
| Figures 7 to 10 below show the example procedure between ingress and | Figures 7 to 10 below illustrate the example procedure between | |||
| egress. | ingress and egress. | |||
| +---------------------------------+----------------------+ | +---------------------------------+----------------------+ | |||
| |Set ID=2 Length=40 | | |Set ID=2 Length=40 | | |||
| |---------------------------------|----------------------| | |---------------------------------|----------------------| | |||
| |Template ID=256 Field Count=8 | | |Template ID=256 Field Count=8 | | |||
| |---------------------------------|----------------------| | |---------------------------------|----------------------| | |||
| |tunnelEcnCeCeByteTotalCount Field Length=8 | | |tunnelEcnCeCeByteTotalCount Field Length=8 | | |||
| |---------------------------------|----------------------| | |---------------------------------|----------------------| | |||
| |tunnelEcnEctNectByteTotalCount Field Length=8 | | |tunnelEcnEctNectByteTotalCount Field Length=8 | | |||
| |---------------------------------|----------------------| | |---------------------------------|----------------------| | |||
| skipping to change at page 26, line 5 ¶ | skipping to change at page 26, line 5 ¶ | |||
| +-+ | +-+ | |||
| |M| : Message Packet | |M| : Message Packet | |||
| +-+ | +-+ | |||
| +-+ | +-+ | |||
| |P| : User Packet | |P| : User Packet | |||
| +-+ | +-+ | |||
| Figure 9. Traffic flow Between Ingress and Egress | Figure 9. Traffic flow Between Ingress and Egress | |||
| Set ID=257, Length=28 | Set ID=257, Length=28 | |||
| +------+ A1 +------+ | +------+ A1 +-------+ | |||
| | | B1 | | | | | B1 | | | |||
| | | C1 | | | | | C1 | | | |||
| | | <----------------------------- | | | | | <----------------------------- | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | SetID=256, Length=72 | | | | | SetID=256, Length=72 | | | |||
| | | A1 | | | | | A1 | | | |||
| | | B1 | | | | | B1 | | | |||
| |egress| C1 ingress| | |egress| C1 |ingress| | |||
| | | A2 | | | | | A2 | | | |||
| | | B2 | | | | | B2 | | | |||
| | | C2 | | | | | C2 | | | |||
| | | D | | | | | D | | | |||
| | | E | | | | | E | | | |||
| | | R | | | | | R | | | |||
| | | ----------------------------> | | | | | ----------------------------> | | | |||
| | | | | | | | | | | |||
| +------+ +------+ | +------+ +-------+ | |||
| Figure 10. Messages Between Ingress and Egress | Figure 10. Messages Between Ingress and Egress | |||
| The following provides an example of how the tunnel congestion level | The following provides an example of how the tunnel congestion level | |||
| can be calculated (see Figure 10): | can be calculated (see Figure 10): | |||
| The congestion Level could be divided into two categories: (1) | The congestion Level could be divided into two categories: (1) | |||
| slight congestion (no packets dropped); (2) serious congestion | slight congestion (no packets dropped); (2) serious congestion | |||
| (packets are being dropped). | (packets are being dropped). | |||
| For slight congestion, the congestion level is indicated by the | For slight congestion, the congestion level is indicated by the | |||
| ratio of CE-marked packets: | ratio of CE-marked packets: | |||
| ce_marked = R; | ce_marked = R; | |||
| For serious congestion, the congestion level is indicated as the | For serious congestion, the congestion level is indicated as the | |||
| number of volume loss: | volume of traffic loss: | |||
| total_ingress = (A1 + B1 + C1) | total_ingress = (A1 + B1 + C1) | |||
| total_egress = (A2 + B2 + C2 + D + E) | total_egress = (A2 + B2 + C2 + D + E) | |||
| volume_loss = (total_ingress - total_egress) | volume_loss = (total_ingress - total_egress) | |||
| 6. IANA Considerations | 6. IANA Considerations | |||
| The following subsections provide IANA assignment considerations. | The following subsections provide IANA assignment considerations. | |||
| skipping to change at page 27, line 23 ¶ | skipping to change at page 27, line 23 ¶ | |||
| assignment as follows: | assignment as follows: | |||
| Bit Description Reference | Bit Description Reference | |||
| ---------- ----------- ----------------- | ---------- ----------- ----------------- | |||
| tbd(16-17) NSH ECN [this document] | tbd(16-17) NSH ECN [this document] | |||
| 6.2 IPFIX Information Element IDs | 6.2 IPFIX Information Element IDs | |||
| IANA is requested to assign IPFIX Information Element IDs as follows: | IANA is requested to assign IPFIX Information Element IDs as follows: | |||
| ElementID: tbd0 | ElementID: TBD0 | |||
| Name: nshServicePathID | Name: nshServicePathID | |||
| Data Type: unsigned32 | Data Type: unsigned32 | |||
| Data Type Semantics: identifier | Data Type Semantics: identifier | |||
| Status: current | Status: current | |||
| Description: The Network Service Header [RFC8300] Service Path | Description: The Network Service Header [RFC8300] Service Path | |||
| Identifier. | Identifier. | |||
| ElementID: TBD1 | ElementID: TBD1 | |||
| Name: tunnelEcnCeCePacketTotalCount | Name: tunnelEcnCeCePacketTotalCount | |||
| Data Type: unsigned64 | Data Type: unsigned64 | |||
| End of changes. 34 change blocks. | ||||
| 98 lines changed or deleted | 104 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||