| < draft-ietf-bier-path-mtu-discovery-03.txt | draft-ietf-bier-path-mtu-discovery-04.txt > | |||
|---|---|---|---|---|
| BIER Working Group G. Mirsky | BIER Working Group G. Mirsky | |||
| Internet-Draft ZTE Corp. | Internet-Draft ZTE Corp. | |||
| Intended status: Standards Track T. Przygienda | Intended status: Standards Track T. Przygienda | |||
| Expires: July 21, 2018 Juniper Networks | Expires: December 21, 2018 Juniper Networks | |||
| A. Dolganow | A. Dolganow | |||
| Nokia | Nokia | |||
| January 17, 2018 | June 19, 2018 | |||
| Path Maximum Transmission Unit Discovery (PMTUD) for Bit Index Explicit | Path Maximum Transmission Unit Discovery (PMTUD) for Bit Index Explicit | |||
| Replication (BIER) Layer | Replication (BIER) Layer | |||
| draft-ietf-bier-path-mtu-discovery-03 | draft-ietf-bier-path-mtu-discovery-04 | |||
| Abstract | Abstract | |||
| This document describes Path Maximum Transmission Unit Discovery | This document describes Path Maximum Transmission Unit Discovery | |||
| (PMTUD) in Bit Indexed Explicit Replication (BIER) layer. | (PMTUD) in Bit Indexed Explicit Replication (BIER) layer. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| skipping to change at page 1, line 35 ¶ | skipping to change at page 1, line 35 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on July 21, 2018. | This Internet-Draft will expire on December 21, 2018. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 18 ¶ | skipping to change at page 2, line 18 ¶ | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 1.1. Conventions used in this document . . . . . . . . . . . . 3 | 1.1. Conventions used in this document . . . . . . . . . . . . 3 | |||
| 1.1.1. Terminology . . . . . . . . . . . . . . . . . . . . . 3 | 1.1.1. Terminology . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1.2. Requirements Language . . . . . . . . . . . . . . . . 3 | 1.1.2. Requirements Language . . . . . . . . . . . . . . . . 3 | |||
| 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 | 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 3. PMTUD Mechanism for BIER . . . . . . . . . . . . . . . . . . 4 | 3. PMTUD Mechanism for BIER . . . . . . . . . . . . . . . . . . 4 | |||
| 3.1. Data TLV for BIER Ping . . . . . . . . . . . . . . . . . 6 | 3.1. Data TLV for BIER Ping . . . . . . . . . . . . . . . . . 6 | |||
| 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 | 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 5. Security Considerations . . . . . . . . . . . . . . . . . . . 7 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 7 | |||
| 6. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 7 | 6. Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 7.1. Normative References . . . . . . . . . . . . . . . . . . 7 | 7.1. Normative References . . . . . . . . . . . . . . . . . . 7 | |||
| 7.2. Informative References . . . . . . . . . . . . . . . . . 8 | 7.2. Informative References . . . . . . . . . . . . . . . . . 8 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 1. Introduction | 1. Introduction | |||
| In packet switched networks, when a host seeks to transmit data to a | In packet switched networks, when a host seeks to transmit data to a | |||
| target destination, the data is transmitted as a set of packets. In | target destination, the data is transmitted as a set of packets. In | |||
| many cases it is more efficient to use the largest size packets that | many cases, it is more efficient to use the largest size packets that | |||
| are less than or equal to the least Maximum Transmission Unit (MTU) | are less than or equal to the least Maximum Transmission Unit (MTU) | |||
| for any forwarding device along the routed path to the IP destination | for any forwarding device along the routed path to the IP destination | |||
| for these packets. Such "least MTU" is known as Path MTU (PMTU). | for these packets. Such "least MTU" is known as Path MTU (PMTU). | |||
| Fragmentation or packet drop, silent or not, may occur on hops along | Fragmentation or packet drop, silent or not, may occur on hops along | |||
| the route where a MTU is smaller than the size of the datagram. To | the route where an MTU is smaller than the size of the datagram. To | |||
| avoid any of the listed above behaviors, the packet source must find | avoid any of the listed above behaviors, the packet source must find | |||
| the value of the least MTU, i.e. PMTU, that will be encountered along | the value of the least MTU, i.e. PMTU, that will be encountered along | |||
| the route that a set of packets will follow to reach the given set of | the route that a set of packets will follow to reach the given set of | |||
| destinations. Such MTU determination along a specific path is | destinations. Such MTU determination along a specific path is | |||
| referred to as path MTU discovery (PMTUD). | referred to as path MTU discovery (PMTUD). | |||
| [RFC8279] introduces and explains Bit Index Explicit Replication | [RFC8279] introduces and explains Bit Index Explicit Replication | |||
| (BIER) architecture and how it supports forwarding of multicast data | (BIER) architecture and how it supports forwarding of multicast data | |||
| packets. A BIER domain consists of Bit-Forwarding Routers (BFRs) | packets. A BIER domain consists of Bit-Forwarding Routers (BFRs) | |||
| that are uniquely identified by their respective BFR-ids. An ingress | that are uniquely identified by their respective BFR-ids. An ingress | |||
| border router (acting as a Bit Forwarding Ingress Router (BFIR)) | border router (acting as a Bit Forwarding Ingress Router (BFIR)) | |||
| inserts a Forwarding Bit Mask (F-BM) into a packet. Each targeted | inserts a Forwarding Bit Mask (F-BM) into a packet. Each targeted | |||
| egress node (referred to as a Bit Forwarding Egress Router (BFER)) is | egress node (referred to as a Bit Forwarding Egress Router (BFER)) is | |||
| represented by Bit Mask Position (BMP) in the BMS. A transit or | represented by Bit Mask Position (BMP) in the BMS. A transit or | |||
| intermediate BIER node, referred as BFR, forwards BIER encapsulated | intermediate BIER node, referred to as BFR, forwards BIER | |||
| packets to BFERs, identified by respective BMPs, according to a Bit | encapsulated packets to BFERs, identified by respective BMPs, | |||
| Index Forwarding Table (BIFT). | according to a Bit Index Forwarding Table (BIFT). | |||
| 1.1. Conventions used in this document | 1.1. Conventions used in this document | |||
| 1.1.1. Terminology | 1.1.1. Terminology | |||
| BFR: Bit-Forwarding Router | BFR: Bit-Forwarding Router | |||
| BFER: Bit-Forwarding Egress Router | BFER: Bit-Forwarding Egress Router | |||
| BFIR: Bit-Forwarding Ingress Router | BFIR: Bit-Forwarding Ingress Router | |||
| skipping to change at page 3, line 47 ¶ | skipping to change at page 3, line 47 ¶ | |||
| [I-D.ietf-bier-oam-requirements] sets forth the requirement to define | [I-D.ietf-bier-oam-requirements] sets forth the requirement to define | |||
| PMTUD protocol for BIER domain. This document describes the | PMTUD protocol for BIER domain. This document describes the | |||
| extension to [I-D.ietf-bier-ping] for use in BIER PMTUD solution. | extension to [I-D.ietf-bier-ping] for use in BIER PMTUD solution. | |||
| Current PMTUD mechanisms ([RFC1191], [RFC8201], and [RFC4821]) are | Current PMTUD mechanisms ([RFC1191], [RFC8201], and [RFC4821]) are | |||
| primarily targeted to work on point-to-point, i.e. unicast paths. | primarily targeted to work on point-to-point, i.e. unicast paths. | |||
| These mechanisms use packet fragmentation control by disabling | These mechanisms use packet fragmentation control by disabling | |||
| fragmentation of the probe packet. As a result, a transient node | fragmentation of the probe packet. As a result, a transient node | |||
| that cannot forward a probe packet that is bigger than its link MTU | that cannot forward a probe packet that is bigger than its link MTU | |||
| sends to the packet source an error notification, otherwise the | sends to the packet source an error notification, otherwise the | |||
| packet destination may respond with a positive acknowledgement. | packet destination may respond with a positive acknowledgment. Thus, | |||
| Thus, possibly through a series of iterations, varying the size of | possibly through a series of iterations, varying the size of the | |||
| the probe packet, the packet source discovers the PMTU of the | probe packet, the packet source discovers the PMTU of the particular | |||
| particular path. | path. | |||
| Thus applied such existing PMTUD solutions are inefficient for point- | Thus applied such existing PMTUD solutions are inefficient for point- | |||
| to-multipoint paths constructed for multicast traffic. Probe packets | to-multipoint paths constructed for multicast traffic. Probe packets | |||
| must be flooded through the whole set of multicast distribution paths | must be flooded through the whole set of multicast distribution paths | |||
| over and over again until the very last egress responds with a | over and over again until the very last egress responds with a | |||
| positive acknowledgement. Consider without loss of generality an | positive acknowledgment. Consider without loss of generality an | |||
| example multicast network presented in Figure 1, where MTU on all | example multicast network presented in Figure 1, where MTU on all | |||
| links but one (B,D) is the same. If MTU on link (B,D) is smaller | links but one (B, D) is the same. If MTU on the link (B, D) is | |||
| than the MTU on the other links, using existing PMTUD mechanism | smaller than the MTU on the other links, using existing PMTUD | |||
| probes will unnecessary flood to leaf nodes E, F, and G for the | mechanism probes will unnecessary flood to leaf nodes E, F, and G for | |||
| second and consecutive times and positive responses will be generated | the second and consecutive times and positive responses will be | |||
| and received by root A repeatedly. | generated and received by root A repeatedly. | |||
| ----- | ----- | |||
| --| D | | --| D | | |||
| ----- / ----- | ----- / ----- | |||
| --| B |-- | --| B |-- | |||
| / ----- \ ----- | / ----- \ ----- | |||
| / --| E | | / --| E | | |||
| ----- / ----- | ----- / ----- | |||
| | A |--- ----- | | A |--- ----- | |||
| ----- \ --| F | | ----- \ --| F | | |||
| skipping to change at page 4, line 37 ¶ | skipping to change at page 4, line 37 ¶ | |||
| --| C |-- | --| C |-- | |||
| ----- \ ----- | ----- \ ----- | |||
| --| G | | --| G | | |||
| ----- | ----- | |||
| Figure 1: Multicast network | Figure 1: Multicast network | |||
| 3. PMTUD Mechanism for BIER | 3. PMTUD Mechanism for BIER | |||
| A BFIR selects a set of BFERs for the specific multicast | A BFIR selects a set of BFERs for the specific multicast | |||
| distribution. Such a BFIR determines, by explicitly controlling | distribution. Such a BFIR determines, by explicitly controlling a | |||
| subset of targeted BFERs and transmitting series of probe packets, | subset of targeted BFERs and transmitting series of probe packets, | |||
| the MTU of that multicast distribution tree. The critical step is | the MTU of that multicast distribution tree. In case of ECMP, BFIR | |||
| that in case of failure at an intermediate BFR to forward towards the | MAY test each path by variating the value in Entropy field. The | |||
| subset of targeted downstream BFERs, the BFR responds with a partial | critical step is that in case of failure at an intermediate BFR to | |||
| (compared to the one it received in the request) bitmask towards the | forward towards the subset of targeted downstream BFERs, the BFR | |||
| originating BFIR in error notification. That allows for | responds with a partial (compared to the one it received in the | |||
| retransmission of the next probe with smaller MTU address only | request) bitmask towards the originating BFIR in error notification. | |||
| towards the failed downstream BFERs instead of all BFERs addressed in | That allows for retransmission of the next probe with smaller MTU | |||
| the previous probe. In the scenario discussed in Section 2 the | address only towards the failed downstream BFERs instead of all BFERs | |||
| second and all following (if needed) probes will be sent only to the | addressed in the previous probe. In the scenario discussed in | |||
| node D since MTU discovery of E, F, and G has been completed already | Section 2 the second and all following (if needed) probes will be | |||
| by the first probe successfully. | sent only to the node D since MTU discovery of E, F, and G has been | |||
| completed already by the first probe successfully. | ||||
| [I-D.ietf-bier-ping] introduced BIER Ping as a transport-independent | [I-D.ietf-bier-ping] introduced BIER Ping as a transport-independent | |||
| OAM mechanism to detect and localize failures in the BIER data plane. | OAM mechanism to detect and localize failures in the BIER data plane. | |||
| This document specifies how BIER Ping can be used to perform | This document specifies how BIER Ping can be used to perform | |||
| efficient PMTUD in the BIER domain. | efficient PMTUD in the BIER domain. | |||
| Consider the network displayed in Figure 1 to be presentation of a | Consider the network displayed in Figure 1 to be a presentation of a | |||
| BIER domain and all nodes to be BFRs. To discover MTU over BIER | BIER domain and all nodes to be BFRs. To discover MTU over BIER | |||
| domain to BFERs D, F, E, and G BFIR A will use BIER Ping with Data | domain to BFERs D, F, E, and G BFIR A will use BIER Ping with Data | |||
| TLV, defined in Section 3.1. Size of the first probe set to M_max | TLV, defined in Section 3.1. Size of the first probe set to M_max | |||
| determined as minimal MTU value of BFIR's links to BIER domain. As | determined as minimal MTU value of BFIR's links to BIER domain. As | |||
| has been assumed in Section 2, MTUs of all links but link (B,D) are | has been assumed in Section 2, MTUs of all links but the link (B, D) | |||
| the same. Thus BFERs E. F, and G would receive BIER Echo Request | are the same. Thus BFERs E, F, and G would receive BIER Echo Request | |||
| and will send their respective replies to BFIR A. BFR B may pass the | and will send their respective replies to BFIR A. BFR B may pass the | |||
| packet which is too large to forward over egress link (B, D) to the | packet which is too large to forward over egress link (B, D) to the | |||
| appropriate network layer for error processing where it would be | appropriate network layer for error processing where it would be | |||
| recognized as a BIER Echo Request packet. BFR B MUST send BIER Echo | recognized as a BIER Echo Request packet. BFR B MUST send BIER Echo | |||
| Reply to BFIR A and MUST include Downstream Mapping TLV, defined in | Reply to BFIR A and MUST include Downstream Mapping TLV, defined in | |||
| [I-D.ietf-bier-ping] setting its fields in the following fashion: | [I-D.ietf-bier-ping] setting its fields in the following fashion: | |||
| o MTU SHOULD be set to the minimal MTU value among all egress BIER | o MTU SHOULD be set to the minimal MTU value among all egress BIER | |||
| links, logical links between this and downstream BFRs, that could | links, logical links between this and downstream BFRs, that could | |||
| be used to reach B's downstream BFERs; | be used to reach B's downstream BFERs; | |||
| skipping to change at page 5, line 42 ¶ | skipping to change at page 5, line 42 ¶ | |||
| o I flag MUST be cleared; | o I flag MUST be cleared; | |||
| o Downstream Interface Address field (4 octets) MUST be zeroed and | o Downstream Interface Address field (4 octets) MUST be zeroed and | |||
| MUST include in the Egress Bitstring sub-TLV the list of all BFERs | MUST include in the Egress Bitstring sub-TLV the list of all BFERs | |||
| that cannot be reached because the attempted MTU turned out to be | that cannot be reached because the attempted MTU turned out to be | |||
| too small. | too small. | |||
| The BFIR will receive either of the two types of packets: | The BFIR will receive either of the two types of packets: | |||
| o a positive Echo Reply from one of BFERs to which the probe has | o a positive Echo Reply from one of BFERs to which the probe has | |||
| been sent. In this case the bit corresponding to the BFER MUST be | been sent. In this case, the bit corresponding to the BFER MUST | |||
| cleared from the BMS; | be cleared from the BMS; | |||
| o a negative Echo Reply with bit string listing unreached BFERs and | o a negative Echo Reply with bit string listing unreached BFERs and | |||
| recommended MTU value MTU'. The BFIR MUST add the bit string to | recommended MTU value MTU'. The BFIR MUST add the bit string to | |||
| its BMS and set size of the next probe as min(MTU, MTU') | its BMS and set the size of the next probe as min(MTU, MTU') | |||
| If upon expiration of the Echo Request timer BFIR didn't receive any | If upon expiration of the Echo Request timer BFIR didn't receive any | |||
| Echo Replies, then the size of the probe SHOULD be decreased. There | Echo Replies, then the size of the probe SHOULD be decreased. There | |||
| are scenarios when an implementation of the PMTUD would not decrease | are scenarios when an implementation of the PMTUD would not decrease | |||
| the size of the probe. For example, if upon expiration of the Echo | the size of the probe. For example, if upon expiration of the Echo | |||
| Request timer BFIR didn't receive any Echo Reply, then BFIR MAY | Request timer BFIR didn't receive any Echo Reply, then BFIR MAY | |||
| continue to retransmit the probe using the initial size and MAY apply | continue to retransmit the probe using the initial size and MAY apply | |||
| probe delay retransmission procedures. The algorithm used to delay | probe delay retransmission procedures. The algorithm used to delay | |||
| retransmission procedures on BFIR is outside the scope of this | retransmission procedures on BFIR is outside the scope of this | |||
| specification. The BFIR sends probes using BMS and locally defined | specification. The BFIR sends probes using BMS and locally defined | |||
| skipping to change at page 7, line 23 ¶ | skipping to change at page 7, line 23 ¶ | |||
| | TBA1 | Data | This document | | | TBA1 | Data | This document | | |||
| +-------+-------------+---------------+ | +-------+-------------+---------------+ | |||
| Table 1: Data TLV Type | Table 1: Data TLV Type | |||
| 5. Security Considerations | 5. Security Considerations | |||
| Routers that support PMTUD based on this document are subject to the | Routers that support PMTUD based on this document are subject to the | |||
| same security considerations as defined in [I-D.ietf-bier-ping] | same security considerations as defined in [I-D.ietf-bier-ping] | |||
| 6. Acknowledgement | 6. Acknowledgment | |||
| Authors greatly appreciate thorough review and the most detailed | Authors greatly appreciate thorough review and the most detailed | |||
| comments by Eric Gray. | comments by Eric Gray. | |||
| 7. References | 7. References | |||
| 7.1. Normative References | 7.1. Normative References | |||
| [I-D.ietf-bier-ping] | [I-D.ietf-bier-ping] | |||
| Kumar, N., Pignataro, C., Akiya, N., Zheng, L., Chen, M., | Kumar, N., Pignataro, C., Akiya, N., Zheng, L., Chen, M., | |||
| and G. Mirsky, "BIER Ping and Trace", draft-ietf-bier- | and G. Mirsky, "BIER Ping and Trace", draft-ietf-bier- | |||
| ping-02 (work in progress), July 2017. | ping-03 (work in progress), January 2018. | |||
| [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | |||
| DOI 10.17487/RFC1191, November 1990, | DOI 10.17487/RFC1191, November 1990, | |||
| <https://www.rfc-editor.org/info/rfc1191>. | <https://www.rfc-editor.org/info/rfc1191>. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| skipping to change at page 8, line 17 ¶ | skipping to change at page 8, line 17 ¶ | |||
| DOI 10.17487/RFC8201, July 2017, | DOI 10.17487/RFC8201, July 2017, | |||
| <https://www.rfc-editor.org/info/rfc8201>. | <https://www.rfc-editor.org/info/rfc8201>. | |||
| 7.2. Informative References | 7.2. Informative References | |||
| [I-D.ietf-bier-oam-requirements] | [I-D.ietf-bier-oam-requirements] | |||
| Mirsky, G., Nordmark, E., Pignataro, C., Kumar, N., | Mirsky, G., Nordmark, E., Pignataro, C., Kumar, N., | |||
| Aldrin, S., Zheng, L., Chen, M., Akiya, N., and S. | Aldrin, S., Zheng, L., Chen, M., Akiya, N., and S. | |||
| Pallagatti, "Operations, Administration and Maintenance | Pallagatti, "Operations, Administration and Maintenance | |||
| (OAM) Requirements for Bit Index Explicit Replication | (OAM) Requirements for Bit Index Explicit Replication | |||
| (BIER) Layer", draft-ietf-bier-oam-requirements-04 (work | (BIER) Layer", draft-ietf-bier-oam-requirements-05 (work | |||
| in progress), July 2017. | in progress), January 2018. | |||
| [RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., | [RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., | |||
| Przygienda, T., and S. Aldrin, "Multicast Using Bit Index | Przygienda, T., and S. Aldrin, "Multicast Using Bit Index | |||
| Explicit Replication (BIER)", RFC 8279, | Explicit Replication (BIER)", RFC 8279, | |||
| DOI 10.17487/RFC8279, November 2017, | DOI 10.17487/RFC8279, November 2017, | |||
| <https://www.rfc-editor.org/info/rfc8279>. | <https://www.rfc-editor.org/info/rfc8279>. | |||
| Authors' Addresses | Authors' Addresses | |||
| Greg Mirsky | Greg Mirsky | |||
| End of changes. 20 change blocks. | ||||
| 42 lines changed or deleted | 43 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||