| < draft-ietf-raw-oam-support-00.txt | draft-ietf-raw-oam-support-01.txt > | |||
|---|---|---|---|---|
| RAW F. Theoleyre | RAW F. Theoleyre | |||
| Internet-Draft CNRS | Internet-Draft CNRS | |||
| Updates: draft-theoleyre-raw-oam- G. Papadopoulos | Updates: draft-ietf-raw-oam-support-00 G. Papadopoulos | |||
| support-04 (if approved) IMT Atlantique | (if approved) IMT Atlantique | |||
| Intended status: Informational G. Mirsky | Intended status: Informational G. Mirsky | |||
| Expires: October 21, 2021 ZTE Corp. | Expires: November 25, 2021 ZTE Corp. | |||
| April 19, 2021 | CJ. Bernardos | |||
| UC3M | ||||
| May 24, 2021 | ||||
| Operations, Administration and Maintenance (OAM) features for RAW | Operations, Administration and Maintenance (OAM) features for RAW | |||
| draft-ietf-raw-oam-support-00 | draft-ietf-raw-oam-support-01 | |||
| Abstract | Abstract | |||
| Some critical applications may use a wireless infrastructure. | Some critical applications may use a wireless infrastructure. | |||
| However, wireless networks exhibit a bandwidth of several orders of | However, wireless networks exhibit a bandwidth of several orders of | |||
| magnitude lower than wired networks. Besides, wireless transmissions | magnitude lower than wired networks. Besides, wireless transmissions | |||
| are lossy by nature; the probability that a packet cannot be decoded | are lossy by nature; the probability that a packet cannot be decoded | |||
| correctly by the receiver may be quite high. In these conditions, | correctly by the receiver may be quite high. In these conditions, | |||
| guaranteeing the network infrastructure works properly is | guaranteeing that the network infrastructure works properly is | |||
| particularly challenging, since we need to address some issues | particularly challenging, since we need to address some issues | |||
| specific to wireless networks. This document lists the requirements | specific to wireless networks. This document lists the requirements | |||
| of the Operation, Administration, and Maintenance (OAM) features | of the Operation, Administration, and Maintenance (OAM) features | |||
| recommended to construct a predictable communication infrastructure | recommended to construct a predictable communication infrastructure | |||
| on top of a collection of wireless segments. This document describes | on top of a collection of wireless segments. This document describes | |||
| the benefits, problems, and trade-offs for using OAM in wireless | the benefits, problems, and trade-offs for using OAM in wireless | |||
| networks to achieve Service Level Objectives (SLO). | networks to achieve Service Level Objectives (SLO). | |||
| Status of This Memo | Status of This Memo | |||
| skipping to change at page 1, line 45 ¶ | skipping to change at page 1, line 47 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on October 21, 2021. | This Internet-Draft will expire on November 25, 2021. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2021 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 25 ¶ | skipping to change at page 2, line 25 ¶ | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 5 | 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 6 | |||
| 2. Role of OAM in RAW . . . . . . . . . . . . . . . . . . . . . 5 | 2. Role of OAM in RAW . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 2.1. Link concept and quality . . . . . . . . . . . . . . . . 6 | 2.1. Link concept and quality . . . . . . . . . . . . . . . . 7 | |||
| 2.2. Broadcast Transmissions . . . . . . . . . . . . . . . . . 6 | 2.2. Broadcast Transmissions . . . . . . . . . . . . . . . . . 7 | |||
| 2.3. Complex Layer 2 Forwarding . . . . . . . . . . . . . . . 7 | 2.3. Complex Layer 2 Forwarding . . . . . . . . . . . . . . . 8 | |||
| 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 2.4. End-to-end delay . . . . . . . . . . . . . . . . . . . . 8 | |||
| 3.1. Information Collection . . . . . . . . . . . . . . . . . 7 | 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 3.2. Continuity Check . . . . . . . . . . . . . . . . . . . . 7 | 3.1. Information Collection . . . . . . . . . . . . . . . . . 8 | |||
| 3.3. Connectivity Verification . . . . . . . . . . . . . . . . 7 | 3.2. Continuity Check . . . . . . . . . . . . . . . . . . . . 9 | |||
| 3.4. Route Tracing . . . . . . . . . . . . . . . . . . . . . . 8 | 3.3. Connectivity Verification . . . . . . . . . . . . . . . . 9 | |||
| 3.5. Fault Verification/detection . . . . . . . . . . . . . . 8 | 3.4. Route Tracing . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 3.6. Fault Isolation/identification . . . . . . . . . . . . . 8 | 3.5. Fault Verification/detection . . . . . . . . . . . . . . 9 | |||
| 4. Administration . . . . . . . . . . . . . . . . . . . . . . . 9 | 3.6. Fault Isolation/identification . . . . . . . . . . . . . 10 | |||
| 4.1. Worst-case metrics . . . . . . . . . . . . . . . . . . . 9 | 4. Administration . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 4.2. Efficient data retrieval . . . . . . . . . . . . . . . . 10 | 4.1. Worst-case metrics . . . . . . . . . . . . . . . . . . . 11 | |||
| 5. Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 4.2. Efficient data retrieval . . . . . . . . . . . . . . . . 11 | |||
| 5.1. Dynamic Resource Reservation . . . . . . . . . . . . . . 11 | 4.3. Reporting OAM packets to the source . . . . . . . . . . . 12 | |||
| 5.2. Reliable Reconfiguration . . . . . . . . . . . . . . . . 11 | 5. Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | 5.1. Soft transition after reconfiguration . . . . . . . . . . 12 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 | 5.2. Predictive maintenance . . . . . . . . . . . . . . . . . 12 | |||
| 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 9. Informative References . . . . . . . . . . . . . . . . . . . 11 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 9. Informative References . . . . . . . . . . . . . . . . . . . 13 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 | ||||
| 1. Introduction | 1. Introduction | |||
| Reliable and Available Wireless (RAW) is an effort that extends | Reliable and Available Wireless (RAW) is an effort that extends | |||
| DetNet to approach end-to-end deterministic performances over a | DetNet to approach end-to-end deterministic performances over a | |||
| network that includes scheduled wireless segments. In wired | network that includes scheduled wireless segments. In wired | |||
| networks, many approaches try to enable Quality of Service (QoS) by | networks, many approaches try to enable Quality of Service (QoS) by | |||
| implementing traffic differentiation so that routers handle each type | implementing traffic differentiation so that routers handle each type | |||
| of packets differently. However, this differentiated treatment was | of packets differently. However, this differentiated treatment was | |||
| expensive for most applications. | expensive for most applications. | |||
| Deterministic Networking (DetNet) [RFC8655] has proposed to provide a | Deterministic Networking (DetNet) [RFC8655] has proposed to provide a | |||
| bounded end-to-end latency on top of the network infrastructure, | bounded end-to-end latency on top of the network infrastructure, | |||
| comprising both Layer 2 bridged and Layer 3 routed segments. Their | comprising both Layer 2 bridged and Layer 3 routed segments. Their | |||
| work encompasses the data plane, OAM, time synchronization, | work encompasses the data plane, OAM, time synchronization, | |||
| management, control, and security aspects. | management, control, and security aspects. | |||
| However, wireless networks create specific challenges. First of all, | However, wireless networks create specific challenges. First of all, | |||
| radio bandwidth is significantly lower than for wired networks. In | radio bandwidth is significantly lower than for wired networks. In | |||
| these conditions, the volume of signaling messages has to be very | these conditions, the volume of signaling messages has to be very | |||
| limited. Even worse, wireless links are lossy: a layer 2 | limited. Even worse, wireless links are lossy: a Layer 2 | |||
| transmission may or may not be decoded correctly by the receiver, | transmission may or may not be decoded correctly by the receiver, | |||
| depending on a broad set of parameters. Thus, providing high | depending on a broad set of parameters. Thus, providing high | |||
| reliability through wireless segments is particularly challenging. | reliability through wireless segments is particularly challenging. | |||
| Wired networks rely on the concept of _links_. All the devices | Wired networks rely on the concept of _links_. All the devices | |||
| attached to a link receive any transmission. The concept of a link | attached to a link receive any transmission. The concept of a link | |||
| in wireless networks is somewhat different from what many are used to | in wireless networks is somewhat different from what many are used to | |||
| in wireline networks. A receiver may or may not receive a | in wireline networks. A receiver may or may not receive a | |||
| transmission, depending on the presence of a colliding transmission, | transmission, depending on the presence of a colliding transmission, | |||
| the radio channel's quality, and the external interference. Besides, | the radio channel's quality, and the external interference. Besides, | |||
| skipping to change at page 4, line 9 ¶ | skipping to change at page 4, line 9 ¶ | |||
| primary importance for IP networks [RFC7276]. It defines a toolset | primary importance for IP networks [RFC7276]. It defines a toolset | |||
| for fault detection, isolation, and performance measurement. | for fault detection, isolation, and performance measurement. | |||
| The primary purpose of this document is to detail the specific | The primary purpose of this document is to detail the specific | |||
| requirements of the OAM features recommended to construct a | requirements of the OAM features recommended to construct a | |||
| predictable communication infrastructure on top of a collection of | predictable communication infrastructure on top of a collection of | |||
| wireless segments. This document describes the benefits, problems, | wireless segments. This document describes the benefits, problems, | |||
| and trade-offs for using OAM in wireless networks to provide | and trade-offs for using OAM in wireless networks to provide | |||
| availability and predictability. | availability and predictability. | |||
| 1.1. Terminology | ||||
| In this document, the term OAM will be used according to its | In this document, the term OAM will be used according to its | |||
| definition specified in [RFC6291]. We expect to implement an OAM | definition specified in [RFC6291]. We expect to implement an OAM | |||
| framework in RAW networks to maintain a real-time view of the network | framework in RAW networks to maintain a real-time view of the network | |||
| infrastructure, and its ability to respect the Service Level | infrastructure, and its ability to respect the Service Level | |||
| Objectives (SLO), such as delay and reliability, assigned to each | Objectives (SLO), such as delay and reliability, assigned to each | |||
| data flow. | data flow. | |||
| 1.1. Terminology | ||||
| We re-use here the same terminology as [detnet-oam]: | We re-use here the same terminology as [detnet-oam]: | |||
| o OAM entity: a data flow to be controlled; | o OAM entity: a data flow to be monitored for defects and/or its | |||
| performance metrics measured.; | ||||
| o Maintenance End Point (MEP): OAM devices crossed when entering/ | o Maintenance End Point (MEP): OAM devices crossed when entering/ | |||
| exiting the network. In RAW, it corresponds mostly to the source | exiting the network. In RAW, it corresponds mostly to the source | |||
| or destination of a data flow. OAM message can be exchanges | or destination of a data flow. OAM message can be exchanged | |||
| between two MEPs; | between two MEPs; | |||
| o Maintenance Intermediate endPoint (MIP): OAM devices along the | o Maintenance Intermediate endPoint (MIP): an OAM system along the | |||
| flow; OAM messages can be exchanged between a MEP and a MIP; | flow; a MIP MAY respond to an OAM message generated by the MEP; | |||
| o control/data plane: while the control plane expects to configure | ||||
| and control the network (long-term), the data plane takes the | ||||
| individual decision; | ||||
| o passive / active methods (as defined in [RFC7799]): active methods | o control/management/data plane: the control and management planes | |||
| send additionnal control information (inserting novel fields, | are used to configure and control the network (long-term). The | |||
| generating novel control packets). Passive methods infer | data plane takes the individual decision. Relative to a data | |||
| information just by observing unmodified existing flows. | flow, the control and/or management plane can be out-of-band; | |||
| o active methods may implement one of these two strategies: | o Active measurement methods (as defined in [RFC7799]) modify a | |||
| normal data flow by inserting novel fields, injecting specially | ||||
| constructed test packets [RFC2544]). It is critical for the | ||||
| quality of information obtained using an active method that | ||||
| generated test packets are in-band with the monitored data flow. | ||||
| In other words, a test packet is required to cross the same | ||||
| network nodes and links and receive the same Quality of Service | ||||
| (QoS) treatment as a data packet. Active methods may implement | ||||
| one of these two strategies: | ||||
| * In-band: control information follows the same path as the data | * In-band: control information follows the same path as the data | |||
| packets. In other words, a failure in the data plane may | packets. In other words, a failure in the data plane may | |||
| prevent the control information to reach the destination (e.g., | prevent the control information to reach the destination (e.g., | |||
| end-device or controller). | end-device or controller). | |||
| * out-of-band: control information is sent separately from the | * out-of-band: control information is sent separately from the | |||
| data packets. Thus, the behavior of control vs. data packets | data packets. Thus, the behavior of control vs. data packets | |||
| may differ; | may differ; | |||
| o Passive measurement methods [RFC7799] infer information by | ||||
| observing unmodified existing flows. | ||||
| We also adopt the following terminology, which is particularly | We also adopt the following terminology, which is particularly | |||
| relevant for RAW segments. | relevant for RAW segments. | |||
| o piggybacking vs. dedicated control packets: control information | o piggybacking vs. dedicated control packets: control information | |||
| may be encapsulated in specific (dedicated) control packets. | may be encapsulated in specific (dedicated) control packets. | |||
| Alternatively, it may be piggybacked in existing data packets, | Alternatively, it may be piggybacked in existing data packets, | |||
| when the MTU is larger than the actual packet length. | when the MTU is larger than the actual packet length. | |||
| Piggybacking makes specifically sense in wireless networks: the | Piggybacking makes specifically sense in wireless networks, as the | |||
| cost (bandwidth and energy) is not linear with the packet size. | cost (bandwidth and energy) is not linear with the packet size. | |||
| o router-over vs. mesh under: a control packet is either forwarded | o router-over vs. mesh under: a control packet is either forwarded | |||
| directly to the layer-3 next hop (mesh under) or handled hop-by- | directly to the layer-3 next hop (mesh under) or handled hop-by- | |||
| hop by each router. While the latter option consumes more | hop by each router. While the latter option consumes more | |||
| resource, it allows to collect additionnal intermediary | resources, it allows to collect additionnal intermediary | |||
| information, particularly relevant in wireless networks. | information, particularly relevant in wireless networks. | |||
| o Defect: a temporary change in the network (e.g., a radio link | o Defect: a temporary change in the network (e.g., a radio link | |||
| which is broken due to a mobile obstacle); | which is broken due to a mobile obstacle); | |||
| o Fault: a definite change which may affect the network performance, | o Fault: a definite change which may affect the network performance, | |||
| e.g., a node runs out of energy. | e.g., a node runs out of energy. | |||
| o End-to-end delay: the time between the packet generation and its | ||||
| reception by the destination. | ||||
| 1.2. Acronyms | 1.2. Acronyms | |||
| OAM Operations, Administration, and Maintenance | OAM Operations, Administration, and Maintenance | |||
| DetNet Deterministic Networking | DetNet Deterministic Networking | |||
| SLO Service Level Objective | PSE Path Selection Engine [I-D.pthubert-raw-architecture] | |||
| QoS Quality of Service | QoS Quality of Service | |||
| SNMP Simple Network Management Protocol | RAW Reliable and Available Wireless | |||
| SLO Service Level Objective | ||||
| SNMP Simple Network Management Protocol | ||||
| SDN Software-Defined Network | SDN Software-Defined Network | |||
| 1.3. Requirements Language | 1.3. Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
| capitals, as shown here. | capitals, as shown here. | |||
| 2. Role of OAM in RAW | 2. Role of OAM in RAW | |||
| RAW networks expect to make the communications reliable and | RAW networks expect to make the communications reliable and | |||
| predictable on top of a wireless network infrastructure. Most | predictable on top of a wireless network infrastructure. Most | |||
| critical applications will define an SLO to be required for the data | critical applications will define an SLO to be required for the data | |||
| flows it generates. RAW considers network plane protocol elements | flows it generates. RAW considers network plane protocol elements | |||
| such as OAM to improve the RAW operation at the service and the | such as OAM to improve the RAW operation at the service and the | |||
| forwarding sub-layers. | forwarding sub-layers. | |||
| To respect strict guarantees, RAW relies on an orchestrator able to | To respect strict guarantees, RAW relies on the Path Selection Engine | |||
| monitor and maintain the network. Typically, a Software-Defined | (PSE) (as defined in [I-D.pthubert-raw-architecture] to monitor and | |||
| Network (SDN) controller is in charge of scheduling the transmissions | maintain the network. As an example, a Software-Defined Network | |||
| in the deployed network, based on the radio link characteristics, SLO | (SDN) controller may be used to schedule the transmissions in the | |||
| of the flows, the number of packets to forward. Thus, resources have | deployed network, based on the radio link characteristics, SLO of the | |||
| to be provisioned a priori to handle any defect. OAM represents the | flows, the number of packets to forward. Thus, resources have to be | |||
| core of the pre-provisioning process and maintains the network | provisioned a priori to handle any defect. OAM represents the core | |||
| operational by updating the schedule dynamically. | of the pre-provisioning process and maintains the network operational | |||
| by updating the schedule dynamically. | ||||
| Fault-tolerance also assumes that multiple paths have to be | Fault-tolerance also assumes that multiple paths have to be | |||
| provisioned so that an end-to-end circuit keeps on existing whatever | provisioned so that an end-to-end circuit keeps on existing whatever | |||
| the conditions. The Packet Replication and Elimination Function | the conditions. The Packet Replication and Elimination Function | |||
| ([PREF-draft]) on a node is typically controlled by a central | ([PREF-draft]) on a node is typically controlled by the PSE. OAM | |||
| controller/orchestrator. OAM mechanisms can be used to monitor that | mechanisms can be used to monitor that PREOF is working correctly on | |||
| PREOF is working correctly on a node and within the domain. | a node and within the domain. | |||
| To be energy-efficient, reserving some dedicated out-of-band | To be energy-efficient, reserving some dedicated out-of-band | |||
| resources for OAM seems idealistic, and only in-band solutions are | resources for OAM seems idealistic, and only in-band solutions are | |||
| considered here. | considered here. | |||
| RAW supports both proactive and on-demand troubleshooting. | RAW supports both proactive and on-demand troubleshooting. | |||
| The specific characteristics of RAW are discussed below. | The specific characteristics of RAW are discussed below. | |||
| 2.1. Link concept and quality | 2.1. Link concept and quality | |||
| In wireless networks, a _link_ does not exist physically. A common | In wireless networks, a _link_ does not exist physically. A device | |||
| convention is to define a wireless link as a pair of devices that | has a set of *neighbors* that correspond to all the devices that have | |||
| have a non-null probability of exchanging a packet that the receiver | a non null probability of receiving correctly its packets. We make a | |||
| can decode. Similarly, we designate as *neighbor* any device with a | distinction between: | |||
| radio link with a specific transmitter. | ||||
| o point-to-point (p2p) link with one transmitter and one receiver. | ||||
| These links are used to transmit unicast packets. | ||||
| o point-to-multipoint (p2m) link associates one transmitter and a | ||||
| collection of receivers. For instance, broadcast packets assume | ||||
| the existence of p2m links to avoid duplicating a broadcast packet | ||||
| to reach each possible radio neighbor. | ||||
| In scheduled radio networks, p2m and p2p links are commonly not | ||||
| scheduled simultaneously to save energy. More precisely, only one | ||||
| part of the neighbors may wake-up at a given instant. | ||||
| Anycast are used in p2m links to improve the reliability. A | ||||
| collection of receivers are scheduled to wake-up simutaneously, so | ||||
| that the transmission fails only if none of the receivers is able to | ||||
| decode the packet. | ||||
| Each wireless link is associated with a link quality, often measured | Each wireless link is associated with a link quality, often measured | |||
| as the Packet Delivery Ratio (PDR), i.e., the probability that the | as the Packet Delivery Ratio (PDR), i.e., the probability that the | |||
| receiver can decode the packet correctly. It is worth noting that | receiver can decode the packet correctly. It is worth noting that | |||
| this link quality depends on many criteria, such as the level of | this link quality depends on many criteria, such as the level of | |||
| external interference, the presence of concurrent transmissions, or | external interference, the presence of concurrent transmissions, or | |||
| the radio channel state. This link quality is even time-variant. | the radio channel state. This link quality is even time-variant. | |||
| For p2m links, we have consequently a collection of PDR (one value | ||||
| per receiver). Other more sophisticated, aggregated metrics exist | ||||
| for these p2m links, such as [anycast-property] | ||||
| 2.2. Broadcast Transmissions | 2.2. Broadcast Transmissions | |||
| In modern switching networks, the unicast transmission is delivered | In modern switching networks, the unicast transmission is delivered | |||
| uniquely to the destination. Wireless networks are much closer to | uniquely to the destination. Wireless networks are much closer to | |||
| the ancient *shared access* networks. Practically, unicast and | the ancient *shared access* networks. Practically, unicast and | |||
| broadcast frames are handled similarly at the physical layer. The | broadcast frames are handled similarly at the physical layer. The | |||
| link layer is just in charge of filtering the frames to discard | link layer is just in charge of filtering the frames to discard | |||
| irrelevant receptions (e.g., different unicast MAC address). | irrelevant receptions (e.g., different unicast MAC address). | |||
| However, contrary to wired networks, we cannot be sure that a packet | However, contrary to wired networks, we cannot be sure that a packet | |||
| is received by *all* the devices attached to the layer-2 segment. It | is received by *all* the devices attached to the Layer 2 segment. It | |||
| depends on the radio channel state between the transmitter(s) and the | depends on the radio channel state between the transmitter(s) and the | |||
| receiver(s). In particular, concurrent transmissions may be possible | receiver(s). In particular, concurrent transmissions may be possible | |||
| or not, depending on the radio conditions (e.g., do the different | or not, depending on the radio conditions (e.g., do the different | |||
| transmitters use a different radio channel or are they sufficiently | transmitters use a different radio channel or are they sufficiently | |||
| spatially separated?) | spatially separated?) | |||
| 2.3. Complex Layer 2 Forwarding | 2.3. Complex Layer 2 Forwarding | |||
| Multiple neighbors may receive a transmission. Thus, anycast layer-2 | Multiple neighbors may receive a transmission. Thus, anycast Layer 2 | |||
| forwarding helps to maximize the reliability by assigning multiple | forwarding helps to maximize the reliability by assigning multiple | |||
| receivers to a single transmission. That way, the packet is lost | receivers to a single transmission. That way, the packet is lost | |||
| only if *none* of the receivers decode it. Practically, it has been | only if *none* of the receivers decode it. Practically, it has been | |||
| proven that different neighbors may exhibit very different radio | proven that different neighbors may exhibit very different radio | |||
| conditions, and that reception independency may hold for some of them | conditions, and that reception independency may hold for some of them | |||
| [anycast-property]. | [anycast-property]. | |||
| 2.4. End-to-end delay | ||||
| In a wireless network, additionnal transmissions opportunities are | ||||
| provisionned to accomodate packet losses. Thus, the end-to-end delay | ||||
| consists of: | ||||
| o Transmission delay, which is fixed and depends mainly on the data | ||||
| rate, and the presence or absence of an acknowledgement. | ||||
| o Residence time, corresponds to the buffering delay and depends on | ||||
| the schedule. To account for retransmisisons, the residence time | ||||
| is equal to the difference between the time of last reception from | ||||
| the previous hop (among all the retransmisions) and the time of | ||||
| emission of the last retransmission. | ||||
| 3. Operation | 3. Operation | |||
| OAM features will enable RAW with robust operation both for | OAM features will enable RAW with robust operation both for | |||
| forwarding and routing purposes. | forwarding and routing purposes. | |||
| 3.1. Information Collection | 3.1. Information Collection | |||
| The model to exchange information should be the same as for detnet | The model to exchange information should be the same as for DetNet | |||
| network, for the sake of inter-operability. YANG may typically | network, for the sake of inter-operability. YANG may typically | |||
| fulfill this objective. | fulfill this objective. | |||
| However, RAW networks imply specific constraints (e.g., low | However, RAW networks imply specific constraints (e.g., low | |||
| bandwidth, packet losses, cost of medium access) that may require to | bandwidth, packet losses, cost of medium access) that may require to | |||
| minimize the volume of information to collect. Thus, we discuss in | minimize the volume of information to collect. Thus, we discuss in | |||
| Section 4.2 the different ways to collect information, i.e., transfer | Section 4.2 different ways to collect information, i.e., transfer | |||
| physically the OAM information from the emitter to the receiver. | physically the OAM information from the emitter to the receiver. | |||
| 3.2. Continuity Check | 3.2. Continuity Check | |||
| Similarly to detnet, we need to verify that the source and the | Similarly to DetNet, we need to verify that the source and the | |||
| destination are connected (at least one valid path exists) | destination are connected (at least one valid path exists) | |||
| 3.3. Connectivity Verification | 3.3. Connectivity Verification | |||
| As in detnet, we have to verify the absence of misconnection. We | As in DetNet, we have to verify the absence of misconnection. We | |||
| will focus here on the RAW specificities. | focus here on the RAW specificities. | |||
| Because of radio transmissions' broadcast nature, several receivers | Because of radio transmissions' broadcast nature, several receivers | |||
| may be active at the same time to enable anycast Layer 2 forwarding. | may be active at the same time to enable anycast Layer 2 forwarding. | |||
| Thus, the connectivity verification must test any combination. We | Thus, the connectivity verification must test any combination. We | |||
| also consider priority-based mechanisms for anycast forwarding, i.e., | also consider priority-based mechanisms for anycast forwarding, i.e., | |||
| all the receivers have different probabilities of forwarding a | all the receivers have different probabilities of forwarding a | |||
| packet. To verify a delay SLO for a given flow, we must also | packet. To verify a delay SLO for a given flow, we must also | |||
| consider all the possible combinations, leading to a probability | consider all the possible combinations, leading to a probability | |||
| distribution function for end-to-end transmissions. If this | distribution function for end-to-end transmissions. If this | |||
| verification is implemented naively, the number of combinations to | verification is implemented naively, the number of combinations to | |||
| skipping to change at page 8, line 28 ¶ | skipping to change at page 9, line 38 ¶ | |||
| Wireless networks are meshed by nature: we have many redundant radio | Wireless networks are meshed by nature: we have many redundant radio | |||
| links. These meshed networks are both an asset and a drawback: while | links. These meshed networks are both an asset and a drawback: while | |||
| several paths exist between two endpoints, and we should choose the | several paths exist between two endpoints, and we should choose the | |||
| most efficient one(s), concerning specifically the reliability, and | most efficient one(s), concerning specifically the reliability, and | |||
| the delay. | the delay. | |||
| Thus, multipath routing can be considered to make the network fault- | Thus, multipath routing can be considered to make the network fault- | |||
| tolerant. Even better, we can exploit the broadcast nature of | tolerant. Even better, we can exploit the broadcast nature of | |||
| wireless networks to exploit meshed multipath routing: we may have | wireless networks to exploit meshed multipath routing: we may have | |||
| multiple Maintenance Intermediate Endpoints (MIE) for each hop in the | multiple Maintenance Intermediate Endpoints (MIP) for each hop in the | |||
| path. In that way, each Maintenance Intermediate Endpoint has | path. In that way, each Maintenance Intermediate Endpoint has | |||
| several possible next hops in the forwarding plane. Thus, all the | several possible next hops in the forwarding plane. Thus, all the | |||
| possible paths between two maintenance endpoints should be retrieved, | possible paths between two maintenance endpoints should be retrieved, | |||
| which may quickly become untractable if we apply a naive approach. | which may quickly become untractable if we apply a naive approach. | |||
| 3.5. Fault Verification/detection | 3.5. Fault Verification/detection | |||
| Wired networks tend to present stable performances. On the contrary, | Wired networks tend to present stable performances. On the contrary, | |||
| wireless networks are time-variant. We must consequently make a | wireless networks are time-variant. We must consequently make a | |||
| distinction between _normal_ evolutions and malfunction. | distinction between _normal_ evolutions and malfunction. | |||
| 3.6. Fault Isolation/identification | 3.6. Fault Isolation/identification | |||
| The network has isolated and identified the cause of the fault. | The network has isolated and identified the cause of the fault. | |||
| While detnet already expects to identify malfunctions, some problems | While DetNet already expects to identify malfunctions, some problems | |||
| are specific to wireless networks. We must consequently collect | are specific to wireless networks. We must consequently collect | |||
| metrics and implement algorithms tailored for wireless networking. | metrics and implement algorithms tailored for wireless networking. | |||
| For instance, the decrease in the link quality may be caused by | For instance, the decrease in the link quality may be caused by | |||
| several factors: external interference, obstacles, multipath fading, | several factors: external interference, obstacles, multipath fading, | |||
| mobility. It it fundamental to be able to discriminate the different | mobility. It it fundamental to be able to discriminate the different | |||
| causes to make the right decision. | causes to make the right decision. | |||
| 4. Administration | 4. Administration | |||
| skipping to change at page 9, line 26 ¶ | skipping to change at page 10, line 38 ¶ | |||
| in wireless to denote the link quality. The radio chipset is in | in wireless to denote the link quality. The radio chipset is in | |||
| charge of translating a received signal strength into a normalized | charge of translating a received signal strength into a normalized | |||
| quality indicator; | quality indicator; | |||
| o Delay: the time elapsed between a packet generation / enqueuing | o Delay: the time elapsed between a packet generation / enqueuing | |||
| and its reception by the next hop; | and its reception by the next hop; | |||
| o Buffer occupancy: the number of packets present in the buffer, for | o Buffer occupancy: the number of packets present in the buffer, for | |||
| each of the existing flows. | each of the existing flows. | |||
| o Battery lifetime: the expected remaining battery lifetime of the | ||||
| device. Since many RAW devices might be battery powered, this is | ||||
| an important metric for an operator to take proper decisions. | ||||
| o Mobility: if a device is known to be mobile, this might be | ||||
| considered by an operator to take proper decisions. | ||||
| These metrics should be collected per device, virtual circuit, and | These metrics should be collected per device, virtual circuit, and | |||
| path, as detnet already does. However, we have to face in RAW to a | path, as detnet already does. However, we have to face in RAW to a | |||
| finer granularity: | finer granularity: | |||
| o per radio channel to measure, e.g., the level of external | o per radio channel to measure, e.g., the level of external | |||
| interference, and to be able to apply counter-measures (e.g., | interference, and to be able to apply counter-measures (e.g., | |||
| blacklisting). | blacklisting). | |||
| o per link to detect misbehaving link (assymetrical link, | o per link to detect misbehaving link (assymetrical link, | |||
| fluctuating quality). | fluctuating quality). | |||
| o per resource block: a collision in the schedule is particularly | o per resource block: a collision in the schedule is particularly | |||
| challenging to identify in radio networks with spectrum reuse. In | challenging to identify in radio networks with spectrum reuse. In | |||
| particular, a collision may not be systematic (depending on the | particular, a collision may not be systematic (depending on the | |||
| radio characteristics and the traffic profile) | radio characteristics and the traffic profile) | |||
| 4.1. Worst-case metrics | 4.1. Worst-case metrics | |||
| RAW inherits the same requirements as detnet: we need to know the | RAW inherits the same requirements as DetNet: we need to know the | |||
| distribution of a collection of metrics. However, wireless networks | distribution of a collection of metrics. However, wireless networks | |||
| are know to be highly variable. Changes may be frequent, and may | are known to be highly variable. Changes may be frequent, and may | |||
| exhibit a periodical pattern. Collecting and analyzing this amount | exhibit a periodical pattern. Collecting and analyzing this amount | |||
| of measurements is challenging. | of measurements is challenging. | |||
| Wireless networks are known to be lossy, and RAW has to implement | Wireless networks are known to be lossy, and RAW has to implement | |||
| strategies to improve reliability on top of unreliable links. Hybrid | strategies to improve reliability on top of unreliable links. Hybrid | |||
| Automatic Repeat reQuest (ARQ) has typically to enable | Automatic Repeat reQuest (ARQ) has typically to enable | |||
| retransmissions based on the end-to-end reliability and latency | retransmissions based on the end-to-end reliability and latency | |||
| requirements. | requirements. | |||
| 4.2. Efficient data retrieval | 4.2. Efficient data retrieval | |||
| skipping to change at page 10, line 44 ¶ | skipping to change at page 12, line 15 ¶ | |||
| the headers to identify the path followed by a packet a | the headers to identify the path followed by a packet a | |||
| posteriori. | posteriori. | |||
| hierarchical monitoring; localized and centralized mechanisms have | hierarchical monitoring; localized and centralized mechanisms have | |||
| to be combined together. Typically, a local mechanism should | to be combined together. Typically, a local mechanism should | |||
| contiuously monitor a set of metrics and trigger distant OAM | contiuously monitor a set of metrics and trigger distant OAM | |||
| exchances only when a fault is detected (but possibly not | exchances only when a fault is detected (but possibly not | |||
| identified). For instance, local temporary defects must not | identified). For instance, local temporary defects must not | |||
| trigger expensive OAM transmissions. | trigger expensive OAM transmissions. | |||
| 4.3. Reporting OAM packets to the source | ||||
| TODO: statistics are collected when a packet goes from the source to | ||||
| the destination. However, it has to be also reported by the source. | ||||
| Problem: resource may not be reserved bidirectionnaly. Even worse: | ||||
| the inverse path may not exist. | ||||
| 5. Maintenance | 5. Maintenance | |||
| RAW needs to implement a self-healing and self-optimization approach. | Maintenance needs to facilitate the maintenance (repairs and | |||
| upgrades). In wireless networks, repairs are expected to occur much | ||||
| more frequently, since the link quality may be highly time-variant. | ||||
| Thus, maintenance represents a key feature for RAW. | ||||
| 5.1. Soft transition after reconfiguration | ||||
| Because of the wireless medium, the link quality may fluctuate, and | ||||
| the network needs to reconfigure itself continuously. During this | ||||
| transient state, flows may begin to be gradually re-forwarded, | ||||
| consuming resources in different parts of the network. OAM has to | ||||
| make a distinction between a metric that changed because of a legal | ||||
| network change (e.g., flow redirection) and an unexpected event | ||||
| (e.g., a fault). | ||||
| 5.2. Predictive maintenance | ||||
| RAW needs to implement self-optimization features. While the network | ||||
| is configured to be fault-tolerant, a reconfiguration may be required | ||||
| to keep on respecting long-term objectives. Obviously, the network | ||||
| keeps on respecting the SLO after a node's crash, but a | ||||
| reconfiguration is required to handle the future faults. In other | ||||
| words, the reconfiguration delay MUST be strictly smaller than the | ||||
| inter-fault time. | ||||
| The network must continuously retrieve the state of the network, to | The network must continuously retrieve the state of the network, to | |||
| judge about the relevance of a reconfiguration, quantifying: | judge about the relevance of a reconfiguration, quantifying: | |||
| the cost of the sub-optimality: resources may not be used | the cost of the sub-optimality: resources may not be used | |||
| optimally (e.g., a better path exists); | optimally (e.g., a better path exists); | |||
| the reconfiguration cost: the controller needs to trigger some | the reconfiguration cost: the controller needs to trigger some | |||
| reconfigurations. For this transient period, resources may be | reconfigurations. For this transient period, resources may be | |||
| twice reserved, and control packets have to be transmitted. | twice reserved, and control packets have to be transmitted. | |||
| Thus, reconfiguration may only be triggered if the gain is | Thus, reconfiguration may only be triggered if the gain is | |||
| significant. | significant. | |||
| 5.1. Dynamic Resource Reservation | ||||
| Wireless networks exhibit time-variant characteristics. Thus, the | ||||
| network has to provide additional resources along the path to fit the | ||||
| worst-case performance. This time-variant characteristics make the | ||||
| resource reservation very challenging: over-reaction waste radio and | ||||
| energy resources. Inversely, under-reaction jeopardize the network | ||||
| operations, and some SLO may be violated. | ||||
| 5.2. Reliable Reconfiguration | ||||
| Wireless networks are known to be lossy. Thus, commands may be | ||||
| received or not by the node to reconfigure. Unfortunately, | ||||
| inconsistent states may create critical misconfigurations, where | ||||
| packets may be lost along a path because it has not been properly | ||||
| configured. | ||||
| We have to propose mechanisms to guarantee that the network state is | ||||
| always consistent, even if some control packets are lost. Timeouts | ||||
| and retransmissions are not sufficient since the reconfiguration | ||||
| duration would be, in that case, unbounded. | ||||
| 6. IANA Considerations | 6. IANA Considerations | |||
| This document has no actionable requirements for IANA. This section | This document has no actionable requirements for IANA. This section | |||
| can be removed before the publication. | can be removed before the publication. | |||
| 7. Security Considerations | 7. Security Considerations | |||
| This section will be expanded in future versions of the draft. | This section will be expanded in future versions of the draft. | |||
| 8. Acknowledgments | 8. Acknowledgments | |||
| skipping to change at page 12, line 18 ¶ | skipping to change at page 13, line 43 ¶ | |||
| 802.15.4-TSCH Networks?", 2019, | 802.15.4-TSCH Networks?", 2019, | |||
| <https://doi.org/10.1109/LCNSymposium47956.2019.9000679>. | <https://doi.org/10.1109/LCNSymposium47956.2019.9000679>. | |||
| [detnet-oam] | [detnet-oam] | |||
| Theoleyre, F., Papadopoulos, G. Z., Mirsky, G., and C. J. | Theoleyre, F., Papadopoulos, G. Z., Mirsky, G., and C. J. | |||
| Bernardos, "Operations, Administration and Maintenance | Bernardos, "Operations, Administration and Maintenance | |||
| (OAM) features for detnet", 2020, | (OAM) features for detnet", 2020, | |||
| <https://tools.ietf.org/html/draft-theoleyre-detnet-oam- | <https://tools.ietf.org/html/draft-theoleyre-detnet-oam- | |||
| support>. | support>. | |||
| [I-D.pthubert-raw-architecture] | ||||
| Thubert, P., Papadopoulos, G. Z., and R. Buddenberg, | ||||
| "Reliable and Available Wireless Architecture/Framework", | ||||
| draft-pthubert-raw-architecture-05 (work in progress), | ||||
| November 2020. | ||||
| [ipath] Gao, Y., Dong, W., Chen, C., Bu, J., Wu, W., and X. Liu, | [ipath] Gao, Y., Dong, W., Chen, C., Bu, J., Wu, W., and X. Liu, | |||
| "iPath: path inference in wireless sensor networks.", | "iPath: path inference in wireless sensor networks.", | |||
| 2016, <https://doi.org/10.1109/TNET.2014.2371459>. | 2016, <https://doi.org/10.1109/TNET.2014.2371459>. | |||
| [PREF-draft] | [PREF-draft] | |||
| Thubert, P., Eckert, T., Brodard, Z., and H. Jiang, "BIER- | Thubert, P., Eckert, T., Brodard, Z., and H. Jiang, "BIER- | |||
| TE extensions for Packet Replication and Elimination | TE extensions for Packet Replication and Elimination | |||
| Function (PREF) and OAM", 2018, | Function (PREF) and OAM", 2018, | |||
| <https://tools.ietf.org/html/draft-thubert-bier- | <https://tools.ietf.org/html/draft-thubert-bier- | |||
| replication-elimination>. | replication-elimination>. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for | ||||
| Network Interconnect Devices", RFC 2544, | ||||
| DOI 10.17487/RFC2544, March 1999, | ||||
| <https://www.rfc-editor.org/info/rfc2544>. | ||||
| [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, | [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, | |||
| D., and S. Mansfield, "Guidelines for the Use of the "OAM" | D., and S. Mansfield, "Guidelines for the Use of the "OAM" | |||
| Acronym in the IETF", BCP 161, RFC 6291, | Acronym in the IETF", BCP 161, RFC 6291, | |||
| DOI 10.17487/RFC6291, June 2011, | DOI 10.17487/RFC6291, June 2011, | |||
| <https://www.rfc-editor.org/info/rfc6291>. | <https://www.rfc-editor.org/info/rfc6291>. | |||
| [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. | [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. | |||
| Weingarten, "An Overview of Operations, Administration, | Weingarten, "An Overview of Operations, Administration, | |||
| and Maintenance (OAM) Tools", RFC 7276, | and Maintenance (OAM) Tools", RFC 7276, | |||
| DOI 10.17487/RFC7276, June 2014, | DOI 10.17487/RFC7276, June 2014, | |||
| skipping to change at line 598 ¶ | skipping to change at page 15, line 29 ¶ | |||
| Cesson-Sevigne - Rennes 35510 | Cesson-Sevigne - Rennes 35510 | |||
| FRANCE | FRANCE | |||
| Phone: +33 299 12 70 04 | Phone: +33 299 12 70 04 | |||
| Email: georgios.papadopoulos@imt-atlantique.fr | Email: georgios.papadopoulos@imt-atlantique.fr | |||
| Greg Mirsky | Greg Mirsky | |||
| ZTE Corp. | ZTE Corp. | |||
| Email: gregory.mirsky@ztetx.com | Email: gregory.mirsky@ztetx.com | |||
| Carlos J. Bernardos | ||||
| Universidad Carlos III de Madrid | ||||
| Av. Universidad, 30 | ||||
| Leganes, Madrid 28911 | ||||
| Spain | ||||
| Phone: +34 91624 6236 | ||||
| Email: cjbc@it.uc3m.es | ||||
| URI: http://www.it.uc3m.es/cjbc/ | ||||
| End of changes. 44 change blocks. | ||||
| 100 lines changed or deleted | 181 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||