| < draft-theoleyre-raw-oam-support-03.txt | draft-theoleyre-raw-oam-support-04.txt > | |||
|---|---|---|---|---|
| RAW F. Theoleyre | RAW F. Theoleyre | |||
| Internet-Draft CNRS | Internet-Draft CNRS | |||
| Intended status: Standards Track G. Papadopoulos | Intended status: Standards Track G. Papadopoulos | |||
| Expires: January 11, 2021 IMT Atlantique | Expires: April 28, 2021 IMT Atlantique | |||
| G. Mirsky | G. Mirsky | |||
| ZTE Corp. | ZTE Corp. | |||
| July 10, 2020 | October 25, 2020 | |||
| Operations, Administration and Maintenance (OAM) features for RAW | Operations, Administration and Maintenance (OAM) features for RAW | |||
| draft-theoleyre-raw-oam-support-03 | draft-theoleyre-raw-oam-support-04 | |||
| Abstract | Abstract | |||
| Some critical applications may use a wireless infrastructure. | Some critical applications may use a wireless infrastructure. | |||
| However, wireless networks exhibit a bandwidth of several orders of | However, wireless networks exhibit a bandwidth of several orders of | |||
| magnitude lower than wired networks. Besides, wireless transmissions | magnitude lower than wired networks. Besides, wireless transmissions | |||
| are lossy by nature; the probability that a packet cannot be decoded | are lossy by nature; the probability that a packet cannot be decoded | |||
| correctly by the receiver may be quite high. In these conditions, | correctly by the receiver may be quite high. In these conditions, | |||
| guaranteeing the network infrastructure works properly is | guaranteeing the network infrastructure works properly is | |||
| particularly challenging, since we need to address some issues | particularly challenging, since we need to address some issues | |||
| skipping to change at page 1, line 45 ¶ | skipping to change at page 1, line 45 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on January 11, 2021. | This Internet-Draft will expire on April 28, 2021. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 5 | 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 5 | |||
| 2. Role of OAM in RAW . . . . . . . . . . . . . . . . . . . . . 5 | 2. Role of OAM in RAW . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 2.1. Link concept and quality . . . . . . . . . . . . . . . . 5 | 2.1. Link concept and quality . . . . . . . . . . . . . . . . 6 | |||
| 2.2. Broadcast Transmissions . . . . . . . . . . . . . . . . . 6 | 2.2. Broadcast Transmissions . . . . . . . . . . . . . . . . . 6 | |||
| 2.3. Complex Layer 2 Forwarding . . . . . . . . . . . . . . . 6 | 2.3. Complex Layer 2 Forwarding . . . . . . . . . . . . . . . 7 | |||
| 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 3.1. Information Collection . . . . . . . . . . . . . . . . . 6 | 3.1. Information Collection . . . . . . . . . . . . . . . . . 7 | |||
| 3.2. Continuity Check . . . . . . . . . . . . . . . . . . . . 6 | 3.2. Continuity Check . . . . . . . . . . . . . . . . . . . . 7 | |||
| 3.3. Connectivity Verification . . . . . . . . . . . . . . . . 7 | 3.3. Connectivity Verification . . . . . . . . . . . . . . . . 7 | |||
| 3.4. Route Tracing . . . . . . . . . . . . . . . . . . . . . . 7 | 3.4. Route Tracing . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 3.5. Fault Verification/detection . . . . . . . . . . . . . . 8 | 3.5. Fault Verification/detection . . . . . . . . . . . . . . 8 | |||
| 3.6. Fault Isolation/identification . . . . . . . . . . . . . 8 | 3.6. Fault Isolation/identification . . . . . . . . . . . . . 8 | |||
| 4. Administration . . . . . . . . . . . . . . . . . . . . . . . 8 | 4. Administration . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 4.1. Collection of metrics . . . . . . . . . . . . . . . . . . 9 | 4.1. Worst-case metrics . . . . . . . . . . . . . . . . . . . 9 | |||
| 4.2. Worst-case metrics . . . . . . . . . . . . . . . . . . . 9 | 4.2. Efficient data retrieval . . . . . . . . . . . . . . . . 10 | |||
| 4.3. Energy efficiency constraint . . . . . . . . . . . . . . 10 | ||||
| 5. Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 5. Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 5.1. Replication / Elimination . . . . . . . . . . . . . . . . 10 | 5.1. Dynamic Resource Reservation . . . . . . . . . . . . . . 11 | |||
| 5.2. Dynamic Resource Reservation . . . . . . . . . . . . . . 11 | 5.2. Reliable Reconfiguration . . . . . . . . . . . . . . . . 11 | |||
| 5.3. Reliable Reconfiguration . . . . . . . . . . . . . . . . 11 | ||||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 | |||
| 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 | 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 9. Informative References . . . . . . . . . . . . . . . . . . . 12 | 9. Informative References . . . . . . . . . . . . . . . . . . . 11 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 1. Introduction | 1. Introduction | |||
| Reliable and Available Wireless (RAW) is an effort that extends | Reliable and Available Wireless (RAW) is an effort that extends | |||
| DetNet to approach end-to-end deterministic performances over a | DetNet to approach end-to-end deterministic performances over a | |||
| network that includes scheduled wireless segments. In wired | network that includes scheduled wireless segments. In wired | |||
| networks, many approaches try to enable Quality of Service (QoS) by | networks, many approaches try to enable Quality of Service (QoS) by | |||
| implementing traffic differentiation so that routers handle each type | implementing traffic differentiation so that routers handle each type | |||
| of packets differently. However, this differentiated treatment was | of packets differently. However, this differentiated treatment was | |||
| skipping to change at page 4, line 18 ¶ | skipping to change at page 4, line 18 ¶ | |||
| In this document, the term OAM will be used according to its | In this document, the term OAM will be used according to its | |||
| definition specified in [RFC6291]. We expect to implement an OAM | definition specified in [RFC6291]. We expect to implement an OAM | |||
| framework in RAW networks to maintain a real-time view of the network | framework in RAW networks to maintain a real-time view of the network | |||
| infrastructure, and its ability to respect the Service Level | infrastructure, and its ability to respect the Service Level | |||
| Objectives (SLO), such as delay and reliability, assigned to each | Objectives (SLO), such as delay and reliability, assigned to each | |||
| data flow. | data flow. | |||
| 1.1. Terminology | 1.1. Terminology | |||
| We re-use here the same terminology as [detnet-oam]: | ||||
| o OAM entity: a data flow to be controlled; | o OAM entity: a data flow to be controlled; | |||
| o Maintenance End Point (MEP): OAM devices crossed when entering/ | o Maintenance End Point (MEP): OAM devices crossed when entering/ | |||
| exiting the network. In RAW, it corresponds mostly to the source | exiting the network. In RAW, it corresponds mostly to the source | |||
| or destination of a data flow. OAM message can be exchanges | or destination of a data flow. OAM message can be exchanges | |||
| between two MEPs; | between two MEPs; | |||
| o Maintenance Intermediate endPoint (MIP): OAM devices along the | o Maintenance Intermediate endPoint (MIP): OAM devices along the | |||
| flow; OAM messages can be exchanged between a MEP and a MIP; | flow; OAM messages can be exchanged between a MEP and a MIP; | |||
| o control/data plane: while the control plane expects to configure | ||||
| and control the network (long-term), the data plane takes the | ||||
| individual decision; | ||||
| o passive / active methods (as defined in [RFC7799]): active methods | ||||
| send additionnal control information (inserting novel fields, | ||||
| generating novel control packets). Passive methods infer | ||||
| information just by observing unmodified existing flows. | ||||
| o active methods may implement one of these two strategies: | ||||
| * In-band: control information follows the same path as the data | ||||
| packets. In other words, a failure in the data plane may | ||||
| prevent the control information to reach the destination (e.g., | ||||
| end-device or controller). | ||||
| * out-of-band: control information is sent separately from the | ||||
| data packets. Thus, the behavior of control vs. data packets | ||||
| may differ; | ||||
| We also adopt the following terminology, which is particularly | ||||
| relevant for RAW segments. | ||||
| o piggybacking vs. dedicated control packets: control information | ||||
| may be encapsulated in specific (dedicated) control packets. | ||||
| Alternatively, it may be piggybacked in existing data packets, | ||||
| when the MTU is larger than the actual packet length. | ||||
| Piggybacking makes specifically sense in wireless networks: the | ||||
| cost (bandwidth and energy) is not linear with the packet size. | ||||
| o router-over vs. mesh under: a control packet is either forwarded | ||||
| directly to the layer-3 next hop (mesh under) or handled hop-by- | ||||
| hop by each router. While the latter option consumes more | ||||
| resource, it allows to collect additionnal intermediary | ||||
| information, particularly relevant in wireless networks. | ||||
| o Defect: a temporary change in the network (e.g., a radio link | o Defect: a temporary change in the network (e.g., a radio link | |||
| which is broken due to a mobile obstacle); | which is broken due to a mobile obstacle); | |||
| o Fault: a definite change which may affect the network performance, | o Fault: a definite change which may affect the network performance, | |||
| e.g., a node runs out of energy. | e.g., a node runs out of energy. | |||
| 1.2. Acronyms | 1.2. Acronyms | |||
| OAM Operations, Administration, and Maintenance | OAM Operations, Administration, and Maintenance | |||
| skipping to change at page 5, line 48 ¶ | skipping to change at page 6, line 33 ¶ | |||
| To be energy-efficient, reserving some dedicated out-of-band | To be energy-efficient, reserving some dedicated out-of-band | |||
| resources for OAM seems idealistic, and only in-band solutions are | resources for OAM seems idealistic, and only in-band solutions are | |||
| considered here. | considered here. | |||
| RAW supports both proactive and on-demand troubleshooting. | RAW supports both proactive and on-demand troubleshooting. | |||
| The specific characteristics of RAW are discussed below. | The specific characteristics of RAW are discussed below. | |||
| 2.1. Link concept and quality | 2.1. Link concept and quality | |||
| In wireless networks, a _link_ does not exist. A common convention | In wireless networks, a _link_ does not exist physically. A common | |||
| is to define a wireless link as a pair of devices that have a non- | convention is to define a wireless link as a pair of devices that | |||
| null probability of transmitting and decoding a packet. Similarly, | have a non-null probability of exchanging a packet that the receiver | |||
| we designate as *neighbor* any device which as a link with a specific | can decode. Similarly, we designate as *neighbor* any device with a | |||
| transmitter. | radio link with a specific transmitter. | |||
| Each wireless link is associated with a link quality, often measured | Each wireless link is associated with a link quality, often measured | |||
| as the Packet Delivery Ratio (PDR), i.e., the probability that the | as the Packet Delivery Ratio (PDR), i.e., the probability that the | |||
| receiver can decode the packet correctly. It is worth noting that | receiver can decode the packet correctly. It is worth noting that | |||
| this link quality depends on many criteria, such as the level of | this link quality depends on many criteria, such as the level of | |||
| external interference, the presence of concurrent transmissions, or | external interference, the presence of concurrent transmissions, or | |||
| the radio channel state. This link quality is even time-variant. | the radio channel state. This link quality is even time-variant. | |||
| 2.2. Broadcast Transmissions | 2.2. Broadcast Transmissions | |||
| In modern switching networks, the unicast transmission is delivered | In modern switching networks, the unicast transmission is delivered | |||
| uniquely to the destination. Wireless networks are much closer to | uniquely to the destination. Wireless networks are much closer to | |||
| the ancient shared access wireless networks. Unicast transmission is | the ancient *shared access* networks. Practically, unicast and | |||
| similar to a broadcast one and can be received by any neighbor. | broadcast frames are handled similarly at the physical layer. The | |||
| link layer is just in charge of filtering the frames to discard | ||||
| irrelevant receptions (e.g., different unicast MAC address). | ||||
| However, contrary to wired networks, we cannot be sure that a packet | However, contrary to wired networks, we cannot be sure that a packet | |||
| is received by *all* the devices attached to the network. It depends | is received by *all* the devices attached to the layer-2 segment. It | |||
| on the radio channel state between the transmitter(s) and the | depends on the radio channel state between the transmitter(s) and the | |||
| receiver(s). In particular, concurrent transmissions may be possible | receiver(s). In particular, concurrent transmissions may be possible | |||
| or not, depending on the radio conditions. | or not, depending on the radio conditions (e.g., do the different | |||
| transmitters use a different radio channel or are they sufficiently | ||||
| spatially separated?) | ||||
| 2.3. Complex Layer 2 Forwarding | 2.3. Complex Layer 2 Forwarding | |||
| Multiple neighbors may receive a transmission. Thus, anycast layer-2 | Multiple neighbors may receive a transmission. Thus, anycast layer-2 | |||
| forwarding helps to maximize the reliability by assigning multiple | forwarding helps to maximize the reliability by assigning multiple | |||
| receivers to a single transmission. That way, the packet is lost | receivers to a single transmission. That way, the packet is lost | |||
| only if none of the receivers decode it. Practically, it has been | only if *none* of the receivers decode it. Practically, it has been | |||
| proven that different neighbors may exhibit very different radio | proven that different neighbors may exhibit very different radio | |||
| conditions, and that reception independency may hold for some of them | conditions, and that reception independency may hold for some of them | |||
| [anycast-property]. | [anycast-property]. | |||
| 3. Operation | 3. Operation | |||
| OAM features will enable RAW with robust operation both for | OAM features will enable RAW with robust operation both for | |||
| forwarding and routing purposes. | forwarding and routing purposes. | |||
| 3.1. Information Collection | 3.1. Information Collection | |||
| Several solutions (e.g., Simple Network Management Protocol (SNMP), | The model to exchange information should be the same as for detnet | |||
| YANG-based data models) are already in charge of collecting the | network, for the sake of inter-operability. YANG may typically | |||
| statistics. That way, we can encapsulate these statistics in | fulfill this objective. | |||
| specific monitoring packets, to send them to the controller. | ||||
| However, RAW networks imply specific constraints (e.g., low | ||||
| bandwidth, packet losses, cost of medium access) that may require to | ||||
| minimize the volume of information to collect. Thus, we discuss in | ||||
| Section 4.2 the different ways to collect information, i.e., transfer | ||||
| physically the OAM information from the emitter to the receiver. | ||||
| 3.2. Continuity Check | 3.2. Continuity Check | |||
| We need to verify that two endpoints are connected. In other words, | Similarly to detnet, we need to verify that the source and the | |||
| there exists "one" way to deliver the packets between two endpoints A | destination are connected (at least one valid path exists) | |||
| and B. The solution may not here defer from those of detnet. | ||||
| 3.3. Connectivity Verification | 3.3. Connectivity Verification | |||
| Additionally, to the Continuity Check, we have to verify the | As in detnet, we have to verify the absence of misconnection. We | |||
| connectivity. This verification considers additional constraints, | will focus here on the RAW specificities. | |||
| i.e., the absence of misconnection. | ||||
| In particular, the resources have to be reserved by a given flow, and | ||||
| no packets from other flows steal the corresponding resources. | ||||
| Similarly, the destination does not receive packets from different | ||||
| flows through its interface. | ||||
| Because of radio transmissions' broadcast nature, several receivers | Because of radio transmissions' broadcast nature, several receivers | |||
| may be active at the same time to enable anycast Layer 2 forwarding. | may be active at the same time to enable anycast Layer 2 forwarding. | |||
| Thus, the connectivity verification must test any combination. We | Thus, the connectivity verification must test any combination. We | |||
| also consider priority-based mechanisms for anycast forwarding, i.e., | also consider priority-based mechanisms for anycast forwarding, i.e., | |||
| all the receivers have different probabilities of forwarding a | all the receivers have different probabilities of forwarding a | |||
| packet. To verify a delay SLO for a given flow, we must also | packet. To verify a delay SLO for a given flow, we must also | |||
| consider all the possible combinations, leading to a probability | consider all the possible combinations, leading to a probability | |||
| distribution function for end-to-end transmissions. If this | distribution function for end-to-end transmissions. If this | |||
| verification is implemented naively, the number of combinations to | verification is implemented naively, the number of combinations to | |||
| test may be exponential and too costly for wireless networks with low | test may be exponential and too costly for wireless networks with low | |||
| bandwidth. | bandwidth. | |||
| It is worth noting that the control and data packets may not follow | ||||
| the same path. The connectivity verification has to be conducted in- | ||||
| band without impacting the data traffic. Test packets MUST share the | ||||
| fate with the monitored data traffic without introducing congestion | ||||
| in normal network conditions. | ||||
| 3.4. Route Tracing | 3.4. Route Tracing | |||
| ICMP tools are comprehensive tools for diagnostic. They help to | ||||
| identify a subset of the list of routers in the route. To ensure | ||||
| predictable performance, resources are reserved per flow in RAW. | ||||
| Thus, we need to define route tracing tools able to track the route | ||||
| for a specific flow. | ||||
| Wireless networks are meshed by nature: we have many redundant radio | Wireless networks are meshed by nature: we have many redundant radio | |||
| links. These meshed networks are both an asset and a drawback: while | links. These meshed networks are both an asset and a drawback: while | |||
| several paths exist between two endpoints, and we should choose the | several paths exist between two endpoints, and we should choose the | |||
| most efficient one(s), concerning specifically the reliability, and | most efficient one(s), concerning specifically the reliability, and | |||
| the delay. | the delay. | |||
| Thus, multipath routing can be considered to make the network fault- | Thus, multipath routing can be considered to make the network fault- | |||
| tolerant. Even better, we can exploit the broadcast nature of | tolerant. Even better, we can exploit the broadcast nature of | |||
| wireless networks to exploit meshed multipath routing: we may have | wireless networks to exploit meshed multipath routing: we may have | |||
| multiple Maintenance Intermediate Endpoints (MIE) for each hop in the | multiple Maintenance Intermediate Endpoints (MIE) for each hop in the | |||
| path. In that way, each Maintenance Intermediate Endpoint has | path. In that way, each Maintenance Intermediate Endpoint has | |||
| several possible next hops in the forwarding plane. Thus, all the | several possible next hops in the forwarding plane. Thus, all the | |||
| possible paths between two maintenance endpoints should be retrieved, | possible paths between two maintenance endpoints should be retrieved, | |||
| which may quickly become untractable if we apply a naive approach. | which may quickly become untractable if we apply a naive approach. | |||
| 3.5. Fault Verification/detection | 3.5. Fault Verification/detection | |||
| RAW expects to operate fault-tolerant networks. Thus, we need | ||||
| mechanisms able to detect faults, before they impact the network | ||||
| performance. | ||||
| Wired networks tend to present stable performances. On the contrary, | Wired networks tend to present stable performances. On the contrary, | |||
| wireless networks are time-variant. We must consequently make a | wireless networks are time-variant. We must consequently make a | |||
| distinction between _normal_ evolutions and malfunction. | distinction between _normal_ evolutions and malfunction. | |||
| The network has to detect when a fault occurred, i.e., the network | ||||
| has deviated from its expected behavior. While the network must | ||||
| report an alarm, the cause may not be identified precisely. For | ||||
| instance, the end-to-end reliability has decreased significantly, or | ||||
| a buffer overflow occurs. | ||||
| 3.6. Fault Isolation/identification | 3.6. Fault Isolation/identification | |||
| The network has isolated and identified the cause of the fault. | The network has isolated and identified the cause of the fault. | |||
| While detnet already expects to identify malfunctions, some problems | While detnet already expects to identify malfunctions, some problems | |||
| are specific to wireless networks. We must consequently collect | are specific to wireless networks. We must consequently collect | |||
| metrics and implement algorithms tailored for wireless networking. | metrics and implement algorithms tailored for wireless networking. | |||
| For instance, the quality of a specific link has decreased, requiring | ||||
| more retransmissions, or the level of external interference has | For instance, the decrease in the link quality may be caused by | |||
| locally increased. | several factors: external interference, obstacles, multipath fading, | |||
| mobility. It it fundamental to be able to discriminate the different | ||||
| causes to make the right decision. | ||||
| 4. Administration | 4. Administration | |||
| The network has to expose a collection of metrics to support an | The RAW network has to expose a collection of metrics to support an | |||
| operator making proper decisions, including: | operator making proper decisions, including: | |||
| o Packet losses: the time-window average and maximum values of the | o Packet losses: the time-window average and maximum values of the | |||
| number of packet losses have to be measured. Many critical | number of packet losses have to be measured. Many critical | |||
| applications stop to work if a few consecutive packets are | applications stop to work if a few consecutive packets are | |||
| dropped; | dropped; | |||
| o Received Signal Strength Indicator (RSSI) is a very common metric | o Received Signal Strength Indicator (RSSI) is a very common metric | |||
| in wireless to denote the link quality. The radio chipset is in | in wireless to denote the link quality. The radio chipset is in | |||
| charge of translating a received signal strength into a normalized | charge of translating a received signal strength into a normalized | |||
| quality indicator; | quality indicator; | |||
| o Delay: the time elapsed between a packet generation / enqueuing | o Delay: the time elapsed between a packet generation / enqueuing | |||
| and its reception by the next hop; | and its reception by the next hop; | |||
| o Buffer occupancy: the number of packets present in the buffer, for | o Buffer occupancy: the number of packets present in the buffer, for | |||
| each of the existing flows. | each of the existing flows. | |||
| These metrics should be collected: | These metrics should be collected per device, virtual circuit, and | |||
| path, as detnet already does. However, we have to face in RAW to a | ||||
| o per virtual circuit to measure the end-to-end performance for a | finer granularity: | |||
| given flow. Each of the paths has to be isolated in multipath | ||||
| routing strategies; | ||||
| o per radio channel to measure, e.g., the level of external | o per radio channel to measure, e.g., the level of external | |||
| interference, and to be able to apply counter-measures (e.g., | interference, and to be able to apply counter-measures (e.g., | |||
| blacklisting). | blacklisting). | |||
| o per device to detect misbehaving node, when it relays the packets | o per link to detect misbehaving link (assymetrical link, | |||
| of several flows. | fluctuating quality). | |||
| 4.1. Collection of metrics | o per resource block: a collision in the schedule is particularly | |||
| challenging to identify in radio networks with spectrum reuse. In | ||||
| particular, a collision may not be systematic (depending on the | ||||
| radio characteristics and the traffic profile) | ||||
| 4.1. Worst-case metrics | ||||
| RAW inherits the same requirements as detnet: we need to know the | ||||
| distribution of a collection of metrics. However, wireless networks | ||||
| are know to be highly variable. Changes may be frequent, and may | ||||
| exhibit a periodical pattern. Collecting and analyzing this amount | ||||
| of measurements is challenging. | ||||
| Wireless networks are known to be lossy, and RAW has to implement | ||||
| strategies to improve reliability on top of unreliable links. Hybrid | ||||
| Automatic Repeat reQuest (ARQ) has typically to enable | ||||
| retransmissions based on the end-to-end reliability and latency | ||||
| requirements. | ||||
| 4.2. Efficient data retrieval | ||||
| We have to minimize the number of statistics / measurements to | We have to minimize the number of statistics / measurements to | |||
| exchange: | exchange: | |||
| o energy efficiency: low-power devices have to limit the volume of | o energy efficiency: low-power devices have to limit the volume of | |||
| monitoring information since every bit consumes energy. | monitoring information since every bit consumes energy. | |||
| o bandwidth: wireless networks exhibit a bandwidth significantly | o bandwidth: wireless networks exhibit a bandwidth significantly | |||
| lower than wired, best-effort networks. | lower than wired, best-effort networks. | |||
| o per-packet cost: it is often more expensive to send several | o per-packet cost: it is often more expensive to send several | |||
| packets instead of combining them in a single link-layer frame. | packets instead of combining them in a single link-layer frame. | |||
| Thus, localized and centralized mechanisms have to be combined | In conclusion, we have to take care of power and bandwidth | |||
| together, and additional control packets have to be triggered only | ||||
| after a fault detection. | ||||
| 4.2. Worst-case metrics | ||||
| RAW aims to enable real-time communications on top of a heterogeneous | ||||
| architecture. Wireless networks are known to be lossy, and RAW has | ||||
| to implement strategies to improve reliability on top of unreliable | ||||
| links. Hybrid Automatic Repeat reQuest (ARQ) has typically to enable | ||||
| retransmissions based on the end-to-end reliability and latency | ||||
| requirements. | ||||
| To make correct decisions, the controller needs to know the | ||||
| distribution of packet losses for each flow, and each hop of the | ||||
| paths. In other words, the average end-to-end statistics are not | ||||
| enough. They must allow the controller to predict the worst-case. | ||||
| 4.3. Energy efficiency constraint | ||||
| RAW targets also low-power wireless networks, where energy represents | ||||
| a key constraint. Thus, we have to take care of power and bandwidth | ||||
| consumption. The following techniques aim to reduce the cost of such | consumption. The following techniques aim to reduce the cost of such | |||
| maintenance: | maintenance: | |||
| on-path collection: some control information is inserted in the | on-path collection: some control information is inserted in the | |||
| data packets if they do not fragment the packet (i.e., the MTU is | data packets if they do not fragment the packet (i.e., the MTU is | |||
| not exceeded). Information Elements represent a standardized way | not exceeded). Information Elements represent a standardized way | |||
| to handle such information; | to handle such information; | |||
| flags/fields: we have to set-up flags in the packets to monitor to | flags/fields: we have to set-up flags in the packets to monitor to | |||
| be able to monitor the forwarding process accurately. A sequence | be able to monitor the forwarding process accurately. A sequence | |||
| number field may help to detect packet losses. Similarly, path | number field may help to detect packet losses. Similarly, path | |||
| inference tools such as [ipath] insert additional information in | inference tools such as [ipath] insert additional information in | |||
| the headers to identify the path followed by a packet a | the headers to identify the path followed by a packet a | |||
| posteriori. | posteriori. | |||
| hierarchical monitoring; localized and centralized mechanisms have | ||||
| to be combined together. Typically, a local mechanism should | ||||
| contiuously monitor a set of metrics and trigger distant OAM | ||||
| exchances only when a fault is detected (but possibly not | ||||
| identified). For instance, local temporary defects must not | ||||
| trigger expensive OAM transmissions. | ||||
| 5. Maintenance | 5. Maintenance | |||
| RAW needs to implement a self-healing and self-optimization approach. | RAW needs to implement a self-healing and self-optimization approach. | |||
| The network must continuously retrieve the state of the network, to | The network must continuously retrieve the state of the network, to | |||
| judge about the relevance of a reconfiguration, quantifying: | judge about the relevance of a reconfiguration, quantifying: | |||
| the cost of the sub-optimality: resources may not be used | the cost of the sub-optimality: resources may not be used | |||
| optimally (e.g., a better path exists); | optimally (e.g., a better path exists); | |||
| the reconfiguration cost: the controller needs to trigger some | the reconfiguration cost: the controller needs to trigger some | |||
| reconfigurations. For this transient period, resources may be | reconfigurations. For this transient period, resources may be | |||
| twice reserved, and control packets have to be transmitted. | twice reserved, and control packets have to be transmitted. | |||
| Thus, reconfiguration may only be triggered if the gain is | Thus, reconfiguration may only be triggered if the gain is | |||
| significant. | significant. | |||
| 5.1. Replication / Elimination | 5.1. Dynamic Resource Reservation | |||
| When multiple paths are reserved between two maintenance endpoints, | ||||
| they may decide to replicate the packets to introduce redundancy, and | ||||
| thus to alleviate transmission errors and collisions. For instance, | ||||
| in Figure 1, the source node S is transmitting the packet to both | ||||
| parents, nodes A and B. Each maintenance endpoint will decide to | ||||
| trigger the replication/elimination process when a set of metrics | ||||
| passes through a threshold value. | ||||
| ===> (A) => (C) => (E) === | ||||
| // \\// \\// \\ | ||||
| source (S) //\\ //\\ (R) (root) | ||||
| \\ // \\ // \\ // | ||||
| ===> (B) => (D) => (F) === | ||||
| Figure 1: Packet Replication: S transmits twice the same data packet, | ||||
| to its DP (A) and to its AP (B). | ||||
| 5.2. Dynamic Resource Reservation | ||||
| Wireless networks exhibit time-variant characteristics. Thus, the | Wireless networks exhibit time-variant characteristics. Thus, the | |||
| network has to provide additional resources along the path to fit the | network has to provide additional resources along the path to fit the | |||
| worst-case performance. This time-variant characteristics make the | worst-case performance. This time-variant characteristics make the | |||
| resource reservation very challenging: over-reaction waste radio and | resource reservation very challenging: over-reaction waste radio and | |||
| energy resources. Inversely, under-reaction jeopardize the network | energy resources. Inversely, under-reaction jeopardize the network | |||
| operations, and some SLO may be violated. | operations, and some SLO may be violated. | |||
| 5.3. Reliable Reconfiguration | 5.2. Reliable Reconfiguration | |||
| Wireless networks are known to be lossy. Thus, commands may be | Wireless networks are known to be lossy. Thus, commands may be | |||
| received or not by the node to reconfigure. Unfortunately, | received or not by the node to reconfigure. Unfortunately, | |||
| inconsistent states may create critical misconfigurations, where | inconsistent states may create critical misconfigurations, where | |||
| packets may be lost along a path because it has not been properly | packets may be lost along a path because it has not been properly | |||
| configured. | configured. | |||
| We have to propose mechanisms to guarantee that the network state is | We have to propose mechanisms to guarantee that the network state is | |||
| always consistent, even if some control packets are lost. Timeouts | always consistent, even if some control packets are lost. Timeouts | |||
| and retransmissions are not sufficient since the reconfiguration | and retransmissions are not sufficient since the reconfiguration | |||
| skipping to change at page 12, line 13 ¶ | skipping to change at page 12, line 11 ¶ | |||
| TBD | TBD | |||
| 9. Informative References | 9. Informative References | |||
| [anycast-property] | [anycast-property] | |||
| Teles Hermeto, R., Gallais, A., and F. Theoleyre, "Is | Teles Hermeto, R., Gallais, A., and F. Theoleyre, "Is | |||
| Link-Layer Anycast Scheduling Relevant for IEEE | Link-Layer Anycast Scheduling Relevant for IEEE | |||
| 802.15.4-TSCH Networks?", 2019, | 802.15.4-TSCH Networks?", 2019, | |||
| <https://doi.org/10.1109/LCNSymposium47956.2019.9000679>. | <https://doi.org/10.1109/LCNSymposium47956.2019.9000679>. | |||
| [detnet-oam] | ||||
| Theoleyre, F., Papadopoulos, G. Z., Mirsky, G., and C. J. | ||||
| Bernardos, "Operations, Administration and Maintenance | ||||
| (OAM) features for detnet", 2020, | ||||
| <https://tools.ietf.org/html/draft-theoleyre-detnet-oam- | ||||
| support>. | ||||
| [ipath] Gao, Y., Dong, W., Chen, C., Bu, J., Wu, W., and X. Liu, | [ipath] Gao, Y., Dong, W., Chen, C., Bu, J., Wu, W., and X. Liu, | |||
| "iPath: path inference in wireless sensor networks.", | "iPath: path inference in wireless sensor networks.", | |||
| 2016, <https://doi.org/10.1109/TNET.2014.2371459>. | 2016, <https://doi.org/10.1109/TNET.2014.2371459>. | |||
| [PREF-draft] | [PREF-draft] | |||
| Thubert, P., Eckert, T., Brodard, Z., and H. Jiang, "BIER- | Thubert, P., Eckert, T., Brodard, Z., and H. Jiang, "BIER- | |||
| TE extensions for Packet Replication and Elimination | TE extensions for Packet Replication and Elimination | |||
| Function (PREF) and OAM", 2018, | Function (PREF) and OAM", 2018, | |||
| <https://tools.ietf.org/html/draft-thubert-bier- | <https://tools.ietf.org/html/draft-thubert-bier- | |||
| replication-elimination>. | replication-elimination>. | |||
| skipping to change at page 12, line 41 ¶ | skipping to change at page 12, line 46 ¶ | |||
| Acronym in the IETF", BCP 161, RFC 6291, | Acronym in the IETF", BCP 161, RFC 6291, | |||
| DOI 10.17487/RFC6291, June 2011, | DOI 10.17487/RFC6291, June 2011, | |||
| <https://www.rfc-editor.org/info/rfc6291>. | <https://www.rfc-editor.org/info/rfc6291>. | |||
| [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. | [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. | |||
| Weingarten, "An Overview of Operations, Administration, | Weingarten, "An Overview of Operations, Administration, | |||
| and Maintenance (OAM) Tools", RFC 7276, | and Maintenance (OAM) Tools", RFC 7276, | |||
| DOI 10.17487/RFC7276, June 2014, | DOI 10.17487/RFC7276, June 2014, | |||
| <https://www.rfc-editor.org/info/rfc7276>. | <https://www.rfc-editor.org/info/rfc7276>. | |||
| [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with | ||||
| Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, | ||||
| May 2016, <https://www.rfc-editor.org/info/rfc7799>. | ||||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| [RFC8655] Finn, N., Thubert, P., Varga, B., and J. Farkas, | [RFC8655] Finn, N., Thubert, P., Varga, B., and J. Farkas, | |||
| "Deterministic Networking Architecture", RFC 8655, | "Deterministic Networking Architecture", RFC 8655, | |||
| DOI 10.17487/RFC8655, October 2019, | DOI 10.17487/RFC8655, October 2019, | |||
| <https://www.rfc-editor.org/info/rfc8655>. | <https://www.rfc-editor.org/info/rfc8655>. | |||
| Authors' Addresses | Authors' Addresses | |||
| End of changes. 37 change blocks. | ||||
| 123 lines changed or deleted | 135 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||