< draft-ietf-raw-oam-support-00.txt   draft-ietf-raw-oam-support-01.txt >
RAW F. Theoleyre RAW F. Theoleyre
Internet-Draft CNRS Internet-Draft CNRS
Updates: draft-theoleyre-raw-oam- G. Papadopoulos Updates: draft-ietf-raw-oam-support-00 G. Papadopoulos
support-04 (if approved) IMT Atlantique (if approved) IMT Atlantique
Intended status: Informational G. Mirsky Intended status: Informational G. Mirsky
Expires: October 21, 2021 ZTE Corp. Expires: November 25, 2021 ZTE Corp.
April 19, 2021 CJ. Bernardos
UC3M
May 24, 2021
Operations, Administration and Maintenance (OAM) features for RAW Operations, Administration and Maintenance (OAM) features for RAW
draft-ietf-raw-oam-support-00 draft-ietf-raw-oam-support-01
Abstract Abstract
Some critical applications may use a wireless infrastructure. Some critical applications may use a wireless infrastructure.
However, wireless networks exhibit a bandwidth of several orders of However, wireless networks exhibit a bandwidth of several orders of
magnitude lower than wired networks. Besides, wireless transmissions magnitude lower than wired networks. Besides, wireless transmissions
are lossy by nature; the probability that a packet cannot be decoded are lossy by nature; the probability that a packet cannot be decoded
correctly by the receiver may be quite high. In these conditions, correctly by the receiver may be quite high. In these conditions,
guaranteeing the network infrastructure works properly is guaranteeing that the network infrastructure works properly is
particularly challenging, since we need to address some issues particularly challenging, since we need to address some issues
specific to wireless networks. This document lists the requirements specific to wireless networks. This document lists the requirements
of the Operation, Administration, and Maintenance (OAM) features of the Operation, Administration, and Maintenance (OAM) features
recommended to construct a predictable communication infrastructure recommended to construct a predictable communication infrastructure
on top of a collection of wireless segments. This document describes on top of a collection of wireless segments. This document describes
the benefits, problems, and trade-offs for using OAM in wireless the benefits, problems, and trade-offs for using OAM in wireless
networks to achieve Service Level Objectives (SLO). networks to achieve Service Level Objectives (SLO).
Status of This Memo Status of This Memo
skipping to change at page 1, line 45 skipping to change at page 1, line 47
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 21, 2021. This Internet-Draft will expire on November 25, 2021.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 25 skipping to change at page 2, line 25
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3. Requirements Language . . . . . . . . . . . . . . . . . . 5 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 6
2. Role of OAM in RAW . . . . . . . . . . . . . . . . . . . . . 5 2. Role of OAM in RAW . . . . . . . . . . . . . . . . . . . . . 6
2.1. Link concept and quality . . . . . . . . . . . . . . . . 6 2.1. Link concept and quality . . . . . . . . . . . . . . . . 7
2.2. Broadcast Transmissions . . . . . . . . . . . . . . . . . 6 2.2. Broadcast Transmissions . . . . . . . . . . . . . . . . . 7
2.3. Complex Layer 2 Forwarding . . . . . . . . . . . . . . . 7 2.3. Complex Layer 2 Forwarding . . . . . . . . . . . . . . . 8
3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4. End-to-end delay . . . . . . . . . . . . . . . . . . . . 8
3.1. Information Collection . . . . . . . . . . . . . . . . . 7 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2. Continuity Check . . . . . . . . . . . . . . . . . . . . 7 3.1. Information Collection . . . . . . . . . . . . . . . . . 8
3.3. Connectivity Verification . . . . . . . . . . . . . . . . 7 3.2. Continuity Check . . . . . . . . . . . . . . . . . . . . 9
3.4. Route Tracing . . . . . . . . . . . . . . . . . . . . . . 8 3.3. Connectivity Verification . . . . . . . . . . . . . . . . 9
3.5. Fault Verification/detection . . . . . . . . . . . . . . 8 3.4. Route Tracing . . . . . . . . . . . . . . . . . . . . . . 9
3.6. Fault Isolation/identification . . . . . . . . . . . . . 8 3.5. Fault Verification/detection . . . . . . . . . . . . . . 9
4. Administration . . . . . . . . . . . . . . . . . . . . . . . 9 3.6. Fault Isolation/identification . . . . . . . . . . . . . 10
4.1. Worst-case metrics . . . . . . . . . . . . . . . . . . . 9 4. Administration . . . . . . . . . . . . . . . . . . . . . . . 10
4.2. Efficient data retrieval . . . . . . . . . . . . . . . . 10 4.1. Worst-case metrics . . . . . . . . . . . . . . . . . . . 11
5. Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2. Efficient data retrieval . . . . . . . . . . . . . . . . 11
5.1. Dynamic Resource Reservation . . . . . . . . . . . . . . 11 4.3. Reporting OAM packets to the source . . . . . . . . . . . 12
5.2. Reliable Reconfiguration . . . . . . . . . . . . . . . . 11 5. Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . 12
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 5.1. Soft transition after reconfiguration . . . . . . . . . . 12
7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 5.2. Predictive maintenance . . . . . . . . . . . . . . . . . 12
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
9. Informative References . . . . . . . . . . . . . . . . . . . 11 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13
9. Informative References . . . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14
1. Introduction 1. Introduction
Reliable and Available Wireless (RAW) is an effort that extends Reliable and Available Wireless (RAW) is an effort that extends
DetNet to approach end-to-end deterministic performances over a DetNet to approach end-to-end deterministic performances over a
network that includes scheduled wireless segments. In wired network that includes scheduled wireless segments. In wired
networks, many approaches try to enable Quality of Service (QoS) by networks, many approaches try to enable Quality of Service (QoS) by
implementing traffic differentiation so that routers handle each type implementing traffic differentiation so that routers handle each type
of packets differently. However, this differentiated treatment was of packets differently. However, this differentiated treatment was
expensive for most applications. expensive for most applications.
Deterministic Networking (DetNet) [RFC8655] has proposed to provide a Deterministic Networking (DetNet) [RFC8655] has proposed to provide a
bounded end-to-end latency on top of the network infrastructure, bounded end-to-end latency on top of the network infrastructure,
comprising both Layer 2 bridged and Layer 3 routed segments. Their comprising both Layer 2 bridged and Layer 3 routed segments. Their
work encompasses the data plane, OAM, time synchronization, work encompasses the data plane, OAM, time synchronization,
management, control, and security aspects. management, control, and security aspects.
However, wireless networks create specific challenges. First of all, However, wireless networks create specific challenges. First of all,
radio bandwidth is significantly lower than for wired networks. In radio bandwidth is significantly lower than for wired networks. In
these conditions, the volume of signaling messages has to be very these conditions, the volume of signaling messages has to be very
limited. Even worse, wireless links are lossy: a layer 2 limited. Even worse, wireless links are lossy: a Layer 2
transmission may or may not be decoded correctly by the receiver, transmission may or may not be decoded correctly by the receiver,
depending on a broad set of parameters. Thus, providing high depending on a broad set of parameters. Thus, providing high
reliability through wireless segments is particularly challenging. reliability through wireless segments is particularly challenging.
Wired networks rely on the concept of _links_. All the devices Wired networks rely on the concept of _links_. All the devices
attached to a link receive any transmission. The concept of a link attached to a link receive any transmission. The concept of a link
in wireless networks is somewhat different from what many are used to in wireless networks is somewhat different from what many are used to
in wireline networks. A receiver may or may not receive a in wireline networks. A receiver may or may not receive a
transmission, depending on the presence of a colliding transmission, transmission, depending on the presence of a colliding transmission,
the radio channel's quality, and the external interference. Besides, the radio channel's quality, and the external interference. Besides,
skipping to change at page 4, line 9 skipping to change at page 4, line 9
primary importance for IP networks [RFC7276]. It defines a toolset primary importance for IP networks [RFC7276]. It defines a toolset
for fault detection, isolation, and performance measurement. for fault detection, isolation, and performance measurement.
The primary purpose of this document is to detail the specific The primary purpose of this document is to detail the specific
requirements of the OAM features recommended to construct a requirements of the OAM features recommended to construct a
predictable communication infrastructure on top of a collection of predictable communication infrastructure on top of a collection of
wireless segments. This document describes the benefits, problems, wireless segments. This document describes the benefits, problems,
and trade-offs for using OAM in wireless networks to provide and trade-offs for using OAM in wireless networks to provide
availability and predictability. availability and predictability.
1.1. Terminology
In this document, the term OAM will be used according to its In this document, the term OAM will be used according to its
definition specified in [RFC6291]. We expect to implement an OAM definition specified in [RFC6291]. We expect to implement an OAM
framework in RAW networks to maintain a real-time view of the network framework in RAW networks to maintain a real-time view of the network
infrastructure, and its ability to respect the Service Level infrastructure, and its ability to respect the Service Level
Objectives (SLO), such as delay and reliability, assigned to each Objectives (SLO), such as delay and reliability, assigned to each
data flow. data flow.
1.1. Terminology
We re-use here the same terminology as [detnet-oam]: We re-use here the same terminology as [detnet-oam]:
o OAM entity: a data flow to be controlled; o OAM entity: a data flow to be monitored for defects and/or its
performance metrics measured.;
o Maintenance End Point (MEP): OAM devices crossed when entering/ o Maintenance End Point (MEP): OAM devices crossed when entering/
exiting the network. In RAW, it corresponds mostly to the source exiting the network. In RAW, it corresponds mostly to the source
or destination of a data flow. OAM message can be exchanges or destination of a data flow. OAM message can be exchanged
between two MEPs; between two MEPs;
o Maintenance Intermediate endPoint (MIP): OAM devices along the o Maintenance Intermediate endPoint (MIP): an OAM system along the
flow; OAM messages can be exchanged between a MEP and a MIP; flow; a MIP MAY respond to an OAM message generated by the MEP;
o control/data plane: while the control plane expects to configure
and control the network (long-term), the data plane takes the
individual decision;
o passive / active methods (as defined in [RFC7799]): active methods o control/management/data plane: the control and management planes
send additionnal control information (inserting novel fields, are used to configure and control the network (long-term). The
generating novel control packets). Passive methods infer data plane takes the individual decision. Relative to a data
information just by observing unmodified existing flows. flow, the control and/or management plane can be out-of-band;
o active methods may implement one of these two strategies: o Active measurement methods (as defined in [RFC7799]) modify a
normal data flow by inserting novel fields, injecting specially
constructed test packets [RFC2544]). It is critical for the
quality of information obtained using an active method that
generated test packets are in-band with the monitored data flow.
In other words, a test packet is required to cross the same
network nodes and links and receive the same Quality of Service
(QoS) treatment as a data packet. Active methods may implement
one of these two strategies:
* In-band: control information follows the same path as the data * In-band: control information follows the same path as the data
packets. In other words, a failure in the data plane may packets. In other words, a failure in the data plane may
prevent the control information to reach the destination (e.g., prevent the control information to reach the destination (e.g.,
end-device or controller). end-device or controller).
* out-of-band: control information is sent separately from the * out-of-band: control information is sent separately from the
data packets. Thus, the behavior of control vs. data packets data packets. Thus, the behavior of control vs. data packets
may differ; may differ;
o Passive measurement methods [RFC7799] infer information by
observing unmodified existing flows.
We also adopt the following terminology, which is particularly We also adopt the following terminology, which is particularly
relevant for RAW segments. relevant for RAW segments.
o piggybacking vs. dedicated control packets: control information o piggybacking vs. dedicated control packets: control information
may be encapsulated in specific (dedicated) control packets. may be encapsulated in specific (dedicated) control packets.
Alternatively, it may be piggybacked in existing data packets, Alternatively, it may be piggybacked in existing data packets,
when the MTU is larger than the actual packet length. when the MTU is larger than the actual packet length.
Piggybacking makes specifically sense in wireless networks: the Piggybacking makes specifically sense in wireless networks, as the
cost (bandwidth and energy) is not linear with the packet size. cost (bandwidth and energy) is not linear with the packet size.
o router-over vs. mesh under: a control packet is either forwarded o router-over vs. mesh under: a control packet is either forwarded
directly to the layer-3 next hop (mesh under) or handled hop-by- directly to the layer-3 next hop (mesh under) or handled hop-by-
hop by each router. While the latter option consumes more hop by each router. While the latter option consumes more
resource, it allows to collect additionnal intermediary resources, it allows to collect additionnal intermediary
information, particularly relevant in wireless networks. information, particularly relevant in wireless networks.
o Defect: a temporary change in the network (e.g., a radio link o Defect: a temporary change in the network (e.g., a radio link
which is broken due to a mobile obstacle); which is broken due to a mobile obstacle);
o Fault: a definite change which may affect the network performance, o Fault: a definite change which may affect the network performance,
e.g., a node runs out of energy. e.g., a node runs out of energy.
o End-to-end delay: the time between the packet generation and its
reception by the destination.
1.2. Acronyms 1.2. Acronyms
OAM Operations, Administration, and Maintenance OAM Operations, Administration, and Maintenance
DetNet Deterministic Networking DetNet Deterministic Networking
SLO Service Level Objective PSE Path Selection Engine [I-D.pthubert-raw-architecture]
QoS Quality of Service QoS Quality of Service
SNMP Simple Network Management Protocol RAW Reliable and Available Wireless
SLO Service Level Objective
SNMP Simple Network Management Protocol
SDN Software-Defined Network SDN Software-Defined Network
1.3. Requirements Language 1.3. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
2. Role of OAM in RAW 2. Role of OAM in RAW
RAW networks expect to make the communications reliable and RAW networks expect to make the communications reliable and
predictable on top of a wireless network infrastructure. Most predictable on top of a wireless network infrastructure. Most
critical applications will define an SLO to be required for the data critical applications will define an SLO to be required for the data
flows it generates. RAW considers network plane protocol elements flows it generates. RAW considers network plane protocol elements
such as OAM to improve the RAW operation at the service and the such as OAM to improve the RAW operation at the service and the
forwarding sub-layers. forwarding sub-layers.
To respect strict guarantees, RAW relies on an orchestrator able to To respect strict guarantees, RAW relies on the Path Selection Engine
monitor and maintain the network. Typically, a Software-Defined (PSE) (as defined in [I-D.pthubert-raw-architecture] to monitor and
Network (SDN) controller is in charge of scheduling the transmissions maintain the network. As an example, a Software-Defined Network
in the deployed network, based on the radio link characteristics, SLO (SDN) controller may be used to schedule the transmissions in the
of the flows, the number of packets to forward. Thus, resources have deployed network, based on the radio link characteristics, SLO of the
to be provisioned a priori to handle any defect. OAM represents the flows, the number of packets to forward. Thus, resources have to be
core of the pre-provisioning process and maintains the network provisioned a priori to handle any defect. OAM represents the core
operational by updating the schedule dynamically. of the pre-provisioning process and maintains the network operational
by updating the schedule dynamically.
Fault-tolerance also assumes that multiple paths have to be Fault-tolerance also assumes that multiple paths have to be
provisioned so that an end-to-end circuit keeps on existing whatever provisioned so that an end-to-end circuit keeps on existing whatever
the conditions. The Packet Replication and Elimination Function the conditions. The Packet Replication and Elimination Function
([PREF-draft]) on a node is typically controlled by a central ([PREF-draft]) on a node is typically controlled by the PSE. OAM
controller/orchestrator. OAM mechanisms can be used to monitor that mechanisms can be used to monitor that PREOF is working correctly on
PREOF is working correctly on a node and within the domain. a node and within the domain.
To be energy-efficient, reserving some dedicated out-of-band To be energy-efficient, reserving some dedicated out-of-band
resources for OAM seems idealistic, and only in-band solutions are resources for OAM seems idealistic, and only in-band solutions are
considered here. considered here.
RAW supports both proactive and on-demand troubleshooting. RAW supports both proactive and on-demand troubleshooting.
The specific characteristics of RAW are discussed below. The specific characteristics of RAW are discussed below.
2.1. Link concept and quality 2.1. Link concept and quality
In wireless networks, a _link_ does not exist physically. A common In wireless networks, a _link_ does not exist physically. A device
convention is to define a wireless link as a pair of devices that has a set of *neighbors* that correspond to all the devices that have
have a non-null probability of exchanging a packet that the receiver a non null probability of receiving correctly its packets. We make a
can decode. Similarly, we designate as *neighbor* any device with a distinction between:
radio link with a specific transmitter.
o point-to-point (p2p) link with one transmitter and one receiver.
These links are used to transmit unicast packets.
o point-to-multipoint (p2m) link associates one transmitter and a
collection of receivers. For instance, broadcast packets assume
the existence of p2m links to avoid duplicating a broadcast packet
to reach each possible radio neighbor.
In scheduled radio networks, p2m and p2p links are commonly not
scheduled simultaneously to save energy. More precisely, only one
part of the neighbors may wake-up at a given instant.
Anycast are used in p2m links to improve the reliability. A
collection of receivers are scheduled to wake-up simutaneously, so
that the transmission fails only if none of the receivers is able to
decode the packet.
Each wireless link is associated with a link quality, often measured Each wireless link is associated with a link quality, often measured
as the Packet Delivery Ratio (PDR), i.e., the probability that the as the Packet Delivery Ratio (PDR), i.e., the probability that the
receiver can decode the packet correctly. It is worth noting that receiver can decode the packet correctly. It is worth noting that
this link quality depends on many criteria, such as the level of this link quality depends on many criteria, such as the level of
external interference, the presence of concurrent transmissions, or external interference, the presence of concurrent transmissions, or
the radio channel state. This link quality is even time-variant. the radio channel state. This link quality is even time-variant.
For p2m links, we have consequently a collection of PDR (one value
per receiver). Other more sophisticated, aggregated metrics exist
for these p2m links, such as [anycast-property]
2.2. Broadcast Transmissions 2.2. Broadcast Transmissions
In modern switching networks, the unicast transmission is delivered In modern switching networks, the unicast transmission is delivered
uniquely to the destination. Wireless networks are much closer to uniquely to the destination. Wireless networks are much closer to
the ancient *shared access* networks. Practically, unicast and the ancient *shared access* networks. Practically, unicast and
broadcast frames are handled similarly at the physical layer. The broadcast frames are handled similarly at the physical layer. The
link layer is just in charge of filtering the frames to discard link layer is just in charge of filtering the frames to discard
irrelevant receptions (e.g., different unicast MAC address). irrelevant receptions (e.g., different unicast MAC address).
However, contrary to wired networks, we cannot be sure that a packet However, contrary to wired networks, we cannot be sure that a packet
is received by *all* the devices attached to the layer-2 segment. It is received by *all* the devices attached to the Layer 2 segment. It
depends on the radio channel state between the transmitter(s) and the depends on the radio channel state between the transmitter(s) and the
receiver(s). In particular, concurrent transmissions may be possible receiver(s). In particular, concurrent transmissions may be possible
or not, depending on the radio conditions (e.g., do the different or not, depending on the radio conditions (e.g., do the different
transmitters use a different radio channel or are they sufficiently transmitters use a different radio channel or are they sufficiently
spatially separated?) spatially separated?)
2.3. Complex Layer 2 Forwarding 2.3. Complex Layer 2 Forwarding
Multiple neighbors may receive a transmission. Thus, anycast layer-2 Multiple neighbors may receive a transmission. Thus, anycast Layer 2
forwarding helps to maximize the reliability by assigning multiple forwarding helps to maximize the reliability by assigning multiple
receivers to a single transmission. That way, the packet is lost receivers to a single transmission. That way, the packet is lost
only if *none* of the receivers decode it. Practically, it has been only if *none* of the receivers decode it. Practically, it has been
proven that different neighbors may exhibit very different radio proven that different neighbors may exhibit very different radio
conditions, and that reception independency may hold for some of them conditions, and that reception independency may hold for some of them
[anycast-property]. [anycast-property].
2.4. End-to-end delay
In a wireless network, additionnal transmissions opportunities are
provisionned to accomodate packet losses. Thus, the end-to-end delay
consists of:
o Transmission delay, which is fixed and depends mainly on the data
rate, and the presence or absence of an acknowledgement.
o Residence time, corresponds to the buffering delay and depends on
the schedule. To account for retransmisisons, the residence time
is equal to the difference between the time of last reception from
the previous hop (among all the retransmisions) and the time of
emission of the last retransmission.
3. Operation 3. Operation
OAM features will enable RAW with robust operation both for OAM features will enable RAW with robust operation both for
forwarding and routing purposes. forwarding and routing purposes.
3.1. Information Collection 3.1. Information Collection
The model to exchange information should be the same as for detnet The model to exchange information should be the same as for DetNet
network, for the sake of inter-operability. YANG may typically network, for the sake of inter-operability. YANG may typically
fulfill this objective. fulfill this objective.
However, RAW networks imply specific constraints (e.g., low However, RAW networks imply specific constraints (e.g., low
bandwidth, packet losses, cost of medium access) that may require to bandwidth, packet losses, cost of medium access) that may require to
minimize the volume of information to collect. Thus, we discuss in minimize the volume of information to collect. Thus, we discuss in
Section 4.2 the different ways to collect information, i.e., transfer Section 4.2 different ways to collect information, i.e., transfer
physically the OAM information from the emitter to the receiver. physically the OAM information from the emitter to the receiver.
3.2. Continuity Check 3.2. Continuity Check
Similarly to detnet, we need to verify that the source and the Similarly to DetNet, we need to verify that the source and the
destination are connected (at least one valid path exists) destination are connected (at least one valid path exists)
3.3. Connectivity Verification 3.3. Connectivity Verification
As in detnet, we have to verify the absence of misconnection. We As in DetNet, we have to verify the absence of misconnection. We
will focus here on the RAW specificities. focus here on the RAW specificities.
Because of radio transmissions' broadcast nature, several receivers Because of radio transmissions' broadcast nature, several receivers
may be active at the same time to enable anycast Layer 2 forwarding. may be active at the same time to enable anycast Layer 2 forwarding.
Thus, the connectivity verification must test any combination. We Thus, the connectivity verification must test any combination. We
also consider priority-based mechanisms for anycast forwarding, i.e., also consider priority-based mechanisms for anycast forwarding, i.e.,
all the receivers have different probabilities of forwarding a all the receivers have different probabilities of forwarding a
packet. To verify a delay SLO for a given flow, we must also packet. To verify a delay SLO for a given flow, we must also
consider all the possible combinations, leading to a probability consider all the possible combinations, leading to a probability
distribution function for end-to-end transmissions. If this distribution function for end-to-end transmissions. If this
verification is implemented naively, the number of combinations to verification is implemented naively, the number of combinations to
skipping to change at page 8, line 28 skipping to change at page 9, line 38
Wireless networks are meshed by nature: we have many redundant radio Wireless networks are meshed by nature: we have many redundant radio
links. These meshed networks are both an asset and a drawback: while links. These meshed networks are both an asset and a drawback: while
several paths exist between two endpoints, and we should choose the several paths exist between two endpoints, and we should choose the
most efficient one(s), concerning specifically the reliability, and most efficient one(s), concerning specifically the reliability, and
the delay. the delay.
Thus, multipath routing can be considered to make the network fault- Thus, multipath routing can be considered to make the network fault-
tolerant. Even better, we can exploit the broadcast nature of tolerant. Even better, we can exploit the broadcast nature of
wireless networks to exploit meshed multipath routing: we may have wireless networks to exploit meshed multipath routing: we may have
multiple Maintenance Intermediate Endpoints (MIE) for each hop in the multiple Maintenance Intermediate Endpoints (MIP) for each hop in the
path. In that way, each Maintenance Intermediate Endpoint has path. In that way, each Maintenance Intermediate Endpoint has
several possible next hops in the forwarding plane. Thus, all the several possible next hops in the forwarding plane. Thus, all the
possible paths between two maintenance endpoints should be retrieved, possible paths between two maintenance endpoints should be retrieved,
which may quickly become untractable if we apply a naive approach. which may quickly become untractable if we apply a naive approach.
3.5. Fault Verification/detection 3.5. Fault Verification/detection
Wired networks tend to present stable performances. On the contrary, Wired networks tend to present stable performances. On the contrary,
wireless networks are time-variant. We must consequently make a wireless networks are time-variant. We must consequently make a
distinction between _normal_ evolutions and malfunction. distinction between _normal_ evolutions and malfunction.
3.6. Fault Isolation/identification 3.6. Fault Isolation/identification
The network has isolated and identified the cause of the fault. The network has isolated and identified the cause of the fault.
While detnet already expects to identify malfunctions, some problems While DetNet already expects to identify malfunctions, some problems
are specific to wireless networks. We must consequently collect are specific to wireless networks. We must consequently collect
metrics and implement algorithms tailored for wireless networking. metrics and implement algorithms tailored for wireless networking.
For instance, the decrease in the link quality may be caused by For instance, the decrease in the link quality may be caused by
several factors: external interference, obstacles, multipath fading, several factors: external interference, obstacles, multipath fading,
mobility. It it fundamental to be able to discriminate the different mobility. It it fundamental to be able to discriminate the different
causes to make the right decision. causes to make the right decision.
4. Administration 4. Administration
skipping to change at page 9, line 26 skipping to change at page 10, line 38
in wireless to denote the link quality. The radio chipset is in in wireless to denote the link quality. The radio chipset is in
charge of translating a received signal strength into a normalized charge of translating a received signal strength into a normalized
quality indicator; quality indicator;
o Delay: the time elapsed between a packet generation / enqueuing o Delay: the time elapsed between a packet generation / enqueuing
and its reception by the next hop; and its reception by the next hop;
o Buffer occupancy: the number of packets present in the buffer, for o Buffer occupancy: the number of packets present in the buffer, for
each of the existing flows. each of the existing flows.
o Battery lifetime: the expected remaining battery lifetime of the
device. Since many RAW devices might be battery powered, this is
an important metric for an operator to take proper decisions.
o Mobility: if a device is known to be mobile, this might be
considered by an operator to take proper decisions.
These metrics should be collected per device, virtual circuit, and These metrics should be collected per device, virtual circuit, and
path, as detnet already does. However, we have to face in RAW to a path, as detnet already does. However, we have to face in RAW to a
finer granularity: finer granularity:
o per radio channel to measure, e.g., the level of external o per radio channel to measure, e.g., the level of external
interference, and to be able to apply counter-measures (e.g., interference, and to be able to apply counter-measures (e.g.,
blacklisting). blacklisting).
o per link to detect misbehaving link (assymetrical link, o per link to detect misbehaving link (assymetrical link,
fluctuating quality). fluctuating quality).
o per resource block: a collision in the schedule is particularly o per resource block: a collision in the schedule is particularly
challenging to identify in radio networks with spectrum reuse. In challenging to identify in radio networks with spectrum reuse. In
particular, a collision may not be systematic (depending on the particular, a collision may not be systematic (depending on the
radio characteristics and the traffic profile) radio characteristics and the traffic profile)
4.1. Worst-case metrics 4.1. Worst-case metrics
RAW inherits the same requirements as detnet: we need to know the RAW inherits the same requirements as DetNet: we need to know the
distribution of a collection of metrics. However, wireless networks distribution of a collection of metrics. However, wireless networks
are know to be highly variable. Changes may be frequent, and may are known to be highly variable. Changes may be frequent, and may
exhibit a periodical pattern. Collecting and analyzing this amount exhibit a periodical pattern. Collecting and analyzing this amount
of measurements is challenging. of measurements is challenging.
Wireless networks are known to be lossy, and RAW has to implement Wireless networks are known to be lossy, and RAW has to implement
strategies to improve reliability on top of unreliable links. Hybrid strategies to improve reliability on top of unreliable links. Hybrid
Automatic Repeat reQuest (ARQ) has typically to enable Automatic Repeat reQuest (ARQ) has typically to enable
retransmissions based on the end-to-end reliability and latency retransmissions based on the end-to-end reliability and latency
requirements. requirements.
4.2. Efficient data retrieval 4.2. Efficient data retrieval
skipping to change at page 10, line 44 skipping to change at page 12, line 15
the headers to identify the path followed by a packet a the headers to identify the path followed by a packet a
posteriori. posteriori.
hierarchical monitoring; localized and centralized mechanisms have hierarchical monitoring; localized and centralized mechanisms have
to be combined together. Typically, a local mechanism should to be combined together. Typically, a local mechanism should
contiuously monitor a set of metrics and trigger distant OAM contiuously monitor a set of metrics and trigger distant OAM
exchances only when a fault is detected (but possibly not exchances only when a fault is detected (but possibly not
identified). For instance, local temporary defects must not identified). For instance, local temporary defects must not
trigger expensive OAM transmissions. trigger expensive OAM transmissions.
4.3. Reporting OAM packets to the source
TODO: statistics are collected when a packet goes from the source to
the destination. However, it has to be also reported by the source.
Problem: resource may not be reserved bidirectionnaly. Even worse:
the inverse path may not exist.
5. Maintenance 5. Maintenance
RAW needs to implement a self-healing and self-optimization approach. Maintenance needs to facilitate the maintenance (repairs and
upgrades). In wireless networks, repairs are expected to occur much
more frequently, since the link quality may be highly time-variant.
Thus, maintenance represents a key feature for RAW.
5.1. Soft transition after reconfiguration
Because of the wireless medium, the link quality may fluctuate, and
the network needs to reconfigure itself continuously. During this
transient state, flows may begin to be gradually re-forwarded,
consuming resources in different parts of the network. OAM has to
make a distinction between a metric that changed because of a legal
network change (e.g., flow redirection) and an unexpected event
(e.g., a fault).
5.2. Predictive maintenance
RAW needs to implement self-optimization features. While the network
is configured to be fault-tolerant, a reconfiguration may be required
to keep on respecting long-term objectives. Obviously, the network
keeps on respecting the SLO after a node's crash, but a
reconfiguration is required to handle the future faults. In other
words, the reconfiguration delay MUST be strictly smaller than the
inter-fault time.
The network must continuously retrieve the state of the network, to The network must continuously retrieve the state of the network, to
judge about the relevance of a reconfiguration, quantifying: judge about the relevance of a reconfiguration, quantifying:
the cost of the sub-optimality: resources may not be used the cost of the sub-optimality: resources may not be used
optimally (e.g., a better path exists); optimally (e.g., a better path exists);
the reconfiguration cost: the controller needs to trigger some the reconfiguration cost: the controller needs to trigger some
reconfigurations. For this transient period, resources may be reconfigurations. For this transient period, resources may be
twice reserved, and control packets have to be transmitted. twice reserved, and control packets have to be transmitted.
Thus, reconfiguration may only be triggered if the gain is Thus, reconfiguration may only be triggered if the gain is
significant. significant.
5.1. Dynamic Resource Reservation
Wireless networks exhibit time-variant characteristics. Thus, the
network has to provide additional resources along the path to fit the
worst-case performance. This time-variant characteristics make the
resource reservation very challenging: over-reaction waste radio and
energy resources. Inversely, under-reaction jeopardize the network
operations, and some SLO may be violated.
5.2. Reliable Reconfiguration
Wireless networks are known to be lossy. Thus, commands may be
received or not by the node to reconfigure. Unfortunately,
inconsistent states may create critical misconfigurations, where
packets may be lost along a path because it has not been properly
configured.
We have to propose mechanisms to guarantee that the network state is
always consistent, even if some control packets are lost. Timeouts
and retransmissions are not sufficient since the reconfiguration
duration would be, in that case, unbounded.
6. IANA Considerations 6. IANA Considerations
This document has no actionable requirements for IANA. This section This document has no actionable requirements for IANA. This section
can be removed before the publication. can be removed before the publication.
7. Security Considerations 7. Security Considerations
This section will be expanded in future versions of the draft. This section will be expanded in future versions of the draft.
8. Acknowledgments 8. Acknowledgments
skipping to change at page 12, line 18 skipping to change at page 13, line 43
802.15.4-TSCH Networks?", 2019, 802.15.4-TSCH Networks?", 2019,
<https://doi.org/10.1109/LCNSymposium47956.2019.9000679>. <https://doi.org/10.1109/LCNSymposium47956.2019.9000679>.
[detnet-oam] [detnet-oam]
Theoleyre, F., Papadopoulos, G. Z., Mirsky, G., and C. J. Theoleyre, F., Papadopoulos, G. Z., Mirsky, G., and C. J.
Bernardos, "Operations, Administration and Maintenance Bernardos, "Operations, Administration and Maintenance
(OAM) features for detnet", 2020, (OAM) features for detnet", 2020,
<https://tools.ietf.org/html/draft-theoleyre-detnet-oam- <https://tools.ietf.org/html/draft-theoleyre-detnet-oam-
support>. support>.
[I-D.pthubert-raw-architecture]
Thubert, P., Papadopoulos, G. Z., and R. Buddenberg,
"Reliable and Available Wireless Architecture/Framework",
draft-pthubert-raw-architecture-05 (work in progress),
November 2020.
[ipath] Gao, Y., Dong, W., Chen, C., Bu, J., Wu, W., and X. Liu, [ipath] Gao, Y., Dong, W., Chen, C., Bu, J., Wu, W., and X. Liu,
"iPath: path inference in wireless sensor networks.", "iPath: path inference in wireless sensor networks.",
2016, <https://doi.org/10.1109/TNET.2014.2371459>. 2016, <https://doi.org/10.1109/TNET.2014.2371459>.
[PREF-draft] [PREF-draft]
Thubert, P., Eckert, T., Brodard, Z., and H. Jiang, "BIER- Thubert, P., Eckert, T., Brodard, Z., and H. Jiang, "BIER-
TE extensions for Packet Replication and Elimination TE extensions for Packet Replication and Elimination
Function (PREF) and OAM", 2018, Function (PREF) and OAM", 2018,
<https://tools.ietf.org/html/draft-thubert-bier- <https://tools.ietf.org/html/draft-thubert-bier-
replication-elimination>. replication-elimination>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for
Network Interconnect Devices", RFC 2544,
DOI 10.17487/RFC2544, March 1999,
<https://www.rfc-editor.org/info/rfc2544>.
[RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu,
D., and S. Mansfield, "Guidelines for the Use of the "OAM" D., and S. Mansfield, "Guidelines for the Use of the "OAM"
Acronym in the IETF", BCP 161, RFC 6291, Acronym in the IETF", BCP 161, RFC 6291,
DOI 10.17487/RFC6291, June 2011, DOI 10.17487/RFC6291, June 2011,
<https://www.rfc-editor.org/info/rfc6291>. <https://www.rfc-editor.org/info/rfc6291>.
[RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y.
Weingarten, "An Overview of Operations, Administration, Weingarten, "An Overview of Operations, Administration,
and Maintenance (OAM) Tools", RFC 7276, and Maintenance (OAM) Tools", RFC 7276,
DOI 10.17487/RFC7276, June 2014, DOI 10.17487/RFC7276, June 2014,
skipping to change at line 598 skipping to change at page 15, line 29
Cesson-Sevigne - Rennes 35510 Cesson-Sevigne - Rennes 35510
FRANCE FRANCE
Phone: +33 299 12 70 04 Phone: +33 299 12 70 04
Email: georgios.papadopoulos@imt-atlantique.fr Email: georgios.papadopoulos@imt-atlantique.fr
Greg Mirsky Greg Mirsky
ZTE Corp. ZTE Corp.
Email: gregory.mirsky@ztetx.com Email: gregory.mirsky@ztetx.com
Carlos J. Bernardos
Universidad Carlos III de Madrid
Av. Universidad, 30
Leganes, Madrid 28911
Spain
Phone: +34 91624 6236
Email: cjbc@it.uc3m.es
URI: http://www.it.uc3m.es/cjbc/
 End of changes. 44 change blocks. 
100 lines changed or deleted 181 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/