Deterministic Networking Architecture
Cisco Systems
170 W Tasman Dr.San Jose95134CaliforniaUSA+1 408 526 4495nfinn@cisco.com
Cisco Systems
Village d'Entreprises Green Side400, Avenue de RoumanilleBatiment T3Biot - Sophia Antipolis06410FRANCE+33 4 97 23 26 34pthubert@cisco.com
Broadcom Corp.
3151 Zanker Rd.San Jose95134CaliforniaUSA+1 831 824 4228MikeJT@broadcom.com
Internet
DetNet
Deterministic Networking (DetNet) provides a capability to carry specified unicast or multicast
data flows for real-time applications with extremely low data loss rates and bounded
latency. Techniques used include: 1) reserving data plane resources for individual
(or aggregated) DetNet flows in some or all of the relay nodes (bridges or routers) along
the path of the flow; 2) providing fixed paths for DetNet flows that do not
rapidly change with the network topology; and 3) sequentializing, replicating, tracing and eliminating
duplicate packets at various points to ensure the availability of at least one path. The
capabilities can be managed by configuration, or by manual or automatic network management.
Deterministic Networking (DetNet) is a service that can be offered by a
network to data flows
(DetNet flows) that that are limited, at their source, to a maximum data
rate specified by that source. DetNet provides these flows extremely
low packet loss rates and assured maximum end-to-end delivery latency.
This is accomplished by dedicating network resources such as link bandwidth
and buffer space to DetNet flows and/or
classes of DetNet flows. Unused reserved resources are available to non-DetNet
packets.
The Deterministic Networking
Problem Statement introduces Deterministic Networking, and
Deterministic Networking
Use Cases summarizes the need for it.
A goal of DetNet is a converged network in all respects. That is, the presence
of DetNet flows does not preclude non-DetNet flows, and the benefits offered
DetNet flows should not, except in extreme cases, prevent existing QoS
mechanisms from operating in a
normal fashion, subject to the bandwidth required for the DetNet flows. A
single source-destination pair can trade both DetNet and non-DetNet flows.
End systems and applications need not instantiate special interfaces for DetNet flows.
Networks are not restricted to certain topologies; connectivity is not restricted.
Any application that generates a data flow that can be usefully
characterized as having a maximum bandwidth should be able to take advantage
of DetNet, as long as the necessary resources can be reserved. Reservations
can be made by the application itself, via network management, by an
applications controller, or by other means.
Many applications of interest to Deterministic Networking require the ability
to synchronize the clocks in end systems to a sub-microsecond accuracy. Some
of the queue control techniques defined in also
require time synchronization among relay nodes. The means used to achieve
time synchronization are not addressed in this document. DetNet should accommodate
various synchronization techniques and profiles that are defined elsewhere to
solve exchange time in different market segments.
The present document is an individual contribution, but it is intended by the authors
for adoption by the DetNet working group.
The following special terms are used in this document in order to avoid the
assumption that a given element in the architecture does or does not have
Internet Protocol stack, functions as a router, bridge, firewall, or otherwise
plays a particular role at Layer-2 or higher.
An end system capable of sinking a DetNet flow.
The portion of a network that is DetNet aware. It includes end
systems and other DetNet nodes.
A DetNet flow is a sequence of packets from a single source, through some
number of relay nodes to one or more destinations, that is
limited by the source in its maximum packet size and transmission rate,
and can thus be ensured the DetNet Quality of Service (QoS) from the network.
A DetNet compound flow is a DetNet flow that has been separated into
multiple duplicate DetNet member flows, which are eventually merged back
into a single DetNet compound flow. "Compound" and "member" are strictly
relative to each other, not absolutes; a DetNet compound flow comprising
multiple DetNet member flows can, in turn, be a member of a higher-order compound.
A DetNet aware end system or relay node.
"DetNet" may be omitted in some text.
An instance of a DetNet node that includes a proxy function
for one or more source end systems, analogous to a Label Edge Router (LER).
Commonly called a "host" or "node" in IETF documents, and an "end
station" is IEEE 802 documents. End systems of interest to
this document are either sources or destinations of L2 and/or
L3 DatNet streams.
A connection between two DetNet nodes. It may be composed of a
physical link
or a sub-network technology that can provide appropriate traffic
delivery for DetNet
flows.
A router, transit node, bridge, Label Switch Router (LSR), firewall, or any other system
that forwards packets from one interface to another.
A trail of configuration between source to destination(s) through relay nodes
associated with a DetNet flow, required to deliver the benefits of DetNet.
An end system capable of sourcing a DetNet flow.
This section also serves as a
dictionary for translating from the terms used by the IEEE 802 Time-Sensitive
Networking (TSN) Task Group to those of the DetNet WG.
The IEEE 802 term for a destination of a DetNet flow.
The IEEE 802 term for a DetNet node.
The IEEE 802 term for a DetNet flow.
The IEEE 802 term for the source of a DetNet flow.
The DetNet Quality of Service can be expressed in terms of:
Minimum and maximum end-to-end latency from source to destination;
timely delivery and jitter avoidance derive from these constraints
Probability of loss of a packet, under various assumptions as to
the operational states of the relay nodes and links.
A derived property is whether it is acceptable to deliver a duplicate
packet, which is an inherent risk in highly reliable and/or broadcast
transmissions
It is a distinction of DetNet that it is concerned solely with worst-case values
for the end-to-end latency. Average, mean, or typical values are of no interest,
because they do not affect the ability of a real-time system to perform its
tasks. In general, a trivial priority-based queuing scheme will
give better average latency to a data flow than DetNet, but of course, the worst-case
latency can be essentially unbounded.
Three techniques are used by DetNet to provide these qualities of service:
Bandwidth reservation and enforcement ().
Pinned paths ().
Packet replication and elimination ().
The DetNet techniques are meant to address both of the DetNet QoS
requirements (latency and packet loss). Given that relay nodes
have a finite amount of buffer space, zero congestion loss
necessarily results in a maximum end-to-end latency. It also addresses the largest
contribution to packet loss, which is buffer congestion. Packet replication and
elimination mitigates the second most important contributions to packet loss, namely
random media errors and equipment failure.
These three techniques can be applied independently, giving eight possible combinations,
including none (no DetNet), although some combinations are of wider utility than others.
This separation keeps the protocol stack coherent and maximizes interoperability with
existing and developing standards in this (IETF) and other
Standards Development Organizations. Some examples of typical expected combinations:
Pinned paths (a) plus packet replication (b) are exactly the techniques
employed by . Pinned paths are achieved by limiting
the physical topology of the network, and the sequentialization, replication, and
duplicate elimination are facilitated by packet tags added at the front or the end
of Ethernet frames.
Zero congestion loss (a) alone is is offered by IEEE 802.1 Audio Video bridging
. As long as the network suffers no failures,
zero congestion loss can be achieved through the use of
a reservation protocol (MSRP), shapers in every relay node (bridge), and a
bit of network calculus.
Using all three together gives maximum protection.
There are, of course, simpler methods available (and employed, today) to achieve
levels of latency and packet loss that are satisfactory for many applications.
Prioritization and over-provisioning is one such technique. However, these
methods generally work best in the absence of any significant amount of non-critical
traffic in the network (if, indeed, such traffic is supported at all), or work only if
the critical traffic constitutes only a small portion of the network's theoretical
capacity, or work only if all systems are functioning properly, or in the absence of
actions by end systems that disrupt the network's operations.
There are any number of methods in use, defined, or in progress for accomplishing each
of the above techniques. It is expected that this DetNet Architecture will assist
various vendors, users, and/or "vertical"
Standards Development Organizations (dedicated to a single industry) to make selections
among the available means of implementing DetNet networks.
The primary means by which DetNet achieves its QoS assurances is to completely
eliminate congestion at an output port as a cause of packet loss. Given that a
DetNet flow cannot be throttled, this can be achieved only by the provision of
sufficient buffer storage at each hop through the network to ensure that no
packets are dropped due to a lack of buffer storage.
Ensuring adequate buffering requires, in turn, that the source, and every relay
system along the path to the destination (or nearly every relay node -- see
) be careful to regulate its output to not exceed the
data rate for any DetNet flow, except for brief periods when making up for
interfering traffic. Any packet sent ahead of its time potentially adds to the
number of buffers required by the next hop, and may thus exceed the resources
allocated for a particular DetNet flow.
The low-level mechanisms described in provide
the necessary
regulation of transmissions by an edge system or relay node to ensure
zero congestion loss. The reservation of the bandwidth and
buffers for a DetNet flow requires the provisioning described in
.
A DetNet node may have other resources requiring allocation and/or scheduling,
that might otherwise be over-subscribed and trigger the rejection of a reservation.
In networks controlled by typical peer-to-peer protocols such as IEEE 802.1 ISIS bridged
networks or IETF OSPF routed networks, a network topology event in one part of the network
can impact, at least briefly, the delivery of data in parts of the network remote from the
failure or recovery event. Thus, even redundant paths through a network, if controlled by
the typical peer-to-peer protocols, do not eliminate the chances of brief losses of contact.
Many real-time networks rely on physical rings or chains of two-port devices, with
a relatively simple ring control protocol. This supports redundant paths with a minimum
of wiring. As an additional benefit, ring topologies can often
utilize different topology management protocols than those used for a mesh network, with
a consequent reduction in the response time to topology changes. Of course, this comes
at some cost in terms of increased hop count, and thus latency, for the typical path.
In order to get the advantages of low hop count and still ensure against even very brief
losses
of connectivity, DetNet employs pinned paths, where the path taken by a given DetNet flow
does not change, at least immediately, and likely not at all, in response to network
topology events. When combined with packet replication and elimination
(), this results in a high likelihood of continuous connectivity.
Pinned paths are commonly used in MPLS TE LSPs.
A core objective of DetNet is to enable the convergence of Non-IP networks
onto a common network infrastructure. This requires the accurate emulation
of currently deployed mission-specific networks, which
typically rely on point-to-point analog (e.g. 4-20mA modulation) and
serial-digital cables (or buses) for highly reliable, synchronized and
jitter-free communications. While the latency of analog transmissions is
basically the speed of light, legacy serial links are usually slow (in the
order of Kbps) compared to, say, GigE, and some latency is usually
acceptable. What is not acceptable is the introduction of excessive jitter,
which may, for instance, affect the stability of control systems.
Applications that are designed to operate on serial links usually do
not provide services to recover the jitter, because jitter simply does not
exists there. Streams of information are expected to be delivered in-order
and the precise time of reception influences the processes. In order to
converge such existing applications,
there is a desire to emulate all properties of the serial cable, such
as clock transportation, perfect flow isolation and fixed latency. While minimal
jitter (in the form of specifying minimum, as well as maximum, end-to-end latency)
is supported by DetNet, there are practical limitations on packet-based networks
in this regard. In general, users
are encouraged to use, instead of, "do this when you get the packet," a
combination of:
Sub-microsecond time synchronization among all source and destination
end systems, and
Time-of-execution fields in the application packets.
After congestion loss has been eliminated, the most important causes of packet
loss are random media and/or memory faults, and equipment failures.
Both causes of packet loss can be greatly reduced by sending the same packets over multiple paths.
Packet replication and elimination, also known as seamless redundancy ,
or 1+1 hitless protection, involves three capabilities:
Replicating these packets into multiple DetNet member flows and, typically,
sending them along at least two
different paths to the destination(s), e.g. over the pinned paths of
.
Providing sequencing information, once, at or near the source, to the
packets of a DetNet compound flow. This may
be done by adding a sequence number or time stamp as part of DetNet, or may be
inherent in the packet, e.g. in a transport protocol, or associated to other physical
properties such as the precise time (and radio channel) of reception of the packet.
Eliminating duplicated packets. This may be done at any step along
the path to save network resources further down, in particular if
multiple Replication points exist.
But the most common case is to perform this operation at
the very edge of the DetNet network, preferably in or near the receiver.
This function is a "hitless" version of, e.g., the 1+1
linear protection in .
That is, instead of switching from one flow to the other when a failure
of a flow is detected, DetNet combines both flows, and performs a
packet-by-packet selection of which to discard, based on sequence number.
In the simplest case, this amounts to replicating each packet in a source that
has two interfaces, and conveying them through the network, along separate paths,
to the
similarly dual-homed destinations, that discard the extras. This ensures that one
path (with zero congestion loss) remains, even if some relay node fails.
The sequence numbers can also be used for loss detection and for re-ordering.
Alternatively, relay nodes in the network can provide replication and elimination
facilities at various points in the network, so that multiple failures can be
accommodated.
This is shown in the following figure, where the two relay nodes
each replicate (R) the DetNet flow on input, sending the DetNet member flows to both the other
relay node and to the end system, and eliminate duplicates (E) on the output
interface to the right-hand end system. Any one link in the network can
fail, and the Detnet compound flow can still get through. Furthermore, two links can
fail, as long as they are in different segments of the network.
Note that packet replication and elimination does not react to and correct failures; it is
entirely passive. Thus, intermittent failures, mistakenly created packet filters,
or misrouted data is handled just the same as the equipment failures
that are detected handled by typical routing and bridging protocols.
When combining member flows that take different-length paths through the network,
and which are also guaranteed a worst-case latency by packet shaping, a merge
point may require extra buffering to equalize the delays over the different paths. This
equalization ensures that the resultant compound flow will not exceed its
contracted bandwidth even after one or the other of the paths is restored
after a failure.
Traffic Engineering Architecture and Signaling (TEAS)
defines traffic-engineering architectures for generic applicability
across packet and non-packet networks.
From TEAS perspective, Traffic Engineering (TE) refers to techniques
that enable operators to control how specific traffic flows are treated
within their networks.
Because if its very nature of establishing pinned optimized paths,
Deterministic Networking can be seen as a new, specialized branch of
Traffic Engineering, and inherits its architecture with a separation
into planes.
The Deterministic Networking architecture is thus composed
of three planes, a (User) Application Plane, a Controller Plane, and a
Network Plane, which echoes that of Figure 1 of
Software-Defined Networking (SDN):
Layers and Architecture Terminology.:
Per ,
the Application Plane includes both applications and services. In particular,
the Application Plane incorporates the User Agent, a specialized application
that interacts with the end user / operator and performs requests for
Deterministic Networking services via an abstract Flow Management Entity,
(FME) which may or may not be collocated with (one of) the end systems.
At the Application Plane, a management interface enables the negotiation of flows between end
systems. An abstraction of the flow called a Traffic Specification (TSpec) provides the
representation. This abstraction is used to place a reservation over the (Northbound) Service
Interface and within the Application plane.
It is associated with an abstraction of location, such as IP addresses and DNS
names, to identify the end systems and eventually specify intermediate relay nodes.
The Controller Plane corresponds to the aggregation of the Control and
Management Planes in , though
Common Control and Measurement Plane (CCAMP)
makes an additional distinction between management and measurement.
When the logical separation of the Control, Measurement and other
Management entities is not relevant, the term Controller Plane is used
for simplicity to represent them all, and the term controller refers to
any device operating in that plane, whether is it a Path Computation
entity or a Network Management entity (NME).
The Path Computation Element (PCE) is a core
element of a controller, in charge of computing Deterministic paths
to be applied in the Network Plane.
A (Northbound) Service Interface enables applications in the Application
Plane to communicate with the entities in the Controller Plane.
One or more PCE(s) collaborate to implement the requests from the FME
as Per-Flow Per-Hop Behaviors installed in the relay nodes for
each individual flow. The PCEs
place each flow along a deterministic sequence of relay nodes so as
to respect per-flow constraints such as security and
latency, and optimize the overall result for metrics such as an
abstract aggregated cost. The deterministic sequence can typically be
more complex than a direct sequence and include redundancy path, with
one or more packet replication and elimination points.
The Network Plane represents the network devices and protocols as a
whole, regardless of the Layer at which the network devices operate.
It includes Forwarding Plane (data plane), Application, and
Operational Plane (control plane) aspects.
The network Plane comprises the Network Interface Cards (NIC) in the
end systems, which are typically IP hosts,
and relay nodes, which are typically IP routers and switches.
Network-to-Network Interfaces such as used for Traffic Engineering
path reservation in ,
as well as User-to-Network Interfaces (UNI) such as provided by
the Local Management Interface (LMI) between network and end systems,
are both part of the Network Plane, both in the control plane and
the data plane.
A Southbound (Network) Interface enables the entities in the Controller
Plane to communicate with devices in the Network Plane. This interface
leverages and extends TEAS to describe the physical topology and
resources in the Network Plane.
The relay nodes (and eventually the end systems NIC) expose their capabilities and physical
resources to the controller (the PCE), and update the PCE with their dynamic perception of the
topology, across the Southbound Interface. In return, the PCE(s) set the per-flow
paths up, providing a Flow Characterization that is more tightly coupled to the relay node
Operation than a TSpec.
At the Network plane, relay nodes may exchange information regarding the state of the paths,
between adjacent systems and eventually with the end systems, and forward packets within
constraints associated to each flow, or, when unable to do so, perform a last resort
operation such as drop or declassify.
This specification focuses on the Southbound interface and the operation of the Network Plane.
DetNet flows can by synchronous or asynchronous.
In synchronous DetNet flows, at least the relay nodes (and possibly
the end systems) are closely time
synchronized, typically to better than 1 microsecond. By transmitting
packets from different DetNet flows or classes of DetNet flows at different times,
using repeating schedules synchronized among the relay nodes, resources
such as buffers and link bandwidth can be shared over the time domain
among different DetNet flows. There is a tradeoff among techniques for
synchronous DetNet flows between the burden of fine-grained scheduling and the
benefit of reducing the required resources, especially buffer space.
In contrast, asynchronous DetNet flows are not coordinated with a fine-grained
schedule, so relay and end systems must assume worst-case interference
among DetNet flows contending for buffer resources.
Asynchronous DetNet flows are characterized by:
A maximum packet size;
An observation interval; and
A maximum number of transmissions during that observation interval.
These parameters, together with knowledge of the protocol stack used (and thus the
size of the various headers added to a packet), limit the number of bit times per
observation interval that the DetNet flow can occupy the physical medium.
The source promises that these limits will not be exceeded. If the source
transmits less data than this limit allows, the unused resources such as link
bandwidth can be made available by the system to non-DetNet packets. However,
making those resources available to DetNet packets in other DetNet flows would serve
no purpose. Those other DetNet flows have their own dedicated resources, on the
assumption that all DetNet flows can use all of their resources over a long
period of time.
Note that there is no provision in DetNet for throttling DetNet flows
(reducing the transmission rate via feedback); the assumption
is that a DetNet flow, to be useful, must be delivered in its entirety. That
is, while any useful application is written to expect a certain number of lost
packets, the real-time applications of interest to DetNet demand that the loss of
data due to the network is extraordinarily infrequent.
Although DetNet strives to minimize the changes required of an application to
allow it to shift from a special-purpose digital network to an Internet Protocol
network, one fundamental shift in
the behavior of network applications is impossible to avoid: the reservation
of resources before the application starts.
In the first place, a network cannot deliver finite latency and practically zero
packet loss to an arbitrarily high offered load. Secondly, achieving
practically zero packet loss for unthrottled (though bandwidth limited) DetNet flows
means that bridges and routers have to dedicate buffer resources to specific
DetNet flows or to classes of DetNet flows. The requirements of each reservation have to be
translated into the parameters that control each system's
queuing, shaping, and scheduling functions and delivered to the hosts, bridges,
and routers.
The presence in the network of relay nodes that are not fully capable of offering
DetNet services complicates the ability of the relay nodes and/or controller to
allocate resources, as extra buffering, and thus extra latency, must be allocated
at points downstream from the non-DetNet relay node for a DetNet flow.
As described above, DetNet achieves its aims
by reserving bandwidth and buffer resources at every hop along
the path of the DetNet flow.
The reservation itself is not sufficient, however. Implementors and users of a
number of
proprietary and standard real-time networks have found that standards for
specific data plane techniques are required to enable these assurances to be
made in a multi-vendor
network. The fundamental reason is that latency variation in one system results
in the need for extra buffer space in the next-hop system(s), which in turn,
increases the worst-case per-hop latency.
Standard queuing and transmission selection algorithms allow a central controller
to compute the latency contribution
of each relay node to the end-to-end latency, to compute the amount of buffer space
required in each relay node for each incremental DetNet flow, and most importantly, to
translate from a flow specification to a set of values for the managed objects that
control each relay or end system. The IEEE 802 has specified (and is
specifying) a set of queuing, shaping, and scheduling algorithms
that enable each relay node (bridge or router), and/or a central controller, to
compute these values. These algorithms include:
A credit-based shaper Clause 34.
Time-gated queues governed by a rotating time schedule, synchronized among all
relay nodes .
Synchronized double (or triple) buffers driven by synchronized time ticks.
.
Pre-emption of an Ethernet packet in transmission by a packet with a more stringent
latency requirement, followed by the resumption of the preempted packet
, .
While these techniques are currently embedded in Ethernet and bridging standards,
we can note that they are all, except perhaps for packet preemption, equally applicable
to other media than Ethernet, and to routers as well as bridges.
A DetNet network supports the dedication of a high proportion (e.g. 75%) of the
network bandwidth
to DetNet flows. But, no matter how much is dedicated for DetNet flows, it is
a goal of DetNet to coexist with existing Class of Service schemes (e.g., DiffServ).
It is also
important that non-DetNet traffic not disrupt the DetNet flow, of course (see
and ).
For these reasons:
Bandwidth (transmission opportunities) not utilized by a DetNet flow are available
to non-DetNet packets (though not to other DetNet flows).
DetNet flows can be shaped or scheduled, in order to ensure that the
highest-priority non-DetNet
packet also is ensured a worst-case latency (at any given hop).
When transmission opportunities for DetNet flows are scheduled in detail, then
the algorithm constructing the schedule should leave sufficient opportunities for
non-DetNet packets to satisfy the needs of the users of the network. Detailed
scheduling can also permit the time-shared use of buffer resources by different
DetNet flows.
Ideally, the net effect of the presence of DetNet flows in a network on the non-DetNet
packets is primarily a reduction in the available bandwidth.
One key to building robust real-time systems is to reduce the infinite variety of
possible failures to a number that can be analyzed with reasonable confidence. DetNet
aids in the process by providing filters and policers to detect DetNet packets received
on the wrong interface, or at the wrong time, or in too great a volume, and to then take
actions such as discarding the offending packet, shutting down the offending DetNet flow,
or shutting down the offending interface.
It is also essential that filters and service remarking be employed at the network edge
to prevent non-DetNet
packets from being mistaken for DetNet packets, and thus impinging on the resources
allocated to DetNet packets.
There exist techniques, at present and/or in various stages of standardization, that can
perform these fault mitigation tasks that deliver a high probability that misbehaving
systems will have zero impact on well-behaved DetNet flows, except of course, for
the receiving interface(s) immediately downstream of the misbehaving device.
Examples of such techniques include traffic policing functions (e.g.
) and separating flows into per-flow rate-limited queues.
illustrates the DetNet data plane layering model. One may
compare it to that in , Annex C, a work in progress.
Not all layers are required for any given application, or even for any
given network. The layers are, from top to bottom:
Shown as "source" and "destination" in the diagram.
Operations, Administration, and Maintenance leverages in-band and
out-of-and signaling that validates whether the service is effectively
obtained within QoS constraints. It is shown in parallel with the
user's application, OAM makes use of the same DetNet services. OAM
can involve specific tagging added in the packets for tracing implementation
or network configuration errors; traceability enables to find whether a
packet is a replica, which node performed the replication, and which
segment was intended for the replica.
Supplies the sequence number for packet replication and elimination ()
for packets going down the stack. Peers
with packet elimination. This layer is not needed if a higher-layer transport protocol
is expected to perform any packet elimination required by the DetNet flow duplication.
Based on the sequenced number supplied by its peer, packet sequencing,
packet elimination discards any duplicate packets generated by DetNet
flow duplication.
The duplication may also be inferred from other
information such as the precise time of reception in a scheduled network.
The duplicate elimination layer may also perform resequencing of packets
to restore packet order in a
flow that was disrupted by the loss of packets on one or another of
the multiple paths taken.
Many DetNet applications, and particularly those in which multiple applications
(e.g. different machine tools) are sharing the same network infrastructure, or even
the same physical links, it is critical that a misbehaving DetNet flow does not
interfere with the timely delivery of packets belonging to other DetNet flows. The
DetNet flow monitoring layer monitors DetNet flows entering a DetNet node and
enforces bandwidth and/or sequencing restrictions, taking appropriate action if
a misbehaving flow is detected. See . This function
is shown in the stack at the point where it can operate on individual DetNet member
flows before they are merged into a DetNet compound flow, but in fact, it may be
present in different forms in multiple places in the stack to ensure against
interference errors.
Replicates packets going down the stack, that belong to a DetNet compound flow,
into two or more DetNet member flows. Note
that this function is separate from packet sequencing. Flow duplication
can be an explicit duplication and remarking of packets, or can be performed by,
for example, techniques similar to ordinary multicast replication.
Peers with DetNet flow merging.
Merges
DetNet member flows together for packets coming up the stack belonging to a
specific DetNet compound flow.
Peers with DetNet flow duplication.
DetNet flow merging, together with packet sequencing, duplicate elimination,
DetNet flow duplication, and DetNet flow merging, performs
packet replication and elimination ().
Encodes the sequence number into packets going down the stack. This
function may or may not be a null transformation of the packet, and for
some protocols, is not explicitly present, being included in the DetNet flow
encoding layer, below.
Peers with sequence decoding.
Extracts the sequence number from packets coming up the stack for use by the
duplicate elimination layer. This
function may or may not be a null transformation of the packet, and for
some protocols, is not explicitly present, being included in the DetNet flow
decoding layer, below.
Peers with sequence encoding.
Encapsulates packets going down the stack, based on the packet's
locally-significant DetNet flow identifier, in order to identify to which
DetNet flow the packet belongs. This may be a null transformation
or might be an explicit encapsulation (e.g., altering the VLAN
and destination MAC address).
DetNet flow identification is the basis for packet replication and elimination, for
assigning per-flow resources (if any) to packets and for defense
against misbehaving systems ().
When DetNet flows are assigned to pinned paths, this layer can be
indistinguishable from the data forwarding layer(s).
Peers with DetNet flow decoding. See
for an explanation of why DetNet flow encoding is not necessarily
a part of normal packet transport.
Extracts a locally-significant DetNet flow identifier from
packets coming up the stack, in order to identify to which
DetNet flow the packet belongs. This may be a null transformation
or might be an explicit decapsulation (e.g., altering the VLAN
and destination MAC address). Peers with DetNet flow encoding.
See also .
This layer provides the latency and congestion loss parts of the DetNet
QoS. See .
Note that additional shaping elements may be provided for DetNet
edge nodes in order to precondition potentially malformed DetNet flows from a
source end system.
The reader is likely to notice that does not
specify the relationship between the DetNet layers, the IP layers, and
the link layers. This is intentional, because they can usefully be placed
different places in the stack, and even in multiple places, depending on
where their peers are placed.
An interesting feature of DetNet, and one that invites implementations that
can be accused of "layering violations", is the need for lower layers to be
aware of specific flows at higher layers, in order to provide specific
queuing and shaping services for specific flows. For example:
A non-IP, strictly L2 source end system X may be sending multiple flows
to the same L2 destination end system Y. Those flows may include
DetNet flows with different QoS requirements, and may include non-DetNet
flows.
A router may be sending any number of flows to another router.
Again, those flows may include
DetNet flows with different QoS requirements, and may include non-DetNet
flows.
Two routers may be separated by bridges. For these bridges to perform
any required per-flow queuing and shaping, they must be able to identify
the individual flows.
A Label Edge Router (LERs) may have a Label Switched
Path (LSP) set up for handling traffic
destined for a particular IP address carrying only non-DetNet flows. If
a DetNet flow to that same address is requested, a separate LSP may be
needed, in order that all of the Label Switch Routers (LSRs) along the
path to the destination give that flow special queuing and shaping.
The need for a lower-level DetNet node to be aware of individual higher-layer
flows is not unique to DetNet. But, given the endless complexity of layering
and relayering over tunnels that is available to network designers, DetNet
needs to provide a model for flow identification that is at least somewhat
better than deep packet inspection. That is not to say that deep inspection
will not be used, or the capability standardized; but, there are alternatives.
The main alternative is the sequence encode/decode and, particularly, the
DetNet flow encoding/decoding layers shown in .
In this model, at the time a DetNet flow is established and the resources
for it reserved, an alternate encapsulation of the DetNet flow at the lower
layer is requested and established. For example:
A single unicast DetNet flow passing from router A through a bridged network
to router B may be assigned a {VLAN, multicast destination MAC address} pair that is
unique within that bridged network. The bridges can then identify the
flow without accessing higher-layer headers. Of course, the receiving router
must recognize and accept that multicast MAC address.
A DetNet flow passing from LSR A to LSR B may be assigned a different
label than that used for other flows to the same IP destination.
The DetNet flow encoding/decoding layers shown in
perform the required alternate encapsulations. For example, one could place
a DetNet flow encoding shim between the Address Resolution Protocol (ARP) layer
and the MAC layer, which alters the {VLAN, MAC address} pair to identify
particular streams going up and down the stack, so that the layers above the
shim need no alteration to service DetNet flows.
In any of the above cases, it is possible that an existing DetNet flow can
be used as a carrier for multiple DetNet sub-flows. (Not to be confused
with DetNet compound vs. member flows.) Of course, this requires that the
aggregate DetNet flow be provisioned properly to carry the sub-flows.
Thus, rather than deep packet inspection, there is the option to export
higher-layer information to the lower layer. The requirement to support
one or the other method for flow identification (or both) is the essential
complexity that DetNet brings to existing control plane models.
There are three classes of information that a central controller
or decentralized control plane needs to
know that can only be obtained from the end systems and/or relay nodes
in the network. When using a peer-to-peer control plane, some of this
information may be required by a system's neighbors in the network.
Details of the system's capabilities that are required in order to
accurately allocate that system's resources, as well as other systems'
resources. This includes, for example, which specific queuing and
shaping algorithms are implemented (),
the number of buffers dedicated for DetNet allocation, and the worst-case
forwarding delay.
The dynamic state of an end or relay node's DetNet resources.
The identity of the system's neighbors, and the characteristics of the
link(s) between the systems, including the length (in nanoseconds) of
the link(s).
A centralized routing model, such as provided with a PCE (RFC 4655), enables global and
per-flow optimizations. (See .)
The model is attractive but a number of issues are
left to be solved.
In particular:
Whether and how the path computation can
be installed by 1) an end device or 2) a Network Management entity,
And how
the path is set up, either by installing state at each hop with a direct
interaction between the forwarding device and the PCE, or along a path by
injecting a source-routed request at one end of the path.
Whether a distributed alternative without a PCE can be valuable should
be studied as well. Such an alternative could for instance inherit from the
Resource ReSerVation Protocol (RSVP-TE) flows.
In a Layer-2 only environment, or as part of a layered approach to a
mixed environment, IEEE 802.1 also has work, either completed
or in progress. Clause 35 describes
SRP, a peer-to-peer protocol for Layer-2 roughly analogous to
RSVP. Almost
complete is , which defines how ISIS can
provide multiple disjoint paths or distribution trees. Also in progress
is , which expands the capabilities
of SRP.
The integration/interaction of the DetNet control layer with an underlying
IEEE 802.1 sub-network control layer will need to be defined.
Reservations for individual DetNet flows require considerable state information in
each relay node, especially when adequate fault mitigation
() is required. The DetNet data plane, in order to
support larger numbers of DetNet flows, must support the aggregation of DetNet flows
into tunnels, which themselves can be viewed by the relay nodes' data planes
largely as individual DetNet flows. Without such aggregation, the per-relay
system may limit the scale of DetNet networks.
Given that users have deployed examples of the IEEE 802.1 TSN TG standards, which
provide capabilities similar to DetNet, it is obvious to ask whether the IETF
DetNet effort can be limited to providing Layer-2 connections (VPNs) between islands of
bridged TSN networks. While this capability is certainly useful to some
applications, and must not be precluded by DetNet, tunneling alone is not a
sufficient goal for the DetNet WG. As shown in the
Deterministic
Networking Use Cases draft,
there are already deployments of Layer-2 TSN networks that are encountering
the well-known problems of over-large broadcast domains. Routed solutions, and
combinations routed/bridged solutions, are both required.
Standards providing similar
capabilities for bridged networks (only) have been and are being generated in the
IEEE 802 LAN/MAN Standards Committee. The present architecture
describes an abstract model that can be applicable both at Layer-2
and Layer-3, and over links not defined by IEEE 802. It is the intention
of the authors (and hopefully, as this draft progresses, of the DetNet
Working Group) that IETF and IEEE 802 will coordinate their work, via
the participation of common individuals, liaisons, and other means,
to maximize the compatibility of their outputs.
DetNet enabled systems and nodes can be interconnected by
sub-networks, i.e., Layer-2 technologies.
These sub-networks will
provide DetNet compatible service for support of DetNet traffic.
Examples of sub-networks include 802.1TSN and a point-to-point OTN link.
Of course, multi-layer DetNet systems may be possible too, where one
DetNet appears as a sub-network, and provides service to, a higher layer
DetNet system.
There are a number of architectural questions that will have to be resolved
before this document can be submitted for publication. Aside from the obvious
fact that this present draft is subject to change, there are specific questions
to which the authors wish to direct the readers' attention.
A number of techniques have been defined and are being defined by IEEE 802
for queuing, shaping, and scheduling transmissions on EtherNet media, most
of which are directly applicable to any other medium. Specific selections
of supported techniques are required, because minimizing, and even
eliminating, congestion losses depends strongly on the details of the per-hop
behavior of sources and relay nodes.
The present authors expect that, at least, the IEEE 802 mechanisms will be
supported.
The techniques to be used for DetNet flow identification must be settled.
The following paragraphs provide a snapshot of the authors' opinions at
the time of writing. These authors anticipate the submission of drafts
on this subject. See also
IEEE 802.1 TSN streams are identified by giving each stream (DetNet flow) a
{VLAN identifier, destination MAC address} pair that is unique in the
bridged network, and that the MAC address must be a multicast address.
If a source is generating, for example, two unicast UDP
flows to the same destination, one DetNet and one not, the DetNet flow's
packets must be transformed at some point to have a multicast
destination MAC address, and perhaps, a different VLAN than the non-DetNet
flow's packets.
A similar provision would apply to DetNet packets that are identified by
MPLS labels; any bridges between the LSRs need a {VLAN identifier, destination MAC address} pair
uniquely identifying the DetNet flow in the bridged network.
Provision is made in current draft of to
make these transformations either in a Layer-2 shim in the source end system,
on the output side of a router or LSR, or in a proxy function in the
first-hop bridge. It remains to be seen whether this provision is
adequate and/or acceptable to the IETF DetNet WG.
There are also questions regarding the sequentialization of packets for use
with packet replication and elimination ().
defines an EtherNet tag carrying a sequence number. If MPLS Pseudowires
are used with a control word containing a sequence number, the relationship
and interworking between these two formats must be defined.
Boxes that are solely routers or solely bridges are rare in today's market.
In a multi-tenant data center, multiple users' virtual Layer-2/Layer-3 topologies
exist simultaneously, implemented on a network whose physical topology bears
only accidental resemblance to the virtual topologies.
While the forwarding topology (the bridges and routers) are an important
consideration for a DetNet Flow Management Entity (),
so is the purely physical topology. Ultimately, the model used by the
management entities is based on boxes, queues, and links. The authors
hope that the work of the TEAS WG will help to clarify exactly what model
parameters need to be traded between the relay nodes and the controller(s).
As described in , the DetNet WG needs to decide whether
to support a peer-to-peer protocol for a source and a destination
to reserve resources for a DetNet stream. Assuming that enabling the
involvement of the source and/or destination is desirable (see
Deterministic Networking Use Cases),
it remains to decide whether the DetNet WG will make it possible to deploy at least some
DetNet capabilities in a network using only a peer-to-peer protocol, without
a central controller.
(Note that a UNI (see ) between an end system and an
edge relay node, for sources and/or listeners to
request DetNet services, can be either the first hop of a per-to-peer reservation
protocol, or can be deflected by the edge relay node to a central controller
for resolution. Similarly, a decision by a central controller can be effected
by the controller instructing the end system or edge relay node to initiate
a per-to-peer protocol activity.)
Deterministic Networking Use Cases
illustrates cases where wireless media are needed in a DetNet network. Some wireless
media in general use, such as IEEE 802.11 ,
have significantly higher packet loss rates than typical wired media, such as
Ethernet. IEEE 802.11 includes support for
such features as MAC-layer acknowledgements and retransmissions.
The techniques described in are likely to improve
the ability of a mixed wired/wireless network to offer the DetNet QoS features.
The interaction of these techniques with the features of specific wireless
media, although they may be significant, cannot be addressed in this document.
It remains to be decided to what extent the DetNet WG will address them, and to
what extent other WGs, e.g. 6TiSCH, will do so.
Security in the context of Deterministic Networking has an added
dimension; the time of delivery of a packet can be just as important
as the contents of the packet, itself. A man-in-the-middle attack,
for example, can impose, and then systematically adjust, additional
delays into a link, and thus disrupt or subvert a real-time
application without having to crack any encryption methods employed.
See for an
exploration of this issue in a related context.
Furthermore, in a control system where millions of dollars of equipment, or even
human lives, can be lost if the DetNet QoS is not delivered, one must consider
not only simple equipment failures, where the box or wire instantly becomes
perfectly silent, but bizarre errors such as can be caused by software failures.
Because there is essential no limit to the kinds of failures that can occur,
protecting against realistic equipment failures is indistinguishable, in most
cases, from protecting against malicious behavior, whether accidental or intentional.
See also .
Security must cover:
the protection of the signaling protocol
the authentication and authorization of the controlling systems
the identification and shaping of the DetNet flows
DetNet is provides a Quality of Service (QoS), and as such, does not directly
raise any new privacy considerations.
However, the requirement for every (or almost every) node along the path of
a DetNet flow to identify DetNet flows may present an additional attack
surface for privacy, should the DetNet paradigm be found useful in broader
environments.
This document does not require an action from IANA.
The authors wish to thank Jouni Korhonen, Erik Nordmark, George Swallow,
Rudy Klecka, Anca Zamfir, David Black, Thomas Watteyne, Shitanshu Shah,
Craig Gunther, Rodney Cummings, Balasz Varga, Wilfried Steiner, Marcel Kiessling, Karl Weber,
Ethan Grossman, Pat Thaler, and Lou Berger
for their various contribution with this work.
To access password protected IEEE 802.1 drafts, see the
IETF IEEE 802.1 information page at
https://www.ietf.org/proceedings/52/slides/bridge-0/tsld003.htm.
Frame Replication and Elimination for Reliability (IEEE Draft P802.1CB)IEEEPath Control and ReservationIEEEStream Reservation Protocol (SRP) Enhancements and Performance ImprovementsIEEEFrame PreemptionIEEEEnhancements for Scheduled TrafficIEEETiming and Synchronizations (IEEE 802.1AS-2011)IEEEAVB Systems (IEEE 802.1BA-2011)IEEEMAC Bridges and VLANs (IEEE 802.1Q-2014IEEECyclic Queuing and ForwardingIEEEIEEE Stabdard for EthernetIEEEInterspersed Express TrafficIEEEWireless LAN Medium Access Control (MAC) and Physical Layer (PHY) SpecificationsIEEEEnterprise-Control System Integration Part 1: Models and TerminologyANSI/ISAISA100.11a, Wireless Systems for Automation, also IEC 62734ISA/IECIEEE 802.1 Time-Sensitive Networks Task GroupIEEE Standards AssociationIEEE std. 802.15.4e, Part. 15.4: Low-Rate Wireless Personal Area Networks (LR-WPANs) Amendment 1: MAC sublayerIEEE standard for Information TechnologyIEEE std. 802.15.4, Part. 15.4: Wireless Medium Access Control
(MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless
Personal Area NetworksIEEE standard for Information TechnologyIndustrial Communication Networks - Wireless Communication
Network and Communication Profiles - WirelessHART - IEC 62591www.hartcomm.orgHighway Addressable Remote Transducer, a group of
specifications for industrial process and control devices
administered by the HART Foundationwww.hartcomm.orgThe organization that supports network technologies built on
the Common Industrial Protocol (CIP) including EtherNet/IP.http://www.odva.org/The AVnu Alliance tests and certifies devices for
interoperability, providing a simple and reliable networking
solution for AV network implementation based on the Audio
Video Bridging (AVB) standards.http://www.avnu.org/PROFINET is a standard for industrial networking in
automation. http://us.profinet.com/technology/profinet/High availability seamless redundancy (HSR) is a further
development of the PRP approach, although HSR functions primarily
as a protocol for creating media redundancy while PRP, as described
in the previous section, creates network redundancy.
PRP and HSR are both described in the IEC 62439 3 standard.IECTraffic Engineering Architecture and SignalingIETFPath Computation ElementIETFCommon Control and Measurement PlaneIETF