idnits 2.17.1 

draft-brockners-inband-oam-requirements-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 30, 2016) is 2727 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-05) exists of
     draft-brockners-proof-of-transit-01

  == Outdated reference: A later version (-15) exists of
     draft-ietf-spring-segment-routing-09


     Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                       F. Brockners
3	Internet-Draft                                               S. Bhandari
4	Intended status: Informational                                   S. Dara
5	Expires: May 3, 2017                                        C. Pignataro
6	                                                                   Cisco
7	                                                              H. Gredler
8	                                                            RtBrick Inc.
9	                                                                J. Leddy
10	                                                                 Comcast
11	                                                               S. Youell
12	                                                                    JMPC
13	                                                                D. Mozes
14	                                              Mellanox Technologies Ltd.
15	                                                              T. Mizrahi
16	                                                                 Marvell
17	                                                             P. Lapukhov
18	                                                                Facebook
19	                                                                R. Chang
20	                                                       Barefoot Networks
21	                                                        October 30, 2016

23	                      Requirements for In-situ OAM
24	               draft-brockners-inband-oam-requirements-02

26	Abstract

28	   This document discusses the motivation and requirements for including
29	   specific operational and telemetry information into data packets
30	   while the data packet traverses a path between two points in the
31	   network.  This method is referred to as "in-situ" Operations,
32	   Administration, and Maintenance (OAM), given that the OAM information
33	   is carried with the data packets as opposed to in "out-of-band"
34	   packets dedicated to OAM.  In situ OAM complements other OAM
35	   mechanisms which use dedicated probe packets to convey OAM
36	   information.

38	Status of This Memo

40	   This Internet-Draft is submitted in full conformance with the
41	   provisions of BCP 78 and BCP 79.

43	   Internet-Drafts are working documents of the Internet Engineering
44	   Task Force (IETF).  Note that other groups may also distribute
45	   working documents as Internet-Drafts.  The list of current Internet-
46	   Drafts is at http://datatracker.ietf.org/drafts/current/.

48	   Internet-Drafts are draft documents valid for a maximum of six months
49	   and may be updated, replaced, or obsoleted by other documents at any
50	   time.  It is inappropriate to use Internet-Drafts as reference
51	   material or to cite them other than as "work in progress."

53	   This Internet-Draft will expire on May 3, 2017.

55	Copyright Notice

57	   Copyright (c) 2016 IETF Trust and the persons identified as the
58	   document authors.  All rights reserved.

60	   This document is subject to BCP 78 and the IETF Trust's Legal
61	   Provisions Relating to IETF Documents
62	   (http://trustee.ietf.org/license-info) in effect on the date of
63	   publication of this document.  Please review these documents
64	   carefully, as they describe your rights and restrictions with respect
65	   to this document.  Code Components extracted from this document must
66	   include Simplified BSD License text as described in Section 4.e of
67	   the Trust Legal Provisions and are provided without warranty as
68	   described in the Simplified BSD License.

70	Table of Contents

72	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
73	   2.  Conventions . . . . . . . . . . . . . . . . . . . . . . . . .   4
74	   3.  Motivation for in-situ OAM  . . . . . . . . . . . . . . . . .   5
75	     3.1.  Path Congruency Issues with Dedicated OAM Packets . . . .   5
76	     3.2.  Results Sent to a System Other Than the Sender  . . . . .   6
77	     3.3.  Overlay and Underlay Correlation  . . . . . . . . . . . .   6
78	     3.4.  SLA Verification  . . . . . . . . . . . . . . . . . . . .   7
79	     3.5.  Analytics and Diagnostics . . . . . . . . . . . . . . . .   7
80	     3.6.  Frame Replication/Elimination Decision for Bi-casting
81	           /Active-active Networks . . . . . . . . . . . . . . . . .   8
82	     3.7.  Proof of Transit  . . . . . . . . . . . . . . . . . . . .   8
83	     3.8.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . .   9
84	   4.  Considerations for In-situ OAM  . . . . . . . . . . . . . . .  11
85	     4.1.  Type of Information to Be Recorded  . . . . . . . . . . .  11
86	     4.2.  MTU and Packet Size . . . . . . . . . . . . . . . . . . .  12
87	     4.3.  Administrative Boundaries . . . . . . . . . . . . . . . .  12
88	     4.4.  Selective Enablement  . . . . . . . . . . . . . . . . . .  13
89	     4.5.  Optimization of Node and Interface Identifiers  . . . . .  13
90	     4.6.  Loop Communication Path (IPv6-specifics)  . . . . . . . .  14
91	   5.  Requirements for In-situ OAM Data Types . . . . . . . . . . .  14
92	     5.1.  Generic Requirements  . . . . . . . . . . . . . . . . . .  14
93	     5.2.  In-situ OAM Data with Per-hop Scope . . . . . . . . . . .  16
94	     5.3.  In-situ OAM with Selected Hop Scope . . . . . . . . . . .  17
95	     5.4.  In-situ OAM with End-to-end Scope . . . . . . . . . . . .  17

97	   6.  Security Considerations and Requirements  . . . . . . . . . .  17
98	     6.1.  General considerations  . . . . . . . . . . . . . . . . .  17
99	     6.2.  Proof of Transit  . . . . . . . . . . . . . . . . . . . .  18
100	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  19
101	   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  19
102	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  19
103	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  19
104	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  19
105	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  21

107	1.  Introduction

109	   This document discusses requirements for "in-situ" Operations,
110	   Administration, and Maintenance (OAM) mechanisms.  In this context,
111	   "in-situ OAM" refers to the concept of directly encoding telemetry
112	   information within the data packet as it traverses the network or
113	   telemetry domain.  Mechanisms which add tracing or other types of
114	   telemetry information to the regular data traffic, sometimes also
115	   referred to as "in-band" OAM can complement active, probe-based
116	   mechanisms such as ping or traceroute, which are sometimes considered
117	   as "out-of-band", because the messages are transported independently
118	   from regular data traffic.  In terms of "active" or "passive" OAM,
119	   "in-situ" OAM can be considered a hybrid OAM type.  While no extra
120	   packets are sent, in-situ OAM adds information to the packets
121	   therefore cannot be considered passive.  In terms of the
122	   classification given in [RFC7799] in-situ OAM could be portrayed as
123	   "hybrid OAM, type 1".  "In-situ" mechanisms do not require extra
124	   packets to be sent and hence don't change the packet traffic mix
125	   within the network.  Traceroute and ping for example use ICMP
126	   messages: New packets are injected to get tracing information.  Those
127	   add to the number of messages in a network, which already might be
128	   highly loaded or suffering performance issues for a particular path
129	   or traffic type.

131	   A number of in-situ as well as in-band OAM mechanisms have been
132	   discussed, such as the INT spec for the P4 programming language [P4]
133	   or the SPUD prototype [I-D.hildebrand-spud-prototype].  The SPUD
134	   prototype uses a similar logic that allows network devices on the
135	   path between endpoints to participate explicitly in the tube outside
136	   the end-to-end context.  Even the IPv4 route-record option defined in
137	   [RFC0791] can be considered an in-situ OAM mechanism.  Per what was
138	   already stated, in-situ OAM complements "out-of-band" mechanisms such
139	   as ping or traceroute, or more recent active probing mechanisms, as
140	   described in [I-D.lapukhov-dataplane-probe].  In-situ OAM mechanisms
141	   can be leveraged where current out-of-band mechanisms do not apply or
142	   do not offer the desired characteristics or requirements, such as
143	   proving that a certain set of traffic takes a pre-defined path,
144	   strict congruency between overlay and underlay transports is in
145	   place, checking service level agreements for the live data traffic,
146	   detailed statistics or verification of path selections within a
147	   domain, or scenarios where probe traffic is potentially handled
148	   differently from regular data traffic by the network devices.
149	   [RFC7276] presents an overview of OAM tools.

151	   Compared to probably the most basic example of "in-situ OAM" which is
152	   IPv4 route recording [RFC0791], an in-situ OAM approach has the
153	   following capabilities:

155	   a.  A flexible data format to allow different types of information to
156	       be captured as part of an in-situ OAM operation, including but
157	       not limited to path tracing information, operational and
158	       telemetry information such as timestamps, sequence numbers, or
159	       even generic data such as queue size, geo-location of the node
160	       that forwarded the packet, etc.

162	   b.  A data format to express node as well as link identifiers to
163	       record the path a packet takes with a fixed amount of added data.

165	   c.  The ability to determine whether any nodes were skipped while
166	       recording in-situ OAM information (i.e., in-situ OAM is not
167	       supported or not enabled on those nodes).

169	   d.  The ability to actively process information in the packet, for
170	       example to prove in a cryptographically secure way that a packet
171	       really took a pre-defined path using some traffic steering method
172	       such as service chaining or traffic engineering.

174	   e.  The ability to include OAM data beyond simple path information,
175	       such as timestamps or even generic data of a particular use case.

177	   f.  The ability to carry in-situ OAM data in various different
178	       transport protocols.

180	2.  Conventions

182	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
183	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
184	   document are to be interpreted as described in [RFC2119].

186	   Abbreviations used in this document:

188	   ECMP:      Equal Cost Multi-Path

190	   LISP:      Locator/ID Separation Protocol

192	   MTU:       Maximum Transmit Unit
193	   NSH:       Network Service Header

195	   NFV:       Network Function Virtualization

197	   OAM:       Operations, Administration, and Maintenance

199	   PMTU:      Path MTU

201	   SFC:       Service Function Chain

203	   SLA:       Service Level Agreement

205	   SR:        Segment Routing

207	   This document defines in-situ Operations, Administration, and
208	   Maintenance (in-situ OAM), as the subset in which OAM information is
209	   carried along with data packets.  This is as opposed to "out-of-band
210	   OAM", where specific packets are dedicated to carrying OAM
211	   information.

213	3.  Motivation for in-situ OAM

215	   In several scenarios it is beneficial to make information about the
216	   path a packet took through the network or through a network device as
217	   well as associated telemetry information available to the operator.
218	   This includes not only tasks like debugging, troubleshooting, as well
219	   as network planning and network optimization but also policy or
220	   service level agreement compliance checks.  This section discusses
221	   the motivation to introduce new methods for enhanced in-situ network
222	   diagnostics.

224	3.1.  Path Congruency Issues with Dedicated OAM Packets

226	   Packet scheduling algorithms, especially for balancing traffic across
227	   equal cost paths or links, often leverage information contained
228	   within the packet, such as protocol number, IP-address or MAC-
229	   address.  Probe packets would thus either need to be sent from the
230	   exact same endpoints with the exact same parameters, or probe packets
231	   would need to be artificially constructed as "fake" packets and
232	   inserted along the path.  Both approaches are often not feasible from
233	   an operational perspective, be it that access to the end-system is
234	   not feasible, or that the diversity of parameters and associated
235	   probe packets to be created is simply too large.  An in-situ
236	   mechanism is an alternative in those cases.

238	   In-situ mechanisms are not impacted by differences in the handling of
239	   probe traffic compared to other data packets, where probe traffic is
240	   handled differently (and potentially forwarded differently) by a
241	   router than regular data traffic.  This obviously assumes that the
242	   addition of in-situ information does not change the forwarding
243	   behavior of the packet.  Note that in certain implementations, the
244	   addition information to a transport protocol changes the forwarding
245	   behavior.  IPv6 extension header processing is one example.  Some
246	   implementations process IPv6 packets with extension headers in the
247	   "slow" path of a router, as opposed to the "fast" path.

249	3.2.  Results Sent to a System Other Than the Sender

251	   Traditional ping and traceroute tools return the OAM results to the
252	   sender of the probe.  Even when the ICMP messages that are used with
253	   these tools are enhanced, and additional telemetry is collected
254	   (e.g., ICMP Multi-Part [RFC4884] supporting MPLS information
255	   [RFC4950], Interface and Next-Hop Identification [RFC5837], etc.), it
256	   would be advantageous to separate the sending of an OAM probe from
257	   the receiving of the telemetry data.  In this context, it is helpful
258	   to eliminate the requirement that there be a working bidirectional
259	   path.

261	3.3.  Overlay and Underlay Correlation

263	   Several network deployments leverage tunneling mechanisms to create
264	   overlay or service-layer networks.  Examples include VXLAN-GPE, GRE,
265	   or LISP.  One often observed attribute of overlay networks is that
266	   they do not offer the user of the overlay any insight into the
267	   underlay network.  This means that the path that a particular
268	   tunneled packet takes, nor other operational details such as the per-
269	   hop delay/jitter in the underlay are visible to the user of the
270	   overlay network, giving rise to diagnosis and debugging challenges in
271	   case of connectivity or performance issues.  The scope of OAM tools
272	   like ping or traceroute is limited to either the overlay or the
273	   underlay which means that the user of the overlay has typically no
274	   access to OAM in the underlay, unless specific operational procedures
275	   are put in place.  With in-situ OAM the operator of the underlay can
276	   offer details of the connectivity in the underlay to the user of the
277	   overlay.  This could include the ability to find out which underlay
278	   elements are shared by overlays and ability to know which overlays
279	   are mapped to the same underlay elements.  Deployment dependent
280	   underlay transit nodes can be configured to update OAM information in
281	   the overlay transport encapsulation.  The operator of the egress
282	   tunnel router could choose to share the recorded information about
283	   the path with the user of the overlay.

285	   Coupled with mechanisms such as Segment Routing (SR)
286	   [I-D.ietf-spring-segment-routing], overlay network and underlay
287	   network can be more tightly coupled: The user of the overlay has
288	   detailed diagnostic information available in case of failure
289	   conditions.  The user of the overlay can also use the path recording
290	   information as input to traffic steering or traffic engineering
291	   mechanisms, to for example achieve path symmetry for the traffic
292	   between two endpoints.  [I-D.brockners-lisp-sr] is an example for how
293	   these methods can be applied to LISP.

295	3.4.  SLA Verification

297	   In-situ OAM can help users of an overlay-service to verify that
298	   negotiated SLAs for the real traffic are met by the underlay network
299	   provider.  Different from solutions which rely on active probes to
300	   test an SLA, in-situ OAM based mechanisms avoid wrong interpretations
301	   and "cheating", which can happen if the probe traffic that is used to
302	   perform SLA-check is prioritized by the network provider of the
303	   underlay.  In active/standby deployments in-situ OAM would only allow
304	   for SLA verification of the active path.

306	3.5.  Analytics and Diagnostics

308	   Network planners and operators benefit from knowledge of the actual
309	   traffic distribution in the network.  When deriving an overall
310	   network connectivity traffic matrix one typically needs to correlate
311	   data gathered from each individual device in the network.  If the
312	   path of a packet is recorded while the packet is forwarded, the
313	   entire path that a packet took through the network is available to
314	   the egress system.  This obviates the need to retrieve individual
315	   traffic statistics from every device in the network and correlate
316	   those statistics, or employ other mechanisms such as leveraging
317	   traffic engineering with null-bandwidth tunnels just to retrieve the
318	   appropriate statistics to generate the traffic matrix.

320	   In addition, with individual path tracing, information is available
321	   at packet level granularity, rather than only at aggregate level - as
322	   is usually the case with IPFIX-style methods which employ flow-
323	   filters at the network elements.  Data-center networks which use
324	   equal-cost multipath (ECMP) forwarding are one example where detailed
325	   statistics on flow distribution in the network are highly desired.
326	   If a network supports ECMP, one can create detailed statistics for
327	   the different paths packets take through the network at the egress
328	   system, without a need to correlate/aggregate statistics from every
329	   router in the system.  Transit devices are off-loaded from the task
330	   of gathering packet statistics.

332	   In high-speed networks one can leverage and benefit from packet-
333	   accurate measurements with for example hardware-accurate timestamping
334	   (i.e., nanosecond-level verification) to support optimized packet
335	   scheduling and queuing mechanisms.

337	3.6.  Frame Replication/Elimination Decision for Bi-casting/Active-
338	      active Networks

340	   Bandwidth- and power-constrained, time-sensitive, or loss-intolerant
341	   networks (e.g., networks for industry automation/control, health
342	   care) require efficient OAM methods to decide when to replicate
343	   packets to a secondary path in order to keep the loss/error-rate for
344	   the receiver at a tolerable level - and also when to stop replication
345	   and eliminate the redundant flow.  Many Internet of Things (IoT)
346	   networks are time sensitive and cannot leverage automatic
347	   retransmission requests (ARQ) to cope with transmission errors or
348	   lost packets.  Transmitting the data over multiple disparate paths
349	   (often called bi-casting or live-live) is a method used to reduce the
350	   error rate observed by the receiver.  Time sensitive networks (TSN)
351	   receive a lot of attention from the manufacturing industry as shown
352	   by a various standardization activities and industry forums being
353	   formed (see e.g., IETF 6TiSCH, IEEE P802.1CB, AVnu).

355	3.7.  Proof of Transit

357	   Several deployments use traffic engineering, policy routing, segment
358	   routing or Service Function Chaining (SFC) [RFC7665] to steer packets
359	   through a specific set of nodes.  In certain cases regulatory
360	   obligations or a compliance policy require to prove that all packets
361	   that are supposed to follow a specific path are indeed being
362	   forwarded across the exact set of nodes specified.  If a packet flow
363	   is supposed to go through a series of service functions or network
364	   nodes, it has to be proven that all packets of the flow actually went
365	   through the service chain or collection of nodes specified by the
366	   policy.  In case the packets of a flow weren't appropriately
367	   processed, a verification device would be required to identify the
368	   policy violation and take corresponding actions (e.g., drop or
369	   redirect the packet, send an alert etc.) corresponding to the policy.
370	   In today's deployments, the proof that a packet traversed a
371	   particular service chain is typically delivered in an indirect way:
372	   Service appliances and network forwarding are in different trust
373	   domains.  Physical hand-off-points are defined between these trust
374	   domains (i.e., physical interfaces).  Or in other terms, in the
375	   "network forwarding domain" things are wired up in a way that traffic
376	   is delivered to the ingress interface of a service appliance and
377	   received back from an egress interface of a service appliance.  This
378	   "wiring" is verified and trusted.  The evolution to Network Function
379	   Virtualization (NFV) and modern service chaining concepts (using
380	   technologies such as Locator/ID Separation Protocol (LISP), Network
381	   Service Header (NSH), Segment Routing (SR), etc.) blurs the line
382	   between the different trust domains, because the hand-off-points are
383	   no longer clearly defined physical interfaces, but are virtual
384	   interfaces.  Because of that very reason, networks operators require
385	   that different trust layers not to be mixed in the same device.  For
386	   an NFV scenario a different proof is required.  Offering a proof that
387	   a packet traversed a specific set of service functions would allow
388	   network operators to move away from the above described indirect
389	   methods of proving that a service chain is in place for a particular
390	   application.

392	   Deployed service chains without the presence of a "proof of transit"
393	   mechanism are typically operated as fail-open system: The packets
394	   that arrive at the end of a service chain are processed.  Adding
395	   "proof of transit" capabilities to a service chain allows an operator
396	   to turn a fail-open system into a fail-close system, i.e.  packets
397	   that did not properly traverse the service chain can be blocked.

399	   A solution approach could be based on OAM data which is added to
400	   every packet for achieving Proof Of Transit (POT).The OAM data is
401	   updated at every hop and is used to verify whether a packet traversed
402	   all required nodes.  When the verifier receives each packet, it can
403	   validate whether the packet traversed the service chain correctly.
404	   The detailed mechanisms used for path verification along with the
405	   procedures applied to the OAM data carried in the packet for path
406	   verification are beyond the scope of this document.  Details are
407	   addressed in [I-D.brockners-proof-of-transit].  In this document the
408	   term "proof" refers to a discrete set of bits that represents an
409	   integer or string carried as OAM data.  The OAM data is used to
410	   verify whether a packet traversed the nodes it is supposed to
411	   traverse.

413	3.8.  Use Cases

415	   In-situ OAM could be leveraged for several use cases, including:

417	   o  Traffic Matrix: Derive the network traffic matrix: Traffic for a
418	      given time interval between any two edge nodes of a given domain.
419	      Could be performed for all traffic or on a per Quality of Service
420	      (QoS) class.

422	   o  Flow Debugging: Discover which path(s) a particular set of traffic
423	      (identified by an n-tuple) takes in the network.  Such a procedure
424	      is particularly useful in case traffic is balanced across multiple
425	      paths, like with link aggregation (LACP) or equal cost multi-
426	      pathing (ECMP).

428	   o  Loss Statistics per Path: Retrieve loss statistics per flow and
429	      path in the network.

431	   o  Path Heat Maps: Discover highly utilized links in the network.

433	   o  Trend Analysis on Traffic Patterns: Analyze if (and if so how) the
434	      forwarding path for a specific set of traffic changes over time
435	      (can give hints to routing issues, unstable links etc.)

437	   o  Network Delay Distribution: Show delay distribution across network
438	      by node or links.  If enabled per application or for a specific
439	      flow then display the path taken along with the delay incurred at
440	      every hop.

442	   o  SLA Verification: Verify that a negotiated service level agreement
443	      (SLA), e.g., for packet drop rates or delay/jitter is conformed to
444	      by the actual traffic.

446	   o  Low-power Networks: Include application level OAM information
447	      (e.g., battery charge level, cache or buffer fill level) into data
448	      traffic to avoid sending extra OAM traffic which incur an extra
449	      cost on the devices.  Using the battery charge level as example,
450	      one could avoid sending extra OAM packets just to communicate
451	      battery health, and as such would save battery on sensors.

453	   o  Path Verification or Service Function Path Verification: Proof and
454	      verification of packets traversing check points in the network,
455	      where check points can be nodes in the network or service
456	      functions.

458	   o  Geo-location Policy: Network policy implemented based on which
459	      path packets took.  Example: Only if packets originated and stayed
460	      within the trading-floor department, access to specific
461	      applications or servers is granted.

463	   o  Device-level Troubleshooting and Optimization: In many cases,
464	      network operators could benefit from information specific to a
465	      single device.  A non-exhaustive list of useful information
466	      includes: queue-depths, buffer utilization (either shared or per-
467	      port), packet latency measured from a known starting point, packet
468	      latency introduced by a single device, and resource utilization
469	      (CPU, memory, link bandwidth) of a given device or link.  In some
470	      cases, this information changes over per-packet timescales (i.e.,
471	      nanoseconds) and as such it is extremely challenging to collect
472	      and report this info in an accurate and scalable manner.  By
473	      encoding the information from the forwarding element directly
474	      within a data packet (i.e., within the 'fast-path') this
475	      information can be added to some or all data packets and then
476	      collected and analyzed by human or machine tools.  This type of
477	      information is particularly valuable for troubleshooting low-level
478	      device errors as well as providing a knowledge feedback loop for
479	      network and device optimization.

481	   o  Custom Network Probing: Active network probing and in-situ OAM can
482	      be combined for customized and efficient network probing.  This
483	      could for example be a customized traceroute.

485	4.  Considerations for In-situ OAM

487	   The implementation of an in-situ OAM mechanism needs to take several
488	   considerations into account, including administrative boundaries, how
489	   information is recorded, Maximum Transfer Unit (MTU), Path MTU
490	   Discovery (PMTUD) and packet size, etc.

492	4.1.  Type of Information to Be Recorded

494	   The information gathered for in-situ OAM can be categorized into
495	   three main categories: Information with a per-hop scope, such as path
496	   tracing; information which applies to a specific set of hops, such as
497	   path or service chain verification; information which only applies to
498	   the edges of a domain, such as sequence numbers.  Note that a single
499	   network device could comprise several in-situ OAM hops, for example
500	   in case one wants to trace the path of a packet through that device.

502	   o  "edge to edge": Information that needs to be shared between
503	      network edges (the "edge" of a network could either be a host or a
504	      domain edge device): Edge to edge data e.g., packet and octet
505	      count of data entering a well-defined domain and leaving it is
506	      helpful in building traffic matrix, sequence number (also called
507	      "path packet counters") is useful for the flow to detect packet
508	      loss.

510	   o  "selected hops": Information that applies to a specific set of
511	      nodes only.  In case of path verification, only the nodes which
512	      are "check points" are required to interpret and update the
513	      information in the packet.

515	   o  "per hop": Information that is gathered at every hop along the
516	      path a packet traverses within an administrative domain:

518	      *  Hop by Hop information e.g., Nodes visited for path tracing,
519	         Timestamps at each hop to find delays along the path

521	      *  Stats collection at each hop to optimize communication in
522	         resource constrained networks e.g., battery, CPU, memory status
523	         of each node piggy backed in a data packet is useful in low
524	         power lossy networks where network nodes are mostly asleep and
525	         communication is expensive

527	4.2.  MTU and Packet Size

529	   The recorded data at every hop might lead to packet size exceeding
530	   the Maximum Transmit Unit (MTU).  A detailed discussion of the
531	   implications of oversized IPv6 header chains is found in [RFC7112].
532	   The Path MTU restricts the amount of data that can be recorded for
533	   purpose of OAM within a data packet.

535	   If in-situ OAM data is inserted at the edge of the domain (e.g., by
536	   intermediate routers) then the MTU on all interfaces with the domain
537	   (MTU_INT) MUST be >= the maximum MTU on any "external" facing
538	   interfaces (MTU_EXT) and the total size of in-situ OAM data to be
539	   recorded MUST be <= (MTU_INT - MTU_EXT).

541	   In-situ OAM comprises two approaches to insert OAM data-records in
542	   the packets:

544	   o  Pre-allocated: In this case, the encapsulating node inserts empty
545	      data records into the packet to cover the entire domain.  The data
546	      records will be incrementally updated/filled as the packet
547	      progresses through the network.  With pre-allocation the packet
548	      size is only changed at the encapsulating node and is kept
549	      constant throughout the domain.  The pre-allocated approach is
550	      beneficial for software data-plane implementations where
551	      allocating the required space only once and index into the array
552	      to populate the data during transit avoids copy operations at
553	      every hop.

555	   o  Incremental: Every node that desires to include in-situ OAM
556	      information extends the packet as needed.  The incremental
557	      approach is beneficial for hardware data-plane implementations as
558	      it eliminates the need for the transit nodes to read the full
559	      array and lookup the pointer in the option prior to updating the
560	      data record contents.

562	   The "incremental" or the "pre-allocated" approaches could even be
563	   combined in the same deployment - in which case two in-situ OAM
564	   headers would be present in the packet: One for the incremental
565	   approach and one for the pre-allocated approach.  In such a case one
566	   would expect that nodes with a hardware data-plane would update the
567	   incremental header, whereas nodes with a software data-plane would
568	   process the pre-allocated header.

570	4.3.  Administrative Boundaries

572	   There are several challenges in enabling in-situ OAM in the public
573	   Internet as well as in corporate/enterprise networks across
574	   administrative domains, which include but are not limited to:

576	   o  Deployment dependent, the data fields that in-situ OAM requires as
577	      part of a specific transport protocol may not be supported across
578	      administrative boundaries.

580	   o  Current OAM implementations are often done in the slow path, i.e.,
581	      OAM packets are punted to router's CPU for processing.  This leads
582	      to performance and scaling issues and opens up routers for attacks
583	      such as Denial of Service (DoS) attacks.

585	   o  Discovery of network topology and details of the network devices
586	      across administrative boundaries may open up attack vectors
587	      compromising network security.

589	   o  Specifically on IPv6: At the administrative boundaries IPv6
590	      packets with extension headers are dropped for several reasons
591	      described in [RFC7872].

593	   The following considerations will be discussed in a future version of
594	   this document: If the packet is dropped due to the presence of the
595	   in-situ OAM; If the policy failure is treated as feature disablement
596	   and any further recording is stopped but the packet itself is not
597	   dropped, it may lead to every node in the path to make this policy
598	   decision.

600	4.4.  Selective Enablement

602	   The ability to selectively enable in-situ OAM is valuable.  While it
603	   may be desirable to enable data collection on all traffic or devices,
604	   this may not always be feasible.  In-situ OAM collection may also
605	   come with a performance impact to forwarding rates or feature
606	   capabilities, which may be acceptable in only some locations.  For
607	   example, the SPUD prototype uses the notion of "pipes" to describe
608	   the portion of the traffic that could be subject to in-path
609	   inspection.  Mechanisms to decide which traffic would be subject to
610	   in-situ OAM are outside the scope of this document.

612	4.5.  Optimization of Node and Interface Identifiers

614	   Since packets have a finite maximum size, the data recording or
615	   carrying capacity of one packet in which the in-situ OAM metadata is
616	   present is limited.  In-situ OAM should use its own dedicated
617	   namespace (confined to the domain in-situ OAM operates in) to
618	   represent node and interface IDs to save space in the header.
619	   Generic representations of node and interface identifiers which are
620	   globally unique (such as a UUID) would consume significantly more
621	   bits of in-situ OAM data.

623	4.6.  Loop Communication Path (IPv6-specifics)

625	   When recorded data is required to be analyzed on a source node that
626	   issues a packet and inserts in-situ OAM data, the recorded data needs
627	   to be carried back to the source node.

629	   One way to carry the in-situ OAM data back to the source is to
630	   utilize an ICMP Echo Request/Reply (ping) or ICMPv6 Echo Request/
631	   Reply (ping6) mechanism.  In order to run the in-situ OAM mechanism
632	   appropriately on the ping/ping6 mechanism, the following two
633	   operations should be implemented by the ping/ping6 target node:

635	   1.  All of the in-situ OAM fields would be copied from an Echo
636	       Request message to an Echo Reply message.

638	   2.  The Hop Limit field of the IPv6 header of these messages would be
639	       copied as a continuous sequence.  Further considerations are
640	       addressed in a future version of this document.

642	5.  Requirements for In-situ OAM Data Types

644	   The above discussed use cases require different types of in-situ OAM
645	   data.  This section details requirements for in-situ OAM derived from
646	   the discussion above.

648	5.1.  Generic Requirements

650	   REQ-G1:  Classification: It should be possible to enable in-situ OAM
651	            on a selected set of traffic (e.g., per interface, based on
652	            an access control list specifying a specific set of traffic,
653	            etc.)  The selected set of traffic can also be all traffic.

655	   REQ-G2:  Scope: If in-situ OAM is used only within a specific domain,
656	            provisions need to be put in place to ensure that in-situ
657	            OAM data stays within the specific domain only.

659	   REQ-G3:  Transport independence: Data formats for in-situ OAM shall
660	            be defined in a transport independent way.  In-situ OAM
661	            applies to a variety of transport protocols.  Encapsulations
662	            should be defined how the generic data formats are carried
663	            by a specific protocol.

665	   REQ-G4:  Layering: It should be possible to have in-situ OAM
666	            information for different transport protocol layers be
667	            present in several fields within a single packet.  This
668	            could for example be the case when tunnels are employed and
669	            in-situ OAM information is to be gathered for both the
670	            underlay as well as the overlay network.  Layering support
671	            should not be limited to just underlay and overlay, but
672	            include more than two layers.

674	   REQ-G5:  MTU size: With in-situ OAM information added, packets MUST
675	            NOT become larger than the path MTU.

677	            REQ-G5.1:  If due to some reason a packet which contains in
678	                       situ OAM data record cannot be forwarded due to
679	                       the presence of in-situ OAM data records, the
680	                       node SHOULD remove the in situ OAM data records
681	                       and forward the packet, rather than drop the
682	                       entire packet.

684	            REQ-G5.2:  If the encapsulating router is unable to insert
685	                       in-situ OAM data records into a packet, e.g., due
686	                       to MTU issues, even though it is configured to do
687	                       so, it should use some operational means to
688	                       inform the operator (e.g., syslog) about the
689	                       inability to add in-situ OAM data records.  Even
690	                       if the in-situ OAM encapsulating node fails to
691	                       add in-situ OAM data records, it should forward
692	                       the packet normally.

694	            REQ-G5.3:  MTU size consideration for in-situ OAM MUST take
695	                       domain specifics into account, e.g., changes of
696	                       the domain topology due to path protection
697	                       mechanisms might extend the hop count of a path
698	                       etc.

700	   REQ-G6:  Data structure reuse: The data types and data formats
701	            defined and used for in-situ OAM ought to be reusable for
702	            out-of-band OAM telemetry as well.

704	   REQ-G7:  Data records format: It is desirable that the format of in-
705	            situ OAM data-records leverages already defined data formats
706	            for OAM as much as feasible.

708	   REQ-G8:  Combination with active OAM mechanisms: In-situ OAM should
709	            be useable for active network probing, like for example a
710	            customized version of traceroute.  Decapsulating in-situ OAM
711	            nodes may have an ability to send the in-situ OAM
712	            information retrieved from the packet back to the source
713	            address of the packet or to the encapsulating node.

715	5.2.  In-situ OAM Data with Per-hop Scope

717	   REQ-H1:  Missing nodes detection: Data shall be present that allows a
718	            node to detect whether all nodes that might participate in
719	            in-situ OAM operations have indeed participated.

721	   REQ-H2:  Node, instance or device identifier: Data shall be present
722	            that allows to retrieve the identity of the entity reporting
723	            telemetry information.  The entity can be a device, or a
724	            subsystem/component within a device.  The latter will allow
725	            for packet tracing within a device in much the same way as
726	            between devices.

728	   REQ-H3:  Ingress interface identifier: Data shall be present that
729	            allows the identification of the interface a particular
730	            packet was received from.  The interface can be a logical
731	            and/or physical entity.

733	   REQ-H4:  Egress interface identifier: Data shall be present that
734	            allows the identification of the interface a particular
735	            packet was forwarded to.  Interface can be a logical or
736	            physical entity.

738	   REQ-H5:  Time-related requirements

740	            REQ-H5.1:  Delay: Data shall be present that allows to
741	                       retrieve the delay between two or more points of
742	                       interest within the system.  Those points can be
743	                       within the same device or on different devices.

745	            REQ-H5.2:  Jitter: Data shall be present that allows to
746	                       retrieve the jitter between two or more points of
747	                       interest within the system.  Those points can be
748	                       within the same device or on different devices.
749	                       Jitter can be derived from the different
750	                       timestamps gathered and does not necessarily need
751	                       to be an explicit data record.

753	            REQ-H5.3:  Wall-clock time: Data shall be present that
754	                       allows to retrieve the wall-clock time visited a
755	                       particular point of interest in the system.

757	            REQ-H5.4:  Time precision: Time with different precision
758	                       should be supported.  Use-case dependent, the
759	                       required precision could e.g., be nanoseconds,
760	                       microseconds, milliseconds, or seconds.

762	   REQ-H6:  Generic data records (like e.g., GPS/Geo-location
763	            information): It should be possible to add user-defined OAM
764	            data at select hops to the packet.  The semantics of the
765	            data are defined by the user.

767	5.3.  In-situ OAM with Selected Hop Scope

769	   REQ-S1:  Proof of transit: Data shall be present which allows to
770	            securely prove that a packet has visited or ore several
771	            particular points of interest (i.e., a particular set of
772	            nodes).

774	            REQ-S1.1:  In case "Shamir's secret sharing scheme" is used
775	                       for proof of transit, two data records, "random"
776	                       and "cumulative" shall be present.  The number of
777	                       bits used for "random" and "cumulative" data
778	                       records can vary between deployments and should
779	                       thus be configurable.

781	            REQ-S1.2:  Enable a fail-open service chaining system to be
782	                       converted into a fail-closed service chaining
783	                       system.

785	5.4.  In-situ OAM with End-to-end Scope

787	   REQ-E1:  Sequence numbering:

789	            REQ-E1.1:  Reordering detection: It should be possible to
790	                       detect whether packets have been reordered while
791	                       traversing an in situ OAM domain.

793	            REQ-E1.2:  Duplicates detection: It should be possible to
794	                       detect whether packets have been duplicated while
795	                       traversing an in situ OAM domain.

797	            REQ-E1.3:  Detection of packet drops: It should be possible
798	                       to detect whether packets have been dropped while
799	                       traversing an in-situ OAM domain.

801	6.  Security Considerations and Requirements

803	6.1.  General considerations

805	   General Security considerations will be expanded on in a later
806	   version of this document.

808	   In-situ OAM is considered a "per domain" feature, where one or
809	   several operators decide on leveraging and configuring in-situ OAM
810	   according to their needs.  Still operators need to properly secure
811	   the in-situ OAM domain to avoid malicious configuration and use,
812	   which could include injecting malicious in-situ OAM packets into a
813	   domain.

815	6.2.  Proof of Transit

817	   Threat Model: Attacks on the deployments could be due to malicious
818	   administrators or accidental misconfiguration resulting in bypassing
819	   of certain nodes.  The solution approach should meet the following
820	   requirements:

822	   REQ-SEC1:  Sound Proof of Transit: A valid and verifiable proof that
823	              the packet definitively traversed through all the nodes as
824	              expected.  Probabilistic methods to achieve this should be
825	              avoided, as the same could be exploited by an attacker.

827	   REQ-SEC2:  Tampering of meta data: An active attacker should not be
828	              able to insert or modify or delete meta data in whole or
829	              in parts and bypass few (or all) nodes.  Any deviation
830	              from the expected path should be accurately determined.

832	   REQ-SEC3:  Replay Attacks: A attacker (active/passive) should not be
833	              able to reuse the POT bits in the packet by observing the
834	              OAM data in the packet, packet characteristics (like IP
835	              addresses, octets transferred, timestamps) or even the
836	              proof bits themselves.  The solution approach should
837	              consider usage of these parameters for deriving any
838	              secrets cautiously.  Mitigating replay attacks beyond a
839	              window of longer duration could be intractable to achieve
840	              with fixed number of bits allocated for proof.

842	   REQ-SEC4:  Pre-play Attacks: A active attacker should not be able to
843	              generate or reuse valid POT bits from legitimate packets,
844	              in order to prove to the verifier as valid packets.  This
845	              slight variant of replay attacks.  The attacker extracts
846	              POT bits from legitimate packets and ensure they do not
847	              reach the verifier.  Subsequently reuse those POT bits in
848	              crafted packets.

850	   REQ-SEC5:  Recycle Secrets: Any configuration of the secrets (like
851	              cryptographic keys, initialization vectors etc.) either in
852	              the controller or service functions should be re-
853	              configurable.  Solution approach should enable controls,
854	              API calls etc. needed in order to perform such recycling.
855	              It is desirable to provide recommendations on the duration
856	              of rotation cycles needed for the secure functioning of
857	              the overall system.

859	   REQ-SEC6:  Secret storage and distribution: Secrets should be shared
860	              with the devices over secure channels.  Methods should be
861	              put in place so that secrets cannot be retrieved by non-
862	              authorized personnel from the devices.

864	7.  IANA Considerations

866	   [RFC Editor: please remove this section prior to publication.]

868	   This document has no IANA actions.

870	8.  Acknowledgements

872	   The authors would like to thank Jen Linkova, LJ Wobker, Eric Vyncke,
873	   Nalini Elkins, Srihari Raghavan, Ranganathan T S, Karthik Babu
874	   Harichandra Babu, Akshaya Nadahalli, Ignas Bagdonas, LJ Wobker, Erik
875	   Nordmark, and Andrew Yourtchenko for the comments and advice.  This
876	   document leverages and builds on top of several concepts described in
877	   [I-D.kitamura-ipv6-record-route].  The authors would like to
878	   acknowledge the work done by the author Hiroshi Kitamura and people
879	   involved in writing it.

881	9.  References

883	9.1.  Normative References

885	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
886	              Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/
887	              RFC2119, March 1997,
888	              <http://www.rfc-editor.org/info/rfc2119>.

890	9.2.  Informative References

892	   [I-D.brockners-lisp-sr]
893	              Brockners, F., Bhandari, S., Maino, F., and D. Lewis,
894	              "LISP Extensions for Segment Routing", draft-brockners-
895	              lisp-sr-01 (work in progress), February 2014.

897	   [I-D.brockners-proof-of-transit]
898	              Brockners, F., Bhandari, S., Dara, S., Pignataro, C.,
899	              Leddy, J., and S. Youell, "Proof of Transit", draft-
900	              brockners-proof-of-transit-01 (work in progress), July
901	              2016.

903	   [I-D.hildebrand-spud-prototype]
904	              Hildebrand, J. and B. Trammell, "Substrate Protocol for
905	              User Datagrams (SPUD) Prototype", draft-hildebrand-spud-
906	              prototype-03 (work in progress), March 2015.

908	   [I-D.ietf-spring-segment-routing]
909	              Filsfils, C., Previdi, S., Decraene, B., Litkowski, S.,
910	              and R. Shakir, "Segment Routing Architecture", draft-ietf-
911	              spring-segment-routing-09 (work in progress), July 2016.

913	   [I-D.kitamura-ipv6-record-route]
914	              Kitamura, H., "Record Route for IPv6 (PR6) Hop-by-Hop
915	              Option Extension", draft-kitamura-ipv6-record-route-00
916	              (work in progress), November 2000.

918	   [I-D.lapukhov-dataplane-probe]
919	              Lapukhov, P. and r. remy@barefootnetworks.com, "Data-plane
920	              probe for in-band telemetry collection", draft-lapukhov-
921	              dataplane-probe-01 (work in progress), June 2016.

923	   [P4]       Kim, , "P4: In-band Network Telemetry (INT)", September
924	              2015.

926	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791, DOI
927	              10.17487/RFC0791, September 1981,
928	              <http://www.rfc-editor.org/info/rfc791>.

930	   [RFC4884]  Bonica, R., Gan, D., Tappan, D., and C. Pignataro,
931	              "Extended ICMP to Support Multi-Part Messages", RFC 4884,
932	              DOI 10.17487/RFC4884, April 2007,
933	              <http://www.rfc-editor.org/info/rfc4884>.

935	   [RFC4950]  Bonica, R., Gan, D., Tappan, D., and C. Pignataro, "ICMP
936	              Extensions for Multiprotocol Label Switching", RFC 4950,
937	              DOI 10.17487/RFC4950, August 2007,
938	              <http://www.rfc-editor.org/info/rfc4950>.

940	   [RFC5837]  Atlas, A., Ed., Bonica, R., Ed., Pignataro, C., Ed., Shen,
941	              N., and JR. Rivers, "Extending ICMP for Interface and
942	              Next-Hop Identification", RFC 5837, DOI 10.17487/RFC5837,
943	              April 2010, <http://www.rfc-editor.org/info/rfc5837>.

945	   [RFC7112]  Gont, F., Manral, V., and R. Bonica, "Implications of
946	              Oversized IPv6 Header Chains", RFC 7112, DOI 10.17487/
947	              RFC7112, January 2014,
948	              <http://www.rfc-editor.org/info/rfc7112>.

950	   [RFC7276]  Mizrahi, T., Sprecher, N., Bellagamba, E., and Y.
951	              Weingarten, "An Overview of Operations, Administration,
952	              and Maintenance (OAM) Tools", RFC 7276, DOI 10.17487/
953	              RFC7276, June 2014,
954	              <http://www.rfc-editor.org/info/rfc7276>.

956	   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
957	              Chaining (SFC) Architecture", RFC 7665, DOI 10.17487/
958	              RFC7665, October 2015,
959	              <http://www.rfc-editor.org/info/rfc7665>.

961	   [RFC7799]  Morton, A., "Active and Passive Metrics and Methods (with
962	              Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799,
963	              May 2016, <http://www.rfc-editor.org/info/rfc7799>.

965	   [RFC7872]  Gont, F., Linkova, J., Chown, T., and W. Liu,
966	              "Observations on the Dropping of Packets with IPv6
967	              Extension Headers in the Real World", RFC 7872, DOI
968	              10.17487/RFC7872, June 2016,
969	              <http://www.rfc-editor.org/info/rfc7872>.

971	Authors' Addresses

973	   Frank Brockners
974	   Cisco Systems, Inc.
975	   Hansaallee 249, 3rd Floor
976	   DUESSELDORF, NORDRHEIN-WESTFALEN  40549
977	   Germany

979	   Email: fbrockne@cisco.com

981	   Shwetha Bhandari
982	   Cisco Systems, Inc.
983	   Cessna Business Park, Sarjapura Marathalli Outer Ring Road
984	   Bangalore, KARNATAKA 560 087
985	   India

987	   Email: shwethab@cisco.com

989	   Sashank Dara
990	   Cisco Systems, Inc.
991	   Cessna Business Park, Sarjapura Marathalli Outer Ring Road
992	   Bangalore, KARNATAKA 560 087
993	   India

995	   Email: sadara@cisco.com
996	   Carlos Pignataro
997	   Cisco Systems, Inc.
998	   7200-11 Kit Creek Road
999	   Research Triangle Park, NC  27709
1000	   United States

1002	   Email: cpignata@cisco.com

1004	   Hannes Gredler
1005	   RtBrick Inc.

1007	   Email: hannes@rtbrick.com

1009	   John Leddy
1010	   Comcast

1012	   Email: John_Leddy@cable.comcast.com

1014	   Stephen Youell
1015	   JP Morgan Chase
1016	   25 Bank Street
1017	   London  E14 5JP
1018	   United Kingdom

1020	   Email: stephen.youell@jpmorgan.com

1022	   David Mozes
1023	   Mellanox Technologies Ltd.

1025	   Email: davidm@mellanox.com

1027	   Tal Mizrahi
1028	   Marvell
1029	   6 Hamada St.
1030	   Yokneam  20692
1031	   Israel

1033	   Email: talmi@marvell.com
1034	   Petr Lapukhov
1035	   Facebook
1036	   1 Hacker Way
1037	   Menlo Park, CA  94025
1038	   USA

1040	   URI:   petr@fb.com

1042	   Remy Chang
1043	   Barefoot Networks

1045	   Email: remy@barefootnetworks.com