idnits 2.17.1 

draft-bryant-rtgwg-enhanced-vpn-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 424 has weird spacing: '... Tenant  conne...'

  -- The document date (March 05, 2018) is 2237 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'NETCALC' is defined on line 997, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-13) exists of
     draft-ietf-detnet-architecture-04

  == Outdated reference: A later version (-04) exists of
     draft-ietf-detnet-dp-sol-01


     Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Routing Area Working Group                                     S. Bryant
3	Internet-Draft                                                   J. Dong
4	Intended status: Informational                                    Huawei
5	Expires: September 6, 2018                                         Z. Li
6	                                                            China Mobile
7	                                                             T. Miyasaka
8	                                                        KDDI Corporation
9	                                                          March 05, 2018

11	                Enhanced Virtual Private Networks (VPN+)
12	                   draft-bryant-rtgwg-enhanced-vpn-02

14	Abstract

16	   This draft describes a number of enhancements that need to be made to
17	   virtual private networks (VPNs) to support the needs of new
18	   applications, particularly applications that are associated with 5G
19	   services.  A network enhanced with these properties may form the
20	   underpin of network slicing, but will also be of use in its own
21	   right.

23	Status of This Memo

25	   This Internet-Draft is submitted in full conformance with the
26	   provisions of BCP 78 and BCP 79.

28	   Internet-Drafts are working documents of the Internet Engineering
29	   Task Force (IETF).  Note that other groups may also distribute
30	   working documents as Internet-Drafts.  The list of current Internet-
31	   Drafts is at https://datatracker.ietf.org/drafts/current/.

33	   Internet-Drafts are draft documents valid for a maximum of six months
34	   and may be updated, replaced, or obsoleted by other documents at any
35	   time.  It is inappropriate to use Internet-Drafts as reference
36	   material or to cite them other than as "work in progress."

38	   This Internet-Draft will expire on September 6, 2018.

40	Copyright Notice

42	   Copyright (c) 2018 IETF Trust and the persons identified as the
43	   document authors.  All rights reserved.

45	   This document is subject to BCP 78 and the IETF Trust's Legal
46	   Provisions Relating to IETF Documents
47	   (https://trustee.ietf.org/license-info) in effect on the date of
48	   publication of this document.  Please review these documents
49	   carefully, as they describe your rights and restrictions with respect
50	   to this document.  Code Components extracted from this document must
51	   include Simplified BSD License text as described in Section 4.e of
52	   the Trust Legal Provisions and are provided without warranty as
53	   described in the Simplified BSD License.

55	Table of Contents

57	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
58	   2.  Requirements Language . . . . . . . . . . . . . . . . . . . .   4
59	   3.  Overview of the Requirements  . . . . . . . . . . . . . . . .   4
60	     3.1.  Isolation between Virtual Networks  . . . . . . . . . . .   4
61	     3.2.  Diverse Performance Guarantees  . . . . . . . . . . . . .   6
62	     3.3.  A Pragmatic Approach to Isolation . . . . . . . . . . . .   7
63	     3.4.  Integration . . . . . . . . . . . . . . . . . . . . . . .   8
64	     3.5.  Dynamic Configuration . . . . . . . . . . . . . . . . . .   8
65	     3.6.  Customized Control Plane  . . . . . . . . . . . . . . . .   9
66	   4.  Architecture and Components of VPN+ . . . . . . . . . . . . .   9
67	     4.1.  Communications Layering . . . . . . . . . . . . . . . . .   9
68	     4.2.  Multi-Point to Multi-point  . . . . . . . . . . . . . . .  10
69	     4.3.  Candidate Underlay Technologies . . . . . . . . . . . . .  10
70	       4.3.1.  FlexE . . . . . . . . . . . . . . . . . . . . . . . .  11
71	       4.3.2.  Dedicated Queues  . . . . . . . . . . . . . . . . . .  12
72	       4.3.3.  Time Sensitive Networking . . . . . . . . . . . . . .  12
73	       4.3.4.  Deterministic Networking  . . . . . . . . . . . . . .  12
74	       4.3.5.  MPLS Traffic Engineering (MPLS-TE)  . . . . . . . . .  13
75	       4.3.6.  Segment Routing . . . . . . . . . . . . . . . . . . .  13
76	     4.4.  Control Plane Considerations  . . . . . . . . . . . . . .  16
77	     4.5.  Application Specific Network Types  . . . . . . . . . . .  17
78	     4.6.  Integration with Service Functions  . . . . . . . . . . .  17
79	   5.  Scalability Considerations  . . . . . . . . . . . . . . . . .  17
80	     5.1.  Maximum Stack Depth . . . . . . . . . . . . . . . . . . .  18
81	     5.2.  RSVP scalability  . . . . . . . . . . . . . . . . . . . .  18
82	   6.  OAM and Instrumentation . . . . . . . . . . . . . . . . . . .  19
83	   7.  Enhanced Resiliency . . . . . . . . . . . . . . . . . . . . .  19
84	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  20
85	   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  20
86	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  21
87	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  21
88	     10.2.  Informative References . . . . . . . . . . . . . . . . .  21
89	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  22

91	1.  Introduction

93	   Virtual networks, often referred to as virtual private networks
94	   (VPNs) have served the industry well as a means of providing
95	   different groups of users with logically isolated access to a common
96	   network.  The common or base network that is used to provide the VPNs
97	   is often referred to as the underlay, and the VPN is often called an
98	   overlay.

100	   Driven largely by needs surfacing from 5G, the concept of network
101	   slicing has gained traction.  There is a need to create a VPN with
102	   enhanced characteristics.  Specifically there is a need for a
103	   transport network supporting a set of virtual networks each of which
104	   provides the client with dedicated (private) networking, computing
105	   and storage resources drawn from a shared pool.
106	   The tenant of such a network can require a degree of isolation and
107	   performance that previously could only be satisfied by dedicated
108	   networks.  Additionally the tenant may ask for some level of control
109	   of their virtual network e.g. to customize the service paths in the
110	   network slice.

112	   These properties cannot be met with pure overlay networks, as they
113	   require tighter coordination and integration between the underlay and
114	   the overlay network.  This document introduces a new network service
115	   called enhanced VPN (VPN+).  VPN+ refers to a virtual network which
116	   has dedicated network resources allocated from the underlay network.
117	   Unlike traditional VPN, an enhanced VPN can achieve greater isolation
118	   and guaranteed performance.

120	   These new network layer properties, which have general applicability,
121	   may also be of interest as part of a network slicing solution.

123	   This document specifies a framework for using the existing, modified
124	   and potential new networking technologies as components to provide an
125	   enhanced VPN (VPN+) service.  Specifically we are concerned with:

127	   o  The design of the enhanced VPN data-plane

129	   o  The necessary protocols in both, underlay and the overlay of
130	      enhanced VPN, and

132	   o  The mechanisms to achieve integration between overlay and underlay

134	   o  The necessary method of monitoring an enhanced VPN

136	   o  The methods of instrumenting an enhanced VPN to ensure that the
137	      required tenant Service Level Agreement (SLA) is maintained

139	   The required layer structure necessary to achieve this is shown in
140	   Section 4.1.

142	   One use for enhanced VPNs is to create network slices with different
143	   isolation requirements.  Such slices may be used to provide different
144	   tenants of vertical industrial markets with their own virtual network
145	   with the explicit characteristics required.  These slices may be
146	   "hard" slices providing a high degree of confidence that the VPN+
147	   characteristics will be maintained over the slice life cycle, of they
148	   may be "soft" slices in which case some degree of interaction may be
149	   experienced.

151	2.  Requirements Language

153	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
154	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
155	   "OPTIONAL" in this document are to be interpreted as described in
156	   [RFC2119].

158	3.  Overview of the Requirements

160	   In this section we provide an overview of the requirements of an
161	   enhanced VPN.

163	3.1.  Isolation between Virtual Networks

165	   The requirement is to provide both hard and soft isolation between
166	   the tenants/applications using one enhanced VPN and the tenants/
167	   applications using another enhanced VPN.  Hard isolation is needed so
168	   that applications with exacting requirements can function correctly
169	   despite a flash demand being created on another VPN competing for the
170	   underlying resources.  An example might be a network supporting both
171	   emergency services and public broadband multi-media services.

173	   During a major incident the VPNs supporting these services would both
174	   be expected to experience high data volumes, and it is important that
175	   both make progress in the transmission of their data.  In these
176	   circumstances the VPNs would require an appropriate degree of
177	   isolation to be able to continue to operate acceptably.

179	   We introduce the terms hard (static) and soft (dynamic) isolation to
180	   cover cases such as the above.  A VPN has soft isolation if the
181	   traffic of one VPN cannot be inspected by the traffic of another.
182	   Both IP and MPLS VPNs are examples of soft isolated VPNs because the
183	   network delivers the traffic only to the required VPN endpoints.
184	   However the traffic from one or more VPNs and regular network traffic
185	   may congest the network resulting in delays for other VPNs operating
186	   normally.  The ability for a VPN to be sheltered from this effect is
187	   called hard isolation, and this property is required by some critical
188	   applications.  Although these isolation requirements are triggered by
189	   the needs of 5G networks, they have general utility.  In the
190	   remainder of this section we explore how isolation may be achieved in
191	   packet networks.

193	   It is of course possible to achieve high degrees of isolation in the
194	   optical layer.  However this is done at the cost of allocating
195	   resources on a long term basis and end-to-end basis.  Such an
196	   arrangement means that the full cost of the resources must be borne
197	   by the service that is allocated the resources.  On the other hand,
198	   isolation at the packet layer allows the resources to be shared
199	   amongst many services and only dedicated to a service on a temporary
200	   basis.  This allows greater statistical multiplexing of network
201	   resources and amortizes the cost over many services, leading to
202	   better economy.  However, the degree of isolation required by network
203	   slicing cannot easily be met with MPLS-TE packet LSPs as they
204	   guarantee long-term bandwidth, but not latency.

206	   Thus some trade-off between the two approaches needs to be considered
207	   to provide the required isolation between virtual networks while
208	   still allows reasonable sharing inside each VPN.

210	   The work of the IEEE project on Time Sensitive Networking is
211	   introducing the concept of packet scheduling where a high priority
212	   packet stream may be given a scheduled time slot thereby guaranteeing
213	   that it experiences no queuing delay and hence a reduced latency.
214	   However where no scheduled packet arrives its reserved time-slot is
215	   handed over to best effort traffic, thereby improving the economics
216	   of the network.  Such a scheduling mechanism may be usable directly,
217	   or with extension to achieve isolation between multiple VPNs.

219	   One of the key areas in which isolation needs to be provided is at
220	   the interfaces.  If nothing is done the system falls back to the
221	   router queuing system in which the ingress places it on a selected
222	   output queue.  Modern routers have quite sophisticated output queuing
223	   systems, traditionally these have not provided the type of scheduling
224	   system needed to support the levels of isolation needed for the
225	   applications that are the target of VPN+ networks.  However some of
226	   the more modern approaches to queuing allow the construction of
227	   logical virtual channelized sub-interfaces (VCSI).  With VCSIs there
228	   is only one physical interface, and routing sees a single adjacency,
229	   but the queuing system is used to provide virtual interfaces at
230	   various priorities.  Sophisticated queuing systems of this type may
231	   be used to provide end-to-end virtual isolation between tenant's
232	   traffic in an otherwise homogeneous network.

234	   [FLEXE] provides the ability to multiplex multiple channels over an
235	   Ethernet link in a way that provides hard isolation.  However it is a
236	   only a link technology.  When packets are received by the downstream
237	   node they need to be processed in a way that preserves that
238	   isolation.  This in turn requires a queuing and forwarding
239	   implementation that preserves the isolation, such as a sliced
240	   hardware system, or an LVI system of the type described above.

242	3.2.  Diverse Performance Guarantees

244	   There are several aspects to guaranteed performance, guaranteed
245	   maximum packet loss, guaranteed maximum delay and guaranteed delay
246	   variation.

248	   Guaranteed maximum packet loss is a common parameter, and is usually
249	   addressed by setting the packet priorities, queue size and discard
250	   policy.  However this becomes more difficult when the requirement is
251	   combine with the latency requirement.  The limiting case is zero
252	   congestion loss, and than is the goal of the Deterministic Networking
253	   work that the IETF and IEEE are pursuing.  In modern optical networks
254	   loss due to transmission errors is already asymptotic to zero due,
255	   but there is always the possibility of failure of the interface and
256	   the fiber itself.  This can only be addressed by some form of packet
257	   duplication and transmission over diverse paths.

259	   Guaranteed maximum latency is required in a number of applications
260	   particularly real-time control applications and some types of virtual
261	   reality applications.  The work of the IETF Deterministic Networking
262	   (DetNet) Working Group is relevant, however the scope needs to be
263	   extended to methods of enhancing the underlay to better support the
264	   delay guarantee, and to integrate these enhancements with the overall
265	   service provision.

267	   Guaranteed maximum delay variation is a service that may also be
268	   needed.  Time transfer is one example of a service that needs this,
269	   although the fungible nature of time means that it might be delivered
270	   by the underlay as a shared service and not provided through
271	   different virtual networks.  Alternatively a dedicated virtual
272	   network may be used to provide this as a shared service.  The need
273	   for guaranteed maximum delay variation as a general requirement is
274	   for further study.

276	   This leads to the concept that there is a spectrum of grades of
277	   service guarantee that need to be considered when deploying and
278	   enhanced VPN.  As a guide to understanding the design requirements we
279	   can consider four types:

281	   o  Guaranteed latency,

283	   o  Enhanced delivery

285	   o  Assured bandwidth,

287	   o  Best effort
288	   In Section 3.1 we considered the work of the IEEE Time Sensitive
289	   Networking (TSN) project and the work of the IETF DetNet Working
290	   group in the context of isolation.  However this work is of greater
291	   relevance in assuring end-to-end packet latency.  It is also of
292	   importance in considering enhanced delivery.

294	   A service that is guaranteed latency has a latency upper bound
295	   provided by the network.  It is important to note that assuring the
296	   upper bound is more important than achieving the minimum latency.

298	   A service that is offered enhanced delivery is one in which the
299	   network (at layer 3) attempts to deliver the packet through multiple
300	   paths in the hope of avoiding transient congestion
301	   [I-D.ietf-detnet-dp-sol].

303	   A useful mechanism to provide these guarantees is to use Flex
304	   Ethernet [FLEXE] as the underlay.  This is a method of bonding
305	   Ethernets together and of providing time-slot based channelization
306	   over an Ethernet bearer.  Such channels are fully isolated from other
307	   channels running over the same Ethernet bearer.  As noted elsewhere
308	   this produces hard isolation but at the cost of making the
309	   reclamation of unused bandwidth harder.

311	   These approaches can usefully be used in tandem.  It is possible to
312	   use FlexE to provide tenant isolation, and then to use the TSN
313	   approach over FlexE to provide service performance guarantee inside
314	   the a slice/tenant VPN.

316	3.3.  A Pragmatic Approach to Isolation

318	   A key question to consider is whether whether it is possible to
319	   achieve hard isolation in packet networks?  Packet networks were
320	   never designed to support hard isolation, just the opposite, they
321	   were designed to provide a high degree of statistical multiplexing
322	   and hence a significant economic advantage when compared to a
323	   dedicated, or a Time Division Multiplexing (TDM) network.  However
324	   the key thing to bear in mind is that the concept of hard isolation
325	   needs to be viewed from the perspective of the application, and there
326	   is no need to provide any harder isolation than is required by the
327	   application.  From a historical perspective it is good to think about
328	   pseudowires [RFC3985] which emulate services that in many would have
329	   had hard isolation in their native form.  However experience has
330	   shown that in most cases an approximation to this requirement is
331	   sufficient for most uses.

333	   Thus, for example, using FlexE or channelized sub-interface,together
334	   with packet scheduling as interface slicing, and optionally, also
335	   together with the slicing of node resources (Network Processor Unit
336	   (NPU), etc.), it may be possible to provide a type of hard isolation
337	   that is adequate for many applications.  Other applications may be
338	   satisfied with a classical VPN and reserved bandwidth, but yet others
339	   may require dedicated point to point fiber.  The requirement is thus
340	   to qualify the needs of each application and provide an economic
341	   solution that satisfies those needs without over-engineering.

343	3.4.  Integration

345	   A solution to the enhanced VPN problem will need to provide seamless
346	   integration of both Overlay VPN and the underlay network resources.
347	   This needs be done in a flexible and scalable way so that it can be
348	   widely deployed in operator networks.  Given the targeting of both
349	   this technology and service function chaining at mobile networks and
350	   in particular 5G the co-integration of service functions is a likely
351	   requirement.

353	3.5.  Dynamic Configuration

355	   It is necessary that new enhanced VPNs can be introduced to the
356	   network, modified, and removed from the network according to service
357	   demand.  In doing so due regard must be given to the impact of other
358	   enhanced VPNs that are operational.  An enhanced VPN that requires
359	   hard isolation must not be disrupted by the installation or
360	   modification of another enhanced VPN.

362	   Whether modification of an enhanced VPN can be disruptive to that
363	   VPN, and in particular the traffic in flight is to be determined, but
364	   is likely to be a difficult problem to address.

366	   The data-plane aspect of this are discussed further in Section 4.3.

368	   The control-plane and management-plane aspects of this, particularly
369	   the garbage collection are likely to be challenging and are for
370	   further study.

372	   As well as managing dynamic changes to the VPN in a seamless way,
373	   dynamic changes to the underlay and its transport network need to be
374	   managed in order to avoid disruption to sensitive services.

376	   In addition to non-disruptively managing the network as a result of
377	   gross change such as the inclusion of a new VPN endpoint or a change
378	   to a link, consideration has to be given to the need to move VPN
379	   traffic as a result of traffic volume changes.

381	3.6.  Customized Control Plane

383	   In some cases it is desirable that an enhanced VPN has a custom
384	   control-plane, so that the tenant of the enhanced VPN can have some
385	   control to the resources and functions partitioned for this VPN.
386	   Each enhanced VPN may have its own dedicated controller, it may be
387	   provided with an interface to a control-plane that is shared with a
388	   set of other tenants, or it may be provided with an interface to the
389	   control-plane of the underlay provided by the underlay network
390	   operator.

392	   Further detail on this requirement will be provided in a future
393	   version of the draft.

395	4.  Architecture and Components of VPN+

397	   Normally a number of enhanced VPN services will be provided by a
398	   common network infrastructure.  Each enhanced VPN consists of both
399	   the overlay and a specific set of dedicated network resources and
400	   functions allocated in the underlay to satisfy the needs of the VPN
401	   tenant.  The integration between overlay and underlay ensures the
402	   isolation and between different enhanced VPNs, and facilitates the
403	   guaranteed performance for different services.

405	   An enhanced VPN needs to be designed with consideration given to:

407	   o  Isolation of enhanced VPN data plane.

409	   o  A scalable control plane to match the data plane isolation.

411	   o  The amount of state in the packet vs the amount of state in the
412	      control plane.

414	   o  Mechanism for diverse performance guarantee within an enhanced VPN

416	   o  Support of the required integration between network functions and
417	      service functions.

419	4.1.  Communications Layering

421	   The communications layering model use to build an enhanced VPN is
422	   shown in Figure 1.

424	   Tenant          Tenant  connection             Tenant
425	   CE1    ----------------------------------------CE2
426	     \                                            /
427	   AC \   OP        Provider VPN           OP    /AC
428	       +- PE1------------------------------PE1 -+
429	                    Enhanced Path
430	             ==============================
431	                       Underlay
432	             ++++++++++++++++++++++++++++++

434	                     Figure 1: Communication Layering

436	   The network operator is required to provide a tenant connection
437	   between the tenant's Customer Equipment (CE) (CE1 and CE2).  These
438	   CEs attach to the Operator's Provider Edge Equipments (PE) (PE1 and
439	   PE2 respectively).  The attachment circuits (AC) are outside the
440	   scope of this document other than to note that they obviously need to
441	   provide a connection of sufficient quality in terms of isolation,
442	   latency etc so as to satisfy the needs of the user.  The subtlety to
443	   be aware of is that the ACs are often provided by a network rather
444	   than a fixed point to point connection and thus the considerations in
445	   this document may apply to the network that provides the AC.

447	   A provider VPN is constructed between PE1 and PE2 to carry tenant
448	   traffic.  This is a normal VPN, and provides one stage of isolation
449	   between tenants.

451	   An enhanced path is constructed to carry the provider VPN using
452	   dedicated resources drawn from the underlay.

454	4.2.  Multi-Point to Multi-point

456	   At a VPN level connections are frequently multi-point-to-multi-point
457	   (MP2MP).  As far as such services are concerned the underlay is also
458	   an abstract MP2MP medium.  However when service guarantees are
459	   provided, such as with an enhanced VPN, each point to point path
460	   through the underlay needs to be specifically engineered to meet the
461	   required performance guarantees.

463	4.3.  Candidate Underlay Technologies

465	   A VPN is a network created by applying a multiplexing technique to
466	   the underlying network (the underlay) in order to distinguish the
467	   traffic of one VPN from that of another.  A VPN path that travels by
468	   other than the shortest path through the underlay normally requires
469	   state in the underlay to specify that path.  State is normally
470	   applied to the underlay through the use of the RSVP Signaling
471	   protocol, or directly through the use of an SDN controller, although
472	   other techniques may emerge as this problem is studied.  This state
473	   gets harder to manage as the number of VPN paths increases.
474	   Furthermore, as we increase the coupling between the underlay and the
475	   overlay to support the VPN which requires enhanced VPN service, this
476	   state will increase further.

478	   In an enhanced VPN different subsets of the underlay resources are
479	   dedicated to different VPNs.  Any enhanced VPN solution thus needs
480	   tighter coupling with underlay than is the case with classical VPNs.
481	   We cannot for example share the tunnel between enhanced VPNs which
482	   require hard isolation.

484	   In the following sections we consider a number of candidate underlay
485	   solutions for proving the required VPN separation.

487	   o  FlexE

489	   o  Time Sensitive Networking

491	   o  Deterministic Networking

493	   o  Dedicated Queues

495	   We then consider the problem of slice differentiation and resource
496	   representation.  Candidate technologies are:

498	   o  MPLS

500	   o  MPLS-SR

502	   o  Segment Routing over IPv6 (SRv6)

504	4.3.1.  FlexE

506	   FlexE [FLEXE] is a method of creating a point-to-point Ethernet with
507	   a specific fixed bandwidth.  FlexE supports the bonding of multiple
508	   links, which supports creating larger links out of multiple slower
509	   links in a more efficient way that traditional link aggregation.
510	   FlexE also supports the sub-rating of links, which allows an operator
511	   to only use a portion of a link.  FlexE also supports the
512	   channelization of links, which allows one link to carry several
513	   lower-speed or sub-rated links from different sources.

515	   If different FlexE channels are used for different services, then no
516	   sharing is possible between the services.  This in turn means that it
517	   is not possible to dynamically re-distribute unused bandwidth to
518	   lower priority services increasing the cost of operation of the
519	   network.  FlexE can on the other hand be used to provide hard
520	   isolation between different tenants by providing hard isolation on an
521	   interface.  The tenant can then use other methods to manage the
522	   relative priority of their own traffic.

524	   Methods of dynamically re-sizing FlexE channels and the implication
525	   for enhanced VPN are under study.

527	4.3.2.  Dedicated Queues

529	   In an enhanced VPN providing multiple isolated virtual networks the
530	   conventional Diff-Serv based queuing system is insufficient for our
531	   purposes due to the limited number of queues which cannot
532	   differentiate between traffic of different VPNs and the range of
533	   service classes that each need to provide their tenants.  This
534	   problem is particularly acute with an MPLS underlay due to the small
535	   number of traffic class services available.  In order to address this
536	   problem and thus reduce the interference between VPNs, it is likely
537	   to be necessary to steer traffic of VPNs to dedicated input and
538	   output queues.

540	4.3.3.  Time Sensitive Networking

542	   Time Sensitive Networking (TSN) is an IEEE project that is designing
543	   a method of carrying time sensitive information over Ethernet.  As
544	   Ethernet this can obviously be tunneled over a Layer 3 network in a
545	   pseudowire.  However the TSN payload would be opaque to the underlay
546	   and thus not treated specifically as time sensitive data.  The
547	   preferred method of carrying TSN over a layer 3 network is through
548	   the use of deterministic networking as explained in the following
549	   section of this document.

551	   The machanisms defined in TSN can be used to meet the requirements of
552	   time sensitive services of an enhanced VPN.

554	4.3.4.  Deterministic Networking

556	   Deterministic Networking (DetNet) [I-D.ietf-detnet-architecture] is a
557	   technique being developed in the IETF to enhance the ability of layer
558	   3 networks to deliver packets more reliably and with greater control
559	   over the delay.  The design cannot use classical re-transmission
560	   techniques such as TCP since can add delay that is above the maximum
561	   tolerated by the applications.  Even the delay improvements that are
562	   achieved with SCTP-PR are outside the bounds set by application
563	   demands.  The approach is to pre-emptively send copies of the packet
564	   over various paths in the expectation that this minimizes the chance
565	   of all packets being lost, but to trim duplicate packets to prevent
566	   excessive flooding of the network and to prevent multiple packets
567	   being delivered to the destination.  It also seeks to set an upper
568	   bound on latency.  Note that it is not the goal to minimize latency,
569	   and the optimum upper bound paths may not be the minimum latency
570	   paths.

572	   DetNet is based on flows.  It currently makes no comment on the
573	   underlay, and so at this stage must be assumed to use the base
574	   topology.  To be of use in this application DetNet there needs to be
575	   a description of how to deal with the concept of flows within an
576	   enhanced VPN.

578	   How we use DetNet in a multi-tenant (VPN) network, and how to improve
579	   the scalability of DetNet in a multi-tenant (VPN) network is for
580	   further study.

582	4.3.5.  MPLS Traffic Engineering (MPLS-TE)

584	   Normal MPLS runs on the base topology and has the concepts of
585	   reserving end to end bandwidth for an LSP, and of creating VPNs.  VPN
586	   traffic can be run over RSVP-TE tunnels to provide reserved bandwidth
587	   for a specific VPN connection.  This is rarely deployed in practice
588	   due to scaling and management overhead concerns.

590	4.3.6.  Segment Routing

592	   Segment Routing [I-D.ietf-spring-segment-routing] is a method that
593	   prepends instructions to packets at entry and sometimes at various
594	   points as it passes though the network.  These instructions allow
595	   packets to be routed on paths other than the shortest path for
596	   various traffic engineering reasons.  These paths can be strict or
597	   loose paths, depending on the compactness required of the instruction
598	   list and the degree of autonomy granted to the network (for example
599	   to support ECMP).

601	   With SR, a path needs to be dynamically created through a set of
602	   resources by simply specifying the Segment IDs (SIDs), i.e.
603	   instructions rooted at a particular point in the network.  Thus if a
604	   path is to be provisioned from some ingress point A to some egress
605	   point B in the underlay, A is provided with the A..B SID list and
606	   instructions on how to identify the packets to which the SID list is
607	   to be prepended.

609	   By encoding the state in the packet, as is done in Segment Routing,
610	   state is transitioned out of the network.

612	   A-------B-----E
613	   |       |     |
614	   |       |     |
615	   C-------D-----+

617	                     Figure 2: An SR Network Fragment

619	   Consider the network fragment shown in Figure 2.  To send a packet
620	   from A to E via B, D & E: Node A prepends the ordered list of SIDs:D,
621	   E to the packet and pushes the packet to B.  SID list {B, D, E} can
622	   be used as a VPN path.  Thus, to create a VPN, a set of SID Lists is
623	   created and provided to each ingress node of the VPN together with
624	   packet selection criteria.  In this way it is possible to create a
625	   VPN with no state in the core.  However this is at the expense of
626	   creating a larger packet with possible MTU and hardware restriction
627	   limits that need to be overcome.

629	   Note in the above if A and E support multiple VPN an additional VPN
630	   identifier will need to be added to the packet, but this is omitted
631	   from this text for simplicity.

633	   A---P---B---S---E
634	   |       |       |
635	   |       Q       |
636	   |       |       |
637	   C---R---D-------+

639	                   Figure 3: Another SR Network Fragment

641	   Consider a further network fragment shown in Figure 3, and further
642	   consider VPN A+D+E.

644	   A has lists: {P, B, Q, D}, {P, B, S, E}
645	   D has lists: {Q, B, P, A}, {E}
646	   E has lists: {S, B, P, A}, {D}

648	   To create a new VPN C+D+B the following list are introduced:

650	   C lists: {R, D}, {A, P, B}
651	   D lists: {R, C}, {Q, B}
652	   B lists: {Q, D}, {P, A, C}

654	   Thus VPN C+D+B was created without touching the settings of the core
655	   routers, indeed it is possible to add endpoints to the VPNs, and move
656	   the paths around simply by providing new lists to the affected
657	   endpoints.

659	   There are a number of limitations in SR as it is currently defined
660	   that limit its applicability to enhanced VPNs:

662	   o  Segments are shared between different VPNs,

664	   o  There is no reservation of bandwidth,

666	   o  There is limited differentiation in the data plane.

668	   Thus some extensions to SR are needed to provide isolation between
669	   different enhanced VPNs.  This can be achieved by including a finer
670	   granularity of state in the core in anticipation of its future use by
671	   authorized services.  We therefore need to evaluate the balance
672	   between this additional state and the performance delivered by the
673	   network.

675	   Both MPLS Segment Routing and SRv6 Segment Routing are candidate
676	   technologies for enhanced VPN.

678	   With current segment routing, the instructions are used to specify
679	   the nodes and links to be traversed.  However, in order to achieve
680	   the required isolation between different services, new instructions
681	   can be created which can be prepended to a packet to steer it through
682	   specific dedicated network resources and functions, e.g. links,
683	   queues, processors, services etc.

685	   Clearly we can use traditional constructs to create a VPN, but there
686	   are advantages to the use of other constructs such as Segment Routing
687	   (SR) in the creation of virtual networks with enhanced properties.

689	   Traditionally a traffic engineered path operates with a granularity
690	   of a link with hints about priority provided through the use of the
691	   traffic class field in the header.  However to achieve the latency
692	   and isolation characteristics that are sought by VPN+ users, steering
693	   packets through specific queues resources will likely be required.
694	   The extent to which these needs can be satisfied through existing QoS
695	   mechanisms is to be determined.  What is clear is that a fine control
696	   of which services wait for which, with a fine granularity of queue
697	   management policy is needed.  Note that the concept of a queue is a
698	   useful abstraction for many types of underlay mechanism that may be
699	   used to provide enhanced latency support.  From the perspective of
700	   the control plane and from the perspective of the segment routing the
701	   method of steering a packet to a queue that provides the required
702	   properties is a universal construct.  How the queue satisfies the
703	   requirement is outside the scope of these aspect of the enhanced VPN
704	   system.  Thus for example a FlexE channel, or time sensitive
705	   networking packet scheduling slot are abstracted to the same concept
706	   and bound to the data plane in a common manner.

708	   We can introduce the specification of finer, deterministic,
709	   granularity to path selection through extensions to traditional path
710	   construction techniques such as RSVP-TE and MPLS-TP.

712	   We can also introduce it by specifying the queue through an SR
713	   instruction list.  Thus new SR instructions may be created to specify
714	   not only which resources are traversed, but in some cases how they
715	   are traversed.  For example, it may be possible to specify not only
716	   the queue to be used but the policy to be applied when enqueuing and
717	   dequeuing.

719	   This concept can be further generalized, since as well as queuing to
720	   the output port of a router, it is possible to queue to any resource,
721	   for example:

723	   o  A network processor unit (NPU)

725	   o  A Central Processing Unit (CPU) Core

727	   o  A Look-up engine such as TCAMs

729	4.4.  Control Plane Considerations

731	   It is expected that VPN+ would be based on a hybrid control
732	   mechanism, which takes advantage of the logically centralized
733	   controller for on-demand provisioning and global optimization, whilst
734	   still relies on distributed control plane to provide scalability,
735	   high reliability, fast reaction, automatic failure recovery etc.
736	   Extension and optimization to the distributed control plane is needed
737	   to support the enhanced properties of VPN+.

739	   Where SR is used as a the data-plane construct it needs to be noted
740	   that it does not have the capability of reserving resources along the
741	   path nor do its currently specified distributed control plane (the
742	   link state routing protocols).  An SDN controller can clearly do
743	   this, from the controllers point of view, and no resource reservation
744	   is done on the device.  Thus if a distributed control plane is needed
745	   either in place of an SDN controller or as an assistant to it, the
746	   design of the control system needs to ensure that resources are
747	   uniquely allocated to the correct service, and no allocated to
748	   multiple services casing unintended resource conflict.  This needs
749	   further study.

751	   On the other hand an advantage of using an SR approach is that it
752	   provides a way of efficiently binding the network underlay and the
753	   enhanced VPN overlay.  With a technology such as RSVP-TE LSPs, each
754	   virtual path in the VPN is bound to the underlay with a dedicated TE-
755	   LSP.

757	   RSVP-TE could be enhanced to bind the VPN to specific resources
758	   within the underlay, but as noted elsewhere in this document there
759	   are concerns as to the scalability of this approach.  With an SR-
760	   based approach to resource reservation (per-slice reservation), it is
761	   straightforward to create dedicated SR network slices, and the VPN
762	   can be bound to a particular SR network slice.

764	4.5.  Application Specific Network Types

766	   Although a lot of the traffic that will be carried over the enhanced
767	   VPN will likely be IPv4 or IPv6, the design has to be capable of
768	   carrying other traffic types.  In particular the design SHOULD be
769	   capable of carrying Ethernet traffic.  This is easily accomplished
770	   through the various pseudowire (PW) techniques [RFC3985].  Where the
771	   underlay is MPLS Ethernet can be carried over the enhanced VPN
772	   encapsulated according to the method specified in [RFC4448].  Where
773	   the underlay is IP Layer Two Tunneling Protocol - Version 3 (L2TPv3)
774	   [RFC3931] can be used with Ethernet traffic carried according to
775	   [RFC4719].  Encapsulations have been defined for most of the common
776	   layer two type for both PW over MPLS and for L2TPv3.

778	4.6.  Integration with Service Functions

780	   There is a significant overlap between the problem of routing a
781	   packet though a set of network resources and the problem of routing a
782	   packet through a set of compute resources.  Service Function Chain
783	   technology is designed to forward a packet through a set of compute
784	   resources.

786	   A future version of this document will discuss this further.

788	5.  Scalability Considerations

790	   For a packet to transit a network, other than on a best effort,
791	   shortest path basis, it is necessary to introduce additional state,
792	   either in the packet, or in the network of some combination of both.

794	   There are at least three ways of doing this:

796	   o  Introduce the complete state into the packet.  That is how SR does
797	      this, and this allows the controller to specify the precise series
798	      of forwarding and processing instructions that will happen to the
799	      packet as it transits the network.  The cost of this is an
800	      increase in the packet header size.  The cost is also that systems
801	      will have capabilities enabled in case they are called upon by a
802	      service.  This is a type of latent state, and increases as we more
803	      precisely specify the path and resources that need to be
804	      exclusively available to a VPN.

806	   o  Introduce the state to the network.  This is normally done by
807	      creating a path using RSVP-TE, which can be extended to introduce
808	      any element that needs to be specified along the path, for example
809	      explicitly specifying queuing policy.  It is of course possible to
810	      use other methods to introduce path state, such as via a Software
811	      Defined Network (SDN) controller, or possibly by modifying a
812	      routing protocol.  With this approach there is state per path per
813	      path characteristic that needs to be maintained over its life-
814	      cycle.  This is more state than is needed using SR, but the packet
815	      are shorter.

817	   o  Provide a hybrid approach based on using binding SIDs to create
818	      path fragments, and bind them together with SR.

820	   Dynamic creation of a VPN path using SR requires less state
821	   maintenance in the network core at the expense of larger VPN headers
822	   on the packet.  The scaling properties will reduce roughly from a
823	   function of (N/2)^2 to a function of N, where N is the VPN path
824	   length in intervention points (hops plus network functions).
825	   Reducing the state in the network is important to VPN+, as VPN+
826	   requires the overlay to be more closely integrated with the underlay
827	   than with traditional VPNs.  This tighter coupling would normally
828	   mean that significant state needed to be created and maintained in
829	   the core.  However, a segment routed approach allows much of this
830	   state to be spread amongst the network ingress nodes, and transiently
831	   carried in the packets as SIDs.

833	   These approaches are for further study.

835	5.1.  Maximum Stack Depth

837	   One of the challenges with SR is the stack depth that nodes are able
838	   to impose on packets.  This leads to a difficult balance between
839	   adding state to the network and minimizing stack depth, or minimizing
840	   state and increasing the stack depth.

842	5.2.  RSVP scalability

844	   The traditional method of creating a resource allocated path through
845	   an MPLS network is to use the RSVP protocol.  However there have been
846	   concerns that this requires significant continuous state maintenance
847	   in the network.  There are ongoing works to improve the scalability
848	   of RSVP-TE LSPs in the control plane
849	   [I-D.ietf-teas-rsvp-te-scaling-rec].  This will be considered further
850	   in a future version of this document.

852	   There is also concern at the scalability of the forwarder footprint
853	   of RSVP as the number of paths through an LSR grows

855	   [I-D.sitaraman-mpls-rsvp-shared-labels] proposes to address this by
856	   employing SR within a tunnel established by RSVP-TE.  This work will
857	   be considered in a future version of this document.

859	6.  OAM and Instrumentation

861	   A study of OAM in SR networks has been documented in
862	   [I-D.ietf-spring-oam-usecase].

864	   The enhanced VPN OAM design needs to consider the following
865	   requirements:

867	   o  Instrumentation of the underlay so that the network operator can
868	      be sure that the resources committed to a tenant are operating
869	      correctly and delivering the required performance.

871	   o  Instrumentation of the overlay by the tenant.  This is likely to
872	      be transparent to the network operator and to use existing
873	      methods.  Particular consideration needs to be given to the need
874	      to verify the isolation and the various committed performance
875	      characteristics.

877	   o  Instrumentation of the overlay by the network provider to
878	      proactively demonstrate that the committed performance is being
879	      delivered.  This needs to be done in a non-intrusive manner,
880	      particularly when the tenant is deploying a performance sensitive
881	      application

883	   o  Verification of the conformity of the path to the service
884	      requirement.  This may need to be done as part of a commissioning
885	      test.

887	   These issues will be discussed in a future version of this document.

889	7.  Enhanced Resiliency

891	   Each enhanced VPN, of necessity, has a life-cycle, and needs
892	   modification during deployment as the needs of its user change.
893	   Additionally as the network as a whole evolves there will need to be
894	   garbage collection performed to consolidate resources into usable
895	   quanta.

897	   Systems in which the path is imposed such as SR, or some form of
898	   explicit routing tend to do well in these applications because it is
899	   possible to perform an atomic transition from one path to another.
900	   However implementations and the monitoring protocols need to make
901	   sure that the new path is up before traffic is transitioned to it.

903	   There are however two manifestations of the latency problem that are
904	   for further study in any of these approaches:

906	   o  The problem of packets overtaking one and other if a path latency
907	      reduces during a transition.

909	   o  The problem of the latency transient in either direction as a path
910	      migrates.

912	   There is also the matter of what happens during failure in the
913	   underlay infrastructure.  Fast reroute is one approach, but that
914	   still produces a transient loss with a normal goal of rectifying this
915	   within 50ms.  An alternative is some form of N+1 delivery such as has
916	   been used for many years to support protection from service
917	   disruption.  This may be taken to a different level using the
918	   techniques proposed by the IETF deterministic network work with
919	   multiple in-network replication and the culling of later packets.

921	   In addition to the approach used to protect high priority packets,
922	   consideration has to be given to the impact of best effort traffic on
923	   the high priority packets during a transient.  Specifically if a
924	   conventional re-convergence process is used there will inevitably be
925	   micro-loops and whilst some form of explicit routing will protect the
926	   high priority traffic, lower priority traffic on best effort shortest
927	   paths will micro-loop without the use of a loop prevention
928	   technology.  To provide the highest quality of service to high
929	   priority traffic, either this traffic must be shielded from the
930	   micro-loops, or micro-loops must be prevented.

932	8.  Security Considerations

934	   All types of virtual network require special consideration to be
935	   given to the isolation between the tenants.  However in an enhanced
936	   virtual network service hard isolation needs to be considered.  If a
937	   service requires a specific latency then it can be damaged by simply
938	   delaying the packet through the activities of another tenant.  In a
939	   network with virtual functions, depriving a function used by another
940	   tenant of compute resources can be just as damaging as delaying
941	   transmission of a packet in the network.

943	9.  IANA Considerations

945	   There are no requested IANA actions.

947	10.  References

949	10.1.  Normative References

951	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
952	              Requirement Levels", BCP 14, RFC 2119,
953	              DOI 10.17487/RFC2119, March 1997,
954	              <https://www.rfc-editor.org/info/rfc2119>.

956	10.2.  Informative References

958	   [FLEXE]    "Flex Ethernet Implementation Agreement", March 2016,
959	              <http://www.oiforum.com/wp-content/uploads/
960	              OIF-FLEXE-01.0.pdf>.

962	   [I-D.ietf-detnet-architecture]
963	              Finn, N., Thubert, P., Varga, B., and J. Farkas,
964	              "Deterministic Networking Architecture", draft-ietf-
965	              detnet-architecture-04 (work in progress), October 2017.

967	   [I-D.ietf-detnet-dp-sol]
968	              Korhonen, J., Andersson, L., Jiang, Y., Finn, N., Varga,
969	              B., Farkas, J., Bernardos, C., Mizrahi, T., and L. Berger,
970	              "DetNet Data Plane Encapsulation", draft-ietf-detnet-dp-
971	              sol-01 (work in progress), January 2018.

973	   [I-D.ietf-spring-oam-usecase]
974	              Geib, R., Filsfils, C., Pignataro, C., and N. Kumar, "A
975	              Scalable and Topology-Aware MPLS Dataplane Monitoring
976	              System", draft-ietf-spring-oam-usecase-10 (work in
977	              progress), December 2017.

979	   [I-D.ietf-spring-segment-routing]
980	              Filsfils, C., Previdi, S., Ginsberg, L., Decraene, B.,
981	              Litkowski, S., and R. Shakir, "Segment Routing
982	              Architecture", draft-ietf-spring-segment-routing-15 (work
983	              in progress), January 2018.

985	   [I-D.ietf-teas-rsvp-te-scaling-rec]
986	              Beeram, V., Minei, I., Shakir, R., Pacella, D., and T.
987	              Saad, "Techniques to Improve the Scalability of RSVP
988	              Traffic Engineering Deployments", draft-ietf-teas-rsvp-te-
989	              scaling-rec-09 (work in progress), February 2018.

991	   [I-D.sitaraman-mpls-rsvp-shared-labels]
992	              Sitaraman, H., Beeram, V., Parikh, T., and T. Saad,
993	              "Signaling RSVP-TE tunnels on a shared MPLS forwarding
994	              plane", draft-sitaraman-mpls-rsvp-shared-labels-03 (work
995	              in progress), December 2017.

997	   [NETCALC]  "Applicability of Network Calculus to DetNet", November
998	              2017, <https://datatracker.ietf.org/meeting/100/materials/
999	              slides-100-detnet-applicability-of-network-calculus-to-
1000	              detnet>.

1002	   [RFC3931]  Lau, J., Ed., Townsley, M., Ed., and I. Goyret, Ed.,
1003	              "Layer Two Tunneling Protocol - Version 3 (L2TPv3)",
1004	              RFC 3931, DOI 10.17487/RFC3931, March 2005,
1005	              <https://www.rfc-editor.org/info/rfc3931>.

1007	   [RFC3985]  Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation
1008	              Edge-to-Edge (PWE3) Architecture", RFC 3985,
1009	              DOI 10.17487/RFC3985, March 2005,
1010	              <https://www.rfc-editor.org/info/rfc3985>.

1012	   [RFC4448]  Martini, L., Ed., Rosen, E., El-Aawar, N., and G. Heron,
1013	              "Encapsulation Methods for Transport of Ethernet over MPLS
1014	              Networks", RFC 4448, DOI 10.17487/RFC4448, April 2006,
1015	              <https://www.rfc-editor.org/info/rfc4448>.

1017	   [RFC4719]  Aggarwal, R., Ed., Townsley, M., Ed., and M. Dos Santos,
1018	              Ed., "Transport of Ethernet Frames over Layer 2 Tunneling
1019	              Protocol Version 3 (L2TPv3)", RFC 4719,
1020	              DOI 10.17487/RFC4719, November 2006,
1021	              <https://www.rfc-editor.org/info/rfc4719>.

1023	Authors' Addresses

1025	   Stewart Bryant
1026	   Huawei

1028	   Email: stewart.bryant@gmail.com

1030	   Jie Dong
1031	   Huawei

1033	   Email: jie.dong@huawei.com
1034	   Zhenqiang Li
1035	   China Mobile

1037	   Email: lizhenqiang@chinamobile.com

1039	   Takuya Miyasaka
1040	   KDDI Corporation

1042	   Email: ta-miyasaka@kddi.com