idnits 2.17.1 

draft-ietf-softwire-mesh-framework-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 27.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1376.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1387.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1394.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1400.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     If a BGP speaker sends an update for an NLRI in the E-IP family,
     and the update is being sent over a BGP session that is running on top of
     the I-IP network layer, and the BGP speaker is advertising itself as the
     NH for that NLRI, then the BGP speaker MUST, unless explicitly overridden
     by policy, specify the NH address in the I-IP family.  The address family
     of the NH MUST not be changed by a Route Reflector.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (July 2007) is 6128 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-02) exists of
     draft-pmohapat-idr-info-safi-01

  -- Possible downref: Normative reference to a draft: ref. 'ENCAPS-SAFI' 

  == Outdated reference: A later version (-01) exists of
     draft-ietf-idr-v4nlri-v6nh-00

  == Outdated reference: A later version (-11) exists of
     draft-ietf-bfd-base-06

  == Outdated reference: A later version (-10) exists of
     draft-ietf-l3vpn-2547bis-mcast-04

  == Outdated reference: A later version (-08) exists of
     draft-ietf-l3vpn-2547bis-mcast-bgp-02

  == Outdated reference: A later version (-15) exists of
     draft-ietf-mpls-ldp-p2mp-02

  -- Obsolete informational reference (is this intentional?): RFC 2385
     (Obsoleted by RFC 5925)

  -- Obsolete informational reference (is this intentional?): RFC 3036
     (Obsoleted by RFC 5036)

  -- Obsolete informational reference (is this intentional?): RFC 4306
     (Obsoleted by RFC 5996)

  == Outdated reference: A later version (-08) exists of
     draft-ietf-pim-rpf-vector-03


     Summary: 1 error (**), 0 flaws (~~), 10 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                              J. Wu
3	Internet Draft                                                    Y. Cui
4	Expiration Date: January 2008                                      X. Li
5	                                                     Tsinghua University

7	                                                                 C. Metz
8	                                                       E. Rosen (Editor)
9	                                                               S. Barber
10	                                                            P. Mohapatra
11	                                                     Cisco Systems, Inc.

13	                                                              J. Scudder
14	                                                  Juniper Networks, Inc.

16	                                                               July 2007

18	                        Softwire Mesh Framework

20	               draft-ietf-softwire-mesh-framework-02.txt

22	Status of this Memo

24	   By submitting this Internet-Draft, each author represents that any
25	   applicable patent or other IPR claims of which he or she is aware
26	   have been or will be disclosed, and any of which he or she becomes
27	   aware will be disclosed, in accordance with Section 6 of BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF), its areas, and its working groups.  Note that
31	   other groups may also distribute working documents as Internet-
32	   Drafts.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   The list of current Internet-Drafts can be accessed at
40	   http://www.ietf.org/ietf/1id-abstracts.txt.

42	   The list of Internet-Draft Shadow Directories can be accessed at
43	   http://www.ietf.org/shadow.html.

45	Abstract

47	   The Internet needs to be able to handle both IPv4 and IPv6 packets.
48	   However, it is expected that some constituent networks of the
49	   Internet will be "single protocol" networks.  One kind of single
50	   protocol network can parse only IPv4 packets and can process only
51	   IPv4 routing information; another kind can parse only IPv6 packets
52	   and can process only IPv6 routing information.  It is nevertheless
53	   required that either kind of single protocol network be able to
54	   provide transit service for the "other" protocol.  This is done by
55	   passing the "other kind" of routing information from one edge of the
56	   single protocol network to the other, and by tunneling the "other
57	   kind" of data packet from one edge to the other.  The tunnels are
58	   known as "Softwires".  This framework document explains how the
59	   routing information and the data packets of one protocol are passed
60	   through a single protocol network of the other protocol.  The
61	   document is careful to specify when this can be done with existing
62	   technology, and when it requires the development of new or modified
63	   technology.

65	Table of Contents

67	 1          Specification of requirements  .........................   4
68	 2          Introduction  ..........................................   4
69	 3          Scenarios of Interest  .................................   7
70	 3.1        IPv6-over-IPv4 Scenario  ...............................   7
71	 3.2        IPv4-over-IPv6 Scenario  ...............................   9
72	 4          General Principles of the Solution  ....................  11
73	 4.1        'E-IP' and 'I-IP'  .....................................  11
74	 4.2        Routing  ...............................................  11
75	 4.3        Tunneled Forwarding  ...................................  12
76	 5          Distribution of Inter-AFBR Routing Information  ........  12
77	 6          Softwire Signaling  ....................................  14
78	 7          Choosing to Forward Through a Softwire  ................  16
79	 8          Selecting a Tunneling Technology  ......................  16
80	 9          Selecting the Softwire for a Given Packet  .............  17
81	10          Softwire OAM and MIBs  .................................  18
82	10.1        Operations and Maintenance (OAM)  ......................  18
83	10.2        MIBs  ..................................................  19
84	11          Softwire Multicast  ....................................  19
85	11.1        One-to-One Mappings  ...................................  20
86	11.1.1      Using PIM in the Core  .................................  20
87	11.1.2      Using mLDP and Multicast MPLS in the Core  .............  21
88	11.2        MVPN-like Schemes  .....................................  22
89	12          Inter-AS Considerations  ...............................  23
90	13          IANA Considerations  ...................................  24
91	14          Security Considerations  ...............................  24
92	14.1        Problem Analysis  ......................................  24
93	14.2        Non-cryptographic techniques  ..........................  25
94	14.3        Cryptographic techniques  ..............................  27
95	15          Acknowledgments  .......................................  28
96	16          Normative References  ..................................  28
97	17          Informative References  ................................  29
98	18          Full Copyright Statement  ..............................  32
99	19          Intellectual Property  .................................  32
100	1. Specification of requirements

102	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
103	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
104	   document are to be interpreted as described in [RFC2119].

106	2. Introduction

108	   The routing information in any IP backbone network can be thought of
109	   as being in one of two categories: "internal routing information" or
110	   "external routing information".  The internal routing information
111	   consists of routes to the nodes that belong to the backbone, and to
112	   the interfaces of those nodes.  External routing information consists
113	   of routes to destinations beyond the backbone, especially
114	   destinations to which the backbone is not directly attached.  In
115	   general, BGP [RFC4271] is used to distribute external routing
116	   information, and an "Interior Gateway Protocol" (IGP) such as OSPF
117	   [RFC2328] or IS-IS [RFC1195] is used to distribute internal routing
118	   information.

120	   Often an IP backbone will provide transit routing services for
121	   packets that originate outside the backbone, and whose destinations
122	   are outside the backbone.  These packets enter the backbone at one of
123	   its "edge routers".  They are routed through the backbone to another
124	   edge router, after which they leave the backbone and continue on
125	   their way. The edge nodes of the backbone are often known as
126	   "Provider Edge" (PE) routers.  The term "ingress" (or "ingress PE")
127	   refers to the router at which a packet enters the backbone, and the
128	   term "egress" (or "egress PE") refers to the router at which it
129	   leaves the backbone.  Interior nodes are often known as "P routers".
130	   Routers which are outside the backbone but directly attached to it
131	   are known as "Customer Edge" (CE) routers.  (This terminology is
132	   taken from [RFC4364].)

134	   When a packet's destination is outside the backbone, the routing
135	   information which is needed within the backbone in order to route the
136	   packet to the proper egress is, by definition, external routing
137	   information.

139	   Traditionally, the external routing information has been distributed
140	   by BGP to all the routers in the backbone, not just to the edge
141	   routers (i.e., not just to the ingress and egress points).  Each of
142	   the interior nodes has been expected to look up the packet's
143	   destination address and route it towards the egress point.  This is
144	   known as "native forwarding":  the interior nodes look into each
145	   packet's header in order to match the information in the header with
146	   the external routing information.

148	   It is, however, possible to provide transit services without
149	   requiring that all the backbone routers have the external routing
150	   information.  The routing information which BGP distributes to each
151	   ingress router specifies the egress router for each route.  The
152	   ingress router can therefore "tunnel" the packet directly to the
153	   egress router.  "Tunneling the packet" means putting on some sort of
154	   encapsulation header which will force the interior routers to forward
155	   the packet to the egress router.  The original packet is known as the
156	   "encapsulation payload".  The P routers do not look at the packet
157	   header of the payload, but only at the encapsulation header.  Since
158	   the path to the egress router is part of the internal routing
159	   information of the backbone, the interior routers then do not need to
160	   know the external routing information.  This is known as "tunneled
161	   forwarding".  Of course, before the packet can leave the egress, it
162	   has to be decapsulated.

164	   The scenario where the P routers do not have external routes is
165	   sometimes known as a "BGP-free core".  That is something of a
166	   misnomer, though, since the crucial aspect of this scenario is not
167	   that the interior nodes don't run BGP, but that they don't maintain
168	   the external routing information.

170	   In recent years, we have seen this scenario deployed to support VPN
171	   services, as specified in [RFC4364].  An edge router maintains
172	   multiple independent routing/addressing spaces, one for each VPN to
173	   which it interfaces.  However, the routing information for the VPNs
174	   is not maintained by the interior routers.  In most of these
175	   scenarios, MPLS is used as the encapsulation mechanism for getting
176	   the packets from ingress to egress.  There are some deployments in
177	   which an IP-based encapsulation, such as L2TPv3 (Layer 2 Transport
178	   Protocol) [RFC3931] or GRE (Generic Routing Encapsulation) [RFC2784]
179	   is used.

181	   This same technique can also be useful when the external routing
182	   information consists not of VPN routes, but of "ordinary" Internet
183	   routes.  It can be used any time it is desired to keep external
184	   routing information out of a backbone's interior nodes, or in fact
185	   any time it is desired for any reason to avoid the native forwarding
186	   of certain kinds of packets.

188	   This framework focuses on two such scenarios.

190	      1. In this scenario, the backbone's interior nodes support only
191	         IPv6.  They do not maintain IPv4 routes at all, and are not
192	         expected to parse IPv4 packet headers.  Yet it is desired to
193	         use such a backbone to provide transit services for IPv4
194	         packets.  Therefore tunneled forwarding of IPv4 packets is
195	         required.  Of course, the edge nodes must have the IPv4 routes,
196	         but the ingress must perform an encapsulation in order to get
197	         an IPv4 packet forwarded to the egress.

199	      2. This scenario is the reverse of scenario 1, i.e., the
200	         backbone's interior nodes support only IPv4, but it is desired
201	         to use the backbone for IPv6 transit.

203	   In these scenarios, a backbone whose interior nodes support only one
204	   of the two address families is required to provide transit services
205	   for the other.  The backbone's edge routers must, of course, support
206	   both address families.  We use the term "Address Family Border
207	   Router" (AFBR) to refer to these PE routers.  The tunnels that are
208	   used for forwarding are referred to as "softwires".

210	   These two scenarios are known as the "Softwire Mesh Problem" [SW-
211	   PROB], and the framework specified in this draft is therefore known
212	   as the "Softwire Mesh Framework".  In this framework, only the AFBRs
213	   need to support both address families.  The CE routers support only a
214	   single address family, and the P routers support only the other
215	   address family.

217	   It is possible to address these scenarios via a large variety of
218	   tunneling technologies.  This framework does not mandate the use of
219	   any particular tunneling technology.  In any given deployment, the
220	   choice of tunneling technology is a matter of policy.  The framework
221	   accommodates at least the use of MPLS ([RFC3031], [RFC3032]), both
222	   LDP-based (Label Distribution Protocol, [RFC3036]) and RSVP-TE-based
223	   ([RFC3209]), L2TPv3 [RFC3931], GRE [RFC2784], and IP-in-IP [RFC2003].
224	   The framework will also accommodate the use of IPsec tunneling, when
225	   that is necessary in order to meet security requirements.

227	   It is expected that in many deployments, the choice of tunneling
228	   technology will be made by a simple expression of policy, such as
229	   "always use IP-IP tunnels", or "always use LDP-based MPLS", or
230	   "always use L2TPv3".

232	   However, other deployments may have a mixture of routers, some of
233	   which support, say, both GRE and L2TPv3, but others of which support
234	   only one of those techniques.  It is desirable therefore to allow the
235	   network administration to create a small set of classes, and to
236	   configure each AFBR to be a member of one or more of these classes.
237	   Then the routers can advertise their class memberships to each other,
238	   and the encapsulation policies can be expressed as, e.g., "use L2TPv3
239	   to tunnel to routers in class X, use GRE to tunnel to routers in
240	   class Y".  To support such policies, it is necessary for the AFBRs to
241	   be able to advertise their class memberships; a standard way of doing
242	   this must be developed.

244	   Policy may also require a certain class of traffic to receive a
245	   certain quality of service, and this may impact the choice of tunnel
246	   and/or tunneling technology used for packets in that class.  This
247	   needs to be accommodated by the softwires framework.

249	   The use of tunneled forwarding often requires that some sort of
250	   signaling protocol be used to set up and/or maintain the tunnels.
251	   Many of the tunneling technologies accommodated by this framework
252	   already have their own signaling protocols.  However, some do not,
253	   and in some cases the standard signaling protocol for a particular
254	   tunneling technology may not be appropriate, for one or another
255	   reason, in the scenarios of interest.  In such cases (and in such
256	   cases only), new signaling methodologies need to be defined and
257	   standardized.

259	   In this framework, the softwires do not form an overlay topology
260	   which is visible to routing; routing adjacencies are not maintained
261	   over the softwires, and routing control packets are not sent through
262	   the softwires.  Routing adjacencies among backbone nodes (including
263	   the edge nodes) are maintained via the native technology of the
264	   backbone.

266	   There is already a standard routing method for distributing external
267	   routing information among AFBRs, namely BGP.  However, in the
268	   scenarios of interest, we may be using IPv6-based BGP sessions to
269	   pass IPv4 routing information, and we may be using IPv4-based BGP
270	   sessions to pass IPv6 routing information.  Furthermore, when IPv4
271	   traffic is to be tunneled over an IPv6 backbone, it is necessary to
272	   encode the "BGP next hop" for an IPv4 route as an IPv6 address, and
273	   vice versa.  The method for encoding an IPv4 address as the next hop
274	   for an IPv6 route is specified in [V6NLRI-V4NH]; the method for
275	   encoding an IPv6 address as the next hop for an IPv4 route is
276	   specified in [V4NLRI-V6NH].

278	3. Scenarios of Interest

280	3.1. IPv6-over-IPv4 Scenario

282	   In this scenario, the client networks run IPv6 but the backbone
283	   network runs IPv4.  This is illustrated in Figure 1.

285	                       +--------+ +--------+
286	                       | IPv6   |   |  IPv6  |
287	                       | Client |   | Client |
288	                       | Network|   | Network|
289	                       +--------+   +--------+
290	                           |   \     /   |
291	                           |    \   /    |
292	                           |     \ /     |
293	                           |      X      |
294	                           |     / \     |
295	                           |    /   \    |
296	                           |   /     \   |
297	                       +--------+   +--------+
298	                       |  AFBR  |   |  AFBR  |
299	                    +--| IPv4/6 |---| IPv4/6 |--+
300	                    |  +--------+   +--------+  |
301	    +--------+      |                           |       +--------+
302	    | IPv4   |      |                           |       | IPv4   |
303	    | Client |      |                           |       | Client |
304	    | Network|------|            IPv4           |-------| Network|
305	    +--------+      |            only           |       +--------+
306	                    |                           |
307	                    |  +--------+   +--------+  |
308	                    +--|  AFBR  |---|  AFBR  |--+
309	                       | IPv4/6 |   | IPv4/6 |
310	                       +--------+   +--------+
311	                         |   \     /   |
312	                         |    \   /    |
313	                         |     \ /     |
314	                         |      X       |
315	                         |     / \     |
316	                         |    /   \    |
317	                         |   /     \   |
318	                      +--------+   +--------+
319	                      |  IPv6  |   |  IPv6  |
320	                      | Client |   | Client |
321	                      | Network|   | Network|
322	                      +--------+   +--------+

324	                    Figure 1 IPv6-over-IPv4 Scenario

326	The IPv4 transit core may or may not run MPLS.  If it does, MPLS may be
327	used as part of the solution.

329	While Figure 1 does not show any "backdoor" connections among the client
330	networks, this framework assumes that there will be such connections.

332	That is, there is no assumption that the only path between two client
333	networks is via the pictured transit core network.  Hence the routing
334	solution must be robust in any kind of topology.

336	Many mechanisms for providing IPv6 connectivity across IPv4 networks
337	have been devised over the past ten years.  A number of different
338	tunneling mechanisms have been used, some provisioned manually, others
339	based on special addressing.  More recently, L3VPN (Layer 3 Virtual
340	Private Network) techniques from [RFC4364] have been extended to provide
341	IPv6 connectivity, using MPLS in the AFBRs and optionally in the
342	backbone [V6NLRI-V4NH].  The solution described in this framework can be
343	thought of as a superset of [V6NLRI-V4NH], with a more generalized
344	scheme for choosing the tunneling (softwire) technology.  In this
345	framework, MPLS is allowed, but not required, even at the AFBRs.  As in
346	[V6NLRI-V4NH], there is no manual provisioning of tunnels, and no
347	special addressing is required.

349	3.2. IPv4-over-IPv6 Scenario

351	   In this scenario, the client networks run IPv4 but the backbone
352	   network runs IPv6.  This is illustrated in Figure 2.

354	                       +--------+ +--------+
355	                       | IPv4   |   |  IPv4  |
356	                       | Client |   | Client |
357	                       | Network|   | Network|
358	                       +--------+   +--------+
359	                           |   \     /   |
360	                           |    \   /    |
361	                           |     \ /     |
362	                           |      X       |
363	                           |     / \     |
364	                           |    /   \    |
365	                           |   /     \   |
366	                       +--------+   +--------+
367	                       |  AFBR  |   |  AFBR  |
368	                    +--| IPv4/6 |---| IPv4/6 |--+
369	                    |  +--------+   +--------+  |
370	    +--------+      |                           |       +--------+
371	    | IPv6   |      |                           |       | IPv6   |
372	    | Client |      |                           |       | Client |
373	    | Network|------|            IPv6           |-------| Network|
374	    +--------+      |            only           |       +--------+
375	                    |                           |
376	                    |  +--------+   +--------+  |
377	                    +--|  AFBR  |---|  AFBR  |--+
378	                       | IPv4/6 |   | IPv4/6 |
379	                       +--------+   +--------+
380	                         |   \     /   |
381	                         |    \   /    |
382	                         |     \ /     |
383	                         |      X       |
384	                         |     / \     |
385	                         |    /   \    |
386	                         |   /     \   |
387	                      +--------+   +--------+
388	                      |  IPv4  |   |  IPv4  |
389	                      | Client |   | Client |
390	                      | Network|   | Network|
391	                      +--------+   +--------+

393	                    Figure 2 IPv4-over-IPv6 Scenario

395	The IPv6 transit core may or may not run MPLS.  If it does, MPLS may be
396	used as part of the solution.

398	While Figure 2 does not show any "backdoor" connections among the client
399	networks, this framework assumes that there will be such connections.

401	That is, there is no assumption the only path between two client
402	networks is via the pictured transit core network.  Hence the routing
403	solution must be robust in any kind of topology.

405	While the issue of IPv6-over-IPv4 has received considerable attention in
406	the past, the scenario of IPv4-over-IPv6 has not.  Yet it is a
407	significant emerging requirement, as a number of service providers are
408	building IPv6 backbone networks and do not wish to provide native IPv4
409	support in their core routers.  These service providers have a large
410	legacy of IPv4 networks and applications that need to operate across
411	their IPv6 backbone.  Solutions for this do not exist yet because it had
412	always been assumed that the backbone networks of the foreseeable future
413	would be dual stack.

415	4. General Principles of the Solution

417	   This section gives a very brief overview of the procedures.  The
418	   subsequent sections provide more detail.

420	4.1. 'E-IP' and 'I-IP'

422	   In the following we use the term "I-IP" ("Internal IP") to refer to
423	   the form of IP (i.e., either IPv4 or IPv6) that is supported by the
424	   transit network.  We use the term "E-IP" ("External IP") to refer to
425	   the form of IP that is supported by the client networks.   In the
426	   scenarios of interest, E-IP is IPv4 if and only if I-IP is IPv6, and
427	   E-IP is IPv6 if and only if I-IP is IPv4.

429	   We assume that the P routers support only I-IP.  That is, they are
430	   expected to have only I-IP routing information, and they are not
431	   expected to be able to parse E-IP headers.  We similarly assume that
432	   the CE routers support only E-IP.

434	   The AFBRs handle both I-IP and E-IP. However, only I-IP is used on
435	   AFBR's "core facing interfaces", and E-IP is only used on its client-
436	   facing interfaces.

438	4.2. Routing

440	   The P routers and the AFBRs of the transit network participate in an
441	   IGP, for the purposes of distributing I-IP routing information.

443	   The AFBRs use IBGP to exchange E-IP routing information with each
444	   other.  Either there is a full mesh of IBGP connections among the
445	   AFBRs, or else some or all of the AFBRs are clients of a BGP Route
446	   Reflector.  Although these IBGP connections are used to pass E-IP
447	   routing information (i.e., the NLRI of the BGP updates is in the E-IP
448	   address family), the IBGP connections run over I-IP, and the "BGP
449	   next hop" for each E-IP NLRI is in the I-IP address family.

451	4.3. Tunneled Forwarding

453	   When an ingress AFBR receives an E-IP packet from a client facing
454	   interface, it looks up the packet's destination IP address.  In the
455	   scenarios of interest, the best match for that address will be a BGP-
456	   distributed route whose next hop is the I-IP address of another AFBR,
457	   the egress AFBR.

459	   The ingress AFBR must forward the packet through a tunnel (i.e,
460	   through a "softwire") to the egress AFBR.  This is done by
461	   encapsulating the packet, using an encapsulation header which the P
462	   routers can process, and which will cause the P routers to send the
463	   packet to the egress AFBR.  The egress AFBR then extracts the
464	   payload, i.e., the original E-IP packet, and forwards it further by
465	   looking up its IP destination address.

467	   Several kinds of tunneling technologies are supported.  Some of those
468	   technologies require explicit AFBR-to-AFBR signaling before the
469	   tunnel can be used, others do not.

471	5. Distribution of Inter-AFBR Routing Information

473	   AFBRs peer with routers in the client networks to exchange routing
474	   information for the E-IP family.

476	   AFBRs use BGP to distribute the E-IP routing information to each
477	   other.  This can be done by an AFBR-AFBR mesh of IBGP sessions, but
478	   more likely is done through a BGP Route Reflector, i.e., where each
479	   AFBR has an IBGP session to one or two Route Reflectors, rather than
480	   to other AFBRs.

482	   The BGP sessions between the AFBRs, or between the AFBRs and the
483	   Route Reflector, will run on top of the I-IP address family.  That
484	   is, if the transit core supports only IPv6, the IBGP sessions used to
485	   distribute IPv4 routing information from the client networks will run
486	   over IPv6; if the transit core supports only IPv4, the IBGP sessions
487	   used to distribute IPv6 routing information from the client networks
488	   will run over IPv4.  The BGP sessions thus use the native networking
489	   layer of the core; BGP messages are NOT tunneled through softwires or
490	   through any other mechanism.

492	   In BGP, a routing update associates an address prefix (or more
493	   generally, "Network Layer Reachability Information", or NLRI) with
494	   the address of a "BGP Next Hop" (NH). The NLRI is associated with a
495	   particular address family.  The NH address is also associated with a
496	   particular address family, which may be the same as or different than
497	   the address family associated with the NLRI.  Generally the NH
498	   address belongs to the address family that is used to communicate
499	   with the BGP speaker to whom the NH address belongs.

501	   Since routing updates which contain information about E-IP address
502	   prefixes are carried over BGP sessions that use I-IP transport, and
503	   since the BGP messages are not tunneled, a BGP update providing
504	   information about an E-IP address prefix will need to specify a next
505	   hop address in the I-IP family.

507	   Due to a variety of historical circumstances, when the NLRI and the
508	   NH in a given BGP update are of different address families, it is not
509	   always obvious how the NH should be encoded.  There is a different
510	   encoding procedure for each pair of address families.

512	   In the case where the NLRI is in the IPv6 address family, and the NH
513	   is in the IPv4 address family, [V6NLRI-V4NH] explains how to encode
514	   the NH.

516	   In the case where the NLRI is in the IPv4 address family, and the NH
517	   is in the IPv6 address family, [V4NLRI-V6NH] explains how to encode
518	   the NH.

520	   If a BGP speaker sends an update for an NLRI in the E-IP family, and
521	   the update is being sent over a BGP session that is running on top of
522	   the I-IP network layer, and the BGP speaker is advertising itself as
523	   the NH for that NLRI, then the BGP speaker MUST, unless explicitly
524	   overridden by policy, specify the NH address in the I-IP family.  The
525	   address family of the NH MUST not be changed by a Route Reflector.

527	   In some cases (e.g., when [V4NLRI-V6NH] is used), one cannot follow
528	   this rule unless one's BGP peers have advertised a particular BGP
529	   capability.  This leads to the following softwires deployment
530	   restriction: if a BGP Capability is defined for the case in which an
531	   E-IP NLRI has an I-IP NH, all the AFBRs in a given transit core MUST
532	   advertise that capability.

534	   If an AFBR has multiple IP addresses, the network administrators
535	   usually have considerable flexibility in choosing which one the AFBR
536	   uses to identify itself as the next hop in a BGP update.  However, if
537	   the AFBR expects to receive packets through a softwire of a
538	   particular tunneling technology, and if the AFBR is known to that
539	   tunneling technology via a specific IP address, then that same IP
540	   address must be used to identify the AFBR in the next hop field of
541	   the BGP updates.  For example, if L2TPv3 tunneling is used, then the
542	   IP address which the AFBR uses when engaging in L2TPv3 signaling must
543	   be the same as the IP address it uses to identify itself in the next
544	   hop field of a BGP update.

546	   In [V6NLRI-V4NH], IPv6 routing information is distributed using the
547	   labeled IPv6 address family.  This allows the egress AFBR to
548	   associate an MPLS label with each IPv6 address prefix.  If an ingress
549	   AFBR forwards packets through a softwire than can carry MPLS packets,
550	   each data packet can carry the MPLS label corresponding to the IPv6
551	   route that it matched.  This may be useful at the egress AFBR, for
552	   demultiplexing and/or enhanced performance.  It is also possible to
553	   do the same for the IPv4 address family, i.e. to use the labeled IPv4
554	   address family instead of the IPv4 address family.  The use of the
555	   labeled IP address families in this manner is OPTIONAL.

557	6. Softwire Signaling

559	   A mesh of inter-AFBR softwires spanning the transit core must be in
560	   place before packets can flow between client networks.  Given N dual-
561	   stack AFBRs, this requires N^2 "point-to-point IP" or "label switched
562	   path" (LSP) tunnels.  While in theory these could be configured
563	   manually, that would result in a very undesirable O(N^2) provisioning
564	   problem.  Therefore manual configuration of point-to-point tunnels is
565	   not considered part of this framework.

567	   Because the transit core is providing layer 3 transit services,
568	   point-to-point tunnels are not required by this framework;
569	   multipoint-to-point tunnels are all that is needed.  In a multipoint-
570	   to-point tunnel, when a packet emerges from the tunnel there is no
571	   way to tell which router put the packet into the tunnel.  This models
572	   the native IP forwarding paradigm, wherein the egress router cannot
573	   determine a given packet's ingress router.  Of course, point-to-point
574	   tunnels might be required for some reason which goes beyond the basic
575	   requirements described in this document.  E.g., QoS or security
576	   considerations might require the use of point-to-point tunnels.  So
577	   point-to-point tunnels are allowed, but not required, by this
578	   framework.

580	   If it is desired to use a particular tunneling technology for the
581	   softwires, and if that technology has its own "native" signaling
582	   methodology, the presumption is that the native signaling will be
583	   used.  This would certainly apply to MPLS-based softwires, where LDP
584	   or RSVP-TE would be used.  A softwire based on IPsec would use
585	   standard IKE (Internet Key Exchange) [RFC4306] and IPsec [RFC4301]
586	   signaling, as that is necessary in order to guarantee the softwire's
587	   security properties.

589	   A Softwire based on GRE might or might not require signaling,
590	   depending on whether various optional GRE header fields are to be
591	   used.  GRE does not have any "native" signaling, so for those cases,
592	   a signaling procedure needs to be developed to support Softwires.

594	   Another possible softwire technology is L2TPv3.  While L2TPv3 does
595	   have its own native signaling, that signaling sets up point-to-point
596	   tunnels.  For the purpose of softwires, it is better to use L2TPv3 in
597	   a multipoint-to-point mode, and this requires a different kind of
598	   signaling.

600	   The signaling to be used for GRE and L2TPv3 to cover these scenarios
601	   is BGP-based, and is described in [ENCAPS-SAFI].

603	   If IP-IP tunneling is used, or if GRE tunneling is used without
604	   options, no signaling is required, as the only information needed by
605	   the ingress AFBR to create the encapsulation header is the IP address
606	   of the egress AFBR, and that is distributed by BGP.

608	   When the encapsulation IP header is constructed, there may be fields
609	   in the IP whose value is determined neither by whatever signaling has
610	   been done nor by the distributed routing information.  The values of
611	   these fields are determined by policy in the ingress AFBR.  Examples
612	   of such fields may be the TTL (Time to Live) field, the DSCP
613	   (DiffServ Service Classes) bits, etc.

615	   It is desirable for all necessary softwires to be fully set up before
616	   the arrival of any packets which need to go through the softwires.
617	   That is, the softwires should be "always on".  From the perspective
618	   of any particular AFBR, the softwire endpoints are always BGP next
619	   hops of routes which the AFBR has installed.  This suggests that any
620	   necessary softwire signaling should be either be done as part of
621	   normal system startup (as would happen, e.g., with LDP-based MPLS),
622	   or else should be triggered by the reception of BGP routing
623	   information (such as is described in [ENCAPS-SAFI]); it is also
624	   helpful if distribution of the routing information that serves as the
625	   trigger is prioritized.

627	7. Choosing to Forward Through a Softwire

629	   The decision to forward through a softwire, instead of to forward
630	   natively, is made by the ingress AFBR.  This decision is a matter of
631	   policy.

633	   In many cases, the policy will be very simple.  Some useful policies
634	   are:

636	     - if routing says that an E-IP packet has to be sent out a "core-
637	       facing interface" to an I-IP core, send the packet through a
638	       softwire

640	     - if routing says that an E-IP packet has to be sent out an
641	       interface that only supports I-IP packets, then send the E-IP
642	       packets through a softwire

644	     - if routing says that the BGP next hop address for an E-IP packet
645	       is an I-IP address, then send the E-IP packets through a softwire

647	     - if the route which is the best match for a particular packet's
648	       destination address is a BGP-distributed route, then send the
649	       packet through a softwire (i.e., tunnel all BGP-routed packets).

651	   More complicated policies are also possible, but a consideration of
652	   those policies is outside the scope of this document.

654	8. Selecting a Tunneling Technology

656	   The choice of tunneling technology is a matter of policy configured
657	   at the ingress AFBR.

659	   It is envisioned that in most cases, the policy will be a very simple
660	   one, and will be the same at all the AFBRs of a given transit core.
661	   E.g., "always use LDP-based MPLS", or "always use L2TPv3".

663	   However, other deployments may have a mixture of routers, some of
664	   which support, say, both GRE and L2TPv3, but others of which support
665	   only one of those techniques.  It is desirable therefore to allow the
666	   network administration to create a small set of classes, and to
667	   configure each AFBR to be a member of one or more of these classes.
668	   Then the routers can advertise their class memberships to each other,
669	   and the encapsulation policies can be expressed as, e.g., "use L2TPv3
670	   to talk to routers in class X, use GRE to talk to routers in class
671	   Y".  To support such policies, it is necessary for the AFBRs to be
672	   able to advertise their class memberships.  [ENCAPS-SAFI] specifies a
673	   way in which an AFBR may advertise, to other AFBRS, various
674	   characteristics which may be relevant to the polcy (e.g., "I belong
675	   to class Y").  In many cases, these characteristics can be
676	   represented by arbitrarily selected communities or extended
677	   communities, and the policies at the ingress can be expressed in
678	   terms of these classes (i.e., communities).

680	   Policy may also require a certain class of traffic to receive a
681	   certain quality of service, and this may impact the choice of tunnel
682	   and/or tunneling technology used for packets in that class.  This
683	   framework allows a variety of tunneling technologies to be used for
684	   instantiating softwires.  The choice of tunneling technology is a
685	   matter of policy, as discussed in section 2.

687	   While in many cases the policy will be unconditional, e.g., "always
688	   use L2TPv3 for softwires", in other cases the policy may specify that
689	   the choice is conditional upon information about the softwire remote
690	   endpoint, e.g., "use L2TPv3 to talk to routers in class X, use GRE to
691	   talk to routers in class Y".  It is desirable therefore to allow the
692	   network administration to create a small set of classes, and to
693	   configure each AFBR to be a member of one or more of these classes.
694	   If each such class is represented as a community or extended
695	   community, then [ENCAPS-SAFI] specifies a method that AFBRs can use
696	   to advertise their class memberships to each other.

698	   This framework also allows for policies of arbitrary complexity,
699	   which may depend on characteristics or attributes of individual
700	   address prefixes, as well as on QoS or security considerations.
701	   However, the specification of such policies is not within the scope
702	   of this document.

704	9. Selecting the Softwire for a Given Packet

706	   Suppose it has been decided to send a given packet through a
707	   softwire.  Routing provides the address, in the address family of the
708	   transport network, of the BGP next hop.  The packet MUST be sent
709	   through a softwire whose remote endpoint address is the same as the
710	   BGP next hop address.

712	   Sending a packet through a softwire is a matter of encapsulating the
713	   packet with an encapsulation header that can be processed by the
714	   transit network, and then transmitting towards the softwire's remote
715	   endpoint address.

717	   In many cases, once one knows the remote endpoint address, one has
718	   all the information one needs in order to form the encapsulation
719	   header.  This will be the case if the tunnel technology instantiating
720	   the softwire is, e.g., LDP-based MPLS, IP-in-IP, or GRE without
721	   optional header fields.

723	   If the tunnel technology being used is L2TPv3 or GRE with optional
724	   header fields, additional information from the remote endpoint is
725	   needed in order to form the encapsulation header.  The procedures for
726	   sending and receiving this information are described in [ENCAPS-
727	   SAFI].

729	   If the tunnel technology being used is RSVP-TE-based MPLS or IPsec,
730	   the native signaling procedures of those technologies will need to be
731	   used.

733	   IPsec procedures will be discussed further in a subsequent revision
734	   of this document.

736	   RSVP-TE procedures will be discussed in companion documents.

738	   If the packet being sent through the softwire matches a route in the
739	   labeled IPv4 or labeled IPv6 address families, it should be sent
740	   through the softwire as an MPLS packet with the corresponding label.
741	   Note that most of the tunneling technologies mentioned in this
742	   document are capable of carrying MPLS packets, so this does not
743	   presuppose support for MPLS in the core routers.

745	10. Softwire OAM and MIBs

747	10.1. Operations and Maintenance (OAM)

749	   Softwires are essentially tunnels connecting routers.  If they
750	   disappear or degrade in performance then connectivity through those
751	   tunnels will be impacted.  There are several techniques available to
752	   monitor the status of the tunnel end-points (AFBRs) as well as the
753	   tunnels themselves.  These techniques allow operations such as
754	   softwires path tracing, remote softwire end-point pinging and remote
755	   softwire end-point liveness failure detection.

757	   Examples of techniques applicable to softwire OAM include:

759	     o BGP/TCP timeouts between AFBRs

761	     o ICMP or LSP echo request and reply addressed to a particular AFBR

763	     o BFD (Bidirectional Forwarding Detection) [BFD] packet exchange
764	       between AFBR routers

766	   Another possibility for softwire OAM is to build something similar to
767	   the [RFC4378] or in other words creating and generating softwire echo
768	   request/reply packets.  The echo request sent to a well-known UDP
769	   port would contain the egress AFBR IP address and the softwire
770	   identifier as the payload (similar to the MPLS forwarding equivalence
771	   class contained in the LSP echo request).  The softwire echo packet
772	   would be encapsulated with the encapsulation header and forwarded
773	   across the same path (inband) as that of the softwire itself.

775	   This mechanism can also be automated to periodically verify remote
776	   softwires end-point reachability, with the loss of reachability being
777	   signaled to the softwires application on the local AFBR thus enabling
778	   suitable actions to be taken.  Consideration must be given to the
779	   trade offs between scalability of such mechanisms verses time to
780	   detection of loss of endpoint reachability for such automated
781	   mechanisms.

783	   In general a framework for softwire OAM can for a large part be based
784	   on the [RFC4176] framework.

786	10.2. MIBs

788	   Specific MIBs do exist to manage elements of the softwire mesh
789	   framework.  However there will be a need to either extend these MIBs
790	   or create new ones that reflect the functional elements that can be
791	   SNMP-managed within the softwire network.

793	11. Softwire Multicast

795	   A set of client networks, running E-IP, that are connected to a
796	   provider's I-IP transit core, may wish to run IP multicast
797	   applications.  Extending IP multicast connectivity across the transit
798	   core can be done in a number of ways, each with a different set of
799	   characteristics.  Most (though not all) of the possibilities are
800	   either slight variations of the procedures defined for L3VPNs in
801	   [L3VPN-MCAST].

803	   We will focus on supporting those multicast features and protocols
804	   which are typically used across inter-provider boundaries.  Support
805	   is provided for PIM-SM (PIM Sparse Mode) and PIM-SSM (PIM Single
806	   Source Mode).  Support for BIDIR-PIM (Bidirectional PIM), BSR
807	   (Bootstrap Router Mechanism for PIM), AutoRP (Automatic Rendezvous
808	   Point Determination) is not provided as these features are not
809	   typically used across inter-provider boundaries.

811	11.1. One-to-One Mappings

813	   In the "one-to-one mapping" scheme, each client multicast tree is
814	   extended through the transit core, so that for each client tree there
815	   is exactly one tree through the core.

817	   The one-to-one scheme is not used in [L3VPN-MCAST], because it
818	   requires an amount of state in the core routers which is proportional
819	   to the number of client multicast trees passing through the core.  In
820	   the VPN context, this is considered undesirable, because the amount
821	   of state is unbounded and out of the control of the service provider.
822	   However, the one-to-one scheme models the typical "Internet
823	   multicast" scenario where the client network and the transit core are
824	   both IPv4 or are both IPv6.  If it scales satisfactorily for that
825	   case, it should also scale satisfactorily for the case where the
826	   client network and the transit core support different versions of IP.

828	11.1.1. Using PIM in the Core

830	   When an AFBR receives an E-IP PIM control message from one of its
831	   CEs, it would translate it from E-IP to I-IP, and forward it towards
832	   the source of the tree.  Since the routers in the transit core will
833	   not generally have a route to the source of the tree, the AFBR must
834	   create include an "RPF Vector" in the PIM message.

836	   Suppose an AFBR A receives an E-IP PIM Join/Prune message from a CE,
837	   for either an (S,G) tree or a (*,G) tree.  The AFBR would have to
838	   "translate" the PIM message into an I-IP PIM message.  It would then
839	   send it to the neighbor which is the next hop along the route to the
840	   root of the (S,G) or (*,G) tree.  In the case of an (S,G) tree the
841	   root of the tree is S; in the case of a (*,G) tree the root of the
842	   tree is the Rendezvous Point (RP) for the group G.

844	   Note that the address of the root of the tree will be an E-IP
845	   address.  Since the routers within the transit core (other than the
846	   AFBRs) do not have routes to E-IP addresses, A must put an "RPF
847	   Vector" [RPF-VECTOR] in the PIM Join/Prune message that it sends to
848	   its upstream neighbor.  The RPF Vector will identify, as an I-IP
849	   address, the AFBR B that is the egress point in the transit network
850	   along the route to the root of the multicast tree.  AFBR B is AFBR
851	   A's "BGP next hop" for the route to the root of the tree.  The RPF
852	   Vector allows the core routers to forward PIM Join/Prune messages
853	   upstream towards the root of the tree, even though they do not
854	   maintain E-IP routes.

856	   In order to "translate" the an E-IP PIM message into an I-IP PIM
857	   message, the AFBR A must translate the address of S (in the case of
858	   an (S,G) group) or the address of G's RP from the E-IP address family
859	   to the I-IP address family, and the AFBR B must translate them back.

861	   In the case where E-IP is IPv4 and I-IP is IPv6, it is possible to do
862	   this translation algorithmically.  A can translate the IPv4 S and G
863	   into the corresponding IPv4-mapped IPv6 addresses [RFC4291], and then
864	   B can translate them back.  The precise circumstances under which
865	   these translations are done would be a matter of policy.

867	   Obviously, this translation procedure does not generalize to the case
868	   where the client multicast is IPv6 but the core is IPv4.  To handle
869	   that case, one needs additional signaling between the two AFBRs.
870	   Each downstream AFBR need to signal the upstream AFBR that it needs a
871	   multicast tunnel for (S,G).  The upstream AFBR must then assign a
872	   multicast address G' to the tunnel, and inform the downstream of the
873	   P-G value to use.  The downstream AFBR then uses PIM/IPv4 to join the
874	   (S', G') tree, where S' is the IPv4 address of the upstream ASBR
875	   (Autonomous System Border Router).

877	   The (S', G') trees should be SSM trees.

879	   This procedure can be used to support client multicasts of either
880	   IPv4 or IPv6 over a transit core of the opposite protocol.  However,
881	   it only works when the client multicasts are SSM, since it provides
882	   no method for mapping a client "prune a source off the (*,G) tree"
883	   operation into an operation on the (S',G') tree.  This method also
884	   requires additional signaling.  The BGP-based signaling of [L3VPN-
885	   MCAST-BGP] is one signaling method that could be used.  Other
886	   signaling methods could be defined as well.

888	11.1.2. Using mLDP and Multicast MPLS in the Core

890	   If the transit core implements mLDP [mLDP] and supports multicast
891	   MPLS, then client Single-Source Multicast (SSM) trees can be mapped
892	   one-to-one onto P2MP LSPs.

894	   When an AFBR A receives a E-IP PIM Join/Prune message for (S,G) from
895	   one of its CEs, where G is an SSM group it would use mLDP to join a
896	   P2MP LSP.  The root of the P2MP LSP would be the AFBR B that is A's
897	   BGP next hop on the route to S. In mLDP, a P2MP LSP is uniquely
898	   identified by a combination of its root and a "FEC (Forwarding
899	   Equivalence Class) identifier".  The original (S,G) can be
900	   algorithmically encoded into the FEC identifier, so that all AFBRs
901	   that need to join the P2MP LSP for (S,G) will generate the same FEC
902	   identifier.  When the root of the P2MP LSP (AFBR B) receives such an
903	   mLDP message, it extracts the original (S,G) from the FEC identifier,
904	   creates an "ordinary" E-IP PIM Join/Prune message, and sends it to
905	   the CE which is its next hop on the route to S.

907	   The method of encoding the (S,G) into the FEC identifier needs to be
908	   standardized.  The encoding must be self-identifying, so that a node
909	   which is the root of a P2MP LSP can determine whether a FEC
910	   identifier is the result of having encoded a PIM (S,G).

912	   The appropriate state machinery must be standardized so that PIM
913	   events at the AFBRs result in the proper mLDP events.  For example,
914	   if at some point an AFBR determines (via PIM procedures) that it no
915	   longer has any downstream receivers for (S,G), the AFBR should invoke
916	   the proper mLDP procedures to prune itself off the corresponding P2MP
917	   LSP.

919	   Note that this method cannot be used when the G is a Sparse Mode
920	   group.  The reason this method cannot be used is that mLDP does not
921	   have any function corresponding to the PIM "prune this source off the
922	   shared tree" function.  So if a P2MP LSP were mapped one-to-one with
923	   a P2MP LSP, duplicate traffic could end up traversing the transit
924	   core (i.e., traffic from S might travel down both the shared tree and
925	   S's source tree).  Alternatively, one could devise an AFBR-to-AFBR
926	   protocol to prune sources off the P2MP LSP at the root of the LSP.
927	   It is recommended though that client SM multicast groups be supported
928	   by other methods, such as those discussed below.

930	   Client-side bidirectional multicast groups set up by PIM-bidir could
931	   be mapped using the above technique to MP2MP (Multipoint-to-
932	   Multipoint) LSPs set up by mLDP [MLDP].  We do not consider this
933	   further as inter-provider bidirectional groups are not in use
934	   anywhere.

936	11.2. MVPN-like Schemes

938	   The "MVPN-like schemes" are those described in [L3VPN-MCAST] and its
939	   companion documents (such as [L3VPN-MCAST-BGP]).  To apply those
940	   schemes to the softwire environment, it is necessary only to treat
941	   all the AFBRs of a given transit core as if they were all, for
942	   multicast purposes, PE routers attached to the same VPN.

944	   The MVPN-like schemes do not require a one-to-one mapping between
945	   client multicast trees and transit core multicast trees.  In the MVPN
946	   environment, it is a requirement that the number of trees in the core
947	   scales less than linearly with the number of client trees.  This
948	   requirement may not hold in the softwires scenarios.

950	   The MVPN-like schemes can support SM, SSM, and Bidir groups.  They
951	   provide a number of options for the control plane:

953	     - Lan-Like

955	       Use a set of multicast trees in the core to emulate a LAN (Local
956	       Area Network), and run the client-side PIM protocol over that
957	       "LAN".  The "LAN" can consists of a single Bidir tree containing
958	       all the AFBRs, or a set of SSM trees, one rooted at each AFBR,
959	       and containing all the other AFBRs as receivers.

961	     - NBMA (Non-Broadcast Multiple Access), using BGP

963	       The client-side PIM signaling can be "translated" into BGP-based
964	       signaling, with a BGP route reflector mediating the signaling.

966	   These two basic options admit of many variations; a comprehensive
967	   discussion is in [L3VPN-MCAST].

969	   For the data plane, there are also a number of options:

971	     - All multicast data sent over the emulated LAN.  This particular
972	       option is not very attractive though for the softwires scenarios,
973	       as every AFBR would have to receive every client multicast
974	       packet.

976	     - Every multicast group mapped to a tree which is considered
977	       appropriate for that group, in the sense of causing the traffic
978	       of that group to go to "too many" AFBRs that don't need to
979	       receive it.

981	   Again, a comprehensive discussion of the issues can be found in
982	   [L3VPN-MCAST].

984	12. Inter-AS Considerations

986	   We have so far only considered the case where a "transit core"
987	   consists of a single Autonomous System (AS).  If the transit core
988	   consists of multiple ASes, then it may be necessary to use softwires
989	   whose endpoints are AFBRs attached to different Autonomous Systems.
990	   In this case, the AFBR at the remote endpoint of a softwire is not
991	   the BGP next hop for packets that need to be sent on the softwire.
992	   Since the procedures described above require the address of remote
993	   softwire endpoint to be the same as the address of the BGP next hop,
994	   those procedures do not work as specified when the transit core
995	   consists of multiple ASes.

997	   There are several ways to deal with this situation.

999	      1. Don't do it; require that there be AFBRs at the edge of each
1000	         AS, so that a transit core does not extend more than one AS.

1002	      2. Use multi-hop EBGP to allow AFBRs to send BGP routes to each
1003	         other, even if the ABFRs are not in the same or in neighboring
1004	         ASes.

1006	      3. Ensure that an ASBR which is not an AFBR does not change the
1007	         next hop field of the routes for which encapsulation is needed.

1009	   In the latter two cases, BGP recursive next hop resolution needs to
1010	   be done, and encapsulations may need to be stacked.

1012	   For instance, consider packet P with destination IP address D.
1013	   Suppose it arrives at ingress AFBR A1, and that the route that is the
1014	   best match for D has BGP next hop B1.  So A1 will encapsulate the
1015	   packet for delivery to B1.  If B1 is not within A1's AS, A1 will need
1016	   to look up the route to B1 and then find the BGP next hop, call it
1017	   B2, of that route. If the interior routers of A1's AS do not have
1018	   routes to B1, then A1 needs to encapsulate the packet a second time,
1019	   this time for delivery to B2.

1021	13. IANA Considerations

1023	   This document has no actions for IANA.

1025	14. Security Considerations

1027	14.1. Problem Analysis

1029	   In the Softwires mesh framework, the data packets that are
1030	   encapsulated are E-IP data packets that are traveling through the
1031	   Internet.  These data packets (the Softwires "payload") may or may
1032	   not need such security features as authentication, integrity,
1033	   confidentiality, or playback protection.  However, the security needs
1034	   of the payload packets are independent of whether or not those
1035	   packets are traversing softwires.  The fact that a particular payload
1036	   packet is traveling through a softwire does not in any way affect its
1037	   security needs.

1039	   Thus the only security issues we need to consider are those which
1040	   affect the I-IP encapsulation headers, rather than those which affect
1041	   the E-IP payload.

1043	   Since the encapsulation headers determine the routing of packets
1044	   traveling through softwires, they must appear "in the clear", i.e.,
1045	   they do not have any confidentiality requirement.

1047	   In the Softwires mesh framework, for each tunnel receiving endpoint,
1048	   there are one or more "valid" transmitting endpoints, where the valid
1049	   transmitting endpoints are those which are authorized to tunnel
1050	   packets to the receiving endpoint.  If the encapsulation header has
1051	   no guarantee of authentication or integrity, then it is possible to
1052	   have spoofing attacks, in which unauthorized nodes send encapsulated
1053	   packets to the receiving endpoint, giving the receiving endpoint the
1054	   invalid impression the encapsulated packets have really traveled
1055	   through the softwire.  Replay attacks are also possible.

1057	   The effect of such attacks is somewhat limited though.  The receiving
1058	   endpoint of a softwire decapsulates the payload and does further
1059	   routing based on the IP destination address of the payload.  Since
1060	   the payload packets are traveling through the Internet, they have
1061	   addresses from the globally unique address space (rather than, e.g.,
1062	   from a private address space of some sort).  Therefore these attacks
1063	   cannot cause payload packets to be delivered to an address other than
1064	   the one intended.

1066	   However, attacks of this sort can result in policy violations.  The
1067	   authorized transmitting endpoint(s) of a softwire may be following a
1068	   policy according to which only certain payload packets get sent
1069	   through the softwire.  If unauthorized nodes are able to encapsulate
1070	   the payload packets so that they arrive at the receiving endpoint
1071	   looking as if they arrived from authorized nodes, then the properly
1072	   authorized policies have been side-stepped.

1074	   Attacks of the sort we are considering can also be used in Denial of
1075	   Service attacks on the receiving tunnel endpoints.  However, such
1076	   attacks cannot be prevented by use of cryptographic
1077	   authentication/integrity techniques, as the need to do cryptography
1078	   on spoofed packets only makes the Denial of Service problem worse.

1080	   This section is largely based on the security considerations section
1081	   of RFC 4023, which also deals with encapsulations and tunnels.

1083	14.2. Non-cryptographic techniques

1085	   If a tunnel lies entirely within a single administrative domain, then
1086	   to a certain extent, then there are certain non-cryptographic
1087	   techniques one can use to prevent spoofed packets from reaching a
1088	   tunnel's receiving endpoint.  For example, when the tunnel
1089	   encapsulation is IP-based:

1091	     - The tunnel receiving endpoints can be given a distinct set of
1092	       addresses, and those addresses can be made known to the border
1093	       routers.  The border routers can then filter out packets,
1094	       destined to those addresses, which arrive from outside the
1095	       domain.

1097	     - The tunnel transmitting endpoints can be given a distinct set of
1098	       addresses, and those addresses can be made know to the border
1099	       routers and to the tunnel receiving endpoints. The border routers
1100	       can filter out all packets arriving from outside the domain with
1101	       source addresses that are in this set, and the receiving
1102	       endpoints can discard all packets which appear to be part of a
1103	       softwire, but whose source addresses are not in this set.

1105	   If an MPLS-based encapsulation is used, the border routers can refuse
1106	   to accept MPLS packets from outside the domain, or can refused to
1107	   accept such MPLS packets whenever the top label corresponds to the
1108	   address of a tunnel receiving endpoint.

1110	   These techniques assume that within a domain, the network is secure
1111	   enough to prevent the introduction of spoofed packets from within the
1112	   domain itself.  That may not always be the case.  Also, these
1113	   techniques however can be difficult or impossible to use effectively
1114	   for tunnels that are not in the same administrative domain.

1116	   A different technique is to have the encapsulation header contain a
1117	   cleartext password.  The 64-bit "cookie" of L2TPv3 [RFC3931] is
1118	   sometimes used in this way.  This can be useful within an
1119	   administrative domain if it is regarded as infeasible for an attacker
1120	   to spy on packets that originate in the domain and that do not leave
1121	   the domain.  An attacker would then not be able to discover the
1122	   password.  An attacker could of course try to guess the password, but
1123	   if the password is an arbitrary 64-bit binary sequence, brute force
1124	   attacks which run through all the possible passwords would be
1125	   infeasible.  This technique may be easier to manage than ingress
1126	   filtering is, and may be just as effective if the assumptions hold.
1127	   Like ingress filtering, though, it may not be applicable for tunnels
1128	   that cross domain boundaries.

1130	   Therefore it is necessary to consider the use of more cryptographic
1131	   techniques for setting up the tunnels and for passing data through
1132	   them.

1134	14.3. Cryptographic techniques

1136	   If the path between the two endpoints of a tunnel is not adequately
1137	   secure, then

1139	     - If a control protocol is used to set up the tunnels (e.g., to
1140	       inform one tunnel endpoint of the IP address of the other), the
1141	       control protocol MUST have an authentication mechanism, and this
1142	       MUST be used when the tunnel is set up.  If the tunnel is set up
1143	       automatically as the result of, for example, information
1144	       distributed by BGP, then the use of BGP's MD5-based
1145	       authentication mechanism [RFC2385] is satisfactory.

1147	     - Data transmission through the tunnel should be secured with
1148	       IPsec.  In the remainder of this section, we specify the way
1149	       IPsec may be used, and the implementation requirements we mention
1150	       are meant to be applicable whenever IPsec is being used.

1152	   We consider only the case where IPsec is used together with an IP-
1153	   based tunneling mechanism.  Use of IPsec with an MPLS-based tunneling
1154	   mechanism is for further study.  In the case where the encapsulation
1155	   being used is MPLS-in-IP or MPLS-in-GRE, please see RFC 4023 for the
1156	   details.

1158	   When IPsec is used, the tunnel head and the tunnel tail should be
1159	   treated as the endpoints of a Security Association.  For this
1160	   purpose, a single IP address of the tunnel head will be used as the
1161	   source IP address, and a single IP address of the tunnel tail will be
1162	   used as the destination IP address.

1164	   The encapsulated packets should be viewed as originating at the
1165	   tunnel head and as being destined for the tunnel tail; IPsec
1166	   transport mode SHOULD thus be used.

1168	   The IP header of the encapsulated packet becomes the outer IP header
1169	   of the resulting packet.  That IP header is followed by an IPsec
1170	   header, which in turn is followed by the payload.

1172	   When IPsec is used to secure softwires, IPsec MUST provide
1173	   authentication and integrity.  Thus, the implementation MUST support
1174	   ESP (IP Encapsulating Security Payload) will null encryption
1175	   [RFC4303].  ESP with encryption MAY be supported.  If ESP is used,
1176	   the tunnel tail MUST check that the source IP address of any packet
1177	   received on a given SA is the one expected.

1179	   Since the softwires are set up dynamically as a byproduct of passing
1180	   routing information, key distribution MUST be done automatically by
1181	   means of IKE [RFC4306], operating in main mode with preshared keys.

1183	   The selectors associated with the SA are the source and destination
1184	   addresses of the encapsulation header, along with the IP protocol
1185	   number representing the encapsulation protocol being used.

1187	   It should be noted that the implementation of IPsec with automatic
1188	   keying is generally not considered to be an attractive option.  The
1189	   combination of cryptography with encapsulation/decapsulation at high
1190	   speeds is rarely offered by vendors, and the management overhead of
1191	   supporting an automated keying infrastructure is rarely desired by
1192	   service providers.

1194	15. Acknowledgments

1196	   David Ward, Chris Cassar, Gargi Nalawade, Ruchi Kapoor, Pranav Mehta,
1197	   Mingwei Xu and Ke Xu provided useful input into this document.

1199	16. Normative References

1201	   [ENCAPS-SAFI] "BGP Information SAFI and BGP Tunnel Encapsulation
1202	   Attribute", P. Mohapatra and E. Rosen, draft-pmohapat-idr-info-
1203	   safi-01.txt, February 2007.

1205	   [RFC2003] "IP Encapsulation within IP", C. Perkins, October 1996.

1207	   [RFC2119] "Key words for use in RFCs to Indicate Requirement Levels",
1208	   S. Bradner, March 1997.

1210	   [RFC2784] "Generic Routing Encapsulation (GRE)", D. Farinacci, T. Li,
1211	   S. Hanks, D. Meyer, P. Traina, RFC 2784, March 2000.

1213	   [RFC3031] "Multiprotocol Label Switching Architecture", E. Rosen, A.
1214	   Viswanathan, R. Callon, RFC 3031, January 2001.

1216	   [RFC3032] "MPLS Label Stack Encoding", E. Rosen, D. Tappan, G.
1217	   Fedorkow, Y. Rekhter, D. Farinacci, T. Li, A. Conta, RFC 3032,
1218	   January 2001.

1220	   [RFC3209] D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan, and G.
1221	   Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209,
1222	   December 2001.

1224	   [RFC3931] J. Lau, M. Townsley, I. Goyret, "Layer Two Tunneling
1225	   Protocol - Version 3 (L2TPv3)", RFC 3931, March 2005.

1227	   [V4NLRI-V6NH] F. Le Faucheur, E. Rosen, "Advertising an IPv4 NLRI
1228	   with an IPv6 Next Hop", draft-ietf-idr-v4nlri-v6nh-00.txt, October
1229	   2006.

1231	   [V6NLRI-V4NH] J. De Clercq, D. Ooms, S. Prevost, F. Le Faucheur,
1232	   "Connecting IPv6 Islands over IPv4 MPLS using IPv6 Provider Edge
1233	   Routers (6PE)", RFC 4798, February 2007.

1235	17. Informative References

1237	   [BFD] D. Katz and D. Ward, "Bidirectional Forwarding Detection",
1238	   draft-ietf-bfd-base-06.txt, March 2007.

1240	   [L3VPN-MCAST], "Multicast in MPLS/BGP IP VPNs", E. Rosen, R.
1241	   Aggarwal, draft-ietf-l3vpn-2547bis-mcast-04.txt, October 2006.

1243	   [L3VPN-MCAST-BGP], "BGP Encodings and Procedures for Multicast in
1244	   MPLS/BGP IP VPNs", R. Aggarwal, E. Rosen, T. Morin, Y. Rekhter, C.
1245	   Kodeboniya, draft-ietf-l3vpn-2547bis-mcast-bgp-02.txt, March 2007.

1247	   [MLDP] "Label Distribution Protocol Extensions for Point-to-
1248	   Multipoint and Multipoint-to-Multipoint Label Switched Paths", I.
1249	   Minei, K. Kompella, IJ. Wijnands, B. Thomas, draft-ietf-mpls-ldp-
1250	   p2mp-02, June 2006.

1252	   [RFC1195] "Use of OSI IS-IS for Routing in TCP/IP and Dual
1253	   Environments", R. Callon, RFC 1195, December 1990.

1255	   [RFC2328] J. Moy, "OSPF Version 2", RFC 2328, April 1998.

1257	   [RFC2385] "Protection of BGP Sessions via the TCP MD5 Signature
1258	   Option", A. Heffernan, RFC 2385, August 1998.

1260	   [RFC3036] "LDP Specification", L. Andersson, P. Doolan, N. Feldman,
1261	   A. Fredette, B. Thomas, January 2001.

1263	   [RFC4176] Y. El Mghazli, T. Nadeau, M. Boucadair, K. Chan, A.
1264	   Gonguet, "Framework for Layer 3 Virtual Private Networks (L3VPN)
1265	   Operations and Management", RFC 4176, October 2005.

1267	   [RFC4271] Rekhter, Y,, Li T., Hares, S., "A Border Gateway Protocol 4
1268	   (BGP-4)", RFC 4271, January 2006.

1270	   [RFC4291] "IP Version 6 Addressing Architecture", R. Hinden, S.
1271	   Deering, RFC 4291, February 2006.

1273	   [RFC4301], "Security Architecture for the Internet Protocol", S.
1274	   Kent, K. Seo, RFC 4301, December 2005.

1276	   [RFC4303] "IP Encapsulating Security Payload (ESP)", S. Kent, RFC
1277	   4303, December 2005.

1279	   [RFC4306] "Internet Key Exchange (IKEv2) Protocol", C. Kaufman, ed.,
1280	   RFC 4306, December 2005.

1282	   [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
1283	   Networks (VPNs)", RFC 4364, February 2006.

1285	   [RFC4378] D. Allan and T. Nadeau, "A Framework for Multi-Protocol
1286	   Label Switching (MPLS) Operations and Management (OAM)", RFC 4378,
1287	   February 2006.

1289	   [RPF-VECTOR], "The RPF Vector TLV", IJ. Wijnands, draft-ietf-pim-rpf-
1290	   vector-03.txt, October 2006.

1292	   [SW-PROB] X. Li, "Softwire Problem Statement", draft-ietf-softwire-
1293	   problem-statement-03.txt, March 2007.

1295	   Authors' Addresses

1297	   Jianping Wu
1298	   Tsinghua University
1299	   Department of Computer Science, Tsinghua University
1300	   Beijing  100084
1301	   P.R.China

1303	   Phone: +86-10-6278-5983
1304	   Email: jianping@cernet.edu.cn

1306	   Yong Cui
1307	   Tsinghua University
1308	   Department of Computer Science, Tsinghua University
1309	   Beijing  100084
1310	   P.R.China

1312	   Phone: +86-10-6278-5822
1313	   Email: yong@csnet1.cs.tsinghua.edu.cn
1314	   Xing Li
1315	   Tsinghua University
1316	   Department of Electronic Engineering, Tsinghua University
1317	   Beijing  100084
1318	   P.R.China

1320	   Phone: +86-10-6278-5983
1321	   Email: xing@cernet.edu.cn

1323	   Chris Metz
1324	   Cisco Systems, Inc.
1325	   3700 Cisco Way
1326	   San Jose, Ca.  95134
1327	   USA

1329	   Email: chmetz@cisco.com

1331	   Eric C. Rosen
1332	   Cisco Systems, Inc.
1333	   1414 Massachusetts Avenue
1334	   Boxborough, MA, 01719
1335	   USA

1337	   Email: erosen@cisco.com

1339	   Simon Barber
1340	   Cisco Systems, Inc.
1341	   250 Longwater Avenue
1342	   Reading, ENGLAND, RG2 6GB
1343	   United Kingdom

1345	   Email: sbarber@cisco.com
1346	   Pradosh Mohapatra
1347	   Cisco Systems, Inc.
1348	   3700 Cisco Way
1349	   San Jose, Ca.  95134
1350	   USA

1352	   Email: pmohapat@cisco.com

1354	   John Scudder
1355	   Juniper Networks
1356	   1194 North Mathilda Avenue
1357	   Sunnyvale, California 94089
1358	   USA

1360	   Email: jgs@juniper.net

1362	18. Full Copyright Statement

1364	   Copyright (C) The IETF Trust (2007).

1366	   This document is subject to the rights, licenses and restrictions
1367	   contained in BCP 78, and except as set forth therein, the authors
1368	   retain all their rights.

1370	   This document and the information contained herein are provided on an
1371	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1372	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1373	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1374	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1375	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1376	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1378	19. Intellectual Property

1380	   The IETF takes no position regarding the validity or scope of any
1381	   Intellectual Property Rights or other rights that might be claimed to
1382	   pertain to the implementation or use of the technology described in
1383	   this document or the extent to which any license under such rights
1384	   might or might not be available; nor does it represent that it has
1385	   made any independent effort to identify any such rights.  Information
1386	   on the procedures with respect to rights in RFC documents can be
1387	   found in BCP 78 and BCP 79.

1389	   Copies of IPR disclosures made to the IETF Secretariat and any
1390	   assurances of licenses to be made available, or the result of an
1391	   attempt made to obtain a general license or permission for the use of
1392	   such proprietary rights by implementers or users of this
1393	   specification can be obtained from the IETF on-line IPR repository at
1394	   http://www.ietf.org/ipr.

1396	   The IETF invites any interested party to bring to its attention any
1397	   copyrights, patents or patent applications, or other proprietary
1398	   rights that may cover technology that may be required to implement
1399	   this standard.  Please address the information to the IETF at ietf-
1400	   ipr@ietf.org.