idnits 2.17.1
draft-marques-l3vpn-end-system-00.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
== No 'Intended status' indicated for this document; assuming Proposed
Standard
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack an IANA Considerations section. (See Section
2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
when there are no actions for IANA.)
** The document seems to lack separate sections for Informative/Normative
References. All references will be assumed normative when checking for
downward references.
** There are 2 instances of too long lines in the document, the longest one
being 7 characters in excess of 72.
** The abstract seems to contain references ([RFC4364]), which it
shouldn't. Please replace those with straight textual mentions of the
documents in question.
== There are 6 instances of lines with private range IPv4 addresses in the
document. If these are generic example addresses, they should be changed
to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
198.51.100.x or 203.0.113.x.
** The document seems to lack a both a reference to RFC 2119 and the
recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
keywords.
RFC 2119 keyword, line 144: '...ficance only and SHOULD be allocated b...'
RFC 2119 keyword, line 208: '...peering sessions SHALL be support the ...'
RFC 2119 keyword, line 218: '... Network devices MAY have direct BGP s...'
RFC 2119 keyword, line 318: '...nt XMPP sessions. These sessions MUST...'
RFC 2119 keyword, line 322: '... An End-system MAY connect to multip...'
(8 more instances...)
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (October 6, 2011) is 4579 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
No issues found here.
Summary: 5 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group P. Marques
3 Internet-Draft
4 Expires: April 8, 2012 L. Fang
5 Cisco Systems
6 P. Pan
7 Infinera Corp
8 October 6, 2011
10 End-system support for BGP-signaled IP/VPNs.
11 draft-marques-l3vpn-end-system-00
13 Abstract
15 Network Service Providers often use BGP/MPLS IP VPNs [RFC4364] as the
16 control plane for overlay networks. That solution has proven to
17 scale to large number of VPNs and attachment points and is one
18 familiar to network equipment software.
20 There is a significant interest in the industry in building overlay
21 networks in which end-systems are themselves the direct participant,
22 along with network equipment such as service appliances.
24 This document proposes an extension of the BGP IP VPN model to serve
25 as the signaling protocol for host-based overlay networks along with
26 an XMPP interface that provides a bridge between the software
27 concepts familiar to end-points and those familiar to network
28 equipment.
30 Status of this Memo
32 This Internet-Draft is submitted in full conformance with the
33 provisions of BCP 78 and BCP 79.
35 Internet-Drafts are working documents of the Internet Engineering
36 Task Force (IETF). Note that other groups may also distribute
37 working documents as Internet-Drafts. The list of current Internet-
38 Drafts is at http://datatracker.ietf.org/drafts/current/.
40 Internet-Drafts are draft documents valid for a maximum of six months
41 and may be updated, replaced, or obsoleted by other documents at any
42 time. It is inappropriate to use Internet-Drafts as reference
43 material or to cite them other than as "work in progress."
45 This Internet-Draft will expire on April 8, 2012.
47 Copyright Notice
48 Copyright (c) 2011 IETF Trust and the persons identified as the
49 document authors. All rights reserved.
51 This document is subject to BCP 78 and the IETF Trust's Legal
52 Provisions Relating to IETF Documents
53 (http://trustee.ietf.org/license-info) in effect on the date of
54 publication of this document. Please review these documents
55 carefully, as they describe your rights and restrictions with respect
56 to this document. Code Components extracted from this document must
57 include Simplified BSD License text as described in Section 4.e of
58 the Trust Legal Provisions and are provided without warranty as
59 described in the Simplified BSD License.
61 Table of Contents
63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
64 2. End-system functionality . . . . . . . . . . . . . . . . . . . 4
65 3. Operational Model . . . . . . . . . . . . . . . . . . . . . . 6
66 4. XMPP client interface . . . . . . . . . . . . . . . . . . . . 9
67 5. VPN NLRI . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14
69 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
70 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16
72 1. Introduction
74 Data center applications require private networks connecting multiple
75 "Virtual Machines" belonging to the same administrative "user" and
76 between them and network elements and appliances.
78 In this context, it is a common goal, for the data-center forwarding
79 infrastructure to be isolated from the knowledge of the private
80 network. The set of routers and switches that interconnects physical
81 machines in the data-center is assumed to provide an IP service (with
82 or without the use of IEEE 802.1 technologies).
84 The Virtual Private Networks (VPNs) associated with each individual
85 administrative domain can be built without the knowledge of the data-
86 center connectivity layer as an overlay network. This proposal
87 leverages the technology used in the Service Provide managed VPN
88 space and extends it to address the problem of interconnecting
89 virtual interfaces on end-systems. In both applications there is the
90 need to be able to manage at scale a very large number of VPNs and
91 attachement points. And in both cases there is the need to support
92 the interchange of traffic between different VPNs.
94 This document defines how BGP-signaled IP/VPNs can be used to
95 interconnect end-systems and network elements. It assumes that the
96 forwarding layer uses IP over GRE as defined by [RFC4023]. Other
97 transport layers such as native MPLS or 802.1ah can also be used with
98 the same signaling approach.
100 When this document uses the term 'Infrastructure IP' addresses, it
101 refers to the addresses used in the outer header of GRE packets. In
102 the case of a transport other than IP over GRE, this would be the
103 Subnetwork Point of Attachement (SNPA) address corresponding to the
104 multi-access network providing connectivity to the end-systems.
106 BGP is not an interface that application software is familiar with.
107 In order to bridge the gap between concepts familiar to network
108 devices and those familiar to end-system developers, this document
109 defines an XMPP client interface to be used by end-systems. It
110 defines the procedures to interchange data between XMPP and BGP IP
111 VPN sessions along with the corresponding data schemas. Networking
112 devices may opt to receive the signaling information directly via
113 BGP.
115 2. End-system functionality
117 For the purposes of this document we assume that each end-system
118 executes an 'Host Operating System' with the ability to:
120 Create virtual interfaces (typically ethernet interfaces).
122 Associate a given virtual interface with a specific "Virtual and
123 Routing Forwarding (VRF)" table.
125 Store entries in the VRF table that map an VRF-specific IP prefix
126 into a next-hop which contains a destination IP address and a 20-
127 bit label.
129 Encapsulate outgoing packets according [RFC4023] using the result
130 of the VRF lookup.
132 Associate incoming packets with a VRF according to the 20-bit
133 label contained immediately after the GRE header.
135 Expose a programmatic interface to create, update and delete VRF
136 table entries.
138 The 'Host Operating System' may choose to associate the virtual
139 interfaces with specify 'Virtual Machines' or use other policies to
140 manage the application access to these interfaces.
142 Hosts should support the ability to associate multiple virtual
143 interfaces with the same VRF. The 20-bit label which is associated
144 with a VRF is of local significance only and SHOULD be allocated by
145 the end-system.
147 The procedure that determines that a VRF should configured on a
148 particular end-system as well as which IP addresses to be associated
149 with each interface are outside the scope of this document. We
150 assume that statically assigned IP addresses are used.
152 The VRFs support IP unicast traffic only. Multicast support is
153 subject for further study and will be detailed in a separate
154 document. Both IPv4 and IPv6 are supported and the term 'IP' can
155 refer to either version of the Internet Protocol.
157 The VRF table is populated by the signaling mechanisms described
158 bellow and may contain both host length (i.e. /32 and /128 for IPv4
159 and v6 respectively) or subnet prefixes. As an example a VPN with
160 access to the external networks would probably contain a default
161 route plus a set of host length entries for all the VMs in the same
162 VPN.
164 In the terminology used in the BGP-signaled IP/VPN standard
165 [RFC4364], a end-system acts as a 'Provider Edge' (PE) device in
166 terms of its forwarding capabilities, with the virtual interfaces
167 that it exposes (for instance to virtual machines) as the 'Customer
168 Edge' (CE) interfaces.
170 3. Operational Model
172 In the simplest case, a VPN is a collection of systems that are
173 allowed to exchange traffic with each other and where all the VRFs in
174 the VPN contain all the routing entries for the VPN. Only members of
175 the VPN are allowed to exchange traffic with each other. We can
176 refer to these as symmetrical VPNs since all VRFs contain the same
177 routing information.
179 When end-systems join a given VPN they advertise their membership by
180 advertising the VPN-specific IP address associated with a particular
181 virtual interface as well as its binding to the infrastructure IP
182 address associated with the host.
184 Infrastructure addresses are routable in the underlying transport
185 network (e.g. the data-center network). While VPN addresses are
186 routable on the VPN network only.
188 End-systems subscribe to the contents of the VPN routing tables for
189 which they have members associated with. This information is then
190 used to populate the host's routing tables. It may contain both host
191 routes (i.e. IPv4 32-bit prefixes or IPv6 128-bit prefixes) or
192 routes to gateways that interconnect other networks.
194 The signaling network delivers the membership advertisements
195 generated by the end-systems to other members of the same VPN, subjet
196 to policy controls.
198 When a particular VM "moves" from one physical end-system to another,
199 its respective VPN address will be advertised by the new system and
200 that notification propagated to all attachment points of that VPN.
202 This document assumes two types of applications that perform network
203 signaling functions: BGP Route Reflectors (RRs) and BGP/XMPP
204 signaling gateways. Both functions may be collocated in the same
205 physical device.
207 The BGP Route Reflectors accept connection from gateways or native
208 BGP devices. These BGP peering sessions SHALL be support the address
209 families: VPN-IPv4 (1, 128), VPN-IPv6 (2, 128) and RT-Constraint (1,
210 132) [RFC4684].
212 The XMPP signaling gateways maintain persistent connection to a
213 subset of the end-systems of the domain and provide a 'pubsub' API to
214 the contents of each specific VPN routing table. These systems are
215 not in the forwarding plane and do not need to be collocated with a
216 network device.
218 Network devices MAY have direct BGP sessions to the BGP Route
219 Reflectors. For instance, a router or security appliance that
220 supports BGP/MPLS IP VPNs over GRE may use its existing functionality
221 to inter-operate directly with a collection of Virtual Machines.
223 The BGP/XMPP gateways implement the VRF policy functionality that is
224 associated with PE routers in the pure BGP IP/VPN case. In these
225 signaling gateways, the 'publish-subscribe' messages from the end-
226 systems are associated with a VRF-specific signaling table. Each of
227 these routing tables contains import and export policies which
228 provide fine grain control over the table contents.
230 An export policy associates VPN routing information with one or more
231 6 byte values known as 'Route Targets'. These 'Route Targets' are
232 associated with the routes as they are advertised out to other BGP
233 speakers.
235 Import policies, on the other hand, select via 'Route Targets', from
236 all the available routing information which routes should be imported
237 into a VPN-specific routing table.
239 A symmetrical VPN uses the same configuration for both import and
240 export. By controlling these policies it is possible to selectively
241 allow direct traffic exchanges between members of different VPNs,
242 assuming their respective IP addresses are non-overlapping.
244 +--------+ +--------+
245 VM1 -- veth0 --| host 1 |=== [network] ===| host 2 |-- veth0 -- VM2
246 +--------+ +--------+
248 IP pkt ===> GRE encap ===> [IP net] ===> GRE decap ===> IP pkt
249 [192.168.2.1, 20] map 20 to veth0
251 Figure 1
253 +----------------+--------------+-------+
254 | VPN IP address | Host address | label |
255 +----------------+--------------+-------+
256 | 10.1.1.1/32 | localhost | 10 |
257 | | | |
258 | 10.1.1.2/32 | 192.168.2.1 | 20 |
259 +----------------+--------------+-------+
261 VRF table on host1
263 Table 1
265 The figure and table above contain an example in which IP packets are
266 transmitted from one VPN interface (with address 10.1.1.1) to another
267 VPN interface (with address 10.1.1.2). As previously mentioned, the
268 virtual ethernet interfaces function as a CE interace in a
269 traditional BGP-signaled IP VPN. While the end-system provide the
270 forwarding functionality equivalent to a PE device.
272 +--------+ +-----------+ +--------+
273 | host 1 | <===> | signaling | <===> | BGP RR |
274 +--------+ | gateway | +--------+
275 +-----------+
277 Figure 2
279 +----------------+-------------+-------+-----------+
280 | VPN IP address | SNPA | label | Known via |
281 +----------------+-------------+-------+-----------+
282 | 10.1.1.1/32 | 192.168.1.1 | 10 | XMPP |
283 | | | | |
284 | 10.1.1.2/32 | 192.168.2.1 | 20 | BGP |
285 +----------------+-------------+-------+-----------+
287 VPN Routing table on signaling gateway
289 Table 2
291 The signaling network corresponding to the same example is depicted
292 above. The signaling gateway is an out-of-band system which speaks
293 both XMPP to the host as well as BGP to the BGP RRs. The table above
294 represents the routing table on the gateway that corresponds to the
295 VPN of the example. Host 2 would be connected to another signaling
296 gateway which would be in turn connected to the BGP RR mesh.
298 The gateway is configured via an external mechanism with the
299 parameters that correspond to the VPNs in use by its clients along
300 with its respective vrf import and export policies.
302 XMPP publish request are translated into routing entries on this
303 table, which are then advertised via BGP, using standard BGP-signaled
304 IP VPN mechanism.BGP learned routes are also imported into this
305 routing table. Any changes to its content are advertised to local
306 XMPP clients.
308 In comparison with traditional IP VPNs, the signaling gateway is
309 performing the PE functionality, which XMPP is used as a PE-CE
310 routing protocol.
312 4. XMPP client interface
314 The communication between end-systems and the signaling gateway uses
315 the XMPP protocol with the PubSub Collection Nodes [pubsub] extension
316 in order to exchange VPN route information.
318 End-systems establish persistent XMPP sessions. These sessions MUST
319 use the XMPP Ping [xmpp-ping] extension in order to detect end-system
320 failures.
322 An End-system MAY connect to multiple VPN-signaling gateways for
323 reliability. In this case it SHOULD publish its information to each
324 of the gateways. It MAY choose to subscribe to VPN routing
325 information once only from one of the available gateways.
327 The information advertised by a end-system SHOULD be deleted after a
328 configurable timeout, when the session closes. This timeout should
329 default to 60 seconds.
331 +---------+ +--------+
332 | gateway | ----------- | BGP RR |
333 +---------+ +--------+
334 // \ /
335 XMPP \ /
336 // \ /
337 +------------+ \ /
338 | end-system | \ /
339 +------------+ \/
340 \\ /\
341 XMPP / \
342 \\ / \
343 +---------+ / \ +--------+
344 | gateway | ----------- | BGP RR |
345 +---------+ +--------+
347 The figure above represents a typical configuration in which a end-
348 system is homed to multiple gateways, which are in turn connected to
349 multiple BGP route reflectors. In a deployment there would be a
350 number of gateways corresponding to the number of end-systems divided
351 by the gateway capacity in terms of number of XMPP sessions. While
352 the BGP RR scale in terms of the number of gateways attached to it.
354 The XMPP "jid" used by the end-system shall be a 6-byte value that
355 uniquely identifies the host in the domain. This specification
356 recommends the use of the 802 MAC address of one of the physical
357 ethernet interfaces of the end-system, when present.
359 Each VPN shall be identified by a 64 ASCII character string.
361 The guest system software on an end-system SHALL establish an XMPP
362 session with its configured signaling gateways before creating
363 virtual interfaces.
365 When a virtual interface is created, for instance as result of a
366 Virtual Machine being instantiated on a end-system, the host
367 operating-system software shall generate an XMPP Publish message to
368 the VPN-signaling gateway.
370 Publish request from end-system to gateway:
372
374 to='network-control.domain.org'
375 id='request1'>
376
377
378
379
380 'vpn-ip-address>/32'
381 'infrastructure-ip-address'
382
383
384
385
386
387
388
390
394
395
396
397
398
399
401 In the request above the node 'vpn-customer-name' is assumed to be a
402 collection which is implicitly created by the VPN-signaling gateway.
404 The VPN-signaling gateway will convert the information received in a
405 the 'publish' request into the corresponding BGP route information
406 such that:.
408 It associates the specific request with a local VRF with the name
409 specified in the collection 'node' attribute.
411 It Creates a route with with a 'Route Distinguisher' (RD)
412 containing the end-system's 'system-id' and the specified 'label'
413 and NLRI prefix.
415 It associates this route with the specified SNPA address.
417 It associates the route with an extended community TDB containing
418 the version number.
420 Subscription request from end-system to gateway:
422
426
427
428
429
431 Update notification from gateway to end-system:
433
434
435
436
437
438 'vpn-ip-address>/32'
439 'infrastructure-ip-address'
440
441
442
443
444
445 ...
446
447
448
449
451 Notifications should be generated whenever a VPN route is added,
452 modified or deleted.
454 Note that the Update from the signaling gateway to the end-point does
455 not contain the system-id of the destination end-point. When
456 multiple possible routes exist for a given VPN IP address, for
457 instance because the VM may be in the process of moving location, it
458 is the responsibility of the gateway to select the best path to
459 advertise to the end-system.
461 In situations where an automated system is controlling the
462 instantiation of VMs it may be possible to have that system assign a
463 non-decreasing version number for each instantiation of that
464 particular VM. In that case, this number, carried in the 'version'
465 field may be used to help gateways select the most recent
466 instantiation of a VM during the interval of time where multiple
467 routes are present.
469 5. VPN NLRI
471 When a VPN-signaling gateway receives a request to create or modify a
472 VPN route is SHALL generate a BGP VPN route advertisement with the
473 corresponding information using the BGP address family corresponding
474 to the address family specified by the end-system.
476 It is assumed that the VPN-signaling gateways contain information
477 regarding the mapping between 'vpn-customer-names' and BGP Route
478 Targets used to import and export information from the associated
479 VRFs. This mapping is known via an out-of-band mechanism not
480 specified in this document.
482 Whenever a VRF in the gateway contains local routing information, the
483 gateway shall advertise the corresponding RT-Constraint route target
484 routes in BGP, which perform a parallel function to the subscription
485 requests in XMPP.
487 The 32bit route version number defined in the XML schema is
488 advertised into BGP as a Extended community with type TBD.
490 Signaling gateways SHOULD use automatically assign a BGP route
491 distinguisher per VPN routing table.
493 6. Security Considerations
495 The signaling protocol defines the access control policies for each
496 virtual interface and any VM associated with it. It is important to
497 secure the end-system access to signaling gateways and the BGP
498 infrastructure itself.
500 The XMPP session between end-systems and the XMPP gateways MUST use
501 mutual authentication. One possible strategy is to distribute pre-
502 signed certificates to end-systems which are presented as proof of
503 authorization to the signaling gateway.
505 BGP sessions MUST be authenticated using a shared secret. This
506 document recommends that BGP speaking systems filter traffic on port
507 179 such that only IP addresses which are known to participate in the
508 BGP signaling protocol are allowed.
510 7. References
512 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating
513 MPLS in IP or Generic Routing Encapsulation (GRE)",
514 RFC 4023, March 2005.
516 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
517 Networks (VPNs)", RFC 4364, February 2006.
519 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk,
520 R., Patel, K., and J. Guichard, "Constrained Route
521 Distribution for Border Gateway Protocol/MultiProtocol
522 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual
523 Private Networks (VPNs)", RFC 4684, November 2006.
525 [xmpp-ping]
526 "XMPP Ping", XEP 0199, June 2009.
528 [pubsub] "PubSub Collection Nodes", XEP 0248, September 2010.
530 Authors' Addresses
532 Pedro Marques
534 Email: pedro.r.marques@gmail.com
536 Luyuan Fang
537 Cisco Systems
538 111 Wood Avenue South
539 Iselin, NJ 08830
541 Email: lufang@cisco.com
543 Ping Pan
544 Infinera Corp
545 140 Caspian Ct.
546 Sunnyvale, CA 94089
548 Email: ppan@infinera.com