Network working group X. Xu Internet Draft Huawei Category: Standard Track K. Lee China Telecom Expires: January 2013 July 16, 2012 BGP Tunnel Address Prefix Attribute and Tunnel Address Prefix Extended Community draft-xu-idr-tunnel-address-prefix-01 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 16, 2013. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Xu Expires January 16, 2013 [Page 1] Internet-Draft Tunnel Address Prefix Attribute July 2012 Abstract This document describes a new BGP attribute referred to as Tunnel Address Prefix Attribute and a new BGP address specific extended community referred to as Tunnel Address Prefix Extended Community, both of which are intended for facilitating the load-balancing of IP/GRE tunneled traffic (e.g., L3VPN-over-GRE traffic) in the core of IP-enabled Packet Switch Networks (PSN). Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. Table of Contents 1. Introduction ................................................ 3 2. Terminology ................................................. 4 3. Tunnel Address Prefix Attribute ............................. 4 4. Tunnel Address Prefix Extended Community .................... 5 5. Functionality Description ................................... 6 5.1. Egress Routers ......................................... 6 5.2. Ingress Routers......................................... 6 5.3. Intermediate Routers ................................... 6 6. Applicability ............................................... 6 7. Security Considerations ..................................... 7 8. IANA Considerations ......................................... 7 9. Acknowledgements ............................................ 7 10. References ................................................. 7 10.1. Normative References .................................. 7 10.2. Informative References ................................ 7 Authors' Addresses ............................................. 8 Xu Expires January 16, 2013 [Page 2] Internet-Draft Tunnel Address Prefix Attribute July 2012 1. Introduction Equal Cost Multi-Path (ECMP) and Link Aggregation Group (LAG) are widely used in the core of IP-enabled Packet Switch Networks (PSN) for load-balancing purposes. Most core routers in the IP-enabled PSN are capable of load-balancing IP traffic flows across ECMP paths and/or LAG based on the hash of the five-tuple of UDP/TCP packets (i.e., source IP address, destination IP address, source port, destination port, and protocol) or some fields in the IP header of non-UDP/TCP packets (e.g., source IP address, destination IP address). However, in the L3VPN [RFC4364], L2VPN and Softwire mesh [RFC5565] scenarios, distinct customer traffic flows between a given tunnel endpoint pair (e.g., the PE pair in the L3VPN context) would be encapsulated with the same IP/GRE tunnel header prior to traversing the core of IP PSN. In addition, since the IP/GRE encapsulated traffic is neither TCP nor UDP, core routers therefore could only perform hash calculation on the fields in the IP header of IP/GRE tunnels. As a result, core routers could not achieve an effective load-balancing for these IP/GRE tunneled traffic flows in the core network due to the lack of adequate entropy information. [RFC5640] describes a method for improving the load-balancing in Softwire mesh networks [RFC5565]. However, this method requires core routers to be able to perform hash calculation on the fields including the specific "load-balancing" field contained in the L2TPv3 or GRE tunnel header. [Entropy-Label] proposes to use the "entropy labels" for achieving a better load-balancing for MPLS traffic flows in the core of MPLS-enabled PSN. Although the entropy label could be inserted in the "Key" field of the GRE header by ingress PE routers in the case where the PSN is IP enabled rather than MPLS enabled, it still requires core routers to be capable of performing hash calculation on the "entropy label" contained in the GRE tunnel header. Any of the above two load-balancing methods requires a change to the date plane of core routers. This document describes an alternative load-balancing method suitable for the above scenarios, which is backwards compatible to those already-deployed core routers that could only perform hash calculation on the fields in the IP header in the case of IP/GRE tunneled traffic flows. The basic idea of this method is: a given (tunnel) egress router signals to (tunnel) ingress routers a special prefix called "tunnel address prefix" via BGP and any addresses beginning with that prefix would be used by those ingress routers as tunnel destination addresses when tunneling traffic towards that egress router. Therefore distinct traffic flows between that tunnel endpoint pair could be encapsulated with as many different tunnel Xu Expires January 16, 2013 [Page 3] Internet-Draft Tunnel Address Prefix Attribute July 2012 destination addresses as possible. In this way, core routers could achieve a better load-balancing for those IP/GRE tunneled traffic through performing hash calculation just on the fields in the IP header. 2. Terminology This memo makes use of the terms defined in [RFC4364] and [RFC5565]. 3. Tunnel Address Prefix Attribute For a given BGP router to tell remote BGP routers what tunnel destination addresses could be used when tunneling traffic flows to it, a new BGP attribute contained in the Encapsulation SAFI [RFC5512], referred to as "Tunnel Address Prefix Attribute", could be used to indicate the available tunnel destination addresses to be used. The Tunnel Address Prefix attribute is an optional transitive attribute. The TLV is structured as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tunnel Address Prefix Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Value | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - Tunnel Address Prefix Type (2 octets): indicates the Value field of such TLV contains the tunnel address prefix information in the form of IP address and subnet mask pair. - Length (2 octets): indicates the total number of octets of the value field. If the AFI of the Encapsulation SAFI is IPv4, the length value is set to 64; otherwise if the AFI is IPv6, the length value is set to 256. - Value (variable): contains the tunnel address prefix information in the form of IP address and subnet mask pair. Xu Expires January 16, 2013 [Page 4] Internet-Draft Tunnel Address Prefix Attribute July 2012 4. Tunnel Address Prefix Extended Community Here, we also define an address specific extended community referred to as Tunnel Address Prefix extended community that can be attached to BGP UPDATE messages to indicate the available tunnel destination addresses to be used when tunneling traffic flows from an ingress router to an egress router. This extended community is useful in the case where the Encapsulation SAFI capability is not supported between BGP routers or one really wants to specify different tunnel destination address prefixes for distinct sets of traffic flows. For example, one wants to assign two different tunnel address prefixes for traffic flows destined for prefix X and Y respectively. IPv4 Tunnel Address Prefix extended community is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x01 | Sub-Type | Global Administrator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator (cont.) | Local Administrator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The value of the high-order octet of the extended type field is 0x01, which indicates it is transitive across ASes. The value of the low-order octet of the extended type field is to be defined. The Global Administrator sub-field contains an IPv4 unicast address while the Local Administrator sub-field contains the corresponding Prefix Length. IPv6 Tunnel Address Prefix extended community is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x00 | Sub-Type | Global Administrator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator (cont.) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator (cont.) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator (cont.) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator (cont.) | Local Administrator | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Xu Expires January 16, 2013 [Page 5] Internet-Draft Tunnel Address Prefix Attribute July 2012 The value of the high-order octet of the extended type field is 0x00, which indicates it is transitive across ASes. The value of the low-order octet of the extended type field is to be defined. The Global Administrator sub-field contains an IPv6 unicast address while the Local Administrator sub-field contains the corresponding Prefix Length. 5. Functionality Description 5.1. Egress Routers An egress router could attach the above BGP Tunnel Address Prefix attribute or extended community to BGP UPDATE messages in order to indicate the available tunnel destination addresses to be used by any ingress routers when tunneling traffic flows to it. In addition, it SHOULD create the corresponding loopback interface for each IP address within that prefix and accordingly advertise a route for that prefix via IGP. As such, it could receive and process those IP/GRE tunneled traffic flows destined for any of those addresses beginning with that prefix. 5.2. Ingress Routers For an ingress router receiving the above BGP Tunnel Address Prefix attribute or extended community announced by a given egress router, it could use any addresses beginning with the tunnel address prefix, in addition to the BGP next-hop address contained in the MP_REACH_NLRI attribute, as tunnel destination addresses when tunneling traffic flows towards that egress router. 5.3. Intermediate Routers There is no special requirement on Intermediate Routers (i.e., core routers). In other words, they could perform load-balancing of the IP/GRE tunneled traffic on basis of the hash of the fields in the IP headers as normal. 6. Applicability The load-balancing approach described in this document is suitable for many scenarios including but not limited to L3VPN [RFC4364], 6PE [RFC4798], Softwire mesh [RFC5565], BGP free core and L2VPN including VPLS [RFC4761, RFC4762] and E-VPN [E-VPN]. In the existing VPLS case where BGP is used for auto-discovery, the above BGP Tunnel Address Prefix attribute or extended community would be attached to the BGP update messages as well. Once a customer MAC address is learnt against a given BGP next-hop address, any addresses beginning Xu Expires January 16, 2013 [Page 6] Internet-Draft Tunnel Address Prefix Attribute July 2012 with the Tunnel Address Prefix which is associated with that BGP next-hop address could be used as tunnel destination addresses when tunneling MAC frames destined for that MAC address. 7. Security Considerations TBD. 8. IANA Considerations The type code of the Tunnel Address Prefix Attribute needs to be allocated by IANA. Meanwhile, a new Sub-Type of the Address Specific BGP Extended Communities of IPv4 and IPv6 respectively SHOULD also be assigned by IANA. 9. Acknowledgements Thanks to Robert Raszuk for his valuable comments on this document. 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 10.2. Informative References [RFC5640] Filsfils, C., Mohapatra, P and C. Pignataro, "Load- Balancing for Mesh Softwires", RFC 5640, August 2009. [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended Communities Attribute", RFC 4360, February 2006. [RFC5701] Rekhter, Y., "IPv6 Address Specific BGP Extended Communities Attribute", RFC5701, November 2009. [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute", RFC 5512, April 2009. [RFC5565] Wu, J., Cui, Y., Metz, C., and E. Rosen, "Softwire Mesh Framework", RFC 5565, June 2009. [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006 Xu Expires January 16, 2013 [Page 7] Internet-Draft Tunnel Address Prefix Attribute July 2012 [Entropy-Label] Kompella, K., Drake, J., Amante, S., Henderickx, W., and L. Yong, "The Use of Entropy Labels in MPLS Forwarding", draft-ietf-mpls-entropy-label-01, work in progress, October, 2011. [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, January 2007. [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, January 2007. [E-VPN] Aggarwal et al., "BGP MPLS Based Ethernet VPN", draft-ietf- l2vpn-evpn-00.txt, work in progress, February, 2012. Authors' Addresses Xiaohu Xu Huawei Technologies, Beijing, China Phone: +86-10-60610041 Email: xuxiaohu@huawei.com Kai Lee China Telecom, Beijing, China. Email: Leekai@ctbri.com.cn