In einer eMail vom 26.11.2009 02:35:49 Westeuropäische Normalzeit schreibt dino at cisco.com:Did anyone indicate if multicast was a fundamental requirement or fundamental component? DinoMulticast: It strikes me that multicast has never been a topic for RRG discussion, although it is mainly a routing issue.
I would like to know if the RRG is going to recommend an architecture that explicitly says it won't include multicast, or it will implicitly include it but lack any reference to how it will scale. But I do think "multicast" is "routing".
State-less multicast would be a valuable topic, I thought 3 years ago when I tried to launch a respective EU-ICT project which was rated to be "very poor" (not only simple "poor" as was NIRA).However, state-less multicast can be done by changing the multicast philosophy. Multicast must adjust, and not the routing architecture. Will say, it can be accomplished with the current BGP as well. Nevertheless, the combination "state-less multicast + location-based routing" might be even more beneficial. E.g. inspire future research work as to minimize the hereby accepted disadvantages/costs.
What the RRG could recommend is that we don't support a native multicast architecture in the core but the edges could run multicast as defined today but the edges would perform "replicated unicast".
I provide references to a "replicated unicast" at the network layer as well as attaching doing core based native replication as in LISP- Multicast.
Dino
Network Working Group D. Thaler
Internet-Draft M. Talwar
Intended status: Standards Track A. Aggarwal
Expires: December 29, 2008 Microsoft Corporation
L. Vicisano
Cisco Systems
T. Pusateri
!j
June 27, 2008
Automatic IP Multicast Without Explicit Tunnels (AMT)
draft-ietf-mboned-auto-multicast-09
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 29, 2008.
Thaler, et al. Expires December 29, 2008 [Page 1]
Internet-Draft AMT June 2008
Abstract
Automatic Multicast Tunneling (AMT) allows multicast communication
amongst isolated multicast-enabled sites or hosts, attached to a
network which has no native multicast support. It also enables them
to exchange multicast traffic with the native multicast
infrastructure and does not require any manual configuration. AMT
uses an encapsulation interface so that no changes to a host stack or
applications are required, all protocols (not just UDP) are handled,
and there is no additional overhead in core routers.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 6
3. Requirements notation . . . . . . . . . . . . . . . . . . . . 7
4. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.1. AMT Pseudo-Interface . . . . . . . . . . . . . . . . . . . 8
4.2. AMT Gateway . . . . . . . . . . . . . . . . . . . . . . . 8
4.3. AMT Site . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.4. AMT Relay Router . . . . . . . . . . . . . . . . . . . . . 8
4.5. AMT Relay Anycast Prefix . . . . . . . . . . . . . . . . . 9
4.6. AMT Relay Anycast Address . . . . . . . . . . . . . . . . 9
4.7. AMT Subnet Anycast Prefix . . . . . . . . . . . . . . . . 9
4.8. AMT Gateway Anycast Address . . . . . . . . . . . . . . . 9
5. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.1. Receiving Multicast in an AMT Site . . . . . . . . . . . . 10
5.1.1. Scalability Considerations . . . . . . . . . . . . . . 11
5.1.2. Spoofing Considerations . . . . . . . . . . . . . . . 11
5.1.3. Protocol Sequence for a Gateway Joining SSM
Receivers to a Relay . . . . . . . . . . . . . . . . . 12
5.2. Sourcing Multicast from an AMT site . . . . . . . . . . . 14
5.2.1. Supporting Site-MBone Multicast . . . . . . . . . . . 15
5.2.2. Supporting Site-Site Multicast . . . . . . . . . . . . 16
6. Message Formats . . . . . . . . . . . . . . . . . . . . . . . 17
6.1. AMT Relay Discovery . . . . . . . . . . . . . . . . . . . 17
6.1.1. Type . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.1.2. Reserved . . . . . . . . . . . . . . . . . . . . . . . 17
6.1.3. Discovery Nonce . . . . . . . . . . . . . . . . . . . 17
6.2. AMT Relay Advertisement . . . . . . . . . . . . . . . . . 17
6.2.1. Type . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.2.2. Reserved . . . . . . . . . . . . . . . . . . . . . . . 18
6.2.3. Discovery Nonce . . . . . . . . . . . . . . . . . . . 18
6.2.4. Relay Address . . . . . . . . . . . . . . . . . . . . 18
6.3. AMT Request . . . . . . . . . . . . . . . . . . . . . . . 18
6.3.1. Type . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.3.2. Reserved . . . . . . . . . . . . . . . . . . . . . . . 19
Thaler, et al. Expires December 29, 2008 [Page 2]
Internet-Draft AMT June 2008
6.3.3. Request Nonce . . . . . . . . . . . . . . . . . . . . 19
6.4. AMT Membership Query . . . . . . . . . . . . . . . . . . . 19
6.4.1. Type . . . . . . . . . . . . . . . . . . . . . . . . . 20
6.4.2. Reserved . . . . . . . . . . . . . . . . . . . . . . . 20
6.4.3. Response MAC . . . . . . . . . . . . . . . . . . . . . 20
6.4.4. Request Nonce . . . . . . . . . . . . . . . . . . . . 20
6.4.5. IGMP/MLD Query (including IP Header) . . . . . . . . . 20
6.5. AMT Membership Update . . . . . . . . . . . . . . . . . . 21
6.5.1. Type . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.5.2. Reserved . . . . . . . . . . . . . . . . . . . . . . . 22
6.5.3. Response MAC . . . . . . . . . . . . . . . . . . . . . 22
6.5.4. Request Nonce . . . . . . . . . . . . . . . . . . . . 22
6.5.5. IGMP/MLD Message (including IP Header) . . . . . . . . 22
6.6. AMT IP Multicast Data . . . . . . . . . . . . . . . . . . 22
6.6.1. Type . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.6.2. Reserved . . . . . . . . . . . . . . . . . . . . . . . 23
6.6.3. IP Multicast Data . . . . . . . . . . . . . . . . . . 23
7. AMT Gateway Details . . . . . . . . . . . . . . . . . . . . . 24
7.1. At Startup Time . . . . . . . . . . . . . . . . . . . . . 24
7.2. Gateway Group and Source Addresses . . . . . . . . . . . . 24
7.2.1. IPv4 . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.2.2. IPv6 . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.3. Joining Groups with MBone Sources . . . . . . . . . . . . 26
7.4. Responding to Relay Changes . . . . . . . . . . . . . . . 26
7.5. Joining SSM Groups with AMT Gateway Sources . . . . . . . 27
7.6. Receiving AMT Membership Updates by the Gateway . . . . . 27
7.7. Sending data to SSM groups . . . . . . . . . . . . . . . . 27
8. Relay Router Details . . . . . . . . . . . . . . . . . . . . . 28
8.1. At Startup time . . . . . . . . . . . . . . . . . . . . . 28
8.2. Receiving Relay Discovery messages sent to the Anycast
Address . . . . . . . . . . . . . . . . . . . . . . . . . 28
8.3. Receiving Membership Updates from AMT Gateways . . . . . . 28
8.4. Receiving (S,G) Joins from the Native Side, for AMT
Sources . . . . . . . . . . . . . . . . . . . . . . . . . 29
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30
9.1. IPv4 and IPv6 Anycast Prefix Allocation . . . . . . . . . 30
9.1.1. IPv4 . . . . . . . . . . . . . . . . . . . . . . . . . 30
9.1.2. IPv6 . . . . . . . . . . . . . . . . . . . . . . . . . 30
9.2. IPv4 and IPv6 AMT Subnet Prefix Allocation . . . . . . . . 30
9.2.1. IPv4 . . . . . . . . . . . . . . . . . . . . . . . . . 30
9.2.2. IPv6 . . . . . . . . . . . . . . . . . . . . . . . . . 30
9.3. UDP Port number . . . . . . . . . . . . . . . . . . . . . 30
10. Security Considerations . . . . . . . . . . . . . . . . . . . 31
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 32
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 33
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34
13.1. Normative References . . . . . . . . . . . . . . . . . . . 34
13.2. Informative References . . . . . . . . . . . . . . . . . . 34
Thaler, et al. Expires December 29, 2008 [Page 3]
Internet-Draft AMT June 2008
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 36
Intellectual Property and Copyright Statements . . . . . . . . . . 38
Thaler, et al. Expires December 29, 2008 [Page 4]
Internet-Draft AMT June 2008
1. Introduction
The primary goal of this document is to foster the deployment of
native IP multicast by enabling a potentially large number of nodes
to connect to the already present multicast infrastructure.
Therefore, the techniques discussed here should be viewed as an
interim solution to help in the various stages of the transition to a
native multicast network.
To allow fast deployment, the solution presented here only requires
small and concentrated changes to the network infrastructure, and no
changes at all to user applications or to the socket API of end-
nodes' operating systems. The protocol introduced in this
specification can be deployed in a few strategically-placed network
nodes and in user-installable software modules (pseudo device drivers
and/or user-mode daemons) that reside underneath the socket API of
end-nodes' operating systems. This mechanism is very similar to that
used by "6to4" [RFC3056], [RFC3068] to get automatic IPv6
connectivity.
Effectively, AMT treats the unicast-only inter-network as a large
non-broadcast multi-access (NBMA) link layer, over which we require
the ability to multicast. To do this, multicast packets being sent
to or from a site must be encapsulated in unicast packets. If the
group has members in multiple sites, AMT encapsulation of the same
multicast packet will take place multiple times by necessity.
Thaler, et al. Expires December 29, 2008 [Page 5]
Internet-Draft AMT June 2008
2. Applicability
AMT is not a substitute for native multicast or a statically
configured multicast tunnel for high traffic flow. Unicast
replication is required to reach multiple receivers that are not part
of the native multicast infrastructure. Unicast replication is also
required by non-native sources to different parts of the native
multicast infrastructure. However, this is no worse than regular
unicast distribution of streams and in most cases much better.
The following problems are addressed:
1. Allowing isolated sites/hosts to receive the SSM flavor of
multicast ([RFC4607]).
2. Allowing isolated non-NAT sites/hosts to transmit the SSM flavor
of multicast.
3. Allowing isolated sites/hosts to receive general multicast (ASM
[RFC1112]).
This document does not address allowing isolated sites/hosts to
transmit general multicast. We expect that other solutions (e.g.,
Tunnel Brokers, a la [RFC3053]) will be used for sites that desire
this capability.
Implementers should be aware that site administrators may have
configured administratively scoped multicast boundaries and a remote
gateway may provide a means to circumvent administrative boundaries.
Therefore, implementations should allow for the configuration of such
boundaries on relays and gateways and perform filtering as needed.
Thaler, et al. Expires December 29, 2008 [Page 6]
Internet-Draft AMT June 2008
3. Requirements notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Thaler, et al. Expires December 29, 2008 [Page 7]
Internet-Draft AMT June 2008
4. Definitions
+---------------+ Internet +---------------+
| AMT Site | | Native MCast |
| | | |
| +------+----+ AMT +----+----+ AMT |
| |AMT Gateway| Anycast |AMT Relay| Subnet |
| | +-----+-+ Prefix +-+-----+ | Prefix |
| | |AMT IF | <------------|AMT IF | |--------> |
| | +-----+-+ +-+-----+ | |
| +------+----+ +----+----+ |
| | | |
+---------------+ +---------------+
4.1. AMT Pseudo-Interface
AMT encapsulation of multicast packets inside unicast packets occurs
at a point that is logically equivalent to an interface, with the
link layer being the unicast-only network. This point is referred to
as a pseudo-interface. Some implementations may treat it exactly
like any other interface and others may treat it like a tunnel end-
point.
4.2. AMT Gateway
A host, or a site gateway router, supporting an AMT Pseudo-
Interface. It does not have native multicast connectivity to the
native multicast backbone infrastructure. It is simply referred to
in this document as a "gateway".
4.3. AMT Site
A multicast-enabled network not connected to the multicast backbone
served by an AMT Gateway. It could also be a stand- alone AMT
Gateway.
4.4. AMT Relay Router
A multicast router configured to support transit routing between AMT
Sites and the native multicast backbone infrastructure. The relay
router has one or more interfaces connected to the native multicast
infrastructure, zero or more interfaces connected to the non-
multicast capable inter-network, and an AMT pseudo-interface. It is
simply referred to in this document as a "relay".
As with [RFC3056], we assume that normal multicast routers do not
want to be tunnel endpoints (especially if this results in high fan
out), and similarly that service providers do not want encapsulation
Thaler, et al. Expires December 29, 2008 [Page 8]
Internet-Draft AMT June 2008
to arbitrary routers. Instead, we assume that special-purpose
routers will be deployed that are suitable for serving as relays.
4.5. AMT Relay Anycast Prefix
A well-known address prefix used to advertise (into the unicast
routing infrastructure) a route to an available AMT Relay Router.
This could also be private (i.e., not well-known) for a private
relay.
Prefixes for both IPv4 and IPv6 will be assigned in a future version
of this draft.
4.6. AMT Relay Anycast Address
An anycast address which is used to reach the nearest AMT Relay
Router.
This address corresponds to the setting the low-order octet of the
AMT Relay Anycast Prefix to 1 (for both IPv4 and IPv6).
4.7. AMT Subnet Anycast Prefix
A well-known address prefix used to advertise (into the M-RIB of the
native multicast-enabled infrastructure) a route to AMT Sites. This
prefix will be used to enable sourcing SSM traffic from an AMT
Gateway.
4.8. AMT Gateway Anycast Address
An anycast address in the AMT Subnet Anycast Prefix range, which is
used by an AMT Gateway to enable sourcing SSM traffic from local
applications.
Thaler, et al. Expires December 29, 2008 [Page 9]
Internet-Draft AMT June 2008
5. Overview
5.1. Receiving Multicast in an AMT Site
Internet
+---------------+ +---------------+
| AMT Site | 2. 3-way Membership | MBone |
| | Handshake | |
| 1. Join +---+---+ =================> +---+---+ |
| +---->|Gateway| | Relay | |
| | +---+---+ <================= +---+---+ |
| R-+ | 3. Receive Data | |
+---------------+ +---------------+
Receiving Multicast in an AMT Site
AMT relays and gateways cooperate to transmit multicast traffic
sourced within the native multicast infrastructure to AMT sites:
relays receive the traffic natively and unicast-encapsulate it to
gateways; gateways decapsulate the traffic and possibly forward it
into the AMT site.
Each gateway has an AMT pseudo-interface that serves as a default
multicast route. Requests to join a multicast session are sent to
this interface and encapsulated to a particular relay reachable
across the unicast-only infrastructure.
Each relay has an AMT pseudo-interface too. Multicast traffic sent
on this interface is encapsulated to zero or more gateways that have
joined to the relay. The AMT recipient-list is determined for each
multicast session. This requires the relay to keep state for each
gateway which has joined a particular group or (source, group) pair.
Multicast packets from the native infrastructure behind the relay
will be sent to each gateway which has requested them.
All multicast packets (data and control) are encapsulated in unicast
packets. UDP encapsulation is used for all AMT control and data
packets using the IANA reserved UDP port number for AMT.
Each relay, plus the set of all gateways using the relay, together
are thought of as being on a separate logical NBMA link. This
implies that the AMT recipient-list is a list of "link layer"
addresses which are (IP address, UDP port) pairs.
Since the number of gateways using a relay can be quite large, and we
expect that most sites will not want to receive most groups, an
explicit-joining protocol is required for gateways to communicate
group membership information to a relay. The two most likely
Thaler, et al. Expires December 29, 2008 [Page 10]
Internet-Draft AMT June 2008
candidates are the IGMP/MLD protocol [RFC3376], [RFC3810], and the
PIM-Sparse Mode protocol [RFC4601]. Since an AMT gateway may be a
host, and hosts typically do not implement routing protocols,
gateways will use IGMP/MLD as described in Section 7 below. This
allows a host kernel (or a pseudo device driver) to easily implement
AMT gateway behavior, and obviates the relay from the need to know
whether a given gateway is a host or a router. From the relay's
perspective, all gateways are indistinguishable from hosts on an NBMA
leaf network.
5.1.1. Scalability Considerations
It is possible that millions of hosts will enable AMT gateway
functionality and so an important design goal is not to create
gateway state in each relay until the gateway joins a multicast
group. But even the requirement that a relay keep group state per
gateway that has joined a group introduces potential scalability
concerns.
Scalability of AMT can be achieved by adding more relays, and using
an appropriate relay discovery mechanism for gateways to discover
relays. The solution we adopt is to assign addresses in anycast
fashion to relays [RFC1546], [RFC4291]. However, simply sending
periodic membership reports to an anycast address can cause
duplicates. Specifically, if routing changes such that a different
relay receives a periodic membership report, both the new and old
relays will encapsulate data to the AMT site until the old relay's
state times out. This is obviously undesirable. Instead, we use the
anycast address merely to find the unicast address of a relay to
which membership reports are sent.
Since adding another relay has the result of adding another
independent NBMA link, this allows the gateways to be spread out
among more relays so as to keep the number of gateways per relay at a
reasonable level.
5.1.2. Spoofing Considerations
An attacker could affect the group state in the relay or gateway by
spoofing the source address in the join or leave reports. This can
be used to launch reflection or denial of service attacks on the
target. Such attacks can be mitigated by using a three way handshake
between the gateway and the relay for each multicast membership
report or leave.
When a gateway or relay wants to send a membership report, it first
sends an AMT Request with a request nonce in it. The receiving side
(the respondent) can calculate a message authentication code (MAC)
Thaler, et al. Expires December 29, 2008 [Page 11]
Internet-Draft AMT June 2008
based on (for example) the source IP address of the Request, the
source UDP port, the request nonce, and a secret key known only to
the respondent. The algorithm and the input used to calculate the
MAC does not have to be standardized since the respondent generates
and verifies the MAC and the originator simply echoes it.
An AMT Membership Query is sent back including the request nonce and
the MAC to the originator of the Request. The originator then sends
the IGMP/MLD Membership/Listener Report or Leave/Done (including the
IP Header) along with the request nonce and the received MAC back to
the respondent finalizing the 3-way handshake.
Upon reception, the respondent can recalculate the MAC based on the
source IP address, the source UDP port, the request nonce, and the
local secret. The IGMP/MLD message is only accepted if the received
MAC matches the calculated MAC.
The local secret never has to be shared with the other side. It is
only used to verify return routability of the originator.
Since the same Request Nonce and source IP address can be re-used,
the receiver SHOULD change its secret key at least once per hour.
However, AMT Membership updates received with the previous secret
MUST be accepted for up to the IGMP/MLD Query Interval.
5.1.3. Protocol Sequence for a Gateway Joining SSM Receivers to a Relay
This description assumes the Gateway can be a host joining as a
receiver or a network device acting as a Gateway when a directly
connected host joins as a receiver.
o Receiver at AMT site sends IGMPv3/MLDv2 report joining (S1,G1).
o Gateway receives report. If it has no tunnel state with a Relay,
it originates an AMT Relay Discovery message addressed to the
Anycast Relay IP address. The AMT Relay Discovery message can be
sent on demand if no relay is known at this time or at startup and
be periodically refreshed.
o The closest Relay topologically receives the AMT Relay Discovery
message and returns the nonce from the Discovery in an AMT Relay
Advertisement message so the Gateway can learn of the Relay's
unique IP address.
o When the Gateway receives the AMT Relay Advertisement message, it
now has an address to use for all subsequent (S,G) entries it will
join on behalf of attached receivers (or itself).
Thaler, et al. Expires December 29, 2008 [Page 12]
Internet-Draft AMT June 2008
o If the gateway has a valid Response MAC from a previous AMT Query
message, it can send an AMT Membership Update message as described
below. Otherwise, the Gateway sends an AMT Request message to the
Relay's unique IP address to begin the process of joining the
(S,G). The gateway also SHOULD initialize a timer used to send
periodic Requests to a random value from the interval [0, [Query
Interval]] before sending the first periodic report, in order to
prevent startup synchronization.
o The Relay responds to the AMT Request message by returning the
nonce from the Request in a AMT Query message. The Query message
contains an IGMP/MLD QUERY indicating how often the Gateway should
repeat AMT Request messages so the (S,G) state can stay refreshed
in the Relay. The Query message also includes an opaque security
code which is generated locally (with no external coordination).
o When the Gateway receives the AMT Query message it responds by
copying the security code from the AMT Query message into a AMT
Membership Update message. The Update message contains (S1,G1) in
an IGMPv3/MLDv2 formatted packet with an IP header. The nonce
from the AMT Request is also included in the AMT Membership Update
message.
o When the Relay receives the AMT Membership Update, it will add the
tunnel to the Gateway in it's outgoing interface list for it's
(S1,G1) entry stored in the multicast routing table. If the
(S1,G1) entry was created do to this interaction, the multicast
routing protocol running on the Relay will trigger a Join message
towards source S1 to build a native multicast tree in the native
multicast infrastructure.
o As packets are sent from the host S1, they will travel natively
down the multicast tree associated with (S1,G1) in the native
multicast infrastructure to the Relay. The Relay will replicate
to all interfaces in it's outgoing interface list as well as the
tunnel outgoing interface, which is encapsulated in a unicast AMT
Multicast Data message.
o When the Gateway receives the AMT Multicast Data message, it will
accept the packet since it was received over the pseudo-interface
associated with the tunnel to the Relay it had attached to, and
forward the packet to the outgoing interfaces joined by any
attached receiver hosts (or deliver the packet to the application
when the Gateway is the receiver).
o If later (S2,G2) is joined by a receiver, a 3-way handshake of
Request/ Query/Update occurs for this entry. The Discovery/
Advertisement exchange is not required.
Thaler, et al. Expires December 29, 2008 [Page 13]
Internet-Draft AMT June 2008
o To keep the state for (S1,G1) and (S2,G2) alive in the Relay, the
Gateway will send periodic AMT Membership Updates. The Membership
Update can be sent directly if the sender has a valid nonce from a
previous Request. If not, an AMT Request messages should be sent
to solicit a Query Message. When sending a periodic state
refresh, all joined state in the Gateway is packed in the fewest
number of AMT Membership Update messages.
o When the Gateway leaves all (S,G) entries, the Relay can free
resources associated with the tunnel. It is assumed that when the
Gateway would want to join an (S,G) again, it would start the
Discovery/Advertisement tunnel establishment process over again.
This same procedure would be used for receivers who operate in Any-
Source Multicast (ASM) mode.
5.2. Sourcing Multicast from an AMT site
Two cases are discussed below: multicast traffic sourced in an AMT
site and received in the MBone, and multicast traffic sourced in an
AMT site and received in another AMT site.
In both cases only SSM sources are supported. Furthermore this
specification only deals with the source residing directly in the
gateway. To enable a generic node in an AMT site to source
multicast, additional coordination between the gateway and the
source-node is required.
The gateway SHOULD allow for filtering link-local and site-local
traffic.
The general mechanism used to join towards AMT sources is based on
the following:
1. Applications residing in the gateway use addresses in the AMT
Subnet Anycast Prefix to send multicast, as a result of sourcing
traffic on the AMT pseudo-interface.
2. The AMT Subnet Anycast Prefix is advertised for RPF reachability
in the M-RIB by relays and gateways.
3. Relays or gateways that receive a join for a source/group pair
use information encoded in the address pair to rebuild the
address of the gateway (source) to which to encapsulate the join
(see Section 7.2 for more details). The membership reports use
the same three way handshake as outlined in Section 5.1.2
Thaler, et al. Expires December 29, 2008 [Page 14]
Internet-Draft AMT June 2008
5.2.1. Supporting Site-MBone Multicast
Internet
+---------------+ +---------------+
| AMT Site | 2. 3-way Membership | MBone |
| | Handshake | |
| +---+---+ <================= +---+---+ 1. Join |
| |Gateway| | Relay |<-----+ |
| +---+---+ =================> +---+---+ | |
| | 3. Receive Data | +-R |
+---------------+ +---------------+
Site-MBone Multicast
If a relay receives an explicit join from the native infrastructure,
for a given (source, group) pair where the source address belongs to
the AMT Subnet Anycast Prefix, then the relay will periodically
(using the rules specified in Section 5.1.2) encapsulate membership
updates for the group to the gateway. The gateway must keep state
per relay from which membership reports have been sent, and forward
multicast traffic from the site to all relays from which membership
reports have been received. The choice of whether this state and
replication is done at the link-layer (i.e., by the tunnel interface)
or at the network-layer is implementation dependent.
If there are multiple relays present, this ensures that data from the
AMT site is received via the closest relay to the receiver. This is
necessary when the routers in the native multicast infrastructure
employ Reverse-Path Forwarding (RPF) checks against the source
address, such as occurs when PIM Sparse-Mode [RFC4601] is used by the
multicast infrastructure.
The solution above will scale to an arbitrary number of relays, as
long at the number of relays requiring multicast traffic from a given
AMT site remains reasonable enough to not overly burden the site's
gateway.
A source at or behind an AMT gateway requires the gateway to do the
replication to one or more relays and receiving gateways. If this
places too much of a burden on the sourcing gateway, the source
should join the native multicast infrastructure through a permanent
tunnel so that replication occurs within the native multicast
infrastructure.
Thaler, et al. Expires December 29, 2008 [Page 15]
Internet-Draft AMT June 2008
5.2.2. Supporting Site-Site Multicast
Internet
+---------------+ +---------------+
| AMT Site | 2. 3-way Membership | AMT Site |
| | Handshake | |
| +---+---+ <================= +---+---+ 1. Join |
| |Gateway| |Gateway|<-----+ |
| +---+---+ =================> +---+---+ | |
| | 3. Receive Data | +-R |
+---------------+ +---------------+
Site-Site Multicast
Since we require gateways to accept membership reports, as described
above, it is also possible to support multicast among AMT sites,
without requiring assistance from any relays.
When a gateway wants to join a given (source, group) pair, where the
source address belongs to the AMT Subnet Anycast Prefix, then the
gateway will periodically unicast encapsulate an IGMPv3/MLDv2 Report
[RFC3376], [RFC3810] (including IP Header) directly to the site
gateway for the source.
We note that this can result in a significant amount of state at a
site gateway sourcing multicast to a large number of other AMT sites.
However, it is expected that this is not unreasonable for two
reasons. First, the gateway does not have native multicast
connectivity, and as a result is likely doing unicast replication at
present. The amount of state is thus the same as what such a site
already deals with. Secondly, any site expecting to source traffic
to a large number of sites could get a point-to-point tunnel to the
native multicast infrastructure, and use that instead of AMT.
Thaler, et al. Expires December 29, 2008 [Page 16]
Internet-Draft AMT June 2008
6. Message Formats
6.1. AMT Relay Discovery
The AMT Relay Discovery message is a UDP packet sent from the AMT
gateway unicast address to the AMT relay anycast address to discover
the unicast address of an AMT relay.
The UDP source port is uniquely selected by the local host operating
system. The UDP destination port is the IANA reserved AMT port
number. The UDP checksum MUST be valid in AMT control messages.
The payload of the UDP packet contains the following fields.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x1 | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Discovery Nonce |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
AMT Relay Discovery
6.1.1. Type
The type of the message.
6.1.2. Reserved
A 24-bit reserved field. Sent as 0, ignored on receipt.
6.1.3. Discovery Nonce
A 32-bit random value generated by the gateway and replayed by the
relay.
6.2. AMT Relay Advertisement
The AMT Relay Advertisement message is a UDP packet sent from the AMT
relay anycast address to the source of the discovery message.
The UDP source port is the IANA reserved AMT port number and the UDP
destination port is the source port received in the Discovery
message. The UDP checksum MUST be valid in AMT control messages.
Thaler, et al. Expires December 29, 2008 [Page 17]
Internet-Draft AMT June 2008
The payload of the UDP packet contains the following fields.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x2 | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Discovery Nonce |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Relay Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
AMT Relay Advertisement
6.2.1. Type
The type of the message.
6.2.2. Reserved
A 24-bit reserved field. Sent as 0, ignored on receipt.
6.2.3. Discovery Nonce
A 32-bit random value generated by the gateway and replayed by the
relay.
6.2.4. Relay Address
The unicast IPv4 or IPv6 address of the AMT relay. The family can be
determined by the length of the Advertisement.
6.3. AMT Request
A Request packet is sent to begin a 3-way handshake for sending an
IGMP/MLD Membership/Listener Report or Leave/Done. It can be sent
from a gateway to a relay, from a gateway to another gateway, or from
a relay to a gateway.
It is sent from the originator's unique unicast address to the
respondents' unique unicast address.
The UDP source port is uniquely selected by the local host operating
system. It can be different for each Request and different from the
source port used in Discovery messages but does not have to be. The
UDP destination port is the IANA reserved AMT port number. The UDP
Thaler, et al. Expires December 29, 2008 [Page 18]
Internet-Draft AMT June 2008
checksum MUST be valid in AMT control messages.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x3 | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Request Nonce |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
AMT Relay Advertisement
6.3.1. Type
The type of the message.
6.3.2. Reserved
A 24-bit reserved field. Sent as 0, ignored on receipt.
6.3.3. Request Nonce
A 32-bit identifier used to distinguish this request.
6.4. AMT Membership Query
An AMT Membership Query packet is sent from the respondent back to
the originator to solicit an AMT Membership Update while confirming
the source of the original request. It contains a relay Message
Authentication Code (MAC) that is a cryptographic hash of a private
secret, the originators address, and the request nonce.
It is sent from the destination address received in the Request to
the source address received in the Request which is the same address
used in the Relay Advertisement.
The UDP source port is the IANA reserved AMT port number and the UDP
destination port is the source port received in the Request message.
The UDP checksum MUST be valid in AMT control messages.
Thaler, et al. Expires December 29, 2008 [Page 19]
Internet-Draft AMT June 2008
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x4 | Reserved | Response MAC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Response MAC (continued) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Request Nonce |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IGMP Membership Query or MLD Listener Query |
| (including IP Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
AMT Membership Query
6.4.1. Type
The type of the message.
6.4.2. Reserved
A 8-bit reserved field. Sent as 0, ignored on receipt.
6.4.3. Response MAC
A 48-bit hash generated by the respondent and sent to the originator
for inclusion in the AMT Membership Update. The algorithm used for
this is chosen by the respondent but an algorithm such as HMAC-MD5-48
[RFC2104] SHOULD be used at a minimum.
6.4.4. Request Nonce
A 32-bit identifier used to distinguish this request echoed back to
the originator.
6.4.5. IGMP/MLD Query (including IP Header)
The message contains either an IGMP Query or an MLD Multicast
Listener Query. The IGMP or MLD version sent should default to
IGMPv3 or MLDv2 unless explicitly configured to use IGMPv2 or MLDv1.
The IGMP/MLD Query includes a full IP Header. The IP source address
of the query would match the anycast address on the pseudo interface.
The TTL of the outer header should be sufficient to reach the tunnel
endpoint and not mimic the inner header TTL which is typically 1 for
IGMP/MLD messages.
Thaler, et al. Expires December 29, 2008 [Page 20]
Internet-Draft AMT June 2008
6.5. AMT Membership Update
An AMT Membership Update is sent to report a membership after a valid
Response MAC has been received. It contains the original IGMP/MLD
Membership/Listener Report or Leave/Done received over the AMT
pseudo-interface including the original IP header. It echoes the
Response MAC received in the AMT Membership Query so the respondent
can verify return routability to the originator.
It is sent from the destination address received in the Query to the
source address received in the Query which should both be the same as
the original Request.
The UDP source and destination port numbers should be the same ones
sent in the original Request.
The relay is not required to use the IP source address of the IGMP
Membership Report for any particular purpose.
The same Request Nonce and Response MAC can be used across multiple
AMT Membership Update messages without having to send individual AMT
Membership Query messages.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x5 | Reserved | Response MAC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Response MAC (continued) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Request Nonce |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IGMP or MLD Message (including IP header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
AMT Membership Update
6.5.1. Type
The type of the message.
Thaler, et al. Expires December 29, 2008 [Page 21]
Internet-Draft AMT June 2008
6.5.2. Reserved
A 8-bit reserved field. Sent as 0, ignored on receipt.
6.5.3. Response MAC
The 48-bit MAC received in the Membership Query and echoed back in
the Membership Update.
6.5.4. Request Nonce
A 32-bit identifier used to distinguish this request.
6.5.5. IGMP/MLD Message (including IP Header)
The message contains either an IGMP Membership Report, an IGMP
Membership Leave, an MLD Multicast Listener Report, or an MLD
Listener Done. The IGMP or MLD version sent should be in response
the version of the query received in the AMT Membership Query. The
IGMP/MLD Message includes a full IP Header.
6.6. AMT IP Multicast Data
The AMT Data message is a UDP packet encapsulating the IP Multicast
data requested by the originator based on a previous AMT Membership
Update message.
It is sent from the unicast destination address of the Membership
update to the source address of the Membership Update.
The UDP source and destination port numbers should be the same ones
sent in the original Query. The UDP checksum SHOULD be 0 in the AMT
IP Multicast Data message.
The payload of the UDP packet contains the following fields.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x6 | Reserved | IP Multicast Data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
AMT IP Multicast Data
Thaler, et al. Expires December 29, 2008 [Page 22]
Internet-Draft AMT June 2008
6.6.1. Type
The type of the message.
6.6.2. Reserved
An 8-bit reserved field. Sent as 0, ignored on receipt.
6.6.3. IP Multicast Data
The original IP Multicast data packet that is being replicated by the
relay to the gateways including the original IP header.
Thaler, et al. Expires December 29, 2008 [Page 23]
Internet-Draft AMT June 2008
7. AMT Gateway Details
This section details the behavior of an AMT Gateway, which may be a
router serving an AMT site, or the site may consist of a single host,
serving as its own gateway.
7.1. At Startup Time
At startup time, the AMT gateway will bring up an AMT pseudo-
interface to be used for encapsulation. The gateway needs to
discover an AMT Relay to send Membership Requests. It can send an
AMT Relay Discovery at startup time or wait until it has a group
membership to report. The AMT Relay Discovery message is sent to the
AMT Relay Anycast Address. A unicast address (which is treated as a
link-layer address to the encapsulation interface) is received in the
AMT Relay Advertisement message. The discovery process SHOULD be
done periodically (e.g., once a day) to re-resolve the unicast
address of a close relay. To prevent startup synchronization, the
timer SHOULD use at least 10 percent jitter.
If the gateway is serving as a local router, it SHOULD also function
as an IGMP/MLD Proxy, as described in [RFC4605], with its IGMP/MLD
host-mode interface being the AMT pseudo-interface. This enables it
to translate group memberships on its downstream interfaces into
IGMP/MLD Reports. Hosts receiving multicast packets through an AMT
gateway acting as a proxy should ensure that their M-RIB accepts
multicast packets from the AMT gateway for the sources it is joining.
7.2. Gateway Group and Source Addresses
To support sourcing traffic to SSM groups by a gateway with a global
unicast address, the AMT Subnet Anycast Prefix is treated as the
subnet prefix of the AMT pseudo-interface, and an anycast address is
added on the interface. This anycast address is formed by
concatenating the AMT Subnet Anycast Prefix followed by the high bits
of the gateway's global unicast address.
The remaining bits of its global unicast address are appended to the
SSM prefix to create the group address and any spare bits may be
allocated using local policy.
If a gateway wants to source multicast traffic, it must select the
gateway source address and SSM group address in such a way that the
AMT relay can have enough information to reconstruct the gateway's
unicast address when it receives an SSM join for the source.
Note that multiple gateways might end up with the same anycast
address assigned to their pseudo-interfaces.
Thaler, et al. Expires December 29, 2008 [Page 24]
Internet-Draft AMT June 2008
7.2.1. IPv4
For example, if IANA assigns the IPv4 prefix x.y/16 as the AMT Subnet
Anycast Prefix, and the gateway has global unicast address a.b.c.d,
then the AMT Gateway's Anycast Source Address will be x.y.a.b. Since
the IPv4 SSM group range is 232/8, it MUST allocate IPv4 SSM groups
in the range 232.c.d/24.
Group:
8 16 8
+------------+------------------------+-------------+
| SSM prefix | Low 16 bits of | Local |
| 232/8 | real source address | Policy |
+------------+------------------------+-------------+
Source:
+-------------------------+-------------------------+
|16-bit AMT unicast prefix| high 16 bits of real src|
+-------------------------+-------------------------+
IPv4 format
This allows for 2^8 (256) IPv4 group addresses for use by each AMT
gateway.
7.2.2. IPv6
Similarly for IPv6, this is illustrated in the following figure.
Group:
32 64 32
+------------+------------------------+-------------+
| SSM prefix | Low 64 bits of | Local |
| FF3x::/32 | real source address | Policy |
+------------+------------------------+-------------+
Source:
+-------------------------+-------------------------+
|64-bit AMT unicast prefix| high 64 bits of real src|
+-------------------------+-------------------------+
IPv6 format
This allows for 2^32 (over 4 billion) IPv6 group addresses for use by
each AMT gateway.
Thaler, et al. Expires December 29, 2008 [Page 25]
Internet-Draft AMT June 2008
7.3. Joining Groups with MBone Sources
The IGMP/MLD protocol usually operates by having the Querier
multicast an IGMP/MLD Query message on the link. This behavior does
not work on NBMA links which do not support multicast. Since the set
of gateways is typically unknown to the relay (and potentially quite
large), unicasting the queries is also impractical. The following
behavior is used instead.
Applications residing in a gateway should join groups on the AMT
pseudo-interface, causing IGMP/MLD Membership/Listener Reports to be
sent over that interface. When UDP encapsulating the membership
reports (and in fact any other messages, unless specified otherwise
in this document), the destination address in the outer IP header is
the relay's unicast address. Robustness is provided by the
underlying IGMP/MLD protocol messages sent on the AMT pseudo-
interface. In other words, the gateway does not need to retransmit
IGMP/MLD Membership/Listener Reports and Leave/Done messages received
on the pseudo-interface since IGMP/MLD will already do this. The
gateway simply needs to encapsulate each IGMP/MLD Membership/Listener
Report and Leave/Done message it receives.
However, since periodic IGMP/MLD Membership/Listener Reports are sent
in response to IGMP/MLD Queries, a mechanism to trigger periodic
Membership/Listener Reports and Leave/Done messages is necessary.
The gateway should use a timer to trigger periodic AMT Membership
Updates.
If the gateway is behind a firewall device, the firewall may require
the gateway to periodically refresh the UDP state in the firewall at
a shorter interval than the standard IGMP/MLD Query interval. AMT
Requests can be sent periodically to solicit IGMP/MLD Queries. The
interval at which the AMT Requests are sent should be configurable to
ensure the firewall does not revert to blocking the UDP encapsulated
IP Multicast data packets. When the AMT Query is received, it can be
ignored unless it is time for a periodic AMT Membership Update.
The relay can use the Querier's Robustness Variable (QRV) defined in
[RFC3376] and [RFC3810] to adjust the number of Membership/Listener
Reports that are sent by the host joining the group.
7.4. Responding to Relay Changes
When a gateway determines that its current relay is unreachable
(e.g., upon receipt of an ICMP Unreachable message [RFC0792] for the
relay's unicast address), it may need to repeat relay address
discovery. However, care should be taken not to abandon the current
relay too quickly due to transient network conditions.
Thaler, et al. Expires December 29, 2008 [Page 26]
Internet-Draft AMT June 2008
7.5. Joining SSM Groups with AMT Gateway Sources
An IGMPv3/MLDv2 Report for a given (source, group) pair MAY be
encapsulated directly to the source, when the source address belongs
to the AMT Subnet Anycast Prefix.
The "link-layer" address to use as the destination address in the
outer IP header is obtained as follows. The source address in the
inclusion list of the IGMPv3/MLDv2 report will be an AMT Gateway
Anycast Address with the high bits of the address, and the remaining
bits will be in the middle of the group address.
Section 7.2 describes this format to recover the gateway source
address.
7.6. Receiving AMT Membership Updates by the Gateway
When an AMT Request is received by the gateway from another gateway
or relay, it follows the same 3-way handshake procedure a relay would
follow if it received the AMT Request. It generates a MAC and
responds with an AMT Membership Query. When the AMT Membership
Update is received, it verifies the MAC and then processes the IGMP/
MLD Membership/Listener Report or Leave/Done.
At the gateway, the IGMP/MLD packet should be an IGMPv3/MLDv2 source
specific (S,G) join or leave.
If S is not the AMT Gateway Anycast Address, the packet is silently
discarded. If G does not contain the low bits of the global unicast
address (as described above), the packet is also silently discarded.
The gateway adds the source address (from the outer IP header) and
UDP port of the report to a membership list for G. Maintaining this
membership list may be done in any implementation-dependent manner.
For example, it might be maintained by the "link-layer" inside the
AMT pseudo-interface, making it invisible to the normal IGMP/MLD
module.
7.7. Sending data to SSM groups
When multicast packets are sent on the AMT pseudo-interface, they are
encapsulated as follows. If the group address is not an SSM group,
then the packet is silently discarded (this memo does not currently
provide a way to send to non-SSM groups).
If the group address is an SSM group, then the packet is unicast
encapsulated to each remote node from which the gateway has received
an IGMPv3/MLDv2 report for the packet's (source, group) pair.
Thaler, et al. Expires December 29, 2008 [Page 27]
Internet-Draft AMT June 2008
8. Relay Router Details
8.1. At Startup time
At startup time, the relay router will bring up an NBMA-style AMT
pseudo-interface. It shall also add the AMT Relay Anycast Address on
some interface.
The relay router shall then advertise the AMT Relay Anycast Prefix
into the unicast-only Internet, as if it were a connection to an
external network. When the advertisement is done using BGP, the AS
path leading to the AMT Relay Anycast Prefix shall include the
identifier of the local AS.
The relay router shall also enable IGMPv3/MLDv2 on the AMT pseudo-
interface, except that it shall not multicast Queries (this might be
done, for example, by having the AMT pseudo-device drop them, or by
having the IGMP/MLD module not send them in the first place).
Finally, to support sourcing SSM traffic from AMT sites, the AMT
Subnet Anycast Prefix is assigned to the AMT pseudo-interface, and
the AMT Subnet Anycast Prefix is injected by the AMT Relay into the
M-RIB of MBGP.
8.2. Receiving Relay Discovery messages sent to the Anycast Address
When a relay receives an AMT Relay Discovery message directed to the
AMT Relay Anycast Address, it should respond with an AMT Relay
Advertisement containing its unicast address. The source and
destination addresses of the advertisement should be the same as the
destination and source addresses of the discovery message
respectively. Further, the nonce in the discovery message MUST be
copied into the advertisement message.
8.3. Receiving Membership Updates from AMT Gateways
The relay operates passively, sending no periodic IGMP/MLD Queries
but simply tracking membership information according to AMT Request/
Query/Membership Update tuples received. In addition, the relay must
also do explicit membership tracking, as to which gateways on the AMT
pseudo-interface have joined which groups. Once an AMT Membership
Update has been successfully received, it updates the forwarding
state for the appropriate group and source (if provided). When data
arrives for that group, the traffic must be encapsulated to each
gateway which has joined that group or (S,G).
The explicit membership tracking and unicast replication may be done
in any implementation-specific manner. Some examples are:
Thaler, et al. Expires December 29, 2008 [Page 28]
Internet-Draft AMT June 2008
1. The AMT pseudo-device driver might track the group information
and perform the replication at the "link-layer", with no changes
to a pre-existing IGMP/MLD module.
2. The IGMP/MLD module might have native support for explicit
membership tracking, especially if it supports other NBMA-style
interfaces.
If a relay wants to affect the rate at which the AMT Requests are
originated from a gateway, it can tune the membership timeout by
adjusting the Querier's Query Interval Code (QQIC) field in the IGMP/
MLD Query contained within the AMT Membership Query message. The
QQIC field is defined in [RFC3376] and [RFC3810]. However, since the
gateway may need to send AMT Requests frequently enough to prevent
firewall state from timing out, the relay may be limited in its
ability to spread out Requests coming from a gateway by adjusting the
QQIC field.
8.4. Receiving (S,G) Joins from the Native Side, for AMT Sources
The relay sends an IGMPv3/MLDv2 report to the AMT source as described
above in Section 5.1.2
Thaler, et al. Expires December 29, 2008 [Page 29]
Internet-Draft AMT June 2008
9. IANA Considerations
9.1. IPv4 and IPv6 Anycast Prefix Allocation
The IANA should allocate an IPv4 prefix and an IPv6 prefix dedicated
to the public AMT Relays to advertise to the native multicast
backbone. The prefix length should be determined by the IANA; the
prefix should be large enough to guarantee advertisement in the
default-free BGP networks.
9.1.1. IPv4
A prefix length of 16 will meet this requirement.
9.1.2. IPv6
A prefix length of 32 will meet this requirement. IANA has
previously set aside the range 2001::/16 for allocating prefixes for
this purpose.
9.2. IPv4 and IPv6 AMT Subnet Prefix Allocation
It should also be noted that this prefix length directly affects the
number of groups available to be created by the AMT gateway: in the
IPv4 case, a prefix length of 16 gives 256 groups, and a prefix
length of 8 gives 65536 groups.
All allocations are a one time effort and there will be no need for
any recurring assignment after this stage.
9.2.1. IPv4
As described above in Section 7.2.1 an IPv4 prefix with a length of
16 is requested for this purpose.
9.2.2. IPv6
As described above in Section 7.2.2 an IPv6 prefix with a length of
32 is requested.
9.3. UDP Port number
IANA has previously allocated UDP reserved port number 2268 for AMT
encapsulation.
Thaler, et al. Expires December 29, 2008 [Page 30]
Internet-Draft AMT June 2008
10. Security Considerations
The anycast technique introduces a risk that a rogue router or a
rogue AS could introduce a bogus route to the AMT Relay Anycast
prefix, and thus divert the traffic. Network managers have to
guarantee the integrity of their routing to the AMT Relay Anycast
prefix in much the same way that they guarantee the integrity of all
other routes.
Within the native MBGP infrastructure, there is a risk that a rogue
router or a rogue AS could inject a false route to the AMT Subnet
Anycast Prefix, and thus divert joins and cause RPF failures of
multicast traffic. As the AMT Subnet Anycast Prefix will be
advertised by multiple entities, guaranteeing the integrity of this
shared MBGP prefix is much more challenging than verifying the
correctness of a regular unicast advertisement. To mitigate this
threat, routing operators should configure the BGP sessions to filter
out any more specific advertisements for the AMT Subnet Anycast
Prefix.
Gateways and relays will accept and decapsulate multicast traffic
from any source from which regular unicast traffic is accepted. If
this is for any reason felt to be a security risk, then additional
source address based packet filtering MUST be applied:
1. To prevent a rogue sender (that can't do traditional spoofing
because of e.g. access lists deployed by its ISP) from making use
of AMT to send packets to an SSM tree, a relay that receives an
encapsulated multicast packet MUST discard the multicast packet
if the IP source address in the outer header does not match the
source address that would be extracted using the rules of
Section 7.2.
2. A gateway MUST discard encapsulated multicast packets if the
source address in the outer header is not the address to which
the encapsulated join message was sent. An AMT Gateway that
receives an encapsulated IGMPv3/MLDv2 (S,G)-Join MUST discard the
message if the IP destination address in the outer header does
not match the source address that would be extracted using the
rules of Section 7.2.
Thaler, et al. Expires December 29, 2008 [Page 31]
Internet-Draft AMT June 2008
11. Contributors
The following people provided significant contributions to earlier
versions of this draft.
Dirk Ooms
OneSparrow
Belegstraat 13; 2018 Antwerp; Belgium
EMail: dirk at onesparrow.com
Thaler, et al. Expires December 29, 2008 [Page 32]
Internet-Draft AMT June 2008
12. Acknowledgments
Most of the mechanisms described in this document are based on
similar work done by the NGTrans WG for obtaining automatic IPv6
connectivity without explicit tunnels ("6to4"). Tony Ballardie
provided helpful discussion that inspired this document.
In addition, extensive comments were received from Pekka Savola, Greg
Shepherd, Dino Farinacci, Toerless Eckert, Marshall Eubanks, John
Zwiebel, and Lenny Giuliano.
Juniper Networks was instrumental in funding several versions of this
draft as well as an open source implementation.
Thaler, et al. Expires December 29, 2008 [Page 33]
Internet-Draft AMT June 2008
13. References
13.1. Normative References
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
RFC 792, September 1981.
[RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A.
Thyagarajan, "Internet Group Management Protocol, Version
3", RFC 3376, October 2002.
[RFC3810] Vida, R. and L. Costa, "Multicast Listener Discovery
Version 2 (MLDv2) for IPv6", RFC 3810, June 2004.
[RFC4605] Fenner, B., He, H., Haberman, B., and H. Sandick,
"Internet Group Management Protocol (IGMP) / Multicast
Listener Discovery (MLD)-Based Multicast Forwarding
("IGMP/MLD Proxying")", RFC 4605, August 2006.
[RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for
IP", RFC 4607, August 2006.
13.2. Informative References
[RFC1112] Deering, S., "Host extensions for IP multicasting", STD 5,
RFC 1112, August 1989.
[RFC1546] Partridge, C., Mendez, T., and W. Milliken, "Host
Anycasting Service", RFC 1546, November 1993.
[RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
Hashing for Message Authentication", RFC 2104,
February 1997.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3053] Durand, A., Fasano, P., Guardini, I., and D. Lento, "IPv6
Tunnel Broker", RFC 3053, January 2001.
[RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains
via IPv4 Clouds", RFC 3056, February 2001.
[RFC3068] Huitema, C., "An Anycast Prefix for 6to4 Relay Routers",
RFC 3068, June 2001.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 4291, February 2006.
Thaler, et al. Expires December 29, 2008 [Page 34]
Internet-Draft AMT June 2008
[RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas,
"Protocol Independent Multicast - Sparse Mode (PIM-SM):
Protocol Specification (Revised)", RFC 4601, August 2006.
Thaler, et al. Expires December 29, 2008 [Page 35]
Internet-Draft AMT June 2008
Authors' Addresses
Dave Thaler
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052-6399
USA
Phone: +1 425 703 8835
Email: dthaler at microsoft.com
Mohit Talwar
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052-6399
USA
Phone: +1 425 705 3131
Email: mohitt at microsoft.com
Amit Aggarwal
Microsoft Corporation
One Microsoft Way
Redmond, WA 98052-6399
USA
Phone: +1 425 706 0593
Email: amitag at microsoft.com
Lorenzo Vicisano
Cisco Systems
170 West Tasman Dr.
San Jose, CA 95134
USA
Phone: +1 408 525 2530
Email: lorenzo at cisco.com
Thaler, et al. Expires December 29, 2008 [Page 36]
Internet-Draft AMT June 2008
Tom Pusateri
!j
222 E. Jones Ave.
Wake Forest, NC 27587
USA
Email: pusateri at bangj.com
Thaler, et al. Expires December 29, 2008 [Page 37]
Internet-Draft AMT June 2008
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr at ietf.org.
Thaler, et al. Expires December 29, 2008 [Page 38]
Network Working Group D. Farinacci
Internet-Draft D. Meyer
Intended status: Experimental J. Zwiebel
Expires: April 2, 2010 S. Venaas
cisco Systems
September 29, 2009
LISP for Multicast Environments
draft-ietf-lisp-multicast-02.txt
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 2, 2010.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
Farinacci, et al. Expires April 2, 2010 [Page 1]
Internet-Draft LISP for Multicast Environments September 2009
Abstract
This draft describes how inter-domain multicast routing will function
in an environment where Locator/ID Separation is deployed using the
LISP architecture.
Table of Contents
1. Requirements Notation . . . . . . . . . . . . . . . . . . . . 3
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 6
4. Basic Overview . . . . . . . . . . . . . . . . . . . . . . . . 9
5. Source Addresses versus Group Addresses . . . . . . . . . . . 12
6. Locator Reachability Implications on LISP-Multicast . . . . . 13
7. Multicast Protocol Changes . . . . . . . . . . . . . . . . . . 14
8. LISP-Multicast Data-Plane Architecture . . . . . . . . . . . . 16
8.1. ITR Forwarding Procedure . . . . . . . . . . . . . . . . . 16
8.1.1. Multiple RLOCs for an ITR . . . . . . . . . . . . . . 16
8.2. ETR Forwarding Procedure . . . . . . . . . . . . . . . . . 17
8.3. Replication Locations . . . . . . . . . . . . . . . . . . 17
9. LISP-Multicast Interworking . . . . . . . . . . . . . . . . . 19
9.1. LISP and non-LISP Mixed Sites . . . . . . . . . . . . . . 19
9.1.1. LISP Source Site to non-LISP Receiver Sites . . . . . 20
9.1.2. Non-LISP Source Site to non-LISP Receiver Sites . . . 21
9.1.3. Non-LISP Source Site to Any Receiver Site . . . . . . 22
9.1.4. Unicast LISP Source Site to Any Receiver Sites . . . 22
9.1.5. LISP Source Site to Any Receiver Sites . . . . . . . . 23
9.2. LISP Sites with Mixed Address Families . . . . . . . . . . 23
9.3. Making a Multicast Interworking Decision . . . . . . . . . 25
10. Considerations when RP Addresses are Embedded in Group
Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . 26
11. Taking Advantage of Upgrades in the Core . . . . . . . . . . . 27
12. Mtrace Considerations . . . . . . . . . . . . . . . . . . . . 28
13. Security Considerations . . . . . . . . . . . . . . . . . . . 29
14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30
15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31
15.1. Normative References . . . . . . . . . . . . . . . . . . . 31
15.2. Informative References . . . . . . . . . . . . . . . . . . 31
Appendix A. Document Change Log . . . . . . . . . . . . . . . . . 33
A.1. Changes to draft-ietf-lisp-multicsat-02.txt . . . . . . . 33
A.2. Changes to draft-ietf-lisp-multicsat-01.txt . . . . . . . 33
A.3. Changes to draft-ietf-lisp-multicsat-00.txt . . . . . . . 33
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 34
Farinacci, et al. Expires April 2, 2010 [Page 2]
Internet-Draft LISP for Multicast Environments September 2009
1. Requirements Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Farinacci, et al. Expires April 2, 2010 [Page 3]
Internet-Draft LISP for Multicast Environments September 2009
2. Introduction
The Locator/ID Separation Architecture [LISP] provides a mechanism to
separate out Identification and Location semantics from the current
definition of an IP address. By creating two namespaces, an EID
namespace used by sites and a Locator (RLOC) namespace used by core
routing, the core routing infrastructure can scale by doing
topological aggregation of routing information.
Since LISP creates a new namespace, a mapping function must exist to
map a site's EID prefixes to its associated locators. For unicast
packets, both the source address and destination address must be
mapped. For multicast packets, only the source address needs to be
mapped. The destination group address doesn't need to be mapped
because the semantics of an IPv4 or IPv6 group address are logical in
nature and not topology-dependent. Therefore, this specifications
focuses on to map a source EID address of a multicast flow during
distribution tree setup and packet delivery.
This specification will address the following scenarios:
1. How a multicast source host in a LISP site sends multicast
packets to receivers inside of its site as well as to receivers
in other sites that are LISP enabled.
2. How inter-domain (or between LISP sites) multicast distribution
trees are built and how forwarding of multicast packets leaving a
source site toward receivers sites is performed.
3. What protocols are affected and what changes are required to such
multicast protocols.
4. How ASM-mode, SSM-mode, and Bidir-mode service models will
operate.
5. How multicast packet flow will occur for multiple combinations of
LISP and non-LISP capable source and receiver sites, for example:
A. How multicast packets from a source host in a LISP site are
sent to receivers in other sites when they are all non-LISP
sites.
B. How multicast packets from a source host in a LISP site are
sent to receivers in both LISP-enabled sites and non-LISP
sites.
C. How multicast packets from a source host in a non-LISP site
are sent to receivers in other sites when they are all LISP-
Farinacci, et al. Expires April 2, 2010 [Page 4]
Internet-Draft LISP for Multicast Environments September 2009
enabled sites.
D. How multicast packets from a source host in a non-LISP site
are sent to receivers in both LISP-enabled sites and non-LISP
sites.
This specification focuses on what changes are needed to the
multicast routing protocols to support LISP-Multicast as well as
other protocols used for inter-domain multicast, such as Multi-
protocol BGP (MBGP) [RFC4760]. The approach proposed in this
specification requires no changes to the multicast infrastructure
inside of a site when all sources and receivers reside in that site,
even when the site is LISP enabled. That is, internal operation of
multicast is unchanged regardless of whether or not the site is LISP
enabled or whether or not receivers exist in other sites which are
LISP-enabled.
Therefore, we see changes only to PIM-ASM [RFC4601], MSDP [RFC3618],
and PIM-SSM [RFC4607]. Bidir-PIM [RFC5015], which typically does not
run in an inter-domain environment is not addressed in depth in this
version of the specification.
Also, the current version of this specification does not describe
multicast-based Traffic Engineering relative to the TE-ITR and TE-ETR
descriptions in [LISP].
Farinacci, et al. Expires April 2, 2010 [Page 5]
Internet-Draft LISP for Multicast Environments September 2009
3. Definition of Terms
The terminology in this section is consistent with the definitions in
[LISP] but is extended specifically to deal with the application of
the terminology to multicast routing.
LISP-Multicast: a reference to the design in this specification.
That is, when any site that is participating in multicast
communication has been upgraded to be a LISP site, the operation
of control-plane and data-plane protocols is considered part of
the LISP-Multicast architecture.
Endpoint ID (EID): a 32-bit (for IPv4) or 128-bit (for IPv6) value
used in the source address field of the first (most inner) LISP
header of a multicast packet. The host obtains a destination
group address the same way it obtains one today, as it would when
it is a non-LISP site. The source EID is obtained via existing
mechanisms used to set a host's "local" IP address. An EID is
allocated to a host from an EID prefix block associated with the
site the host is located in. An EID can be used by a host to
refer to another host, as when it joins an SSM (S-EID,G) route
using IGMP version 3 [RFC4604]. LISP uses Provider Independent
(PI) blocks for EIDs; such EIDs MUST NOT be used as LISP RLOCs.
Note that EID blocks may be assigned in a hierarchical manner,
independent of the network topology, to facilitate scaling of the
mapping database. In addition, an EID block assigned to a site
may have site-local structure (subnetting) for routing within the
site; this structure is not visible to the global routing system.
Routing Locator (RLOC): the IPv4 or IPv6 address of an ingress
tunnel router (ITR), the router in the multicast source host's
site that encapsulates multicast packets. It is the output of a
EID-to-RLOC mapping lookup. An EID maps to one or more RLOCs.
Typically, RLOCs are numbered from topologically-aggregatable
blocks that are assigned to a site at each point to which it
attaches to the global Internet; where the topology is defined by
the connectivity of provider networks, RLOCs can be thought of as
Provider Assigned (PA) addresses. Multiple RLOCs can be assigned
to the same ITR device or to multiple ITR devices at a site.
Ingress Tunnel Router (ITR): a router which accepts an IP multicast
packet with a single IP header (more precisely, an IP packet that
does not contain a LISP header). The router treats this "inner"
IP destination multicast address opaquely so it doesn't need to
perform a map lookup on the group address because it is
topologically insignificant. The router then prepends an "outer"
IP header with one of its globally-routable RLOCs as the source
address field. This RLOC is known to other multicast receiver
Farinacci, et al. Expires April 2, 2010 [Page 6]
Internet-Draft LISP for Multicast Environments September 2009
sites which have used the mapping database to join a multicast
tree for which the ITR is the root. In general, an ITR receives
IP packets from site end systems on one side and sends LISP-
encapsulated multicast IP packets out all external interfaces
which have been joined.
An ITR would receive a multicast packet from a source inside of
its site when 1) it is on the path from the multicast source to
internally joined receivers, or 2) when it is on the path from the
multicast source to externally joined receivers.
Egress Tunnel Router (ETR): a router that is on the path from a
multicast source host in another site to a multicast receiver in
its own site. An ETR accepts a PIM Join/Prune message from a site
internal PIM router destined for the source's EID in the multicast
source site. The ETR maps the source EID in the Join/Prune
message to an RLOC address based on the EID-to-RLOC mapping. This
sets up the ETR to accept multicast encapsulated packets from the
ITR in the source multicast site. A multicast ETR decapsulates
multicast encapsulated packets and replicates them on interfaces
leading to internal receivers.
xTR: is a reference to an ITR or ETR when direction of data flow is
not part of the context description. xTR refers to the router that
is the tunnel endpoint. Used synonymously with the term "Tunnel
Router". For example, "An xTR can be located at the Customer Edge
(CE) router", meaning both ITR and ETR functionality can be at the
CE router.
LISP Header: a term used in this document to refer to the outer
IPv4 or IPv6 header, a UDP header, and a LISP header. An ITR
prepends headers and an ETR strips headers. A LISP encapsulated
multicast packet will have an "inner" header with the source EID
in the source field; an "outer" header with the source RLOC in the
source field: and the same globally unique group address in the
destination field of both the inner and outer header.
(S,G) State: the formal definition is in the PIM Sparse Mode
[RFC4601] specification. For this specification, the term is used
generally to refer to multicast state. Based on its topological
location, the (S,G) state resides in routers can be either
(S-EID,G) state (at a location where the (S,G) state resides) or
(S-RLOC,G) state (in the Internet core).
(S-EID,G) State: refers to multicast state in multicast source and
receiver sites where S-EID is the IP address of the multicast
source host (its EID). An S-EID can appear in an IGMPv3 report,
an MSDP SA message or a PIM Join/Prune message that travels inside
Farinacci, et al. Expires April 2, 2010 [Page 7]
Internet-Draft LISP for Multicast Environments September 2009
of a site.
(S-RLOC,G) State: refers to multicast state in the core where S is
a source locator (the IP address of a multicast ITR) of a site
with a multicast source. The (S-RLOC,G) is mapped from (S-EID,G)
entry by doing a mapping database lookup for the EID prefix that
S-EID maps to. An S-RLOC can appear in a PIM Join/Prune message
when it travels from an ETR to an ITR over the Internet core.
uLISP Site: a unicast only LISP site according to [LISP] which has
not deployed the procedures of this specification and therefore,
for multicast purposes, follows the procedures from Section 9.
mPTR: this is a multicast PTR that is responsible for advertising a
very coarse EID prefix which non-LISP and uLISP sites can target
their (S-EID,G) PIM Join/Prune message to. mPTRs are used so LISP
source multicast sites can send multicast packets using source
addresses from the EID namespace. mPTRs act as Proxy ETRs for
supporting multicast routing in a LISP infrastructure.
Mixed Locator-Sets: this is a locator-set for a LISP database
mapping entry where the RLOC addresses in the locator-set are in
both IPv4 and IPv6 format.
Unicast Encapsulated PIM Join/Prune Message: this is a standard PIM
Join/Prune message (encapsulated in a LISP Encapsulated Control
Message with destination UDP port 4342) which is sent by ETRs at
multicast receiver sites to an ITR at a multicast source site.
This message is sent periodically as long as there are interfaces
in the oif-list for the (S-EID,G) entry the ETR is joining for.
Farinacci, et al. Expires April 2, 2010 [Page 8]
Internet-Draft LISP for Multicast Environments September 2009
4. Basic Overview
LISP, when used for unicast routing, increases the site's ability to
control ingress traffic flows. Egress traffic flows are controlled
by the IGP in the source site. For multicast, the IGP coupled with
PIM can decide which path multicast packets ingress. By using the
traffic engineering features of LISP, a multicast source site can
control the egress of its multicast traffic. By controlling the
priorities of locators from a mapping database entry, a source
multicast site can control which way multicast receiver sites join to
the source site.
At this point in time, we don't see a requirement for different
locator-sets, priority, and weight policies for multicast than we
have for unicast.
The fundamental multicast forwarding model is to encapsulate a
multicast packet into another multicast packet. An ITR will
encapsulate multicast packets received from sources that it serves in
another LISP multicast header. The destination group address from
the inner header is copied to the destination address of the outer
header. The inner source address is the EID of the multicast source
host and the outer source address is the RLOC of the encapsulating
ITR.
The LISP-Multicast architecture will follow this high-level protocol
and operational sequence:
1. Receiver hosts in multicast sites will join multicast content the
way they do today, they use IGMP. When they use IGMPv3 where
they specify source addresses, they use source EIDs, that is they
join (S-EID,G). If the S-EID is a local multicast source host.
If the multicast source is external to this receiver site, the
PIM Join/Prune message flows toward the ETRs, finding the
shortest exit (that is the closest exit for the Join/Prune
message but it is the closest entrance for the multicast packet
to the receiver).
2. The ETR does a mapping database lookup for S-EID. If the mapping
is cached from a previous lookup (from either a previous Join/
Prune for the source multicast site or a unicast packet that went
to the site), it will use the RLOC information from the mapping.
The ETR will use the same priority and weighting mechanism as for
unicast. So the source site can decide which way multicast
packets egress.
Farinacci, et al. Expires April 2, 2010 [Page 9]
Internet-Draft LISP for Multicast Environments September 2009
3. The ETR will build two PIM Join/Prune messages, one that contains
a (S-EID,G) entry that is unicast to the ITR that matches the
RLOC the ETR selects, and the other which contains a (S-RLOC,G)
entry so the core network can create multicast state from this
ETR to the ITR.
4. When the ITR gets the unicast Join/Prune message (see Section 3
for formal definition), it will process (S-EID,G) entries in the
message and propagate them inside of the site where it has
explicit routing information for EIDs via the IGP. When the ITR
receives the (S-RLOC,G) PIM Join/Prune message it will process it
like any other join it would get in today's Internet. The S-RLOC
address is the IP address of this ITR.
5. At this point there is (S-EID,G) state from the joining host in
the receiver multicast site to the ETR of the receiver multicast
site. There is (S-RLOC,G) state across the core network from the
ETR of the multicast receiver site to the ITR in the multicast
source site and (S-EID,G) state in the source multicast site.
Note, the (S-EID,G) state is the same S-EID in each multicast
site. As other ETRs join the same multicast tree, they can join
through the same ITR (in which case the packet replication is
done in the core) or a different ITR (in which case the packet
replication is done at the source site).
6. When a packet is originated by the multicast host in the source
site, it will flow to one or more ITRs which will prepend a LISP
header by copying the group address to the outer destination
address field and insert its own locator address in the outer
source address field. The ITR will look at its (S-RLOC,G) state,
where S-RLOC is its own locator address, and replicate the packet
on each interface a (S-RLOC,G) joined was received on. The core
has (S-RLOC,G) so where fanout occurs to multiple sites, a core
router will do packet replication.
7. When either the source site or the core replicates the packet,
the ETR will receive a LISP packet with a destination group
address. It will also decapsulate packets because it has
receivers for the group. Otherwise, it would have not received
the packets because it would not have joined. The ETR
decapsulates and does a (S-EID,G) lookup in its multicast FIB to
forward packets out one or more interfaces to forward the packet
to internal receivers.
This architecture is consistent and scalable with the architecture
presented in [LISP] where multicast state in the core operates on
locators and multicast state at the sites operates on EIDs.
Farinacci, et al. Expires April 2, 2010 [Page 10]
Internet-Draft LISP for Multicast Environments September 2009
Alternatively, [LISP] does present a mechanism where (S-EID,G) state
can reside in the core through the use of RPF-vectors [RPFV] in PIM
Join/Prune messages. However, this will require EID state in core as
well as the use of RPF-vector formatted Join/Prune messages which are
not the default implementation choice. So we choose a design that
can allow the separation of namespaces as unicast LISP provides. It
will be at the expense of creating new (S-RLOC,G) state when ITRs go
unreachable. See Section 5 for details.
However, we have some observations on the algorithm above. We can
scale the control plane but at the expense of sending data to sites
which may have not joined the distribution tree where the
encapsulated data is being delivered. For example, one site joins
(S-EID1,G) and another site joins (S-EID2,G). Both EIDs are in the
same multicast source site. Both multicast receiver sites join to
the same ITR with state (S-RLOC,G) where S-RLOC is the RLOC for the
ITR. The ITR joins both (S-EID1,G) and (S-EID2,G) inside of the
site. The ITR receives (S-RLOC,G) joins and populates the oif-list
state for it. Since both (S-EID1,G) and (S-EID2, G) map to the one
(S-RLOC,G) packets will be delivered by the core to both multicast
receiver sites even though each have joined a single source-based
distribution tree. This behavior is a consequence of the many-to-one
mapping between S-EIDs and a S-RLOC.
There is a possible solution to this problem which reduces the number
of many-to-one occurrences of (S-EID,G) entries aggregating into a
single (S-RLOC,G) entry. If a physical ITR can be assigned multiple
RLOC addresses and these addresses are advertised in mapping database
entries, then ETRs at receiver sites have more RLOC address options
and therefore can join different (RLOC,G) entries for each (S-EID,G)
entry joined at the receiver site. It would not scale to have a one-
to-one relationship between the number of S-EID sources at a source
site and the number of RLOCs assigned to all ITRs at the site, but we
can reduce the "n" to a smaller number in the "n-to-1" relationship.
And in turn, reduce the opportunity for data packets to be delivered
to sites for groups not joined.
Farinacci, et al. Expires April 2, 2010 [Page 11]
Internet-Draft LISP for Multicast Environments September 2009
5. Source Addresses versus Group Addresses
Multicast group addresses don't have to be associated with either the
EID or RLOC namespace. They actually are a namespace of their own
that can be treated as logical with relatively opaque allocation.
So, by their nature, they don't detract from an incremental
deployment of LISP-Multicast.
As for source addresses, as in the unicast LISP scenario, there is a
decoupling of identification from location. In a LISP site, packets
are originated from hosts using their allocated EIDs, those addresses
are used to identify the host as well as where in the site's topology
the host resides but not how and where it is attached to the
Internet.
Therefore, when multicast distribution tree state is created anywhere
in the network on the path from any multicast receiver to a multicast
source, EID state is maintained at the source and receiver multicast
sites, and RLOC state is maintained in the core. That is, a
multicast distribution tree will be represented as a 3-tuple of
{(S-EID,G) (S-RLOC,G) (S-EID,G)} where the first element of the
3-tuple is the state stored in routers from the source to one or more
ITRs in the source multicast site, the second element of the 3-tuple
is the state stored in routers downstream of the ITR, in the core, to
all LISP receiver multicast sites, and the third element in the
3-tuple is the state stored in the routers downstream of each ETR, in
each receiver multicast site, reaching each receiver. Note that
(S-EID,G) is the same in both the source and receiver multicast
sites.
The concatenation/mapping from the first element to the second
element of the 3-tuples is done by the ITR and from the second
element to the third element is done at the ETRs.
Farinacci, et al. Expires April 2, 2010 [Page 12]
Internet-Draft LISP for Multicast Environments September 2009
6. Locator Reachability Implications on LISP-Multicast
Multicast state as it is stored in the core is always (S,G) state as
it exists today or (S-RLOC,G) state as it will exist when LISP sites
are deployed. The core routers cannot distinguish one from the
other. They don't need to because it is state that RPFs against the
core routing tables in the RLOC namespace. The difference is where
the root of the distribution tree for a particular source is. In the
traditional multicast core, the source S is the source host's IP
address. For LISP-Multicast the source S is a single ITR of the
multicast source site.
An ITR is selected based on the LISP EID-to-RLOC mapping used when an
ETR propagates a PIM Join/Prune message out of a receiver multicast
site. The selection is based on the same algorithm an ITR would use
to select an ETR when sending a unicast packet to the site. In the
unicast case, the ITR can change on a per-packet basis depending on
the reachability of the ETR. So an ITR can change relatively easily
using local reachability state. However, in the multicast case, when
an ITR goes unreachable, new distribution tree state must be built
because the encapsulating root has changed. This is more significant
than an RPF-change event, where any router would typically locally
change its RPF-interface for its existing tree state. But when an
encapsulating LISP-Multicast ITR goes unreachable, new distribution
state must be rebuilt and reflect the new encapsulator. Therefore,
when an ITR goes unreachable, all ETRs that are currently joined to
that ITR will have to trigger a new Join/Prune message for (S-RLOC,G)
to the new ITR as well as send a unicast encapsulated Join/Prune
message telling the new ITR which (S-EID,G) is being joined.
This issue can be mitigated by using anycast addressing for the ITRs
so the problem does reduce to an RPF change in the core, but still
requires a unicast encapsulated Join/Prune message to tell the new
ITR about (S-EID,G). The problem with this approach is that the ETR
really doesn't know when the ITR has changed so the new anycast ITR
will get the (S-EID,G) state only when the ETR sends it the next time
during its periodic sending procedures.
Farinacci, et al. Expires April 2, 2010 [Page 13]
Internet-Draft LISP for Multicast Environments September 2009
7. Multicast Protocol Changes
A number of protocols are used today for inter-domain multicast
routing:
IGMPv1-v3, MLDv1-v2: These protocols do not require any changes for
LISP-Multicast for two reasons. One being that they are link-
local and not used over site boundaries and second they advertise
group addresses that don't need translation. Where source
addresses are supplied in IGMPv3 and MLDv2 messages, they are
semantically regarded as EIDs and don't need to be converted to
RLOCs until the multicast tree-building protocol, such as PIM, is
received by the ETR at the site boundary. Addresses used for IGMP
and MLD come out of the source site's allocated addresses which
are therefore from the EID namespace.
MBGP: Even though MBGP is not a multicast routing protocol, it is
used to find multicast sources when the unicast BGP peering
topology and the multicast MBGP peering topology are not
congruent. When MBGP is used in a LISP-Multicast environment, the
prefixes which are advertised are from the RLOC namespace. This
allows receiver multicast sites to find a path to the source
multicast site's ITRs. MBGP peering addresses will be from the
RLOC namespace.
MSDP: MSDP is used to announce active multicast sources to other
routing domains (or LISP sites). The announcements come from the
PIM Rendezvous Points (RPs) from sites where there are active
multicast sources sending to various groups. In the context of
LISP-Multicast, the source addresses advertised in MSDP will
semantically be from the EID namespace since they describe the
identity of a source multicast host. It will be true that the
state stored in MSDP caches from core routers will be from the EID
namespace. An RP address inside of site will be from the EID
namespace so it can be advertised and reached by internal unicast
routing mechanism. However, for MSDP peer-RPF checking to work
properly across sites, the RP addresses must be converted or
mapped into a routable address that is advertised and maintained
in the BGP routing tables in the core. MSDP peering addresses can
come out of either the EID or a routable address namespace. And
the choice can be made unilaterally because the ITR at the site
will determine which namespace the destination peer address is out
of by looking in the mapping database service.
PIM-SSM: In the simplest form of distribution tree building, when
PIM operates in SSM mode, a source distribution tree is built and
maintained across site boundaries. In this case, there is a small
modification to the operation of the PIM protocol (but not to any
Farinacci, et al. Expires April 2, 2010 [Page 14]
Internet-Draft LISP for Multicast Environments September 2009
message format) to support taking a Join/Prune message originated
inside of a LISP site with embedded addresses from the EID
namespace and converting them to addresses from the RLOC namespace
when the Join/Prune message crosses a site boundary. This is
similar to the requirements documented in [MNAT].
PIM-Bidir: Bidirectional PIM is typically run inside of a routing
domain, but if deployed in an inter-domain environment, one would
have to decide if the RP address of the shared-tree would be from
the EID namespace or the RLOC namespace. If the RP resides in a
site-based router, then the RP address is from the EID namespace.
If the RP resides in the core where RLOC addresses are routed,
then the RP address is from the RLOC namespace. This could be
easily distinguishable if the EID address were well-known address
allocation block from the RLOC namespace. Also, when using
Embedded-RP for RP determination [RFC3956], the format of the
group address could indicate the namespace the RP address is from.
However, refer to Section 10 for considerations core routers need
to make when using Embedded-RP IPv6 group addresses. When using
Bidir-PIM for inter-domain multicast routing, it is recommended to
use staticly configured RPs so core routers think the Bidir group
is associated with an ITR's RLOC as the RP address and site
routers think the Bidir group is associated with the site resident
RP with an EID address. With respect to DF-election in Bidir PIM,
no changes are required since all messaging and addressing is
link-local.
PIM-ASM: The ASM mode of PIM, the most popular form of PIM, is
deployed in the Internet today is by having shared-trees within a
site and using source-trees across sites. By the use of MSDP and
PIM-SSM techniques described above, we can get multicast
connectivity across LISP sites. Having said that, that means
there are no special actions required for processing (*,G) or
(S,G,R) Join/Prune messages since they all operate against the
shared-tree which is site resident. Just like with ASM, there is
no (*,G) in the core when LISP-Multicast is in use. This is also
true for the RP-mapping mechanisms Auto-RP and BSR.
Based on the protocol description above, the conclusion is that there
are no protocol message format changes, just a translation function
performed at the control-plane. This will make for an easier and
faster transition for LISP since fewer components in the network have
to change.
It should also be stated just like it is in [LISP] that no host
changes, whatsoever, are required to have a multicast source host
send multicast packets and for a multicast receiver host to receive
multicast packets.
Farinacci, et al. Expires April 2, 2010 [Page 15]
Internet-Draft LISP for Multicast Environments September 2009
8. LISP-Multicast Data-Plane Architecture
The LISP-Multicast data-plane operation conforms to the operation and
packet formats specified in [LISP]. However, encapsulating a
multicast packet from an ITR is a much simpler process. The process
is simply to copy the inner group address to the outer destination
address. And to have the ITR use its own IP address (its RLOC), and
as the source address. The process is simpler for multicast because
there is no EID-to-RLOC mapping lookup performed during packet
forwarding.
In the decapsulation case, the ETR simply removes the outer header
and performs a multicast routing table lookup on the inner header
(S-EID,G) addresses. Then the oif-list for the (S-EID,G) entry is
used to replicate the packet on site-facing interfaces leading to
multicast receiver hosts.
There is no Data-Probe logic for ETRs as there can be in the unicast
forwarding case.
8.1. ITR Forwarding Procedure
The following procedure is used by an ITR, when it receives a
multicast packet from a source inside of its site:
1. A multicast data packet sent by a host in a LISP site will have
the source address equal to the host's EID and the destination
address equal to the group address of the multicast group. It is
assumed the group information is obtained by current methods.
The same is true for a multicast receiver to obtain the source
and group address of a multicast flow.
2. When the ITR receives a multicast packet, it will have both S-EID
state and S-RLOC state stored. Since the packet was received on
a site-facing interface, the RPF lookup is based on the S-EID
state. If the RPF check succeeds, then the oif-list contains
interfaces that are site-facing and external-facing. For the
site-facing interfaces, no LISP header is prepended. For the
external-facing interfaces a LISP header is prepended. When the
ITR prepends a LISP header, it uses its own RLOC address as the
source address and copies the group address supplied by the IP
header the host built as the outer destination address.
8.1.1. Multiple RLOCs for an ITR
Typically, an ITR will have a single RLOC address but in some cases
there could be multiple RLOC addresses assigned from either the same
or different service providers. In this case when (S-RLOC,G) Join/
Farinacci, et al. Expires April 2, 2010 [Page 16]
Internet-Draft LISP for Multicast Environments September 2009
Prune messages are received for each RLOC, there is a oif-list
merging action that must take place. Therefore, when a packet is
received from a site-facing interface that matches on a (S-EID,G)
entry, the interfaces of the oif-list from all (RLOC,G) entries
joined to the ITR as well as the site-facing oif-list joined for
(S-EID,G) must be part be included in packet replication. In
addition to replicating for all types of oif-lists, each oif entry
must be tagged with the RLOC address, so encapsulation uses the outer
source address for the RLOC joined.
8.2. ETR Forwarding Procedure
The following procedure is used by an ETR, when it receives a
multicast packet from a source outside of its site:
1. When a multicast data packet is received by an ETR on an
external-facing interface, it will do an RPF lookup on the S-RLOC
state it has stored. If the RPF check succeeds, the interfaces
from the oif-list are used for replication to interfaces that are
site-facing as well as interfaces that are external-facing (this
ETR can also be a transit multicast router for receivers outside
of its site). When the packet is to be replicated for an
external-facing interface, the LISP encapsulation header are not
stripped. When the packet is replicated for a site-facing
interface, the encapsulation header is stripped.
2. The packet without a LISP header is now forwarded down the
(S-EID,G) distribution tree in the receiver multicast site.
8.3. Replication Locations
Multicast packet replication can happen in the following topological
locations:
o In an IGP multicast router inside a site which operates on S-EIDs.
o In a transit multicast router inside of the core which operates on
S-RLOCs.
o At one or more ETR routers depending on the path a Join/Prune
message exits a receiver multicast site.
o At one or more ITR routers in a source multicast site depending on
what priorities are returned in a Map-Reply to receiver multicast
sites.
In the last case the source multicast site can do replication rather
than having a single exit from the site. But this only can occur
Farinacci, et al. Expires April 2, 2010 [Page 17]
Internet-Draft LISP for Multicast Environments September 2009
when the priorities in the Map-Reply are modified for different
receiver multicast site so that the PIM Join/Prune messages arrive at
different ITRs.
This policy technique, also used in [ALT] for unicast, is useful for
multicast to mitigate the problems of changing distribution tree
state as discussed in Section 6.
Farinacci, et al. Expires April 2, 2010 [Page 18]
Internet-Draft LISP for Multicast Environments September 2009
9. LISP-Multicast Interworking
This section will describe the multicast corollary to [INTWORK] which
describes the interworking of multicast routing among LISP and non-
LISP sites.
9.1. LISP and non-LISP Mixed Sites
Since multicast communication can involve more than two entities to
communicate together, the combinations of interworking scenarios are
more involved. However, the state maintained for distribution trees
at the sites is the same regardless of whether or not the site is
LISP enabled or not. So most of the implications are in the core
with respect to storing routable EID prefixes from either PA or PI
blocks.
Before we enumerate the multicast interworking scenarios, we must
define 3 deployment states of a site:
o A non-LISP site which will run PIM-SSM or PIM-ASM with MSDP as it
does today. The addresses for the site are globally routable.
o A site that deploys LISP for unicast routing. The addresses for
the site are not globally routable. Let's define the name for
this type of site as a uLISP site.
o A site that deploys LISP for both unicast and multicast routing.
The addresses for the site are not globally routable. Let's
define the name for this type of site as a LISP-Multicast site.
We will not consider a LISP site enabled for multicast purposes only
but do consider a uLISP site as documented in [INTWORK]. In this
section we don't discuss how a LISP site sends multicast packets when
all receiver sites are LISP-Multicast enabled; that has been
discussed in previous sections.
The following scenarios exist to make LISP-Multicast sites interwork
with non-LISP-Multicast sites:
1. A LISP site must be able to send multicast packets to receiver
sites which are a mix of non-LISP sites and uLISP sites.
2. A non-LISP site must be able to send multicast packets to
receiver sites which are a mix of non-LISP sites and uLISP sites.
3. A non-LISP site must be able to send multicast packets to
receiver sites which are a mix of LISP sites, uLISP sites, and
non-LISP sites.
Farinacci, et al. Expires April 2, 2010 [Page 19]
Internet-Draft LISP for Multicast Environments September 2009
4. A uLISP site must be able to send multicast packets to receiver
sites which are a mix of LISP sites, uLISP sites, and non-LISP
sites.
5. A LISP site must be able to send multicast packets to receiver
sites which are a mix of LISP sites, uLISP sites, and non-LISP
sites.
9.1.1. LISP Source Site to non-LISP Receiver Sites
In the first scenario, a site is LISP capable for both unicast and
multicast traffic and as such operates on EIDs. Therefore there is a
possibility that the EID prefix block is not routable in the core.
For LISP receiver multicast sites this isn't a problem but for non-
LISP or uLISP receiver multicast sites, when a PIM Join/Prune message
is received by the edge router, it has no route to propagate the
Join/Prune message out of the site. This is no different than the
unicast case that LISP-NAT in [INTWORK] solves.
LISP-NAT allows a unicast packet that exits a LISP site to get its
source address mapped to a globally routable address before the ITR
realizes that it should not encapsulate the packet destined to a non-
LISP site. For a multicast packet to leave a LISP site, distribution
tree state needs to be built so the ITR can know where to send the
packet. So the receiver multicast sites need to know about the
multicast source host by its routable address and not its EID
address. When this is the case, the routable address is the
(S-RLOC,G) state that is stored and maintained in the core routers.
It is important to note that the routable address for the host cannot
be the same as an RLOC for the site. Because we want the ITRs to
process a received PIM Join/Prune message from an external-facing
interface to be propagated inside of the site so the site-part of the
distribution tree is built.
Using a globally routable source address allows non-LISP and uLISP
multicast receiver to join, create, and maintain a multicast
distribution tree. However, the LISP multicast receiver site will
want to perform an EID-to-RLOC mapping table lookup when a PIM Join/
Prune message is received on a site-facing interface. It does this
because it wants to find a (S-RLOC,G) entry to Join in the core. So
we have a conflict of behavior between the two types of sites.
The solution to this problem is the same as when an ITR wants to send
a unicast packet to a destination site but needs determine if the
site is LISP capable or not. When it is not LISP capable, the ITR
does not encapsulate the packet. So for the multicast case, when ETR
receives a PIM Join/Prune message for (S-EID,G) state, it will do a
mapping table lookup on S-EID. In this case, S-EID is not in the
Farinacci, et al. Expires April 2, 2010 [Page 20]
Internet-Draft LISP for Multicast Environments September 2009
mapping database because the source multicast site is using a
routable address and not an EID prefix address. So the ETR knows to
simply propagate the PIM Join/Prune message to a external-facing
interface without converting the (S-EID,G) because it is an (S,G)
where S is routable and reachable via core routing tables.
Now that the multicast distribution tree is built and maintained from
any non-LISP or uLISP receiver multicast site, the way packet
forwarding model is performed can be explained.
Since the ITR in the source multicast site has never received a
unicast encapsulated PIM Join/Prune message from any ETR in a
receiver multicast site, it knows there are no LISP-Multicast
receiver sites. Therefore, there is no need for the ITR to
encapsulate data. Since it will know a priori (via configuration)
that its site's EIDs are not routable, it assumes that the multicast
packets from the source host are sent by a routable address. That
is, it is the responsibility of the multicast source host's system
administrator to ensure that the source host sends multicast traffic
using a routable source address. When this happens, the ITR acts
simply as a router and forwards the multicast packet like an ordinary
multicast router.
There is an alternative to using a LISP-NAT scheme just like there is
for unicast [INTWORK] forwarding by using Proxy Tunnel Routers
(PTRs). This can work the same way for multicast routing as well,
but the difference is that non-LISP and uLISP sites will send PIM
Join/Prune messages for (S-EID,G) which make their way in the core to
PTRs. Let's call this use of a PTR as a "Multicast PTR" (or mPTR).
Since the PTRs advertise very coarse EID prefixes, they draw the PIM
Join/Prune control traffic making them the target of the distribution
tree. To get multicast packets from the LISP source multicast sites,
the tree needs to be built on the path from the mPTR to the LISP
source multicast site. To make this happen the mPTR acts as a "Proxy
ETR" (where in unicast it acts as a "Proxy ITR").
The existence of mPTRs in the core allows LISP source multicast site
ITRs to encapsulate multicast packets so the state built between the
ITRs and the mPTRs is (S-RLOC,G) state. Then the mPTRs can
decapsulate packets and forward natively to the non-LISP and uLISP
receiver multicast sites.
9.1.2. Non-LISP Source Site to non-LISP Receiver Sites
Clearly non-LISP multicast sites can send multicast packets to non-
LISP receiver multicast sites. That is what they do today. However,
discussion is required to show how non-LISP multicast sites send
multicast packets to uLISP receiver multicast sites.
Farinacci, et al. Expires April 2, 2010 [Page 21]
Internet-Draft LISP for Multicast Environments September 2009
Since uLISP receiver multicast sites are not targets of any (S,G)
state, they simply send (S,G) PIM Join/Prune messages toward the non-
LISP source multicast site. Since the source multicast site, in this
case has not been upgraded to LISP, all multicast source host
addresses are routable. So this case is simplified to where a uLISP
receiver multicast site looks to the source multicast site as a non-
LISP receiver multicast site.
9.1.3. Non-LISP Source Site to Any Receiver Site
When a non-LISP source multicast site has receivers in either a non-
LISP/uLISP site or a LISP site, one needs to decide how the LISP
receiver multicast site will attach to the distribution tree. We
know from Section 9.1.2 that non-LISP and uLISP receiver multicast
sites can join the distribution tree, but a LISP receiver multicast
site ETR will need to know if the source address of the multicast
source host is routable or not. We showed in Section 9.1.1 that an
ETR, before it sends a PIM Join/Prune message on an external-facing
interface, does a EID-to-RLOC mapping lookup to determine if it
should convert the (S,G) state from a PIM Join/Prune message received
on a site-facing interface to a (S-RLOC,G). If the lookup fails, the
ETR can conclude the source multicast site is a non-LISP site so it
simply forwards the Join/Prune message (it also doesn't need to send
a unicast encapsulated Join/Prune message because there is no ITR in
a non-LISP site and there is namespace continuity between the ETR and
source).
9.1.4. Unicast LISP Source Site to Any Receiver Sites
In the last section, it was explained how an ETR in a multicast
receiver site can determine if a source multicast site is LISP-
enabled by looking into the mapping database. When the source
multicast site is a uLISP site, it is LISP enabled but the ITR, by
definition is not capable of doing multicast encapsulation. So for
the purposes of multicast routing, the uLISP source multicast site is
treated as non-LISP source multicast site.
Non-LISP receiver multicast sites can join distribution trees to a
uLISP source multicast site since the source site behaves, from a
forwarding perspective, as a non-LISP source site. This is also the
case for a uLISP receiver multicast site since the ETR does not have
multicast functionality built-in or enabled.
Special considerations are required for LISP receiver multicast sites
since they think the source multicast site is LISP capable, the ETR
cannot know if ITR is LISP-Multicast capable. To solve this problem,
each mapping database entry will have a multicast 2-tuple (Mpriority,
Mweight) per RLOC. When the Mpriority is set to 255, the site is
Farinacci, et al. Expires April 2, 2010 [Page 22]
Internet-Draft LISP for Multicast Environments September 2009
considered not multicast capable. So an ETR in a LISP receiver
multicast site can distinguish whether a LISP source multicast site
is LISP-Multicast site from a uLISP site.
9.1.5. LISP Source Site to Any Receiver Sites
When a LISP source multicast site has receivers in LISP, non-LISP,
and uLISP receiver multicast sites, it has a conflict about how it
sends multicast packets. The ITR can either encapsulate or natively
forward multicast packets. Since the receiver multicast sites are
heterogeneous in their behavior, one packet forwarding mechanism
cannot satisfy both. However, if a LISP receiver multicast site acts
like a uLISP site then it could receive packets like a non-LISP
receiver multicast site making all receiver multicast sites have
homogeneous behavior. However, this poses the following issues:
o LISP-NAT techniques with routable addresses would be required in
all cases.
o Or alternatively, mPTR deployment would be required forcing coarse
EID prefix advertisement in the core.
o But what is most disturbing is that when all sites that
participate are LISP-Multicast sites but then a non-LISP or uLISP
site joins the distribution tree, then the existing joined LISP
receiver multicast sites would have to change their behavior.
This would create too much dynamic tree-building churn to be a
viable alternative.
So the solution space options are:
1. Make the LISP ITR in the source multicast site send two packets,
one that is encapsulated with (S-RLOC,G) to reach LISP receiver
multicast sites and another that is not encapsulated with
(S-EID,G) to reach non-LISP and uLISP receiver multicast sites.
2. Make the LISP ITR always encapsulate packets with (S-RLOC,G) to
reach LISP-Multicast sites and to reach mPTRs that can
decapsulate and forward (S-EID,G) packets to non-LISP and uLISP
receiver multicast sites.
9.2. LISP Sites with Mixed Address Families
A LISP database mapping entry that describes the locator-set,
Mpriority and Mweight per locator address (RLOC), for an EID prefix
associated with a site could have RLOC addresses in either IPv4 or
IPv6 format. When a mapping entry has a mix of RLOC formatted
addresses, it is an implicit advertisement by the site that it is a
Farinacci, et al. Expires April 2, 2010 [Page 23]
Internet-Draft LISP for Multicast Environments September 2009
dual-stack site. That is, the site can receive IPv4 or IPv6 unicast
packets.
To distinguish if the site can receive dual-stack unicast packets as
well as dual-stack multicast packets, the Mpriority value setting
will be relative to an IPv4 or IPv6 RLOC See [LISP] for packet format
details.
If you consider the combinations of LISP, non-LISP, and uLISP sites
sharing the same distribution tree and considering the capabilities
of supporting IPv4, IPv6, or dual-stack, the number of total
combinations grows beyond comprehension.
Using some combinatorial math, we have the following profiles of a
site and the combinations that can occur:
1. LISP-Multicast IPv4 Site
2. LISP-Multicast IPv6 Site
3. LISP-Multicast Dual-Stack Site
4. uLISP IPv4 Site
5. uLISP IPv6 Site
6. uLISP Dual-Stack Site
7. non-LISP IPv4 Site
8. non-LISP IPv6 Site
9. non-LISP Dual-Stack Site
Lets define (m n) = m!/(n!*(m-n)!), pronounced "m choose n" to
illustrate some combinatorial math below.
When 1 site talks to another site, the combinatorial is (9 2), when 1
site talks to another 2 sites, the combinatorial is (9 3). If sum
this up to (9 9), we have:
(9 2) + (9 3) + (9 4) + (9 5) + (9 6) + (9 7) + (9 8) + (9 9) =
36 + 84 + 126 + 126 + 84 + 36 + 9 + 1
Which results in the total number of cases to be considered at 502.
Farinacci, et al. Expires April 2, 2010 [Page 24]
Internet-Draft LISP for Multicast Environments September 2009
This combinatorial gets even worse when you consider a site using one
address family inside of the site and the xTRs use the other address
family (as in using IPv4 EIDs with IPv6 RLOCs or IPv6 EIDs with IPv4
RLOCs).
To rationalize this combinatorial nightmare, there are some
guidelines which need to be put in place:
o Each distribution tree shared among sites will either be an IPv4
distribution tree or an IPv6 distribution tree. Therefore, we can
avoid head-end replication by building and sending packets on each
address family based distribution tree. Even though there might
be an urge to do multicast packet translation from one address
family format to the other, it is a non-viable over-complicated
urge.
o All LISP sites on a multicast distribution tree must share a
common address family which is determined by the source site's
locator-set in its LISP database mapping entry. All receiver
multicast sites will use the best RLOC priority controlled by the
source multicast site. This is true when the source site is
either LISP-Multicast or uLISP capable. This means that priority-
based policy modification is prohibited.
o When the source site is not LISP capable, it is up to how
receivers find the source and group information for a multicast
flow. That mechanism decides the address family for the flow.
9.3. Making a Multicast Interworking Decision
This Multicast Interworking section has shown all combinations of
multicast connectivity that could occur. As you might have already
concluded, this can be quite complicated and if the design is too
ambitious, the dynamics of the protocol could cause a lot of
instability.
The trade-off decisions are hard to make and we want the same single
solution to work for both IPv4 and IPv6 multicast. It is imperative
to have an incrementally deployable solution for all of IPv4 unicast
and multicast and IPv6 unicast and multicast while minimizing (or
eliminating) both unicast and multicast EID namespace state.
Therefore the design decision to go with PTRs for unicast routing and
mPTRs for multicast routing seems to be the sweet spot in the
solution space so we can optimize state requirements and avoid head-
end data replication at ITRs.
Farinacci, et al. Expires April 2, 2010 [Page 25]
Internet-Draft LISP for Multicast Environments September 2009
10. Considerations when RP Addresses are Embedded in Group Addresses
When ASM and PIM-Bidir is used in an IPv6 inter-domain environment, a
technique exists to embed the unicast address of an RP in a IPv6
group address [RFC3956]. When routers in end sites process a PIM
Join/Prune message which contain an embedded-RP group address, they
extract the RP address from the group address and treat it from the
EID namespace. However, core routers do not have state for the EID
namespace, need to extract an RP address from the RLOC namespace.
Therefore, it is the responsibility of ETRs in multicast receiver
sites to map the group address into a group address where the
embedded-RP address is from the RLOC namespace. The mapped RP-
address is obtained from a EID-to-RLOC mapping database lookup. The
ETR will also send a unicast (*,G) Join/Prune message to the ITR so
the branch of the distribution tree from the source site resident RP
to the ITR is created.
This technique is no different than the techniques described in this
specification for translating (S,G) state and propagating Join/Prune
messages into the core. The only difference is that the (*,G) state
in Join/Prune messages are mapped because they contain unicast
addresses encoded in an Embedded-RP group address.
Farinacci, et al. Expires April 2, 2010 [Page 26]
Internet-Draft LISP for Multicast Environments September 2009
11. Taking Advantage of Upgrades in the Core
If the core routers are upgraded to support [RPFV] and [RFC5496],
then we can pass EID specific data through the core without,
possibly, having to store the state in the core.
By doing this we can eliminate the ETR from unicast encapsulating PIM
Join/Prune messages to the source site's ITR.
However, this solution is restricted to a small set of workable cases
which would not be good for general use of LISP-Multicast. In
addition to slow convergence properties, it is not being recommended
for LISP-Multicast.
Farinacci, et al. Expires April 2, 2010 [Page 27]
Internet-Draft LISP for Multicast Environments September 2009
12. Mtrace Considerations
Mtrace functionality must be consistent with unicast traceroute
functionality where all hops from multicast receiver to multicast
source are visible.
The design for mtrace for use in LISP-Multicast environments is to be
determined but should build upon the mtrace version 2 specified in
[MTRACE].
Farinacci, et al. Expires April 2, 2010 [Page 28]
Internet-Draft LISP for Multicast Environments September 2009
13. Security Considerations
Refer to the [LISP] specification.
Farinacci, et al. Expires April 2, 2010 [Page 29]
Internet-Draft LISP for Multicast Environments September 2009
14. Acknowledgments
The authors would like to gratefully acknowledge the people who have
contributed discussion, ideas, and commentary to the making of this
proposal and specification. People who provided expert review were
Scott Brim, Greg Shepherd, and Dave Oran. Other commentary from
discussions at Summer 2008 Dublin IETF were Toerless Eckert and
Ijsbrand Wijnands.
We would also like to thank the MBONED working group for constructive
and civil verbal feedback when this draft was presented at the Fall
2008 IETF in Minneapolis. In particular, good commentary came from
Tom Pusateri, Steve Casner, Marshall Eubanks, Dimitri Papadimitriou,
Ron Bonica, and Lenny Guardino.
An expert review of this specification was done by Yiqun Cai and
Liming Wei. We thank them for their detailed comments.
This work originated in the Routing Research Group (RRG) of the IRTF.
The individual submission [MLISP] was converted into this IETF LISP
working group draft.
Farinacci, et al. Expires April 2, 2010 [Page 30]
Internet-Draft LISP for Multicast Environments September 2009
15. References
15.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3618] Fenner, B. and D. Meyer, "Multicast Source Discovery
Protocol (MSDP)", RFC 3618, October 2003.
[RFC3956] Savola, P. and B. Haberman, "Embedding the Rendezvous
Point (RP) Address in an IPv6 Multicast Address",
RFC 3956, November 2004.
[RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas,
"Protocol Independent Multicast - Sparse Mode (PIM-SM):
Protocol Specification (Revised)", RFC 4601, August 2006.
[RFC4604] Holbrook, H., Cain, B., and B. Haberman, "Using Internet
Group Management Protocol Version 3 (IGMPv3) and Multicast
Listener Discovery Protocol Version 2 (MLDv2) for Source-
Specific Multicast", RFC 4604, August 2006.
[RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for
IP", RFC 4607, August 2006.
[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
"Multiprotocol Extensions for BGP-4", RFC 4760,
January 2007.
[RFC5015] Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano,
"Bidirectional Protocol Independent Multicast (BIDIR-
PIM)", RFC 5015, October 2007.
[RFC5496] Wijnands, IJ., Boers, A., and E. Rosen, "The Reverse Path
Forwarding (RPF) Vector TLV", RFC 5496, March 2009.
15.2. Informative References
[ALT] Farinacci, D., Fuller, V., and D. Meyer, "LISP Alternative
Topology (LISP-ALT)", draft-ietf-lisp-alt-01.txt (work in
progress), May 2009.
[INTWORK] Lewis, D., Meyer, D., and D. Farinacci, "Interworking LISP
with IPv4 and IPv6", draft-ietf-lisp-interworking-00.txt
(work in progress), May 2009.
[LISP] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis,
Farinacci, et al. Expires April 2, 2010 [Page 31]
Internet-Draft LISP for Multicast Environments September 2009
"Locator/ID Separation Protocol (LISP)",
draft-ietf-lisp-05.txt (work in progress), September 2009.
[MLISP] Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas,
"LISP for Multicast Environments",
draft-farinacci-lisp-multicast-01.txt (work in progress),
November 2008.
[MNAT] Wing, D. and T. Eckert, "Multicast Requirements for a
Network Address (and port) Translator (NAT)",
draft-ietf-behave-multicast-07.txt (work in progress),
June 2007.
[MTRACE] Asaeda, H., Jinmei, T., Fenner, W., and S. Casner, "Mtrace
Version 2: Traceroute Facility for IP Multicast",
draft-ietf-mboned-mtrace-v2-03.txt (work in progress),
March 2009.
[RPFV] Wijnands, IJ., Boers, A., and E. Rosen, "The RPF Vector
TLV", draft-ietf-pim-rpf-vector-06.txt (work in progress),
February 2008.
Farinacci, et al. Expires April 2, 2010 [Page 32]
Internet-Draft LISP for Multicast Environments September 2009
Appendix A. Document Change Log
A.1. Changes to draft-ietf-lisp-multicsat-02.txt
o Posted September 2009.
o Added Document Change Log appendix.
o Specify that the LISP Encapsulated Control Message be used for
unicasting PIM Join/Prune messages from ETRs to ITRs.
A.2. Changes to draft-ietf-lisp-multicsat-01.txt
o Posted November 2008.
o Specified that PIM Join/Prune unicast messages that get sent from
ETRs to ITRs of a source multicast site get LISP encapsulated in
destination UDP port 4342.
o Add multiple RLOCs per ITR per Yiqun's comments.
o Indicate how static RPs can be used when LISP is run using Bidir-
PIM in the core.
o Editorial changes per Liming comments.
o Add Mttrace Considersations section.
A.3. Changes to draft-ietf-lisp-multicsat-00.txt
o Posted April 2008.
o Renamed from draft-farinacci-lisp-multicast-01.txt.
Farinacci, et al. Expires April 2, 2010 [Page 33]
Internet-Draft LISP for Multicast Environments September 2009
Authors' Addresses
Dino Farinacci
cisco Systems
Tasman Drive
San Jose, CA
USA
Email: dino at cisco.com
Dave Meyer
cisco Systems
Tasman Drive
San Jose, CA
USA
Email: dmm at cisco.com
John Zwiebel
cisco Systems
Tasman Drive
San Jose, CA
USA
Email: jzwiebel at cisco.com
Stig Venaas
cisco Systems
Tasman Drive
San Jose, CA
USA
Email: stig at cisco.com
Farinacci, et al. Expires April 2, 2010 [Page 34]
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.