idnits 2.17.1 draft-ietf-mboned-auto-multicast-02.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 8 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 512: '... local router, it SHOULD also function...' RFC 2119 keyword, line 516: '...ts. The gateway MUST also advertise i...' RFC 2119 keyword, line 578: '...he remaining 24 bits MUST be generated...' RFC 2119 keyword, line 590: '...a.b.c.d, then it MUST allocate SSM gro...' RFC 2119 keyword, line 595: '...or a given (source, group) pair MAY be...' (8 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: 4. A gateway or relay MUST not update its group state on receiving a membership report. Instead, it MUST generate a Membership Confirmation message to the source of the report. On receiving a Membership Acknowledgement, the group state should be updated only if the nonce in the acknowledgement matches the one in the confirmation message. This prevents an attacker from spoofing the source address of a membership report and causing a denial of service or reflection attack on the target. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1112' is mentioned on line 86, but not defined == Missing Reference: 'TBD IANA' is mentioned on line 240, but not defined -- Unexpected draft version: The latest known version of draft-holbrook-ssm-arch is -03, but you're referring to -04. -- Possible downref: Normative reference to a draft: ref. 'SSM' -- Obsolete informational reference (is this intentional?): RFC 3068 (ref. 'ANYCAST') (Obsoleted by RFC 7526) -- Obsolete informational reference (is this intentional?): RFC 2362 (ref. 'PIMSM') (Obsoleted by RFC 4601, RFC 5059) Summary: 5 errors (**), 0 flaws (~~), 5 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MBoneD Working Group Dave Thaler 3 Internet Draft Mohit Talwar 4 Document: draft-ietf-mboned-auto-multicast-02.txt Amit Aggarwal 5 February 9, 2004 Microsoft 6 Lorenzo Vicisano 7 Cisco 8 Dirk Ooms 9 Alcatel 11 IPv4 Automatic Multicast Without Explicit Tunnels (AMT) 13 Status of this Memo 15 This document is an Internet-Draft and is in full conformance with 16 all provisions of Section 10 of RFC2026. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six 24 months and may be updated, replaced, or obsoleted by other documents 25 at any time. It is inappropriate to use Internet-Drafts as 26 reference material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 Copyright Notice 36 Copyright (C) The Internet Society (2004). All Rights Reserved. 38 1. Abstract 40 Automatic Multicast Tunneling (AMT) allows multicast communication 41 amongst isolated multicast-enabled sites or hosts, attached to a 42 network which has no native multicast support. It also enables them 43 to exchange multicast traffic with the native multicast 44 infrastructure (MBone) and does not require any manual 45 configuration. AMT uses an encapsulation interface so that no 46 changes to a host stack or applications are required, all protocols 47 (not just UDP) are handled, and there is no additional overhead in 48 core routers. 50 2. Introduction 52 The primary goal of this document is to foster the deployment of 53 native IP multicast by enabling a potentially large number of nodes 54 to connect to the already present multicast infrastructure. 55 Therefore, the techniques discussed here should be viewed as an 56 interim solution to help in the various stages of the transition to 57 a native multicast network. 59 To allow fast deployment, the solution presented here only requires 60 small and concentrated changes to the network infrastructure, and no 61 changes at all to user applications or to the socket API of end- 62 nodes' operating systems. The protocols introduced in this 63 specification are implemented in a few strategically-placed network 64 nodes and in user-installable software modules (pseudo device 65 drivers and/or user-mode daemons) that reside underneath the socket 66 API of end-nodes' operating systems. This mechanism is very similar 67 to that used by "6to4" [6TO4, ANYCAST] to get automatic IPv6 68 connectivity. 70 Effectively, AMT treats the unicast-only internetwork as a large 71 non-broadcast multi-access (NBMA) link layer, over which we require 72 the ability to multicast. To do this, multicast packets being sent 73 to or from a site must be encapsulated in unicast packets. If the 74 group has members in multiple sites, AMT encapsulation of the same 75 multicast packet will take place multiple times by necessity. 77 The following problems are addressed: 79 1. Allowing isolated sites/hosts to receive the SSM flavor of 80 multicast ([SSM]). 82 2. Allowing isolated sites/hosts to transmit the SSM flavor of 83 multicast. 85 3. Allowing isolated sites/hosts to receive general multicast (ISM 86 [RFC1112]). 88 This document does not address allowing isolated sites/hosts to 89 transmit general multicast. We expect that other solutions (e.g., 90 Tunnel Brokers, a la [BROKER]) will be used for sites that desire 91 this capability. 93 3. Definitions 95 +---------------+ Internet +---------------+ 96 | AMT Site | | MBone | 97 | | AMT | | 98 | +------+----+ Relay +----+----+ AMT | 99 | |AMT Gateway| Anycast |AMT Relay| Subnet | 100 | | +-----+-+ Prefix +-+-----+ | Prefix | 101 | | |AMT IF | <--------|AMT IF | |--------> | 102 | | +-----+-+ +-+-----+ | | 103 | +------+----+ +----+----+ | 104 | | | | 105 +---------------+ +---------------+ 107 Figure 1: Automatic Multicast Definitions. 109 AMT Pseudo-Interface 111 AMT encapsulation of multicast packets inside unicast packets 112 occurs at a point that is logically equivalent to an interface, 113 with the link layer being the unicast-only network. This point 114 is referred to as a pseudo-interface. Some implementations may 115 treat it exactly like any other interface and others may treat 116 it like a tunnel end-point. 118 AMT Gateway 120 A host, or a site gateway router, supporting an AMT Pseudo- 121 Interface. It does not have native multicast connectivity to 122 the native multicast backbone infrastructure. It is simply 123 referred to in this document as a "gateway". 125 AMT Site 127 A multicast-enabled network not connected to the multicast 128 backbone served by an AMT Gateway. It could also be a stand- 129 alone AMT Gateway. 131 AMT Relay Router 133 A multicast router configured to support transit routing 134 between AMT Sites and the native multicast backbone 135 infrastructure. The relay router has one or more interfaces 136 connected to the native multicast infrastructure, zero or more 137 interfaces connected to the non-multicast capable internetwork, 138 and an AMT pseudo-interface. It is simply referred to in this 139 document as a "relay". 141 As with [6TO4], we assume that normal multicast routers do not 142 want to be tunnel endpoints (especially if this results in high 143 fanout), and similarly that service providers do not want 144 encapsulation to arbitrary routers. Instead, we assume that 145 special-purpose routers will be deployed that are suitable for 146 serving as relays. 148 AMT Relay Anycast Prefix 150 A well-known address prefix used to advertise (into the unicast 151 routing infrastructure) a route to an available AMT Relay 152 Router. This could also be private (i.e. not well-known) for a 153 private relay. 155 The value of this prefix is x.x.x.0/nn [length and value TBD 156 IANA]. 158 AMT Relay Anycast Address 160 An anycast address which is used to reach the nearest AMT Relay 161 Router. 163 This address corresponds to host number 1 in the AMT Relay 164 Anycast Prefix, x.x.x.1. 166 AMT Unicast Autonomous System ID 168 A 16-bit Autonomous System ID, for use in BGP in accordance to 169 this memo. AS 10888 might be usable for this, but for now 170 we'll assume it's different, to avoid confusion. This number 171 represents a "pseudo-AS" common to all AMT relays using the 172 well known AMT Relay Anycast Prefix (private relays use their 173 own ID). 175 To protect themselves from erroneous advertisements, managers 176 of border routers often use databases to check the relation 177 between the advertised network and the last hop in the AS path. 178 Associating a specific AS number with the AMT Relay Anycast 179 Address allows us to enter this relationship in the databases 180 used to check inter-domain routing [ANYCAST]. 182 AMT Subnet Prefix 184 A well-known address prefix used to advertise (into the M-RIB 185 of the native multicast-enabled infrastructure) a route to AMT 186 Sites. This prefix will be used to enable sourcing SSM traffic 187 from an AMT Gateway. 189 AMT Gateway Anycast Address 191 An anycast address in the AMT Subnet Prefix range, which is 192 used by an AMT Gateway to enable sourcing SSM traffic from 193 local applications. 195 AMT Multicast Autonomous System ID 197 A 16-bit Autonomous system ID, for use in MBGP in accordance to 198 this memo. This number represents a "pseudo-AS" common to all 199 AMT relays using the well known AMT Subnet Prefix (private 200 relays use their own ID). We assume that the existing AS 10888 201 is suitable for this purpose. (Note: if this is a problem, a 202 different one would be fine.) 204 4. Overview 206 4.1. Receiving Multicast in an AMT Site 208 +---------------+ Internet +---------------+ 209 | AMT Site | | MBone | 210 | | 2. IGMP Report | | 211 | 1. Join +---+---+ =================> +---+---+ | 212 | +---->|Gateway| | Relay | | 213 | | +---+---+ <================= +---+---+ | 214 | R-+ | 3. Data | | 215 +---------------+ +---------------+ 217 Figure 2: Receiving Multicast in an AMT Site. 219 AMT relays and gateways cooperate to transmit multicast traffic 220 sourced within the native multicast infrastructure to AMT sites: 221 relays receive the traffic natively and unicast-encapsulate it to 222 gateways; gateways decapsulate the traffic and possibly forward it 223 into the AMT site. 225 Each gateway has an AMT pseudo-interface that serves as a default 226 multicast route. Requests to join a multicast session are sent to 227 this interface and encapsulated to a particular relay reachable 228 across the unicast-only infrastructure. 230 Each relay has an AMT pseudo-interface too. Multicast traffic sent 231 on this interface is encapsulated to zero or more gateways that have 232 joined to the relay. The AMT recipient-list is determined for each 233 multicast session. This requires the relay to keep state for each 234 gateway which has joined a particular group or (source, group) 235 pair). Multicast packets from the native infrastructure behind the 236 relay will be sent to each gateway which has requested them. 238 All multicast packets (data and control) are encapsulated in unicast 239 packets. To work across NAT's, the encapsulation is done over UDP 240 using a well-known port number [TBD IANA]. 242 Each relay, plus the set of all gateways (perhaps unknown to the 243 relay) using the relay, together can be thought of as being on a 244 separate logical NBMA link. This implies that the AMT recipient- 245 list is a list of "link layer" addresses which are (IP address, UDP 246 port) pairs. 248 Since the number of gateways using a relay can be quite large, and 249 we expect that most sites will not want to receive most groups, an 250 explicit-joining protocol is required for gateways to communicate 251 group membership information to a relay. The two most likely 252 candidates are the IGMP [IGMPv3] protocol, and the PIM-Sparse Mode 253 [PIMSM] protocol. Since an AMT gateway may be a host, and hosts 254 typically do not implement routing protocols, gateways will use IGMP 255 as described in Section 5 below. This allows a host kernel (or a 256 pseudo device driver) to easily implement AMT gateway behavior, and 257 obviates the relay from the need to know whether a given gateway is 258 a host or a router. From the relay's perspective, all gateways are 259 indistinguishable from hosts on an NBMA leaf network. 261 4.1.1. Scalability Considerations 263 The requirement that a relay keep group state per gateway that has 264 joined the group introduces potential scalability concerns. 266 However, scalability of AMT can be achieved by adding more relays, 267 and using an appropriate relay discovery mechanism for gateways to 268 discover relays. The solution we adopt is to assign an anycast 269 address to relays. However, simply sending periodic IGMP Reports to 270 the anycast address can cause duplicates. Specifically, if routing 271 changes such that a different relay receives a periodic IGMP Report, 272 both the new and old relays will encapsulate data to the AMT site 273 until the old relay's state times out. This is obviously 274 undesirable. Instead, we use the anycast address merely to find a 275 unicast address which can then be used. 277 Since adding another relay has the result of adding another 278 independent NBMA link, this allows the gateways to be spread out 279 among more relays so as to keep the number of gateways per relay at 280 a reasonable level. 282 4.1.2 Spoofing Considerations 284 An attacker could affect the group state in the relay by spoofing 285 the source address in the join or leave reports. This can be used to 286 launch reflection or denial of service attacks on the target. Such 287 attacks can be mitigated by using a three way handshake between the 288 gateway and the relay for each multicast membership report. On 289 receiving an IGMP report, the relay sends a message to the source of 290 the report with the original report as well as a nonce. The state in 291 the relay is updated only on receiving a confirmation for the report 292 with the nonce in it. 294 4.2. Sourcing Multicast from an AMT site 296 Two cases are discussed below: multicast traffic sourced in an AMT 297 site and received in the MBone, and multicast traffic sourced in an 298 AMT site and received in another AMT site. 300 In both cases only SSM sources are supported. Furthermore this 301 specification only deals with the source residing directly in the 302 gateway. To enable a generic node in an AMT site to source 303 multicast, additional coordination between the gateway and the 304 source-node is required. 306 The general mechanism used to join towards AMT sources is based on 307 the following: 309 1. Applications residing in the gateway use addresses in the AMT 310 Subnet Prefix to send multicast, as a result of sourcing traffic on 311 the AMT pseudo-interface. 313 2. The AMT Subnet Prefix is advertised for RPF reachability in the 314 M-RIB by relays and gateways. 316 3. Relays or gateways that receive a join for a source/group pair 317 use information encoded in the address pair to rebuild the address 318 of the gateway (source) to which to encapsulate the join (see 319 section 5 for more details). The membership reports use the same 320 three way handshake as outlined in section 4.1.2. 322 4.2.1. Supporting Site-MBone Multicast 324 +---------------+ Internet +---------------+ 325 | AMT Site | | MBone | 326 | | 2. IGMP Report | | 327 | +---+---+ <================= +---+---+ 1. Join | 328 | |Gateway| | Relay |<-----+ | 329 | +---+---+ =================> +---+---+ | | 330 | | 3. Data | +-R | 331 +---------------+ +---------------+ 333 Figure 3: Site-MBone Multicast. 335 If a relay receives an explicit join from the native infrastructure, 336 for a given (source, group) pair where the source address belongs to 337 the AMT Subnet Prefix, then the relay will periodically (using the 338 rules specified in Section 5) UDP encapsulate an IGMP Report for the 339 group to the gateway. The gateway must keep state per relay from 340 which an IGMP Report has been sent, and forward multicast traffic 341 from the site to all relays from which IGMP Reports have been 342 received. The choice of whether this state and replication is done 343 at the link-layer (i.e., by the tunnel interface) or at the network- 344 layer is implementation-dependent. 346 If there are multiple relays present, this ensures that data from 347 the AMT site is received via the closest relay to the receiver. This 348 is necessary when the routers in the native multicast infrastructure 349 employ Reverse-Path Forwarding (RPF) checks against the source 350 address, such as occurs when [PIMSM] is used by the multicast 351 infrastructure. 353 The solution above will scale to an arbitrary number of relays, as 354 long at the number of relays requiring multicast traffic from a 355 given AMT site remains reasonable enough to not overly burden the 356 site's gateway. 358 4.2.2. Supporting Site-Site Multicast 360 +---------------+ Internet +---------------+ 361 | AMT Site | | AMT Site | 362 | | 2. IGMP Report | | 363 | +---+---+ <================= +---+---+ 1. Join | 364 | |Gateway| |Gateway|<-----+ | 365 | +---+---+ =================> +---+---+ | | 366 | | 3. Data | +-R | 367 +---------------+ +---------------+ 369 Figure 4: Site-Site Multicast. 371 Since we require gateways to accept IGMP Reports, as described 372 above, it is also possible to support multicast among AMT sites, 373 without requiring assistance from any relays. 375 When a gateway wants to join a given (source, group) pair, where the 376 source address belongs to the AMT Subnet Prefix, then the gateway 377 will periodically unicast encapsulate an IGMPv3 [IGMPv3] Report 378 directly to the site gateway for the source. 380 We note that this can result in a significant amount of state at a 381 site gateway sourcing multicast to a large number of other AMT 382 sites. However, it is expected that this is not unreasonable for 383 two reasons. First, the gateway does not have native multicast 384 connectivity, and as a result is likely doing unicast replication at 385 present. The amount of state is thus the same as what such a site 386 already deals with. Secondly, any site expecting to source traffic 387 to a large number of sites could get a point-to-point tunnel to the 388 native multicast infrastructure, and use that instead of AMT. 390 5. Message Formats 392 5.1. AMT Relay Discovery 394 The AMT Relay Discovery message is a UDP packet sent from the AMT 395 gateway unicast address to the AMT relay anycast address to discover 396 the unicast address of an AMT relay. The payload of the UDP packet 397 contains the following fields. 399 0 1 2 3 400 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 401 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 402 | Type=0x1 | Nonce | 403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 405 Fields: 407 Type 408 The type of the message. 410 Nonce 411 A 24 bit random value generated by the gateway and 412 replayed by the relay. 414 5.2. AMT Relay Advertisement 416 The AMT Relay Advertisement message is a UDP packet sent from the 417 AMT relay anycast address to the source of the discovery message. 418 The payload of the UDP packet contains the following fields. 420 0 1 2 3 421 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 422 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 423 | Type=0x2 | Nonce | 424 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 425 | Relay Address | 426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 428 Fields: 430 Type 431 The type of the message. 433 Nonce 434 A 24 bit random value replayed from the discovery 435 message. 437 Relay Address 438 The unicast IP address of the AMT relay. 440 5.3. Membership Report Confirmation 442 The membership report confirmation is a UDP packet sent by the 443 gateway or relay to the source of a multicast membership report. 445 0 1 2 3 446 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 | Type=0x3 | Nonce | 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 | Multicast Report | 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 453 Fields: 455 Type 456 The type of the message. 458 Nonce 459 A 24 bit random value generated by the relay or gateway 460 on receiving a multicast report. 462 Multicast Report 463 The complete multicast report that the relay or gateway 464 is trying to confirm. 466 5.4. Membership Report Acknowledgement 468 The membership report acknowledgement is a UDP packet sent by the 469 source of a membership report to a gateway or relay/ 471 0 1 2 3 472 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 473 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 474 | Type=0x3 | Nonce | 475 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 476 | Multicast Report | 477 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 479 Fields: 481 Type 482 The type of the message. 484 Nonce 485 A 24 bit random value replayed from the confirmation 486 message. 488 Multicast Report 489 The complete multicast report that the relay or gateway 490 is trying to confirm. 492 6. AMT Gateway Details 494 This section details the behavior of an AMT Gateway, which may be a 495 router serving an AMT site, or the site may consist of a single 496 host, serving as its own gateway. 498 6.1. At Startup Time 500 At startup time, the AMT gateway will bring up an AMT pseudo- 501 interface, to be used for encapsulation. The gateway will then send 502 a AMT Relay Discovery message to the AMT Relay Anycast Address, and 503 note the unicast address (which is treated as a link-layer address 504 to the encapsulation interface) from the AMT Relay Advertisement 505 message. This discovery should be done periodically (e.g., once a 506 day) to re-resolve the unicast address of a close relay. The 507 gateway also initializes a timer used to send periodic IGMP Reports 508 to a random value from the interval [0, [Query Interval]] before 509 sending the first periodic report, in order to prevent startup 510 synchronization (e.g., after a power outage). 512 If the gateway is serving as a local router, it SHOULD also function 513 as an IGMP Proxy, as described in [IGMPPROXY], with its IGMP host- 514 mode interface being the AMT pseudo-interface. This enables it to 515 translate group memberships on its downstream interfaces into IGMP 516 Reports. The gateway MUST also advertise itself as the default 517 route for multicast in the M-RIB (or it must be the default unicast 518 router if unicast and multicast topologies are congruent). Also, if 519 a shared tree routing protocol is used inside the AMT site, each 520 tree-root must be a gateway, e.g., in PIM-SM each RP must be a 521 gateway. 523 Finally, to support sourcing traffic to SSM groups by a gateway with 524 a global unicast address, the AMT Subnet Prefix is treated as the 525 subnet prefix of the AMT pseudo-interface, and an anycast address is 526 added on the interface. This anycast address is formed by 527 concatenating the AMT Subnet Prefix followed by the high bits of the 528 gateway's global unicast address. For example, if IANA assigns the 529 prefix x.y/16 as the AMT Subnet Prefix, and the gateway has global 530 unicast address a.b.c.d, then the AMT Gateway's Anycast Address will 531 be x.y.a.b. Note that multiple gateways might end up with the same 532 address anycast assigned to their pseudo-interfaces. 534 6.2. Joining Groups with MBone Sources 536 The IGMP protocol usually operates by having the Querier multicast 537 an IGMP Query message on the link. This behavior does not work on 538 NBMA links which do not support multicast. Since the set of 539 gateways is typically unknown to the relay (and potentially quite 540 large), unicasting the queries is also impractical. The following 541 behavior is used instead. 543 Applications residing in a gateway should join groups on the AMT 544 pseudo-interface, causing IGMP Membership Reports to be sent over 545 that interface. When UDP encapsulating the IGMP Reports (and in 546 fact any other messages, unless specified otherwise in this 547 document), the destination address in the outer IP header is the 548 relay's unicast address. To provide robustness, gateways unicast 549 IGMP Reports to the relay every [Query Interval] (defined as 125 in 550 [IGMPv3]) seconds. The gateway also needs to respond to Membershsip 551 Confirmation messages sent by the relay with a Membership 552 Acknowledgement message. 554 Generating periodic reports can be done in any implementation- 555 specific manner. Some possibilities include: 557 1. The AMT pseudo-interface might periodically manufacture IGMPv3 558 Queries as if they had been received from an IGMP Querier, and 559 deliver them to the IP layer, after which normal IGMP behavior will 560 cause the appropriate reports to be sent. 562 2. The IGMP module itself might provide an option to operate in 563 periodic mode on specific interfaces. 565 6.3. Responding to Relay Changes 567 When a gateway determines that its current relay is unreachable 568 (e.g., upon receipt of a ICMP Unreachable message for the relay's 569 unicast address), it immediately repeats the unicast address 570 resolution step by sending a UDP encapsulated ICMP Echo Request to 571 the AMT Relay Anycast Address, and storing the source address of the 572 UDP encapsulated ICMP Echo Response as the new unicast address to 573 use as a "link-layer" destination. 575 6.4. Creating SSM groups 577 When a gateway wants to create an SSM group (i.e., in 232/8) for 578 which it can source traffic, the remaining 24 bits MUST be generated 579 as described below. ([SSM] states that "the policy for allocating 580 these bits is strictly locally determined at the sender's host.") 582 When the gateway determined its AMT Gateway Anycast Address as 583 described above, it used the high bits of its global unicast 584 address. The remaining bits of its global unicast address are 585 appended to the 232/8 prefix, and any spare bits may be allocated 586 using any policy (again, strictly locally determined at the sender's 587 host). 589 For example, if the AMT Subnet Prefix is x.y/16, and the device has 590 global unicast address a.b.c.d, then it MUST allocate SSM groups in 591 the range 232.c.d/24. 593 6.5. Joining SSM Groups with AMT Sources 595 An IGMPv3 Report for a given (source, group) pair MAY be 596 encapsulated directly to the source, when the source address belongs 597 to the AMT Subnet Prefix. 599 The "link-layer" address to use as the destination address in the 600 outer IP header is obtained as follows. The source address in the 601 inclusion list of the IGMPv3 report will be an AMT Gateway Anycast 602 Address with the high bits of the address, and the remaining bits 603 will be in the middle of the group address. 605 For example, if the AMT Subnet Prefix is x.y/16, and the IGMPv3 606 Report is for (x.y.a.b, 232.c.d.e), then the "link layer" 607 destination address used for encapsulation is a.b.c.d. 609 6.6. Receiving IGMPv3 Reports on the AMT Interface 611 When an IGMPv3 report is received on the AMT pseudo-interface, and 612 the report is a request to join a given (S, G) pair, then the 613 following actions are taken. 615 If S is not the AMT Gateway Anycast Address of the gateway, the 616 packet is dropped. If G does not contain the low bits of the global 617 unicast address (as described above), the packet is also dropped. 619 Otherwise, the gateway sends a Membership Confirmation message to 620 the source of the IGMPv3 report. The message contains a random 621 nonce. On receiving a Membership Acknowledgement message, the 622 gateway verifies that the nonce in the acknowledgement is the same 623 as the one in the confirmation message. If the two differ, the 624 message is dropped without any change to the gateway state. If the 625 two nonces are the same, the gateway adds the source address (from 626 the outer IP header) and UDP port of the report to a membership list 627 for G. Maintaining this membership list may be done in any 628 implementation-dependent manner. For example, it might be 629 maintained by the "link-layer" inside the AMT pseudo-interface, 630 making it invisible to the normal IGMP module. 632 6.7. Sending data to SSM groups 634 When multicast packets are sent on the AMT pseudo-interface, they 635 are encapsulated as follows. If the group address is not an SSM 636 group, then the packet is dropped (this memo does not currently 637 provide a way to send to non-SSM groups). 639 If the group address is an SSM group, then the packet is unicast 640 encapsulated to each remote node from which the gateway has received 641 an IGMPv3 report for the packet's (source, group) pair. 643 7. Relay Router Details 645 7.1. At startup time 647 At startup time, the relay router will bring up an NBMA-style AMT 648 pseudo-interface. It shall also add the AMT Relay Anycast Address 649 on some interface. 651 The relay router shall then advertise the AMT Relay Anycast Prefix 652 into the unicast-only Internet, as if it were a connection to an 653 external network. When the advertisement is done using BGP, the AS 654 path leading to the AMT Relay Anycast Prefix shall include the 655 identifier of the local AS and the AMT Unicast Autonomous System ID. 657 The relay router shall also enable IGMPv3 on the AMT pseudo- 658 interface, except that it shall not multicast Queries (this might be 659 done, for example, by having the AMT pseudo-device drop them, or by 660 having the IGMP module not send them in the first place). 662 Finally, to support sourcing SSM traffic from AMT sites, the AMT 663 Subnet Prefix is assigned to the AMT pseudo-interface, and the AMT 664 Subnet Prefix is injected into the M-RIB of MBGP. 666 7.2. Receiving Echo Requests to the Anycast Address 668 When a relay receives a AMT Relay Discovery message directed to the 669 AMT Relay Anycast Address, it should respond with a AMT Relay 670 Advertisement containing its unicast address. The source and 671 destination addresses of the advertisement should be the same as the 672 destination and source addresses of the discovery message 673 respectively. Further, the nonce in the discovery message MUST be 674 copied into the advertisement message. 676 7.3. Receiving Joins from AMT Gateways 678 The relay operates passively, sending no Queries but simply tracking 679 membership information according to Reports and Leave messages, as a 680 router normally would. In addition, the relay must also do explicit 681 membership tracking, as to which gateways on the AMT pseudo- 682 interface have joined which groups. On receiving a membership 683 report, the gateway generates a Membership Confirmation message with 684 a random nonce in it. On receiving a Membership Acknowledgement, it 685 updates the group state if the nonce in the reply matches the one in 686 the confirmation message. When data arrives for that group, the 687 traffic must be encapsulated to each gateway which has joined that 688 group. 690 The explicit membership tracking and unicast replication may be done 691 in any implementation-specific manner. Some examples are: 693 1. The AMT pseudo-device driver might track the group information 694 and perform the replication at the "link-layer", with no changes to 695 a pre-existing IGMP module. 697 2. The IGMP module might have native support for explicit membership 698 tracking, especially if it supports other NBMA-style interfaces. 700 7.4. Receiving (S,G) Joins from the Native Side, for AMT 701 Sources 703 The relay encapsulates an IGMPv3 report to the AMT source as 704 described above in Section 5.5. 706 8. IANA Considerations 708 The IANA should allocate a prefix dedicated to the public AMT Relays 709 to the native multicast backbone. The prefix length should be 710 determined by the IANA; the prefix should be large enough to 711 guarantee advertisement in the default- free BGP networks; a length 712 of 16 will meet this requirement. This is a one time effort; there 713 is no need for any recurring assignment after this stage. 715 The IANA should also allocate an Autonomous System ID which can be 716 used as a pseudo-AS when advertising routes to the above prefix. 717 Furthermore, to support sourcing SSM traffic from AMT gateways, the 718 IANA should allocate a subnet prefix dedicated to the AMT link. The 719 prefix length should be determined by the IANA; the prefix should be 720 large enough to guarantee advertisement in the default- free BGP 721 networks; a length of 16 will meet this requirement. This is a one 722 time effort; there is no need for any recurring assignment after 723 this stage. It should also be noted that this prefix length 724 directly affects the number of groups available to be created by the 725 AMT gateway: a length of 16 gives 256 groups, and a length of 8 726 gives 65536 groups. For diagnostic purposes, it is helpful to have 727 a prefix length which is a multiple of 8, although this is not 728 required. 730 An autonomous system number dedicated to a pseudo-AS for multicast 731 is already in use today (AS 10888), and so it is expected that no 732 additional AS number is required for this prefix. 734 Finally, IANA should reserve a well-known UDP port number for AMT 735 encapsulation. 737 9. Security Considerations 738 The anycast technique introduces a risk that a rogue router or a 739 rogue AS could introduce a bogus route to the AMT Relay Anycast 740 Prefix, and thus divert the traffic. Network managers have to 741 guarantee the integrity of their routing to the AMT Relay anycast 742 prefix in much the same way that they guarantee the integrity of all 743 other routes. 745 Within the native MBGP infrastructure, there is a risk that a rogue 746 router or a rogue AS could introduce a bogus route to the AMT Subnet 747 Prefix, and thus divert joins and cause RPF failures of multicast 748 traffic. Again, network managers have to guarantee the integrity of 749 the MBGP routing to the AMT subnet prefix in much the same way that 750 they guarantee the integrity of all other routes in the M-RIB. 752 Gateways and relays will accept and decapsulate multicast traffic 753 from any source from which regular unicast traffic is accepted. If 754 this is for any reason felt to be a security risk, then additional 755 source address based packet filtering MUST be applied: 757 1. To avoid that a rogue sender (that can't do traditional spoofing 758 because of e.g. access lists deployed by its ISP) makes use of AMT 759 to send packets to an SSM tree, a relay that receives an 760 encapsulated multicast packet MUST discard the multicast packet if 761 the IPv4 source address in the outer header is not composed of the 762 last 2 bytes of the source address and the 2 middle bytes of the 763 destination address of the inner header (i.e. a.b.c.d must be 764 composed of the a.b of x.y.a.b and the c.d of 232.c.d.e). 766 2. A gateway MUST discard encapsulated multicast packets if the 767 source address in the outer header is not the address to which the 768 encapsulated join message was sent. An AMT Gateway that receives an 769 encapsulated IGMPv3 (S,G)-Join MUST discard the message if the IPv4 770 destination address in the outer header is not composed of the last 771 2 bytes of S and the 2 middle bytes of G (i.e. the destination 772 address a.b.c.d must be composed of the a.b of the multicast source 773 x.y.a.b and the c.d of the multicast group 232.c.d.e). 775 3. A gateway MUST drop an AMT Relay Advertisement if the nonce in 776 the advertisement does not match the nonce in the discovery packet 777 sent by the gateway. This prevents an attacker from acting as an AMT 778 anycast relay even without publishing a route to the AMT Anycast 779 Subnet Prefix. 781 4. A gateway or relay MUST not update its group state on receiving a 782 membership report. Instead, it MUST generate a Membership 783 Confirmation message to the source of the report. On receiving a 784 Membership Acknowledgement, the group state should be updated only 785 if the nonce in the acknowledgement matches the one in the 786 confirmation message. This prevents an attacker from spoofing the 787 source address of a membership report and causing a denial of 788 service or reflection attack on the target. 790 10. Acknowledgements 792 Most of the mechanisms described in this document are based on 793 similar work done by the NGTrans WG for obtaining automatic IPv6 794 connectivity without explicit tunnels ("6to4"). Tony Ballardie 795 provided helpful discussion that inspired this document. 797 11. Appendix A: Open Issues 799 Under the proposed mechanism, a gateway sends its IGMPv3 Reports for 800 MBone sources to the relay closest to itself (discovered using the 801 UDP encapsulated "ping"). This ensures that, as far as possible, 802 multicast traffic flows through the native multicast infrastructure 803 and the automatic multicast encapsulation is short. 805 However, there might be reasons to create automatic tunnels to the 806 relay closest to the MBone source instead. An ISP, for example, 807 might be willing to provide a relay for only its own customers, 808 those wishing to multicast their transmission to a much wider 809 audience. A mechanism, complementary to the one described in this 810 document, might be used to provide this facility. It uses UDP 811 encapsulated ICMP Redirect messages as described below. 813 While injecting routes for its sources into the M-RIB, such an ISP 814 might, for example, use a new BGP attribute to convey the address of 815 the preferred relay. This would let other relays redirect any IGMP 816 Reports to the preferred relay by sending a UDP encapsulated ICMP 817 Redirect. 819 An IGMP Report sent by a gateway to the relay closest to it would 820 consist of the following packet: 822 OuterIP [UDP [InnerIP [IGMP Report]]] 824 The relay would respond with: 826 OuterIP' [UDP' [InnerIP' [ICMP Redirect [InnerIP [IGMP Report]]]]] 828 An ICMP Redirect contains the first 64 bits of the original packet 829 [ICMP]. Hence the gateway would get 44 bytes (64 - sizeof(Inner 830 IP)) of the IGMP Report, enough to easily extract the (source, 831 group) pair, and redirect its report to the preferred gateway. 833 Certainly additional complexity is undesirable, so it is an open 834 issue as to whether redirects are needed at all. 836 12. Authors' Addresses 838 Dave Thaler 839 Microsoft Corporation 840 One Microsoft Way 841 Redmond, WA 98052-6399 842 Phone: +1 425 703 8835 843 EMail: dthaler@microsoft.com 845 Mohit Talwar 846 Microsoft Corporation 847 One Microsoft Way 848 Redmond, WA 98052-6399 849 Phone: +1 425 705 3131 850 EMail: mohitt@microsoft.com 852 Amit Aggarwal 853 Microsoft Corporation 854 One Microsoft Way 855 Redmond, WA 98052-6399 856 Phone: +1 425 706 0593 857 EMail: amitag@microsoft.com 859 Lorenzo Vicisano 860 Cisco Systems 861 170 West Tasman Dr. 862 San Jose, CA 95134 863 Phone: +1 408 525 2530 864 EMail: lorenzo@cisco.com 866 Dirk Ooms 867 Alcatel 868 F. Wellesplein 1, 2018 Antwerp, Belgium 869 Phone: +32 3 2404732 870 EMail: dirk.ooms@alcatel.be 872 13. Normative References 874 [ICMP] Postel, J., "Internet Control Message Protocol", RFC 792, 875 September 1981. 877 [IGMPPROXY] W. Fenner, "IGMP-based Multicast Forwarding (``IGMP 878 Proxying'')", Work in progress, draft-fenner-igmp-proxy- 879 03.txt, July 2000. 881 [IGMPv3] Cain, B., Deering, S., Fenner, B., Kouvelas, I., and A. 882 Thyagarajan, "Internet Group Management Protocol, Version 883 3", RFC 3376, October 2002. 885 [SSM] Holbrook, H., and B. Cain, "Source-Specific Multicast for IP", 886 Work in progress, draft-holbrook-ssm-arch-04.txt, October 887 2003. 889 14. Informative References 891 [6TO4] Carpenter, B., and K. Moore, "Connection of IPv6 Domains via 892 IPv4 Clouds", RFC 3056, February 2001. 894 [BROKER] Durand, A., Fasano, P., Guardini, I., and D. Lento, "IPv6 895 Tunnel Broker", RFC 3053, January 2001. 897 [ANYCAST] C. Huitema, "An Anycast Prefix for 6to4 Relay Routers", 898 RFC 3068, June 2001. 900 [PIMSM] Estrin, D. Farinacci, D., Helmy, A., Thaler, D., Deering, 901 S., Handley, M., Jacobson, V., Liu, C., Sharma, P., and L. 902 Wei. "Protocol Independent Multicast-Sparse Mode (PIM-SM): 903 Protocol Specification", RFC 2362, June 1998. 905 15. Full Copyright Statement 907 Copyright (C) The Internet Society (2004). All Rights Reserved. 909 This document and translations of it may be copied and furnished to 910 others, and derivative works that comment on or otherwise explain it 911 or assist in its implmentation may be prepared, copied, published 912 and distributed, in whole or in part, without restriction of any 913 kind, provided that the above copyright notice and this paragraph 914 are included on all such copies and derivative works. However, this 915 document itself may not be modified in any way, such as by removing 916 the copyright notice or references to the Internet Society or other 917 Internet organizations, except as needed for the purpose of 918 developing Internet standards in which case the procedures for 919 copyrights defined in the Internet Standards process must be 920 followed, or as required to translate it into languages other than 921 English. 923 The limited permissions granted above are perpetual and will not be 924 revoked by the Internet Society or its successors or assigns. 926 This document and the information contained herein is provided on an 927 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 928 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 929 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 930 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 931 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.