idnits 2.17.1 draft-calvert-concast-svc-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 187: '...-capable routers MUST check every forw...' RFC 2119 keyword, line 325: '...eStateBlock type MUST include initial ...' RFC 2119 keyword, line 439: '...nterface toward R MUST be available to...' RFC 2119 keyword, line 538: '...cast flow as a sender or receiver MUST...' RFC 2119 keyword, line 606: '... Specifications SHOULD take steps to ...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 134 has weird spacing: '...ions of conca...' == Line 311 has weird spacing: '...d to be given...' == Line 409 has weird spacing: '...ith the minim...' == Line 660 has weird spacing: '...in each emitt...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 2000) is 8557 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '1' ** Downref: Normative reference to an Informational RFC: RFC 1321 (ref. '3') -- Possible downref: Normative reference to a draft: ref. '5' Summary: 8 errors (**), 0 flaws (~~), 5 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet-Draft K. Calvert, J. Griffioen 3 University of Kentucky 5 Expires May 2001 November 2000 7 Internet Concast Service 8 draft-calvert-concast-svc-02.txt 10 Status of this Memo 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. Internet-Drafts are working 14 documents of the Internet Engineering Task Force (IETF), its areas, 15 and its working groups. Note that other groups may also distribute 16 working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet- Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html. 29 The distribution of this memo is unlimited. It is filed as and expires January 24, 2002. Please 31 send comments to the authors. 33 Abstract 35 Concast is a many-to-one best-effort network service that allows a 36 receiver to treat a group of senders as a single entity, in much the 37 same way that IP multicast allows a sender to treat a group of 38 receivers as one. Each concast datagram delivered to a receiver is 39 derived from (possibly many) datagrams sent by different members of 40 the concast group to that receiver; the relationship between the 41 delivered datagram and the sent datagrams is defined by a "merge 42 specification". Concast provides a framework that allows the 43 semantics of this merging operation to vary to suit the needs of 44 different applications. Concast is incrementally deployable and 45 backward compatible with IPv4 and IPv6. It can be implemented 46 entirely in end systems, but offers the most benefits in terms of 47 scalability when it is supported by routers in the network. This 48 document describes the concast service and its framework for defining 49 merge semantics, including safety properties required of the merge 50 framework implementation. 52 1. Introduction 54 Multicast has been an Internet service for many years now [1]. Its 55 semantics are simple: when a host sends a packet to a multicast 56 address, the network makes its best effort to deliver a copy to all 57 hosts in the group. The network keeps track of receiving hosts' 58 locations, and duplicates datagrams as needed while forwarding them 59 toward all receivers. The power of multicast is in its abstraction 60 mechanism, which enables a sender to treat an arbitrary number of 61 receivers as a single entity. 63 Concast is intended to provide a similar abstraction in the reverse 64 direction: it enables a receiver to treat an arbitrary number of 65 senders as a single entity. When multiple senders transmit concast 66 datagrams to the same receiver, the network makes its best effort to 67 "merge" them into a single message for delivery to that receiver. 68 The utility of such a service depends on the semantics of the merging 69 operation performed by the network layer. It seems unlikely that any 70 single (necessarily application-independent) semantics would have 71 sufficiently broad applicability to justify implementation of the 72 concast service. Therefore concast is designed to allow for a broad 73 range of merge semantics, all fitting within a certain framework. 74 The following examples illustrate a range of possible merge 75 semantics: 77 o Inverse multicast/duplicate suppression: at most one copy of 78 any datagram is delivered to the receiver within a particular 79 window of time. 81 o Voting: each datagram contains a value chosen by its sender. 82 When some threshold number of datagrams has been sent, a single 83 datagram containing the value that occurred most often in the 84 sent datagrams is delivered. 86 o Applying an associative and commutative operator: each datagram 87 contains a value. The maximum (minimum, sum, product, 88 conjunction, disjunction, bitwise conjunction, bitwise 89 disjunction) of the values in all sent datagrams is placed in the 90 datagram delivered to the receiver. 92 It is envisioned that certain simple merge functions like these will 93 be "hardwired" into the network. The merge framework defined later 94 allows for new merge semantics to be specified simply by defining 95 certain functions that make up the framework. For maximum 96 flexibility, authorized receivers would supply such definitions using 97 an encoding interpreted by all concast-capable nodes. The nature of 98 this encoding determines the power of the computations permitted for 99 merge specifications. 101 Concast can be used alone, for example to collect and distill 102 telemetry from a group of remote sensors. It is also especially 103 useful in conjunction with multicast. Many multicast applications 104 require some form of feedback from the receiver set. For such 105 applications, implosion at the multicast source or at internal 106 network nodes is a real problem as group sizes grow large, because 107 (in the absence of concast) the only way to convey feedback is via 108 unicast datagrams. This fundamentally breaks the multicast 109 abstraction by forcing the sender to deal with individuals instead of 110 the group as a whole. Moreover, in most cases, the feedback 111 recipient is not interested in the individuals' information, but 112 rather some function -- for example the maximum or minimum -- of the 113 group's information. Support for the concast abstraction allows such 114 "summary information" to be provided in a scalable way, by computing 115 it at strategic points along the way. 117 Support for concast requires modifications to those hosts and routers 118 that support it. It does not, however, require any modification to 119 other parts of the infrastructure, nor does it require additional 120 routing or forwarding capabilities beyond those required for unicast. 121 In particular, concast does not depend on multicast in any way. 122 Concast service can be provided on an end-system-only basis, though 123 router support is necessary for scalability (in terms of the group 124 size supportable without implosion). Partial deployment among 125 routers is beneficial, and indeed most of the scalability and 126 implosion-prevention benefits are likely to be attainable by 127 deployment of concast at select routers at domain boundaries. This 128 document describes extensions to Version 4 of the Internet Protocol. 129 Similar extensions can be defined for IPv6. 131 The next section provides an overview of the service and its use. 132 Section 3 defines the semantic framework for merging datagrams, and 133 gives an example of its use. Section 4 describes the processing of 134 concast datagrams by the IP implementations of concast-capable 135 nodes, in terms of the semantic framework. Security considerations 136 are discussed in Section 5. 138 2. Service Overview 139 The unit of concast service is the "flow". Concast flows are 140 unidirectional: data travels only from the senders to the (single) 141 receiver. Each concast flow is identified by a pair (R,G), where R 142 is the (unicast) IP address of the receiver and G is a concast group 143 identifier. Concast group IDs are 32 bit numbers chosen by the 144 receiver. Note that different receiving applications on the same 145 host need to use different group IDs so their flows can be 146 distinguished. 148 Each concast flow has an associated Merge Specification, which is 149 chosen by the receiver and specified at flow creation time. The 150 Merge Specification defines the relationship between datagrams 151 delivered to the receiver application and those transmitted by the 152 senders. 154 Thus to use concast, receiver and senders must agree (through some 155 out-of-band means) on two things: the concast group ID and the Merge 156 Specification. Senders must transmit datagrams containing 157 information in the format expected by the Merge Specification. 159 A concast-capable node N maintains state information for each concast 160 flow (R,G) passing through it (i.e., for which N is on the path to R 161 from some sender participating in the flow). Responsibility for 162 establishment and maintainance of this per-flow information belongs 163 to the Concast Signaling Protocol (CSP), which is described in a 164 separate document [1]. CSP uses soft-state techniques to ensure that 165 the concast service is robust in the face of route changes. The per- 166 flow information includes the identities of all concast-capable nodes 167 "upstream" of N on the flow, and state relevant to the ongoing merge 168 processing of messages sent on the flow. 170 In contrast to multicast -- as multicast is currently specified and 171 implemented in the Internet [2] -- both senders and receiver are 172 required to signal the network before using the concast service. 173 (Multicast only requires receivers to signal.) A benefit of this 174 "uniform" signaling requirement is that it provides an opportunity 175 for authentication and authorization checks on users of the service. 176 This is likely to be important since router support for concast 177 requires the maintenance of per-flow state. 179 2.1 Concast Datagram Format 181 Concast datagrams are distinguished from ordinary IP datagrams by the 182 presence of a "Concast ID" option in the IP header, which contains 183 the concast group number. Concast-oblivious routers do not recognize 184 the concast ID option, and simply forward concast datagrams as if 185 they were regular unicast datagrams. 187 Concast-capable routers MUST check every forwarded datagram for the 188 presence of the Concast ID option. One means of achieving this is to 189 check for this particular option. However, other protocols (e.g. 190 RSVP) also require processing at non-destination nodes, and various 191 methods of diverting datagrams from the forwarding path for special 192 processing are available, including the Router Alert option [4] and 193 the Waypoint mechanism [5]. 195 The source address field of a concast datagram's IP header contains 196 the IP address of the last concast-capable node to process the 197 datagram. This enables a node processing an incoming datagram to 198 check that the datagram was forwarded by one of that node's known 199 upstream neighbors. (This check is of course not secure. See Section 200 5.) 202 The concast service does not specify any other headers of its own. 203 Instead, each individual merge specification defines the information 204 it expects to be carried in packets. 206 2.2 Concast Flow Lifecycle 208 The normal sequence of events for establishing and using a concast 209 flow is as follows: 211 1. The receiver creates the flow by supplying its local IP concast 212 module with the concast group ID G, the preferred receiver address 213 R, and the Merge Specification (described below). 215 2. Nodes wishing to join the group and participate as senders do so 216 by supplying (R,G) to their local IP concast module. This invokes 217 the signaling protocol, which causes flow state (including the 218 Merge Specification) to be established along the paths from the 219 senders to the receiver. 221 3. Senders transmit packets as usual. Each sender's IP concast 222 implementation ensures that each sent concast datagram carries a 223 concast group ID option with value G. 225 4. As concast datagrams travel hop-by-hop toward the receiver, at 226 each concast-capable node (including at least the receiving host) 227 they are diverted for concast processing. This involves retrieving 228 the state for the (R,G) flow and carrying out the computation 229 defined by the merge specification for that flow. Packets are 230 forwarded (toward $R$) after processing only under the conditions 231 defined by the merge specification. 233 5. When the receiving application receives data from the flow via 234 its network API, concast messages appear to have been sent by the 235 concast group; they contain the result of merging the messages sent 236 by group members. 238 6. Senders may leave the concast group at any time. When a sender 239 leaves the group, the signaling protocol is invoked to inform the 240 sender's downstream neighbor that one of its upstream neighbors is 241 going away. When a node has no remaining upstream neighbors, it 242 recursively informs its downstream neighbor that it is leaving the 243 flow. In this way, the concast "tree" grows and shrinks as senders 244 join and leave the group. 246 7. The receiving application may tear down the flow at any time. 247 The signaling protocol then notifies all upstream neighbors that 248 the flow has gone away. Those nodes inform their neighbors, and so 249 on, until all state for the flow has been removed from the system. 251 The Concast Signaling Protocol, including the soft-state techniques 252 that are used to detect route changes and connectivity problems, is 253 defined elsewhere [1]. 255 3. Merge Semantics 257 The nature of the merge function determines the utility of the 258 concast service for any particular application; different 259 applications in general need different semantics. On the other hand, 260 the processing burden placed on the intermediate nodes by the merge 261 computation, along with the potential for misbehaior by the merge 262 function, must be limited. We therefore constrain the general form 263 of the merge computation by defining certain steps to be common to 264 all merge functions, and by defining the "shape" of the variable 265 parts. 267 The merge semantics tells which datagrams within a flow are to be 268 merged together, and defines the computation that takes as input the 269 (possibly many) messages sent, and produces as output the message 270 ultimately delivered to the receiver. 272 In the context of a particular concast flow, a Datagram Equivalence 273 Class (DEC) is defined as a set of concast datagrams to be merged 274 together. A flow may have datagrams belonging to multiple DECs in 275 the network at the same time. For example, in the inverse multicast 276 (duplicate suppression) service mentioned in the first section, two 277 packets could be in the same DEC if they have the destination IP 278 address, concast group ID, and IP payload. 280 Because packets are processed one at a time as they arrive, each 281 concast node maintains a "merge state block" (MSB) for each active 282 DEC of a flow. To limit the amount of per-flow state, the size and 283 number of active MSBs will in general be limited by the concast 284 implementation. When a merge computation is ``finished'', a concast 285 datagram is constructed using information in the MSB, and forwarded 286 toward the destination R. 288 Based on the foregoing, we can outline the steps in "merge" 289 processing, given the merge specification for flow (R,G): 291 1. Determine the datagram equivalence class to which the datagram 292 belongs, according to the merge specification. 294 2. Retrieve the Merge State Block for that DEC. 296 3. Update the contents of the MSB using the old contents and the 297 datagram according to the merge specification. 299 4. If the computation is finished according to the merge 300 specification, construct and forward an IP datagram with 301 destination address equal to R, source address equal to the IP 302 address of the interface that leads toward R (or the concast group 303 ID if this node is R), the concast group ID option with value G, 304 and payload constructed from the MSB according to the merge 305 specification. 307 3.1 Variable Components of the Merge Specification 309 The semantics of merge are defined by giving definitions for certain 310 types and methods. In defining those methods, the following types 311 are considered to be given: 313 The type DECTag, of tags that identify Datagram Equivalence 314 Classes. 316 A Merge Specification consists of precise definitions of the 317 following types and functions: 319 The type MergeStateBlock, which defines the state information to be 320 stored for in-progress merges, including information that will be 321 carried in any forwarded datagram. The maximum size of a 322 MergeStateBlock must be fixed at the time of the definition. Every 323 MergeStateBlock contains a one-byte field used for computing TTL 324 values of forwarded datagrams, as described below. The definition 325 of the MergeStateBlock type MUST include initial values for all 326 fields. 328 The function getTag(), which takes a concast datagram as input and 329 returns a DECTag. This function determines the Datagram 330 Equivalence Class to which a given packet belongs. Typically this 331 function will extract a value from a particular location or 332 locations in the datagram (header and/or payload). Alternatively, 333 it might compute a digest of the datagram's payload. 335 The function merge(), which takes a MergeStateBlock, a concast 336 datagram, and per-flow information and returns an updated 337 MergeStateBlock. This function does the real work of merging, 338 combining information from an incoming datagram with information 339 derived from previously processed datagrams. 341 The predicate done() on MergeStateBlocks, which returns "true" when 342 a datagram needs to be constructed and forwarded to R. 344 The function buildDatagram(), which takes a MergeStateBlock and 345 returns a datagram containing a valid IP header, transport header, 346 and payload. 348 The method of specification of the above functions is beyond the 349 scope of this memo. Note that some merge specifications are expected 350 to be well-known and commonly supported, so that receivers can invoke 351 them by name. 353 3.2 Generic Portion of the Merge Specification 355 The fixed component of the merge semantics defines that portion of 356 the merge computation that is the same on every node, for every flow. 357 It is specified by the pseudocode below. 359 ProcessDatagram(IPAddr R, ConcastGroupID G, IPDatagram m) 360 // Generic concast merge processing 361 { 362 FlowStateBlock fsb; // flow state for flow (R,G) 363 DECTag t; // tag for m's DEC 364 MergeStateBlock s; 366 fsb = LOOKUP_FLOW(R,G); // get relevant flow state 367 if (fsb != NULL) { 368 t = fsb.getTag(m); // get the DEC 369 s = GET_MERGE_STATE(fsb,t); // get state of in-progress merge 370 s = UPDATE_TTL(s,m); // fixed function 371 s = fsb.merge(s,m,fsb); // merge computation 372 if (fsb.done(s)) { // time to send something on? 373 (s,m) = fsb.buildDatagram(s); 374 FORWARD_DG(fsb,s,m); // on toward R 375 } 376 PUT_MERGE_STATE(fsb,s,t); // replace old state 377 } 379 The data type "FlowStateBlock" encapsulates flow-specific information 380 that might be useful to the merge computation, for example the list 381 of upstream neighbors of the current node in the concast tree. 382 Methods whose names are given in CAPITALS are built-in, fixed parts 383 of the merging framework; their semantics cannot be modified by the 384 user. 386 The method LOOKUP_FLOW takes a flow specifier and returns the flow 387 state block belonging to that flow, if any. If no such flow state 388 block exists, the datagram is silently dropped. 390 The method GET_MERGE_STATE takes a tag identifying a Datagram 391 Equivalence Class, and a flow state block, and returns a 392 MergeStateBlock associated with the given tag in the flow. In order 393 to bound the amount of per-flow state kept at a node, a limit 394 (MAX_ACTIVE_DECS) is placed on the number of distinct tag values that 395 may have MergeStateBlocks bound to them at any instant. However, 396 GET_MERGE_STATE always returns a valid MergeStateBlock instance. 397 These two facts imply that when the limit on extant MSBs is reached, 398 calls to saveMergeState with a new DECTag value will result in some 399 (tag, MSB) pair being evicted from the state store. It is the 400 application's responsibility to ensure that the limit on extant 401 MergeStateBlocks is not violated, by limiting the rate at which 402 concast datagrams arrive at intermediate routers. The value of 403 MAX_ACTIVE_DECS should be globally defined and published, so that 404 applications can limit their sending rate accordingly. If the 405 MergeStateBlock returned by GET_MERGE_STATE is a new instance, it has 406 been initialized according to the specification. 408 The method UPDATE_TTL replaces the Time-to-live value stored in the 409 MergeStateBlock with the minimum of that value and the TTL value 410 from the IP header of the given datagram, and returns the modified 411 MergeStateBlock. 413 The method "FORWARD_DG" takes an IP datagram and does the following 414 to the IP header: 416 1. Overwrites the IP destination address with the given destination 417 R. 419 2. Writes the IP address of the outgoing interface (toward R) into 420 the 421 IP source address. 423 3. Writes the Time-to-live value from the given MergeStateBlock, 424 minus one, into the TTL field. 426 4. Adds a Concast Group ID option containing the group ID G, and 427 adjusts the header and datagram length fields accordingly. 429 5. Resets the TOS bits to 0. 431 6. Recomputes the IP header checksum. 433 Note that any higher-level headers above IP are entirely the 434 responsibility of the merge spec-defined functions. In particular, 435 UDP and TCP checksums are not recomputed by the FORWARD_DG procedure; 436 thus if UDP or TCP is used, the buildDatagram function must compute 437 them properly, using R and the outgoing interface's address in the IP 438 pseudo-header that is included in the checksum computation. Thus the 439 IP address of the outgoing interface toward R MUST be available to 440 the buildDatagram function. 442 The PUT_MERGE_STATE method associates the given MergeStateBlock with 443 the given DECTag value in the given flow, replacing the old 444 MergeStateBlock value. (The foregoing discussion of GET_MERGE_STATE 445 implies that an old value exists for the given tag.) 447 3.3 Example Merge Function 449 To illustrate the use of the Merge Specification framework, we 450 present a definition of the "inverse multicast" (duplicate 451 suppression) service mentioned in the Introduction. To implement 452 this service, the network must "remember" each datagram that is 453 delivered to the receiver, and suppress subsequent copies without 454 forwarding them. For the purposes of this service, two datagrams 455 belonging to the same flow are considered "identical" if they have 456 the same payload. In other words, datagrams with the same payload 457 belong to the same DEC. 459 In principle, the only state needed for a DEC is that fact that a 460 datagram in that DEC has already been forwarded by a node. However, 461 because of the way the generic part of the computation is structured, 462 on the first arrival of a datagram from a DEC, the merge state must 463 record the fact that no merge state was found for that DEC 464 originally. Therefore we define: 465 typedef MergeStateBlock { 466 boolean forwarded; 467 IPPayload pendingDG; 468 } 470 The DECTag value is computed by taking the MD5 hash [3] of the 471 payload of the given datagram: 472 DECTag getTag(IPDatagram m) 473 { 474 return (DECTag) MD5hash(m.payload); 475 } 476 The merge() function simply records the datagram for forwarding. 477 MergeStateBlock merge(MergeStateBlock s, IPDatagram m, FSB f) 478 { 479 if (s==NULL) { 480 create a new MergeStateBlock newState; 481 newState.forwarded := false; 482 newState.pendingDG := m; 483 return newState; 484 } else 485 return s; 486 } 487 The done() function simply checks whether the packet has already been 488 forwarded: 489 boolean done(MergeStateBlock s) 490 { 491 return NOT(s.forwarded); 492 } 493 The buildDatagram() function updates the state to indicate that the 494 datagram has been forwarded, and returns the updated state along with 495 the protocol number and payload of the datagram: 496 (MergeStateBlock,ProtocolNumber,IPPayload) 497 buildDatagram(MergeStateBlock s) 498 { 499 s.forwarded := true; 500 return (s,s.pendingDG); 501 } 503 3.4 Discussion 505 The definition of particular merge functions using code like that of 506 Section 3.3 does not imply that the actual processing should be 507 accomplished in software. For the example above, the computation can 508 be very efficiently implemented in hardware, and it is expected that 509 routers supporting this merge function would do so. 511 However, for flexibility it may also be useful to support user- 512 supplied merge functions coded in some restricted-but-high-level 513 language, for example a limited subset of Java. Obviously some 514 constructs (e.g. recursion, dynamic storage allocation, unbounded 515 iteration) should be restricted or prohibited in such code. On the 516 other hand, certain functionality may be needed by such code. 518 For example, in some cases it is useful to initiate merge processing 519 via the passage of time, rather than datagram arrival. This 520 capability can be provided by providing a method by which user- 521 supplied code can arrange for the last portion of the merge 522 processing (beginning with the done() test) to be executed a 523 specified amount of time in the future. To limit overhead, each flow 524 is permitted at most one pending timeout-callback per MSB at any 525 time. 527 3.5 Fragmentation 529 The notion of a datagram equivalence class is well-defined only for 530 complete (unfragemented) IP datagrams. Therefore it is necessary for 531 applications using concast to send datagrams that will not be 532 fragmented in the network. This can be achieved either by performing 533 path MTU discovery for the path between each sender and the receiver, 534 or by sending datagrams smaller than the minimum IP datagram size. 536 4. Levels of Support 538 A node participating in a concast flow as a sender or receiver MUST 539 implement some part of the the Concast Signaling Protocol (CSP) [1]. 540 The parts of CSP that must be implemented depend on the level of 541 concast support provided by the node. Some nodes may only be able to 542 originate concast datagrams and thus do not need to implement the 543 receiving or merging components of the CSP protocol. Other nodes may 544 only be able to receive concast messages. Some nodes will support 545 both sending and receiving. Internal network nodes need only to 546 support the merge processing described earlier. Legacy nodes that do 547 not support concast at all, simply need to forward concast packets as 548 if they were unicast. 550 4.1 Sending Host Processing 552 In order for a host to participate as a sender in a concast group, it 553 needs to support the portion of the CSP protocol that signals the 554 node's intent to join (or leave) as a sender. Once CSP has 555 established the necessary state information to link the sender into 556 the concast flow, the sender can begin transmitting concast 557 datagrams. Specifically, senders must mark outgoing packets as 558 concast packets requiring hop-by-hop processing. This is achieved 559 simply by inserting a "Concast ID" option in the IP header containing 560 the concast group G from which this packet originated. The packet is 561 then routed and transmitted using the standard IP mechanism. 563 4.2 Receiving Host Processing 564 Applications join the group G as a receiver by identifying (via a 565 system call) the concast flow to be joined (R,G) and providing the 566 Merge Specification for the flow. Concast receivers must support the 567 parts of the CSP protocol that respond to requests for Merge 568 Specifications and Join requests. Once the CSP protocol establishes 569 the flow and distributes the Merge Specification, concast datagrams 570 will begin arriving at the receiver. The receiver's IP module must 571 recognize the "Concast ID" option and divert the incoming packet for 572 merge processing as specified in Section 3. Following merge 573 processing, the FORWARD_DG function passes any resulting datagram to 574 the local IP module, which recognizes the destination address as its 575 own demultiplexes it as usual to the higher-level protocol indicated 576 in the IP header. 578 4.3 Per-flow State Considerations 580 Because sender-only nodes simply mark outgoing packets as concast 581 packets, the state information maintained at such nodes will be 582 minimal. However, because merge state accumulates at internal 583 concast-capable nodes and at concast receivers, state size could 584 potentially grow without bound. Consequently the fsb.saveMergeState 585 function should limit the amount of state information any particular 586 flow can consume. The language used to construct the Merge 587 Specification may also impose limits on the amount of state 588 information that can be saved. Assuming the merge state is bounded, 589 the state needed to maintain information about flows is similar to 590 the state used by shortest-path multicast routing protocols. 592 5 Security Considerations 594 Security considerations for concast fall into two main categories. 595 First and foremost, concast implementations must ensure the stability 596 of individual nodes as well as the network as a whole. For example, 597 concast should not enable new classes of denial of service, or other 598 forms of attack. Second, the protocol should be designed so that 599 access to the concast service can be controlled by network (or 600 provider) policy. 602 5.1 Node Safety 604 Implementations of the concast merge framework must ensure the safety 605 of individual nodes. Nodes that accept user-supplied Merge 606 Specifications SHOULD take steps to ensure that the 607 interpretation/execution of such specifications is safe, in 608 particular consumes acceptable amounts of compute bandwidth and 609 storage and does not modify the router state in unacceptable ways. 610 One possible approach is to require that the Merge Specification be 611 supplied in an encoding that allows a priori verification of desired 612 safety properties. Another is to impose limits on per-message 613 processing and storage at run time. The limits of "acceptable 614 amounts of compute bandwidth and storage" will depend on the capacity 615 available at each node; presumably concast-capable nodes will be 616 equipped with sufficient processing and storage capacity to enable 617 them to handle some amount of concast processing without harm to 618 their other required functions. 620 Note that the signaling requirements of concast provide a means for 621 nodes to limit the amount of processing for which they are obligated. 622 Nodes that are fully booked can simply refuse requests to establish 623 new flows. 625 If the risk of executing user-supplied Merge Specifications is 626 considered to be excessive, the service can still be supported, but 627 with only built-in (i.e. user-selectable but not user-customizable) 628 merge functions. 630 5.2 Network Safety 632 The construction and emission of datagrams at routers under user 633 control -- however limited -- should always be handled with care. 635 As an example, in some earlier descriptions of this service, concast 636 datagrams were identified by the presence of a group address (e.g. a 637 Class E or multicast address) in the Source field of the IP header. 638 While this approach is nicely symmetric with multicast in that no 639 individual address is associated with the source of the datagram, it 640 conflicts with anti-spoofing source checks applied in some parts of 641 the current Internet. Unfortunately, modification of such checks to 642 allow packets with concast source addresses to pass opens up the 643 possibility of untraceable denial-of-service attacks on concast- 644 capable hosts that reside in domains with no concast-capable routers, 645 and therefore the method of marking concast datagrams was changed. 647 The concast merge framework MUST be implemented in such a way that 648 the "construction and emission" process for concast has certain 649 properties: 651 o Fixed Destination: The only destination to which datagrams 652 can be forwarded as a result of concast processing is R, the 653 flow's receiver. 655 o Conservation of packets: At most one datagram is 656 emitted per incoming concast datagram processed. 658 o TTL Monotonicity: For each flow passing through a node, the TTL 659 values of the datagrams belonging to that flow decrease as they 660 are processed. That is, the TTL value in each emitted datagram 661 is smaller than the TTL values of all the packets that were 662 merged to form that datagram. 664 These properties can be ensured by judicious implementation of the 665 fixed part of the merge framework, and by restricting the operations 666 permitted by the variable part. For example, if user-supplied merge 667 functions are permitted to set timeouts, the merge framework MUST 668 ensure that at most one timeout per MergeStateBlock can be pending. 669 In order to guarantee the "Conservation of packets" property, 670 invocation of the FORWARD_DG function for a MergeStateBlock should 671 cancel any timeout pending for that MergeStateBlock. 673 5.3 Controlling Access to Concast Services 675 In the absence of strong authentication applied to each packet at 676 each concast-capable node, packets can be inserted into a concast 677 flow by nodes that have not joined the flow. This may corrupt the 678 information delivered to the receiver. However, the same threat 679 exists when unicast is used to deliver the information. 681 It is straightforward to add an authentication check to the generic 682 merge processing of Section 3.2. Moreover, the signaling phase 683 provides an opportunity for establishment of the necessary security 684 associations between neighbors in the concast tree (indeed this is 685 one motivation for requiring signaling by all parties). However, the 686 security provided is necessarily hop-by-hop. 688 [Discussion of extension of concast tree to be added.] 690 6 Acknowledgements 692 The contributions of Billy Mullins, Leon Poutievsky, Amit Sehgal, 693 and Su Wen to this service specification are acknowledged with 694 thanks. 696 The support of the Defense Advanced Research Projects Agency (DARPA) 697 and Air Force Research Laboratory, Air Force Materiel Command, USAF, 698 under agreement number F30602-99-1-0514, is gratefully acknowledged. 699 The views and conclusions contained herein are those of the authors 700 and should not be interpreted as necessarily representing the 701 official policies or endorsements, either expressed or implied, of 702 the Defense Advanced Research Projects Agency (DARPA), the Air Force 703 Research Laboratory, or the U.S. Government. 705 References 707 [1] Calvert, K. L. and Griffioen, J. N., "Concast Signaling 708 Protocol", Internet-Draft, in preparation. 710 [2] Deering, S., "Host Requirements for IP Multicasting", RFC 1112, 711 August 1989. 713 [3] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April 714 1992. 716 [4] Katz, D., "IP Router Alert Option", RFC 2113, February 1997. 718 [5] Lindell, B. and Braden, B., "Waypoint - A Path Oriented Delivery 719 Mechanism for IP based Control, Measurement, and Signaling 720 Protocols", Internet Draft (work in progress), November 2000, 721 draft-lindell-waypoint-00.txt. 723 Authors' Address: 725 Kenneth L. Calvert (calvert@netlab.uky.edu) 726 James N. Griffioen (griff@netlab.uky.edu) 727 Lab for Advanced Networking 728 University of Kentucky 729 Hardymon Building, 2nd Floor 730 301 Rose Street 731 Lexington, KY 40506-0495