idnits 2.17.1 draft-ietf-sigtran-mdtp-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 15 longer pages, the longest (page 10) being 698 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 36 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 22 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** The abstract seems to contain references ([1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 368: '...Data Size in the common header MUST be...' RFC 2119 keyword, line 372: '...the control part MUST be processed fir...' RFC 2119 keyword, line 428: '...o 0xff - reserved and MUST NOT be used...' RFC 2119 keyword, line 446: '...n the current datagram, it MUST be set...' RFC 2119 keyword, line 456: '...l parameter part MUST be transmitted i...' (9 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'Tag-X' on line 1676 -- Looks like a reference, but probably isn't: 'Tag-Y' on line 1679 -- Looks like a reference, but probably isn't: 'Tag-Z' on line 1682 -- Looks like a reference, but probably isn't: 'Tag-A' on line 1731 == Unused Reference: '2' is defined on line 2193, but no explicit reference was found in the text == Unused Reference: '3' is defined on line 2196, but no explicit reference was found in the text == Unused Reference: '5' is defined on line 2202, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 2205, but no explicit reference was found in the text == Unused Reference: '7' is defined on line 2208, but no explicit reference was found in the text == Unused Reference: '8' is defined on line 2210, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (ref. '3') (Obsoleted by RFC 9293) -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Normative reference to a draft: ref. '6' ** Downref: Normative reference to an Informational RFC: RFC 2100 (ref. '7') ** Obsolete normative reference: RFC 1750 (ref. '9') (Obsoleted by RFC 4086) ** Obsolete normative reference: RFC 1948 (ref. '10') (Obsoleted by RFC 6528) -- Possible downref: Non-RFC (?) normative reference: ref. '11' Summary: 14 errors (**), 0 flaws (~~), 9 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group R. R. Stewart 2 INTERNET-DRAFT Q. Xie 3 Motorola 4 K. Morneau 5 C. Sharp 6 Cisco 7 H. J. Schwarzbauer 8 Siemens 9 T. Taylor 10 Nortel Networks 11 I. Rytina 12 Ericsson 14 expires in six months June 25,1999 16 MULTI_NETWORK DATAGRAM TRANSMISSION PROTOCOL 17 19 Status of This Memo 21 This document is an Internet-Draft and is in full conformance with 22 all provisions of Section 10 of RFC2026. Internet-Drafts are working 23 documents of the Internet Engineering Task Force (IETF), its areas, 24 and its working groups. Note that other groups may also distribute 25 working documents as Internet-Drafts. 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Abstract 35 This Internet Draft discusses a new protocol, namely the Multi-network 36 Datagram Transmission Protocol (MDTP), that is intended to provide 37 fault-tolerant reliable data transfer between communicating entities 38 over IP networks [1]. 40 MDTP is proposed as an application-level protocol that is designed to 41 support redundant networks and transparent fault management. MDTP also 42 provides timing control and configuration flexibilities to meet the 43 stringent timing requirements often found in telephony signaling 44 protocols. The motivation of developing MDTP is to support 45 Internet-based high reliability applications such as signaling and 46 call control for Internet telephony. 48 Stewart, et al [Page 1] 49 TABLE OF CONTENTS 51 1. Introduction.......................................................3 52 1.1 Terminology......................................................3 53 1.2 Design Requirements of MDTP......................................4 54 1.3 Interface to MDTP................................................5 55 2. MDTP Datagram Format...............................................5 56 2.1 MDTP Common Header Field Descriptions............................6 57 2.2 MDTP Control Parameter Part Definitions..........................7 58 2.3 MDTP Data Part Definitions......................................11 59 3. Endpoint Association Initialization...............................12 60 3.1 Initiation Message and Tag Lock.................................12 61 3.1.1 Passing Initiation Parameters ................................12 62 3.2 Tag Unlock and TSN Initialization...............................13 63 3.3 Datagram Processing during Tag Lock ............................14 64 3.4 An Example of Association Initialization .......................14 65 3.5 Other Initiation Issues.........................................15 66 3.5.1 Selection of Tag Value......................................15 67 3.5.2 Initiation from behind a NAT................................15 68 3.5.3 Initialization Collision....................................16 69 3.5.4 Association Re-initialization...............................16 70 4. Transfer User Datagram............................................16 71 4.1 Timer Management Rules..........................................17 72 4.1.1 T3-send Timer Adjustment with RTT...........................18 73 4.2 Multihoming Rotation............................................18 74 4.2.1 Remote Multihoming Rotation.................................18 75 4.2.2 Local Multihoming Rotation..................................19 76 4.3 Stream Sequence Number..........................................19 77 4.4 Ordered and Un-ordered Delivery.................................19 78 4.5 Report Missing Datagrams........................................20 79 4.6 Range Check on TSN .............................................21 80 4.7 Advisory Ack Request............................................21 81 4.8 CRC utilization.................................................21 82 5 Congestion Controls...............................................22 83 5.1 Send with Window Control........................................22 84 5.1.1 Window Length Adjustment....................................23 85 5.2 Send Timer Back-off at Re-transmission..........................24 86 6. Network Management................................................25 87 6.1 Failure Detection in Redundant Networks.........................25 88 6.2 RTT Measurement.................................................26 89 6.3 Network Heart Beat .............................................26 90 7. Termination of Association........................................27 91 7.1 Graceful Shutdown of an Association.............................28 92 8. Stream Operations.................................................29 93 8.1 Stream Initiation...............................................29 94 8.2 Stream Termination..............................................29 95 8.3 Other Issues with Stream Operations.............................30 96 9. Interface with Upper Layer........................................30 97 10. Suggested MDTP Timer and Protocol Parameter Values................34 98 11. Abbreviations.....................................................34 99 12. Acknowledgments...................................................34 100 13. Authors' Addresses................................................34 101 14. References........................................................35 103 Stewart, et al [Page 2] 104 1. Introduction 106 This Internet Draft discusses a new protocol, namely the Multi-network 107 Datagram Transmission Protocol (MDTP). The intention of developing 108 MDTP is to provide a fault-tolerant, real-time reliable data transfer 109 mechanism between communicating endpoints over IP networks [1]. 111 MDTP is proposed as an application-level protocol that is designed to 112 support redundant networks and transparent fault management. MDTP also 113 provides timing control and configuration flexibilities to meet the 114 stringent timing requirements often found in telephony signaling 115 protocols. The motivation of developing MDTP is to support 116 Internet-based high reliability applications such as signaling and 117 call control for Internet telephony. 119 MDTP is also designed to be scalable in order to support different 120 signaling transport requirements for different interfaces to a 121 telephony network. 123 For example, the transportation of signaling protocols such as ISDN 124 PRI may not require redundant networks, and hence only a subset of 125 MDTP will need to be implemented. On the other hand, redundant 126 networks may be mandated when transporting SS7 signaling messages 127 amongst different components in a carrier-grade telephony core 128 network. In such cases, the transparent support for redundant 129 networks, load sharing, and fault management defined in MDTP become 130 essential. 132 Many of the fundamental concepts that have made TCP such a useful 133 protocol are reused in MDTP, and some of the advantages of UDP are 134 also merged into the design. 136 1.1 Terminology 138 The following terms are defined and used in this document: 140 - Redundant networks: 142 An endpoint may be able to transmit or receive on more than one IP 143 address/UDP port. RFC 1122 refers to this as multi-homing. This 144 constitutes a redundant local network (for MDTP) relative to the 145 endpoint. MDTP makes no attempt to assure routing diversity within 146 the Internet connecting two endpoints. Each endpoint attempts to 147 send to its peer endpoint using all the IP addresses and UDP ports 148 its peer has open (within the constraints of any application 149 specified restrictions). The choice of which local socket to send 150 upon is an implementation detail (it is possible only one socket is 151 available and bound to all of the local networks to which the machine is 152 connected). The O/S also will play a role in the multi-homing/redundancy. 153 MDTP attempts a best effort at spreading the traffic across a 155 Stewart, et al [Page 3] 156 destination's available interfaces. It is assumed by MDTP that the 157 network (if fault tolerance is desired) is engineered for diversity 158 and MDTP's best effort will play only a small role in that diversity. 160 - Endpoint: 162 Representation of the logical point where MDTP datagrams can be sent 163 to or received from. Moreover, an MDTP endpoint shall be defined as 164 a set of IP address/port combinations in order to support redundant 165 networks. For example, an endpoint on a multi-homed host connected 166 with N IP networks can be represented as: 168 [IP addr1/port1, 169 ... 170 IP addrN/portN] 172 where the port numbers or IP addresses may not be unique, but their 173 combinations shall be guaranteed unique by the underneath IP 174 networks. 176 - Association: 178 Representation of an ongoing logical communication channel between 179 two MDTP endpoints. 181 - Sub-layering: 183 Conceptually MDTP is subdivided into two sub-layers, as shown below: 185 +--------------------------+ 186 | Sequencing Sub-layer | 187 +--------------------------+ 188 | Reliability Sub-layer | 189 +--------------------------+ 191 This is introduced to achieve a clear separation between: 192 1) the reliable transport on a per association basis, and 193 2) the in-sequence delivery on a per stream basis to avoid blocking 194 between independent streams. 196 - Reliability Sub-layer: 198 This Sub-layer copes only with functions to guarantee the 199 delivery of a datagram at its peer. At this sub-layer there 200 is no subdivision into different streams. 202 - Transmission Sequence Number (TSN): 204 A TSN is assigned to every datagram sent that transports user 205 data. The TSN is used by the peer Reliability Sub-layer to detect any 206 missing or duplicate user data. The TSN is processed by the 207 Reliability Sub-layer only. Its value and presence is not known by 208 the Sequencing Sub-layer 210 - Sequencing Sub-layer 212 This sub-layer copes only with ordered delivery of datagrams 213 belonging to a certain stream. It is based on the fact that 214 the Reliability Sub-layer has ensured the guaranteed delivery 215 of datagrams. 217 - Stream: 219 Defined as a unidirectional logical sub-channel within an existing 220 association (see the example below). 222 Each stream shall be identified by a stream ID that is unique 223 within the association and with regard to the endpoint that opens 224 the stream. 226 Endpoint "A" Endpoint "Z" 228 ------- association ------- 229 |===========================| 230 Stream ID | | 231 0 ----------------------------> | 232 1 ----------------------------> | 233 2 ----------------------------> | 234 | | Stream ID 235 | <---------------------------- 0 236 | <---------------------------- 1 237 | <---------------------------- 2 238 | <---------------------------- 3 239 | | 240 |===========================| 241 ------- ------- 243 Datagrams sent through a stream shall be reliably transmitted and 244 delivered independent to datagrams from other streams. 246 As an implementation consideration, both the sender and receiver 247 sides may need to dedicate resources, e.g., data queues, for each 248 existing stream. 250 - Stream Sequence Number (SSN): 252 A Stream Sequence Number is associated with every datagram 253 having a TSN. The SSN is valid only within the stream where the 254 datagram belongs to. The SSN is processed by the Sequencing 255 Sub-layer on a per stream basis. 257 Stream 0xffff is reserved and shall not be used. Stream 0x0 is 258 open per default upon initiating an association and is not to be 259 terminated. 261 - Sequence-number Attack: 263 As defined in RFC 1948 [10]. 265 - CRC Usage Policy: 267 The minimum level of data integrity is provided using the checksum 268 mechanism of the underlying transport protocol. It is therefore 269 required that this mechanism is always enabled when transferring 270 MDTP datagrams. 272 In order to meet higher data integrity, as required for transporting 273 of certain SCN signaling protocols, an additional 16 bit CRC value 274 can optionally be carried in an MDTP datagram. 276 See ITU-T Recommendation Q.703 [11] for details of how to calculate 277 a 16 bit CRC. 279 1.2 Design Requirements of MDTP 281 The following are some of the design requirements of MDTP to 282 make MDTP capable of supporting real-time call control environments 283 that may employ redundant networks: 285 A) High communication fan-out: an endpoint may need to be in 286 simultaneous communication with hundreds or thousands of endpoints 287 performing various call processing functions. These endpoints may 288 be codec converters, SS7 to IP translation applications, or, in the 289 case of mobile networks, data selector and combiner applications. 291 B) Stringent timer control: an endpoint needs to have a very fine 292 control over the timing for delivering a datagram. The timing 293 should be easily adjusted depending on the message type and the 294 destination. For example, after a few seconds of non-delivery the 295 call which the message is about may not exist anymore. 297 Stewart, et al [Page 4] 298 C) Support multiple network paths: an endpoint communicating with a peer 299 should be able to take advantage of the multiple network paths and 300 multi-homing in a transparent way. Therefore, the protocol must 301 be able to take advantage of local multi-homed hosts and remote 302 multi-homed hosts to provide resilient data delivery. This means 303 that the application or upper layer protocols need not to be involved 304 in the network fault management. Instead, when network failure occurs 305 MDTP should be able to automatically transmit out-bound datagrams to an 306 alternate destination network interface (if one exists) without 307 intervention from the application. 309 D) Reliable transport: datagrams might be lost or discarded while 310 traveling in the IP network towards the destination. The protocol 311 must handle the re-transmission of lost messages in an autonomous 312 way without any intervention from the upper layer. Also, sometimes 313 datagrams may arrive in duplicate copies, in such cases MDTP must 314 be able to detect and remove the duplicates automatically. 316 E) Support both ordered and unordered delivery: MDTP must support 317 both ordered and unordered delivery. In the case of ordered 318 delivery, the receiver shall detect out-of-order datagrams and 319 re-order them before dispatching them to the upper layer. In the 320 unordered case, received datagrams shall be dispatched without any 321 effort of re-ordering. 323 F) Support stream sequencing: on the demand of the upper layer 324 protocols or applications, MDTP should be able to support sequenced 325 delivery with regard to each individual stream, i.e., the delay caused 326 by the loss and retransmission of a datagram should be isolated to 327 only the stream to which the datagram belongs. This is particularly 328 important in some call control applications, where a loss of a 329 message should only affect the call whom the message belongs to. 331 1.3 Interface to MDTP 333 The application programs or upper layer protocols interface with MDTP 334 through a set of primitives (see section 9). 336 Towards the IP networks, it is assumed that UDP is used for the 337 transport layer. No special interfaces or changes are assumed within 338 UDP or at the UDP/MDTP interface. MDTP maintains its own queuing and 339 endpoint association. When MDTP runs on a router or on a 340 gateway-enabled host, it will place no special constraints on the 341 lower layer protocol implementations other than those described in the 342 Router Requirements and Host Requirements RFCs. 344 2. MDTP Datagram Format 346 A MDTP datagram consists of a common header and possibly a control 347 parameter part, a data part, or both. 349 Stewart, et al [Page 5] 350 MDTP Datagram Format 352 0 1 2 3 353 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 355 | CRC-16/MDTP Protocol Identifier | Vers | 356 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 357 | Msg Type | Reserved |C| Data Size | 358 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 359 \ \ 360 / Control Parameter Part / 361 \ \ 362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 363 \ \ 364 / Data Part / 365 \ \ 366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 368 Note: Message Type and Data Size in the common header MUST be 369 transmitted in network byte-order. 371 Note: when both the control part and data part are present in an MDTP 372 datagram, the control part MUST be processed first. 374 2.1 MDTP Common Header Field Descriptions 376 CRC-16/MDTP Protocol Identifier: 28 bits 378 When the C Bit is NOT set, this field shall contain the 28 bit 379 MDTP Protocol Identifier with a fixed value of 0xf787307. The 380 receiver shall verify this Protocol Identifier before it 381 consider the received datagram is a valid MDTP datagram. 383 When the C Bit is set, the most significant 16 bits of this 384 field shall contain a CRC-16 value, and the other 12 bits shall 385 be filled with '0' by the sender and ignored by the receiver, as 386 illustrated below: 388 0 1 2 389 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 390 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 391 | CRC-16 |0 0 0 0 0 0 0 0 0 0 0 0| 392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 394 Version: 4 bits 396 This field represents the version number of the MDTP protocol, 397 and shall be set to 0x3. 399 Message Type: 8 bits 401 When the value is non-zero, this shall indicate the type of 402 control message present in the current MDTP datagram. A value of 403 0x0 indicates the control part is NOT present in the current 404 datagram. 406 Stewart, et al [Page 6] 407 The value of Message Type is defined as the follows: 409 0x0 - indicating control part is NOT present 411 0x1 - Initiation 412 0x2 - Initiation Ack 413 0x3 - Extended Data Ack 414 0x4 - Advisory Ack Request 415 0x5 - Window-up 416 0x6 - Window-up Ack 417 0x7 - RTT-request 418 0x8 - RTT-ack 419 0x9 - Abort 420 0xa - Graceful Shutdown 421 0xb - Graceful Shutdown Ack 422 0xc - Stream Initiation 423 0xd - Stream Initiation Ack 424 0x10 - Stream Initiation Nack 425 0xe - Stream Termination 426 0xf - Stream Termination Ack 428 0x11 to 0xff - reserved and MUST NOT be used 430 Reserved: 7 bits 432 These bits are reserved for future use. The sender shall always 433 set these bits to '0', and the receiver shall ignore there 434 values. 436 C Bit: 1 bit 438 The CRC flag to indicate whether a CRC-16 value or the MDTP 439 protocol identifier is present in the header, as described 440 above. 442 Data Size: 16 bits 444 This value represents, in number of octets, the size of the user 445 data present in the Data Part of the current datagram. If the 446 Data Part is not present in the current datagram, it MUST be set 447 to 0x0. This implies that no Data Part with zero size user data 448 shall be allowed. 450 2.2 MDTP Control Parameter Part Definitions 452 This section defines whether a control parameter part is present for 453 each message type, and its format if a control parameter part is 454 present. 456 Note: integers in the control parameter part MUST be transmitted in 457 network byte-order. 459 2.2.1 Initiation (0x1) and Initiation Ack (0x2): 461 The parameter field of the Initiation and Initiation Ack messages 462 shall carry two initiation Tags, the maximal window length of the 463 sender, the sender's T2-Receive timer value in microseconds, the 464 number of pre-open outbound streams (P), the number of maximal 465 inbound streams (M), and the sender's local network 466 information. The network information informs the receiver the 467 addresses that may be the source of datagrams for this association 468 and are valid addresses that the receiver can use as a destination 469 address. Note that the endpoint MAY be multi-homed. 471 Stewart, et al [Page 7] 472 The following defines the parameter format for carrying N IPv4 473 Network addresses (other network address formats can be carried by 474 setting the size and type fields accordingly): 476 0 1 2 3 477 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 479 | Tag Value 1 (Seen) | 480 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 481 | Tag Value 2 (Send) | 482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 483 | Max Window Length | 484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 485 | My T2-Recv Timer value in microseconds | 486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 487 | Number of Pre-open Streams (P) | 488 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 489 | Number of Max Streams (M) | 490 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 491 | Number of Networks = N | 492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 493 | Size of address=8 | Type of Address=2 | 494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 495 | IP Address of Network 1 | 496 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 497 | Port # 1 | Padding = 0 | 498 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 499 / / 500 \ ... \ 501 / / 502 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 503 | Size of address=8 | Type of Address=2 | 504 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 505 | IP Address of Network N | 506 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 507 | Port # N | Padding = 0 | 508 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 510 If there is any implementation-specific data needed to be 511 exchanged at the setup of the association, it should be appended 512 to the end of the above data structure. The format of the 513 implementation-specific data should follow "Size/Type/Data Field" 514 format as defined above. In case an endpoint does not support the 515 implementation-specific data received, it shall ignore the 516 additional fields. 518 2.2.2 Extended Data Ack (0x3): 520 The parameter field contains 0 or more segment reports and the 521 highest consecutive TSN received. 523 Stewart, et al [Page 8] 524 0 1 2 3 525 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 527 | Number of Segments = N | 528 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 529 | Segment #1 Start TSN | 530 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 531 | Segment #1 End TSN | 532 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 533 / / 534 \ ... \ 535 / / 536 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 537 | Segment #N Start TSN | 538 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 539 | Segment #N End TSN | 540 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 541 | Highest Consecutive TSN Seen | 542 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 544 For example, assume the receiver has the following datagrams newly 545 arrived at the time when it decides to send an Extended Data Ack, 547 ---------- 548 | TSN=17 | 549 ---------- 550 | | <- still missing 551 ---------- 552 | TSN=15 | 553 ---------- 554 | TSN=14 | 555 ---------- 556 | | <- still missing 557 ---------- 558 | TSN=12 | 559 ---------- 560 | TSN=11 | 561 ---------- 562 | TSN=10 | 563 ---------- 565 the control parameter part of the Extended Data Ack shall be 566 constructed as follows: 568 -------------------------------- 569 | number of seg = 2 | 570 -------------------------------- 571 | seg #1 start = 17 | 572 -------------------------------- 573 | seg #1 end = 17 | 574 -------------------------------- 575 | seg #2 start = 14 | 576 -------------------------------- 577 | seg #2 end = 15 | 578 -------------------------------- 579 | highest consecutive TSN = 12 | 580 -------------------------------- 582 Note: when multiple segments are reported in a single Extended 583 Data Ack, the order of the segments in the Extended Data Ack is 584 not specified. 586 2.2.3 Advisory Ack Request (0x4): 588 No parameter field. 590 2.2.4 Window-up (0x5): 592 No parameter field. 594 2.2.5 Window-up Ack (0x6): 596 Same as that of Extended Data Ack. 598 2.2.6 RTT-request (0x7) and RTT-ack (0x8): 600 The parameter field shall contain the time value that is used for 601 RTT calculation (see section 6.2), and optionally an 602 acknowledgment Seen value. 604 0 1 2 3 605 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 606 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 607 | Time Value 1 | 608 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 609 | Time Value 2 | 610 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 611 | 0x0 or TSN Seen | 612 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 614 2.2.7 Abort (0x9): 616 The Abort message shall carry the initiation Tag of the 617 destination endpoint as a measure of security. 619 Stewart, et al [Page 9] 620 0 1 2 3 621 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 623 | Init-Tag | 624 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 626 2.2.8 Graceful Shutdown (0xa): 628 The destination endpoint initiation Tag shall be carried as a 629 measure of security. 631 0 1 2 3 632 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 633 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 634 | Init-Tag | 635 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 636 | TSN Seen | 637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 639 2.2.9 Graceful Shutdown Ack (0xb): 641 No parameter field. 643 2.2.10 Stream Initiation (0xc): 645 The parameter field shall contain the initiation Tag of the 646 destination endpoint (see section 3.1) and the Stream Identifier. 647 Also, there shall be a "Size of Stream Info" and "Stream 648 Information" fields that may contain an opaque user data structure 649 specific to the stream being opened. The "Stream Information" 650 field should be padded with '0's to 32 bit word boundary before 651 transmission. 653 0 1 2 3 654 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 656 | Init-Tag | 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | Stream Identifier | Reserved (set to 0) | 659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 660 | Size of Stream Info = N | 661 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 662 / / 663 \ Stream Information (N octets) \ 664 / / 665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 667 2.2.11 Stream Initiation Ack (0xd): 669 The parameter field shall contain the Stream Identifier. 671 0 1 2 3 672 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 | Stream Identifier | Reserved (set to 0) | 675 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 677 2.2.12 Stream Initiation Nack (0x10): 679 Same as that of Stream Initiation Ack. 681 2.2.13 Stream Termination (0xe): 683 The parameter field shall contain the initiation Tag value (see 684 section 3.1) and the Stream Identification 686 0 1 2 3 687 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 688 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 689 | Init-Tag | 690 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 691 | Stream Identifier | Reserved (set to 0) | 692 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 2.2.14 Stream Termination Ack (0xf): 696 Same as that of Stream Initiation Ack. 698 2.3 MDTP Data Part Definitions 700 The following format shall be used for MDTP datagram Data Part: 702 0 1 2 3 703 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 704 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 705 | TSN Seen | 706 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 707 | TSN Send | 708 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 709 | Stream Identifier S | Sequence Number n | 710 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 711 \ \ 712 / User Data (seq n of Stream S) / 713 \ \ 714 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 716 Note: TSN Seen, TSN Send, Stream Identifier, and Sequence Number MUST 717 be transmitted in network byte-order. 719 TSN Seen: 32 bits 721 This is a piggy-backed acknowledgment, indicating the reception 722 of datagrams up to this TSN. 724 TSN Send: 32 bits 726 This value represents the TSN of the user data carried in this 727 datagram. 729 Stream Identifier S: 16 bits 731 Identify the stream to which the following user date belongs. 733 Sequence Number n: 16 bits 735 This value presents the sequence number of the following user 736 data within the stream. 738 Sequence number 0x0 indicates that the following user data shall 739 be treated as unordered, and shall be dispatched to the upper 740 layer by the receiver without any attempt of re-ordering. 742 User Data: variable length 744 This is the payload user data. The size of the user data shall 745 be specified in the Data Size field. The implementation may 746 optionally have some '0' padded at the end of User Data field. 748 3. Endpoint Association Initialization 750 Before the first data transmission can take place from one endpoint 751 ("A") to another endpoint ("Z"), the two endpoints must complete an 752 initialization process in order to set up an association between them. 754 The upper layer may explicitly request MDTP to initialize an 755 association to an endpoint, or implicitly open the association by 756 sending the first datagram to that endpoint on stream 0. 758 Once the association is established, stream 0 is automatically opened 759 and ready for datagram transmission in both directions. Moreover, if 760 there are any pre-open streams specified by either side, they shall 761 also be opened and ready for transmission from that side. 763 Other streams must be explicitly opened before data transmission can 764 occur. 766 A tag-and-lock mechanism must be employed during the initialization 767 in order to guard against security attacks as well as erroneous 768 datagrams. 770 3.1 Initiation Message and Tag Lock 772 The initialization process consists of the following steps (assuming 773 that MDTP endpoint "A" tries to set up an association with MDTP 774 endpoint "Z"): 776 A) "A" shall first send an Initiation message to "Z", with Tag Seen 777 field set to 0x0 and Tag Send field set to Tag_A, where Tag_A shall 778 be a random number in the range of 0x80000000 to 0xffffffff (see 779 3.1.4 for Tag value selection), and enter the Tag-lock mode. 781 B) "Z" shall respond immediately with an Initiation Ack message, with 782 Seen set to Tag_A and Send set to Tag_Z (same range as Tag_A), and 783 enter the Tag-lock-new mode. 785 At this point, "Z" is ready to send user datagrams to "A" in stream 786 0. And upon the reception of the above Initiation Ack from "Z", "A" 787 also becomes ready to send user datagrams to "Z" in stream 0. 789 Note: user data in other streams can not be sent until the 790 respective streams are opened. 792 C) "Z" shall leave Tag-lock-new mode and enter Tag-lock mode only if a 793 user datagram has been sent out from "Z" to "A". 795 Note: to guard against "man in the middle" attacks, an endpoint 796 should impose a limit on the number of associations allowed to be 797 in the Tag-lock-new mode; whenever this limit is reached, any 798 further association Initiations received by the endpoint shall be 799 silently discarded. Also, a timer shall be used on each association 800 that is in the Tag-lock-new state; at the expiration of that timer, 801 that association shall be shutdown by the endpoint by sending an 802 Abort to the peer of that association. 804 Note: no user data shall be carried in both the Initiation and 805 Initiation Ack messages, i.e. the Data Size field in the MDTP common 806 header must be set to 0x0. 808 Note: if an endpoint receives an Initiation but decides not to 809 establish the new association due to lack of resources, etc., 810 it shall respond to the Initiation with an Abort message. 812 3.1.1 Passing Initiation Parameters 814 In addition to the Tags, both side must exchange their local network 815 information, maximal window length, the sender's T2-Receive timer 816 value in microseconds, number of pre-open outbound streams (P), and 817 number of maximal inbound streams (M), in the Initiation and 818 Initiation Ack messages. And the receiver shall process and store 819 these initiation parameters. 821 The maximal window length from the peer will be used to validate the 822 TSN range of the received datagrams (see section 4.6). 824 The sender's T2-receive timer will be used to adjust the T3-send timer 825 (see section 4.1.1). 827 The number of maximal inbound streams (M) shall indicate the maximal 828 number of concurrent streams the sender will accept from its peer 829 (excluding stream 0). The sender will reject any new Stream Initiation 830 request from its peer if this number is reached, unless some of the 831 currently open streams are closed first by the peer. 833 The sender shall use the number of pre-open outbound streams (P) to 834 indicate to its peer that, in addition to the stream 0, the sender 835 wants to have that many more streams (from stream 1 to stream P) 836 implicitly opened from the sender's side at the onset of the 837 association. This allows the receiver to allocate and initialize 838 necessary resources for the additional P inbound streams. 840 However, if the sender's P is greater than, or equal to, the 841 receiver's M, the receiver shall replace the sender's P with M, and 842 then only pre-open M inbound streams (from stream 1 to stream M). At 843 the same time, the sender also must either settle with M, instead of 844 P, pre-open outbound streams, or abort the association and report the 845 resources shortage. 847 3.2 Tag Unlock and TSN Initialization 849 The first user datagram transmitted by "A" to "Z" shall have the TSN 850 Seen value set to Tag_Z in the Data Part (see 2.3). 852 Similarly, the first user datagram transmitted by "Z" to "A" shall 853 have the TSN Seen value set to Tag_A. 855 The reception of this first datagram with user data and with the 856 correct Tag value in the TSN Seen field from its peer shall unlock the 857 Tag and cause the endpoint to leave the Tag-lock or Tag-lock-new mode. 859 The receiver shall immediately send back an Extended Data Ack to 860 acknowledge the reception of this first user datagram. 862 The TSN Send value carried in this first datagram with user data shall 863 be used to establish the initial TSN of this peer, i.e., the sender of 864 this datagram. 866 To strengthen the security, this initial TSN shall be randomly 867 selected from the range between 0x1 and 0x7fffffff by the sender, by 868 means such as those suggested in RFC 1750 [9]. 870 Note: When an endpoint receives the first user datagram that causes it 871 to leave the the Tag-lock or Tag-lock-new mode, it shall immediately 872 send an Extended Data Ack to acknowledge the reception of this user 873 datagram and shall NOT start a T2-recv timer. For all the subsequent 874 user datagram receptions, the receiver shall follow the normal timer 875 rules. 877 3.3 Datagram Processing during Tag Lock 879 In Tag-lock or Tag-lock-new mode, an endpoint shall silently discard 880 any user datagrams from the peer endpoint that does not carry the 881 correct Tag value. 883 However, if there is a control part present in a discarded user 884 datagram, the endpoint shall always process the 885 control part even when the data part is being discarded. 887 If another Initiation from "A" is received by "Z" after "Z" sent out 888 its Initiation Ack, "Z" shall respond to this second Initiation by 889 re-sending the Initiation Ack if the Tag Send field of this second 890 Initiation has the same value as that of the original Initiation. 891 Otherwise, "Z" shall respond by sending an Initiation of its own, with 892 Tag Send field set to Tag_Z, so as to elicit an Initiation Ack from 893 "A". 895 3.4 An Example of Association Initialization 897 In the following example, "A" initiates the association first and then 898 sends a user datagram to "Z", then "Z" sends two user datagrams 899 sometimes later: 901 Endpoint A Endpoint Z 903 {app sets association with Z} 904 Initiation 905 [Tag Seen=0,Tag Send=Tag_A 906 & net addr info] --------\ 907 (Start T1-init timer) \ 908 (Enter Tag_A-lock mode) \---->Initiation Ack 909 [Tag Seen=Tag_A,Tag Send=Tag_Z 910 /---- & net addr info] 911 / (Enter Tag_Z-lock-new mode) 912 (Cancel T1-init timer)<-------/ 914 {app sends 1st user data; strm 0} 915 U-Data 916 [Seen=Tag_Z,Send=init TSN-A 917 Strm=0,Seq=1, 918 & user data] -------\ 919 (Start T3-send timer) \ 920 \---->(Leave Tag_Z-lock-new mode) 921 ------Ext Data Ack 922 / [Seg=0,TSN Seen=init TSN-A] 923 (Cancel T3-send timer) <-----/ 924 .. 926 .. 927 {app sends 2 datagrams;strm 0} 928 /---- U-data 929 / [Seen=Tag_A,Send=init TSN-Z 930 (Leave Tag_A-lock mode) <----/ Strm=0,Seq=1, 931 Ext Data Ack & user data 1] 932 [Seg=0,TSN Seen=init TSN-Z] /---- U-data 933 --------\ / [Seen=init TSN-A, 934 \/ Send=init TSN-Z +1, 935 (Start T2-receive timer)<---/\ Strm=0,Seq=1, & user data 2] 936 \ 937 \------> 939 If T1-init timer expires at "A" after the Initiation is sent, the same 940 Initiation message with the same Tag_A value shall be retransmitted and 941 the timer restarted. This shall be repeated Max.Init.Retransmit times 942 before "A" considers "Z" unreachable and optionally reports the 943 failure. 945 3.5 Other Initiation Issues 947 3.5.1 Selection of Tag Value 949 Tag values should be selected from the range of 0x80000000 to 950 0xffffffff. It is very important that the Tag value be randomized to 951 guard against "man in the middle" and "sequence number" attacks. It is 952 suggested that RFC 1750 [9] be used for the Tag randomization. 954 3.5.2 Initiation from behind a NAT 956 When a NAT is present between two endpoints, the endpoint that is 957 behind the NAT, i.e., one that does not have a publicly available 958 network address, shall take one of the following options: 960 A) Indicate that it has only one network by setting the 'Number of 961 networks' field in the Initiation message to 0. This will make the 962 endpoint that receives this Initiation message to consider the sender 963 as only having that one address. This method can be used for a dynamic 964 NAT, but any multi-homing configuration at the endpoint that is behind 965 the NAT will not be visible to its peer, and thus not be taken 966 advantage of. 968 B) Indicate all of its networks in the Initiation by specifying all 969 the actual IP addresses and ports that the NAT will substitute for the 970 endpoint. This method requires that the endpoint behind the NAT must 971 have pre-knowledge of all the IP addresses and ports that the NAT will 972 assign. 974 3.5.3 Initialization Collision 976 If two endpoints attempt to initialize an association with each other 977 at about the same instance, a collision will occur. As a result, each 978 side will receive an Initiation datagram from the other side after it 979 transmitted its own. In such a case, both sides shall send an 980 Initiation Ack datagram to the other side using the procedure 981 described above. 983 3.5.4 Association Re-initialization 985 An endpoint shall be allowed to re-initialize an established 986 association with the other endpoint. 988 Once an endpoint has left the Tag-lock or Tag-lock-new mode of the 989 previous association initialization process, it shall treat any new 990 Initiation message from its peer as a re-initialization event. 992 During a re-initialization, both endpoint shall follow the same 993 procedure as defined in section 3.1. And a new Init-Tag must be used 994 by the endpoint that receives the Initiation message, if it has already 995 left the previous Tag-lock or Tag-lock-new mode. 997 Association re-initialization affects ongoing transmission and 998 their resources. The receiver of the new Initiation may need to 999 perform garbage-collection on its resources, including: 1001 A) automatically terminating all existing streams within the current 1002 association and releasing the resources, 1004 B) cancelling any running timers, 1006 C) removing all outstanding datagrams of the current association 1007 from its retransmission queue, and 1009 D) optionally, notifying the upper layer about the re-initialization. 1011 4. Transfer User Datagram 1013 The receiver of a user datagram shall always acknowledge the reception 1014 to the sender of the datagram. Normally, delayed acknowledgment shall 1015 be used. The delay shall be controlled by a T2-receive timer. 1017 At the expiration of T2-receive timer, if there is out-bound user data, 1018 the ack should be piggy-backed on the data part of the out-bound user 1019 datagram, occupying the TSN Seen field (see section 2.3). Otherwise, a 1020 stand-alone Extended Data Ack shall be used to carry the 1021 acknowledgment. 1023 When Extended Data Ack is used, the sender shall fill the Highest 1024 Consecutive TSN Seen field to indicate the highest TSN Send number it 1025 has received from the peer. Any received segments must also be 1026 reported (see sections 2.2.2 and 4.5). 1028 The following example illustrates both stand-alone and piggy-backed 1029 acknowledgments: 1031 Endpoint A Endpoint Z 1032 {App sends 3 messages in strm 0} 1033 U-Data 1034 [Seen=5,Send=7,Strm=0,Seq=3]--------> (Start T2-receive timer) 1035 (Start T3-send timer) 1037 U-Data 1038 [Seen=5,Send=8,Strm=0,Seq=4]--------> 1040 U-Data 1041 [Seen=5,Send=9,Strm=0,Seq=5]--------> 1042 ... 1043 {Timer T2 expires} 1044 /--------- Extended Data Ack 1045 / [Seg=0,Seen=9] 1046 (cancel T3-send timer) <----/ 1047 ... 1048 ... 1049 {App sends 1 message; strm 0} 1050 U-Data 1051 [Seen=5,Send=10,Strm=0,Seq=6]-------> (Start T2-receive timer) 1052 (Start T3-send timer) 1053 ... 1054 {App sends 1 message; strm 1} 1055 (cancel T2-receive timer) 1056 /------ U-Data 1057 / [Seen=10,Send=6,Strm=1,Seq=2] 1058 / (Start T3-send timer) 1059 (cancel T3-send timer) <------/ 1060 (Start T2-receive timer) 1061 .. 1062 {Timer T2 Expires} 1063 Extended Data Ack 1064 [Seg=0,Seen=6]----------------------> (cancel T3-send timer) 1066 4.1 Timer Management Rules 1068 The the following rules shall be used to manage the timers during 1069 normal datagram transfer, unless otherwise stated for some special 1070 cases: 1072 A) When a user datagram is received, the endpoint shall start a 1073 T2-receive timer if no T2-receive timer is currently running. Upon 1074 the expiration of the T2-receive timer, the endpoint shall 1075 acknowledge to the sender all the un-acked user datagrams it has 1076 received. 1078 B) When a user datagram is sent out, the sending endpoint shall start 1079 a T3-send timer if no T3-send timer is currently running. 1081 If the T2-receive timer is running, the endpoint shall first stop 1082 the T2 timer, piggy-back an ack (or Extended Data Ack) onto the 1083 out-bound datagram, and then start a T3-send timer. 1085 If the T3-send timer expires, the endpoint shall follow the rules 1086 described in 4.6 for possible re-transmission of the un-acked 1087 datagrams. 1089 Moreover, whenever the T3-send timer is started the RTT estimate 1090 last calculated for that remote network address should be added to 1091 the base T3-send timer value (see sections 6.2 and 6.3 for RTT). 1093 C) When all outstanding datagrams are acknowledged, the T3-send timer 1094 shall be stopped if one is still running. 1096 D) If an endpoint has a T3-send timer running and receives a partial 1097 acknowledgment (one that acknowledges some of the outstanding 1098 datagrams), the endpoint shall restart the T3-send timer. 1100 The following example shows the use of various timers. 1102 Endpoint A Endpoint Z 1103 {App sends 2 messages; strm 0} 1104 U-Data 1105 [Seen=5,Send=7,Strm=0,Seq=3] ---------> (Start T2-receive timer) 1106 (Start T3-send timer) 1108 U-Data {App sends 1 message; strm 1} 1109 [Seen=5,Send=8,Strm=0,Seq=4] -\ /-- (cancel T2-receive timer) 1110 \ / U-Data 1111 \ / [Seen=7,Send=6,Strm=1,Seq=2] 1112 \ (Start T3-send timer) 1113 / \ 1114 (Re-start T3-send timer) <-------/ \ 1115 (Start T2-receive timer) \ 1116 ... -> (Start T2-receive timer) 1117 ... 1118 {T2-receive timer expires} 1119 Extended Data Ack 1120 [Seg=0,Seen=6] -----------------------> (Cancel T3-send timer) 1121 .. 1122 {T2-receive timer expires} 1123 (Cancel T3-send timer) <---------------- Extended Data Ack 1124 [Seg=0,Seen=8] 1126 4.1.1 T3-send Timer Adjustment with RTT 1128 The sender shall keep track of the latest RTT measurement for the 1129 destination IP address (or addresses if the remote host is 1130 multi-homed) of its peer. Three procedures for obtaining RTT 1131 measurements are defined in sections 4.7, 6.2, and 6.3, 1132 respectively. And the calculation of RTT should follow the method 1133 described in [4]. 1135 Every time when a new datagram is sent for the first time (i.e., not 1136 for re-transmission), the following procedure shall be applied to 1137 determine the T3-send timer value: 1139 1. TL3-value = 'TL3-default' 1141 2. if TL3-value <= Receiver's T2-Recv + highest-RTT, 1142 TL3-value = TL3-value + highest-RTT 1143 end-if 1145 3. T3-send = TL3-value + network-RTT 1147 where, 'TL3-default' is a protocol parameter configurable by the 1148 endpoint, receiver's T2-Recv timer value is known during the 1149 association initiation (see section 3.1.1), the highest-RTT is the 1150 current highest RTT measurement across all the destination IP 1151 addresses available for transmission, and, the network-RTT is the 1152 current RTT measurement of the destination IP address this 1153 transmission is to take place (see section 4.2.1 for the determining 1154 of destination IP address). 1156 However, if the previous T3-send timer expired and is being re-started 1157 for a re-transmission, the timer back-off rules defined in section 5.2 1158 shall be used instead. 1160 4.2 Multihoming Rotation 1162 4.2.1 Remote Multihoming Rotation 1164 When an endpoint is transmitting to a remote multi-homed endpoint, the 1165 transmitting endpoint shall rotate between destination IP addresses. 1166 Every time the application transmits a datagram, MDTP MUST keep track 1167 of the remote IP address to which it sent the datagram in the MDTP 1168 protocol variable 'last.sent.intf'. MDTP should rotate each send in a 1169 round robin fashion amongst all available destination IP addresses on 1170 the remote multi-homed host and should update the protocol variable 1171 'last.sent.intf' to indicate which destination IP address it last 1172 used. 1174 If possible, acks should be transmitted to the same IP address from 1175 which the acked messages were received. When acknowledging multiple 1176 messages, this may not be possible. In the latter case, MDTP SHOULD 1177 rotate the transmission of acknowledgments to all remote IP addresses. 1179 The MDTP implementation MUST allow an application to override this 1180 rotation by specifying the destination IP address to which to send a 1181 datagram. The implementation must also provide an interface to add 1182 and remove a remote IP address from rotation eligibility. 1184 4.2.2 Local Multihoming Rotation 1186 As discussed in section 3.3.4 of RFC 1122, an endpoint MAY rotate 1187 transmitted messages amongst all local network interfaces by 1188 specifying the local IP address and UDP port or it may allow the 1189 networking protocol to decide which local IP address (and network 1190 interface) to use to transmit a datagram.. 1192 If possible, acks should be transmitted from the same IP address over 1193 which the acked messages were received. When acknowledging multiple 1194 messages, this may not be possible. In the latter case, MDTP SHOULD 1195 rotate the transmission of acknowledgments from all configured IP 1196 address/port pairs. 1198 4.3 Stream Sequence Number 1200 The datagram stream sequence number shall always be set to 1 when the 1201 stream is opened. 1203 Also, when the stream sequence number reaches the value 0xffff the 1204 next sequence number shall be set to 1. Sequence number '0' has 1205 special meaning (see section 4.4) and shall not be used in normal 1206 sequence number rotation.. 1208 4.4 Ordered and Un-ordered Delivery 1210 Normally, the receiver shall ensure the user datagrams within any 1211 given stream be delivered to the upper layer according to the order of 1212 their stream sequence number. If there are datagram arrived out of 1213 order of their stream sequence number, the receiver must hold the 1214 received datagrams from delivery until they are re-ordered. 1216 However, a sender can set the stream sequence number of a user 1217 datagram to 0, to indicate that no ordering shall be performed on that 1218 datagram within that stream. Upon the reception of the datagram, the 1219 receiver must by-pass the ordering mechanism and immediately delivery 1220 the datagram to the upper layer. 1222 This provides an effective way to transmit "out-of-band" data in any 1223 given stream. Also, a stream can be used as an "un-ordered" stream by 1224 simply setting the stream sequence number of each out-bound user 1225 datagram to 0. 1227 4.5 Report Missing Datagrams 1229 MDTP uses a receiver-based retransmission policy, where the sender 1230 attempts to elicit from the receiver information on the missing 1231 datagrams before the retransmission. 1233 If a receiver detects holes in the received user datagram sequence (by 1234 examining TSN Send numbers), an Extended Data Ack with segment reports 1235 shall be sent back to inform the sender so that the sender can 1236 calculate and re-transmit the missing datagrams. 1238 Multiple segments can be indicated in one single Extended Data Ack 1239 (see section 2.2.2). 1241 If there is outbound user data, the endpoint shall piggy-back the 1242 Extended Data Ack with the user data in the same MDTP datagram, and 1243 the TSN Seen field in the data part shall not be used, i.e., the 1244 sender shall set the field to 0x0 and the receiver shall ignore it. 1246 The following example shows the use of segment report in an Extended 1247 Data Ack. 1249 Endpoint A Endpoint Z 1250 {App sends 3 messages; strm 0} 1251 U-Data 1252 [Seen=3,Send=6,Strm=0,Seq=2]-------> (Start T2-receive timer) 1253 (Start T3-send timer) 1255 U-Data 1256 [Seen=3,Send=7,Strm=0,Seq=3]-----X (lost) 1258 U-Data 1259 [Seen=3,Send=8,Strm=0,Seq=4]-------> (A seg detected in data) 1260 .. 1261 {T2-receive timer expires} 1262 /------ Extended Data Ack 1263 / [Seg=1,Strt=8,End=8,Seen=6] 1264 (Prepare retransmission) <----/ 1266 In this example, when "Z" receives the third datagram from "A" it 1267 realizes that a gap exists in the received data. At the expiration of 1268 T2-receive timer, "Z" sends an Extended Data Ack with a segment report 1269 to "A" to indicate the missing datagram. 1271 When the peer endpoint is multi-homed, the Extended Data Ack should be 1272 sent out to the destination IP address specified in the MDTP protocol 1273 variable 'last.good.intf'. The value of 'last.good.intf' is always 1274 updated to point to the source IP address from which the last datagram 1275 from the peer endpoint arrived. 1277 4.6 Range Check on TSN 1279 For security reasons, the receiver must check the range of the TSN 1280 Send value in each received user datagrams. 1282 Assume that the highest TSN received from a peer is T and the maximal 1283 window length of the same peer is W (exchanged during association 1284 initiation, see section 3.1). When the next user datagram arrives from 1285 this peer, the receiver shall silently discard the datagram if the TSN 1286 Send value carried in the datagram is greater than T+W (calculation 1287 rounds up at 0x7fffffff to 0x1). 1289 4.7 Advisory Ack Request 1291 An endpoint may use Advisory Ack Requests to improve bandwidth 1292 utilization, in combination of the window control (see section 5.1). 1294 Advisory Ack Request shall always be piggy-backed on an outbound user 1295 datagram. 1297 The endpoint should send an Advisory Ack Request to its peer when: 1299 A) it reaches half of its window length with the sending of the 1300 current user datagram, or 1302 B) it detects that the next send will reach the full window length 1303 with the sending of the current user datagram. 1305 After the receiver detects the Advisory Ack Request in the control 1306 part of the datagram, it should handle it with the following rules: 1308 A) The receiver may choose to ignore the peer's Advisory Ack Request 1309 for any reasons, such as flow control, etc, and move on to 1310 process the data part. 1312 B) If the receiver chooses to respond, it should, at the end of 1313 processing the data part, immediately send an Extended Data Ack 1314 to acknowledge all the un-acked datagrams (including the one it 1315 just processed), and cancel its T2-receive timer if one is still 1316 running. 1318 The following diagram shows an example of using Advisory Ack Request: 1320 Endpoint A Endpoint Z 1321 {App sends 3 messages; strm 0} 1322 U-Data 1323 [Seen=5,Send=7,Strm=0,Seq=3]-------------> (Start T2-recv timer) 1324 (Start T3-send timer) 1326 U-Data 1327 [Seen=5,Send=8,Strm=0,Seq=4]-----------> 1329 {detects window half full, use Advisory Ack Req} 1330 Adv Ack Request/U-data 1331 [Seen=5,Send=9,Strm=0,Seq=5]------\ 1332 \ 1333 \----> (cancel T2-receive timer) 1334 <---------------- Extended Data Ack 1335 [Seg=0,Seen=9] 1337 An endpoint sending an Advisory Ack Request may also use this request 1338 for its RTT calculation. The sending endpoint may note the time mark 1339 when sending the datagram with the Advisory Ack Request. When the 1340 peer endpoint responds with an Extended Data Ack, the sender of the 1341 Advisory Ack Request may use the time mark of the arriving Extend Data 1342 Ack and the stored time mark to calculate the RTT as defined in 1343 [4]. However, the sender of the Advisory Ack Request shall abandon the 1344 RTT calculation if more datagrams are sent to its peer and no Extended 1345 Data Ack is received. 1347 4.8 CRC-16 Utilization 1349 When sending a datagram, the sender can choose to strengthen the data 1350 integrity 1351 of the transmission by including a CRC-16 value of the datagram. 1353 After the datagram is constructed, the sender shall: 1355 1) set the C Bit to '1' and fill the 28 bit CRC-16/MDTP Protocol 1356 Identifier field with '0', and the 4 bit Version field to the 1357 current MDTP version number (binary 0011). 1359 2) calculate a CRC-16 value of the whole datagram, including the 1360 MDTP common header, the Control Parameter Part if present, and 1361 the Data Part if present, 1363 3) put the resultant CRC-16 value into the most significant 16 bits 1364 of the CRC-16/MDTP Protocol Identifier, and leave the rest of the 1365 bits unchanged. 1367 When a datagram is received, the receiver must first check the C 1368 Bit. If the C Bit is set, the receiver shall: 1370 1) store the received CRC-16 value (the most significant 16 bits of 1371 the first word of the datagram), 1373 2) replace the 16 bit CRC-16/MDTP Protocol Identifier field with '0' 1374 and calculate a CRC-16 value of the whole received datagram, 1376 3) verify that the calculated CRC-16 value is the same as the 1377 received CRC-16 value, and 1379 4) handle the datagram as an invalid MDTP datagram if the CRC-16 1380 values mismatch . 1382 If the C Bit is not set, the receiver shall check the MDTP Protocol 1383 Identifier instead, and handle the datagram as an invalid MDTP 1384 datagram if the check fails. 1386 The default procedure of handling invalid MDTP datagrams is to 1387 silently discard them. 1389 5 Congestion Controls 1391 Several different mechanisms shall be used jointly to achieve 1392 congestion control in MDTP. These mechanisms are always used in regard 1393 to the association, not a individual stream. 1395 5.1 Send with Window Control 1397 The sending endpoint shall use a transmission window to control the 1398 number of outstanding datagrams, i.e., datagrams that have been sent, 1399 but yet to be acknowledged. The length of the window is defined as the 1400 maximal number of outstanding datagrams a sending endpoint can 1401 allow. This length is adjusted dynamically, depending on the current 1402 number of successful transmissions as well as the number of lost 1403 datagrams or retransmissions. 1405 When the number of outstanding datagrams reaches the current window 1406 length, the endpoint shall still accept send requests from its upper 1407 layer, but shall transmit no more datagrams until some or all of the 1408 outstanding datagrams are acknowledged. The endpoint may also elect 1409 to queue only a specified number of datagram when the window is full. 1410 When this maximal number of queued datagrams is reached the endpoint 1411 shall return an error to its upper layer. 1413 Moreover, when the window length is reached, the next send request 1414 from the upper layer will trigger a Window-up message to be 1415 transmitted. Upon receiving this Window-up the receiver must respond 1416 with a Window-up Ack, as illustrated by the following example 1417 (assuming current window length is 3): 1419 Endpoint A Endpoint Z 1420 {App sends 3 messages, strm 0} 1421 U-Data 1422 [Seen=5,Send=7,Strm=0,Seq=3]--------> (Start T2-receive timer) 1423 (Start T3-send timer) 1425 U-Data 1426 [Seen=5,Send=8,Strm=0,Seq=4]--------> 1428 U-Data 1429 [Seen=5,Send=9,Strm=0,Seq=5]--------> 1431 {App sends a new message, strm 1} 1432 (queue new message and send Win-up) 1433 Window-up ---------------> (cancel T2-recv timer) 1434 /---- Window-up Ack 1435 / [Seg=0,Seen=9] 1436 (Cancel T3-send timer) <--------/ 1437 U-Data 1438 [Seen=5,Send=10,Strm=1,Seq=2]-------> (Start T2-receive timer) 1439 (Start T3-send timer) 1441 In the above example, after the transmission of the first three 1442 datagrams, "A" reached its window length. The next message from the 1443 user triggered a Window-up that was sent to "Z". The Window-up shall 1444 contain no user data. In response, "Z" cancelled timer T2 and 1445 immediately sent a Window-up Ack. The arrival of this Window-up Ack 1446 effectively resolved all the outstanding datagrams at "A", thus 1447 allowing "A" to send out the next datagram. 1449 5.1.1 Window Length Adjustment 1451 The window length shall be initially set to 2, and shall then be 1452 dynamically adjusted based on datagram loss and acknowledgment. 1454 If the current window length is less than or equal to 4, every time 1455 when the number of consecutive outstanding datagrams acknowledged in a 1456 single ack is equal to or greater than half the current window length, 1457 the sender's window length shall be raised by 1, until it reaches 1458 'Max.Outstanding.dg' (which should be a user configurable parameter). 1460 If the current window length is greater than 4, every time when the number 1461 of consecutive outstanding datagrams acknowledged in a single ack is 1462 equal to or greater than 4, the sender's window length shall be raised 1463 by 1, until it reaches 'Max.Outstanding.dg'. 1465 In the following circumstances, the sender's window length shall be 1466 decreased. However, when the window length reaches 2 it shall not be 1467 decreased any further. 1469 Firstly, if the sender receives a stand-alone Extended Data Ack with a 1470 Seen TSN that equals to the highest consecutive acked TSN, the sender 1471 should consider this as a duplicate ack and lower its window size 1472 by 4. 1474 The peer endpoint may report reception gaps which may correspond to 1475 multiple datagram losses (indicated by an Extended Data Ack or 1476 Window-up Ack). If between 1 to 3 datagrams are lost, the window 1477 length shall be decreased by 1. If between 4 to 7 datagrams are lost, 1478 the window length shall be decreased by 2. If 8 or more datagrams are 1479 lost, the window length shall be decreased by 4. 1481 Any time a Window-up Ack is received by an endpoint, as a response to 1482 a previous Window-up it sent, the endpoint shall decrease its window 1483 by 1 if the window has not advanced from the time at which the 1484 Window-up was sent out. 1486 Also, if a timeout forces a retransmission the sender's window length 1487 shall be reduced to half of its currently value. 1489 The following table summarizes these rules: 1491 ----------------------------------------------------------------- 1492 duplicate ack received by sender | Adjust down by 4 1493 ----------------------------------------------------------------- 1494 8 or more datagrams lost | Adjust down by 4 1495 ----------------------------------------------------------------- 1496 4 to 7 datagrams lost | Adjust down by 2 1497 ----------------------------------------------------------------- 1498 1 to 3 datagrams lost | Adjust down by 1 1499 ----------------------------------------------------------------- 1500 Timeout forced retransmission | Adjust down by 1/2 of the 1501 | current window. 1502 ----------------------------------------------------------------- 1503 Window-up Ack received and the | Adjust down by 1 1504 window has not advanced. | 1505 ----------------------------------------------------------------- 1506 4 or more consecutive datagrams | Adjust up by 1 1507 acknowledged (window length > 4) | 1508 ----------------------------------------------------------------- 1509 1/2 Window length or more acked | Adjust up by 1 1510 (window length <=4) | 1511 ----------------------------------------------------------------- 1513 5.2 Send Timer Back-off at Re-transmission 1515 Whenever a T3-send timer expires, the endpoint shall re-transmit the 1516 un-acked datagram that has the highest TSN Send value and re-start the 1517 T3-send timer, unless: 1519 A) If the current window length is reached, a Window-up message shall 1520 be sent out (see section 5.1), or 1522 B) If the current window length is not reached and there is still user 1523 data pending for transmission, a new datagram with user data shall 1524 be sent out and T3-send timer shall be restarted. 1526 When the T3-send timer is re-started at a retransmission, the 1527 following back-off rules shall be applied to determine the value of 1528 the new timer: 1530 1. TL3-value = TL3-value * 2 1532 2. T3-send = TL3-value + network-RTT 1534 where, TL3-value is the protocol variable keeping the previous and 1535 current T3-send timer base value, and the network-RTT is the current 1536 RTT measurement of the destination IP address the re-transmission is 1537 to be sent to. 1539 Note: the T3-send timer base value shall be restored to its default 1540 value 'TL3-default' when a datagram is received from the peer 1541 endpoint. 1543 The total number of consecutive re-transmissions to all destination IP 1544 addresses in an association shall be recorded. If this value exceeds 1545 the limit defined in 'Max.Retransmit', the sending endpoint shall 1546 consider the peer endpoint unreachable and shall stop transmitting any 1547 more data to it. The sending endpoint MAY report the failure to the 1548 upper layer, including all datagrams in its out-bound buffer which 1549 have not been acknowledged. Whenever a datagram is received from a 1550 peer endpoint the total number of retransmissions shall be cleared. 1552 6. Network Management 1554 6.1 Failure Detection in Redundant Networks 1556 When the peer endpoint is multi-homed, the re-transmission of a 1557 datagram should be attempted to the destination IP address specified 1558 in the MDTP protocol variable 'last.good.intf'. The value of 1559 'last.good.intf' is always updated to point to the source IP address 1560 from which the last datagram from the peer endpoint arrived. 1562 The number of consecutive T3-send timeout events is also recorded in 1563 a variable 'retran.count' for each destination IP address. This count 1564 should be incremented when a T3-send time-out event occurs for that 1565 destination IP address. Every time a datagram is received from a peer 1566 endpoint, the receiving endpoint shall reset to 0 the 'retran.count' 1567 corresponding to the source IP address . 1569 If the value in 'retran.count' exceeds half of the value of the 1570 protocol parameter 'Max.Retransmit', the destination IP address shall 1571 be reported to the upper layer as out-of-service and shall be removed 1572 from eligibility for rotation. When re-transmitting a datagram, the 1573 re-transmission should use 'last.good.intf' as the preferred 1574 destination IP address to which to re-transmit, unless 'last.good.intf' 1575 points to the destination IP address on which the original T3-send 1576 time-out event occurred. 1578 In the event that a datagram is received from an IP address that has 1579 been reported as out-of-service, the 'retran.count' shall be cleared 1580 as specified above, the destination IP address shall be reported as 1581 in-service to the upper layer, and the destination IP address shall be 1582 considered valid for rotation. 1584 6.2 RTT Measurement 1586 On occasions an endpoint of an association may need to perform an RTT 1587 measurement of the network (or one of the redundant networks) between 1588 itself and its peer. 1590 RTT-request and RTT-ack messages shall be used to perform the RTT 1591 measurement. In the messages, two 32 bit long opaque integers are used 1592 in the control parameter field to carry the time value. 1594 At the request of its upper layer, an endpoint shall initiate an RTT 1595 measurement by sending an RTT-request (to a specific network if 1596 redundant networks exist). The sender shall also place in Time value 1 1597 and Time value 2 the value of the current time mark. 1599 Upon the reception of this RTT-request message, the recipient shall 1600 immediately respond with a RTT-ack to the sender (over the same 1601 network on which the RTT-request arrives if the recipient is 1602 multi-homed), with the time mark carried in the original RTT-request 1603 copied into its own Time value fields. 1605 Upon the reception of this reply, the sender shall use the time mark 1606 in the reply RTT-ack to calculate the RTT (to the specific destination 1607 IP address if redundant networks exist) as defined in [4]. 1609 Endpoint A Endpoint Z 1610 {RTT - Request Now=x.y} 1611 RTT-request 1612 [Time-value1=x, 1613 Time-value2=y, 1614 Seen=81] -----------------------> 1615 /------- RTT-ack 1616 / [Time-value1=x, 1617 / Time-value2=y, 1618 / Seen=3] 1619 (Endpoint A uses <----------/ 1620 x.y to calculate RTT) 1622 6.3 Network Heart Beat 1624 At the request of its upper layer, an endpoint shall enable heart beat 1625 to a specific peer with which it has an established association. 1627 The RTT-request message defined in section 2.2 shall be used as 1628 the heart beat while the RTT-ack shall be used as the heart beat 1629 response. 1631 After having heart beat enabled, the endpoint shall transmit a heart 1632 beat to that specific peer and start a T5-heartBeat timer. The peer 1633 shall immediately respond to the heart beat in the same manner as the 1634 RTT measurement procedure described in section 6.2. This response, as 1635 well as the new RTT measurement, shall be stored by the endpoint. 1637 When the T5-heartBeat timer expires, the endpoint shall first check if 1638 the previous heart beat has been responded to (on the same network it 1639 was sent in the case of multi-homed hosts). If not, the destination IP 1640 address to which the last heart beat was sent shall have the 1641 'retran.count' incremented and checked following the rules described 1642 in section 6.1. Then, the endpoint shall send another heart beat and 1643 re-start the T5-heartBeat timer. 1645 In the case where one or both endpoints are multi-homed, the sending 1646 of Heart beats shall follow the network rotation rules outlined in 1647 section 4.2. 1649 If, before the expiration of T5-heartBeat timer, a datagram is 1650 received by the endpoint, the T5-heartBeat timer shall be stopped and 1651 restarted. 1653 The suggested interval for T5-heartBeat timer is 4000 ms, and may be 1654 dynamically adjusted by adding the current RTT measurement if it is 1655 available. 1657 7. Termination of Association 1659 Before an endpoint terminates itself, it shall send an Abort message 1660 to each of its peer endpoints in all existing associations. The Abort 1661 shall be sent without requiring an acknowledgment from the peer 1662 endpoint. However, the sender of the Abort message MUST fill in the 1663 peer's Init-Tag. 1665 When the peer endpoint receives the Abort, after verifying the Tag, 1666 the peer shall remove the sender from its record, and optionally 1667 report the termination of the sender to its upper layer. However if 1668 the Tag sent with the Abort message is incorrect, the peer must 1669 silently discard the Abort message. 1671 The following shows an example of the termination of Endpoint A: 1673 Endpoint A 1674 {App indicates termination} 1675 Abort 1676 [Tag-X] --------------------------------> to Endpoint X 1678 Abort 1679 [Tag-Y] --------------------------------> to Endpoint Y 1681 Abort 1682 [Tag-Z] --------------------------------> to Endpoint Z 1684 7.1 Graceful Shutdown of an Association 1686 An endpoint in an association may decide to "graceful shutdown" the 1687 association without completely closing it down. With graceful 1688 shutdown, both endpoints shall remove any record and pending datagrams 1689 associated with the association. Further communications between the 1690 two endpoints can be resumed by going through a re-initialization 1691 procedure (see section 3.5.4). 1693 A Graceful Shutdown message shall be sent to the peer endpoint of the 1694 association, and the peer shall send back an acknowledgment. Note 1695 that it shall be the responsibility of the endpoint that sends the 1696 Graceful Shutdown message to assure that all the outstanding datagrams 1697 from its side have been resolved before it initiates the graceful 1698 shutdown procedure. 1700 In the Graceful Shutdown message, the sender shall indicate the 1701 highest TSN Seen it has received from the peer, as well as the 1702 Init-Tag of the peer. 1704 Upon the reception of the Graceful Shutdown, the peer shall first 1705 verify that Tag value contained in the Graceful Shutdown message is 1706 valid. If the Tag is invalid, the message must be silently discarded. 1708 The peer then shall verify, by checking the Seen numbers from the 1709 Graceful Shutdown message, that all the out-bound datagrams have 1710 reached the destination. Otherwise, the peer shall re-transmit all 1711 lost datagrams. 1713 After sending the Graceful Shutdown, if the endpoint receives any new 1714 user datagram it shall immediately respond with an Extended Data Ack 1715 and re-start its T3-send timer. 1717 The peer shall send a Graceful Shutdown Ack when all the outstanding 1718 datagrams are acknowledged, then start a T4-shutdown timer. The 1719 endpoint, after receiving the Graceful Shutdown Ack, must also 1720 validate the Tag value contained in the message. If it does not match 1721 the Tag value that unlocked the association, the message should be 1722 silently discarded. 1724 The following sequence shows an example of Graceful Shutdown: 1726 Endpoint A Endpoint X 1727 {App indicates graceful shutdown} 1728 Graceful Shutdown 1729 [Tag-X, Seen=10] ---------------------> (all datagrams resolved) 1730 (start T3-send timer) /-------- Graceful Shutdown Ack 1731 / [Tag-A] 1732 / (start T4-shutdown timer) 1733 (cancel T3-send timer) <------/ ... 1734 (clean-up the association) (T4-shutdown expires) 1735 (clean-up the association) 1737 Both endpoints shall reject any new data request from their upper layers 1738 while the graceful shutdown procedure is in progress. 1740 8. Stream Operations 1742 8.1 Stream Initiation 1744 An MDTP association between the two endpoints must be established 1745 before any stream operation. 1747 Except for the global stream (i.e, stream 0) and the pre-opened 1748 streams (see section 3.1.1), a stream shall be initiated (opened) by 1749 the sender before datagrams can be passed in that stream. When a 1750 stream is no longer used, it shall be terminated (closed) by the 1751 endpoint that opened the stream. Moreover, both sides of the 1752 association shall be able to initiate or terminate streams 1753 independently. Streams are unidirectional. 1755 The sender initiates a stream by sending a Stream Initiation. In 1756 addition to specifying the Stream Identifier, the sender must set the 1757 Init-Tag field of the Stream Initiation to the Tag value of the peer 1758 endpoint. 1760 The sender shall also attach the stream-specific data if any (usually 1761 provided by the upper layer), with the Stream Initiation. Otherwise, 1762 the Size of Stream Info field shall be set to 0x0. 1764 After sending out the Stream Initiation, the sender shall start a 1765 T6-streamInit timer. If this timer expires, the sender shall 1766 re-transmit the Stream Initiation. The value and adjustment rules of 1767 T6-streamInit timer is the same as that of the T3-send timer (see 1768 sections 4.1.1 and 5.2). 1770 Upon the reception of the Stream Initiation, the peer must first 1771 verify that the correct Tag value is carried in the Init-Tag field of 1772 the Stream Initiation. The peer must silently discard the Stream 1773 Initiation if the tag value is found incorrect. 1775 Then, the peer shall respond immediately with either a Stream 1776 Initiation Ack if it chooses to establish the requested stream, or a 1777 Stream Initiation Nack if it chooses to reject the request for reasons 1778 such as lack of resources. 1780 The arrival of the Stream Initiation Ack or Nack shall cause the 1781 sender to cancel its T6-streamInit timer. 1783 The following example shows the opening of stream 5 by "A": 1785 Endpoint A Endpoint Z 1786 {App Initiates stream 5} 1787 Stream Initiation 1788 [Tag=Tag-Z,Strm=5] -------------\ 1789 (Start T6-streamInit timer) \ 1790 \------> 1791 (Cancel T6-streamInit timer) <----------------- Stream Initiation Ack 1792 [Strm=5] 1794 8.2 Stream Termination 1796 An endpoint shall be allowed to terminate any one of the streams it 1797 opened, by sending a Stream Termination to its peer. However, 1798 stream 0 is not allowed to be terminated, and if an endpoint receives a 1799 termination message for stream 0 it must silently discard the message. 1801 The same Tag verification process and timer rules used for stream 1802 initiation shall be applied to stream termination. 1804 The peer shall immediately send a Stream Termination Ack in response 1805 to the Stream Termination. 1807 The following example shows the termination of stream 5 by "A": 1809 Endpoint A Endpoint Z 1810 {App closes stream 5} 1811 Stream Termination 1812 [Tag=Tag-Z,Strm=5] ---------------\ 1813 (Start T6-streamInit timer) \ 1814 \------> 1815 (Cancel T6-streamInit timer) <------------------ Stream Termination Ack 1816 [Strm=5] 1818 Received datagrams associated with a terminated stream shall be 1819 silently discarded. It is up to the endpoint to assure that all 1820 outstanding user datagrams in the stream are acknowledged before the 1821 stream termination. 1823 8.3 Other Issues with Stream Operations 1825 When an association is re-initialized (see section 3.5.4), all existing 1826 streams within that association will be automatically terminated. 1828 The receiver shall silently discard any datagrams associated with a 1829 stream which has not yet been opened or has already been terminated. 1831 9. Interface with Upper Layer 1833 The upper layer protocols (ULP) shall request for services by passing 1834 primitives to MDTP and shall receive notifications from MDTP for 1835 various events. 1837 The primitives and notifications described in this section should be 1838 used as a guideline for implementing MDTP. 1840 A) Init.MDTP primitive 1842 This primitive allows MDTP to initialize its internal data structures 1843 and allocate necessary resources for setting up its operation 1844 environment. Note that once MDTP is initialized, ULP can communicate 1845 directly with any other endpoints without re-invoking this primitive. 1847 Mandatory attributes: 1849 None. 1851 Optional attributes: 1853 The following types of attributes may be passed along with 1854 the primitive: 1856 o Timer selection and its operation syntax -- to indicate to MDTP 1857 an alternative timer the MDTP should use for its operation. 1858 o Initial MDTP operation mode; 1859 o IP port number, if ULP wants it to be specified; 1861 B) Init.Association 1863 This primitive allows the upper layer to initiate an association to a 1864 specific peer endpoint. The peer endpoint shall be specified by one of 1865 the IP address/port pairs which define the endpoint (see section 1.1). 1867 Mandatory attributes: 1869 o associationID - specified as one of the IP address/port pairs of 1870 the peer endpoint with which the association is to be established. 1872 Optional attributes: 1874 o eligibleNetList - a list of destination IP address/port pairs that 1875 the peer endpoint is allowed to use in its network rotation. By 1876 default, all destination IP address/port pairs on the peer are 1877 available. 1879 C) Term.Association 1881 Terminating an association. 1883 Mandatory attributes: 1885 o associationID - specified as one of the IP address/port pairs of 1886 the peer endpoint with which the association is to be terminated. 1888 Optional attributes: 1890 None. 1892 D) Send.Data primitive 1894 This is the main method to send datagrams via MDTP. 1896 Mandatory attributes: 1898 o data - This is the payload ULP wants to transmit; 1899 o size - The size of the payload in number of octets; 1900 o associationID - One of the IP address/port pair of the peer endpoint. 1901 Note that the actual destination address sent to will be determined 1902 by MDTP due to the network rotation, unless the current mode 1903 prohibits MDTP network rotation; in such a case the datagram will 1904 be sent to the IP address/port specified by associationID. 1906 Optional attributes: 1908 o mode-flags - This indicates a new MDTP operation mode, taking effect 1909 immediately including the current datagram send; 1911 o context - optional information that will be carried in the 1912 Send.Failure notification to the ULP if the transportation of 1913 this datagram fails. 1915 o streamID - to indicate which stream to send the data on. By 1916 default, the global stream will be used. 1918 E) Receive.Data primitive 1920 This primitive shall return the first datagram in the MDTP in-queue to 1921 ULP, if there is one available. It may, depending on the specific 1922 implementation, also return other informations such as the sender's 1923 address, whether there are more datagrams available for retrieval, 1924 etc. The behavior is undefined if no datagram is available when this 1925 primitive is invoked. 1927 Mandatory attributes: 1929 o buffer - the memory location indicated by the ULP to store the 1930 received datagram. 1932 Optional attributes: 1934 o associationID - the storage to be filled with one of the IP 1935 address/port pair of the peer endpoint that sent this datagram. 1937 F) Data.Arrive notification 1939 MDTP shall invoke this notification on the ULP when a datagram is 1940 successfully received and ready for retrieval. 1942 G) Send.Failure notification 1944 If a datagram can not be delivered MDTP shall invoke this notification 1945 on the ULP. 1947 The following may be optionally be passed with the notification: 1949 o data - the location ULP can find the un-delivered datagram. 1950 o context - optional information associated with this datagram (see 1951 D). 1952 o associationID - One of the IP address/port pair of the peer this 1953 datagram was attempted to be sent to. 1955 H) Network.Status.Change notification 1957 When a endpoint-id is marked down (e.g., when MDTP detects a failure), 1958 or marked up (e.g., when MDTP detects a recovery), MDTP shall 1959 invoke this notification on the ULP. 1961 The following shall be passed with the notification: 1963 o endpoint-id - This indicates the IP address/port of the 1964 peer endpoint affected by the change; 1965 o new-status - This indicates the new status. 1967 I) Communication.Up notification 1969 This notification is used when MDTP becomes ready to send or receive 1970 datagrams, or when a lost communication to an endpoint is restored. 1972 The following shall be passed with the notification: 1974 o status - This indicates what type of event that has occurred; 1975 o associationID - An IP address/port to identify the peer endpoint; 1977 J) Communication.Lost notification 1979 When MDTP loses communication to an endpoint completely or detects 1980 that the endpoint has performed a abort or graceful shutdown 1981 operation, it shall invoke this notification on the ULP. 1983 The following shall be passed with the notification: 1985 o status - This indicates what type of event that has occurred; 1986 o associationID - An IP address/port number to identify the peer 1987 endpoint; 1989 The following may be optionally passed with the notification: 1991 o packets-enqueue - The number and location of un-sent datagrams 1992 still holding by MDTP; 1993 o last-acked - the sequence number last acked by that peer endpoint; 1994 o last-sent - the sequence number last sent to that peer endpoint; 1996 K) Change.Network.Rotation primitive 1998 When the upper layer wants to inform MDTP to make a specific network 1999 eligible or ineligible for in network rotation, the upper layer will send 2000 this primitive to MDTP. 2002 Mandatory attributes: 2004 o action - This indicates if the network is to be made eligible or 2005 ineligible for network rotation. 2006 o network-id - This is the IP address/port of the peer endpoint to 2007 be added or removed from network rotation consideration. 2009 L) Open.Stream primitive 2011 This should be used by the upper layer to open a new outbound stream. 2013 Mandatory attributes: 2015 o associationID - One of the IP address/port to identify the peer 2016 endpoint of the association to which the stream is to be opened. An 2017 association must have existed at the time of stream open. 2019 Optional attributes: 2021 o streamInfo - The upper layer should use this field to pass any 2022 stream-specific data to the other endpoint of the association. 2024 M) Open.Stream.Succeed notification 2026 This should be used to report the successful opening of an new outbound 2027 stream. 2029 Mandatory attributes: 2031 o associationID - One of the IP address/port to identify the peer 2032 endpoint of the association to which the outbound stream has been 2033 successfully opened. 2035 o streamID - The stream number of the outbound stream assigned by 2036 MDTP. 2038 Optional attributes: 2040 o streamInfo - The streamInfo used for opening this outbound stream. 2042 N) Open.Stream.Rejected notification 2044 This reports to the ULP that the open of an outbound stream is 2045 rejected by the peer endpoint. 2047 Mandatory attributes: 2049 o associationID - One of the IP address/port to identify the peer 2050 endpoint of the association by which the stream open is rejected. 2052 Optional attributes: 2054 o streamInfo - The info used in the failed attempt of the stream 2055 open. 2057 O) Close.Stream notification 2059 This should be used to report the successful closing of an outbound 2060 stream. 2062 Mandatory attributes: 2064 o associationID - One of the IP address/port to identify the peer 2065 endpoint of the association with which the stream is closed. 2066 o streamID - The stream number of the closed stream. 2068 P) Peer.Open.Stream notification 2070 This notifies the ULP that a new inbound steam is opened by a peer 2071 endpoint. 2073 Mandatory attributes: 2075 o associationID - One of the IP address/port to identify the peer 2076 endpoint of the association to which the stream is opened. 2077 o streamID - The stream number of the new inbound stream assigned 2078 by the peer. 2080 Optional attributes: 2082 o streamInfo - The stream-specific Information passed from the peer 2083 endpoint. 2085 Q) Peer.Close.Stream notification 2087 This reports to the ULP the closing by a remote peer of an inbound 2088 stream. 2090 Mandatory attributes: 2092 o associationID - One of the IP address/port to identify the peer 2093 endpoint of the association by which the inbound stream is closed. 2094 o streamID - The stream number of the closed inbound stream. 2096 R) Close.Stream primitive 2098 This shall be used by the upper layer to close an outbound stream. 2100 Mandatory attributes: 2102 o associationID - One of the IP address/port to identify the peer 2103 endpoint of the association to which the outbound stream is to be 2104 closed. 2105 o streamID - The stream identifier to identify the stream to be 2106 closed (this should be the number returned by the Stream.Open 2107 primitive on this stream). 2109 10. Suggested MDTP Timer and Protocol Parameter Values 2111 The following are suggested timer values for MDTP: 2113 T1-init Timer - 160 ms 2114 T2-receive Timer - 20 ms 2115 T3-send Timer - 160 ms (TL3-default) 2116 T4-shutdown Timer - 300 ms 2117 T5-heartBeat timer - 4000 ms (TL5-default) 2118 T6-streamInit timer - same as T3-send 2120 The following protocol parameters are recommended: 2122 Max.Outstanding.dg - 20 messages 2123 Max.Retransmit - 10 attempts 2124 Max.Init.Retransmit - 8 attempts 2126 11. Abbreviations 2128 MDTP - Multi-network Datagram Transmission Protocol. 2130 NAT - Network Address Translation 2132 RTT - Round Trip Time 2134 TSN - Transport Sequence Number 2136 ULP - Upper Layer Protocol 2138 12. Acknowledgments 2140 The authors wish to thank Brian Wyld, A. Sankar, Henry Houh, Gary 2141 Lehecka, Lyndon Ong, Greg Sidebottom, Lixia Zhang, Jarno Rajahalme, 2142 Heinz Prantner, Matt Holdrege, Kelvin Porter, Richard Band, and many 2143 others for their invaluable comments. 2145 13. Authors' Addresses 2147 Randall R. Stewart Tel: +1-847-632-7438 2148 Cellular Infrastructure Group EMail: stewrtrs@cig.mot.com 2149 Motorola, Inc. 2150 1475 W. Shure Drive, #2C-6 2151 Arlington Heights, IL 60004 2152 USA 2153 Qiaobing Xie Tel: +1-847-632-3028 2154 Cellular Infrastructure Group EMail: xieqb@cig.mot.com 2155 Motorola, Inc. 2156 1501 W. Shure Drive, #2309 2157 Arlington Heights, IL 60004 2158 USA 2160 Ken Morneau Tel: +1-703-484-3323 2161 Cisco Systems Inc. EMail:kmorneau@cisco.com 2162 13615 Dulles Technology Drive 2163 Herndon, VA. 20171 2165 Chip Sharp Tel: +1-919-851-2085 2166 Cisco Systems Inc. EMail:chsharp@cisco.com 2167 7025 Kit Creek Road 2168 Research Triangle Park, NC 27709 2170 Hanns Juergen Schwarzbauer Tel: +49-89-722-24236 2171 SIEMENS AG 2172 Hofmannstr. 51 2173 81359 Munich, Germany 2174 EMail: HannsJuergen.Schwarzbauer@icn.siemens.de 2176 Tom Taylor Tel: +1-613-736-0961 2177 Nortel Networks EMail:taylor@nortelnetworks.com 2178 1852 Lorraine Ave. 2179 Ottawa Ontario Canada 2180 K1H6Z8 2182 Ian Rytina Tel: 2183 Ericsson Australia EMail:ian.rytina@ericsson.com 2184 37/360 Elizabeth Street 2185 Melbourne, Victoria 3000, Australia 2187 14. References 2189 [1] Postel, J. (ed.), "Internet Protocol - DARPA Internet Program 2190 Protocol Specification", RFC 791, USC/Information Sciences Institute, 2191 September 1981. 2193 [2] Postel, J., "User Datagram Protocol", RFC 768, USC/Information Sciences 2194 Institute, August 1980. 2196 [3] Postel, J. (ed.), "Transmission Control Protocol", RFC 793, USC/ 2197 Information Sciences Institute, September 1981. 2199 [4] Jacobson V., "Congestion Avoidance and Control", Proceedings of 2200 SIGCOMM '88, pp 314-329, August 1988. 2202 [5] Seth, T., etc. "Performance Requirements for Signaling in Internet 2203 Telephony", Internet-Draft , May 1999. 2205 [6] Rytina, I., "Framework for Generic Common Signaling Transport 2206 Protocol", draft-rytina-sigtran-generic-framework-00.txt>, Feb. 1999. 2208 [7] Ashworth, J., "The Naming of Hosts", RFC 2100, April 1997. 2210 [8] Braden, R., "Requirements for Internet hosts - Application and 2211 Support", RFC 1122, October 1989. 2213 [9] Eastlake 3rd, D., Crocker, S., and Schiller, J., "Randomness 2214 Recommendations for Security", RFC 1750, December 1994. 2216 [10] Bellovin, S., "Defending Against Sequence Number Attacks", 2217 RFC1948, May 1996 2219 [11] ITU-T Recommendation Q.703 "Q.703 - Signaling link", July 1996. 2221 This Internet Draft expires in 6 months from June 1999.