idnits 2.17.1 draft-ietf-rddp-rdmap-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 22. -- Found old boilerplate from RFC 3978, Section 5.5 on line 2915. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2885. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2892. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2898. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. (A line matching the expected section header was found, but with an unexpected indentation: ' scope of this specification. The mechanism presumably entails' ) ** The document seems to lack an Authors' Addresses Section. ** The abstract seems to contain references ([DDP], [RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 1, 2006) is 6532 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'VERBS' is defined on line 2405, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2406 (Obsoleted by RFC 4303, RFC 4305) ** Obsolete normative reference: RFC 2407 (Obsoleted by RFC 4306) ** Obsolete normative reference: RFC 2409 (Obsoleted by RFC 4306) -- No information found for draft-hilland-iwarp-verbs-v1 - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'VERBS' == Outdated reference: A later version (-07) exists of draft-ietf-rddp-ddp-05 == Outdated reference: A later version (-08) exists of draft-ietf-rddp-mpa-04 ** Obsolete normative reference: RFC 2960 (ref. 'SCTP') (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 793 (ref. 'TCP') (Obsoleted by RFC 9293) == Outdated reference: A later version (-10) exists of draft-ietf-rddp-security-09 -- Obsolete informational reference (is this intentional?): RFC 2401 (Obsoleted by RFC 4301) -- Obsolete informational reference (is this intentional?): RFC 2246 (Obsoleted by RFC 4346) Summary: 11 errors (**), 0 flaws (~~), 7 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Remote Direct Data Placement Work Group R. Recio 3 INTERNET DRAFT IBM Corporation 4 draft-ietf-rddp-rdmap-06.txt P. Culley 5 Hewlett-Packard Company 6 D. Garcia 7 Hewlett-Packard Company 8 J. Hilland 9 Hewlett-Packard Company 10 B. Metzler 11 IBM Corporation 13 Expires: January, 2007 June 1, 2006 15 A Remote Direct Memory Access Protocol Specification 17 Status of this Memo 19 By submitting this Internet-Draft, each author represents that any 20 applicable patent or other IPR claims of which he or she is aware 21 have been or will be disclosed, and any of which he or she becomes 22 aware will be disclosed, in accordance with Section 6 of BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six 30 months and may be updated, replaced, or obsoleted by other 31 documents at any time. It is inappropriate to use Internet-Drafts 32 as reference material or to cite them other than as "work in 33 progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/1id-abstracts.html The list of Internet-Draft 37 Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 Abstract 42 This document defines a Remote Direct Memory Access Protocol 43 (RDMAP) that operates over the Direct Data Placement Protocol (DDP 44 protocol). RDMAP provides read and write services directly to 45 applications and enables data to be transferred directly into Upper 46 Layer Protocol (ULP) Buffers without intermediate data copies. It 47 also enables a kernel bypass implementation. 49 Table of Contents 51 1 Introduction...............................................6 52 1.1 Architectural Goals........................................6 53 1.2 Protocol Overview..........................................7 54 1.3 RDMAP Layering............................................10 55 1.4 Specification Changes from the Last Version...............11 56 2 Glossary..................................................14 57 2.1 General...................................................14 58 2.2 LLP.......................................................16 59 2.3 Direct Data Placement (DDP)...............................17 60 2.4 Remote Direct Memory Access (RDMA)........................19 61 3 ULP and Transport Attributes..............................22 62 3.1 Transport Requirements & Assumptions......................22 63 3.2 RDMAP Interactions with the ULP...........................23 64 4 Header Format.............................................27 65 4.1 RDMAP Control and Invalidate STag Field...................27 66 4.2 RDMA Message Definitions..................................30 67 4.3 RDMA Write Header.........................................31 68 4.4 RDMA Read Request Header..................................32 69 4.5 RDMA Read Response Header.................................34 70 4.6 Send Header and Send with Solicited Event Header..........34 71 4.7 Send with Invalidate Header and Send with SE and Invalidate 72 Header..........................................................34 73 4.8 Terminate Header..........................................34 74 5 Data Transfer.............................................41 75 5.1 RDMA Write Message........................................41 76 5.2 RDMA Read Operation.......................................42 77 5.2.1 RDMA Read Request Message................................42 78 5.2.2 RDMA Read Response Message...............................43 79 5.3 Send Message Type.........................................44 80 5.4 Terminate Message.........................................46 81 5.5 Ordering and Completions..................................47 82 6 RDMAP Stream Management...................................51 83 6.1 Stream Initialization.....................................51 84 6.2 Stream Teardown...........................................52 85 6.2.1 RDMAP Abortive Termination...............................52 86 7 RDMAP Error Management....................................54 87 7.1 RDMAP Error Surfacing.....................................54 88 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages 89 55 90 8 Security..................................................57 91 8.1 Summary of RDMAP specific Security Requirements...........57 92 8.1.1 RDMAP (RNIC) Requirements................................57 93 8.1.2 Privileged Resource Manager Requirements.................59 94 8.2 Security Services for RDMAP...............................60 95 8.2.1 Available Security Services..............................60 96 8.2.2 Requirements for IPsec Services for RDMAP................61 97 9 IANA......................................................64 98 10 References................................................65 99 10.1 Normative References.....................................65 100 10.2 Informative References...................................65 101 11 Appendix..................................................67 102 11.1 DDP Segment Formats for RDMA Messages....................67 103 11.1.1 DDP Segment for RDMA Write.............................67 104 11.1.2 DDP Segment for RDMA Read Request......................67 105 11.1.3 DDP Segment for RDMA Read Response.....................69 106 11.1.4 DDP Segment for Send and Send with Solicited Event.....69 107 11.1.5 DDP Segment for Send with Invalidate and Send with SE and 108 Invalidate......................................................70 109 11.1.6 DDP Segment for Terminate..............................71 110 11.2 Ordering and Completion Table............................71 111 12 Author's Address..........................................75 112 13 Contributors..............................................76 113 14 Intellectual Property Statement...........................80 114 15 IPR Disclosure Acknowledgement..Error! Bookmark not defined. 115 16 Full Copyright Statement..................................81 117 Table of Figures 119 Figure 1 RDMAP Layering.........................................10 120 Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP11 121 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields.28 122 Figure 4 RDMA Usage of DDP Fields...............................29 123 Figure 5 RDMA Message Definitions...............................31 124 Figure 6 RDMA Read Request Header Format........................32 125 Figure 7 Terminate Header Format................................35 126 Figure 8 Terminate Control Field................................35 127 Figure 9 Terminate Control Field Values.........................38 128 Figure 10 Error Type to RDMA Message Mapping....................40 129 Figure 11 RDMA Write, DDP Segment format........................67 130 Figure 12 RDMA Read Request, DDP Segment format.................68 131 Figure 13 RDMA Read Response, DDP Segment format................69 132 Figure 14 Send and Send with Solicited Event, DDP Segment format 70 133 Figure 15 Send with Invalidate and Send with SE and Invalidate, DDP 134 Segment.........................................................70 135 Figure 16 Terminate, DDP Segment format.........................71 136 Figure 17 Operation Ordering....................................74 138 1 Introduction 140 Today, communications over TCP/IP typically require copy 141 operations, which add latency and consume significant CPU and 142 memory resources. The Remote Direct Memory Access Protocol (RDMAP) 143 enables removal of data copy operations and enables reduction in 144 latencies by allowing a local application to read or write data on 145 a remote computer's memory with minimal demands on memory bus 146 bandwidth and CPU processing overhead, while preserving memory 147 protection semantics. 149 RDMAP is layered on top of Direct Data Placement (DDP) and uses the 150 two Buffer Models available from DDP. DDP-related terminology is 151 discussed in Section 2.3. As RDMAP builds on DDP the reader is 152 advised to become familiar with [DDP]. 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 156 this document are to be interpreted as described in [RFC2119]." 158 1.1 Architectural Goals 160 RDMAP has been designed with the following high-level architectural 161 goals: 163 * Provide a data transfer operation that allows a Local Peer to 164 transfer up to 2^32 - 1 octets directly into a previously 165 advertised buffer (i.e. Tagged buffer) located at a Remote Peer 166 without requiring a copy operation. This is referred to as the 167 RDMA Write data transfer operation. 169 * Provide a data transfer operation that allows a Local Peer to 170 retrieve up to 2^32 - 1 octets directly from a previously 171 advertised buffer (i.e. Tagged buffer) located at a Remote Peer 172 without requiring a copy operation. This is referred to as the 173 RDMA Read data transfer operation. 175 * Provide a data transfer operation that allows a Local Peer to 176 send up to 2^32 - 1 octets directly into a buffer located at a 177 Remote Peer that has not been explicitly advertised. This is 178 referred to as the Send (Send with Invalidate, Send with 179 Solicited Event, and Send with Solicited Event and Invalidate) 180 data transfer operation. 182 * Enable the local ULP to use the Send Operation Type (includes 183 Send, Send with Invalidate, Send with Solicited Event, and Send 184 with Solicited Event and Invalidate) to signal to the remote ULP 185 the Completion of all previous Messages initiated by the local 186 ULP. 188 * Provide for all Operations on a single RDMAP Stream to be 189 reliably transmitted in the order that they were submitted. 191 * Provide RDMAP capabilities independently for each Stream when 192 the LLP supports multiple data Streams within an LLP connection. 194 1.2 Protocol Overview 196 RDMAP provides seven data transfer operations. Except for the RDMA 197 Read operation, each operation generates exactly one RDMA Message. 198 Following is a brief overview of the RDMA Operations and RDMA 199 Messages: 201 1. Send - A Send operation uses a Send Message to transfer data 202 from the Data Source into a buffer that has not been explicitly 203 Advertised by the Data Sink. The Send Message uses the DDP 204 Untagged Buffer Model to transfer the ULP Message into the Data 205 Sink's Untagged Buffer. 207 2. Send with Invalidate - A Send with Invalidate operation uses a 208 Send with Invalidate Message to transfer data from the Data 209 Source into a buffer that has not been explicitly Advertised by 210 the Data Sink. The Send with Invalidate Message includes all 211 functionality of the Send Message, with one addition: an STag 212 field is included in the Send With Invalidate Message and after 213 the message has been Placed and Delivered at the Data Sink the 214 remote peer's buffer identified by the STag can no longer be 215 accessed remotely until the remote peer's ULP re-enables access 216 and Advertises the buffer. 218 3. Send with Solicited Event (Send with SE) - A Send with 219 Solicited Event operation uses a Send with Solicited Event 220 Message to transfer data from the Data Source into an Untagged 221 Buffer at the Data Sink. The Send with Solicited Event Message 222 is similar to the Send Message, with one addition: when the 223 Send with Solicited Event Message has been Placed and 224 Delivered, an Event may be generated at the recipient, if the 225 recipient is configured to generate such an Event. 227 4. Send with Solicited Event and Invalidate (Send with SE and 228 Invalidate) - A Send with Solicited Event and Invalidate 229 operation uses a Send with Solicited Event and Invalidate 230 Message to transfer data from the Data Source into a buffer 231 that has not been explicitly Advertised by the Data Sink. The 232 Send with Solicited Event and Invalidate Message is similar to 233 the Send with Invalidate Message, with one addition: when the 234 Send with Solicited Event and Invalidate Message has been 235 Placed and Delivered, an Event may be generated at the 236 recipient, if the recipient is configured to generate such an 237 Event. 239 5. Remote Direct Memory Access Write - An RDMA Write operation 240 uses an RDMA Write Message to transfer data from the Data 241 Source to a previously advertised buffer at the Data Sink. 243 The ULP at the Remote Peer, which in this case is the Data 244 Sink, enables the Data Sink Tagged Buffer for access and 245 Advertises the buffer's size (length), location (Tagged 246 Offset), and Steering Tag (STag) to the Data Source through a 247 ULP specific mechanism. The ULP at the Local Peer, which in 248 this case is the Data Source, initiates the RDMA Write 249 operation. The RDMA Write Message uses the DDP Tagged Buffer 250 Model to transfer the ULP Message into the Data Sink's Tagged 251 Buffer. Note: the STag associated with the Tagged Buffer 252 remains valid until the ULP at the Remote Peer invalidates it 253 or the ULP at the Local Peer invalidates it through a Send with 254 Invalidate or Send with Solicited Event and Invalidate. 256 6. Remote Direct Memory Access Read - The RDMA Read operation 257 transfers data to a Tagged Buffer at the Local Peer, which in 258 this case is the Data Sink, from a Tagged Buffer at the Remote 259 Peer, which in this case is the Data Source. The ULP at the 260 Data Source enables the Data Source Tagged Buffer for access 261 and Advertises the buffer's size (length), location (Tagged 262 Offset), and Steering Tag (STag) to the Data Sink through a ULP 263 specific mechanism. The ULP at the Data Sink enables the Data 264 Sink Tagged Buffer for access and initiates the RDMA Read 265 operation. The RDMA Read operation consists of a single RDMA 266 Read Request Message and a single RDMA Read Response Message, 267 and the latter may be segmented into multiple DDP Segments. 269 The RDMA Read Request Message uses the DDP Untagged Buffer 270 Model to Deliver the STag, starting Tagged Offset and length 271 for both the Data Source and Data Sink Tagged Buffers to the 272 remote peer's RDMA Read Request Queue. 274 The RDMA Read Response Message uses the DDP Tagged Buffer Model 275 to Deliver the Data Source's Tagged Buffer to the Data Sink, 276 without any involvement from the ULP at the Data Source. 278 Note: the Data Source STag associated with the Tagged Buffer 279 remains valid until the ULP at the Data Source invalidates it 280 or the ULP at the Data Sink invalidates it through a Send with 281 Invalidate or Send with Solicited Event and Invalidate. The 282 Data Sink STag associated with the Tagged Buffer remains valid 283 until the ULP at the Data Sink invalidates it. 285 7. Terminate - A Terminate operation uses a Terminate Message to 286 transfer to the Remote Peer information associated with an 287 error that occurred at the Local Peer. The Terminate Message 288 uses the DDP Untagged Buffer Model to transfer the Message into 289 the Data Sink's Untagged Buffer. 291 1.3 RDMAP Layering 293 RDMAP is dependent on DDP, subject to the requirements defined in 294 section 3.1 Transport Requirements & Assumptions. Figure 1 RDMAP 295 Layering depicts the relationship between Upper Layer Protocols 296 (ULPs), RDMAP, DDP protocol, the framing layer, and the transport. 297 For LLP protocol definitions of each LLP, see [MPA], [TCP], and 298 [SCTP]. 300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 301 | | 302 | Upper Layer Protocol (ULP) | 303 | | 304 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 305 | | 306 | RDMAP | 307 | | 308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 309 | | 310 | DDP protocol | 311 | | 312 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 313 | | | 314 | MPA | | 315 | | | 316 +-+-+-+-+-+-+-+-+-+ SCTP | 317 | | | 318 | TCP | | 319 | | | 320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 321 Figure 1 RDMAP Layering 323 If RDMAP is layered over DDP/MPA/TCP, then the respective headers 324 and ULP Payload are arranged as follows (Note: For clarity, MPA 325 header and CRC fields are included but MPA markers are not shown): 327 0 1 2 3 328 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 330 | | 331 // TCP Header // 332 | | 333 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 334 | MPA Header | | 335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 336 | | 337 // DDP Header // 338 | | 339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 340 | | 341 // RDMA Header // 342 | | 343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 344 | | 345 // ULP Payload // 346 | (shown with no pad bytes) | 347 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 348 | MPA CRC | 349 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 350 Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP 352 1.4 Specification Changes from the Last Version 354 This section is to be removed before RFC publication. 356 The following major changes (vs typos) were made to the -05 357 version: 359 * To pass the IETF checklist tool, modified heading of Security 360 Section 8 to "Security" and added "Security Considerations" 361 below it. 363 * Added IANA Section 9 and to pass the IETF checklist tool added 364 "IANA Considerations" line below Section 9 header. 366 * Added Intellectual Property Statement Section 14 and IPR 367 Disclosure Acknowledgement Section 15. 369 * Added Disclaimer Section 16. 371 * Section 6.8 - Acknowledged that the Reserved field size for the 372 Terminate Message is 13 bits. The fix was made to the -04 373 version, but was not listed in this section. 375 * Rewrite of the "Security" section to refer to Security document 376 rather than summarize. 378 * Update to the "Contributors" section. 380 * Changed boilerplate reference form 3667 to 3979. 382 * Removed references to company names in the disclaimer section. 384 * Added "Key Words" Disclaimer to the Introduction. 386 The following major changes (vs typos) were made to the -04 387 version: 389 * Section 10 - Expanded IPsec requirements sentence in section 390 10.3.2 to say what is required in addition to cross-referencing 391 RFC 3723. 393 * Section 6.8 - Fixed text after Figure 9 to reflect the correct 394 size (13 bits) of the Reserved field in the Terminate Message. 396 The following major changes (vs typos) were made to the -03 397 version: 399 * Section 6.1 - Added normative text describing downward 400 compatibility with version 0. 402 * Section 6.8 - Changed the description of the reserved field size 403 to match the size in the figure, which is 13 bits. 405 * Section 10 - Aligned security section closely to [RDMASEC] and 406 added normative text for security requirements. 408 The following major changes (vs typos) were made to the -02 409 version: 411 * Section 6.8 - Explicitly defined the bit numbers for the three 412 header control bits. 414 * Section 8.1 - Stated the typical Stream initialization to be: 415 RDMA mode is entered some time after the LLP Stream is 416 initialized. 418 * Section 10 - Update reference to security document. 420 * Section 10 - Fixed Send with Solicited Event and Invalidate 421 reference. 423 * Section 12.1 - MPA and DDP references were changed to reflect 424 the released specifications and accurate titles. 426 * Section 12.1 - Reference for RDMA Protocol Verbs was changed to 427 reflect the released specification and accurate title. 429 2 Glossary 431 2.1 General 433 Advertisement (Advertised, Advertise, Advertisements, Advertises) - 434 the act of informing a Remote Peer that a local RDMA Buffer is 435 available to it. A Node makes available an RDMA Buffer for 436 incoming RDMA Read or RDMA Write access by informing its 437 RDMA/DDP peer of the Tagged Buffer identifiers (STag, base 438 address, and buffer length). This Advertisement of Tagged 439 Buffer information is not defined by RDMA/DDP and is left to 440 the ULP. A typical method would be for the Local Peer to embed 441 the Tagged Buffer's Steering Tag, base address, and length in a 442 Send Message destined for the Remote Peer. 444 Completion - Refer to "RDMA Completion" in Section 2.4. 446 Completed - See "RDMA Completion" in Section 2.4. 448 Complete - See "RDMA Completion" in Section 2.4. 450 Completes - See "RDMA Completion" in Section 2.4. 452 Data Sink - The peer receiving a data payload. Note that the Data 453 Sink can be required to both send and receive RDMA/DDP Messages 454 to transfer a data payload. 456 Data Source - The peer sending a data payload. Note that the Data 457 Source can be required to both send and receive RDMA/DDP 458 Messages to transfer a data payload. 460 Data Delivery (Delivery, Delivered, Delivers) - Delivery is defined 461 as the process of informing the ULP or consumer that a 462 particular Message is available for use. This is specifically 463 different from "Placement", which may generally occur in any 464 order, while the order of "Delivery" is strictly defined. See 465 "Data Placement" in Section 2.3. 467 Delivery - See Data Delivery in Section 2.1. 469 Delivered - See Data Delivery in Section 2.1. 471 Delivers - See Data Delivery in Section 2.1. 473 Fabric - The collection of links, switches, and routers that 474 connect a set of Nodes with RDMA/DDP protocol implementations. 476 Fence (Fenced, Fences) - To block the current RDMA Operation from 477 executing until prior RDMA Operations have Completed. 479 iWARP - A suite of wire protocols comprised of RDMAP, DDP, and MPA. 480 The iWARP protocol suite may be layered above TCP, SCTP, or 481 other transport protocols. 483 Local Peer - The RDMA/DDP protocol implementation on the local end 484 of the connection. Used to refer to the local entity when 485 describing a protocol exchange or other interaction between two 486 Nodes. 488 Node - A computing device attached to one or more links of a Fabric 489 (network). A Node in this context does not refer to a specific 490 application or protocol instantiation running on the computer. 491 A Node may consist of one or more RNICs installed in a host 492 computer. 494 Placement - See "Data Placement" in Section 2.3 496 Placed - See "Data Placement" in Section 2.3 498 Places - See "Data Placement" in Section 2.3 500 Remote Peer - The RDMA/DDP protocol implementation on the opposite 501 end of the connection. Used to refer to the remote entity when 502 describing protocol exchanges or other interactions between two 503 Nodes. 505 RNIC - RDMA Network Interface Controller. In this context, this 506 would be a network I/O adapter or embedded controller with 507 iWARP and Verbs functionality. 509 RNIC Interface (RI) - The presentation of the RNIC to the Verbs 510 Consumer as implemented through the combination of the RNIC and 511 the RNIC driver. 513 Termination - See "RDMAP Abortive Termination" in Section 2.4. 515 Terminated - See "RDMAP Abortive Termination" in Section 2.4. 517 Terminate - See "RDMAP Abortive Termination" in Section 2.4 519 Terminates - See "RDMAP Abortive Termination" in Section 2.4 521 ULP - Upper Layer Protocol. The protocol layer above the protocol 522 layer currently being referenced. The ULP for RDMA/DDP is 523 expected to be an OS, Application, adaptation layer, or 524 proprietary device. The RDMA/DDP documents do not specify a 525 ULP - they provide a set of semantics that allow a ULP to be 526 designed to utilize RDMA/DDP. 528 ULP Payload - The ULP data that is contained within a single 529 protocol segment or packet (e.g. a DDP Segment). 531 Verbs - An abstract description of the functionality of a RNIC 532 Interface. The OS may expose some or all of this functionality 533 via one or more APIs to applications. The OS will also use some 534 of the functionality to manage the RNIC Interface. 536 2.2 LLP 538 LLP - Lower Layer Protocol. The protocol layer beneath the protocol 539 layer currently being referenced. For example, for DDP the LLP 540 is SCTP, MPA, or other transport protocols. For RDMA, the LLP 541 is DDP. 543 LLP Connection - Corresponds to an LLP transport-level connection 544 between the peer LLP layers on two nodes. 546 LLP Stream - Corresponds to a single LLP transport-level Stream 547 between the peer LLP layers on two Nodes. One or more LLP 548 Streams may map to a single transport-level LLP connection. For 549 transport protocols that support multiple Streams per 550 connection (e.g. SCTP), a LLP Stream corresponds to one 551 transport-level Stream. 553 MULPDU - Maximum ULPDU. The current maximum size of the record that 554 is acceptable for DDP to pass to the LLP for transmission. 556 ULPDU - Upper Layer Protocol Data Unit. The data record defined by 557 the layer above MPA. 559 2.3 Direct Data Placement (DDP) 561 Data Placement (Placement, Placed, Places) - For DDP, this term is 562 specifically used to indicate the process of writing to a data 563 buffer by a DDP implementation. DDP Segments carry Placement 564 information, which may be used by the receiving DDP 565 implementation to perform Data Placement of the DDP Segment ULP 566 Payload. See "Data Delivery". 568 DDP Abortive Teardown - The act of closing a DDP Stream without 569 attempting to Complete in-progress and pending DDP Messages. 571 DDP Graceful Teardown - The act of closing a DDP Stream such that 572 all in-progress and pending DDP Messages are allowed to 573 Complete successfully. 575 DDP Control Field - a fixed 16-bit field in the DDP Header. The DDP 576 Control Field contains an 8-bit field whose contents are 577 reserved for use by the ULP. 579 DDP Header - The header present in all DDP segments. The DDP Header 580 contains control and Placement fields that are used to define 581 the final Placement location for the ULP payload carried in a 582 DDP Segment. 584 DDP Message - A ULP defined unit of data interchange, which is 585 subdivided into one or more DDP segments. This segmentation may 586 occur for a variety of reasons, including segmentation to 587 respect the maximum segment size of the underlying transport 588 protocol. 590 DDP Segment - The smallest unit of data transfer for the DDP 591 protocol. It includes a DDP Header and ULP Payload (if 592 present). A DDP Segment should be sized to fit within the 593 underlying transport protocol MULPDU. 595 DDP Stream - a sequence of DDP Messages whose ordering is defined 596 by the LLP. For SCTP, a DDP Stream maps directly to an SCTP 597 Stream. For MPA, a DDP Stream maps directly to a TCP connection 598 and a single DDP Stream is supported. Note that DDP has no 599 ordering guarantees between DDP Streams. 601 Direct Data Placement - A mechanism whereby ULP data contained 602 within DDP Segments may be Placed directly into its final 603 destination in memory without processing of the ULP. This may 604 occur even when the DDP Segments arrive out of order. Out of 605 order Placement support may require the Data Sink to implement 606 the LLP and DDP as one functional block. 608 Direct Data Placement Protocol (DDP) - Also, a wire protocol that 609 supports Direct Data Placement by associating explicit memory 610 buffer placement information with the LLP payload units. 612 Message Offset (MO) - For the DDP Untagged Buffer Model, specifies 613 the offset, in bytes, from the start of a DDP Message. 615 Message Sequence Number (MSN) - For the DDP Untagged Buffer Model, 616 specifies a sequence number that is increasing with each DDP 617 Message. 619 Queue Number (QN) - For the DDP Untagged Buffer Model, identifies a 620 destination Data Sink queue for a DDP Segment. 622 Steering Tag - An identifier of a Tagged Buffer on a Node, valid as 623 defined within a protocol specification. 625 STag - Steering Tag 627 Tagged Buffer - A buffer that is explicitly Advertised to the 628 Remote Peer through exchange of an STag, Tagged Offset, and 629 length. 631 Tagged Buffer Model - A DDP data transfer model used to transfer 632 Tagged Buffers from the Local Peer to the Remote Peer. 634 Tagged DDP Message - A DDP Message that targets a Tagged Buffer. 636 Tagged Offset (TO) - The offset within a Tagged Buffer on a Node. 638 Untagged Buffer - A buffer that is not explicitly Advertised to the 639 Remote Peer. Untagged buffers support one of the two available 640 data transfer mechanisms called the Untagged Buffer Model. An 641 untagged buffer is used to send asynchronous control messages 642 to the Remote Peer for RDMA Read, Send, and Terminate requests. 643 Untagged Buffers handle Untagged DDP Messages. 645 Untagged Buffer Model - A DDP data transfer model used to transfer 646 Untagged Buffers from the Local Peer to the Remote Peer. 648 Untagged DDP Message - A DDP Message that targets an Untagged 649 Buffer. 651 2.4 Remote Direct Memory Access (RDMA) 653 Event - An indication provided by the RDMAP Layer to the ULP to 654 indicate a Completion or other condition requiring immediate 655 attention. 657 Invalidate STag - A mechanism used to prevent the Remote Peer from 658 reusing a previous explicitly Advertised STag, until the Local 659 Peer makes it available through a subsequent explicit 660 Advertisement. The STag cannot be accessed remotely until it is 661 explicit Advertised again. 663 RDMA Completion (Completion, Completed, Complete, Completes) - For 664 RDMA, Completion is defined as the process of informing the ULP 665 that a particular RDMA Operation has performed all functions 666 specified for the RDMA Operations, including Placement and 667 Delivery. The Completion semantic of each RDMA Operation is 668 distinctly defined. 670 RDMA Message - A data transfer mechanism used to fulfill an RDMA 671 Operation. 673 RDMA Operation - A sequence of RDMA Messages, including control 674 Messages, to transfer data from a Data Source to a Data Sink. 675 The following RDMA Operations are defined - RDMA Writes, RDMA 676 Read, Send, Send with Invalidate, Send with Solicited Event, 677 Send with Solicited Event and Invalidate, and Terminate. 679 RDMA Protocol (RDMAP) - A wire protocol that supports RDMA 680 Operations to transfer ULP data between a Local Peer and the 681 Remote Peer. 683 RDMAP Abortive Termination (Termination, Terminated, Terminate, 684 Terminates) - The act of closing an RDMAP Stream without 685 attempting to Complete in-progress and pending RDMA Operations. 687 RDMAP Graceful Termination - The act of closing an RDMAP Stream 688 such that all in-progress and pending RDMA Operations are 689 allowed to Complete successfully. 691 RDMA Read - An RDMA Operation used by the Data Sink to transfer the 692 contents of a source RDMA buffer from the Remote Peer to the 693 Local Peer. An RDMA Read operation consists of a single RDMA 694 Read Request Message and a single RDMA Read Response Message. 696 RDMA Read Request - An RDMA Message used by the Data Sink to 697 request the Data Source to transfer the contents of an RDMA 698 buffer. The RDMA Read Request Message describes both the Data 699 Source and Data Sink RDMA buffers. 701 RDMA Read Request Queue - The queue used for processing RDMA Read 702 Requests. The RDMA Read Request Queue has a DDP Queue Number of 703 1. 705 RDMA Read Response - An RDMA Message used by the Data Source to 706 transfer the contents of an RDMA buffer to the Data Sink, in 707 response to an RDMA Read Request. The RDMA Read Response 708 Message only describes the data sink RDMA buffer. 710 RDMAP Stream - An association between a pair of RDMAP 711 implementations, possibly on different Nodes, which transfer 712 ULP data using RDMA Operations. There may be multiple RDMAP 713 Streams on a single Node. An RDMAP Stream maps directly to a 714 single DDP Stream. 716 RDMA Write - An RDMA Operation that transfers the contents of a 717 source RDMA Buffer from the Local Peer to a destination RDMA 718 Buffer at the Remote Peer using RDMA. The RDMA Write Message 719 only describes the Data Sink RDMA buffer. 721 Remote Direct Memory Access (RDMA) - A method of accessing memory 722 on a remote system in which the local system specifies the 723 remote location of the data to be transferred. Employing a RNIC 724 in the remote system allows the access to take place without 725 interrupting the processing of the CPU(s) on the system. 727 Send - An RDMA Operation that transfers the contents of a ULP 728 Buffer from the Local Peer to an Untagged Buffer at the Remote 729 Peer. 731 Send Message Type - A Send Message, Send with Invalidate Message, 732 Send with Solicited Event Message, or Send with Solicited Event 733 and Invalidate Message. 735 Send Operation Type - A Send Operation, Send with Invalidate 736 Operation, Send with Solicited Event Operation, or Send with 737 Solicited Event and Invalidate Operation. 739 Solicited Event (SE) - A facility by which an RDMA Operation sender 740 may cause an Event to be generated at the recipient, if the 741 recipient is configured to generate such an Event, when a Send 742 with Solicited Event or Send with Solicited Event and 743 Invalidate Message is received. Note: The Local Peer's ULP can 744 use the Solicited Event mechanism to ensure that Messages 745 designated as important to the ULP are handled in an 746 expeditious manner by the Remote Peer's ULP. The ULP at the 747 Local Peer can indicate a given Send Message Type is important 748 by using the Send with Solicited Event Message or Send with 749 Solicited Event and Invalidate Message. The ULP at the Remote 750 Peer can choose to only be notified when valid Send with 751 Solicited Event Messages and/or Send with Solicited Event and 752 Invalidate Messages arrive and handle other valid incoming Send 753 Messages or Send with Invalidate Messages at its leisure. 755 Terminate - An RDMA Message used by a Node to pass an error 756 indication to the peer Node on an RDMAP Stream. This operation 757 is for RDMAP use only. 759 ULP Buffer - A buffer owned above the RDMAP Layer and advertised to 760 the RDMAP Layer either as a Tagged Buffer or an Untagged ULP 761 Buffer. 763 ULP Message - The ULP data that is handed to a specific protocol 764 layer for transmission. Data boundaries are preserved as they 765 are transmitted through iWARP. 767 3 ULP and Transport Attributes 769 3.1 Transport Requirements & Assumptions 771 RDMAP MUST be layered on top of the Direct Data Placement Protocol 772 [DDP]. 774 RDMAP requires the following DDP support: 776 * RDMAP uses three queues for Untagged Buffers: 778 * Queue Number 0 (used by RDMAP for Send, Send with 779 Invalidate, Send with Solicited Event, and Send with 780 Solicited Event and Invalidate operations). 782 * Queue Number 1 (used by RDMAP for RDMA Read operations). 784 * Queue Number 2 (used by RDMAP for Terminate operations). 786 * DDP maps a single RDMA Message to a single DDP Message. 788 * DDP uses the STag and Tagged Offset provided by the RDMAP for 789 Tagged Buffer Messages (i.e. RDMA Write and RDMA Read Response). 791 * When the DDP layer Delivers an Untagged DDP Message to the RDMAP 792 layer, DDP provides the length of the DDP Message. This ensures 793 that RDMAP does not have to carry a length field in its header. 795 * When the RDMAP layer provides an RDMA Message to the DDP Layer, 796 DDP must insert the RsvdULP field value provided by the RDMAP 797 Layer into the associated DDP Message. 799 * When the DDP layer Delivers a DDP Message to the RDMAP layer, 800 DDP provides the RsvdULP field. 802 * The RsvdULP field must be 1 octet for DDP Tagged Messages and 5 803 octets for DDP Untagged Messages. 805 * DDP propagates to RDMAP all operation or protection errors (used 806 by RDMAP Terminate) and, when appropriate, the DDP Header fields 807 of the DDP Segment that encountered the error. 809 * If an RDMA Operation is aborted by DDP or a lower layer, the 810 contents of the Data Sink buffers associated with the operation 811 are considered indeterminate. 813 * DDP in conjunction with the lower layers provide reliable, in- 814 order Delivery. 816 3.2 RDMAP Interactions with the ULP 818 RDMAP provides the ULP with access to the following RDMA Operations 819 as defined in this specification: 821 * Send 823 * Send with Solicited Event 825 * Send with Invalidate 827 * Send with Solicited Event and Invalidate 829 * RDMA Write 831 * RDMA Read 833 For Send Operation Types, the following are the interactions 834 between the RDMAP Layer and the ULP: 836 * At the Data Source: 838 * The ULP passes to the RDMAP Layer the following: 840 * ULP Message Length 842 * ULP Message 844 * An indication of the Send Operation Type, where the 845 valid types are: Send, Send with Solicited Event, Send 846 with Invalidate, or Send with Solicited Event and 847 Invalidate. 849 * An Invalidate STag, if the Send Operation Type was Send 850 with Invalidate or Send with Solicited Event and 851 Invalidate. 853 * When the Send Operation Type Completes, an indication of 854 the Completion results. 856 * At the Data Sink: 858 * If the Send Operation Type Completed successfully, the 859 RDMAP Layer passes the following information to the ULP 860 Layer: 862 * ULP Message Length 864 * ULP Message 866 * An Event, if the Data Sink is configured to generate an 867 Event. 869 * An Invalidated STag, if the Send Operation Type was 870 Send with Invalidate or Send with Solicited Event and 871 Invalidate. 873 * If the Send Operation Type Completed in error, the Data 874 Sink RDMAP Layer will pass up the corresponding error 875 information to the Data Sink ULP and send a Terminate 876 Message to the Data Source RDMAP Layer. The Data Source 877 RDMAP Layer will then pass up the Terminate Message to the 878 ULP. 880 For RDMA Write Operations, the following are the interactions 881 between the RDMAP Layer and the ULP: 883 * At the Data Source: 885 * The ULP passes to the RDMAP Layer the following: 887 * ULP Message Length 889 * ULP Message 891 * Data Sink STag 893 * Data Sink Tagged Offset 895 * When the RDMA Write Operation Completes, an indication of 896 the Completion results. 898 * At the Data Sink: 900 * If the RDMA Write completed successfully, the RDMAP Layer 901 does not Deliver the RDMA Write to the ULP. It does Place 902 the ULP Message transferred through the RDMA Write Message 903 into the ULP Buffer. 905 * If the RDMA Write completed in error, the Data Sink RDMAP 906 Layer will pass up the corresponding error information to 907 the Data Sink ULP and send a Terminate Message to the Data 908 Source RDMAP Layer. The Data Source RDMAP Layer will then 909 pass up the Terminate Message to the ULP. 911 For RDMA Read Operations, the following are the interactions 912 between the RDMAP Layer and the ULP: 914 * At the Data Sink: 916 * The ULP passes to the RDMAP Layer the following: 918 * ULP Message Length 920 * Data Source STag 922 * Data Sink STag 924 * Data Source Tagged Offset 926 * Data Sink Tagged Offset 928 * When the RDMA Read Operation Completes, an indication of 929 the Completion results. 931 * At the Data Source: 933 * If no error occurred while processing the RDMA Read 934 Request, the Data Source will not pass up any information 935 to the ULP. 937 * If an error occurred while processing the RDMA Read 938 Request, the Data Source RDMAP Layer will pass up the 939 corresponding error information to the Data Source ULP and 940 send a Terminate Message to the Data Sink RDMAP Layer. The 941 Data Sink RDMAP Layer will then pass up the Terminate 942 Message to the ULP. 944 For STags made available to the RDMAP Layer, following are the 945 interactions between the RDMAP Layer and the ULP: 947 * If the ULP enables an STag, the ULP passes to the RDMAP Layer 948 the: 950 * yesSTag; 952 * range of Tagged Offsets that are associated with a given 953 STag; 955 * remote access rights (read, write, or read and write) 956 associated with a given, valid STag; and 958 * association between a given STag and a given RDMAP Stream. 960 * If the ULP disables an STag, the ULP passes to the RDMAP Layer 961 the STag. 963 If an error occurs at the RDMAP Layer, the RDMAP Layer may pass 964 back error information (e.g. the content of a Terminate Message) to 965 the ULP. 967 4 Header Format 969 The control information of RDMA Messages is included in DDP 970 protocol defined header fields, with the following exceptions: 972 * The first octet reserved for ULP usage on all DDP Messages in 973 the DDP Protocol (i.e. the RsvdULP Field) is used by RDMAP to 974 carry the RDMA Message Opcode and the RDMAP version. This octet 975 is known as the RDMAP Control Fiebld in this specification. For 976 Send with Invalidate and Send with Solicited Event and 977 Invalidate, RDMAP uses the second through fifth octets provided 978 by DDP on Untagged DDP Messages to carry the STag that will be 979 Invalidated. 981 * The RDMA Message length is passed by the RDMAP layer to the DDP 982 layer on all outbound transfers. 984 * For RDMA Read Request Messages, the RDMA Read Message Size is 985 included in the RDMA Read Request Header. 987 * The RDMA Message length is passed to the RDMAP Layer by the DDP 988 layer on inbound Untagged Buffer transfers. 990 * Two RDMA Messages carry additional RDMAP headers. The RDMA Read 991 Request carries the Data Sink and Data Source buffer 992 descriptions, including buffer length. The Terminate carries 993 additional information associated with the error that caused the 994 Terminate. 996 4.1 RDMAP Control and Invalidate STag Field 998 The version of RDMAP defined by this specification uses all 8 bits 999 of the RDMAP Control Field. The first octet reserved for ULP use in 1000 the DDP Protocol MUST be used by the RDMAP to carry the RDMAP 1001 Control Field. The ordering of the bits in the first octet MUST be 1002 as defined in Figure 3 DDP Control, RDMAP Control, and Invalidate 1003 STag Field. For Send with Invalidate and Send with Solicited Event 1004 and Invalidate, the second through fifth octets of the DDP RsvdULP 1005 field MUST be used by RDMAP to carry the Invalidate STag. Figure 3 1006 DDP Control, RDMAP Control, and Invalidate STag Field depicts the 1007 format of the DDP Control and RDMAP Control fields. (Note: In 1008 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Field, the 1009 DDP Header is offset by 16 bits to accommodate the MPA header 1010 defined in [MPA]. The MPA header is only present if DDP is layered 1011 on top of MPA.) 1013 0 1 2 3 1014 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1015 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1016 |T|L| Resrv | DV| RV|Rsv| Opcode| 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 | Invalidate STag | 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1020 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields 1022 All RDMA Messages handed by the RDMAP Layer to the DDP layer MUST 1023 define the value of the Tagged flag in the DDP Header. Figure 4 1024 RDMA Usage of DDP Fields MUST be used to define the value of the 1025 Tagged flag that is handed to the DDP Layer for each RDMA Message. 1027 Figure 4 RDMA Usage of DDP Fields defines the value of the RDMA 1028 Opcode field that MUST be used for each RDMA Message. 1030 Figure 4 RDMA Usage of DDP Fields defines when the STag, Queue 1031 Number, and Tagged Offset fields MUST be provided for each RDMA 1032 Message. 1034 For this version of the RDMAP, all RDMA Messages MUST have: 1036 * Bits 24-25; RDMA Version field: 01b for IETF RNICs, and 00b for 1037 RDMAC RNICs. Both version numbers are valid. Interoperability is 1038 dependent on MPA protocol version negotiation (e.g. MPA marker 1039 and MPA CRC). 1041 * Bits 26-27; Reserved. MUST be set to zero by sender, ignored by 1042 the receiver. 1044 * Bits 28-31; OpCode field: see Figure 4 RDMA Usage of DDP Fields. 1046 * Bits 32-63; Invalidate STag. However, this field is only valid 1047 for Send with Invalidate and Send with Solicited Event and 1048 Invalidate Messages (see Figure 4 RDMA Usage of DDP Fields). 1049 For Send, Send with Solicited Event, RDMA Read Request, and 1050 Terminate, the Invalidate STag field MUST be set to zero on 1051 transmit and ignored by the receiver. 1053 -------+-----------+-------+------+-------+-----------+-------------- 1054 RDMA | Message | Tagged| STag | Queue | Invalidate| Message 1055 Message| Type | Flag | and | Number| STag | Length 1056 OpCode | | | TO | | | Communicated 1057 | | | | | | between DDP 1058 | | | | | | and RDMAP 1059 -------+-----------+-------+------+-------+-----------+-------------- 1060 0000b | RDMA Write| 1 | Valid| N/A | N/A | Yes 1061 | | | | | | 1062 -------+-----------+-------+------+-------+-----------+-------------- 1063 0001b | RDMA Read | 0 | N/A | 1 | N/A | Yes 1064 | Request | | | | | 1065 -------+-----------+-------+------+-------+-----------+-------------- 1066 0010b | RDMA Read | 1 | Valid| N/A | N/A | Yes 1067 | Response | | | | | 1068 -------+-----------+-------+------+-------+-----------+-------------- 1069 0011b | Send | 0 | N/A | 0 | N/A | Yes 1070 | | | | | | 1071 -------+-----------+-------+------+-------+-----------+-------------- 1072 0100b | Send with | 0 | N/A | 0 | Valid | Yes 1073 | Invalidate| | | | | 1074 -------+-----------+-------+------+-------+-----------+-------------- 1075 0101b | Send with | 0 | N/A | 0 | N/A | Yes 1076 | SE | | | | | 1077 -------+-----------+-------+------+-------+-----------+-------------- 1078 0110b | Send with | 0 | N/A | 0 | Valid | Yes 1079 | SE and | | | | | 1080 | Invalidate| | | | | 1081 -------+-----------+-------+------+-------+-----------+-------------- 1082 0111b | Terminate | 0 | N/A | 2 | N/A | Yes 1083 | | | | | | 1084 -------+-----------+-------+------+-------+-----------+-------------- 1085 1000b | | 1086 to | Reserved | Not Specified 1087 1111b | | 1088 -------+-----------+------------------------------------------------- 1089 Figure 4 RDMA Usage of DDP Fields 1091 Note: N/A means Not Applicable. 1093 4.2 RDMA Message Definitions 1095 The following figure defines which RDMA Headers MUST be used on 1096 each RDMA Message and which RDMA Messages are allowed to carry ULP 1097 payload: 1099 -------+-----------+-------------------+------------------------- 1100 RDMA | Message | RDMA Header Used | ULP Message allowed in 1101 Message| Type | | the RDMA Message 1102 OpCode | | | 1103 | | | 1104 -------+-----------+-------------------+------------------------- 1105 0000b | RDMA Write| None | Yes 1106 | | | 1107 -------+-----------+-------------------+------------------------- 1108 0001b | RDMA Read | RDMA Read Request | No 1109 | Request | Header | 1110 -------+-----------+-------------------+------------------------- 1111 0010b | RDMA Read | None | Yes 1112 | Response | | 1113 -------+-----------+-------------------+------------------------- 1114 0011b | Send | None | Yes 1115 | | | 1116 -------+-----------+-------------------+------------------------- 1117 0100b | Send with | None | Yes 1118 | Invalidate| | 1119 -------+-----------+-------------------+------------------------- 1120 0101b | Send with | None | Yes 1121 | SE | | 1122 -------+-----------+-------------------+------------------------- 1123 0110b | Send with | None | Yes 1124 | SE and | | 1125 | Invalidate| | 1126 -------+-----------+-------------------+------------------------- 1127 0111b | Terminate | Terminate Header | No 1128 | | | 1129 -------+-----------+-------------------+------------------------- 1130 1000b | | 1131 to | Reserved | Not Specified 1132 1111b | | 1133 -------+-----------+-------------------+------------------------- 1134 Figure 5 RDMA Message Definitions 1136 4.3 RDMA Write Header 1138 The RDMA Write Message does not include an RDMAP header. The RDMAP 1139 layer passes to the DDP layer an RDMAP Control Field. The RDMA 1140 Write Message is fully described by the DDP Headers of the DDP 1141 Segments associated with the Message. 1143 See section 11 Appendix for a description of the DDP Segment format 1144 associated with RDMA Write Messages. 1146 4.4 RDMA Read Request Header 1148 The RDMA Read Request Message carries an RDMA Read Request Header 1149 that describes the Data Sink and Data Source Buffers used by the 1150 RDMA Read operation. The RDMA Read Request Header immediately 1151 follows the DDP header. The RDMAP layer passes to the DDP layer an 1152 RDMAP Control Field. The following figure depicts the RDMA Read 1153 Request Header that MUST be used for all RDMA Read Request 1154 Messages: 1156 0 1 2 3 1157 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1159 | Data Sink STag (SinkSTag) | 1160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1161 | | 1162 + Data Sink Tagged Offset (SinkTO) + 1163 | | 1164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1165 | RDMA Read Message Size (RDMARDSZ) | 1166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 | Data Source STag (SrcSTag) | 1168 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1169 | | 1170 + Data Source Tagged Offset (SrcTO) + 1171 | | 1172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1173 Figure 6 RDMA Read Request Header Format 1175 Data Sink Steering Tag: 32 bits. 1177 The Data Sink Steering Tag identifies the Data Sink's Tagged 1178 Buffer. This field MUST be copied, without interpretation, 1179 from the RDMA Read Request into the corresponding RDMA Read 1180 Response and allows the Data Sink to place the returning data. 1181 The STag is associated with the RDMAP Stream through a 1182 mechanism that is outside the scope of the RDMAP 1183 specification. 1185 Data Sink Tagged Offset: 64 bits. 1187 The Data Sink Tagged Offset specifies the starting offset, in 1188 octets, from the base of the Data Sink's Tagged Buffer, where 1189 the data is to be written by the Data Source. This field is 1190 copied from the RDMA Read Request into the corresponding RDMA 1191 Read Response and allows the Data Sink to place the returning 1192 data. The Data Sink Tagged Offset MAY start at an arbitrary 1193 offset. 1195 The Data Sink STag and Data Sink Tagged Offset fields describe 1196 the buffer to which the RDMA Read data is written. 1198 Note: the DDP Layer protects against a wrap of the Data Sink 1199 Tagged Offset. 1201 RDMA Read Message Size: 32 bits. 1203 The RDMA Read Message Size is the amount of data, in octets, 1204 read from the Data Source. A single RDMA Read Request Message 1205 can retrieve from 0 to 2^32-1 data octets from the Data 1206 Source. 1208 Data Source Steering Tag: 32 bits. 1210 The Data Source Steering Tag identifies the Data Source's 1211 Tagged Buffer. The STag is associated with the RDMAP Stream 1212 through a mechanism that is outside the scope of the RDMAP 1213 specification. 1215 Data Source Tagged Offset: 64 bits. 1217 The Tagged Offset specifies the starting offset, in octets, 1218 that is to be read from the Data Source's Tagged Buffer. The 1219 Data Source Tagged Offset MAY start at an arbitrary offset. 1221 The Data Source STag and Data Source Tagged Offset fields 1222 describe the buffer from which the RDMA Read data is read. 1224 See Section 7.2 Errors Detected at the Remote Peer on Incoming RDMA 1225 Messages for a description of error checking required upon 1226 processing of an RDMA Read Request at the Data Source. 1228 4.5 RDMA Read Response Header 1230 The RDMA Read Response Message does not include an RDMAP header. 1231 The RDMAP layer passes to the DDP layer an RDMAP Control Field. The 1232 RDMA Read Response Message is fully described by the DDP Headers of 1233 the DDP Segments associated with the Message. 1235 See Section 11 Appendix for a description of the DDP Segment format 1236 associated with RDMA Read Response Messages. 1238 4.6 Send Header and Send with Solicited Event Header 1240 The Send and Send with Solicited Event Message do not include an 1241 RDMAP header. The RDMAP layer passes to the DDP layer an RDMAP 1242 Control Field. The Send and Send with Solicited Event Message are 1243 fully described by the DDP Headers of the DDP Segments associated 1244 with the Message. 1246 See Section 11 Appendix for a description of the DDP Segment format 1247 associated with Send and Send with Solicited Event Messages. 1249 4.7 Send with Invalidate Header and Send with SE and Invalidate Header 1251 The Send with Invalidate and Send with Solicited Event and 1252 Invalidate Message do not include an RDMAP header. The RDMAP layer 1253 passes to the DDP layer an RDMAP Control Field and the Invalidate 1254 STag field (see section 4.1 RDMAP Control and Invalidate STag 1255 Field). The Send with Invalidate and Send with Solicited Event and 1256 Invalidate Message are fully described by the DDP Headers of the 1257 DDP Segments associated with the Message. 1259 See Section 11 Appendix for a description of the DDP Segment format 1260 associated with Send and Send with Solicited Event Messages. 1262 4.8 Terminate Header 1264 The Terminate Message carries a Terminate Header that contains 1265 additional information associated with the cause of the Terminate. 1266 The Terminate Header immediately follows the DDP header. The RDMAP 1267 layer passes to the DDP layer an RDMAP Control Field. The following 1268 figure depicts a Terminate Header that MUST be used for the 1269 Terminate Message: 1271 0 1 2 3 1272 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1274 | Terminate Control | Reserved | 1275 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1276 | DDP Segment Length (if any) | | 1277 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 1278 | | 1279 // // 1280 | Terminated DDP Header (if any) | 1281 + + 1282 | | 1283 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1284 | | 1285 // // 1286 | Terminated RDMA Header (if any) | 1287 + + 1288 | | 1289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1290 Figure 7 Terminate Header Format 1292 Terminate Control: 19 bits. 1294 The Terminate Control field MUST have the format defined in 1295 Figure 8 Terminate Control Field. 1297 0 1 2 3 1298 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1300 | Layer | EType | Error Code |HdrCt| 1301 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1302 Figure 8 Terminate Control Field 1304 * Figure 9 Terminate Control Field Values defines the valid 1305 values that MUST be used for this field. 1307 * Layer: 4 bits. 1309 Identifies the layer that encountered the error. 1311 * EType (RDMA Error Type): 4 bits. 1313 Identifies the type of error that caused the Terminate. 1314 When the error is detected at the RDMAP Layer, the 1315 RDMAP Layer inserts the Error Type into this field. 1316 When the error is detected at a LLP layer, a LLP layer 1317 creates the Error Type and the DDP layer passes it up 1318 to the RDMAP Layer, and the RDMAP Layer inserts it into 1319 this field. 1321 * Error Code: 8 bits. 1323 This field identifies the specific error that caused 1324 the Terminate. When the error is detected at the RDMAP 1325 Layer, the RDMAP Layer creates the Error Code. When the 1326 error is detected at a LLP layer, a LLP layer creates 1327 the Error Code and the DDP layer passes it up to the 1328 RDMAP Layer, and the RDMAP Layer inserts it into this 1329 field. 1331 * HdrCt: 3 bits. 1333 Header control bits: 1335 * M: bit 16. DDP Segment Length valid. See Figure 10 1336 for when this bit SHOULD be set. 1338 * D: bit 17. DDP Header Included. See Figure 10 for 1339 when this bit SHOULD be set. 1341 * R: bit 18. RDMAP Header Included. See Figure 10 for 1342 when this bit SHOULD be set. 1344 -------+----------+-------+-------------+------+-------------------- 1345 Layer | Layer | Error | Error Type | Error| Error Code Name 1346 | Name | Type | Name | Code | 1347 -------+----------+-------+-------------+------+-------------------- 1348 | | 0000b | Local | None | None 1349 | | | Catastrophic| | 1350 | | | Error | | 1351 | +-------+-------------+------+-------------------- 1352 | | | | 00X | Invalid STag 1353 | | | +------+-------------------- 1354 | | | | 01X | Base or bounds 1355 | | | | | violation 1356 | | | Remote +------+-------------------- 1357 | | 0001b | Protection | 02X | Access rights 1358 | | | Error | | violation 1359 | | | +------+-------------------- 1360 0000b | RDMA | | | 03X | STag not associated 1361 | | | | | with RDMAP Stream 1362 | | | +------+-------------------- 1363 | | | | 04X | TO wrap 1364 | | | +------+-------------------- 1365 | | | | 09X | STag cannot be 1366 | | | | | Invalidated 1367 | | | +------+-------------------- 1368 | | | | FFX | Unspecified Error 1369 | +-------+-------------+------+-------------------- 1370 | | | | 05X | Invalid RDMAP 1371 | | | | | version 1372 | | | +------+-------------------- 1373 | | | | 06X | Unexpected OpCode 1374 | | | Remote +------+-------------------- 1375 | | 0010b | Operation | 07X | Catastrophic error, 1376 | | | Error | | localized to RDMAP 1377 | | | | | Stream 1378 | | | +------+-------------------- 1379 | | | | 08X | Catastrophic error, 1380 | | | | | global 1381 | | | +------+-------------------- 1382 | | | | 09X | STag cannot be 1383 | | | | | Invalidated 1384 | | | +------+-------------------- 1385 | | | | FFX | Unspecified Error 1386 -------+----------+-------+-------------+------+-------------------- 1387 0001b | DDP | See DDP Specification [DDP] for a description of 1388 | | the values and names. 1389 -------+----------+-------+----------------------------------------- 1390 0010b | LLP | For MPA, see MPA Specification [MPA] for a 1391 | (eg MPA) | description of the values and names. 1392 -------+----------+-------+----------------------------------------- 1393 Figure 9 Terminate Control Field Values 1395 Reserved: 13 bits. This field MUST be set to zero on transmit, 1396 ignored on receive. 1398 DDP Segment Length: 16 bits 1400 The length handed up by the DDP Layer when the error was 1401 detected. It MUST be valid if the M bit is set. It MUST be 1402 present when the D bit is set. 1404 Terminated DDP Header: 112 bits for Tagged Messages and 144 bits 1405 for Untagged Messages. 1407 The DDP Header of the incoming Message that is associated with 1408 the Terminate. The DDP Header is not present if the Terminate 1409 Error Type is a Local Catastrophic Error. It MUST be present 1410 if the D bit is set. 1412 Terminated RDMA Header: 224 bits. 1414 The Terminated RDMA Header is only sent back if the terminate 1415 is associated with an RDMA Read Request Message. It MUST be 1416 present if the R bit is set. 1418 If the terminate occurs before the first RDMA Read Request 1419 byte is processed, the original RDMA Read Request Header is 1420 sent back. 1422 If the terminate occurs after the first RDMA Read Request byte 1423 is processed, the RDMA Read Request Header is updated to 1424 reflect the current location of the RDMA Read operation that 1425 is in process: 1427 * Data Sink STag = Data Sink STag originally sent in the 1428 RDMA Read Request. 1430 * Data Sink Tagged Offset = Current offset into the Data 1431 Sink Tagged Buffer. For example if the RDMA Read 1432 Request was terminated after 2048 octets were sent, 1433 then the Data Sink Tagged Offset = the original Data 1434 Sink Tagged Offset + 2048. 1436 * Data Message size = Number of bytes left to transfer. 1438 * Data Source STag = Data Source STag in the RDMA Read 1439 Request. 1441 * Data Source Tagged Offset = Current offset into the 1442 Data Source Tagged Buffer. For example if the RDMA Read 1443 Request was terminated after 2048 octets were sent, 1444 then the Data Source Tagged Offset = the original Data 1445 Source Tagged Offset + 2048. 1447 Note: if a given LLP does not define any termination codes for the 1448 RDMAP Termination message to use, then none would be used for that 1449 LLP. 1451 Figure 10 Error Type to RDMA Message Mapping maps layer name and 1452 error types to each RDMA Message type: 1454 ---------+-------------+------------+------------+----------------- 1455 Layer | Error Type | Terminate | Terminate | What type of 1456 Name | Name | Includes | Includes | RDMA Message can 1457 | | DDP Header | RDMA Header| cause the error 1458 | | and DDP | | 1459 | | Segment | | 1460 | | Length | | 1461 ---------+-------------+------------+------------+----------------- 1462 | Local | No | No | Any 1463 | Catastrophic| | | 1464 | Error | | | 1465 +-------------+------------+------------+----------------- 1466 | Remote | Yes, if | Yes | Only RDMA Read 1467 RDMA | Protection | possible | | Request, Send 1468 | Error | | | with Invalidate, 1469 | | | | and Send with SE 1470 | | | | and Invalidate 1471 +-------------+------------+------------+----------------- 1472 | Remote | Yes, if | No | Any 1473 | Operation | possible | | 1474 | Error | | | 1475 ---------+-------------+------------+------------+----------------- 1476 DDP | See DDP Spec| Yes | No | Any 1477 | [DDP] | | | 1478 ---------+-------------+------------+------------+----------------- 1479 LLP | See LLP Spec| No | No | Any 1480 | [e.g. MPA] | | | 1481 Figure 10 Error Type to RDMA Message Mapping 1483 5 Data Transfer 1485 5.1 RDMA Write Message 1487 An RDMA Write is used by the Data Source to transfer data to a 1488 previously Advertised Tagged Buffer at the Data Sink. The RDMA 1489 Write Message has the following semantics: 1491 * An RDMA Write Message MUST reference a Tagged Buffer. That is, 1492 the Data Source RDMAP Layer MUST request that the DDP layer mark 1493 the Message as Tagged. 1495 * A valid RDMA Write Message MUST NOT be delivered to the Data 1496 Sink's ULP (i.e. it is placed by the DDP layer). 1498 * At the Remote Peer, when an invalid RDMA Write Message is 1499 delivered to the Remote Peer's RDMAP Layer, an error is surfaced 1500 (see section 7.1 RDMAP Error Surfacing). 1502 * The Tagged Offset of a Tagged Buffer MAY start at a non-zero 1503 value. 1505 * An RDMA Write Message MAY target all or part of a previously 1506 Advertised buffer. 1508 * The RDMAP does not define how the buffer(s) used by an outbound 1509 RDMA Write is defined and how it is addressed. For example, an 1510 implementation of RDMA may choose to allow a gather-list of non- 1511 contiguous data blocks to be the source of an RDMA Write. In 1512 this case, the data blocks would be combined by the Data Source 1513 and sent as a single RDMA Write Message to the Data Sink. 1515 * The Data Source RDMAP Layer MUST issue RDMA Write Messages to 1516 the DDP layer in the order they were submitted by the ULP. 1518 * At the Data Source, a subsequent Send (Send with Invalidate, 1519 Send with Solicited Event, or Send with Solicited Event and 1520 Invalidate) Message MAY be used to signal Delivery of previous 1521 RDMA Write Messages to the Data Sink, if desired by the ULP. 1523 * If the Local Peer wishes to write to multiple Tagged Buffers on 1524 the Remote Peer, the Local Peer MUST use multiple RDMA Write 1525 Messages. That is, a single RDMA Write Message can only write to 1526 one remote Tagged Buffer. 1528 * The Data Source MAY issue a zero length RDMA Write Message. 1530 5.2 RDMA Read Operation 1532 The RDMA Read operation MUST consist of a single RDMA Read Request 1533 Message and a single RDMA Read Response Message. 1535 5.2.1 RDMA Read Request Message 1537 An RDMA Read Request is used by the Data Sink to transfer data from 1538 a previously Advertised Tagged Buffer at the Data Source to a 1539 Tagged Buffer at the Data Sink. The RDMA Read Request Message has 1540 the following semantics: 1542 * An RDMA Read Request Message MUST reference an Untagged Buffer. 1543 That is, the Local Peer's RDMAP Layer MUST request that the DDP 1544 mark the Message as Untagged. 1546 * One RDMA Read Request Message MUST consume one Untagged Buffer. 1548 * The Remote Peer's RDMAP Layer MUST process an RDMA Read Request 1549 Message. A valid RDMA Read Request Message MUST NOT be delivered 1550 to the Data Sink's ULP (i.e. it is processed by the RDMAP 1551 layer). 1553 * At the Remote Peer, when an invalid RDMA Read Request Message is 1554 delivered to the Remote Peer's RDMAP Layer, an error is surfaced 1555 (see section 7.1 RDMAP Error Surfacing). 1557 * AN RDMA Read Request Message MUST reference the RDMA Read 1558 Request Queue. That is, the Local Peer's RDMAP Layer MUST 1559 request that the DDP layer set the Queue Number field to one. 1561 * The Local Peer MUST pass to the DDP Layer RDMA Read Request 1562 Messages in the order they were submitted by the ULP. 1564 * The Remote Peer MUST process the RDMA Read Request Messages in 1565 the order they were sent. 1567 * If the Local Peer wishes to read from multiple Tagged Buffers on 1568 the Remote Peer, the Local Peer MUST use multiple RDMA Read 1569 Request Messages. That is, a single RDMA Read Request Message 1570 MUST only read from one remote Tagged Buffer. 1572 * AN RDMA Read Request Message MAY target all or part of a 1573 previously Advertised buffer. 1575 * If the Data Source receives a valid RDMA Read Request Message it 1576 MUST respond with a valid RDMA Read Response Message. 1578 * The Data Sink MAY issue a zero length RDMA Read Request Message, 1579 by setting the RDMA Read Message Size field to zero in the RDMA 1580 Read Request Header. 1582 * If the Data Source receives a non-zero length RDMA Read Message 1583 Size, the Data Source RDMAP MUST validate the Data Source STag 1584 and Data Source Tagged Offset contained in the RDMA Read Request 1585 Header. 1587 * If the Data Source receives an RDMA Read Request Header with the 1588 RDMA Read Message Size set to zero, the Data Source RDMAP: 1590 * MUST NOT validate the Data Source STag and Data Source 1591 Tagged Offset contained in the RDMA Read Request Header, 1592 and 1594 * MUST respond with a zero length RDMA Read Response Message. 1596 5.2.2 RDMA Read Response Message 1598 The RDMA Read Response Message uses the DDP Tagged Buffer Model to 1599 Deliver the contents of a previously requested Data Source Tagged 1600 Buffer to the Data Sink, without any involvement from the ULP at 1601 the Remote Peer. The RDMA Read Response Message has the following 1602 semantics: 1604 * The RDMA Read Response Message for the associated RDMA Read 1605 Request Message travels in the opposite direction. 1607 * An RDMA Read Response Message MUST reference a Tagged Buffer. 1608 That is, the Data Source RDMAP Layer MUST request that the DDP 1609 mark the Message as Tagged. 1611 * The Data Source MUST ensure that a sufficient number of Untagged 1612 Buffers are available on the RDMA Read Request Queue (Queue with 1613 DDP Queue Number 1) to support the maximum number of RDMA Read 1614 Requests negotiated by the ULP. 1616 * The RDMAP Layer MUST Deliver the RDMA Read Response Message to 1617 the ULP. 1619 * At the Remote Peer, when an invalid RDMA Read Response Message 1620 is delivered to the Remote Peer's RDMAP Layer, an error is 1621 surfaced (see section 7.1 RDMAP Error Surfacing). 1623 * The Tagged Offset of a Tagged Buffer MAY start at a non-zero 1624 value. 1626 * The Data Source RDMAP Layer MUST pass RDMA Read Response 1627 Messages to the DDP layer in the order that the RDMA Read 1628 Request Messages were received by the RDMAP Layer at the Data 1629 Source. 1631 * The Data Sink MAY validate that the STag, Tagged Offset, and 1632 length of the RDMA Read Response Message are the same as the 1633 STag, Tagged Offset, and length included in the corresponding 1634 RDMA Read Request Message. 1636 * A single RDMA Read Response Message MUST write to one remote 1637 Tagged Buffer. If the Data Sink wishes to Read multiple Tagged 1638 Buffers, the Data Sink can use multiple RDMA Read Request 1639 Messages. 1641 5.3 Send Message Type 1643 The Send Message Type uses the DDP Untagged Buffer Model to 1644 transfer data from the Data Source into an Untagged Buffer at the 1645 Data Sink. 1647 * A Send Message Type MUST reference an Untagged Buffer. That is, 1648 the Local Peer's RDMAP Layer MUST request that the DDP layer 1649 mark the Message as Untagged. 1651 * One Send Message Type MUST consume one Untagged Buffer. 1653 * The ULP Message sent using a Send Message Type MAY be less 1654 than or equal to the size of the consumed Untagged Buffer. 1655 The RDMAP Layer communicates to the ULP the size of the 1656 data written into the Untagged Buffer. 1658 * If the ULP Message sent via Send Message Type is larger 1659 than the Data Sink's Untagged Buffer, it is an error (see 1660 section 9.1 RDMAP Error Surfacing). 1662 * At the Remote Peer, the Send Message Type MUST be Delivered to 1663 the Remote Peer's ULP in the order they were sent. 1665 * After the Send with Solicited Event or Send with Solicited Event 1666 and Invalidate Message is Delivered to the ULP, the RDMAP MAY 1667 generate an Event, if the Data Sink is configured to generate 1668 such an Event. 1670 * At the Remote Peer, when an invalid Send Message Type is 1671 Delivered to the Remote Peer's RDMAP Layer, an error is surfaced 1672 (see section 7.1 RDMAP Error Surfacing). 1674 * The RDMAP does not define how the buffer(s) used by an outbound 1675 Send Message Type is defined and how it is addressed. For 1676 example, an implementation of RDMA may choose to allow a gather- 1677 list of non-contiguous data blocks to be the source of a Send 1678 Message Type. In this case, the data blocks would be combined by 1679 the Data Source and sent as a single Send Message Type to the 1680 Data Sink. 1682 * For a Send Message Type, the Local Peer's RDMAP Layer MUST 1683 request that the DDP layer set the Queue Number field to zero. 1685 * The Local Peer MUST issue Send Message Type Messages in the 1686 order they were submitted by the ULP. 1688 * The Data Source MAY pass a zero length Send Message Type. A zero 1689 length Send Message Type MUST consume an Untagged Buffer at the 1690 Data Sink. A Send with Invalidate or Send with Solicited Event 1691 and Invalidate Message MUST reference an STag. That is, the 1692 Local Peer's RDMAP Layer MUST pass the RDMA control field and 1693 the STag that will be Invalidated to the DDP layer. 1695 * When the Send with Invalidate and Send with Solicited Event and 1696 Invalidate Message are Delivered to the Remote Peer's RDMAP 1697 Layer, the RDMAP Layer MUST: 1699 * Verify the STag that is associated with the RDMAP Stream; 1700 and 1702 * Invalidate the STag if it is associated with the RDMAP 1703 Stream; or Issue a Terminate Message with the STag Cannot 1704 be Invalidated Terminate Error Code, if the STag is not 1705 associated with the RDMAP Stream. 1707 5.4 Terminate Message 1709 The Terminate Message uses the DDP Untagged Buffer Model to 1710 transfer error related information from the Data Source into an 1711 Untagged Buffer at the Data Sink and then ceases all further 1712 communications on the underlying DDP Stream. The Terminate Message 1713 has the following semantics: 1715 * A Terminate Message MUST reference an Untagged Buffer. That is, 1716 the Local Peer's RDMAP Layer MUST request that the DDP layer 1717 mark the Message as Untagged. 1719 * A Terminate Message references the Terminate Queue. That is, the 1720 Local Peer's RDMAP Layer MUST request that the DDP layer set the 1721 Queue Number field to two. 1723 * One Terminate Message MUST consume one Untagged Buffer. 1725 * On a single RDMAP Stream, the RDMAP layer MUST guarantee 1726 placement of a single Terminate Message. 1728 * A Terminate Message MUST be Delivered to the Remote Peer's RDMAP 1729 Layer. The RDMAP Layer MUST Deliver the Terminate Message to the 1730 ULP. 1732 * At the Remote Peer, when an invalid Terminate Message is 1733 delivered to the Remote Peer's RDMAP Layer, an error is surfaced 1734 (see section 7.1 RDMAP Error Surfacing). 1736 * The RDMAP Layer Completes in error all ULP Operations that have 1737 not been provided to the DDP layer. 1739 * After sending a Terminate Message on an RDMAP Stream, the Local 1740 Peer MUST NOT send any more Messages on that specific RDMAP 1741 Stream. 1743 * After receiving a Terminate Message on an RDMAP Stream, the 1744 Remote Peer MAY stop sending Messages on that specific RDMAP 1745 Stream. 1747 5.5 Ordering and Completions 1749 It is important to understand the difference between Placement and 1750 Delivery ordering since RDMAP provides quite different semantics 1751 for the two. 1753 Note that many current protocols, both as used in the Internet and 1754 elsewhere, assume that data is both Placed and Delivered in order. 1755 This allowed applications to take a variety of shortcuts by taking 1756 advantage of this fact. For RDMAP, many of these shortcuts are no 1757 longer safe to use, and could cause application failure. 1759 The following rules apply to implementations of the RDMAP protocol. 1760 Note, in these rules Send includes Send, Send with Invalidate, Send 1761 with Solicited Event, and Send with Solicited Event and Invalidate: 1763 1. RDMAP does not provide ordering among Messages on different 1764 RDMAP Streams. 1766 2. RDMAP does not provide ordering between operations that are 1767 generated from the two ends of an RDMAP Stream. 1769 3. RDMA Messages that use Tagged and Untagged Buffers MAY be 1770 Placed in any order. If an application uses overlapping 1771 buffers (points different Messages or portions of a single 1772 Message at the same buffer), then it is possible that the last 1773 incoming write to the Data Sink buffer will not be the last 1774 outgoing data sent from the Data Source. 1776 4. For a Send operation, the contents of an Untagged Buffer at the 1777 Data Sink MAY be indeterminate until the Send is Delivered to 1778 the ULP at the Data Sink. 1780 5. For an RDMA Write operation, the contents of the Tagged Buffer 1781 at the Data Sink MAY be indeterminate until a subsequent Send 1782 is Delivered to the ULP at the Data Sink. 1784 6. For an RDMA Read operation, the contents of the Tagged Buffer 1785 at the Data Sink MAY be indeterminate until the RDMA Read 1786 Response Message has been Delivered at the Local Peer. 1788 Statements 4, 5, and 6 imply "no peeking" at the data to see 1789 if it is done. It is possible for some data to arrive before 1790 logically earlier data does, and peeking may cause 1791 unpredictable application failure 1793 7. If the ULP or Application modifies the contents of Tagged or 1794 Untagged Buffers being modified by an RDMA Operation while the 1795 RDMAP is processing the RDMA Operation, the state of the 1796 Buffers is indeterminate. 1798 8. If the ULP or Application modifies the contents of Tagged or 1799 Untagged Buffers read by an RDMA Operation while the RDMAP is 1800 processing the RDMA Operation, the results of the read are 1801 indeterminate. 1803 9. The Completion of an RDMA Write or Send Operation at the Local 1804 Peer does not guarantee that the ULP Message has yet reached 1805 the Remote Peer ULP Buffer or been examined by the Remote ULP. 1807 10. Send Messages MUST be Delivered to the ULP at the Remote Peer 1808 after they are Delivered to RDMAP by DDP and in the order that 1809 the they were Delivered to RDMAP. 1811 Note that DDP ordering rules ensure that this will be the same 1812 order that they were submitted at the Local Peer and that any 1813 prior RDMA Writes have been submitted for ordered Placement at 1814 the Remote Peer. This means that when the ULP sees the Delivery 1815 of the Send, the memory buffers targeted by any preceding RDMA 1816 Writes and Sends are available to be accessed locally or 1817 remotely as authorized. If the ULP overlaps its buffers for 1818 different operations, the data from the RDMA Write or Send may 1819 be overwritten by subsequent RDMA Operations before the ULP 1820 receives and processes the Delivery. 1822 11. RDMA Read Response Messages MUST be Delivered to the ULP at the 1823 Remote Peer after they are Delivered to RDMAP by DDP and in the 1824 order that the they were Delivered to RDMAP. 1826 DDP ordering rules ensure that this will be the same order that 1827 they were submitted at the Local Peer. This means that when the 1828 ULP sees the Delivery of the RDMA Read Response, the memory 1829 buffers targeted by the RDMA Read Response are available to be 1830 accessed locally or remotely as authorized. If the ULP overlaps 1831 its buffers for different operations, the data from the RDMA 1832 Read Response may be overwritten by subsequent RDMA Operations 1833 before the ULP receives and processes the Delivery. 1835 12. RDMA Read Request Messages, including zero-length RDMA Read 1836 Requests, MUST NOT start processing at the Remote Peer until 1837 they have been Delivered to RDMAP by DDP. 1839 Note: the ULP is assured that data written can be read back. 1840 For example, if an RDMA Read Request is issued by the local 1841 peer, targeting the same ULP Buffer as a preceding Send or RDMA 1842 Write (in the same direction as the RDMA Read Request), and 1843 there are no other sources of update for the ULP Buffer, then 1844 the remote peer will send back the data written by the Send or 1845 RDMA Write. That is, for this example the ULP Buffer: is 1846 Advertised for use on a series of RDMA Messages, is only valid 1847 on the RDMAP Stream for which it is advertised, and is not 1848 locally updated while the series of RDMAP Messages are 1849 performed. For this example, order rule (12) assures that 1850 subsequent local or remote accesses to the ULP Buffer contain 1851 the data written by the Send or RDMA Write. 1853 RDMA Read Response Messages MAY be generated at the Remote Peer 1854 after subsequent RDMA Write Messages or Send Messages have been 1855 Placed or Delivered. Therefore, when an application does an 1856 RDMA Read Request followed by an RDMA Write (or Send) to the 1857 same buffer, it may get the data from the later RDMA Write (or 1858 Send) in the RDMA Read Response Message, even though the 1859 operations completed in order at the Local Peer. If this 1860 behavior is not desired, the Local Peer ULP must Fence the 1861 later RDMA write (or Send) by withholding the RDMA Write 1862 Message until all outstanding RDMA Read Responses have been 1863 Delivered. 1865 13. The RDMAP Layer MUST submit RDMA Messages to the DDP layer in 1866 the order the RDMA Operations are submitted to the RDMAP Layer 1867 by the ULP. 1869 14. A Send or RDMA Write Message MUST NOT be considered Complete at 1870 the Local Peer (Data Source) until it has been successfully 1871 completed at the DDP layer. 1873 15. RDMA Operations MUST be Completed at the Local Peer in the 1874 order that they were submitted by the ULP. 1876 16. At the Data Sink, an incoming Send Message MUST be Delivered to 1877 the ULP only after the DDP Message has been Delivered to the 1878 RDMAP Layer by the DDP layer. 1880 17. RDMA Read Response Message processing at the Remote Peer 1881 (reading the specified Tagged Buffer) MUST be started only 1882 after the RDMA Read Request Message has been Delivered by the 1883 DDP layer (thus all previous RDMA Messages have been properly 1884 submitted for ordered Placement). 1886 18. Send Messages MAY be Completed at the Remote Peer (Data Sink) 1887 before prior incoming RDMA Read Request Messages have completed 1888 their response processing. 1890 19. An RDMA Read operation MUST NOT be Completed at the Local Peer 1891 until the DDP layer Delivers the associated incoming RDMA Read 1892 Response Message. 1894 20. If more than one outstanding RDMA Read Request Message is 1895 supported by both peers, the RDMA Read Response Messages MUST 1896 be submitted to the DDP layer on the Remote Peer in the order 1897 the RDMA Read Request Messages were Delivered by DDP, but the 1898 actual read of the buffer contents MAY take place in any order 1899 at the Remote Peer. 1901 This simplifies Local Peer Completion processing for RDMA 1902 Reads in that a Delivered RDMA Read Response MUST be 1903 sufficient to Complete the RDMA Read Operation. 1905 6 RDMAP Stream Management 1907 RDMAP Stream management consists of RDMAP Stream Initialization and 1908 RDMAP Stream Termination. 1910 6.1 Stream Initialization 1912 RDMAP Stream initialization occurs after the LLP Stream has been 1913 created (e.g. for DDP/MPA over TCP the first TCP Segment after the 1914 SYN, SYN/ACK exchange). The ULP is responsible for transitioning 1915 the LLP Stream into RDMA enabled mode. The switch to RDMA mode 1916 typically occurs sometime after LLP Stream setup. Once in RDMA 1917 enabled mode, an implementation MUST send only RDMA Messages across 1918 the transport Stream until the RDMAP Stream is torn down. 1920 For each direction of an RDMAP Stream: 1922 * For a given RDMAP Stream, the number of outstanding RDMA Read 1923 Requests is limited per RDMAP Stream direction. 1925 * It is the ULP's responsibility to set the maximum number of 1926 outstanding, inbound RDMA Read Requests per RDMAP Stream 1927 direction. 1929 * The RDMAP Layer MUST provide the maximum number of outstanding, 1930 inbound RDMA Read Requests per RDMAP Stream direction that were 1931 negotiated between the ULP and the Local Peer's RDMAP Layer. The 1932 negotiation mechanism is outside the scope of this 1933 specification. 1935 * It is the ULP's responsibility to set the maximum number of 1936 outstanding, outbound RDMA Read Requests per RDMAP Stream 1937 direction. 1939 * The RDMAP Layer MUST provide the maximum number of outstanding, 1940 outbound RDMA Read Requests for the RDMAP Stream direction that 1941 were negotiated between the ULP and the Local Peer's RDMAP 1942 Layer. The negotiation mechanism is outside the scope of this 1943 specification. 1945 * The Local Peer's ULP is responsible for negotiating with the 1946 Remote Peer's ULP the maximum number of outstanding RDMA Read 1947 Requests for the RDMAP Stream direction. It is recommended that 1948 the ULP set the maximum number of outstanding, inbound RDMA Read 1949 Requests equal to the maximum number of outstanding, outbound 1950 RDMA Read Requests for a given RDMAP Stream direction. 1952 * For outbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed 1953 the maximum number of outstanding, outbound RDMA Read Requests 1954 that were negotiated between the ULP and the Local Peer's RDMAP 1955 Layer. 1957 * For inbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed 1958 the maximum number of outstanding, inbound RDMA Read Requests 1959 that were negotiated between the ULP and the Local Peer's RDMAP 1960 Layer. 1962 6.2 Stream Teardown 1964 There are three methods for terminating an RDMAP Stream: ULP 1965 Graceful Termination, RDMAP Abortive Termination, and LLP Abortive 1966 Termination. 1968 The ULP is responsible for performing ULP Graceful Termination. 1969 After a ULP Graceful Termination, either side of the Stream can 1970 initiate LLP Graceful Termination, using the graceful termination 1971 mechanism provided by the LLP. 1973 RDMAP Abortive Termination allows the RDMAP to issue a Terminate 1974 Message describing the reason the RDMAP Stream was terminated. The 1975 next section (6.2.1 RDMAP Abortive Termination) describes the RDMAP 1976 Abortive Termination in detail. 1978 LLP Abortive Termination results due to a LLP error and causes the 1979 RDMAP Stream to be torn down midstream, without an RDMAP Terminate 1980 Message. While this last method is highly undesirable, it is 1981 possible and the ULP should take this into consideration. 1983 6.2.1 RDMAP Abortive Termination 1985 RDMAP defines a Terminate operation that SHOULD be invoked when 1986 either an RDMAP error is encountered or a LLP error is surfaced to 1987 the RDMAP layer by the LLP. 1989 It is not always possible to send the Terminate Message. For 1990 example, certain LLP errors may occur that cause the LLP Stream to 1991 be torn down before a) RDMAP is aware of the error, b) before RDMAP 1992 is able to send the Terminate Message, or c) after RDMAP has posted 1993 the Terminate Message to the LLP, but it has not yet been 1994 transmitted by the LLP. 1996 Note that an RDMAP Abortive Termination may entail loss of data. In 1997 general, when a Terminate Message is received it is impossible to 1998 tell for sure what unacknowledged RDMA Messages were Completed 1999 successfully at the Remote Peer. Thus the state of all outstanding 2000 RDMA Messages is indeterminate and the Messages SHOULD be 2001 considered Completed in error. 2003 When a peer sends or receives a Terminate Message, it MAY 2004 immediately teardown the LLP Stream. The peer SHOULD perform a 2005 graceful LLP teardown to ensure the Terminate Message is 2006 successfully Delivered. 2008 See section 4.8 Terminate Header for a description of the Terminate 2009 Message and its contents. See section 5.4 Terminate Message for a 2010 description of the Terminate Message semantics. 2012 7 RDMAP Error Management 2014 The RDMAP protocol does not have RDMAP or DDP layer error recovery 2015 operations built in. If everything is working, the LLP guarantees 2016 will ensure that the Messages are arriving at the destination. 2018 If errors are detected at the RDMAP or DDP layer, then the RDMAP, 2019 DDP and LLP Streams are Abortively Terminated (see section 4.8 2020 Terminate Header on page 34). 2022 In general poor implementations or improper ULP programming causes 2023 the errors detected at the RDMAP and DDP layers. In these cases, 2024 returning a diagnostic termination error Message and closing the 2025 RDMAP Stream is far simpler than attempting to maintain the RDMAP 2026 Stream, particularly when the cause of the error is not known. 2028 If an LLP does not support teardown of a Stream independent of 2029 other Streams and an RDMAP error results in the Termination of a 2030 specific Stream, then the LLP MUST label the Stream as an erroneous 2031 Stream and MUST NOT allow any further data transfer on that Stream 2032 after RDMAP requests the Stream to be torn down. 2034 For a specific LLP connection, when all Streams are either 2035 gracefully torn down or are labeled as erroneous Streams, the LLP 2036 connection MUST be torn down. 2038 Since errors are detected at the Remote Peer (possibly long) after 2039 RDMA Messages are passed to DDP and the LLP at the Local Peer and 2040 Completed, the sender cannot easily determine which of its Messages 2041 have been received. (RDMA Reads are an exception to this rule). 2043 For a list of errors returned to the Remote Peer as a result of an 2044 Abortive Termination, see section 4.8 Terminate Header on page 34. 2046 7.1 RDMAP Error Surfacing 2048 If an error occurs at the Local Peer, the RDMAP layer MUST attempt 2049 to inform the local ULP that the error has occurred. 2051 The Local Peer MUST send a Terminate Message for each of the 2052 following cases: 2054 21. For Errors detected while creating RDMA Write, Send, Send with 2055 Invalidate, Send with Solicited Event, Send with Solicited 2056 Event and Invalidate, or RDMA Read Requests, or other reasons 2057 not directly associated with an incoming Message, the Terminate 2058 Message and Error code are sent instead of the request. In 2059 this case, the Error Type and Error Code fields are included in 2060 the Terminate Message, but the Terminated DDP Header and 2061 Terminated RDMA Header fields are set to zero. 2063 22. For errors detected on an incoming RDMA Write, Send, Send with 2064 Invalidate, Send with Solicited Event, Send with Solicited 2065 Event and Invalidate, or Read Response Message (after the 2066 Message has been Delivered by DDP), the Terminate Message is 2067 sent at the earliest possible opportunity, preferably in the 2068 next outgoing RDMA Message. In this case, the Error Type, Error 2069 Code, ULP PDU Length, and Terminated DDP Header fields are 2070 included in the Terminate Message, but the Terminated RDMA 2071 Header field is set to zero. 2073 23. For errors detected on an incoming RDMA Read Request Message 2074 (after the Message has been Delivered by DDP), the Terminate 2075 Message is sent at the earliest possible opportunity, 2076 preferably in the next outgoing RDMA Message. In this case, the 2077 Error Type, Error Code, ULP PDU Length, Terminated DDP Header, 2078 and Terminated RDMA Header fields are included in the Terminate 2079 Message. 2081 24. If more than one error is detected on incoming RDMA Messages, 2082 before the Terminate Message can be sent, then the first RDMA 2083 Message (and its associated DDP Segment) that experienced an 2084 error MUST be captured by the Terminate Message in accordance 2085 with rules 2 and 3 above. 2087 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages 2089 On incoming RDMA Writes, RDMA Read Response, Sends, Send with 2090 Invalidate, Send with Solicited Event, Send with Solicited Event 2091 and Invalidate, and Terminate Messages, the following must be 2092 validated: 2094 1. The DDP Layer MUST validate all DDP Segment fields. 2096 2. The RDMA OpCode MUST be valid. 2098 3. The RDMA Version MUST be valid. 2100 Additionally, on incoming Send with Invalidate and Send with 2101 Solicited Event and Invalidate Messages, the following must 2102 also be validated: 2104 4. The Invalidate STag MUST be valid. 2106 5. The STag MUST be associated to this RDMAP Stream. 2108 On incoming RDMA Request Messages, the following must be validated: 2110 1. The DDP Layer MUST validate all Untagged DDP Segment fields. 2112 2. The RDMA OpCode MUST be valid. 2114 3. The RDMA Version MUST be valid. 2116 4. For non-zero length RDMA Read Request Messages: 2118 a. The Data Source STag MUST be valid. 2120 b. The Data Source STag MUST be associated to this RDMAP 2121 Stream. 2123 c. The Data Source Tagged Offset MUST fall in the range of 2124 legal offsets associated with the Data Source STag. 2126 d. The sum of the Data Source Tagged Offset and the RDMA Read 2127 Message Size MUST fall in the range of legal offsets 2128 associated with the Data Source STag. 2130 e. The sum of the Data Source Tagged Offset and the RDMA Read 2131 Message Size MUST NOT cause the Data Source Tagged Offset 2132 to wrap. 2134 8 Security 2136 Security Considerations 2138 This section references the resources that discuss protocol- 2139 specific security considerations and implications of using RDMAP 2140 with existing security services. A detailed analysis of the 2141 security issues around implementation and use of the RDMAP can be 2142 found in [RDMASEC]. 2144 [RDMASEC] introduces the RDMA reference model and discusses how the 2145 resources of this model are vulnerable to attacks and the types of 2146 attack these vulnerabilities are subject to. It also details the 2147 levels of Trust available in this peer-to-peer model and how this 2148 defines the nature of resource sharing. 2150 8.1 Summary of RDMAP specific Security Requirements 2152 [RDMASEC] defines the security requirements for the implementation 2153 of the components of the RDMA reference model, namely the RDMA 2154 enabled NIC (RNIC) and the Privileged Resource Manager. An RDMAP 2155 implementation conforming to this specification MUST conform to 2156 these requirements. 2158 8.1.1 RDMAP (RNIC) Requirements 2160 RDMAP provides several countermeasures for all types of attacks as 2161 introduced in [RDMASEC]. In the following, this specification lists 2162 all security requirements which MUST be implemented by the RNIC. A 2163 more detailed discussion of RNIC security requirements can be found 2164 in Section 5 of [RDMASEC]. 2166 1. An RNIC MUST ensure that a specific Stream in a specific 2167 Protection Domain cannot access an STag in a different 2168 Protection Domain. 2170 2. An RNIC MUST ensure that if an STag is limited in scope to a 2171 single Stream, no other Stream can use the STag. 2173 3. An RNIC MUST ensure that a Remote Peer is not able to access 2174 memory outside of the buffer specified when the STag was 2175 enabled for remote access. 2177 4. An RNIC MUST provide a mechanism for the ULP to establish and 2178 revoke the association of a ULP Buffer to an STag and TO range. 2180 5. An RNIC MUST provide a mechanism for the ULP to establish and 2181 revoke read, write, or read and write access to the ULP Buffer 2182 referenced by an STag. 2184 6. An RNIC MUST ensure that the network interface can no longer 2185 modify an advertised buffer after the ULP revokes remote access 2186 rights for an STag. 2188 7. An RNIC MUST ensure that a Remote Peer is not able to 2189 invalidate an STag enabled for remote access, if the STag is 2190 shared on multiple streams. 2192 8. An RNIC MUST choose the value of STags in a way difficult to 2193 predict. It is RECOMMENDED to sparsely populate them over the 2194 full range available. 2196 9. An RNIC MUST NOT enable sharing a CQ across ULPs that do not 2197 share partial mutual trust. 2199 10. An RNIC MUST ensure that if a CQ overflows, any Streams which 2200 do not use the CQ MUST remain unaffected. 2202 11. An RNIC implementation SHOULD provide a mechanism to cap the 2203 number of outstanding RDMA Read Requests. 2205 12. An RNIC MUST NOT enable firmware to be loaded on the RNIC 2206 directly from an untrusted Local Peer or Remote Peer, unless 2207 the Peer is properly authenticated (by a mechanism outside the 2208 scope of this specification. The mechanism presumably entails 2209 authenticating that the remote ULP has the right to perform the 2210 update), and the update is done via a secure protocol, such as 2211 IPsec. 2213 8.1.2 Privileged Resource Manager Requirements 2215 With RDMAP, all reservations of local resources are initiated from 2216 local ULPs. To protect from local attacks including unfair 2217 resource distribution and gaining unauthorized access to RNIC 2218 resources, a Privileged Resource Manager (PRM) must be 2219 implemented, which manages all local resource allocation. Note 2220 that the PRM must not be provided as an independent component, its 2221 functionality can also be implemented as part of the privileged 2222 ULP or as part of the RNIC itself. 2224 An PRM implementation must meet the following security 2225 requirements (a more detailed discussion of PRM security 2226 requirements can be found in Section 5 of [RDMASEC]): 2228 1. All Non-Privileged ULP interactions with the RNIC Engine that 2229 could affect other ULPs MUST be done using the Resource Manager 2230 as a proxy. 2232 2. All ULP resource allocation requests for scarce resources MUST 2233 also be done using a Privileged Resource Manager. 2235 3. The Privileged Resource Manager MUST NOT assume different ULPs 2236 share Partial Mutual Trust unless there is a mechanism to 2237 ensure that the ULPs do indeed share partial mutual trust. 2239 4. If Non-Privileged ULPs are supported, the Privileged Resource 2240 Manager MUST verify that the Non-Privileged ULP has the right 2241 to access a specific Data Buffer before allowing an STag for 2242 which the ULP has access rights to be associated with a 2243 specific Data Buffer. 2245 5. The Privileged Resource Manager MUST control the allocation of 2246 CQ entries. 2248 6. The Privileged Resource Manager SHOULD prevent a Local Peer 2249 from allocating more than its fair share of resources. 2251 7. RDMA Read Request Queue resource consumption MUST be controlled 2252 by the Privileged Resource Manager such that RDMAP/DDP Streams 2253 which do not share Partial Mutual Trust do not share RDMA Read 2254 Request Queue resources. 2256 8. If an RNIC provides the ability to share receive buffers across 2257 multiple Streams, the combination of the RNIC and the 2258 Privileged Resource Manager MUST be able to detect if the 2259 Remote Peer is attempting to consume more than its fair share 2260 of resources so that the Local Peer can apply countermeasures 2261 to detect and prevent the attack. 2263 8.2 Security Services for RDMAP 2265 RDMAP is using IP based network services to control, read and 2266 write data buffers over the network. Therefore, all exchanged 2267 control and data packets are vulnerable to spoofing, tampering and 2268 information disclosure attacks. 2270 RDMAP Streams that are subject to impersonation attacks, or Stream 2271 hijacking attacks, can be authenticated, have their integrity 2272 protected, and be protected from replay attacks. Furthermore, 2273 confidentiality protection can be used to protect from 2274 eavesdropping. 2276 8.2.1 Available Security Services 2278 The IPsec protocol suite [RFC2401] defines strong countermeasures 2279 to protect an IP stream from those attacks. Several levels of 2280 protection can guarantee session confidentiality, per-packet source 2281 authentication, per-packet integrity and correct packet sequencing. 2283 RDMAP security may also profit from SSL or TLS security services 2284 provided for TCP based ULPs [RFC2246]. Used underneath RDMAP, these 2285 security services also provides for stream authentication, data 2286 integrity and confidentiality. As discussed in [RDMASEC], 2287 limitations on the maximum packet length to be carried over the 2288 network and potentially inefficient out-of-order packet processing 2289 at the data sink makes SSL and TLS less appropriate for RDMAP than 2290 IPsec. 2292 If SSL is layered on top of RDMAP, SSL does not protect the RDMAP 2293 headers. Thus, a man-in-the-middle attack can still occur by 2294 modifying the RDMAP header to incorrectly place the data into the 2295 wrong buffer, thus effectively corrupting the data stream. 2297 By remaining independent of ULP and LLP security protocols, RDMAP 2298 will benefit from continuing improvements at those layers. Users 2299 are provided flexibility to adapt to their specific security 2300 requirements and the ability to adapt to future security 2301 challenges. Given this, the vulnerabilities of RDMAP to active 2302 third-party interference are no greater than any other protocol 2303 running over an LLP such as TCP or SCTP. 2305 8.2.2 Requirements for IPsec Services for RDMAP 2307 Because IPsec is designed to secure arbitrary IP packet streams, 2308 including streams where packets are lost, RDMAP can run on top of 2309 IPsec without any change. IPsec packets are processed (e.g., 2310 integrity checked and possibly decrypted) in the order they are 2311 received, and an RDMAP Data Sink will process the decrypted RDMA 2312 Messages contained in these packets in the same manner as RDMA 2313 Messages contained in unsecured IP packets. 2315 The IP Storage working group has defined the normative IPsec 2316 requirements for IP Storage [RFC3723]. Portions of this 2317 specification are applicable to the RDMAP. In particular, a 2318 compliant implementation of IPsec services for RDMAP MUST meet the 2319 requirements as outlined in Section 2.3 of [RFC3723]. Without 2320 replicating the detailed discussion in [RFC3723], this includes 2321 the following requirements: 2323 1. The implementation MUST support IPsec ESP [RFC2406], as well as 2324 the replay protection mechanisms of IPsec. When ESP is 2325 utilized, per-packet data origin authentication, integrity and 2326 replay protection MUST be used. 2328 2. It MUST support ESP in tunnel mode and MAY implement ESP in 2329 transport mode. 2331 3. It MUST support IKE [RFC2409] for peer authentication, 2332 negotiation of security associations, and key management, using 2333 the IPsec DOI [RFC2407]. 2335 4. It MUST NOT interpret the receipt of a IKE Phase 2 delete 2336 message as a reason for tearing down the RDMAP stream. Since 2337 IPsec acceleration hardware may only be able to handle a 2338 limited number of active IKE Phase 2 SAs, idle SAs may be 2339 dynamically brought down and a new SA be brought up again, if 2340 activity resumes. 2342 5. It MUST support peer authentication using a pre-shared key, and 2343 MAY support certificate-based peer authentication using digital 2344 signatures. Peer authentication using the public key 2345 encryption methods [RFC2409] SHOULD NOT be used. 2347 6. It MUST support IKE Main Mode and SHOULD support Aggressive 2348 Mode. IKE Main Mode with pre-shared key authentication SHOULD 2349 NOT be used when either of the peers uses a dynamically 2350 assigned IP address. 2352 7. When digital signatures are used to achieve authentication, 2353 either IKE Main Mode or IKE Aggressive Mode MAY be used. In 2354 these cases, an IKE negotiator SHOULD use IKE Certificate 2355 Request Payload(s) to specify the certificate authority (or 2356 authorities) that are trusted in accordance with its local 2357 policy. IKE negotiators SHOULD check the pertinent Certificate 2358 Revocation List (CRL) before accepting a PKI certificate for 2359 use in IKE's authentication procedures. 2361 8. Access to locally stored secret information (pre-shared or 2362 private key for digital signing) must be suitably restricted, 2363 since compromise of the secret information nullifies the 2364 security properties of the IKE/IPsec protocols. 2366 9. It MUST follow the guidelines of Section 2.3.4 of [RFC3723] on 2367 the setting of IKE parameters to achieve a high level of 2368 interoperability without requiring extensive configuration. 2370 Furthermore, implementation and deployment of the IPsec services 2371 for RDDP should follow the Security Considerations outlined in 2372 Section 5 of [RFC3723]. 2374 9 IANA 2376 IANA Considerations 2378 This document requests no direct action from IANA. The following 2379 consideration is listed here as commentary. 2381 If RDMAP was enabled a priori for a ULP by connecting to a well- 2382 known port, this well-known port would be registered for the RDMAP 2383 with IANA. The registration of the well-known port will be the 2384 responsibility of the ULP specification. 2386 10 References 2388 10.1 Normative References 2390 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2391 Requirement Levels", BCP 14, RFC 2119, March 1997. 2393 [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security 2394 Payload (ESP)", RFC 2406, November 1998. 2396 [RFC2407] Piper, D., "The Internet IP Security Domain of 2397 Interpretation of ISAKMP", RFC 2407, November 1998. 2399 [RFC2409] Harkins, D. and D. Carrel, "The Internet Key Exchange 2400 (IKE)", RFC 2409, November 1998. 2402 [RFC3723] Aboba B. et al., "Secure Block Storage Protocols over 2403 IP", RFC 3723, April 2004. 2405 [VERBS] J. Hilland, "RDMA Protocol Verbs Specification", draft- 2406 hilland-iwarp-verbs-v1.0 RDMA Consortium, April 2003. 2408 [DDP] H. Shah et al., "Direct Data Placement over Reliable 2409 Transports", draft-ietf-rddp-ddp-05.txt, February 2005. 2411 [MPA] P. Culley et al., "Marker PDU Aligned Framing for TCP 2412 Specification", draft-ietf-rddp-mpa-04.txt, January 2005. 2414 [SCTP] R. Stewart et al., "Stream Control Transmission Protocol", 2415 RFC 2960, October 2000. 2417 [TCP] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 2418 September 1981. 2420 [RDMASEC] J. Pinkerton et al., "DDP/RDMAP Security", draft-ietf- 2421 rddp-security-09.txt, March 2005. 2423 10.2 Informative References 2425 [RFC2401] Atkinson, R., Kent, S., "Security Architecture for the 2426 Internet Protocol", RFC 2401, November 1998. 2428 [RFC2246] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", 2429 RFC 2246, November 1998. 2431 11 Appendix 2433 11.1 DDP Segment Formats for RDMA Messages 2435 This appendix is for information only and is NOT part of the 2436 standard. It simply depicts the DDP Segment format for the various 2437 RDMA Messages. 2439 11.1.1 DDP Segment for RDMA Write 2441 The following figure depicts an RDMA Write, DDP Segment: 2443 0 1 2 3 2444 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2446 | DDP Control | RDMA Control | 2447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2448 | Data Sink STag | 2449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2450 | Data Sink Tagged Offset | 2451 + + 2452 | | 2453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2454 | RDMA Write ULP Payload | 2455 // // 2456 | | 2457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2458 Figure 11 RDMA Write, DDP Segment format 2460 11.1.2 DDP Segment for RDMA Read Request 2462 The following figure depicts an RDMA Read Request, DDP Segment: 2464 0 1 2 3 2465 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2466 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2467 | DDP Control | RDMA Control | 2468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2469 | Reserved (Not Used) | 2470 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2471 | DDP (RDMA Read Request) Queue Number | 2472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2473 | DDP (RDMA Read Request) Message Sequence Number | 2474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2475 | DDP (RDMA Read Request) Message Offset | 2476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2477 | Data Sink STag (SinkSTag) | 2478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2479 | | 2480 + Data Sink Tagged Offset (SinkTO) + 2481 | | 2482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2483 | RDMA Read Message Size (RDMARDSZ) | 2484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2485 | Data Source STag (SrcSTag) | 2486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2487 | | 2488 + Data Source Tagged Offset (SrcTO) + 2489 | | 2490 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2491 Figure 12 RDMA Read Request, DDP Segment format 2493 11.1.3 DDP Segment for RDMA Read Response 2495 The following figure depicts an RDMA Read Response, DDP Segment: 2497 0 1 2 3 2498 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2499 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2500 | DDP Control | RDMA Control | 2501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2502 | Data Sink STag | 2503 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2504 | Data Sink Tagged Offset | 2505 + + 2506 | | 2507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2508 | RDMA Read Response ULP Payload | 2509 // // 2510 | | 2511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2512 Figure 13 RDMA Read Response, DDP Segment format 2514 11.1.4 DDP Segment for Send and Send with Solicited Event 2516 The following figure depicts a Send and Send with Solicited 2517 Request, DDP Segment: 2519 0 1 2 3 2520 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2521 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2522 | DDP Control | RDMA Control | 2523 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2524 | Reserved (Not Used) | 2525 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2526 | (Send) Queue Number | 2527 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2528 | (Send) Message Sequence Number | 2529 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2530 | (Send) Message Offset | 2531 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2532 | Send ULP Payload | 2533 // // 2534 | | 2535 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2537 Figure 14 Send and Send with Solicited Event, DDP Segment format 2539 11.1.5 DDP Segment for Send with Invalidate and Send with SE and 2540 Invalidate 2542 The following figure depicts a Send with invalidate and Send with 2543 Solicited and Invalidate Request, DDP Segment: 2545 0 1 2 3 2546 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2548 | DDP Control | RDMA Control | 2549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2550 | Invalidate STag | 2551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2552 | (Send) Queue Number | 2553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2554 | (Send) Message Sequence Number | 2555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2556 | (Send) Message Offset | 2557 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2558 | Send ULP Payload | 2559 // // 2560 | | 2561 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2562 Figure 15 Send with Invalidate and Send with SE and Invalidate, DDP 2563 Segment 2565 11.1.6 DDP Segment for Terminate 2567 The following figure depicts a Terminate, DDP Segment: 2569 0 1 2 3 2570 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2571 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2572 | DDP Control | RDMA Control | 2573 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2574 | Reserved (Not Used) | 2575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2576 | DDP (Terminate) Queue Number | 2577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2578 | DDP (Terminate) Message Sequence Number | 2579 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2580 | DDP (Terminate) Message Offset | 2581 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2582 | Terminate Control | Reserved | 2583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2584 | DDP Segment Length (if any) | | 2585 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 2586 | | 2587 + + 2588 | Terminated DDP Header (if any) | 2589 + + 2590 | | 2591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2592 | | 2593 // // 2594 | Terminated RDMA Header (if any) | 2595 + + 2596 | | 2597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2598 Figure 16 Terminate, DDP Segment format 2600 11.2 Ordering and Completion Table 2602 The following table summarizes the ordering relationships that are 2603 defined in section 5.5 Ordering and Completions from the standpoint 2604 of the local peer issuing the two Operations. Note, in the table 2605 that follows Send includes Send, Send with Invalidate, Send with 2606 Solicited Event, and Send with Solicited Event and Invalidate 2608 ------+-------+----------------+----------------+---------------- 2609 First | Later | Placement | Placement | Ordering 2610 Op | Op | guarantee at | guarantee | guarantee at 2611 | | Remote Peer | Local Peer | Remote Peer 2612 | | | | 2613 ------+-------+----------------+----------------+---------------- 2614 Send | Send | No placement | Not applicable | Completed in 2615 | | guarantee. If | | order. 2616 | | guarantee is | | 2617 | | necessary, see | | 2618 | | footnote 1. | | 2619 ------+-------+----------------+----------------+---------------- 2620 Send | RDMA | No placement | Not applicable | Not applicable 2621 | Write | guarantee. If | | 2622 | | guarantee is | | 2623 | | necessary, see | | 2624 | | footnote 1. | | 2625 ------+-------+----------------+----------------+---------------- 2626 Send | RDMA | No placement | RDMA Read | RDMA Read 2627 | Read | guarantee | Response | Response 2628 | | between Send | Payload will | Message will 2629 | | Payload and | not be placed | not be 2630 | | RDMA Read | at the local | generated until 2631 | | Request Header | peer until the | Send has been 2632 | | | Send Payload is| Completed 2633 | | | placed at the | 2634 | | | remote peer | 2635 ------+-------+----------------+----------------+---------------- 2636 RDMA | Send | No placement | Not applicable | Not applicable 2637 Write | | guarantee. If | | 2638 | | guarantee is | | 2639 | | necessary, see | | 2640 | | footnote 1. | | 2641 ------+-------+----------------+----------------+---------------- 2642 RDMA | RDMA | No placement | Not applicable | Not applicable 2643 Write | Write | guarantee. If | | 2644 | | guarantee is | | 2645 | | necessary, see | | 2646 | | footnote 1. | | 2647 ------+-------+----------------+----------------+---------------- 2648 RDMA | RDMA | No placement | RDMA Read | Not applicable 2649 Write | Read | guarantee | Response | 2650 | | between RDMA | Payload will | 2651 | | Write Payload | not be placed | 2652 | | and RDMA Read | at the local | 2653 | | Request Header | peer until the | 2654 | | | RDMA Write | 2655 | | | Payload is | 2656 | | | placed at the | 2657 | | | remote peer | 2658 ------+-------+----------------+----------------+---------------- 2659 RDMA | Send | No placement | Send Payload | Not applicable 2660 Read | | guarantee | may be placed | 2661 | | between RDMA | at the remote | 2662 | | Read Request | peer before the| 2663 | | Header and Send| RDMA Read | 2664 | | payload | Response is | 2665 | | | generated. | 2666 | | | If guarantee is| 2667 | | | necessary, see | 2668 | | | footnote 2. | 2669 ------+-------+----------------+----------------+---------------- 2670 RDMA | RDMA | No placement | RDMA Write | Not applicable 2671 Read | Write | guarantee | Payload may be | 2672 | | between RDMA | placed at the | 2673 | | Read Request | remote peer | 2674 | | Header and RDMA| before the RDMA| 2675 | | Write payload | Read Response | 2676 | | | is generated. | 2677 | | | If guarantee is| 2678 | | | necessary, see | 2679 | | | footnote 2. | 2680 ------+-------+----------------+----------------+---------------- 2681 RDMA | RDMA | No placement | No placement | Second RDMA 2682 Read | Read | guarantee of | guarantee of | Read Response 2683 | | the two RDMA | the two RDMA | will not be 2684 | | Read Request | Read Response | generated until 2685 | | Headers | Payloads. | first RDMA Read 2686 | | Additionally, | | Response is 2687 | | there is no | | generated. 2688 | | guarantee that | | 2689 | | the Tagged | | 2690 | | Buffers | | 2691 | | referenced in | | 2692 | | the RDMA Read | | 2693 | | will be read in| | 2694 | | order | | 2695 Figure 17 Operation Ordering 2697 Footnote 1: If the guarantee is necessary, a ULP may insert an 2698 RDMA Read Operation and wait for it to complete to act as a Fence. 2700 Footnote 2: If the guarantee is necessary, a ULP may wait for the 2701 RDMA Read Operation to complete before performing the Send. 2703 12 Author's Address 2705 Paul R. Culley 2706 Hewlett-Packard Company 2707 20555 SH 249 2708 Houston, Tx. USA 77070-2698 2709 Phone: 281-514-5543 2710 Email: paul.culley@hp.com 2712 Dave Garcia 2713 Hewlett-Packard Company 2714 19333 Vallco Parkway 2715 Cupertino, Ca. USA 95014 2716 Phone: 408.285.6116 2717 Email: dave.garcia@hp.com 2719 Jeff Hilland 2720 Hewlett-Packard Company 2721 20555 SH 249 2722 Houston, Tx. USA 77070-2698 2723 Phone: 281-514-9489 2724 Email: jeff.hilland@hp.com 2726 Bernard Metzler 2727 IBM Research GmbH 2728 Zurich Research Laboratory 2729 Saeumerstrasse 4 2730 CH-8803 Rueschlikon, Switzerland 2731 Phone: +41 44 724 8605 2732 Email: bmt@zurich.ibm.com 2734 Renato J. Recio 2735 IBM Corp. 2736 11501 Burnett Road 2737 Austin, Tx. USA 78758 2738 Phone: 512-838-3685 2739 Email: recio@us.ibm.com 2740 13 Contributors 2742 Dwight Barron 2743 Hewlett-Packard Company 2744 20555 SH 249 2745 Houston, Tx. USA 77070-2698 2746 Phone: 281-514-2769 2747 Email: dwight.barron@hp.com 2749 Caitlin Bestler 2750 Broadcom Corporation 2751 16215 Alton Parkway 2752 Irvine, CA. USA 92619-7013 2753 Phone: 949-926-6383 2754 Email: caitlinb@broadcom.com 2756 John Carrier 2757 Cray, Inc. 2758 411 First Avenue S, Suite 600 2759 Seattle, WA 98104-2860 USA 2760 Phone: 206-701-2090 2761 Email: carrier@cray.com 2763 Ted Compton 2764 EMC Corporation 2765 Research Triangle Park, NC 27709, USA 2766 Phone: 919-248-6075 2767 Email: compton_ted@emc.com 2769 Uri Elzur 2770 Broadcom Corporation 2771 16215 Alton Parkway 2772 Irvine, California 92619-7013 USA 2773 Phone: +1 (949) 585-6432 2774 Email: Uri@Broadcom.com 2776 Hari Ghadia 2777 Adaptec, Inc. 2778 691 S. Milpitas Blvd., 2779 Milpitas, CA 95035 USA 2780 Phone: +1 (408) 957-5608 2781 Email: hari_ghadia@adaptec.com 2783 Howard C. Herbert 2784 Intel Corporation 2785 MS CH7-404 2786 5000 West Chandler Blvd. 2787 Chandler, Arizona 85226 2788 Phone: 480-554-3116 2789 Email: howard.c.herbert@intel.com 2791 Mike Ko 2792 IBM 2793 650 Harry Rd. 2794 San Jose, CA 95120 2795 Phone: (408) 927-2085 2796 Email: mako@us.ibm.com 2798 Mike Krause 2799 Hewlett-Packard Company 2800 43LN 2801 19410 Homestead Road 2802 Cupertino, CA 95014 USA 2803 Phone: 408-447-3191 2804 Email: krause@cup.hp.com 2806 Dave Minturn 2807 Intel Corporation 2808 MS JF1-210 2809 5200 North East Elam Young Parkway 2810 Hillsboro, Oregon 97124 2811 Phone: 503-712-4106 2812 Email: dave.b.minturn@intel.com 2814 Mike Penna 2815 Broadcom Corporation 2816 16215 Alton Parkway 2817 Irvine, California 92619-7013 USA 2818 Phone: +1 (949) 926-7149 2819 Email: MPenna@Broadcom.com 2821 Jim Pinkerton 2822 Microsoft, Inc. 2823 One Microsoft Way 2824 Redmond, WA, USA 98052 2825 Email: jpink@microsoft.com 2827 Hemal Shah 2828 Broadcom Corporation 2829 16215 Alton Parkway 2830 Irvine, CA. USA 92619-7013 2831 Phone: 949-926-6941 2832 Email: 2834 Allyn Romanow 2835 Cisco Systems 2836 170 W Tasman Drive 2837 San Jose, CA 95134 USA 2838 Phone: +1 408 525 8836 2839 Email: allyn@cisco.com 2841 Tom Talpey 2842 Network Appliance 2843 375 Totten Pond Road 2844 Waltham, MA 02451 USA 2845 Phone: +1 (781) 768-5329 2846 EMail: thomas.talpey@netapp.com 2848 Patricia Thaler 2849 Broadcom Corporation 2850 16215 Alton Parkway 2851 Irvine, CA. USA 92619-7013 2852 Phone: +1-916-570-2707 2853 email: pthaler@broadcom.com 2855 Jim Wendt 2856 Hewlett-Packard Company 2857 8000 Foothills Boulevard MS 5668 2858 Roseville, CA 95747-5668 USA 2859 Phone: +1 916 785 5198 2860 Email: jim_wendt@hp.com 2862 Madeline Vega 2863 IBM 2864 11400 Burnet Rd. Bld.45-2L-007 2865 Austin, TX. USA 78758 2866 Phone: 512-838-7739 2867 Email: mvega1@us.ibm.com 2869 Claudia Salzberg 2870 IBM 2871 11501 Burnet Rd. Bld.902-5B-014 2872 Austin, TX. USA 78758 2873 Phone: 512-838-5156 2874 Email: salzberg@us.ibm.com 2876 14 Intellectual Property Statement 2878 The IETF takes no position regarding the validity or scope of any 2879 Intellectual Property Rights or other rights that might be claimed 2880 to pertain to the implementation or use of the technology described 2881 in this document or the extent to which any license under such 2882 rights might or might not be available; nor does it represent that 2883 it has made any independent effort to identify any such rights. 2884 Information on the procedures with respect to rights in RFC 2885 documents can be found in BCP 78 and BCP 79. 2887 Copies of IPR disclosures made to the IETF Secretariat and any 2888 assurances of licenses to be made available, or the result of an 2889 attempt made to obtain a general license or permission for the use 2890 of such proprietary rights by implementers or users of this 2891 specification can be obtained from the IETF on-line IPR repository 2892 at http://www.ietf.org/ipr. 2894 The IETF invites any interested party to bring to its attention any 2895 copyrights, patents or patent applications, or other proprietary 2896 rights that may cover technology that may be required to implement 2897 this standard. Please address the information to the IETF at ietf- 2898 ipr@ietf.org. 2900 15 Full Copyright Statement 2902 Copyright (C) The Internet Society (2006). 2904 This document is subject to the rights, licenses and restrictions 2905 contained in BCP 78, and except as set forth therein, the authors 2906 retain all their rights. 2908 This document and the information contained herein are provided on 2909 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2910 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 2911 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2912 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2913 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 2914 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 2915 PARTICULAR PURPOSE.