idnits 2.17.1 draft-ietf-rddp-rdmap-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 22. -- Found old boilerplate from RFC 3978, Section 5.5 on line 2964. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2934. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2941. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2947. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. (A line matching the expected section header was found, but with an unexpected indentation: ' scope of this specification. The mechanism presumably entails' ) ** The document seems to lack a Security Considerations section. (A line matching the expected section header was found, but with an unexpected indentation: ' 8 Security Considerations' ) ** There are 31 instances of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([DDP]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 8, 2006) is 6438 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 2401' is mentioned on line 2178, but not defined ** Obsolete undefined reference: RFC 2401 (Obsoleted by RFC 4301) == Missing Reference: 'RFC 3723' is mentioned on line 2179, but not defined == Unused Reference: 'RFC2119' is defined on line 2431, but no explicit reference was found in the text == Unused Reference: 'VERBS' is defined on line 2449, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2406 (Obsoleted by RFC 4303, RFC 4305) ** Obsolete normative reference: RFC 2407 (Obsoleted by RFC 4306) ** Obsolete normative reference: RFC 2409 (Obsoleted by RFC 4306) -- No information found for draft-hilland-iwarp-verbs-v1 - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'VERBS' == Outdated reference: A later version (-08) exists of draft-ietf-rddp-mpa-06 ** Obsolete normative reference: RFC 2960 (ref. 'SCTP') (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 793 (ref. 'TCP') (Obsoleted by RFC 9293) == Outdated reference: A later version (-10) exists of draft-ietf-rddp-security-09 -- Obsolete informational reference (is this intentional?): RFC 2401 (Obsoleted by RFC 4301) -- Obsolete informational reference (is this intentional?): RFC 4346 (Obsoleted by RFC 5246) Summary: 13 errors (**), 0 flaws (~~), 9 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Remote Direct Data Placement Work Group R. Recio 3 INTERNET DRAFT IBM Corporation 4 draft-ietf-rddp-rdmap-07.txt P. Culley 5 Hewlett-Packard Company 6 D. Garcia 7 Hewlett-Packard Company 8 J. Hilland 9 Hewlett-Packard Company 10 B. Metzler 11 IBM Corporation 13 Expires: February, 2007 September 8, 2006 15 A Remote Direct Memory Access Protocol Specification 17 Status of this Memo 19 By submitting this Internet-Draft, each author represents that any 20 applicable patent or other IPR claims of which he or she is aware 21 have been or will be disclosed, and any of which he or she becomes 22 aware will be disclosed, in accordance with Section 6 of BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF), its areas, and its working groups. Note that 26 other groups may also distribute working documents as Internet- 27 Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six 30 months and may be updated, replaced, or obsoleted by other 31 documents at any time. It is inappropriate to use Internet-Drafts 32 as reference material or to cite them other than as "work in 33 progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/1id-abstracts.html The list of Internet-Draft 37 Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 Abstract 42 This document defines a Remote Direct Memory Access Protocol 43 (RDMAP) that operates over the Direct Data Placement Protocol (DDP 44 protocol). RDMAP provides read and write services directly to 45 applications and enables data to be transferred directly into 46 Upper Layer Protocol (ULP) buffers without intermediate data 47 copies. It also enables a kernel bypass implementation. 49 Table of Contents 51 1 Introduction................................................6 52 1.1 Architectural Goals.........................................6 53 1.2 Protocol Overview...........................................7 54 1.3 RDMAP Layering.............................................10 55 1.4 Specification Changes from the Last Version................11 56 2 Glossary...................................................14 57 2.1 General....................................................14 58 2.2 LLP........................................................16 59 2.3 Direct Data Placement (DDP)................................17 60 2.4 Remote Direct Memory Access (RDMA).........................19 61 3 ULP and Transport Attributes...............................22 62 3.1 Transport Requirements & Assumptions.......................22 63 3.2 RDMAP Interactions with the ULP............................23 64 4 Header Format..............................................27 65 4.1 RDMAP Control and Invalidate STag Field....................27 66 4.2 RDMA Message Definitions...................................30 67 4.3 RDMA Write Header..........................................31 68 4.4 RDMA Read Request Header...................................32 69 4.5 RDMA Read Response Header..................................34 70 4.6 Send Header and Send with Solicited Event Header...........34 71 4.7 Send with Invalidate Header and Send with SE and Invalidate 72 Header...........................................................34 73 4.8 Terminate Header...........................................34 74 5 Data Transfer..............................................41 75 5.1 RDMA Write Message.........................................41 76 5.2 RDMA Read Operation........................................42 77 5.2.1 RDMA Read Request Message.................................42 78 5.2.2 RDMA Read Response Message................................43 79 5.3 Send Message Type..........................................44 80 5.4 Terminate Message..........................................46 81 5.5 Ordering and Completions...................................47 82 6 RDMAP Stream Management....................................51 83 6.1 Stream Initialization......................................51 84 6.2 Stream Teardown............................................52 85 6.2.1 RDMAP Abortive Termination................................52 86 7 RDMAP Error Management.....................................54 87 7.1 RDMAP Error Surfacing......................................54 88 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages55 89 8 Security Considerations....................................57 90 8.1 Summary of RDMAP specific Security Requirements............57 91 8.1.1 RDMAP (RNIC) Requirements.................................57 92 8.1.2 Privileged Resource Manager Requirements..................59 93 8.2 Security Services for RDMAP................................60 94 8.2.1 Available Security Services...............................60 95 8.2.2 Requirements for IPsec Services for RDMAP.................61 96 9 IANA.......................................................64 97 10 References.................................................65 98 10.1 Normative References......................................65 99 10.2 Informative References....................................66 100 11 Appendix...................................................67 101 11.1 DDP Segment Formats for RDMA Messages.....................67 102 11.1.1 DDP Segment for RDMA Write..............................67 103 11.1.2 DDP Segment for RDMA Read Request.......................67 104 11.1.3 DDP Segment for RDMA Read Response......................69 105 11.1.4 DDP Segment for Send and Send with Solicited Event......69 106 11.1.5 DDP Segment for Send with Invalidate and Send with SE and 107 Invalidate.......................................................70 108 11.1.6 DDP Segment for Terminate...............................71 109 11.2 Ordering and Completion Table.............................71 110 12 Author's Address...........................................75 111 13 Contributors...............................................76 112 14 Intellectual Property Statement............................80 113 15 Full Copyright Statement...................................81 115 Table of Figures 117 Figure 1 RDMAP Layering..........................................10 118 Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP11 119 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields..28 120 Figure 4 RDMA Usage of DDP Fields................................29 121 Figure 5 RDMA Message Definitions................................31 122 Figure 6 RDMA Read Request Header Format.........................32 123 Figure 7 Terminate Header Format.................................35 124 Figure 8 Terminate Control Field.................................35 125 Figure 9 Terminate Control Field Values..........................38 126 Figure 10 Error Type to RDMA Message Mapping.....................40 127 Figure 11 RDMA Write, DDP Segment format.........................67 128 Figure 12 RDMA Read Request, DDP Segment format..................68 129 Figure 13 RDMA Read Response, DDP Segment format.................69 130 Figure 14 Send and Send with Solicited Event, DDP Segment format.70 131 Figure 15 Send with Invalidate and Send with SE and Invalidate, 132 DDP Segment......................................................70 133 Figure 16 Terminate, DDP Segment format..........................71 134 Figure 17 Operation Ordering.....................................74 136 1 Introduction 138 Today, communications over TCP/IP typically require copy 139 operations, which add latency and consume significant CPU and 140 memory resources. The Remote Direct Memory Access Protocol 141 (RDMAP) enables removal of data copy operations and enables 142 reduction in latencies by allowing a local application to read or 143 write data on a remote computer's memory with minimal demands on 144 memory bus bandwidth and CPU processing overhead, while preserving 145 memory protection semantics. 147 RDMAP is layered on top of Direct Data Placement (DDP) and uses 148 the two Buffer Models available from DDP. DDP-related terminology 149 is discussed in Section 2.3. As RDMAP builds on DDP the reader is 150 advised to become familiar with [DDP]. 152 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 153 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" 154 in this document are to be interpreted as described in RFC 2119." 156 1.1 Architectural Goals 158 RDMAP has been designed with the following high-level 159 architectural goals: 161 * Provide a data transfer operation that allows a Local Peer to 162 transfer up to 2^32 - 1 octets directly into a previously 163 advertised buffer (i.e., Tagged buffer) located at a Remote 164 Peer without requiring a copy operation. This is referred to as 165 the RDMA Write data transfer operation. 167 * Provide a data transfer operation that allows a Local Peer to 168 retrieve up to 2^32 - 1 octets directly from a previously 169 advertised buffer (i.e., Tagged buffer) located at a Remote 170 Peer without requiring a copy operation. This is referred to as 171 the RDMA Read data transfer operation. 173 * Provide a data transfer operation that allows a Local Peer to 174 send up to 2^32 - 1 octets directly into a buffer located at a 175 Remote Peer that has not been explicitly advertised. This is 176 referred to as the Send (Send with Invalidate, Send with 177 Solicited Event, and Send with Solicited Event and Invalidate) 178 data transfer operation. 180 * Enable the local ULP to use the Send Operation Type (includes 181 Send, Send with Invalidate, Send with Solicited Event, and Send 182 with Solicited Event and Invalidate) to signal to the remote 183 ULP the Completion of all previous Messages initiated by the 184 local ULP. 186 * Provide for all Operations on a single RDMAP Stream to be 187 reliably transmitted in the order that they were submitted. 189 * Provide RDMAP capabilities independently for each Stream when 190 the LLP supports multiple data Streams within an LLP 191 connection. 193 1.2 Protocol Overview 195 RDMAP provides seven data transfer operations. Except for the RDMA 196 Read operation, each operation generates exactly one RDMA Message. 197 Following is a brief overview of the RDMA Operations and RDMA 198 Messages: 200 1. Send - A Send operation uses a Send Message to transfer data 201 from the Data Source into a buffer that has not been 202 explicitly Advertised by the Data Sink. The Send Message uses 203 the DDP Untagged Buffer Model to transfer the ULP Message into 204 the Data Sink's Untagged Buffer. 206 2. Send with Invalidate - A Send with Invalidate operation uses a 207 Send with Invalidate Message to transfer data from the Data 208 Source into a buffer that has not been explicitly Advertised 209 by the Data Sink. The Send with Invalidate Message includes 210 all functionality of the Send Message, with one addition: an 211 STag field is included in the Send with Invalidate Message and 212 after the message has been Placed and Delivered at the Data 213 Sink the remote peer's buffer identified by the STag can no 214 longer be accessed remotely until the remote peer's ULP re- 215 enables access and Advertises the buffer. 217 3. Send with Solicited Event (Send with SE) - A Send with 218 Solicited Event operation uses a Send with Solicited Event 219 Message to transfer data from the Data Source into an Untagged 220 Buffer at the Data Sink. The Send with Solicited Event Message 221 is similar to the Send Message, with one addition: when the 222 Send with Solicited Event Message has been Placed and 223 Delivered, an Event may be generated at the recipient, if the 224 recipient is configured to generate such an Event. 226 4. Send with Solicited Event and Invalidate (Send with SE and 227 Invalidate) - A Send with Solicited Event and Invalidate 228 operation uses a Send with Solicited Event and Invalidate 229 Message to transfer data from the Data Source into a buffer 230 that has not been explicitly Advertised by the Data Sink. The 231 Send with Solicited Event and Invalidate Message is similar to 232 the Send with Invalidate Message, with one addition: when the 233 Send with Solicited Event and Invalidate Message has been 234 Placed and Delivered, an Event may be generated at the 235 recipient, if the recipient is configured to generate such an 236 Event. 238 5. Remote Direct Memory Access Write - An RDMA Write operation 239 uses an RDMA Write Message to transfer data from the Data 240 Source to a previously advertised buffer at the Data Sink. 242 The ULP at the Remote Peer, which in this case is the Data 243 Sink, enables the Data Sink Tagged Buffer for access and 244 Advertises the buffer's size (length), location (Tagged 245 Offset), and Steering Tag (STag) to the Data Source through a 246 ULP specific mechanism. The ULP at the Local Peer, which in 247 this case is the Data Source, initiates the RDMA Write 248 operation. The RDMA Write Message uses the DDP Tagged Buffer 249 Model to transfer the ULP Message into the Data Sink's Tagged 250 Buffer. Note: the STag associated with the Tagged Buffer 251 remains valid until the ULP at the Remote Peer invalidates it 252 or the ULP at the Local Peer invalidates it through a Send 253 with Invalidate or Send with Solicited Event and Invalidate. 255 6. Remote Direct Memory Access Read - The RDMA Read operation 256 transfers data to a Tagged Buffer at the Local Peer, which in 257 this case is the Data Sink, from a Tagged Buffer at the Remote 258 Peer, which in this case is the Data Source. The ULP at the 259 Data Source enables the Data Source Tagged Buffer for access 260 and Advertises the buffer's size (length), location (Tagged 261 Offset), and Steering Tag (STag) to the Data Sink through a 262 ULP specific mechanism. The ULP at the Data Sink enables the 263 Data Sink Tagged Buffer for access and initiates the RDMA Read 264 operation. The RDMA Read operation consists of a single RDMA 265 Read Request Message and a single RDMA Read Response Message, 266 and the latter may be segmented into multiple DDP Segments. 268 The RDMA Read Request Message uses the DDP Untagged Buffer 269 Model to Deliver the STag, starting Tagged Offset and length 270 for both the Data Source and Data Sink Tagged Buffers to the 271 remote peer's RDMA Read Request Queue. 273 The RDMA Read Response Message uses the DDP Tagged Buffer 274 Model to Deliver the Data Source's Tagged Buffer to the Data 275 Sink, without any involvement from the ULP at the Data Source. 277 Note: the Data Source STag associated with the Tagged Buffer 278 remains valid until the ULP at the Data Source invalidates it 279 or the ULP at the Data Sink invalidates it through a Send with 280 Invalidate or Send with Solicited Event and Invalidate. The 281 Data Sink STag associated with the Tagged Buffer remains valid 282 until the ULP at the Data Sink invalidates it. 284 7. Terminate - A Terminate operation uses a Terminate Message to 285 transfer to the Remote Peer information associated with an 286 error that occurred at the Local Peer. The Terminate Message 287 uses the DDP Untagged Buffer Model to transfer the Message 288 into the Data Sink's Untagged Buffer. 290 1.3 RDMAP Layering 292 RDMAP is dependent on DDP, subject to the requirements defined in 293 section 3.1 Transport Requirements & Assumptions. Figure 1 RDMAP 294 Layering depicts the relationship between Upper Layer Protocols 295 (ULPs), RDMAP, DDP protocol, the framing layer, and the transport. 296 For LLP protocol definitions of each LLP, see [MPA], [TCP], and 297 [SCTP]. 299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 300 | | 301 | Upper Layer Protocol (ULP) | 302 | | 303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 304 | | 305 | RDMAP | 306 | | 307 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 308 | | 309 | DDP protocol | 310 | | 311 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 312 | | | 313 | MPA | | 314 | | | 315 +-+-+-+-+-+-+-+-+-+ SCTP | 316 | | | 317 | TCP | | 318 | | | 319 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 320 Figure 1 RDMAP Layering 322 If RDMAP is layered over DDP/MPA/TCP, then the respective headers 323 and ULP Payload are arranged as follows (Note: For clarity, MPA 324 header and CRC fields are included but MPA markers are not shown): 326 0 1 2 3 327 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 329 | | 330 // TCP Header // 331 | | 332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 333 | MPA Header | | 334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 335 | | 336 // DDP Header // 337 | | 338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 339 | | 340 // RDMA Header // 341 | | 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 343 | | 344 // ULP Payload // 345 // (shown with no pad bytes) // 346 // // 347 | | 348 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 349 | MPA CRC | 350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP 353 1.4 Specification Changes from the Last Version 355 This section is to be removed before RFC publication. 357 The following major changes (vs typos) were made to the -06 and - 358 07 version: 360 * Incorporated comments from Transport Area Directors and the 361 Remote Direct Data Placement Working Group chair. 363 The following major changes (vs typos) were made to the -05 364 version: 366 * To pass the IETF checklist tool, modified heading of Security 367 Section 8 to "Security" and added "Security Considerations" 368 below it. 370 * Added IANA Section 9 and to pass the IETF checklist tool added 371 "IANA Considerations" line below Section 9 header. 373 * Added Intellectual Property Statement Section 14 and IPR 374 Disclosure Acknowledgement Section 15. 376 * Added Disclaimer Section 16. 378 * Section 6.8 - Acknowledged that the Reserved field size for the 379 Terminate Message is 13 bits. The fix was made to the -04 380 version, but was not listed in this section. 382 * Rewrite of the "Security" section to refer to Security document 383 rather than summarize. 385 * Update to the "Contributors" section. 387 * Changed boilerplate reference form 3667 to 3979. 389 * Removed references to company names in the disclaimer section. 391 * Added "Key Words" Disclaimer to the Introduction. 393 The following major changes (vs typos) were made to the -04 394 version: 396 * Section 10 - Expanded IPsec requirements sentence in section 397 10.3.2 to say what is required in addition to cross-referencing 398 RFC 3723. 400 * Section 6.8 - Fixed text after Figure 9 to reflect the correct 401 size (13 bits) of the Reserved field in the Terminate Message. 403 The following major changes (vs typos) were made to the -03 404 version: 406 * Section 6.1 - Added normative text describing downward 407 compatibility with version 0. 409 * Section 6.8 - Changed the description of the reserved field 410 size to match the size in the figure, which is 13 bits. 412 * Section 10 - Aligned security section closely to [RDMASEC] and 413 added normative text for security requirements. 415 The following major changes (vs typos) were made to the -02 416 version: 418 * Section 6.8 - Explicitly defined the bit numbers for the three 419 header control bits. 421 * Section 8.1 - Stated the typical Stream initialization to be: 422 RDMA mode is entered some time after the LLP Stream is 423 initialized. 425 * Section 10 - Update reference to security document. 427 * Section 10 - Fixed Send with Solicited Event and Invalidate 428 reference. 430 * Section 12.1 - MPA and DDP references were changed to reflect 431 the released specifications and accurate titles. 433 * Section 12.1 - Reference for RDMA Protocol Verbs was changed to 434 reflect the released specification and accurate title. 436 2 Glossary 438 2.1 General 440 Advertisement (Advertised, Advertise, Advertisements, Advertises) 441 - the act of informing a Remote Peer that a local RDMA Buffer 442 is available to it. A Node makes available an RDMA Buffer for 443 incoming RDMA Read or RDMA Write access by informing its 444 RDMA/DDP peer of the Tagged Buffer identifiers (STag, base 445 address, and buffer length). This Advertisement of Tagged 446 Buffer information is not defined by RDMA/DDP and is left to 447 the ULP. A typical method would be for the Local Peer to embed 448 the Tagged Buffer's Steering Tag, base address, and length in 449 a Send Message destined for the Remote Peer. 451 Completion - Refer to "RDMA Completion" in Section 2.4. 453 Completed - See "RDMA Completion" in Section 2.4. 455 Complete - See "RDMA Completion" in Section 2.4. 457 Completes - See "RDMA Completion" in Section 2.4. 459 Data Sink - The peer receiving a data payload. Note that the Data 460 Sink can be required to both send and receive RDMA/DDP 461 Messages to transfer a data payload. 463 Data Source - The peer sending a data payload. Note that the Data 464 Source can be required to both send and receive RDMA/DDP 465 Messages to transfer a data payload. 467 Data Delivery (Delivery, Delivered, Delivers) - Delivery is 468 defined as the process of informing the ULP or consumer that a 469 particular Message is available for use. This is specifically 470 different from "Placement", which may generally occur in any 471 order, while the order of "Delivery" is strictly defined. See 472 "Data Placement" in Section 2.3. 474 Delivery - See Data Delivery in Section 2.1. 476 Delivered - See Data Delivery in Section 2.1. 478 Delivers - See Data Delivery in Section 2.1. 480 Fabric - The collection of links, switches, and routers that 481 connect a set of Nodes with RDMA/DDP protocol implementations. 483 Fence (Fenced, Fences) - To block the current RDMA Operation from 484 executing until prior RDMA Operations have Completed. 486 iWARP - A suite of wire protocols comprised of RDMAP, DDP, and 487 MPA. The iWARP protocol suite may be layered above TCP, SCTP, 488 or other transport protocols. 490 Local Peer - The RDMA/DDP protocol implementation on the local end 491 of the connection. Used to refer to the local entity when 492 describing a protocol exchange or other interaction between 493 two Nodes. 495 Node - A computing device attached to one or more links of a 496 Fabric (network). A Node in this context does not refer to a 497 specific application or protocol instantiation running on the 498 computer. A Node may consist of one or more RNICs installed in 499 a host computer. 501 Placement - See "Data Placement" in Section 2.3 503 Placed - See "Data Placement" in Section 2.3 505 Places - See "Data Placement" in Section 2.3 507 Remote Peer - The RDMA/DDP protocol implementation on the opposite 508 end of the connection. Used to refer to the remote entity when 509 describing protocol exchanges or other interactions between 510 two Nodes. 512 RNIC - RDMA Network Interface Controller. In this context, this 513 would be a network I/O adapter or embedded controller with 514 iWARP and Verbs functionality. 516 RNIC Interface (RI) - The presentation of the RNIC to the Verbs 517 Consumer as implemented through the combination of the RNIC 518 and the RNIC driver. 520 Termination - See "RDMAP Abortive Termination" in Section 2.4. 522 Terminated - See "RDMAP Abortive Termination" in Section 2.4. 524 Terminate - See "RDMAP Abortive Termination" in Section 2.4 526 Terminates - See "RDMAP Abortive Termination" in Section 2.4 528 ULP - Upper Layer Protocol. The protocol layer above the protocol 529 layer currently being referenced. The ULP for RDMA/DDP is 530 expected to be an OS, Application, adaptation layer, or 531 proprietary device. The RDMA/DDP documents do not specify a 532 ULP - they provide a set of semantics that allow a ULP to be 533 designed to utilize RDMA/DDP. 535 ULP Payload - The ULP data that is contained within a single 536 protocol segment or packet (e.g., a DDP Segment). 538 Verbs - An abstract description of the functionality of a RNIC 539 Interface. The OS may expose some or all of this functionality 540 via one or more APIs to applications. The OS will also use 541 some of the functionality to manage the RNIC Interface. 543 2.2 LLP 545 LLP - Lower Layer Protocol. The protocol layer beneath the 546 protocol layer currently being referenced. For example, for 547 DDP the LLP is SCTP, MPA, or other transport protocols. For 548 RDMA, the LLP is DDP. 550 LLP Connection - Corresponds to an LLP transport-level connection 551 between the peer LLP layers on two nodes. 553 LLP Stream - Corresponds to a single LLP transport-level Stream 554 between the peer LLP layers on two Nodes. One or more LLP 555 Streams may map to a single transport-level LLP connection. 556 For transport protocols that support multiple Streams per 557 connection (e.g., SCTP), a LLP Stream corresponds to one 558 transport-level Stream. 560 MULPDU - Maximum ULPDU. The current maximum size of the record 561 that is acceptable for DDP to pass to the LLP for 562 transmission. 564 ULPDU - Upper Layer Protocol Data Unit. The data record defined 565 by the layer above MPA. 567 2.3 Direct Data Placement (DDP) 569 Data Placement (Placement, Placed, Places) - For DDP, this term is 570 specifically used to indicate the process of writing to a data 571 buffer by a DDP implementation. DDP Segments carry Placement 572 information, which may be used by the receiving DDP 573 implementation to perform Data Placement of the DDP Segment 574 ULP Payload. See "Data Delivery". 576 DDP Abortive Teardown - The act of closing a DDP Stream without 577 attempting to Complete in-progress and pending DDP Messages. 579 DDP Graceful Teardown - The act of closing a DDP Stream such that 580 all in-progress and pending DDP Messages are allowed to 581 Complete successfully. 583 DDP Control Field - a fixed 16-bit field in the DDP Header. The 584 DDP Control Field contains an 8-bit field whose contents are 585 reserved for use by the ULP. 587 DDP Header - The header present in all DDP segments. The DDP 588 Header contains control and Placement fields that are used to 589 define the final Placement location for the ULP payload 590 carried in a DDP Segment. 592 DDP Message - A ULP defined unit of data interchange, which is 593 subdivided into one or more DDP segments. This segmentation 594 may occur for a variety of reasons, including segmentation to 595 respect the maximum segment size of the underlying transport 596 protocol. 598 DDP Segment - The smallest unit of data transfer for the DDP 599 protocol. It includes a DDP Header and ULP Payload (if 600 present). A DDP Segment should be sized to fit within the 601 underlying transport protocol MULPDU. 603 DDP Stream - a sequence of DDP Messages whose ordering is defined 604 by the LLP. For SCTP, a DDP Stream maps directly to an SCTP 605 Stream. For MPA, a DDP Stream maps directly to a TCP 606 connection and a single DDP Stream is supported. Note that 607 DDP has no ordering guarantees between DDP Streams. 609 Direct Data Placement - A mechanism whereby ULP data contained 610 within DDP Segments may be Placed directly into its final 611 destination in memory without processing of the ULP. This may 612 occur even when the DDP Segments arrive out of order. Out of 613 order Placement support may require the Data Sink to implement 614 the LLP and DDP as one functional block. 616 Direct Data Placement Protocol (DDP) - Also, a wire protocol that 617 supports Direct Data Placement by associating explicit memory 618 buffer placement information with the LLP payload units. 620 Message Offset (MO) - For the DDP Untagged Buffer Model, specifies 621 the offset, in bytes, from the start of a DDP Message. 623 Message Sequence Number (MSN) - For the DDP Untagged Buffer Model, 624 specifies a sequence number that is increasing with each DDP 625 Message. 627 Queue Number (QN) - For the DDP Untagged Buffer Model, identifies 628 a destination Data Sink queue for a DDP Segment. 630 Steering Tag - An identifier of a Tagged Buffer on a Node, valid 631 as defined within a protocol specification. 633 STag - Steering Tag 635 Tagged Buffer - A buffer that is explicitly Advertised to the 636 Remote Peer through exchange of an STag, Tagged Offset, and 637 length. 639 Tagged Buffer Model - A DDP data transfer model used to transfer 640 Tagged Buffers from the Local Peer to the Remote Peer. 642 Tagged DDP Message - A DDP Message that targets a Tagged Buffer. 644 Tagged Offset (TO) - The offset within a Tagged Buffer on a Node. 646 Untagged Buffer - A buffer that is not explicitly Advertised to 647 the Remote Peer. Untagged buffers support one of the two 648 available data transfer mechanisms called the Untagged Buffer 649 Model. An untagged buffer is used to send asynchronous control 650 messages to the Remote Peer for RDMA Read, Send, and Terminate 651 requests. Untagged Buffers handle Untagged DDP Messages. 653 Untagged Buffer Model - A DDP data transfer model used to transfer 654 Untagged Buffers from the Local Peer to the Remote Peer. 656 Untagged DDP Message - A DDP Message that targets an Untagged 657 Buffer. 659 2.4 Remote Direct Memory Access (RDMA) 661 Event - An indication provided by the RDMAP Layer to the ULP to 662 indicate a Completion or other condition requiring immediate 663 attention. 665 Invalidate STag - A mechanism used to prevent the Remote Peer from 666 reusing a previous explicitly Advertised STag, until the Local 667 Peer makes it available through a subsequent explicit 668 Advertisement. The STag cannot be accessed remotely until it 669 is explicit Advertised again. 671 RDMA Completion (Completion, Completed, Complete, Completes) - For 672 RDMA, Completion is defined as the process of informing the 673 ULP that a particular RDMA Operation has performed all 674 functions specified for the RDMA Operations, including 675 Placement and Delivery. The Completion semantic of each RDMA 676 Operation is distinctly defined. 678 RDMA Message - A data transfer mechanism used to fulfill an RDMA 679 Operation. 681 RDMA Operation - A sequence of RDMA Messages, including control 682 Messages, to transfer data from a Data Source to a Data Sink. 683 The following RDMA Operations are defined - RDMA Writes, RDMA 684 Read, Send, Send with Invalidate, Send with Solicited Event, 685 Send with Solicited Event and Invalidate, and Terminate. 687 RDMA Protocol (RDMAP) - A wire protocol that supports RDMA 688 Operations to transfer ULP data between a Local Peer and the 689 Remote Peer. 691 RDMAP Abortive Termination (Termination, Terminated, Terminate, 692 Terminates) - The act of closing an RDMAP Stream without 693 attempting to Complete in-progress and pending RDMA 694 Operations. 696 RDMAP Graceful Termination - The act of closing an RDMAP Stream 697 such that all in-progress and pending RDMA Operations are 698 allowed to Complete successfully. 700 RDMA Read - An RDMA Operation used by the Data Sink to transfer 701 the contents of a source RDMA buffer from the Remote Peer to 702 the Local Peer. An RDMA Read operation consists of a single 703 RDMA Read Request Message and a single RDMA Read Response 704 Message. 706 RDMA Read Request - An RDMA Message used by the Data Sink to 707 request the Data Source to transfer the contents of an RDMA 708 buffer. The RDMA Read Request Message describes both the Data 709 Source and Data Sink RDMA buffers. 711 RDMA Read Request Queue - The queue used for processing RDMA Read 712 Requests. The RDMA Read Request Queue has a DDP Queue Number 713 of 1. 715 RDMA Read Response - An RDMA Message used by the Data Source to 716 transfer the contents of an RDMA buffer to the Data Sink, in 717 response to an RDMA Read Request. The RDMA Read Response 718 Message only describes the data sink RDMA buffer. 720 RDMAP Stream - An association between a pair of RDMAP 721 implementations, possibly on different Nodes, which transfer 722 ULP data using RDMA Operations. There may be multiple RDMAP 723 Streams on a single Node. An RDMAP Stream maps directly to a 724 single DDP Stream. 726 RDMA Write - An RDMA Operation that transfers the contents of a 727 source RDMA Buffer from the Local Peer to a destination RDMA 728 Buffer at the Remote Peer using RDMA. The RDMA Write Message 729 only describes the Data Sink RDMA buffer. 731 Remote Direct Memory Access (RDMA) - A method of accessing memory 732 on a remote system in which the local system specifies the 733 remote location of the data to be transferred. Employing a 734 RNIC in the remote system allows the access to take place 735 without interrupting the processing of the CPU(s) on the 736 system. 738 Send - An RDMA Operation that transfers the contents of a ULP 739 Buffer from the Local Peer to an Untagged Buffer at the Remote 740 Peer. 742 Send Message Type - A Send Message, Send with Invalidate Message, 743 Send with Solicited Event Message, or Send with Solicited 744 Event and Invalidate Message. 746 Send Operation Type - A Send Operation, Send with Invalidate 747 Operation, Send with Solicited Event Operation, or Send with 748 Solicited Event and Invalidate Operation. 750 Solicited Event (SE) - A facility by which an RDMA Operation 751 sender may cause an Event to be generated at the recipient, if 752 the recipient is configured to generate such an Event, when a 753 Send with Solicited Event or Send with Solicited Event and 754 Invalidate Message is received. Note: The Local Peer's ULP 755 can use the Solicited Event mechanism to ensure that Messages 756 designated as important to the ULP are handled in an 757 expeditious manner by the Remote Peer's ULP. The ULP at the 758 Local Peer can indicate a given Send Message Type is important 759 by using the Send with Solicited Event Message or Send with 760 Solicited Event and Invalidate Message. The ULP at the Remote 761 Peer can choose to only be notified when valid Send with 762 Solicited Event Messages and/or Send with Solicited Event and 763 Invalidate Messages arrive and handle other valid incoming 764 Send Messages or Send with Invalidate Messages at its leisure. 766 Terminate - An RDMA Message used by a Node to pass an error 767 indication to the peer Node on an RDMAP Stream. This operation 768 is for RDMAP use only. 770 ULP Buffer - A buffer owned above the RDMAP Layer and advertised 771 to the RDMAP Layer either as a Tagged Buffer or an Untagged 772 ULP Buffer. 774 ULP Message - The ULP data that is handed to a specific protocol 775 layer for transmission. Data boundaries are preserved as they 776 are transmitted through iWARP. 778 3 ULP and Transport Attributes 780 3.1 Transport Requirements & Assumptions 782 RDMAP MUST be layered on top of the Direct Data Placement Protocol 783 [DDP]. 785 RDMAP requires the following DDP support: 787 * RDMAP uses three queues for Untagged Buffers: 789 * Queue Number 0 (used by RDMAP for Send, Send with 790 Invalidate, Send with Solicited Event, and Send with 791 Solicited Event and Invalidate operations). 793 * Queue Number 1 (used by RDMAP for RDMA Read operations). 795 * Queue Number 2 (used by RDMAP for Terminate operations). 797 * DDP maps a single RDMA Message to a single DDP Message. 799 * DDP uses the STag and Tagged Offset provided by the RDMAP for 800 Tagged Buffer Messages (i.e., RDMA Write and RDMA Read 801 Response). 803 * When the DDP layer Delivers an Untagged DDP Message to the 804 RDMAP layer, DDP provides the length of the DDP Message. This 805 ensures that RDMAP does not have to carry a length field in its 806 header. 808 * When the RDMAP layer provides an RDMA Message to the DDP Layer, 809 DDP must insert the RsvdULP field value provided by the RDMAP 810 Layer into the associated DDP Message. 812 * When the DDP layer Delivers a DDP Message to the RDMAP layer, 813 DDP provides the RsvdULP field. 815 * The RsvdULP field must be 1 octet for DDP Tagged Messages and 5 816 octets for DDP Untagged Messages. 818 * DDP propagates to RDMAP all operation or protection errors 819 (used by RDMAP Terminate) and, when appropriate, the DDP Header 820 fields of the DDP Segment that encountered the error. 822 * If an RDMA Operation is aborted by DDP or a lower layer, the 823 contents of the Data Sink buffers associated with the operation 824 are considered indeterminate. 826 * DDP in conjunction with the lower layers provide reliable, in- 827 order Delivery. 829 3.2 RDMAP Interactions with the ULP 831 RDMAP provides the ULP with access to the following RDMA 832 Operations as defined in this specification: 834 * Send 836 * Send with Solicited Event 838 * Send with Invalidate 840 * Send with Solicited Event and Invalidate 842 * RDMA Write 844 * RDMA Read 846 For Send Operation Types, the following are the interactions 847 between the RDMAP Layer and the ULP: 849 * At the Data Source: 851 * The ULP passes to the RDMAP Layer the following: 853 * ULP Message Length 855 * ULP Message 857 * An indication of the Send Operation Type, where the 858 valid types are: Send, Send with Solicited Event, Send 859 with Invalidate, or Send with Solicited Event and 860 Invalidate. 862 * An Invalidate STag, if the Send Operation Type was 863 Send with Invalidate or Send with Solicited Event and 864 Invalidate. 866 * When the Send Operation Type Completes, an indication of 867 the Completion results. 869 * At the Data Sink: 871 * If the Send Operation Type Completed successfully, the 872 RDMAP Layer passes the following information to the ULP 873 Layer: 875 * ULP Message Length 877 * ULP Message 879 * An Event, if the Data Sink is configured to generate 880 an Event. 882 * An Invalidated STag, if the Send Operation Type was 883 Send with Invalidate or Send with Solicited Event and 884 Invalidate. 886 * If the Send Operation Type Completed in error, the Data 887 Sink RDMAP Layer will pass up the corresponding error 888 information to the Data Sink ULP and send a Terminate 889 Message to the Data Source RDMAP Layer. The Data Source 890 RDMAP Layer will then pass up the Terminate Message to the 891 ULP. 893 For RDMA Write Operations, the following are the interactions 894 between the RDMAP Layer and the ULP: 896 * At the Data Source: 898 * The ULP passes to the RDMAP Layer the following: 900 * ULP Message Length 902 * ULP Message 904 * Data Sink STag 906 * Data Sink Tagged Offset 908 * When the RDMA Write Operation Completes, an indication of 909 the Completion results. 911 * At the Data Sink: 913 * If the RDMA Write completed successfully, the RDMAP Layer 914 does not Deliver the RDMA Write to the ULP. It does Place 915 the ULP Message transferred through the RDMA Write Message 916 into the ULP Buffer. 918 * If the RDMA Write completed in error, the Data Sink RDMAP 919 Layer will pass up the corresponding error information to 920 the Data Sink ULP and send a Terminate Message to the Data 921 Source RDMAP Layer. The Data Source RDMAP Layer will then 922 pass up the Terminate Message to the ULP. 924 For RDMA Read Operations, the following are the interactions 925 between the RDMAP Layer and the ULP: 927 * At the Data Sink: 929 * The ULP passes to the RDMAP Layer the following: 931 * ULP Message Length 933 * Data Source STag 935 * Data Sink STag 937 * Data Source Tagged Offset 939 * Data Sink Tagged Offset 941 * When the RDMA Read Operation Completes, an indication of 942 the Completion results. 944 * At the Data Source: 946 * If no error occurred while processing the RDMA Read 947 Request, the Data Source will not pass up any information 948 to the ULP. 950 * If an error occurred while processing the RDMA Read 951 Request, the Data Source RDMAP Layer will pass up the 952 corresponding error information to the Data Source ULP and 953 send a Terminate Message to the Data Sink RDMAP Layer. The 954 Data Sink RDMAP Layer will then pass up the Terminate 955 Message to the ULP. 957 For STags made available to the RDMAP Layer, following are the 958 interactions between the RDMAP Layer and the ULP: 960 * If the ULP enables an STag, the ULP passes to the RDMAP Layer 961 the: 963 * STag; 965 * range of Tagged Offsets that are associated with a given 966 STag; 968 * remote access rights (read, write, or read and write) 969 associated with a given, valid STag; and 971 * association between a given STag and a given RDMAP Stream. 973 * If the ULP disables an STag, the ULP passes to the RDMAP Layer 974 the STag. 976 If an error occurs at the RDMAP Layer, the RDMAP Layer may pass 977 back error information (e.g., the content of a Terminate Message) 978 to the ULP. 980 4 Header Format 982 The control information of RDMA Messages is included in DDP 983 protocol defined header fields, with the following exceptions: 985 * The first octet reserved for ULP usage on all DDP Messages in 986 the DDP Protocol (i.e., the RsvdULP Field) is used by RDMAP to 987 carry the RDMA Message Opcode and the RDMAP version. This octet 988 is known as the RDMAP Control Field in this specification. For 989 Send with Invalidate and Send with Solicited Event and 990 Invalidate, RDMAP uses the second through fifth octets provided 991 by DDP on Untagged DDP Messages to carry the STag that will be 992 Invalidated. 994 * The RDMA Message length is passed by the RDMAP layer to the DDP 995 layer on all outbound transfers. 997 * For RDMA Read Request Messages, the RDMA Read Message Size is 998 included in the RDMA Read Request Header. 1000 * The RDMA Message length is passed to the RDMAP Layer by the DDP 1001 layer on inbound Untagged Buffer transfers. 1003 * Two RDMA Messages carry additional RDMAP headers. The RDMA Read 1004 Request carries the Data Sink and Data Source buffer 1005 descriptions, including buffer length. The Terminate carries 1006 additional information associated with the error that caused 1007 the Terminate. 1009 4.1 RDMAP Control and Invalidate STag Field 1011 The version of RDMAP defined by this specification uses all 8 bits 1012 of the RDMAP Control Field. The first octet reserved for ULP use 1013 in the DDP Protocol MUST be used by the RDMAP to carry the RDMAP 1014 Control Field. The ordering of the bits in the first octet MUST be 1015 as defined in Figure 3 DDP Control, RDMAP Control, and Invalidate 1016 STag Field. For Send with Invalidate and Send with Solicited Event 1017 and Invalidate, the second through fifth octets of the DDP RsvdULP 1018 field MUST be used by RDMAP to carry the Invalidate STag. Figure 3 1019 DDP Control, RDMAP Control, and Invalidate STag Field depicts the 1020 format of the DDP Control and RDMAP Control fields. (Note: In 1021 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Field, 1022 the DDP Header is offset by 16 bits to accommodate the MPA header 1023 defined in [MPA]. The MPA header is only present if DDP is layered 1024 on top of MPA.) 1026 0 1 2 3 1027 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1028 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1029 |T|L| Resrv | DV| RV|Rsv| Opcode| 1030 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1031 | Invalidate STag | 1032 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1033 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields 1035 All RDMA Messages handed by the RDMAP Layer to the DDP layer MUST 1036 define the value of the Tagged flag in the DDP Header. Figure 4 1037 RDMA Usage of DDP Fields MUST be used to define the value of the 1038 Tagged flag that is handed to the DDP Layer for each RDMA Message. 1040 Figure 4 RDMA Usage of DDP Fields defines the value of the RDMA 1041 Opcode field that MUST be used for each RDMA Message. 1043 Figure 4 RDMA Usage of DDP Fields defines when the STag, Queue 1044 Number, and Tagged Offset fields MUST be provided for each RDMA 1045 Message. 1047 For this version of the RDMAP, all RDMA Messages MUST have: 1049 * Bits 24-25; RDMA Version field: 01b for an RNIC that complies 1050 with this RDMA protocol specification. 00b for an RNIC that 1051 complies with the RDMA Consortium's RDMA protocol 1052 specification. Both version numbers are valid. 1053 Interoperability is dependent on MPA protocol version 1054 negotiation (e.g., MPA marker and MPA CRC). 1056 * Bits 26-27; Reserved. MUST be set to zero by sender, ignored by 1057 the receiver. 1059 * Bits 28-31; OpCode field: see Figure 4 RDMA Usage of DDP 1060 Fields. 1062 * Bits 32-63; Invalidate STag. However, this field is only valid 1063 for Send with Invalidate and Send with Solicited Event and 1064 Invalidate Messages (see Figure 4 RDMA Usage of DDP Fields). 1066 For Send, Send with Solicited Event, RDMA Read Request, and 1067 Terminate, the Invalidate STag field MUST be set to zero on 1068 transmit and ignored by the receiver. 1070 -------+-----------+-------+------+-------+-----------+-------------- 1071 RDMA | Message | Tagged| STag | Queue | Invalidate| Message 1072 Message| Type | Flag | and | Number| STag | Length 1073 OpCode | | | TO | | | Communicated 1074 | | | | | | between DDP 1075 | | | | | | and RDMAP 1076 -------+-----------+-------+------+-------+-----------+-------------- 1077 0000b | RDMA Write| 1 | Valid| N/A | N/A | Yes 1078 | | | | | | 1079 -------+-----------+-------+------+-------+-----------+-------------- 1080 0001b | RDMA Read | 0 | N/A | 1 | N/A | Yes 1081 | Request | | | | | 1082 -------+-----------+-------+------+-------+-----------+-------------- 1083 0010b | RDMA Read | 1 | Valid| N/A | N/A | Yes 1084 | Response | | | | | 1085 -------+-----------+-------+------+-------+-----------+-------------- 1086 0011b | Send | 0 | N/A | 0 | N/A | Yes 1087 | | | | | | 1088 -------+-----------+-------+------+-------+-----------+-------------- 1089 0100b | Send with | 0 | N/A | 0 | Valid | Yes 1090 | Invalidate| | | | | 1091 -------+-----------+-------+------+-------+-----------+-------------- 1092 0101b | Send with | 0 | N/A | 0 | N/A | Yes 1093 | SE | | | | | 1094 -------+-----------+-------+------+-------+-----------+-------------- 1095 0110b | Send with | 0 | N/A | 0 | Valid | Yes 1096 | SE and | | | | | 1097 | Invalidate| | | | | 1098 -------+-----------+-------+------+-------+-----------+-------------- 1099 0111b | Terminate | 0 | N/A | 2 | N/A | Yes 1100 | | | | | | 1101 -------+-----------+-------+------+-------+-----------+-------------- 1102 1000b | | 1103 to | Reserved | Not Specified 1104 1111b | | 1105 -------+-----------+------------------------------------------------- 1106 Figure 4 RDMA Usage of DDP Fields 1108 Note: N/A means Not Applicable. 1110 4.2 RDMA Message Definitions 1112 The following figure defines which RDMA Headers MUST be used on 1113 each RDMA Message and which RDMA Messages are allowed to carry ULP 1114 payload: 1116 -------+-----------+-------------------+------------------------- 1117 RDMA | Message | RDMA Header Used | ULP Message allowed in 1118 Message| Type | | the RDMA Message 1119 OpCode | | | 1120 | | | 1121 -------+-----------+-------------------+------------------------- 1122 0000b | RDMA Write| None | Yes 1123 | | | 1124 -------+-----------+-------------------+------------------------- 1125 0001b | RDMA Read | RDMA Read Request | No 1126 | Request | Header | 1127 -------+-----------+-------------------+------------------------- 1128 0010b | RDMA Read | None | Yes 1129 | Response | | 1130 -------+-----------+-------------------+------------------------- 1131 0011b | Send | None | Yes 1132 | | | 1133 -------+-----------+-------------------+------------------------- 1134 0100b | Send with | None | Yes 1135 | Invalidate| | 1136 -------+-----------+-------------------+------------------------- 1137 0101b | Send with | None | Yes 1138 | SE | | 1139 -------+-----------+-------------------+------------------------- 1140 0110b | Send with | None | Yes 1141 | SE and | | 1142 | Invalidate| | 1143 -------+-----------+-------------------+------------------------- 1144 0111b | Terminate | Terminate Header | No 1145 | | | 1146 -------+-----------+-------------------+------------------------- 1147 1000b | | 1148 to | Reserved | Not Specified 1149 1111b | | 1150 -------+-----------+-------------------+------------------------- 1151 Figure 5 RDMA Message Definitions 1153 4.3 RDMA Write Header 1155 The RDMA Write Message does not include an RDMAP header. The RDMAP 1156 layer passes to the DDP layer an RDMAP Control Field. The RDMA 1157 Write Message is fully described by the DDP Headers of the DDP 1158 Segments associated with the Message. 1160 See section 11 Appendix for a description of the DDP Segment 1161 format associated with RDMA Write Messages. 1163 4.4 RDMA Read Request Header 1165 The RDMA Read Request Message carries an RDMA Read Request Header 1166 that describes the Data Sink and Data Source Buffers used by the 1167 RDMA Read operation. The RDMA Read Request Header immediately 1168 follows the DDP header. The RDMAP layer passes to the DDP layer an 1169 RDMAP Control Field. The following figure depicts the RDMA Read 1170 Request Header that MUST be used for all RDMA Read Request 1171 Messages: 1173 0 1 2 3 1174 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1176 | Data Sink STag (SinkSTag) | 1177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1178 | | 1179 + Data Sink Tagged Offset (SinkTO) + 1180 | | 1181 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1182 | RDMA Read Message Size (RDMARDSZ) | 1183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1184 | Data Source STag (SrcSTag) | 1185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1186 | | 1187 + Data Source Tagged Offset (SrcTO) + 1188 | | 1189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1190 Figure 6 RDMA Read Request Header Format 1192 Data Sink Steering Tag: 32 bits. 1194 The Data Sink Steering Tag identifies the Data Sink's Tagged 1195 Buffer. This field MUST be copied, without interpretation, 1196 from the RDMA Read Request into the corresponding RDMA Read 1197 Response and allows the Data Sink to place the returning 1198 data. The STag is associated with the RDMAP Stream through a 1199 mechanism that is outside the scope of the RDMAP 1200 specification. 1202 Data Sink Tagged Offset: 64 bits. 1204 The Data Sink Tagged Offset specifies the starting offset, in 1205 octets, from the base of the Data Sink's Tagged Buffer, where 1206 the data is to be written by the Data Source. This field is 1207 copied from the RDMA Read Request into the corresponding RDMA 1208 Read Response and allows the Data Sink to place the returning 1209 data. The Data Sink Tagged Offset MAY start at an arbitrary 1210 offset. 1212 The Data Sink STag and Data Sink Tagged Offset fields 1213 describe the buffer to which the RDMA Read data is written. 1215 Note: the DDP Layer protects against a wrap of the Data Sink 1216 Tagged Offset. 1218 RDMA Read Message Size: 32 bits. 1220 The RDMA Read Message Size is the amount of data, in octets, 1221 read from the Data Source. A single RDMA Read Request Message 1222 can retrieve from 0 to 2^32-1 data octets from the Data 1223 Source. 1225 Data Source Steering Tag: 32 bits. 1227 The Data Source Steering Tag identifies the Data Source's 1228 Tagged Buffer. The STag is associated with the RDMAP Stream 1229 through a mechanism that is outside the scope of the RDMAP 1230 specification. 1232 Data Source Tagged Offset: 64 bits. 1234 The Tagged Offset specifies the starting offset, in octets, 1235 that is to be read from the Data Source's Tagged Buffer. The 1236 Data Source Tagged Offset MAY start at an arbitrary offset. 1238 The Data Source STag and Data Source Tagged Offset fields 1239 describe the buffer from which the RDMA Read data is read. 1241 See Section 7.2 Errors Detected at the Remote Peer on Incoming 1242 RDMA Messages for a description of error checking required upon 1243 processing of an RDMA Read Request at the Data Source. 1245 4.5 RDMA Read Response Header 1247 The RDMA Read Response Message does not include an RDMAP header. 1248 The RDMAP layer passes to the DDP layer an RDMAP Control Field. 1249 The RDMA Read Response Message is fully described by the DDP 1250 Headers of the DDP Segments associated with the Message. 1252 See Section 11 Appendix for a description of the DDP Segment 1253 format associated with RDMA Read Response Messages. 1255 4.6 Send Header and Send with Solicited Event Header 1257 The Send and Send with Solicited Event Message do not include an 1258 RDMAP header. The RDMAP layer passes to the DDP layer an RDMAP 1259 Control Field. The Send and Send with Solicited Event Message are 1260 fully described by the DDP Headers of the DDP Segments associated 1261 with the Message. 1263 See Section 11 Appendix for a description of the DDP Segment 1264 format associated with Send and Send with Solicited Event 1265 Messages. 1267 4.7 Send with Invalidate Header and Send with SE and Invalidate 1268 Header 1270 The Send with Invalidate and Send with Solicited Event and 1271 Invalidate Message do not include an RDMAP header. The RDMAP layer 1272 passes to the DDP layer an RDMAP Control Field and the Invalidate 1273 STag field (see section 4.1 RDMAP Control and Invalidate STag 1274 Field). The Send with Invalidate and Send with Solicited Event and 1275 Invalidate Message are fully described by the DDP Headers of the 1276 DDP Segments associated with the Message. 1278 See Section 11 Appendix for a description of the DDP Segment 1279 format associated with Send and Send with Solicited Event 1280 Messages. 1282 4.8 Terminate Header 1284 The Terminate Message carries a Terminate Header that contains 1285 additional information associated with the cause of the Terminate. 1286 The Terminate Header immediately follows the DDP header. The RDMAP 1287 layer passes to the DDP layer an RDMAP Control Field. The 1288 following figure depicts a Terminate Header that MUST be used for 1289 the Terminate Message: 1291 0 1 2 3 1292 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1294 | Terminate Control | Reserved | 1295 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1296 | DDP Segment Length (if any) | | 1297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 1298 | | 1299 // // 1300 | Terminated DDP Header (if any) | 1301 + + 1302 | | 1303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1304 | | 1305 // // 1306 | Terminated RDMA Header (if any) | 1307 + + 1308 | | 1309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1310 Figure 7 Terminate Header Format 1312 Terminate Control: 19 bits. 1314 The Terminate Control field MUST have the format defined in 1315 Figure 8 Terminate Control Field. 1317 0 1 2 3 1318 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1319 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1320 | Layer | EType | Error Code |HdrCt| 1321 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1322 Figure 8 Terminate Control Field 1324 * Figure 9 Terminate Control Field Values defines the valid 1325 values that MUST be used for this field. 1327 * Layer: 4 bits. 1329 Identifies the layer that encountered the error. 1331 * EType (RDMA Error Type): 4 bits. 1333 Identifies the type of error that caused the 1334 Terminate. When the error is detected at the RDMAP 1335 Layer, the RDMAP Layer inserts the Error Type into 1336 this field. When the error is detected at a LLP layer, 1337 a LLP layer creates the Error Type and the DDP layer 1338 passes it up to the RDMAP Layer, and the RDMAP Layer 1339 inserts it into this field. 1341 * Error Code: 8 bits. 1343 This field identifies the specific error that caused 1344 the Terminate. When the error is detected at the RDMAP 1345 Layer, the RDMAP Layer creates the Error Code. When 1346 the error is detected at a LLP layer, a LLP layer 1347 creates the Error Code and the DDP layer passes it up 1348 to the RDMAP Layer, and the RDMAP Layer inserts it 1349 into this field. 1351 * HdrCt: 3 bits. 1353 Header control bits: 1355 * M: bit 16. DDP Segment Length valid. See Figure 10 1356 for when this bit SHOULD be set. 1358 * D: bit 17. DDP Header Included. See Figure 10 for 1359 when this bit SHOULD be set. 1361 * R: bit 18. RDMAP Header Included. See Figure 10 1362 for when this bit SHOULD be set. 1364 -------+----------+-------+-------------+------+-------------------- 1365 Layer | Layer | Error | Error Type | Error| Error Code Name 1366 | Name | Type | Name | Code | 1367 -------+----------+-------+-------------+------+-------------------- 1368 | | 0000b | Local | None | None - This error 1369 | | | Catastrophic| | type does not have 1370 | | | Error | | an error code. Any 1371 | | | | | value in this field 1372 | | | | | is acceptable. 1373 | +-------+-------------+------+-------------------- 1374 | | | | 00X | Invalid STag 1375 | | | +------+-------------------- 1376 | | | | 01X | Base or bounds 1377 | | | | | violation 1378 | | | Remote +------+-------------------- 1379 | | 0001b | Protection | 02X | Access rights 1380 | | | Error | | violation 1381 | | | +------+-------------------- 1382 0000b | RDMA | | | 03X | STag not associated 1383 | | | | | with RDMAP Stream 1384 | | | +------+-------------------- 1385 | | | | 04X | TO wrap 1386 | | | +------+-------------------- 1387 | | | | 09X | STag cannot be 1388 | | | | | Invalidated 1389 | | | +------+-------------------- 1390 | | | | FFX | Unspecified Error 1391 | +-------+-------------+------+-------------------- 1392 | | | | 05X | Invalid RDMAP 1393 | | | | | version 1394 | | | +------+-------------------- 1395 | | | | 06X | Unexpected OpCode 1396 | | | Remote +------+-------------------- 1397 | | 0010b | Operation | 07X | Catastrophic error, 1398 | | | Error | | localized to RDMAP 1399 | | | | | Stream 1400 | | | +------+-------------------- 1401 | | | | 08X | Catastrophic error, 1402 | | | | | global 1403 | | | +------+-------------------- 1404 | | | | 09X | STag cannot be 1405 | | | | | Invalidated 1406 | | | +------+-------------------- 1407 | | | | FFX | Unspecified Error 1408 -------+----------+-------+-------------+------+-------------------- 1409 0001b | DDP | See DDP Specification [DDP] for a description of 1410 | | the values and names. 1411 -------+----------+-------+----------------------------------------- 1412 0010b | LLP | For MPA, see MPA Specification [MPA] for a 1413 | (eg MPA) | description of the values and names. 1414 -------+----------+-------+----------------------------------------- 1415 Figure 9 Terminate Control Field Values 1417 Reserved: 13 bits. This field MUST be set to zero on transmit, 1418 ignored on receive. 1420 DDP Segment Length: 16 bits 1422 The length handed up by the DDP Layer when the error was 1423 detected. It MUST be valid if the M bit is set. It MUST be 1424 present when the D bit is set. 1426 Terminated DDP Header: 112 bits for Tagged Messages and 144 bits 1427 for Untagged Messages. 1429 The DDP Header of the incoming Message that is associated 1430 with the Terminate. The DDP Header is not present if the 1431 Terminate Error Type is a Local Catastrophic Error. It MUST 1432 be present if the D bit is set. 1434 Terminated RDMA Header: 224 bits. 1436 The Terminated RDMA Header is only sent back if the terminate 1437 is associated with an RDMA Read Request Message. It MUST be 1438 present if the R bit is set. 1440 If the terminate occurs before the first RDMA Read Request 1441 byte is processed, the original RDMA Read Request Header is 1442 sent back. 1444 If the terminate occurs after the first RDMA Read Request 1445 byte is processed, the RDMA Read Request Header is updated to 1446 reflect the current location of the RDMA Read operation that 1447 is in process: 1449 * Data Sink STag = Data Sink STag originally sent in the 1450 RDMA Read Request. 1452 * Data Sink Tagged Offset = Current offset into the Data 1453 Sink Tagged Buffer. For example if the RDMA Read 1454 Request was terminated after 2048 octets were sent, 1455 then the Data Sink Tagged Offset = the original Data 1456 Sink Tagged Offset + 2048. 1458 * Data Message size = Number of bytes left to transfer. 1460 * Data Source STag = Data Source STag in the RDMA Read 1461 Request. 1463 * Data Source Tagged Offset = Current offset into the 1464 Data Source Tagged Buffer. For example if the RDMA 1465 Read Request was terminated after 2048 octets were 1466 sent, then the Data Source Tagged Offset = the 1467 original Data Source Tagged Offset + 2048. 1469 Note: if a given LLP does not define any termination codes for the 1470 RDMAP Termination message to use, then none would be used for that 1471 LLP. 1473 Figure 10 Error Type to RDMA Message Mapping maps layer name and 1474 error types to each RDMA Message type: 1476 ---------+-------------+------------+------------+----------------- 1477 Layer | Error Type | Terminate | Terminate | What type of 1478 Name | Name | Includes | Includes | RDMA Message can 1479 | | DDP Header | RDMA Header| cause the error 1480 | | and DDP | | 1481 | | Segment | | 1482 | | Length | | 1483 ---------+-------------+------------+------------+----------------- 1484 | Local | No | No | Any 1485 | Catastrophic| | | 1486 | Error | | | 1487 +-------------+------------+------------+----------------- 1488 | Remote | Yes, if | Yes | Only RDMA Read 1489 RDMA | Protection | possible | | Request, Send 1490 | Error | | | with Invalidate, 1491 | | | | and Send with SE 1492 | | | | and Invalidate 1493 +-------------+------------+------------+----------------- 1494 | Remote | Yes, if | No | Any 1495 | Operation | possible | | 1496 | Error | | | 1497 ---------+-------------+------------+------------+----------------- 1498 DDP | See DDP Spec| Yes | No | Any 1499 | [DDP] | | | 1500 ---------+-------------+------------+------------+----------------- 1501 LLP | See LLP Spec| No | No | Any 1502 | [e.g., MPA] | | | 1503 Figure 10 Error Type to RDMA Message Mapping 1505 5 Data Transfer 1507 5.1 RDMA Write Message 1509 An RDMA Write is used by the Data Source to transfer data to a 1510 previously Advertised Tagged Buffer at the Data Sink. The RDMA 1511 Write Message has the following semantics: 1513 * An RDMA Write Message MUST reference a Tagged Buffer. That is, 1514 the Data Source RDMAP Layer MUST request that the DDP layer 1515 mark the Message as Tagged. 1517 * A valid RDMA Write Message MUST NOT be delivered to the Data 1518 Sink's ULP (i.e., it is placed by the DDP layer). 1520 * At the Remote Peer, when an invalid RDMA Write Message is 1521 delivered to the Remote Peer's RDMAP Layer, an error is 1522 surfaced (see section 7.1 RDMAP Error Surfacing). 1524 * The Tagged Offset of a Tagged Buffer MAY start at a non-zero 1525 value. 1527 * An RDMA Write Message MAY target all or part of a previously 1528 Advertised buffer. 1530 * The RDMAP does not define how the buffer(s) used by an outbound 1531 RDMA Write is defined and how it is addressed. For example, an 1532 implementation of RDMA may choose to allow a gather-list of 1533 non-contiguous data blocks to be the source of an RDMA Write. 1534 In this case, the data blocks would be combined by the Data 1535 Source and sent as a single RDMA Write Message to the Data 1536 Sink. 1538 * The Data Source RDMAP Layer MUST issue RDMA Write Messages to 1539 the DDP layer in the order they were submitted by the ULP. 1541 * At the Data Source, a subsequent Send (Send with Invalidate, 1542 Send with Solicited Event, or Send with Solicited Event and 1543 Invalidate) Message MAY be used to signal Delivery of previous 1544 RDMA Write Messages to the Data Sink, if desired by the ULP. 1546 * If the Local Peer wishes to write to multiple Tagged Buffers on 1547 the Remote Peer, the Local Peer MUST use multiple RDMA Write 1548 Messages. That is, a single RDMA Write Message can only write 1549 to one remote Tagged Buffer. 1551 * The Data Source MAY issue a zero length RDMA Write Message. 1553 5.2 RDMA Read Operation 1555 The RDMA Read operation MUST consist of a single RDMA Read Request 1556 Message and a single RDMA Read Response Message. 1558 5.2.1 RDMA Read Request Message 1560 An RDMA Read Request is used by the Data Sink to transfer data 1561 from a previously Advertised Tagged Buffer at the Data Source to a 1562 Tagged Buffer at the Data Sink. The RDMA Read Request Message has 1563 the following semantics: 1565 * An RDMA Read Request Message MUST reference an Untagged Buffer. 1566 That is, the Local Peer's RDMAP Layer MUST request that the DDP 1567 mark the Message as Untagged. 1569 * One RDMA Read Request Message MUST consume one Untagged Buffer. 1571 * The Remote Peer's RDMAP Layer MUST process an RDMA Read Request 1572 Message. A valid RDMA Read Request Message MUST NOT be 1573 delivered to the Data Sink's ULP (i.e., it is processed by the 1574 RDMAP layer). 1576 * At the Remote Peer, when an invalid RDMA Read Request Message 1577 is delivered to the Remote Peer's RDMAP Layer, an error is 1578 surfaced (see section 7.1 RDMAP Error Surfacing). 1580 * AN RDMA Read Request Message MUST reference the RDMA Read 1581 Request Queue. That is, the Local Peer's RDMAP Layer MUST 1582 request that the DDP layer set the Queue Number field to one. 1584 * The Local Peer MUST pass to the DDP Layer RDMA Read Request 1585 Messages in the order they were submitted by the ULP. 1587 * The Remote Peer MUST process the RDMA Read Request Messages in 1588 the order they were sent. 1590 * If the Local Peer wishes to read from multiple Tagged Buffers 1591 on the Remote Peer, the Local Peer MUST use multiple RDMA Read 1592 Request Messages. That is, a single RDMA Read Request Message 1593 MUST only read from one remote Tagged Buffer. 1595 * AN RDMA Read Request Message MAY target all or part of a 1596 previously Advertised buffer. 1598 * If the Data Source receives a valid RDMA Read Request Message 1599 it MUST respond with a valid RDMA Read Response Message. 1601 * The Data Sink MAY issue a zero length RDMA Read Request 1602 Message, by setting the RDMA Read Message Size field to zero in 1603 the RDMA Read Request Header. 1605 * If the Data Source receives a non-zero length RDMA Read Message 1606 Size, the Data Source RDMAP MUST validate the Data Source STag 1607 and Data Source Tagged Offset contained in the RDMA Read 1608 Request Header. 1610 * If the Data Source receives an RDMA Read Request Header with 1611 the RDMA Read Message Size set to zero, the Data Source RDMAP: 1613 * MUST NOT validate the Data Source STag and Data Source 1614 Tagged Offset contained in the RDMA Read Request Header, 1615 and 1617 * MUST respond with a zero length RDMA Read Response 1618 Message. 1620 5.2.2 RDMA Read Response Message 1622 The RDMA Read Response Message uses the DDP Tagged Buffer Model to 1623 Deliver the contents of a previously requested Data Source Tagged 1624 Buffer to the Data Sink, without any involvement from the ULP at 1625 the Remote Peer. The RDMA Read Response Message has the following 1626 semantics: 1628 * The RDMA Read Response Message for the associated RDMA Read 1629 Request Message travels in the opposite direction. 1631 * An RDMA Read Response Message MUST reference a Tagged Buffer. 1632 That is, the Data Source RDMAP Layer MUST request that the DDP 1633 mark the Message as Tagged. 1635 * The Data Source MUST ensure that a sufficient number of 1636 Untagged Buffers are available on the RDMA Read Request Queue 1637 (Queue with DDP Queue Number 1) to support the maximum number 1638 of RDMA Read Requests negotiated by the ULP. 1640 * The RDMAP Layer MUST Deliver the RDMA Read Response Message to 1641 the ULP. 1643 * At the Remote Peer, when an invalid RDMA Read Response Message 1644 is delivered to the Remote Peer's RDMAP Layer, an error is 1645 surfaced (see section 7.1 RDMAP Error Surfacing). 1647 * The Tagged Offset of a Tagged Buffer MAY start at a non-zero 1648 value. 1650 * The Data Source RDMAP Layer MUST pass RDMA Read Response 1651 Messages to the DDP layer in the order that the RDMA Read 1652 Request Messages were received by the RDMAP Layer at the Data 1653 Source. 1655 * The Data Sink MAY validate that the STag, Tagged Offset, and 1656 length of the RDMA Read Response Message are the same as the 1657 STag, Tagged Offset, and length included in the corresponding 1658 RDMA Read Request Message. 1660 * A single RDMA Read Response Message MUST write to one remote 1661 Tagged Buffer. If the Data Sink wishes to Read multiple Tagged 1662 Buffers, the Data Sink can use multiple RDMA Read Request 1663 Messages. 1665 5.3 Send Message Type 1667 The Send Message Type uses the DDP Untagged Buffer Model to 1668 transfer data from the Data Source into an Untagged Buffer at the 1669 Data Sink. 1671 * A Send Message Type MUST reference an Untagged Buffer. That is, 1672 the Local Peer's RDMAP Layer MUST request that the DDP layer 1673 mark the Message as Untagged. 1675 * One Send Message Type MUST consume one Untagged Buffer. 1677 * The ULP Message sent using a Send Message Type MAY be less 1678 than or equal to the size of the consumed Untagged Buffer. 1679 The RDMAP Layer communicates to the ULP the size of the 1680 data written into the Untagged Buffer. 1682 * If the ULP Message sent via Send Message Type is larger 1683 than the Data Sink's Untagged Buffer, it is an error (see 1684 section 9.1 RDMAP Error Surfacing). 1686 * At the Remote Peer, the Send Message Type MUST be Delivered to 1687 the Remote Peer's ULP in the order they were sent. 1689 * After the Send with Solicited Event or Send with Solicited 1690 Event and Invalidate Message is Delivered to the ULP, the RDMAP 1691 MAY generate an Event, if the Data Sink is configured to 1692 generate such an Event. 1694 * At the Remote Peer, when an invalid Send Message Type is 1695 Delivered to the Remote Peer's RDMAP Layer, an error is 1696 surfaced (see section 7.1 RDMAP Error Surfacing). 1698 * The RDMAP does not define how the buffer(s) used by an outbound 1699 Send Message Type is defined and how it is addressed. For 1700 example, an implementation of RDMA may choose to allow a 1701 gather-list of non-contiguous data blocks to be the source of a 1702 Send Message Type. In this case, the data blocks would be 1703 combined by the Data Source and sent as a single Send Message 1704 Type to the Data Sink. 1706 * For a Send Message Type, the Local Peer's RDMAP Layer MUST 1707 request that the DDP layer set the Queue Number field to zero. 1709 * The Local Peer MUST issue Send Message Type Messages in the 1710 order they were submitted by the ULP. 1712 * The Data Source MAY pass a zero length Send Message Type. A 1713 zero length Send Message Type MUST consume an Untagged Buffer 1714 at the Data Sink. A Send with Invalidate or Send with Solicited 1715 Event and Invalidate Message MUST reference an STag. That is, 1716 the Local Peer's RDMAP Layer MUST pass the RDMA control field 1717 and the STag that will be Invalidated to the DDP layer. 1719 * When the Send with Invalidate and Send with Solicited Event and 1720 Invalidate Message are Delivered to the Remote Peer's RDMAP 1721 Layer, the RDMAP Layer MUST: 1723 * Verify the STag that is associated with the RDMAP Stream; 1724 and 1726 * Invalidate the STag if it is associated with the RDMAP 1727 Stream; or Issue a Terminate Message with the STag Cannot 1728 be Invalidated Terminate Error Code, if the STag is not 1729 associated with the RDMAP Stream. 1731 5.4 Terminate Message 1733 The Terminate Message uses the DDP Untagged Buffer Model to 1734 transfer error related information from the Data Source into an 1735 Untagged Buffer at the Data Sink and then ceases all further 1736 communications on the underlying DDP Stream. The Terminate Message 1737 has the following semantics: 1739 * A Terminate Message MUST reference an Untagged Buffer. That is, 1740 the Local Peer's RDMAP Layer MUST request that the DDP layer 1741 mark the Message as Untagged. 1743 * A Terminate Message references the Terminate Queue. That is, 1744 the Local Peer's RDMAP Layer MUST request that the DDP layer 1745 set the Queue Number field to two. 1747 * One Terminate Message MUST consume one Untagged Buffer. 1749 * On a single RDMAP Stream, the RDMAP layer MUST guarantee 1750 placement of a single Terminate Message. 1752 * A Terminate Message MUST be Delivered to the Remote Peer's 1753 RDMAP Layer. The RDMAP Layer MUST Deliver the Terminate Message 1754 to the ULP. 1756 * At the Remote Peer, when an invalid Terminate Message is 1757 delivered to the Remote Peer's RDMAP Layer, an error is 1758 surfaced (see section 7.1 RDMAP Error Surfacing). 1760 * The RDMAP Layer Completes in error all ULP Operations that have 1761 not been provided to the DDP layer. 1763 * After sending a Terminate Message on an RDMAP Stream, the Local 1764 Peer MUST NOT send any more Messages on that specific RDMAP 1765 Stream. 1767 * After receiving a Terminate Message on an RDMAP Stream, the 1768 Remote Peer MAY stop sending Messages on that specific RDMAP 1769 Stream. 1771 5.5 Ordering and Completions 1773 It is important to understand the difference between Placement and 1774 Delivery ordering since RDMAP provides quite different semantics 1775 for the two. 1777 Note that many current protocols, both as used in the Internet and 1778 elsewhere, assume that data is both Placed and Delivered in order. 1779 This allowed applications to take a variety of shortcuts by taking 1780 advantage of this fact. For RDMAP, many of these shortcuts are no 1781 longer safe to use, and could cause application failure. 1783 The following rules apply to implementations of the RDMAP 1784 protocol. Note, in these rules Send includes Send, Send with 1785 Invalidate, Send with Solicited Event, and Send with Solicited 1786 Event and Invalidate: 1788 1. RDMAP does not provide ordering among Messages on different 1789 RDMAP Streams. 1791 2. RDMAP does not provide ordering between operations that are 1792 generated from the two ends of an RDMAP Stream. 1794 3. RDMA Messages that use Tagged and Untagged Buffers MAY be 1795 Placed in any order. If an application uses overlapping 1796 buffers (points different Messages or portions of a single 1797 Message at the same buffer), then it is possible that the last 1798 incoming write to the Data Sink buffer will not be the last 1799 outgoing data sent from the Data Source. 1801 4. For a Send operation, the contents of an Untagged Buffer at 1802 the Data Sink MAY be indeterminate until the Send is Delivered 1803 to the ULP at the Data Sink. 1805 5. For an RDMA Write operation, the contents of the Tagged Buffer 1806 at the Data Sink MAY be indeterminate until a subsequent Send 1807 is Delivered to the ULP at the Data Sink. 1809 6. For an RDMA Read operation, the contents of the Tagged Buffer 1810 at the Data Sink MAY be indeterminate until the RDMA Read 1811 Response Message has been Delivered at the Local Peer. 1813 Statements 4, 5, and 6 imply "no peeking" at the data to see 1814 if it is done. It is possible for some data to arrive before 1815 logically earlier data does, and peeking may cause 1816 unpredictable application failure 1818 7. If the ULP or Application modifies the contents of Tagged or 1819 Untagged Buffers being modified by an RDMA Operation while the 1820 RDMAP is processing the RDMA Operation, the state of the 1821 Buffers is indeterminate. 1823 8. If the ULP or Application modifies the contents of Tagged or 1824 Untagged Buffers read by an RDMA Operation while the RDMAP is 1825 processing the RDMA Operation, the results of the read are 1826 indeterminate. 1828 9. The Completion of an RDMA Write or Send Operation at the Local 1829 Peer does not guarantee that the ULP Message has yet reached 1830 the Remote Peer ULP Buffer or been examined by the Remote ULP. 1832 10. Send Messages MUST be Delivered to the ULP at the Remote Peer 1833 after they are Delivered to RDMAP by DDP and in the order that 1834 the they were Delivered to RDMAP. 1836 Note that DDP ordering rules ensure that this will be the same 1837 order that they were submitted at the Local Peer and that any 1838 prior RDMA Writes have been submitted for ordered Placement at 1839 the Remote Peer. This means that when the ULP sees the 1840 Delivery of the Send, the memory buffers targeted by any 1841 preceding RDMA Writes and Sends are available to be accessed 1842 locally or remotely as authorized. If the ULP overlaps its 1843 buffers for different operations, the data from the RDMA Write 1844 or Send may be overwritten by subsequent RDMA Operations 1845 before the ULP receives and processes the Delivery. 1847 11. RDMA Read Response Messages MUST be Delivered to the ULP at 1848 the Remote Peer after they are Delivered to RDMAP by DDP and 1849 in the order that the they were Delivered to RDMAP. 1851 DDP ordering rules ensure that this will be the same order 1852 that they were submitted at the Local Peer. This means that 1853 when the ULP sees the Delivery of the RDMA Read Response, the 1854 memory buffers targeted by the RDMA Read Response are 1855 available to be accessed locally or remotely as authorized. If 1856 the ULP overlaps its buffers for different operations, the 1857 data from the RDMA Read Response may be overwritten by 1858 subsequent RDMA Operations before the ULP receives and 1859 processes the Delivery. 1861 12. RDMA Read Request Messages, including zero-length RDMA Read 1862 Requests, MUST NOT start processing at the Remote Peer until 1863 they have been Delivered to RDMAP by DDP. 1865 Note: the ULP is assured that data written can be read back. 1866 For example, if an RDMA Read Request is issued by the local 1867 peer, targeting the same ULP Buffer as a preceding Send or 1868 RDMA Write (in the same direction as the RDMA Read Request), 1869 and there are no other sources of update for the ULP Buffer, 1870 then the remote peer will send back the data written by the 1871 Send or RDMA Write. That is, for this example the ULP Buffer: 1872 is Advertised for use on a series of RDMA Messages, is only 1873 valid on the RDMAP Stream for which it is advertised, and is 1874 not locally updated while the series of RDMAP Messages are 1875 performed. For this example, order rule (12) assures that 1876 subsequent local or remote accesses to the ULP Buffer contain 1877 the data written by the Send or RDMA Write. 1879 RDMA Read Response Messages MAY be generated at the Remote 1880 Peer after subsequent RDMA Write Messages or Send Messages 1881 have been Placed or Delivered. Therefore, when an application 1882 does an RDMA Read Request followed by an RDMA Write (or Send) 1883 to the same buffer, it may get the data from the later RDMA 1884 Write (or Send) in the RDMA Read Response Message, even though 1885 the operations completed in order at the Local Peer. If this 1886 behavior is not desired, the Local Peer ULP must Fence the 1887 later RDMA write (or Send) by withholding the RDMA Write 1888 Message until all outstanding RDMA Read Responses have been 1889 Delivered. 1891 13. The RDMAP Layer MUST submit RDMA Messages to the DDP layer in 1892 the order the RDMA Operations are submitted to the RDMAP Layer 1893 by the ULP. 1895 14. A Send or RDMA Write Message MUST NOT be considered Complete 1896 at the Local Peer (Data Source) until it has been successfully 1897 completed at the DDP layer. 1899 15. RDMA Operations MUST be Completed at the Local Peer in the 1900 order that they were submitted by the ULP. 1902 16. At the Data Sink, an incoming Send Message MUST be Delivered 1903 to the ULP only after the DDP Message has been Delivered to 1904 the RDMAP Layer by the DDP layer. 1906 17. RDMA Read Response Message processing at the Remote Peer 1907 (reading the specified Tagged Buffer) MUST be started only 1908 after the RDMA Read Request Message has been Delivered by the 1909 DDP layer (thus all previous RDMA Messages have been properly 1910 submitted for ordered Placement). 1912 18. Send Messages MAY be Completed at the Remote Peer (Data Sink) 1913 before prior incoming RDMA Read Request Messages have 1914 completed their response processing. 1916 19. An RDMA Read operation MUST NOT be Completed at the Local Peer 1917 until the DDP layer Delivers the associated incoming RDMA Read 1918 Response Message. 1920 20. If more than one outstanding RDMA Read Request Messages are 1921 supported by both peers, the RDMA Read Response Messages MUST 1922 be submitted to the DDP layer on the Remote Peer in the order 1923 the RDMA Read Request Messages were Delivered by DDP, but the 1924 actual read of the buffer contents MAY take place in any order 1925 at the Remote Peer. 1927 This simplifies Local Peer Completion processing for RDMA 1928 Reads in that a Delivered RDMA Read Response MUST be 1929 sufficient to Complete the RDMA Read Operation. 1931 6 RDMAP Stream Management 1933 RDMAP Stream management consists of RDMAP Stream Initialization 1934 and RDMAP Stream Termination. 1936 6.1 Stream Initialization 1938 RDMAP Stream initialization occurs after the LLP Stream has been 1939 created (e.g., for DDP/MPA over TCP the first TCP Segment after 1940 the SYN, SYN/ACK exchange). The ULP is responsible for 1941 transitioning the LLP Stream into RDMA enabled mode. The switch to 1942 RDMA mode typically occurs sometime after LLP Stream setup. Once 1943 in RDMA enabled mode, an implementation MUST send only RDMA 1944 Messages across the transport Stream until the RDMAP Stream is 1945 torn down. 1947 For each direction of an RDMAP Stream: 1949 * For a given RDMAP Stream, the number of outstanding RDMA Read 1950 Requests is limited per RDMAP Stream direction. 1952 * It is the ULP's responsibility to set the maximum number of 1953 outstanding, inbound RDMA Read Requests per RDMAP Stream 1954 direction. 1956 * The RDMAP Layer MUST provide the maximum number of outstanding, 1957 inbound RDMA Read Requests per RDMAP Stream direction that were 1958 negotiated between the ULP and the Local Peer's RDMAP Layer. 1959 The negotiation mechanism is outside the scope of this 1960 specification. 1962 * It is the ULP's responsibility to set the maximum number of 1963 outstanding, outbound RDMA Read Requests per RDMAP Stream 1964 direction. 1966 * The RDMAP Layer MUST provide the maximum number of outstanding, 1967 outbound RDMA Read Requests for the RDMAP Stream direction that 1968 were negotiated between the ULP and the Local Peer's RDMAP 1969 Layer. The negotiation mechanism is outside the scope of this 1970 specification. 1972 * The Local Peer's ULP is responsible for negotiating with the 1973 Remote Peer's ULP the maximum number of outstanding RDMA Read 1974 Requests for the RDMAP Stream direction. It is recommended that 1975 the ULP set the maximum number of outstanding, inbound RDMA 1976 Read Requests equal to the maximum number of outstanding, 1977 outbound RDMA Read Requests for a given RDMAP Stream direction. 1979 * For outbound RDMA Read Requests, the RDMAP Layer MUST NOT 1980 exceed the maximum number of outstanding, outbound RDMA Read 1981 Requests that were negotiated between the ULP and the Local 1982 Peer's RDMAP Layer. 1984 * For inbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed 1985 the maximum number of outstanding, inbound RDMA Read Requests 1986 that were negotiated between the ULP and the Local Peer's RDMAP 1987 Layer. 1989 6.2 Stream Teardown 1991 There are three methods for terminating an RDMAP Stream: ULP 1992 Graceful Termination, RDMAP Abortive Termination, and LLP Abortive 1993 Termination. 1995 The ULP is responsible for performing ULP Graceful Termination. 1996 After a ULP Graceful Termination, either side of the Stream can 1997 initiate LLP Graceful Termination, using the graceful termination 1998 mechanism provided by the LLP. 2000 RDMAP Abortive Termination allows the RDMAP to issue a Terminate 2001 Message describing the reason the RDMAP Stream was terminated. The 2002 next section (6.2.1 RDMAP Abortive Termination) describes the 2003 RDMAP Abortive Termination in detail. 2005 LLP Abortive Termination results due to a LLP error and causes the 2006 RDMAP Stream to be torn down midstream, without an RDMAP Terminate 2007 Message. While this last method is highly undesirable, it is 2008 possible and the ULP should take this into consideration. 2010 6.2.1 RDMAP Abortive Termination 2012 RDMAP defines a Terminate operation that SHOULD be invoked when 2013 either an RDMAP error is encountered or a LLP error is surfaced to 2014 the RDMAP layer by the LLP. 2016 It is not always possible to send the Terminate Message. For 2017 example, certain LLP errors may occur that cause the LLP Stream to 2018 be torn down before a) RDMAP is aware of the error, b) before 2019 RDMAP is able to send the Terminate Message, or c) after RDMAP has 2020 posted the Terminate Message to the LLP, but it has not yet been 2021 transmitted by the LLP. 2023 Note that an RDMAP Abortive Termination may entail loss of data. 2024 In general, when a Terminate Message is received it is impossible 2025 to tell for sure what unacknowledged RDMA Messages were Completed 2026 successfully at the Remote Peer. Thus the state of all outstanding 2027 RDMA Messages is indeterminate and the Messages SHOULD be 2028 considered Completed in error. 2030 When a peer sends or receives a Terminate Message, it MAY 2031 immediately teardown the LLP Stream. The peer SHOULD perform a 2032 graceful LLP teardown to ensure the Terminate Message is 2033 successfully Delivered. 2035 See section 4.8 Terminate Header for a description of the 2036 Terminate Message and its contents. See section 5.4 Terminate 2037 Message for a description of the Terminate Message semantics. 2039 7 RDMAP Error Management 2041 The RDMAP protocol does not have RDMAP or DDP layer error recovery 2042 operations built in. If everything is working, the LLP guarantees 2043 will ensure that the Messages are arriving at the destination. 2045 If errors are detected at the RDMAP or DDP layer, then the RDMAP, 2046 DDP and LLP Streams are Abortively Terminated (see section 4.8 2047 Terminate Header on page 34). 2049 In general poor implementations or improper ULP programming causes 2050 the errors detected at the RDMAP and DDP layers. In these cases, 2051 returning a diagnostic termination error Message and closing the 2052 RDMAP Stream is far simpler than attempting to maintain the RDMAP 2053 Stream, particularly when the cause of the error is not known. 2055 If an LLP does not support teardown of a Stream independent of 2056 other Streams and an RDMAP error results in the Termination of a 2057 specific Stream, then the LLP MUST label the Stream as an 2058 erroneous Stream and MUST NOT allow any further data transfer on 2059 that Stream after RDMAP requests the Stream to be torn down. 2061 For a specific LLP connection, when all Streams are either 2062 gracefully torn down or are labeled as erroneous Streams, the LLP 2063 connection MUST be torn down. 2065 Since errors are detected at the Remote Peer (possibly long) after 2066 RDMA Messages are passed to DDP and the LLP at the Local Peer and 2067 Completed, the sender cannot easily determine which of its 2068 Messages have been received. (RDMA Reads are an exception to this 2069 rule). 2071 For a list of errors returned to the Remote Peer as a result of an 2072 Abortive Termination, see section 4.8 Terminate Header on page 34. 2074 7.1 RDMAP Error Surfacing 2076 If an error occurs at the Local Peer, the RDMAP layer MUST attempt 2077 to inform the local ULP that the error has occurred. 2079 The Local Peer MUST send a Terminate Message for each of the 2080 following cases: 2082 1. For errors detected while creating RDMA Write, Send, Send with 2083 Invalidate, Send with Solicited Event, Send with Solicited 2084 Event and Invalidate, or RDMA Read Requests, or other reasons 2085 not directly associated with an incoming Message, the 2086 Terminate Message and Error code are sent instead of the 2087 request. In this case, the Error Type and Error Code fields 2088 are included in the Terminate Message, but the Terminated DDP 2089 Header and Terminated RDMA Header fields are set to zero. 2091 2. For errors detected on an incoming RDMA Write, Send, Send with 2092 Invalidate, Send with Solicited Event, Send with Solicited 2093 Event and Invalidate, or Read Response Message (after the 2094 Message has been Delivered by DDP), the Terminate Message is 2095 sent at the earliest possible opportunity, preferably in the 2096 next outgoing RDMA Message. In this case, the Error Type, 2097 Error Code, ULP PDU Length, and Terminated DDP Header fields 2098 are included in the Terminate Message, but the Terminated RDMA 2099 Header field is set to zero. 2101 3. For errors detected on an incoming RDMA Read Request Message 2102 (after the Message has been Delivered by DDP), the Terminate 2103 Message is sent at the earliest possible opportunity, 2104 preferably in the next outgoing RDMA Message. In this case, 2105 the Error Type, Error Code, ULP PDU Length, Terminated DDP 2106 Header, and Terminated RDMA Header fields are included in the 2107 Terminate Message. 2109 4. If more than one error is detected on incoming RDMA Messages, 2110 before the Terminate Message can be sent, then the first RDMA 2111 Message (and its associated DDP Segment) that experienced an 2112 error MUST be captured by the Terminate Message in accordance 2113 with rules 2 and 3 above. 2115 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages 2117 On incoming RDMA Writes, RDMA Read Response, Sends, Send with 2118 Invalidate, Send with Solicited Event, Send with Solicited Event 2119 and Invalidate, and Terminate Messages, the following must be 2120 validated: 2122 1. The DDP Layer MUST validate all DDP Segment fields. 2124 2. The RDMA OpCode MUST be valid. 2126 3. The RDMA Version MUST be valid. 2128 Additionally, on incoming Send with Invalidate and Send with 2129 Solicited Event and Invalidate Messages, the following must 2130 also be validated: 2132 4. The Invalidate STag MUST be valid. 2134 5. The STag MUST be associated to this RDMAP Stream. 2136 On incoming RDMA Request Messages, the following must be 2137 validated: 2139 1. The DDP Layer MUST validate all Untagged DDP Segment fields. 2141 2. The RDMA OpCode MUST be valid. 2143 3. The RDMA Version MUST be valid. 2145 4. For non-zero length RDMA Read Request Messages: 2147 a. The Data Source STag MUST be valid. 2149 b. The Data Source STag MUST be associated to this RDMAP 2150 Stream. 2152 c. The Data Source Tagged Offset MUST fall in the range of 2153 legal offsets associated with the Data Source STag. 2155 d. The sum of the Data Source Tagged Offset and the RDMA Read 2156 Message Size MUST fall in the range of legal offsets 2157 associated with the Data Source STag. 2159 e. The sum of the Data Source Tagged Offset and the RDMA Read 2160 Message Size MUST NOT cause the Data Source Tagged Offset 2161 to wrap. 2163 8 Security Considerations 2165 This section references the resources that discuss protocol- 2166 specific security considerations and implications of using RDMAP 2167 with existing security services. A detailed analysis of the 2168 security issues around implementation and use of the RDMAP can be 2169 found in [RDMASEC]. 2171 [RDMASEC] introduces the RDMA reference model and discusses how 2172 the resources of this model are vulnerable to attacks and the 2173 types of attack these vulnerabilities are subject to. It also 2174 details the levels of Trust available in this peer-to-peer model 2175 and how this defines the nature of resource sharing. 2177 The IPsec requirements for RDDP are based on the version of IPsec 2178 specified in RFC 2401 [RFC 2401] and related RFCs, as profiled by 2179 RFC 3723 [RFC 3723], despite the existence of a newer version of 2180 IPsec specified in RFC 4301 [RFC 4301] and related RFCs. One of 2181 the important early applications of the RDDP protocols is their 2182 use with iSCSI [iSER]; RDDP's IPsec requirements follow those of 2183 IPsec in order to facilitate that usage by allowing a common 2184 profile of IPsec to be used with iSCSI and the RDDP protocols. In 2185 the future, RFC 3723 may be updated to the newer version of IPsec, 2186 the IPsec security requirements of any such update should apply 2187 uniformly to iSCSI and the RDDP protocols. 2189 8.1 Summary of RDMAP specific Security Requirements 2191 [RDMASEC] defines the security requirements for the implementation 2192 of the components of the RDMA reference model, namely the RDMA 2193 enabled NIC (RNIC) and the Privileged Resource Manager. An RDMAP 2194 implementation conforming to this specification MUST conform to 2195 these requirements. 2197 8.1.1 RDMAP (RNIC) Requirements 2199 RDMAP provides several countermeasures for all types of attacks as 2200 introduced in [RDMASEC]. In the following, this specification 2201 lists all security requirements which MUST be implemented by the 2202 RNIC. A more detailed discussion of RNIC security requirements can 2203 be found in Section 5 of [RDMASEC]. 2205 1. An RNIC MUST ensure that a specific Stream in a specific 2206 Protection Domain cannot access an STag in a different 2207 Protection Domain. 2209 2. An RNIC MUST ensure that if an STag is limited in scope to a 2210 single Stream, no other Stream can use the STag. 2212 3. An RNIC MUST ensure that a Remote Peer is not able to access 2213 memory outside of the buffer specified when the STag was 2214 enabled for remote access. 2216 4. An RNIC MUST provide a mechanism for the ULP to establish and 2217 revoke the association of a ULP Buffer to an STag and TO 2218 range. 2220 5. An RNIC MUST provide a mechanism for the ULP to establish and 2221 revoke read, write, or read and write access to the ULP Buffer 2222 referenced by an STag. 2224 6. An RNIC MUST ensure that the network interface can no longer 2225 modify an advertised buffer after the ULP revokes remote 2226 access rights for an STag. 2228 7. An RNIC MUST ensure that a Remote Peer is not able to 2229 invalidate an STag enabled for remote access, if the STag is 2230 shared on multiple streams. 2232 8. An RNIC MUST choose the value of STags in a way difficult to 2233 predict. It is RECOMMENDED to sparsely populate them over the 2234 full available range. 2236 9. An RNIC MUST NOT enable sharing a CQ across ULPs that do not 2237 share partial mutual trust. 2239 10. An RNIC MUST ensure that if a CQ overflows, any Streams which 2240 do not use the CQ MUST remain unaffected. 2242 11. An RNIC implementation SHOULD provide a mechanism to cap the 2243 number of outstanding RDMA Read Requests. 2245 12. An RNIC MUST NOT enable firmware to be loaded on the RNIC 2246 directly from an untrusted Local Peer or Remote Peer, unless 2247 the Peer is properly authenticated (by a mechanism outside the 2248 scope of this specification. The mechanism presumably entails 2249 authenticating that the remote ULP has the right to perform 2250 the update), and the update is done via a secure protocol, 2251 such as IPsec. 2253 8.1.2 Privileged Resource Manager Requirements 2255 With RDMAP, all reservations of local resources are initiated from 2256 local ULPs. To protect from local attacks including unfair 2257 resource distribution and gaining unauthorized access to RNIC 2258 resources, a Privileged Resource Manager (PRM) must be 2259 implemented, which manages all local resource allocation. Note 2260 that the PRM must not be provided as an independent component, its 2261 functionality can also be implemented as part of the privileged 2262 ULP or as part of the RNIC itself. 2264 An PRM implementation must meet the following security 2265 requirements (a more detailed discussion of PRM security 2266 requirements can be found in Section 5 of [RDMASEC]): 2268 1. All Non-Privileged ULP interactions with the RNIC Engine that 2269 could affect other ULPs MUST be done using the Resource 2270 Manager as a proxy. 2272 2. All ULP resource allocation requests for scarce resources MUST 2273 also be done using a Privileged Resource Manager. 2275 3. The Privileged Resource Manager MUST NOT assume different ULPs 2276 share Partial Mutual Trust unless there is a mechanism to 2277 ensure that the ULPs do indeed share partial mutual trust. 2279 4. If Non-Privileged ULPs are supported, the Privileged Resource 2280 Manager MUST verify that the Non-Privileged ULP has the right 2281 to access a specific Data Buffer before allowing an STag for 2282 which the ULP has access rights to be associated with a 2283 specific Data Buffer. 2285 5. The Privileged Resource Manager MUST control the allocation of 2286 CQ entries. 2288 6. The Privileged Resource Manager SHOULD prevent a Local Peer 2289 from allocating more than its fair share of resources. 2291 7. RDMA Read Request Queue resource consumption MUST be 2292 controlled by the Privileged Resource Manager such that 2293 RDMAP/DDP Streams which do not share Partial Mutual Trust do 2294 not share RDMA Read Request Queue resources. 2296 8. If an RNIC provides the ability to share receive buffers 2297 across multiple Streams, the combination of the RNIC and the 2298 Privileged Resource Manager MUST be able to detect if the 2299 Remote Peer is attempting to consume more than its fair share 2300 of resources so that the Local Peer can apply countermeasures 2301 to detect and prevent the attack. 2303 8.2 Security Services for RDMAP 2305 RDMAP is using IP based network services to control, read and 2306 write data buffers over the network. Therefore, all exchanged 2307 control and data packets are vulnerable to spoofing, tampering 2308 and information disclosure attacks. 2310 RDMAP Streams that are subject to impersonation attacks, or 2311 Stream hijacking attacks, can be authenticated, have their 2312 integrity protected, and be protected from replay attacks. 2313 Furthermore, confidentiality protection can be used to protect 2314 from eavesdropping. 2316 8.2.1 Available Security Services 2318 The IPsec protocol suite [RFC2401] defines strong countermeasures 2319 to protect an IP stream from those attacks. Several levels of 2320 protection can guarantee session confidentiality, per-packet 2321 source authentication, per-packet integrity and correct packet 2322 sequencing. 2324 RDMAP security may also profit from SSL or TLS security services 2325 provided for TCP based ULPs [RFC4346]. Used underneath RDMAP, 2326 these security services also provides for stream authentication, 2327 data integrity and confidentiality. As discussed in [RDMASEC], 2328 limitations on the maximum packet length to be carried over the 2329 network and potentially inefficient out-of-order packet processing 2330 at the data sink makes SSL and TLS less appropriate for RDMAP than 2331 IPsec. 2333 If SSL is layered on top of RDMAP, SSL does not protect the RDMAP 2334 headers. Thus, a man-in-the-middle attack can still occur by 2335 modifying the RDMAP header to incorrectly place the data into the 2336 wrong buffer, thus effectively corrupting the data stream. 2338 By remaining independent of ULP and LLP security protocols, RDMAP 2339 will benefit from continuing improvements at those layers. Users 2340 are provided flexibility to adapt to their specific security 2341 requirements and the ability to adapt to future security 2342 challenges. Given this, the vulnerabilities of RDMAP to active 2343 third-party interference are no greater than any other protocol 2344 running over an LLP such as TCP or SCTP. 2346 8.2.2 Requirements for IPsec Services for RDMAP 2348 Because IPsec is designed to secure arbitrary IP packet streams, 2349 including streams where packets are lost, RDMAP can run on top of 2350 IPsec without any change. IPsec packets are processed (e.g., 2351 integrity checked and possibly decrypted) in the order they are 2352 received, and an RDMAP Data Sink will process the decrypted RDMA 2353 Messages contained in these packets in the same manner as RDMA 2354 Messages contained in unsecured IP packets. 2356 The IP Storage working group has defined the normative IPsec 2357 requirements for IP Storage [RFC3723]. Portions of this 2358 specification are applicable to the RDMAP. In particular, a 2359 compliant implementation of IPsec services for RDMAP MUST meet 2360 the requirements as outlined in Section 2.3 of [RFC3723]. Without 2361 replicating the detailed discussion in [RFC3723], this includes 2362 the following requirements: 2364 1. The implementation MUST support IPsec ESP [RFC2406], as well 2365 as the replay protection mechanisms of IPsec. When ESP is 2366 utilized, per-packet data origin authentication, integrity and 2367 replay protection MUST be used. 2369 2. It MUST support ESP in tunnel mode and MAY implement ESP in 2370 transport mode. 2372 3. It MUST support IKE [RFC2409] for peer authentication, 2373 negotiation of security associations, and key management, 2374 using the IPsec DOI [RFC2407]. 2376 4. It MUST NOT interpret the receipt of a IKE Phase 2 delete 2377 message as a reason for tearing down the RDMAP stream. Since 2378 IPsec acceleration hardware may only be able to handle a 2379 limited number of active IKE Phase 2 SAs, idle SAs may be 2380 dynamically brought down and a new SA be brought up again, if 2381 activity resumes. 2383 5. It MUST support peer authentication using a pre-shared key, 2384 and MAY support certificate-based peer authentication using 2385 digital signatures. Peer authentication using the public key 2386 encryption methods [RFC2409] SHOULD NOT be used. 2388 6. It MUST support IKE Main Mode and SHOULD support Aggressive 2389 Mode. IKE Main Mode with pre-shared key authentication SHOULD 2390 NOT be used when either of the peers uses a dynamically 2391 assigned IP address. 2393 7. When digital signatures are used to achieve authentication, 2394 either IKE Main Mode or IKE Aggressive Mode MAY be used. In 2395 these cases, an IKE negotiator SHOULD use IKE Certificate 2396 Request Payload(s) to specify the certificate authority (or 2397 authorities) that are trusted in accordance with its local 2398 policy. IKE negotiators SHOULD check the pertinent Certificate 2399 Revocation List (CRL) before accepting a PKI certificate for 2400 use in IKE's authentication procedures. 2402 8. Access to locally stored secret information (pre-shared or 2403 private key for digital signing) must be suitably restricted, 2404 since compromise of the secret information nullifies the 2405 security properties of the IKE/IPsec protocols. 2407 9. It MUST follow the guidelines of Section 2.3.4 of [RFC3723] on 2408 the setting of IKE parameters to achieve a high level of 2409 interoperability without requiring extensive configuration. 2411 Furthermore, implementation and deployment of the IPsec services 2412 for RDDP should follow the Security Considerations outlined in 2413 Section 5 of [RFC3723]. 2415 9 IANA 2417 IANA Considerations 2419 This document requests no direct action from IANA. The following 2420 consideration is listed here as commentary. 2422 If RDMAP was enabled a priori for a ULP by connecting to a well- 2423 known port, this well-known port would be registered for the RDMAP 2424 with IANA. The registration of the well-known port will be the 2425 responsibility of the ULP specification. 2427 10 References 2429 10.1 Normative References 2431 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2432 Requirement Levels", BCP 14, RFC 2119, March 1997. 2434 [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security 2435 Payload (ESP)", RFC 2406, November 1998. 2437 [RFC2407] Piper, D., "The Internet IP Security Domain of 2438 Interpretation of ISAKMP", RFC 2407, November 1998. 2440 [RFC2409] Harkins, D. and D. Carrel, "The Internet Key Exchange 2441 (IKE)", RFC 2409, November 1998. 2443 [RFC3723] Aboba B. et al., "Secure Block Storage Protocols over 2444 IP", RFC 3723, April 2004. 2446 [RFC 4301] S. Kent and K. Seo, "Security Architecture for the 2447 Internet Protocol", RFC 4301, December 2005. 2449 [VERBS] J. Hilland, "RDMA Protocol Verbs Specification", draft- 2450 hilland-iwarp-verbs-v1.0 RDMA Consortium, April 2003. 2452 [DDP] H. Shah et al., "Direct Data Placement over Reliable 2453 Transports", draft-ietf-rddp-ddp-07.txt, September 2006. 2455 [MPA] P. Culley et al., "Marker PDU Aligned Framing for TCP 2456 Specification", draft-ietf-rddp-mpa-06.txt, September 2006. 2458 [SCTP] R. Stewart et al., "Stream Control Transmission Protocol", 2459 RFC 2960, October 2000. 2461 [TCP] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 2462 September 1981. 2464 [RDMASEC] J. Pinkerton et al., "DDP/RDMAP Security", draft-ietf- 2465 rddp-security-09.txt, March 2005. 2467 [iSER] M. Ko, et. al., "iSCSI Extensions for RDMA Specification, 2468 "Internet-Draft, draft-ietf-ips-iser-05.txt, Work in Progress, 2469 October 2005. 2471 10.2 Informative References 2473 [RFC2401] Atkinson, R., Kent, S., "Security Architecture for the 2474 Internet Protocol", RFC 2401, November 1998. 2476 [RFC4346] Dierks, T. and C. Allen, "The TLS Protocol Version 1.1", 2477 RFC 4346, April 2006. 2479 11 Appendix 2481 11.1 DDP Segment Formats for RDMA Messages 2483 This appendix is for information only and is NOT part of the 2484 standard. It simply depicts the DDP Segment format for the various 2485 RDMA Messages. 2487 11.1.1 DDP Segment for RDMA Write 2489 The following figure depicts an RDMA Write, DDP Segment: 2491 0 1 2 3 2492 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2494 | DDP Control | RDMA Control | 2495 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2496 | Data Sink STag | 2497 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2498 | Data Sink Tagged Offset | 2499 + + 2500 | | 2501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2502 | RDMA Write ULP Payload | 2503 // // 2504 | | 2505 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2506 Figure 11 RDMA Write, DDP Segment format 2508 11.1.2 DDP Segment for RDMA Read Request 2510 The following figure depicts an RDMA Read Request, DDP Segment: 2512 0 1 2 3 2513 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2514 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2515 | DDP Control | RDMA Control | 2516 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2517 | Reserved (Not Used) | 2518 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2519 | DDP (RDMA Read Request) Queue Number | 2520 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2521 | DDP (RDMA Read Request) Message Sequence Number | 2522 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2523 | DDP (RDMA Read Request) Message Offset | 2524 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2525 | Data Sink STag (SinkSTag) | 2526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2527 | | 2528 + Data Sink Tagged Offset (SinkTO) + 2529 | | 2530 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2531 | RDMA Read Message Size (RDMARDSZ) | 2532 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2533 | Data Source STag (SrcSTag) | 2534 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2535 | | 2536 + Data Source Tagged Offset (SrcTO) + 2537 | | 2538 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2539 Figure 12 RDMA Read Request, DDP Segment format 2541 11.1.3 DDP Segment for RDMA Read Response 2543 The following figure depicts an RDMA Read Response, DDP Segment: 2545 0 1 2 3 2546 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2548 | DDP Control | RDMA Control | 2549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2550 | Data Sink STag | 2551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2552 | Data Sink Tagged Offset | 2553 + + 2554 | | 2555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2556 | RDMA Read Response ULP Payload | 2557 // // 2558 | | 2559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2560 Figure 13 RDMA Read Response, DDP Segment format 2562 11.1.4 DDP Segment for Send and Send with Solicited Event 2564 The following figure depicts a Send and Send with Solicited 2565 Request, DDP Segment: 2567 0 1 2 3 2568 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2569 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2570 | DDP Control | RDMA Control | 2571 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2572 | Reserved (Not Used) | 2573 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2574 | (Send) Queue Number | 2575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2576 | (Send) Message Sequence Number | 2577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2578 | (Send) Message Offset | 2579 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2580 | Send ULP Payload | 2581 // // 2582 | | 2583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2585 Figure 14 Send and Send with Solicited Event, DDP Segment format 2587 11.1.5 DDP Segment for Send with Invalidate and Send with SE and 2588 Invalidate 2590 The following figure depicts a Send with invalidate and Send with 2591 Solicited and Invalidate Request, DDP Segment: 2593 0 1 2 3 2594 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2596 | DDP Control | RDMA Control | 2597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2598 | Invalidate STag | 2599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2600 | (Send) Queue Number | 2601 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2602 | (Send) Message Sequence Number | 2603 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2604 | (Send) Message Offset | 2605 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2606 | Send ULP Payload | 2607 // // 2608 | | 2609 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2610 Figure 15 Send with Invalidate and Send with SE and Invalidate, 2611 DDP Segment 2613 11.1.6 DDP Segment for Terminate 2615 The following figure depicts a Terminate, DDP Segment: 2617 0 1 2 3 2618 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2619 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2620 | DDP Control | RDMA Control | 2621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2622 | Reserved (Not Used) | 2623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2624 | DDP (Terminate) Queue Number | 2625 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2626 | DDP (Terminate) Message Sequence Number | 2627 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2628 | DDP (Terminate) Message Offset | 2629 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2630 | Terminate Control | Reserved | 2631 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2632 | DDP Segment Length (if any) | | 2633 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 2634 | | 2635 + + 2636 | Terminated DDP Header (if any) | 2637 + + 2638 | | 2639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2640 | | 2641 // // 2642 | Terminated RDMA Header (if any) | 2643 + + 2644 | | 2645 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2646 Figure 16 Terminate, DDP Segment format 2648 11.2 Ordering and Completion Table 2650 The following table summarizes the ordering relationships that are 2651 defined in section 5.5 Ordering and Completions from the 2652 standpoint of the local peer issuing the two Operations. Note, in 2653 the table that follows Send includes Send, Send with Invalidate, 2654 Send with Solicited Event, and Send with Solicited Event and 2655 Invalidate 2657 ------+-------+----------------+----------------+---------------- 2658 First | Later | Placement | Placement | Ordering 2659 Op | Op | guarantee at | guarantee | guarantee at 2660 | | Remote Peer | Local Peer | Remote Peer 2661 | | | | 2662 ------+-------+----------------+----------------+---------------- 2663 Send | Send | No placement | Not applicable | Completed in 2664 | | guarantee. If | | order. 2665 | | guarantee is | | 2666 | | necessary, see | | 2667 | | footnote 1. | | 2668 ------+-------+----------------+----------------+---------------- 2669 Send | RDMA | No placement | Not applicable | Not applicable 2670 | Write | guarantee. If | | 2671 | | guarantee is | | 2672 | | necessary, see | | 2673 | | footnote 1. | | 2674 ------+-------+----------------+----------------+---------------- 2675 Send | RDMA | No placement | RDMA Read | RDMA Read 2676 | Read | guarantee | Response | Response 2677 | | between Send | Payload will | Message will 2678 | | Payload and | not be placed | not be 2679 | | RDMA Read | at the local | generated until 2680 | | Request Header | peer until the | Send has been 2681 | | | Send Payload is| Completed 2682 | | | placed at the | 2683 | | | remote peer | 2684 ------+-------+----------------+----------------+---------------- 2685 RDMA | Send | No placement | Not applicable | Not applicable 2686 Write | | guarantee. If | | 2687 | | guarantee is | | 2688 | | necessary, see | | 2689 | | footnote 1. | | 2690 ------+-------+----------------+----------------+---------------- 2691 RDMA | RDMA | No placement | Not applicable | Not applicable 2692 Write | Write | guarantee. If | | 2693 | | guarantee is | | 2694 | | necessary, see | | 2695 | | footnote 1. | | 2696 ------+-------+----------------+----------------+---------------- 2697 RDMA | RDMA | No placement | RDMA Read | Not applicable 2698 Write | Read | guarantee | Response | 2699 | | between RDMA | Payload will | 2700 | | Write Payload | not be placed | 2701 | | and RDMA Read | at the local | 2702 | | Request Header | peer until the | 2703 | | | RDMA Write | 2704 | | | Payload is | 2705 | | | placed at the | 2706 | | | remote peer | 2707 ------+-------+----------------+----------------+---------------- 2708 RDMA | Send | No placement | Send Payload | Not applicable 2709 Read | | guarantee | may be placed | 2710 | | between RDMA | at the remote | 2711 | | Read Request | peer before the| 2712 | | Header and Send| RDMA Read | 2713 | | payload | Response is | 2714 | | | generated. | 2715 | | | If guarantee is| 2716 | | | necessary, see | 2717 | | | footnote 2. | 2718 ------+-------+----------------+----------------+---------------- 2719 RDMA | RDMA | No placement | RDMA Write | Not applicable 2720 Read | Write | guarantee | Payload may be | 2721 | | between RDMA | placed at the | 2722 | | Read Request | remote peer | 2723 | | Header and RDMA| before the RDMA| 2724 | | Write payload | Read Response | 2725 | | | is generated. | 2726 | | | If guarantee is| 2727 | | | necessary, see | 2728 | | | footnote 2. | 2729 ------+-------+----------------+----------------+---------------- 2730 RDMA | RDMA | No placement | No placement | Second RDMA 2731 Read | Read | guarantee of | guarantee of | Read Response 2732 | | the two RDMA | the two RDMA | will not be 2733 | | Read Request | Read Response | generated until 2734 | | Headers | Payloads. | first RDMA Read 2735 | | Additionally, | | Response is 2736 | | there is no | | generated. 2737 | | guarantee that | | 2738 | | the Tagged | | 2739 | | Buffers | | 2740 | | referenced in | | 2741 | | the RDMA Read | | 2742 | | will be read in| | 2743 | | order | | 2744 Figure 17 Operation Ordering 2746 Footnote 1: If the guarantee is necessary, a ULP may insert an 2747 RDMA Read Operation and wait for it to complete to act as a Fence. 2749 Footnote 2: If the guarantee is necessary, a ULP may wait for the 2750 RDMA Read Operation to complete before performing the Send. 2752 12 Author's Address 2754 Paul R. Culley 2755 Hewlett-Packard Company 2756 20555 SH 249 2757 Houston, Tx. USA 77070-2698 2758 Phone: 281-514-5543 2759 Email: paul.culley@hp.com 2761 Dave Garcia 2762 Hewlett-Packard Company 2763 19333 Vallco Parkway 2764 Cupertino, Ca. USA 95014 2765 Phone: 408.285.6116 2766 Email: dave.garcia@hp.com 2768 Jeff Hilland 2769 Hewlett-Packard Company 2770 20555 SH 249 2771 Houston, Tx. USA 77070-2698 2772 Phone: 281-514-9489 2773 Email: jeff.hilland@hp.com 2775 Bernard Metzler 2776 IBM Research GmbH 2777 Zurich Research Laboratory 2778 Saeumerstrasse 4 2779 CH-8803 Rueschlikon, Switzerland 2780 Phone: +41 44 724 8605 2781 Email: bmt@zurich.ibm.com 2783 Renato J. Recio 2784 IBM Corp. 2785 11501 Burnett Road 2786 Austin, Tx. USA 78758 2787 Phone: 512-838-3685 2788 Email: recio@us.ibm.com 2789 13 Contributors 2791 Dwight Barron 2792 Hewlett-Packard Company 2793 20555 SH 249 2794 Houston, Tx. USA 77070-2698 2795 Phone: 281-514-2769 2796 Email: dwight.barron@hp.com 2798 Caitlin Bestler 2799 Broadcom Corporation 2800 16215 Alton Parkway 2801 Irvine, CA. USA 92619-7013 2802 Phone: 949-926-6383 2803 Email: caitlinb@broadcom.com 2805 John Carrier 2806 Cray, Inc. 2807 411 First Avenue S, Suite 600 2808 Seattle, WA 98104-2860 USA 2809 Phone: 206-701-2090 2810 Email: carrier@cray.com 2812 Ted Compton 2813 EMC Corporation 2814 Research Triangle Park, NC 27709, USA 2815 Phone: 919-248-6075 2816 Email: compton_ted@emc.com 2818 Uri Elzur 2819 Broadcom Corporation 2820 16215 Alton Parkway 2821 Irvine, California 92619-7013 USA 2822 Phone: +1 (949) 585-6432 2823 Email: Uri@Broadcom.com 2825 Hari Ghadia 2826 Adaptec, Inc. 2827 691 S. Milpitas Blvd., 2828 Milpitas, CA 95035 USA 2829 Phone: +1 (408) 957-5608 2830 Email: hari_ghadia@adaptec.com 2832 Howard C. Herbert 2833 Intel Corporation 2834 MS CH7-404 2835 5000 West Chandler Blvd. 2836 Chandler, Arizona 85226 2837 Phone: 480-554-3116 2838 Email: howard.c.herbert@intel.com 2840 Mike Ko 2841 IBM 2842 650 Harry Rd. 2843 San Jose, CA 95120 2844 Phone: (408) 927-2085 2845 Email: mako@us.ibm.com 2847 Mike Krause 2848 Hewlett-Packard Company 2849 43LN 2850 19410 Homestead Road 2851 Cupertino, CA 95014 USA 2852 Phone: 408-447-3191 2853 Email: krause@cup.hp.com 2855 Dave Minturn 2856 Intel Corporation 2857 MS JF1-210 2858 5200 North East Elam Young Parkway 2859 Hillsboro, Oregon 97124 2860 Phone: 503-712-4106 2861 Email: dave.b.minturn@intel.com 2863 Mike Penna 2864 Broadcom Corporation 2865 16215 Alton Parkway 2866 Irvine, California 92619-7013 USA 2867 Phone: +1 (949) 926-7149 2868 Email: MPenna@Broadcom.com 2870 Jim Pinkerton 2871 Microsoft, Inc. 2872 One Microsoft Way 2873 Redmond, WA, USA 98052 2874 Email: jpink@microsoft.com 2876 Hemal Shah 2877 Broadcom Corporation 2878 16215 Alton Parkway 2879 Irvine, CA. USA 92619-7013 2880 Phone: 949-926-6941 2881 Email: 2883 Allyn Romanow 2884 Cisco Systems 2885 170 W Tasman Drive 2886 San Jose, CA 95134 USA 2887 Phone: +1 408 525 8836 2888 Email: allyn@cisco.com 2890 Tom Talpey 2891 Network Appliance 2892 1601 Trapelo Road #16 2893 Waltham, MA 02451 USA 2894 Phone: +1 (781) 768-5329 2895 EMail: thomas.talpey@netapp.com 2897 Patricia Thaler 2898 Broadcom Corporation 2899 16215 Alton Parkway 2900 Irvine, CA. USA 92619-7013 2901 Phone: +1-916-570-2707 2902 email: pthaler@broadcom.com 2904 Jim Wendt 2905 Hewlett-Packard Company 2906 8000 Foothills Boulevard MS 5668 2907 Roseville, CA 95747-5668 USA 2908 Phone: +1 916 785 5198 2909 Email: jim_wendt@hp.com 2911 Madeline Vega 2912 IBM 2913 11400 Burnet Rd. Bld.45-2L-007 2914 Austin, TX. USA 78758 2915 Phone: 512-838-7739 2916 Email: mvega1@us.ibm.com 2918 Claudia Salzberg 2919 IBM 2920 11501 Burnet Rd. Bld.902-5B-014 2921 Austin, TX. USA 78758 2922 Phone: 512-838-5156 2923 Email: salzberg@us.ibm.com 2925 14 Intellectual Property Statement 2927 The IETF takes no position regarding the validity or scope of any 2928 Intellectual Property Rights or other rights that might be claimed 2929 to pertain to the implementation or use of the technology described 2930 in this document or the extent to which any license under such 2931 rights might or might not be available; nor does it represent that 2932 it has made any independent effort to identify any such rights. 2933 Information on the procedures with respect to rights in RFC 2934 documents can be found in BCP 78 and BCP 79. 2936 Copies of IPR disclosures made to the IETF Secretariat and any 2937 assurances of licenses to be made available, or the result of an 2938 attempt made to obtain a general license or permission for the use 2939 of such proprietary rights by implementers or users of this 2940 specification can be obtained from the IETF on-line IPR repository 2941 at http://www.ietf.org/ipr. 2943 The IETF invites any interested party to bring to its attention any 2944 copyrights, patents or patent applications, or other proprietary 2945 rights that may cover technology that may be required to implement 2946 this standard. Please address the information to the IETF at ietf- 2947 ipr@ietf.org. 2949 15 Full Copyright Statement 2951 Copyright (C) The Internet Society (2006). 2953 This document is subject to the rights, licenses and restrictions 2954 contained in BCP 78, and except as set forth therein, the authors 2955 retain all their rights. 2957 This document and the information contained herein are provided on 2958 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2959 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 2960 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2961 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2962 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 2963 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 2964 PARTICULAR PURPOSE.