idnits 2.17.1 draft-ietf-rddp-rdmap-05.txt: -(361): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(2424): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(2949): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 24. -- Found old boilerplate from RFC 3978, Section 5.1 on line 2918. -- Found old boilerplate from RFC 3978, Section 5.5 on line 2946. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2898. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2905. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2911. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** An RFC 3978, Section 5.1 paragraph was found, but not on the first page, as required. == There are 4 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 39 instances of too long lines in the document, the longest one being 12 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == In addition to RFC 3978, Section 5.5 boilerplate, a section with a similar start was also found: This document and the information contained herein is provided on an ��AS IS�� basis and ADAPTEC INC., AGILENT TECHNOLOGIES INC., BROADCOM CORPORATION, CISCO SYSTEMS INC., EMC CORPORATION, HEWLETT-PACKARD COMPANY, INTERNATIONAL BUSINESS MACHINES CORPORATION, INTEL CORPORATION, MICROSOFT CORPORATION, NETWORK APPLIANCE INC., THE INTERNET SOCIETY, AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The "Author's Address" (or "Authors' Addresses") section title is misspelled. == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 17, 2005) is 6857 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2246' is mentioned on line 2315, but not defined ** Obsolete undefined reference: RFC 2246 (Obsoleted by RFC 4346) == Unused Reference: 'RFC2119' is defined on line 2409, but no explicit reference was found in the text == Unused Reference: 'VERBS' is defined on line 2424, but no explicit reference was found in the text == Unused Reference: 'RFC 2246' is defined on line 2447, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2406 (Obsoleted by RFC 4303, RFC 4305) ** Obsolete normative reference: RFC 2407 (Obsoleted by RFC 4306) ** Obsolete normative reference: RFC 2409 (Obsoleted by RFC 4306) -- Possible downref: Normative reference to a draft: ref. 'VERBS' == Outdated reference: A later version (-07) exists of draft-ietf-rddp-ddp-03 == Outdated reference: A later version (-08) exists of draft-ietf-rddp-mpa-01 ** Obsolete normative reference: RFC 2960 (ref. 'SCTP') (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 793 (ref. 'TCP') (Obsoleted by RFC 9293) == Outdated reference: A later version (-10) exists of draft-ietf-rddp-security-05 -- Obsolete informational reference (is this intentional?): RFC 2401 (Obsoleted by RFC 4301) -- Obsolete informational reference (is this intentional?): RFC 2246 (Obsoleted by RFC 4346) Summary: 12 errors (**), 0 flaws (~~), 13 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Remote Direct Data Placement Work Group R. Recio 3 INTERNET DRAFT IBM Corporation 4 draft-ietf-rddp-rdmap-05.txt P. Culley 5 Hewlett-Packard Company 6 D. Garcia 7 Hewlett-Packard Company 8 J. Hilland 9 Hewlett-Packard Company 10 B. Metzler 11 IBM Corporation 13 Expires: January, 2006 July 17, 2005 15 An RDMA Protocol Specification 17 Status of this Memo 19 This document is an Internet-Draft and is subject to all 20 provisions of Section 3 of RFC 3667. By submitting this Internet- 21 Draft, each author represents that any applicable patent or other 22 IPR claims of which he or she is aware have been or will be 23 disclosed, and any of which he or she becomes aware will be 24 disclosed, in accordance with Section 6 of BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six 32 months and may be updated, replaced, or obsoleted by other 33 documents at any time. It is inappropriate to use Internet-Drafts 34 as reference material or to cite them other than as "work in 35 progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/1id-abstracts.html The list of Internet-Draft 39 Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html. 42 Abstract 44 This document defines a Remote Direct Memory Access Protocol 45 (RDMAP) that operates over the Direct Data Placement Protocol (DDP 46 protocol). RDMAP provides read and write services directly to 47 applications and enables data to be transferred directly into ULP 48 Buffers without intermediate data copies. It also enables a kernel 49 bypass implementation. 51 Table of Contents 53 1 Introduction................................................6 54 1.1 Architectural Goals.........................................6 55 1.2 Protocol Overview...........................................7 56 1.3 RDMAP Layering.............................................10 57 1.4 Specification Changes from the Last Version................11 58 2 Glossary...................................................14 59 2.1 General....................................................14 60 2.2 LLP........................................................15 61 2.3 Direct Data Placement (DDP)................................16 62 2.4 Remote Direct Memory Access (RDMA).........................18 63 3 ULP and Transport Attributes...............................22 64 3.1 Transport Requirements & Assumptions.......................22 65 3.2 RDMAP Interactions with the ULP............................23 66 4 Header Format..............................................27 67 4.1 RDMAP Control and Invalidate STag Field....................27 68 4.2 RDMA Message Definitions...................................30 69 4.3 RDMA Write Header..........................................31 70 4.4 RDMA Read Request Header...................................32 71 4.5 RDMA Read Response Header..................................34 72 4.6 Send Header and Send with Solicited Event Header...........34 73 4.7 Send with Invalidate Header and Send with SE and Invalidate 74 Header...........................................................34 75 4.8 Terminate Header...........................................34 76 5 Data Transfer..............................................41 77 5.1 RDMA Write Message.........................................41 78 5.2 RDMA Read Operation........................................42 79 5.2.1 RDMA Read Request Message.................................42 80 5.2.2 RDMA Read Response Message................................43 81 5.3 Send Message Type..........................................44 82 5.4 Terminate Message..........................................46 83 5.5 Ordering and Completions...................................47 84 6 RDMAP Stream Management....................................51 85 6.1 Stream Initialization......................................51 86 6.2 Stream Teardown............................................52 87 6.2.1 RDMAP Abortive Termination................................52 88 7 RDMAP Error Management.....................................54 89 7.1 RDMAP Error Surfacing......................................54 90 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages55 91 8 Security...................................................57 92 8.1 Security Model and general Assumptions.....................57 93 8.1.1 Attackable Resources......................................57 94 8.1.2 Types of Attackers and Types of Attacks...................57 95 8.1.3 Trust and Resource Sharing................................58 96 8.2 Summary of RDMAP specific Security Requirements............58 97 8.2.1 RDMAP (RNIC) Requirements.................................58 98 8.2.2 Privileged Resource Manager Requirements..................60 99 8.3 Security Services for RDMAP................................61 100 8.3.1 Available Security Services...............................61 101 8.3.2 Requirements for IPsec Services for RDMAP.................62 102 9 IANA.......................................................64 103 10 References.................................................65 104 10.1 Normative References......................................65 105 10.2 Informative References....................................65 106 11 Appendix...................................................67 107 11.1 DDP Segment Formats for RDMA Messages.....................67 108 11.1.1 DDP Segment for RDMA Write..............................67 109 11.1.2 DDP Segment for RDMA Read Request.......................67 110 11.1.3 DDP Segment for RDMA Read Response......................69 111 11.1.4 DDP Segment for Send and Send with Solicited Event......69 112 11.1.5 DDP Segment for Send with Invalidate and Send with SE and 113 Invalidate.......................................................70 114 11.1.6 DDP Segment for Terminate...............................71 115 11.2 Ordering and Completion Table.............................71 116 12 Authors Addresses..........................................75 117 13 Acknowledgments............................................76 118 14 Intellectual Property Statement............................79 119 15 IPR Disclosure Acknowledgement.............................80 120 16 Disclaimer.................................................81 121 17 Full Copyright Statement...................................82 123 Table of Figures 125 Figure 1 RDMAP Layering..........................................10 126 Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP11 127 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields..28 128 Figure 4 RDMA Usage of DDP Fields................................29 129 Figure 5 RDMA Message Definitions................................31 130 Figure 6 RDMA Read Request Header Format.........................32 131 Figure 7 Terminate Header Format.................................35 132 Figure 8 Terminate Control Field.................................35 133 Figure 9 Terminate Control Field Values..........................38 134 Figure 10 Error Type to RDMA Message Mapping.....................40 135 Figure 11 RDMA Write, DDP Segment format.........................67 136 Figure 12 RDMA Read Request, DDP Segment format..................68 137 Figure 13 RDMA Read Response, DDP Segment format.................69 138 Figure 14 Send and Send with Solicited Event, DDP Segment format.70 139 Figure 15 Send with Invalidate and Send with SE and Invalidate, 140 DDP Segment......................................................70 141 Figure 16 Terminate, DDP Segment format..........................71 142 Figure 17 Operation Ordering.....................................74 144 1 Introduction 146 Today, communications over TCP/IP typically require copy 147 operations, which add latency and consume significant CPU and 148 memory resources. The Remote Direct Memory Access Protocol 149 (RDMAP) enables removal of data copy operations and enables 150 reduction in latencies by allowing a local application to read or 151 write data on a remote computer's memory with minimal demands on 152 memory bus bandwidth and CPU processing overhead, while preserving 153 memory protection semantics. 155 RDMAP is layered on top of Direct Data Placement (DDP) and uses 156 the two Buffer Models available from DDP [DDP]. 158 1.1 Architectural Goals 160 RDMAP has been designed with the following high-level 161 architectural goals: 163 * Provide a data transfer operation that allows a Local Peer to 164 transfer up to 2^32 - 1 octets directly into a previously 165 advertised buffer (i.e. Tagged buffer) located at a Remote Peer 166 without requiring a copy operation. This is referred to as the 167 RDMA Write data transfer operation. 169 * Provide a data transfer operation that allows a Local Peer to 170 retrieve up to 2^32 - 1 octets directly from a previously 171 advertised buffer (i.e. Tagged buffer) located at a Remote Peer 172 without requiring a copy operation. This is referred to as the 173 RDMA Read data transfer operation. 175 * Provide a data transfer operation that allows a Local Peer to 176 send up to 2^32 - 1 octets directly into a buffer located at a 177 Remote Peer that has not been explicitly advertised. This is 178 referred to as the Send (Send with Invalidate, Send with 179 Solicited Event, and Send with Solicited Event and Invalidate) 180 data transfer operation. 182 * Enable the local ULP to use the Send Operation Type (includes 183 Send, Send with Invalidate, Send with Solicited Event, and Send 184 with Solicited Event and Invalidate) to signal to the remote 185 ULP the Completion of all previous Messages initiated by the 186 local ULP. 188 * Provide for all Operations on a single RDMAP Stream to be 189 reliably transmitted in the order that they were submitted. 191 * Provide RDMAP capabilities independently for each Stream when 192 the LLP supports multiple data Streams within an LLP 193 connection. 195 1.2 Protocol Overview 197 RDMAP provides seven data transfer operations. Except for the RDMA 198 Read operation, each operation generates exactly one RDMA Message. 199 Following is a brief overview of the RDMA Operations and RDMA 200 Messages: 202 1. Send - A Send operation uses a Send Message to transfer data 203 from the Data Source into a buffer that has not been 204 explicitly Advertised by the Data Sink. The Send Message uses 205 the DDP Untagged Buffer Model to transfer the ULP Message into 206 the Data Sink's Untagged Buffer. 208 2. Send with Invalidate - A Send with Invalidate operation uses a 209 Send with Invalidate Message to transfer data from the Data 210 Source into a buffer that has not been explicitly Advertised 211 by the Data Sink. The Send with Invalidate Message includes 212 all functionality of the Send Message, with one addition: an 213 STag field is included in the Send With Invalidate Message and 214 after the message has been Placed and Delivered at the Data 215 Sink the remote peer's buffer identified by the STag can no 216 longer be accessed remotely until the remote peer's ULP re- 217 enables access and Advertises the buffer. 219 3. Send with Solicited Event (Send with SE) - A Send with 220 Solicited Event operation uses a Send with Solicited Event 221 Message to transfer data from the Data Source into an Untagged 222 Buffer at the Data Sink. The Send with Solicited Event Message 223 is similar to the Send Message, with one addition: when the 224 Send with Solicited Event Message has been Placed and 225 Delivered, an Event may be generated at the recipient, if the 226 recipient is configured to generate such an Event. 228 4. Send with Solicited Event and Invalidate (Send with SE and 229 Invalidate) - A Send with Solicited Event and Invalidate 230 operation uses a Send with Solicited Event and Invalidate 231 Message to transfer data from the Data Source into a buffer 232 that has not been explicitly Advertised by the Data Sink. The 233 Send with Solicited Event and Invalidate Message is similar to 234 the Send with Invalidate Message, with one addition: when the 235 Send with Solicited Event and Invalidate Message has been 236 Placed and Delivered, an Event may be generated at the 237 recipient, if the recipient is configured to generate such an 238 Event. 240 5. Remote Direct Memory Access Write - An RDMA Write operation 241 uses an RDMA Write Message to transfer data from the Data 242 Source to a previously advertised buffer at the Data Sink. 244 The ULP at the Remote Peer, which in this case is the Data 245 Sink, enables the Data Sink Tagged Buffer for access and 246 Advertises the buffer's size (length), location (Tagged 247 Offset), and Steering Tag (STag) to the Data Source through a 248 ULP specific mechanism. The ULP at the Local Peer, which in 249 this case is the Data Source, initiates the RDMA Write 250 operation. The RDMA Write Message uses the DDP Tagged Buffer 251 Model to transfer the ULP Message into the Data Sink's Tagged 252 Buffer. Note: the STag associated with the Tagged Buffer 253 remains valid until the ULP at the Remote Peer invalidates it 254 or the ULP at the Local Peer invalidates it through a Send 255 with Invalidate or Send with Solicited Event and Invalidate. 257 6. Remote Direct Memory Access Read - The RDMA Read operation 258 transfers data to a Tagged Buffer at the Local Peer, which in 259 this case is the Data Sink, from a Tagged Buffer at the Remote 260 Peer, which in this case is the Data Source. The ULP at the 261 Data Source enables the Data Source Tagged Buffer for access 262 and Advertises the buffer's size (length), location (Tagged 263 Offset), and Steering Tag (STag) to the Data Sink through a 264 ULP specific mechanism. The ULP at the Data Sink enables the 265 Data Sink Tagged Buffer for access and initiates the RDMA Read 266 operation. The RDMA Read operation consists of a single RDMA 267 Read Request Message and a single RDMA Read Response Message, 268 and the latter may be segmented into multiple DDP Segments. 270 The RDMA Read Request Message uses the DDP Untagged Buffer 271 Model to Deliver the STag, starting Tagged Offset and length 272 for both the Data Source and Data Sink Tagged Buffers to the 273 remote peer's RDMA Read Request Queue. 275 The RDMA Read Response Message uses the DDP Tagged Buffer 276 Model to Deliver the Data Source's Tagged Buffer to the Data 277 Sink, without any involvement from the ULP at the Data Source. 279 Note: the Data Source STag associated with the Tagged Buffer 280 remains valid until the ULP at the Data Source invalidates it 281 or the ULP at the Data Sink invalidates it through a Send with 282 Invalidate or Send with Solicited Event and Invalidate. The 283 Data Sink STag associated with the Tagged Buffer remains valid 284 until the ULP at the Data Sink invalidates it. 286 7. Terminate - A Terminate operation uses a Terminate Message to 287 transfer to the Remote Peer information associated with an 288 error that occurred at the Local Peer. The Terminate Message 289 uses the DDP Untagged Buffer Model to transfer the Message 290 into the Data Sink's Untagged Buffer. 292 1.3 RDMAP Layering 294 RDMAP is dependent on DDP, subject to the requirements defined in 295 section 3.1 Transport Requirements & Assumptions. Figure 1 RDMAP 296 Layering depicts the relationship between Upper Layer Protocols 297 (ULPs), RDMAP, DDP protocol, the framing layer, and the transport 298 For LLP protocol definitions of each LLP, see [MPA], [TCP], and 299 [SCTP]. 301 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 302 | | 303 | Upper Layer Protocol (ULP) | 304 | | 305 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 306 | | 307 | RDMAP | 308 | | 309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 310 | | 311 | DDP protocol | 312 | | 313 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 314 | | | 315 | MPA | | 316 | | | 317 +-+-+-+-+-+-+-+-+-+ SCTP | 318 | | | 319 | TCP | | 320 | | | 321 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 322 Figure 1 RDMAP Layering 324 If RDMAP is layered over DDP/MPA/TCP, then the respective headers 325 and ULP Payload are arranged as follows (Note: For clarity, MPA 326 header and CRC fields are included but MPA markers are not shown): 328 0 1 2 3 329 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 331 | | 332 // TCP Header // 333 | | 334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 335 | MPA Header | | 336 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 337 | | 338 // DDP Header // 339 | | 340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 341 | | 342 // RDMA Header // 343 | | 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 | | 346 // ULP Payload // 347 | (shown with no pad bytes) | 348 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 349 | MPA CRC | 350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP 353 1.4 Specification Changes from the Last Version 355 This section is to be removed before RFC publication. 357 The following major changes (vs typos) were made to the -05 358 version: 360 * To pass the IETF checklist tool, modified heading of Security 361 Section 8 to ��Security�� and added ��Security Considerations�� 362 below it. 364 * Added IANA Section 9 and to pass the IETF checklist tool added 365 ��IANA Considerations�� line below Section 9 header. 367 * Added Intellectual Property Statement Section 14 and IPR 368 Disclosure Acknowledgement Section 15. 370 * Added Disclaimer Section 16. 372 * Section 6.8 - 373 - Acknowledged that the Reserved field size for the 374 Terminate Message is 13 bits. The fix was made to the -04 375 version, but was not listed in this section. 377 The following major changes (vs typos) were made to the -04 378 version: 380 * Section 10 - 381 - Expanded IPsec requirements sentence in section 382 10.3.2 to say what is required in addition to cross-referencing 383 RFC 3723. 385 * Section 6.8 - Fixed text after Figure 9 to reflect the correct 386 size (13 bits) of the Reserved field in the Terminate Message. 388 The following major changes (vs typos) were made to the -03 389 version: 391 * Section 6.1 - Added normative text describing downward 392 compatibility with version 0. 394 * Section 6.8 - Changed the description of the reserved field 395 size to match the size in the figure, which is 13 bits. 397 * Section 10 - 398 - Aligned security section closely to [RDMASEC] and 399 added normative text for security requirements. 401 The following major changes (vs typos) were made to the -02 402 version: 404 * Section 6.8 - 405 - Explicitly defined the bit numbers for the three 406 header control bits. 408 * Section 8.1 - 409 - Stated the typical Stream initialization to be: 410 RDMA mode is entered some time after the LLP Stream is 411 initialized. 413 * Section 10 - 414 - Update reference to security document. 416 * Section 10 - 417 - Fixed Send with Solicited Event and Invalidate 418 reference. 420 * Section 12.1 - 421 - MPA and DDP references were changed to reflect 422 the released specifications and accurate titles. 424 * Section 12.1 - 425 - Reference for RDMA Protocol Verbs was changed to 426 reflect the released specification and accurate title. 428 2 Glossary 430 2.1 General 432 Advertisement (Advertised, Advertise, Advertisements, Advertises) 433 - the act of informing a Remote Peer that a local RDMA Buffer 434 is available to it. A Node makes available an RDMA Buffer for 435 incoming RDMA Read or RDMA Write access by informing its 436 RDMA/DDP peer of the Tagged Buffer identifiers (STag, base 437 address, and buffer length). This advertisement of Tagged 438 Buffer information is not defined by RDMA/DDP and is left to 439 the ULP. A typical method would be for the Local Peer to embed 440 the Tagged Buffer's Steering Tag, base address, and length in 441 a Send Message destined for the Remote Peer. 443 Data Sink - The peer receiving a data payload. Note that the Data 444 Sink can be required to both send and receive RDMA/DDP 445 Messages to transfer a data payload. 447 Data Source - The peer sending a data payload. Note that the Data 448 Source can be required to both send and receive RDMA/DDP 449 Messages to transfer a data payload. 451 Data Delivery (Delivery, Delivered, Delivers) - Delivery is 452 defined as the process of informing the ULP or consumer that a 453 particular Message is available for use. This is specifically 454 different from "Placement", which may generally occur in any 455 order, while the order of "Delivery" is strictly defined. See 456 "Data Placement". 458 Fabric - The collection of links, switches, and routers that 459 connect a set of Nodes with RDMA/DDP protocol implementations. 461 Fence (Fenced, Fences) - To block the current RDMA Operation from 462 executing until prior RDMA Operations have Completed. 464 iWARP - A suite of wire protocols comprised of RDMAP, DDP, and 465 MPA. The iWARP protocol suite may be layered above TCP, SCTP, 466 or other transport protocols. 468 Local Peer - The RDMA/DDP protocol implementation on the local end 469 of the connection. Used to refer to the local entity when 470 describing a protocol exchange or other interaction between 471 two Nodes. 473 Node - A computing device attached to one or more links of a 474 Fabric (network). A Node in this context does not refer to a 475 specific application or protocol instantiation running on the 476 computer. A Node may consist of one or more RNICs installed in 477 a host computer. 479 Remote Peer - The RDMA/DDP protocol implementation on the opposite 480 end of the connection. Used to refer to the remote entity when 481 describing protocol exchanges or other interactions between 482 two Nodes. 484 RNIC - RDMA Network Interface Controller. In this context, this 485 would be a network I/O adapter or embedded controller with 486 iWARP and verbs functionality. 488 RNIC Interface (RI) - The presentation of the RNIC to the verbs 489 Consumer as implemented through the combination of the RNIC 490 and the RNIC driver. 492 ULP - Upper Layer Protocol. The protocol layer above the protocol 493 layer currently being referenced. The ULP for RDMA/DDP is 494 expected to be an OS, Application, adaptation layer, or 495 proprietary device. The RDMA/DDP documents do not specify a 496 ULP - they provide a set of semantics that allow a ULP to be 497 designed to utilize RDMA/DDP. 499 ULP Payload - The ULP data that is contained within a single 500 protocol segment or packet (e.g. a DDP Segment). 502 Verbs - An abstract description of the functionality of a RNIC 503 Interface. The OS may expose some or all of this functionality 504 via one or more APIs to applications. The OS will also use 505 some of the functionality to manage the RNIC Interface. 507 2.2 LLP 509 LLP - Lower Layer Protocol. The protocol layer beneath the 510 protocol layer currently being referenced. For example, for 511 DDP the LLP is SCTP, MPA, or other transport protocols. For 512 RDMA, the LLP is DDP. 514 LLP Connection - Corresponds to an LLP transport-level connection 515 between the peer LLP layers on two nodes. 517 LLP Stream - Corresponds to a single LLP transport-level Stream 518 between the peer LLP layers on two Nodes. One or more LLP 519 Streams may map to a single transport-level LLP connection. 520 For transport protocols that support multiple Streams per 521 connection (e.g. SCTP), a LLP Stream corresponds to one 522 transport-level Stream. 524 MULPDU - Maximum ULPDU. The current maximum size of the record 525 that is acceptable for DDP to pass to the LLP for 526 transmission. 528 ULPDU - Upper Layer Protocol Data Unit. The data record defined 529 by the layer above MPA. 531 2.3 Direct Data Placement (DDP) 533 Data Placement (Placement, Placed, Places) - For DDP, this term is 534 specifically used to indicate the process of writing to a data 535 buffer by a DDP implementation. DDP Segments carry Placement 536 information, which may be used by the receiving DDP 537 implementation to perform Data Placement of the DDP Segment 538 ULP Payload. See "Data Delivery". 540 DDP Abortive Teardown - The act of closing a DDP Stream without 541 attempting to Complete in-progress and pending DDP Messages. 543 DDP Graceful Teardown - The act of closing a DDP Stream such that 544 all in-progress and pending DDP Messages are allowed to 545 Complete successfully. 547 DDP Control Field - a fixed 16-bit field in the DDP Header. The 548 DDP Control Field contains an 8-bit field whose contents are 549 reserved for use by the ULP. 551 DDP Header - The header present in all DDP segments. The DDP 552 Header contains control and Placement fields that are used to 553 define the final Placement location for the ULP payload 554 carried in a DDP Segment. 556 DDP Message - A ULP defined unit of data interchange, which is 557 subdivided into one or more DDP segments. This segmentation 558 may occur for a variety of reasons, including segmentation to 559 respect the maximum segment size of the underlying transport 560 protocol. 562 DDP Segment - The smallest unit of data transfer for the DDP 563 protocol. It includes a DDP Header and ULP Payload (if 564 present). A DDP Segment should be sized to fit within the 565 underlying transport protocol MULPDU. 567 DDP Stream - a sequence of DDP Messages whose ordering is defined 568 by the LLP. For SCTP, a DDP Stream maps directly to an SCTP 569 Stream. For MPA, a DDP Stream maps directly to a TCP 570 connection and a single DDP Stream is supported. Note that 571 DDP has no ordering guarantees between DDP Streams. 573 Direct Data Placement - A mechanism whereby ULP data contained 574 within DDP Segments may be Placed directly into its final 575 destination in memory without processing of the ULP. This may 576 occur even when the DDP Segments arrive out of order. Out of 577 order Placement support may require the Data Sink to implement 578 the LLP and DDP as one functional block. 580 Direct Data Placement Protocol (DDP) - Also, a wire protocol that 581 supports Direct Data Placement by associating explicit memory 582 buffer placement information with the LLP payload units. 584 Message Offset (MO) - For the DDP Untagged Buffer Model, specifies 585 the offset, in bytes, from the start of a DDP Message. 587 Message Sequence Number (MSN) - For the DDP Untagged Buffer Model, 588 specifies a sequence number that is increasing with each DDP 589 Message. 591 Queue Number (QN) - For the DDP Untagged Buffer Model, identifies 592 a destination Data Sink queue for a DDP Segment. 594 Steering Tag - An identifier of a Tagged Buffer on a Node, valid 595 as defined within a protocol specification. 597 STag - Steering Tag 598 Tagged Buffer - A buffer that is explicitly Advertised to the 599 Remote Peer through exchange of an STag, Tagged Offset, and 600 length. 602 Tagged Buffer Model - A DDP data transfer model used to transfer 603 Tagged Buffers from the Local Peer to the Remote Peer. 605 Tagged DDP Message - A DDP Message that targets a Tagged Buffer. 607 Tagged Offset (TO) - The offset within a Tagged Buffer on a Node. 609 Untagged Buffer - A buffer that is not explicitly Advertised to 610 the Remote Peer. 612 Untagged Buffer Model - A DDP data transfer model used to transfer 613 Untagged Buffers from the Local Peer to the Remote Peer. 615 Untagged DDP Message - A DDP Message that targets an Untagged 616 Buffer. 618 2.4 Remote Direct Memory Access (RDMA) 620 Event - An indication provided by the RDMAP Layer to the ULP to 621 indicate a Completion or other condition requiring immediate 622 attention. 624 Invalidate STag - A mechanism used to prevent the Remote Peer from 625 reusing a previous explicitly Advertised STag, until the Local 626 Peer makes it available through a subsequent explicit 627 Advertisement. The STag cannot be accessed remotely until it 628 is explicit Advertised again. 630 RDMA Completion (Completion, Completed, Complete, Completes) - For 631 RDMA, Completion is defined as the process of informing the 632 ULP that a particular RDMA Operation has performed all 633 functions specified for the RDMA Operations, including 634 Placement and Delivery. The Completion semantic of each RDMA 635 Operation is distinctly defined. 637 RDMA Message - A data transfer mechanism used to fulfill an RDMA 638 Operation. 640 RDMA Operation - A sequence of RDMA Messages, including control 641 Messages, to transfer data from a Data Source to a Data Sink. 642 The following RDMA Operations are defined - RDMA Writes, RDMA 643 Read, Send, Send with Invalidate, Send with Solicited Event, 644 Send with Solicited Event and Invalidate, and Terminate. 646 RDMA Protocol (RDMAP) - A wire protocol that supports RDMA 647 Operations to transfer ULP data between a Local Peer and the 648 Remote Peer. 650 RDMAP Abortive Termination (Termination, Terminated, Terminate, 651 Terminates) - The act of closing an RDMAP Stream without 652 attempting to Complete in-progress and pending RDMA 653 Operations. 655 RDMAP Graceful Termination - The act of closing an RDMAP Stream 656 such that all in-progress and pending RDMA Operations are 657 allowed to Complete successfully. 659 RDMA Read - An RDMA Operation used by the Data Sink to transfer 660 the contents of a source RDMA buffer from the Remote Peer to 661 the Local Peer. An RDMA Read operation consists of a single 662 RDMA Read Request Message and a single RDMA Read Response 663 Message. 665 RDMA Read Request - An RDMA Message used by the Data Sink to 666 request the Data Source to transfer the contents of an RDMA 667 buffer. The RDMA Read Request Message describes both the Data 668 Source and Data Sink RDMA buffers. 670 RDMA Read Request Queue - The queue used for processing RDMA Read 671 Requests. The RDMA Read Request Queue has a DDP Queue Number 672 of 1. 674 RDMA Read Response - An RDMA Message used by the Data Source to 675 transfer the contents of an RDMA buffer to the Data Sink, in 676 response to an RDMA Read Request. The RDMA Read Response 677 Message only describes the data sink RDMA buffer. 679 RDMAP Stream - An association between a pair of RDMAP 680 implementations, possibly on different Nodes, which transfer 681 ULP data using RDMA Operations. There may be multiple RDMAP 682 Streams on a single Node. An RDMAP Stream maps directly to a 683 single DDP Stream. 685 RDMA Write - An RDMA Operation that transfers the contents of a 686 source RDMA Buffer from the Local Peer to a destination RDMA 687 Buffer at the Remote Peer using RDMA. The RDMA Write Message 688 only describes the Data Sink RDMA buffer. 690 Remote Direct Memory Access (RDMA) - A method of accessing memory 691 on a remote system in which the local system specifies the 692 remote location of the data to be transferred. Employing a 693 RNIC in the remote system allows the access to take place 694 without interrupting the processing of the CPU(s) on the 695 system. 697 Send - An RDMA Operation that transfers the contents of a ULP 698 Buffer from the Local Peer to an Untagged Buffer at the Remote 699 Peer. 701 Send Message Type - A Send Message, Send with Invalidate Message, 702 Send with Solicited Event Message, or Send with Solicited 703 Event and Invalidate Message. 705 Send Operation Type - A Send Operation, Send with Invalidate 706 Operation, Send with Solicited Event Operation, or Send with 707 Solicited Event and Invalidate Operation. 709 Solicited Event (SE) - A facility by which an RDMA Operation 710 sender may cause an Event to be generated at the recipient, if 711 the recipient is configured to generate such an Event, when a 712 Send with Solicited Event or Send with Solicited Event and 713 Invalidate Message is received. Note: The Local Peer's ULP 714 can use the Solicited Event mechanism to ensure that Messages 715 designated as important to the ULP are handled in an 716 expeditious manner by the Remote Peer's ULP. The ULP at the 717 Local Peer can indicate a given Send Message Type is important 718 by using the Send with Solicited Event Message or Send with 719 Solicited Event and Invalidate Message. The ULP at the Remote 720 Peer can choose to only be notified when valid Send with 721 Solicited Event Messages and/or Send with Solicited Event and 722 Invalidate Messages arrive and handle other valid incoming 723 Send Messages or Send with Invalidate Messages at its leisure. 725 Terminate - An RDMA Message used by a Node to pass an error 726 indication to the peer Node on an RDMAP Stream. This operation 727 is for RDMAP use only. 729 ULP Buffer - A buffer owned above the RDMAP Layer and advertised 730 to the RDMAP Layer either as a Tagged Buffer or an Untagged 731 ULP Buffer. 733 ULP Message - The ULP data that is handed to a specific protocol 734 layer for transmission. Data boundaries are preserved as they 735 are transmitted through iWARP. 737 3 ULP and Transport Attributes 739 3.1 Transport Requirements & Assumptions 741 RDMAP MUST be layered on top of the Direct Data Placement Protocol 742 [DDP]. 744 RDMAP requires the following DDP support: 746 * RDMAP uses three queues for Untagged Buffers: 748 * Queue Number 0 (used by RDMAP for Send, Send with 749 Invalidate, Send with Solicited Event, and Send with 750 Solicited Event and Invalidate operations). 752 * Queue Number 1 (used by RDMAP for RDMA Read operations). 754 * Queue Number 2 (used by RDMAP for Terminate operations). 756 * DDP maps a single RDMA Message to a single DDP Message. 758 * DDP uses the STag and Tagged Offset provided by the RDMAP for 759 Tagged Buffer Messages (i.e. RDMA Write and RDMA Read 760 Response). 762 * When the DDP layer Delivers an Untagged DDP Message to the 763 RDMAP layer, DDP provides the length of the DDP Message. This 764 ensures that RDMAP does not have to carry a length field in its 765 header. 767 * When the RDMAP layer provides an RDMA Message to the DDP Layer, 768 DDP must insert the RsvdULP field value provided by the RDMAP 769 Layer into the associated DDP Message. 771 * When the DDP layer Delivers a DDP Message to the RDMAP layer, 772 DDP provides the RsvdULP field. 774 * The RsvdULP field must be 1 octet for DDP Tagged Messages and 5 775 octets for DDP Untagged Messages. 777 * DDP propagates to RDMAP all operation or protection errors 778 (used by RDMAP Terminate) and, when appropriate, the DDP Header 779 fields of the DDP Segment that encountered the error. 781 * If an RDMA Operation is aborted by DDP or a lower layer, the 782 contents of the Data Sink buffers associated with the operation 783 are considered indeterminate. 785 * DDP in conjunction with the lower layers provide reliable, in- 786 order Delivery. 788 3.2 RDMAP Interactions with the ULP 790 RDMAP provides the ULP with access to the following RDMA 791 Operations as defined in this specification: 793 * Send 795 * Send with Solicited Event 797 * Send with Invalidate 799 * Send with Solicited Event and Invalidate 801 * RDMA Write 803 * RDMA Read 805 For Send Operation Types, the following are the interactions 806 between the RDMAP Layer and the ULP: 808 * At the Data Source: 810 * The ULP passes to the RDMAP Layer the following: 812 * ULP Message Length 814 * ULP Message 816 * An indication of the Send Operation Type, where the 817 valid types are: Send, Send with Solicited Event, Send 818 with Invalidate, or Send with Solicited Event and 819 Invalidate. 821 * An Invalidate STag, if the Send Operation Type was 822 Send with Invalidate or Send with Solicited Event and 823 Invalidate. 825 * When the Send Operation Type Completes, an indication of 826 the Completion results. 828 * At the Data Sink: 830 * If the Send Operation Type Completed successfully, the 831 RDMAP Layer passes the following information to the ULP 832 Layer: 834 * ULP Message Length 836 * ULP Message 838 * An Event, if the Data Sink is configured to generate 839 an Event. 841 * An Invalidated STag, if the Send Operation Type was 842 Send with Invalidate or Send with Solicited Event and 843 Invalidate. 845 * If the Send Operation Type Completed in error, the Data 846 Sink RDMAP Layer will pass up the corresponding error 847 information to the Data Sink ULP and send a Terminate 848 Message to the Data Source RDMAP Layer. The Data Source 849 RDMAP Layer will then pass up the Terminate Message to the 850 ULP. 852 For RDMA Write Operations, the following are the interactions 853 between the RDMAP Layer and the ULP: 855 * At the Data Source: 857 * The ULP passes to the RDMAP Layer the following: 859 * ULP Message Length 861 * ULP Message 863 * Data Sink STag 865 * Data Sink Tagged Offset 867 * When the RDMA Write Operation Completes, an indication of 868 the Completion results. 870 * At the Data Sink: 872 * If the RDMA Write completed successfully, the RDMAP Layer 873 does not Deliver the RDMA Write to the ULP. It does Place 874 the ULP Message transferred through the RDMA Write Message 875 into the ULP Buffer. 877 * If the RDMA Write completed in error, the Data Sink RDMAP 878 Layer will pass up the corresponding error information to 879 the Data Sink ULP and send a Terminate Message to the Data 880 Source RDMAP Layer. The Data Source RDMAP Layer will then 881 pass up the Terminate Message to the ULP. 883 For RDMA Read Operations, the following are the interactions 884 between the RDMAP Layer and the ULP: 886 * At the Data Sink: 888 * The ULP passes to the RDMAP Layer the following: 890 * ULP Message Length 892 * Data Source STag 894 * Data Sink STag 896 * Data Source Tagged Offset 898 * Data Sink Tagged Offset 900 * When the RDMA Read Operation Completes, an indication of 901 the Completion results. 903 * At the Data Source: 905 * If no error occurred while processing the RDMA Read 906 Request, the Data Source will not pass up any information 907 to the ULP. 909 * If an error occurred while processing the RDMA Read 910 Request, the Data Source RDMAP Layer will pass up the 911 corresponding error information to the Data Source ULP and 912 send a Terminate Message to the Data Sink RDMAP Layer. The 913 Data Sink RDMAP Layer will then pass up the Terminate 914 Message to the ULP. 916 For STags made available to the RDMAP Layer, following are the 917 interactions between the RDMAP Layer and the ULP: 919 * If the ULP enables an STag, the ULP passes to the RDMAP Layer 920 the: 922 * STag; 924 * range of Tagged Offsets that are associated with a given 925 STag; 927 * remote access rights (read, write, or read and write) 928 associated with a given, valid STag; and 930 * association between a given STag and a given RDMAP Stream. 932 * If the ULP disables an STag, the ULP passes to the RDMAP Layer 933 the STag. 935 If an error occurs at the RDMAP Layer, the RDMAP Layer may pass 936 back error information (e.g. the content of a Terminate Message) 937 to the ULP. 939 4 Header Format 941 The control information of RDMA Messages is included in DDP 942 protocol defined header fields, with the following exceptions: 944 * The first octet reserved for ULP usage on all DDP Messages in 945 the DDP Protocol (i.e. the RsvdULP Field) is used by RDMAP to 946 carry the RDMA Message Opcode and the RDMAP version. This octet 947 is known as the RDMAP Control Fiebld in this specification. For 948 Send with Invalidate and Send with Solicited Event and 949 Invalidate, RDMAP uses the second through fifth octets provided 950 by DDP on Untagged DDP Messages to carry the STag that will be 951 Invalidated. 953 * The RDMA Message length is passed by the RDMAP layer to the DDP 954 layer on all outbound transfers. 956 * For RDMA Read Request Messages, the RDMA Read Message Size is 957 included in the RDMA Read Request Header. 959 * The RDMA Message length is passed to the RDMAP Layer by the DDP 960 layer on inbound Untagged Buffer transfers. 962 * Two RDMA Messages carry additional RDMAP headers. The RDMA Read 963 Request carries the Data Sink and Data Source buffer 964 descriptions, including buffer length. The Terminate carries 965 additional information associated with the error that caused 966 the Terminate. 968 4.1 RDMAP Control and Invalidate STag Field 970 The version of RDMAP defined by this specification uses all 8 bits 971 of the RDMAP Control Field. The first octet reserved for ULP use 972 in the DDP Protocol MUST be used by the RDMAP to carry the RDMAP 973 Control Field. The ordering of the bits in the first octet MUST be 974 as defined in Figure 3 DDP Control, RDMAP Control, and Invalidate 975 STag Field. For Send with Invalidate and Send with Solicited Event 976 and Invalidate, the second through fifth octets of the DDP RsvdULP 977 field MUST be used by RDMAP to carry the Invalidate STag. Figure 3 978 DDP Control, RDMAP Control, and Invalidate STag Field depicts the 979 format of the DDP Control and RDMAP Control fields. (Note: In 980 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Field, 981 the DDP Header is offset by 16 bits to accommodate the MPA header 982 defined in [MPA]. The MPA header is only present if DDP is layered 983 on top of MPA.) 985 0 1 2 3 986 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 988 |T|L| Resrv | DV| RV|Rsv| Opcode| 989 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 990 | Invalidate STag | 991 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 992 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields 994 All RDMA Messages handed by the RDMAP Layer to the DDP layer MUST 995 define the value of the Tagged flag in the DDP Header. Figure 4 996 RDMA Usage of DDP Fields MUST be used to define the value of the 997 Tagged flag that is handed to the DDP Layer for each RDMA Message. 999 Figure 4 RDMA Usage of DDP Fields defines the value of the RDMA 1000 Opcode field that MUST be used for each RDMA Message. 1002 Figure 4 RDMA Usage of DDP Fields defines when the STag, Queue 1003 Number, and Tagged Offset fields MUST be provided for each RDMA 1004 Message. 1006 For this version of the RDMAP, all RDMA Messages MUST have: 1008 * Bits 24-25; RDMA Version field: 01b for IETF RNICs, and 00b for 1009 RDMAC RNICs. Both version numbers are valid. Interoperability 1010 is dependent on MPA protocol version negotiation (e.g. MPA 1011 marker and MPA CRC), see [RNIC Interoperability] for details. 1013 * Bits 26-27; Reserved. MUST be set to zero by sender, ignored by 1014 the receiver. 1016 * Bits 28-31; OpCode field: see Figure 4 RDMA Usage of DDP 1017 Fields. 1019 * Bits 32-63; Invalidate STag. However, this field is only valid 1020 for Send with Invalidate and Send with Solicited Event and 1021 Invalidate Messages (see Figure 4 RDMA Usage of DDP Fields). 1022 For Send, Send with Solicited Event, RDMA Read Request, and 1023 Terminate, the Invalidate STag field MUST be set to zero on 1024 transmit and ignored by the receiver. 1026 -------+-----------+-------+------+-------+-----------+-------------- 1027 RDMA | Message | Tagged| STag | Queue | Invalidate| Message 1028 Message| Type | Flag | and | Number| STag | Length 1029 OpCode | | | TO | | | Communicated 1030 | | | | | | between DDP 1031 | | | | | | and RDMAP 1032 -------+-----------+-------+------+-------+-----------+-------------- 1033 0000b | RDMA Write| 1 | Valid| N/A | N/A | Yes 1034 | | | | | | 1035 -------+-----------+-------+------+-------+-----------+-------------- 1036 0001b | RDMA Read | 0 | N/A | 1 | N/A | Yes 1037 | Request | | | | | 1038 -------+-----------+-------+------+-------+-----------+-------------- 1039 0010b | RDMA Read | 1 | Valid| N/A | N/A | Yes 1040 | Response | | | | | 1041 -------+-----------+-------+------+-------+-----------+-------------- 1042 0011b | Send | 0 | N/A | 0 | N/A | Yes 1043 | | | | | | 1044 -------+-----------+-------+------+-------+-----------+-------------- 1045 0100b | Send with | 0 | N/A | 0 | Valid | Yes 1046 | Invalidate| | | | | 1047 -------+-----------+-------+------+-------+-----------+-------------- 1048 0101b | Send with | 0 | N/A | 0 | N/A | Yes 1049 | SE | | | | | 1050 -------+-----------+-------+------+-------+-----------+-------------- 1051 0110b | Send with | 0 | N/A | 0 | Valid | Yes 1052 | SE and | | | | | 1053 | Invalidate| | | | | 1054 -------+-----------+-------+------+-------+-----------+-------------- 1055 0111b | Terminate | 0 | N/A | 2 | N/A | Yes 1056 | | | | | | 1057 -------+-----------+-------+------+-------+-----------+-------------- 1058 1000b | | 1059 to | Reserved | Not Specified 1060 1111b | | 1061 -------+-----------+------------------------------------------------- 1062 Figure 4 RDMA Usage of DDP Fields 1064 Note: N/A means Not Applicable. 1066 4.2 RDMA Message Definitions 1068 The following figure defines which RDMA Headers MUST be used on 1069 each RDMA Message and which RDMA Messages are allowed to carry ULP 1070 payload: 1072 -------+-----------+-------------------+------------------------- 1073 RDMA | Message | RDMA Header Used | ULP Message allowed in 1074 Message| Type | | the RDMA Message 1075 OpCode | | | 1076 | | | 1077 -------+-----------+-------------------+------------------------- 1078 0000b | RDMA Write| None | Yes 1079 | | | 1080 -------+-----------+-------------------+------------------------- 1081 0001b | RDMA Read | RDMA Read Request | No 1082 | Request | Header | 1083 -------+-----------+-------------------+------------------------- 1084 0010b | RDMA Read | None | Yes 1085 | Response | | 1086 -------+-----------+-------------------+------------------------- 1087 0011b | Send | None | Yes 1088 | | | 1089 -------+-----------+-------------------+------------------------- 1090 0100b | Send with | None | Yes 1091 | Invalidate| | 1092 -------+-----------+-------------------+------------------------- 1093 0101b | Send with | None | Yes 1094 | SE | | 1095 -------+-----------+-------------------+------------------------- 1096 0110b | Send with | None | Yes 1097 | SE and | | 1098 | Invalidate| | 1099 -------+-----------+-------------------+------------------------- 1100 0111b | Terminate | Terminate Header | No 1101 | | | 1102 -------+-----------+-------------------+------------------------- 1103 1000b | | 1104 to | Reserved | Not Specified 1105 1111b | | 1106 -------+-----------+-------------------+------------------------- 1107 Figure 5 RDMA Message Definitions 1109 4.3 RDMA Write Header 1111 The RDMA Write Message does not include an RDMAP header. The RDMAP 1112 layer passes to the DDP layer an RDMAP Control Field. The RDMA 1113 Write Message is fully described by the DDP Headers of the DDP 1114 Segments associated with the Message. 1116 See section 11 Appendix for a description of the DDP Segment 1117 format associated with RDMA Write Messages. 1119 4.4 RDMA Read Request Header 1121 The RDMA Read Request Message carries an RDMA Read Request Header 1122 that describes the Data Sink and Data Source Buffers used by the 1123 RDMA Read operation. The RDMA Read Request Header immediately 1124 follows the DDP header. The RDMAP layer passes to the DDP layer an 1125 RDMAP Control Field. The following figure depicts the RDMA Read 1126 Request Header that MUST be used for all RDMA Read Request 1127 Messages: 1129 0 1 2 3 1130 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1131 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1132 | Data Sink STag (SinkSTag) | 1133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1134 | | 1135 + Data Sink Tagged Offset (SinkTO) + 1136 | | 1137 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1138 | RDMA Read Message Size (RDMARDSZ) | 1139 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1140 | Data Source STag (SrcSTag) | 1141 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1142 | | 1143 + Data Source Tagged Offset (SrcTO) + 1144 | | 1145 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1146 Figure 6 RDMA Read Request Header Format 1148 Data Sink Steering Tag: 32 bits. 1150 The Data Sink Steering Tag identifies the Data Sink's Tagged 1151 Buffer. This field MUST be copied, without interpretation, 1152 from the RDMA Read Request into the corresponding RDMA Read 1153 Response and allows the Data Sink to place the returning 1154 data. The STag is associated with the RDMAP Stream through a 1155 mechanism that is outside the scope of the RDMAP 1156 specification. 1158 Data Sink Tagged Offset: 64 bits. 1160 The Data Sink Tagged Offset specifies the starting offset, in 1161 octets, from the base of the Data Sink's Tagged Buffer, where 1162 the data is to be written by the Data Source. This field is 1163 copied from the RDMA Read Request into the corresponding RDMA 1164 Read Response and allows the Data Sink to place the returning 1165 data. The Data Sink Tagged Offset MAY start at an arbitrary 1166 offset. 1168 The Data Sink STag and Data Sink Tagged Offset fields 1169 describe the buffer to which the RDMA Read data is written. 1171 Note: the DDP Layer protects against a wrap of the Data Sink 1172 Tagged Offset. 1174 RDMA Read Message Size: 32 bits. 1176 The RDMA Read Message Size is the amount of data, in octets, 1177 read from the Data Source. A single RDMA Read Request Message 1178 can retrieve from 0 to 2^32-1 data octets from the Data 1179 Source. 1181 Data Source Steering Tag: 32 bits. 1183 The Data Source Steering Tag identifies the Data Source's 1184 Tagged Buffer. The STag is associated with the RDMAP Stream 1185 through a mechanism that is outside the scope of the RDMAP 1186 specification. 1188 Data Source Tagged Offset: 64 bits. 1190 The Tagged Offset specifies the starting offset, in octets, 1191 that is to be read from the Data Source's Tagged Buffer. The 1192 Data Source Tagged Offset MAY start at an arbitrary offset. 1194 The Data Source STag and Data Source Tagged Offset fields 1195 describe the buffer from which the RDMA Read data is read. 1197 See Section 7.2 Errors Detected at the Remote Peer on Incoming 1198 RDMA Messages for a description of error checking required upon 1199 processing of an RDMA Read Request at the Data Source. 1201 4.5 RDMA Read Response Header 1203 The RDMA Read Response Message does not include an RDMAP header. 1204 The RDMAP layer passes to the DDP layer an RDMAP Control Field. 1205 The RDMA Read Response Message is fully described by the DDP 1206 Headers of the DDP Segments associated with the Message. 1208 See Section 11 Appendix for a description of the DDP Segment 1209 format associated with RDMA Read Response Messages. 1211 4.6 Send Header and Send with Solicited Event Header 1213 The Send and Send with Solicited Event Message do not include an 1214 RDMAP header. The RDMAP layer passes to the DDP layer an RDMAP 1215 Control Field. The Send and Send with Solicited Event Message are 1216 fully described by the DDP Headers of the DDP Segments associated 1217 with the Message. 1219 See Section 11 Appendix for a description of the DDP Segment 1220 format associated with Send and Send with Solicited Event 1221 Messages. 1223 4.7 Send with Invalidate Header and Send with SE and Invalidate 1224 Header 1226 The Send with Invalidate and Send with Solicited Event and 1227 Invalidate Message do not include an RDMAP header. The RDMAP layer 1228 passes to the DDP layer an RDMAP Control Field and the Invalidate 1229 STag field (see section 4.1 RDMAP Control and Invalidate STag 1230 Field). The Send with Invalidate and Send with Solicited Event and 1231 Invalidate Message are fully described by the DDP Headers of the 1232 DDP Segments associated with the Message. 1234 See Section 11 Appendix for a description of the DDP Segment 1235 format associated with Send and Send with Solicited Event 1236 Messages. 1238 4.8 Terminate Header 1240 The Terminate Message carries a Terminate Header that contains 1241 additional information associated with the cause of the Terminate. 1242 The Terminate Header immediately follows the DDP header. The RDMAP 1243 layer passes to the DDP layer an RDMAP Control Field. The 1244 following figure depicts a Terminate Header that MUST be used for 1245 the Terminate Message: 1247 0 1 2 3 1248 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1249 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1250 | Terminate Control | Reserved | 1251 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1252 | DDP Segment Length (if any) | | 1253 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 1254 | | 1255 // // 1256 | Terminated DDP Header (if any) | 1257 + + 1258 | | 1259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1260 | | 1261 // // 1262 | Terminated RDMA Header (if any) | 1263 + + 1264 | | 1265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1266 Figure 7 Terminate Header Format 1268 Terminate Control: 19 bits. 1270 The Terminate Control field MUST have the format defined in 1271 Figure 8 Terminate Control Field. 1273 0 1 2 3 1274 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1275 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1276 | Layer | EType | Error Code |HdrCt| 1277 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1278 Figure 8 Terminate Control Field 1280 * Figure 9 Terminate Control Field Values defines the valid 1281 values that MUST be used for this field. 1283 * Layer: 4 bits. 1285 Identifies the layer that encountered the error. 1287 * EType (RDMA Error Type): 4 bits. 1289 Identifies the type of error that caused the 1290 Terminate. When the error is detected at the RDMAP 1291 Layer, the RDMAP Layer inserts the Error Type into 1292 this field. When the error is detected at a LLP layer, 1293 a LLP layer creates the Error Type and the DDP layer 1294 passes it up to the RDMAP Layer, and the RDMAP Layer 1295 inserts it into this field. 1297 * Error Code: 8 bits. 1299 This field identifies the specific error that caused 1300 the Terminate. When the error is detected at the RDMAP 1301 Layer, the RDMAP Layer creates the Error Code. When 1302 the error is detected at a LLP layer, a LLP layer 1303 creates the Error Code and the DDP layer passes it up 1304 to the RDMAP Layer, and the RDMAP Layer inserts it 1305 into this field. 1307 * HdrCt: 3 bits. 1309 Header control bits: 1311 * M: bit 16. DDP Segment Length valid. See Figure 10 1312 for when this bit SHOULD be set. 1314 * D: bit 17. DDP Header Included. See Figure 10 for 1315 when this bit SHOULD be set. 1317 * R: bit 18. RDMAP Header Included. See Figure 10 1318 for when this bit SHOULD be set. 1320 -------+----------+-------+-------------+------+-------------------- 1321 Layer | Layer | Error | Error Type | Error| Error Code Name 1322 | Name | Type | Name | Code | 1323 -------+----------+-------+-------------+------+-------------------- 1324 | | 0000b | Local | None | None 1325 | | | Catastrophic| | 1326 | | | Error | | 1327 | +-------+-------------+------+-------------------- 1328 | | | | 00X | Invalid STag 1329 | | | +------+-------------------- 1330 | | | | 01X | Base or bounds 1331 | | | | | violation 1332 | | | Remote +------+-------------------- 1333 | | 0001b | Protection | 02X | Access rights 1334 | | | Error | | violation 1335 | | | +------+-------------------- 1336 0000b | RDMA | | | 03X | STag not associated 1337 | | | | | with RDMAP Stream 1338 | | | +------+-------------------- 1339 | | | | 04X | TO wrap 1340 | | | +------+-------------------- 1341 | | | | 09X | STag cannot be 1342 | | | | | Invalidated 1343 | | | +------+-------------------- 1344 | | | | FFX | Unspecified Error 1345 | +-------+-------------+------+-------------------- 1346 | | | | 05X | Invalid RDMAP 1347 | | | | | version 1348 | | | +------+-------------------- 1349 | | | | 06X | Unexpected OpCode 1350 | | | Remote +------+-------------------- 1351 | | 0010b | Operation | 07X | Catastrophic error, 1352 | | | Error | | localized to RDMAP 1353 | | | | | Stream 1354 | | | +------+-------------------- 1355 | | | | 08X | Catastrophic error, 1356 | | | | | global 1357 | | | +------+-------------------- 1358 | | | | 09X | STag cannot be 1359 | | | | | Invalidated 1360 | | | +------+-------------------- 1361 | | | | FFX | Unspecified Error 1362 -------+----------+-------+-------------+------+-------------------- 1363 0001b | DDP | See DDP Specification [DDP] for a description of 1364 | | the values and names. 1365 -------+----------+-------+----------------------------------------- 1366 0010b | LLP | For MPA, see MPA Specification [MPA] for a 1367 | (eg MPA) | description of the values and names. 1368 -------+----------+-------+----------------------------------------- 1369 Figure 9 Terminate Control Field Values 1371 Reserved: 13 bits. This field MUST be set to zero on transmit, 1372 ignored on receive. 1374 DDP Segment Length: 16 bits 1376 The length handed up by the DDP Layer when the error was 1377 detected. It MUST be valid if the M bit is set. It MUST be 1378 present when the D bit is set. 1380 Terminated DDP Header: 112 bits for Tagged Messages and 144 bits 1381 for Untagged Messages. 1383 The DDP Header of the incoming Message that is associated 1384 with the Terminate. The DDP Header is not present if the 1385 Terminate Error Type is a Local Catastrophic Error. It MUST 1386 be present if the D bit is set. 1388 Terminated RDMA Header: 224 bits. 1390 The Terminated RDMA Header is only sent back if the terminate 1391 is associated with an RDMA Read Request Message. It MUST be 1392 present if the R bit is set. 1394 If the terminate occurs before the first RDMA Read Request 1395 byte is processed, the original RDMA Read Request Header is 1396 sent back. 1398 If the terminate occurs after the first RDMA Read Request 1399 byte is processed, the RDMA Read Request Header is updated to 1400 reflect the current location of the RDMA Read operation that 1401 is in process: 1403 * Data Sink STag = Data Sink STag originally sent in the 1404 RDMA Read Request. 1406 * Data Sink Tagged Offset = Current offset into the Data 1407 Sink Tagged Buffer. For example if the RDMA Read 1408 Request was terminated after 2048 octets were sent, 1409 then the Data Sink Tagged Offset = the original Data 1410 Sink Tagged Offset + 2048. 1412 * Data Message size = Number of bytes left to transfer. 1414 * Data Source STag = Data Source STag in the RDMA Read 1415 Request. 1417 * Data Source Tagged Offset = Current offset into the 1418 Data Source Tagged Buffer. For example if the RDMA 1419 Read Request was terminated after 2048 octets were 1420 sent, then the Data Source Tagged Offset = the 1421 original Data Source Tagged Offset + 2048. 1423 Note: if a given LLP does not define any termination codes for the 1424 RDMAP Termination message to use, then none would be used for that 1425 LLP. 1427 Figure 10 Error Type to RDMA Message Mapping maps layer name and 1428 error types to each RDMA Message type: 1430 ---------+-------------+------------+------------+----------------- 1431 Layer | Error Type | Terminate | Terminate | What type of 1432 Name | Name | Includes | Includes | RDMA Message can 1433 | | DDP Header | RDMA Header| cause the error 1434 | | and DDP | | 1435 | | Segment | | 1436 | | Length | | 1437 ---------+-------------+------------+------------+----------------- 1438 | Local | No | No | Any 1439 | Catastrophic| | | 1440 | Error | | | 1441 +-------------+------------+------------+----------------- 1442 | Remote | Yes, if | Yes | Only RDMA Read 1443 RDMA | Protection | possible | | Request, Send 1444 | Error | | | with Invalidate, 1445 | | | | and Send with SE 1446 | | | | and Invalidate 1447 +-------------+------------+------------+----------------- 1448 | Remote | Yes, if | No | Any 1449 | Operation | possible | | 1450 | Error | | | 1451 ---------+-------------+------------+------------+----------------- 1452 DDP | See DDP Spec| Yes | No | Any 1453 | [DDP] | | | 1454 ---------+-------------+------------+------------+----------------- 1455 LLP | See LLP Spec| No | No | Any 1456 | [e.g. MPA] | | | 1457 Figure 10 Error Type to RDMA Message Mapping 1459 5 Data Transfer 1461 5.1 RDMA Write Message 1463 An RDMA Write is used by the Data Source to transfer data to a 1464 previously Advertised Tagged Buffer at the Data Sink. The RDMA 1465 Write Message has the following semantics: 1467 * AN RDMA Write Message MUST reference a Tagged Buffer. That is, 1468 the Data Source RDMAP Layer MUST request that the DDP layer 1469 mark the Message as Tagged. 1471 * A valid RDMA Write Message MUST NOT be delivered to the Data 1472 Sink's ULP (i.e. it is placed by the DDP layer). 1474 * At the Remote Peer, when an invalid RDMA Write Message is 1475 delivered to the Remote Peer's RDMAP Layer, an error is 1476 surfaced (see section 7.1 RDMAP Error Surfacing). 1478 * The Tagged Offset of a Tagged Buffer MAY start at a non-zero 1479 value. 1481 * AN RDMA Write Message MAY target all or part of a previously 1482 Advertised buffer. 1484 * The RDMAP does not define how the buffer(s) used by an outbound 1485 RDMA Write is defined and how it is addressed. For example, an 1486 implementation of RDMA may choose to allow a gather-list of 1487 non-contiguous data blocks to be the source of an RDMA Write. 1488 In this case, the data blocks would be combined by the Data 1489 Source and sent as a single RDMA Write Message to the Data 1490 Sink. 1492 * The Data Source RDMAP Layer MUST issue RDMA Write Messages to 1493 the DDP layer in the order they were submitted by the ULP. 1495 * At the Data Source, a subsequent Send (Send with Invalidate, 1496 Send with Solicited Event, or Send with Solicited Event and 1497 Invalidate) Message MAY be used to signal Delivery of previous 1498 RDMA Write Messages to the Data Sink, if desired by the ULP. 1500 * If the Local Peer wishes to write to multiple Tagged Buffers on 1501 the Remote Peer, the Local Peer MUST use multiple RDMA Write 1502 Messages. That is, a single RDMA Write Message can only write 1503 to one remote Tagged Buffer. 1505 * The Data Source MAY issue a zero length RDMA Write Message. 1507 5.2 RDMA Read Operation 1509 The RDMA Read operation MUST consist of a single RDMA Read Request 1510 Message and a single RDMA Read Response Message. 1512 5.2.1 RDMA Read Request Message 1514 An RDMA Read Request is used by the Data Sink to transfer data 1515 from a previously Advertised Tagged Buffer at the Data Source to a 1516 Tagged Buffer at the Data Sink. The RDMA Read Request Message has 1517 the following semantics: 1519 * AN RDMA Read Request Message MUST reference an Untagged Buffer. 1520 That is, the Local Peer's RDMAP Layer MUST request that the DDP 1521 mark the Message as Untagged. 1523 * One RDMA Read Request Message MUST consume one Untagged Buffer. 1525 * The Remote Peer's RDMAP Layer MUST process an RDMA Read Request 1526 Message. A valid RDMA Read Request Message MUST NOT be 1527 delivered to the Data Sink's ULP (i.e. it is processed by the 1528 RDMAP layer). 1530 * At the Remote Peer, when an invalid RDMA Read Request Message 1531 is delivered to the Remote Peer's RDMAP Layer, an error is 1532 surfaced (see section 7.1 RDMAP Error Surfacing). 1534 * AN RDMA Read Request Message MUST reference the RDMA Read 1535 Request Queue. That is, the Local Peer's RDMAP Layer MUST 1536 request that the DDP layer set the Queue Number field to one. 1538 * The Local Peer MUST pass to the DDP Layer RDMA Read Request 1539 Messages in the order they were submitted by the ULP. 1541 * The Remote Peer MUST process the RDMA Read Request Messages in 1542 the order they were sent. 1544 * If the Local Peer wishes to read from multiple Tagged Buffers 1545 on the Remote Peer, the Local Peer MUST use multiple RDMA Read 1546 Request Messages. That is, a single RDMA Read Request Message 1547 MUST only read from one remote Tagged Buffer. 1549 * AN RDMA Read Request Message MAY target all or part of a 1550 previously Advertised buffer. 1552 * If the Data Source receives a valid RDMA Read Request Message 1553 it MUST respond with a valid RDMA Read Response Message. 1555 * The Data Sink MAY issue a zero length RDMA Read Request 1556 Message, by setting the RDMA Read Message Size field to zero in 1557 the RDMA Read Request Header. 1559 * If the Data Source receives a non-zero length RDMA Read Message 1560 Size, the Data Source RDMAP MUST validate the Data Source STag 1561 and Data Source Tagged Offset contained in the RDMA Read 1562 Request Header. 1564 * If the Data Source receives an RDMA Read Request Header with 1565 the RDMA Read Message Size set to zero, the Data Source RDMAP: 1567 * MUST NOT validate the Data Source STag and Data Source 1568 Tagged Offset contained in the RDMA Read Request Header, 1569 and 1571 * MUST respond with a zero length RDMA Read Response 1572 Message. 1574 5.2.2 RDMA Read Response Message 1576 The RDMA Read Response Message uses the DDP Tagged Buffer Model to 1577 Deliver the contents of a previously requested Data Source Tagged 1578 Buffer to the Data Sink, without any involvement from the ULP at 1579 the Remote Peer. The RDMA Read Response Message has the following 1580 semantics: 1582 * The RDMA Read Response Message for the associated RDMA Read 1583 Request Message travels in the opposite direction. 1585 * An RDMA Read Response Message MUST reference a Tagged Buffer. 1586 That is, the Data Source RDMAP Layer MUST request that the DDP 1587 mark the Message as Tagged. 1589 * The Data Source MUST ensure that a sufficient number of 1590 Untagged Buffers are available on the RDMA Read Request Queue 1591 (Queue with DDP Queue Number 1) to support the maximum number 1592 of RDMA Read Requests negotiated by the ULP. 1594 * The RDMAP Layer MUST Deliver the RDMA Read Response Message to 1595 the ULP. 1597 * At the Remote Peer, when an invalid RDMA Read Response Message 1598 is delivered to the Remote Peer's RDMAP Layer, an error is 1599 surfaced (see section 7.1 RDMAP Error Surfacing). 1601 * The Tagged Offset of a Tagged Buffer MAY start at a non-zero 1602 value. 1604 * The Data Source RDMAP Layer MUST pass RDMA Read Response 1605 Messages to the DDP layer in the order that the RDMA Read 1606 Request Messages were received by the RDMAP Layer at the Data 1607 Source. 1609 * The Data Sink MAY validate that the STag, Tagged Offset, and 1610 length of the RDMA Read Response Message are the same as the 1611 STag, Tagged Offset, and length included in the corresponding 1612 RDMA Read Request Message. 1614 * A single RDMA Read Response Message MUST write to one remote 1615 Tagged Buffer. If the Data Sink wishes to Read multiple Tagged 1616 Buffers, the Data Sink can use multiple RDMA Read Request 1617 Messages. 1619 5.3 Send Message Type 1621 The Send Message Type uses the DDP Untagged Buffer Model to 1622 transfer data from the Data Source into an Untagged Buffer at the 1623 Data Sink. 1625 * A Send Message Type MUST reference an Untagged Buffer. That is, 1626 the Local Peer's RDMAP Layer MUST request that the DDP layer 1627 mark the Message as Untagged. 1629 * One Send Message Type MUST consume one Untagged Buffer. 1631 * The ULP Message sent using a Send Message Type MAY be less 1632 than or equal to the size of the consumed Untagged Buffer. 1633 The RDMAP Layer communicates to the ULP the size of the 1634 data written into the Untagged Buffer. 1636 * If the ULP Message sent via Send Message Type is larger 1637 than the Data Sink's Untagged Buffer, it is an error (see 1638 section 9.1 RDMAP Error Surfacing). 1640 * At the Remote Peer, the Send Message Type MUST be Delivered to 1641 the Remote Peer's ULP in the order they were sent. 1643 * After the Send with Solicited Event or Send with Solicited 1644 Event and Invalidate Message is Delivered to the ULP, the RDMAP 1645 MAY generate an Event, if the Data Sink is configured to 1646 generate such an Event. 1648 * At the Remote Peer, when an invalid Send Message Type is 1649 Delivered to the Remote Peer's RDMAP Layer, an error is 1650 surfaced (see section 7.1 RDMAP Error Surfacing). 1652 * The RDMAP does not define how the buffer(s) used by an outbound 1653 Send Message Type is defined and how it is addressed. For 1654 example, an implementation of RDMA may choose to allow a 1655 gather-list of non-contiguous data blocks to be the source of a 1656 Send Message Type. In this case, the data blocks would be 1657 combined by the Data Source and sent as a single Send Message 1658 Type to the Data Sink. 1660 * For a Send Message Type, the Local Peer's RDMAP Layer MUST 1661 request that the DDP layer set the Queue Number field to zero. 1663 * The Local Peer MUST issue Send Message Type Messages in the 1664 order they were submitted by the ULP. 1666 * The Data Source MAY pass a zero length Send Message Type. A 1667 zero length Send Message Type MUST consume an Untagged Buffer 1668 at the Data Sink. A Send with Invalidate or Send with Solicited 1669 Event and Invalidate Message MUST reference an STag. That is, 1670 the Local Peer's RDMAP Layer MUST pass the RDMA control field 1671 and the STag that will be Invalidated to the DDP layer. 1673 * When the Send with Invalidate and Send with Solicited Event and 1674 Invalidate Message are Delivered to the Remote Peer's RDMAP 1675 Layer, the RDMAP Layer MUST: 1677 * Verify the STag that is associated with the RDMAP Stream; 1678 and 1680 * Invalidate the STag if it is associated with the RDMAP 1681 Stream; or Issue a Terminate Message with the STag Cannot 1682 be Invalidated Terminate Error Code, if the STag is not 1683 associated with the RDMAP Stream. 1685 5.4 Terminate Message 1687 The Terminate Message uses the DDP Untagged Buffer Model to 1688 transfer error related information from the Data Source into an 1689 Untagged Buffer at the Data Sink and then ceases all further 1690 communications on the underlying DDP Stream. The Terminate Message 1691 has the following semantics: 1693 * A Terminate Message MUST reference an Untagged Buffer. That is, 1694 the Local Peer's RDMAP Layer MUST request that the DDP layer 1695 mark the Message as Untagged. 1697 * A Terminate Message references the Terminate Queue. That is, 1698 the Local Peer's RDMAP Layer MUST request that the DDP layer 1699 set the Queue Number field to two. 1701 * One Terminate Message MUST consume one Untagged Buffer. 1703 * On a single RDMAP Stream, the RDMAP layer MUST guarantee 1704 placement of a single Terminate Message. 1706 * A Terminate Message MUST be Delivered to the Remote Peer's 1707 RDMAP Layer. The RDMAP Layer MUST Deliver the Terminate Message 1708 to the ULP. 1710 * At the Remote Peer, when an invalid Terminate Message is 1711 delivered to the Remote Peer's RDMAP Layer, an error is 1712 surfaced (see section 7.1 RDMAP Error Surfacing). 1714 * The RDMAP Layer Completes in error all ULP Operations that have 1715 not been provided to the DDP layer. 1717 * After sending a Terminate Message on an RDMAP Stream, the Local 1718 Peer MUST NOT send any more Messages on that specific RDMAP 1719 Stream. 1721 * After receiving a Terminate Message on an RDMAP Stream, the 1722 Remote Peer MAY stop sending Messages on that specific RDMAP 1723 Stream. 1725 5.5 Ordering and Completions 1727 It is important to understand the difference between Placement and 1728 Delivery ordering since RDMAP provides quite different semantics 1729 for the two. 1731 Note that many current protocols, both as used in the Internet and 1732 elsewhere, assume that data is both Placed and Delivered in order. 1733 This allowed applications to take a variety of shortcuts by taking 1734 advantage of this fact. For RDMAP, many of these shortcuts are no 1735 longer safe to use, and could cause application failure. 1737 The following rules apply to implementations of the RDMAP 1738 protocol. Note, in these rules Send includes Send, Send with 1739 Invalidate, Send with Solicited Event, and Send with Solicited 1740 Event and Invalidate: 1742 1. RDMAP does not provide ordering among Messages on different 1743 RDMAP Streams. 1745 2. RDMAP does not provide ordering between operations that are 1746 generated from the two ends of an RDMAP Stream. 1748 3. RDMA Messages that use Tagged and Untagged Buffers MAY be 1749 Placed in any order. If an application uses overlapping 1750 buffers (points different Messages or portions of a single 1751 Message at the same buffer), then it is possible that the last 1752 incoming write to the Data Sink buffer will not be the last 1753 outgoing data sent from the Data Source. 1755 4. For a Send operation, the contents of an Untagged Buffer at 1756 the Data Sink MAY be indeterminate until the Send is Delivered 1757 to the ULP at the Data Sink. 1759 5. For an RDMA Write operation, the contents of the Tagged Buffer 1760 at the Data Sink MAY be indeterminate until a subsequent Send 1761 is Delivered to the ULP at the Data Sink. 1763 6. For an RDMA Read operation, the contents of the Tagged Buffer 1764 at the Data Sink MAY be indeterminate until the RDMA Read 1765 Response Message has been Delivered at the Local Peer. 1767 Statements 4, 5, and 6 imply "no peeking" at the data to see 1768 if it is done. It is possible for some data to arrive before 1769 logically earlier data does, and peeking may cause 1770 unpredictable application failure 1772 7. If the ULP or Application modifies the contents of Tagged or 1773 Untagged Buffers being modified by an RDMA Operation while the 1774 RDMAP is processing the RDMA Operation, the state of the 1775 Buffers is indeterminate. 1777 8. If the ULP or Application modifies the contents of Tagged or 1778 Untagged Buffers read by an RDMA Operation while the RDMAP is 1779 processing the RDMA Operation, the results of the read are 1780 indeterminate. 1782 9. The Completion of an RDMA Write or Send Operation at the Local 1783 Peer does not guarantee that the ULP Message has yet reached 1784 the Remote Peer ULP Buffer or been examined by the Remote ULP. 1786 10. Send Messages MUST be Delivered to the ULP at the Remote Peer 1787 after they are Delivered to RDMAP by DDP and in the order that 1788 the they were Delivered to RDMAP. 1790 Note that DDP ordering rules ensure that this will be the same 1791 order that they were submitted at the Local Peer and that any 1792 prior RDMA Writes have been submitted for ordered Placement at 1793 the Remote Peer. This means that when the ULP sees the 1794 Delivery of the Send, the memory buffers targeted by any 1795 preceding RDMA Writes and Sends are available to be accessed 1796 locally or remotely as authorized. If the ULP overlaps its 1797 buffers for different operations, the data from the RDMA Write 1798 or Send may be overwritten by subsequent RDMA Operations 1799 before the ULP receives and processes the Delivery. 1801 11. RDMA Read Response Messages MUST be Delivered to the ULP at 1802 the Remote Peer after they are Delivered to RDMAP by DDP and 1803 in the order that the they were Delivered to RDMAP. 1805 DDP ordering rules ensure that this will be the same order 1806 that they were submitted at the Local Peer. This means that 1807 when the ULP sees the Delivery of the RDMA Read Response, the 1808 memory buffers targeted by the RDMA Read Response are 1809 available to be accessed locally or remotely as authorized. If 1810 the ULP overlaps its buffers for different operations, the 1811 data from the RDMA Read Response may be overwritten by 1812 subsequent RDMA Operations before the ULP receives and 1813 processes the Delivery. 1815 12. RDMA Read Request Messages, including zero-length RDMA Read 1816 Requests, MUST NOT start processing at the Remote Peer until 1817 they have been Delivered to RDMAP by DDP. 1819 Note: the ULP is assured that data written can be read back. 1820 For example, if an RDMA Read Request is issued by the local 1821 peer, targeting the same ULP Buffer as a preceding Send or 1822 RDMA Write (in the same direction as the RDMA Read Request), 1823 and there are no other sources of update for the ULP Buffer, 1824 then the remote peer will send back the data written by the 1825 Send or RDMA Write. That is, for this example the ULP Buffer: 1826 is Advertised for use on a series of RDMA Messages, is only 1827 valid on the RDMAP Stream for which it is advertised, and is 1828 not locally updated while the series of RDMAP Messages are 1829 performed. For this example, order rule (12) assures that 1830 subsequent local or remote accesses to the ULP Buffer contain 1831 the data written by the Send or RDMA Write. 1833 RDMA Read Response Messages MAY be generated at the Remote 1834 Peer after subsequent RDMA Write Messages or Send Messages 1835 have been Placed or Delivered. Therefore, when an application 1836 does an RDMA Read Request followed by an RDMA Write (or Send) 1837 to the same buffer, it may get the data from the later RDMA 1838 Write (or Send) in the RDMA Read Response Message, even though 1839 the operations completed in order at the Local Peer. If this 1840 behavior is not desired, the Local Peer ULP must Fence the 1841 later RDMA write (or Send) by withholding the RDMA Write 1842 Message until all outstanding RDMA Read Responses have been 1843 Delivered. 1845 13. The RDMAP Layer MUST submit RDMA Messages to the DDP layer in 1846 the order the RDMA Operations are submitted to the RDMAP Layer 1847 by the ULP. 1849 14. A Send or RDMA Write Message MUST NOT be considered Complete 1850 at the Local Peer (Data Source) until it has been successfully 1851 completed at the DDP layer. 1853 15. RDMA Operations MUST be Completed at the Local Peer in the 1854 order that they were submitted by the ULP. 1856 16. At the Data Sink, an incoming Send Message MUST be Delivered 1857 to the ULP only after the DDP Message has been Delivered to 1858 the RDMAP Layer by the DDP layer. 1860 17. RDMA Read Response Message processing at the Remote Peer 1861 (reading the specified Tagged Buffer) MUST be started only 1862 after the RDMA Read Request Message has been Delivered by the 1863 DDP layer (thus all previous RDMA Messages have been properly 1864 submitted for ordered Placement). 1866 18. Send Messages MAY be Completed at the Remote Peer (Data Sink) 1867 before prior incoming RDMA Read Request Messages have 1868 completed their response processing. 1870 19. An RDMA Read operation MUST NOT be Completed at the Local Peer 1871 until the DDP layer Delivers the associated incoming RDMA Read 1872 Response Message. 1874 20. If more than one outstanding RDMA Read Request Message is 1875 supported by both peers, the RDMA Read Response Messages MUST 1876 be submitted to the DDP layer on the Remote Peer in the order 1877 the RDMA Read Request Messages were Delivered by DDP, but the 1878 actual read of the buffer contents MAY take place in any order 1879 at the Remote Peer. 1881 This simplifies Local Peer Completion processing for RDMA 1882 Reads in that a Delivered RDMA Read Response MUST be 1883 sufficient to Complete the RDMA Read Operation. 1885 6 RDMAP Stream Management 1887 RDMAP Stream management consists of RDMAP Stream Initialization 1888 and RDMAP Stream Termination. 1890 6.1 Stream Initialization 1892 RDMAP Stream initialization occurs after the LLP Stream has been 1893 created (e.g. for DDP/MPA over TCP the first TCP Segment after the 1894 SYN, SYN/ACK exchange). The ULP is responsible for transitioning 1895 the LLP Stream into RDMA enabled mode. The switch to RDMA mode 1896 typically occurs sometime after LLP Stream. Once in RDMA enabled 1897 mode, an implementation MUST send only RDMA Messages across the 1898 transport Stream until the RDMAP Stream is torn down. 1900 For each direction of an RDMAP Stream: 1902 * For a given RDMAP Stream, the number of outstanding RDMA Read 1903 Requests is limited per RDMAP Stream direction. 1905 * It is the ULP's responsibility to set the maximum number of 1906 outstanding, inbound RDMA Read Requests per RDMAP Stream 1907 direction. 1909 * The RDMAP Layer MUST provide the maximum number of outstanding, 1910 inbound RDMA Read Requests per RDMAP Stream direction that were 1911 negotiated between the ULP and the Local Peer's RDMAP Layer. 1912 The negotiation mechanism is outside the scope of this 1913 specification. 1915 * It is the ULP's responsibility to set the maximum number of 1916 outstanding, outbound RDMA Read Requests per RDMAP Stream 1917 direction. 1919 * The RDMAP Layer MUST provide the maximum number of outstanding, 1920 outbound RDMA Read Requests for the RDMAP Stream direction that 1921 were negotiated between the ULP and the Local Peer's RDMAP 1922 Layer. The negotiation mechanism is outside the scope of this 1923 specification. 1925 * The Local Peer's ULP is responsible for negotiating with the 1926 Remote Peer's ULP the maximum number of outstanding RDMA Read 1927 Requests for the RDMAP Stream direction. It is recommended that 1928 the ULP set the maximum number of outstanding, inbound RDMA 1929 Read Requests equal to the maximum number of outstanding, 1930 outbound RDMA Read Requests for a given RDMAP Stream direction. 1932 * For outbound RDMA Read Requests, the RDMAP Layer MUST NOT 1933 exceed the maximum number of outstanding, outbound RDMA Read 1934 Requests that were negotiated between the ULP and the Local 1935 Peer's RDMAP Layer. 1937 * For inbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed 1938 the maximum number of outstanding, inbound RDMA Read Requests 1939 that were negotiated between the ULP and the Local Peer's RDMAP 1940 Layer. 1942 6.2 Stream Teardown 1944 There are three methods for terminating an RDMAP Stream: ULP 1945 Graceful Termination, RDMAP Abortive Termination, and LLP Abortive 1946 Termination. 1948 The ULP is responsible for performing ULP Graceful Termination. 1949 After a ULP Graceful Termination, either side of the Stream can 1950 initiate LLP Graceful Termination, using the graceful termination 1951 mechanism provided by the LLP. 1953 RDMAP Abortive Termination allows the RDMAP to issue a Terminate 1954 Message describing the reason the RDMAP Stream was terminated. The 1955 next section (6.2.1 RDMAP Abortive Termination) describes the 1956 RDMAP Abortive Termination in detail. 1958 LLP results due to a LLP error and causes the RDMAP Stream to be 1959 torn down midstream, without an RDMAP Terminate Message. While 1960 this last method is highly undesirable, it is possible and the ULP 1961 should take this into consideration. 1963 6.2.1 RDMAP Abortive Termination 1965 RDMAP defines a Terminate operation that SHOULD be invoked when 1966 either an RDMAP error is encountered or a LLP error is surfaced to 1967 the RDMAP layer by the LLP. 1969 It is not always possible to send the Terminate Message. For 1970 example, certain LLP errors may occur that cause the LLP Stream to 1971 be torn down before a) RDMAP is aware of the error, b) before 1972 RDMAP is able to send the Terminate Message, or c) after RDMAP has 1973 posted the Terminate Message to the LLP, but it has not yet been 1974 transmitted by the LLP. 1976 Note that an RDMAP Abortive Termination may entail loss of data. 1977 In general, when a Terminate Message is received it is impossible 1978 to tell for sure what unacknowledged RDMA Messages were Completed 1979 successfully at the Remote Peer. Thus the state of all outstanding 1980 RDMA Messages is indeterminate and the Messages SHOULD be 1981 considered Completed in error. 1983 When a peer sends or receives a Terminate Message, it MAY 1984 immediately teardown the LLP Stream. The peer SHOULD perform a 1985 graceful LLP teardown to ensure the Terminate Message is 1986 successfully Delivered. 1988 See section 4.8 Terminate Header for a description of the 1989 Terminate Message and its contents. See section 5.4 Terminate 1990 Message for a description of the Terminate Message semantics. 1992 7 RDMAP Error Management 1994 The RDMAP protocol does not have RDMAP or DDP layer error recovery 1995 operations built in. If everything is working, the LLP guarantees 1996 will ensure that the Messages are arriving at the destination. 1998 If errors are detected at the RDMAP or DDP layer, then the RDMAP, 1999 DDP and LLP Streams are Abortively Terminated (see section 4.8 2000 Terminate Header on page 34). 2002 In general poor implementations or improper ULP programming causes 2003 the errors detected at the RDMAP and DDP layers. In these cases, 2004 returning a diagnostic termination error Message and closing the 2005 RDMAP Stream is far simpler than attempting to maintain the RDMAP 2006 Stream, particularly when the cause of the error is not known. 2008 If an LLP does not support teardown of a Stream independent of 2009 other Streams and an RDMAP error results in the Termination of a 2010 specific Stream, then the LLP MUST label the Stream as an 2011 erroneous Stream and MUST NOT allow any further data transfer on 2012 that Stream after RDMAP requests the Stream to be torn down. 2014 For a specific LLP connection, when all Streams are either 2015 gracefully torn down or are labeled as erroneous Streams, the LLP 2016 connection MUST be torn down. 2018 Since errors are detected at the Remote Peer (possibly long) after 2019 RDMA Messages are passed to DDP and the LLP at the Local Peer and 2020 Completed, the sender cannot easily determine which of its 2021 Messages have been received. (RDMA Reads are an exception to this 2022 rule). 2024 For a list of errors returned to the Remote Peer as a result of an 2025 Abortive Termination, see section 4.8 Terminate Header on page 34. 2027 7.1 RDMAP Error Surfacing 2029 If an error occurs at the Local Peer, the RDMAP layer MUST attempt 2030 to inform the local ULP that the error has occurred. 2032 The Local Peer MUST send a Terminate Message for each of the 2033 following cases: 2035 1. For Errors detected while creating RDMA Write, Send, Send with 2036 Invalidate, Send with Solicited Event, Send with Solicited 2037 Event and Invalidate, or RDMA Read Requests, or other reasons 2038 not directly associated with an incoming Message, the 2039 Terminate Message and Error code are sent instead of the 2040 request. In this case, the Error Type and Error Code fields 2041 are included in the Terminate Message, but the Terminated DDP 2042 Header and Terminated RDMA Header fields are set to zero. 2044 2. For errors detected on an incoming RDMA Write, Send, Send with 2045 Invalidate, Send with Solicited Event, Send with Solicited 2046 Event and Invalidate, or Read Response Message (after the 2047 Message has been Delivered by DDP), the Terminate Message is 2048 sent at the earliest possible opportunity, preferably in the 2049 next outgoing RDMA Message. In this case, the Error Type, 2050 Error Code, ULP PDU Length, and Terminated DDP Header fields 2051 are included in the Terminate Message, but the Terminated RDMA 2052 Header field is set to zero. 2054 3. For errors detected on an incoming RDMA Read Request Message 2055 (after the Message has been Delivered by DDP), the Terminate 2056 Message is sent at the earliest possible opportunity, 2057 preferably in the next outgoing RDMA Message. In this case, 2058 the Error Type, Error Code, ULP PDU Length, Terminated DDP 2059 Header, and Terminated RDMA Header fields are included in the 2060 Terminate Message. 2062 4. If more than one error is detected on incoming RDMA Messages, 2063 before the Terminate Message can be sent, then the first RDMA 2064 Message (and its associated DDP Segment) that experienced an 2065 error MUST be captured by the Terminate Message in accordance 2066 with rules 2 and 3 above. 2068 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages 2070 On incoming RDMA Writes, RDMA Read Response, Sends, Send with 2071 Invalidate, Send with Solicited Event, Send with Solicited Event 2072 and Invalidate, and Terminate Messages, the following must be 2073 validated: 2075 1. The DDP Layer MUST validate all DDP Segment fields. 2077 2. The RDMA OpCode MUST be valid. 2079 3. The RDMA Version MUST be valid. 2081 Additionally, on incoming Send with Invalidate and Send with 2082 Solicited Event and Invalidate Messages, the following must 2083 also be validated: 2085 4. The Invalidate STag MUST be valid. 2087 5. The STag MUST be associated to this RDMAP Stream. 2089 On incoming RDMA Request Messages, the following must be 2090 validated: 2092 1. The DDP Layer MUST validate all Untagged DDP Segment fields. 2094 2. The RDMA OpCode MUST be valid. 2096 3. The RDMA Version MUST be valid. 2098 4. For non-zero length RDMA Read Request Messages: 2100 a. The Data Source STag MUST be valid. 2102 b. The Data Source STag MUST be associated to this RDMAP 2103 Stream. 2105 c. The Data Source Tagged Offset MUST fall in the range of 2106 legal offsets associated with the Data Source STag. 2108 d. The sum of the Data Source Tagged Offset and the RDMA Read 2109 Message Size MUST fall in the range of legal offsets 2110 associated with the Data Source STag. 2112 e. The sum of the Data Source Tagged Offset and the RDMA Read 2113 Message Size MUST NOT cause the Data Source Tagged Offset 2114 to wrap. 2116 8 Security 2118 Security Considerations 2120 This section discusses both protocol-specific security 2121 considerations and implications of using RDMAP with existing 2122 security services. A detailed analysis of the security issues 2123 around implementation and use of the RDMAP can be found in 2124 [RDMASEC]. Note, that it is not the intention of this section to 2125 replicate the RDMAP relevant content of [RDMASEC], but rather to 2126 give an overview onto RDMAP related security issues. 2128 8.1 Security Model and general Assumptions 2130 This section of the specification follows the RDMA architectural 2131 reference model as defined in [RDMASEC]. It further uses the 2132 definition of attackable resources, types of attacks and possible 2133 countermeasures introduced therein. 2135 8.1.1 Attackable Resources 2137 According to [RDMASEC], all resources of the RDMA reference model 2138 are attackable using the RDMAP. Thus, Stream Context Memory, Data 2139 Buffers, Page Translation Tables, STag Namespace, Completion 2140 Queues, Asynchronous Event Queues, RDMA Read Request Queues are 2141 vulnerable to attacks. 2143 8.1.2 Types of Attackers and Types of Attacks 2145 Possible types of attackers are a non-trusted remote peer, a 2146 network based attacker or a hostile local application in a multi- 2147 user system. Generally, while a remote or network based attacker 2148 is using the RDMAP communication channel to place the attack, a 2149 local attacker is using the host RDMA infrastructure to gain 2150 access to local or remote resources. 2152 [RDMASEC] defines the following possible categories of attacks: 2153 Spoofing, Tampering, Information Disclosure, Denial of Service and 2154 Elevation of Privilege. See [RDMASEC] for a detailed discussion of 2155 all known attacks falling into these categories. 2157 8.1.3 Trust and Resource Sharing 2159 [RDMASEC] establishes a peer-to-peer trust model based on Local 2160 Partial Trust and Remote Partial Trust. Based on the level of 2161 trust in an RDMAP based communication, the ULP itself must take 2162 appropriate actions to protect exposed resources from attacks from 2163 a non trusted Remote Peer or non trusted Local Peer. 2165 The correct evaluation of current Local and Remote Partial Trust 2166 is of particular importance for the protection of communication 2167 resources shared among multiple RDMAP streams such as multiple 2168 RDMA streams sharing receive buffers or associated with a common 2169 Shared Receive Queue. The sharing of resources across Streams 2170 should be under the control of the ULP, both in terms of the trust 2171 model the ULP wishes to operate under, as well as the level of 2172 resource sharing the ULP wishes to give Local Peer processes (see 2173 [RDMASEC] for further details on resource sharing). 2175 8.2 Summary of RDMAP specific Security Requirements 2177 An RDMAP implementation conforming to this specification MUST 2178 provide the following two components: an RDMA enabled NIC (RNIC) 2179 and a Privileged Resource Manager (PRM). An PRM is the component 2180 responsible for managing and allocating resources associated with 2181 the RNIC Engine [RDMASEC]. 2183 The RNIC MUST implement the RDMA wire Protocol and MUST perform 2184 the security semantics described in this section. The PRM MUST 2185 implement the security semantics described in this section. 2187 8.2.1 RDMAP (RNIC) Requirements 2189 RDMAP provides several countermeasures for attacks as introduced 2190 in 10.1.2. In the following, this specification lists all security 2191 requirements which MUST be implemented by the RNIC. A more 2192 detailed discussion of these requirements can be found in Section 2193 7 of [RDMASEC]. 2195 1. An RNIC MUST ensure that a specific Stream in a specific 2196 Protection Domain cannot access an STag in a different 2197 Protection Domain. 2199 2. An RNIC MUST ensure that if an STag is limited in scope to a 2200 single Stream, no other Stream can use the STag. 2202 3. An RNIC MUST ensure that a Remote Peer is not able to access 2203 memory outside of the buffer specified when the STag was 2204 enabled for remote access. 2206 4. An RNIC MUST provide a mechanism for the ULP to establish and 2207 revoke the association of a ULP Buffer to an STag and TO 2208 range. 2210 5. An RNIC MUST provide a mechanism for the ULP to establish and 2211 revoke read, write, or read and write access to the ULP Buffer 2212 referenced by an STag. 2214 6. An RNIC MUST ensure that the network interface can no longer 2215 modify an advertised buffer after the ULP revokes remote 2216 access rights for an STag. 2218 7. An RNIC MUST ensure that a Remote Peer is not able to 2219 invalidate an STag enabled for remote access, if the STag is 2220 shared on multiple streams. 2222 8. An RNIC MUST choose the value of STags in a way difficult to 2223 predict. It is RECOMMENDED to sparsely populate them over the 2224 full range available. 2226 9. An RNIC MUST NOT enable sharing a CQ across ULPs that do not 2227 share partial mutual trust. 2229 10. An RNIC MUST ensure that if a CQ overflows, any Streams which 2230 do not use the CQ MUST remain unaffected. 2232 11. An RNIC implementation SHOULD provide a mechanism to cap the 2233 number of outstanding RDMA Read Requests. 2235 12. An RNIC MUST NOT enable firmware to be loaded on the RNIC 2236 directly from an untrusted Local Peer or Remote Peer, unless 2237 the Peer is properly authenticated (by a mechanism outside the 2238 scope of this specification. The mechanism presumably entails 2239 authenticating that the remote ULP has the right to perform 2240 the update), and the update is done via a secure protocol, 2241 such as IPsec. 2243 8.2.2 Privileged Resource Manager Requirements 2245 With RDMAP, all reservations of local resources are initiated from 2246 local ULPs. To protect from local attacks including unfair 2247 resource distribution and gaining unauthorized access to RNIC 2248 resources, a Privileged Resource Manager (PRM) must be 2249 implemented, which manages all local resource allocation. Note 2250 that the PRM must not be provided as an independent component, its 2251 functionality can also be implemented as part of the privileged 2252 ULP or as part of the RNIC itself. 2254 An PRM implementation must meet the following security 2255 requirements (a more detailed discussion of these requirements can 2256 be found in Section 7 of [RDMASEC]): 2258 1. All Non-Privileged ULP interactions with the RNIC Engine that 2259 could affect other ULPs MUST be done using the Privileged 2260 Resource Manager as a proxy. 2262 2. All ULP resource allocation requests for scarce resources MUST 2263 also be done using a Privileged Resource Manager. 2265 3. The Privileged Resource Manager MUST NOT assume different ULPs 2266 share Partial Mutual Trust unless there is a mechanism to 2267 ensure that the ULPs do indeed share partial mutual trust. 2269 4. If Non-Privileged ULPs are supported, the Privileged Resource 2270 Manager MUST verify that the Non-Privileged ULP has the right 2271 to access a specific Data Buffer before allowing an STag for 2272 which the ULP has access rights to be associated with a 2273 specific Data Buffer. 2275 5. The Privileged Resource Manager MUST control the allocation of 2276 CQ entries. 2278 6. The Privileged Resource Manager SHOULD prevent a Local Peer 2279 from allocating more than its fair share of resources. 2281 7. RDMA Read Request Queue resource consumption MUST be 2282 controlled by the Privileged Resource Manager such that 2283 RDMAP/DDP Streams which do not share Partial Mutual Trust do 2284 not share RDMA Read Request Queue resources. 2286 8. If an RNIC provides the ability to share receive buffers 2287 across multiple Streams, the combination of the RNIC and the 2288 Privileged Resource Manager MUST be able to detect if the 2289 Remote Peer is attempting to consume more than its fair share 2290 of resources so that the Local Peer can apply countermeasures 2291 to detect and prevent the attack. 2293 8.3 Security Services for RDMAP 2295 RDMAP is using IP based network services to control, read and 2296 write data buffers over the network. Therefore, all exchanged 2297 control and data packets are vulnerable to spoofing, tampering and 2298 information disclosure attacks. 2300 If an RDMAP Stream may be subject to impersonation attacks, or 2301 Stream hijacking attacks, it is highly RECOMMENDED that the Stream 2302 be authenticated, integrity protected, and protected from replay 2303 attacks; it MAY use confidentiality protection to protect from 2304 eavesdropping. 2306 8.3.1 Available Security Services 2308 The IPsec protocol suite [RFC2401] defines strong countermeasures 2309 to protect an IP stream from those attacks. Several levels of 2310 protection can guarantee session confidentiality, per-packet 2311 source authentication, per-packet integrity and correct packet 2312 sequencing. 2314 RDMAP security may also profit from SSL or TLS security services 2315 provided for TCP based ULPs [RFC2246]. Used underneath RDMAP, 2316 these security services also provides for stream authentication, 2317 data integrity and confidentiality. As discussed in [RDMASEC], 2318 limitations on the maximum packet length to be carried over the 2319 network and potentially inefficient out-of-order packet processing 2320 at the data sink makes SSL and TLS less appropriate for RDMAP than 2321 IPsec. 2323 If SSL is layered on top of RDMAP, SSL does not protect the RDMAP 2324 headers. Thus, a man-in-the-middle attack can still occur by 2325 modifying the RDMAP header to incorrectly place the data into the 2326 wrong buffer, thus effectively corrupting the data stream. 2328 By remaining independent of ULP and LLP security protocols, RDMAP 2329 will benefit from continuing improvements at those layers. Users 2330 are provided flexibility to adapt to their specific security 2331 requirements and the ability to adapt to future security 2332 challenges. Given this, the vulnerabilities of RDMAP to active 2333 third-party interference are no greater than any other protocol 2334 running over an LLP such as TCP or SCTP. 2336 8.3.2 Requirements for IPsec Services for RDMAP 2338 Because IPsec is designed to secure arbitrary IP packet streams, 2339 including streams where packets are lost, RDMAP can run on top of 2340 IPsec without any change. IPsec packets are processed (e.g., 2341 integrity checked and possibly decrypted) in the order they are 2342 received, and an RDMAP Data Sink will process the decrypted RDMA 2343 Messages contained in these packets in the same manner as RDMA 2344 Messages contained in unsecured IP packets. 2346 The IP Storage working group has defined the normative IPsec 2347 requirements for IP Storage [RFC3723]. Portions of this 2348 specification are applicable to the RDMAP. In particular, a 2349 compliant implementation of IPsec services MUST meet the 2350 requirements as outlined in Section 2.3 of [RFC3723]. Without 2351 replicating the detailed discussion in [RFC3723], this includes 2352 the following requirements: 2354 1. The implementation MUST support IPsec ESP [RFC2406], as well 2355 as the replay protection mechanisms of IPsec. When ESP is 2356 utilized, per-packet data origin authentication, integrity and 2357 replay protection MUST be used. 2359 2. It MUST support ESP in tunnel mode and MAY implement ESP in 2360 transport mode. 2362 3. It MUST support IKE [RFC2409] for peer authentication, 2363 negotiation of security associations, and key management, 2364 using the IPsec DOI [RFC2407]. 2366 4. It MUST NOT interpret the receipt of a IKE Phase 2 delete 2367 message as a reason for tearing down the RDMAP stream. Since 2368 IPsec acceleration hardware may only be able to handle a 2369 limited number of active IKE Phase 2 SAs, idle SAs may be 2370 dynamically brought down and a new SA be brought up again, if 2371 activity resumes. 2373 5. It MUST support peer authentication using a pre-shared key, 2374 and MAY support certificate-based peer authentication using 2375 digital signatures. Peer authentication using the public key 2376 encryption methods [RFC2409] SHOULD NOT be used. 2378 6. It MUST support IKE Main Mode and SHOULD support Aggressive 2379 Mode. IKE Main Mode with pre-shared key authentication SHOULD 2380 NOT be used when either of the peers uses a dynamically 2381 assigned IP address. 2383 7. Access to locally stored secret information (pre-shared or 2384 private key for digital signing) must be suitably restricted, 2385 since compromise of the secret information nullifies the 2386 security properties of the IKE/IPsec protocols. 2388 8. It MUST follow the guidelines of Section 2.3.4 of [RFC3723] on the setting 2389 of IKE parameters to achieve a high level of interoperability without 2390 requiring extensive configuration. 2392 Furthermore, implementation and deployment of the IPsec services 2393 for RDDP should follow the Security Considerations outlined in 2394 Section 5 of [RFC3723]. 2396 9 IANA 2398 IANA Considerations 2400 If RDMAP was enabled a priori for a ULP by connecting to a well- 2401 known port, this well-known port would be registered for the RDMAP 2402 with IANA. The registration of the well-known port will be the 2403 responsibility of the ULP specification. 2405 10 References 2407 10.1 Normative References 2409 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2410 Requirement Levels", BCP 14, RFC 2119, March 1997. 2412 [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload 2413 (ESP)", RFC 2406, November 1998. 2415 [RFC2407] Piper, D., "The Internet IP Security Domain of Interpretation 2416 of ISAKMP", RFC 2407, November 1998. 2418 [RFC2409] Harkins, D. and D. Carrel, "The Internet Key Exchange (IKE)", RFC 2419 2409, November 1998. 2421 [RFC3723] Aboba B. et al., "Secure Block Storage Protocols over 2422 IP", RFC 3723, April 2004. 2424 [VERBS] J. Hilland, ��RDMA Protocol Verbs Specification��, draft- 2425 hilland-rddp-verbs-00. 2427 [DDP] H. Shah et al., "Direct Data Placement over Reliable 2428 Transports", draft-ietf-rddp-ddp-03.txt, February 2005. 2430 [MPA] P. Culley et al., "Marker PDU Aligned Framing for TCP 2431 Specification", draft-ietf-rddp-mpa-01.txt, January 2005. 2433 [SCTP] R. Stewart et al., "Stream Control Transmission Protocol", 2434 RFC 2960, October 2000. 2436 [TCP] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, 2437 September 1981. 2439 [RDMASEC] J. Pinkerton et al., "DDP/RDMAP Security", draft-ietf- 2440 rddp-security-05.txt, March 2005. 2442 10.2 Informative References 2444 [RFC2401] Atkinson, R., Kent, S., "Security Architecture for the 2445 Internet Protocol", RFC 2401, November 1998. 2447 [RFC 2246] Dierks, T. and C. Allen, "The TLS Protocol Version 2448 1.0", RFC 2246, November 1998. 2450 11 Appendix 2452 11.1 DDP Segment Formats for RDMA Messages 2454 This appendix is for information only and is NOT part of the 2455 standard. It simply depicts the DDP Segment format for the various 2456 RDMA Messages. 2458 11.1.1 DDP Segment for RDMA Write 2460 The following figure depicts an RDMA Write, DDP Segment: 2462 0 1 2 3 2463 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2464 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2465 | DDP Control | RDMA Control | 2466 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2467 | Data Sink STag | 2468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2469 | Data Sink Tagged Offset | 2470 + + 2471 | | 2472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2473 | RDMA Write ULP Payload | 2474 // // 2475 | | 2476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2477 Figure 11 RDMA Write, DDP Segment format 2479 11.1.2 DDP Segment for RDMA Read Request 2481 The following figure depicts an RDMA Read Request, DDP Segment: 2483 0 1 2 3 2484 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2485 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2486 | DDP Control | RDMA Control | 2487 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2488 | Reserved (Not Used) | 2489 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2490 | DDP (RDMA Read Request) Queue Number | 2491 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2492 | DDP (RDMA Read Request) Message Sequence Number | 2493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2494 | DDP (RDMA Read Request) Message Offset | 2495 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2496 | Data Sink STag (SinkSTag) | 2497 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2498 | | 2499 + Data Sink Tagged Offset (SinkTO) + 2500 | | 2501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2502 | RDMA Read Message Size (RDMARDSZ) | 2503 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2504 | Data Source STag (SrcSTag) | 2505 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2506 | | 2507 + Data Source Tagged Offset (SrcTO) + 2508 | | 2509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2510 Figure 12 RDMA Read Request, DDP Segment format 2512 11.1.3 DDP Segment for RDMA Read Response 2514 The following figure depicts an RDMA Read Response, DDP Segment: 2516 0 1 2 3 2517 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2518 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2519 | DDP Control | RDMA Control | 2520 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2521 | Data Sink STag | 2522 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2523 | Data Sink Tagged Offset | 2524 + + 2525 | | 2526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2527 | RDMA Read Response ULP Payload | 2528 // // 2529 | | 2530 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2531 Figure 13 RDMA Read Response, DDP Segment format 2533 11.1.4 DDP Segment for Send and Send with Solicited Event 2535 The following figure depicts a Send and Send with Solicited 2536 Request, DDP Segment: 2538 0 1 2 3 2539 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2540 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2541 | DDP Control | RDMA Control | 2542 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2543 | Reserved (Not Used) | 2544 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2545 | (Send) Queue Number | 2546 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2547 | (Send) Message Sequence Number | 2548 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2549 | (Send) Message Offset | 2550 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2551 | Send ULP Payload | 2552 // // 2553 | | 2554 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2556 Figure 14 Send and Send with Solicited Event, DDP Segment format 2558 11.1.5 DDP Segment for Send with Invalidate and Send with SE and 2559 Invalidate 2561 The following figure depicts a Send with invalidate and Send with 2562 Solicited and Invalidate Request, DDP Segment: 2564 0 1 2 3 2565 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2566 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2567 | DDP Control | RDMA Control | 2568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2569 | Invalidate STag | 2570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2571 | (Send) Queue Number | 2572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2573 | (Send) Message Sequence Number | 2574 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2575 | (Send) Message Offset | 2576 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2577 | Send ULP Payload | 2578 // // 2579 | | 2580 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2581 Figure 15 Send with Invalidate and Send with SE and Invalidate, 2582 DDP Segment 2584 11.1.6 DDP Segment for Terminate 2586 The following figure depicts a Terminate, DDP Segment: 2588 0 1 2 3 2589 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2591 | DDP Control | RDMA Control | 2592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2593 | Reserved (Not Used) | 2594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2595 | DDP (Terminate) Queue Number | 2596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2597 | DDP (Terminate) Message Sequence Number | 2598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2599 | DDP (Terminate) Message Offset | 2600 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2601 | Terminate Control | Reserved | 2602 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2603 | DDP Segment Length (if any) | | 2604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 2605 | | 2606 + + 2607 | Terminated DDP Header (if any) | 2608 + + 2609 | | 2610 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2611 | | 2612 // // 2613 | Terminated RDMA Header (if any) | 2614 + + 2615 | | 2616 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2617 Figure 16 Terminate, DDP Segment format 2619 11.2 Ordering and Completion Table 2621 The following table summarizes the ordering relationships that are 2622 defined in section 5.5 Ordering and Completions from the 2623 standpoint of the local peer issuing the two Operations. Note, in 2624 the table that follows Send includes Send, Send with Invalidate, 2625 Send with Solicited Event, and Send with Solicited Event and 2626 Invalidate 2628 ------+-------+----------------+----------------+---------------- 2629 First | Later | Placement | Placement | Ordering 2630 Op | Op | guarantee at | guarantee | guarantee at 2631 | | Remote Peer | Local Peer | Remote Peer 2632 | | | | 2633 ------+-------+----------------+----------------+---------------- 2634 Send | Send | No placement | Not applicable | Completed in 2635 | | guarantee. If | | order. 2636 | | guarantee is | | 2637 | | necessary, see | | 2638 | | footnote 1. | | 2639 ------+-------+----------------+----------------+---------------- 2640 Send | RDMA | No placement | Not applicable | Not applicable 2641 | Write | guarantee. If | | 2642 | | guarantee is | | 2643 | | necessary, see | | 2644 | | footnote 1. | | 2645 ------+-------+----------------+----------------+---------------- 2646 Send | RDMA | No placement | RDMA Read | RDMA Read 2647 | Read | guarantee | Response | Response 2648 | | between Send | Payload will | Message will 2649 | | Payload and | not be placed | not be 2650 | | RDMA Read | at the local | generated until 2651 | | Request Header | peer until the | Send has been 2652 | | | Send Payload is| Completed 2653 | | | placed at the | 2654 | | | remote peer | 2655 ------+-------+----------------+----------------+---------------- 2656 RDMA | Send | No placement | Not applicable | Not applicable 2657 Write | | guarantee. If | | 2658 | | guarantee is | | 2659 | | necessary, see | | 2660 | | footnote 1. | | 2661 ------+-------+----------------+----------------+---------------- 2662 RDMA | RDMA | No placement | Not applicable | Not applicable 2663 Write | Write | guarantee. If | | 2664 | | guarantee is | | 2665 | | necessary, see | | 2666 | | footnote 1. | | 2667 ------+-------+----------------+----------------+---------------- 2668 RDMA | RDMA | No placement | RDMA Read | Not applicable 2669 Write | Read | guarantee | Response | 2670 | | between RDMA | Payload will | 2671 | | Write Payload | not be placed | 2672 | | and RDMA Read | at the local | 2673 | | Request Header | peer until the | 2674 | | | RDMA Write | 2675 | | | Payload is | 2676 | | | placed at the | 2677 | | | remote peer | 2678 ------+-------+----------------+----------------+---------------- 2679 RDMA | Send | No placement | Send Payload | Not applicable 2680 Read | | guarantee | may be placed | 2681 | | between RDMA | at the remote | 2682 | | Read Request | peer before the| 2683 | | Header and Send| RDMA Read | 2684 | | payload | Response is | 2685 | | | generated. | 2686 | | | If guarantee is| 2687 | | | necessary, see | 2688 | | | footnote 2. | 2689 ------+-------+----------------+----------------+---------------- 2690 RDMA | RDMA | No placement | RDMA Write | Not applicable 2691 Read | Write | guarantee | Payload may be | 2692 | | between RDMA | placed at the | 2693 | | Read Request | remote peer | 2694 | | Header and RDMA| before the RDMA| 2695 | | Write payload | Read Response | 2696 | | | is generated. | 2697 | | | If guarantee is| 2698 | | | necessary, see | 2699 | | | footnote 2. | 2700 ------+-------+----------------+----------------+---------------- 2701 RDMA | RDMA | No placement | No placement | Second RDMA 2702 Read | Read | guarantee of | guarantee of | Read Response 2703 | | the two RDMA | the two RDMA | will not be 2704 | | Read Request | Read Response | generated until 2705 | | Headers | Payloads. | first RDMA Read 2706 | | Additionally, | | Response is 2707 | | there is no | | generated. 2708 | | guarantee that | | 2709 | | the Tagged | | 2710 | | Buffers | | 2711 | | referenced in | | 2712 | | the RDMA Read | | 2713 | | will be read in| | 2714 | | order | | 2715 Figure 17 Operation Ordering 2717 Footnote 1: If the guarantee is necessary, a ULP may insert an 2718 RDMA Read Operation and wait for it to complete to act as a Fence. 2720 Footnote 2: If the guarantee is necessary, a ULP may wait for the 2721 RDMA Read Operation to complete before performing the Send. 2723 12 Authors Addresses 2725 Paul R. Culley 2726 Hewlett-Packard Company 2727 20555 SH 249 2728 Houston, Tx. USA 77070-2698 2729 Phone: 281-514-5543 2730 Email: paul.culley@hp.com 2732 Dave Garcia 2733 Hewlett-Packard Company 2734 19333 Vallco Parkway 2735 Cupertino, Ca. USA 95014 2736 Phone: 408.285.6116 2737 Email: dave.garcia@hp.com 2739 Jeff Hilland 2740 Hewlett-Packard Company 2741 20555 SH 249 2742 Houston, Tx. USA 77070-2698 2743 Phone: 281-514-9489 2744 Email: jeff.hilland@hp.com 2746 Bernard Metzler 2747 IBM Research GmbH 2748 Zurich Research Laboratory 2749 Saeumerstrasse 4 2750 CH-8803 Rueschlikon, Switzerland 2751 Phone: +41 44 724 8605 2752 Email: bmt@zurich.ibm.com 2754 Renato J. Recio 2755 IBM Corp. 2756 11501 Burnett Road 2757 Austin, Tx. USA 78758 2758 Phone: 512-838-3685 2759 Email: recio@us.ibm.com 2760 13 Acknowledgments 2762 Dwight Barron 2763 Hewlett-Packard Company 2764 20555 SH 249 2765 Houston, Tx. USA 77070-2698 2766 Phone: 281-514-2769 2767 Email: dwight.barron@compaq.com 2769 Caitlin Bestler 2770 Email: cait@asomi.com 2772 John Carrier 2773 Adaptec, Inc. 2774 691 S. Milpitas Blvd. 2775 Milpitas, CA 95035 USA 2776 Phone: +1 (360) 378-8526 2777 Email: john_carrier@adaptec.com 2779 Ted Compton 2780 EMC Corporation 2781 Research Triangle Park, NC 27709, USA 2782 Phone: 919-248-6075 2783 Email: compton_ted@emc.com 2785 Uri Elzur 2786 Broadcom Corporation 2787 16215 Alton Parkway 2788 Irvine, California 92619-7013 USA 2789 Phone: +1 (949) 585-6432 2790 Email: Uri@Broadcom.com 2792 Hari Ghadia 2793 Adaptec, Inc. 2794 691 S. Milpitas Blvd., 2795 Milpitas, CA 95035 USA 2796 Phone: +1 (408) 957-5608 2797 Email: hari_ghadia@adaptec.com 2799 Howard C. Herbert 2800 Intel Corporation 2801 MS CH7-404 2802 5000 West Chandler Blvd. 2804 Chandler, Arizona 85226 2805 Phone: 480-554-3116 2806 Email: howard.c.herbert@intel.com 2808 Mike Ko 2809 IBM 2810 650 Harry Rd. 2811 San Jose, CA 95120 2812 Phone: (408) 927-2085 2813 Email: mako@us.ibm.com 2815 Mike Krause 2816 Hewlett-Packard Company 2817 43LN 2818 19410 Homestead Road 2819 Cupertino, CA 95014 USA 2820 Phone: 408-447-3191 2821 Email: krause@cup.hp.com 2823 Dave Minturn 2824 Intel Corporation 2825 MS JF1-210 2826 5200 North East Elam Young Parkway 2827 Hillsboro, Oregon 97124 2828 Phone: 503-712-4106 2829 Email: dave.b.minturn@intel.com 2831 Mike Penna 2832 Broadcom Corporation 2833 16215 Alton Parkway 2834 Irvine, California 92619-7013 USA 2835 Phone: +1 (949) 926-7149 2836 Email: MPenna@Broadcom.com 2838 Jim Pinkerton 2839 Microsoft, Inc. 2840 One Microsoft Way 2841 Redmond, WA, USA 98052 2842 Email: jpink@microsoft.com 2844 Hemal Shah 2845 Intel Corporation 2846 MS PTL1 2847 1501 South Mopac Expressway, #400 2848 Austin, Texas 78746 2849 Phone: 512-732-3963 2850 Email: hemal.shah@intel.com 2852 Allyn Romanow 2853 Cisco Systems 2854 170 W Tasman Drive 2855 San Jose, CA 95134 USA 2856 Phone: +1 408 525 8836 2857 Email: allyn@cisco.com 2859 Tom Talpey 2860 Network Appliance 2861 375 Totten Pond Road 2862 Waltham, MA 02451 USA 2863 Phone: +1 (781) 768-5329 2864 EMail: thomas.talpey@netapp.com 2866 Patricia Thaler 2867 Agilent Technologies, Inc. 2868 1101 Creekside Ridge Drive, #100 2869 M/S-RG10 2870 Roseville, CA 95678 2871 Phone: +1-916-788-5662 2872 email: pat_thaler@agilent.com 2874 Jim Wendt 2875 Hewlett-Packard Company 2876 8000 Foothills Boulevard MS 5668 2877 Roseville, CA 95747-5668 USA 2878 Phone: +1 916 785 5198 2879 Email: jim_wendt@hp.com 2881 Madeline Vega 2882 IBM 2883 11400 Burnet Rd. Bld.45- 2884 -2L-007 2885 Austin, TX 78758 2886 Phone: (512) 838-7739 2887 Email: mvega1@us.ibm.com 2889 14 Intellectual Property Statement 2891 The IETF takes no position regarding the validity or scope of any 2892 Intellectual Property Rights or other rights that might be claimed 2893 to pertain to the implementation or use of the technology 2894 described in this document or the extent to which any license 2895 under such rights might or might not be available; nor does it 2896 represent that it has made any independent effort to identify any 2897 such rights. Information on the procedures with respect to rights 2898 in RFC documents can be found in BCP 78 and BCP 79. 2900 Copies of IPR disclosures made to the IETF Secretariat and any 2901 assurances of licenses to be made available, or the result of an 2902 attempt made to obtain a general license or permission for the use 2903 of such proprietary rights by implementers or users of this 2904 specification can be obtained from the IETF on-line IPR repository 2905 at http://www.ietf.org/ipr. 2907 The IETF invites any interested party to bring to its attention 2908 any copyrights, patents or patent applications, or other 2909 proprietary rights that may cover technology that may be required 2910 to implement this standard. Please address the information to the 2911 IETF at ietf-ipr@ietf.org. 2913 15 IPR Disclosure Acknowledgement 2915 By submitting this Internet-Draft, each author represents that any 2916 applicable patent or other IPR claims of which he or she is aware 2917 have been or will be disclosed, and any of which he or she becomes 2918 aware will be disclosed, in accordance with Section 6 of BCP 79. 2920 16 Disclaimer 2922 This document and the information contained herein are provided on 2923 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2924 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 2925 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2926 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2927 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 2928 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 2929 PARTICULAR PURPOSE. 2931 17 Full Copyright Statement 2933 Copyright (C) The Internet Society (2005). 2935 This document is subject to the rights, licenses and restrictions 2936 contained in BCP 78, and except as set forth therein, the authors 2937 retain all their rights. 2939 This document and the information contained herein are provided on 2940 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2941 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 2942 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2943 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 2944 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 2945 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 2946 PARTICULAR PURPOSE. 2948 This document and the information contained herein is provided on 2949 an ��AS IS�� basis and ADAPTEC INC., AGILENT TECHNOLOGIES INC., 2950 BROADCOM CORPORATION, CISCO SYSTEMS INC., EMC CORPORATION, 2951 HEWLETT-PACKARD COMPANY, INTERNATIONAL BUSINESS MACHINES 2952 CORPORATION, INTEL CORPORATION, MICROSOFT CORPORATION, NETWORK 2953 APPLIANCE INC., THE INTERNET SOCIETY, AND THE INTERNET ENGINEERING 2954 TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 2955 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 2956 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 2957 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.