idnits 2.17.1 draft-ietf-storm-rdmap-ext-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC5040]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 10, 2013) is 3871 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCXXXX' is mentioned on line 1218, but not defined == Missing Reference: 'RFC5226' is mentioned on line 1216, but not defined ** Obsolete undefined reference: RFC 5226 (Obsoleted by RFC 8126) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Storage Maintenance (storm) Working Group Hemal Shah 2 Internet Draft Broadcom Corporation 3 Intended status: Standards Track Felix Marti 4 Expires: March 2014 Wael Noureddine 5 Asgeir Eiriksson 6 Chelsio Communications, Inc. 7 Robert Sharp 8 Intel Corporation 9 September 10, 2013 11 RDMA Protocol Extensions 12 draft-ietf-storm-rdmap-ext-06.txt 14 Status of this Memo 16 This Internet-Draft is submitted to IETF in full conformance with 17 the provisions of BCP 78 and BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF). Note that other groups may also distribute 21 working documents as Internet-Drafts. The list of current Internet- 22 Drafts is at http://datatracker.ietf.org/drafts/current. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other documents 26 at any time. It is inappropriate to use Internet-Drafts as 27 reference material or to cite them other than as "work in progress." 29 This Internet-Draft will expire on March 10, 2014. 31 Copyright Notice 33 Copyright (c) 2013 IETF Trust and the persons identified as the 34 document authors. All rights reserved. 36 This document is subject to BCP 78 and the IETF Trust's Legal 37 Provisions Relating to IETF Documents 38 (http://trustee.ietf.org/license-info) in effect on the date of 39 publication of this document. Please review these documents 40 carefully, as they describe your rights and restrictions with 41 respect to this document. Code Components extracted from this 42 document must include Simplified BSD License text as described in 43 Section 4.e of the Trust Legal Provisions and are provided without 44 warranty as described in the Simplified BSD License. 46 Abstract 48 This document specifies extensions to the IETF Remote Direct Memory 49 Access Protocol (RDMAP [RFC5040]). RDMAP provides read and write 50 services directly to applications and enables data to be transferred 51 directly into Upper Layer Protocol (ULP) Buffers without 52 intermediate data copies. The extensions specified in this document 53 provide the following capabilities and/or improvements: Atomic 54 Operations and Immediate Data. 56 Table of Contents 58 1. Introduction...................................................3 59 1.1. Discovery of RDMAP Extensions.............................4 60 2. Requirements Language..........................................5 61 3. Glossary.......................................................5 62 4. Header Format Extensions.......................................7 63 4.1. RDMAP Control and Invalidate STag Fields..................7 64 4.2. RDMA Message Definitions..................................9 65 5. Atomic Operations..............................................9 66 5.1. Atomic Operation Details.................................11 67 5.1.1. FetchAdd............................................11 68 5.1.2. CmpSwap.............................................12 69 5.2. Atomic Operations........................................14 70 5.2.1. Atomic Operation Request Message....................14 71 5.2.2. Atomic Operation Response Message...................18 72 5.3. Atomicity Guarantees.....................................19 73 5.4. Atomic Operations Ordering and Completion Rules..........20 74 6. Immediate Data................................................21 75 6.1. RDMAP Interactions with ULP for Immediate Data...........21 76 6.2. Immediate Data Header Format.............................22 77 6.3. Immediate Data or Immediate Data with SE Message.........22 78 6.4. Ordering and Completions.................................23 79 7. Ordering and Completions Table................................23 80 8. Error Processing..............................................26 81 8.1. Errors Detected at the Local Peer........................26 82 8.2. Errors Detected at the Remote Peer.......................27 84 9. Security Considerations.......................................28 85 10. IANA Considerations..........................................28 86 10.1. RDMAP Message Atomic Operation Subcodes.................28 87 10.2. RDMAP Queue Numbers.....................................29 88 11. References...................................................30 89 11.1. Normative References....................................30 90 11.2. Informative References..................................30 91 12. Acknowledgments..............................................31 92 Appendix A. DDP Segment Formats for RDMA Messages................32 93 A.1. DDP Segment for Atomic Operation Request.................32 94 A.2. DDP Segment for Atomic Response..........................34 95 A.3. DDP Segment for Immediate Data and Immediate Data with SE34 97 1. Introduction 99 The RDMA Protocol [RFC5040] provides capabilities for zero copy data 100 communications that preserve memory protection semantics, enabling 101 more efficient network protocol implementations. This document 102 specifies the following extensions to the RDMA Protocol (RDMAP): 104 o Atomic operations on remote memory locations. Support for atomic 105 operation enhances the usability of RDMAP in distributed shared 106 memory environments. 108 o Immediate Data messages allow the ULP at the sender to provide a 109 small amount of data. When an Immediate Data message is sent 110 following an RDMA Write Message, the combination of the two 111 messages is an implementation of RDMA Write with Immediate 112 message that is found in other RDMA transport protocols. 114 Other RDMA transport protocols define the functionality added by 115 these extensions leading to differences in RDMA applications and/or 116 Upper Layer Protocols. Removing these differences in the transport 117 protocols simplifies these applications and ULPs and that is the 118 main motivation for the extensions specified in this document. 120 [RSOCKETS] is an example of RDMA enabled middleware that provides a 121 socket interface as the upper edge interface and utilizes RDMA to 122 provide more efficient networking for sockets based applications. 123 [RSOCKETS] is aware of Immediate Data support in [IB]. [RSOCKETS] 124 cannot utilize the RDMA Write with Immediate Data operation from 125 [IB] on iWARP. The addition of the Immediate Data operation 126 specified in this draft will alleviate this difference in [RSOCKETS] 127 when running on [IB] and iWARP. 129 [DAT_ATOMICS] is an example of RDMA enable middleware that provides 130 a portable RDMA programming interface for various RDMA transport 131 protocols. [DAT_ATOMICS] includes a primitive for [IB] that is not 132 supported by iWARP RNICs. The addition of Atomic Operations as 133 specified in this draft will allow atomic operations in 134 [DAT_ATOMICS] to work for both [IB] and iWARP interchangeably. 136 1.1. Discovery of RDMAP Extensions 138 Today there are RDMA applications and/or ULPs that are aware of the 139 existence of Atomic and Immediate data operations for RDMA 140 transports such as [IB] and application programming interfaces such 141 as [OFA Verbs]. Today, these applications need to be aware that 142 iWARP RNICs do not support these operations. Typically the 143 availability of these capabilities are exposed to the applications 144 through adapter query interfaces in software. Applications then 145 have to decide to use or not to use Immediate Data or Atomic 146 Operations based on the results of the query interfaces. 147 Negotiation of Atomic Operations typically are to determine the 148 scope of atomicity guarantees, not down to the individual Atomic 149 Operations supported. Therefore, this specification requires all 150 Atomic Operations defined within to be supported if an RNIC supports 151 any Atomic Operations. 153 In cases where heterogeneous hardware, with differing support for 154 Atomic Operations and Immediate Data Operations, is deployed for 155 usage by RDMA applications and/or ULPs, applications are either 156 statically configured to use or not use optional features or use 157 application specific negotiation mechanisms. For the extensions 158 covered by this document, it is RECOMMENDED that RDMA applications 159 and/or ULPs negotiate at the application or ULP level the usage of 160 these extensions. The definition of such application specific 161 mechanism is outside the scope of this specification. For backward 162 compatibility, existing applications and/or ULPs should assume that 163 iWARP RNICs do not support these extensions. 165 In the absence of application specific negotiation of the features 166 defined within this specification, the new operations can be 167 attempted and reported errors can be used to determine a remote 168 peer's capabilities. In the case of Atomics, a FetchAdd operation 169 with Add Data set to 0 can safely be used to determine the existence 170 of Atomic Operations without modifying the content of a remote 171 peer's memory. A Remote Operation Error / Unexpected OpCode error 172 will be reported by the remote peer in the case of an Immediate Data 173 or Atomic Operation as described if not supported by the remote 174 peer. 176 2. Requirements Language 178 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 179 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 180 document are to be interpreted as described in RFC-2119 [RFC2119]. 182 3. Glossary 184 This document is an extension of [RFC5040] and key words are defined 185 in the glossary of the referenced document. 187 Atomic Operation - is an operation that results in an execution of a 188 memory operation at a specific ULP Buffer address on a remote node 189 using the Tagged Buffer data transfer model. The consumer can use 190 Atomic Operations to read, modify and write memory at the 191 destination ULP Buffer address while at the same time guarantee that 192 no other Atomic Operation read or write accesses to the ULP Buffer 193 address targeted by the Atomic Operation will occur across any other 194 RDMAP Streams on an RNIC at the Responder. 196 Atomic Operation Request - An RDMA Message used by the Data Source 197 to perform an Atomic Operation at the Responder. 199 Atomic Operation Response - An RDMA Message used by the Responder to 200 describe the completion of an Atomic Operation at the Responder. 202 CmpSwap - is an Atomic Operation that is used to compare and swap a 203 value at a specific address on a remote node. 205 FetchAdd - is an Atomic Operation that is used to atomically 206 increment a value at a specific ULP Buffer address on a remote node. 208 Immediate Data - a small fixed size portion of data sent from the 209 Data Source to a Data Sink 211 Immediate Data Message - An RDMA Message used by the Data Source to 212 send Immediate Data to the Data Sink 214 Immediate Data with Solicited Event (SE) Message - An RDMA Message 215 used by the Data Source to send Immediate Data with Solicited Event 216 to the Data Sink 217 Requester - the sender of an RDMA Atomic Operation request. 219 Responder - the receiver of an RDMA Atomic Operation request. 221 ULP - Upper Layer Protocol. The protocol layer above the one 222 currently being referenced. The ULP for RDMAP[RFC5040]/DDP[RFC5041] 223 is expected to be an OS, Application, adaptation layer, or 224 proprietary device. The RDMAP[RFC5040]/DDP[RFC5041] documents do not 225 specify a ULP -- they provide a set of semantics that allow a ULP to 226 be designed to utilize RDMAP[RFC5040]/DDP[RFC5041]. 228 4. Header Format Extensions 230 The control information of RDMA Messages is included in DDP protocol 231 [RFC5041] defined header fields. [RFC5040] defines the RDMAP header 232 formats layered on the [RFC5041] DDP header definition. This 233 specification extends [RFC5040] with the following new formats: 234 . Four new RDMA Messages carry additional RDMAP headers. The 235 Immediate Data operation and Immediate Data with Solicited Event 236 operation include 8 bytes of data following the RDMAP header. 237 Atomic Operations include Atomic Request or Atomic Response 238 headers following the RDMAP header. The RDMAP header for Atomic 239 Request messages is 52 bytes long as specified in Figure 4. The 240 RDMAP header for Atomic Response Messages is 32 bytes long as 241 specified in Figure 5. 243 . Introduction of a new queue for untagged buffers (QN=3) used for 244 Atomic Response tracking. 246 4.1. RDMAP Control and Invalidate STag Fields 248 For reference, Figure 1 depicts the format of the DDP Control and 249 RDMAP Control fields, in the style and convention of [RFC5040]: 251 0 1 2 3 252 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 253 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 254 |T|L| Resrv | DV| RV|Rsv| Opcode| 255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 256 | Invalidate STag | 257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 259 Figure 1 DDP Control and RDMAP Control Fields 261 The DDP Control Field consists of the T,L, Resrv and DV fields 262 [RFC5041]. The RDMAP Control Field consists of the RV, Rsv and 263 Opcode fields [RFC5040]. 265 This specification adds additional values for the RDMA Opcode field 266 to those specified in [RFC5040]. Figure 2 defines the new values of 267 RDMA Opcode field that MUST be used for the RDMA Messages defined in 268 this specification. 270 Figure 2 also defines when the STag, Tagged Offset, and Queue Number 271 fields MUST be provided for the RDMA Messages defined in this 272 specification. 274 All RDMA Messages defined in this specification MUST have: 276 The RDMA Version (RV) field: 01b. 278 Opcode field: See Figure 2. 280 Invalidate STag: MUST be set to zero by the sender, ignored by the 281 receiver. 283 -------+-----------+-------+------+-------+-----------+-------------- 284 RDMA | Message | Tagged| STag | Queue | Invalidate| Message 285 Opcode | Type | Flag | and | Number| STag | Length 286 | | | TO | | | Communicated 287 | | | | | | between DDP 288 | | | | | | and RDMAP 289 -------+-----------+-------+------+-------+-----------+-------------- 290 1000b | Immediate | 0 | N/A | 0 | N/A | Yes 291 | Data | | | | | 292 -------+-----------+------------------------------------------------- 293 1001b | Immediate | 0 | N/A | 0 | N/A | Yes 294 | Data with | | | | | 295 | SE | | | | | 296 -------+-----------+------------------------------------------------- 297 1010b | Atomic | 0 | N/A | 1 | N/A | Yes 298 | Request | | | | | 299 -------+-----------+------------------------------------------------- 300 1011b | Atomic | 0 | N/A | 3 | N/A | Yes 301 | Response | | | | | 302 -------+-----------+------------------------------------------------- 304 Figure 2 Additional RDMA Usage of DDP Fields 306 Note: N/A means Not Applicable. 308 This extension defines RDMAP use of Queue Number 3 for Untagged 309 Buffers for Atomic Responses. This queue is used for tracking 310 outstanding Atomic Requests. 312 All other DDP and RDMAP control fields MUST be set as described in 313 [RFC5040]. 315 4.2. RDMA Message Definitions 317 The following figure defines which RDMA Headers MUST be used on each 318 new RDMA Message and which new RDMA Messages are allowed to carry 319 ULP payload: 321 -------+-----------+-------------------+------------------------- 322 RDMA | Message | RDMA Header Used | ULP Message allowed in 323 Message| Type | | the RDMA Message 324 OpCode | | | 325 | | | 326 -------+-----------+-------------------+------------------------- 327 1000b | Immediate | Immediate Data | No 328 | Data | Header | 329 -------+-----------+-------------------+------------------------- 330 1001b | Immediate | Immediate Data | No 331 | Data with | Header | 332 | SE | | 333 -------+-----------+-------------------+------------------------- 334 1010b | Atomic | Atomic Request | No 335 | Request | Header | 336 -------+-----------+-------------------+------------------------- 337 1011b | Atomic | Atomic Response | No 338 | Response | Header | 339 -------+-----------+-------------------+------------------------- 340 Figure 3 RDMA Message Definitions 342 5. Atomic Operations 344 The RDMA Protocol Specification in [RFC5040] does not include 345 support for Atomic Operations which are an important building block 346 for implementing distributed shared memory. 348 This document extends the RDMA Protocol specification with a set of 349 basic Atomic Operations, and specifies their resource and ordering 350 rules. The Atomic Operations specified in this document provide 351 equivalent functionality to the [IB] RDMA transport as well as 352 extended Atomic Operations defined in [OFA Verbs], to allow 353 applications that use these primitives to work interchangeably over 354 iWARP. Other operations are left for future consideration. 356 Atomic operations as specified in this document execute a 64-bit 357 memory operation at a specified destination ULP Buffer address on a 358 Responder node using the Tagged Buffer data transfer model. The 359 operations atomically read, modify and write back the contents of 360 the destination ULP Buffer address and guarantee that Atomic 361 Operations on this ULP Buffer address by other RDMAP Streams on the 362 same RNIC do not occur between the read and the write caused by the 363 Atomic Operation. Therefore, the Responder RNIC MUST implement 364 mechanisms to prevent Atomic Operations to a memory registered for 365 Atomic Operations while an Atomic Operation targeting the memory is 366 in progress. Atomicity guarantees between multiple RNICs or between 367 RNICs and software running independent of the RNIC is outside the 368 scope of this specification. An RNIC that supports Atomic 369 Operations as specified in this document MUST implement all Atomic 370 Operation Codes defined in Figure 5. The advertisement of Tagged 371 Buffer information for Atomic Operations is outside the scope of 372 this specification and must be handled by the ULPs. 374 Implementation note: It is recommended that the applications do not 375 use the ULP Buffer addresses used for Atomic Operations for other 376 RDMA operations. 378 Implementation note: Errors related to the alignment in the 379 following sections cover Atomic Operations targeted at a ULP Buffer 380 address that is not aligned to a 64-bit boundary. 382 Atomic Operation Request Messages use the same remote addressing 383 mechanism as RDMA Reads and Writes. The ULP Buffer address specified 384 in the request is in the address space of the Remote Peer to which 385 the Atomic Operation is targeted. 387 Atomic Operation Response Messages MUST use the Untagged Buffer 388 model with QN=3. Queue number 3 MUST be used to track outstanding 389 Atomic Operation Request messages at the Requestor. When the Atomic 390 Operation Response message is received, the MSN MUST be used to 391 locate the corresponding Atomic Operation request in order to 392 complete the Atomic Operation request. 394 5.1. Atomic Operation Details 396 The following sub-sections describe the Atomic Operations in more 397 details. 399 5.1.1. FetchAdd 401 The FetchAdd Atomic Operation requests the Responder to read a 64- 402 bit Original Remote Data Value at a 64-bit aligned ULP Buffer 403 address in the Responder's memory, to perform FetchAdd operation on 404 multiple fields of selectable length specified by 64-bit "Add Mask", 405 and write the result back to the same ULP Buffer address. The Atomic 406 addition is performed independently on each one of these fields. A 407 bit set in the Add Mask field specifies the field boundary; for each 408 field, a bit is set at the most significant bit position for each 409 field, causing any carry out of that bit position to be discarded 410 when the addition is performed. 412 FetchAdd Atomic Operations MUST target ULP Buffer addresses that are 413 64-bit aligned. FetchAdd Atomic Operations that target ULP Buffer 414 addresses that are not 64-bit aligned MUST be surfaced as errors and 415 the Responder's memory MUST NOT be modified in such cases. 416 Additionally an error MUST be surfaced and a terminate message MUST 417 be generated. The setting of "Add Mask" field to 0x0000000000000000 418 results in Atomic Add of 64-bit Original Remote Data Value and 64- 419 bit "Add Data". 421 The pseudo code below describes masked FetchAdd Atomic Operation. 423 bit_location = 1 425 carry = 0 427 Remote Data Value = 0 429 for bit = 0 to 63 431 { 433 if (bit != 0 ) bit_location = bit_location << 1 435 val1 = (Original Remote Data Value & bit_location) >> bit 437 val2 = (Add Data & bit_location) >> bit 438 sum = carry + val1 + val2 440 carry = !(!(sum & 2)) 442 sum = sum & 1 444 if (sum) 446 Remote Data Value |= bit_location 448 carry = ((carry) && (!(Add Mask & bit_location))) 450 } 452 The FetchAdd operation is performed in the endian format of the 453 target memory. The "Original Remote Data Value" is converted from 454 the endian format of the target memory for return and returned to 455 the Requester. The fields are in big-endian format on the wire. 457 The Requester specifies: 459 o Remote STag 461 o Remote Tagged Offset 463 o Add Data 465 o Add Mask 467 The Responder returns: 469 o Original Remote Data 471 5.1.2. CmpSwap 473 The CmpSwap Atomic Operation requires the Responder to read a 64-bit 474 value at a 64-bit aligned ULP Buffer address in the Responder's 475 memory, to perform an AND logical operation using the 64 bit 476 "Compare Mask" field in the Atomic Operation Request header, then to 477 compare it with the result of a logical AND operation of the 478 "Compare Mask" and the "Compare Data" fields in the header, and, if 479 the two values are equal, to swap masked bits in the same ULP Buffer 480 address with the masked Swap Data. If the two masked compare values 481 are not equal, the contents of the Responder's memory are not 482 changed. In either case, the original value read from the ULP Buffer 483 address is converted from the endian format of the target memory for 484 return and returned to the Requester. The fields are in big-endian 485 format on the wire. 487 The Requester specifies: 489 o Remote STag 491 o Remote Tagged Offset 493 o Swap Data 495 o Swap Mask 497 o Compare Data 499 o Compare Mask 501 The Responder returns: 503 o Original Remote Data Value 505 The following pseudo code describes the masked CmpSwap operation 506 result. 508 if (!((Compare Data ^ Original Remote Data Value) & 510 Compare Mask)) 512 then 514 Remote Data Value = 516 (Original Remote Data Value & ~(Swap Mask)) 518 | (Swap Data & Swap Mask) 520 else 522 Remote Data Value = Original Remote Data Value 524 After the operation, the remote data buffer MUST contain the 525 "Original Remote Data Value" (if comparison did not match) or the 526 masked "Swap Data" (if the comparison did match). CmpSwap Atomic 527 Operations MUST target buffer addresses that are 64-bit aligned. If 528 a CmpSwap Atomic Operation is attempted on a target ULP Buffer 529 addresse that is not 64-bit aligned: 531 o The operation MUST NOT be performed, 533 o The Responder's memory MUST NOT be modified, 535 o The result MUST be surfaced as an error, and 537 o A terminate message MUST be generated (see Section 8.2. for the 538 terminate message contents) 540 5.2. Atomic Operations 542 The Atomic Operation Request and Response are RDMA Messages. An 543 Atomic Operation makes use of the DDP Untagged Buffer Model. Atomic 544 Operation Request messages MUST use the same Queue Number as RDMA 545 Read Requests (QN=1). Reusing the same Queue Number for Atomic 546 Request messages allows the Atomic Operations to reuse the same 547 infrastructure (e.g. ORD/IRD flow control) as defined for RDMA Read 548 Requests. Atomic Operation Response messages MUST set Queue Number 549 (QN) to 3 in the DDP header. 551 The RDMA Message OpCode for an Atomic Request Message is 1010b. The 552 RDMA Message OpCode for an Atomic Response Message is 1011b. 554 5.2.1. Atomic Operation Request Message 556 The Atomic Operation Request Message carries an Atomic Operation 557 Header that describes the ULP Buffer address in the Responder's 558 memory. The Atomic Operation Request header immediately follows the 559 DDP header. The RDMAP layer passes to the DDP layer a RDMAP Control 560 Field. The following figure depicts the Atomic Operation Request 561 Header that MUST be used for all Atomic Operation Request Messages: 563 0 1 2 3 564 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 565 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 566 | Un-used (Not Used) |AOpCode| 567 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 568 | Request Identifier | 569 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 570 | Remote STag | 571 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 572 | Remote Tagged Offset | 573 + + 574 | | 575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 576 | Add or Swap Data | 577 + + 578 | | 579 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 580 | Add or Swap Mask | 581 + + 582 | | 583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 584 | Compare Data | 585 + + 586 | | 587 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 588 | Compare Mask | 589 + + 590 | | 591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 593 Figure 4 Atomic Operation Request Header 595 Un-used (Not Used): 28 bits 597 This field MUST be set to zero on transmit, ignored on 598 receive. 600 Atomic Operation Code (AOpCode): 4 bits. 602 See Figure 5. All Atomic Operation Codes from Figure 5 MUST 603 be implemented by an RNIC that support Atomic Operations. 605 Request Identifier: 32 bits. 607 The Request Identifier specifies a number that is used to 608 identify Atomic Operation Request Message. The value used in 609 this field is selected by the RNIC that sends the message, and 610 is reflected back to the Local Peer in the Atomic Operation 611 Response message. 613 Remote STag: 32 bits. 615 The Remote STag identifies the Remote Peer's Tagged Buffer 616 targeted by the Atomic Operation. The Remote STag is 617 associated with the RDMAP Stream through a mechanism that is 618 outside the scope of the RDMAP specification. 620 Remote Tagged Offset: 64 bits. 622 The Remote Tagged Offset specifies the starting offset, in 623 octets, from the base of the Remote Peer's Tagged Buffer 624 targeted by the Atomic Operation. The Remote Tagged Offset MAY 625 start at an arbitrary offset. 627 Add or Swap Data: 64 bits. 629 The Add or Swap Data field specifies the 64-bit "Add Data" 630 value in an Atomic FetchAdd Operation or the 64-bit "Swap 631 Data" value in an Atomic Swap or CmpSwap Operation. 633 Add or Swap Mask: 64 bits 635 This field is used in masked Atomic Operations (FetchAdd and 636 CmpSwap) to perform a bitwise logical AND operation as 637 specified in the definition of these operations. For non- 638 masked Atomic Operations (Swap), this field MUST be set to 639 ffffffffffffffffh on transmit and ignored by the receiver. 641 Compare Data: 64 bits. 643 The Compare Data field specifies the 64-bit "Compare Data" 644 value in an Atomic CmpSwap Operation. For Atomic FetchAdd and 645 Atomic Swap operation, the Compare Data field MUST be set to 646 zero on transmit and ignored by the receiver. 648 Compare Mask: 64 bits 650 This field is used in masked Atomic Operation CmpSwap to 651 perform a bitwise logical AND operation as specified in the 652 definition of these operations. For Atomic Operations FetchAdd 653 and Swap, this field MUST be set to ffffffffffffffffh on 654 transmit and ignored by the receiver. 656 ---------+-----------+----------+----------+---------+--------- 657 Atomic | Atomic | Add or | Add or | Compare | Compare 658 Operation| Operation | Swap | Swap | Data | Mask 659 Code | | Data | Mask | | 660 ---------+-----------+----------+----------+---------+--------- 661 0000b | FetchAdd | Add Data | Add Mask | N/A | N/A 662 ---------+-----------+----------+----------+---------+--------- 663 0010b | CmpSwap | Swap Data| Swap Mask| Valid | Valid 664 ---------+-----------+----------------------------------------- 666 Figure 5 Atomic Operation Message Definitions 668 The Atomic Operation Request Message has the following semantics: 670 1. An Atomic Operation Request Message MUST reference an Untagged 671 Buffer. That is, the Local Peer's RDMAP layer MUST request that 672 the DDP mark the Message as Untagged. 674 2. One Atomic Operation Request Message MUST consume one Untagged 675 Buffer. 677 3. The Responder's RDMAP layer MUST process an Atomic Operation 678 Request Message. A valid Atomic Operation Request Message MUST 679 NOT be delivered to the Responder's ULP (i.e., it is processed by 680 the RDMAP layer). 682 4. At the Responder, an error MUST be surfaced in response to 683 delivery to the Remote Peer's RDMAP layer of an Atomic Operation 684 Request Message with an Atomic Operation Code that the RNIC does 685 not support. 687 5. An Atomic Operation Request Message MUST reference the RDMA Read 688 Request Queue. That is, the Requester's RDMAP layer MUST request 689 that the DDP layer set the Queue Number field to one. 691 6. The Requester MUST pass to the DDP layer Atomic Operation Request 692 Messages in the order they were submitted by the ULP. 694 7. The Responder MUST process the Atomic Operation Request Messages 695 in the order they were sent. 697 8. If the Responder receives a valid Atomic Operation Request 698 Message, it MUST respond with a valid Atomic Operation Response 699 Message. 701 5.2.2. Atomic Operation Response Message 703 The Atomic Operation Response Message carries an Atomic Operation 704 Response Header that contains the "Original Request Identifier" and 705 "Original Remote Data Value". The Atomic Operation Response Header 706 immediately follows the DDP header. The RDMAP layer passes to the 707 DDP layer a RDMAP Control Field. The following figure depicts the 708 Atomic Operation Response header that MUST be used for all Atomic 709 Operation Response Messages: 711 0 1 2 3 712 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 713 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 714 | Original Request Identifier | 715 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 716 | Original Remote Data Value | 717 + + 718 | | 719 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 721 Figure 6 Atomic Operation Response Header 723 Original Request Identifier: 32 bits. 725 The Original Request Identifier MUST be set to the value 726 specified in the Request Identifier field that was originally 727 provided in the corresponding Atomic Operation Request 728 Message. 730 Original Remote Data Value: 64 bits. 732 The Original Remote Value specifies the original 64-bit value 733 stored at the ULP Buffer address targeted by the Atomic 734 Operation. 736 The Atomic Operation Response Message has the following semantics: 738 1. The Atomic Operation Response Message for the associated Atomic 739 Operation Request Message travels in the opposite direction. 741 2. An Atomic Operation Response Message MUST consume an Untagged 742 Buffer. That is, the Responder RDMAP layer MUST request that the 743 DDP mark the Message as Untagged. 745 3. An Atomic Operation Response Message MUST reference the Queue 746 Number 3. That is, the Responder's RDMAP layer MUST request that 747 the DDP layer set the Queue Number field to 3. 749 4. The Responder MUST ensure that a sufficient number of Untagged 750 Buffers are available on the RDMA Read Request Queue (Queue with 751 DDP Queue Number 1) to support the maximum number of Atomic 752 Operation Requests negotiated by the ULP in addition to the 753 maximum number of RDMA Read Requests negotiated by the ULP. 755 5. The Requester MUST ensure that a sufficient number of Untagged 756 Buffers are available on the RDMA Atomic Response Queue (Queue 757 with DDP Queue Number 3) to support the maximum number of Atomic 758 Operation Requests negotiated by the ULP. 760 6. The RDMAP layer MUST Deliver the Atomic Operation Response 761 Message to the ULP. 763 7. At the Requester, when an invalid Atomic Operation Response 764 Message is delivered to the Remote Peer's RDMAP layer, an error 765 is surfaced. 767 8. When the Responder receives Atomic Operation Request messages, 768 the Responder RDMAP layer MUST pass Atomic Operation Response 769 Messages to the DDP layer, in the order that the Atomic Operation 770 Request Messages were received by the RDMAP layer, at the 771 Responder. 773 5.3. Atomicity Guarantees 775 Atomicity of the Read-Modify-Write (RMW) on the Responder's node by 776 the Atomic Operation MUST be assured in the context of concurrent 777 atomic accesses by other RDMAP Streams on the same RNIC. 779 5.4. Atomic Operations Ordering and Completion Rules 781 In addition to the ordering and completion rules described in 782 [RFC5040], the following rules apply to implementations of the 783 Atomic operations. 785 1. For an Atomic operation, the Requester MUST NOT consider the 786 contents of the Tagged Buffer at the Responder to be modified by 787 that specific Atomic Operation until the Atomic Operation 788 Response Message has been Delivered to RDMAP at the Requester. 790 2. Atomicity guarantees MUST be within the scope of a single RNIC. 792 Implementation Note: Implementations may not guarantee Atomicity 793 if the Tagged Buffer is accessed by any other method other than 794 an Atomic Operation within the scope of a single RNIC. 796 3. Atomic Operation Request Messages MUST NOT start processing at 797 the Responder until they have been Delivered to RDMAP by DDP. 799 4. Atomic Operation Response Messages MAY be generated at the 800 Responder after subsequent RDMA Write Messages or Send Messages 801 have been Placed or Delivered. 803 5. Atomic Operation Response Message processing at the Responder 804 MUST be started only after the Atomic Operation Request Message 805 has been Delivered by the DDP layer (thus, all previous RDMA 806 Messages on that DDP Stream have been Delivered). 808 6. Send Messages MAY be Completed at the Responder before prior 809 incoming Atomic Operation Request Messages have completed their 810 response processing. 812 7. An Atomic Operation MUST NOT be Completed at the Requester until 813 the DDP layer Delivers the associated incoming Atomic Operation 814 Response Message. 816 8. If more than one outstanding Atomic Request Messages are 817 supported by both peers, the Atomic Operation Request Messages 818 MUST be processed in the order they were delivered by the DDP 819 layer on the Responder. Atomic Operation Response Messages MUST 820 be submitted to the DDP layer on the Responder in the order the 821 Atomic Operation Request Messages were Delivered by DDP. 823 6. Immediate Data 825 The Immediate Data operation is typically used in conjunction with an 826 RDMA Write Operation to improve ULP processing efficiency. The 827 efficiency is gained by causing an RDMA Completion to be generated 828 immediately following the RDMA Write operation. This RDMA Completion 829 delivers 8 bytes of immediate data at the Remote Peer. The 830 combination of an RDMA Write Message followed by an Immediate Data 831 Operation has the same behavior as the RDMA Write with Immediate Data 832 operation found in [IB]. An Immediate Data operation that is not 833 preceded by an RDMA Write operation causes an RDMA Completion. 835 6.1. RDMAP Interactions with ULP for Immediate Data 837 For Immediate Data operations, the following are the interactions 838 between the RDMAP Layer and the ULP: 839 . At the Data Source: 841 . The ULP passes to the RDMAP Layer the following: 843 . Eight bytes of ULP Immediate Data 845 . When the Immediate Data operation Completes, an indication 846 of the Completion results. 848 . At the Data Sink: 850 . If the Immediate Data operation is Completed successfully, 851 the RDMAP Layer passes the following information to the ULP 852 Layer: 854 . Eight bytes of Immediate Data 856 . An Event, if the Data Sink is configured to generate an 857 Event. 859 . If the Immediate Data operation is Completed in error, the 860 Data Sink RDMAP Layer will pass up the corresponding error 861 information to the Data Sink ULP and send a Terminate 862 Message to the Data Source RDMAP Layer. The Data Source 863 RDMAP Layer will then pass up the Terminate Message to the 864 ULP. 866 6.2. Immediate Data Header Format 868 The Immediate Data and Immediate Data with SE Messages carry 869 immediate data as shown in Figure 7. The RDMAP layer passes to the 870 DDP layer an RDMAP Control Field and 8 bytes of Immediate Data. The 871 first 8 bytes of the data following the DDP header contains the 872 Immediate Data. See section A.3. for the DDP segment format of an 873 Immediate Data or Immediate Data with SE Message. 875 0 1 2 3 876 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 877 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 878 | Immediate Data | 879 + + 880 | | 881 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 883 Figure 7 Immediate Data or Immediate Data with SE Message Header 885 Immediate Data: 64 bits. 886 Eight bytes of data transferred from the Requester to an 887 untagged buffer at the Responder. 889 6.3. Immediate Data or Immediate Data with SE Message 891 The Immediate Data or Immediate Data with SE Message uses the DDP 892 Untagged Buffer Model to transfer Immediate Data from the Data 893 Source to the Data Sink. 894 . An Immediate Data or Immediate Data with SE Message MUST 895 reference an Untagged Buffer. That is, the Local Peer's RDMAP 896 Layer MUST request that the DDP layer mark the Message as 897 Untagged. 899 . One Immediate Data or Immediate Data with SE Message MUST consume 900 one Untagged Buffer. 902 . At the Remote Peer, the Immediate Data or Immediate Data with SE 903 Message MUST be Delivered to the Remote Peer's ULP in the order 904 they were sent. 906 . For an Immediate Data or Immediate Data with SE Message, the 907 Local Peer's RDMAP Layer MUST request that the DDP layer set the 908 Queue Number field to zero. 910 . For an Immediate Data or Immediate Data with SE Message, the 911 Local Peer's RDMAP Layer MUST request that the DDP layer transmit 912 8 bytes of data. 914 . The Local Peer MUST issue Immediate Data and Immediate Data with 915 SE Messages in the order they were submitted by the ULP. 917 . The Remote Peer MUST check that Immediate Data and Immediate Data 918 with SE Messages include exactly 8 bytes of data from the DDP 919 layer. The DDP header carries the length field that is reported 920 by the DDP layer. 922 6.4. Ordering and Completions 924 Ordering and completion rules for Immediate Data are the same as 925 those for a Send operation as described in section 5.5 of RFC 5040. 927 7. Ordering and Completions Table 929 The following table summarizes the ordering relationships for Atomic 930 and Immediate Data operations from the standpoint of Local Peer issuing 931 the Operations. Note that in the table that follows, Send includes 932 Send, Send with Invalidate, Send with Solicited Event, and Send with 933 Solicited Event and Invalidate. Also note that in the table below, 934 Immediate Data includes Immediate Data and Immediate Data with 935 Solicited Event. 937 ----------+------------+-------------+-------------+------------------- 938 First | Second | Placement | Placement | Ordering 939 Operation | Operation | Guarantee at| Guarantee at| Guarantee at 940 | | Remote Peer | Local Peer | Remote Peer 941 ----------+------------+-------------+-------------+------------------- 942 Immediate | Send | No Placement| Not | Completed in 943 Data | | Guarantee | Applicable | Order 944 | | between Send| | 945 | | Payload and | | 946 | | Immediate | | 947 | | Data | | 948 ----------+------------+-------------+-------------+------------------- 949 Immediate | RDMA | No Placement| Not | Not 950 Data | Write | Guarantee | Applicable | Applicable 951 | | between RDMA| | 952 | | Write | | 953 | | Payload and | | 954 | | Immediate | | 955 | | Data | | 956 ----------+------------+-------------+-------------+------------------- 957 Immediate | RDMA | No Placement| RDMA Read | RDMA Read 958 Data | Read | Guarantee | Response | Response 959 | | between | will not be | Message will 960 | | Immediate | Placed until| not be 961 | | Data and | Immediate | generated 962 | | RDMA Read | Data is | until 963 | | Request | Placed at | Immediate Data 964 | | | Remote Peer | has been 965 | | | | Completed 966 ----------+------------+-------------+-------------+------------------- 967 Immediate | Atomic | No Placement| Atomic | Atomic 968 Data | | Guarantee | Response | Response 969 | | between | will not be | Message will 970 | | Immediate | Placed until| not be 971 | | Data and | Immediate | generated 972 | | Atomic | Data is | until 973 | | Request | Placed at | Immediate Data 974 | | | Remote Peer | has been 975 | | | | Completed 976 ----------+------------+-------------+-------------+------------------- 977 Immediate | Immediate | No Placement| Not | Completed in 978 Data or | Data | Guarantee | Applicable | Order 979 Send | | | | 980 ----------+------------+-------------+-------------+------------------- 981 RDMA Write| Immediate | No Placement| Not | Immediate Data 982 | Data | Guarantee | Applicable | is Completed 983 | | | | after RDMA 984 | | | | Write is Placed 985 | | | | and Delivered 986 ----------+------------+-------------+-------------+------------------- 987 RDMA Read | Immediate | No Placement| Immediate | Not Applicable 988 | Data | Guarantee | Data MAY be | 989 | | between | Placed | 990 | | Immediate | before | 991 | | Data and | RDMA Read | 992 | | RDMA Read | Response is | 993 | | Request | generated | 995 ----------+------------+-------------+-------------+------------------- 996 Atomic | Immediate | No Placement| Immediate | Not Applicable 997 | Data | Guarantee | Data MAY be | 998 | | between | Placed | 999 | | Immediate | before | 1000 | | Data and | Atomic | 1001 | | Atomic | Response is | 1002 | | Request | generated | 1003 ----------+------------+-------------+-------------+------------------- 1004 Atomic | Send | No Placement| Send Payload| Not Applicable 1005 | | Guarantee | MAY be | 1006 | | between Send| Placed | 1007 | | Payload and | before | 1008 | | Atomic | Atomic | 1009 | | Request | Response is | 1010 | | | generated | 1011 ----------+------------+-------------+-------------+------------------- 1012 Atomic | RDMA | No Placement| RDMA Write | Not 1013 | Write | Guarantee | Payload MAY | Applicable 1014 | | between RDMA| be Placed | 1015 | | Write | before | 1016 | | Payload and | Atomic | 1017 | | Atomic | Response is | 1018 | | Request | generated | 1019 ----------+------------+-------------+-------------+------------------- 1020 Atomic | RDMA | No Placement| No Placement| RDMA Read 1021 | Read | Guarantee | Guarantee | Response 1022 | | between | between | Message will 1023 | | Atomic | Atomic | not be 1024 | | Request and | Response | generated 1025 | | RDMA Read | and RDMA | until Atomic 1026 | | Request | Read | Response Message 1027 | | | Response | has been 1028 | | | | generated 1029 ----------+------------+-------------+-------------+------------------- 1030 Atomic | Atomic | Placed in | No Placement| Second Atomic 1031 | | order | Guarantee | Request 1032 | | | between two | Message will 1033 | | | Atomic | not be 1034 | | | Responses | processed 1035 | | | | until first 1036 | | | | Atomic Response 1037 | | | | has been 1038 | | | | generated 1039 ----------+------------+-------------+-------------+------------------- 1040 Send | Atomic | No Placement| Atomic | Atomic Response 1041 | | Guarantee | Response | Message will not 1042 | | between Send| will not be | be generated until 1043 | | Payload and | Placed at | Send has been 1044 | | Atomic | the Local | Completed 1045 | | Request | Peer Until | 1046 | | | Send Payload| 1047 | | | is Placed | 1048 | | | at the | 1049 | | | Remote Peer | 1050 ----------+------------+-------------+-------------+------------------- 1051 RDMA | Atomic | No Placement| Atomic | Not 1052 Write | | Guarantee | Response | Applicable 1053 | | between RDMA| will not be | 1054 | | Write | Placed at | 1055 | | Payload and | the Local | 1056 | | Atomic | Peer Until | 1057 | | Request | RDMA Write | 1058 | | | Payload | 1059 | | | is Placed | 1060 | | | at the | 1061 | | | Remote Peer | 1062 ----------+------------+-------------+-------------+------------------- 1063 RDMA | Atomic | No Placement| No Placement| Atomic Response 1064 Read | | Guarantee | Guarantee | Message will 1065 | | between | between | not be generated 1066 | | Atomic | Atomic | until RDMA 1067 | | Request and | Response | Read Response 1068 | | RDMA Read | and RDMA | has been 1069 | | Request | Read | generated 1070 | | | Response | 1071 ----------+------------+-------------+-------------+------------------- 1073 8. Error Processing 1075 In addition to error processing described in section 7 of [RFC5040], 1076 the following rules apply for the new RDMA Messages defined in this 1077 specification. 1079 8.1. Errors Detected at the Local Peer 1081 The Local Peer MUST send a Terminate Message for each of the 1082 following cases: 1084 1. For errors detected while creating an Atomic Request, Atomic 1085 Response, Immediate Data, or Immediate Data with SE Message, or 1086 other reasons not directly associated with an incoming Message, 1087 the Terminate Message and Error code are sent instead of the 1088 Message. In this case, the Error Type and Error Code fields are 1089 included in the Terminate Message, but the Terminated DDP Header 1090 and Terminated RDMA Header fields are set to zero. 1092 2. For errors detected on an incoming Atomic Request, Atomic 1093 Response, Immediate Data, or Immediate Data with Solicited Event 1094 (after the Message has been Delivered by DDP), the Terminate 1095 Message is sent at the earliest possible opportunity, preferably 1096 in the next outgoing RDMA Message. In this case, the Error Type, 1097 Error Code, and Terminated DDP Header fields are included in the 1098 Terminate Message, but the Terminated RDMA Header field is set to 1099 zero. 1101 8.2. Errors Detected at the Remote Peer 1103 On incoming Atomic Requests, Atomic Responses, Immediate Data, and 1104 Immediate Data with Solicited Event, the following MUST be 1105 validated: 1107 . The DDP layer MUST validate all DDP Segment fields. 1109 . The RDMA OpCode MUST be valid. 1111 . The RDMA Version MUST be valid. 1113 On incoming Atomic requests the following additional validation MUST 1114 be performed: 1116 . The RDMAP layer MUST validate that the Remote Peer's Tagged ULP 1117 Buffer address references a 64-bit aligned ULP Buffer address. In 1118 the case of an error, the RDMAP layer MUST generate a Terminate 1119 Message indicating RDMA Layer Remote Operation Error with Error 1120 Code Name "Catastrophic Error, Localized to RDMAP Stream" as 1121 described in Section 4.8 of [RFC5040]. Implementation Note: A ULP 1122 implementation can avoid this error by having the target ULP 1123 buffer of an atomic operation 64-bit aligned. 1125 9. Security Considerations 1127 This document specifies extensions to the RDMA Protocol 1128 specification in [RFC5040], and as such the Security Considerations 1129 discussed in Section 8 of [RFC5040] apply. In particular, Atomic 1130 Operations use ULP Buffer addresses for the Remote Peer buffer 1131 addressing used in [RFC5040] which is used to satisfy the [RFC5042] 1132 security model. No additional Security Considerations are required 1133 for the extensions specified in this document. 1135 10. IANA Considerations 1137 IANA is requested to add the following entries to the "RDMAP Message 1138 Operation Codes" registry of "RDDP Registries": 1140 0x8, Immediate Data, [RFCXXXX] 1142 0x9, Immediate Data with Solicited Event, [RFCXXXX] 1144 0xA, Atomic Request, [RFCXXXX] 1146 0xB, Atomic Response, [RFCXXXX] 1148 In addition, the following registry is requested to be added to 1149 "RDDP Registries". The following section specifies the registry, its 1150 initial contents and the administration policy in more detail. 1152 10.1. RDMAP Message Atomic Operation Subcodes 1154 Name of the registry: "RDMAP Message Atomic Operation Subcodes" 1156 Namespace details: RDMAP Message Atomic Operation Subcodes are 4-bit 1157 values [RFCXXXX]. 1159 Information that must be provided to assign a new value: An IESG- 1160 approved standards-track specification defining the semantics and 1161 interoperability requirements of the proposed new value and the 1162 fields to be recorded in the registry. 1164 Fields to record in the registry: RDMAP Message Atomic Operation 1165 Subcode, Atomic Operation, RFC Reference. 1167 Initial registry contents: 1169 0x0, FetchAdd, [RFCXXXX] 1170 0x2, CmpSwap, [RFCXXXX] 1172 Note: An experimental RDMAP Message Operation Code has already been 1173 allocated; hence there is no need for an experimental RDMAP Message 1174 Atomic Operation Subcode. 1176 All other values are Unassigned and available to IANA for 1177 assignment. 1179 Allocation Policy: Standards Action ([RFC5226]) 1181 RFC Editor: Please replace XXXX in all instances of [RFCXXXX] above 1182 with the RFC number of this document and remove this note. 1184 10.2. RDMAP Queue Numbers 1186 Name of the registry: "RDMAP DDP Untagged Queue Numbers" 1188 Namespace details: RDMAP DDP Untagged Queue numbers are 32-bit 1189 values [RFCXXXX]. 1191 Information that must be provided to assign a new value: An IESG- 1192 approved standards-track specification defining the semantics and 1193 interoperability requirements of the proposed new value and the 1194 fields to be recorded in the registry. 1196 Fields to record in the registry: RDMAP DDP Untagged Queue Numbers, 1197 Atomic Operation, RFC Reference. 1199 Initial registry contents: 1201 0x00000000, Queue 0 (Send operation Variants), [RFC5040] 1203 0x00000001, Queue 1 (RDMA Read Request operations), [RFC5040] 1205 0x00000002, Queue 2 (Terminate operations), [RFC5040] 1207 0x00000003, Queue 3 (Atomic Response operations), [RFCXXXX] 1209 Note: An experimental RDMAP Message Operation Code has already been 1210 allocated; hence there is no need for an experimental RDMAP DDP 1211 Untagged Queue Number. 1213 All other values are Unassigned and available to IANA for 1214 assignment. 1216 Allocation Policy: Standards Action ([RFC5226]) 1218 RFC Editor: Please replace XXXX in all instances of [RFCXXXX] above 1219 with the RFC number of this document and remove this note. 1221 11. References 1223 11.1. Normative References 1225 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1226 Requirement Levels", BCP 14, RFC 2119, March 1997. 1228 [RFC5040] Recio, R. et al., "A Remote Direct Memory Access Protocol 1229 Specification", RFC 5040, October 2007. 1231 [RFC5041] Shah, H. et al., "Direct Data Placement over Reliable 1232 Transports", RFC 5041, October 2007. 1234 [RFC5042] Pinkerton, J. and E. Deleganes, "Direct Data Placement 1235 Protocol (DDP) / Remote Direct Memory Access Protocol 1236 (RDMAP) Security", October 2007. 1238 11.2. Informative References 1240 [IB] Infiniband Trade Association, "Infiniband Architecture 1241 Specification Volumes 1 and 2", Release 1.1, November 1242 2002, available from http://www.infinibandta.org/specs. 1244 [RSOCKETS] RSockets, RDMA enabled Sockets library for Open Fabrics, 1245 available from 1246 http://git.openfabrics.org/git?p=~shefty/librdmacm.git;a=b 1247 lob;f=src/rsocket.c;h=d544dd097cda228de114173c8fe569dc1881 1248 f057;hb=HEAD. 1250 [OFA Verbs] Open Fabrics Alliance Verbs Enhanced Atomic Operations, 1251 "[PATCH 0/2] Add support for enhanced atomic operations", 1252 available from 1253 http://comments.gmane.org/gmane.linux.drivers.rdma/2397. 1255 [DAT_ATOMICS] DAT Collaborative, User Direct Access Programming 1256 Library, "Ratified DAT IB extension spec", available from 1257 http://www.datcollaborative.org/DAT_IB_Extensions.pdf. 1259 12. Acknowledgments 1261 The authors would like to acknowledge the following contributors who 1262 provided valuable comments and suggestions. 1264 o David Black 1266 o Arkady Kanevsky 1268 o Bernard Metzler 1270 o Jim Pinkerton 1272 o Tom Talpey 1274 o Steve Wise 1276 o Don Wood 1278 This document was prepared using 2-Word-v2.0.template.dot. 1280 Appendix A. DDP Segment Formats for RDMA Messages 1282 This appendix is for information only and is NOT part of the 1283 standard. It simply depicts the DDP Segment format for the various 1284 RDMA Messages. 1286 A.1. DDP Segment for Atomic Operation Request 1288 The following figure depicts an Atomic Operation Request, DDP 1289 Segment: 1291 0 1 2 3 1292 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1294 | DDP Control | RDMA Control | 1295 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1296 | Un-used (Not Used) | 1297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1298 | DDP (Atomic Operation Request) Queue Number | 1299 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1300 | DDP (Atomic Operation Request) Message Sequence Number | 1301 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1302 | DDP (Atomic Operation Request) Message Offset | 1303 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1304 | Un-used (Not Used) |AOpCode| 1305 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1306 | Request Identifier | 1307 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1308 | Remote STag | 1309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1310 | Remote Tagged Offset | 1311 + + 1312 | | 1313 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1314 | Add or Swap Data | 1315 + + 1316 | | 1317 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1318 | Add or Swap Mask | 1319 + + 1320 | | 1321 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1322 | Compare Data | 1323 + + 1324 | | 1325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1326 | Compare Mask | 1327 + + 1328 | | 1329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1331 A.2. DDP Segment for Atomic Response 1333 The following figure depicts an Atomic Operation Response, DDP 1334 Segment: 1336 0 1 2 3 1337 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1339 | DDP Control | RDMA Control | 1340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1341 | Un-used (Not Used) | 1342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1343 | DDP (Atomic Operation Request) Queue Number | 1344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1345 | DDP (Atomic Operation Request) Message Sequence Number | 1346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1347 | DDP (Atomic Operation Request) Message Offset | 1348 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1349 | Original Request Identifier | 1350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1351 | Original Remote Value | 1352 + + 1353 | | 1354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1356 A.3. DDP Segment for Immediate Data and Immediate Data with SE 1358 The following figure depicts an Immediate Data or Immediate data 1359 with SE, DDP Segment: 1361 0 1 2 3 1362 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1364 | DDP Control | RDMA Control | 1365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1366 | Un-used (Not Used) | 1367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1368 | DDP (Send) Queue Number | 1369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1370 | DDP (Send) Message Sequence Number | 1371 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1372 | DDP Message Offset | 1373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1374 | Immediate Data | 1375 + + 1376 | | 1377 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1379 Authors' Addresses 1381 Hemal Shah 1382 Broadcom Corporation 1383 5300 California Avenue 1384 Irvine, CA 92617 1385 Phone: 1-949-926-6941 1386 Email: hemal@broadcom.com 1388 Felix Marti 1389 Chelsio Communications, Inc. 1390 370 San Aleso Ave. 1391 Sunnyvale, CA 94085 1392 Phone: 1-408-962-3600 1393 Email: felix@chelsio.com 1395 Asgeir Eiriksson 1396 Chelsio Communications, Inc. 1397 370 San Aleso Ave. 1398 Sunnyvale, CA 94085 1399 Phone: 1-408-962-3600 1400 Email: asgeir@chelsio.com 1402 Wael Noureddine 1403 Chelsio Communications, Inc. 1404 370 San Aleso Ave. 1405 Sunnyvale, CA 94085 1406 Phone: 1-408-962-3600 1407 Email: wael@chelsio.com 1409 Robert Sharp 1410 Intel Corporation 1411 1300 South Mopac Expy, Mailstop: AN4-4B 1412 Austin, TX 78746 1413 Phone: 1-512-362-1407 1414 Email: robert.o.sharp@intel.com