idnits 2.17.1 draft-ietf-storm-rdmap-ext-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 3, 2014) is 3707 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCXXXX' is mentioned on line 1234, but not defined ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Storage Maintenance (storm) Working Group Hemal Shah 2 Internet Draft Broadcom Corporation 3 Intended status: Standards Track Felix Marti 4 Expires: September 2014 Wael Noureddine 5 Asgeir Eiriksson 6 Chelsio Communications, Inc. 7 Robert Sharp 8 Intel Corporation 9 March 3, 2014 11 RDMA Protocol Extensions 12 draft-ietf-storm-rdmap-ext-09.txt 14 Status of this Memo 16 This Internet-Draft is submitted to IETF in full conformance with 17 the provisions of BCP 78 and BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF). Note that other groups may also distribute 21 working documents as Internet-Drafts. The list of current Internet- 22 Drafts is at http://datatracker.ietf.org/drafts/current. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other documents 26 at any time. It is inappropriate to use Internet-Drafts as 27 reference material or to cite them other than as "work in progress." 29 This Internet-Draft will expire on August 3, 2014. 31 Copyright Notice 33 Copyright (c) 2014 IETF Trust and the persons identified as the 34 document authors. All rights reserved. 36 This document is subject to BCP 78 and the IETF Trust's Legal 37 Provisions Relating to IETF Documents 38 (http://trustee.ietf.org/license-info) in effect on the date of 39 publication of this document. Please review these documents 40 carefully, as they describe your rights and restrictions with 41 respect to this document. Code Components extracted from this 42 document must include Simplified BSD License text as described in 43 Section 4.e of the Trust Legal Provisions and are provided without 44 warranty as described in the Simplified BSD License. 46 Abstract 48 This document specifies extensions to the IETF Remote Direct Memory 49 Access Protocol (RDMAP RFC5040). RDMAP provides read and write 50 services directly to applications and enables data to be transferred 51 directly into Upper Layer Protocol (ULP) Buffers without 52 intermediate data copies. The extensions specified in this document 53 provide the following capabilities and/or improvements: Atomic 54 Operations and Immediate Data. 56 Table of Contents 58 1. Introduction...................................................3 59 1.1. Discovery of RDMAP Extensions.............................4 60 2. Requirements Language..........................................5 61 3. Glossary.......................................................5 62 4. Header Format Extensions.......................................7 63 4.1. RDMAP Control and Invalidate STag Fields..................7 64 4.2. RDMA Message Definitions..................................9 65 5. Atomic Operations..............................................9 66 5.1. Atomic Operation Details.................................11 67 5.1.1. FetchAdd............................................11 68 5.1.2. CmpSwap.............................................12 69 5.2. Atomic Operations........................................14 70 5.2.1. Atomic Operation Request Message....................14 71 5.2.2. Atomic Operation Response Message...................18 72 5.3. Atomicity Guarantees.....................................19 73 5.4. Atomic Operations Ordering and Completion Rules..........20 74 6. Immediate Data................................................21 75 6.1. RDMAP Interactions with ULP for Immediate Data...........21 76 6.2. Immediate Data Header Format.............................22 77 6.3. Immediate Data or Immediate Data with SE Message.........22 78 6.4. Ordering and Completions.................................23 79 7. Ordering and Completions Table................................23 80 8. Error Processing..............................................26 81 8.1. Errors Detected at the Local Peer........................26 82 8.2. Errors Detected at the Remote Peer.......................27 84 9. Security Considerations.......................................28 85 10. IANA Considerations..........................................28 86 10.1. RDMAP Message Atomic Operation Subcodes.................28 87 10.2. RDMAP Queue Numbers.....................................29 88 11. References...................................................30 89 11.1. Normative References....................................30 90 11.2. Informative References..................................31 91 12. Acknowledgments..............................................31 92 Appendix A. DDP Segment Formats for RDMA Messages................33 93 A.1. DDP Segment for Atomic Operation Request.................33 94 A.2. DDP Segment for Atomic Response..........................35 95 A.3. DDP Segment for Immediate Data and Immediate Data with SE35 97 1. Introduction 99 The RDMA Protocol [RFC5040] provides capabilities for zero copy data 100 communications that preserve memory protection semantics, enabling 101 more efficient network protocol implementations. This document 102 specifies the following extensions to the RDMA Protocol (RDMAP): 104 o Atomic operations on remote memory locations. Support for atomic 105 operation enhances the usability of RDMAP in distributed shared 106 memory environments. 108 o Immediate Data messages allow the ULP at the sender to provide a 109 small amount of data. When an Immediate Data message is sent 110 following an RDMA Write Message, the combination of the two 111 messages is an implementation of RDMA Write with Immediate 112 message that is found in other RDMA transport protocols. 114 Other RDMA transport protocols define the functionality added by 115 these extensions leading to differences in RDMA applications and/or 116 Upper Layer Protocols. Removing these differences in the transport 117 protocols simplifies these applications and ULPs and that is the 118 main motivation for the extensions specified in this document. 120 RSockets [RSOCKETS] is an example of RDMA enabled middleware that 121 provides a socket interface as the upper edge interface and utilizes 122 RDMA to provide more efficient networking for sockets based 123 applications. RSockets [RSOCKETS] is aware of Immediate Data 124 support in [IB]. [RSOCKETS] cannot utilize the RDMA Write with 125 Immediate Data operation from [IB] on iWARP. The addition of the 126 Immediate Data operation specified in this draft will alleviate this 127 difference in [RSOCKETS] when running on [IB] and iWARP. 129 DAT Atomics [DAT_ATOMICS] is an example of RDMA enable middleware 130 that provides a portable RDMA programming interface for various RDMA 131 transport protocols. [DAT_ATOMICS] includes a primitive for [IB] 132 that is not supported by iWARP RNICs. The addition of Atomic 133 Operations as specified in this draft will allow atomic operations 134 in [DAT_ATOMICS] to work for both [IB] and iWARP interchangeably. 136 For more background on RDMA Protocol applicability, see 137 Applicability of Remote Direct Memory Access Protocol (RDMA) and 138 Direct Data Placement Protocol (DDP) [RFC5045]. 140 1.1. Discovery of RDMAP Extensions 142 Today there are RDMA applications and/or ULPs that are aware of the 143 existence of Atomic and Immediate data operations for RDMA 144 transports such as [IB] and application programming interfaces such 145 as [OFAVERBS]. Today, these applications need to be aware that 146 iWARP RNICs do not support these operations. Typically the 147 availability of these capabilities are exposed to the applications 148 through adapter query interfaces in software. Applications then 149 have to decide to use or not to use Immediate Data or Atomic 150 Operations based on the results of the query interfaces. 151 Negotiation of Atomic Operations typically are to determine the 152 scope of atomicity guarantees, not down to the individual Atomic 153 Operations supported. Therefore, this specification requires all 154 Atomic Operations defined within to be supported if an RNIC supports 155 any Atomic Operations. 157 In cases where heterogeneous hardware, with differing support for 158 Atomic Operations and Immediate Data Operations, is deployed for 159 usage by RDMA applications and/or ULPs, applications are either 160 statically configured to use or not use optional features or use 161 application specific negotiation mechanisms. For the extensions 162 covered by this document, it is RECOMMENDED that RDMA applications 163 and/or ULPs negotiate at the application or ULP level the usage of 164 these extensions. The definition of such application specific 165 mechanism is outside the scope of this specification. For backward 166 compatibility, existing applications and/or ULPs should assume that 167 iWARP RNICs do not support these extensions. 169 In the absence of application specific negotiation of the features 170 defined within this specification, the new operations can be 171 attempted and reported errors can be used to determine a remote 172 peer's capabilities. In the case of Atomics, a FetchAdd operation 173 with Add Data set to 0 can safely be used to determine the existence 174 of Atomic Operations without modifying the content of a remote 175 peer's memory. A Remote Operation Error / Unexpected OpCode error 176 will be reported by the remote peer in the case of an Immediate Data 177 or Atomic Operation as described if not supported by the remote 178 peer. 180 2. Requirements Language 182 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 183 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 184 document are to be interpreted as described in RFC-2119 [RFC2119]. 186 3. Glossary 188 This document is an extension of [RFC5040] and key words are defined 189 in the glossary of the referenced document. 191 Atomic Operation - is an operation that results in an execution of a 192 memory operation at a specific ULP Buffer address on a remote node 193 using the Tagged Buffer data transfer model. The consumer can use 194 Atomic Operations to read, modify and write memory at the 195 destination ULP Buffer address while at the same time guaranteeing 196 that no other Atomic Operation read or write accesses to the ULP 197 Buffer address targeted by the Atomic Operation will occur across 198 any other RDMAP Streams on an RNIC at the Responder. 200 Atomic Operation Request - An RDMA Message used by the Data Source 201 to perform an Atomic Operation at the Responder. 203 Atomic Operation Response - An RDMA Message used by the Responder to 204 describe the completion of an Atomic Operation at the Responder. 206 CmpSwap - is an Atomic Operation that is used to compare and swap a 207 value at a specific address on a remote node. 209 FetchAdd - is an Atomic Operation that is used to atomically 210 increment a value at a specific ULP Buffer address on a remote node. 212 Immediate Data - a small fixed size portion of data sent from the 213 Data Source to a Data Sink 215 Immediate Data Message - An RDMA Message used by the Data Source to 216 send Immediate Data to the Data Sink 217 Immediate Data with Solicited Event (SE) Message - An RDMA Message 218 used by the Data Source to send Immediate Data with Solicited Event 219 to the Data Sink 221 Requester - the sender of an RDMA Atomic Operation request. 223 Responder - the receiver of an RDMA Atomic Operation request. 225 ULP - Upper Layer Protocol. The protocol layer above the one 226 currently being referenced. The ULP for RDMAP [RFC5040] / DDP 227 [RFC5041] is expected to be an OS, Application, adaptation layer, or 228 proprietary device. The RDMAP [RFC5040] / DDP [RFC5041] documents do 229 not specify a ULP -- they provide a set of semantics that allow a 230 ULP to be designed to utilize RDMAP [RFC5040] / DDP [RFC5041]. 232 4. Header Format Extensions 234 The control information of RDMA Messages is included in DDP protocol 235 [RFC5041] defined header fields. [RFC5040] defines the RDMAP header 236 formats layered on the [RFC5041] DDP header definition. This 237 specification extends [RFC5040] with the following new formats: 238 . Four new RDMA Messages carry additional RDMAP headers. The 239 Immediate Data operation and Immediate Data with Solicited Event 240 operation include 8 bytes of data following the RDMAP header. 241 Atomic Operations include Atomic Request or Atomic Response 242 headers following the RDMAP header. The RDMAP header for Atomic 243 Request messages is 52 bytes long as specified in Figure 4. The 244 RDMAP header for Atomic Response Messages is 32 bytes long as 245 specified in Figure 5. 247 . Introduction of a new queue for untagged buffers (QN=3) used for 248 Atomic Response tracking. 250 4.1. RDMAP Control and Invalidate STag Fields 252 For reference, Figure 1 depicts the format of the DDP Control and 253 RDMAP Control fields, in the style and convention of [RFC5040]: 255 0 1 2 3 256 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 |T|L| Resrv | DV| RV|Rsv| Opcode| 259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 260 | Invalidate STag | 261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 263 Figure 1 DDP Control and RDMAP Control Fields 265 The DDP Control Field consists of the T,L, Resrv and DV fields 266 [RFC5041]. The RDMAP Control Field consists of the RV, Rsv and 267 Opcode fields [RFC5040]. 269 This specification adds additional values for the RDMA Opcode field 270 to those specified in [RFC5040]. Figure 2 defines the new values of 271 RDMA Opcode field that MUST be used for the RDMA Messages defined in 272 this specification. 274 Figure 2 also defines when the STag, Tagged Offset, and Queue Number 275 fields MUST be provided for the RDMA Messages defined in this 276 specification. 278 All RDMA Messages defined in this specification MUST have: 280 The RDMA Version (RV) field: 01b. 282 Opcode field: See Figure 2. 284 Invalidate STag: MUST be set to zero by the sender, ignored by the 285 receiver. 287 -------+-----------+-------+------+-------+---------+------------- 288 RDMA | Message | Tagged| STag | Queue | In- | Message 289 Opcode | Type | Flag | and | Number| validate| Length 290 | | | TO | | STag | Communicated 291 | | | | | | between DDP 292 | | | | | | and RDMAP 293 -------+-----------+-------+------+-------+---------+------------- 294 1000b | Immediate | 0 | N/A | 0 | N/A | Yes 295 | Data | | | | | 296 -------+-----------+---------------------------------------------- 297 1001b | Immediate | 0 | N/A | 0 | N/A | Yes 298 | Data with | | | | | 299 | SE | | | | | 300 -------+-----------+---------------------------------------------- 301 1010b | Atomic | 0 | N/A | 1 | N/A | Yes 302 | Request | | | | | 303 -------+-----------+---------------------------------------------- 304 1011b | Atomic | 0 | N/A | 3 | N/A | Yes 305 | Response | | | | | 306 -------+-----------+---------------------------------------------- 308 Figure 2 Additional RDMA Usage of DDP Fields 310 Note: N/A means Not Applicable. 312 This extension defines RDMAP use of Queue Number 3 for Untagged 313 Buffers for Atomic Responses. This queue is used for tracking 314 outstanding Atomic Requests. 316 All other DDP and RDMAP control fields MUST be set as described in 317 [RFC5040]. 319 4.2. RDMA Message Definitions 321 The following figure defines which RDMA Headers MUST be used on each 322 new RDMA Message and which new RDMA Messages are allowed to carry 323 ULP payload: 325 -------+-----------+-------------------+------------------------- 326 RDMA | Message | RDMA Header Used | ULP Message allowed in 327 Message| Type | | the RDMA Message 328 OpCode | | | 329 | | | 330 -------+-----------+-------------------+------------------------- 331 1000b | Immediate | Immediate Data | No 332 | Data | Header | 333 -------+-----------+-------------------+------------------------- 334 1001b | Immediate | Immediate Data | No 335 | Data with | Header | 336 | SE | | 337 -------+-----------+-------------------+------------------------- 338 1010b | Atomic | Atomic Request | No 339 | Request | Header | 340 -------+-----------+-------------------+------------------------- 341 1011b | Atomic | Atomic Response | No 342 | Response | Header | 343 -------+-----------+-------------------+------------------------- 344 Figure 3 RDMA Message Definitions 346 5. Atomic Operations 348 The RDMA Protocol Specification in [RFC5040] does not include 349 support for Atomic Operations which are an important building block 350 for implementing distributed shared memory. 352 This document extends the RDMA Protocol specification with a set of 353 basic Atomic Operations, and specifies their resource and ordering 354 rules. The Atomic Operations specified in this document provide 355 equivalent functionality to the [IB] RDMA transport as well as 356 extended Atomic Operations defined in [OFAVERBS], to allow 357 applications that use these primitives to work interchangeably over 358 iWARP. Other operations are left for future consideration. 360 Atomic operations as specified in this document execute a 64-bit 361 memory operation at a specified destination ULP Buffer address on a 362 Responder node using the Tagged Buffer data transfer model. The 363 operations atomically read, modify and write back the contents of 364 the destination ULP Buffer address and guarantee that Atomic 365 Operations on this ULP Buffer address by other RDMAP Streams on the 366 same RNIC do not occur between the read and the write caused by the 367 Atomic Operation. Therefore, the Responder RNIC MUST implement 368 mechanisms to prevent Atomic Operations to a memory registered for 369 Atomic Operations while an Atomic Operation targeting the memory is 370 in progress. The Requester of an atomic operation cannot rely on 371 atomic operation behavior at the Responder across multiple RNICs or 372 with respect to other applications/ULPs running at the Responder 373 that can access the ULP Buffer. Some RNIC implementations may 374 provide such atomic behavior, but it is OPTIONAL for the atomic 375 operations specified in this document. An RNIC that supports Atomic 376 Operations as specified in this document MUST implement all Atomic 377 Operation Codes defined in Figure 5. The advertisement of Tagged 378 Buffer information for Atomic Operations is outside the scope of 379 this specification and must be handled by the ULPs. 381 Implementation note: It is recommended that the applications do not 382 use the ULP Buffer addresses used for Atomic Operations for other 383 RDMA operations. 385 Implementation note: Errors related to the alignment in the 386 following sections cover Atomic Operations targeted at a ULP Buffer 387 address that is not aligned to a 64-bit boundary. 389 Atomic Operation Request Messages use the same remote addressing 390 mechanism as RDMA Reads and Writes. The ULP Buffer address specified 391 in the request is in the address space of the Remote Peer to which 392 the Atomic Operation is targeted. 394 Atomic Operation Response Messages MUST use the Untagged Buffer 395 model with QN=3. Queue number 3 MUST be used to track outstanding 396 Atomic Operation Request messages at the Requestor. When the Atomic 397 Operation Response message is received, the MSN MUST be used to 398 locate the corresponding Atomic Operation request in order to 399 complete the Atomic Operation request. 401 5.1. Atomic Operation Details 403 The following sub-sections describe the Atomic Operations in more 404 details. 406 5.1.1. FetchAdd 408 The FetchAdd Atomic Operation requests the Responder to read a 64- 409 bit Original Remote Data Value at a 64-bit aligned ULP Buffer 410 address in the Responder's memory, to perform FetchAdd operation on 411 multiple fields of selectable length specified by 64-bit "Add Mask", 412 and write the result back to the same ULP Buffer address. The Atomic 413 addition is performed independently on each one of these fields. A 414 bit set in the Add Mask field specifies the field boundary; for each 415 field, a bit is set at the most significant bit position for each 416 field, causing any carry out of that bit position to be discarded 417 when the addition is performed. 419 FetchAdd Atomic Operations MUST target ULP Buffer addresses that are 420 64-bit aligned. FetchAdd Atomic Operations that target ULP Buffer 421 addresses that are not 64-bit aligned MUST be surfaced as errors and 422 the Responder's memory MUST NOT be modified in such cases. 423 Additionally an error MUST be surfaced and a terminate message MUST 424 be generated. The setting of "Add Mask" field to 0x0000000000000000 425 results in Atomic Add of 64-bit Original Remote Data Value and 64- 426 bit "Add Data". 428 The pseudo code below describes masked FetchAdd Atomic Operation. 430 bit_location = 1 432 carry = 0 434 Remote Data Value = 0 436 for bit = 0 to 63 438 { 440 if (bit != 0 ) bit_location = bit_location << 1 442 val1 = (Original Remote Data Value & bit_location) >> bit 444 val2 = (Add Data & bit_location) >> bit 445 sum = carry + val1 + val2 447 carry = (sum & 2) >> 1 449 sum = sum & 1 451 if (sum) 453 Remote Data Value |= bit_location 455 carry = ((carry) && (!(Add Mask & bit_location))) 457 } 459 The FetchAdd operation is performed in the endian format of the 460 target memory. The "Original Remote Data Value" is converted from 461 the endian format of the target memory for return and returned to 462 the Requester. The fields are in big-endian format on the wire. 464 The Requester specifies: 466 o Remote STag 468 o Remote Tagged Offset 470 o Add Data 472 o Add Mask 474 The Responder returns: 476 o Original Remote Data 478 5.1.2. CmpSwap 480 The CmpSwap Atomic Operation requires the Responder to read a 64-bit 481 value at a 64-bit aligned ULP Buffer address in the Responder's 482 memory, to perform an AND logical operation using the 64 bit 483 "Compare Mask" field in the Atomic Operation Request header, then to 484 compare it with the result of a logical AND operation of the 485 "Compare Mask" and the "Compare Data" fields in the header, and, if 486 the two values are equal, to swap masked bits in the same ULP Buffer 487 address with the masked Swap Data. If the two masked compare values 488 are not equal, the contents of the Responder's memory are not 489 changed. In either case, the original value read from the ULP Buffer 490 address is converted from the endian format of the target memory for 491 return and returned to the Requester. The fields are in big-endian 492 format on the wire. 494 The Requester specifies: 496 o Remote STag 498 o Remote Tagged Offset 500 o Swap Data 502 o Swap Mask 504 o Compare Data 506 o Compare Mask 508 The Responder returns: 510 o Original Remote Data Value 512 The following pseudo code describes the masked CmpSwap operation 513 result. 515 if (!((Compare Data ^ Original Remote Data Value) & 517 Compare Mask)) 519 then 521 Remote Data Value = 523 (Original Remote Data Value & ~(Swap Mask)) 525 | (Swap Data & Swap Mask) 527 else 529 Remote Data Value = Original Remote Data Value 531 After the operation, the remote data buffer MUST contain the 532 "Original Remote Data Value" (if comparison did not match) or the 533 masked "Swap Data" (if the comparison did match). CmpSwap Atomic 534 Operations MUST target buffer addresses that are 64-bit aligned. If 535 a CmpSwap Atomic Operation is attempted on a target ULP Buffer 536 address that is not 64-bit aligned: 538 o The operation MUST NOT be performed, 540 o The Responder's memory MUST NOT be modified, 542 o The result MUST be surfaced as an error, and 544 o A terminate message MUST be generated (see Section 8.2. for the 545 terminate message contents) 547 5.2. Atomic Operations 549 The Atomic Operation Request and Response are RDMA Messages. An 550 Atomic Operation makes use of the DDP Untagged Buffer Model. Atomic 551 Operation Request messages MUST use the same Queue Number as RDMA 552 Read Requests (QN=1). Reusing the same Queue Number for Atomic 553 Request messages allows the Atomic Operations to reuse the same 554 infrastructure (e.g. ORD/IRD flow control) as defined for RDMA Read 555 Requests. Atomic Operation Response messages MUST set Queue Number 556 (QN) to 3 in the DDP header. 558 The RDMA Message OpCode for an Atomic Request Message is 1010b. The 559 RDMA Message OpCode for an Atomic Response Message is 1011b. 561 5.2.1. Atomic Operation Request Message 563 The Atomic Operation Request Message carries an Atomic Operation 564 Header that describes the ULP Buffer address in the Responder's 565 memory. The Atomic Operation Request header immediately follows the 566 DDP header. The RDMAP layer passes to the DDP layer a RDMAP Control 567 Field. The following figure depicts the Atomic Operation Request 568 Header that MUST be used for all Atomic Operation Request Messages: 570 0 1 2 3 571 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 573 | Reserved (Not Used) |AOpCode| 574 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 575 | Request Identifier | 576 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 577 | Remote STag | 578 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 579 | Remote Tagged Offset | 580 + + 581 | | 582 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 583 | Add or Swap Data | 584 + + 585 | | 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 587 | Add or Swap Mask | 588 + + 589 | | 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 591 | Compare Data | 592 + + 593 | | 594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 595 | Compare Mask | 596 + + 597 | | 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 600 Figure 4 Atomic Operation Request Header 602 Reserved (Not Used): 28 bits 604 This field MUST be set to zero on transmit, ignored on 605 receive. 607 Atomic Operation Code (AOpCode): 4 bits. 609 See Figure 5. All Atomic Operation Codes from Figure 5 MUST 610 be implemented by an RNIC that support Atomic Operations. 612 Request Identifier: 32 bits. 614 The Request Identifier specifies a number that is used to 615 identify Atomic Operation Request Message. The value used in 616 this field is selected by the RNIC that sends the message, and 617 is reflected back to the Local Peer in the Atomic Operation 618 Response message. 620 Remote STag: 32 bits. 622 The Remote STag identifies the Remote Peer's Tagged Buffer 623 targeted by the Atomic Operation. The Remote STag is 624 associated with the RDMAP Stream through a mechanism that is 625 outside the scope of the RDMAP specification. 627 Remote Tagged Offset: 64 bits. 629 The Remote Tagged Offset specifies the starting offset, in 630 octets, from the base of the Remote Peer's Tagged Buffer 631 targeted by the Atomic Operation. The Remote Tagged Offset MAY 632 start at an arbitrary offset. 634 Add or Swap Data: 64 bits. 636 The Add or Swap Data field specifies the 64-bit "Add Data" 637 value in an Atomic FetchAdd Operation or the 64-bit "Swap 638 Data" value in an Atomic Swap or CmpSwap Operation. 640 Add or Swap Mask: 64 bits 642 This field is used in masked Atomic Operations (FetchAdd and 643 CmpSwap) to perform a bitwise logical AND operation as 644 specified in the definition of these operations. For non- 645 masked Atomic Operations (Swap), this field MUST be set to 646 ffffffffffffffffh on transmit and ignored by the receiver. 648 Compare Data: 64 bits. 650 The Compare Data field specifies the 64-bit "Compare Data" 651 value in an Atomic CmpSwap Operation. For Atomic FetchAdd and 652 Atomic Swap operation, the Compare Data field MUST be set to 653 zero on transmit and ignored by the receiver. 655 Compare Mask: 64 bits 657 This field is used in masked Atomic Operation CmpSwap to 658 perform a bitwise logical AND operation as specified in the 659 definition of these operations. For Atomic Operations FetchAdd 660 and Swap, this field MUST be set to ffffffffffffffffh on 661 transmit and ignored by the receiver. 663 ---------+-----------+----------+----------+---------+--------- 664 Atomic | Atomic | Add or | Add or | Compare | Compare 665 Operation| Operation | Swap | Swap | Data | Mask 666 Code | | Data | Mask | | 667 ---------+-----------+----------+----------+---------+--------- 668 0000b | FetchAdd | Add Data | Add Mask | N/A | N/A 669 ---------+-----------+----------+----------+---------+--------- 670 0010b | CmpSwap | Swap Data| Swap Mask| Valid | Valid 671 ---------+-----------+----------------------------------------- 673 Figure 5 Atomic Operation Message Definitions 675 The Atomic Operation Request Message has the following semantics: 677 1. An Atomic Operation Request Message MUST reference an Untagged 678 Buffer. That is, the Local Peer's RDMAP layer MUST request that 679 the DDP mark the Message as Untagged. 681 2. One Atomic Operation Request Message MUST consume one Untagged 682 Buffer. 684 3. The Responder's RDMAP layer MUST process an Atomic Operation 685 Request Message. A valid Atomic Operation Request Message MUST 686 NOT be delivered to the Responder's ULP (i.e., it is processed by 687 the RDMAP layer). 689 4. At the Responder, an error MUST be surfaced in response to 690 delivery to the Remote Peer's RDMAP layer of an Atomic Operation 691 Request Message with an Atomic Operation Code that the RNIC does 692 not support. 694 5. An Atomic Operation Request Message MUST reference the RDMA Read 695 Request Queue. That is, the Requester's RDMAP layer MUST request 696 that the DDP layer set the Queue Number field to one. 698 6. The Requester MUST pass to the DDP layer Atomic Operation Request 699 Messages in the order they were submitted by the ULP. 701 7. The Responder MUST process the Atomic Operation Request Messages 702 in the order they were sent. 704 8. If the Responder receives a valid Atomic Operation Request 705 Message, it MUST respond with a valid Atomic Operation Response 706 Message. 708 5.2.2. Atomic Operation Response Message 710 The Atomic Operation Response Message carries an Atomic Operation 711 Response Header that contains the "Original Request Identifier" and 712 "Original Remote Data Value". The Atomic Operation Response Header 713 immediately follows the DDP header. The RDMAP layer passes to the 714 DDP layer a RDMAP Control Field. The following figure depicts the 715 Atomic Operation Response header that MUST be used for all Atomic 716 Operation Response Messages: 718 0 1 2 3 719 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 720 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 721 | Original Request Identifier | 722 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 723 | Original Remote Data Value | 724 + + 725 | | 726 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 728 Figure 6 Atomic Operation Response Header 730 Original Request Identifier: 32 bits. 732 The Original Request Identifier MUST be set to the value 733 specified in the Request Identifier field that was originally 734 provided in the corresponding Atomic Operation Request 735 Message. 737 Original Remote Data Value: 64 bits. 739 The Original Remote Value specifies the original 64-bit value 740 stored at the ULP Buffer address targeted by the Atomic 741 Operation. 743 The Atomic Operation Response Message has the following semantics: 745 1. The Atomic Operation Response Message for the associated Atomic 746 Operation Request Message travels in the opposite direction. 748 2. An Atomic Operation Response Message MUST consume an Untagged 749 Buffer. That is, the Responder RDMAP layer MUST request that the 750 DDP mark the Message as Untagged. 752 3. An Atomic Operation Response Message MUST reference the Queue 753 Number 3. That is, the Responder's RDMAP layer MUST request that 754 the DDP layer set the Queue Number field to 3. 756 4. The Responder MUST ensure that a sufficient number of Untagged 757 Buffers are available on the RDMA Read Request Queue (Queue with 758 DDP Queue Number 1) to support the maximum number of Atomic 759 Operation Requests negotiated by the ULP in addition to the 760 maximum number of RDMA Read Requests negotiated by the ULP. 762 5. The Requester MUST ensure that a sufficient number of Untagged 763 Buffers are available on the RDMA Atomic Response Queue (Queue 764 with DDP Queue Number 3) to support the maximum number of Atomic 765 Operation Requests negotiated by the ULP. 767 6. The RDMAP layer MUST Deliver the Atomic Operation Response 768 Message to the ULP. 770 7. At the Requester, when an invalid Atomic Operation Response 771 Message is delivered to the Remote Peer's RDMAP layer, an error 772 is surfaced. 774 8. When the Responder receives Atomic Operation Request messages, 775 the Responder RDMAP layer MUST pass Atomic Operation Response 776 Messages to the DDP layer, in the order that the Atomic Operation 777 Request Messages were received by the RDMAP layer, at the 778 Responder. 780 5.3. Atomicity Guarantees 782 Atomicity of the Read-Modify-Write (RMW) on the Responder's node by 783 the Atomic Operation MUST be assured in the context of concurrent 784 atomic accesses by other RDMAP Streams on the same RNIC. 786 5.4. Atomic Operations Ordering and Completion Rules 788 In addition to the ordering and completion rules described in 789 [RFC5040], the following rules apply to implementations of the 790 Atomic operations. 792 1. For an Atomic operation, the Requester MUST NOT consider the 793 contents of the Tagged Buffer at the Responder to be modified by 794 that specific Atomic Operation until the Atomic Operation 795 Response Message has been Delivered to RDMAP at the Requester. 797 2. Atomicity guarantees MUST be within the scope of a single RNIC. 799 Implementation Note: Implementations may not guarantee Atomicity 800 if the Tagged Buffer is accessed by any other method other than 801 an Atomic Operation within the scope of a single RNIC. 803 3. Atomic Operation Request Messages MUST NOT start processing at 804 the Responder until they have been Delivered to RDMAP by DDP. 806 4. Atomic Operation Response Messages MAY be generated at the 807 Responder after subsequent RDMA Write Messages or Send Messages 808 have been Placed or Delivered. 810 5. Atomic Operation Response Message processing at the Responder 811 MUST be started only after the Atomic Operation Request Message 812 has been Delivered by the DDP layer (thus, all previous RDMA 813 Messages on that DDP Stream have been Delivered). 815 6. Send Messages MAY be Completed at the Responder before prior 816 incoming Atomic Operation Request Messages have completed their 817 response processing. 819 7. An Atomic Operation MUST NOT be Completed at the Requester until 820 the DDP layer Delivers the associated incoming Atomic Operation 821 Response Message. 823 8. If more than one outstanding Atomic Request Messages are 824 supported by both peers, the Atomic Operation Request Messages 825 MUST be processed in the order they were delivered by the DDP 826 layer on the Responder. Atomic Operation Response Messages MUST 827 be submitted to the DDP layer on the Responder in the order the 828 Atomic Operation Request Messages were Delivered by DDP. 830 6. Immediate Data 832 The Immediate Data operation is typically used in conjunction with 833 an RDMA Write Operation to improve ULP processing efficiency. The 834 efficiency is gained by causing an RDMA Completion to be generated 835 immediately following the RDMA Write operation. This RDMA Completion 836 delivers 8 bytes of immediate data at the Remote Peer. The 837 combination of an RDMA Write Message followed by an Immediate Data 838 Operation has the same behavior as the RDMA Write with Immediate Data 839 operation found in [IB]. An Immediate Data operation that is not 840 preceded by an RDMA Write operation causes an RDMA Completion. 842 6.1. RDMAP Interactions with ULP for Immediate Data 844 For Immediate Data operations, the following are the interactions 845 between the RDMAP Layer and the ULP: 846 . At the Data Source: 848 . The ULP passes to the RDMAP Layer the following: 850 . Eight bytes of ULP Immediate Data 852 . When the Immediate Data operation Completes, an indication 853 of the Completion results. 855 . At the Data Sink: 857 . If the Immediate Data operation is Completed successfully, 858 the RDMAP Layer passes the following information to the ULP 859 Layer: 861 . Eight bytes of Immediate Data 863 . An Event, if the Data Sink is configured to generate an 864 Event. 866 . If the Immediate Data operation is Completed in error, the 867 Data Sink RDMAP Layer will pass up the corresponding error 868 information to the Data Sink ULP and send a Terminate 869 Message to the Data Source RDMAP Layer. The Data Source 870 RDMAP Layer will then pass up the Terminate Message to the 871 ULP. 873 6.2. Immediate Data Header Format 875 The Immediate Data and Immediate Data with SE Messages carry 876 immediate data as shown in Figure 7. The RDMAP layer passes to the 877 DDP layer an RDMAP Control Field and 8 bytes of Immediate Data. The 878 first 8 bytes of the data following the DDP header contains the 879 Immediate Data. See section A.3. for the DDP segment format of an 880 Immediate Data or Immediate Data with SE Message. 882 0 1 2 3 883 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 884 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 885 | Immediate Data | 886 + + 887 | | 888 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 890 Figure 7 Immediate Data or Immediate Data with SE Message Header 892 Immediate Data: 64 bits. 893 Eight bytes of data transferred from the Data Source to an 894 untagged buffer at the Data Sink. 896 6.3. Immediate Data or Immediate Data with SE Message 898 The Immediate Data or Immediate Data with SE Message uses the DDP 899 Untagged Buffer Model to transfer Immediate Data from the Data 900 Source to the Data Sink. 901 . An Immediate Data or Immediate Data with SE Message MUST 902 reference an Untagged Buffer. That is, the Local Peer's RDMAP 903 Layer MUST request that the DDP layer mark the Message as 904 Untagged. 906 . One Immediate Data or Immediate Data with SE Message MUST consume 907 one Untagged Buffer. 909 . At the Remote Peer, the Immediate Data or Immediate Data with SE 910 Message MUST be Delivered to the Remote Peer's ULP in the order 911 they were sent. 913 . For an Immediate Data or Immediate Data with SE Message, the 914 Local Peer's RDMAP Layer MUST request that the DDP layer set the 915 Queue Number field to zero. 917 . For an Immediate Data or Immediate Data with SE Message, the 918 Local Peer's RDMAP Layer MUST request that the DDP layer transmit 919 8 bytes of data. 921 . The Local Peer MUST issue Immediate Data and Immediate Data with 922 SE Messages in the order they were submitted by the ULP. 924 . The Remote Peer MUST check that Immediate Data and Immediate Data 925 with SE Messages include exactly 8 bytes of data from the DDP 926 layer. The DDP header carries the length field that is reported 927 by the DDP layer. 929 6.4. Ordering and Completions 931 Ordering and completion rules for Immediate Data are the same as 932 those for a Send operation as described in section 5.5 of RFC 5040. 934 7. Ordering and Completions Table 936 The following table summarizes the ordering relationships for Atomic 937 and Immediate Data operations from the standpoint of Local Peer 938 issuing the Operations. Note that in the table that follows, Send 939 includes Send, Send with Invalidate, Send with Solicited Event, and 940 Send with Solicited Event and Invalidate. Also note that in the 941 table below, Immediate Data includes Immediate Data and Immediate 942 Data with Solicited Event. 944 ---------+----------+-------------+-------------+------------------ 945 First | Second | Placement | Placement | Ordering 946 Operation| Operation| Guarantee at| Guarantee at| Guarantee at 947 | | Remote Peer | Local Peer | Remote Peer 948 ---------+----------+-------------+-------------+------------------ 949 Immediate| Send | No Placement| Not | Completed in 950 Data | | Guarantee | Applicable | Order 951 | | between Send| | 952 | | Payload and | | 953 | | Immediate | | 954 | | Data | | 955 ---------+----------+-------------+-------------+------------------ 956 Immediate| RDMA | No Placement| Not | Not 957 Data | Write | Guarantee | Applicable | Applicable 958 | | between RDMA| | 959 | | Write | | 960 | | Payload and | | 961 | | Immediate | | 962 | | Data | | 963 ---------+----------+-------------+-------------+------------------ 964 Immediate| RDMA | No Placement| RDMA Read | RDMA Read 965 Data | Read | Guarantee | Response | Response 966 | | between | will not be | Message will 967 | | Immediate | Placed until| not be 968 | | Data and | Immediate | generated 969 | | RDMA Read | Data is | until 970 | | Request | Placed at | Immediate Data 971 | | | Remote Peer | has been 972 | | | | Completed 973 ---------+----------+-------------+-------------+------------------ 974 Immediate| Atomic | No Placement| Atomic | Atomic 975 Data | | Guarantee | Response | Response 976 | | between | will not be | Message will 977 | | Immediate | Placed until| not be 978 | | Data and | Immediate | generated 979 | | Atomic | Data is | until 980 | | Request | Placed at | Immediate Data 981 | | | Remote Peer | has been 982 | | | | Completed 983 ---------+----------+-------------+-------------+------------------ 984 Immediate| Immediate| No Placement| Not | Completed in 985 Data or | Data | Guarantee | Applicable | Order 986 Send | | | | 987 ---------+----------+-------------+-------------+------------------ 988 RDMA | Immediate| No Placement| Not | Immediate Data 989 Write | Data | Guarantee | Applicable | is Completed 990 | | | | after RDMA 991 | | | | Write is Placed 992 | | | | and Delivered 993 ---------+----------+-------------+-------------+------------------ 994 RDMA Read| Immediate| No Placement| Immediate | Not Applicable 995 | Data | Guarantee | Data MAY be | 996 | | between | Placed | 997 | | Immediate | before | 998 | | Data and | RDMA Read | 999 | | RDMA Read | Response is | 1000 | | Request | generated | 1002 ---------+----------+-------------+-------------+------------------ 1003 Atomic | Immediate| No Placement| Immediate | Not Applicable 1004 | Data | Guarantee | Data MAY be | 1005 | | between | Placed | 1006 | | Immediate | before | 1007 | | Data and | Atomic | 1008 | | Atomic | Response is | 1009 | | Request | generated | 1010 ---------+----------+-------------+-------------+------------------ 1011 Atomic | Send | No Placement| Send Payload| Not Applicable 1012 | | Guarantee | MAY be | 1013 | | between Send| Placed | 1014 | | Payload and | before | 1015 | | Atomic | Atomic | 1016 | | Request | Response is | 1017 | | | generated | 1018 ---------+----------+-------------+-------------+------------------ 1019 Atomic | RDMA | No Placement| RDMA Write | Not 1020 | Write | Guarantee | Payload MAY | Applicable 1021 | | between RDMA| be Placed | 1022 | | Write | before | 1023 | | Payload and | Atomic | 1024 | | Atomic | Response is | 1025 | | Request | generated | 1026 ---------+----------+-------------+-------------+------------------ 1027 Atomic | RDMA | No Placement| No Placement| RDMA Read 1028 | Read | Guarantee | Guarantee | Response 1029 | | between | between | Message will 1030 | | Atomic | Atomic | not be 1031 | | Request and | Response | generated 1032 | | RDMA Read | and RDMA | until Atomic 1033 | | Request | Read | Response Message 1034 | | | Response | has been 1035 | | | | generated 1036 ---------+----------+-------------+-------------+------------------ 1037 Atomic | Atomic | Placed in | No Placement| Second Atomic 1038 | | order | Guarantee | Request 1039 | | | between two | Message will 1040 | | | Atomic | not be 1041 | | | Responses | processed 1042 | | | | until first 1043 | | | | Atomic Response 1044 | | | | has been 1045 | | | | generated 1046 ---------+----------+-------------+-------------+------------------ 1047 Send | Atomic | No Placement| Atomic | Atomic Response 1048 | | Guarantee | Response | Message will not 1049 | | between Send| will not be | be generated 1050 | | Payload and | Placed at | until Send has 1051 | | Atomic | the Local | been Completed 1052 | | Request | Peer Until | 1053 | | | Send Payload| 1054 | | | is Placed | 1055 | | | at the | 1056 | | | Remote Peer | 1057 ---------+----------+-------------+-------------+------------------ 1058 RDMA | Atomic | No Placement| Atomic | Not 1059 Write | | Guarantee | Response | Applicable 1060 | | between RDMA| will not be | 1061 | | Write | Placed at | 1062 | | Payload and | the Local | 1063 | | Atomic | Peer Until | 1064 | | Request | RDMA Write | 1065 | | | Payload | 1066 | | | is Placed | 1067 | | | at the | 1068 | | | Remote Peer | 1069 ---------+----------+-------------+-------------+------------------ 1070 RDMA | Atomic | No Placement| No Placement| Atomic Response 1071 Read | | Guarantee | Guarantee | Message will 1072 | | between | between | not be generated 1073 | | Atomic | Atomic | until RDMA 1074 | | Request and | Response | Read Response 1075 | | RDMA Read | and RDMA | has been 1076 | | Request | Read | generated 1077 | | | Response | 1078 ---------+----------+-------------+-------------+------------------ 1080 8. Error Processing 1082 In addition to error processing described in section 7 of [RFC5040], 1083 the following rules apply for the new RDMA Messages defined in this 1084 specification. 1086 8.1. Errors Detected at the Local Peer 1088 The Local Peer MUST send a Terminate Message for each of the 1089 following cases: 1091 1. For errors detected while creating an Atomic Request, Atomic 1092 Response, Immediate Data, or Immediate Data with SE Message, or 1093 other reasons not directly associated with an incoming Message, 1094 the Terminate Message and Error code are sent instead of the 1095 Message. In this case, the Error Type and Error Code fields are 1096 included in the Terminate Message, but the Terminated DDP Header 1097 and Terminated RDMA Header fields are set to zero. 1099 2. For errors detected on an incoming Atomic Request, Atomic 1100 Response, Immediate Data, or Immediate Data with Solicited Event 1101 (after the Message has been Delivered by DDP), the Terminate 1102 Message is sent at the earliest possible opportunity, preferably 1103 in the next outgoing RDMA Message. In this case, the Error Type, 1104 Error Code, and Terminated DDP Header fields are included in the 1105 Terminate Message, but the Terminated RDMA Header field is set to 1106 zero. 1108 8.2. Errors Detected at the Remote Peer 1110 On incoming Atomic Requests, Atomic Responses, Immediate Data, and 1111 Immediate Data with Solicited Event, the following MUST be 1112 validated: 1114 . The DDP layer MUST validate all DDP Segment fields. 1116 . The RDMA OpCode MUST be valid. 1118 . The RDMA Version MUST be valid. 1120 On incoming Atomic requests the following additional validation MUST 1121 be performed: 1123 . The RDMAP layer MUST validate that the Remote Peer's Tagged ULP 1124 Buffer address references a 64-bit aligned ULP Buffer address. In 1125 the case of an error, the RDMAP layer MUST generate a Terminate 1126 Message indicating RDMA Layer Remote Operation Error with Error 1127 Code Name "Catastrophic Error, Localized to RDMAP Stream" as 1128 described in Section 4.8 of [RFC5040]. Implementation Note: A ULP 1129 implementation can avoid this error by having the target ULP 1130 buffer of an atomic operation 64-bit aligned. 1132 9. Security Considerations 1134 This document specifies extensions to the RDMA Protocol 1135 specification in [RFC5040], and as such the Security Considerations 1136 discussed in Section 8 of [RFC5040] apply. In particular, Atomic 1137 Operations use ULP Buffer addresses for the Remote Peer buffer 1138 addressing used in [RFC5040] which is used to satisfy the [RFC5042] 1139 security model. No additional Security Considerations are required 1140 for the extensions specified in this document. 1142 10. IANA Considerations 1144 IANA is requested to add the following entries to the "RDMAP Message 1145 Operation Codes" registry of "RDDP Registries": 1147 0x8, Immediate Data, [RFCXXXX] 1149 0x9, Immediate Data with Solicited Event, [RFCXXXX] 1151 0xA, Atomic Request, [RFCXXXX] 1153 0xB, Atomic Response, [RFCXXXX] 1155 In addition, the following registry is requested to be added to 1156 "RDDP Registries". The following section specifies the registry, its 1157 initial contents and the administration policy in more detail. 1159 RFC Editor: Please replace XXXX in all instances of [RFCXXXX] above 1160 with the RFC number of this document and remove this note. 1162 10.1. RDMAP Message Atomic Operation Subcodes 1164 Name of the registry: "RDMAP Message Atomic Operation Subcodes" 1166 Namespace details: RDMAP Message Atomic Operation Subcodes are 4-bit 1167 values [RFCXXXX]. 1169 Information that must be provided to assign a new value: An IESG- 1170 approved standards-track specification defining the semantics and 1171 interoperability requirements of the proposed new value and the 1172 fields to be recorded in the registry. 1174 Fields to record in the registry: RDMAP Message Atomic Operation 1175 Subcode, Atomic Operation, RFC Reference. 1177 Initial registry contents: 1179 0x0, FetchAdd, [RFCXXXX] 1181 0x1, Reserved 1183 0x2, CmpSwap, [RFCXXXX] 1185 Note: An experimental RDMAP Message Operation Code has already been 1186 allocated; hence there is no need for an experimental RDMAP Message 1187 Atomic Operation Subcode. 1189 All other values are Unassigned and available to IANA for 1190 assignment. New RDMAP Message Atomic Operation Subcodes should be 1191 assigned sequentially in order to better support implementations 1192 that process RDMAP Message Atomic Operations in hardware. 1194 Allocation Policy: Standards Action ([RFC5226]) 1196 RFC Editor: Please replace XXXX in all instances of [RFCXXXX] above 1197 with the RFC number of this document and remove this note. 1199 10.2. RDMAP Queue Numbers 1201 Name of the registry: "RDMAP DDP Untagged Queue Numbers" 1203 Namespace details: RDMAP DDP Untagged Queue numbers are 32-bit 1204 values [RFCXXXX]. 1206 Information that must be provided to assign a new value: An IESG- 1207 approved standards-track specification defining the semantics and 1208 interoperability requirements of the proposed new value and the 1209 fields to be recorded in the registry. 1211 Fields to record in the registry: RDMAP DDP Untagged Queue Numbers, 1212 Queue Usage Description, RFC Reference. 1214 Initial registry contents: 1216 0x00000000, Queue 0 (Send operation Variants), [RFC5040] 1218 0x00000001, Queue 1 (RDMA Read Request operations), [RFC5040] 1220 0x00000002, Queue 2 (Terminate operations), [RFC5040] 1221 0x00000003, Queue 3 (Atomic Response operations), [RFCXXXX] 1223 Note: An experimental RDMAP Message Operation Code has already been 1224 allocated; hence there is no need for an experimental RDMAP DDP 1225 Untagged Queue Number. 1227 All other values are Unassigned and available to IANA for 1228 assignment. New RDMAP queue numbers should be assigned sequentially 1229 in order to better support implementations that perform RDMAP queue 1230 selection in hardware. 1232 Allocation Policy: Standards Action ([RFC5226]) 1234 RFC Editor: Please replace XXXX in all instances of [RFCXXXX] above 1235 with the RFC number of this document and remove this note. 1237 11. References 1239 11.1. Normative References 1241 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1242 Requirement Levels", BCP 14, RFC 2119, March 1997. 1244 [RFC5040] Recio, R. et al., "A Remote Direct Memory Access Protocol 1245 Specification", RFC 5040, October 2007. 1247 [RFC5041] Shah, H. et al., "Direct Data Placement over Reliable 1248 Transports", RFC 5041, October 2007. 1250 [RFC5042] Pinkerton, J. and E. Deleganes, "Direct Data Placement 1251 Protocol (DDP) / Remote Direct Memory Access Protocol 1252 (RDMAP) Security", October 2007. 1254 [RFC5226] T. Narten and H. Alvestrand, "Guidelines for Writing an 1255 IANA Considerations Section in RFCs", May 2008. 1257 RFC Editor: Please remove reference to RFC5226 if the associated 1258 IANA Considerations reference is also removed before publication. 1260 11.2. Informative References 1262 [IB] Infiniband Trade Association, "Infiniband Architecture 1263 Specification Volumes 1 and 2", Release 1.1, November 1264 2002, available from http://www.infinibandta.org/specs. 1266 [RSOCKETS] RSockets, RDMA enabled Sockets library for Open Fabrics, 1267 available from 1268 http://git.openfabrics.org/git?p=~shefty/librdmacm.git;a=b 1269 lob;f=src/rsocket.c;h=d544dd097cda228de114173c8fe569dc1881 1270 f057;hb=HEAD. 1272 [RFC5045] C. Bestler and L. Coene, "Applicability of Remote Direct 1273 Memory Access Protocol (RDMA and Direct Data Placement 1274 Protocol (DDP)", October 2007. 1276 [OFAVERBS] Open Fabrics Alliance Verbs Enhanced Atomic Operations, 1277 "[PATCH 0/2] Add support for enhanced atomic operations", 1278 available from 1279 http://comments.gmane.org/gmane.linux.drivers.rdma/2397. 1281 [DAT_ATOMICS] DAT Collaborative, User Direct Access Programming 1282 Library, "Ratified DAT IB extension spec", available from 1283 http://www.datcollaborative.org/DAT_IB_Extensions.pdf. 1285 12. Acknowledgments 1287 The authors would like to acknowledge the following contributors who 1288 provided valuable comments and suggestions. 1290 o David Black 1292 o Arkady Kanevsky 1294 o Bernard Metzler 1296 o Jim Pinkerton 1298 o Tom Talpey 1300 o Steve Wise 1302 o Don Wood 1303 This document was prepared using 2-Word-v2.0.template.dot. 1305 Appendix A. DDP Segment Formats for RDMA Messages 1307 This appendix is for information only and is NOT part of the 1308 standard. It simply depicts the DDP Segment format for the various 1309 RDMA Messages. 1311 A.1. DDP Segment for Atomic Operation Request 1313 The following figure depicts an Atomic Operation Request, DDP 1314 Segment: 1316 0 1 2 3 1317 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1319 | DDP Control | RDMA Control | 1320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1321 | Reserved (Not Used) | 1322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1323 | DDP (Atomic Operation Request) Queue Number | 1324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1325 | DDP (Atomic Operation Request) Message Sequence Number | 1326 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1327 | DDP (Atomic Operation Request) Message Offset | 1328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1329 | Reserved (Not Used) |AOpCode| 1330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1331 | Request Identifier | 1332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1333 | Remote STag | 1334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1335 | Remote Tagged Offset | 1336 + + 1337 | | 1338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1339 | Add or Swap Data | 1340 + + 1341 | | 1342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1343 | Add or Swap Mask | 1344 + + 1345 | | 1346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1347 | Compare Data | 1348 + + 1349 | | 1350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1351 | Compare Mask | 1352 + + 1353 | | 1354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1356 A.2. DDP Segment for Atomic Response 1358 The following figure depicts an Atomic Operation Response, DDP 1359 Segment: 1361 0 1 2 3 1362 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1364 | DDP Control | RDMA Control | 1365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1366 | Reserved (Not Used) | 1367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1368 | DDP (Atomic Operation Request) Queue Number | 1369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1370 | DDP (Atomic Operation Request) Message Sequence Number | 1371 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1372 | DDP (Atomic Operation Request) Message Offset | 1373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1374 | Original Request Identifier | 1375 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1376 | Original Remote Value | 1377 + + 1378 | | 1379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1381 A.3. DDP Segment for Immediate Data and Immediate Data with SE 1383 The following figure depicts an Immediate Data or Immediate data 1384 with SE, DDP Segment: 1386 0 1 2 3 1387 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1388 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1389 | DDP Control | RDMA Control | 1390 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1391 | Reserved (Not Used) | 1392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1393 | DDP (Send) Queue Number | 1394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1395 | DDP (Send) Message Sequence Number | 1396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1397 | DDP Message Offset | 1398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1399 | Immediate Data | 1400 + + 1401 | | 1402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1404 Authors' Addresses 1406 Hemal Shah 1407 Broadcom Corporation 1408 5300 California Avenue 1409 Irvine, CA 92617 1410 Phone: 1-949-926-6941 1411 Email: hemal@broadcom.com 1413 Felix Marti 1414 Chelsio Communications, Inc. 1415 370 San Aleso Ave. 1416 Sunnyvale, CA 94085 1417 Phone: 1-408-962-3600 1418 Email: felix@chelsio.com 1420 Asgeir Eiriksson 1421 Chelsio Communications, Inc. 1422 370 San Aleso Ave. 1423 Sunnyvale, CA 94085 1424 Phone: 1-408-962-3600 1425 Email: asgeir@chelsio.com 1427 Wael Noureddine 1428 Chelsio Communications, Inc. 1429 370 San Aleso Ave. 1430 Sunnyvale, CA 94085 1431 Phone: 1-408-962-3600 1432 Email: wael@chelsio.com 1434 Robert Sharp 1435 Intel Corporation 1436 1300 South Mopac Expy, Mailstop: AN4-4B 1437 Austin, TX 78746 1438 Phone: 1-512-362-1407 1439 Email: robert.o.sharp@intel.com