< draft-ietf-rddp-rdmap-06.txt   draft-ietf-rddp-rdmap-07.txt >
Remote Direct Data Placement Work Group R. Recio Remote Direct Data Placement Work Group R. Recio
INTERNET DRAFT IBM Corporation INTERNET DRAFT IBM Corporation
draft-ietf-rddp-rdmap-06.txt P. Culley draft-ietf-rddp-rdmap-07.txt P. Culley
Hewlett-Packard Company Hewlett-Packard Company
D. Garcia D. Garcia
Hewlett-Packard Company Hewlett-Packard Company
J. Hilland J. Hilland
Hewlett-Packard Company Hewlett-Packard Company
B. Metzler B. Metzler
IBM Corporation IBM Corporation
Expires: January, 2007 June 1, 2006 Expires: February, 2007 September 8, 2006
A Remote Direct Memory Access Protocol Specification A Remote Direct Memory Access Protocol Specification
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-Drafts documents at any time. It is inappropriate to use Internet-Drafts
as reference material or to cite them other than as "work in as reference material or to cite them other than as "work in
progress." progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html The list of Internet-Draft http://www.ietf.org/1id-abstracts.html The list of Internet-Draft
Shadow Directories can be accessed at Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Abstract Abstract
This document defines a Remote Direct Memory Access Protocol This document defines a Remote Direct Memory Access Protocol
(RDMAP) that operates over the Direct Data Placement Protocol (DDP (RDMAP) that operates over the Direct Data Placement Protocol (DDP
protocol). RDMAP provides read and write services directly to protocol). RDMAP provides read and write services directly to
applications and enables data to be transferred directly into Upper applications and enables data to be transferred directly into
Layer Protocol (ULP) Buffers without intermediate data copies. It Upper Layer Protocol (ULP) buffers without intermediate data
also enables a kernel bypass implementation. copies. It also enables a kernel bypass implementation.
Table of Contents Table of Contents
1 Introduction...............................................6 1 Introduction................................................6
1.1 Architectural Goals........................................6 1.1 Architectural Goals.........................................6
1.2 Protocol Overview..........................................7 1.2 Protocol Overview...........................................7
1.3 RDMAP Layering............................................10 1.3 RDMAP Layering.............................................10
1.4 Specification Changes from the Last Version...............11 1.4 Specification Changes from the Last Version................11
2 Glossary..................................................14 2 Glossary...................................................14
2.1 General...................................................14 2.1 General....................................................14
2.2 LLP.......................................................16 2.2 LLP........................................................16
2.3 Direct Data Placement (DDP)...............................17 2.3 Direct Data Placement (DDP)................................17
2.4 Remote Direct Memory Access (RDMA)........................19 2.4 Remote Direct Memory Access (RDMA).........................19
3 ULP and Transport Attributes..............................22 3 ULP and Transport Attributes...............................22
3.1 Transport Requirements & Assumptions......................22 3.1 Transport Requirements & Assumptions.......................22
3.2 RDMAP Interactions with the ULP...........................23 3.2 RDMAP Interactions with the ULP............................23
4 Header Format.............................................27 4 Header Format..............................................27
4.1 RDMAP Control and Invalidate STag Field...................27 4.1 RDMAP Control and Invalidate STag Field....................27
4.2 RDMA Message Definitions..................................30 4.2 RDMA Message Definitions...................................30
4.3 RDMA Write Header.........................................31 4.3 RDMA Write Header..........................................31
4.4 RDMA Read Request Header..................................32 4.4 RDMA Read Request Header...................................32
4.5 RDMA Read Response Header.................................34 4.5 RDMA Read Response Header..................................34
4.6 Send Header and Send with Solicited Event Header..........34 4.6 Send Header and Send with Solicited Event Header...........34
4.7 Send with Invalidate Header and Send with SE and Invalidate 4.7 Send with Invalidate Header and Send with SE and Invalidate
Header..........................................................34 Header...........................................................34
4.8 Terminate Header..........................................34 4.8 Terminate Header...........................................34
5 Data Transfer.............................................41 5 Data Transfer..............................................41
5.1 RDMA Write Message........................................41 5.1 RDMA Write Message.........................................41
5.2 RDMA Read Operation.......................................42 5.2 RDMA Read Operation........................................42
5.2.1 RDMA Read Request Message................................42 5.2.1 RDMA Read Request Message.................................42
5.2.2 RDMA Read Response Message...............................43 5.2.2 RDMA Read Response Message................................43
5.3 Send Message Type.........................................44 5.3 Send Message Type..........................................44
5.4 Terminate Message.........................................46 5.4 Terminate Message..........................................46
5.5 Ordering and Completions..................................47 5.5 Ordering and Completions...................................47
6 RDMAP Stream Management...................................51 6 RDMAP Stream Management....................................51
6.1 Stream Initialization.....................................51 6.1 Stream Initialization......................................51
6.2 Stream Teardown...........................................52 6.2 Stream Teardown............................................52
6.2.1 RDMAP Abortive Termination...............................52 6.2.1 RDMAP Abortive Termination................................52
7 RDMAP Error Management....................................54 7 RDMAP Error Management.....................................54
7.1 RDMAP Error Surfacing.....................................54 7.1 RDMAP Error Surfacing......................................54
7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages55
55 8 Security Considerations....................................57
8 Security..................................................57 8.1 Summary of RDMAP specific Security Requirements............57
8.1 Summary of RDMAP specific Security Requirements...........57 8.1.1 RDMAP (RNIC) Requirements.................................57
8.1.1 RDMAP (RNIC) Requirements................................57 8.1.2 Privileged Resource Manager Requirements..................59
8.1.2 Privileged Resource Manager Requirements.................59 8.2 Security Services for RDMAP................................60
8.2 Security Services for RDMAP...............................60 8.2.1 Available Security Services...............................60
8.2.1 Available Security Services..............................60 8.2.2 Requirements for IPsec Services for RDMAP.................61
8.2.2 Requirements for IPsec Services for RDMAP................61 9 IANA.......................................................64
9 IANA......................................................64 10 References.................................................65
10 References................................................65 10.1 Normative References......................................65
10.1 Normative References.....................................65 10.2 Informative References....................................66
10.2 Informative References...................................65 11 Appendix...................................................67
11 Appendix..................................................67 11.1 DDP Segment Formats for RDMA Messages.....................67
11.1 DDP Segment Formats for RDMA Messages....................67 11.1.1 DDP Segment for RDMA Write..............................67
11.1.1 DDP Segment for RDMA Write.............................67 11.1.2 DDP Segment for RDMA Read Request.......................67
11.1.2 DDP Segment for RDMA Read Request......................67 11.1.3 DDP Segment for RDMA Read Response......................69
11.1.3 DDP Segment for RDMA Read Response.....................69 11.1.4 DDP Segment for Send and Send with Solicited Event......69
11.1.4 DDP Segment for Send and Send with Solicited Event.....69 11.1.5 DDP Segment for Send with Invalidate and Send with SE and
11.1.5 DDP Segment for Send with Invalidate and Send with SE and Invalidate.......................................................70
Invalidate......................................................70 11.1.6 DDP Segment for Terminate...............................71
11.1.6 DDP Segment for Terminate..............................71 11.2 Ordering and Completion Table.............................71
11.2 Ordering and Completion Table............................71 12 Author's Address...........................................75
12 Author's Address..........................................75 13 Contributors...............................................76
13 Contributors..............................................76 14 Intellectual Property Statement............................80
14 Intellectual Property Statement...........................80 15 Full Copyright Statement...................................81
15 IPR Disclosure Acknowledgement..Error! Bookmark not defined.
16 Full Copyright Statement..................................81
Table of Figures Table of Figures
Figure 1 RDMAP Layering.........................................10 Figure 1 RDMAP Layering..........................................10
Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP11 Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP11
Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields.28 Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields..28
Figure 4 RDMA Usage of DDP Fields...............................29 Figure 4 RDMA Usage of DDP Fields................................29
Figure 5 RDMA Message Definitions...............................31 Figure 5 RDMA Message Definitions................................31
Figure 6 RDMA Read Request Header Format........................32 Figure 6 RDMA Read Request Header Format.........................32
Figure 7 Terminate Header Format................................35 Figure 7 Terminate Header Format.................................35
Figure 8 Terminate Control Field................................35 Figure 8 Terminate Control Field.................................35
Figure 9 Terminate Control Field Values.........................38 Figure 9 Terminate Control Field Values..........................38
Figure 10 Error Type to RDMA Message Mapping....................40 Figure 10 Error Type to RDMA Message Mapping.....................40
Figure 11 RDMA Write, DDP Segment format........................67 Figure 11 RDMA Write, DDP Segment format.........................67
Figure 12 RDMA Read Request, DDP Segment format.................68 Figure 12 RDMA Read Request, DDP Segment format..................68
Figure 13 RDMA Read Response, DDP Segment format................69 Figure 13 RDMA Read Response, DDP Segment format.................69
Figure 14 Send and Send with Solicited Event, DDP Segment format 70 Figure 14 Send and Send with Solicited Event, DDP Segment format.70
Figure 15 Send with Invalidate and Send with SE and Invalidate, DDP Figure 15 Send with Invalidate and Send with SE and Invalidate,
Segment.........................................................70 DDP Segment......................................................70
Figure 16 Terminate, DDP Segment format.........................71 Figure 16 Terminate, DDP Segment format..........................71
Figure 17 Operation Ordering....................................74 Figure 17 Operation Ordering.....................................74
1 Introduction 1 Introduction
Today, communications over TCP/IP typically require copy Today, communications over TCP/IP typically require copy
operations, which add latency and consume significant CPU and operations, which add latency and consume significant CPU and
memory resources. The Remote Direct Memory Access Protocol (RDMAP) memory resources. The Remote Direct Memory Access Protocol
enables removal of data copy operations and enables reduction in (RDMAP) enables removal of data copy operations and enables
latencies by allowing a local application to read or write data on reduction in latencies by allowing a local application to read or
a remote computer's memory with minimal demands on memory bus write data on a remote computer's memory with minimal demands on
bandwidth and CPU processing overhead, while preserving memory memory bus bandwidth and CPU processing overhead, while preserving
protection semantics. memory protection semantics.
RDMAP is layered on top of Direct Data Placement (DDP) and uses the RDMAP is layered on top of Direct Data Placement (DDP) and uses
two Buffer Models available from DDP. DDP-related terminology is the two Buffer Models available from DDP. DDP-related terminology
discussed in Section 2.3. As RDMAP builds on DDP the reader is is discussed in Section 2.3. As RDMAP builds on DDP the reader is
advised to become familiar with [DDP]. advised to become familiar with [DDP].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
this document are to be interpreted as described in [RFC2119]." in this document are to be interpreted as described in RFC 2119."
1.1 Architectural Goals 1.1 Architectural Goals
RDMAP has been designed with the following high-level architectural RDMAP has been designed with the following high-level
goals: architectural goals:
* Provide a data transfer operation that allows a Local Peer to * Provide a data transfer operation that allows a Local Peer to
transfer up to 2^32 - 1 octets directly into a previously transfer up to 2^32 - 1 octets directly into a previously
advertised buffer (i.e. Tagged buffer) located at a Remote Peer advertised buffer (i.e., Tagged buffer) located at a Remote
without requiring a copy operation. This is referred to as the Peer without requiring a copy operation. This is referred to as
RDMA Write data transfer operation. the RDMA Write data transfer operation.
* Provide a data transfer operation that allows a Local Peer to * Provide a data transfer operation that allows a Local Peer to
retrieve up to 2^32 - 1 octets directly from a previously retrieve up to 2^32 - 1 octets directly from a previously
advertised buffer (i.e. Tagged buffer) located at a Remote Peer advertised buffer (i.e., Tagged buffer) located at a Remote
without requiring a copy operation. This is referred to as the Peer without requiring a copy operation. This is referred to as
RDMA Read data transfer operation. the RDMA Read data transfer operation.
* Provide a data transfer operation that allows a Local Peer to * Provide a data transfer operation that allows a Local Peer to
send up to 2^32 - 1 octets directly into a buffer located at a send up to 2^32 - 1 octets directly into a buffer located at a
Remote Peer that has not been explicitly advertised. This is Remote Peer that has not been explicitly advertised. This is
referred to as the Send (Send with Invalidate, Send with referred to as the Send (Send with Invalidate, Send with
Solicited Event, and Send with Solicited Event and Invalidate) Solicited Event, and Send with Solicited Event and Invalidate)
data transfer operation. data transfer operation.
* Enable the local ULP to use the Send Operation Type (includes * Enable the local ULP to use the Send Operation Type (includes
Send, Send with Invalidate, Send with Solicited Event, and Send Send, Send with Invalidate, Send with Solicited Event, and Send
with Solicited Event and Invalidate) to signal to the remote ULP with Solicited Event and Invalidate) to signal to the remote
the Completion of all previous Messages initiated by the local ULP the Completion of all previous Messages initiated by the
ULP. local ULP.
* Provide for all Operations on a single RDMAP Stream to be * Provide for all Operations on a single RDMAP Stream to be
reliably transmitted in the order that they were submitted. reliably transmitted in the order that they were submitted.
* Provide RDMAP capabilities independently for each Stream when * Provide RDMAP capabilities independently for each Stream when
the LLP supports multiple data Streams within an LLP connection. the LLP supports multiple data Streams within an LLP
connection.
1.2 Protocol Overview 1.2 Protocol Overview
RDMAP provides seven data transfer operations. Except for the RDMA RDMAP provides seven data transfer operations. Except for the RDMA
Read operation, each operation generates exactly one RDMA Message. Read operation, each operation generates exactly one RDMA Message.
Following is a brief overview of the RDMA Operations and RDMA Following is a brief overview of the RDMA Operations and RDMA
Messages: Messages:
1. Send - A Send operation uses a Send Message to transfer data 1. Send - A Send operation uses a Send Message to transfer data
from the Data Source into a buffer that has not been explicitly from the Data Source into a buffer that has not been
Advertised by the Data Sink. The Send Message uses the DDP explicitly Advertised by the Data Sink. The Send Message uses
Untagged Buffer Model to transfer the ULP Message into the Data the DDP Untagged Buffer Model to transfer the ULP Message into
Sink's Untagged Buffer. the Data Sink's Untagged Buffer.
2. Send with Invalidate - A Send with Invalidate operation uses a 2. Send with Invalidate - A Send with Invalidate operation uses a
Send with Invalidate Message to transfer data from the Data Send with Invalidate Message to transfer data from the Data
Source into a buffer that has not been explicitly Advertised by Source into a buffer that has not been explicitly Advertised
the Data Sink. The Send with Invalidate Message includes all by the Data Sink. The Send with Invalidate Message includes
functionality of the Send Message, with one addition: an STag all functionality of the Send Message, with one addition: an
field is included in the Send With Invalidate Message and after STag field is included in the Send with Invalidate Message and
the message has been Placed and Delivered at the Data Sink the after the message has been Placed and Delivered at the Data
remote peer's buffer identified by the STag can no longer be Sink the remote peer's buffer identified by the STag can no
accessed remotely until the remote peer's ULP re-enables access longer be accessed remotely until the remote peer's ULP re-
and Advertises the buffer. enables access and Advertises the buffer.
3. Send with Solicited Event (Send with SE) - A Send with 3. Send with Solicited Event (Send with SE) - A Send with
Solicited Event operation uses a Send with Solicited Event Solicited Event operation uses a Send with Solicited Event
Message to transfer data from the Data Source into an Untagged Message to transfer data from the Data Source into an Untagged
Buffer at the Data Sink. The Send with Solicited Event Message Buffer at the Data Sink. The Send with Solicited Event Message
is similar to the Send Message, with one addition: when the is similar to the Send Message, with one addition: when the
Send with Solicited Event Message has been Placed and Send with Solicited Event Message has been Placed and
Delivered, an Event may be generated at the recipient, if the Delivered, an Event may be generated at the recipient, if the
recipient is configured to generate such an Event. recipient is configured to generate such an Event.
4. Send with Solicited Event and Invalidate (Send with SE and 4. Send with Solicited Event and Invalidate (Send with SE and
Invalidate) - A Send with Solicited Event and Invalidate Invalidate) - A Send with Solicited Event and Invalidate
operation uses a Send with Solicited Event and Invalidate operation uses a Send with Solicited Event and Invalidate
Message to transfer data from the Data Source into a buffer Message to transfer data from the Data Source into a buffer
that has not been explicitly Advertised by the Data Sink. The that has not been explicitly Advertised by the Data Sink. The
Send with Solicited Event and Invalidate Message is similar to Send with Solicited Event and Invalidate Message is similar to
the Send with Invalidate Message, with one addition: when the the Send with Invalidate Message, with one addition: when the
Send with Solicited Event and Invalidate Message has been Send with Solicited Event and Invalidate Message has been
Placed and Delivered, an Event may be generated at the Placed and Delivered, an Event may be generated at the
recipient, if the recipient is configured to generate such an recipient, if the recipient is configured to generate such an
Event. Event.
5. Remote Direct Memory Access Write - An RDMA Write operation 5. Remote Direct Memory Access Write - An RDMA Write operation
uses an RDMA Write Message to transfer data from the Data uses an RDMA Write Message to transfer data from the Data
Source to a previously advertised buffer at the Data Sink. Source to a previously advertised buffer at the Data Sink.
The ULP at the Remote Peer, which in this case is the Data The ULP at the Remote Peer, which in this case is the Data
Sink, enables the Data Sink Tagged Buffer for access and Sink, enables the Data Sink Tagged Buffer for access and
Advertises the buffer's size (length), location (Tagged Advertises the buffer's size (length), location (Tagged
Offset), and Steering Tag (STag) to the Data Source through a Offset), and Steering Tag (STag) to the Data Source through a
ULP specific mechanism. The ULP at the Local Peer, which in ULP specific mechanism. The ULP at the Local Peer, which in
this case is the Data Source, initiates the RDMA Write this case is the Data Source, initiates the RDMA Write
operation. The RDMA Write Message uses the DDP Tagged Buffer operation. The RDMA Write Message uses the DDP Tagged Buffer
Model to transfer the ULP Message into the Data Sink's Tagged Model to transfer the ULP Message into the Data Sink's Tagged
Buffer. Note: the STag associated with the Tagged Buffer Buffer. Note: the STag associated with the Tagged Buffer
remains valid until the ULP at the Remote Peer invalidates it remains valid until the ULP at the Remote Peer invalidates it
or the ULP at the Local Peer invalidates it through a Send with or the ULP at the Local Peer invalidates it through a Send
Invalidate or Send with Solicited Event and Invalidate. with Invalidate or Send with Solicited Event and Invalidate.
6. Remote Direct Memory Access Read - The RDMA Read operation 6. Remote Direct Memory Access Read - The RDMA Read operation
transfers data to a Tagged Buffer at the Local Peer, which in transfers data to a Tagged Buffer at the Local Peer, which in
this case is the Data Sink, from a Tagged Buffer at the Remote this case is the Data Sink, from a Tagged Buffer at the Remote
Peer, which in this case is the Data Source. The ULP at the Peer, which in this case is the Data Source. The ULP at the
Data Source enables the Data Source Tagged Buffer for access Data Source enables the Data Source Tagged Buffer for access
and Advertises the buffer's size (length), location (Tagged and Advertises the buffer's size (length), location (Tagged
Offset), and Steering Tag (STag) to the Data Sink through a ULP Offset), and Steering Tag (STag) to the Data Sink through a
specific mechanism. The ULP at the Data Sink enables the Data ULP specific mechanism. The ULP at the Data Sink enables the
Sink Tagged Buffer for access and initiates the RDMA Read Data Sink Tagged Buffer for access and initiates the RDMA Read
operation. The RDMA Read operation consists of a single RDMA operation. The RDMA Read operation consists of a single RDMA
Read Request Message and a single RDMA Read Response Message, Read Request Message and a single RDMA Read Response Message,
and the latter may be segmented into multiple DDP Segments. and the latter may be segmented into multiple DDP Segments.
The RDMA Read Request Message uses the DDP Untagged Buffer The RDMA Read Request Message uses the DDP Untagged Buffer
Model to Deliver the STag, starting Tagged Offset and length Model to Deliver the STag, starting Tagged Offset and length
for both the Data Source and Data Sink Tagged Buffers to the for both the Data Source and Data Sink Tagged Buffers to the
remote peer's RDMA Read Request Queue. remote peer's RDMA Read Request Queue.
The RDMA Read Response Message uses the DDP Tagged Buffer Model The RDMA Read Response Message uses the DDP Tagged Buffer
to Deliver the Data Source's Tagged Buffer to the Data Sink, Model to Deliver the Data Source's Tagged Buffer to the Data
without any involvement from the ULP at the Data Source. Sink, without any involvement from the ULP at the Data Source.
Note: the Data Source STag associated with the Tagged Buffer Note: the Data Source STag associated with the Tagged Buffer
remains valid until the ULP at the Data Source invalidates it remains valid until the ULP at the Data Source invalidates it
or the ULP at the Data Sink invalidates it through a Send with or the ULP at the Data Sink invalidates it through a Send with
Invalidate or Send with Solicited Event and Invalidate. The Invalidate or Send with Solicited Event and Invalidate. The
Data Sink STag associated with the Tagged Buffer remains valid Data Sink STag associated with the Tagged Buffer remains valid
until the ULP at the Data Sink invalidates it. until the ULP at the Data Sink invalidates it.
7. Terminate - A Terminate operation uses a Terminate Message to 7. Terminate - A Terminate operation uses a Terminate Message to
transfer to the Remote Peer information associated with an transfer to the Remote Peer information associated with an
error that occurred at the Local Peer. The Terminate Message error that occurred at the Local Peer. The Terminate Message
uses the DDP Untagged Buffer Model to transfer the Message into uses the DDP Untagged Buffer Model to transfer the Message
the Data Sink's Untagged Buffer. into the Data Sink's Untagged Buffer.
1.3 RDMAP Layering 1.3 RDMAP Layering
RDMAP is dependent on DDP, subject to the requirements defined in RDMAP is dependent on DDP, subject to the requirements defined in
section 3.1 Transport Requirements & Assumptions. Figure 1 RDMAP section 3.1 Transport Requirements & Assumptions. Figure 1 RDMAP
Layering depicts the relationship between Upper Layer Protocols Layering depicts the relationship between Upper Layer Protocols
(ULPs), RDMAP, DDP protocol, the framing layer, and the transport. (ULPs), RDMAP, DDP protocol, the framing layer, and the transport.
For LLP protocol definitions of each LLP, see [MPA], [TCP], and For LLP protocol definitions of each LLP, see [MPA], [TCP], and
[SCTP]. [SCTP].
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
| Upper Layer Protocol (ULP) | | Upper Layer Protocol (ULP) |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
| RDMAP | | RDMAP |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
| DDP protocol | | DDP protocol |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | | |
| MPA | | | MPA | |
| | | | | |
+-+-+-+-+-+-+-+-+-+ SCTP | +-+-+-+-+-+-+-+-+-+ SCTP |
| | | | | |
| TCP | | | TCP | |
| | | | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1 RDMAP Layering Figure 1 RDMAP Layering
If RDMAP is layered over DDP/MPA/TCP, then the respective headers If RDMAP is layered over DDP/MPA/TCP, then the respective headers
and ULP Payload are arranged as follows (Note: For clarity, MPA and ULP Payload are arranged as follows (Note: For clarity, MPA
header and CRC fields are included but MPA markers are not shown): header and CRC fields are included but MPA markers are not shown):
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
// TCP Header // // TCP Header //
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MPA Header | | | MPA Header | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| | | |
// DDP Header // // DDP Header //
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
// RDMA Header // // RDMA Header //
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
// ULP Payload // // ULP Payload //
| (shown with no pad bytes) | // (shown with no pad bytes) //
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // //
| MPA CRC | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP | MPA CRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2 Example of MPA, DDP, and RDMAP Header Alignment over TCP
1.4 Specification Changes from the Last Version 1.4 Specification Changes from the Last Version
This section is to be removed before RFC publication. This section is to be removed before RFC publication.
The following major changes (vs typos) were made to the -05 The following major changes (vs typos) were made to the -06 and -
version: 07 version:
* To pass the IETF checklist tool, modified heading of Security * Incorporated comments from Transport Area Directors and the
Section 8 to "Security" and added "Security Considerations" Remote Direct Data Placement Working Group chair.
below it.
* Added IANA Section 9 and to pass the IETF checklist tool added The following major changes (vs typos) were made to the -05
"IANA Considerations" line below Section 9 header. version:
* Added Intellectual Property Statement Section 14 and IPR * To pass the IETF checklist tool, modified heading of Security
Disclosure Acknowledgement Section 15. Section 8 to "Security" and added "Security Considerations"
below it.
* Added Disclaimer Section 16. * Added IANA Section 9 and to pass the IETF checklist tool added
"IANA Considerations" line below Section 9 header.
* Section 6.8 - Acknowledged that the Reserved field size for the * Added Intellectual Property Statement Section 14 and IPR
Terminate Message is 13 bits. The fix was made to the -04 Disclosure Acknowledgement Section 15.
version, but was not listed in this section.
* Rewrite of the "Security" section to refer to Security document * Added Disclaimer Section 16.
rather than summarize.
* Update to the "Contributors" section. * Section 6.8 - Acknowledged that the Reserved field size for the
Terminate Message is 13 bits. The fix was made to the -04
version, but was not listed in this section.
* Changed boilerplate reference form 3667 to 3979. * Rewrite of the "Security" section to refer to Security document
rather than summarize.
* Removed references to company names in the disclaimer section. * Update to the "Contributors" section.
* Added "Key Words" Disclaimer to the Introduction. * Changed boilerplate reference form 3667 to 3979.
The following major changes (vs typos) were made to the -04 * Removed references to company names in the disclaimer section.
version:
* Section 10 - Expanded IPsec requirements sentence in section * Added "Key Words" Disclaimer to the Introduction.
10.3.2 to say what is required in addition to cross-referencing
RFC 3723.
* Section 6.8 - Fixed text after Figure 9 to reflect the correct The following major changes (vs typos) were made to the -04
size (13 bits) of the Reserved field in the Terminate Message. version:
The following major changes (vs typos) were made to the -03 * Section 10 - Expanded IPsec requirements sentence in section
version: 10.3.2 to say what is required in addition to cross-referencing
RFC 3723.
* Section 6.1 - Added normative text describing downward * Section 6.8 - Fixed text after Figure 9 to reflect the correct
compatibility with version 0. size (13 bits) of the Reserved field in the Terminate Message.
* Section 6.8 - Changed the description of the reserved field size The following major changes (vs typos) were made to the -03
to match the size in the figure, which is 13 bits. version:
* Section 10 - Aligned security section closely to [RDMASEC] and * Section 6.1 - Added normative text describing downward
added normative text for security requirements. compatibility with version 0.
The following major changes (vs typos) were made to the -02 * Section 6.8 - Changed the description of the reserved field
version: size to match the size in the figure, which is 13 bits.
* Section 6.8 - Explicitly defined the bit numbers for the three * Section 10 - Aligned security section closely to [RDMASEC] and
header control bits. added normative text for security requirements.
* Section 8.1 - Stated the typical Stream initialization to be: The following major changes (vs typos) were made to the -02
RDMA mode is entered some time after the LLP Stream is version:
initialized.
* Section 10 - Update reference to security document. * Section 6.8 - Explicitly defined the bit numbers for the three
header control bits.
* Section 10 - Fixed Send with Solicited Event and Invalidate * Section 8.1 - Stated the typical Stream initialization to be:
reference. RDMA mode is entered some time after the LLP Stream is
initialized.
* Section 12.1 - MPA and DDP references were changed to reflect * Section 10 - Update reference to security document.
the released specifications and accurate titles.
* Section 12.1 - Reference for RDMA Protocol Verbs was changed to * Section 10 - Fixed Send with Solicited Event and Invalidate
reflect the released specification and accurate title. reference.
2 Glossary * Section 12.1 - MPA and DDP references were changed to reflect
the released specifications and accurate titles.
2.1 General * Section 12.1 - Reference for RDMA Protocol Verbs was changed to
reflect the released specification and accurate title.
Advertisement (Advertised, Advertise, Advertisements, Advertises) - 2 Glossary
the act of informing a Remote Peer that a local RDMA Buffer is
available to it. A Node makes available an RDMA Buffer for
incoming RDMA Read or RDMA Write access by informing its
RDMA/DDP peer of the Tagged Buffer identifiers (STag, base
address, and buffer length). This Advertisement of Tagged
Buffer information is not defined by RDMA/DDP and is left to
the ULP. A typical method would be for the Local Peer to embed
the Tagged Buffer's Steering Tag, base address, and length in a
Send Message destined for the Remote Peer.
Completion - Refer to "RDMA Completion" in Section 2.4. 2.1 General
Completed - See "RDMA Completion" in Section 2.4. Advertisement (Advertised, Advertise, Advertisements, Advertises)
- the act of informing a Remote Peer that a local RDMA Buffer
is available to it. A Node makes available an RDMA Buffer for
incoming RDMA Read or RDMA Write access by informing its
RDMA/DDP peer of the Tagged Buffer identifiers (STag, base
address, and buffer length). This Advertisement of Tagged
Buffer information is not defined by RDMA/DDP and is left to
the ULP. A typical method would be for the Local Peer to embed
the Tagged Buffer's Steering Tag, base address, and length in
a Send Message destined for the Remote Peer.
Complete - See "RDMA Completion" in Section 2.4. Completion - Refer to "RDMA Completion" in Section 2.4.
Completes - See "RDMA Completion" in Section 2.4. Completed - See "RDMA Completion" in Section 2.4.
Data Sink - The peer receiving a data payload. Note that the Data Complete - See "RDMA Completion" in Section 2.4.
Sink can be required to both send and receive RDMA/DDP Messages
to transfer a data payload.
Data Source - The peer sending a data payload. Note that the Data Completes - See "RDMA Completion" in Section 2.4.
Source can be required to both send and receive RDMA/DDP
Messages to transfer a data payload.
Data Delivery (Delivery, Delivered, Delivers) - Delivery is defined Data Sink - The peer receiving a data payload. Note that the Data
as the process of informing the ULP or consumer that a Sink can be required to both send and receive RDMA/DDP
particular Message is available for use. This is specifically Messages to transfer a data payload.
different from "Placement", which may generally occur in any
order, while the order of "Delivery" is strictly defined. See
"Data Placement" in Section 2.3.
Delivery - See Data Delivery in Section 2.1. Data Source - The peer sending a data payload. Note that the Data
Source can be required to both send and receive RDMA/DDP
Messages to transfer a data payload.
Delivered - See Data Delivery in Section 2.1. Data Delivery (Delivery, Delivered, Delivers) - Delivery is
defined as the process of informing the ULP or consumer that a
particular Message is available for use. This is specifically
different from "Placement", which may generally occur in any
order, while the order of "Delivery" is strictly defined. See
"Data Placement" in Section 2.3.
Delivers - See Data Delivery in Section 2.1. Delivery - See Data Delivery in Section 2.1.
Fabric - The collection of links, switches, and routers that Delivered - See Data Delivery in Section 2.1.
connect a set of Nodes with RDMA/DDP protocol implementations.
Fence (Fenced, Fences) - To block the current RDMA Operation from Delivers - See Data Delivery in Section 2.1.
executing until prior RDMA Operations have Completed.
iWARP - A suite of wire protocols comprised of RDMAP, DDP, and MPA. Fabric - The collection of links, switches, and routers that
The iWARP protocol suite may be layered above TCP, SCTP, or connect a set of Nodes with RDMA/DDP protocol implementations.
other transport protocols.
Local Peer - The RDMA/DDP protocol implementation on the local end Fence (Fenced, Fences) - To block the current RDMA Operation from
of the connection. Used to refer to the local entity when executing until prior RDMA Operations have Completed.
describing a protocol exchange or other interaction between two
Nodes.
Node - A computing device attached to one or more links of a Fabric iWARP - A suite of wire protocols comprised of RDMAP, DDP, and
(network). A Node in this context does not refer to a specific MPA. The iWARP protocol suite may be layered above TCP, SCTP,
application or protocol instantiation running on the computer. or other transport protocols.
A Node may consist of one or more RNICs installed in a host
computer.
Placement - See "Data Placement" in Section 2.3 Local Peer - The RDMA/DDP protocol implementation on the local end
of the connection. Used to refer to the local entity when
describing a protocol exchange or other interaction between
two Nodes.
Placed - See "Data Placement" in Section 2.3 Node - A computing device attached to one or more links of a
Fabric (network). A Node in this context does not refer to a
specific application or protocol instantiation running on the
computer. A Node may consist of one or more RNICs installed in
a host computer.
Places - See "Data Placement" in Section 2.3 Placement - See "Data Placement" in Section 2.3
Remote Peer - The RDMA/DDP protocol implementation on the opposite Placed - See "Data Placement" in Section 2.3
end of the connection. Used to refer to the remote entity when
describing protocol exchanges or other interactions between two
Nodes.
RNIC - RDMA Network Interface Controller. In this context, this Places - See "Data Placement" in Section 2.3
would be a network I/O adapter or embedded controller with
iWARP and Verbs functionality.
RNIC Interface (RI) - The presentation of the RNIC to the Verbs Remote Peer - The RDMA/DDP protocol implementation on the opposite
Consumer as implemented through the combination of the RNIC and end of the connection. Used to refer to the remote entity when
the RNIC driver. describing protocol exchanges or other interactions between
two Nodes.
Termination - See "RDMAP Abortive Termination" in Section 2.4. RNIC - RDMA Network Interface Controller. In this context, this
would be a network I/O adapter or embedded controller with
iWARP and Verbs functionality.
Terminated - See "RDMAP Abortive Termination" in Section 2.4. RNIC Interface (RI) - The presentation of the RNIC to the Verbs
Consumer as implemented through the combination of the RNIC
and the RNIC driver.
Terminate - See "RDMAP Abortive Termination" in Section 2.4 Termination - See "RDMAP Abortive Termination" in Section 2.4.
Terminates - See "RDMAP Abortive Termination" in Section 2.4 Terminated - See "RDMAP Abortive Termination" in Section 2.4.
ULP - Upper Layer Protocol. The protocol layer above the protocol Terminate - See "RDMAP Abortive Termination" in Section 2.4
layer currently being referenced. The ULP for RDMA/DDP is
expected to be an OS, Application, adaptation layer, or
proprietary device. The RDMA/DDP documents do not specify a
ULP - they provide a set of semantics that allow a ULP to be
designed to utilize RDMA/DDP.
ULP Payload - The ULP data that is contained within a single Terminates - See "RDMAP Abortive Termination" in Section 2.4
protocol segment or packet (e.g. a DDP Segment).
Verbs - An abstract description of the functionality of a RNIC ULP - Upper Layer Protocol. The protocol layer above the protocol
Interface. The OS may expose some or all of this functionality layer currently being referenced. The ULP for RDMA/DDP is
via one or more APIs to applications. The OS will also use some expected to be an OS, Application, adaptation layer, or
of the functionality to manage the RNIC Interface. proprietary device. The RDMA/DDP documents do not specify a
ULP - they provide a set of semantics that allow a ULP to be
designed to utilize RDMA/DDP.
2.2 LLP ULP Payload - The ULP data that is contained within a single
protocol segment or packet (e.g., a DDP Segment).
LLP - Lower Layer Protocol. The protocol layer beneath the protocol Verbs - An abstract description of the functionality of a RNIC
layer currently being referenced. For example, for DDP the LLP Interface. The OS may expose some or all of this functionality
is SCTP, MPA, or other transport protocols. For RDMA, the LLP via one or more APIs to applications. The OS will also use
is DDP. some of the functionality to manage the RNIC Interface.
LLP Connection - Corresponds to an LLP transport-level connection 2.2 LLP
between the peer LLP layers on two nodes.
LLP Stream - Corresponds to a single LLP transport-level Stream LLP - Lower Layer Protocol. The protocol layer beneath the
between the peer LLP layers on two Nodes. One or more LLP protocol layer currently being referenced. For example, for
Streams may map to a single transport-level LLP connection. For DDP the LLP is SCTP, MPA, or other transport protocols. For
transport protocols that support multiple Streams per RDMA, the LLP is DDP.
connection (e.g. SCTP), a LLP Stream corresponds to one
transport-level Stream.
MULPDU - Maximum ULPDU. The current maximum size of the record that LLP Connection - Corresponds to an LLP transport-level connection
is acceptable for DDP to pass to the LLP for transmission. between the peer LLP layers on two nodes.
ULPDU - Upper Layer Protocol Data Unit. The data record defined by LLP Stream - Corresponds to a single LLP transport-level Stream
the layer above MPA. between the peer LLP layers on two Nodes. One or more LLP
Streams may map to a single transport-level LLP connection.
For transport protocols that support multiple Streams per
connection (e.g., SCTP), a LLP Stream corresponds to one
transport-level Stream.
2.3 Direct Data Placement (DDP) MULPDU - Maximum ULPDU. The current maximum size of the record
that is acceptable for DDP to pass to the LLP for
transmission.
Data Placement (Placement, Placed, Places) - For DDP, this term is ULPDU - Upper Layer Protocol Data Unit. The data record defined
specifically used to indicate the process of writing to a data by the layer above MPA.
buffer by a DDP implementation. DDP Segments carry Placement
information, which may be used by the receiving DDP
implementation to perform Data Placement of the DDP Segment ULP
Payload. See "Data Delivery".
DDP Abortive Teardown - The act of closing a DDP Stream without 2.3 Direct Data Placement (DDP)
attempting to Complete in-progress and pending DDP Messages.
DDP Graceful Teardown - The act of closing a DDP Stream such that Data Placement (Placement, Placed, Places) - For DDP, this term is
all in-progress and pending DDP Messages are allowed to specifically used to indicate the process of writing to a data
Complete successfully. buffer by a DDP implementation. DDP Segments carry Placement
information, which may be used by the receiving DDP
implementation to perform Data Placement of the DDP Segment
ULP Payload. See "Data Delivery".
DDP Control Field - a fixed 16-bit field in the DDP Header. The DDP DDP Abortive Teardown - The act of closing a DDP Stream without
Control Field contains an 8-bit field whose contents are attempting to Complete in-progress and pending DDP Messages.
reserved for use by the ULP.
DDP Header - The header present in all DDP segments. The DDP Header DDP Graceful Teardown - The act of closing a DDP Stream such that
contains control and Placement fields that are used to define all in-progress and pending DDP Messages are allowed to
the final Placement location for the ULP payload carried in a Complete successfully.
DDP Segment.
DDP Message - A ULP defined unit of data interchange, which is DDP Control Field - a fixed 16-bit field in the DDP Header. The
subdivided into one or more DDP segments. This segmentation may DDP Control Field contains an 8-bit field whose contents are
occur for a variety of reasons, including segmentation to reserved for use by the ULP.
respect the maximum segment size of the underlying transport
protocol.
DDP Segment - The smallest unit of data transfer for the DDP DDP Header - The header present in all DDP segments. The DDP
protocol. It includes a DDP Header and ULP Payload (if Header contains control and Placement fields that are used to
present). A DDP Segment should be sized to fit within the define the final Placement location for the ULP payload
underlying transport protocol MULPDU. carried in a DDP Segment.
DDP Stream - a sequence of DDP Messages whose ordering is defined DDP Message - A ULP defined unit of data interchange, which is
by the LLP. For SCTP, a DDP Stream maps directly to an SCTP subdivided into one or more DDP segments. This segmentation
Stream. For MPA, a DDP Stream maps directly to a TCP connection may occur for a variety of reasons, including segmentation to
and a single DDP Stream is supported. Note that DDP has no respect the maximum segment size of the underlying transport
ordering guarantees between DDP Streams. protocol.
Direct Data Placement - A mechanism whereby ULP data contained DDP Segment - The smallest unit of data transfer for the DDP
within DDP Segments may be Placed directly into its final protocol. It includes a DDP Header and ULP Payload (if
destination in memory without processing of the ULP. This may present). A DDP Segment should be sized to fit within the
occur even when the DDP Segments arrive out of order. Out of underlying transport protocol MULPDU.
order Placement support may require the Data Sink to implement
the LLP and DDP as one functional block.
Direct Data Placement Protocol (DDP) - Also, a wire protocol that DDP Stream - a sequence of DDP Messages whose ordering is defined
supports Direct Data Placement by associating explicit memory by the LLP. For SCTP, a DDP Stream maps directly to an SCTP
buffer placement information with the LLP payload units. Stream. For MPA, a DDP Stream maps directly to a TCP
connection and a single DDP Stream is supported. Note that
DDP has no ordering guarantees between DDP Streams.
Message Offset (MO) - For the DDP Untagged Buffer Model, specifies Direct Data Placement - A mechanism whereby ULP data contained
the offset, in bytes, from the start of a DDP Message. within DDP Segments may be Placed directly into its final
destination in memory without processing of the ULP. This may
occur even when the DDP Segments arrive out of order. Out of
order Placement support may require the Data Sink to implement
the LLP and DDP as one functional block.
Message Sequence Number (MSN) - For the DDP Untagged Buffer Model, Direct Data Placement Protocol (DDP) - Also, a wire protocol that
specifies a sequence number that is increasing with each DDP supports Direct Data Placement by associating explicit memory
Message. buffer placement information with the LLP payload units.
Queue Number (QN) - For the DDP Untagged Buffer Model, identifies a Message Offset (MO) - For the DDP Untagged Buffer Model, specifies
destination Data Sink queue for a DDP Segment. the offset, in bytes, from the start of a DDP Message.
Steering Tag - An identifier of a Tagged Buffer on a Node, valid as Message Sequence Number (MSN) - For the DDP Untagged Buffer Model,
defined within a protocol specification. specifies a sequence number that is increasing with each DDP
Message.
STag - Steering Tag Queue Number (QN) - For the DDP Untagged Buffer Model, identifies
a destination Data Sink queue for a DDP Segment.
Tagged Buffer - A buffer that is explicitly Advertised to the Steering Tag - An identifier of a Tagged Buffer on a Node, valid
Remote Peer through exchange of an STag, Tagged Offset, and as defined within a protocol specification.
length.
Tagged Buffer Model - A DDP data transfer model used to transfer STag - Steering Tag
Tagged Buffers from the Local Peer to the Remote Peer.
Tagged DDP Message - A DDP Message that targets a Tagged Buffer. Tagged Buffer - A buffer that is explicitly Advertised to the
Remote Peer through exchange of an STag, Tagged Offset, and
length.
Tagged Offset (TO) - The offset within a Tagged Buffer on a Node. Tagged Buffer Model - A DDP data transfer model used to transfer
Tagged Buffers from the Local Peer to the Remote Peer.
Untagged Buffer - A buffer that is not explicitly Advertised to the Tagged DDP Message - A DDP Message that targets a Tagged Buffer.
Remote Peer. Untagged buffers support one of the two available
data transfer mechanisms called the Untagged Buffer Model. An
untagged buffer is used to send asynchronous control messages
to the Remote Peer for RDMA Read, Send, and Terminate requests.
Untagged Buffers handle Untagged DDP Messages.
Untagged Buffer Model - A DDP data transfer model used to transfer Tagged Offset (TO) - The offset within a Tagged Buffer on a Node.
Untagged Buffers from the Local Peer to the Remote Peer.
Untagged DDP Message - A DDP Message that targets an Untagged Untagged Buffer - A buffer that is not explicitly Advertised to
Buffer. the Remote Peer. Untagged buffers support one of the two
available data transfer mechanisms called the Untagged Buffer
Model. An untagged buffer is used to send asynchronous control
messages to the Remote Peer for RDMA Read, Send, and Terminate
requests. Untagged Buffers handle Untagged DDP Messages.
2.4 Remote Direct Memory Access (RDMA) Untagged Buffer Model - A DDP data transfer model used to transfer
Untagged Buffers from the Local Peer to the Remote Peer.
Event - An indication provided by the RDMAP Layer to the ULP to Untagged DDP Message - A DDP Message that targets an Untagged
indicate a Completion or other condition requiring immediate Buffer.
attention.
Invalidate STag - A mechanism used to prevent the Remote Peer from 2.4 Remote Direct Memory Access (RDMA)
reusing a previous explicitly Advertised STag, until the Local
Peer makes it available through a subsequent explicit
Advertisement. The STag cannot be accessed remotely until it is
explicit Advertised again.
RDMA Completion (Completion, Completed, Complete, Completes) - For Event - An indication provided by the RDMAP Layer to the ULP to
RDMA, Completion is defined as the process of informing the ULP indicate a Completion or other condition requiring immediate
that a particular RDMA Operation has performed all functions attention.
specified for the RDMA Operations, including Placement and
Delivery. The Completion semantic of each RDMA Operation is
distinctly defined.
RDMA Message - A data transfer mechanism used to fulfill an RDMA Invalidate STag - A mechanism used to prevent the Remote Peer from
Operation. reusing a previous explicitly Advertised STag, until the Local
Peer makes it available through a subsequent explicit
Advertisement. The STag cannot be accessed remotely until it
is explicit Advertised again.
RDMA Operation - A sequence of RDMA Messages, including control RDMA Completion (Completion, Completed, Complete, Completes) - For
Messages, to transfer data from a Data Source to a Data Sink. RDMA, Completion is defined as the process of informing the
The following RDMA Operations are defined - RDMA Writes, RDMA ULP that a particular RDMA Operation has performed all
Read, Send, Send with Invalidate, Send with Solicited Event, functions specified for the RDMA Operations, including
Send with Solicited Event and Invalidate, and Terminate. Placement and Delivery. The Completion semantic of each RDMA
Operation is distinctly defined.
RDMA Protocol (RDMAP) - A wire protocol that supports RDMA RDMA Message - A data transfer mechanism used to fulfill an RDMA
Operations to transfer ULP data between a Local Peer and the Operation.
Remote Peer.
RDMAP Abortive Termination (Termination, Terminated, Terminate, RDMA Operation - A sequence of RDMA Messages, including control
Terminates) - The act of closing an RDMAP Stream without Messages, to transfer data from a Data Source to a Data Sink.
attempting to Complete in-progress and pending RDMA Operations. The following RDMA Operations are defined - RDMA Writes, RDMA
Read, Send, Send with Invalidate, Send with Solicited Event,
Send with Solicited Event and Invalidate, and Terminate.
RDMAP Graceful Termination - The act of closing an RDMAP Stream RDMA Protocol (RDMAP) - A wire protocol that supports RDMA
such that all in-progress and pending RDMA Operations are Operations to transfer ULP data between a Local Peer and the
allowed to Complete successfully. Remote Peer.
RDMA Read - An RDMA Operation used by the Data Sink to transfer the RDMAP Abortive Termination (Termination, Terminated, Terminate,
contents of a source RDMA buffer from the Remote Peer to the Terminates) - The act of closing an RDMAP Stream without
Local Peer. An RDMA Read operation consists of a single RDMA attempting to Complete in-progress and pending RDMA
Read Request Message and a single RDMA Read Response Message. Operations.
RDMA Read Request - An RDMA Message used by the Data Sink to RDMAP Graceful Termination - The act of closing an RDMAP Stream
request the Data Source to transfer the contents of an RDMA such that all in-progress and pending RDMA Operations are
buffer. The RDMA Read Request Message describes both the Data allowed to Complete successfully.
Source and Data Sink RDMA buffers.
RDMA Read Request Queue - The queue used for processing RDMA Read RDMA Read - An RDMA Operation used by the Data Sink to transfer
Requests. The RDMA Read Request Queue has a DDP Queue Number of the contents of a source RDMA buffer from the Remote Peer to
1. the Local Peer. An RDMA Read operation consists of a single
RDMA Read Request Message and a single RDMA Read Response
Message.
RDMA Read Response - An RDMA Message used by the Data Source to RDMA Read Request - An RDMA Message used by the Data Sink to
transfer the contents of an RDMA buffer to the Data Sink, in request the Data Source to transfer the contents of an RDMA
response to an RDMA Read Request. The RDMA Read Response buffer. The RDMA Read Request Message describes both the Data
Message only describes the data sink RDMA buffer. Source and Data Sink RDMA buffers.
RDMAP Stream - An association between a pair of RDMAP RDMA Read Request Queue - The queue used for processing RDMA Read
implementations, possibly on different Nodes, which transfer Requests. The RDMA Read Request Queue has a DDP Queue Number
ULP data using RDMA Operations. There may be multiple RDMAP of 1.
Streams on a single Node. An RDMAP Stream maps directly to a
single DDP Stream.
RDMA Write - An RDMA Operation that transfers the contents of a RDMA Read Response - An RDMA Message used by the Data Source to
source RDMA Buffer from the Local Peer to a destination RDMA transfer the contents of an RDMA buffer to the Data Sink, in
Buffer at the Remote Peer using RDMA. The RDMA Write Message response to an RDMA Read Request. The RDMA Read Response
only describes the Data Sink RDMA buffer. Message only describes the data sink RDMA buffer.
Remote Direct Memory Access (RDMA) - A method of accessing memory RDMAP Stream - An association between a pair of RDMAP
on a remote system in which the local system specifies the implementations, possibly on different Nodes, which transfer
remote location of the data to be transferred. Employing a RNIC ULP data using RDMA Operations. There may be multiple RDMAP
in the remote system allows the access to take place without Streams on a single Node. An RDMAP Stream maps directly to a
interrupting the processing of the CPU(s) on the system. single DDP Stream.
Send - An RDMA Operation that transfers the contents of a ULP RDMA Write - An RDMA Operation that transfers the contents of a
Buffer from the Local Peer to an Untagged Buffer at the Remote source RDMA Buffer from the Local Peer to a destination RDMA
Peer. Buffer at the Remote Peer using RDMA. The RDMA Write Message
only describes the Data Sink RDMA buffer.
Send Message Type - A Send Message, Send with Invalidate Message, Remote Direct Memory Access (RDMA) - A method of accessing memory
Send with Solicited Event Message, or Send with Solicited Event on a remote system in which the local system specifies the
and Invalidate Message. remote location of the data to be transferred. Employing a
RNIC in the remote system allows the access to take place
without interrupting the processing of the CPU(s) on the
system.
Send Operation Type - A Send Operation, Send with Invalidate Send - An RDMA Operation that transfers the contents of a ULP
Operation, Send with Solicited Event Operation, or Send with Buffer from the Local Peer to an Untagged Buffer at the Remote
Solicited Event and Invalidate Operation. Peer.
Solicited Event (SE) - A facility by which an RDMA Operation sender Send Message Type - A Send Message, Send with Invalidate Message,
may cause an Event to be generated at the recipient, if the Send with Solicited Event Message, or Send with Solicited
recipient is configured to generate such an Event, when a Send Event and Invalidate Message.
with Solicited Event or Send with Solicited Event and
Invalidate Message is received. Note: The Local Peer's ULP can
use the Solicited Event mechanism to ensure that Messages
designated as important to the ULP are handled in an
expeditious manner by the Remote Peer's ULP. The ULP at the
Local Peer can indicate a given Send Message Type is important
by using the Send with Solicited Event Message or Send with
Solicited Event and Invalidate Message. The ULP at the Remote
Peer can choose to only be notified when valid Send with
Solicited Event Messages and/or Send with Solicited Event and
Invalidate Messages arrive and handle other valid incoming Send
Messages or Send with Invalidate Messages at its leisure.
Terminate - An RDMA Message used by a Node to pass an error Send Operation Type - A Send Operation, Send with Invalidate
indication to the peer Node on an RDMAP Stream. This operation Operation, Send with Solicited Event Operation, or Send with
is for RDMAP use only. Solicited Event and Invalidate Operation.
ULP Buffer - A buffer owned above the RDMAP Layer and advertised to Solicited Event (SE) - A facility by which an RDMA Operation
the RDMAP Layer either as a Tagged Buffer or an Untagged ULP sender may cause an Event to be generated at the recipient, if
Buffer. the recipient is configured to generate such an Event, when a
Send with Solicited Event or Send with Solicited Event and
Invalidate Message is received. Note: The Local Peer's ULP
can use the Solicited Event mechanism to ensure that Messages
designated as important to the ULP are handled in an
expeditious manner by the Remote Peer's ULP. The ULP at the
Local Peer can indicate a given Send Message Type is important
by using the Send with Solicited Event Message or Send with
Solicited Event and Invalidate Message. The ULP at the Remote
Peer can choose to only be notified when valid Send with
Solicited Event Messages and/or Send with Solicited Event and
Invalidate Messages arrive and handle other valid incoming
Send Messages or Send with Invalidate Messages at its leisure.
ULP Message - The ULP data that is handed to a specific protocol Terminate - An RDMA Message used by a Node to pass an error
layer for transmission. Data boundaries are preserved as they indication to the peer Node on an RDMAP Stream. This operation
are transmitted through iWARP. is for RDMAP use only.
3 ULP and Transport Attributes ULP Buffer - A buffer owned above the RDMAP Layer and advertised
to the RDMAP Layer either as a Tagged Buffer or an Untagged
ULP Buffer.
3.1 Transport Requirements & Assumptions ULP Message - The ULP data that is handed to a specific protocol
layer for transmission. Data boundaries are preserved as they
are transmitted through iWARP.
RDMAP MUST be layered on top of the Direct Data Placement Protocol 3 ULP and Transport Attributes
[DDP].
RDMAP requires the following DDP support: 3.1 Transport Requirements & Assumptions
* RDMAP uses three queues for Untagged Buffers: RDMAP MUST be layered on top of the Direct Data Placement Protocol
[DDP].
* Queue Number 0 (used by RDMAP for Send, Send with RDMAP requires the following DDP support:
Invalidate, Send with Solicited Event, and Send with
Solicited Event and Invalidate operations).
* Queue Number 1 (used by RDMAP for RDMA Read operations). * RDMAP uses three queues for Untagged Buffers:
* Queue Number 2 (used by RDMAP for Terminate operations). * Queue Number 0 (used by RDMAP for Send, Send with
Invalidate, Send with Solicited Event, and Send with
Solicited Event and Invalidate operations).
* DDP maps a single RDMA Message to a single DDP Message. * Queue Number 1 (used by RDMAP for RDMA Read operations).
* DDP uses the STag and Tagged Offset provided by the RDMAP for * Queue Number 2 (used by RDMAP for Terminate operations).
Tagged Buffer Messages (i.e. RDMA Write and RDMA Read Response).
* When the DDP layer Delivers an Untagged DDP Message to the RDMAP * DDP maps a single RDMA Message to a single DDP Message.
layer, DDP provides the length of the DDP Message. This ensures
that RDMAP does not have to carry a length field in its header.
* When the RDMAP layer provides an RDMA Message to the DDP Layer, * DDP uses the STag and Tagged Offset provided by the RDMAP for
DDP must insert the RsvdULP field value provided by the RDMAP Tagged Buffer Messages (i.e., RDMA Write and RDMA Read
Layer into the associated DDP Message. Response).
* When the DDP layer Delivers a DDP Message to the RDMAP layer, * When the DDP layer Delivers an Untagged DDP Message to the
DDP provides the RsvdULP field. RDMAP layer, DDP provides the length of the DDP Message. This
ensures that RDMAP does not have to carry a length field in its
header.
* The RsvdULP field must be 1 octet for DDP Tagged Messages and 5 * When the RDMAP layer provides an RDMA Message to the DDP Layer,
octets for DDP Untagged Messages. DDP must insert the RsvdULP field value provided by the RDMAP
Layer into the associated DDP Message.
* DDP propagates to RDMAP all operation or protection errors (used * When the DDP layer Delivers a DDP Message to the RDMAP layer,
by RDMAP Terminate) and, when appropriate, the DDP Header fields DDP provides the RsvdULP field.
of the DDP Segment that encountered the error.
* If an RDMA Operation is aborted by DDP or a lower layer, the * The RsvdULP field must be 1 octet for DDP Tagged Messages and 5
contents of the Data Sink buffers associated with the operation octets for DDP Untagged Messages.
are considered indeterminate.
* DDP in conjunction with the lower layers provide reliable, in- * DDP propagates to RDMAP all operation or protection errors
order Delivery. (used by RDMAP Terminate) and, when appropriate, the DDP Header
fields of the DDP Segment that encountered the error.
3.2 RDMAP Interactions with the ULP * If an RDMA Operation is aborted by DDP or a lower layer, the
contents of the Data Sink buffers associated with the operation
are considered indeterminate.
RDMAP provides the ULP with access to the following RDMA Operations * DDP in conjunction with the lower layers provide reliable, in-
as defined in this specification: order Delivery.
* Send 3.2 RDMAP Interactions with the ULP
* Send with Solicited Event RDMAP provides the ULP with access to the following RDMA
Operations as defined in this specification:
* Send with Invalidate * Send
* Send with Solicited Event and Invalidate * Send with Solicited Event
* RDMA Write * Send with Invalidate
* RDMA Read * Send with Solicited Event and Invalidate
For Send Operation Types, the following are the interactions * RDMA Write
between the RDMAP Layer and the ULP:
* At the Data Source: * RDMA Read
* The ULP passes to the RDMAP Layer the following: For Send Operation Types, the following are the interactions
between the RDMAP Layer and the ULP:
* ULP Message Length * At the Data Source:
* ULP Message * The ULP passes to the RDMAP Layer the following:
* An indication of the Send Operation Type, where the * ULP Message Length
valid types are: Send, Send with Solicited Event, Send
with Invalidate, or Send with Solicited Event and
Invalidate.
* An Invalidate STag, if the Send Operation Type was Send * ULP Message
with Invalidate or Send with Solicited Event and
Invalidate.
* When the Send Operation Type Completes, an indication of * An indication of the Send Operation Type, where the
the Completion results. valid types are: Send, Send with Solicited Event, Send
with Invalidate, or Send with Solicited Event and
Invalidate.
* At the Data Sink: * An Invalidate STag, if the Send Operation Type was
Send with Invalidate or Send with Solicited Event and
Invalidate.
* If the Send Operation Type Completed successfully, the * When the Send Operation Type Completes, an indication of
RDMAP Layer passes the following information to the ULP the Completion results.
Layer:
* ULP Message Length * At the Data Sink:
* ULP Message * If the Send Operation Type Completed successfully, the
RDMAP Layer passes the following information to the ULP
Layer:
* An Event, if the Data Sink is configured to generate an * ULP Message Length
Event.
* An Invalidated STag, if the Send Operation Type was * ULP Message
Send with Invalidate or Send with Solicited Event and
Invalidate.
* If the Send Operation Type Completed in error, the Data * An Event, if the Data Sink is configured to generate
Sink RDMAP Layer will pass up the corresponding error an Event.
information to the Data Sink ULP and send a Terminate
Message to the Data Source RDMAP Layer. The Data Source
RDMAP Layer will then pass up the Terminate Message to the
ULP.
For RDMA Write Operations, the following are the interactions * An Invalidated STag, if the Send Operation Type was
between the RDMAP Layer and the ULP: Send with Invalidate or Send with Solicited Event and
Invalidate.
* At the Data Source: * If the Send Operation Type Completed in error, the Data
Sink RDMAP Layer will pass up the corresponding error
information to the Data Sink ULP and send a Terminate
Message to the Data Source RDMAP Layer. The Data Source
RDMAP Layer will then pass up the Terminate Message to the
ULP.
* The ULP passes to the RDMAP Layer the following: For RDMA Write Operations, the following are the interactions
between the RDMAP Layer and the ULP:
* ULP Message Length * At the Data Source:
* ULP Message * The ULP passes to the RDMAP Layer the following:
* Data Sink STag * ULP Message Length
* Data Sink Tagged Offset * ULP Message
* When the RDMA Write Operation Completes, an indication of * Data Sink STag
the Completion results.
* At the Data Sink: * Data Sink Tagged Offset
* If the RDMA Write completed successfully, the RDMAP Layer * When the RDMA Write Operation Completes, an indication of
does not Deliver the RDMA Write to the ULP. It does Place the Completion results.
the ULP Message transferred through the RDMA Write Message
into the ULP Buffer.
* If the RDMA Write completed in error, the Data Sink RDMAP * At the Data Sink:
Layer will pass up the corresponding error information to
the Data Sink ULP and send a Terminate Message to the Data
Source RDMAP Layer. The Data Source RDMAP Layer will then
pass up the Terminate Message to the ULP.
For RDMA Read Operations, the following are the interactions * If the RDMA Write completed successfully, the RDMAP Layer
between the RDMAP Layer and the ULP: does not Deliver the RDMA Write to the ULP. It does Place
the ULP Message transferred through the RDMA Write Message
into the ULP Buffer.
* At the Data Sink: * If the RDMA Write completed in error, the Data Sink RDMAP
Layer will pass up the corresponding error information to
the Data Sink ULP and send a Terminate Message to the Data
Source RDMAP Layer. The Data Source RDMAP Layer will then
pass up the Terminate Message to the ULP.
* The ULP passes to the RDMAP Layer the following: For RDMA Read Operations, the following are the interactions
between the RDMAP Layer and the ULP:
* ULP Message Length * At the Data Sink:
* Data Source STag * The ULP passes to the RDMAP Layer the following:
* Data Sink STag * ULP Message Length
* Data Source Tagged Offset * Data Source STag
* Data Sink Tagged Offset * Data Sink STag
* When the RDMA Read Operation Completes, an indication of * Data Source Tagged Offset
the Completion results.
* At the Data Source: * Data Sink Tagged Offset
* If no error occurred while processing the RDMA Read * When the RDMA Read Operation Completes, an indication of
Request, the Data Source will not pass up any information the Completion results.
to the ULP.
* If an error occurred while processing the RDMA Read * At the Data Source:
Request, the Data Source RDMAP Layer will pass up the
corresponding error information to the Data Source ULP and
send a Terminate Message to the Data Sink RDMAP Layer. The
Data Sink RDMAP Layer will then pass up the Terminate
Message to the ULP.
For STags made available to the RDMAP Layer, following are the * If no error occurred while processing the RDMA Read
interactions between the RDMAP Layer and the ULP: Request, the Data Source will not pass up any information
to the ULP.
* If the ULP enables an STag, the ULP passes to the RDMAP Layer * If an error occurred while processing the RDMA Read
the: Request, the Data Source RDMAP Layer will pass up the
corresponding error information to the Data Source ULP and
send a Terminate Message to the Data Sink RDMAP Layer. The
Data Sink RDMAP Layer will then pass up the Terminate
Message to the ULP.
* yesSTag; For STags made available to the RDMAP Layer, following are the
interactions between the RDMAP Layer and the ULP:
* range of Tagged Offsets that are associated with a given * If the ULP enables an STag, the ULP passes to the RDMAP Layer
STag; the:
* remote access rights (read, write, or read and write) * STag;
associated with a given, valid STag; and
* association between a given STag and a given RDMAP Stream. * range of Tagged Offsets that are associated with a given
STag;
* If the ULP disables an STag, the ULP passes to the RDMAP Layer * remote access rights (read, write, or read and write)
the STag. associated with a given, valid STag; and
If an error occurs at the RDMAP Layer, the RDMAP Layer may pass * association between a given STag and a given RDMAP Stream.
back error information (e.g. the content of a Terminate Message) to
the ULP.
4 Header Format * If the ULP disables an STag, the ULP passes to the RDMAP Layer
the STag.
The control information of RDMA Messages is included in DDP If an error occurs at the RDMAP Layer, the RDMAP Layer may pass
protocol defined header fields, with the following exceptions: back error information (e.g., the content of a Terminate Message)
to the ULP.
* The first octet reserved for ULP usage on all DDP Messages in 4 Header Format
the DDP Protocol (i.e. the RsvdULP Field) is used by RDMAP to
carry the RDMA Message Opcode and the RDMAP version. This octet
is known as the RDMAP Control Fiebld in this specification. For
Send with Invalidate and Send with Solicited Event and
Invalidate, RDMAP uses the second through fifth octets provided
by DDP on Untagged DDP Messages to carry the STag that will be
Invalidated.
* The RDMA Message length is passed by the RDMAP layer to the DDP The control information of RDMA Messages is included in DDP
layer on all outbound transfers. protocol defined header fields, with the following exceptions:
* For RDMA Read Request Messages, the RDMA Read Message Size is * The first octet reserved for ULP usage on all DDP Messages in
included in the RDMA Read Request Header. the DDP Protocol (i.e., the RsvdULP Field) is used by RDMAP to
carry the RDMA Message Opcode and the RDMAP version. This octet
is known as the RDMAP Control Field in this specification. For
Send with Invalidate and Send with Solicited Event and
Invalidate, RDMAP uses the second through fifth octets provided
by DDP on Untagged DDP Messages to carry the STag that will be
Invalidated.
* The RDMA Message length is passed to the RDMAP Layer by the DDP * The RDMA Message length is passed by the RDMAP layer to the DDP
layer on inbound Untagged Buffer transfers. layer on all outbound transfers.
* Two RDMA Messages carry additional RDMAP headers. The RDMA Read * For RDMA Read Request Messages, the RDMA Read Message Size is
Request carries the Data Sink and Data Source buffer included in the RDMA Read Request Header.
descriptions, including buffer length. The Terminate carries
additional information associated with the error that caused the
Terminate.
4.1 RDMAP Control and Invalidate STag Field * The RDMA Message length is passed to the RDMAP Layer by the DDP
layer on inbound Untagged Buffer transfers.
The version of RDMAP defined by this specification uses all 8 bits * Two RDMA Messages carry additional RDMAP headers. The RDMA Read
of the RDMAP Control Field. The first octet reserved for ULP use in Request carries the Data Sink and Data Source buffer
the DDP Protocol MUST be used by the RDMAP to carry the RDMAP descriptions, including buffer length. The Terminate carries
Control Field. The ordering of the bits in the first octet MUST be additional information associated with the error that caused
as defined in Figure 3 DDP Control, RDMAP Control, and Invalidate the Terminate.
STag Field. For Send with Invalidate and Send with Solicited Event
and Invalidate, the second through fifth octets of the DDP RsvdULP
field MUST be used by RDMAP to carry the Invalidate STag. Figure 3
DDP Control, RDMAP Control, and Invalidate STag Field depicts the
format of the DDP Control and RDMAP Control fields. (Note: In
Figure 3 DDP Control, RDMAP Control, and Invalidate STag Field, the
DDP Header is offset by 16 bits to accommodate the MPA header
defined in [MPA]. The MPA header is only present if DDP is layered
on top of MPA.)
0 1 2 3 4.1 RDMAP Control and Invalidate STag Field
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T|L| Resrv | DV| RV|Rsv| Opcode|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Invalidate STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields
All RDMA Messages handed by the RDMAP Layer to the DDP layer MUST The version of RDMAP defined by this specification uses all 8 bits
define the value of the Tagged flag in the DDP Header. Figure 4 of the RDMAP Control Field. The first octet reserved for ULP use
RDMA Usage of DDP Fields MUST be used to define the value of the in the DDP Protocol MUST be used by the RDMAP to carry the RDMAP
Tagged flag that is handed to the DDP Layer for each RDMA Message. Control Field. The ordering of the bits in the first octet MUST be
as defined in Figure 3 DDP Control, RDMAP Control, and Invalidate
STag Field. For Send with Invalidate and Send with Solicited Event
and Invalidate, the second through fifth octets of the DDP RsvdULP
field MUST be used by RDMAP to carry the Invalidate STag. Figure 3
DDP Control, RDMAP Control, and Invalidate STag Field depicts the
format of the DDP Control and RDMAP Control fields. (Note: In
Figure 3 DDP Control, RDMAP Control, and Invalidate STag Field,
the DDP Header is offset by 16 bits to accommodate the MPA header
defined in [MPA]. The MPA header is only present if DDP is layered
on top of MPA.)
Figure 4 RDMA Usage of DDP Fields defines the value of the RDMA 0 1 2 3
Opcode field that MUST be used for each RDMA Message. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T|L| Resrv | DV| RV|Rsv| Opcode|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Invalidate STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3 DDP Control, RDMAP Control, and Invalidate STag Fields
Figure 4 RDMA Usage of DDP Fields defines when the STag, Queue All RDMA Messages handed by the RDMAP Layer to the DDP layer MUST
Number, and Tagged Offset fields MUST be provided for each RDMA define the value of the Tagged flag in the DDP Header. Figure 4
Message. RDMA Usage of DDP Fields MUST be used to define the value of the
Tagged flag that is handed to the DDP Layer for each RDMA Message.
For this version of the RDMAP, all RDMA Messages MUST have: Figure 4 RDMA Usage of DDP Fields defines the value of the RDMA
Opcode field that MUST be used for each RDMA Message.
* Bits 24-25; RDMA Version field: 01b for IETF RNICs, and 00b for Figure 4 RDMA Usage of DDP Fields defines when the STag, Queue
RDMAC RNICs. Both version numbers are valid. Interoperability is Number, and Tagged Offset fields MUST be provided for each RDMA
dependent on MPA protocol version negotiation (e.g. MPA marker Message.
and MPA CRC).
* Bits 26-27; Reserved. MUST be set to zero by sender, ignored by For this version of the RDMAP, all RDMA Messages MUST have:
the receiver.
* Bits 28-31; OpCode field: see Figure 4 RDMA Usage of DDP Fields. * Bits 24-25; RDMA Version field: 01b for an RNIC that complies
with this RDMA protocol specification. 00b for an RNIC that
complies with the RDMA Consortium's RDMA protocol
specification. Both version numbers are valid.
Interoperability is dependent on MPA protocol version
negotiation (e.g., MPA marker and MPA CRC).
* Bits 32-63; Invalidate STag. However, this field is only valid * Bits 26-27; Reserved. MUST be set to zero by sender, ignored by
for Send with Invalidate and Send with Solicited Event and the receiver.
Invalidate Messages (see Figure 4 RDMA Usage of DDP Fields).
For Send, Send with Solicited Event, RDMA Read Request, and
Terminate, the Invalidate STag field MUST be set to zero on
transmit and ignored by the receiver.
-------+-----------+-------+------+-------+-----------+-------------- * Bits 28-31; OpCode field: see Figure 4 RDMA Usage of DDP
RDMA | Message | Tagged| STag | Queue | Invalidate| Message Fields.
Message| Type | Flag | and | Number| STag | Length
OpCode | | | TO | | | Communicated
| | | | | | between DDP
| | | | | | and RDMAP
-------+-----------+-------+------+-------+-----------+--------------
0000b | RDMA Write| 1 | Valid| N/A | N/A | Yes
| | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0001b | RDMA Read | 0 | N/A | 1 | N/A | Yes
| Request | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0010b | RDMA Read | 1 | Valid| N/A | N/A | Yes
| Response | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0011b | Send | 0 | N/A | 0 | N/A | Yes
| | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0100b | Send with | 0 | N/A | 0 | Valid | Yes
| Invalidate| | | | |
-------+-----------+-------+------+-------+-----------+--------------
0101b | Send with | 0 | N/A | 0 | N/A | Yes
| SE | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0110b | Send with | 0 | N/A | 0 | Valid | Yes
| SE and | | | | |
| Invalidate| | | | |
-------+-----------+-------+------+-------+-----------+--------------
0111b | Terminate | 0 | N/A | 2 | N/A | Yes
| | | | | |
-------+-----------+-------+------+-------+-----------+--------------
1000b | |
to | Reserved | Not Specified
1111b | |
-------+-----------+-------------------------------------------------
Figure 4 RDMA Usage of DDP Fields
Note: N/A means Not Applicable. * Bits 32-63; Invalidate STag. However, this field is only valid
for Send with Invalidate and Send with Solicited Event and
Invalidate Messages (see Figure 4 RDMA Usage of DDP Fields).
4.2 RDMA Message Definitions For Send, Send with Solicited Event, RDMA Read Request, and
Terminate, the Invalidate STag field MUST be set to zero on
transmit and ignored by the receiver.
The following figure defines which RDMA Headers MUST be used on -------+-----------+-------+------+-------+-----------+--------------
each RDMA Message and which RDMA Messages are allowed to carry ULP RDMA | Message | Tagged| STag | Queue | Invalidate| Message
payload: Message| Type | Flag | and | Number| STag | Length
OpCode | | | TO | | | Communicated
| | | | | | between DDP
| | | | | | and RDMAP
-------+-----------+-------+------+-------+-----------+--------------
0000b | RDMA Write| 1 | Valid| N/A | N/A | Yes
| | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0001b | RDMA Read | 0 | N/A | 1 | N/A | Yes
| Request | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0010b | RDMA Read | 1 | Valid| N/A | N/A | Yes
| Response | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0011b | Send | 0 | N/A | 0 | N/A | Yes
| | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0100b | Send with | 0 | N/A | 0 | Valid | Yes
| Invalidate| | | | |
-------+-----------+-------+------+-------+-----------+--------------
0101b | Send with | 0 | N/A | 0 | N/A | Yes
| SE | | | | |
-------+-----------+-------+------+-------+-----------+--------------
0110b | Send with | 0 | N/A | 0 | Valid | Yes
| SE and | | | | |
| Invalidate| | | | |
-------+-----------+-------+------+-------+-----------+--------------
0111b | Terminate | 0 | N/A | 2 | N/A | Yes
| | | | | |
-------+-----------+-------+------+-------+-----------+--------------
1000b | |
to | Reserved | Not Specified
1111b | |
-------+-----------+-------------------------------------------------
Figure 4 RDMA Usage of DDP Fields
-------+-----------+-------------------+------------------------- Note: N/A means Not Applicable.
RDMA | Message | RDMA Header Used | ULP Message allowed in
Message| Type | | the RDMA Message
OpCode | | |
| | |
-------+-----------+-------------------+-------------------------
0000b | RDMA Write| None | Yes
| | |
-------+-----------+-------------------+-------------------------
0001b | RDMA Read | RDMA Read Request | No
| Request | Header |
-------+-----------+-------------------+-------------------------
0010b | RDMA Read | None | Yes
| Response | |
-------+-----------+-------------------+-------------------------
0011b | Send | None | Yes
| | |
-------+-----------+-------------------+-------------------------
0100b | Send with | None | Yes
| Invalidate| |
-------+-----------+-------------------+-------------------------
0101b | Send with | None | Yes
| SE | |
-------+-----------+-------------------+-------------------------
0110b | Send with | None | Yes
| SE and | |
| Invalidate| |
-------+-----------+-------------------+-------------------------
0111b | Terminate | Terminate Header | No
| | |
-------+-----------+-------------------+-------------------------
1000b | |
to | Reserved | Not Specified
1111b | |
-------+-----------+-------------------+-------------------------
Figure 5 RDMA Message Definitions
4.3 RDMA Write Header 4.2 RDMA Message Definitions
The RDMA Write Message does not include an RDMAP header. The RDMAP The following figure defines which RDMA Headers MUST be used on
layer passes to the DDP layer an RDMAP Control Field. The RDMA each RDMA Message and which RDMA Messages are allowed to carry ULP
Write Message is fully described by the DDP Headers of the DDP payload:
Segments associated with the Message.
See section 11 Appendix for a description of the DDP Segment format -------+-----------+-------------------+-------------------------
associated with RDMA Write Messages. RDMA | Message | RDMA Header Used | ULP Message allowed in
Message| Type | | the RDMA Message
OpCode | | |
| | |
-------+-----------+-------------------+-------------------------
0000b | RDMA Write| None | Yes
| | |
-------+-----------+-------------------+-------------------------
0001b | RDMA Read | RDMA Read Request | No
| Request | Header |
-------+-----------+-------------------+-------------------------
0010b | RDMA Read | None | Yes
| Response | |
-------+-----------+-------------------+-------------------------
0011b | Send | None | Yes
| | |
-------+-----------+-------------------+-------------------------
0100b | Send with | None | Yes
| Invalidate| |
-------+-----------+-------------------+-------------------------
0101b | Send with | None | Yes
| SE | |
-------+-----------+-------------------+-------------------------
0110b | Send with | None | Yes
| SE and | |
| Invalidate| |
-------+-----------+-------------------+-------------------------
0111b | Terminate | Terminate Header | No
| | |
-------+-----------+-------------------+-------------------------
1000b | |
to | Reserved | Not Specified
1111b | |
-------+-----------+-------------------+-------------------------
Figure 5 RDMA Message Definitions
4.4 RDMA Read Request Header 4.3 RDMA Write Header
The RDMA Read Request Message carries an RDMA Read Request Header The RDMA Write Message does not include an RDMAP header. The RDMAP
that describes the Data Sink and Data Source Buffers used by the layer passes to the DDP layer an RDMAP Control Field. The RDMA
RDMA Read operation. The RDMA Read Request Header immediately Write Message is fully described by the DDP Headers of the DDP
follows the DDP header. The RDMAP layer passes to the DDP layer an Segments associated with the Message.
RDMAP Control Field. The following figure depicts the RDMA Read
Request Header that MUST be used for all RDMA Read Request
Messages:
0 1 2 3 See section 11 Appendix for a description of the DDP Segment
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 format associated with RDMA Write Messages.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag (SinkSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Sink Tagged Offset (SinkTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Read Message Size (RDMARDSZ) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Source STag (SrcSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Source Tagged Offset (SrcTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6 RDMA Read Request Header Format
Data Sink Steering Tag: 32 bits. 4.4 RDMA Read Request Header
The Data Sink Steering Tag identifies the Data Sink's Tagged The RDMA Read Request Message carries an RDMA Read Request Header
Buffer. This field MUST be copied, without interpretation, that describes the Data Sink and Data Source Buffers used by the
from the RDMA Read Request into the corresponding RDMA Read RDMA Read operation. The RDMA Read Request Header immediately
Response and allows the Data Sink to place the returning data. follows the DDP header. The RDMAP layer passes to the DDP layer an
The STag is associated with the RDMAP Stream through a RDMAP Control Field. The following figure depicts the RDMA Read
mechanism that is outside the scope of the RDMAP Request Header that MUST be used for all RDMA Read Request
specification. Messages:
Data Sink Tagged Offset: 64 bits. 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag (SinkSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Sink Tagged Offset (SinkTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Read Message Size (RDMARDSZ) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Source STag (SrcSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Source Tagged Offset (SrcTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6 RDMA Read Request Header Format
The Data Sink Tagged Offset specifies the starting offset, in Data Sink Steering Tag: 32 bits.
octets, from the base of the Data Sink's Tagged Buffer, where
the data is to be written by the Data Source. This field is
copied from the RDMA Read Request into the corresponding RDMA
Read Response and allows the Data Sink to place the returning
data. The Data Sink Tagged Offset MAY start at an arbitrary
offset.
The Data Sink STag and Data Sink Tagged Offset fields describe The Data Sink Steering Tag identifies the Data Sink's Tagged
the buffer to which the RDMA Read data is written. Buffer. This field MUST be copied, without interpretation,
from the RDMA Read Request into the corresponding RDMA Read
Response and allows the Data Sink to place the returning
data. The STag is associated with the RDMAP Stream through a
mechanism that is outside the scope of the RDMAP
specification.
Note: the DDP Layer protects against a wrap of the Data Sink Data Sink Tagged Offset: 64 bits.
Tagged Offset.
RDMA Read Message Size: 32 bits. The Data Sink Tagged Offset specifies the starting offset, in
octets, from the base of the Data Sink's Tagged Buffer, where
the data is to be written by the Data Source. This field is
copied from the RDMA Read Request into the corresponding RDMA
Read Response and allows the Data Sink to place the returning
data. The Data Sink Tagged Offset MAY start at an arbitrary
offset.
The RDMA Read Message Size is the amount of data, in octets, The Data Sink STag and Data Sink Tagged Offset fields
read from the Data Source. A single RDMA Read Request Message describe the buffer to which the RDMA Read data is written.
can retrieve from 0 to 2^32-1 data octets from the Data
Source.
Data Source Steering Tag: 32 bits. Note: the DDP Layer protects against a wrap of the Data Sink
Tagged Offset.
The Data Source Steering Tag identifies the Data Source's RDMA Read Message Size: 32 bits.
Tagged Buffer. The STag is associated with the RDMAP Stream
through a mechanism that is outside the scope of the RDMAP
specification.
Data Source Tagged Offset: 64 bits. The RDMA Read Message Size is the amount of data, in octets,
read from the Data Source. A single RDMA Read Request Message
can retrieve from 0 to 2^32-1 data octets from the Data
Source.
The Tagged Offset specifies the starting offset, in octets, Data Source Steering Tag: 32 bits.
that is to be read from the Data Source's Tagged Buffer. The
Data Source Tagged Offset MAY start at an arbitrary offset.
The Data Source STag and Data Source Tagged Offset fields The Data Source Steering Tag identifies the Data Source's
describe the buffer from which the RDMA Read data is read. Tagged Buffer. The STag is associated with the RDMAP Stream
through a mechanism that is outside the scope of the RDMAP
specification.
See Section 7.2 Errors Detected at the Remote Peer on Incoming RDMA Data Source Tagged Offset: 64 bits.
Messages for a description of error checking required upon
processing of an RDMA Read Request at the Data Source.
4.5 RDMA Read Response Header The Tagged Offset specifies the starting offset, in octets,
that is to be read from the Data Source's Tagged Buffer. The
Data Source Tagged Offset MAY start at an arbitrary offset.
The RDMA Read Response Message does not include an RDMAP header. The Data Source STag and Data Source Tagged Offset fields
The RDMAP layer passes to the DDP layer an RDMAP Control Field. The describe the buffer from which the RDMA Read data is read.
RDMA Read Response Message is fully described by the DDP Headers of
the DDP Segments associated with the Message.
See Section 11 Appendix for a description of the DDP Segment format See Section 7.2 Errors Detected at the Remote Peer on Incoming
associated with RDMA Read Response Messages. RDMA Messages for a description of error checking required upon
processing of an RDMA Read Request at the Data Source.
4.6 Send Header and Send with Solicited Event Header 4.5 RDMA Read Response Header
The Send and Send with Solicited Event Message do not include an The RDMA Read Response Message does not include an RDMAP header.
RDMAP header. The RDMAP layer passes to the DDP layer an RDMAP The RDMAP layer passes to the DDP layer an RDMAP Control Field.
Control Field. The Send and Send with Solicited Event Message are The RDMA Read Response Message is fully described by the DDP
fully described by the DDP Headers of the DDP Segments associated Headers of the DDP Segments associated with the Message.
with the Message.
See Section 11 Appendix for a description of the DDP Segment format See Section 11 Appendix for a description of the DDP Segment
associated with Send and Send with Solicited Event Messages. format associated with RDMA Read Response Messages.
4.7 Send with Invalidate Header and Send with SE and Invalidate Header 4.6 Send Header and Send with Solicited Event Header
The Send with Invalidate and Send with Solicited Event and The Send and Send with Solicited Event Message do not include an
Invalidate Message do not include an RDMAP header. The RDMAP layer RDMAP header. The RDMAP layer passes to the DDP layer an RDMAP
passes to the DDP layer an RDMAP Control Field and the Invalidate Control Field. The Send and Send with Solicited Event Message are
STag field (see section 4.1 RDMAP Control and Invalidate STag fully described by the DDP Headers of the DDP Segments associated
Field). The Send with Invalidate and Send with Solicited Event and with the Message.
Invalidate Message are fully described by the DDP Headers of the
DDP Segments associated with the Message.
See Section 11 Appendix for a description of the DDP Segment format See Section 11 Appendix for a description of the DDP Segment
associated with Send and Send with Solicited Event Messages. format associated with Send and Send with Solicited Event
Messages.
4.8 Terminate Header 4.7 Send with Invalidate Header and Send with SE and Invalidate
Header
The Terminate Message carries a Terminate Header that contains The Send with Invalidate and Send with Solicited Event and
additional information associated with the cause of the Terminate. Invalidate Message do not include an RDMAP header. The RDMAP layer
The Terminate Header immediately follows the DDP header. The RDMAP passes to the DDP layer an RDMAP Control Field and the Invalidate
layer passes to the DDP layer an RDMAP Control Field. The following STag field (see section 4.1 RDMAP Control and Invalidate STag
figure depicts a Terminate Header that MUST be used for the Field). The Send with Invalidate and Send with Solicited Event and
Terminate Message: Invalidate Message are fully described by the DDP Headers of the
DDP Segments associated with the Message.
0 1 2 3 See Section 11 Appendix for a description of the DDP Segment
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 format associated with Send and Send with Solicited Event
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Messages.
| Terminate Control | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Segment Length (if any) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
// //
| Terminated DDP Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
// //
| Terminated RDMA Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7 Terminate Header Format
Terminate Control: 19 bits. 4.8 Terminate Header
The Terminate Control field MUST have the format defined in The Terminate Message carries a Terminate Header that contains
Figure 8 Terminate Control Field. additional information associated with the cause of the Terminate.
The Terminate Header immediately follows the DDP header. The RDMAP
layer passes to the DDP layer an RDMAP Control Field. The
following figure depicts a Terminate Header that MUST be used for
the Terminate Message:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Layer | EType | Error Code |HdrCt| | Terminate Control | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 8 Terminate Control Field | DDP Segment Length (if any) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
// //
| Terminated DDP Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
// //
| Terminated RDMA Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7 Terminate Header Format
* Figure 9 Terminate Control Field Values defines the valid Terminate Control: 19 bits.
values that MUST be used for this field.
* Layer: 4 bits. The Terminate Control field MUST have the format defined in
Figure 8 Terminate Control Field.
Identifies the layer that encountered the error. 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Layer | EType | Error Code |HdrCt|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 8 Terminate Control Field
* EType (RDMA Error Type): 4 bits. * Figure 9 Terminate Control Field Values defines the valid
values that MUST be used for this field.
Identifies the type of error that caused the Terminate. * Layer: 4 bits.
When the error is detected at the RDMAP Layer, the
RDMAP Layer inserts the Error Type into this field.
When the error is detected at a LLP layer, a LLP layer
creates the Error Type and the DDP layer passes it up
to the RDMAP Layer, and the RDMAP Layer inserts it into
this field.
* Error Code: 8 bits. Identifies the layer that encountered the error.
This field identifies the specific error that caused * EType (RDMA Error Type): 4 bits.
the Terminate. When the error is detected at the RDMAP
Layer, the RDMAP Layer creates the Error Code. When the
error is detected at a LLP layer, a LLP layer creates
the Error Code and the DDP layer passes it up to the
RDMAP Layer, and the RDMAP Layer inserts it into this
field.
* HdrCt: 3 bits. Identifies the type of error that caused the
Terminate. When the error is detected at the RDMAP
Layer, the RDMAP Layer inserts the Error Type into
this field. When the error is detected at a LLP layer,
a LLP layer creates the Error Type and the DDP layer
passes it up to the RDMAP Layer, and the RDMAP Layer
inserts it into this field.
Header control bits: * Error Code: 8 bits.
* M: bit 16. DDP Segment Length valid. See Figure 10 This field identifies the specific error that caused
for when this bit SHOULD be set. the Terminate. When the error is detected at the RDMAP
Layer, the RDMAP Layer creates the Error Code. When
the error is detected at a LLP layer, a LLP layer
creates the Error Code and the DDP layer passes it up
to the RDMAP Layer, and the RDMAP Layer inserts it
into this field.
* D: bit 17. DDP Header Included. See Figure 10 for * HdrCt: 3 bits.
when this bit SHOULD be set.
* R: bit 18. RDMAP Header Included. See Figure 10 for Header control bits:
when this bit SHOULD be set.
-------+----------+-------+-------------+------+-------------------- * M: bit 16. DDP Segment Length valid. See Figure 10
Layer | Layer | Error | Error Type | Error| Error Code Name for when this bit SHOULD be set.
| Name | Type | Name | Code |
-------+----------+-------+-------------+------+--------------------
| | 0000b | Local | None | None
| | | Catastrophic| |
| | | Error | |
| +-------+-------------+------+--------------------
| | | | 00X | Invalid STag
| | | +------+--------------------
| | | | 01X | Base or bounds
| | | | | violation
| | | Remote +------+--------------------
| | 0001b | Protection | 02X | Access rights
| | | Error | | violation
| | | +------+--------------------
0000b | RDMA | | | 03X | STag not associated
| | | | | with RDMAP Stream
| | | +------+--------------------
| | | | 04X | TO wrap
| | | +------+--------------------
| | | | 09X | STag cannot be
| | | | | Invalidated
| | | +------+--------------------
| | | | FFX | Unspecified Error
| +-------+-------------+------+--------------------
| | | | 05X | Invalid RDMAP
| | | | | version
| | | +------+--------------------
| | | | 06X | Unexpected OpCode
| | | Remote +------+--------------------
| | 0010b | Operation | 07X | Catastrophic error,
| | | Error | | localized to RDMAP
| | | | | Stream
| | | +------+--------------------
| | | | 08X | Catastrophic error,
| | | | | global
| | | +------+--------------------
| | | | 09X | STag cannot be
| | | | | Invalidated
| | | +------+--------------------
| | | | FFX | Unspecified Error
-------+----------+-------+-------------+------+--------------------
0001b | DDP | See DDP Specification [DDP] for a description of
| | the values and names.
-------+----------+-------+-----------------------------------------
0010b | LLP | For MPA, see MPA Specification [MPA] for a
| (eg MPA) | description of the values and names.
-------+----------+-------+-----------------------------------------
Figure 9 Terminate Control Field Values
Reserved: 13 bits. This field MUST be set to zero on transmit, * D: bit 17. DDP Header Included. See Figure 10 for
ignored on receive. when this bit SHOULD be set.
DDP Segment Length: 16 bits * R: bit 18. RDMAP Header Included. See Figure 10
for when this bit SHOULD be set.
The length handed up by the DDP Layer when the error was -------+----------+-------+-------------+------+--------------------
detected. It MUST be valid if the M bit is set. It MUST be Layer | Layer | Error | Error Type | Error| Error Code Name
present when the D bit is set. | Name | Type | Name | Code |
-------+----------+-------+-------------+------+--------------------
| | 0000b | Local | None | None - This error
| | | Catastrophic| | type does not have
| | | Error | | an error code. Any
| | | | | value in this field
| | | | | is acceptable.
| +-------+-------------+------+--------------------
| | | | 00X | Invalid STag
| | | +------+--------------------
| | | | 01X | Base or bounds
| | | | | violation
| | | Remote +------+--------------------
| | 0001b | Protection | 02X | Access rights
| | | Error | | violation
| | | +------+--------------------
0000b | RDMA | | | 03X | STag not associated
| | | | | with RDMAP Stream
| | | +------+--------------------
| | | | 04X | TO wrap
| | | +------+--------------------
| | | | 09X | STag cannot be
| | | | | Invalidated
| | | +------+--------------------
| | | | FFX | Unspecified Error
| +-------+-------------+------+--------------------
| | | | 05X | Invalid RDMAP
| | | | | version
| | | +------+--------------------
| | | | 06X | Unexpected OpCode
| | | Remote +------+--------------------
| | 0010b | Operation | 07X | Catastrophic error,
| | | Error | | localized to RDMAP
| | | | | Stream
| | | +------+--------------------
| | | | 08X | Catastrophic error,
| | | | | global
| | | +------+--------------------
| | | | 09X | STag cannot be
| | | | | Invalidated
| | | +------+--------------------
| | | | FFX | Unspecified Error
-------+----------+-------+-------------+------+--------------------
0001b | DDP | See DDP Specification [DDP] for a description of
| | the values and names.
-------+----------+-------+-----------------------------------------
0010b | LLP | For MPA, see MPA Specification [MPA] for a
| (eg MPA) | description of the values and names.
-------+----------+-------+-----------------------------------------
Figure 9 Terminate Control Field Values
Terminated DDP Header: 112 bits for Tagged Messages and 144 bits Reserved: 13 bits. This field MUST be set to zero on transmit,
for Untagged Messages. ignored on receive.
The DDP Header of the incoming Message that is associated with DDP Segment Length: 16 bits
the Terminate. The DDP Header is not present if the Terminate
Error Type is a Local Catastrophic Error. It MUST be present
if the D bit is set.
Terminated RDMA Header: 224 bits. The length handed up by the DDP Layer when the error was
detected. It MUST be valid if the M bit is set. It MUST be
present when the D bit is set.
The Terminated RDMA Header is only sent back if the terminate Terminated DDP Header: 112 bits for Tagged Messages and 144 bits
is associated with an RDMA Read Request Message. It MUST be for Untagged Messages.
present if the R bit is set.
If the terminate occurs before the first RDMA Read Request The DDP Header of the incoming Message that is associated
byte is processed, the original RDMA Read Request Header is with the Terminate. The DDP Header is not present if the
sent back. Terminate Error Type is a Local Catastrophic Error. It MUST
be present if the D bit is set.
If the terminate occurs after the first RDMA Read Request byte Terminated RDMA Header: 224 bits.
is processed, the RDMA Read Request Header is updated to
reflect the current location of the RDMA Read operation that
is in process:
* Data Sink STag = Data Sink STag originally sent in the The Terminated RDMA Header is only sent back if the terminate
RDMA Read Request. is associated with an RDMA Read Request Message. It MUST be
present if the R bit is set.
* Data Sink Tagged Offset = Current offset into the Data If the terminate occurs before the first RDMA Read Request
Sink Tagged Buffer. For example if the RDMA Read byte is processed, the original RDMA Read Request Header is
Request was terminated after 2048 octets were sent, sent back.
then the Data Sink Tagged Offset = the original Data
Sink Tagged Offset + 2048.
* Data Message size = Number of bytes left to transfer. If the terminate occurs after the first RDMA Read Request
byte is processed, the RDMA Read Request Header is updated to
reflect the current location of the RDMA Read operation that
is in process:
* Data Source STag = Data Source STag in the RDMA Read * Data Sink STag = Data Sink STag originally sent in the
Request. RDMA Read Request.
* Data Source Tagged Offset = Current offset into the * Data Sink Tagged Offset = Current offset into the Data
Data Source Tagged Buffer. For example if the RDMA Read Sink Tagged Buffer. For example if the RDMA Read
Request was terminated after 2048 octets were sent, Request was terminated after 2048 octets were sent,
then the Data Source Tagged Offset = the original Data then the Data Sink Tagged Offset = the original Data
Source Tagged Offset + 2048. Sink Tagged Offset + 2048.
Note: if a given LLP does not define any termination codes for the * Data Message size = Number of bytes left to transfer.
RDMAP Termination message to use, then none would be used for that
LLP.
Figure 10 Error Type to RDMA Message Mapping maps layer name and * Data Source STag = Data Source STag in the RDMA Read
error types to each RDMA Message type: Request.
---------+-------------+------------+------------+----------------- * Data Source Tagged Offset = Current offset into the
Layer | Error Type | Terminate | Terminate | What type of Data Source Tagged Buffer. For example if the RDMA
Name | Name | Includes | Includes | RDMA Message can Read Request was terminated after 2048 octets were
| | DDP Header | RDMA Header| cause the error sent, then the Data Source Tagged Offset = the
| | and DDP | | original Data Source Tagged Offset + 2048.
| | Segment | |
| | Length | |
---------+-------------+------------+------------+-----------------
| Local | No | No | Any
| Catastrophic| | |
| Error | | |
+-------------+------------+------------+-----------------
| Remote | Yes, if | Yes | Only RDMA Read
RDMA | Protection | possible | | Request, Send
| Error | | | with Invalidate,
| | | | and Send with SE
| | | | and Invalidate
+-------------+------------+------------+-----------------
| Remote | Yes, if | No | Any
| Operation | possible | |
| Error | | |
---------+-------------+------------+------------+-----------------
DDP | See DDP Spec| Yes | No | Any
| [DDP] | | |
---------+-------------+------------+------------+-----------------
LLP | See LLP Spec| No | No | Any
| [e.g. MPA] | | |
Figure 10 Error Type to RDMA Message Mapping
5 Data Transfer Note: if a given LLP does not define any termination codes for the
RDMAP Termination message to use, then none would be used for that
LLP.
5.1 RDMA Write Message Figure 10 Error Type to RDMA Message Mapping maps layer name and
error types to each RDMA Message type:
An RDMA Write is used by the Data Source to transfer data to a ---------+-------------+------------+------------+-----------------
previously Advertised Tagged Buffer at the Data Sink. The RDMA Layer | Error Type | Terminate | Terminate | What type of
Write Message has the following semantics: Name | Name | Includes | Includes | RDMA Message can
| | DDP Header | RDMA Header| cause the error
| | and DDP | |
| | Segment | |
| | Length | |
---------+-------------+------------+------------+-----------------
| Local | No | No | Any
| Catastrophic| | |
| Error | | |
+-------------+------------+------------+-----------------
| Remote | Yes, if | Yes | Only RDMA Read
RDMA | Protection | possible | | Request, Send
| Error | | | with Invalidate,
| | | | and Send with SE
| | | | and Invalidate
+-------------+------------+------------+-----------------
| Remote | Yes, if | No | Any
| Operation | possible | |
| Error | | |
---------+-------------+------------+------------+-----------------
DDP | See DDP Spec| Yes | No | Any
| [DDP] | | |
---------+-------------+------------+------------+-----------------
LLP | See LLP Spec| No | No | Any
| [e.g., MPA] | | |
Figure 10 Error Type to RDMA Message Mapping
* An RDMA Write Message MUST reference a Tagged Buffer. That is, 5 Data Transfer
the Data Source RDMAP Layer MUST request that the DDP layer mark
the Message as Tagged.
* A valid RDMA Write Message MUST NOT be delivered to the Data 5.1 RDMA Write Message
Sink's ULP (i.e. it is placed by the DDP layer).
* At the Remote Peer, when an invalid RDMA Write Message is An RDMA Write is used by the Data Source to transfer data to a
delivered to the Remote Peer's RDMAP Layer, an error is surfaced previously Advertised Tagged Buffer at the Data Sink. The RDMA
(see section 7.1 RDMAP Error Surfacing). Write Message has the following semantics:
* The Tagged Offset of a Tagged Buffer MAY start at a non-zero * An RDMA Write Message MUST reference a Tagged Buffer. That is,
value. the Data Source RDMAP Layer MUST request that the DDP layer
mark the Message as Tagged.
* An RDMA Write Message MAY target all or part of a previously * A valid RDMA Write Message MUST NOT be delivered to the Data
Advertised buffer. Sink's ULP (i.e., it is placed by the DDP layer).
* The RDMAP does not define how the buffer(s) used by an outbound * At the Remote Peer, when an invalid RDMA Write Message is
RDMA Write is defined and how it is addressed. For example, an delivered to the Remote Peer's RDMAP Layer, an error is
implementation of RDMA may choose to allow a gather-list of non- surfaced (see section 7.1 RDMAP Error Surfacing).
contiguous data blocks to be the source of an RDMA Write. In
this case, the data blocks would be combined by the Data Source
and sent as a single RDMA Write Message to the Data Sink.
* The Data Source RDMAP Layer MUST issue RDMA Write Messages to * The Tagged Offset of a Tagged Buffer MAY start at a non-zero
the DDP layer in the order they were submitted by the ULP. value.
* At the Data Source, a subsequent Send (Send with Invalidate, * An RDMA Write Message MAY target all or part of a previously
Send with Solicited Event, or Send with Solicited Event and Advertised buffer.
Invalidate) Message MAY be used to signal Delivery of previous
RDMA Write Messages to the Data Sink, if desired by the ULP.
* If the Local Peer wishes to write to multiple Tagged Buffers on * The RDMAP does not define how the buffer(s) used by an outbound
the Remote Peer, the Local Peer MUST use multiple RDMA Write RDMA Write is defined and how it is addressed. For example, an
Messages. That is, a single RDMA Write Message can only write to implementation of RDMA may choose to allow a gather-list of
one remote Tagged Buffer. non-contiguous data blocks to be the source of an RDMA Write.
In this case, the data blocks would be combined by the Data
Source and sent as a single RDMA Write Message to the Data
Sink.
* The Data Source MAY issue a zero length RDMA Write Message. * The Data Source RDMAP Layer MUST issue RDMA Write Messages to
the DDP layer in the order they were submitted by the ULP.
5.2 RDMA Read Operation * At the Data Source, a subsequent Send (Send with Invalidate,
Send with Solicited Event, or Send with Solicited Event and
Invalidate) Message MAY be used to signal Delivery of previous
RDMA Write Messages to the Data Sink, if desired by the ULP.
The RDMA Read operation MUST consist of a single RDMA Read Request * If the Local Peer wishes to write to multiple Tagged Buffers on
Message and a single RDMA Read Response Message. the Remote Peer, the Local Peer MUST use multiple RDMA Write
Messages. That is, a single RDMA Write Message can only write
to one remote Tagged Buffer.
5.2.1 RDMA Read Request Message * The Data Source MAY issue a zero length RDMA Write Message.
An RDMA Read Request is used by the Data Sink to transfer data from 5.2 RDMA Read Operation
a previously Advertised Tagged Buffer at the Data Source to a
Tagged Buffer at the Data Sink. The RDMA Read Request Message has
the following semantics:
* An RDMA Read Request Message MUST reference an Untagged Buffer. The RDMA Read operation MUST consist of a single RDMA Read Request
That is, the Local Peer's RDMAP Layer MUST request that the DDP Message and a single RDMA Read Response Message.
mark the Message as Untagged.
* One RDMA Read Request Message MUST consume one Untagged Buffer. 5.2.1 RDMA Read Request Message
* The Remote Peer's RDMAP Layer MUST process an RDMA Read Request An RDMA Read Request is used by the Data Sink to transfer data
Message. A valid RDMA Read Request Message MUST NOT be delivered from a previously Advertised Tagged Buffer at the Data Source to a
to the Data Sink's ULP (i.e. it is processed by the RDMAP Tagged Buffer at the Data Sink. The RDMA Read Request Message has
layer). the following semantics:
* At the Remote Peer, when an invalid RDMA Read Request Message is * An RDMA Read Request Message MUST reference an Untagged Buffer.
delivered to the Remote Peer's RDMAP Layer, an error is surfaced That is, the Local Peer's RDMAP Layer MUST request that the DDP
(see section 7.1 RDMAP Error Surfacing). mark the Message as Untagged.
* AN RDMA Read Request Message MUST reference the RDMA Read * One RDMA Read Request Message MUST consume one Untagged Buffer.
Request Queue. That is, the Local Peer's RDMAP Layer MUST
request that the DDP layer set the Queue Number field to one.
* The Local Peer MUST pass to the DDP Layer RDMA Read Request * The Remote Peer's RDMAP Layer MUST process an RDMA Read Request
Messages in the order they were submitted by the ULP. Message. A valid RDMA Read Request Message MUST NOT be
delivered to the Data Sink's ULP (i.e., it is processed by the
RDMAP layer).
* The Remote Peer MUST process the RDMA Read Request Messages in * At the Remote Peer, when an invalid RDMA Read Request Message
the order they were sent. is delivered to the Remote Peer's RDMAP Layer, an error is
surfaced (see section 7.1 RDMAP Error Surfacing).
* If the Local Peer wishes to read from multiple Tagged Buffers on * AN RDMA Read Request Message MUST reference the RDMA Read
the Remote Peer, the Local Peer MUST use multiple RDMA Read Request Queue. That is, the Local Peer's RDMAP Layer MUST
Request Messages. That is, a single RDMA Read Request Message request that the DDP layer set the Queue Number field to one.
MUST only read from one remote Tagged Buffer.
* AN RDMA Read Request Message MAY target all or part of a * The Local Peer MUST pass to the DDP Layer RDMA Read Request
previously Advertised buffer. Messages in the order they were submitted by the ULP.
* If the Data Source receives a valid RDMA Read Request Message it * The Remote Peer MUST process the RDMA Read Request Messages in
MUST respond with a valid RDMA Read Response Message. the order they were sent.
* The Data Sink MAY issue a zero length RDMA Read Request Message, * If the Local Peer wishes to read from multiple Tagged Buffers
by setting the RDMA Read Message Size field to zero in the RDMA on the Remote Peer, the Local Peer MUST use multiple RDMA Read
Read Request Header. Request Messages. That is, a single RDMA Read Request Message
MUST only read from one remote Tagged Buffer.
* If the Data Source receives a non-zero length RDMA Read Message * AN RDMA Read Request Message MAY target all or part of a
Size, the Data Source RDMAP MUST validate the Data Source STag previously Advertised buffer.
and Data Source Tagged Offset contained in the RDMA Read Request
Header.
* If the Data Source receives an RDMA Read Request Header with the * If the Data Source receives a valid RDMA Read Request Message
RDMA Read Message Size set to zero, the Data Source RDMAP: it MUST respond with a valid RDMA Read Response Message.
* MUST NOT validate the Data Source STag and Data Source * The Data Sink MAY issue a zero length RDMA Read Request
Tagged Offset contained in the RDMA Read Request Header, Message, by setting the RDMA Read Message Size field to zero in
and the RDMA Read Request Header.
* MUST respond with a zero length RDMA Read Response Message. * If the Data Source receives a non-zero length RDMA Read Message
Size, the Data Source RDMAP MUST validate the Data Source STag
and Data Source Tagged Offset contained in the RDMA Read
Request Header.
5.2.2 RDMA Read Response Message * If the Data Source receives an RDMA Read Request Header with
the RDMA Read Message Size set to zero, the Data Source RDMAP:
The RDMA Read Response Message uses the DDP Tagged Buffer Model to * MUST NOT validate the Data Source STag and Data Source
Deliver the contents of a previously requested Data Source Tagged Tagged Offset contained in the RDMA Read Request Header,
Buffer to the Data Sink, without any involvement from the ULP at and
the Remote Peer. The RDMA Read Response Message has the following
semantics:
* The RDMA Read Response Message for the associated RDMA Read * MUST respond with a zero length RDMA Read Response
Request Message travels in the opposite direction. Message.
* An RDMA Read Response Message MUST reference a Tagged Buffer. 5.2.2 RDMA Read Response Message
That is, the Data Source RDMAP Layer MUST request that the DDP
mark the Message as Tagged.
* The Data Source MUST ensure that a sufficient number of Untagged The RDMA Read Response Message uses the DDP Tagged Buffer Model to
Buffers are available on the RDMA Read Request Queue (Queue with Deliver the contents of a previously requested Data Source Tagged
DDP Queue Number 1) to support the maximum number of RDMA Read Buffer to the Data Sink, without any involvement from the ULP at
Requests negotiated by the ULP. the Remote Peer. The RDMA Read Response Message has the following
semantics:
* The RDMAP Layer MUST Deliver the RDMA Read Response Message to * The RDMA Read Response Message for the associated RDMA Read
the ULP. Request Message travels in the opposite direction.
* At the Remote Peer, when an invalid RDMA Read Response Message * An RDMA Read Response Message MUST reference a Tagged Buffer.
is delivered to the Remote Peer's RDMAP Layer, an error is That is, the Data Source RDMAP Layer MUST request that the DDP
surfaced (see section 7.1 RDMAP Error Surfacing). mark the Message as Tagged.
* The Tagged Offset of a Tagged Buffer MAY start at a non-zero * The Data Source MUST ensure that a sufficient number of
value. Untagged Buffers are available on the RDMA Read Request Queue
(Queue with DDP Queue Number 1) to support the maximum number
of RDMA Read Requests negotiated by the ULP.
* The Data Source RDMAP Layer MUST pass RDMA Read Response * The RDMAP Layer MUST Deliver the RDMA Read Response Message to
Messages to the DDP layer in the order that the RDMA Read the ULP.
Request Messages were received by the RDMAP Layer at the Data
Source.
* The Data Sink MAY validate that the STag, Tagged Offset, and * At the Remote Peer, when an invalid RDMA Read Response Message
length of the RDMA Read Response Message are the same as the is delivered to the Remote Peer's RDMAP Layer, an error is
STag, Tagged Offset, and length included in the corresponding surfaced (see section 7.1 RDMAP Error Surfacing).
RDMA Read Request Message.
* A single RDMA Read Response Message MUST write to one remote * The Tagged Offset of a Tagged Buffer MAY start at a non-zero
Tagged Buffer. If the Data Sink wishes to Read multiple Tagged value.
Buffers, the Data Sink can use multiple RDMA Read Request
Messages.
5.3 Send Message Type * The Data Source RDMAP Layer MUST pass RDMA Read Response
Messages to the DDP layer in the order that the RDMA Read
Request Messages were received by the RDMAP Layer at the Data
Source.
The Send Message Type uses the DDP Untagged Buffer Model to * The Data Sink MAY validate that the STag, Tagged Offset, and
transfer data from the Data Source into an Untagged Buffer at the length of the RDMA Read Response Message are the same as the
Data Sink. STag, Tagged Offset, and length included in the corresponding
RDMA Read Request Message.
* A Send Message Type MUST reference an Untagged Buffer. That is, * A single RDMA Read Response Message MUST write to one remote
the Local Peer's RDMAP Layer MUST request that the DDP layer Tagged Buffer. If the Data Sink wishes to Read multiple Tagged
mark the Message as Untagged. Buffers, the Data Sink can use multiple RDMA Read Request
Messages.
* One Send Message Type MUST consume one Untagged Buffer. 5.3 Send Message Type
* The ULP Message sent using a Send Message Type MAY be less The Send Message Type uses the DDP Untagged Buffer Model to
than or equal to the size of the consumed Untagged Buffer. transfer data from the Data Source into an Untagged Buffer at the
The RDMAP Layer communicates to the ULP the size of the Data Sink.
data written into the Untagged Buffer.
* If the ULP Message sent via Send Message Type is larger * A Send Message Type MUST reference an Untagged Buffer. That is,
than the Data Sink's Untagged Buffer, it is an error (see the Local Peer's RDMAP Layer MUST request that the DDP layer
section 9.1 RDMAP Error Surfacing). mark the Message as Untagged.
* At the Remote Peer, the Send Message Type MUST be Delivered to * One Send Message Type MUST consume one Untagged Buffer.
the Remote Peer's ULP in the order they were sent.
* After the Send with Solicited Event or Send with Solicited Event * The ULP Message sent using a Send Message Type MAY be less
and Invalidate Message is Delivered to the ULP, the RDMAP MAY than or equal to the size of the consumed Untagged Buffer.
generate an Event, if the Data Sink is configured to generate The RDMAP Layer communicates to the ULP the size of the
such an Event. data written into the Untagged Buffer.
* At the Remote Peer, when an invalid Send Message Type is * If the ULP Message sent via Send Message Type is larger
Delivered to the Remote Peer's RDMAP Layer, an error is surfaced than the Data Sink's Untagged Buffer, it is an error (see
(see section 7.1 RDMAP Error Surfacing). section 9.1 RDMAP Error Surfacing).
* The RDMAP does not define how the buffer(s) used by an outbound * At the Remote Peer, the Send Message Type MUST be Delivered to
Send Message Type is defined and how it is addressed. For the Remote Peer's ULP in the order they were sent.
example, an implementation of RDMA may choose to allow a gather-
list of non-contiguous data blocks to be the source of a Send
Message Type. In this case, the data blocks would be combined by
the Data Source and sent as a single Send Message Type to the
Data Sink.
* For a Send Message Type, the Local Peer's RDMAP Layer MUST * After the Send with Solicited Event or Send with Solicited
request that the DDP layer set the Queue Number field to zero. Event and Invalidate Message is Delivered to the ULP, the RDMAP
MAY generate an Event, if the Data Sink is configured to
generate such an Event.
* The Local Peer MUST issue Send Message Type Messages in the * At the Remote Peer, when an invalid Send Message Type is
order they were submitted by the ULP. Delivered to the Remote Peer's RDMAP Layer, an error is
surfaced (see section 7.1 RDMAP Error Surfacing).
* The Data Source MAY pass a zero length Send Message Type. A zero * The RDMAP does not define how the buffer(s) used by an outbound
length Send Message Type MUST consume an Untagged Buffer at the Send Message Type is defined and how it is addressed. For
Data Sink. A Send with Invalidate or Send with Solicited Event example, an implementation of RDMA may choose to allow a
and Invalidate Message MUST reference an STag. That is, the gather-list of non-contiguous data blocks to be the source of a
Local Peer's RDMAP Layer MUST pass the RDMA control field and Send Message Type. In this case, the data blocks would be
the STag that will be Invalidated to the DDP layer. combined by the Data Source and sent as a single Send Message
Type to the Data Sink.
* When the Send with Invalidate and Send with Solicited Event and * For a Send Message Type, the Local Peer's RDMAP Layer MUST
Invalidate Message are Delivered to the Remote Peer's RDMAP request that the DDP layer set the Queue Number field to zero.
Layer, the RDMAP Layer MUST:
* Verify the STag that is associated with the RDMAP Stream; * The Local Peer MUST issue Send Message Type Messages in the
and order they were submitted by the ULP.
* Invalidate the STag if it is associated with the RDMAP * The Data Source MAY pass a zero length Send Message Type. A
Stream; or Issue a Terminate Message with the STag Cannot zero length Send Message Type MUST consume an Untagged Buffer
be Invalidated Terminate Error Code, if the STag is not at the Data Sink. A Send with Invalidate or Send with Solicited
associated with the RDMAP Stream. Event and Invalidate Message MUST reference an STag. That is,
the Local Peer's RDMAP Layer MUST pass the RDMA control field
and the STag that will be Invalidated to the DDP layer.
5.4 Terminate Message * When the Send with Invalidate and Send with Solicited Event and
Invalidate Message are Delivered to the Remote Peer's RDMAP
Layer, the RDMAP Layer MUST:
The Terminate Message uses the DDP Untagged Buffer Model to * Verify the STag that is associated with the RDMAP Stream;
transfer error related information from the Data Source into an and
Untagged Buffer at the Data Sink and then ceases all further
communications on the underlying DDP Stream. The Terminate Message
has the following semantics:
* A Terminate Message MUST reference an Untagged Buffer. That is, * Invalidate the STag if it is associated with the RDMAP
the Local Peer's RDMAP Layer MUST request that the DDP layer Stream; or Issue a Terminate Message with the STag Cannot
mark the Message as Untagged. be Invalidated Terminate Error Code, if the STag is not
associated with the RDMAP Stream.
* A Terminate Message references the Terminate Queue. That is, the 5.4 Terminate Message
Local Peer's RDMAP Layer MUST request that the DDP layer set the
Queue Number field to two.
* One Terminate Message MUST consume one Untagged Buffer. The Terminate Message uses the DDP Untagged Buffer Model to
transfer error related information from the Data Source into an
Untagged Buffer at the Data Sink and then ceases all further
communications on the underlying DDP Stream. The Terminate Message
has the following semantics:
* On a single RDMAP Stream, the RDMAP layer MUST guarantee * A Terminate Message MUST reference an Untagged Buffer. That is,
placement of a single Terminate Message. the Local Peer's RDMAP Layer MUST request that the DDP layer
mark the Message as Untagged.
* A Terminate Message MUST be Delivered to the Remote Peer's RDMAP * A Terminate Message references the Terminate Queue. That is,
Layer. The RDMAP Layer MUST Deliver the Terminate Message to the the Local Peer's RDMAP Layer MUST request that the DDP layer
ULP. set the Queue Number field to two.
* At the Remote Peer, when an invalid Terminate Message is * One Terminate Message MUST consume one Untagged Buffer.
delivered to the Remote Peer's RDMAP Layer, an error is surfaced
(see section 7.1 RDMAP Error Surfacing).
* The RDMAP Layer Completes in error all ULP Operations that have * On a single RDMAP Stream, the RDMAP layer MUST guarantee
not been provided to the DDP layer. placement of a single Terminate Message.
* After sending a Terminate Message on an RDMAP Stream, the Local * A Terminate Message MUST be Delivered to the Remote Peer's
Peer MUST NOT send any more Messages on that specific RDMAP RDMAP Layer. The RDMAP Layer MUST Deliver the Terminate Message
Stream. to the ULP.
* After receiving a Terminate Message on an RDMAP Stream, the * At the Remote Peer, when an invalid Terminate Message is
Remote Peer MAY stop sending Messages on that specific RDMAP delivered to the Remote Peer's RDMAP Layer, an error is
Stream. surfaced (see section 7.1 RDMAP Error Surfacing).
5.5 Ordering and Completions * The RDMAP Layer Completes in error all ULP Operations that have
not been provided to the DDP layer.
It is important to understand the difference between Placement and * After sending a Terminate Message on an RDMAP Stream, the Local
Delivery ordering since RDMAP provides quite different semantics Peer MUST NOT send any more Messages on that specific RDMAP
for the two. Stream.
Note that many current protocols, both as used in the Internet and * After receiving a Terminate Message on an RDMAP Stream, the
elsewhere, assume that data is both Placed and Delivered in order. Remote Peer MAY stop sending Messages on that specific RDMAP
This allowed applications to take a variety of shortcuts by taking Stream.
advantage of this fact. For RDMAP, many of these shortcuts are no
longer safe to use, and could cause application failure.
The following rules apply to implementations of the RDMAP protocol. 5.5 Ordering and Completions
Note, in these rules Send includes Send, Send with Invalidate, Send
with Solicited Event, and Send with Solicited Event and Invalidate:
1. RDMAP does not provide ordering among Messages on different It is important to understand the difference between Placement and
RDMAP Streams. Delivery ordering since RDMAP provides quite different semantics
for the two.
2. RDMAP does not provide ordering between operations that are Note that many current protocols, both as used in the Internet and
generated from the two ends of an RDMAP Stream. elsewhere, assume that data is both Placed and Delivered in order.
This allowed applications to take a variety of shortcuts by taking
advantage of this fact. For RDMAP, many of these shortcuts are no
longer safe to use, and could cause application failure.
3. RDMA Messages that use Tagged and Untagged Buffers MAY be The following rules apply to implementations of the RDMAP
Placed in any order. If an application uses overlapping protocol. Note, in these rules Send includes Send, Send with
buffers (points different Messages or portions of a single Invalidate, Send with Solicited Event, and Send with Solicited
Message at the same buffer), then it is possible that the last Event and Invalidate:
incoming write to the Data Sink buffer will not be the last
outgoing data sent from the Data Source.
4. For a Send operation, the contents of an Untagged Buffer at the 1. RDMAP does not provide ordering among Messages on different
Data Sink MAY be indeterminate until the Send is Delivered to RDMAP Streams.
the ULP at the Data Sink.
5. For an RDMA Write operation, the contents of the Tagged Buffer 2. RDMAP does not provide ordering between operations that are
at the Data Sink MAY be indeterminate until a subsequent Send generated from the two ends of an RDMAP Stream.
is Delivered to the ULP at the Data Sink.
6. For an RDMA Read operation, the contents of the Tagged Buffer 3. RDMA Messages that use Tagged and Untagged Buffers MAY be
at the Data Sink MAY be indeterminate until the RDMA Read Placed in any order. If an application uses overlapping
Response Message has been Delivered at the Local Peer. buffers (points different Messages or portions of a single
Message at the same buffer), then it is possible that the last
incoming write to the Data Sink buffer will not be the last
outgoing data sent from the Data Source.
Statements 4, 5, and 6 imply "no peeking" at the data to see 4. For a Send operation, the contents of an Untagged Buffer at
if it is done. It is possible for some data to arrive before the Data Sink MAY be indeterminate until the Send is Delivered
logically earlier data does, and peeking may cause to the ULP at the Data Sink.
unpredictable application failure
7. If the ULP or Application modifies the contents of Tagged or 5. For an RDMA Write operation, the contents of the Tagged Buffer
Untagged Buffers being modified by an RDMA Operation while the at the Data Sink MAY be indeterminate until a subsequent Send
RDMAP is processing the RDMA Operation, the state of the is Delivered to the ULP at the Data Sink.
Buffers is indeterminate.
8. If the ULP or Application modifies the contents of Tagged or 6. For an RDMA Read operation, the contents of the Tagged Buffer
Untagged Buffers read by an RDMA Operation while the RDMAP is at the Data Sink MAY be indeterminate until the RDMA Read
processing the RDMA Operation, the results of the read are Response Message has been Delivered at the Local Peer.
indeterminate.
9. The Completion of an RDMA Write or Send Operation at the Local Statements 4, 5, and 6 imply "no peeking" at the data to see
Peer does not guarantee that the ULP Message has yet reached if it is done. It is possible for some data to arrive before
the Remote Peer ULP Buffer or been examined by the Remote ULP. logically earlier data does, and peeking may cause
unpredictable application failure
10. Send Messages MUST be Delivered to the ULP at the Remote Peer 7. If the ULP or Application modifies the contents of Tagged or
after they are Delivered to RDMAP by DDP and in the order that Untagged Buffers being modified by an RDMA Operation while the
the they were Delivered to RDMAP. RDMAP is processing the RDMA Operation, the state of the
Buffers is indeterminate.
Note that DDP ordering rules ensure that this will be the same 8. If the ULP or Application modifies the contents of Tagged or
order that they were submitted at the Local Peer and that any Untagged Buffers read by an RDMA Operation while the RDMAP is
prior RDMA Writes have been submitted for ordered Placement at processing the RDMA Operation, the results of the read are
the Remote Peer. This means that when the ULP sees the Delivery indeterminate.
of the Send, the memory buffers targeted by any preceding RDMA
Writes and Sends are available to be accessed locally or
remotely as authorized. If the ULP overlaps its buffers for
different operations, the data from the RDMA Write or Send may
be overwritten by subsequent RDMA Operations before the ULP
receives and processes the Delivery.
11. RDMA Read Response Messages MUST be Delivered to the ULP at the 9. The Completion of an RDMA Write or Send Operation at the Local
Remote Peer after they are Delivered to RDMAP by DDP and in the Peer does not guarantee that the ULP Message has yet reached
order that the they were Delivered to RDMAP. the Remote Peer ULP Buffer or been examined by the Remote ULP.
DDP ordering rules ensure that this will be the same order that 10. Send Messages MUST be Delivered to the ULP at the Remote Peer
they were submitted at the Local Peer. This means that when the after they are Delivered to RDMAP by DDP and in the order that
ULP sees the Delivery of the RDMA Read Response, the memory the they were Delivered to RDMAP.
buffers targeted by the RDMA Read Response are available to be
accessed locally or remotely as authorized. If the ULP overlaps
its buffers for different operations, the data from the RDMA
Read Response may be overwritten by subsequent RDMA Operations
before the ULP receives and processes the Delivery.
12. RDMA Read Request Messages, including zero-length RDMA Read Note that DDP ordering rules ensure that this will be the same
Requests, MUST NOT start processing at the Remote Peer until order that they were submitted at the Local Peer and that any
they have been Delivered to RDMAP by DDP. prior RDMA Writes have been submitted for ordered Placement at
the Remote Peer. This means that when the ULP sees the
Delivery of the Send, the memory buffers targeted by any
preceding RDMA Writes and Sends are available to be accessed
locally or remotely as authorized. If the ULP overlaps its
buffers for different operations, the data from the RDMA Write
or Send may be overwritten by subsequent RDMA Operations
before the ULP receives and processes the Delivery.
Note: the ULP is assured that data written can be read back. 11. RDMA Read Response Messages MUST be Delivered to the ULP at
For example, if an RDMA Read Request is issued by the local the Remote Peer after they are Delivered to RDMAP by DDP and
peer, targeting the same ULP Buffer as a preceding Send or RDMA in the order that the they were Delivered to RDMAP.
Write (in the same direction as the RDMA Read Request), and
there are no other sources of update for the ULP Buffer, then
the remote peer will send back the data written by the Send or
RDMA Write. That is, for this example the ULP Buffer: is
Advertised for use on a series of RDMA Messages, is only valid
on the RDMAP Stream for which it is advertised, and is not
locally updated while the series of RDMAP Messages are
performed. For this example, order rule (12) assures that
subsequent local or remote accesses to the ULP Buffer contain
the data written by the Send or RDMA Write.
RDMA Read Response Messages MAY be generated at the Remote Peer DDP ordering rules ensure that this will be the same order
after subsequent RDMA Write Messages or Send Messages have been that they were submitted at the Local Peer. This means that
Placed or Delivered. Therefore, when an application does an when the ULP sees the Delivery of the RDMA Read Response, the
RDMA Read Request followed by an RDMA Write (or Send) to the memory buffers targeted by the RDMA Read Response are
same buffer, it may get the data from the later RDMA Write (or available to be accessed locally or remotely as authorized. If
Send) in the RDMA Read Response Message, even though the the ULP overlaps its buffers for different operations, the
operations completed in order at the Local Peer. If this data from the RDMA Read Response may be overwritten by
behavior is not desired, the Local Peer ULP must Fence the subsequent RDMA Operations before the ULP receives and
later RDMA write (or Send) by withholding the RDMA Write processes the Delivery.
Message until all outstanding RDMA Read Responses have been
Delivered.
13. The RDMAP Layer MUST submit RDMA Messages to the DDP layer in 12. RDMA Read Request Messages, including zero-length RDMA Read
the order the RDMA Operations are submitted to the RDMAP Layer Requests, MUST NOT start processing at the Remote Peer until
by the ULP. they have been Delivered to RDMAP by DDP.
14. A Send or RDMA Write Message MUST NOT be considered Complete at Note: the ULP is assured that data written can be read back.
the Local Peer (Data Source) until it has been successfully For example, if an RDMA Read Request is issued by the local
completed at the DDP layer. peer, targeting the same ULP Buffer as a preceding Send or
RDMA Write (in the same direction as the RDMA Read Request),
and there are no other sources of update for the ULP Buffer,
then the remote peer will send back the data written by the
Send or RDMA Write. That is, for this example the ULP Buffer:
is Advertised for use on a series of RDMA Messages, is only
valid on the RDMAP Stream for which it is advertised, and is
not locally updated while the series of RDMAP Messages are
performed. For this example, order rule (12) assures that
subsequent local or remote accesses to the ULP Buffer contain
the data written by the Send or RDMA Write.
15. RDMA Operations MUST be Completed at the Local Peer in the RDMA Read Response Messages MAY be generated at the Remote
order that they were submitted by the ULP. Peer after subsequent RDMA Write Messages or Send Messages
have been Placed or Delivered. Therefore, when an application
does an RDMA Read Request followed by an RDMA Write (or Send)
to the same buffer, it may get the data from the later RDMA
Write (or Send) in the RDMA Read Response Message, even though
the operations completed in order at the Local Peer. If this
behavior is not desired, the Local Peer ULP must Fence the
later RDMA write (or Send) by withholding the RDMA Write
Message until all outstanding RDMA Read Responses have been
Delivered.
16. At the Data Sink, an incoming Send Message MUST be Delivered to 13. The RDMAP Layer MUST submit RDMA Messages to the DDP layer in
the ULP only after the DDP Message has been Delivered to the the order the RDMA Operations are submitted to the RDMAP Layer
RDMAP Layer by the DDP layer. by the ULP.
17. RDMA Read Response Message processing at the Remote Peer 14. A Send or RDMA Write Message MUST NOT be considered Complete
(reading the specified Tagged Buffer) MUST be started only at the Local Peer (Data Source) until it has been successfully
after the RDMA Read Request Message has been Delivered by the completed at the DDP layer.
DDP layer (thus all previous RDMA Messages have been properly
submitted for ordered Placement).
18. Send Messages MAY be Completed at the Remote Peer (Data Sink) 15. RDMA Operations MUST be Completed at the Local Peer in the
before prior incoming RDMA Read Request Messages have completed order that they were submitted by the ULP.
their response processing.
19. An RDMA Read operation MUST NOT be Completed at the Local Peer 16. At the Data Sink, an incoming Send Message MUST be Delivered
until the DDP layer Delivers the associated incoming RDMA Read to the ULP only after the DDP Message has been Delivered to
Response Message. the RDMAP Layer by the DDP layer.
20. If more than one outstanding RDMA Read Request Message is 17. RDMA Read Response Message processing at the Remote Peer
supported by both peers, the RDMA Read Response Messages MUST (reading the specified Tagged Buffer) MUST be started only
be submitted to the DDP layer on the Remote Peer in the order after the RDMA Read Request Message has been Delivered by the
the RDMA Read Request Messages were Delivered by DDP, but the DDP layer (thus all previous RDMA Messages have been properly
actual read of the buffer contents MAY take place in any order submitted for ordered Placement).
at the Remote Peer.
This simplifies Local Peer Completion processing for RDMA 18. Send Messages MAY be Completed at the Remote Peer (Data Sink)
Reads in that a Delivered RDMA Read Response MUST be before prior incoming RDMA Read Request Messages have
sufficient to Complete the RDMA Read Operation. completed their response processing.
6 RDMAP Stream Management 19. An RDMA Read operation MUST NOT be Completed at the Local Peer
until the DDP layer Delivers the associated incoming RDMA Read
Response Message.
RDMAP Stream management consists of RDMAP Stream Initialization and 20. If more than one outstanding RDMA Read Request Messages are
RDMAP Stream Termination. supported by both peers, the RDMA Read Response Messages MUST
be submitted to the DDP layer on the Remote Peer in the order
the RDMA Read Request Messages were Delivered by DDP, but the
actual read of the buffer contents MAY take place in any order
at the Remote Peer.
6.1 Stream Initialization This simplifies Local Peer Completion processing for RDMA
Reads in that a Delivered RDMA Read Response MUST be
sufficient to Complete the RDMA Read Operation.
RDMAP Stream initialization occurs after the LLP Stream has been 6 RDMAP Stream Management
created (e.g. for DDP/MPA over TCP the first TCP Segment after the
SYN, SYN/ACK exchange). The ULP is responsible for transitioning
the LLP Stream into RDMA enabled mode. The switch to RDMA mode
typically occurs sometime after LLP Stream setup. Once in RDMA
enabled mode, an implementation MUST send only RDMA Messages across
the transport Stream until the RDMAP Stream is torn down.
For each direction of an RDMAP Stream: RDMAP Stream management consists of RDMAP Stream Initialization
and RDMAP Stream Termination.
* For a given RDMAP Stream, the number of outstanding RDMA Read 6.1 Stream Initialization
Requests is limited per RDMAP Stream direction.
* It is the ULP's responsibility to set the maximum number of RDMAP Stream initialization occurs after the LLP Stream has been
outstanding, inbound RDMA Read Requests per RDMAP Stream created (e.g., for DDP/MPA over TCP the first TCP Segment after
direction. the SYN, SYN/ACK exchange). The ULP is responsible for
transitioning the LLP Stream into RDMA enabled mode. The switch to
RDMA mode typically occurs sometime after LLP Stream setup. Once
in RDMA enabled mode, an implementation MUST send only RDMA
Messages across the transport Stream until the RDMAP Stream is
torn down.
* The RDMAP Layer MUST provide the maximum number of outstanding, For each direction of an RDMAP Stream:
inbound RDMA Read Requests per RDMAP Stream direction that were
negotiated between the ULP and the Local Peer's RDMAP Layer. The
negotiation mechanism is outside the scope of this
specification.
* It is the ULP's responsibility to set the maximum number of * For a given RDMAP Stream, the number of outstanding RDMA Read
outstanding, outbound RDMA Read Requests per RDMAP Stream Requests is limited per RDMAP Stream direction.
direction.
* The RDMAP Layer MUST provide the maximum number of outstanding, * It is the ULP's responsibility to set the maximum number of
outbound RDMA Read Requests for the RDMAP Stream direction that outstanding, inbound RDMA Read Requests per RDMAP Stream
were negotiated between the ULP and the Local Peer's RDMAP direction.
Layer. The negotiation mechanism is outside the scope of this
specification.
* The Local Peer's ULP is responsible for negotiating with the * The RDMAP Layer MUST provide the maximum number of outstanding,
Remote Peer's ULP the maximum number of outstanding RDMA Read inbound RDMA Read Requests per RDMAP Stream direction that were
Requests for the RDMAP Stream direction. It is recommended that negotiated between the ULP and the Local Peer's RDMAP Layer.
the ULP set the maximum number of outstanding, inbound RDMA Read The negotiation mechanism is outside the scope of this
Requests equal to the maximum number of outstanding, outbound specification.
RDMA Read Requests for a given RDMAP Stream direction.
* For outbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed * It is the ULP's responsibility to set the maximum number of
the maximum number of outstanding, outbound RDMA Read Requests outstanding, outbound RDMA Read Requests per RDMAP Stream
that were negotiated between the ULP and the Local Peer's RDMAP direction.
Layer.
* For inbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed * The RDMAP Layer MUST provide the maximum number of outstanding,
the maximum number of outstanding, inbound RDMA Read Requests outbound RDMA Read Requests for the RDMAP Stream direction that
that were negotiated between the ULP and the Local Peer's RDMAP were negotiated between the ULP and the Local Peer's RDMAP
Layer. Layer. The negotiation mechanism is outside the scope of this
specification.
6.2 Stream Teardown * The Local Peer's ULP is responsible for negotiating with the
Remote Peer's ULP the maximum number of outstanding RDMA Read
Requests for the RDMAP Stream direction. It is recommended that
the ULP set the maximum number of outstanding, inbound RDMA
Read Requests equal to the maximum number of outstanding,
outbound RDMA Read Requests for a given RDMAP Stream direction.
There are three methods for terminating an RDMAP Stream: ULP * For outbound RDMA Read Requests, the RDMAP Layer MUST NOT
Graceful Termination, RDMAP Abortive Termination, and LLP Abortive exceed the maximum number of outstanding, outbound RDMA Read
Termination. Requests that were negotiated between the ULP and the Local
Peer's RDMAP Layer.
The ULP is responsible for performing ULP Graceful Termination. * For inbound RDMA Read Requests, the RDMAP Layer MUST NOT exceed
After a ULP Graceful Termination, either side of the Stream can the maximum number of outstanding, inbound RDMA Read Requests
initiate LLP Graceful Termination, using the graceful termination that were negotiated between the ULP and the Local Peer's RDMAP
mechanism provided by the LLP. Layer.
RDMAP Abortive Termination allows the RDMAP to issue a Terminate 6.2 Stream Teardown
Message describing the reason the RDMAP Stream was terminated. The
next section (6.2.1 RDMAP Abortive Termination) describes the RDMAP
Abortive Termination in detail.
LLP Abortive Termination results due to a LLP error and causes the There are three methods for terminating an RDMAP Stream: ULP
RDMAP Stream to be torn down midstream, without an RDMAP Terminate Graceful Termination, RDMAP Abortive Termination, and LLP Abortive
Message. While this last method is highly undesirable, it is Termination.
possible and the ULP should take this into consideration.
6.2.1 RDMAP Abortive Termination The ULP is responsible for performing ULP Graceful Termination.
After a ULP Graceful Termination, either side of the Stream can
initiate LLP Graceful Termination, using the graceful termination
mechanism provided by the LLP.
RDMAP defines a Terminate operation that SHOULD be invoked when RDMAP Abortive Termination allows the RDMAP to issue a Terminate
either an RDMAP error is encountered or a LLP error is surfaced to Message describing the reason the RDMAP Stream was terminated. The
the RDMAP layer by the LLP. next section (6.2.1 RDMAP Abortive Termination) describes the
RDMAP Abortive Termination in detail.
It is not always possible to send the Terminate Message. For LLP Abortive Termination results due to a LLP error and causes the
example, certain LLP errors may occur that cause the LLP Stream to RDMAP Stream to be torn down midstream, without an RDMAP Terminate
be torn down before a) RDMAP is aware of the error, b) before RDMAP Message. While this last method is highly undesirable, it is
is able to send the Terminate Message, or c) after RDMAP has posted possible and the ULP should take this into consideration.
the Terminate Message to the LLP, but it has not yet been
transmitted by the LLP.
Note that an RDMAP Abortive Termination may entail loss of data. In 6.2.1 RDMAP Abortive Termination
general, when a Terminate Message is received it is impossible to
tell for sure what unacknowledged RDMA Messages were Completed
successfully at the Remote Peer. Thus the state of all outstanding
RDMA Messages is indeterminate and the Messages SHOULD be
considered Completed in error.
When a peer sends or receives a Terminate Message, it MAY RDMAP defines a Terminate operation that SHOULD be invoked when
immediately teardown the LLP Stream. The peer SHOULD perform a either an RDMAP error is encountered or a LLP error is surfaced to
graceful LLP teardown to ensure the Terminate Message is the RDMAP layer by the LLP.
successfully Delivered.
See section 4.8 Terminate Header for a description of the Terminate It is not always possible to send the Terminate Message. For
Message and its contents. See section 5.4 Terminate Message for a example, certain LLP errors may occur that cause the LLP Stream to
description of the Terminate Message semantics. be torn down before a) RDMAP is aware of the error, b) before
RDMAP is able to send the Terminate Message, or c) after RDMAP has
posted the Terminate Message to the LLP, but it has not yet been
transmitted by the LLP.
7 RDMAP Error Management Note that an RDMAP Abortive Termination may entail loss of data.
In general, when a Terminate Message is received it is impossible
to tell for sure what unacknowledged RDMA Messages were Completed
successfully at the Remote Peer. Thus the state of all outstanding
RDMA Messages is indeterminate and the Messages SHOULD be
considered Completed in error.
The RDMAP protocol does not have RDMAP or DDP layer error recovery When a peer sends or receives a Terminate Message, it MAY
operations built in. If everything is working, the LLP guarantees immediately teardown the LLP Stream. The peer SHOULD perform a
will ensure that the Messages are arriving at the destination. graceful LLP teardown to ensure the Terminate Message is
successfully Delivered.
If errors are detected at the RDMAP or DDP layer, then the RDMAP, See section 4.8 Terminate Header for a description of the
DDP and LLP Streams are Abortively Terminated (see section 4.8 Terminate Message and its contents. See section 5.4 Terminate
Terminate Header on page 34). Message for a description of the Terminate Message semantics.
In general poor implementations or improper ULP programming causes 7 RDMAP Error Management
the errors detected at the RDMAP and DDP layers. In these cases,
returning a diagnostic termination error Message and closing the
RDMAP Stream is far simpler than attempting to maintain the RDMAP
Stream, particularly when the cause of the error is not known.
If an LLP does not support teardown of a Stream independent of The RDMAP protocol does not have RDMAP or DDP layer error recovery
other Streams and an RDMAP error results in the Termination of a operations built in. If everything is working, the LLP guarantees
specific Stream, then the LLP MUST label the Stream as an erroneous will ensure that the Messages are arriving at the destination.
Stream and MUST NOT allow any further data transfer on that Stream
after RDMAP requests the Stream to be torn down.
For a specific LLP connection, when all Streams are either If errors are detected at the RDMAP or DDP layer, then the RDMAP,
gracefully torn down or are labeled as erroneous Streams, the LLP DDP and LLP Streams are Abortively Terminated (see section 4.8
connection MUST be torn down. Terminate Header on page 34).
Since errors are detected at the Remote Peer (possibly long) after In general poor implementations or improper ULP programming causes
RDMA Messages are passed to DDP and the LLP at the Local Peer and the errors detected at the RDMAP and DDP layers. In these cases,
Completed, the sender cannot easily determine which of its Messages returning a diagnostic termination error Message and closing the
have been received. (RDMA Reads are an exception to this rule). RDMAP Stream is far simpler than attempting to maintain the RDMAP
Stream, particularly when the cause of the error is not known.
For a list of errors returned to the Remote Peer as a result of an If an LLP does not support teardown of a Stream independent of
Abortive Termination, see section 4.8 Terminate Header on page 34. other Streams and an RDMAP error results in the Termination of a
specific Stream, then the LLP MUST label the Stream as an
erroneous Stream and MUST NOT allow any further data transfer on
that Stream after RDMAP requests the Stream to be torn down.
7.1 RDMAP Error Surfacing For a specific LLP connection, when all Streams are either
gracefully torn down or are labeled as erroneous Streams, the LLP
connection MUST be torn down.
If an error occurs at the Local Peer, the RDMAP layer MUST attempt Since errors are detected at the Remote Peer (possibly long) after
to inform the local ULP that the error has occurred. RDMA Messages are passed to DDP and the LLP at the Local Peer and
Completed, the sender cannot easily determine which of its
Messages have been received. (RDMA Reads are an exception to this
rule).
The Local Peer MUST send a Terminate Message for each of the For a list of errors returned to the Remote Peer as a result of an
following cases: Abortive Termination, see section 4.8 Terminate Header on page 34.
21. For Errors detected while creating RDMA Write, Send, Send with 7.1 RDMAP Error Surfacing
Invalidate, Send with Solicited Event, Send with Solicited
Event and Invalidate, or RDMA Read Requests, or other reasons
not directly associated with an incoming Message, the Terminate
Message and Error code are sent instead of the request. In
this case, the Error Type and Error Code fields are included in
the Terminate Message, but the Terminated DDP Header and
Terminated RDMA Header fields are set to zero.
22. For errors detected on an incoming RDMA Write, Send, Send with If an error occurs at the Local Peer, the RDMAP layer MUST attempt
Invalidate, Send with Solicited Event, Send with Solicited to inform the local ULP that the error has occurred.
Event and Invalidate, or Read Response Message (after the
Message has been Delivered by DDP), the Terminate Message is
sent at the earliest possible opportunity, preferably in the
next outgoing RDMA Message. In this case, the Error Type, Error
Code, ULP PDU Length, and Terminated DDP Header fields are
included in the Terminate Message, but the Terminated RDMA
Header field is set to zero.
23. For errors detected on an incoming RDMA Read Request Message The Local Peer MUST send a Terminate Message for each of the
(after the Message has been Delivered by DDP), the Terminate following cases:
Message is sent at the earliest possible opportunity,
preferably in the next outgoing RDMA Message. In this case, the
Error Type, Error Code, ULP PDU Length, Terminated DDP Header,
and Terminated RDMA Header fields are included in the Terminate
Message.
24. If more than one error is detected on incoming RDMA Messages, 1. For errors detected while creating RDMA Write, Send, Send with
before the Terminate Message can be sent, then the first RDMA Invalidate, Send with Solicited Event, Send with Solicited
Message (and its associated DDP Segment) that experienced an Event and Invalidate, or RDMA Read Requests, or other reasons
error MUST be captured by the Terminate Message in accordance not directly associated with an incoming Message, the
with rules 2 and 3 above. Terminate Message and Error code are sent instead of the
request. In this case, the Error Type and Error Code fields
are included in the Terminate Message, but the Terminated DDP
Header and Terminated RDMA Header fields are set to zero.
7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages 2. For errors detected on an incoming RDMA Write, Send, Send with
Invalidate, Send with Solicited Event, Send with Solicited
Event and Invalidate, or Read Response Message (after the
Message has been Delivered by DDP), the Terminate Message is
sent at the earliest possible opportunity, preferably in the
next outgoing RDMA Message. In this case, the Error Type,
Error Code, ULP PDU Length, and Terminated DDP Header fields
are included in the Terminate Message, but the Terminated RDMA
Header field is set to zero.
On incoming RDMA Writes, RDMA Read Response, Sends, Send with 3. For errors detected on an incoming RDMA Read Request Message
Invalidate, Send with Solicited Event, Send with Solicited Event (after the Message has been Delivered by DDP), the Terminate
and Invalidate, and Terminate Messages, the following must be Message is sent at the earliest possible opportunity,
validated: preferably in the next outgoing RDMA Message. In this case,
the Error Type, Error Code, ULP PDU Length, Terminated DDP
Header, and Terminated RDMA Header fields are included in the
Terminate Message.
1. The DDP Layer MUST validate all DDP Segment fields. 4. If more than one error is detected on incoming RDMA Messages,
before the Terminate Message can be sent, then the first RDMA
Message (and its associated DDP Segment) that experienced an
error MUST be captured by the Terminate Message in accordance
with rules 2 and 3 above.
2. The RDMA OpCode MUST be valid. 7.2 Errors Detected at the Remote Peer on Incoming RDMA Messages
3. The RDMA Version MUST be valid. On incoming RDMA Writes, RDMA Read Response, Sends, Send with
Invalidate, Send with Solicited Event, Send with Solicited Event
and Invalidate, and Terminate Messages, the following must be
validated:
Additionally, on incoming Send with Invalidate and Send with 1. The DDP Layer MUST validate all DDP Segment fields.
Solicited Event and Invalidate Messages, the following must
also be validated:
4. The Invalidate STag MUST be valid. 2. The RDMA OpCode MUST be valid.
5. The STag MUST be associated to this RDMAP Stream. 3. The RDMA Version MUST be valid.
On incoming RDMA Request Messages, the following must be validated: Additionally, on incoming Send with Invalidate and Send with
Solicited Event and Invalidate Messages, the following must
also be validated:
1. The DDP Layer MUST validate all Untagged DDP Segment fields. 4. The Invalidate STag MUST be valid.
2. The RDMA OpCode MUST be valid. 5. The STag MUST be associated to this RDMAP Stream.
3. The RDMA Version MUST be valid. On incoming RDMA Request Messages, the following must be
validated:
4. For non-zero length RDMA Read Request Messages: 1. The DDP Layer MUST validate all Untagged DDP Segment fields.
a. The Data Source STag MUST be valid. 2. The RDMA OpCode MUST be valid.
b. The Data Source STag MUST be associated to this RDMAP 3. The RDMA Version MUST be valid.
Stream.
c. The Data Source Tagged Offset MUST fall in the range of 4. For non-zero length RDMA Read Request Messages:
legal offsets associated with the Data Source STag.
d. The sum of the Data Source Tagged Offset and the RDMA Read a. The Data Source STag MUST be valid.
Message Size MUST fall in the range of legal offsets
associated with the Data Source STag.
e. The sum of the Data Source Tagged Offset and the RDMA Read b. The Data Source STag MUST be associated to this RDMAP
Message Size MUST NOT cause the Data Source Tagged Offset Stream.
to wrap.
8 Security c. The Data Source Tagged Offset MUST fall in the range of
legal offsets associated with the Data Source STag.
Security Considerations d. The sum of the Data Source Tagged Offset and the RDMA Read
Message Size MUST fall in the range of legal offsets
associated with the Data Source STag.
This section references the resources that discuss protocol- e. The sum of the Data Source Tagged Offset and the RDMA Read
specific security considerations and implications of using RDMAP Message Size MUST NOT cause the Data Source Tagged Offset
with existing security services. A detailed analysis of the to wrap.
security issues around implementation and use of the RDMAP can be
found in [RDMASEC].
[RDMASEC] introduces the RDMA reference model and discusses how the 8 Security Considerations
resources of this model are vulnerable to attacks and the types of
attack these vulnerabilities are subject to. It also details the
levels of Trust available in this peer-to-peer model and how this
defines the nature of resource sharing.
8.1 Summary of RDMAP specific Security Requirements This section references the resources that discuss protocol-
specific security considerations and implications of using RDMAP
with existing security services. A detailed analysis of the
security issues around implementation and use of the RDMAP can be
found in [RDMASEC].
[RDMASEC] defines the security requirements for the implementation [RDMASEC] introduces the RDMA reference model and discusses how
of the components of the RDMA reference model, namely the RDMA the resources of this model are vulnerable to attacks and the
enabled NIC (RNIC) and the Privileged Resource Manager. An RDMAP types of attack these vulnerabilities are subject to. It also
implementation conforming to this specification MUST conform to details the levels of Trust available in this peer-to-peer model
these requirements. and how this defines the nature of resource sharing.
8.1.1 RDMAP (RNIC) Requirements The IPsec requirements for RDDP are based on the version of IPsec
specified in RFC 2401 [RFC 2401] and related RFCs, as profiled by
RFC 3723 [RFC 3723], despite the existence of a newer version of
IPsec specified in RFC 4301 [RFC 4301] and related RFCs. One of
the important early applications of the RDDP protocols is their
use with iSCSI [iSER]; RDDP's IPsec requirements follow those of
IPsec in order to facilitate that usage by allowing a common
profile of IPsec to be used with iSCSI and the RDDP protocols. In
the future, RFC 3723 may be updated to the newer version of IPsec,
the IPsec security requirements of any such update should apply
uniformly to iSCSI and the RDDP protocols.
RDMAP provides several countermeasures for all types of attacks as 8.1 Summary of RDMAP specific Security Requirements
introduced in [RDMASEC]. In the following, this specification lists
all security requirements which MUST be implemented by the RNIC. A
more detailed discussion of RNIC security requirements can be found
in Section 5 of [RDMASEC].
1. An RNIC MUST ensure that a specific Stream in a specific [RDMASEC] defines the security requirements for the implementation
Protection Domain cannot access an STag in a different of the components of the RDMA reference model, namely the RDMA
Protection Domain. enabled NIC (RNIC) and the Privileged Resource Manager. An RDMAP
implementation conforming to this specification MUST conform to
these requirements.
2. An RNIC MUST ensure that if an STag is limited in scope to a 8.1.1 RDMAP (RNIC) Requirements
single Stream, no other Stream can use the STag.
3. An RNIC MUST ensure that a Remote Peer is not able to access RDMAP provides several countermeasures for all types of attacks as
memory outside of the buffer specified when the STag was introduced in [RDMASEC]. In the following, this specification
enabled for remote access. lists all security requirements which MUST be implemented by the
RNIC. A more detailed discussion of RNIC security requirements can
be found in Section 5 of [RDMASEC].
4. An RNIC MUST provide a mechanism for the ULP to establish and 1. An RNIC MUST ensure that a specific Stream in a specific
revoke the association of a ULP Buffer to an STag and TO range. Protection Domain cannot access an STag in a different
Protection Domain.
5. An RNIC MUST provide a mechanism for the ULP to establish and 2. An RNIC MUST ensure that if an STag is limited in scope to a
revoke read, write, or read and write access to the ULP Buffer single Stream, no other Stream can use the STag.
referenced by an STag.
6. An RNIC MUST ensure that the network interface can no longer 3. An RNIC MUST ensure that a Remote Peer is not able to access
modify an advertised buffer after the ULP revokes remote access memory outside of the buffer specified when the STag was
rights for an STag. enabled for remote access.
7. An RNIC MUST ensure that a Remote Peer is not able to 4. An RNIC MUST provide a mechanism for the ULP to establish and
invalidate an STag enabled for remote access, if the STag is revoke the association of a ULP Buffer to an STag and TO
shared on multiple streams. range.
8. An RNIC MUST choose the value of STags in a way difficult to 5. An RNIC MUST provide a mechanism for the ULP to establish and
predict. It is RECOMMENDED to sparsely populate them over the revoke read, write, or read and write access to the ULP Buffer
full range available. referenced by an STag.
9. An RNIC MUST NOT enable sharing a CQ across ULPs that do not 6. An RNIC MUST ensure that the network interface can no longer
share partial mutual trust. modify an advertised buffer after the ULP revokes remote
access rights for an STag.
10. An RNIC MUST ensure that if a CQ overflows, any Streams which 7. An RNIC MUST ensure that a Remote Peer is not able to
do not use the CQ MUST remain unaffected. invalidate an STag enabled for remote access, if the STag is
shared on multiple streams.
11. An RNIC implementation SHOULD provide a mechanism to cap the 8. An RNIC MUST choose the value of STags in a way difficult to
number of outstanding RDMA Read Requests. predict. It is RECOMMENDED to sparsely populate them over the
full available range.
12. An RNIC MUST NOT enable firmware to be loaded on the RNIC 9. An RNIC MUST NOT enable sharing a CQ across ULPs that do not
directly from an untrusted Local Peer or Remote Peer, unless share partial mutual trust.
the Peer is properly authenticated (by a mechanism outside the
scope of this specification. The mechanism presumably entails
authenticating that the remote ULP has the right to perform the
update), and the update is done via a secure protocol, such as
IPsec.
8.1.2 Privileged Resource Manager Requirements 10. An RNIC MUST ensure that if a CQ overflows, any Streams which
do not use the CQ MUST remain unaffected.
With RDMAP, all reservations of local resources are initiated from 11. An RNIC implementation SHOULD provide a mechanism to cap the
local ULPs. To protect from local attacks including unfair number of outstanding RDMA Read Requests.
resource distribution and gaining unauthorized access to RNIC
resources, a Privileged Resource Manager (PRM) must be
implemented, which manages all local resource allocation. Note
that the PRM must not be provided as an independent component, its
functionality can also be implemented as part of the privileged
ULP or as part of the RNIC itself.
An PRM implementation must meet the following security 12. An RNIC MUST NOT enable firmware to be loaded on the RNIC
requirements (a more detailed discussion of PRM security directly from an untrusted Local Peer or Remote Peer, unless
requirements can be found in Section 5 of [RDMASEC]): the Peer is properly authenticated (by a mechanism outside the
scope of this specification. The mechanism presumably entails
authenticating that the remote ULP has the right to perform
the update), and the update is done via a secure protocol,
such as IPsec.
1. All Non-Privileged ULP interactions with the RNIC Engine that 8.1.2 Privileged Resource Manager Requirements
could affect other ULPs MUST be done using the Resource Manager
as a proxy.
2. All ULP resource allocation requests for scarce resources MUST With RDMAP, all reservations of local resources are initiated from
also be done using a Privileged Resource Manager. local ULPs. To protect from local attacks including unfair
resource distribution and gaining unauthorized access to RNIC
resources, a Privileged Resource Manager (PRM) must be
implemented, which manages all local resource allocation. Note
that the PRM must not be provided as an independent component, its
functionality can also be implemented as part of the privileged
ULP or as part of the RNIC itself.
3. The Privileged Resource Manager MUST NOT assume different ULPs An PRM implementation must meet the following security
share Partial Mutual Trust unless there is a mechanism to requirements (a more detailed discussion of PRM security
ensure that the ULPs do indeed share partial mutual trust. requirements can be found in Section 5 of [RDMASEC]):
4. If Non-Privileged ULPs are supported, the Privileged Resource 1. All Non-Privileged ULP interactions with the RNIC Engine that
Manager MUST verify that the Non-Privileged ULP has the right could affect other ULPs MUST be done using the Resource
to access a specific Data Buffer before allowing an STag for Manager as a proxy.
which the ULP has access rights to be associated with a
specific Data Buffer.
5. The Privileged Resource Manager MUST control the allocation of 2. All ULP resource allocation requests for scarce resources MUST
CQ entries. also be done using a Privileged Resource Manager.
6. The Privileged Resource Manager SHOULD prevent a Local Peer 3. The Privileged Resource Manager MUST NOT assume different ULPs
from allocating more than its fair share of resources. share Partial Mutual Trust unless there is a mechanism to
ensure that the ULPs do indeed share partial mutual trust.
7. RDMA Read Request Queue resource consumption MUST be controlled 4. If Non-Privileged ULPs are supported, the Privileged Resource
by the Privileged Resource Manager such that RDMAP/DDP Streams Manager MUST verify that the Non-Privileged ULP has the right
which do not share Partial Mutual Trust do not share RDMA Read to access a specific Data Buffer before allowing an STag for
Request Queue resources. which the ULP has access rights to be associated with a
specific Data Buffer.
8. If an RNIC provides the ability to share receive buffers across 5. The Privileged Resource Manager MUST control the allocation of
multiple Streams, the combination of the RNIC and the CQ entries.
Privileged Resource Manager MUST be able to detect if the
Remote Peer is attempting to consume more than its fair share
of resources so that the Local Peer can apply countermeasures
to detect and prevent the attack.
8.2 Security Services for RDMAP 6. The Privileged Resource Manager SHOULD prevent a Local Peer
from allocating more than its fair share of resources.
RDMAP is using IP based network services to control, read and 7. RDMA Read Request Queue resource consumption MUST be
write data buffers over the network. Therefore, all exchanged controlled by the Privileged Resource Manager such that
control and data packets are vulnerable to spoofing, tampering and RDMAP/DDP Streams which do not share Partial Mutual Trust do
information disclosure attacks. not share RDMA Read Request Queue resources.
RDMAP Streams that are subject to impersonation attacks, or Stream 8. If an RNIC provides the ability to share receive buffers
hijacking attacks, can be authenticated, have their integrity across multiple Streams, the combination of the RNIC and the
protected, and be protected from replay attacks. Furthermore, Privileged Resource Manager MUST be able to detect if the
confidentiality protection can be used to protect from Remote Peer is attempting to consume more than its fair share
eavesdropping. of resources so that the Local Peer can apply countermeasures
to detect and prevent the attack.
8.2.1 Available Security Services 8.2 Security Services for RDMAP
The IPsec protocol suite [RFC2401] defines strong countermeasures RDMAP is using IP based network services to control, read and
to protect an IP stream from those attacks. Several levels of write data buffers over the network. Therefore, all exchanged
protection can guarantee session confidentiality, per-packet source control and data packets are vulnerable to spoofing, tampering
authentication, per-packet integrity and correct packet sequencing. and information disclosure attacks.
RDMAP security may also profit from SSL or TLS security services RDMAP Streams that are subject to impersonation attacks, or
provided for TCP based ULPs [RFC2246]. Used underneath RDMAP, these Stream hijacking attacks, can be authenticated, have their
security services also provides for stream authentication, data integrity protected, and be protected from replay attacks.
integrity and confidentiality. As discussed in [RDMASEC], Furthermore, confidentiality protection can be used to protect
limitations on the maximum packet length to be carried over the from eavesdropping.
network and potentially inefficient out-of-order packet processing
at the data sink makes SSL and TLS less appropriate for RDMAP than
IPsec.
If SSL is layered on top of RDMAP, SSL does not protect the RDMAP 8.2.1 Available Security Services
headers. Thus, a man-in-the-middle attack can still occur by
modifying the RDMAP header to incorrectly place the data into the
wrong buffer, thus effectively corrupting the data stream.
By remaining independent of ULP and LLP security protocols, RDMAP The IPsec protocol suite [RFC2401] defines strong countermeasures
will benefit from continuing improvements at those layers. Users to protect an IP stream from those attacks. Several levels of
are provided flexibility to adapt to their specific security protection can guarantee session confidentiality, per-packet
requirements and the ability to adapt to future security source authentication, per-packet integrity and correct packet
challenges. Given this, the vulnerabilities of RDMAP to active sequencing.
third-party interference are no greater than any other protocol
running over an LLP such as TCP or SCTP.
8.2.2 Requirements for IPsec Services for RDMAP RDMAP security may also profit from SSL or TLS security services
provided for TCP based ULPs [RFC4346]. Used underneath RDMAP,
these security services also provides for stream authentication,
data integrity and confidentiality. As discussed in [RDMASEC],
limitations on the maximum packet length to be carried over the
network and potentially inefficient out-of-order packet processing
at the data sink makes SSL and TLS less appropriate for RDMAP than
IPsec.
Because IPsec is designed to secure arbitrary IP packet streams, If SSL is layered on top of RDMAP, SSL does not protect the RDMAP
including streams where packets are lost, RDMAP can run on top of headers. Thus, a man-in-the-middle attack can still occur by
IPsec without any change. IPsec packets are processed (e.g., modifying the RDMAP header to incorrectly place the data into the
integrity checked and possibly decrypted) in the order they are wrong buffer, thus effectively corrupting the data stream.
received, and an RDMAP Data Sink will process the decrypted RDMA
Messages contained in these packets in the same manner as RDMA
Messages contained in unsecured IP packets.
The IP Storage working group has defined the normative IPsec By remaining independent of ULP and LLP security protocols, RDMAP
requirements for IP Storage [RFC3723]. Portions of this will benefit from continuing improvements at those layers. Users
specification are applicable to the RDMAP. In particular, a are provided flexibility to adapt to their specific security
compliant implementation of IPsec services for RDMAP MUST meet the requirements and the ability to adapt to future security
requirements as outlined in Section 2.3 of [RFC3723]. Without challenges. Given this, the vulnerabilities of RDMAP to active
replicating the detailed discussion in [RFC3723], this includes third-party interference are no greater than any other protocol
the following requirements: running over an LLP such as TCP or SCTP.
1. The implementation MUST support IPsec ESP [RFC2406], as well as 8.2.2 Requirements for IPsec Services for RDMAP
the replay protection mechanisms of IPsec. When ESP is
utilized, per-packet data origin authentication, integrity and
replay protection MUST be used.
2. It MUST support ESP in tunnel mode and MAY implement ESP in Because IPsec is designed to secure arbitrary IP packet streams,
transport mode. including streams where packets are lost, RDMAP can run on top of
IPsec without any change. IPsec packets are processed (e.g.,
integrity checked and possibly decrypted) in the order they are
received, and an RDMAP Data Sink will process the decrypted RDMA
Messages contained in these packets in the same manner as RDMA
Messages contained in unsecured IP packets.
3. It MUST support IKE [RFC2409] for peer authentication, The IP Storage working group has defined the normative IPsec
negotiation of security associations, and key management, using requirements for IP Storage [RFC3723]. Portions of this
the IPsec DOI [RFC2407]. specification are applicable to the RDMAP. In particular, a
compliant implementation of IPsec services for RDMAP MUST meet
the requirements as outlined in Section 2.3 of [RFC3723]. Without
replicating the detailed discussion in [RFC3723], this includes
the following requirements:
4. It MUST NOT interpret the receipt of a IKE Phase 2 delete 1. The implementation MUST support IPsec ESP [RFC2406], as well
message as a reason for tearing down the RDMAP stream. Since as the replay protection mechanisms of IPsec. When ESP is
IPsec acceleration hardware may only be able to handle a utilized, per-packet data origin authentication, integrity and
limited number of active IKE Phase 2 SAs, idle SAs may be replay protection MUST be used.
dynamically brought down and a new SA be brought up again, if
activity resumes.
5. It MUST support peer authentication using a pre-shared key, and 2. It MUST support ESP in tunnel mode and MAY implement ESP in
MAY support certificate-based peer authentication using digital transport mode.
signatures. Peer authentication using the public key
encryption methods [RFC2409] SHOULD NOT be used.
6. It MUST support IKE Main Mode and SHOULD support Aggressive 3. It MUST support IKE [RFC2409] for peer authentication,
Mode. IKE Main Mode with pre-shared key authentication SHOULD negotiation of security associations, and key management,
NOT be used when either of the peers uses a dynamically using the IPsec DOI [RFC2407].
assigned IP address.
7. When digital signatures are used to achieve authentication, 4. It MUST NOT interpret the receipt of a IKE Phase 2 delete
either IKE Main Mode or IKE Aggressive Mode MAY be used. In message as a reason for tearing down the RDMAP stream. Since
these cases, an IKE negotiator SHOULD use IKE Certificate IPsec acceleration hardware may only be able to handle a
Request Payload(s) to specify the certificate authority (or limited number of active IKE Phase 2 SAs, idle SAs may be
authorities) that are trusted in accordance with its local dynamically brought down and a new SA be brought up again, if
policy. IKE negotiators SHOULD check the pertinent Certificate activity resumes.
Revocation List (CRL) before accepting a PKI certificate for
use in IKE's authentication procedures.
8. Access to locally stored secret information (pre-shared or 5. It MUST support peer authentication using a pre-shared key,
private key for digital signing) must be suitably restricted, and MAY support certificate-based peer authentication using
since compromise of the secret information nullifies the digital signatures. Peer authentication using the public key
security properties of the IKE/IPsec protocols. encryption methods [RFC2409] SHOULD NOT be used.
9. It MUST follow the guidelines of Section 2.3.4 of [RFC3723] on 6. It MUST support IKE Main Mode and SHOULD support Aggressive
the setting of IKE parameters to achieve a high level of Mode. IKE Main Mode with pre-shared key authentication SHOULD
interoperability without requiring extensive configuration. NOT be used when either of the peers uses a dynamically
assigned IP address.
Furthermore, implementation and deployment of the IPsec services 7. When digital signatures are used to achieve authentication,
for RDDP should follow the Security Considerations outlined in either IKE Main Mode or IKE Aggressive Mode MAY be used. In
Section 5 of [RFC3723]. these cases, an IKE negotiator SHOULD use IKE Certificate
Request Payload(s) to specify the certificate authority (or
authorities) that are trusted in accordance with its local
policy. IKE negotiators SHOULD check the pertinent Certificate
Revocation List (CRL) before accepting a PKI certificate for
use in IKE's authentication procedures.
9 IANA 8. Access to locally stored secret information (pre-shared or
private key for digital signing) must be suitably restricted,
since compromise of the secret information nullifies the
security properties of the IKE/IPsec protocols.
IANA Considerations 9. It MUST follow the guidelines of Section 2.3.4 of [RFC3723] on
the setting of IKE parameters to achieve a high level of
interoperability without requiring extensive configuration.
This document requests no direct action from IANA. The following Furthermore, implementation and deployment of the IPsec services
consideration is listed here as commentary. for RDDP should follow the Security Considerations outlined in
Section 5 of [RFC3723].
If RDMAP was enabled a priori for a ULP by connecting to a well- 9 IANA
known port, this well-known port would be registered for the RDMAP
with IANA. The registration of the well-known port will be the
responsibility of the ULP specification.
10 References IANA Considerations
10.1 Normative References This document requests no direct action from IANA. The following
consideration is listed here as commentary.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate If RDMAP was enabled a priori for a ULP by connecting to a well-
Requirement Levels", BCP 14, RFC 2119, March 1997. known port, this well-known port would be registered for the RDMAP
with IANA. The registration of the well-known port will be the
responsibility of the ULP specification.
[RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security 10 References
Payload (ESP)", RFC 2406, November 1998.
[RFC2407] Piper, D., "The Internet IP Security Domain of 10.1 Normative References
Interpretation of ISAKMP", RFC 2407, November 1998.
[RFC2409] Harkins, D. and D. Carrel, "The Internet Key Exchange [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
(IKE)", RFC 2409, November 1998. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3723] Aboba B. et al., "Secure Block Storage Protocols over [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security
IP", RFC 3723, April 2004. Payload (ESP)", RFC 2406, November 1998.
[VERBS] J. Hilland, "RDMA Protocol Verbs Specification", draft- [RFC2407] Piper, D., "The Internet IP Security Domain of
hilland-iwarp-verbs-v1.0 RDMA Consortium, April 2003. Interpretation of ISAKMP", RFC 2407, November 1998.
[DDP] H. Shah et al., "Direct Data Placement over Reliable [RFC2409] Harkins, D. and D. Carrel, "The Internet Key Exchange
Transports", draft-ietf-rddp-ddp-05.txt, February 2005. (IKE)", RFC 2409, November 1998.
[MPA] P. Culley et al., "Marker PDU Aligned Framing for TCP [RFC3723] Aboba B. et al., "Secure Block Storage Protocols over
Specification", draft-ietf-rddp-mpa-04.txt, January 2005. IP", RFC 3723, April 2004.
[SCTP] R. Stewart et al., "Stream Control Transmission Protocol", [RFC 4301] S. Kent and K. Seo, "Security Architecture for the
RFC 2960, October 2000. Internet Protocol", RFC 4301, December 2005.
[TCP] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, [VERBS] J. Hilland, "RDMA Protocol Verbs Specification", draft-
September 1981. hilland-iwarp-verbs-v1.0 RDMA Consortium, April 2003.
[RDMASEC] J. Pinkerton et al., "DDP/RDMAP Security", draft-ietf- [DDP] H. Shah et al., "Direct Data Placement over Reliable
rddp-security-09.txt, March 2005. Transports", draft-ietf-rddp-ddp-07.txt, September 2006.
10.2 Informative References [MPA] P. Culley et al., "Marker PDU Aligned Framing for TCP
Specification", draft-ietf-rddp-mpa-06.txt, September 2006.
[RFC2401] Atkinson, R., Kent, S., "Security Architecture for the [SCTP] R. Stewart et al., "Stream Control Transmission Protocol",
Internet Protocol", RFC 2401, November 1998. RFC 2960, October 2000.
[RFC2246] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", [TCP] Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
RFC 2246, November 1998. September 1981.
11 Appendix [RDMASEC] J. Pinkerton et al., "DDP/RDMAP Security", draft-ietf-
rddp-security-09.txt, March 2005.
11.1 DDP Segment Formats for RDMA Messages [iSER] M. Ko, et. al., "iSCSI Extensions for RDMA Specification,
"Internet-Draft, draft-ietf-ips-iser-05.txt, Work in Progress,
October 2005.
This appendix is for information only and is NOT part of the 10.2 Informative References
standard. It simply depicts the DDP Segment format for the various
RDMA Messages.
11.1.1 DDP Segment for RDMA Write [RFC2401] Atkinson, R., Kent, S., "Security Architecture for the
Internet Protocol", RFC 2401, November 1998.
The following figure depicts an RDMA Write, DDP Segment: [RFC4346] Dierks, T. and C. Allen, "The TLS Protocol Version 1.1",
RFC 4346, April 2006.
0 1 2 3 11 Appendix
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink Tagged Offset |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Write ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 11 RDMA Write, DDP Segment format
11.1.2 DDP Segment for RDMA Read Request 11.1 DDP Segment Formats for RDMA Messages
The following figure depicts an RDMA Read Request, DDP Segment: This appendix is for information only and is NOT part of the
standard. It simply depicts the DDP Segment format for the various
RDMA Messages.
0 1 2 3 11.1.1 DDP Segment for RDMA Write
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (RDMA Read Request) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (RDMA Read Request) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (RDMA Read Request) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag (SinkSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Sink Tagged Offset (SinkTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Read Message Size (RDMARDSZ) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Source STag (SrcSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Source Tagged Offset (SrcTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 12 RDMA Read Request, DDP Segment format
11.1.3 DDP Segment for RDMA Read Response The following figure depicts an RDMA Write, DDP Segment:
The following figure depicts an RDMA Read Response, DDP Segment: 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink Tagged Offset |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Write ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 11 RDMA Write, DDP Segment format
0 1 2 3 11.1.2 DDP Segment for RDMA Read Request
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink Tagged Offset |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Read Response ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 13 RDMA Read Response, DDP Segment format
11.1.4 DDP Segment for Send and Send with Solicited Event The following figure depicts an RDMA Read Request, DDP Segment:
The following figure depicts a Send and Send with Solicited 0 1 2 3
Request, DDP Segment: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (RDMA Read Request) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (RDMA Read Request) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (RDMA Read Request) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag (SinkSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Sink Tagged Offset (SinkTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Read Message Size (RDMARDSZ) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Source STag (SrcSTag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Data Source Tagged Offset (SrcTO) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 12 RDMA Read Request, DDP Segment format
0 1 2 3 11.1.3 DDP Segment for RDMA Read Response
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 14 Send and Send with Solicited Event, DDP Segment format The following figure depicts an RDMA Read Response, DDP Segment:
11.1.5 DDP Segment for Send with Invalidate and Send with SE and 0 1 2 3
Invalidate 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Sink Tagged Offset |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RDMA Read Response ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 13 RDMA Read Response, DDP Segment format
The following figure depicts a Send with invalidate and Send with 11.1.4 DDP Segment for Send and Send with Solicited Event
Solicited and Invalidate Request, DDP Segment:
0 1 2 3 The following figure depicts a Send and Send with Solicited
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 Request, DDP Segment:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Invalidate STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 15 Send with Invalidate and Send with SE and Invalidate, DDP
Segment
11.1.6 DDP Segment for Terminate 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The following figure depicts a Terminate, DDP Segment: Figure 14 Send and Send with Solicited Event, DDP Segment format
0 1 2 3 11.1.5 DDP Segment for Send with Invalidate and Send with SE and
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 Invalidate
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Terminate) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Terminate) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Terminate) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Terminate Control | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Segment Length (if any) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ +
| Terminated DDP Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
// //
| Terminated RDMA Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 16 Terminate, DDP Segment format
11.2 Ordering and Completion Table The following figure depicts a Send with invalidate and Send with
Solicited and Invalidate Request, DDP Segment:
The following table summarizes the ordering relationships that are 0 1 2 3
defined in section 5.5 Ordering and Completions from the standpoint 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
of the local peer issuing the two Operations. Note, in the table +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
that follows Send includes Send, Send with Invalidate, Send with | DDP Control | RDMA Control |
Solicited Event, and Send with Solicited Event and Invalidate +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Invalidate STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Send) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send ULP Payload |
// //
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 15 Send with Invalidate and Send with SE and Invalidate,
DDP Segment
------+-------+----------------+----------------+---------------- 11.1.6 DDP Segment for Terminate
First | Later | Placement | Placement | Ordering
Op | Op | guarantee at | guarantee | guarantee at
| | Remote Peer | Local Peer | Remote Peer
| | | |
------+-------+----------------+----------------+----------------
Send | Send | No placement | Not applicable | Completed in
| | guarantee. If | | order.
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
Send | RDMA | No placement | Not applicable | Not applicable
| Write | guarantee. If | |
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
Send | RDMA | No placement | RDMA Read | RDMA Read
| Read | guarantee | Response | Response
| | between Send | Payload will | Message will
| | Payload and | not be placed | not be
| | RDMA Read | at the local | generated until
| | Request Header | peer until the | Send has been
| | | Send Payload is| Completed
| | | placed at the |
| | | remote peer |
------+-------+----------------+----------------+----------------
RDMA | Send | No placement | Not applicable | Not applicable
Write | | guarantee. If | |
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | Not applicable | Not applicable
Write | Write | guarantee. If | |
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | RDMA Read | Not applicable
Write | Read | guarantee | Response |
| | between RDMA | Payload will |
| | Write Payload | not be placed |
| | and RDMA Read | at the local |
| | Request Header | peer until the |
| | | RDMA Write |
| | | Payload is |
| | | placed at the |
| | | remote peer |
------+-------+----------------+----------------+----------------
RDMA | Send | No placement | Send Payload | Not applicable
Read | | guarantee | may be placed |
| | between RDMA | at the remote |
| | Read Request | peer before the|
| | Header and Send| RDMA Read |
| | payload | Response is |
| | | generated. |
| | | If guarantee is|
| | | necessary, see |
| | | footnote 2. |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | RDMA Write | Not applicable
Read | Write | guarantee | Payload may be |
| | between RDMA | placed at the |
| | Read Request | remote peer |
| | Header and RDMA| before the RDMA|
| | Write payload | Read Response |
| | | is generated. |
| | | If guarantee is|
| | | necessary, see |
| | | footnote 2. |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | No placement | Second RDMA
Read | Read | guarantee of | guarantee of | Read Response
| | the two RDMA | the two RDMA | will not be
| | Read Request | Read Response | generated until
| | Headers | Payloads. | first RDMA Read
| | Additionally, | | Response is
| | there is no | | generated.
| | guarantee that | |
| | the Tagged | |
| | Buffers | |
| | referenced in | |
| | the RDMA Read | |
| | will be read in| |
| | order | |
Figure 17 Operation Ordering
Footnote 1: If the guarantee is necessary, a ULP may insert an The following figure depicts a Terminate, DDP Segment:
RDMA Read Operation and wait for it to complete to act as a Fence.
Footnote 2: If the guarantee is necessary, a ULP may wait for the 0 1 2 3
RDMA Read Operation to complete before performing the Send. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Terminate) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Terminate) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Terminate) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Terminate Control | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Segment Length (if any) | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ +
| Terminated DDP Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
// //
| Terminated RDMA Header (if any) |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 16 Terminate, DDP Segment format
11.2 Ordering and Completion Table
The following table summarizes the ordering relationships that are
defined in section 5.5 Ordering and Completions from the
standpoint of the local peer issuing the two Operations. Note, in
the table that follows Send includes Send, Send with Invalidate,
Send with Solicited Event, and Send with Solicited Event and
Invalidate
------+-------+----------------+----------------+----------------
First | Later | Placement | Placement | Ordering
Op | Op | guarantee at | guarantee | guarantee at
| | Remote Peer | Local Peer | Remote Peer
| | | |
------+-------+----------------+----------------+----------------
Send | Send | No placement | Not applicable | Completed in
| | guarantee. If | | order.
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
Send | RDMA | No placement | Not applicable | Not applicable
| Write | guarantee. If | |
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
Send | RDMA | No placement | RDMA Read | RDMA Read
| Read | guarantee | Response | Response
| | between Send | Payload will | Message will
| | Payload and | not be placed | not be
| | RDMA Read | at the local | generated until
| | Request Header | peer until the | Send has been
| | | Send Payload is| Completed
| | | placed at the |
| | | remote peer |
------+-------+----------------+----------------+----------------
RDMA | Send | No placement | Not applicable | Not applicable
Write | | guarantee. If | |
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | Not applicable | Not applicable
Write | Write | guarantee. If | |
| | guarantee is | |
| | necessary, see | |
| | footnote 1. | |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | RDMA Read | Not applicable
Write | Read | guarantee | Response |
| | between RDMA | Payload will |
| | Write Payload | not be placed |
| | and RDMA Read | at the local |
| | Request Header | peer until the |
| | | RDMA Write |
| | | Payload is |
| | | placed at the |
| | | remote peer |
------+-------+----------------+----------------+----------------
RDMA | Send | No placement | Send Payload | Not applicable
Read | | guarantee | may be placed |
| | between RDMA | at the remote |
| | Read Request | peer before the|
| | Header and Send| RDMA Read |
| | payload | Response is |
| | | generated. |
| | | If guarantee is|
| | | necessary, see |
| | | footnote 2. |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | RDMA Write | Not applicable
Read | Write | guarantee | Payload may be |
| | between RDMA | placed at the |
| | Read Request | remote peer |
| | Header and RDMA| before the RDMA|
| | Write payload | Read Response |
| | | is generated. |
| | | If guarantee is|
| | | necessary, see |
| | | footnote 2. |
------+-------+----------------+----------------+----------------
RDMA | RDMA | No placement | No placement | Second RDMA
Read | Read | guarantee of | guarantee of | Read Response
| | the two RDMA | the two RDMA | will not be
| | Read Request | Read Response | generated until
| | Headers | Payloads. | first RDMA Read
| | Additionally, | | Response is
| | there is no | | generated.
| | guarantee that | |
| | the Tagged | |
| | Buffers | |
| | referenced in | |
| | the RDMA Read | |
| | will be read in| |
| | order | |
Figure 17 Operation Ordering
Footnote 1: If the guarantee is necessary, a ULP may insert an
RDMA Read Operation and wait for it to complete to act as a Fence.
Footnote 2: If the guarantee is necessary, a ULP may wait for the
RDMA Read Operation to complete before performing the Send.
12 Author's Address 12 Author's Address
Paul R. Culley Paul R. Culley
Hewlett-Packard Company Hewlett-Packard Company
20555 SH 249 20555 SH 249
Houston, Tx. USA 77070-2698 Houston, Tx. USA 77070-2698
Phone: 281-514-5543 Phone: 281-514-5543
Email: paul.culley@hp.com Email: paul.culley@hp.com
skipping to change at page 78, line 21 skipping to change at page 78, line 21
Allyn Romanow Allyn Romanow
Cisco Systems Cisco Systems
170 W Tasman Drive 170 W Tasman Drive
San Jose, CA 95134 USA San Jose, CA 95134 USA
Phone: +1 408 525 8836 Phone: +1 408 525 8836
Email: allyn@cisco.com Email: allyn@cisco.com
Tom Talpey Tom Talpey
Network Appliance Network Appliance
375 Totten Pond Road 1601 Trapelo Road #16
Waltham, MA 02451 USA Waltham, MA 02451 USA
Phone: +1 (781) 768-5329 Phone: +1 (781) 768-5329
EMail: thomas.talpey@netapp.com EMail: thomas.talpey@netapp.com
Patricia Thaler Patricia Thaler
Broadcom Corporation Broadcom Corporation
16215 Alton Parkway 16215 Alton Parkway
Irvine, CA. USA 92619-7013 Irvine, CA. USA 92619-7013
Phone: +1-916-570-2707 Phone: +1-916-570-2707
email: pthaler@broadcom.com email: pthaler@broadcom.com
 End of changes. 570 change blocks. 
2133 lines changed or deleted 2182 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/