idnits 2.17.1 draft-ietf-nfsv4-rpcrdma-bidirection-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 8, 2016) is 2940 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-11) exists of draft-ietf-nfsv4-rfc5666bis-04 ** Obsolete normative reference: RFC 5661 (Obsoleted by RFC 8881) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network File System Version 4 C. Lever 3 Internet-Draft Oracle 4 Intended status: Standards Track April 8, 2016 5 Expires: October 10, 2016 7 Size-Limited Bi-directional Remote Procedure Call On Remote Direct 8 Memory Access Transports 9 draft-ietf-nfsv4-rpcrdma-bidirection-02 11 Abstract 13 Recent minor versions of NFSv4 work best when ONC RPC transports can 14 send ONC RPC transactions in both directions. This document 15 describes conventions that enable RPC-over-RDMA transport endpoints 16 to interoperate when operation in both directions is necessary. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on October 10, 2016. 35 Copyright Notice 37 Copyright (c) 2016 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 53 1.1. Understanding RPC Direction . . . . . . . . . . . . . . . 2 54 1.2. Rationale For RPC-over-RDMA Bi-Direction . . . . . . . . 4 55 1.3. Design Considerations . . . . . . . . . . . . . . . . . . 6 56 1.4. Requirements Language . . . . . . . . . . . . . . . . . . 8 57 2. Conventions For Backward Operation . . . . . . . . . . . . . 8 58 2.1. Flow Control . . . . . . . . . . . . . . . . . . . . . . 8 59 2.2. Managing Receive Buffers . . . . . . . . . . . . . . . . 9 60 2.3. Backward Direction Retransmission . . . . . . . . . . . . 11 61 2.4. Backward Direction Message Size . . . . . . . . . . . . . 11 62 2.5. Sending A Backward Direction Call . . . . . . . . . . . . 12 63 2.6. Sending A Backward Direction Reply . . . . . . . . . . . 12 64 3. Backward Direction Upper Layer Binding . . . . . . . . . . . 13 65 4. Limits To This Approach . . . . . . . . . . . . . . . . . . . 13 66 4.1. Payload Size . . . . . . . . . . . . . . . . . . . . . . 13 67 4.2. Preparedness To Handle Backward Requests . . . . . . . . 13 68 4.3. Long Term . . . . . . . . . . . . . . . . . . . . . . . . 14 69 5. Security Considerations . . . . . . . . . . . . . . . . . . . 14 70 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 71 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 72 8. Normative References . . . . . . . . . . . . . . . . . . . . 15 73 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 75 1. Introduction 77 The purpose of this document is to enable bi-directional RPC 78 transactions on RPC-over-RDMA transports that do not already support 79 backward direction operation. The conventions described herein can 80 be used with the RPC-over-RDMA Version One protocol without changes. 81 Therefore this document does not update [I-D.ietf-nfsv4-rfc5666bis]. 83 Backward direction transactions enable the operation of NFSv4.1, and 84 in particular pNFS, on RPC-over-RDMA. Providing an Upper Layer 85 Binding for NFSv4.x callback operations is outside the scope of this 86 document. 88 1.1. Understanding RPC Direction 90 The ONC RPC protocol as described in [RFC5531] is fundamentally a 91 message-passing protocol between one server and one or more clients. 92 ONC RPC transactions are made up of two types of messages. 94 A CALL message, or "Call", requests work. A Call is designated by 95 the value CALL in the message's msg_type field. An arbitrary unique 96 value is placed in the message's xid field. A host that originates a 97 Call is referred to in this document as a "Requester." 98 A REPLY message, or "Reply", reports the results of work requested by 99 a Call. A Reply is designated by the value REPLY in the message's 100 msg_type field. The value contained in the message's xid field is 101 copied from the Call whose results are being returned. A host that 102 emits a Reply is referred to as a "Responder." 104 RPC-over-RDMA is a connection-oriented RPC transport. When a 105 connection-oriented transport is used, ONC RPC client endpoints are 106 responsible for initiating transport connections, while ONC RPC 107 service endpoints wait passively for incoming connection requests. 109 RPC direction on connectionless RPC transports is not considered in 110 this document. 112 1.1.1. Forward Direction 114 A traditional ONC RPC client is always a Requester. A traditional 115 ONC RPC service is always a Responder. This traditional form of ONC 116 RPC message passing is referred to as operation in the "forward 117 direction." 119 During forward direction operation, the ONC RPC client is responsible 120 for establishing transport connections. 122 1.1.2. Backward Direction 124 The ONC RPC specification [RFC5531] does not forbid passing messages 125 in the other direction. An ONC RPC service endpoint can act as a 126 Requester, in which case an ONC RPC client endpoint acts as a 127 Responder. This form of message passing is referred to as operation 128 in the "backward direction." 130 During backward direction operation, the ONC RPC client is 131 responsible for establishing transport connections, even though ONC 132 RPC Calls come from the ONC RPC server. 134 ONC RPC clients and services are optimized to perform and scale well 135 while handling traffic in the forward direction, and may not be 136 prepared to handle operation in the backward direction. Not until 137 recently has there been a need to handle backward direction 138 operation. 140 1.1.3. Bi-directional Operation 142 A pair of connected RPC endpoints may choose to use only forward or 143 only backward direction operations on a particular transport. Or, 144 these endpoints may send Calls in both directions concurrently on the 145 same transport. 147 "Bi-directional operation" occurs when both transport endpoints act 148 as a Requester and a Responder at the same time. As above, the ONC 149 RPC client is always responsible for establishing transport 150 connections. 152 1.1.4. XID Values 154 Section 9 of [RFC5531] introduces the ONC RPC transaction identifier, 155 or "xid" for short. The value of an xid is interpreted in the 156 context of the message's msg_type field. 158 o The xid of a Call is arbitrary but is unique among outstanding 159 Calls from that Requester. 161 o The xid of a Reply always matches that of the initiating Call. 163 When receiving a Reply, a Requester matches the xid value in the 164 Reply with a Call it previously sent. 166 1.1.4.1. XIDs with Bi-direction 168 During bi-directional operation, the forward and backward directions 169 use independent xid spaces. 171 In other words, a forward direction Requester MAY use the same xid 172 value at the same time as a backward direction Requester on the same 173 transport connection. Though such concurrent requests use the same 174 xid value, they represent distinct ONC RPC transactions. 176 1.2. Rationale For RPC-over-RDMA Bi-Direction 178 1.2.1. NFSv4.0 Callback Operation 180 An NFSv4.0 client employs a traditional ONC RPC client to send NFS 181 requests to an NFSv4.0 server's traditional ONC RPC service 182 [RFC7530]. NFSv4.0 requests flow in the forward direction on a 183 connection established by the client. This connection is referred to 184 as a "forechannel" connection. 186 An NFSv4 "delegation" is simply a promise made by a server that it 187 will notify a client when another client requests access to a file. 188 With this guarantee, that client can operate as sole accessor of this 189 file, and manage the file's data and metadata caches aggressively. 191 To manage file delegation, NFSv4.0 introduces the use of callback 192 operations, or "callbacks", in Section 10.2 of [RFC7530]. An NFSv4.0 193 server sets up a traditional ONC RPC client, and an NFSv4.0 client 194 sets up a traditional ONC RPC service. Callbacks flow in the forward 195 direction on a connection established between the server's client, 196 and the client's server. This connection is distinct from 197 connections being used as forechannels, and is referred to as a 198 "backchannel connection." 200 When an RDMA transport is used as a forechannel, an NFSv4.0 client 201 typically provides a TCP callback service. The client's SETCLIENTID 202 operation advertises the callback service endpoint with a "tcp" or 203 "tcp6" netid. The server then connects to this service using a TCP 204 socket. 206 NFSv4.0 implementations are fully functional without a backchannel in 207 place. In this case, the server does not grant file delegations. 208 This might result in a negative performance effect, but functional 209 correctness is unaffected. 211 1.2.2. NFSv4.1 Callback Operation 213 NFSv4.1 supports file delegation in a similar fashion to NFSv4.0, and 214 extends the repertoire of callbacks to manage pNFS layouts, as 215 discussed in Chapter 12 of [RFC5661]. 217 For various reasons, NFSv4.1 requires that all transport connections 218 be initiated by NFSv4.1 clients. Therefore, NFSv4.1 servers send 219 callbacks to clients in the backward direction on connections 220 established by NFSv4.1 clients. 222 NFSv4.1 clients and servers indicate to their peers that a 223 backchannel capability is available on a given transport in the 224 arguments and results of a CREATE_SESSION or BIND_CONN_TO_SESSION 225 operation. 227 NFSv4.1 clients may establish distinct transport connections for 228 forechannel and backchannel operation, or they may combine 229 forechannel and backchannel operation on one transport connection 230 using bi-directional operation. 232 Without a backward direction RPC-over-RDMA capability, an NFSv4.1 233 client must additionally connect using a transport with backward 234 direction capability to use as a backchannel. TCP is the only choice 235 at present for an NFSv4.1 backchannel connection. 237 Some implementations find it more convenient to use a single combined 238 transport (ie. a transport that is capable of bi-directional 239 operation). This simplifies connection establishment, and recovery 240 during network partitions or when one endpoint restarts. 242 As with NFSv4.0, if a backchannel is not in use, an NFSv4.1 server 243 does not grant delegations. But because of its reliance on callbacks 244 to manage pNFS layout state, pNFS operation is not possible without a 245 backchannel. 247 1.3. Design Considerations 249 As of this writing, the only use case for backward direction ONC RPC 250 messages is the NFSv4.1 backchannel. The conventions described in 251 this document take advantage of certain characteristics of NFSv4.1 252 callbacks, namely: 254 o NFSv4.1 callbacks typically bear small arguments and results 256 o NFSv4.1 callback arguments and results are insensitive to 257 alignment relative to system pages 259 o NFSv4.1 callbacks are infrequent relative to forechannel 260 operations 262 1.3.1. Backward Compatibility 264 Existing clients that implement RPC-over-RDMA Version One should 265 interoperate correctly with servers that implement RPC-over-RDMA with 266 backward direction support, and vice versa. 268 The approach taken here avoids altering the RPC-over-RDMA XDR 269 specification. Keeping the XDR the same enables existing RPC-over- 270 RDMA Version One implementations to interoperate with implementations 271 that support operation in the backward direction. 273 1.3.2. Performance Impact 275 Support for operation in the backward direction should never impact 276 the performance or scalability of forward direction operation, where 277 the bulk of ONC RPC transport activity typically occurs. 279 1.3.3. Server Memory Security 281 RDMA transfers involve one endpoint exposing a section of its memory 282 to the other endpoint, which then drives RDMA Read and Write 283 operations to access or modify the exposed memory. RPC-over-RDMA 284 client endpoints expose their memory, and RPC-over-RDMA server 285 endpoints initiate RDMA data transfer operations. 287 If RDMA transfers are not used for backward direction operations, 288 there is no need for servers to expose their memory to clients. 290 Further, this avoids the client complexity required to drive RDMA 291 transfers. 293 1.3.4. Payload Size 295 Small RPC-over-RDMA messages are conveyed using only RDMA Send 296 operations. Send is used to transmit both ONC RPC Calls and replies. 298 To send a large payload, an RPC-over-RDMA client endpoint registers a 299 region of memory (known as a "chunk") and transmits its coordinates 300 to an RPC-over-RDMA server endpoint, who uses an RDMA transfer to 301 move data to or from the client. See Section 4.4 of 302 [I-D.ietf-nfsv4-rfc5666bis]. 304 To transmit RPC-over-RDMA messages larger than the receive buffer 305 size (1024 bytes on an RPC-over-RDMA Version One transport), a chunk 306 must be used. For example, in an RDMA_NOMSG type message, the entire 307 RPC header and Upper Layer payload are contained in one or more 308 chunks. See Section 4.5 of [I-D.ietf-nfsv4-rfc5666bis]. for further 309 details. 311 If chunks are not allowed to be used for conveying backward direction 312 messages, an RDMA_NOMSG type message cannot be used to convey a 313 backward direction message using the conventions described in this 314 document. Therefore, backward direction messages sent using the 315 conventions in this document can be no larger than a single receive 316 buffer. 318 Stipulating such a limit on backward direction message size assumes 319 that either Upper Layer Protocol consumers of backward direction 320 messages can advertise this limit to peers, or that ULP consumers can 321 agree by convention on a maximum size of their backchannel payloads. 323 In addition, using only inline forms of RPC-over-RDMA messages and 324 never populating the RPC-over-RDMA chunk lists means that the RPC 325 header's msg_type field is always at a fixed location in messages 326 flowing in the backward direction, allowing efficient detection of 327 the direction of an RPC-over-RDMA message. 329 With few exceptions, NFSv4.1 servers can break down callback requests 330 so they fit within this limit. There are potentially large NFSv4.1 331 callback operations, such as a CB_GETATTR operation where a large ACL 332 must be conveyed. Although we are not aware of any NFSv4.1 333 implementation that uses CB_GETATTR, this state of affairs is not 334 guaranteed in perpetuity. 336 1.4. Requirements Language 338 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 339 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 340 document are to be interpreted as described in [RFC2119]. 342 2. Conventions For Backward Operation 344 Performing backward direction ONC RPC operations over an RPC-over- 345 RDMA transport can be accomplished within limits by observing the 346 conventions described in the following subsections. For reference, 347 the XDR description of RPC-over-RDMA Version One is contained in 348 Section 5.1 of [I-D.ietf-nfsv4-rfc5666bis]. 350 2.1. Flow Control 352 For an RDMA Send operation to work, the receiving peer must have 353 posted an RDMA Receive Work Request (WR) to provide a receive buffer 354 in which to capture the incoming message. If a receiver hasn't 355 posted enough Receive WRs to catch incoming Send operations, the RDMA 356 provider is allowed to drop the RDMA connection. 358 RPC-over-RDMA protocols provide built-in send flow control to prevent 359 overrunning the number of pre-posted receive buffers on a 360 connection's receive endpoint. This is fully discussed in 361 Section 4.3 of [I-D.ietf-nfsv4-rfc5666bis]. 363 2.1.1. Forward Credits 365 An RPC-over-RDMA credit is the capability to handle one RPC-over-RDMA 366 transaction. Each forward direction RPC-over-RDMA Call requests a 367 number of credits from the Responder. Each forward direction Reply 368 informs the Requester how many credits the Responder is prepared to 369 handle in total. The value of the request and grant are carried in 370 each RPC-over-RDMA message's rdma_credit field. 372 Practically speaking, the critical value is the value of the 373 rdma_credit field in RPC-over-RDMA replies. When a Requester is 374 operating correctly, it sends no more outstanding requests at a time 375 than the Responder's advertised forward direction credit value. 377 The credit value is a guaranteed minimum. However, a receiver can 378 post more receive buffers than its credit value. There is no 379 requirement in the RPC-over-RDMA protocol for a receiver to indicate 380 a credit overrun. Operation continues as long as there are enough 381 receive buffers to handle incoming messages. 383 2.1.2. Backward Credits 385 Credits work the same way in the backward direction as they do in the 386 forward direction. However, forward direction credits and backward 387 direction credits are accounted separately. 389 In other words, the forward direction credit value is the same 390 whether or not there are backward direction resources associated with 391 an RPC-over-RDMA transport connection. The backward direction credit 392 value MAY be different than the forward direction credit value. The 393 rdma_credit field in a backward direction RPC-over-RDMA message MUST 394 NOT contain the value zero. 396 A backward direction Requester (ie, an RPC-over-RDMA service 397 endpoint) requests credits from the Responder (ie, an RPC-over-RDMA 398 client endpoint). The Responder reports how many credits it can 399 grant. This is the number of backward direction Calls the Responder 400 is prepared to handle at once. 402 When an RPC-over-RDMA server endpoint is operating correctly, it 403 sends no more outstanding requests at a time than the client 404 endpoint's advertised backward direction credit value. 406 2.2. Managing Receive Buffers 408 An RPC-over-RDMA transport endpoint must pre-post receive buffers 409 before it can receive and process incoming RPC-over-RDMA messages. 410 If a sender transmits a message for a receiver which has no prepared 411 receive buffer, the RDMA provider is allowed to drop the RDMA 412 connection. 414 2.2.1. Client Receive Buffers 416 Typically an RPC-over-RDMA Requester posts only as many receive 417 buffers as there are outstanding RPC Calls. A client endpoint 418 without backward direction support might therefore at times have no 419 pre-posted receive buffers. 421 To receive incoming backward direction Calls, an RPC-over-RDMA client 422 endpoint must pre-post enough additional receive buffers to match its 423 advertised backward direction credit value. Each outstanding forward 424 direction RPC requires an additional receive buffer above this 425 minimum. 427 When an RDMA transport connection is lost, all active receive buffers 428 are flushed and are no longer available to receive incoming messages. 429 When a fresh transport connection is established, a client endpoint 430 must re-post a receive buffer to handle the Reply for each 431 retransmitted forward direction Call, and a full set of receive 432 buffers to handle backward direction Calls. 434 2.2.2. Server Receive Buffers 436 A forward direction RPC-over-RDMA service endpoint posts as many 437 receive buffers as it expects incoming forward direction Calls. That 438 is, it posts no fewer buffers than the number of RPC-over-RDMA 439 credits it advertises in the rdma_credit field of forward direction 440 RPC replies. 442 To receive incoming backward direction replies, an RPC-over-RDMA 443 server endpoint must pre-post a receive buffer for each backward 444 direction Call it sends. 446 When the existing transport connection is lost, all active receive 447 buffers are flushed and are no longer available to receive incoming 448 messages. When a fresh transport connection is established, a server 449 endpoint must re-post a receive buffer to handle the Reply for each 450 retransmitted backward direction Call, and a full set of receive 451 buffers for receiving forward direction Calls. 453 2.2.3. In the Absense of Backward Direction Support 455 An RPC-over-RDMA transport endpoint might not support backward 456 direction operation. There might be no mechanism in the transport 457 implementation to do so. Or the Upper Layer Protocol consumer might 458 not yet have configured the transport to handle backward direction 459 traffic. 461 A loss of the RDMA connection may result if the receiver is not 462 prepared to receive an incoming message. Thus a denial-of-service 463 could result if a sender continues to send backchannel messages after 464 every transport reconnect to an endpoint that is not prepared to 465 receive them. 467 Generally, for RPC-over-RDMA Version One transports, the Upper Layer 468 Protocol consumer is responsible for informing its peer when it has 469 support for the backward direction. Otherwise even a simple backward 470 direction NULL probe from a peer could result in a lost connection. 472 An NFSv4.1 server does not send backchannel messages to an NFSv4.1 473 client before the NFSv4.1 client has sent a CREATE_SESSION or a 474 BIND_CONN_TO_SESSION operation. As long as an NFSv4.1 client has 475 prepared appropriate backchannel resources before sending one of 476 these operations, denial-of-service is avoided. Legacy versions of 477 NFS never send backchannel operations. 479 Therefore, an Upper Layer Protocol consumer MUST NOT perform backward 480 direction ONC RPC operations unless the peer consumer has indicated 481 it is prepared to handle them. A description of Upper Layer Protocol 482 mechanisms used for this indication is outside the scope of this 483 document. 485 2.3. Backward Direction Retransmission 487 In rare cases, an ONC RPC transaction cannot be completed within a 488 certain time. This can be because the transport connection was lost, 489 the Call or Reply message was dropped, or because the Upper Layer 490 consumer delayed or dropped the ONC RPC request. Typically, the 491 Requester sends the transaction again, reusing the same RPC XID. 492 This is known as an "RPC retransmission". 494 In the forward direction, the Requester is the ONC RPC client. The 495 client is always responsible for establishing a transport connection 496 before sending again. 498 In the backward direction, the Requester is the ONC RPC server. 499 Because an ONC RPC server does not establish transport connections 500 with clients, it cannot send a retransmission if there is no 501 transport connection. It must wait for the ONC RPC client to re- 502 establish the transport connection before it can retransmit ONC RPC 503 transactions in the backward direction. 505 If an ONC RPC client has no work to do, it may be some time before it 506 re-establishes a transport connection. Backward direction Requesters 507 must be prepared to wait indefinitely before a connection is 508 established before a pending backward direction ONC RPC Call can be 509 retransmitted. 511 2.4. Backward Direction Message Size 513 RPC-over-RDMA backward direction messages are transmitted and 514 received using the same buffers as messages in the forward direction. 515 Therefore they are constrained to be no larger than receive buffers 516 posted for forward messages. The default Receive buffer size in RPC- 517 over-RDMA Version One implementations is 1024 bytes. 519 It is expected that the Upper Layer Protocol consumer establishes an 520 appropriate payload size limit for backward direction operations, 521 either by advertising that size limit to its peers, or by convention. 522 If that is done, backward direction messages will not exceed the size 523 of receive buffers at either endpoint. 525 If a sender transmits a backward direction message that is larger 526 than the receiver is prepared for, the RDMA provider drops the 527 message and the RDMA connection. 529 If a sender transmits an RDMA message that is too small to convey a 530 complete and valid RPC-over-RDMA and RPC message in either direction, 531 the receiver MUST NOT use any value in the fields that were 532 transmitted. Namely, the rdma_credit field MUST be ignored, and the 533 message silently discarded. 535 2.5. Sending A Backward Direction Call 537 To form a backward direction RPC-over-RDMA Call message, an ONC RPC 538 service endpoint constructs an RPC-over-RDMA header containing a 539 fresh RPC XID in the rdma_xid field (see Section 1.1.4 for full 540 requirements). 542 The number of requested backward direction credits is placed in the 543 rdma_credit field (see Section 2.1). 545 The rdma_proc field in the RPC-over-RDMA header MUST contain the 546 value RDMA_MSG. All three chunk lists MUST be empty. 548 The ONC RPC Call header MUST follow immediately, starting with the 549 same XID value that is present in the RPC-over-RDMA header. The Call 550 header's msg_type field MUST contain the value CALL. 552 2.6. Sending A Backward Direction Reply 554 To form a backward direction RPC-over-RDMA Reply message, an ONC RPC 555 client endpoint constructs an RPC-over-RDMA header containing a copy 556 of the matching ONC RPC Call's RPC XID in the rdma_xid field (see 557 Section 1.1.4 for full requirements). 559 The number of granted backward direction credits is placed in the 560 rdma_credit field (see Section 2.1). 562 The rdma_proc field in the RPC-over-RDMA header MUST contain the 563 value RDMA_MSG. All three chunk lists MUST be empty. 565 The ONC RPC Reply header MUST follow immediately, starting with the 566 same XID value that is present in the RPC-over-RDMA header. The 567 Reply header's msg_type field MUST contain the value REPLY. 569 3. Backward Direction Upper Layer Binding 571 RPC programs that operate on RPC-over-RDMA transports using the 572 conventions described in this document do not require an Upper Layer 573 Binding specification. Because backward direction operation using 574 these conventions cannot transfer data via RMDA Read or Write, there 575 can be no RDMA-eligible data items in the Upper Layer Program on this 576 transport. 578 In addition, since backward direction operation occurs on an already- 579 established connection, there is no need to specify RPC bind 580 parameters. 582 4. Limits To This Approach 584 4.1. Payload Size 586 The major drawback to the approach described in this document is the 587 limit on payload size in backward direction requests. 589 o Some NFSv4.1 callback operations can have potentially large 590 arguments or results. For example, CB_GETATTR on a file with a 591 large ACL; or CB_NOTIFY, which can provide a large, complex 592 argument. 594 o Any backward direction operation protected by RPCSEC_GSS might 595 have additional header information that makes it difficult to send 596 backward direction operations with large arguments or results. 598 o Larger payloads could potentially require the use of RDMA data 599 transfers, which are complex and make it more difficult to detect 600 backward direction requests. The msg_type field in the ONC RPC 601 header would no longer be at a fixed location in backward 602 direction requests. 604 4.2. Preparedness To Handle Backward Requests 606 A second drawback is the exposure of the client transport endpoint to 607 backward direction Calls before it has posted receive buffers to 608 handle them. 610 Clients that do not support backward direction operation typically 611 drop messages they do not recognize. However, this does not allow 612 bi-direction-capable servers to quickly identify clients that cannot 613 handle backward direction requests. 615 The conventions in this document rely on Upper Layer Protocol 616 consumers to decide when backward direction transport operation is 617 appropriate. 619 4.3. Long Term 621 To address the limitations described in this section in the long run, 622 two approaches are available: 624 o Larger inline thresholds would make the transport capable of 625 conveying larger backward direction requests 627 o The capability to move chunks in the backward direction would lift 628 the size limit even further by enabling backward direction Long 629 Call and Reply messages to be formed 631 The latter approach would benefit from changes to the XDR definition 632 of the RPC-over-RDMA protocol, and would require significant changes 633 to implementations. 635 The use of the conventions described in this document to enable 636 backward direction operation should be considered a transitional 637 approach that is appropriate while the predominantly deployed 638 versions of the RPC-over-RDMA protocol do not have native support for 639 large backward direction messages. 641 5. Security Considerations 643 When RPCSEC_GSS integrity and confidentiality services (described in 644 [I-D.ietf-nfsv4-rpcsec-gssv3]) are in use, additional RPC header 645 information is included in each message. This increases the size of 646 each message, further limiting the size of backward direction 647 operations. 649 6. IANA Considerations 651 This document does not require actions by IANA. 653 7. Acknowledgements 655 Tom Talpey was an indispensable resource, in addition to creating the 656 foundation upon which this work is based. Our warmest regards go to 657 him for his help and support. 659 Dave Noveck provided excellent review, constructive suggestions, and 660 navigational guidance throughout the process of drafting this 661 document. 663 Dai Ngo was a solid partner and collaborator. Together we 664 constructed and tested independent prototypes of the conventions 665 described in this document. 667 The author wishes to thank Bill Baker for his unwavering support of 668 this work. In addition, the author gratefully acknowledges the 669 expert contributions of Karen Deitke, Chunli Zhang, Mahesh 670 Siddheshwar, Steve Wise, and Tom Tucker. 672 Special thanks go to the nfsv4 Working Group Chair Spencer Shepler 673 and the nfsv4 Working Group Secretary Tom Haynes for their support. 675 8. Normative References 677 [I-D.ietf-nfsv4-rfc5666bis] 678 Lever, C., Simpson, W., and T. Talpey, "Remote Direct 679 Memory Access Transport for Remote Procedure Call", draft- 680 ietf-nfsv4-rfc5666bis-04 (work in progress), March 2016. 682 [I-D.ietf-nfsv4-rpcsec-gssv3] 683 Adamson, A. and N. Williams, "Remote Procedure Call (RPC) 684 Security Version 3", draft-ietf-nfsv4-rpcsec-gssv3-17 685 (work in progress), January 2016. 687 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 688 Requirement Levels", BCP 14, RFC 2119, March 1997. 690 [RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol 691 Specification Version 2", RFC 5531, May 2009. 693 [RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File 694 System (NFS) Version 4 Minor Version 1 Protocol", RFC 695 5661, January 2010. 697 [RFC7530] Haynes, T. and D. Noveck, "Network File System (NFS) 698 Version 4 Protocol", RFC 7530, March 2015. 700 Author's Address 702 Charles Lever 703 Oracle Corporation 704 1015 Granger Avenue 705 Ann Arbor, MI 48104 706 USA 708 Phone: +1 734 274 2396 709 Email: chuck.lever@oracle.com