| < draft-ietf-nfsv4-rfc5667bis-01.txt | draft-ietf-nfsv4-rfc5667bis-02.txt > | |||
|---|---|---|---|---|
| Network File System Version 4 C. Lever, Ed. | Network File System Version 4 C. Lever, Ed. | |||
| Internet-Draft Oracle | Internet-Draft Oracle | |||
| Obsoletes: 5667 (if approved) June 30, 2016 | Obsoletes: 5667 (if approved) August 25, 2016 | |||
| Intended status: Standards Track | Intended status: Standards Track | |||
| Expires: January 1, 2017 | Expires: February 26, 2017 | |||
| Network File System (NFS) Upper Layer Binding To RPC-Over-RDMA | Network File System (NFS) Upper Layer Binding To RPC-Over-RDMA | |||
| draft-ietf-nfsv4-rfc5667bis-01 | draft-ietf-nfsv4-rfc5667bis-02 | |||
| Abstract | Abstract | |||
| This document specifies the Upper Layer Bindings of Network File | This document specifies Upper Layer Bindings of Network File System | |||
| System (NFS) protocol versions to RPC-over-RDMA transports. Such | (NFS) protocol versions to RPC-over-RDMA transports. These bindings | |||
| Upper Layer Bindings are required to enable RPC-based protocols to | are required to enable RPC-based protocols to use direct data | |||
| use direct data placement when conveying large data payloads on RPC- | placement on RPC-over-RDMA transports. This document obsoletes RFC | |||
| over-RDMA transports. This document obsoletes RFC 5667. | 5667. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on January 1, 2017. | This Internet-Draft will expire on February 26, 2017. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2016 IETF Trust and the persons identified as the | Copyright (c) 2016 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 | 1.1. Changes Since RFC 5667 . . . . . . . . . . . . . . . . . 3 | |||
| 1.2. Changes Since RFC 5667 . . . . . . . . . . . . . . . . . 3 | 1.2. Extending This Upper Layer Binding . . . . . . . . . . . 4 | |||
| 1.3. Planned Changes To This Document . . . . . . . . . . . . 4 | 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 4 | |||
| 2. Conveying NFS Operations On RPC-Over-RDMA Transports . . . . 4 | 2. Conveying NFS Operations On RPC-Over-RDMA Transports . . . . 4 | |||
| 2.1. Use Of The Read List . . . . . . . . . . . . . . . . . . 4 | 2.1. Use Of The Read List . . . . . . . . . . . . . . . . . . 4 | |||
| 2.2. Use Of The Write List . . . . . . . . . . . . . . . . . . 5 | 2.2. Use Of The Write List . . . . . . . . . . . . . . . . . . 4 | |||
| 2.3. Construction Of Individual Chunks . . . . . . . . . . . . 5 | 2.3. Construction Of Individual Chunks . . . . . . . . . . . . 5 | |||
| 2.4. Use Of Long Calls And Replies . . . . . . . . . . . . . . 5 | 2.4. Use Of Long Calls And Replies . . . . . . . . . . . . . . 5 | |||
| 3. NFS Versions 2 And 3 Upper Layer Binding . . . . . . . . . . 5 | 3. NFS Versions 2 And 3 Upper Layer Binding . . . . . . . . . . 5 | |||
| 4. NFS Version 4 Upper Layer Binding . . . . . . . . . . . . . . 6 | 4. NFS Version 4 Upper Layer Binding . . . . . . . . . . . . . . 6 | |||
| 4.1. NFS Version 4 COMPOUND Considerations . . . . . . . . . . 7 | 4.1. DDP-Eligibility . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 4.2. NFS Version 4 Callbacks . . . . . . . . . . . . . . . . . 8 | 4.2. Reply Size Estimation . . . . . . . . . . . . . . . . . . 7 | |||
| 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | 4.3. NFS Version 4 COMPOUND Considerations . . . . . . . . . . 7 | |||
| 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 | 4.4. NFS Version 4 Callback . . . . . . . . . . . . . . . . . 9 | |||
| 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 | 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 6. Security Considerations . . . . . . . . . . . . . . . . . . . 10 | |||
| 8.1. Normative References . . . . . . . . . . . . . . . . . . 9 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| 8.2. Informative References . . . . . . . . . . . . . . . . . 10 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 | 8.1. Normative References . . . . . . . . . . . . . . . . . . 10 | |||
| 8.2. Informative References . . . . . . . . . . . . . . . . . 11 | ||||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
| 1. Introduction | 1. Introduction | |||
| Remote Direct Memory Access Transport for Remote Procedure Call, | An RPC-over-RDMA transport, such as defined in | |||
| Version One [I-D.ietf-nfsv4-rfc5666bis] (RPC-over-RDMA) enables the | [I-D.ietf-nfsv4-rfc5666bis], may employ direct data placement to | |||
| use of direct data placement to accelerate the transmission of large | transmit large data payloads associated with RPC transactions. Each | |||
| data payloads associated with RPC transactions. | RPC-over-RDMA transport header conveys lists of memory locations | |||
| corresponding to XDR data items defined in an Upper Layer Protocol | ||||
| Each RPC-over-RDMA transport header can convey lists of memory | (such as NFS). | |||
| locations involved in direct transfers of data payloads. These | ||||
| memory locations correspond to XDR data items defined in an Upper | ||||
| Layer Protocol (such as NFS). | ||||
| To facilitate interoperation, RPC client and server implementations | To facilitate interoperation, RPC client and server implementations | |||
| must agree on what XDR data items in which RPC procedures are | must agree in advance on what XDR data items in which RPC procedures | |||
| eligible for direct data placement (DDP). | are eligible for direct data placement (DDP). This document contains | |||
| material required of Upper Layer Bindings, as specified in | ||||
| This document specifies the set of XDR data items in each of the | [I-D.ietf-nfsv4-rfc5666bis], for the following NFS protocol versions: | |||
| following NFS protocol versions that are eligible for DDP. It also | ||||
| contains additional material required of Upper Layer Bindings as | ||||
| specified in [I-D.ietf-nfsv4-rfc5666bis]. | ||||
| o NFS Version 2 [RFC1094] | o NFS Version 2 [RFC1094] | |||
| o NFS Version 3 [RFC1813] | o NFS Version 3 [RFC1813] | |||
| o NFS Version 4.0 [RFC7530] | o NFS Version 4.0 [RFC7530] | |||
| o NFS Version 4.1 [RFC5661] | o NFS Version 4.1 [RFC5661] | |||
| o NFS Version 4.2 [I-D.ietf-nfsv4-minorversion2] | o NFS Version 4.2 [I-D.ietf-nfsv4-minorversion2] | |||
| The Upper Layer Binding specified in this document can be extended to | 1.1. Changes Since RFC 5667 | |||
| cover the addition of new DDP-eligible XDR data items defined by | ||||
| versions of the NFS version 4 protocol specified after this document | ||||
| has been ratified. | ||||
| 1.1. Requirements Language | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
| document are to be interpreted as described in [RFC2119]. | ||||
| 1.2. Changes Since RFC 5667 | ||||
| Corrections and updates made necessary by new language in | Corrections and updates made necessary by new language in | |||
| [I-D.ietf-nfsv4-rfc5666bis] has been introduced. For example, | [I-D.ietf-nfsv4-rfc5666bis] have been introduced. For example, | |||
| references to deprecated features of RPC-over-RDMA Version One, such | references to deprecated features of RPC-over-RDMA Version One, such | |||
| as RDMA_MSGP, and the use of the Read list for handling RPC replies, | as RDMA_MSGP, and the use of the Read list for handling RPC replies, | |||
| has been removed. The term "mapping" has been replaced with the term | has been removed. The term "mapping" has been replaced with the term | |||
| "binding" or "Upper Layer Binding" throughout the document. Material | "binding" or "Upper Layer Binding" throughout the document. Material | |||
| that duplicates what is in [I-D.ietf-nfsv4-rfc5666bis] has been | that duplicates what is in [I-D.ietf-nfsv4-rfc5666bis] has been | |||
| deleted. | deleted. | |||
| Material required by [I-D.ietf-nfsv4-rfc5666bis] for Upper Layer | Material required by [I-D.ietf-nfsv4-rfc5666bis] for Upper Layer | |||
| Bindings that was not present in [RFC5667] has been added, including | Bindings that was not present in [RFC5667] has been added, including | |||
| discussion of how each NFS version properly estimates the maximum | discussion of how each NFS version properly estimates the maximum | |||
| skipping to change at page 3, line 51 ¶ | skipping to change at page 3, line 36 ¶ | |||
| o Ambiguous or erroneous uses of RFC2119 terms have been corrected. | o Ambiguous or erroneous uses of RFC2119 terms have been corrected. | |||
| o References to specific data movement mechanisms have been made | o References to specific data movement mechanisms have been made | |||
| generic or removed. | generic or removed. | |||
| o References to obsolete RFCs have been replaced. | o References to obsolete RFCs have been replaced. | |||
| o Technical corrections have been made. For example, the mention of | o Technical corrections have been made. For example, the mention of | |||
| 12KB and 36KB inline thresholds have been removed. The reference | 12KB and 36KB inline thresholds have been removed. The reference | |||
| to a non-existant NFS version 4 SYMLINK operation has been | to a non-existant NFS version 4 SYMLINK operation has been | |||
| replaced with NFS version 4 CREATE(NF4LNK). | replaced with NFS version 4 CREATE(NF4LNK). The discussion of NFS | |||
| version 4 COMPOUND handling has been completed. | ||||
| o An IANA Considerations Section has replaced the "Port Usage | o An IANA Considerations Section has replaced the "Port Usage | |||
| Considerations" Section. | Considerations" Section. | |||
| o Code excerpts have been removed, and figures have been modernized. | o Code excerpts have been removed, and figures have been modernized. | |||
| o Language inconsistent with or contradictory to | o Language inconsistent with or contradictory to | |||
| [I-D.ietf-nfsv4-rfc5666bis] has been removed from Sections 2 and | [I-D.ietf-nfsv4-rfc5666bis] has been removed from Sections 2 and | |||
| 3, and both Sections have been combined into Section 2 in the | 3, and both Sections have been combined into Section 2 in the | |||
| present document. | present document. | |||
| o An explicit discussion of NFSv4.0 and NFSv4.1 backchannel | o An explicit discussion of NFSv4.0 and NFSv4.1 backchannel | |||
| operation will replace the previous treatment of callback | operation will replace the previous treatment of callback | |||
| operations. No NFSv4.x callback operation is DDP-eligible. | operations. No NFSv4.x callback operation is DDP-eligible. | |||
| o The binding for NFSv4.1 has been completed. No additional DDP- | o The binding for NFSv4.1 has been completed. No DDP-eligible | |||
| eligible operations exist in NFSv4.1. | operations exist in NFSv4.1 that did not exist in NFSv4.0. | |||
| o A binding for NFSv4.2 has been added that includes discussion of | o A binding for NFSv4.2 has been added that includes discussion of | |||
| new data-bearing operations like READ_PLUS. | new data-bearing operations like READ_PLUS. | |||
| 1.3. Planned Changes To This Document | 1.2. Extending This Upper Layer Binding | |||
| The following changes are planned, relative to [RFC5667]: | ||||
| o The discussion of NFS version 4 COMPOUND handling will be | As stated earlier, RPC programs such as NFS are required to have an | |||
| completed. | Upper Layer Binding specification to interoperate on RPC-over-RDMA | |||
| transports [I-D.ietf-nfsv4-rfc5666bis]. The Upper Layer Binding | ||||
| specified in this document can be extended to cover versions of the | ||||
| NFS version 4 protocol specified after NFS version 4 minor version 2 | ||||
| via standards action. This includes NFSv4 extensions that are | ||||
| documented separately from a new minor version. | ||||
| o Remarks about handling DDP-eligibility violations will be | 1.3. Requirements Language | |||
| introduced. | ||||
| o A discussion of how the NFS binding to RPC-over-RDMA is extended | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| by standards action will be added. | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in [RFC2119]. | ||||
| 2. Conveying NFS Operations On RPC-Over-RDMA Transports | 2. Conveying NFS Operations On RPC-Over-RDMA Transports | |||
| Definitions of terminology and a general discussion of how RPC-over- | Definitions of terminology and a general discussion of how RPC-over- | |||
| RDMA is used to convey RPC transactions can be found in | RDMA is used to convey RPC transactions can be found in | |||
| [I-D.ietf-nfsv4-rfc5666bis]. In this section, these general | [I-D.ietf-nfsv4-rfc5666bis]. In this section, these general | |||
| principals are applied to the specifics of the NFS protocol. | principals are applied to the specifics of the NFS protocol. | |||
| 2.1. Use Of The Read List | 2.1. Use Of The Read List | |||
| The Read list in each RPC-over-RDMA transport header represents a set | The Read list in each RPC-over-RDMA transport header represents a set | |||
| of memory regions containing DDP-eligible NFS argument data. Large | of memory regions containing DDP-eligible NFS argument data. Large | |||
| data items, such as the file data payload of an NFS WRITE request, | data items, such as the data payload of an NFS WRITE request, are | |||
| are referenced by the Read list and placed directly into server | referenced by the Read list. The server places these directly into | |||
| memory. | its memory. | |||
| XDR unmarshaling code on the NFS server identifies the correspondence | XDR unmarshaling code on the NFS server identifies the correspondence | |||
| between Read chunks and particular NFS arguments via the chunk | between Read chunks and particular NFS arguments via the chunk | |||
| Position value encoded in each Read chunk. | Position value encoded in each Read chunk. | |||
| 2.2. Use Of The Write List | 2.2. Use Of The Write List | |||
| The Write list in each RPC-over-RDMA transport header represents a | The Write list in each RPC-over-RDMA transport header represents a | |||
| set of memory regions that can receive DDP-eligible NFS result data. | set of memory regions that can receive DDP-eligible NFS result data. | |||
| Large data items such as the payload of an NFS READ request are | Large data items such as the payload of an NFS READ request are | |||
| referenced by the Write list and placed directly into client memory. | referenced by the Write list. The server places these directly into | |||
| client memory. | ||||
| Each Write chunk corresponds to a specific XDR data item in an NFS | Each Write chunk corresponds to a specific XDR data item in an NFS | |||
| reply. This document specifies how NFS client and server | reply. This document specifies how NFS client and server | |||
| implementations identify the correspondence between Write chunks and | implementations identify the correspondence between Write chunks and | |||
| each XDR result. | XDR results. | |||
| 2.3. Construction Of Individual Chunks | 2.3. Construction Of Individual Chunks | |||
| Each Read chunk is represented as a list of segments at the same XDR | Each Read chunk is represented as a list of segments at the same XDR | |||
| Position, and each Write chunk is represented as an array of | Position, and each Write chunk is represented as an array of | |||
| segments. An NFS client thus has the flexibility to advertise a set | segments. An NFS client thus has the flexibility to advertise a set | |||
| of discontiguous memory regions in which to send or receive a single | of discontiguous memory regions in which to send or receive a single | |||
| DDP-eligible data item. | DDP-eligible data item. | |||
| 2.4. Use Of Long Calls And Replies | 2.4. Use Of Long Calls And Replies | |||
| Small RPC messages are conveyed using RDMA Send operations which are | Small RPC messages are conveyed using RDMA Send operations which are | |||
| of limited size. If an NFS request is too large to be conveyed via | of limited size. If an NFS request is too large to be conveyed via | |||
| an RDMA Send, and there are no DDP-eligible data items that can be | an RDMA Send, and there are no DDP-eligible data items that can be | |||
| removed, an NFS client must send the request using a Long Call. The | removed, an NFS client must send the request using a Long Call. The | |||
| entire NFS request is sent in a special Read chunk. | entire NFS request is sent in a special Read chunk called a Position- | |||
| Zero Read chunk. | ||||
| If a client expects that an NFS reply will be too large to be | If a client predicts that the maximum size of an NFS reply is too | |||
| conveyed via an RDMA Send, it provides a Reply chunk in the RPC-over- | large to be conveyed via an RDMA Send, it provides a Reply chunk in | |||
| RDMA transport header conveying the NFS request. The server can | the RPC-over-RDMA transport header conveying the NFS request. The | |||
| place the entire NFS reply in the Reply chunk. | server can place the entire NFS reply in the Reply chunk. | |||
| These are described in more detail in [I-D.ietf-nfsv4-rfc5666bis]. | These special chunks are described in more detail in | |||
| [I-D.ietf-nfsv4-rfc5666bis]. | ||||
| 3. NFS Versions 2 And 3 Upper Layer Binding | 3. NFS Versions 2 And 3 Upper Layer Binding | |||
| An NFS client MAY send a single Read chunk to supply opaque file data | An NFS client MAY send a single Read chunk to supply opaque file data | |||
| for an NFS WRITE procedure, or the pathname for an NFS SYMLINK | for an NFS WRITE procedure, or the pathname for an NFS SYMLINK | |||
| procedure. For all other NFS procedures, the server MUST ignore Read | procedure. For all other NFS procedures, NFS servers MUST ignore | |||
| chunks that have a non-zero value in their Position fields, and Read | Read chunks that have a non-zero value in their Position fields, and | |||
| chunks beyond the first in the Read list. | Read chunks beyond the first in the Read list. | |||
| Similarly, an NFS client MAY provide a single Write chunk to receive | Similarly, an NFS client MAY provide a single Write chunk to receive | |||
| either opaque file data from an NFS READ procedure, or the pathname | either opaque file data from an NFS READ procedure, or the pathname | |||
| from an NFS READLINK procedure. The server MUST ignore the Write | from an NFS READLINK procedure. NFS servers MUST ignore the Write | |||
| list for any other NFS procedure, and any Write chunks beyond the | list for any other NFS procedure, and any Write chunks beyond the | |||
| first in the Write list. | first in the Write list. | |||
| There are no NFS version 2 or 3 procedures that have DDP-eligible | There are no NFS version 2 or 3 procedures that have DDP-eligible | |||
| data items in both their Call and Reply. However, if an NFS client | data items in both their Call and Reply. However, when an NFS client | |||
| is sending a Long Call or Reply, it MAY provide a combination of Read | sends a Long Call or Reply, it MAY provide a combination of Read | |||
| list, Write list, and/or a Reply chunk in the same transaction. | list, Write list, and/or a Reply chunk in the same RPC-over-RDMA | |||
| header. | ||||
| If an NFS client has not provided enough bytes in a Read list to | ||||
| match the size of a DDP-eligible NFS argument data item, or if an NFS | ||||
| client has not provided enough Write list resources to handle an NFS | ||||
| WRITE or READLINK reply, or if the client has not provided a large | ||||
| enough Reply chunk to convey an NFS reply, the server MUST return one | ||||
| of: | ||||
| o An RPC-over-RDMA message of type RDMA_ERROR, with the rdma_xid | ||||
| field set to the XID of the matching NFS Call, and the rdma_error | ||||
| field set to ERR_CHUNK; or | ||||
| o An RPC message with the mtype field set to REPLY, the stat field | ||||
| set to MSG_ACCEPTED, and the accept_stat field set to | ||||
| GARBAGE_ARGS. | ||||
| NFS clients already successfully estimate the maximum reply size of | NFS clients already successfully estimate the maximum reply size of | |||
| each operation in order to provide an adequate set of buffers to | each operation in order to provide an adequate set of buffers to | |||
| receive each NFS reply. An NFS client provides a Reply chunk when | receive each NFS reply. An NFS client provides a Reply chunk when | |||
| the maximum possible reply size is larger than the client's responder | the maximum possible reply size is larger than the client's responder | |||
| inline threshold. | inline threshold. | |||
| How does the server respond if the client has not provided enough | ||||
| Write list resources to handle an NFS WRITE or READLINK reply? How | ||||
| does the server respond if the client has not provided enough Reply | ||||
| chunk resources to handle an NFS reply? | ||||
| 4. NFS Version 4 Upper Layer Binding | 4. NFS Version 4 Upper Layer Binding | |||
| This specification applies to NFS Version 4.0 [RFC7530], NFS Version | This specification applies to NFS Version 4.0 [RFC7530], NFS Version | |||
| 4.1 [RFC5661], and NFS Version 4.2 [I-D.ietf-nfsv4-minorversion2]. | 4.1 [RFC5661], and NFS Version 4.2 [I-D.ietf-nfsv4-minorversion2]. | |||
| It also applies to the callback protocols associated with each of | It also applies to the callback protocols associated with each of | |||
| these minor versions. | these minor versions. | |||
| 4.1. DDP-Eligibility | ||||
| An NFS client MAY send a Read chunk to supply opaque file data for a | An NFS client MAY send a Read chunk to supply opaque file data for a | |||
| WRITE operation or the pathname for a CREATE(NF4LNK) operation in an | WRITE operation or the pathname for a CREATE(NF4LNK) operation in an | |||
| NFS version 4 COMPOUND procedure. An NFS client MUST NOT send a Read | NFS version 4 COMPOUND procedure. An NFS client MUST NOT send a Read | |||
| chunk that corresponds with any other XDR data item in any other NFS | chunk that corresponds with any other XDR data item in any other NFS | |||
| version 4 operation. | version 4 operation in an NFS version 4 COMPOUND procedure, or in an | |||
| NFS version 4 NULL procedure. | ||||
| Similarly, an NFS client MAY provide a Write chunk to receive either | Similarly, an NFS client MAY provide a Write chunk to receive either | |||
| opaque file data from a READ operation, NFS4_CONTENT_DATA from a | opaque file data from a READ operation, NFS4_CONTENT_DATA from a | |||
| READ_PLUS operation, or the pathname from a READLINK operation in an | READ_PLUS operation, or the pathname from a READLINK operation in an | |||
| NFS version 4 COMPOUND procedure. An NFS client MUST NOT provide a | NFS version 4 COMPOUND procedure. An NFS client MUST NOT provide a | |||
| Write chunk that corresponds with any other XDR data item in any | Write chunk that corresponds with any other XDR data item in any | |||
| other NFS version 4 operation. | other NFS version 4 operation in an NFS version 4 COMPOUND procedure, | |||
| or in an NFS version 4 NULL procedure. | ||||
| There is no prohibition against an NFS version 4 COMPOUND procedure | There is no prohibition against an NFS version 4 COMPOUND procedure | |||
| constructed with both a READ and WRITE operation, say. Thus it is | constructed with both a READ and WRITE operation, say. Thus it is | |||
| possible for NFS version 4 COMPOUND procedures to use both the Read | possible for NFS version 4 COMPOUND procedures to use both the Read | |||
| list and Write list simultaneously. An NFS client MAY provide a Read | list and Write list simultaneously. An NFS client MAY provide a Read | |||
| list and a Write list in the same transaction if it is sending a Long | list and a Write list in the same transaction if it is sending a Long | |||
| Call or Reply. | Call or Reply. | |||
| Some remarks need to be made about how NFS version 4 clients estimate | If an NFS client has not provided enough bytes in a Read list to | |||
| reply size, and how DDP-eligibility violations are reported. | match the size of a DDP-eligible NFS argument data item, or if an NFS | |||
| client has not provided enough Write list resources to handle a WRITE | ||||
| or READLINK operation, or if the client has not provided a large | ||||
| enough Reply chunk to convey an NFS reply, the server MUST return one | ||||
| of: | ||||
| 4.1. NFS Version 4 COMPOUND Considerations | o An RPC-over-RDMA message of type RDMA_ERROR, with the rdma_xid | |||
| field set to the XID of the matching NFS Call, and the rdma_error | ||||
| field set to ERR_CHUNK; or | ||||
| o An RPC message with the mtype field set to REPLY, the stat field | ||||
| set to MSG_ACCEPTED, and the accept_stat field set to | ||||
| GARBAGE_ARGS. | ||||
| 4.2. Reply Size Estimation | ||||
| An NFS client provides a Reply chunk when the maximum possible reply | ||||
| size is larger than the client's responder inline threshold. NFS | ||||
| clients successfully estimate the maximum reply size of most | ||||
| operations in order to provide an adequate set of buffers to receive | ||||
| each NFS reply. | ||||
| There are certain NFSv4 data items whose size cannot be reliably | ||||
| estimated by clients, however, because there is no protocol-specified | ||||
| size limit on these structures. These include but are not limited to | ||||
| opaque types such as the attrlist4 field; fields containing ACLs such | ||||
| as fattr4_acl, fattr4_dacl, fattr4_sacl; fields in the fs_locations4 | ||||
| and fs_locations_info4 data structures; and opaque fields loc_body, | ||||
| loh_body, da_addr_body, lou_body, lrf_body, fattr_layout_types and | ||||
| fs_layout_types, which pertain to pNFS layout metadata. | ||||
| 4.3. NFS Version 4 COMPOUND Considerations | ||||
| An NFS version 4 COMPOUND procedure supplies arguments for a sequence | An NFS version 4 COMPOUND procedure supplies arguments for a sequence | |||
| of operations, and returns results from that sequence. A client MAY | of operations, and returns results from that sequence. A client MAY | |||
| construct an NFS version 4 COMPOUND procedure that uses more than one | construct an NFS version 4 COMPOUND procedure that uses more than one | |||
| chunk in either the Read list or Write list. The NFS client provides | chunk in either the Read list or Write list. The NFS client provides | |||
| XDR Position values in each Read chunk to disambiguate which chunk is | XDR Position values in each Read chunk to disambiguate which chunk is | |||
| associated with which XDR data item. | associated with which XDR data item. | |||
| However NFS server and client implementations must agree in advance | However NFS server and client implementations must agree in advance | |||
| on how to pair Write chunks with returned result data items. The | on how to pair Write chunks with returned result data items. The | |||
| skipping to change at page 8, line 19 ¶ | skipping to change at page 9, line 7 ¶ | |||
| Unlike NFS versions 2 and 3, the maximum size of an NFS version 4 | Unlike NFS versions 2 and 3, the maximum size of an NFS version 4 | |||
| COMPOUND is not bounded. However, typical NFS version 4 clients | COMPOUND is not bounded. However, typical NFS version 4 clients | |||
| rarely issue such problematic requests. In practice, NFS version 4 | rarely issue such problematic requests. In practice, NFS version 4 | |||
| clients behave in much more predictable ways. Rsize and wsize apply | clients behave in much more predictable ways. Rsize and wsize apply | |||
| to COMPOUND operations by capping the total amount of data payload | to COMPOUND operations by capping the total amount of data payload | |||
| allowed in each COMPOUND. An extension to NFS version 4 supporting a | allowed in each COMPOUND. An extension to NFS version 4 supporting a | |||
| comprehensive exchange of upper-layer message size parameters is part | comprehensive exchange of upper-layer message size parameters is part | |||
| of [RFC5661]. | of [RFC5661]. | |||
| 4.2. NFS Version 4 Callbacks | 4.4. NFS Version 4 Callback | |||
| The NFS version 4 protocols support server-initiated callbacks to | The NFS version 4 protocols support server-initiated callbacks to | |||
| notify clients of events such as recalled delegations. There are no | notify clients of events such as recalled delegations. There are no | |||
| DDP-eligible data items in callback protocols associated with | DDP-eligible data items in callback protocols associated with | |||
| NFSv4.0, NFSv4.1, or NFSv4.2. | NFSv4.0, NFSv4.1, or NFSv4.2. | |||
| In NFS version 4.1 and 4.2, callback operations may appear on the | In NFS version 4.1 and 4.2, callback operations may appear on the | |||
| same connection as one used for NFS version 4 client requests. To | same connection as one used for NFS version 4 client requests. NFS | |||
| operate on RPC-over-RDMA transports, NFS version 4 clients and | version 4 clients and servers MUST use the mechanism described in | |||
| servers MUST use the mechanism described in | [I-D.ietf-nfsv4-rpcrdma-bidirection] when backchannel operations are | |||
| [I-D.ietf-nfsv4-rpcrdma-bidirection]. | conveyed on RPC-over-RDMA transports. | |||
| 5. IANA Considerations | 5. IANA Considerations | |||
| NFS use of direct data placement introduces a need for an additional | NFS use of direct data placement introduces a need for an additional | |||
| NFS port number assignment for networks that share traditional UDP | NFS port number assignment for networks that share traditional UDP | |||
| and TCP port spaces with RDMA services. The iWARP [RFC5041] | and TCP port spaces with RDMA services. The iWARP [RFC5041] | |||
| [RFC5040] protocol is such an example (InfiniBand is not). | [RFC5040] protocol is such an example (InfiniBand is not). | |||
| NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally | NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally | |||
| listen for clients on UDP and TCP port 2049, and additionally, they | listen for clients on UDP and TCP port 2049, and additionally, they | |||
| End of changes. 37 change blocks. | ||||
| 97 lines changed or deleted | 133 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||