| < draft-ietf-nfsv4-rfc5667bis-07.txt | draft-ietf-nfsv4-rfc5667bis-08.txt > | |||
|---|---|---|---|---|
| Network File System Version 4 C. Lever, Ed. | Network File System Version 4 C. Lever, Ed. | |||
| Internet-Draft Oracle | Internet-Draft Oracle | |||
| Obsoletes: 5667 (if approved) March 9, 2017 | Obsoletes: 5667 (if approved) April 4, 2017 | |||
| Intended status: Standards Track | Intended status: Standards Track | |||
| Expires: September 10, 2017 | Expires: October 6, 2017 | |||
| Network File System (NFS) Upper Layer Binding To RPC-Over-RDMA Version | Network File System (NFS) Upper Layer Binding To RPC-Over-RDMA Version | |||
| One | One | |||
| draft-ietf-nfsv4-rfc5667bis-07 | draft-ietf-nfsv4-rfc5667bis-08 | |||
| Abstract | Abstract | |||
| This document specifies Upper Layer Bindings of Network File System | This document specifies Upper Layer Bindings of Network File System | |||
| (NFS) protocol versions to RPC-over-RDMA Version One, enabling the | (NFS) protocol versions to RPC-over-RDMA Version One, enabling the | |||
| use of Direct Data Placement. This document obsoletes RFC 5667. | use of Direct Data Placement. This document obsoletes RFC 5667. | |||
| Requirements Language | Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| skipping to change at page 1, line 40 ¶ | skipping to change at page 1, line 40 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on September 10, 2017. | This Internet-Draft will expire on October 6, 2017. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 37 ¶ | skipping to change at page 2, line 37 ¶ | |||
| 2.1. Short Reply Chunk Retry . . . . . . . . . . . . . . . . . 4 | 2.1. Short Reply Chunk Retry . . . . . . . . . . . . . . . . . 4 | |||
| 3. Upper Layer Binding for NFS Versions 2 and 3 . . . . . . . . 5 | 3. Upper Layer Binding for NFS Versions 2 and 3 . . . . . . . . 5 | |||
| 3.1. Reply Size Estimation . . . . . . . . . . . . . . . . . . 5 | 3.1. Reply Size Estimation . . . . . . . . . . . . . . . . . . 5 | |||
| 3.2. RPC Binding Considerations . . . . . . . . . . . . . . . 5 | 3.2. RPC Binding Considerations . . . . . . . . . . . . . . . 5 | |||
| 4. Upper Layer Bindings for NFS Version 2 and 3 Auxiliary | 4. Upper Layer Bindings for NFS Version 2 and 3 Auxiliary | |||
| Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 4.1. MOUNT, NLM, and NSM Protocols . . . . . . . . . . . . . . 6 | 4.1. MOUNT, NLM, and NSM Protocols . . . . . . . . . . . . . . 6 | |||
| 4.2. NFSACL Protocol . . . . . . . . . . . . . . . . . . . . . 6 | 4.2. NFSACL Protocol . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 5. Upper Layer Binding For NFS Version 4 . . . . . . . . . . . . 7 | 5. Upper Layer Binding For NFS Version 4 . . . . . . . . . . . . 7 | |||
| 5.1. DDP-Eligibility . . . . . . . . . . . . . . . . . . . . . 7 | 5.1. DDP-Eligibility . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 5.2. Reply Size Estimation . . . . . . . . . . . . . . . . . . 8 | 5.2. Reply Size Estimation . . . . . . . . . . . . . . . . . . 7 | |||
| 5.3. RPC Binding Considerations . . . . . . . . . . . . . . . 9 | 5.3. RPC Binding Considerations . . . . . . . . . . . . . . . 8 | |||
| 5.4. NFS COMPOUND Requests . . . . . . . . . . . . . . . . . . 10 | 5.4. NFS COMPOUND Requests . . . . . . . . . . . . . . . . . . 8 | |||
| 5.5. NFS Callback Requests . . . . . . . . . . . . . . . . . . 11 | 5.5. NFS Callback Requests . . . . . . . . . . . . . . . . . . 10 | |||
| 5.6. Session-Related Considerations . . . . . . . . . . . . . 12 | 5.6. Session-Related Considerations . . . . . . . . . . . . . 11 | |||
| 5.7. Transport Considerations . . . . . . . . . . . . . . . . 13 | 5.7. Transport Considerations . . . . . . . . . . . . . . . . 12 | |||
| 6. Extending NFS Upper Layer Bindings . . . . . . . . . . . . . 14 | 6. Extending NFS Upper Layer Bindings . . . . . . . . . . . . . 13 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 14 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 | |||
| 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 9.1. Normative References . . . . . . . . . . . . . . . . . . 15 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 14 | |||
| 9.2. Informative References . . . . . . . . . . . . . . . . . 16 | 9.2. Informative References . . . . . . . . . . . . . . . . . 15 | |||
| Appendix A. Changes Since RFC 5667 . . . . . . . . . . . . . . . 17 | Appendix A. Changes Since RFC 5667 . . . . . . . . . . . . . . . 16 | |||
| Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 18 | Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 17 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 18 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 | |||
| 1. Introduction | 1. Introduction | |||
| The RPC-over-RDMA Version One transport may employ direct data | The RPC-over-RDMA Version One transport may employ direct data | |||
| placement to convey data payloads associated with RPC transactions | placement to convey data payloads associated with RPC transactions | |||
| [I-D.ietf-nfsv4-rfc5666bis]. To enable successful interoperation, | [I-D.ietf-nfsv4-rfc5666bis]. To enable successful interoperation, | |||
| RPC client and server implementations using RPC-over-RDMA Version One | RPC client and server implementations using RPC-over-RDMA Version One | |||
| must agree which XDR data items and RPC procedures are eligible to | must agree which XDR data items and RPC procedures are eligible to | |||
| use direct data placement (DDP). | use direct data placement (DDP). | |||
| skipping to change at page 6, line 27 ¶ | skipping to change at page 6, line 27 ¶ | |||
| o Versions 1, 3, and 4 of the NLM protocol [RFC1813] | o Versions 1, 3, and 4 of the NLM protocol [RFC1813] | |||
| o Version 1 of the NSM protocol, described in Chapter 11 of [XNFS] | o Version 1 of the NSM protocol, described in Chapter 11 of [XNFS] | |||
| o Version 1 of the NFSACL protocol, which does not have a public | o Version 1 of the NFSACL protocol, which does not have a public | |||
| definition. NFSACL is treated in this document as a de facto | definition. NFSACL is treated in this document as a de facto | |||
| standard, as there are several interoperating implementations. | standard, as there are several interoperating implementations. | |||
| 4.1. MOUNT, NLM, and NSM Protocols | 4.1. MOUNT, NLM, and NSM Protocols | |||
| Typically MOUNT, NLM, and NSM are conveyed via TCP, even in | Historically, NFS/RDMA implementations have chosen to convey the | |||
| deployments where the NFS RPC Program operates on RPC-over-RDMA | MOUNT, NLM, and NSM protocols via TCP. To enable interoperation of | |||
| Version One. | these protocols when NFS/RDMA is in use, a legacy NFS server MUST | |||
| provide TCP-based MOUNT, NLM, and NSM services. | ||||
| No XDR data item in these protocols is DDP-eligible, therefore a | ||||
| special port assignment for operation on RPC-over-RDMA is not | ||||
| necessary. When a Legacy server supports these RPC Programs on RPC- | ||||
| over-RDMA Version One, it advertises an arbitrarily-chosen service | ||||
| port address via the rpcbind service [RFC1833]. | ||||
| The largest variable-length XDR data items in these protocols is | ||||
| defined in [XNFS]: LM_MAXSTRLEN is 1024 bytes, LM_MAXNAMELEN is | ||||
| LM_MAXSTRLEN + 1, and MAXNETOBJ_SZ is 1024 bytes. Reply size | ||||
| estimation for these protocols uses the criteria outlined in | ||||
| Section 2. There are no operations in these protocols that benefit | ||||
| from short Reply chunk retry. | ||||
| 4.2. NFSACL Protocol | 4.2. NFSACL Protocol | |||
| Legacy clients and servers that support the NFSACL RPC Program | Legacy clients and servers that support the NFSACL RPC Program | |||
| typically convey NFSACL procedures on the same connection as NFS RPC | typically convey NFSACL procedures on the same connection as NFS RPC | |||
| Programs. This obviates the need for separate rpcbind queries to | Programs. This obviates the need for separate rpcbind queries to | |||
| discover server support for this RPC Program. | discover server support for this RPC Program. | |||
| ACLs are typically small, but even large ACLs must be encoded and | ACLs are typically small, but even large ACLs must be encoded and | |||
| decoded to some degree. Thus no data item in this Upper Layer | decoded to some degree. Thus no data item in this Upper Layer | |||
| skipping to change at page 7, line 42 ¶ | skipping to change at page 7, line 28 ¶ | |||
| NFS version 4 minor versions are DDP-eligible: | NFS version 4 minor versions are DDP-eligible: | |||
| o The opaque data field in the WRITE4args structure | o The opaque data field in the WRITE4args structure | |||
| o The linkdata field of the NF4LNK arm in the createtype4 union | o The linkdata field of the NF4LNK arm in the createtype4 union | |||
| o The opaque data field in the READ4resok structure | o The opaque data field in the READ4resok structure | |||
| o The linkdata field in the READLINK4resok structure | o The linkdata field in the READLINK4resok structure | |||
| o In minor version 2 and newer, the rpc_data field of the | ||||
| read_plus_content union (further restrictions on the use of this | ||||
| data item follow below). | ||||
| 5.1.1. READ_PLUS Replies | ||||
| The NFS version 4.2 READ_PLUS operation returns a complex data type | ||||
| [RFC7862]. The rpr_contents field in the result of this operation is | ||||
| an array of read_plus_content unions, one arm of which contains an | ||||
| opaque byte stream (d_data). | ||||
| The size of d_data is limited to the value of the rpa_count field, | ||||
| but the protocol does not bound the number of elements which can be | ||||
| returned in the rpr_contents array. In order to make the size of | ||||
| READ_PLUS replies predictable by NFS version 4.2 clients, the | ||||
| following restrictions are placed on the use of the READ_PLUS | ||||
| operation on an RPC-over-RDMA Version One transport: | ||||
| o An NFS version 4.2 client MUST NOT provide more than one Write | ||||
| chunk for any READ_PLUS operation. When providing a Write chunk | ||||
| for a READ_PLUS operation, an NFS version 4.2 client MUST provide | ||||
| a Write chunk that is either empty (which forces all result data | ||||
| items for this operation to be returned inline) or large enough to | ||||
| receive rpa_count bytes in a single element of the rpr_contents | ||||
| array. | ||||
| o If the Write chunk provided for a READ_PLUS operation by an NFS | ||||
| version 4.2 client is not empty, an NFS version 4.2 server MUST | ||||
| use that chunk for the first element of the rpr_contents array | ||||
| that has an rpc_data arm. | ||||
| o An NFS version 4.2 server MUST NOT return more than two elements | ||||
| in the rpr_contents array of any READ_PLUS operation. It returns | ||||
| as much of the requested byte range as it can fit within these two | ||||
| elements. If the NFS version 4.2 server has not asserted rpr_eof | ||||
| in the reply, the NFS version 4.2 client SHOULD send additional | ||||
| READ_PLUS requests for any remaining bytes. | ||||
| 5.2. Reply Size Estimation | 5.2. Reply Size Estimation | |||
| Within NFS version 4, there are certain variable-length result data | Within NFS version 4, there are certain variable-length result data | |||
| items whose maximum size cannot be estimated by clients reliably | items whose maximum size cannot be estimated by clients reliably | |||
| because there is no protocol-specified size limit on these arrays. | because there is no protocol-specified size limit on these arrays. | |||
| These include: | These include: | |||
| o The attrlist4 field | o The attrlist4 field | |||
| o Fields containing ACLs such as fattr4_acl, fattr4_dacl, | o Fields containing ACLs such as fattr4_acl, fattr4_dacl, | |||
| skipping to change at page 10, line 16 ¶ | skipping to change at page 9, line 8 ¶ | |||
| 5.4.1. Multiple DDP-eligible Data Items | 5.4.1. Multiple DDP-eligible Data Items | |||
| An NFS version 4 COMPOUND procedure can contain more than one | An NFS version 4 COMPOUND procedure can contain more than one | |||
| operation that carries a DDP-eligible data item. An NFS version 4 | operation that carries a DDP-eligible data item. An NFS version 4 | |||
| client provides XDR Position values in each Read chunk to | client provides XDR Position values in each Read chunk to | |||
| disambiguate which chunk is associated with which argument data item. | disambiguate which chunk is associated with which argument data item. | |||
| However NFS version 4 server and client implementations must agree in | However NFS version 4 server and client implementations must agree in | |||
| advance on how to pair Write chunks with returned result data items. | advance on how to pair Write chunks with returned result data items. | |||
| In the following list, an "NFS Read" operation refers to any NFS | In the following list, a "READ operation" refers to any NFS Version 4 | |||
| Version 4 operation which has a DDP-eligible result data item (i.e., | operation which has a DDP-eligible result data item. The mechanism | |||
| either a READ, READ_PLUS, or READLINK operation). The mechanism | ||||
| specified in Section 4.3.2 of [I-D.ietf-nfsv4-rfc5666bis]) is applied | specified in Section 4.3.2 of [I-D.ietf-nfsv4-rfc5666bis]) is applied | |||
| to this class of operations: | to this class of operations: | |||
| o If an NFS version 4 client wishes all DDP-eligible items in an NFS | o If an NFS version 4 client wishes all DDP-eligible items in an NFS | |||
| reply to be conveyed inline, it leaves the Write list empty. | reply to be conveyed inline, it leaves the Write list empty. | |||
| o The first chunk in the Write list MUST be used by the first READ | o The first chunk in the Write list MUST be used by the first READ | |||
| operation in an NFS version 4 COMPOUND procedure. The next Write | operation in an NFS version 4 COMPOUND procedure. The next Write | |||
| chunk is used by the next READ operation, and so on. | chunk is used by the next READ operation, and so on. | |||
| skipping to change at page 15, line 26 ¶ | skipping to change at page 14, line 26 ¶ | |||
| This document should be listed as the reference for the nfsrdma port | This document should be listed as the reference for the nfsrdma port | |||
| assignments. This document does not alter these assignments. | assignments. This document does not alter these assignments. | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| [I-D.ietf-nfsv4-rfc5666bis] | [I-D.ietf-nfsv4-rfc5666bis] | |||
| Lever, C., Simpson, W., and T. Talpey, "Remote Direct | Lever, C., Simpson, W., and T. Talpey, "Remote Direct | |||
| Memory Access Transport for Remote Procedure Call, Version | Memory Access Transport for Remote Procedure Call, Version | |||
| One", draft-ietf-nfsv4-rfc5666bis-10 (work in progress), | One", draft-ietf-nfsv4-rfc5666bis-11 (work in progress), | |||
| February 2017. | March 2017. | |||
| [I-D.ietf-nfsv4-rpcrdma-bidirection] | [I-D.ietf-nfsv4-rpcrdma-bidirection] | |||
| Lever, C., "Bi-directional Remote Procedure Call On RPC- | Lever, C., "Bi-directional Remote Procedure Call On RPC- | |||
| over-RDMA Transports", draft-ietf-nfsv4-rpcrdma- | over-RDMA Transports", draft-ietf-nfsv4-rpcrdma- | |||
| bidirection-08 (work in progress), March 2017. | bidirection-08 (work in progress), March 2017. | |||
| [RFC1833] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", | [RFC1833] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", | |||
| RFC 1833, DOI 10.17487/RFC1833, August 1995, | RFC 1833, DOI 10.17487/RFC1833, August 1995, | |||
| <http://www.rfc-editor.org/info/rfc1833>. | <http://www.rfc-editor.org/info/rfc1833>. | |||
| skipping to change at page 18, line 9 ¶ | skipping to change at page 17, line 9 ¶ | |||
| A section discussing NFS version 4 retransmission and connection loss | A section discussing NFS version 4 retransmission and connection loss | |||
| has been added. | has been added. | |||
| The following additional improvements have been made, relative to | The following additional improvements have been made, relative to | |||
| [RFC5667]: | [RFC5667]: | |||
| o An explicit discussion of NFS version 4.0 and NFS version 4.1 | o An explicit discussion of NFS version 4.0 and NFS version 4.1 | |||
| backchannel operation has replaced the previous treatment of | backchannel operation has replaced the previous treatment of | |||
| callback operations. | callback operations. | |||
| o A binding for NFS version 4.2 has been added that includes | o A binding for NFS version 4.2 has been added. | |||
| discussion of new data-bearing operations like READ_PLUS. | ||||
| o A section suggesting a mechanism for periodically assessing | o A section suggesting a mechanism for periodically assessing | |||
| connection health has been introduced. | connection health has been introduced. | |||
| o Ambiguous or erroneous uses of RFC2119 terms have been corrected. | o Ambiguous or erroneous uses of RFC2119 terms have been corrected. | |||
| o References to obsolete RFCs have been updated. | o References to obsolete RFCs have been updated. | |||
| o An IANA Considerations Section has been added, which specifies the | o An IANA Considerations Section has been added, which specifies the | |||
| port assignments for NFS/RDMA. This replaces the example | port assignments for NFS/RDMA. This replaces the example | |||
| End of changes. 10 change blocks. | ||||
| 80 lines changed or deleted | 28 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||