| < draft-ietf-nfsv4-nfsdirect-03.txt | draft-ietf-nfsv4-nfsdirect-04.txt > | |||
|---|---|---|---|---|
| Internet-Draft Tom Talpey | Internet-Draft Tom Talpey | |||
| Expires: December 2006 Brent Callaghan | Expires: April 2007 Brent Callaghan | |||
| Document: draft-ietf-nfsv4-nfsdirect-03 June, 2006 | Document: draft-ietf-nfsv4-nfsdirect-04 October, 2007 | |||
| NFS Direct Data Placement | NFS Direct Data Placement | |||
| Status of this Memo | Status of this Memo | |||
| By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
| applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
| have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
| aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
| skipping to change at page 2, line 9 ¶ | skipping to change at page 2, line 9 ¶ | |||
| movement over the network to be implemented in RDMA hardware. This | movement over the network to be implemented in RDMA hardware. This | |||
| draft describes the use of direct data placement by means of server- | draft describes the use of direct data placement by means of server- | |||
| initiated RDMA operations into client-supplied buffers in a Chunk | initiated RDMA operations into client-supplied buffers in a Chunk | |||
| list for implementations of NFS versions 2, 3, and 4 over an RDMA | list for implementations of NFS versions 2, 3, and 4 over an RDMA | |||
| transport. | transport. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 2. Transfers from NFS Client to NFS Server . . . . . . . . . . 2 | 2. Transfers from NFS Client to NFS Server . . . . . . . . . . 2 | |||
| 3. Transfers from NFS Server to NFS Client . . . . . . . . . . 2 | 3. Transfers from NFS Server to NFS Client . . . . . . . . . . 3 | |||
| 4. NFS Versions 2 and 3 Mapping . . . . . . . . . . . . . . . . 4 | 4. NFS Versions 2 and 3 Mapping . . . . . . . . . . . . . . . . 4 | |||
| 5. NFS Version 4 Mapping . . . . . . . . . . . . . . . . . . . 5 | 5. NFS Version 4 Mapping . . . . . . . . . . . . . . . . . . . 5 | |||
| 6. Security . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 6. Security . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . 7 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . 7 | |||
| 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 9. Normative References . . . . . . . . . . . . . . . . . . . . 8 | 9. Normative References . . . . . . . . . . . . . . . . . . . . 8 | |||
| 10. Informative References . . . . . . . . . . . . . . . . . . 8 | 10. Informative References . . . . . . . . . . . . . . . . . . 9 | |||
| 11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . 9 | 11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . 9 | |||
| 12. Intellectual Property and Copyright Statements . . . . . . 9 | 12. Intellectual Property and Copyright Statements . . . . . 10 | |||
| Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 10 | Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
| Requirements Language | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
| document are to be interpreted as described in [RFC2119]. | ||||
| 1. Introduction | 1. Introduction | |||
| The RDMA Transport for ONC RPC [RPCRDMA] allows an RPC client | The RDMA Transport for ONC RPC [RPCRDMA] allows an RPC client | |||
| application to post buffers in a Chunk list for specific arguments | application to post buffers in a Chunk list for specific arguments | |||
| and results from an RPC call. The RDMA transport header conveys this | and results from an RPC call. The RDMA transport header conveys this | |||
| list of client buffer addresses to the server where the application | list of client buffer addresses to the server where the application | |||
| can associate them with client data and use RDMA operations to | can associate them with client data and use RDMA operations to | |||
| transfer the results directly to and from the posted buffers on the | transfer the results directly to and from the posted buffers on the | |||
| client. The client and server must agree on a consistent mapping of | client. The client and server must agree on a consistent mapping of | |||
| posted buffers to RPC. This document details the mapping for each | posted buffers to RPC. This document details the mapping for each | |||
| version of the NFS protocol [RFC1831] [RFC1832] [RFC1094] [RFC1813] | version of the NFS protocol [RFC1831] [RFC1832] [RFC1094] [RFC1813] | |||
| [RFC3530] [NFSv4.1]. | [RFC3530] [NFSv4.1]. | |||
| 2. Transfers from NFS Client to NFS Server | 2. Transfers from NFS Client to NFS Server | |||
| The RDMA Read list, in the RDMA transport header, allows an RPC | The RDMA Read list, in the RDMA transport header, allows an RPC | |||
| client to marshal RPC call data selectively. Large chunks of data, | client to marshal RPC call data selectively. Large chunks of data, | |||
| such as the file data of an NFS WRITE request, may be referenced by | such as the file data of an NFS WRITE request, MAY be referenced by | |||
| an RDMA Read list and be moved efficiently and directly-placed by an | an RDMA Read list and be moved efficiently and directly-placed by an | |||
| RDMA READ operation initiated by the server. | RDMA READ operation initiated by the server. | |||
| The process of identifying these chunks for the RDMA Read list can be | The process of identifying these chunks for the RDMA Read list can be | |||
| implemented entirely within the RPC layer. It is transparent to the | implemented entirely within the RPC layer. It is transparent to the | |||
| upper-level protocol, such as NFS. For instance, the file data | upper-level protocol, such as NFS. For instance, the file data | |||
| portion of an NFS WRITE request can be selected as an RDMA "chunk" | portion of an NFS WRITE request can be selected as an RDMA "chunk" | |||
| within the XDR marshalling code of RPC based on a size criterion, | within the XDR marshaling code of RPC based on a size criterion, | |||
| independently of the NFS protocol layer. The XDR unmarshalling on the | independently of the NFS protocol layer. The XDR unmarshaling on the | |||
| receiving system can identify the correspondence between Read chunks | receiving system can identify the correspondence between Read chunks | |||
| and protocol elements via the XDR position value encoded in the Read | and protocol elements via the XDR position value encoded in the Read | |||
| chunk entry. | chunk entry. | |||
| RPC RDMA Read chunks are employed by this NFS mapping to convey | RPC RDMA Read chunks are employed by this NFS mapping to convey | |||
| specific NFS data to the server in a manner which may be directly | specific NFS data to the server in a manner which may be directly | |||
| placed. The following sections describe this mapping for versions of | placed. The following sections describe this mapping for versions of | |||
| the NFS protocol. | the NFS protocol. | |||
| 3. Transfers from NFS Server to NFS Client | 3. Transfers from NFS Server to NFS Client | |||
| skipping to change at page 3, line 42 ¶ | skipping to change at page 3, line 47 ¶ | |||
| struct xdr_write_chunk { | struct xdr_write_chunk { | |||
| struct xdr_rdma_segment target<>; | struct xdr_rdma_segment target<>; | |||
| }; | }; | |||
| struct xdr_write_list { | struct xdr_write_list { | |||
| struct xdr_write_chunk entry; | struct xdr_write_chunk entry; | |||
| struct xdr_write_list *next; | struct xdr_write_list *next; | |||
| }; | }; | |||
| The sum of the segment lengths yields the total size of the buffer, | The sum of the segment lengths yields the total size of the buffer, | |||
| which must be large enough to accept the result. If the buffer is | which MUST be large enough to accept the result. If the buffer is | |||
| too small, the server must return an XDR encode error. The server | too small, the server MUST return an XDR encode error. The server | |||
| must return the result data for a posted buffer by progressively | MUST return the result data for a posted buffer by progressively | |||
| filling its segments, perhaps leaving some trailing segments unfilled | filling its segments, perhaps leaving some trailing segments unfilled | |||
| or partially full if the size of the result is less than the total | or partially full if the size of the result is less than the total | |||
| size of the buffer segments. | size of the buffer segments. | |||
| The server returns the RDMA Write list to the client with the segment | The server returns the RDMA Write list to the client with the segment | |||
| length fields overwritten to indicate the amount of data RDMA Written | length fields overwritten to indicate the amount of data RDMA Written | |||
| to each segment. Results returned by direct placement must not be | to each segment. Results returned by direct placement MUST not be | |||
| returned by other methods, e.g. by read chunk list or inline. If no | returned by other methods, e.g. by read chunk list or inline. If no | |||
| result data at all is returned for the element, the server places no | result data at all is returned for the element, the server places no | |||
| data in the buffer(s), but does return zeroes in the segment length | data in the buffer(s), but does return zeroes in the segment length | |||
| fields corresponding to the result. | fields corresponding to the result. | |||
| The RDMA Write list allows the client to provide multiple result | The RDMA Write list allows the client to provide multiple result | |||
| buffers - each buffer must map to a specific result in the reply. The | buffers - each buffer maps to a specific result in the reply. The NFS | |||
| NFS client and server implementations must agree on the mapping of | client and server implementations agree by specifying the mapping of | |||
| results to buffers for each RPC procedure. The following sections | results to buffers for each RPC procedure. The following sections | |||
| describe this mapping for versions of the NFS protocol. | describe this mapping for versions of the NFS protocol. | |||
| Through the use of RDMA Write lists in NFS requests, it is not | Through the use of RDMA Write lists in NFS requests, it is not | |||
| necessary to employ the RDMA Read lists in the NFS replies, as | necessary to employ the RDMA Read lists in the NFS replies, as | |||
| described in the RPC/RDMA protocol. This enables more efficient | described in the RPC/RDMA protocol. This enables more efficient | |||
| operation, by avoiding the need for the server to expose buffers for | operation, by avoiding the need for the server to expose buffers for | |||
| RDMA, and also avoiding "RDMA_DONE" exchanges. Clients may | RDMA, and also avoiding "RDMA_DONE" exchanges. Clients MAY | |||
| additionally employ RDMA Reply chunks to receive entire messages, as | additionally employ RDMA Reply chunks to receive entire messages, as | |||
| described in [RPCRDMA]. | described in [RPCRDMA]. | |||
| 4. NFS Versions 2 and 3 Mapping | 4. NFS Versions 2 and 3 Mapping | |||
| A single RDMA Write list entry may be posted by the client to receive | A single RDMA Write list entry MAY be posted by the client to receive | |||
| either the opaque file data from a READ request or the pathname from | either the opaque file data from a READ request or the pathname from | |||
| a READLINK request. The server will ignore a Write list for any | a READLINK request. The server MUST ignore a Write list for any | |||
| other NFS procedure, as well as any Write list entries beyond the | other NFS procedure, as well as any Write list entries beyond the | |||
| first in the list. | first in the list. | |||
| Similarly, a single RDMA Read list entry may be posted by the client | Similarly, a single RDMA Read list entry MAY be posted by the client | |||
| to supply the opaque file data for a WRITE request or the pathname | to supply the opaque file data for a WRITE request or the pathname | |||
| for a SYMLINK request. The server will ignore any Read list for | for a SYMLINK request. The server MUST ignore any Read list for | |||
| other NFS procedures, as well as additional Read list entries beyond | other NFS procedures, as well as additional Read list entries beyond | |||
| the first in the list. | the first in the list. | |||
| Because there are no NFS version 2 or 3 requests that transfer bulk | Because there are no NFS version 2 or 3 requests that transfer bulk | |||
| data in both directions, it is not necessary to post requests | data in both directions, it is not necessary to post requests | |||
| containing both Write and Read lists. Any unneeded Read or Write | containing both Write and Read lists. Any unneeded Read or Write | |||
| lists are ignored by the server. | lists are ignored by the server. | |||
| In the case where the outgoing request or expected incoming reply is | In the case where the outgoing request or expected incoming reply is | |||
| larger than the maximum size supported on the connection, it is | larger than the maximum size supported on the connection, it is | |||
| possible for the RPC layer to post the entire message or result in a | possible for the RPC layer to post the entire message or result in a | |||
| special "RDMA_NOMSG" message type which is transferred entirely by | special "RDMA_NOMSG" message type which is transferred entirely by | |||
| RDMA. This is implemented in RPC, below NFS and therefore has no | RDMA. This is implemented in RPC, below NFS and therefore has no | |||
| effect on the message contents. | effect on the message contents. | |||
| Non-RDMA (inline) WRITE transfers may optionally employ the | Non-RDMA (inline) WRITE transfers MAY OPTIONALLY employ the | |||
| "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | |||
| appropriate value for the server is known to the client. Padding | appropriate value for the server is known to the client. Padding | |||
| allows the opaque file data to arrive at the server in an aligned | allows the opaque file data to arrive at the server in an aligned | |||
| fashion, which may improve server performance. | fashion, which may improve server performance. | |||
| The NFS version 2 and 3 protocols are frequently limited in practice | The NFS version 2 and 3 protocols are frequently limited in practice | |||
| to requests containing less than or equal to 8 kilobytes and 32 | to requests containing less than or equal to 8 kilobytes and 32 | |||
| kilobytes of data, respectively. In these cases, it is often | kilobytes of data, respectively. In these cases, it is often | |||
| practical to support basic operation without employing a | practical to support basic operation without employing a | |||
| configuration exchange as discussed in [RPCRDMA]. The server can | configuration exchange as discussed in [RPCRDMA]. The server MUST | |||
| post buffers large enough to receive the largest possible incoming | post buffers large enough to receive the largest possible incoming | |||
| message (approximately 12KB/36KB would be vastly sufficient in the | message (approximately 12KB for NFS version 2, or 36KB for NFS | |||
| above cases), and the client can post buffers large enough to receive | version 3, would be vastly sufficient), and the client can post | |||
| replies based on the "rsize" it is using to the server. Because the | buffers large enough to receive replies based on the "rsize" it is | |||
| server will never return data in excess of this size, the client can | using to the server, plus a fixed overhead for the RPC and NFS | |||
| be assured of the adequacy of its posted buffer sizes. | headers. Because the server MUST NOT return data in excess of this | |||
| size, the client can be assured of the adequacy of its posted buffer | ||||
| sizes. | ||||
| Flow control is handled dynamically by the RPC RDMA protocol, and | Flow control is handled dynamically by the RPC RDMA protocol, and | |||
| write padding is optional and therefore may remain unused. | write padding is OPTIONAL and therefore MAY remain unused. | |||
| Alternatively, if the server is administratively configured to values | Alternatively, if the server is administratively configured to values | |||
| appropriate for all its clients, the same assurance of | appropriate for all its clients, the same assurance of | |||
| interoperability within the domain can be made. | interoperability within the domain can be made. | |||
| The use of a configuration protocol with NFS v2 and v3 is therefore | The use of a configuration protocol with NFS v2 and v3 is therefore | |||
| optional. Employing a configuration exchange may allow some advantage | OPTIONAL. Employing a configuration exchange may allow some advantage | |||
| to server resource management through accurately sizing buffers, | to server resource management through accurately sizing buffers, | |||
| enabling the server to know exactly how many RDMA Reads may be in | enabling the server to know exactly how many RDMA Reads may be in | |||
| progress at once on the client connection, and enabling client write | progress at once on the client connection, and enabling client write | |||
| padding which may be desirable for certain servers when RDMA Read is | padding which may be desirable for certain servers when RDMA Read is | |||
| impractical. | impractical. | |||
| 5. NFS Version 4 Mapping | 5. NFS Version 4 Mapping | |||
| This specification applies to the first minor version of NFS version | This specification applies to the first minor version of NFS version | |||
| 4 (NFSv4.0) and any subsequent minor versions that do not override | 4 (NFSv4.0) and any subsequent minor versions that do not override | |||
| this mapping. | this mapping. | |||
| The Write list will be considered only for the COMPOUND procedure. | The Write list MUST be considered only for the COMPOUND procedure. | |||
| This procedure returns results from a sequence of operations. Only | This procedure returns results from a sequence of operations. Only | |||
| the opaque file data from an NFS READ operation, and the pathname | the opaque file data from an NFS READ operation, and the pathname | |||
| from a READLINK operation will utilize entries from the Write list. | from a READLINK operation MUST utilize entries from the Write list. | |||
| If there is no Write list, i.e. the list is null, then any READ or | If there is no Write list, i.e. the list is null, then any READ or | |||
| READLINK operations in the COMPOUND must return their data inline. | READLINK operations in the COMPOUND MUST return their data inline. | |||
| The NFSv4.0 client must ensure that any result of its READ and | The NFSv4.0 client MUST ensure that any result of its READ and | |||
| READLINK requests must fit within its receive buffers, or an RDMA | READLINK requests fits within its receive buffers, lest an RDMA | |||
| transport error may occur. | transport error result upon transfer. | |||
| The first entry in the Write list must be used by the first READ or | The first entry in the Write list MUST be used by the first READ or | |||
| READLINK in the COMPOUND request. The next Write list entry by the | READLINK in the COMPOUND request. The next Write list entry by the | |||
| by the next READ or READLINK, and so on. If there are more READ or | by the next READ or READLINK, and so on. If there are more READ or | |||
| READLINK operations than Write list entries, then any remaining | READLINK operations than Write list entries, then any remaining | |||
| operations must return their results inline. | operations MUST return their results inline. | |||
| If a Write list entry is presented, then the corresponding READ or | If a Write list entry is presented, then the corresponding READ or | |||
| READLINK must return its data via an RDMA WRITE to the buffer | READLINK MUST return its data via an RDMA WRITE to the buffer | |||
| indicated by the Write list entry. If the Write list entry has zero | indicated by the Write list entry. If the Write list entry has zero | |||
| RDMA segments, or if the total size of the segments is zero, then the | RDMA segments, or if the total size of the segments is zero, then the | |||
| corresponding READ or READLINK operation must return its result | corresponding READ or READLINK operation MUST return its result | |||
| inline. | inline. | |||
| The following example shows an RDMA Write list with three posted | The following example shows an RDMA Write list with three posted | |||
| buffers A, B, and C. The designated operations in the compound | buffers A, B, and C. The designated operations in the compound | |||
| request, READ and READLINK, consume the posted buffers by writing | request, READ and READLINK, consume the posted buffers by writing | |||
| their results back to each buffer. | their results back to each buffer. | |||
| RDMA Write list: | RDMA Write list: | |||
| A --> B --> C | A --> B --> C | |||
| skipping to change at page 6, line 37 ¶ | skipping to change at page 6, line 45 ¶ | |||
| Compound request: | Compound request: | |||
| PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ | PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ | |||
| | | | | | | | | |||
| v v v | v v v | |||
| A B C | A B C | |||
| If the client does not want to have the READLINK result returned | If the client does not want to have the READLINK result returned | |||
| directly, then it provides a zero length array of segment triplets | directly, then it provides a zero length array of segment triplets | |||
| for buffer B or sets the values in the segment triplet for buffer B | for buffer B or sets the values in the segment triplet for buffer B | |||
| to zeros so that the READLINK result will be returned inline. | to zeros so that the READLINK result MUST be returned inline. | |||
| The situation is similar for RDMA Read lists sent by the client and | The situation is similar for RDMA Read lists sent by the client and | |||
| applies to the NFSv4.0 WRITE and SYMLINK procedures as for v3. | applies to the NFSv4.0 WRITE and SYMLINK procedures as for v3. | |||
| Additionally, inline segments too large to fit in posted buffers may | ||||
| Additionally, inline segments too large to fit in posted buffers MAY | ||||
| be transferred in special "RDMA_NOMSG" messages. | be transferred in special "RDMA_NOMSG" messages. | |||
| Non-RDMA (inline) WRITE transfers may optionally employ the | Non-RDMA (inline) WRITE transfers MAY OPTIONALLY employ the | |||
| "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | |||
| appropriate value for the server is known to the client. Padding | appropriate value for the server is known to the client. Padding | |||
| allows the opaque file data to arrive at the server in an aligned | allows the opaque file data to arrive at the server in an aligned | |||
| fashion, which may improve server performance. In order to ensure | fashion, which may improve server performance. In order to ensure | |||
| accurate alignment for all data, it is likely that the client will | accurate alignment for all data, it is likely that the client will | |||
| restrict its use of optional padding to COMPOUND requests containing | restrict its use of OPTIONAL padding to COMPOUND requests containing | |||
| only a single WRITE operation. | only a single WRITE operation. | |||
| Unlike NFS versions 2 and 3, the maximum size of an NFS version 4 | Unlike NFS versions 2 and 3, the maximum size of an NFS version 4 | |||
| COMPOUND is unbounded, even when RDMA chunks are in use. While it | COMPOUND is unbounded, even when RDMA chunks are in use. While it | |||
| might appear that a configuration protocol exchange (such as the one | might appear that a configuration protocol exchange (such as the one | |||
| described in [RPCRDMA]) would help, in fact the layering issues | described in [RPCRDMA]) would help, in fact the layering issues | |||
| involved in building COMPOUNDs by NFS make such a mechanism | involved in building COMPOUNDs by NFS make such a mechanism | |||
| unworkable. Instead, an extension to NFS version 4 supporting a more | unworkable. | |||
| comprehensive exchange of upper layer (NFSv4) parameters is proposed | ||||
| in [NFSv4.1]. This proposal also addresses other use of the sizes, | However, typical NFS version 4 clients rarely issue such problematic | |||
| such as in the server's response cache. | requests. In practice, they behave in much more predictable ways, in | |||
| fact most still support the traditional rsize/wsize mount parameters. | ||||
| Therefore, most NFS version 4 clients function over RPC/RDMA in the | ||||
| same way as NFS versions 2 and 3, operationally. | ||||
| There are however advantages to allowing both client and server to | ||||
| operate with prearranged sie constraints, for example use of the | ||||
| sizes to better manage the server's response cache. An extension to | ||||
| NFS version 4 supporting a more comprehensive exchange of upper layer | ||||
| parameters is part of [NFSv4.1]. | ||||
| 6. Security | 6. Security | |||
| The RDMA transport for ONC RPC supports RPCSEC_GSS security as well | The RDMA transport for ONC RPC supports RPCSEC_GSS security as well | |||
| as link-level security. The use of RDMA Write to return RPC results | as link-level security. The use of RDMA Write to return RPC results | |||
| does not affect ONC RPC security. | does not affect ONC RPC security. | |||
| 7. IANA Considerations | 7. IANA Considerations | |||
| NFS use of direct data placement may introduce a need for an | NFS use of direct data placement introduces a need for an additional | |||
| additional NFS port number assignment for networks which share | NFS port number assignment for networks which share traditional UDP | |||
| traditional UDP and TCP port spaces with RDMA services. The iWARP | and TCP port spaces with RDMA services. The iWARP [DDP] [RDMAP] | |||
| [DDP] [RDMAP] protocol is such an example (Infiniband is not). | protocol is such an example (Infiniband is not). | |||
| NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally | NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally | |||
| listen for clients on UDP and TCP port 2049, and additionally, they | listen for clients on UDP and TCP port 2049, and additionally, they | |||
| register these with the portmapper. NFS servers for version 4 | register these with the portmapper and/or rpcbind [RFC1833] service. | |||
| [RFC3050] are required to listen on TCP port 2049, and are not | However, NFS servers for version 4 [RFC3530] are required by that | |||
| required to register. | specification to listen on TCP port 2049, and are not required to | |||
| register. | ||||
| An NFS version 2 or version 3 server supporting RPC/RDMA on such a | An NFS version 2 or version 3 server supporting RPC/RDMA on such a | |||
| network and registering itself with the RPC portmapper may choose an | network and registering itself with the RPC portmapper MAY choose an | |||
| arbitrary port, or may be assigned an alternative well-known port | arbitrary port, or MAY use the alternative well-known port number for | |||
| number for its RPC/RDMA service by IANA. The chosen port must be | its RPC/RDMA service by IANA. The chosen port MAY be registered with | |||
| registered with the RPC portmapper under the netid assigned by the | the RPC portmapper under the netid assigned by the requirement in | |||
| requirement in [RPCRDMA]. | [RPCRDMA]. | |||
| An NFS version 4 server supporting RPC/RDMA on such a network must be | ||||
| assigned an alternative well-known port number for its RPC/RDMA | ||||
| service by IANA. Clients will connect to this well-known port | ||||
| without consulting the RPC portmapper (as for NFSv4/TCP). | ||||
| Any subsequent NFS version 4 minor version's [NFSv4.1] server may | An NFS version 4 server supporting RPC/RDMA on such a network must | |||
| reuse port 2049, by requiring the client to perform the RDMA session | MUST use the alternative well-known port number for its RPC/RDMA | |||
| negotiation supported by this protocol. If it does not require the | service by IANA. Clients SHOULD connect to this well-known port | |||
| client to negotiate an RDMA-enabled session, it must use the | without consulting the RPC portmapper (as for NFSv4/TCP). The | |||
| alternative port for RPC/RDMA, as for version 4. | following port is assigned to an NFS service over an RPC/RDMA | |||
| transport: | ||||
| This is not an issue on non-IP transports such as native Infiniband, | nfs-rdma 2050 | |||
| where a non-colliding port translation scheme is used [IBPORT]. On | ||||
| such interfaces, the server can simply listen on the port mapped from | ||||
| the IANA-assigned NFS 2049, or any other port as assigned by the | ||||
| native transport. Such assignments are out of the scope of IANA, and | ||||
| of this document. | ||||
| 8. Acknowledgements | 8. Acknowledgements | |||
| The authors would like to thank Dave Noveck and Chet Juszczak for | The authors would like to thank Dave Noveck and Chet Juszczak for | |||
| their contributions to this document. | their contributions to this document. | |||
| 9. Normative References | 9. Normative References | |||
| [RFC2119] | ||||
| S. Bradner, "Key words for use in RFCs to Indicate Requirement | ||||
| Levels", | ||||
| Best Current Practice, | ||||
| BCP 14, RFC 2119, March 1997. | ||||
| [RFC1831] | [RFC1831] | |||
| R. Srinivasan, "RPC: Remote Procedure Call Protocol Specification | R. Srinivasan, "RPC: Remote Procedure Call Protocol Specification | |||
| Version 2", | Version 2", | |||
| Standards Track RFC, | Standards Track RFC, | |||
| http://www.ietf.org/rfc/rfc1831.txt | http://www.ietf.org/rfc/rfc1831.txt | |||
| [RFC1832] | [RFC1832] | |||
| R. Srinivasan, "XDR: External Data Representation Standard", | R. Srinivasan, "XDR: External Data Representation Standard", | |||
| Standards Track RFC, | Standards Track RFC, | |||
| http://www.ietf.org/rfc/rfc1832.txt | http://www.ietf.org/rfc/rfc1832.txt | |||
| skipping to change at page 8, line 42 ¶ | skipping to change at page 9, line 11 ¶ | |||
| "NFS: Network File System Protocol Specification", | "NFS: Network File System Protocol Specification", | |||
| (NFS version 2) Informational RFC, | (NFS version 2) Informational RFC, | |||
| http://www.ietf.org/rfc/rfc1094.txt | http://www.ietf.org/rfc/rfc1094.txt | |||
| [RFC1813] | [RFC1813] | |||
| B. Callaghan, B. Pawlowski, P. Staubach, "NFS Version 3 Protocol | B. Callaghan, B. Pawlowski, P. Staubach, "NFS Version 3 Protocol | |||
| Specification", | Specification", | |||
| Informational RFC, | Informational RFC, | |||
| http://www.ietf.org/rfc/rfc1813.txt | http://www.ietf.org/rfc/rfc1813.txt | |||
| [RFC1833] | ||||
| R. Srinivasan, "Binding Protocols for ONC RPC Version 2", | ||||
| Standards Track RFC, | ||||
| http://www.ietf.org/rfc/rfc1833.txt | ||||
| [RFC3530] | [RFC3530] | |||
| S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. | S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. | |||
| Eisler, D. Noveck, "NFS version 4 Protocol", | Eisler, D. Noveck, "NFS version 4 Protocol", | |||
| Standards Track RFC, | Standards Track RFC, | |||
| http://www.ietf.org/rfc/rfc3530.txt | http://www.ietf.org/rfc/rfc3530.txt | |||
| 10. Informative References | 10. Informative References | |||
| [RPCRDMA] | [RPCRDMA] | |||
| T. Talpey, B. Callaghan, "RDMA Transport for ONC RPC" | T. Talpey, B. Callaghan, "RDMA Transport for ONC RPC" | |||
| Internet Draft Work in Progress, | Internet Draft Work in Progress, | |||
| draft-ietf-nfsv4-rpcrdma | draft-ietf-nfsv4-rpcrdma | |||
| [NFSv4.1] | [NFSv4.1] | |||
| S. Shepler, ed., "NFSv4 Minor Version 1" | S. Shepler et. al., ed., "NFSv4 Minor Version 1" | |||
| Internet Draft Work in Progress, | Internet Draft Work in Progress, | |||
| draft-ietf-nfsv4-minorversion1 | draft-ietf-nfsv4-minorversion1 | |||
| [DDP] | [DDP] | |||
| H. Shah et al, "Direct Data Placement over Reliable Transports", | H. Shah et al, "Direct Data Placement over Reliable Transports", | |||
| Internet Draft Work in Progress, | Standards Track RFC, | |||
| draft-ietf-rddp-ddp | draft-ietf-rddp-ddp | |||
| [RDMAP] | [RDMAP] | |||
| R. Recio et al, "An RDMA Protocol Specification", | R. Recio et al, "An RDMA Protocol Specification", | |||
| Internet Draft Work in Progress, | Standards Track RFC, | |||
| draft-ietf-rddp-rdmap | draft-ietf-rddp-rdmap | |||
| [IBPORT] | ||||
| Infiniband Trade Association, "IP Addressing Annex", | ||||
| available from www.infinibandta.org | ||||
| 11. Authors' Addresses | 11. Authors' Addresses | |||
| Tom Talpey | Tom Talpey | |||
| Network Appliance, Inc. | Network Appliance, Inc. | |||
| 375 Totten Pond Road | 375 Totten Pond Road | |||
| Waltham, MA 02451 USA | Waltham, MA 02451 USA | |||
| Phone: +1 781 768 5329 | Phone: +1 781 768 5329 | |||
| EMail: thomas.talpey@netapp.com | EMail: thomas.talpey@netapp.com | |||
| Brent Callaghan | Brent Callaghan | |||
| Apple Computer, Inc. | Apple Computer, Inc. | |||
| End of changes. 45 change blocks. | ||||
| 82 lines changed or deleted | 98 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||