idnits 2.17.1 draft-eisler-nfsv4-minorversion-2-requirements-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 18, 2009) is 5276 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 3530 (Obsoleted by RFC 7530) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Eisler, Ed. 3 Internet-Draft NetApp 4 Intended status: Informational October 18, 2009 5 Expires: April 21, 2010 7 Requirements for NFSv4.2 8 draft-eisler-nfsv4-minorversion-2-requirements-00 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on April 21, 2010. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents in effect on the date of 40 publication of this document (http://trustee.ietf.org/license-info). 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. 44 Abstract 46 This document proposes requirements for NFSv4.2. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 52 2. Efficiency and Utilization Requirements . . . . . . . . . . . . 3 53 2.1. Capacity . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2.2. Network Bandwidth and Processing . . . . . . . . . . . . . 5 55 2.3. Flash Memory Requirements . . . . . . . . . . . . . . . . . 5 56 3. Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . 6 57 4. Incremental Improvements . . . . . . . . . . . . . . . . . . . 6 58 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 59 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 60 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 7.1. Normative References . . . . . . . . . . . . . . . . . . . 8 62 7.2. Informative References . . . . . . . . . . . . . . . . . . 8 63 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 1. Introduction 67 NFSv4.1 [I-D.ietf-nfsv4-minorversion1] is an approved specification. 68 The NFSv4 [RFC3530] community has indicated a desire to continue 69 innovating NFS, and specifically via a new minor version of NFSv4, 70 namely NFSv4.2. The desire for future innovation is primarily driven 71 by two trends in the storage industry: 73 o High efficiency and utilization of resources such as, capacity, 74 network bandwidth, and processors. 76 o Solid state flash storage which promises faster throughput and 77 lower latency than magnetic disk drives and lower cost than 78 dynamic random access memory. 80 Secondarily, innovation is being driver by the trend to stronger 81 compliance with information management. In addition, as might be 82 expected with a complex protocol like NFSv4.1, implementation 83 experience has shown that minor changes to the protocol would be 84 useful to improve the end user experience. 86 This document proposes requirements along these four themes, and 87 attempts to strike the balance between stating the problem and 88 proposing solutions. With respect to the latter, some thinking among 89 the NFS community has taken place, and a future revision of this 90 document will reference embodiments of such thinking. 92 1.1. Requirements Language 94 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 95 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 96 document are to be interpreted as described in RFC 2119 [RFC2119]. 98 2. Efficiency and Utilization Requirements 100 2.1. Capacity 102 Despite the capacity of magnetic disk continuing to increase at 103 exponential rates, the storage industry is under pressure to make the 104 storage of data increasingly more efficient, so that more data can be 105 stored. The driver for this counter-intuitive demand is that disk 106 access times are not improving any where near as quickly as 107 capacities. The industry has responded to this by increasing data 108 density by limiting the number of times a unique pattern of data is 109 stored in a storage device. For example some storage devices support 110 de-duplication. When storing two files, a sorage device might 111 compare them for shared patterns of data, and store the pattern just 112 once, and setting reference counts on the blocks of the unique 113 pattern to two. With de-duplication the number of times a storage 114 device has to read a particular pattern just once, thus improving 115 average access time. 117 For a file access protocol like NFS, there are several implied 118 requirements for addressing this capacity efficiency trend: 120 o The "space_used" attribute of NFSv4 does not report meaningful 121 information. Removing a file with a "space_used" value of X bytes 122 does not mean that the file system will see an increase of X 123 available bytes. Providing more meaningful information is a 124 requirement. 126 o Because it is probable, especially for applications such as 127 hypervisors, the NFSv4 client is accessing multiple files with 128 shared blocks of data, it is in the interest of the client and 129 server for the client to know which blocks are share so that they 130 are are not read multiple times, and not cached multiple times. 131 Providing a block map of shared blocks is a requirement. 133 o If an NFSv4 client is aware of which patterns exist on which 134 files, when it wants to write pattern X to file B to offset J, and 135 it knows that X also exists in offset I of file A, then if it can 136 advise the server of its intent, the server can arrange for 137 pattern X to appear in file A being a zero copy. Even if the 138 server does not support de-duplication, it can at least perform a 139 local copy that saves network bandwidth and processor overhead on 140 the client and server. 142 o File holes are patterns of zeros that in some file systems do are 143 unallocated blocks. In a sense holes are the ultimate de- 144 duplicated pattern. While proposals to extend NFS to support hole 145 punching have been around since the 1980s, until recently there 146 have not been NFS clients that could make use of hole punching. 147 The Information Technology (IT) trend toward virtualizing 148 operating environments via hypervisors has resulted in a need for 149 hypervisors to translate a (virtual) disk command to free a block 150 into an NFS request to free that block. On the read side, if a 151 file contains holes, then again, as the ultimate in de- 152 duplication, it would be better for the client to be told the 153 region it wants to read has a hole, instead of of returning long 154 arrays of zero bytes. Even if a server does not holes on write or 155 read, avoiding the transmission of zeroes will save network 156 bandwidth and reduce processor overhead. 158 2.2. Network Bandwidth and Processing 160 The computation capabilities of processors continues at an 161 exponential rate. However, as noted previous, disk access times are 162 not tracking this. In addition, while network bandwidth is 163 exponentially increasing, unlike disk capacities and processor 164 bandwidth, the improvement is not seen on a 1-2 year cycle, but is 165 closer to a 10 year cycle. This means that there is often a 166 discontinuity between the processing capabilities of NFS clients, and 167 the speed at which they can extract data from an NFS server. For 168 some use cases, much of the data that is read by one client from an 169 NFS server also needs to be read by another client. Re-reading this 170 data is waste of the network bandwidth and processing of the NFS 171 server. This same observation has driven the creation of peer-to- 172 peer content distribution protocols, where data is directly read from 173 peers rather than servers. It is apparent that a similar technique 174 could be used to offload primary storage. 176 The pNFS protocol distributes the I/O to a set of files across a 177 cluster of of data servers. arguably, its primary value is in 178 balancing load across storage devices, especially when it can 179 leverage a back end file system or storage cluster with automatic 180 load balancing capabilities. In NFSv4.1, no consideration was given 181 to metadata. Metadata is critical to several workloads, to the point 182 that as defined in NFSv4.1, pNFS will not not offer much value in 183 those cases. The load balancing capabilities of pNFS need to be 184 brought to metadata. 186 From an end user perspective, the operations he performs on a file 187 include creating, reading, writing, deleting, and copying. NFSv4 has 188 operations for all but the last. While file copy has been proposed 189 for NFS in the past, it was always rejected because of the lack of 190 APIs on operating environments to send a copy operation. The IT 191 trend toward virtualization via hypervisors has changed the picture. 192 The use of a copy operation will save network bandwidth on the client 193 and server, and where the server supports it, intra-server fle copy 194 can potentially be zero copy. 196 2.3. Flash Memory Requirements 198 Flash memory is rapidly filling the wide gap between expensive but 199 fast Dynamic Random Access Memory (DRAM) and inexpensive but cheap 200 magnetic disk. The cost per bit of flash is between DRAM and disk. 201 The access time pet bit of flash is between DRAM and disk. This has 202 resulted in cost per File access Operation Per Second (FOPS) of flash 203 exceeding DRAM and disk. Flash can be easily added as another 204 storage medium to NFS servers, and this does not require a change to 205 the NFS protocol. However, the value of flash's superior FOPS is 206 best realized when flash is closest to the application, i.e. on the 207 NFS client. One approach would be to forgo the use of network 208 storage and de-evolve back to Direct Attached Storage (DAS). 209 However, this would require that data protection value that exists in 210 modern storage devices be brought into DAS, and this is not always 211 convenient or cost effective. A less traumatic way to leverage the 212 full FOPS of flash would be for NFSv4 clients to leverage flash for 213 caching of data. 215 Today NFSv4 supports whole file delegations for enabling caching. 216 Such a granularity is useful for applications like user home 217 directories where there is little file sharing. However, NFS is used 218 for many more workloads, which include file sharing. In these 219 workloads, files are shared, whereas individual blocks might not be. 220 This drives a requirement for sub-file caching. 222 3. Compliance 224 New regulations for the IT industry limit who can view what data. 225 NFSv4 has Access Control Lists (ACLs), but the ACL can be changed by 226 the nominal file owner. In practice, the end user that owns the file 227 (essentially, has the right to delete the file or give permissions to 228 other users), is not the legal owner of the file. The legal owner of 229 the file wants to control not just who can access the file, but who 230 they can pass the content of the file to. The IT industry has 231 addressed this need in the past with notion of security labeling. 232 Labels are attached to devices, files, users, applications, network 233 connections, etc. When the labels of two objects match, data can be 234 transferred from one to another. For example a label called "Secret" 235 on a file results in only users with a "Secret" security clearance 236 being allowed to view the file, despite what the ACL says. 238 To attach a label on a file requires that it be created atomically 239 with the file, which means that a new RECOMMENDED attribute for a 240 security label is needed. 242 4. Incremental Improvements 244 Implementation experience with NFSv4.1 and related protocols, such as 245 SMB2, has shown a number of areas where the protocol can be improved. 247 o Hints for the type of file access, such as sequential read. While 248 traditionally NFS servers have been able to detect read-a-head 249 patterns, with the introduction of pNFS, this will be harder. 250 Since NFS clients can detect patterns of access, they can advise 251 servers. In addition, the UNIX/Linux madvise() API is an example 252 of where applications can provide direct advice to the NFS server. 254 o Head of line blocking. Consider a client that wants to send a 255 three operations: a file creation, a read for one megabyte, and a 256 write for one megabyte. Each of these might be sent on a separate 257 slot. The client determines that it is not desirable for the read 258 operation to wait for the write operation to be sent, so it sends 259 the create. However, it does not want to serialize the read and 260 write behind the create, so the read gets sent, followed by the 261 write. On the reply side, the server does not know that client 262 wants the create satisfied first, so read and write operations are 263 first processed. By the time the create is performed on the 264 server, the response to the read is still filling the reply side. 265 While NFSv4.1 could solve this problem by associating two 266 connections with the session, and using one connection for create, 267 and the other for read or write, multiple connections come at a 268 cost. The requirement is to solve this head of line blocking 269 problem. Tagging a request as one that should go to the head of 270 the line for request and response processing is one possible way 271 to address it. 273 o pNFS connectivity/access indication. If a pNFS client is given a 274 layout that directs it to a storage device it cannot access due to 275 connectivity of access control issues, it has no way in NFSv4.1 to 276 indicate the problem to the metadata server. 278 o RPCSEC_GSS sequence window size on backchannel. The NFSv4.1 279 specification does not have a way to for the client to tell the 280 server what window size to use on the backchannel. The 281 specification says that the window size will be the same as what 282 the server uses. Potentially, a server could use a very large 283 window size that the client does not want. 285 o Trunking discovery. The NFSv4.1 specification is long on how a 286 client verifies if trunking is available between two connections, 287 but short on how a client can discover destination addresses that 288 can be trunked. It would be useful if there was a method (such as 289 an operation) to get a list of destinations that can be session or 290 client ID trunked, as well as a notification when the set of 291 destinations changes. 293 5. IANA Considerations 295 None. 297 6. Security Considerations 299 None. 301 7. References 303 7.1. Normative References 305 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 306 Requirement Levels", BCP 14, RFC 2119, March 1997. 308 7.2. Informative References 310 [I-D.ietf-nfsv4-minorversion1] 311 Shepler, S., Eisler, M., and D. Noveck, "NFS Version 4 312 Minor Version 1", draft-ietf-nfsv4-minorversion1-29 (work 313 in progress), December 2008. 315 [RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., 316 Beame, C., Eisler, M., and D. Noveck, "Network File System 317 (NFS) version 4 Protocol", RFC 3530, April 2003. 319 Author's Address 321 Michael Eisler (editor) 322 NetApp 323 5765 Chase Point Circle 324 Colorado Springs, CO 80919 325 US 327 Phone: +1 719 599 8759 328 Email: mike@eisler.com