idnits 2.17.1 draft-eisler-nfsv4-minorversion-2-requirements-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 26, 2009) is 5289 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-01) exists of draft-eisler-nfsv4-pnfs-dedupe-00 == Outdated reference: A later version (-02) exists of draft-eisler-nfsv4-pnfs-metastripe-01 == Outdated reference: A later version (-03) exists of draft-faibish-nfsv4-pnfs-access-permissions-check-01 == Outdated reference: A later version (-06) exists of draft-lentini-nfsv4-server-side-copy-03 == Outdated reference: A later version (-01) exists of draft-myklebust-nfsv4-pnfs-backend-00 == Outdated reference: A later version (-03) exists of draft-quigley-nfsv4-sec-label-00 -- Obsolete informational reference (is this intentional?): RFC 3530 (Obsoleted by RFC 7530) Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Eisler, Ed. 3 Internet-Draft NetApp 4 Intended status: Informational October 26, 2009 5 Expires: April 29, 2010 7 Requirements for NFSv4.2 8 draft-eisler-nfsv4-minorversion-2-requirements-02 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on April 29, 2010. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents in effect on the date of 40 publication of this document (http://trustee.ietf.org/license-info). 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. 44 Abstract 46 This document proposes requirements for NFSv4.2. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 52 2. Efficiency and Utilization Requirements . . . . . . . . . . . . 3 53 2.1. Capacity . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2.2. Network Bandwidth and Processing . . . . . . . . . . . . . 5 55 3. Flash Memory Requirements . . . . . . . . . . . . . . . . . . . 6 56 4. Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . 6 57 5. Incremental Improvements . . . . . . . . . . . . . . . . . . . 7 58 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8 59 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 60 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 61 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 62 9.1. Normative References . . . . . . . . . . . . . . . . . . . 8 63 9.2. Informative References . . . . . . . . . . . . . . . . . . 9 64 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 1. Introduction 68 NFSv4.1 [I-D.ietf-nfsv4-minorversion1] is an approved specification. 69 The NFSv4 [RFC3530] community has indicated a desire to continue 70 innovating NFS, and specifically via a new minor version of NFSv4, 71 namely NFSv4.2. The desire for future innovation is primarily driven 72 by two trends in the storage industry: 74 o High efficiency and utilization of resources such as, capacity, 75 network bandwidth, and processors. 77 o Solid state flash storage which promises faster throughput and 78 lower latency than magnetic disk drives and lower cost than 79 dynamic random access memory. 81 Secondarily, innovation is being driver by the trend to stronger 82 compliance with information management. In addition, as might be 83 expected with a complex protocol like NFSv4.1, implementation 84 experience has shown that minor changes to the protocol would be 85 useful to improve the end user experience. 87 This document proposes requirements along these four themes, and 88 attempts to strike a balance between stating the problem and 89 proposing solutions. With respect to the latter, some thinking among 90 the NFS community has taken place, and a future revision of this 91 document will reference embodiments of such thinking. 93 1.1. Requirements Language 95 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 96 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 97 document are to be interpreted as described in RFC 2119 [RFC2119]. 99 2. Efficiency and Utilization Requirements 101 2.1. Capacity 103 Despite the capacity of magnetic disk continuing to increase at 104 exponential rates, the storage industry is under pressure to make the 105 storage of data increasingly efficient, so that more data can be 106 stored. The driver for this counter-intuitive demand is that disk 107 access times are not improving anywhere near as quickly as 108 capacities. The industry has responded to this development by 109 increasing data density via limiting the number of times a unique 110 pattern of data is stored in a storage device. For example some 111 storage devices support de-duplication. When storing two files, a 112 storage device might compare them for shared patterns of data, and 113 store the pattern just once, and setting reference counts on the 114 blocks of the unique pattern to two. With de-duplication the number 115 of times a storage device has to read a particular pattern would be 116 reduced to just once, thus improving average access time. 118 For a file access protocol such as NFS, there are several implied 119 requirements for addressing this capacity efficiency trend: 121 o The "space_used" attribute of NFSv4 does not report meaningful 122 information. Removing a file with a "space_used" value of X bytes 123 does not mean that the file system will see an increase of X 124 available bytes. Providing more meaningful information is a 125 requirement. 127 o Because it is probable, especially for applications such as 128 hypervisors, the NFSv4 client is accessing multiple files with 129 shared blocks of data, it is in the interest of the client and 130 server for the client to know which blocks are share so that they 131 are are not read multiple times, and not cached multiple times. 132 Providing a block map of shared blocks is a requirement. For an 133 example of how NFSv4 could deal with this, see 134 [I-D.eisler-nfsv4-pnfs-dedupe]. 136 o If an NFSv4 client is aware of which patterns exist on which 137 files, when it wants to write pattern X to file B to offset J, and 138 it knows that X also exists in offset I of file A, then if it can 139 advise the server of its intent, the server can arrange for 140 pattern X to appear in file A being a zero copy. Even if the 141 server does not support de-duplication, it can at least perform a 142 local copy that saves network bandwidth and processor overhead on 143 the client and server. 145 o File holes are patterns of zeros that in some file systems do are 146 unallocated blocks. In a sense, holes are the ultimate de- 147 duplicated pattern. While proposals to extend NFS to support hole 148 punching have been around since the 1980s, until recently there 149 have not been NFS clients that could make use of hole punching. 150 The Information Technology (IT) trend toward virtualizing 151 operating environments via hypervisors has resulted in a need for 152 hypervisors to translate a (virtual) disk command to free a block 153 into an NFS request to free that block. On the read side, if a 154 file contains holes, then again, as the ultimate in de- 155 duplication, it would be better for the client to be told the 156 region it wants to read has a hole, instead of of returning long 157 arrays of zero bytes. Even if a server does not support holes on 158 write or read, avoiding the transmission of zeroes will save 159 network bandwidth and reduce processor overhead. 161 2.2. Network Bandwidth and Processing 163 The computational capabilities of processors continues to grow at an 164 exponential rate. However, as noted previously, because disk access 165 times are not showing a commensurate exponential decrease, disk 166 performance is not tracking processor performance. In addition, 167 while network bandwidth is exponentially increasing, unlike disk 168 capacities and processor bandwidth, the improvement is not seen on a 169 1-2 year cycle, but happens on something closer to a 10 year cycle. 170 The lag between disk and network performance compared to processor 171 performance means that there is often a discontinuity between the 172 processing capabilities of NFS clients and the speed at which they 173 can extract data from an NFS server. For some use cases, much of the 174 data that is read by one client from an NFS server also needs to be 175 read by other clients. Re-reading this data is will result in a 176 waste of the network bandwidth and processing of the NFS server. 177 This same observation has driven the creation of peer-to-peer content 178 distribution protocols, where data is directly read from peers rather 179 than servers. It is apparent that a similar technique could be used 180 to offload primary storage, such as that proposed in 181 [I-D.myklebust-nfsv4-pnfs-backend] 183 The pNFS protocol distributes the I/O to a set of files across a 184 cluster of data servers. Arguably, its primary value is in balancing 185 load across storage devices, especially when it can leverage a back 186 end file system or storage cluster with automatic load balancing 187 capabilities. In NFSv4.1, no consideration was given to metadata. 188 Metadata is critical to several workloads, to the point that, as 189 defined in NFSv4.1, pNFS will not not offer much value in those 190 cases. The load balancing capabilities of pNFS need to be brought to 191 metadata. An example of how to do so is in 192 [I-D.eisler-nfsv4-pnfs-metastripe]. 194 From an end user perspective, the operations performed on a file 195 include creating, reading, writing, deleting, and copying. NFSv4 has 196 operations for all but the last. While file copy has been proposed 197 for NFS in the past, it was always rejected because of the lack of 198 Application Programming Interfaces (APIs) within existing operating 199 environments to send a copy operation. The IT trend toward 200 virtualization via hypervisors has changed the situation, where the 201 emerging use case is to copy a virtual disk. The use of a copy 202 operation will save network bandwidth on the client and server, and 203 where the server supports it, intra-server file copy has the 204 potential to avoid all physical data copy. For an example, see 205 [I-D.lentini-nfsv4-server-side-copy]. 207 3. Flash Memory Requirements 209 Flash memory is rapidly filling the wide gap between expensive but 210 fast Dynamic Random Access Memory (DRAM) and inexpensive but cheap 211 magnetic disk. The cost per bit of flash is between DRAM and disk. 212 The access time pet bit of flash is between DRAM and disk. This has 213 resulted in the File access Operations Per Second (FOPS) per unit of 214 cost of flash exceeding DRAM and disk. Flash can be easily added as 215 another storage medium to NFS servers, and this does not require a 216 change to the NFS protocol. However, the value of flash's superior 217 FOPS is best realized when flash is closest to the application, i.e. 218 on the NFS client. One approach would be to forgo the use of network 219 storage and de-evolve back to Direct Attached Storage (DAS). 220 However, this would require that data protection value that exists in 221 modern storage devices be brought into DAS, and this is not always 222 convenient or cost effective. A less traumatic way to leverage the 223 full FOPS of flash would be for NFSv4 clients to leverage flash for 224 caching of data. 226 Today NFSv4 supports whole file delegations for enabling caching. 227 Such a granularity is useful for applications like user home 228 directories where there is little file sharing. However, NFS is used 229 for many more workloads, which include file sharing. In these 230 workloads, files are shared, whereas individual blocks might not be. 231 This drives a requirement for sub-file caching. A derivative of 232 [I-D.eisler-nfsv4-pnfs-dedupe] could provide sub-file caching, and 233 could be integrated with [I-D.myklebust-nfsv4-pnfs-backend] to 234 provide off-NFS-server sub-file caching. 236 4. Compliance 238 New regulations for the IT industry limit who can view what data. 239 NFSv4 has Access Control Lists (ACLs), but the ACL can be changed by 240 the nominal file owner. In practice, the end user that owns the file 241 (essentially, has the right to delete the file or give permissions to 242 other users), is not the legal owner of the file. The legal owner of 243 the file wants to control not just who can access (both read and 244 modify) the file, but who they can pass the content of the file to. 245 The legal owner of the file also wants to control which software can 246 manipulate the files of the legal owner (for example the legal owner 247 might want to only allow software that has been certified). 249 In the past, the IT industry has addressed these requirements with 250 notion of security labeling. Labels are attached to devices, files, 251 users, applications, network connections, etc. When the labels of 252 two objects match, data can be transferred from one to another. For 253 example a label called "Secret" on a file results in only users with 254 a compatible security clearance (e.g. "Secret" or higher) being 255 allowed to view the file, despite what the ACL says. 257 In environments where labeling is mandated, this often means that a 258 file access protocol like NFSv4 is not permitted, despite the fact 259 that NFSv4 meets many of the other security and non-security 260 requirements of such environments. Thus, it is necessary NFSv4 261 support labeling and highly desired that label enforcement and 262 application be supported by both the NFSv4 client and server. 264 To attach a label on a file requires that it be created atomically 265 with the file, which means that a new RECOMMENDED attribute for a 266 security label is needed such as that proposed in 267 [I-D.quigley-nfsv4-sec-label]. 269 5. Incremental Improvements 271 Implementation experience with NFSv4.1 and related protocols, such as 272 SMB2, has shown a number of areas where the protocol can be improved. 274 o Hints for the type of file access, such as sequential read. While 275 traditionally NFS servers have been able to detect read-a-head 276 patterns, with the introduction of pNFS, this will be harder. 277 Since NFS clients can detect patterns of access, they can advise 278 servers. In addition, the UNIX/Linux madvise() API is an example 279 of where applications can provide direct advice to the NFS server. 281 o Head of line blocking. Consider a client that wants to send a 282 three operations: a file creation, a read for one megabyte, and a 283 write for one megabyte. Each of these might be sent on a separate 284 slot. The client determines that it is not desirable for the read 285 operation to wait for the write operation to be sent, so it sends 286 the create. However, it does not want to serialize the read and 287 write behind the create, so the read gets sent, followed by the 288 write. On the reply side, the server does not know that client 289 wants the create satisfied first, so read and write operations are 290 first processed. By the time the create is performed on the 291 server, the response to the read is still filling the reply side. 292 While NFSv4.1 could solve this problem by associating two 293 connections with the session, and using one connection for create, 294 and the other for read or write, multiple connections come at a 295 cost. The requirement is to solve this head of line blocking 296 problem. Tagging a request as one that should go to the head of 297 the line for request and response processing is one possible way 298 to address it. 300 o pNFS connectivity/access indication. If a pNFS client is given a 301 layout that directs it to a storage device it cannot access due to 302 connectivity of access control issues, it has no way in NFSv4.1 to 303 indicate the problem to the metadata server. See a proposal to 304 address this in [I-D.faibish-nfsv4-pnfs-access-permissions-check]. 306 o RPCSEC_GSS sequence window size on backchannel. The NFSv4.1 307 specification does not have a way to for the client to tell the 308 server what window size to use on the backchannel. The 309 specification says that the window size will be the same as what 310 the server uses. Potentially, a server could use a very large 311 window size that the client does not want. 313 o Trunking discovery. The NFSv4.1 specification is long on how a 314 client verifies if trunking is available between two connections, 315 but short on how a client can discover destination addresses that 316 can be trunked. It would be useful if there was a method (such as 317 an operation) to get a list of destinations that can be session or 318 client ID trunked, as well as a notification when the set of 319 destinations changes. 321 6. IANA Considerations 323 None. 325 7. Security Considerations 327 None. 329 8. Acknowledgements 331 Thanks to Dave Noveck and David Quigley for reviewing this document 332 and providing valuable feedback. 334 9. References 336 9.1. Normative References 338 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 339 Requirement Levels", BCP 14, RFC 2119, March 1997. 341 9.2. Informative References 343 [I-D.eisler-nfsv4-pnfs-dedupe] 344 Eisler, M., "Storage De-Duplication Awareness in NFS", 345 draft-eisler-nfsv4-pnfs-dedupe-00 (work in progress), 346 October 2008. 348 [I-D.eisler-nfsv4-pnfs-metastripe] 349 Eisler, M., "Metadata Striping for pNFS", 350 draft-eisler-nfsv4-pnfs-metastripe-01 (work in progress), 351 October 2008. 353 [I-D.faibish-nfsv4-pnfs-access-permissions-check] 354 Eisler, M., "pNFS Access Permissions Check", 355 draft-faibish-nfsv4-pnfs-access-permissions-check-01 (work 356 in progress), October 2009. 358 [I-D.ietf-nfsv4-minorversion1] 359 Shepler, S., Eisler, M., and D. Noveck, "NFS Version 4 360 Minor Version 1", draft-ietf-nfsv4-minorversion1-29 (work 361 in progress), December 2008. 363 [I-D.lentini-nfsv4-server-side-copy] 364 Lentini, J., Eisler, M., Iyer, R., Kenchammana, D., and A. 365 Madan, "NFS Server-side Copy", 366 draft-lentini-nfsv4-server-side-copy-03 (work in 367 progress), July 2009. 369 [I-D.myklebust-nfsv4-pnfs-backend] 370 Myklebust, T., "Network File System (NFS) version 4 pNFS 371 back end protocol extensions", 372 draft-myklebust-nfsv4-pnfs-backend-00 (work in progress), 373 July 2009. 375 [I-D.quigley-nfsv4-sec-label] 376 Quigley, D. and J. Morris, "MAC Security Label Support for 377 NFSv4", draft-quigley-nfsv4-sec-label-00 (work in 378 progress), January 2009. 380 [RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., 381 Beame, C., Eisler, M., and D. Noveck, "Network File System 382 (NFS) version 4 Protocol", RFC 3530, April 2003. 384 Author's Address 386 Michael Eisler (editor) 387 NetApp 388 5765 Chase Point Circle 389 Colorado Springs, CO 80919 390 US 392 Phone: +1 719 599 9026 393 Email: mike@eisler.com