idnits 2.17.1 draft-ietf-nfsv4-minorversion-2-requirements-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 17, 2010) is 4970 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-01) exists of draft-eisler-nfsv4-pnfs-dedupe-00 == Outdated reference: A later version (-02) exists of draft-eisler-nfsv4-pnfs-metastripe-01 == Outdated reference: A later version (-06) exists of draft-lentini-nfsv4-server-side-copy-05 == Outdated reference: A later version (-01) exists of draft-myklebust-nfsv4-pnfs-backend-00 == Outdated reference: A later version (-03) exists of draft-quigley-nfsv4-sec-label-01 -- Obsolete informational reference (is this intentional?): RFC 3530 (Obsoleted by RFC 7530) Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Eisler, Ed. 3 Internet-Draft NetApp 4 Intended status: Informational September 17, 2010 5 Expires: March 21, 2011 7 Requirements for NFSv4.2 8 draft-ietf-nfsv4-minorversion-2-requirements-00 10 Abstract 12 This document proposes requirements for NFSv4.2. 14 Status of this Memo 16 This Internet-Draft is submitted in full conformance with the 17 provisions of BCP 78 and BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF). Note that other groups may also distribute 21 working documents as Internet-Drafts. The list of current Internet- 22 Drafts is at http://datatracker.ietf.org/drafts/current/. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 This Internet-Draft will expire on March 21, 2011. 31 Copyright Notice 33 Copyright (c) 2010 IETF Trust and the persons identified as the 34 document authors. All rights reserved. 36 This document is subject to BCP 78 and the IETF Trust's Legal 37 Provisions Relating to IETF Documents 38 (http://trustee.ietf.org/license-info) in effect on the date of 39 publication of this document. Please review these documents 40 carefully, as they describe your rights and restrictions with respect 41 to this document. Code Components extracted from this document must 42 include Simplified BSD License text as described in Section 4.e of 43 the Trust Legal Provisions and are provided without warranty as 44 described in the Simplified BSD License. 46 Table of Contents 48 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 49 1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 50 2. Efficiency and Utilization Requirements . . . . . . . . . . . . 3 51 2.1. Capacity . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 2.2. Network Bandwidth and Processing . . . . . . . . . . . . . 5 53 3. Flash Memory Requirements . . . . . . . . . . . . . . . . . . . 6 54 4. Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . 6 55 5. Incremental Improvements . . . . . . . . . . . . . . . . . . . 7 56 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8 57 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 58 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 59 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 60 9.1. Normative References . . . . . . . . . . . . . . . . . . . 8 61 9.2. Informative References . . . . . . . . . . . . . . . . . . 9 62 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 9 64 1. Introduction 66 NFSv4.1 [I-D.ietf-nfsv4-minorversion1] is an approved specification. 67 The NFSv4 [RFC3530] community has indicated a desire to continue 68 innovating NFS, and specifically via a new minor version of NFSv4, 69 namely NFSv4.2. The desire for future innovation is primarily driven 70 by two trends in the storage industry: 72 o High efficiency and utilization of resources such as, capacity, 73 network bandwidth, and processors. 75 o Solid state flash storage which promises faster throughput and 76 lower latency than magnetic disk drives and lower cost than 77 dynamic random access memory. 79 Secondarily, innovation is being driver by the trend to stronger 80 compliance with information management. In addition, as might be 81 expected with a complex protocol like NFSv4.1, implementation 82 experience has shown that minor changes to the protocol would be 83 useful to improve the end user experience. 85 This document proposes requirements along these four themes, and 86 attempts to strike a balance between stating the problem and 87 proposing solutions. With respect to the latter, some thinking among 88 the NFS community has taken place, and a future revision of this 89 document will reference embodiments of such thinking. 91 1.1. Requirements Language 93 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 94 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 95 document are to be interpreted as described in RFC 2119 [RFC2119]. 97 2. Efficiency and Utilization Requirements 99 2.1. Capacity 101 Despite the capacity of magnetic disk continuing to increase at 102 exponential rates, the storage industry is under pressure to make the 103 storage of data increasingly efficient, so that more data can be 104 stored within the same physical space. The driver for this counter- 105 intuitive demand is that disk access times are not improving anywhere 106 near as quickly as capacities. The industry has responded to this 107 development by increasing data density via limiting the number of 108 times a unique pattern of data is stored in a storage device. For 109 example some storage devices support de-duplication. When storing 110 two files, a storage device might compare them for shared patterns of 111 data, and store the pattern just once, and setting reference counts 112 on the blocks of the unique pattern to two. With de-duplication the 113 number of times a storage device has to read a particular pattern 114 would be reduced to just once, thus improving average access time. 116 For a file access protocol such as NFS, there are several implied 117 requirements for addressing this capacity efficiency trend: 119 o The "space_used" attribute of NFSv4 does not report meaningful 120 information. Removing a file with a "space_used" value of X bytes 121 does not mean that the file system will see an increase of X 122 available bytes. Providing more meaningful information is a 123 requirement. 125 o Because it is probable, especially for applications such as 126 hypervisors, the NFSv4 client is accessing multiple files with 127 shared blocks of data, it is in the interest of the client and 128 server for the client to know which blocks are share so that they 129 are are not read multiple times, and not cached multiple times. 130 Providing a block map of shared blocks is a requirement. For an 131 example of how NFSv4 could deal with this, see 132 [I-D.eisler-nfsv4-pnfs-dedupe]. 134 o If an NFSv4 client is aware of which patterns exist on which 135 files, when it wants to write pattern X to file B to offset J, and 136 it knows that X also exists in offset I of file A, then if it can 137 advise the server of its intent, the server can arrange for 138 pattern X to appear in file A being a zero copy. Even if the 139 server does not support de-duplication, it can at least perform a 140 local copy that saves network bandwidth and processor overhead on 141 the client and server. 143 o File holes are patterns of zeros that in some file systems do are 144 unallocated blocks. In a sense, holes are the ultimate de- 145 duplicated pattern. While proposals to extend NFS to support hole 146 punching have been around since the 1980s, until recently there 147 have not been NFS clients that could make use of hole punching. 148 The Information Technology (IT) trend toward virtualizing 149 operating environments via hypervisors has resulted in a need for 150 hypervisors to translate a (virtual) disk command to free a block 151 into an NFS request to free that block. On the read side, if a 152 file contains holes, then again, as the ultimate in de- 153 duplication, it would be better for the client to be told the 154 region it wants to read has a hole, instead of of returning long 155 arrays of zero bytes. Even if a server does not support holes on 156 write or read, avoiding the transmission of zeroes will save 157 network bandwidth and reduce processor overhead. 159 2.2. Network Bandwidth and Processing 161 The computational capabilities of processors continues to grow at an 162 exponential rate. However, as noted previously, because disk access 163 times are not showing a commensurate exponential decrease, disk 164 performance is not tracking processor performance. In addition, 165 while network bandwidth is exponentially increasing, unlike disk 166 capacities and processor bandwidth, the improvement is not seen on a 167 1-2 year cycle, but happens on something closer to a 10 year cycle. 168 The lag between disk and network performance compared to processor 169 performance means that there is often a discontinuity between the 170 processing capabilities of NFS clients and the speed at which they 171 can extract data from an NFS server. For some use cases, much of the 172 data that is read by one client from an NFS server also needs to be 173 read by other clients. Re-reading this data is will result in a 174 waste of the network bandwidth and processing of the NFS server. 175 This same observation has driven the creation of peer-to-peer content 176 distribution protocols, where data is directly read from peers rather 177 than servers. It is apparent that a similar technique could be used 178 to offload primary storage, such as that proposed in 179 [I-D.myklebust-nfsv4-pnfs-backend] 181 The pNFS protocol distributes the I/O to a set of files across a 182 cluster of data servers. Arguably, its primary value is in balancing 183 load across storage devices, especially when it can leverage a back 184 end file system or storage cluster with automatic load balancing 185 capabilities. In NFSv4.1, no consideration was given to metadata. 186 Metadata is critical to several workloads, to the point that, as 187 defined in NFSv4.1, pNFS will not not offer much value in those 188 cases. The load balancing capabilities of pNFS need to be brought to 189 metadata. An example of how to do so is in 190 [I-D.eisler-nfsv4-pnfs-metastripe]. 192 From an end user perspective, the operations performed on a file 193 include creating, reading, writing, deleting, and copying. NFSv4 has 194 operations for all but the last. While file copy has been proposed 195 for NFS in the past, it was always rejected because of the lack of 196 Application Programming Interfaces (APIs) within existing operating 197 environments to send a copy operation. The IT trend toward 198 virtualization via hypervisors has changed the situation, where the 199 emerging use case is to copy a virtual disk. The use of a copy 200 operation will save network bandwidth on the client and server, and 201 where the server supports it, intra-server file copy has the 202 potential to avoid all physical data copy. For an example, see 203 [I-D.lentini-nfsv4-server-side-copy]. 205 3. Flash Memory Requirements 207 Flash memory is rapidly filling the wide gap between expensive but 208 fast Dynamic Random Access Memory (DRAM) and inexpensive but cheap 209 magnetic disk. The cost per bit of flash is between DRAM and disk. 210 The access time pet bit of flash is between DRAM and disk. This has 211 resulted in the File access Operations Per Second (FOPS) per unit of 212 cost of flash exceeding DRAM and disk. Flash can be easily added as 213 another storage medium to NFS servers, and this does not require a 214 change to the NFS protocol. However, the value of flash's superior 215 FOPS is best realized when flash is closest to the application, i.e. 216 on the NFS client. One approach would be to forgo the use of network 217 storage and de-evolve back to Direct Attached Storage (DAS). 218 However, this would require that data protection value that exists in 219 modern storage devices be brought into DAS, and this is not always 220 convenient or cost effective. A less traumatic way to leverage the 221 full FOPS of flash would be for NFSv4 clients to leverage flash for 222 caching of data. 224 Today NFSv4 supports whole file delegations for enabling caching. 225 Such a granularity is useful for applications like user home 226 directories where there is little file sharing. However, NFS is used 227 for many more workloads, which include file sharing. In these 228 workloads, files are shared, whereas individual blocks might not be. 229 This drives a requirement for sub-file caching. A derivative of 230 [I-D.eisler-nfsv4-pnfs-dedupe] could provide sub-file caching, and 231 could be integrated with [I-D.myklebust-nfsv4-pnfs-backend] to 232 provide off-NFS-server sub-file caching. 234 4. Compliance 236 New regulations for the IT industry limit who can view what data. 237 NFSv4 has Access Control Lists (ACLs), but the ACL can be changed by 238 the nominal file owner. In practice, the end user that owns the file 239 (essentially, has the right to delete the file or give permissions to 240 other users), is often not the legal owner of the file. The legal 241 owner of the file wants to control not just who can access (both read 242 and modify) the file, but who they can pass the content of the file 243 to. The legal owner of the file also wants to control which software 244 can manipulate the files of the legal owner (for example the legal 245 owner might want to only allow software that has been certified). 247 In the past, the IT industry has addressed these requirements with 248 notion of security labeling. Labels are attached to devices, files, 249 users, applications, network connections, etc. When the labels of 250 two objects match, data can be transferred from one to another. For 251 example a label called "Secret" on a file results in only users with 252 a compatible security clearance (e.g. "Secret" or higher) being 253 allowed to view the file, despite what the ACL says. 255 In environments where labeling is mandated, this often means that a 256 file access protocol like NFSv4 is not permitted, despite the fact 257 that NFSv4 meets many of the other security and non-security 258 requirements of such environments. Thus, it is necessary NFSv4 259 support labeling and highly desired that label enforcement and 260 application be supported by both the NFSv4 client and server. 262 To attach a label on a file requires that it be created atomically 263 with the file, which means that a new RECOMMENDED attribute for a 264 security label is needed such as that proposed in 265 [I-D.quigley-nfsv4-sec-label]. 267 5. Incremental Improvements 269 Implementation experience with NFSv4.1 and related protocols, such as 270 SMB2, has shown a number of areas where the protocol can be improved. 272 o Hints for the type of file access, such as sequential read. While 273 traditionally NFS servers have been able to detect read-a-head 274 patterns, with the introduction of pNFS, this will be harder. 275 Since NFS clients can detect patterns of access, they can advise 276 servers. In addition, the UNIX/Linux madvise() API is an example 277 of where applications can provide direct advice to the NFS server. 279 o Head of line blocking. Consider a client that wants to send a 280 three operations: a file creation, a read for one megabyte, and a 281 write for one megabyte. Each of these might be sent on a separate 282 slot. The client determines that it is not desirable for the read 283 operation to wait for the write operation to be sent, so it sends 284 the create. However, it does not want to serialize the read and 285 write behind the create, so the read gets sent, followed by the 286 write. On the reply side, the server does not know that client 287 wants the create satisfied first, so read and write operations are 288 first processed. By the time the create is performed on the 289 server, the response to the read is still filling the reply side. 290 While NFSv4.1 could solve this problem by associating two 291 connections with the session, and using one connection for create, 292 and the other for read or write, multiple connections come at a 293 cost. The requirement is to solve this head of line blocking 294 problem. Tagging a request as one that should go to the head of 295 the line for request and response processing is one possible way 296 to address it. 298 o pNFS connectivity/access indication. If a pNFS client is given a 299 layout that directs it to a storage device it cannot access due to 300 connectivity of access control issues, it has no way in NFSv4.1 to 301 indicate the problem to the metadata server. See a proposal to 302 address this in [I-D.faibish-nfsv4-pnfs-access-permissions-check]. 304 o RPCSEC_GSS sequence window size on backchannel. The NFSv4.1 305 specification does not have a way to for the client to tell the 306 server what window size to use on the backchannel. The 307 specification says that the window size will be the same as what 308 the server uses. Potentially, a server could use a very large 309 window size that the client does not want. 311 o Trunking discovery. The NFSv4.1 specification is long on how a 312 client verifies if trunking is available between two connections, 313 but short on how a client can discover destination addresses that 314 can be trunked. It would be useful if there was a method (such as 315 an operation) to get a list of destinations that can be session or 316 client ID trunked, as well as a notification when the set of 317 destinations changes. 319 6. IANA Considerations 321 None. 323 7. Security Considerations 325 None. 327 8. Acknowledgements 329 Thanks to Dave Noveck and David Quigley for reviewing this document 330 and providing valuable feedback. 332 9. References 334 9.1. Normative References 336 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 337 Requirement Levels", BCP 14, RFC 2119, March 1997. 339 9.2. Informative References 341 [I-D.eisler-nfsv4-pnfs-dedupe] 342 Eisler, M., "Storage De-Duplication Awareness in NFS", 343 draft-eisler-nfsv4-pnfs-dedupe-00 (work in progress), 344 October 2008. 346 [I-D.eisler-nfsv4-pnfs-metastripe] 347 Eisler, M., "Metadata Striping for pNFS", 348 draft-eisler-nfsv4-pnfs-metastripe-01 (work in progress), 349 October 2008. 351 [I-D.faibish-nfsv4-pnfs-access-permissions-check] 352 Faibish, S., Black, D., Eisler, M., and J. Glasgow, "pNFS 353 Access Permissions Check", 354 draft-faibish-nfsv4-pnfs-access-permissions-check-03 (work 355 in progress), July 2010. 357 [I-D.ietf-nfsv4-minorversion1] 358 Shepler, S., Eisler, M., and D. Noveck, "NFS Version 4 359 Minor Version 1", draft-ietf-nfsv4-minorversion1-29 (work 360 in progress), December 2008. 362 [I-D.lentini-nfsv4-server-side-copy] 363 Lentini, J., Eisler, M., Kenchammana, D., Madan, A., and 364 R. Iyer, "NFS Server-side Copy", 365 draft-lentini-nfsv4-server-side-copy-05 (work in 366 progress), July 2010. 368 [I-D.myklebust-nfsv4-pnfs-backend] 369 Myklebust, T., "Network File System (NFS) version 4 pNFS 370 back end protocol extensions", 371 draft-myklebust-nfsv4-pnfs-backend-00 (work in progress), 372 July 2009. 374 [I-D.quigley-nfsv4-sec-label] 375 Quigley, D. and J. Morris, "MAC Security Label Support for 376 NFSv4", draft-quigley-nfsv4-sec-label-01 (work in 377 progress), February 2010. 379 [RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., 380 Beame, C., Eisler, M., and D. Noveck, "Network File System 381 (NFS) version 4 Protocol", RFC 3530, April 2003. 383 Author's Address 385 Michael Eisler (editor) 386 NetApp 387 5765 Chase Point Circle 388 Colorado Springs, CO 80919 389 US 391 Phone: +1 719 599 9026 392 Email: mike@eisler.com