idnits 2.17.1 draft-faibish-nfsv4-data-reduction-attributes-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 05, 2019) is 1724 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 5661 (ref. '4') (Obsoleted by RFC 8881) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network File System Version 4 S. Faibish 2 Internet-Draft D. Black 3 Intended status: Informational P. Shilane 4 Expires: February 5, 2020 Dell EMC 5 August 05, 2019 7 Support for Data Reduction Attributes in nfsv4 Version 2 8 draft-faibish-nfsv4-data-reduction-attributes-00 10 Abstract 12 This document proposes extending NFSv4 operations to enable 13 file extended attributes or xattr to be used in the protocol to 14 provide information about the data reduction properties of files. 15 New xattrs are proposed to allow the client application to 16 communicate to the NFSv4 server data reduction attributes 17 associated with files and directories using opaque metadata, not 18 interpreted by the file system, but communicated to the Block 19 Storage data reduction engines. Corresponding new file attributes 20 are proposed to allow clients and client applications to query the 21 server for data reduction xattr support and allow to get and set 22 data reduction xattrs on files and directories. Such data reduction 23 metadata is used as hints to the file server about what type of data 24 reduction to apply. The proposed data reduction attributes include 25 achievable ratios for compression and deduplication plus whether 26 each data reduction technique applies to the file. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 The list of Internet-Draft Shadow Directories can be accessed at 44 https://www.ietf.org/standards/ids/internet-draft-mirror-sites/. 46 This Internet-Draft will expire on February 5, 2020. 48 Copyright Notice 50 Copyright (c) 2018 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (https://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 66 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 67 2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 3. Extended Attributes . . . . . . . . . . . . . . . . . . . . . 8 69 4. File System Support . . . . . . . . . . . . . . . . . . . . . 9 70 5. Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . 9 71 6 Differences with Named Attributes . . . . . . . . . . . . . . 9 72 7 Protocol Enhancements . . . . . . . . . . . . . . . . . . . . 10 73 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 74 9. Security Considerations . . . . . . . . . . . . . . . . . . . 10 75 10. References . . . . . . . . . . . . . . . . . . . . . . . . 11 76 10.1. Normative References . . . . . . . . . . . . . . . . 11 77 10.2 Informative References . . . . . . . . . . . . . . . . 11 78 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 12 79 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 12 81 1. Introduction 83 Many NFS servers use expensive solid state media, e.g., NVMe SSDs, 84 complemented by data reduction processing of files to reduce their 85 size on the Block Storage via compression and deduplication, thereby 86 optimizing media usage. This draft considers scenarios in which 87 data reduction processing is performed in Block Storage for NFS 88 servers, i.e., compression and deduplication processing occurs in 89 the background or inline as a consequence of NFS files being 90 written to the Block Storage. In these scenarios, the data reduction 91 engines in Block Storage have limited information about how 92 reducible (compressible and/or deduplicate-able) the data written 93 to NFS is. 95 There is additional strong interest to improve data reduction when 96 using NVMe accessed media and exposing such data attributes to the 97 Block Storage as xattrs over NFS is one means of providing this 98 critical to Block Storage data reduction engines. 100 There is an expired draft for use of NVMe (over fabric) in accessing 101 a pNFS SCSI Layout [3] which could be extended to communicate data 102 reduction attributes to NVMe storage. The shortcoming of the current 103 pNFS SCSI NVMe layout is that it has no information related to data 104 reduction attributes. This document discusses potential use of NFSv4 105 extended attributes as currently standardized in [2], for 106 communicating additional data reduction metadata; a future version 107 of this document will propose updates to the NFSv4 protocol to 108 support this functionality. 110 The purpose of this draft is to add xattrs that will allow 111 applications to send richer metadata information to the NFS server 112 in order to optimize Block Storage data reduction engine operations 113 and improve data reduction for data stored by NFS 114 servers. 116 Applications can handle files with different compression and 117 deduplication characteristics and send this information to the data 118 reduction engines. Current applications have defined data reduction 119 characteristics and there are clear definitions for the typical 120 compression and deduplication ratios of some types of data 121 independent of the application that generated the data. For example 122 electronic data analysis (EDA) has no single de facto standard file 123 extension but generates application files with common compression 124 and deduplication characteristics. Knowing that a file is compressed 125 improves the latency and/or throughput of the NFS server by not 126 attempting to further compress the files. An additional example is 127 that NFS backup of files that are already stored on the Block 128 Storage is likely to result in a very high deduplication ratio. 130 1.1 Terminology 132 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 133 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 134 document are to be interpreted as described in RFC 2119 [1]. 136 In this document, these words will appear with that interpretation 137 only when in ALL CAPS. Lower case uses of these words are not to be 138 interpreted as carrying RFC-2119 significance. We will refer to the 139 block devices used by the NFS servers as "Block Storage". 141 2. Use Cases 143 Applications can use extended attributes to store metadata together 144 with the files and directories. Metadata regarding data reduction 145 attributes may be available from applications that use different 146 types of files. This metadata may not be directly useful to the file 147 system but is relevant to the compression and deduplication engines 148 used by the Block Storage to improve data reduction. Use of data 149 reduction metadata is not expected to significantly impact I/O 150 latency or throughput (IOPS). 152 File Domain | Block Domain 153 | 154 +-------------+ | +-----------------+ 155 | NFS Server |------------|--------->|Reduction Engine | 156 +------+------+ | +--------+--------+ 157 ^ | | 158 | | | 159 | | v 160 +------+------+ | +--------+--------+ 161 | NFS Client | | | Block storage | 162 +------+------+ | +-----------------+ 163 ^ | 164 | | 165 | | 166 +------+------+ | 167 | Application | | 168 +-------------+ | 169 Figure 1: Data Reduction Domains for NFSv4 171 Figure 1 shows the NFSv4 server configuration, data flow and 172 functionality domains with the data reduction engine in the Block 173 domain and located above the Block Storage. This figure represents 174 NFSv4 without parallel NFS (pNFS) support. In this structure the NFS 175 server can communicate xattrs as metadata directly to the Reduction 176 Engine via an extension to the interface to Block Storage. 178 In general applications using block devices rely on SCSI protocols to 179 access the data. Although SCSI protocols have a rich API, most 180 communication between hosts and Block Storage, e.g., storage arrays, 181 is in terms of blocks, not files. In contrast, applications use large 182 files to read and write data to and from NFS servers. In general, 183 NFS servers use NFS file systems that are stored on SCSI (or NVMe) 184 devices provisioned from Block Storage, e.g., external storage 185 arrays, as Block Storage but file metadata, e.g., file type and file 186 size, is not transferred to the block array in a explicit manner. 188 An NFS Server might be able to infer data reduction characteristics 189 based on the file type, e.g., a ".mp4" file can be expected to be an 190 MP4 file that contains MPEG-4 content [7]. This is not sufficient 191 due to file content variability, e.g., as a large variety of codecs 192 are used to create MPEG-4 content whose compressibility may vary by 193 codec. To go beyond the file type, the NFS Server could read the 194 file contents to determine compressibility, but this is problematic 195 due to complexity, e.g., the NFS Server may need to parse a 196 significant amount of an MP4 file to obtain the information 197 necessary to understand its compressibility characteristics. This 198 may be impractical if the file is not written to the NFS Server 199 sequentially, and moreover introduces an undesirable 200 dependency on not only the MP4 file format, but also the set of 201 supported codecs that it supports and individual codec 202 characteristics. It is much better to have the application provide 203 information on compressibility, as the application that generates 204 an MP4 file has the information on the file's contents. A mechanism 205 is needed to pass that information to the NFS Server; this document 206 proposes using xattrs. 208 So, although the xattrs are stored with the files, the current xattr 209 specification [6] indicates that the file system does not understand 210 the structure or content of these extended attributes. If the NFS 211 server could extract the data reduction xattrs and pass their 212 contents to the Block Storage functionality, the Block Storage 213 reduction engines could parse that content and adapt its data 214 reduction behavior accordingly. 216 File Domain | Block Domain 217 | 218 +-------------+ | +------------------+ 219 | |------------------|----->| | 220 | pNFS Server | +-------|----->| Reduction Engine | 221 | | | +----|----->| | 222 +------+------+ | | | +---------+--------+ 223 ^ | | | | 224 | | | | | 225 | | | | v 226 +------+------+ NVMe | | | +--------+--------+ 227 | pNFS Client |----------+ | | | Block storage | 228 | |-------------+ | +-----------------+ 229 +------+------+ SCSI | 230 ^ | 231 | | 232 | | 233 +------+------+ | 234 | Application | | 235 +-------------+ | 236 Figure 2: Data Reduction Domains for pNFS over NVMe or SCSI 238 The current situation is that data reduction done in the block 239 domain lacks critical information that could be provided by the 240 applications in order to improve efficiency of data compression and 241 deduplication. 243 Figure 2 shows another scenario with a pNFS Server and a block pNFS 244 Client that accesses Block storage using either NVMe or SCSI 245 over a network. In this scenario the pNFS Client could send data 246 reduction attributes directly to the reduction engine above the 247 Block storage layer if the block storage protocol (NVMe or SCSI in 248 the figure) supports doing so. The assumption is that the 249 application has additional information related to files types and 250 typical compression and deduplication parameters associated to 251 different file types, e.g., see the above discussion of MPEG-4 252 content. The application can convey this information to the 253 reduction engine to improve the reduction engine efficiency. If the 254 application does not do so, then the user can also add data 255 reduction characteristics for individual files towards improving 256 data reduction efficiency without needing to change the storage 257 array configuration. 259 For this pNFS scenario the application enables sending 260 data reduction parameters to the Block Device using extensions to 261 the SCSI or NVMe protocols. The pNFS Client still needs to pass the 262 data reduction xattrs to the pNFS Server because the pNFS Client is 263 always allowed to fall back from a pNFS write to an NFS write via 264 the NFS Server; this fallback is similar to the previous case where 265 the NFS Server stores the data reduction xattrs associated with each 266 file. 268 For example a video application knows whether a file consists of 269 compressed data or uncompressed data. The application writing the 270 data to the pNFS client can set a file attribute that will indicate 271 that a file is uncompressed and hence it is likely to be productive 272 for the data reduction engine to reduce the file's size. The pNFS 273 client passes that information via an xattr that hints that the 274 file is compressible. The pNFS server will change the data reduction 275 xattr and will transmit the xattr to the Block Storage as a hint 276 that the data is uncompressed. The pNFS client will stream the 277 video using the pNFS NVMe data protocol and the compression engine 278 in the Block Storage will compress the data blocks as long as the 279 uncompressed hint is set in NVMe writes from the pNFS Client. If the 280 xattr is changed to indicate that the data has been compressed, the 281 compression engine does not compress the incoming blocks. 283 A second example is related to encrypted files that can be neither 284 compressed nor deduplicated in the absence of file copying. For this 285 specific example we envision a not-deduplicatable hint. 287 In this scenario the NFS client sets the deduplication hint to 288 advise to the data reduction engine that deduplication should be 289 enabled for the file. Alternatively if a new file is being written 290 that is not based on modifying an existing file the deduplication 291 hint is set to indicate that deduplication should be disabled. 293 Another use case involves compressed video files and images that are 294 written by video applications. As such files are already compressed, 295 further attempts to compress them are likely to be pointless, and 296 may negatively impact the performance of the NFS Server. 298 An additional scenario involves metadata at the start (header) of 299 the file; an application that did not generate the file may 300 nonetheless be able access the metadata section in the file and set 301 extended file attributes based on compression and deduplication 302 found in the file header. The NFS server doesn't have visibility 303 into metadata included in file headers and cannot send file header 304 content to the data reduction engine as separate metadata. Only the 305 user application can access and parse the header and add xattr when 306 the file is written to the NFS server. 308 Additional examples of known data reduction attributes is implemented 309 in benchmarks such as SPECsfs that is using predefined data reduction 310 attributes. SPECsfs workloads [8] have DR/CR (Deduplication 311 Ratio/Compression Ratio) characteristics that were collected from 312 actual user data. They are as follows: 314 EDA DR/CR=50%/50% 315 SWBUILD DR/CR=0/80% 316 VDI DR/CR=55%/70% 317 DB DR/CR=0/50% 318 VDA DR/CR=0/0 319 IT infrastucture DR/CR=30%/50% 320 Oracle DW DR/CR=15%/70% 321 Oracle OLTP DR/CR=0%/65% 322 Exchange 2010 DR/CR=15%/35% 323 Geoseismic DR/CR=3%/40% 325 Another scenario involves placing files with the same known data 326 reduction characteristics in same directory, where the user or an 327 application sets data reduction xattrs the attributes on the 328 directory that are intended to apply to all files in the directory 329 and possibly also sub-directories. In this case the NFS Server uses 330 the data reduction xattrs on the directory to inform the data 331 reduction engine of the data reduction characteristics of blocks in 332 all files in that directory. 334 3 Extended Attributes 336 Extended attributes, also called xattrs [6], are a means to associate 337 opaque metadata with file system objects, e.g., files and 338 directories. Extended attributes are especially useful when they 339 add information that is not, or cannot be, present in the 340 associated object itself. User-space applications can arbitrarily 341 create, read from, and write these attributes. 343 As extended attributes are file system-agnostic 344 applications do not need to be concerned about how the attributes 345 are stored internally on the underlying file system. All major 346 operating systems provide various flavors of extended attributes. 347 Many user space tools allow xattrs to be included in attributes that 348 need to be preserved when files and directories are updated, moved 349 or copied. 351 The proposed data reduction attributes are opaque to the file system 352 but can be used by the data reduction engines in the Block Storage 353 reduction engine to increase the data reduction and server operations 354 by viewing the xattrs as hints from the client application regarding 355 file compression and deduplication characteristics. The Block Storage 356 will parse these attributes and change the data reduction methods 357 according to these hints with no need for the file system to know 358 about the data reduction methods used. 360 Extended attributes have long been considered unsuitable for 361 portability because they are inadequately defined and not formally 362 documented by any standard (such as POSIX). However, evidence 363 suggests that xattrs are widely deployed and their support in modern 364 disk-based file systems is fairly universal. What is different 365 in the new usecase is that the opaque metadata can be received and 366 understood by the data reduction engines. The extended attributes 367 can be 0 or 100 where 0 means "don't do this" hint and 100 is a "do 368 this, but can't predict how much reduction will actually result" 369 hint. They can also take on a percentage value, e.g., from the 370 SPECsfs data shown above. 372 Any regular file or directory may have a set of extended attributes, 373 each consisting of a key and associated value [6]. As currently 374 specified, the NFS client or server MUST NOT interpret the contents 375 of the key or value. This document proposed to remove that 376 restriction in support of data reduction xattrs. 378 The data reduction attributes can be provided by the extended 379 attributes supported by most modern file systems and can be 380 retrieved from the local file systems on the client and added to the 381 NFS extended attributes when files are moved from local file system 382 attributes of the files to the xattrs in NFS. 384 4 File System Support 386 In Linux, ext3, ext4, JFS, XFS, Btrfs, among other file systems 387 support extended attributes. The getfattr and setfattr utilities can 388 be used to retrieve and set xattrs. The names of the extended 389 attributes must be prefixed by the name of the category and a dot; 390 hence these categories are generally qualified as name spaces. 391 In the NTFS file system, extended attributes are one of several 392 supported "file streams" [5]. 394 Xattrs can be retrieved and set through system calls, [6], or shell 395 commands and generally supported by user-space tools that preserve 396 other file attributes. For example, the "rsync" remote copy program 397 will correctly preserve extended attributes between Linux/ext4 398 and OSX/hfs by stripping off the Linux-specific "user." prefix. 400 5 Namespaces 402 Operating systems may define multiple "namespaces" in which xattrs 403 can be set. Namespaces are more than organizational classes; the 404 operating system may enforce different data reduction policies and 405 allow different reduction characteristics depending on the namespace. 407 6 Differences with Named Attributes 409 RFC5661 defines named attributes as opaque byte streams that are 410 associated with a directory or file and referred to by a string name 411 [4]. Named attributes are intended to be used by client applications 412 as a method to associate application-specific data with a regular 413 file or directory. In that sense, xattrs are similar in concept and 414 use to named attributes, but there are subtle differences. Named 415 attributes are only visible to the NFS layer and not to the 416 application while extended attributes are accessible to the 417 application layer and can be modified by users. File systems 418 typically define individual xattr "get" and "set" operations. Xattrs 419 generally have size limits ranging from a few bytes to several 420 kilobytes; the maximum supported size is not universally defined 421 and is usually restricted by the file system. 423 There are no clear indications on how xattrs can be mapped to any 424 existing recommended or optional file attributes defined in RFC 5661 425 [2]; as a result, most NFS client implementations ignore 426 application-specified xattrs. This results in data loss if one 427 copies, over the NFS protocol, a file with data reduction related 428 xattrs from one file system to another that also supports xattrs. 429 Although different data reduction engines achieve different levels 430 of reduction these attributes are used by the reduction engines to 431 increase the reduction todifferent levels for different algorithms. 433 While it should be possible to write guidance about how a client can 434 use the named attribute mechanism to act like xattrs, such as carving 435 out some namespace and specifying locking primitives to enforce 436 atomicity constraints on individual get/set operations, this is 437 problematic for data reduction attributes that are specific to 438 specific applications and file types and not defined by the user. 439 As such there will be mechanisms that will detect the reduction 440 attributes from the application or from local file system xattrs. 442 The different implementations of the protocol would have to address 443 these attributes based on additional guidance such as reserving 444 named some portion of named attribute namespace for xattr-like 445 functionality. 447 7 Protocol Enhancements 449 This section proposes extensions to the NFSv4 protocol operations to 450 allow data reduction xattrs to be queried and modified by clients. 451 A new attribute is added to bitmap4 data type to allow xattr support 452 to be queried. This follows the guidelines specified in [2] with 453 respect to minor versioning. We propose to add 2 bits that will 454 be passed to the reduction engine and used to activate/deactivate 455 the compression and/or the deduplication operations. All the current 456 NFSv4 xattr operations are not changed but we will add 4 new 457 operations, namely GETDRATTR, SETDRATTR, LISTXATTR and REMOVEXATTR 458 to be queried and set. The protocol detailes will be provided in the 459 next version of the draft. 461 8. IANA Considerations 463 All IANA considerations are covered in [4]. 465 9. Security Considerations 467 The additions to the NFS protocol for supporting extended attributes 468 do not alter the security considerations of the NFSv4.1 protocol [4]. 470 Data reduction hints may enable attacks on Block Storage resources 471 that support the NFS Server. Hinting at more data reduction than is 472 possible may cause excessive data reduction processing, and hinting 473 at less data reduction than is possible, including hinting not to 474 perform any data reduction, may result in consumption of more 475 potentially expensive storage capacity. A future version of this 476 draft will discuss what to do about these possible resource attacks. 478 10. References 480 10.1. Normative References 482 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 483 Levels", BCP 14, RFC 2119, March 1997. 485 [2] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network 486 File System (NFS) Version 4 Minor Version 1 External Data 487 Representation Standard (XDR) Description", RFC 5662, January 488 2010. 490 [4] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network 491 File System (NFS) Version 4 Minor Version 1 Protocol", RFC 5661, 492 January 2010. 494 10.2 Informative References 496 [3] C. Hellwig, "Using the Parallel NFS (pNFS) SCSI Layout 497 with NVMe", June 2017, 498 https://tools.ietf.org/html/draft-hellwig-nfsv4-scsi-layout- 499 ... nvme-00 501 [5] http://www.freedesktop.org/wiki/CommonExtendedAttributes, 502 "Guidelines for extended attributes". 504 [6] M. Naik, M. Eshel, "File System Extended Attributes in NFSv4" 505 https://datatracker.ietf.org/doc/rfc8276/ 507 [7] ISO/IEC 14496-14 "Information technology - Coding of audio- 508 visual objects - Part 14: MP4 file format" 510 [8] SPEC SFS 2014 SP2 User's Guide, 511 http://spec.org/sfs2014/docs/usersguide.pdf 513 Acknowledgments 515 This draft has attempted to capture the latest industry trends of 516 adding data reduction attributes needed to increase efficiency of 517 newest flash NVMe technology for file servers. New protocols were 518 proposed specific for NVMe media and we were inspired by new drafts 519 proposed by Christoph Hellwig. 521 Author's Address 523 Sorin Faibish 524 Dell EMC 525 228 South Street 526 Hopkinton, MA 01774 527 United States of America 529 Phone: +1 508-249-5745 530 Email: faibish.sorin@dell.com 532 Philip Shilane 533 Dell EMC 534 228 South Street 535 Hopkinton, MA 01774 536 United States of America 538 Phone: +1 908-286-7977 539 Email: philip.shilane@dell.com 541 David Black 542 DellEMC 543 176 South Street 544 Hopkinton, MA 01748 545 United States of America 547 Phone: +1 774-350-9323 548 Email: david.black@dell.com