idnits 2.17.1 draft-ietf-nfsv4-nfs-rdma-problem-statement-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 691. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 667. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 674. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 680. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3530 (Obsoleted by RFC 7530) ** Obsolete normative reference: RFC 1831 (Obsoleted by RFC 5531) ** Obsolete normative reference: RFC 1832 (Obsoleted by RFC 4506) ** Downref: Normative reference to an Informational RFC: RFC 1813 Summary: 7 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Tom Talpey 3 Expires: April 2007 Chet Juszczak 5 October, 2006 7 NFS RDMA Problem Statement 8 draft-ietf-nfsv4-nfs-rdma-problem-statement-05 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six 23 months and may be updated, replaced, or obsoleted by other 24 documents at any time. It is inappropriate to use Internet-Drafts 25 as reference material or to cite them other than as "work in 26 progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt The list of 30 Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Abstract 35 This draft addresses applying Remote Direct Memory Access to the 36 NFS protocols. NFS implementations historically incur significant 37 overhead due to data copies on end-host systems, as well as other 38 sources. The potential benefits of RDMA to these implementations 39 are explored, and the reasons why RDMA is especially well-suited to 40 NFS and network file protocols in general are evaluated. 42 Table Of Contents 44 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2 45 2. Problem Statement . . . . . . . . . . . . . . . . . . . . 4 46 3. File Protocol Architecture . . . . . . . . . . . . . . . . 5 47 4. Sources of Overhead . . . . . . . . . . . . . . . . . . . 7 48 4.1. Savings from TOE . . . . . . . . . . . . . . . . . . . . 8 49 4.2. Savings from RDMA . . . . . . . . . . . . . . . . . . . 9 50 5. Application of RDMA to NFS . . . . . . . . . . . . . . . . 10 51 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . 10 52 Security Considerations . . . . . . . . . . . . . . . . . 11 53 IANA Considerations . . . . . . . . . . . . . . . . . . . 11 54 Acknowledgements . . . . . . . . . . . . . . . . . . . . . 11 55 Normative References . . . . . . . . . . . . . . . . . . . 11 56 Informative References . . . . . . . . . . . . . . . . . . 12 57 Authors' Addresses . . . . . . . . . . . . . . . . . . . . 14 58 Intellectual Property and Copyright Statements . . . . . . 14 60 1. Introduction 62 The Network File System (NFS) protocol (as described in [RFC1094], 63 [RFC1813], and [RFC3530]) is one of several remote file access 64 protocols used in the class of processing architecture sometimes 65 called Network Attached Storage (NAS). 67 Historically, remote file access has proved to be a convenient, 68 cost-effective way to share information over a network, a concept 69 proven over time by the popularity of the NFS protocol. However, 70 there are issues in such a deployment. 72 As compared to a local (direct-attached) file access architecture, 73 NFS removes the overhead of managing the local on-disk filesystem 74 state and its metadata, but interposes at least a transport network 75 and two network endpoints between an application process and the 76 files it is accessing. This tradeoff has to date usually resulted 77 in a net performance loss as a result of reduced bandwidth, 78 increased application server CPU utilization, and other overheads. 80 Several classes of applications, including those directly 81 supporting enterprise activities in high performance domains such 82 as database applications and shared clusters, have therefore 83 encountered issues with moving to NFS architectures. While this 84 has been due principally to the performance costs of NFS versus 85 direct attached files, other reasons are relevant, such as the lack 86 of strong consistency guarantees being provided by NFS 87 implementations. 89 Replication of local file access performance on NAS using 90 traditional network protocol stacks has proven difficult, not 91 because of protocol processing overheads, but because of data copy 92 costs in the network endpoints. This is especially true since host 93 buses are now often the main bottleneck in NAS architectures 94 [MOG03] [CHA+01]. 96 The External Data Representation [RFC1832] employed beneath NFS and 97 RPC [RFC1831] can add more data copies, exacerbating the problem. 99 Data copy-avoidance designs have not been widely adopted for a 100 variety of reasons. [BRU99] points out that "many copy avoidance 101 techniques for network I/O are not applicable or may even backfire 102 if applied to file I/O." Other designs that eliminate unnecessary 103 copies, such as [PAI+00], are incompatible with existing APIs and 104 therefore force application changes. 106 In recent years, an effort to standardize a set of protocols for 107 Remote Direct Memory Access, RDMA, over the standard Internet 108 Protocol Suite has been chartered [RDDP]. Several drafts have been 109 proposed and are being considered for Standards Track. 111 RDMA is a general solution to the problem of CPU overhead incurred 112 due to data copies, primarily at the receiver. Substantial 113 research has addressed this and has borne out the efficacy of the 114 approach. An overview of this is the RDDP "Remote Direct Memory 115 Access (RDMA) over IP Problem Statement" document, [RFC4297]. 117 In addition to the per-byte savings of off-loading data copies, 118 RDMA-enabled NICs (RNICS) offload the underlying protocol layers as 119 well, e.g. TCP, further reducing CPU overhead due to NAS 120 processing. 122 1.1. Background 124 The RDDP Problem Statement [RFC4297] asserts: 126 "High costs associated with copying are an issue primarily for 127 large scale systems ... with high bandwidth feeds, usually 128 multiprocessors and clusters, that are adversely affected by 129 copying overhead. Examples of such machines include all 130 varieties of servers: database servers, storage servers, 131 application servers for transaction processing, for e- 132 commerce, and web serving, content distribution, video 133 distribution, backups, data mining and decision support, and 134 scientific computing. 136 Note that such servers almost exclusively service many 137 concurrent sessions (transport connections), which, in 138 aggregate, are responsible for > 1 Gbits/s of communication. 139 Nonetheless, the cost of copying overhead for a particular 140 load is the same whether from few or many sessions." 142 Note that each of the servers listed above could be accessing their 143 file data as an NFS client, or NFS serving the data to such 144 clients, or acting as both. 146 The CPU overhead of the NFS and TCP/IP protocol stacks (including 147 data copies or reduced copy workarounds) becomes a significant 148 matter in these clients and servers. File access using locally 149 attached disks imposes relatively low overhead due to the highly 150 optimized I/O path and direct memory access afforded to the storage 151 controller. This is not the case with NFS, which must pass data 152 to, and especially from, the network and network processing stack 153 to the NFS stack. Frequently, data copies are imposed on this 154 transfer, in some cases several such copies in each direction. 156 Copies are potentially encountered in an NFS implementation 157 exchanging data to and from user address spaces, within kernel 158 buffer caches, in XDR marshalling and unmarshalling, and within 159 network stacks and network drivers. Other overheads such as 160 serialization among multiple threads of execution sharing a single 161 NFS mount point and transport connection are additionally 162 encountered. 164 Numerous upper layer protocols achieve extremely high bandwidth and 165 low overhead through the use of RDMA. [MAF+02] show that the RDMA- 166 based Direct Access File System (with a user-level implementation 167 of the file system client) can outperform even a zero-copy 168 implementation of NFS [CHA+01] [CHA+99] [GAL+99] [KM02]. Also, 169 file data access implies the use of large ULP messages. These 170 large messages tend to amortize any increase in per-message costs 171 due to the offload of protocol processing incurred when using RNICs 172 while gaining the benefits of reduced per-byte costs. Finally, the 173 direct memory addressing afforded by RDMA avoids many sources of 174 contention on network resources. 176 2. Problem Statement 178 The principal performance problem encountered by NFS 179 implementations is the CPU overhead required to implement the 180 protocol. Primary among the sources of this overhead is the 181 movement of data from NFS protocol messages to its eventual 182 destination in user buffers or aligned kernel buffers. Due to the 183 nature of the RPC and XDR protocols, the NFS data payload arrives 184 at arbitrary alignment, necessitating a copy at the receiver, and 185 the NFS requests are completed in an arbitrary sequence. 187 The data copies consume system bus bandwidth and CPU time, reducing 188 the available system capacity for applications [RFC4297]. 189 Achieving zero-copy with NFS has, to date, required sophisticated, 190 version-specific "header cracking" hardware and/or extensive 191 platform-specific virtual memory mapping tricks. Such approaches 192 become even more difficult for NFS version 4 due to the existence 193 of the COMPOUND operation, which further reduces alignment and 194 greatly complicates ULP offload. 196 Furthermore, NFS will soon be challenged by emerging high-speed 197 network fabrics such as 10 Gbits/s Ethernet. Performing even raw 198 network I/O such as TCP is an issue at such speeds with today's 199 hardware. The problem is fundamental in nature and has led the 200 IETF to explore RDMA [RFC4297]. 202 Zero-copy techniques benefit file protocols extensively, as they 203 enable direct user I/O, reduce the overhead of protocol stacks, 204 provide perfect alignment into caches, etc. Many studies have 205 already shown the performance benefits of such techniques [SKE+01] 206 [DCK+03] [FJNFS] [FJDAFS] [KM02] [MAF+02]. 208 RDMA is compelling here for another reason; hardware offloaded 209 networking support in itself does not avoid data copies, without 210 resorting to implementing part of the NFS protocol in the NIC. 211 Support of RDMA by NFS enables the highest performance at the 212 architecture level rather than by implementation; this enables 213 ubiquitous and interoperable solutions. 215 By providing file access performance equivalent to that of local 216 file systems, NFS over RDMA will enable applications running on a 217 set of client machines to interact through an NFS file system, just 218 as applications running on a single machine might interact through 219 a local file system. 221 3. File Protocol Architecture 223 NFS runs as an ONC RPC [RFC1831] application. Being a file access 224 protocol, NFS is very "rich" in data content (versus control 225 information). 227 NFS messages can range from very small (under 100 bytes) to very 228 large (from many kilobytes to a megabyte or more). They are all 229 contained within an RPC message and follow a variable length RPC 230 header. This layout provides an alignment challenge for the data 231 items contained in an NFS call (request) or reply (response) 232 message. 234 In addition to the control information in each NFS call or reply 235 message, sometimes there are large "chunks" of application file 236 data, for example read and write requests. With NFS version 4 (due 237 to the existence of the COMPOUND operation) there can be several of 238 these data chunks interspersed with control information. 240 ONC RPC is a remote procedure call protocol that has been run over 241 a variety of transports. Most implementations today use UDP or 242 TCP. RPC messages are defined in terms of an eXternal Data 243 Representation (XDR) [RFC1832] which provides a canonical data 244 representation across a variety of host architectures. An XDR data 245 stream is conveyed differently on each type of transport. On UDP, 246 RPC messages are encapsulated inside datagrams, while on a TCP byte 247 stream, RPC messages are delineated by a record marking protocol. 248 An RDMA transport also conveys RPC messages in a unique fashion 249 that must be fully described if client and server implementations 250 are to interoperate. 252 The RPC transport is responsible for conveying an RPC message from 253 a sender to a receiver. An RPC message is either an RPC call from 254 a client to a server, or an RPC reply from the server back to the 255 client. An RPC message contains an RPC call header followed by 256 arguments if the message is an RPC call, or an RPC reply header 257 followed by results if the message is an RPC reply. The call 258 header contains a transaction ID (XID) followed by the program and 259 procedure number as well as a security credential. An RPC reply 260 header begins with an XID that matches that of the RPC call 261 message, followed by a security verifier and results. All data in 262 an RPC message is XDR encoded. 264 The encoding of XDR data into transport buffers is referred to as 265 "marshalling", and the decoding of XDR data contained within 266 transport buffers and into destination RPC procedure result 267 buffers, is referred to as "unmarshalling". The process of 268 marshalling takes place therefore at the sender of any particular 269 message, be it an RPC request or an RPC response. Unmarshalling, 270 of course, takes place at the receiver. 272 Normally, any bulk data is moved (copied) as a result of the 273 unmarshalling process, because the destination adddress is not 274 known until the RPC code receives control and subsequently invokes 275 the XDR unmarshalling routine. In other words, XDR-encoded data is 276 not self-describing, and it carries no placement information. This 277 results in a data copy in most NFS implementations. 279 One mechanism by which the RPC layer may overcome this is for each 280 request to include placement information, to be used for direct 281 placement during XDR encode. This "write chunk" can avoid sending 282 bulk data inline in an RPC message and generally results in one or 283 more RDMA Write operations. 285 Similarly, a "read chunk", where placement information referring to 286 bulk data which may be directly fetched via one or more RDMA Read 287 operations during XDR decode, may be conveyed. The "read chunk" 288 will therefore be useful in both RPC calls and replies, while the 289 "write chunk" is used solely in replies. 291 These "chunks" are the key concept in an existing proposal 292 [RPCRDMA]. They convey what are effectively pointers to remote 293 memory across the network. They allow cooperating peers to 294 exchange data outside of XDR encodings but still use XDR for 295 describing the data to be transferred. And, finally, through use 296 of XDR they maintain a large degree of on-the-wire compatibility. 298 The central concept of the RDMA transport is to provide the 299 additional encoding conventions to convey this placement 300 information in transport-specific encoding, and to modify the XDR 301 handling of bulk data. 303 Block Diagram 305 +------------------------+-----------------------------------+ 306 | NFS | NFS + RDMA | 307 +------------------------+----------------------+------------+ 308 | Operations / Procedures | | 309 +-----------------------------------------------+ | 310 | RPC/XDR | | 311 +--------------------------------+--------------+ | 312 | Stream Transport | RDMA Transport | 313 +--------------------------------+---------------------------+ 315 4. Sources of Overhead 317 Network and file protocol costs can be categorized as follows: 319 o per-byte costs - data touching costs such as checksum or data 320 copy. Today's network interface hardware commonly offloads 321 the checksum, which leaves the other major source of per-byte 322 overhead, data copy. 324 o per-packet costs - interrupts and lower-layer processing. 325 Today's network interface hardware also commonly coalesce 326 interrupts to reduce per-packet costs. 328 o per-message (request or response) costs - LLP and ULP 329 processing. 331 Improvement from optimization becomes more important if the 332 overhead it targets is a larger share of the total cost. As other 333 sources of overhead, such as the checksumming and interrupt 334 handling above are eliminated, the remaining overheads (primarily 335 data copy) loom larger. 337 With copies crossing the bus twice per copy, network processing 338 overhead is high whenever network bandwidth is large in comparison 339 to CPU and memory bandwidths. Generally with today's end-systems, 340 the effects are observable at network speeds at or above 1 Gbits/s. 342 A common question is whether increase in CPU processing power 343 alleviates the problem of high processing costs of network I/O. 344 The answer is no, it is the memory bandwidth that is the issue. 345 Faster CPUs do not help if the CPU spends most of its time waiting 346 for memory [RFC4297]. 348 TCP offload engine (TOE) technology aims to offload the CPU by 349 moving TCP/IP protocol processing to the NIC. However, TOE 350 technology by itself does nothing to avoid necessary data copies 351 within upper layer protocols. [MOG03] provides a description of 352 the role TOE can play in reducing per-packet and per-message costs. 353 Beyond the offloads commonly provided by today's network interface 354 hardware, TOE alone (w/o RDMA) helps in protocol header processing, 355 but this has been shown to be a minority component of the total 356 protocol processing overhead. [CHA+01] 358 Numerous software approaches to the optimization of network 359 throughput have been made. Experience has shown that network I/O 360 interacts with other aspects of system processing such as file I/O 361 and disk I/O. [BRU99] [CHU96] Zero-copy optimizations based on 362 page remapping [CHU96] can be dependent upon machine architecture, 363 and are not scaleable to multi-processor architectures. Correct 364 buffer alignment and sizing together are needed to optimize the 365 performance of zero-copy movement mechanisms [SKE+01]. The NFS 366 message layout described above does not facilitate the splitting of 367 headers from data nor does it facilitate providing correct data 368 buffer alignment. 370 4.1. Savings from TOE 372 The expected improvement of TOE specifically for NFS protocol 373 processing can be quantified and shown to be fundamentally limited. 374 [SHI+03] presents a set of "LAWS" parameters which serve to 375 illustrate the issues. In the TOE case, the copy cost can be 376 viewed as part of the application processing "a". Application 377 processing increases the LAWS "gamma", which is shown by the paper 378 to result in a diminished benefit for TOE. 380 For example, if the overhead is 20% TCP/IP, 30% copy and 50% real 381 application work, then gamma is 80/20 or 4, which means the maximum 382 benefit of TOE is 1/gamma, or only 25%. 384 For RDMA (with embedded TOE) and the same example, the "overhead" 385 (o) offloaded or eliminated is 50% (20%+30%). Therefore in the 386 RDMA case, gamma is 50/50 or 1, and the inverse gives the potential 387 benefit of 1 (100%), a factor of two. 389 CPU overhead reduction factor 391 No Offload TCP Offload RDMA Offload 392 -----------+-------------+------------- 393 1.00x 1.25x 2.00x 395 The analysis in the paper shows that RDMA could improve throughput 396 by the same factor of two, even when the host is (just) powerful 397 enough to drive the full network bandwidth without RDMA. It can 398 also be shown that the speedup may be higher if network bandwidth 399 grows faster than Moore's Law, although the higher benefits will 400 apply to a narrow range of applications. 402 4.2. Savings from RDMA 404 Performance measurements directly comparing an NFS over RDMA 405 prototype with conventional network-based NFS processing are 406 described in [CAL+03]. Comparisons of Read throughput and CPU 407 overhead were performed on two Gigabit Ethernet adapters, one 408 conventional and one with RDMA capability. The prototype RDMA 409 protocol performed all transfers via RDMA Read. 411 In these results, conventional network-based throughput was 412 severely limited by the client's CPU being saturated at 100% for 413 all transfers. Read throughput reached no more than 60MBytes/s. 415 I/O Type Size Read Throughput CPU Utilization 416 Conventional 2KB 20MB/s 100% 417 Conventional 16KB 40MB/s 100% 418 Conventional 256KB 60MB/s 100% 420 However, over RDMA, throughput rose to the theoretical maximum 421 throughput of the platform, while saturating the single-CPU system 422 only at maximum throughput. 424 I/O Type Size Read Throughput CPU Utilization 425 RDMA 2KB 10MB/s 45% 426 RDMA 16KB 40MB/s 70% 427 RDMA 256KB 100MB/s 100% 429 The lower relative throughput of the RDMA prototype at the small 430 blocksize may be attributable to the RDMA Read imposed by the 431 prototype protocol, which reduced the operation rate since it 432 introduces additional latency. As well, it may reflect the 433 relative increase of per-packet setup costs within the DMA portion 434 of the transfer. 436 5. Application of RDMA to NFS 438 Efficient file protocols require efficient data positioning and 439 movement. The client system knows the client memory address where 440 the application has data to be written or wants read data 441 deposited. The server system knows the server memory address where 442 the local filesystem will accept write data or has data to be read. 443 Neither peer however is aware of the others' data destination in 444 the current NFS, RPC or XDR protocols. Existing NFS 445 implementations have struggled with the performance costs of data 446 copies when using traditional Ethernet transports. 448 With the onset of faster networks, the network I/O bottleneck will 449 worsen. Fortunately, new transports that support RDMA have 450 emerged. RDMA excels at bulk transfer efficiency; it is an 451 efficient way to deliver direct data placement and remove a major 452 part of the problem: data copies. RDMA also addresses other 453 overheads, e.g. underlying protocol offload, and offers separation 454 of control information from data. 456 The current NFS message layout provides the performance enhancing 457 opportunity for an NFS over RDMA protocol that separates the 458 control information from data chunks while meeting the alignment 459 needs of both. The data chunks can be copied "directly" between 460 the client and server memory addresses above (with a single 461 occurrence on each memory bus) while the control information can be 462 passed "inline". [RPCRDMA] describes such a protocol. 464 6. Conclusions 466 NFS version 4 [RFC3530] has been granted "Proposed Standard" 467 status. The NFSv4 protocol was developed along several design 468 points, important among them: effective operation over wide- area 469 networks, including the Internet itself; strong security 470 integrated into the protocol; extensive cross-platform 471 interoperability including integrated locking semantics compatible 472 with multiple operating systems; and (this is key), protocol 473 extension. 475 NFS version 4 is an excellent base on which to add the needed 476 performance enhancements and improved semantics described above. 477 The minor versioning support defined in NFS version 4 was designed 478 to support protocol improvements without disruption to the 479 installed base. Evolutionary improvement of the protocol via minor 480 versioning is a conservative and cautious approach to current and 481 future problems and shortcomings. 483 Many arguments can be made as to the efficacy of the file 484 abstraction in meeting the future needs of enterprise data service 485 and the Internet. Fine grained Quality of Service (QoS) policies 486 (e.g. data delivery, retention, availability, security, ...) are 487 high among them. 489 It is vital that the NFS protocol continue to provide these 490 benefits to a wide range of applications, without its usefulness 491 being compromised by concerns about performance and semantic 492 inadequacies. This can reasonably be addressed in the existing NFS 493 protocol framework. A cautious evolutionary improvement of 494 performance and semantics allows building on the value already 495 present in the NFS protocol, while addressing new requirements that 496 have arisen from the application of networking technology. 498 7. Security Considerations 500 Security Considerations are not covered by this document. Please 501 refer to the appropriate protocol documents for any security 502 issues. 504 8. IANA Considerations 506 IANA Considerations are not covered by this document. Please refer 507 to the appropriate protocol documents for any IANA issues. 509 9. Acknowledgements 511 The authors wish to thank Jeff Chase who provided many useful 512 suggestions. 514 10. Normative References 516 [RFC3530] 517 S. Shepler, et. al., "NFS Version 4 Protocol", Standards Track 518 RFC 520 [RFC1831] 521 R. Srinivasan, "RPC: Remote Procedure Call Protocol 522 Specification Version 2", Standards Track RFC 524 [RFC1832] 525 R. Srinivasan, "XDR: External Data Representation Standard", 526 Standards Track RFC 528 [RFC1813] 529 B. Callaghan, B. Pawlowski, P. Staubach, "NFS Version 3 530 Protocol Specification", Informational RFC 532 11. Informative References 534 [BRU99] 535 J. Brustoloni, "Interoperation of copy avoidance in network 536 and file I/O", in Proc. INFOCOM '99, pages 534-542, New York, 537 NY, Mar. 1999., IEEE. Also available from 538 http://www.cs.pitt.edu/~jcb/publs.html 540 [CAL+03] 541 B. Callaghan, T. Lingutla-Raj, A. Chiu, P. Staubach, O. Asad, 542 "NFS over RDMA", in Proceedings of ACM SIGCOMM Summer 2003 543 NICELI Workshop. 545 [CHA+01] 546 J. S. Chase, A. J. Gallatin, K. G. Yocum, "Endsystem 547 optimizations for high-speed TCP", IEEE Communications, 548 39(4):68-74, April 2001. 550 [CHA+99] 551 J. S. Chase, D. C. Anderson, A. J. Gallatin, A. R. Lebeck, K. 552 G. Yocum, "Network I/O with Trapeze", in 1999 Hot 553 Interconnects Symposium, August 1999. 555 [CHU96] 556 H.K. Chu, "Zero-copy TCP in Solaris", Proc. of the USENIX 1996 557 Annual Technical Conference, San Diego, CA, January 1996 559 [DCK+03] 560 M. DeBergalis, P. Corbett, S. Kleiman, A. Lent, D. Noveck, T. 561 Talpey, M. Wittle, "The Direct Access File System", in 562 Proceedings of 2nd USENIX Conference on File and Storage 563 Technologies (FAST '03), San Francisco, CA, March 31 - April 564 2, 2003 566 [FJDAFS] 567 Fujitsu Prime Software Technologies, "Meet the DAFS 568 Performance with DAFS/VI Kernel Implementation using cLAN", 569 available from 570 http://www.pst.fujitsu.com/english/dafsdemo/index.html, 2001. 572 [FJNFS] 573 Fujitsu Prime Software Technologies, "An Adaptation of VIA to 574 NFS on Linux", available from 575 http://www.pst.fujitsu.com/english/nfs/index.html, 2000. 577 [GAL+99] 578 A. Gallatin, J. Chase, K. Yocum, "Trapeze/IP: TCP/IP at Near- 579 Gigabit Speeds", 1999 USENIX Technical Conference (Freenix 580 Track), June 1999. 582 [KM02] 583 K. Magoutis, "Design and Implementation of a Direct Access 584 File System (DAFS) Kernel Server for FreeBSD", in Proceedings 585 of USENIX BSDCon 2002 Conference, San Francisco, CA, February 586 11-14, 2002. 588 [MAF+02] 589 K. Magoutis, S. Addetia, A. Fedorova, M. Seltzer, J. Chase, D. 590 Gallatin, R. Kisley, R. Wickremesinghe, E. Gabber, "Structure 591 and Performance of the Direct Access File System (DAFS)", in 592 Proceedings of 2002 USENIX Annual Technical Conference, 593 Monterey, CA, June 9-14, 2002. 595 [MOG03] 596 J. Mogul, "TCP offload is a dumb idea whose time has come", 597 9th Workshop on Hot Topics in Operating Systems (HotOS IX), 598 Lihue, HI, May 2003. USENIX. 600 [NFSv4.1] 601 S. Shepler, ed., "NFSv4 Minor Version 1" Internet Draft work- 602 in-progress, draft-ietf-nfsv4-minorversion1 604 [PAI+00] 605 V. S. Pai, P. Druschel, W. Zwaenepoel, "IO-Lite: a unified I/O 606 buffering and caching system", ACM Trans. Computer Systems, 607 18(1):37-66, Feb. 2000. 609 [RDDP] 610 RDDP Working Group charter, 611 http://www.ietf.org/html.charters/rddp-charter.html 613 [RFC4297] 614 A. Romanow, J. Mogul, T. Talpey, S. Bailey, "Remote Direct 615 Memory Access (RDMA) over IP Problem Statement", Informational 616 RFC 618 [RFC1094] 619 Sun Microsystems, "NFS: Network File System Protocol 620 Specification" 622 [RPCRDMA] 623 T. Talpey, B. Callaghan, "RDMA Transport for ONC RPC", 624 Internet Draft Work in Progress, draft-ietf-nfsv4-rpcrdma 626 [SHI+03] 627 P. Shivam, J. Chase, "On the Elusive Benefits of Protocol 628 Offload", to be published in Proceedings of ACM SIGCOMM Summer 629 2003 NICELI Workshop, also available from 630 http://issg.cs.duke.edu/publications/niceli03.pdf 632 [SKE+01] 633 K.-A. Skevik, T. Plagemann, V. Goebel, P. Halvorsen, 634 "Evaluation of a Zero-Copy Protocol Implementation", in 635 Proceedings of the 27th Euromicro Conference - Multimedia and 636 Telecommunications Track (MTT'2001), Warsaw, Poland, September 637 2001. 639 Authors' Addresses 641 Tom Talpey 642 Network Appliance, Inc. 643 375 Totten Pond Road 644 Waltham, MA 02451 USA 646 Phone: +1 781 768 5329 647 Email: thomas.talpey@netapp.com 649 Chet Juszczak 650 Chet's Boathouse Co. 651 P.O. Box 1467 652 Merrimack, NH 03054 654 Email: chetnh@earthlink.net 656 Intellectual Property and Copyright Statements 658 Intellectual Property Statement 660 The IETF takes no position regarding the validity or scope of any 661 Intellectual Property Rights or other rights that might be claimed 662 to pertain to the implementation or use of the technology described 663 in this document or the extent to which any license under such 664 rights might or might not be available; nor does it represent that 665 it has made any independent effort to identify any such rights. 666 Information on the procedures with respect to rights in RFC 667 documents can be found in BCP 78 and BCP 79. 669 Copies of IPR disclosures made to the IETF Secretariat and any 670 assurances of licenses to be made available, or the result of an 671 attempt made to obtain a general license or permission for the use 672 of such proprietary rights by implementers or users of this 673 specification can be obtained from the IETF on-line IPR repository 674 at http://www.ietf.org/ipr. 676 The IETF invites any interested party to bring to its attention any 677 copyrights, patents or patent applications, or other proprietary 678 rights that may cover technology that may be required to implement 679 this standard. Please address the information to the IETF at ietf- 680 ipr@ietf.org. 682 Disclaimer of Validity 684 This document and the information contained herein are provided on 685 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 686 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 687 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 688 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 689 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 690 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 691 PARTICULAR PURPOSE. 693 Copyright Statement 695 Copyright (C) The Internet Society (2006). 697 This document is subject to the rights, licenses and restrictions 698 contained in BCP 78, and except as set forth therein, the authors 699 retain all their rights. 701 Acknowledgement 702 Funding for the RFC Editor function is currently provided by the 703 Internet Society.