idnits 2.17.1 draft-ietf-rddp-rdma-concerns-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-20) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 7 longer pages, the longest (page 6) being 68 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 7 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. (A line matching the expected section header was found, but with an unexpected indentation: ' 1. Overview' ) ** The document seems to lack a Security Considerations section. (A line matching the expected section header was found, but with an unexpected indentation: ' 5. Security Considerations' ) ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The abstract seems to contain references ([RFC2401], [RDDP-arch], [RDDP-ps], [RFC2119], [RFC2246], [RFC896], [RDDP-arch,RDDP-ps]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 212: '... specifications MUST contain mechanis...' RFC 2119 keyword, line 215: '...e specifications MUST contain mechanis...' RFC 2119 keyword, line 247: '...n addition, an RDMA specification MUST...' RFC 2119 keyword, line 257: '... (respectively) MAY be included in RD...' RFC 2119 keyword, line 258: '... are NOT REQUIRED....' (3 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The exact meaning of the all-uppercase expression 'NOT REQUIRED' is not defined in RFC 2119. If it is intended as a requirements expression, it should be rewritten using one of the combinations defined in RFC 2119; otherwise it should not be all-uppercase. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2004) is 7249 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RDDP-arch' on line 318 looks like a reference -- Missing reference section? 'RDDP-ps' on line 323 looks like a reference -- Missing reference section? 'RFC2119' on line 138 looks like a reference -- Missing reference section? 'RFC 896' on line 326 looks like a reference -- Missing reference section? 'RFC 2246' on line 330 looks like a reference -- Missing reference section? 'RFC 2401' on line 332 looks like a reference -- Missing reference section? 'RFC 2119' on line 328 looks like a reference Summary: 6 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft David L. Black 4 Document: draft-ietf-rddp-rdma-concerns-01.txt EMC 6 Expires: December 2004 Michael F. Speer 8 Sun 10 John Wroclawski 12 MIT 14 June 2004 16 DDP and RDMA Concerns 18 Status of this Memo 20 By submitting this Internet-Draft, I certify that any applicable 21 patent or other IPR claims of which I am aware have been disclosed, 23 and any of which I become aware will be disclosed, in accordance 24 with RFC 3668. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six 32 months and may be updated, replaced, or obsoleted by other 33 documents at any time. It is inappropriate to use Internet-Drafts 34 as reference material or to cite them other than as "work in 35 progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/ietf/1id-abstracts.txt 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html. 42 Abstract 44 This draft describes technical concerns that should be considered 45 in the design of standardized RDMA and DDP protocols/mechanisms for 47 use with Internet transport protocols. This draft was written to 48 provide input to the proposed new Remote Direct Data Placement 49 (rddp) WG, and is not intended for publication as an RFC. 51 This is an updated and resubmitted version of draft-ietf-rddp-rdma- 52 concerns-00.txt to make it available for current discussions of 53 mandatory-to-implement security in the RDDP WG. Sections 4.1, 4.2, 55 and 5 are of particular relevance to that discussion. 57 DDP and RDMA Concerns June 58 2004 60 Table of Contents 62 1. Overview......................................................2 63 2. Conventions used in this document.............................3 64 3. Architectural Concerns........................................3 65 3.1 Buffer Management.........................................3 66 3.2 Reliability...............................................4 67 4. Memory is more general that Transport Buffers.................4 68 4.1 Overwrites................................................4 69 4.2 Concurrent Operations to the Same Memory..................5 70 4.3 Completions and Ordering..................................5 71 4.4 Transfer Granularity......................................5 72 5. Security Considerations.......................................6 73 References.......................................................6 74 Acknowledgements.................................................7 75 Author's Addresses...............................................7 77 1. Overview 79 A new effort to standardize RDMA (Remote Direct Memory Access) and 80 DDP (Direct Data Placement) protocols/mechanisms for Internet 81 transport protocols is going to take place in the proposed IETF 82 Remote Direct Data Placement (rddp) WG. This draft describes 83 technical concerns that should be addressed in the design and 84 standardization of these protocols. A basic understanding of RDMA 85 and DDP is assumed; while a basic introduction is included in this 86 section; readers unfamiliar with these concepts may wish to refer 87 to [RDDP-arch, RDDP-ps] for more background. 89 Both Direct Data Placement (DDP) and Remote Direct Memory Access 90 (RDMA) have the goal of eliminating copies between the protocol 91 stack and application buffers at the receiver. For example, when a 93 4-kilobyte file or disk block is retrieved, most operating systems 94 expect the resulting block to be in 4kB of contiguous memory 95 aligned to a 4kB boundary, but most networking interfaces do not 96 behave in this fashion. The result is that a copy is required to 97 produce an aligned 4kB block of data from the data delivered by the 99 network interface. This copy has undesirable performance impacts; 100 the goal of DDP and RDMA is to enable elimination of this copy in 101 an application- and protocol-independent fashion. The basic 102 concept is that the sender identifies data to be placed directly 103 into application buffers, and transmits that identification with 104 the data so that the receiver can place the data directly into 105 application buffers when it is received. 107 DDP is envisioned to share network transport buffers with 108 applications, but to use application-specified tags and offsets to 109 select buffers for use on receive. The primary purposes of this 111 DDP and RDMA Concerns June 112 2004 114 information are to separate application data from headers and deal 115 with applications that return data in unpredictable orders (e.g., 116 the results of concurrent file and disk operations may be returned 117 to the invoker in arbitrary order). One way to view DDP on the 118 wire is that it annotates (or "decorates") data that would have 119 been sent anyway. 121 RDMA uses DDP or a DDP-like mechanism to implement remote read and 122 write operations on memory regions explicitly exported by end 123 systems. A tag is used to designate a memory region, and an offset 125 is used to indicate the address within that region. RDMA differs 126 from DDP in that it provides a memory abstraction rather than a 127 transport buffer abstraction. This raises concerns based on the 128 ways in which transport buffers differ from memory in general. In 129 addition, the system coupling over a potentially unreliable network 131 implied by DDP and RDMA raises several architectural concerns. 133 2. Conventions used in this document 135 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 137 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 138 this document are to be interpreted as described in [RFC2119], 139 although they are used here to describe requirements on protocol 140 development and standardization rather than on protocol 141 implementations. 143 3. Architectural Concerns 145 Both DDP and RDMA expose memory resources on the receiver to one or 147 more potentially untrustworthy sender(s) over a potentially 148 unreliable network. This has a number of architectural 149 implications, particularly for resource management. 151 3.1 Buffer Management 153 Traditional network stacks utilize a pool of interchangeable (aka 154 anonymous) buffers to hold data received from the network. By 155 using specific identifiable application buffers, DDP and RDMA make 156 the memory used for specific receive operations identifiable and 157 may cause protocols to devote more resources to the receive 158 function than might otherwise be the case. In situations where 159 effective use is being made of DDP and/or RDMA, the actual resource 161 demand on the system may be lessened (e.g., because applications 162 only expose memory that is in their working set), but it is 163 necessary to anticipate applications that use DDP and RDMA in a way 165 that increases resource demands and take appropriate precautions to 167 limit system degradation. 169 DDP and RDMA Concerns June 170 2004 172 3.2 Reliability 174 RDMA is motivated by experiences with both local DMA and transfers 175 over reliable channels; these experiences will not be completely 176 applicable to RDMA over IP networks. Local DMA provides an extreme 178 example, in that a local DMA failure is usually caused by hardware 179 problems that often result in the hardware being considered to have 181 failed. In contrast, RDMA over IP must deal with a variety of 182 "stupid IP network tricks" as part of its normal operation. 183 Channel behavior is a less extreme example as channel controllers 184 must expect occasional channel failures and be prepared to deal 185 with the result; one example can be found in multipathing software 186 for disk storage access. 188 This set of concerns is roughly analogous to the reliability 189 difference between local and remote procedure calls and its impact 190 on distributed system design [need to add a reference here]. The 191 impact of the difference in reliability between local DMA and/or 192 channels vs. RDMA needs to be considered as part of any 193 specification effort, but may be best dealt with in applicability 194 statements as opposed to making these considerations part of the 195 core protocol specifications. 197 4. Memory is more general that Transport Buffers 199 The following subsections describe concerns arising from the fact 200 that memory that can be read and/or written is a more general and 201 capable abstraction than a transport buffer. 203 4.1 Overwrites 205 A transport buffer can be written exactly once when the data is 206 received; in contrast memory can be written multiple times. This 207 creates the opportunity for received DDP and RDMA data to overwrite 209 other data, including previously received data (that may or may not 211 have been transferred to the application(s)). DDP and RDMA 212 specifications MUST contain mechanisms to prevent overwrites from 213 impairing system integrity and to isolate the effect of overwrites 214 so that interference among otherwise unrelated applications is 215 prevented. In addition the specifications MUST contain mechanisms 216 that allow applications to control the exposure of memory used for 217 DDP and RDMA receives to subsequent overwrites; this is to enable 218 an application to know that a check on received data (e.g., for 219 integrity) is performed after changes to it can no longer be made 220 by remote nodes via DDP or RDMA. 222 DDP and RDMA Concerns June 223 2004 225 4.2 Concurrent Operations to the Same Memory 227 If a remote (or local) write takes place concurrently with a read 228 to the same memory, the read may return an arbitrary mix of the old 230 and new contents of the memory. If a remote (or local) write takes 232 place concurrently with another write, the resulting memory 233 contents may be an arbitrary mix of the data from the two writes. 234 These results are generally considered undesirable, and should be 235 avoided. DDP and RDMA specifications must consider how these 236 situations are to be avoided (e.g., application-level 237 synchronization may be required), so that at worst they will occur 238 only as the result of application errors in using DDP and RDMA. 240 4.3 Completions and Ordering 242 RDMA Read and Write operations are asynchronous with respect to the 244 protocol layers above RDMA, hence completion mechanisms are 245 necessary to enable applications to determine when RDMA operations 246 have completed, although these mechanisms need not be invoked for 247 every RDMA operation. In addition, an RDMA specification MUST 248 include the assumptions that an application may and may not make 249 about the state of "prior" RDMA operations based on observing the 250 completion of a specific RDMA operation. The word "prior" is in 251 quotes because an RDMA specification will need to define it as part 253 of specifying permissible inference of completion of "prior" 254 operations; the definition is likely to involve a partial order. 256 Fence and stream abstractions to enforce and prevent ordering 257 (respectively) MAY be included in RDMA and DDP specifications, but 258 are NOT REQUIRED. 260 4.4 Transfer Granularity 262 IP transports include the functionality to bundle data so that a 263 set of small user transfers is accomplished via a single larger 264 transfer across the network and through the relevant portions of 265 the protocol stacks. By defining specific remote operations that 266 an application may reasonably expect to complete in a timely 267 fashion, RDMA may disrupt this behavior by requiring smaller 268 transfers to be done promptly. The potential inefficiencies of the 270 resulting behavior for protocol stacks and networks have been known 272 for a long time; see the discussion of the small-packet problem in 273 [RFC 896]. Any RDMA specification MUST consider the ability to 274 bundle operations and the potential performance impact of 275 performing multiple smaller transfers in place of a single larger 276 one. This may also apply to DDP, but the first priority is that 277 DDP SHOULD NOT cause major changes to the transmission behavior of 278 any transport protocol to which it is applied by comparison to the 279 same stream without the DDP annotations (some degree of minor 281 DDP and RDMA Concerns June 282 2004 284 change is unavoidable due to the space consumed by the DDP 285 annotations). 287 5. Security Considerations 289 With the possible exception of the Completion and Ordering concerns 291 described in Section 4.3, all of these concerns have security 292 implications in that failing to deal with them adequately may 293 expose attacks on system resources, correct operation and/or 294 integrity. 296 When memory is accessible via the network, such access must be 297 controlled, as allowing arbitrary access by untrusted entities 298 discloses the contents of the memory (read access) and/or allows it 300 to be corrupted (write access). Specifically, it is necessary to 301 provide mechanisms that enable applications to control RDMA and DDP 303 access to their exported memory by both identity (RDMA and DDP) and 305 type of access (read vs. write - RDMA only); this inherently 306 involves authentication of the principals granted access in order 307 to distinguish authorized from unauthorized access. Such 308 authentication MAY be implemented outside the DDP and/or RDMA 309 protocols (e.g., in the application or a separate security protocol 311 such as TLS [RFC 2246] or IPsec [RFC 2401]) provided that means are 313 specified to securely couple the authorization of DDP and RDMA 314 operations to the corresponding authentications. 316 References 318 [RDDP-arch] Bailey, S. and T. Talpey, "The Architecture of Direct 319 Data Placement (DDP) And Remote Direct Memory Access (RDMA) On 320 Internet Protocols", Internet-Draft draft-ietf-rddp-arch-04.txt, 322 Work in Progress, January 2004. 323 [RDDP-ps] Romanow, A., J. Mogul, T. Talpey, and S. Bailey, "RDMA 324 over IP Problem Statement", Internet-Draft draft-ietf-rddp- 325 problem-statement-03.txt, Work in Progress, January 2004. 326 [RFC 896] Nagle, J., "Congestion Control in IP/TCP Internetworks", 327 RFC 896, January 1984. 328 [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate 329 Requirement Levels", RFC 2119, BCP 14, March 1997. 330 [RFC 2246] Dierks, T. and C. Allen, " The TLS Protocol Version 331 1.0", RFC 2246, January 1999. 332 [RFC 2401] Kent, S. and R. Atkinson, "Security Architecture for the 334 Internet Protocol", RFC 2401, November 1998. 336 DDP and RDMA Concerns June 337 2004 339 Acknowledgements 341 This draft is based in part on a presentation and discussion at an 342 end2end research group meeting at MIT in May 2002 - the authors 343 thank the end2end RG for providing the opportunity and gratefully 344 acknowledge the comments and suggestions of participants. 346 Author's Addresses 348 David L. Black 349 EMC Corporation 350 176 South Street Phone: +1 (508) 293-7953 351 Hopkinton, MA, 01748, USA Email: black_david@emc.com 353 Michael F. Speer 354 Sun Microsystems, Inc. 355 4150 Network Circle UMPK17-103 Phone: +1 (650) 786-6445 356 Santa Clara, CA 95054 Email: michael.speer@sun.com 358 John Wroclawski 359 MIT Lab for Computer Science 360 200 Technology Square Phone: +1 (617) 253-7885 361 Cambridge, MA 02139 Email: jtw@lcs.mit.edu