idnits 2.17.1 draft-ietf-rddp-security-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 2432. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2443. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2450. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2456. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'not RECOMMENDED' in this paragraph: For these reasons, it is not RECOMMENDED that TLS be layered on top of RDMAP or DDP. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2006) is 6525 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 2246' is mentioned on line 1024, but not defined ** Obsolete undefined reference: RFC 2246 (Obsoleted by RFC 4346) == Unused Reference: 'IPv6-Trust' is defined on line 1972, but no explicit reference was found in the text == Unused Reference: 'VERBS-RDMAC-Overview' is defined on line 1985, but no explicit reference was found in the text == Unused Reference: 'ISER' is defined on line 2004, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-ietf-rddp-ddp-05 == Outdated reference: A later version (-07) exists of draft-ietf-rddp-rdmap-05 ** Obsolete normative reference: RFC 2406 (Obsoleted by RFC 4303, RFC 4305) ** Obsolete normative reference: RFC 2409 (Obsoleted by RFC 4306) ** Obsolete normative reference: RFC 2401 (Obsoleted by RFC 4301) ** Obsolete normative reference: RFC 2402 (Obsoleted by RFC 4302, RFC 4305) ** Obsolete normative reference: RFC 2960 (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 2828 (Obsoleted by RFC 4949) == Outdated reference: A later version (-08) exists of draft-ietf-rddp-applicability-06 == Outdated reference: A later version (-04) exists of draft-ietf-nfsv4-channel-bindings-02 -- Obsolete informational reference (is this intentional?): RFC 4347 (ref. 'DTLS') (Obsoleted by RFC 6347) == Outdated reference: A later version (-06) exists of draft-ietf-ips-iser-05 -- Obsolete informational reference (is this intentional?): RFC 3530 (ref. 'NFSv4') (Obsoleted by RFC 7530) Summary: 10 errors (**), 0 flaws (~~), 12 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft James Pinkerton 3 draft-ietf-rddp-security-10.txt Microsoft Corporation 4 Category: Standards Track Ellen Deleganes 5 Expires: December, 2006 Intel Corporation 6 June 2006 8 DDP/RDMAP Security 10 Status of this Memo 11 By submitting this Internet-Draft, each author represents that 12 any applicable patent or other IPR claims of which he or she is 13 aware have been or will be disclosed, and any of which he or she 14 becomes aware will be disclosed, in accordance with Section 6 of 15 BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six 23 months and may be updated, replaced, or obsoleted by other 24 documents at any time. It is inappropriate to use Internet-Drafts 25 as reference material or to cite them other than as "work in 26 progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 Abstract 35 This document analyzes security issues around implementation and 36 use of the Direct Data Placement Protocol(DDP) and Remote Direct 37 Memory Access Protocol (RDMAP). It first defines an architectural 38 model for an RDMA Network Interface Card (RNIC), which can 39 implement DDP or RDMAP and DDP. The document reviews various 40 attacks against the resources defined in the architectural model 41 and the countermeasures that can be used to protect the system. 42 Attacks are grouped into those that can be mitigated by using 43 secure communication channels across the network, attacks from 44 Remote Peers, and attacks from Local Peers. Attack categories 45 include spoofing, tampering, information disclosure, denial of 46 service, and elevation of privilege. 48 J. Pinkerton, et al. Expires December, 2006 1 49 Table of Contents 51 1 Introduction.................................................4 52 2 Architectural Model..........................................7 53 2.1 Components...................................................8 54 2.2 Resources...................................................10 55 2.2.1 Stream Context Memory.....................................10 56 2.2.2 Data Buffers..............................................10 57 2.2.3 Page Translation Tables...................................11 58 2.2.4 Protection Domain (PD)....................................11 59 2.2.5 STag Namespace and Scope..................................12 60 2.2.6 Completion Queues.........................................13 61 2.2.7 Asynchronous Event Queue..................................13 62 2.2.8 RDMA Read Request Queue...................................13 63 2.3 RNIC Interactions...........................................14 64 2.3.1 Privileged Control Interface Semantics....................14 65 2.3.2 Non-Privileged Data Interface Semantics...................14 66 2.3.3 Privileged Data Interface Semantics.......................15 67 2.3.4 Initialization of RNIC Data Structures for Data Transfer..15 68 2.3.5 RNIC Data Transfer Interactions...........................16 69 3 Trust and Resource Sharing..................................18 70 4 Attacker Capabilities.......................................19 71 5 Attacks That Can be Mitigated With End-to-End Security......20 72 5.1 Spoofing....................................................20 73 5.1.1 Impersonation.............................................20 74 5.1.2 Stream Hijacking..........................................21 75 5.1.3 Man-in-the-Middle Attack..................................21 76 5.2 Tampering - Network based modification of buffer content....22 77 5.3 Information Disclosure - Network Based Eavesdropping........22 78 5.4 Specific Requirements for Security Services.................22 79 5.4.1 Introduction to Security Options..........................23 80 5.4.2 TLS is Inappropriate for DDP/RDMAP Security...............23 81 5.4.3 DTLS and RDDP.............................................24 82 5.4.4 ULPs Which Provide Security...............................24 83 5.4.5 Requirements for IPsec Encapsulation of DDP...............25 84 6 Attacks from Remote Peers...................................26 85 6.1 Spoofing....................................................26 86 6.1.1 Using an STag on a Different Stream.......................26 87 6.2 Tampering...................................................27 88 6.2.1 Buffer Overrun - RDMA Write or Read Response..............28 89 6.2.2 Modifying a Buffer After Indication.......................28 90 6.2.3 Multiple STags to access the same buffer..................29 91 6.3 Information Disclosure......................................29 92 6.3.1 Probing memory outside of the buffer bounds...............29 93 6.3.2 Using RDMA Read to Access Stale Data......................29 94 6.3.3 Accessing a Buffer After the Transfer.....................30 95 6.3.4 Accessing Unintended Data With a Valid STag...............30 96 6.3.5 RDMA Read into an RDMA Write Buffer.......................30 97 6.3.6 Using Multiple STags Which Alias to the Same Buffer.......31 98 6.4 Denial of Service (DOS).....................................31 99 6.4.1 RNIC Resource Consumption.................................32 100 6.4.2 Resource Consumption by Idle ULPs.........................32 101 6.4.3 Resource Consumption By Active ULPs.......................33 102 6.4.3.1 Multiple Streams Sharing Receive Buffers...............33 103 6.4.3.2 Remote or Local Peer Attacking a Shared CQ.............35 104 6.4.3.3 Attacking the RDMA Read Request Queue..................37 105 6.4.4 Exercise of non-optimal code paths........................38 106 6.4.5 Remote Invalidate an STag Shared on Multiple Streams......38 107 6.4.6 Remote Peer attacking an Unshared CQ......................39 108 6.5 Elevation of Privilege......................................39 109 7 Attacks from Local Peers....................................40 110 7.1 Local ULP Attacking a Shared CQ.............................40 111 7.2 Local Peer Attacking the RDMA Read Request Queue............40 112 7.3 Local ULP Attacking the PTT & STag Mapping..................40 113 8 Security considerations.....................................42 114 9 IANA Considerations.........................................43 115 10 References..................................................44 116 10.1 Normative References......................................44 117 10.2 Informative References....................................44 118 11 Appendix A: ULP Issues for RDDP Client/Server Protocols.....46 119 12 Appendix B: Summary of RNIC and ULP Implementation 120 Requirements.....................................................50 121 13 Appendix C: Partial Trust Taxonomy..........................52 122 14 Author's Addresses..........................................54 123 15 Acknowledgments.............................................55 124 16 Full Copyright Statement....................................57 126 Table of Figures 128 Figure 1 - RDMA Security Model....................................8 130 1 Introduction 132 RDMA enables new levels of flexibility when communicating between 133 two parties compared to current conventional networking practice 134 (e.g. a stream-based model or datagram model). This flexibility 135 brings new security issues that must be carefully understood when 136 designing Upper Layer Protocols (ULPs) utilizing RDMA and when 137 implementing RDMA-aware NICs (RNICs). Note that for the purposes 138 of this security analysis, an RNIC may implement RDMAP [RDMAP] 139 and DDP [DDP], or just DDP. Also, a ULP may be an application or 140 it may be a middleware library. 142 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 143 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 144 "OPTIONAL" in this document are to be interpreted as described in 145 RFC 2119. Additionally the security terminology defined in 146 [RFC2828] is used in this specification. 148 The document first develops an architectural model that is 149 relevant for the security analysis - it details components, 150 resources, and system properties that may be attacked in Section 151 2. The document uses Local Peer to represent the RDMA/DDP 152 protocol implementation on the local end of a Stream (implemented 153 with a transport protocol such as [RFC793] or [RFC2960]). The 154 local Upper-Layer-Protocol (ULP) is used to represent the 155 application or middle-ware layer above the Local Peer. The 156 document does not attempt to differentiate between a Remote Peer 157 and a Remote ULP (an RDMA/DDP protocol implementation on the 158 remote end of a Stream versus the application on the remote end) 159 for several reasons: often the source of the attack is difficult 160 to know for sure; and regardless of the source, the mitigations 161 required of the Local Peer or local ULP are the same. Thus the 162 document generically refers to a Remote Peer rather than trying 163 to further delineate the attacker. 165 The document then defines what resources a local ULP may share 166 across Streams and what resources the local ULP may share with 167 the Remote Peer across Streams in Section 3. 169 Intentional sharing of resources between multiple Streams may 170 imply some level of trust between the Streams. However, some 171 types of resource sharing have unmitigated security attacks which 172 would mandate not sharing a specific type of resource unless 173 there is some level of trust between the Streams sharing 174 resources. 176 This document defines a new term, "Partial Mutual Trust" to 177 address this concept: 179 Partial Mutual Trust - a collection of RDMAP/DDP Streams, 180 which represent the local and remote end points of the 181 Stream, which are willing to assume that the Streams from 182 the collection will not perform malicious attacks against 183 any of the other Streams in the collection. 185 ULPs have explicit control of which collection of endpoints is in 186 a Partial Mutual Trust collection through tools discussed in 187 Section 13 Appendix C: Partial Trust Taxonomy. 189 An untrusted peer relationship is appropriate when a ULP wishes 190 to ensure that it will be robust and uncompromised even in the 191 face of a deliberate attack by its peer. For example, a single 192 ULP that concurrently supports multiple unrelated Streams (e.g. a 193 server) would presumably treat each of its peers as an untrusted 194 peer. For a collection of Streams which share Partial Mutual 195 Trust, the assumption is that any Stream not in the collection is 196 untrusted. For the untrusted peer, a brief list of capabilities 197 is enumerated in Section 4. 199 The rest of the document is focused on analyzing attacks and 200 recommending specific mitigations to the attacks. Attacks are 201 categorized into attacks mitigated by end-to-end security, 202 attacks initiated by Remote Peers, and attacks initiated by Local 203 Peers. For each attack, possible countermeasures are reviewed. 205 ULPs within a host are divided into two categories - Privileged 206 and Non-Privileged. Both ULP types can send and receive data and 207 request resources. The key differences between the two are: 209 The Privileged ULP is trusted by the local system to not 210 maliciously attack the operating environment, but it is not 211 trusted to optimize resource allocation globally. For 212 example, the Privileged ULP could be a kernel ULP, thus the 213 kernel presumably has in some way vetted the ULP before 214 allowing it to execute. 216 A Non-Privileged ULP's capabilities are a logical sub-set of 217 the Privileged ULP's. It is assumed by the local system that 218 a Non-Privileged ULP is untrusted. All Non-Privileged ULP 219 interactions with the RNIC Engine that could affect other 220 ULPs need to be done through a trusted intermediary that can 221 verify the Non-Privileged ULP requests. 223 The appendices provide focused summaries of this specification. 224 Section 11 Appendix A: ULP Issues for RDDP Client/Server 225 Protocols focuses on implementers of traditional client/server 226 protocols. Section 12 Appendix B: Summary of RNIC and ULP 227 Implementation Requirements summarizes all normative requirements 228 in this specification. Section 13 Appendix C: Partial Trust 229 Taxonomy provides an abstract model for categorizing trust 230 boundaries. 232 If an RDMAP/DDP protocol implementation uses the mitigations 233 recommended in this document, that implementation should not 234 exhibit additional security vulnerabilities above and beyond 235 those of an implementation of the transport protocol (i.e., TCP 236 or SCTP) and protocols beneath it (e.g., IP) without RDMAP/DDP. 238 2 Architectural Model 240 This section describes an RDMA architectural reference model that 241 is used as security issues are examined. It introduces the 242 components of the model, the resources that can be attacked, the 243 types of interactions possible between components and resources, 244 and the system properties which must be preserved. 246 Figure 1 shows the components comprising the architecture and the 247 interfaces where potential security attacks could be launched. 248 External attacks can be injected into the system from a ULP that 249 sits above the RNIC Interface or from the network. 251 The intent here is to describe high level components and 252 capabilities which affect threat analysis, and not focus on 253 specific implementation options. Also note that the architectural 254 model is an abstraction, and an actual implementation may choose 255 to subdivide its components along different boundary lines than 256 defined here. For example, the Privileged Resource Manager may be 257 partially or completely encapsulated in the Privileged ULP. 258 Regardless, it is expected that the security analysis of the 259 potential threats and countermeasures still apply. 261 Note that the model below is derived from several specific RDMA 262 implementations. A few of note are [VERBS-RDMAC], [VERBS-RDMAC- 263 Overview], and [INFINIBAND]. 265 +-------------+ 266 | Privileged | 267 | Resource | 268 Admin<-+>| Manager | ULP Control Interface 269 | | |<------+-------------------+ 270 | +-------------+ | | 271 | ^ v v 272 | | +-------------+ +-----------------+ 273 +---------------->| Privileged | | Non-Privileged | 274 | | ULP | | ULP | 275 | +-------------+ +-----------------+ 276 | ^ ^ 277 |Privileged |Privileged |Non-Privileged 278 |Control |Data |Data 279 |Interface |Interface |Interface 280 RNIC | | | 281 Interface v v v 282 ================================================================= 284 +--------------------------------------+ 285 | | 286 | RNIC Engine | 287 | | 288 +--------------------------------------+ 289 ^ 290 | 291 v 292 Internet 294 Figure 1 - RDMA Security Model 296 2.1 Components 298 The components shown in Figure 1 - RDMA Security Model are: 300 * RDMA Network Interface Controller Engine (RNIC) - the 301 component that implements the RDMA protocol and/or DDP 302 protocol. 304 * Privileged Resource Manager - the component responsible 305 for managing and allocating resources associated with the 306 RNIC Engine. The Resource Manager does not send or 307 receive data. Note that whether the Resource Manager is 308 an independent component, part of the RNIC, or part of 309 the ULP is implementation dependent. 311 * Privileged ULP - See Section 1 Introduction for a 312 definition of Privileged ULP. The local host 313 infrastructure can enable the Privileged ULP to map a 314 data buffer directly from the RNIC Engine to the host 315 through the RNIC Interface, but it does not allow the 316 Privileged ULP to directly consume RNIC Engine resources. 318 * Non-Privileged ULP - See Section 1 Introduction for a 319 definition of Non-Privileged ULP. 321 A design goal of the DDP and RDMAP protocols is to allow, under 322 constrained conditions, Non-Privileged ULP to send and receive 323 data directly to/from the RDMA Engine without Privileged Resource 324 Manager intervention - while ensuring that the host remains 325 secure. Thus, one of the primary goals of this document is to 326 analyze this usage model for the enforcement that is required in 327 the RNIC Engine to ensure the system remains secure. 329 DDP provides two mechanisms for transferring data: 331 * Untagged Data Transfer - the incoming payload simply 332 consumes the first buffer in a queue of buffers that are 333 in the order specified by the receiving Peer (commonly 334 referred to as the Receive Queue), and 336 * Tagged Data Transfer - the Peer transmitting the payload 337 explicitly states which destination buffer is targeted, 338 through use of an STag. STag based transfers allow the 339 receiving ULP to be indifferent to what order (or in what 340 messages) the opposite Peer sent the data, or what order 341 packets are received in. 343 Both data transfer mechanisms are also enabled through RDMAP, 344 with additional control semantics. Typically Tagged Data Transfer 345 can be used for payload transfer, while Untagged Data Transfer is 346 best used for control messages. However, each upper layer 347 protocol can determine the optimal use of tagged and untagged 348 messages for itself. See [APPLICABILITY] for more information on 349 application applicability for the two transfer mechanisms. 351 For DDP the two forms correspond to Untagged and Tagged DDP 352 Messages, respectively. For RDMAP the two forms correspond to 353 Send Type Messages and RDMA Messages (either RDMA Read or RDMA 354 Write Messages), respectively. 356 The host interfaces that could be exercised include: 358 * Privileged Control Interface - A Privileged Resource 359 Manager uses the RNIC Interface to allocate and manage 360 RNIC Engine resources, control the state within the RNIC 361 Engine, and monitor various events from the RNIC Engine. 362 It also uses this interface to act as a proxy for some 363 operations that a Non-Privileged ULP may require (after 364 performing appropriate countermeasures). 366 * ULP Control Interface - A ULP uses this interface to the 367 Privileged Resource Manager to allocate RNIC Engine 368 resources. The Privileged Resource Manager implements 369 countermeasures to ensure that if the Non-Privileged ULP 370 launches an attack it can prevent the attack from 371 affecting other ULPs. 373 * Non-Privileged Data Transfer Interface - A Non-Privileged 374 ULP uses this interface to initiate and to check the 375 status of data transfer operations. 377 * Privileged Data Transfer Interface - A superset of the 378 functionality provided by the Non-Privileged Data 379 Transfer Interface. The ULP is allowed to directly 380 manipulate RNIC Engine mapping resources to map an STag 381 to a ULP data buffer. 383 If Internet control messages, such as ICMP, ARP, RIPv4, etc. are 384 processed by the RNIC Engine, the threat analyses for those 385 protocols is also applicable, but outside the scope of this 386 document. 388 2.2 Resources 390 This section describes the primary resources in the RNIC Engine 391 that could be affected if under attack. For RDMAP, all of the 392 defined resources apply. For DDP, all of the resources except the 393 RDMA Read Queue apply. 395 2.2.1 Stream Context Memory 397 The state information for each Stream is maintained in memory, 398 which could be located in a number of places - on the NIC, inside 399 RAM attached to the NIC, in host memory, or in any combination of 400 the three, depending on the implementation. 402 Stream Context Memory includes state associated with Data 403 Buffers. For Tagged Buffers, this includes how STag names, Data 404 Buffers, and Page Translation Tables (see Section 2.2.3) 405 interrelate. It also includes the list of Untagged Data Buffers 406 posted for reception of Untagged Messages (commonly called the 407 Receive Queue), and a list of operations to perform to send data 408 (commonly called the Send Queue). 410 2.2.2 Data Buffers 412 As mentioned previously, there are two different ways to expose a 413 local ULP's data buffers for data transfer; Untagged Data 414 Transfer - a buffer can be exposed for receiving RDMAP Send Type 415 Messages (a.k.a. DDP Untagged Messages) on DDP Queue zero - or 416 Tagged Data Transfer - the buffer can be exposed for remote 417 access through STags (a.k.a. DDP Tagged Messages). This 418 distinction is important because the attacks and the 419 countermeasures used to protect against the attack are different 420 depending on the method for exposing the buffer to the network. 422 For the purposes of the security discussion, for Tagged Data 423 Transfer a single logical Data Buffer is exposed with a single 424 Stag on a given Stream. Actual implementations may support 425 scatter/gather capabilities to enable multiple physical data 426 buffers to be accessed with a single STag, but from a threat 427 analysis perspective it is assumed that a single STag enables 428 access to a single logical Data Buffer. 430 In any event, it is the responsibility of the Privileged Resource 431 Manager to ensure that no STag can be created that exposes memory 432 that the consumer had no authority to expose. 434 A data buffer has specific access rights. The local ULP can 435 control whether a data buffer is exposed for local only, or local 436 and remote access, and assign specific access privileges (read, 437 write, read and write) on a per Stream basis. 439 For DDP, when an STag is advertised, the Remote Peer is 440 presumably given write access rights to the data (otherwise there 441 was not much point to the advertisement). For RDMAP, when a ULP 442 advertises an STag, it can enable write-only, read-only, or both 443 write and read access rights. 445 Similarly, some ULPs may wish to provide a single buffer with 446 different access rights on a per-Stream basis. For example, some 447 Streams may have read-only access, some may have remote read and 448 write access, while on other Streams only the local ULP/Local 449 Peer is allowed access. 451 2.2.3 Page Translation Tables 453 Page Translation Tables are the structures used by the RNIC to be 454 able to access ULP memory for data transfer operations. Even 455 though these structures are called "Page" Translation Tables, 456 they may not reference a page at all - conceptually they are used 457 to map a ULP address space representation (e.g. a virtual 458 address) of a buffer to the physical addresses that are used by 459 the RNIC Engine to move data. If on a specific system a mapping 460 is not used, then a subset of the attacks examined may be 461 appropriate. Note that the Page Translation Table may or may not 462 be a shared resource. 464 2.2.4 Protection Domain (PD) 466 A Protection Domain (PD) is a local construct to the RDMA 467 implementation, and never visible over the wire. Protection 468 Domains are assigned to three of the resources of concern - 469 Stream Context Memory, STags associated with Page Translation 470 Table entries, and data buffers. A correct implementation of a 471 Protection Domain requires that resources which belong to a given 472 Protection Domain can not be used on a resource belonging to 473 another Protection Domain, because Protection Domain membership 474 is checked by the RNIC prior to taking any action involving such 475 a resource. Protection Domains are therefore used to ensure that 476 an STag can only be used to access an associated data buffer on 477 one or more Streams that are associated with the same Protection 478 Domain as the specific STag. 480 If an implementation chooses to not share resources between 481 Streams, it is recommended that each Stream be associated with 482 its own, unique Protection Domain. If an implementation chooses 483 to allow resource sharing, it is recommended that Protection 484 Domain be limited to the collection of Streams that have Partial 485 Mutual Trust with each other. 487 Note that a ULP (either Privileged or Non-Privileged) can 488 potentially have multiple Protection Domains. This could be used, 489 for example, to ensure that multiple clients of a server do not 490 have the ability to corrupt each other. The server would allocate 491 a Protection Domain per client to ensure that resources covered 492 by the Protection Domain could not be used by another (untrusted) 493 client. 495 2.2.5 STag Namespace and Scope 497 The DDP specification defines a 32-bit namespace for the STag. 498 Implementations may vary in terms of the actual number of STags 499 that are supported. In any case, this is a bounded resource that 500 can come under attack. Depending upon STag namespace allocation 501 algorithms, the actual name space to attack may be significantly 502 less than 2^32. 504 The scope of an STag is the set of DDP/RDMAP Streams on which the 505 STag is valid. If an STag is valid on a particular DDP/RDMAP 506 Stream, then that stream can modify the buffer, subject to the 507 access rights that the stream has for the STag (see Section 2.2.2 508 Data Buffers for additional information). 510 The analysis presented in this document assumes two mechanisms 511 for limiting the scope of Streams for which the STag is valid: 513 * Protection Domain scope. The STag is valid if used on 514 any Stream within a specific Protection Domain, and 515 is invalid if used on any Stream that is not a member 516 of the Protection Domain. 518 * Single Stream scope. The STag is valid on a single 519 Stream, regardless of what the Stream association is 520 to a Protection Domain. If used on any other Stream, 521 it is invalid. 523 2.2.6 Completion Queues 525 Completion Queues (CQ) are used in this document to conceptually 526 represent how the RNIC Engine notifies the ULP about the 527 completion of the transmission of data, or the completion of the 528 reception of data through the Data Transfer Interface 529 (specifically for Untagged Data Transfer - Tagged Data Transfer 530 can not cause a completion to occur). Because there could be many 531 transmissions or receptions in flight at any one time, 532 completions are modeled as a queue rather than a single event. An 533 implementation may also use the Completion Queue to notify the 534 ULP of other activities, for example, the completion of a mapping 535 of an STag to a specific ULP buffer. Completion Queues may be 536 shared by a group of Streams, or may be designated to handle a 537 specific Stream's traffic. Limiting Completion Queue association 538 to one, or a small number of RDMAP/DDP Streams can prevent 539 several forms of attacks by sharply limiting the scope of the 540 attack's effect. 542 Some implementations may allow this queue to be manipulated 543 directly by both Non-Privileged and Privileged ULPs. 545 2.2.7 Asynchronous Event Queue 547 The Asynchronous Event Queue is a queue from the RNIC to the 548 Privileged Resource Manager of bounded size. It is used by the 549 RNIC to notify the host of various events which might require 550 management action, including protocol violations, Stream state 551 changes, local operation errors, low water marks on receive 552 queues, and possibly other events. 554 The Asynchronous Event Queue is a resource that can be attacked 555 because Remote or Local Peers and/or ULPs can cause events to 556 occur which have the potential of overflowing the queue. 558 Note that an implementation is at liberty to implement the 559 functions of the Asynchronous Event Queue in a variety of ways, 560 including multiple queues or even simple callbacks. All 561 vulnerabilities identified are intended to apply regardless of 562 the implementation of the Asynchronous Event Queue. For example, 563 a callback function may be viewed as simply a very short queue. 565 2.2.8 RDMA Read Request Queue 567 The RDMA Read Request Queue is the memory that holds state 568 information for one or more RDMA Read Request Messages that have 569 arrived, but for which the RDMA Read Response Messages have not 570 yet been completely sent. Because potentially more than one RDMA 571 Read Request can be outstanding at one time, the memory is 572 modeled as a queue of bounded size. Some implementations may 573 enable sharing of a single RDMA Read Request Queue across 574 multiple Streams. 576 2.3 RNIC Interactions 578 With RNIC resources and interfaces defined, it is now possible to 579 examine the interactions supported by the generic RNIC functional 580 interfaces through each of the 3 interfaces - Privileged Control 581 Interface, Privileged Data Interface, and Non-Privileged Data 582 Interface. As mentioned previously in Section 2.1 Components, 583 there are two data transfer mechanisms to be examined - Untagged 584 Data Transfer and Tagged Data Transfer. 586 2.3.1 Privileged Control Interface Semantics 588 Generically, the Privileged Control Interface controls the RNIC's 589 allocation, de-allocation, and initialization of RNIC global 590 resources. This includes allocation and de-allocation of Stream 591 Context Memory, Page Translation Tables, STag names, Completion 592 Queues, RDMA Read Request Queues, and Asynchronous Event Queues. 594 The Privileged Control Interface is also typically used for 595 managing Non-Privileged ULP resources for the Non-Privileged ULP 596 (and possibly for the Privileged ULP as well). This includes 597 initialization and removal of Page Translation Table resources, 598 and managing RNIC events (possibly managing all events for the 599 Asynchronous Event Queue). 601 2.3.2 Non-Privileged Data Interface Semantics 603 The Non-Privileged Data Interface enables data transfer (transmit 604 and receive) but does not allow initialization of the Page 605 Translation Table resources. However, once the Page Translation 606 Table resources have been initialized, the interface may enable a 607 specific STag mapping to be enabled and disabled by directly 608 communicating with the RNIC, or create an STag mapping for a 609 buffer that has been previously initialized in the RNIC. 611 For RDMAP, ULP data can be sent by one of the previously 612 described data transfer mechanisms - Untagged Data Transfer or 613 Tagged Data Transfer. Two RDMAP data transfer mechanisms are 614 defined, one using Untagged Data Transfer (Send Type Messages), 615 and one using Tagged Data Transfer (RDMA Read Responses and RDMA 616 Writes). ULP data reception through RDMAP can be done by 617 receiving Send Type Messages into buffers that have been posted 618 on the Receive Queue or Shared Receive Queue. Thus a Receive 619 Queue or Shared Receive Queue can only be affected by Untagged 620 Data Transfer. Data reception can also be done by receiving RDMA 621 Write and RDMA Read Response Messages into buffers that have 622 previously been exposed for external write access through 623 advertisement of an STag (i.e. Tagged Data Transfer). 624 Additionally, to cause ULP data to be pulled (read) across the 625 network, RDMAP uses an RDMA Read Request Message (which only 626 contains RDMAP control information necessary to access the ULP 627 buffer to be read), to cause an RDMA Read Response Message to be 628 generated that contains the ULP data. 630 For DDP, transmitting data means sending DDP Tagged or Untagged 631 Messages. For data reception, DDP can receive Untagged Messages 632 into buffers that have been posted on the Receive Queue or Shared 633 Receive Queue. It can also receive Tagged DDP Messages into 634 buffers that have previously been exposed for external write 635 access through advertisement of an STag. 637 Completion of data transmission or reception generally entails 638 informing the ULP of the completed work by placing completion 639 information on the Completion Queue. For data reception, only an 640 Untagged Data Transfer can cause completion information to be put 641 in the Completion Queue. 643 2.3.3 Privileged Data Interface Semantics 645 The Privileged Data Interface semantics are a superset of the 646 Non-Privileged Data Transfer semantics. The interface can do 647 everything defined in the prior section, as well as 648 create/destroy buffer to STag mappings directly. This generally 649 entails initialization or clearing of Page Translation Table 650 state in the RNIC. 652 2.3.4 Initialization of RNIC Data Structures for Data Transfer 654 Initialization of the mapping between an STag and a Data Buffer 655 can be viewed in the abstract as two separate operations: 657 a. Initialization of the allocated Page Translation Table 658 entries with the location of the Data Buffer, and 660 b. Initialization of a mapping from an allocated STag name 661 to a set of Page Translation Table entry(s) or partial- 662 entries. 664 Note that an implementation may not have a Page Translation Table 665 (i.e. it may support a direct mapping between an STag and a Data 666 Buffer). If there is no Page Translation Table, then attacks 667 based on changing its contents or exhausting its resources are 668 not possible. 670 Initialization of the contents of the Page Translation Table can 671 be done by either the Privileged ULP or by the Privileged 672 Resource Manager as a proxy for the Non-Privileged ULP. By 673 definition the Non-Privileged ULP is not trusted to directly 674 manipulate the Page Translation Table. In general the concern is 675 that the Non-Privileged ULP may try to maliciously initialize the 676 Page Translation Table to access a buffer for which it does not 677 have permission. 679 The exact resource allocation algorithm for the Page Translation 680 Table is outside the scope of this document. It may be allocated 681 for a specific Data Buffer, or be allocated as a pooled resource 682 to be consumed by potentially multiple Data Buffers, or be 683 managed in some other way. This document attempts to abstract 684 implementation dependent issues, and group them into higher level 685 security issues such as resource starvation and sharing of 686 resources between Streams. 688 The next issue is how an STag name is associated with a Data 689 Buffer. For the case of an Untagged Data Buffer (i.e. Untagged 690 Data Transfer), there is no wire visible mapping between an STag 691 and the Data Buffer. Note that there may, in fact, be an STag 692 which represents the buffer, if an implementation chooses to 693 internally represent Untagged Data Buffer using STags. However, 694 because the STag by definition is not visible on the wire, this 695 is a local host implementation specific issue which should be 696 analyzed in the context of a local host implementation specific 697 security analysis, and thus is outside the scope of this 698 document. 700 For a Tagged Data Buffer (i.e. Tagged Data Transfer), either the 701 Privileged ULP or the Privileged Resource Manager acting on 702 behalf of the Non-Privileged ULP may initialize a mapping from an 703 STag to a Page Translation Table, or may have the ability to 704 simply enable/disable an existing STag to Page Translation Table 705 mapping. There may also be multiple STag names which map to a 706 specific group of Page Translation Table entries (or sub- 707 entries). Specific security issues with this level of flexibility 708 are examined in Section 6.2.3 Multiple STags to access the same 709 buffer. 711 There are a variety of implementation options for initialization 712 of Page Translation Table entries and mapping an STag to a group 713 of Page Translation Table entries which have security 714 repercussions. This includes support for separation of Mapping an 715 STag versus mapping a set of Page Translation Table entries, and 716 support for ULPs directly manipulating STag to Page Translation 717 Table entry mappings (versus requiring access through the 718 Privileged Resource Manager). 720 2.3.5 RNIC Data Transfer Interactions 722 RNIC Data Transfer operations can be subdivided into send 723 operations and receive operations. 725 For send operations, there is typically a queue that enables the 726 ULP to post multiple operation requests to send data (referred to 727 as the Send Queue). Depending upon the implementation, Data 728 Buffers used in the operations may or may not have Page 729 Translation Table entries associated with them, and may or may 730 not have STags associated with them. Because this is a local host 731 specific implementation issue rather than a protocol issue, the 732 security analysis of threats and mitigations is left to the host 733 implementation. 735 Receive operations are different for Tagged Data Buffers versus 736 Untagged Data Buffers (i.e. Tagged Data Transfer vs. Untagged 737 Data Transfer). For Untagged Data Transfer, if more than one 738 Untagged Data Buffer can be posted by the ULP, the DDP 739 specification requires that they be consumed in sequential order 740 (the RDMAP specification also requires this). Thus the most 741 general implementation is that there is a sequential queue of 742 receive Untagged Data Buffers (Receive Queue). Some 743 implementations may also support sharing of the sequential queue 744 between multiple Streams. In this case defining "sequential" 745 becomes non-trivial - in general the buffers for a single Stream 746 are consumed from the queue in the order that they were placed on 747 the queue, but there is no consumption order guarantee between 748 Streams. 750 For receive Tagged Data Transfer (i.e. Tagged Data Buffers, RDMA 751 Write Buffers, or RDMA Read Buffers), at some time prior to data 752 transfer, the mapping of the STag to specific Page Translation 753 Table entries (if present) and the mapping from the Page 754 Translation Table entries to the Data Buffer must have been 755 initialized (see Section 2.3.4 for interaction details). 757 3 Trust and Resource Sharing 759 It is assumed that in general the Local and Remote Peer are 760 untrusted, and thus attacks by either should have mitigations in 761 place. 763 A separate, but related issue is resource sharing between 764 multiple Streams. If local resources are not shared, the 765 resources are dedicated on a per Stream basis. Resources are 766 defined in Section 2.2 Resources. The advantage of not sharing 767 resources between Streams is that it reduces the types of attacks 768 that are possible. The disadvantage of not sharing resources is 769 that ULPs might run out of resources. Thus there can be a strong 770 incentive for sharing resources, if the security issues 771 associated with the sharing of resources can be mitigated. 773 It is assumed in this document that the component that implements 774 the mechanism to control sharing of the RNIC Engine resources is 775 the Privileged Resource Manager. The RNIC Engine exposes its 776 resources through the RNIC Interface to the Privileged Resource 777 Manager. All Privileged and Non-Privileged ULPs request resources 778 from the Resource Manager (note that by definition both the Non- 779 Privileged and the Privileged application might try to greedily 780 consume resources, thus creating a potential Denial of Service 781 (DOS) attack). The Resource Manager implements resource 782 management policies to ensure fair access to resources. The 783 Resource Manager should be designed to take into account security 784 attacks detailed in this document. Note that for some systems the 785 Privileged Resource Manager may be implemented within the 786 Privileged ULP. 788 All Non-Privileged ULP interactions with the RNIC Engine that 789 could affect other ULPs MUST be done using the Privileged 790 Resource Manager as a proxy. All ULP resource allocation requests 791 for scarce resources MUST also be done using a Privileged 792 Resource Manager. 794 The sharing of resources across Streams should be under the 795 control of the ULP, both in terms of the trust model the ULP 796 wishes to operate under, as well as the level of resource sharing 797 the ULP wishes to give local processes. For more discussion on 798 types of trust models which combine partial trust and sharing of 799 resources, see Appendix C: Partial Trust Taxonomy. 801 The Privileged Resource Manager MUST NOT assume different Streams 802 share Partial Mutual Trust unless there is a mechanism to ensure 803 that the Streams do indeed share Partial Mutual Trust. This can 804 be done in several ways, including explicit notification from the 805 ULP that owns the Streams. 807 4 Attacker Capabilities 809 An attacker's capabilities delimit the types of attacks that 810 attacker is able to launch. RDMAP and DDP require that the 811 initial LLP Stream (and connection) be set up prior to 812 transferring RDMAP/DDP Messages. This requires at least one 813 round-trip handshake to occur. 815 If the attacker is not the Remote Peer that created the initial 816 connection, then the attacker's capabilities can be segmented 817 into send only capabilities or send and receive capabilities. 818 Attacking with send only capabilities requires the attacker to 819 first guess the current LLP Stream parameters before they can 820 attack RNIC resources (e.g. TCP sequence number). If this class 821 of attacker also has receive capabilities and the ability to pose 822 as the receiver to the sender and the sender to the receiver, 823 they are typically referred to as a "man-in-the-middle" attacker 824 [RFC3552]. A man-in-the-middle attacker has a much wider ability 825 to attack RNIC resources. The breadth of attack is essentially 826 the same as that of an attacking Remote Peer (i.e. the Remote 827 Peer that setup the initial LLP Stream). 829 5 Attacks That Can be Mitigated With End-to-End Security 831 This section describes the RDMAP/DDP attacks where the only 832 solution is to implement some form of end-to-end security. The 833 analysis includes a detailed description of each attack, what is 834 being attacked, and a description of the countermeasures that can 835 be taken to thwart the attack. 837 Some forms of attack involve modifying the RDMAP or DDP payload 838 by a network based attacker or involve monitoring the traffic to 839 discover private information. An effective tool to ensure 840 confidentiality is to encrypt the data stream through mechanisms 841 such as IPsec encryption. Additionally, authentication protocols 842 such as IPsec authentication are an effective tool to ensure the 843 remote entity is who they claim to be as well as ensuring that 844 the payload is unmodified as it traverses the network. 846 Note that connection setup and teardown is presumed to be done in 847 stream mode (i.e. no RDMA encapsulation of the payload), so there 848 are no new attacks related to connection setup/teardown beyond 849 what is already present in the LLP (e.g. TCP or SCTP). Note, 850 however, that RDMAP/DDP parameters may be exchanged in stream 851 mode, and if they are corrupted by an attacker unintended 852 consequences will result. Therefore, any existing mitigations for 853 LLP Spoofing, Tampering, Repudiation, Information Disclosure, 854 Denial of Service, or Elevation of Privilege continue to apply 855 (and are out of scope of this document). Thus the analysis in 856 this section focuses on attacks that are present regardless of 857 the LLP Stream type. 859 Tampering is any modification of the legitimate traffic (machine 860 internal or network). Spoofing attack is a special case of 861 tampering where the attacker falsifies an identity of the Remote 862 Peer (identity can be an IP address, machine name, ULP level 863 identity etc.). 865 5.1 Spoofing 867 Spoofing attacks can be launched by the Remote Peer, or by a 868 network based attacker. A network based spoofing attack applies 869 to all Remote Peers. This section analyzes the various types of 870 spoofing attacks applicable to RDMAP & DDP. 872 5.1.1 Impersonation 874 A network based attacker can impersonate a legal RDMAP/DDP Peer 875 (by spoofing a legal IP address). This can either be done as a 876 blind attack (see [RFC3552]) or by establishing an RDMAP/DDP 877 Stream with the victim. Because an RDMAP/DDP Stream requires an 878 LLP Stream to be fully initialized (e.g. for [RFC793] it is in 879 the ESTABLISHED state), existing transport layer protection 880 mechanisms against blind attacks remain in place. 882 For a blind attack to succeed, it requires the attacker to inject 883 a valid transport layer segment (e.g. for TCP it must match at 884 least the 4-tuple as well as guess a sequence number within the 885 window) while also guessing valid RDMAP or DDP parameters. There 886 are many ways to attack the RDMAP/DDP protocol if the transport 887 protocol is assumed to be vulnerable. For example, for Tagged 888 Messages, this entails guessing the STag and TO values. If the 889 attacker wishes to simply terminate the connection, it can do so 890 by correctly guessing the transport & network layer values, and 891 providing an invalid STag. Per the DDP specification, if an 892 invalid STag is received, the Stream is torn down and the Remote 893 Peer is notified with an error. If an attacker wishes to 894 overwrite an Advertised Buffer, it must successfully guess the 895 correct STag and TO. Given that the TO often will start at zero, 896 this is straightforward. The value of the STag should be chosen 897 at random, as discussed in Section 6.1.1 Using an STag on a 898 Different Stream. For Untagged Messages, if the MSN is invalid 899 then the connection may be torn down. If it is valid, then the 900 receive buffers can be corrupted. 902 End-to-end authentication (e.g. IPsec or ULP authentication) 903 provides protection against either the blind attack or the 904 connected attack. 906 5.1.2 Stream Hijacking 908 Stream hijacking happens when a network based attacker eavesdrops 909 the LLP connection through the Stream establishment phase, and 910 waits until the authentication phase (if such a phase exists) is 911 completed successfully. The attacker then spoofs the IP address 912 and re-directs the Stream from the victim to its own machine. For 913 example, an attacker can wait until an iSCSI authentication is 914 completed successfully, and then hijack the iSCSI Stream. 916 The best protection against this form of attack is end-to-end 917 integrity protection and authentication, such as IPsec, to 918 prevent spoofing. Another option is to provide a physically 919 segregated network for security. Discussion of physical security 920 is out of scope for this document. 922 Because the connection and/or Stream itself is established by the 923 LLP, some LLPs are more difficult to hijack than others. Please 924 see the relevant LLP documentation on security issues around 925 connection and/or Stream hijacking. 927 5.1.3 Man-in-the-Middle Attack 929 If a network based attacker has the ability to delete or modify 930 packets which will still be accepted by the LLP (e.g., TCP 931 sequence number is correct) then the Stream can be exposed to a 932 man-in-the-middle attack. One style of attack is for the man-in- 933 the-middle to send Tagged Messages (either RDMAP or DDP). If it 934 can discover a buffer that has been exposed for STag enabled 935 access, then the man-in-the-middle can use an RDMA Read operation 936 to read the contents of the associated data buffer, perform an 937 RDMA Write Operation to modify the contents of the associated 938 data buffer, or invalidate the STag to disable further access to 939 the buffer. 941 The best protection against this form of attack is end-to-end 942 integrity protection and authentication, such as IPsec, to 943 prevent spoofing or tampering. If authentication and integrity 944 protections are not used, then physical protection must be 945 employed to prevent man-in-the-middle attacks. 947 Because the connection/Stream itself is established by the LLP, 948 some LLPs are more exposed to man-in-the-middle attack than 949 others. Please see the relevant LLP documentation on security 950 issues around connection and/or Stream hijacking. 952 Another approach is to restrict access to only the local 953 subnet/link, and provide some mechanism to limit access, such as 954 physical security or 802.1.x. This model is an extremely limited 955 deployment scenario, and will not be further examined here. 957 5.2 Tampering - Network based modification of buffer content 959 This is actually a man in the middle attack - but only on the 960 content of the buffer, as opposed to the man in the middle attack 961 presented above, where both the signaling and content can be 962 modified. See Section 5.1.3 Man-in-the-Middle Attack. 964 5.3 Information Disclosure - Network Based Eavesdropping 966 An attacker that is able to eavesdrop on the network can read the 967 content of all read and write accesses to a Peer's buffers. To 968 prevent information disclosure, the read/written data must be 969 encrypted. See also Section 5.1.3 Man-in-the-Middle Attack. The 970 encryption can be done either by the ULP, or by a protocol that 971 can provide security services to RDMAP & DDP (e.g. IPsec). 973 5.4 Specific Requirements for Security Services 975 Generally speaking, Stream confidentiality protects against 976 eavesdropping. Stream and/or session authentication and integrity 977 protection is a counter measurement against various spoofing and 978 tampering attacks. The effectiveness of authentication and 979 integrity against a specific attack depend on whether the 980 authentication is machine level authentication (such as IPsec), 981 or ULP authentication. 983 5.4.1 Introduction to Security Options 985 The following security services can be applied to an RDMAP/DDP 986 Stream: 988 1. Session confidentiality - protects against eavesdropping 989 (Section 5.3). 991 2. Per-packet data source authentication - protects against the 992 following spoofing attacks: network based impersonation 993 (Section 5.1.1), Stream hijacking (Section 5.1.2), and man in 994 the middle (Section 5.1.3). 996 3. Per-packet integrity - protects against tampering done by 997 network based modification of buffer content (Section 5.2) 999 4. Packet sequencing - protects against replay attacks, which is 1000 a special case of the above tampering attack. 1002 If an RDMAP/DDP Stream may be subject to impersonation attacks, 1003 or Stream hijacking attacks, it is recommended that the Stream be 1004 authenticated, integrity protected, and protected from replay 1005 attacks; it may use confidentiality protection to protect from 1006 eavesdropping (in case the RDMAP/DDP Stream traverses a public 1007 network). 1009 IPsec is a protocol suite which is used to secure communication 1010 at the network layer between two peers. The IPsec protocol suite 1011 is specified within the IP Security Architecture [RFC2401], IKE 1012 [RFC2409], IPsec Authentication Header (AH) [RFC2402] and IPsec 1013 Encapsulating Security Payload (ESP) [RFC2406] documents. IKE is 1014 the key management protocol while AH and ESP are used to protect 1015 IP traffic. Please see those RFCs for a complete description of 1016 the respective protocols. 1018 IPsec is capable of providing the above security services for IP 1019 and TCP traffic respectively. ULP protocols are able to provide 1020 only part of the above security services. 1022 5.4.2 TLS is Inappropriate for DDP/RDMAP Security 1024 TLS [RFC 2246] provides Stream authentication, integrity and 1025 confidentiality for TCP based ULPs. TLS supports one-way (server 1026 only) or mutual certificates based authentication. 1028 If TLS is layered underneath RDMAP, there are at least two 1029 limitations that make TLS inappropriate for DDP/RDMA security: 1031 1. The maximum length supported by the TLS record layer protocol 1032 is 2^14 bytes - longer packets must be fragmented (as a 1033 comparison, the maximum length of an Untagged DDP Message is 1034 roughly 2^32). 1036 2. TLS is a connection oriented protocol. If a stream cipher or 1037 block cipher in CBC mode is used for bulk encryption, then a 1038 packet can be decrypted only after all the packets preceding 1039 it have already arrived. If TLS is used to protect DDP/RDMAP 1040 traffic, then TCP must gather all out-of-order packets before 1041 TLS can decrypt them. Only after this is done can RDMAP/DDP 1042 place them into the ULP buffer. Thus one of the primary 1043 features of DDP/RDMAP - enabling implementations to have a 1044 flow-through architecture with little to no buffering, can 1045 not be achieved if TLS is used to protect the data stream. 1047 If TLS is layered on top of RDMAP or DDP, TLS does not protect 1048 the RDMAP and/or DDP headers. Thus a man-in-the-middle attack can 1049 still occur by modifying the RDMAP/DDP header to incorrectly 1050 place the data into the wrong buffer, thus effectively corrupting 1051 the data stream. 1053 For these reasons, it is not RECOMMENDED that TLS be layered on 1054 top of RDMAP or DDP. 1056 5.4.3 DTLS and RDDP 1058 DTLS [DTLS] provides security services for datagram protocols, 1059 including unreliable datagram protocols. These services include 1060 anti-replay based on a mechanism adapted from IPsec that is 1061 intended to operate on packets as they are received from the 1062 network. For these and other reasons, DTLS is best applied to 1063 RDDP by employing DTLS beneath TCP, yielding a layering of RDDP 1064 over TCP over DTLS over UDP/IP. Such a layering inserts DTLS at 1065 roughly the same level in the protocol stack as IPsec, making 1066 DTLS's security services an alternative to IPsec's services from 1067 an RDDP standpoint. 1069 For RDDP, IPsec is the better choice for a security framework, 1070 and hence is mandatory-to-implement (as specified elsewhere in 1071 this document). An important contributing factor to the 1072 specification of IPsec rather than DTLS is that the non-RDDP 1073 versions of two initial adopters of RDDP (iSCSI [iSCSI][iSER] and 1074 NFSv4 [NFSv4][NFSv4.1]) are compatible with IPsec but neither of 1075 these protocols currently uses either TLS or DTLS. For the 1076 specific case of iSCSI, IPsec is the basis for mandatory-to- 1077 implement security services [RFC3723]. Therefore this document 1078 and the RDDP protocol specifications contain mandatory 1079 implementation requirements for IPsec rather than for DTLS. 1081 5.4.4 ULPs Which Provide Security 1083 ULPs which provide integrated security but wish to leverage 1084 lower-layer protocol security should be aware of security 1085 concerns around correlating a specific channel's security 1086 mechanisms to the authentication performed by the ULP. See 1088 [NFSv4CHANNEL] for additional information on a promising approach 1089 called "channel binding". From [NFSv4CHANNEL]: 1091 "The concept of channel bindings allows applications to 1092 prove that the end-points of two secure channels at 1093 different network layers are the same by binding 1094 authentication at one channel to the session protection at 1095 the other channel. The use of channel bindings allows 1096 applications to delegate session protection to lower layers, 1097 which may significantly improve performance for some 1098 applications." 1100 5.4.5 Requirements for IPsec Encapsulation of DDP 1102 The IP Storage working group has spent significant time and 1103 effort to define the normative IPsec requirements for IP Storage 1104 [RFC3723]. Portions of that specification are applicable to a 1105 wide variety of protocols, including the RDDP protocol suite. In 1106 order to not replicate this effort, an RNIC implementation MUST 1107 follow the requirements defined in RFC3723 Section 2.3 and 1108 Section 5, including the associated normative references for 1109 those sections. Note that this means that support for IPSEC ESP 1110 mode is normative. 1112 Additionally, since IPsec acceleration hardware may only be able 1113 to handle a limited number of active IKE Phase 2 SAs, Phase 2 1114 delete messages may be sent for idle SAs, as a means of keeping 1115 the number of active Phase 2 SAs to a minimum. The receipt of an 1116 IKE Phase 2 delete message MUST NOT be interpreted as a reason 1117 for tearing down a DDP/RDMA Stream. Rather, it is preferable to 1118 leave the Stream up, and if additional traffic is sent on it, to 1119 bring up another IKE Phase 2 SA to protect it. This avoids the 1120 potential for continually bringing Streams up and down. 1122 Note that there are serious security issues if IPsec is not 1123 implemented end-to-end. For example, if IPsec is implemented as a 1124 tunnel in the middle of the network, any hosts between the Peer 1125 and the IPsec tunneling device can freely attack the unprotected 1126 Stream. 1128 6 Attacks from Remote Peers 1130 This section describes remote attacks that are possible against 1131 the RDMA system defined in Figure 1 - RDMA Security Model and the 1132 RNIC Engine resources defined in Section 2.2. The analysis 1133 includes a detailed description of each attack, what is being 1134 attacked, and a description of the countermeasures that can be 1135 taken to thwart the attack. 1137 The attacks are classified into five categories: Spoofing, 1138 Tampering, Information Disclosure, Denial of Service (DoS) 1139 attacks, and Elevation of Privileges. As mentioned previously, 1140 tampering is any modification of the legitimate traffic (machine 1141 internal or network). A spoofing attack is a special case of 1142 tampering where the attacker falsifies an identity of the Remote 1143 Peer (identity can be an IP address, machine name, ULP level 1144 identity etc.). 1146 6.1 Spoofing 1148 This section analyzes the various types of spoofing attacks 1149 applicable to RDMAP & DDP. Spoofing attacks can be launched by 1150 the Remote Peer, or by a network based attacker. For 1151 countermeasures against a network based attacker, see Section 5 1152 Attacks That Can be Mitigated With End-to-End Security. 1154 6.1.1 Using an STag on a Different Stream 1156 One style of attack from the Remote Peer is for it to attempt to 1157 use STag values that it is not authorized to use. Note that if 1158 the Remote Peer sends an invalid STag to the Local Peer, per the 1159 DDP and RDMAP specifications, the Stream must be torn down. Thus 1160 the threat exists if an STag has been enabled for Remote Access 1161 on one Stream and a Remote Peer is able to use it on an unrelated 1162 Stream. If the attack is successful, the attacker could 1163 potentially be able to perform either RDMA Read Operations to 1164 read the contents of the associated data buffer, perform RDMA 1165 Write Operations to modify the contents of the associated data 1166 buffer, or to invalidate the STag to disable further access to 1167 the buffer. 1169 An attempt by a Remote Peer to access a buffer with an STag on a 1170 different Stream in the same Protection Domain may or may not be 1171 an attack depending on whether resource sharing is intended (i.e. 1172 whether the Streams shared Partial Mutual Trust or not). For some 1173 ULPs, using an STag on multiple Streams within the same 1174 Protection Domain could be desired behavior. For other ULPs, 1175 attempting to use an STag on a different Stream could be 1176 considered to be an attack. Since this varies by ULP, a ULP 1177 typically would need to be able to control the scope of the STag. 1179 In the case where an implementation does not share resources 1180 between Streams (including STags), this attack can be defeated by 1181 assigning each Stream to a different Protection Domain. Before 1182 allowing remote access to the buffer, the Protection Domain of 1183 the Stream where the access attempt was made is matched against 1184 the Protection Domain of the STag. If the Protection Domains do 1185 not match, access to the buffer is denied, an error is generated, 1186 and the RDMAP Stream associated with the attacking Stream is 1187 terminated. 1189 For implementations that share resources between multiple 1190 Streams, it may not be practical to separate each Stream into its 1191 own Protection Domain. In this case, the ULP can still limit the 1192 scope of any of the STags to a single Stream (if it is enabling 1193 it for remote access). If the STag scope has been limited to a 1194 single Stream, any attempt to use that STag on a different Stream 1195 will result in an error, and the RDMAP Stream is terminated. 1197 Thus for implementations that do not share STags between Streams, 1198 each Stream MUST either be in a separate Protection Domain or the 1199 scope of an STag MUST be limited to a single Stream. 1201 An RNIC MUST ensure that a specific Stream in a specific 1202 Protection Domain can not access an STag in a different 1203 Protection Domain. 1205 An RNIC MUST ensure that if an STag is limited in scope to a 1206 single Stream, no other Stream can use the STag. 1208 An additional issue may be unintended sharing of STags (i.e. a 1209 bug in the ULP) or a bug in the Remote Peer which causes an off- 1210 by-one STag to be used. For additional protection, an 1211 implementation should allocate STags in such a fashion that it is 1212 difficult to predict the next allocated STag number, and also 1213 ensure that STags are reused at as slow a rate as possible. Any 1214 allocation method which would lead to intentional or 1215 unintentional reuse of an STag by the peer should be avoided 1216 (e.g. a method which always starts with a given STag and 1217 monotonically increases it for each new allocation, or a method 1218 which always uses the same STag for each operation). 1220 6.2 Tampering 1222 A Remote Peer or a network based attacker can attempt to tamper 1223 with the contents of data buffers on a Local Peer that have been 1224 enabled for remote write access. The types of tampering attacks 1225 from a Remote Peer are outlined in the sections that follow. For 1226 countermeasures against a network based attacker, see Section 5 1227 Attacks That Can be Mitigated With End-to-End Security. 1229 6.2.1 Buffer Overrun - RDMA Write or Read Response 1231 This attack is an attempt by the Remote Peer to perform an RDMA 1232 Write or RDMA Read Response to memory outside of the valid length 1233 range of the data buffer enabled for remote write access. This 1234 attack can occur even when no resources are shared across 1235 Streams. This issue can also arise if the ULP has a bug. 1237 The countermeasure for this type of attack must be in the RNIC 1238 implementation, leveraging the STag. When the local ULP specifies 1239 to the RNIC the base address and the number of bytes in the 1240 buffer that it wishes to make accessible, the RNIC must ensure 1241 that the base and bounds check are applied to any access to the 1242 buffer referenced by the STag before the STag is enabled for 1243 access. When an RDMA data transfer operation (which includes an 1244 STag) arrives on a Stream, a base and bounds byte granularity 1245 access check must be performed to ensure the operation accesses 1246 only memory locations within the buffer described by that STag. 1248 Thus an RNIC implementation MUST ensure that a Remote Peer is not 1249 able to access memory outside of the buffer specified when the 1250 STag was enabled for remote access. 1252 6.2.2 Modifying a Buffer After Indication 1254 This attack can occur if a Remote Peer attempts to modify the 1255 contents of an STag referenced buffer by performing an RDMA Write 1256 or an RDMA Read Response after the Remote Peer has indicated to 1257 the Local Peer or local ULP (by a variety of means) that the STag 1258 data buffer contents are ready for use. This attack can occur 1259 even when no resources are shared across Streams. Note that a bug 1260 in a Remote Peer, or network based tampering, could also result 1261 in this problem. 1263 For example, assume the STag referenced buffer contains ULP 1264 control information as well as ULP payload, and the ULP sequence 1265 of operation is to first validate the control information and 1266 then perform operations on the control information. If the Remote 1267 Peer can perform an additional RDMA Write or RDMA Read Response 1268 (thus changing the buffer) after the validity checks have been 1269 completed but before the control data is operated on, the Remote 1270 Peer could force the ULP down operational paths that were never 1271 intended. 1273 The local ULP can protect itself from this type of attack by 1274 revoking remote access when the original data transfer has 1275 completed and before it validates the contents of the buffer. The 1276 local ULP can either do this by explicitly revoking remote access 1277 rights for the STag when the Remote Peer indicates the operation 1278 has completed, or by checking to make sure the Remote Peer 1279 invalidated the STag through the RDMAP Remote Invalidate 1280 capability (see Section 6.4.5 Remote Invalidate an STag Shared on 1281 Multiple Streams for a definition of Remote Invalidate), and if 1282 it did not, the local ULP then explicitly revokes the STag remote 1283 access rights. 1285 The local ULP SHOULD follow the above procedure to protect the 1286 buffer before it validates the contents of the buffer (or uses 1287 the buffer in any way). 1289 An RNIC MUST ensure that network packets using the STag for a 1290 previously advertised buffer can no longer modify the buffer 1291 after the ULP revokes remote access rights for the specific STag. 1293 6.2.3 Multiple STags to access the same buffer 1295 See Section 6.3.6 Using Multiple STags Which Alias to the Same 1296 Buffer for this analysis. 1298 6.3 Information Disclosure 1300 The main potential source for information disclosure is through a 1301 local buffer that has been enabled for remote access. If the 1302 buffer can be probed by a Remote Peer on another Stream, then 1303 there is potential for information disclosure. 1305 The potential attacks that could result in unintended information 1306 disclosure and countermeasures are detailed in the following 1307 sections. 1309 6.3.1 Probing memory outside of the buffer bounds 1311 This is essentially the same attack as described in Section 6.2.1 1312 Buffer Overrun - RDMA Write or Read Response, except an RDMA Read 1313 Request is used to mount the attack. The same countermeasure 1314 applies. 1316 6.3.2 Using RDMA Read to Access Stale Data 1318 If a buffer is being used for some combination of reads and 1319 writes (either remote or local), and is exposed to a Remote Peer 1320 with at least remote read access rights before it is initialized 1321 with the correct data, there is a potential race condition where 1322 the Remote Peer can view the prior contents of the buffer. This 1323 becomes a security issue if the prior contents of the buffer were 1324 not intended to be shared with the Remote Peer. 1326 To eliminate this race condition, the local ULP SHOULD ensure 1327 that no stale data is contained in the buffer before remote read 1328 access rights are granted (this can be done by zeroing the 1329 contents of the memory, for example). This ensures that the 1330 Remote Peer can not access the buffer until the stale data has 1331 been removed. 1333 6.3.3 Accessing a Buffer After the Transfer 1335 If the Remote Peer has remote read access to a buffer, and by 1336 some mechanism tells the local ULP that the transfer has been 1337 completed, but the local ULP does not disable remote access to 1338 the buffer before modifying the data, it is possible for the 1339 Remote Peer to retrieve the new data. 1341 This is similar to the attack defined in Section 6.2.2 Modifying 1342 a Buffer After Indication. The same countermeasures apply. In 1343 addition, the local ULP SHOULD grant remote read access rights 1344 only for the amount of time needed to retrieve the data. 1346 6.3.4 Accessing Unintended Data With a Valid STag 1348 If the ULP enables remote access to a buffer using an STag that 1349 references the entire buffer, but intends only a portion of the 1350 buffer to be accessed, it is possible for the Remote Peer to 1351 access the other parts of the buffer anyway. 1353 To prevent this attack, the ULP SHOULD set the base and bounds of 1354 the buffer when the STag is initialized to expose only the data 1355 to be retrieved. 1357 6.3.5 RDMA Read into an RDMA Write Buffer 1359 One form of disclosure can occur if the access rights on the 1360 buffer enabled remote read, when only remote write access was 1361 intended. If the buffer contained ULP data, or data from a 1362 transfer on an unrelated Stream, the Remote Peer could retrieve 1363 the data through an RDMA Read operation. Note that an RNIC 1364 implementation is not required to support STags that have both 1365 read and write access. 1367 The most obvious countermeasure for this attack is to not grant 1368 remote read access if the buffer is intended to be write-only. 1369 Then the Remote Peer would not be able to retrieve data 1370 associated with the buffer. An attempt to do so would result in 1371 an error and the RDMAP Stream associated with the Stream would be 1372 terminated. 1374 Thus if a ULP only intends a buffer to be exposed for remote 1375 write access, it MUST set the access rights to the buffer to only 1376 enable remote write access. Note that this requirement is not 1377 meant to restrict the use of zero-length RDMA Reads. Zero-length 1378 RDMA Reads do not expose ULP data. Because they are intended to 1379 be used as a mechanism to ensure that all RDMA Writes have been 1380 received, and do not even require a valid STag, their use is 1381 permitted even if a buffer has only been enabled for write 1382 access. 1384 6.3.6 Using Multiple STags Which Alias to the Same Buffer 1386 Multiple STags which alias to the same buffer at the same time 1387 can result in unintentional information disclosure if the STags 1388 are used by different, mutually untrusted, Remote Peers. This 1389 model applies specifically to client/server communication, where 1390 the server is communicating with multiple clients, each of which 1391 do not mutually trust each other. 1393 If only read access is enabled, then the local ULP has complete 1394 control over information disclosure. Thus a server which intended 1395 to expose the same data (i.e. buffer) to multiple clients by 1396 using multiple STags to the same buffer creates no new security 1397 issues beyond what has already been described in this document. 1398 Note that if the server did not intend to expose the same data to 1399 the clients, it should use separate buffers for each client (and 1400 separate STags). 1402 When one STag has remote read access enabled and a different STag 1403 has remote write access enabled to the same buffer, it is 1404 possible for one Remote Peer to view the contents that have been 1405 written by another Remote Peer. 1407 If both STags have remote write access enabled and the two Remote 1408 Peers do not mutually trust each other, it is possible for one 1409 Remote Peer to overwrite the contents that have been written by 1410 the other Remote Peer. 1412 Thus a ULP with multiple Remote Peers which do not share Partial 1413 Mutual Trust MUST NOT grant write access to the same buffer 1414 through different STags. A buffer should be exposed to only one 1415 untrusted Remote Peer at a time to ensure that no information 1416 disclosure or information tampering occurs between peers. 1418 6.4 Denial of Service (DOS) 1420 A DOS attack is one of the primary security risks of RDMAP. This 1421 is because RNIC resources are valuable and scarce, and many ULP 1422 environments require communication with untrusted Remote Peers. 1423 If the Remote Peer can be authenticated or the ULP payload 1424 encrypted, clearly, the DOS profile can be reduced. For the 1425 purposes of this analysis, it is assumed that the RNIC must be 1426 able to operate in untrusted environments, which are open to DOS 1427 style attacks. 1429 Denial of service attacks against RNIC resources are not the 1430 typical unknown party spraying packets at a random host (such as 1431 a TCP SYN attack). Because the connection/Stream must be fully 1432 established (e.g. a 3 message transport layer handshake has 1433 occurred), the attacker must be able to both send and receive 1434 messages over that connection/Stream, or be able to guess a valid 1435 packet on an existing RDMAP Stream. 1437 This section outlines the potential attacks and the 1438 countermeasures available for dealing with each attack. 1440 6.4.1 RNIC Resource Consumption 1442 This section covers attacks that fall into the general category 1443 of a local ULP attempting to unfairly allocate scarce (i.e. 1444 bounded) RNIC resources. The local ULP may be attempting to 1445 allocate resources on its own behalf, or on behalf of a Remote 1446 Peer. Resources that fall into this category include: Protection 1447 Domains, Stream Context Memory, Translation and Protection 1448 Tables, and STag namespace. These can be due to attacks by 1449 currently active local ULPs or ones that allocated resources 1450 earlier, but are now idle. 1452 This type of attack can occur regardless of whether or not 1453 resources are shared across Streams. 1455 The allocation of all scarce resources MUST be placed under the 1456 control of a Privileged Resource Manager. This allows the 1457 Privileged Resource Manager to: 1459 * prevent a local ULP from allocating more than its fair 1460 share of resources. 1462 * detect if a Remote Peer is attempting to launch a DOS 1463 attack by attempting to create an excessive number of 1464 Streams (with associated resources) and take corrective 1465 action (such as refusing the request or applying network 1466 layer filters against the Remote Peer). 1468 This analysis assumes that the Resource Manager is responsible 1469 for handing out Protection Domains, and RNIC implementations will 1470 provide enough Protection Domains to allow the Resource Manager 1471 to be able to assign a unique Protection Domain for each 1472 unrelated, untrusted local ULP (for a bounded, reasonable number 1473 of local ULPs). This analysis further assumes that the Resource 1474 Manager implements policies to ensure that untrusted local ULPs 1475 are not able to consume all of the Protection Domains through a 1476 DOS attack. Note that Protection Domain consumption cannot result 1477 from a DOS attack launched by a Remote Peer, unless a local ULP 1478 is acting on the Remote Peer's behalf. 1480 6.4.2 Resource Consumption by Idle ULPs 1482 The simplest form of a DOS attack given a fixed amount of 1483 resources is for the Remote Peer to create a RDMAP Stream to a 1484 Local Peer, and request dedicated resources then do no actual 1485 work. This allows the Remote Peer to be very light weight (i.e. 1486 only negotiate resources, but do no data transfer) and consumes a 1487 disproportionate amount of resources at the Local Peer. 1489 A general countermeasure for this style of attack is to monitor 1490 active RDMAP Streams and if resources are getting low, reap the 1491 resources from RDMAP Streams that are not transferring data and 1492 possibly terminate the Stream. This would presumably be under 1493 administrative control. 1495 Refer to Section 6.4.1 for the analysis and countermeasures for 1496 this style of attack on the following RNIC resources: Stream 1497 Context Memory, Page Translation Tables and STag namespace. 1499 Note that some RNIC resources are not at risk of this type of 1500 attack from a Remote Peer because an attack requires the Remote 1501 Peer to send messages in order to consume the resource. Receive 1502 Data Buffers, Completion Queue, and RDMA Read Request Queue 1503 resources are examples. These resources are, however, at risk 1504 from a local ULP that attempts to allocate resources, then goes 1505 idle. This could also be created if the ULP negotiates the 1506 resource levels with the Remote Peer, which causes the Local Peer 1507 to consume resources, however the Remote Peer never sends data to 1508 consume them. The general countermeasure described in this 1509 section can be used to free resources allocated by an idle Local 1510 Peer. 1512 6.4.3 Resource Consumption By Active ULPs 1514 This section describes DOS attacks from Local and Remote Peers 1515 that are actively exchanging messages. Attacks on each RDMA NIC 1516 resource are examined and specific countermeasures are 1517 identified. Note that attacks on Stream Context Memory, Page 1518 Translation Tables, and STag namespace are covered in Section 1519 6.4.1 RNIC Resource Consumption, so are not included here. 1521 6.4.3.1 Multiple Streams Sharing Receive Buffers 1523 The Remote Peer can attempt to consume more than its fair share 1524 of receive data buffers (i.e. Untagged buffers for DDP are or 1525 Send Type Messages for RDMAP) if receive buffers are shared 1526 across multiple Streams. 1528 If resources are not shared across multiple Streams, then this 1529 attack is not possible because the Remote Peer will not be able 1530 to consume more buffers than were allocated to the Stream. The 1531 worst case scenario is that the Remote Peer can consume more 1532 receive buffers than the local ULP allowed, resulting in no 1533 buffers being available, which could cause the Remote Peer's 1534 Stream to the Local Peer to be torn down, and all allocated 1535 resources to be released. 1537 If local receive data buffers are shared among multiple Streams, 1538 then the Remote Peer can attempt to consume more than its fair 1539 share of the receive buffers, causing a different Stream to be 1540 short of receive buffers, thus possibly causing the other Stream 1541 to be torn down. For example, if the Remote Peer sent enough one 1542 byte Untagged Messages, they might be able to consume all local 1543 shared receive queue resources with little effort on their part. 1545 One method the Local Peer could use is to recognize that a Remote 1546 Peer is attempting to use more than its fair share of resources 1547 and terminate the Stream (causing the allocated resources to be 1548 released). However, if the Local Peer is sufficiently slow, it 1549 may be possible for the Remote Peer to still mount a denial of 1550 service attack. One countermeasure that can protect against this 1551 attack is implementing a low-water notification. The low-water 1552 notification alerts the ULP if the number of buffers in the 1553 receive queue is less than a threshold. 1555 If all of the following conditions are true, then the Local Peer 1556 or local ULP can size the amount of local receive buffers posted 1557 on the receive queue to ensure a DOS attack can be stopped. 1559 * a low-water notification is enabled, and 1561 * the Local Peer is able to bound the amount of time that 1562 it takes to replenish receive buffers, and 1564 * the Local Peer maintains statistics to determine which 1565 Remote Peer is consuming buffers. 1567 The above conditions enable the low-water notification to arrive 1568 before resources are depleted and thus the Local Peer or local 1569 ULP can take corrective action (e.g., terminate the Stream of the 1570 attacking Remote Peer). 1572 A different, but similar attack is if the Remote Peer sends a 1573 significant number of out-of-order packets and the RNIC has the 1574 ability to use the ULP buffer (i.e. the Untagged Buffer for DDP 1575 or the buffer consumed by a Send Type Message for RDMAP) as a 1576 reassembly buffer. In this case the Remote Peer can consume a 1577 significant number of ULP buffers, but never send enough data to 1578 enable the ULP buffer to be completed to the ULP. 1580 An effective countermeasure is to create a high-water 1581 notification which alerts the ULP if there is more than a 1582 specified number of receive buffers "in process" (partially 1583 consumed, but not completed). The notification is generated when 1584 more than the specified number of buffers are in process 1585 simultaneously on a specific Stream (i.e., packets have started 1586 to arrive for the buffer, but the buffer has not yet been 1587 delivered to the ULP). 1589 A different countermeasure is for the RNIC Engine to provide the 1590 capability to limit the Remote Peer's ability to consume receive 1591 buffers on a per Stream basis. Unfortunately this requires a 1592 large amount of state to be tracked in each RNIC on a per Stream 1593 basis. 1595 Thus, if an RNIC Engine provides the ability to share receive 1596 buffers across multiple Streams, the combination of the RNIC 1597 Engine and the Privileged Resource Manager MUST be able to detect 1598 if the Remote Peer is attempting to consume more than its fair 1599 share of resources so that the Local Peer or local ULP can apply 1600 countermeasures to detect and prevent the attack. 1602 6.4.3.2 Remote or Local Peer Attacking a Shared CQ 1604 For an overview of the shared CQ attack model, see Section 7.1. 1606 The Remote Peer can attack a shared CQ by consuming more than its 1607 fair share of CQ entries by using one of the following methods: 1609 * The ULP protocol allows the Remote Peer to cause the 1610 local ULP to reserve a specified number of CQ entries, 1611 possibly leaving insufficient entries for other Streams 1612 that are sharing the CQ. 1614 * If the Remote Peer, Local Peer, or local ULP (or any 1615 combination) can attack the CQ by overwhelming the CQ 1616 with completions, then completion processing on other 1617 Streams sharing that Completion Queue can be affected 1618 (e.g. the Completion Queue overflows and stops 1619 functioning). 1621 The first method of attack can be avoided if the ULP does not 1622 allow a Remote Peer to reserve CQ entries or there is a trusted 1623 intermediary such as a Privileged Resource Manager. Unfortunately 1624 it is often unrealistic to not allow a Remote Peer to reserve CQ 1625 entries - particularly if the number of completion entries is 1626 dependent on other ULP negotiated parameters, such as the amount 1627 of buffering required by the ULP. Thus an implementation MUST 1628 implement a Privileged Resource Manager to control the allocation 1629 of CQ entries. See Section 2.1 Components for a definition of 1630 Privileged Resource Manager. 1632 One way that a Local or Remote Peer can attempt to overwhelm a CQ 1633 with completions is by sending minimum length RDMAP/DDP Messages 1634 to cause as many completions (receive completions for the Remote 1635 Peer, send completions for the Local Peer) per second as 1636 possible. If it is the Remote Peer attacking, and we assume that 1637 the Local Peer's receive queue(s) do not run out of receive 1638 buffers (if they do, then this is a different attack, documented 1639 in Section 6.4.3.1 Multiple Streams Sharing Receive Buffers), 1640 then it might be possible for the Remote Peer to consume more 1641 than its fair share of Completion Queue entries. Depending upon 1642 the CQ implementation, this could either cause the CQ to overflow 1643 (if it is not large enough to handle all of the completions 1644 generated) or for another Stream to not be able to generate CQ 1645 entries (if the RNIC had flow control on generation of CQ entries 1646 into the CQ). In either case, the CQ will stop functioning 1647 correctly and any Streams expecting completions on the CQ will 1648 stop functioning. 1650 This attack can occur regardless of whether all of the Streams 1651 associated with the CQ are in the same Protection Domain or are 1652 in different Protection Domains - the key issue is that the 1653 number of Completion Queue entries is less than the number of all 1654 outstanding operations that can cause a completion. 1656 The Local Peer can protect itself from this type of attack using 1657 either of the following methods: 1659 * Size the CQ to the appropriate level, as specified below 1660 (note that if the CQ currently exists, and it needs to be 1661 resized, resizing the CQ is not required to succeed in 1662 all cases, so the CQ resize should be done before sizing 1663 the Send Queue and Receive Queue on the Stream), OR 1665 * Grant fewer resources than the Remote Peer requested (not 1666 supplying the number of Receive Data Buffers requested). 1668 The proper sizing of the CQ is dependent on whether the local 1669 ULP(s) will post as many resources to the various queues as the 1670 size of the queue enables or not. If the local ULP(s) can be 1671 trusted to post a number of resources that is smaller than the 1672 size of the specific resource's queue, then a correctly sized CQ 1673 means that the CQ is large enough to hold completion status for 1674 all of the outstanding Data Buffers (both send and receive 1675 buffers), or: 1677 CQ_MIN_SIZE = SUM(MaxPostedOnEachRQ) 1678 + SUM(MaxPostedOnEachSRQ) 1679 + SUM(MaxPostedOnEachSQ) 1681 Where: 1683 MaxPostedOnEachRQ = the maximum number of requests which 1684 can cause a completion that will be posted on a 1685 specific Receive Queue. 1687 MaxPostedOnEachSRQ = the maximum number of requests which 1688 can cause a completion that will be posted on a 1689 specific Shared Receive Queue. 1691 MaxPostedOnEachSQ = the maximum number of requests which 1692 can cause a completion that will be posted on a 1693 specific Send Queue. 1695 If the local ULP must be able to completely fill the queues, or 1696 can not be trusted to observe a limit smaller than the queues, 1697 then the CQ must be sized to accommodate the maximum number of 1698 operations that it is possible to post at any one time. Thus the 1699 equation becomes: 1701 CQ_MIN_SIZE = SUM(SizeOfEachRQ) 1702 + SUM(SizeOfEachSRQ) 1703 + SUM(SizeOfEachSQ) 1705 Where: 1707 SizeOfEachRQ = the maximum number of requests which 1708 can cause a completion that can ever be posted 1709 on a specific Receive Queue. 1711 SizeOfEachSRQ = the maximum number of requests which 1712 can cause a completion that can ever be posted 1713 on a specific Shared Receive Queue. 1715 SizeOfEachSQ = the maximum number of requests which 1716 can cause a completion that can ever be posted 1717 on a specific Send Queue. 1719 Where MaxPosted*OnEach*Q and SizeOfEach*Q varies on a per Stream 1720 or per Shared Receive Queue basis. 1722 If the ULP is sharing a CQ across multiple Streams which do not 1723 share Partial Mutual Trust, then the ULP MUST implement a 1724 mechanism to ensure that the Completion Queue can not overflow. 1725 Note that it is possible to share CQs even if the Remote Peers 1726 accessing the CQs are untrusted if either of the above two 1727 formulas are implemented. If the ULP can be trusted to not post 1728 more than MaxPostedOnEachRQ, MaxPostedOnEachSRQ, and 1729 MaxPostedOnEachSQ, then the first formula applies. If the ULP can 1730 not be trusted to obey the limit, then the second formula 1731 applies. 1733 6.4.3.3 Attacking the RDMA Read Request Queue 1735 The RDMA Read Request Queue can be attacked if the Remote Peer 1736 sends more RDMA Read Requests than the depth of the RDMA Read 1737 Request Queue at the Local Peer. If the RDMA Read Request Queue 1738 is a shared resource, this could corrupt the queue. If the queue 1739 is not shared, then the worst case is that the current Stream is 1740 no longer functional (e.g. torn down). One approach to solving 1741 the shared RDMA Read Request Queue would be to create thresholds, 1742 similar to those described in Section 6.4.3.1 Multiple Streams 1743 Sharing Receive Buffers. A simpler approach is to not share RDMA 1744 Read Request Queue resources among Streams or enforce hard limits 1745 of consumption per Stream. Thus RDMA Read Request Queue resource 1746 consumption MUST be controlled by the Privileged Resource Manager 1747 such that RDMAP/DDP Streams which do not share Partial Mutual 1748 Trust do not share RDMA Read Request Queue resources. 1750 If the issue is a bug in the Remote Peer's implementation, but 1751 not a malicious attack, the issue can be solved by requiring the 1752 Remote Peer's RNIC to throttle RDMA Read Requests. By properly 1753 configuring the Stream at the Remote Peer through a trusted 1754 agent, the RNIC can be made to not transmit RDMA Read Requests 1755 that exceed the depth of the RDMA Read Request Queue at the Local 1756 Peer. If the Stream is correctly configured, and if the Remote 1757 Peer submits more requests than the Local Peer's RDMA Read 1758 Request Queue can handle, the requests would be queued at the 1759 Remote Peer's RNIC until previous requests complete. If the 1760 Remote Peer's Stream is not configured correctly, the RDMAP 1761 Stream is terminated when more RDMA Read Requests arrive at the 1762 Local Peer than the Local Peer can handle (assuming the prior 1763 paragraph's recommendation is implemented). Thus an RNIC 1764 implementation SHOULD provide a mechanism to cap the number of 1765 outstanding RDMA Read Requests. The configuration of this limit 1766 is outside the scope of this document. 1768 6.4.4 Exercise of non-optimal code paths 1770 Another form of DOS attack is to attempt to exercise data paths 1771 that can consume a disproportionate amount of resources. An 1772 example might be if error cases are handled on a "slow path" 1773 (consuming either host or RNIC computational resources), and an 1774 attacker generates excessive numbers of errors in an attempt to 1775 consume these resources. Note that for most RDMAP or DDP errors, 1776 the attacking Stream will simply be torn down. Thus for this form 1777 of attack to be effective, the Remote Peer needs to exercise data 1778 paths which do not cause the Stream to be torn down. 1780 If an RNIC implementation contains "slow paths" which do not 1781 result in the tear down of the Stream, it is recommended that an 1782 implementation provide the ability to detect the above condition 1783 and allow an administrator to act, including potentially 1784 administratively tearing down the RDMAP Stream associated with 1785 the Stream exercising data paths consuming a disproportionate 1786 amount of resources. 1788 6.4.5 Remote Invalidate an STag Shared on Multiple Streams 1790 If a Local Peer has enabled an STag for remote access, the Remote 1791 Peer could attempt to remote invalidate the STag by using the 1792 RDMAP Send with Invalidate or Send with SE and Invalidate 1793 Message. If the STag is only valid on the current Stream, then 1794 the only side effect is that the Remote Peer can no longer use 1795 the STag; thus there are no security issues. 1797 If the STag is valid across multiple Streams, then the Remote 1798 Peer can prevent other Streams from using that STag by using the 1799 remote invalidate functionality. 1801 Thus if RDDP Streams do not share Partial Mutual Trust (i.e. the 1802 Remote Peer may attempt to remote invalidate the STag 1803 prematurely), the ULP MUST NOT enable an STag which would be 1804 valid across multiple Streams. 1806 6.4.6 Remote Peer attacking an Unshared CQ 1808 The Remote Peer can attack an unshared CQ if the Local Peer does 1809 not size the CQ correctly. For example, if the Local Peer enables 1810 the CQ to handle completions of received buffers, and the receive 1811 buffer queue is longer than the Completion Queue, then an 1812 overflow can potentially occur. The effect on the attacker's 1813 Stream is catastrophic. However if an RNIC does not have the 1814 proper protections in place, then an attack to overflow the CQ 1815 can also cause corruption and/or termination of an unrelated 1816 Stream. Thus an RNIC MUST ensure that if a CQ overflows, any 1817 Streams which do not use the CQ MUST remain unaffected. 1819 6.5 Elevation of Privilege 1821 The RDMAP/DDP Security Architecture explicitly differentiates 1822 between three levels of privilege - Non-Privileged, Privileged, 1823 and the Privileged Resource Manager. If a Non-Privileged ULP is 1824 able to elevate its privilege level to a Privileged ULP, then 1825 mapping a physical address list to an STag can provide local and 1826 remote access to any physical address location on the node. If a 1827 Privileged Mode ULP is able to promote itself to be a Resource 1828 Manager, then it is possible for it to perform denial of service 1829 type attacks where substantial amounts of local resources could 1830 be consumed. 1832 In general, elevation of privilege is a local implementation 1833 specific issue and thus outside the scope of this document. 1835 7 Attacks from Local Peers 1837 This section describes local attacks that are possible against 1838 the RDMA system defined in Figure 1 - RDMA Security Model and the 1839 RNIC Engine resources defined in Section 2.2. 1841 7.1 Local ULP Attacking a Shared CQ 1843 DOS attacks against a Shared Completion Queue (CQ - see Section 1844 2.2.6 Completion Queues) can be caused by either the local ULP or 1845 the Remote Peer if either attempts to cause more completions than 1846 its fair share of the number of entries, thus potentially 1847 starving another unrelated ULP such that no Completion Queue 1848 entries are available. 1850 A Completion Queue entry can potentially be maliciously consumed 1851 by a completion from the Send Queue or a completion from the 1852 Receive Queue. In the former, the attacker is the local ULP. In 1853 the latter, the attacker is the Remote Peer. 1855 A form of attack can occur where the local ULPs can consume 1856 resources on the CQ. A local ULP that is slow to free resources 1857 on the CQ by not reaping the completion status quickly enough 1858 could stall all other local ULPs attempting to use that CQ. 1860 For these reasons, an RNIC MUST NOT enable sharing a CQ across 1861 ULPs that do not share Partial Mutual Trust. 1863 7.2 Local Peer Attacking the RDMA Read Request Queue 1865 If RDMA Read Request Queue resources are pooled across multiple 1866 Streams, one attack is if the local ULP attempts to unfairly 1867 allocate RDMA Read Request Queue resources for its Streams. For 1868 example, a local ULP attempts to allocate all available resources 1869 on a specific RDMA Read Request Queue for its Streams, thereby 1870 denying the resource to ULPs sharing the RDMA Read Request Queue. 1871 The same type of argument applies even if the RDMA Read Request 1872 is not shared - but a local ULP attempts to allocate all of the 1873 RNIC's resources when the queue is created. 1875 Thus access to interfaces that allocate RDMA Read Request Queue 1876 entries MUST be restricted to a trusted Local Peer, such as a 1877 Privileged Resource Manager. The Privileged Resource Manager 1878 SHOULD prevent a local ULP from allocating more than its fair 1879 share of resources. 1881 7.3 Local ULP Attacking the PTT & STag Mapping 1883 If a Non-Privileged ULP is able to directly manipulate the RNIC 1884 Page Translation Tables (which translate from an STag to a host 1885 address), it is possible that the Non-Privileged ULP could point 1886 the Page Translation Table at an unrelated Stream's or ULP's 1887 buffers and thereby be able to gain access to information of the 1888 unrelated Stream/ULP. 1890 As discussed in Section 2 Architectural Model, introduction of a 1891 Privileged Resource Manager to arbitrate the mapping requests is 1892 an effective countermeasure. This enables the Privileged Resource 1893 Manager to ensure a local ULP can only initialize the Page 1894 Translation Table (PTT)to point to its own buffers. 1896 Thus if Non-Privileged ULPs are supported, the Privileged 1897 Resource Manager MUST verify that the Non-Privileged ULP has the 1898 right to access a specific Data Buffer before allowing an STag 1899 for which the ULP has access rights to be associated with a 1900 specific Data Buffer. This can be done when the Page Translation 1901 Table is initialized to access the Data Buffer or when the STag 1902 is initialized to point to a group of Page Translation Table 1903 entries, or both. 1905 8 Security considerations 1907 Please see Sections 5 Attacks That Can be Mitigated With End-to- 1908 End Security, Section 6 Attacks from Remote Peers, and Section 7 1909 Attacks from Local Peers, for a detailed analysis of attacks and 1910 normative countermeasures to mitigate the attacks. 1912 Additionally, the appendices provide a summary of the security 1913 requirements for specific audiences. Section 11 Appendix A: ULP 1914 Issues for RDDP Client/Server Protocols provides a summary of 1915 implementation issues and requirements for applications which 1916 implement a traditional client/server style of interaction. It 1917 provides additional insight and applicability of the normative 1918 text in Sections 5, 6, and 7. Section 12, Appendix B: Summary of 1919 RNIC and ULP Implementation Requirements provides a convenient 1920 summary of normative requirements for implementers. 1922 9 IANA Considerations 1924 IANA considerations are not addressed by this document. Any IANA 1925 considerations resulting from the use of DDP or RDMA must be 1926 addressed in the relevant standards. 1928 10 References 1930 10.1 Normative References 1932 [DDP] Shah, H., J. Pinkerton, R. Recio, and P. Culley, "Direct 1933 Data Placement over Reliable Transports", Internet-Draft Work 1934 in Progress draft-ietf-rddp-ddp-05.txt, July 2005. 1936 [RDMAP] Recio, R., P. Culley, D. Garcia, J. Hilland, "An RDMA 1937 Protocol Specification", Internet-Draft Work in Progress 1938 draft-ietf-rddp-rdmap-05.txt, July 2005. 1940 [RFC2406] Kent, S., Atkinson, R. "IP Encapsulating Security 1941 Payload (ESP)", RFC 2406, November 1998. 1943 [RFC2409] Harkins, D., Carrel, D., "The Internet Key Exchange 1944 (IKE)", RFC 2409, November 1998. 1946 [RFC2401] Kent, S., Atkinson, R. "Security Architecture for the 1947 Internet Protocol", RFC 2401, November 1998. 1949 [RFC2402] Kent, S., Atkinson, R. "IP Authentication Header", RFC 1950 2402, November 1998. 1952 [RFC3723] Aboba, B., et al, "Securing Block Storage Protocols 1953 over IP", RFC3723, April 2004. 1955 [RFC2960] Stewart, R. et al., "Stream Control Transmission 1956 Protocol", RFC 2960, October 2000. 1958 [RFC793] Postel, J., "Transmission Control Protocol - DARPA 1959 Internet Program Protocol Specification", RFC 793, September 1960 1981. 1962 10.2 Informative References 1964 [RFC2828] Shirley, R., "Internet Security Glossary", FYI 36, RFC 1965 2828, May 2000. 1967 [APPLICABILITY] Bestler, C. , Coene, L. "Applicability of Remote 1968 Direct Memory Access Protocol (RDMA) and Direct Data 1969 Placement (DDP)", Internet-Draft Work in Progress draft-ietf- 1970 rddp-applicability-06.txt, April 2006. 1972 [IPv6-Trust] Nikander, P., J.Kempf, E. Nordmark, "IPv6 Neighbor 1973 Discovery Trust Models and threats", Informational RFC, 1974 RFC3756, May 2004. 1976 [NFSv4CHANNEL] Williams, N., "On the Use of Channel Bindings to 1977 Secure Channels", Internet-Draft draft-ietf-nfsv4-channel- 1978 bindings-02.txt, July 2004. 1980 [VERBS-RDMAC] "RDMA Protocol Verbs Specification", RDMA 1981 Consortium standard, April 2003. 1982 http://www.rdmaconsortium.org/home/draft-hilland-iwarp-verbs- 1983 v1.0-RDMAC.pdf 1985 [VERBS-RDMAC-Overview] "RDMA enabled NIC (RNIC) Verbs Overview", 1986 slide presentation by Renato Recio, April 2003. 1987 http://www.rdmaconsortium.org/home/RNIC_Verbs_Overview2.pdf 1989 [RFC3552] "Guidelines for Writing RFC Text on Security 1990 Considerations", Best Current Practice RFC, RFC 3552, July 1991 2003. 1993 [INFINIBAND] "InfiniBand Architecture Specification Volume 1", 1994 release 1.2, InfiniBand Trade Association standard. 1995 http://www.infinibandta.org/specs. Verbs are documented in 1996 chapter 11. 1998 [DTLS] E. Rescorla and N. Modadugu, "Datagram Transport Layer 1999 Security", RFC 4347, April 2006. 2001 [iSCSI] J. Satran, et al, "Internet Small Computer Systems 2002 Interface (iSCSI)", RFC 3720, April 2004. 2004 [ISER] M. Ko, et al, "iSCSI Extensions for RDMA Specification", 2005 Internet-Draft Work in Progress draft-ietf-ips-iser-05.txt, 2006 October 2005. 2008 [NFSv4] S. Shepler, et al, "Network File System (NFS) version 4 2009 Protocol", RFC 3530, April 2003. 2011 [NFSv4.1] S. Shepler, ed., "NFSv4 Minor Version 1", Internet- 2012 Draft draft-ietf-nfsv4-minorversion1-03.txt, Work in 2013 Progress, June 2006. 2015 11 Appendix A: ULP Issues for RDDP Client/Server Protocols 2017 This section is a normative appendix to the document that is 2018 focused on client/server ULP implementation requirements to 2019 ensure a secure server implementation. 2021 The prior sections outlined specific attacks and their 2022 countermeasures. This section summarizes the attacks and 2023 countermeasures that have been defined in the prior section which 2024 are applicable to creation of a secure ULP (e.g. application) 2025 server. A ULP server is defined as a ULP which must be able to 2026 communicate with many clients which do not necessarily have a 2027 trust relationship with each other, and ensure that each client 2028 can not attack another client through server interactions. 2029 Further, the server may wish to use multiple Streams to 2030 communicate with a specific client, and those Streams may share 2031 mutual trust. Note that this section assumes a compliant RNIC and 2032 Privileged Resource Manager implementation - thus it focuses 2033 specifically on ULP server (e.g. application) implementation 2034 issues. 2036 All of the prior section's details on attacks and countermeasures 2037 apply to the server, thus requirements which are repeated in this 2038 section use non-normative "must", "should", "may". In some cases 2039 normative SHOULD statements for the ULP from the main body of 2040 this document are made MUST statements for the ULP server because 2041 the operating conditions can be refined to make the motives for a 2042 SHOULD inapplicable. If a prior SHOULD is changed to a MUST in 2043 this section, it is explicitly noted and it uses upper-case 2044 normative statements. 2046 The following list summarizes the relevant attacks that clients 2047 can mount on the shared server, by re-stating the previous 2048 normative statements to be client/server specific. Note that each 2049 client/server ULP may employ explicit RDMA operations (RDMA Read, 2050 RDMA Write) in differing fashions. Therefore where appropriate, 2051 "Local ULP", "Local Peer" and "Remote Peer" are used in place of 2052 "server" or "client", in order to retain full generality of each 2053 requirement. 2055 * Spoofing 2057 * Sections 5.1.1 to 5.1.3. For protection against many 2058 forms of spoofing attacks, enable IPsec. 2060 * Section 6.1.1 Using an STag on a Different Stream. To 2061 ensure that one client can not access another 2062 client's data via use of the other client's STag, the 2063 server ULP must either scope an STag to a single 2064 Stream or use a unique Protection Domain per client. 2065 If a single client has multiple Streams that share 2066 Partial Mutual Trust, then the STag can be shared 2067 between the associated Streams by using a single 2068 Protection Domain among the associated Streams (see 2069 Section 5.4.4 ULPs Which Provide Security for 2070 additional issues). To prevent unintended sharing of 2071 STags within the associated Streams, a server ULP 2072 should use STags in such a fashion that it is 2073 difficult to predict the next allocated STag number. 2075 * Tampering 2077 * 6.2.2 Modifying a Buffer After Indication. Before the 2078 local ULP operates on a buffer that was written by 2079 the Remote Peer using an RDMA Write or RDMA Read, the 2080 local ULP MUST ensure the buffer can no longer be 2081 modified, by invalidating the STag for remote access 2082 (note that this is stronger than the SHOULD in 2083 Section 6.2.2). This can either be done explicitly by 2084 revoking remote access rights for the STag when the 2085 Remote Peer indicates the operation has completed, or 2086 by checking to make sure the Remote Peer Invalidated 2087 the STag through the RDMAP Invalidate capability, and 2088 if it did not, the local ULP then explicitly revoking 2089 the STag remote access rights. 2091 * Information Disclosure 2093 * 6.3.2 Using RDMA Read to Access Stale Data. In a 2094 general purpose server environment there is no 2095 compelling rationale to not require a buffer to be 2096 initialized before remote read is enabled (and an 2097 enormous down side of unintentionally sharing data). 2098 Thus a local ULP MUST (this is stronger than the 2099 SHOULD in Section 6.3.2) ensure that no stale data is 2100 contained in a buffer before remote read access 2101 rights are granted to a Remote Peer (this can be done 2102 by zeroing the contents of the memory, for example). 2104 * 6.3.3 Accessing a Buffer After the Transfer. This 2105 mitigation is already covered by Section 6.2.2 2106 (above). 2108 * 6.3.4 Accessing Unintended Data With a Valid STag. 2109 The ULP must set the base and bounds of the buffer 2110 when the STag is initialized to expose only the data 2111 to be retrieved. 2113 * 6.3.5 RDMA Read into an RDMA Write Buffer. If a peer 2114 only intends a buffer to be exposed for remote write 2115 access, it must set the access rights to the buffer 2116 to only enable remote write access. 2118 * 6.3.6 Using Multiple STags Which Alias to the Same 2119 Buffer. The requirement in Section 6.1.1 (above) 2120 mitigates this attack. A server buffer is exposed to 2121 only one client at a time to ensure that no 2122 information disclosure or information tampering 2123 occurs between peers. 2125 * 5.3 - Network Based Eavesdropping. Confidentiality 2126 services should be enabled by the ULP if this threat 2127 is a concern. 2129 * Denial of Service 2131 * 6.4.3.1 Multiple Streams Sharing Receive Buffers. ULP 2132 memory footprint size can be important for some 2133 server ULPs. If a server ULP is expecting significant 2134 network traffic from multiple clients, using a 2135 receive buffer queue per Stream where there is a 2136 large number of Streams can consume substantial 2137 amounts of memory. Thus a receive queue that can be 2138 shared by multiple Streams is attractive. 2140 However, because of the attacks outlined in this 2141 section, sharing a single receive queue between 2142 multiple clients must only be done if a mechanism is 2143 in place to ensure one client cannot consume receive 2144 buffers in excess of its limits, as defined by each 2145 ULP. For multiple Streams within a single client ULP 2146 (which presumably shared Partial Mutual Trust) this 2147 added overhead may be avoided. 2149 * 7.1 Local ULP Attacking a Shared CQ. The normative 2150 RNIC mitigations require the RNIC to not enable 2151 sharing of a CQ if the local ULPs do not share 2152 Partial Mutual Trust. Thus while the ULP is not 2153 allowed to enable this feature in an unsafe mode, if 2154 the two local ULPs share Partial Mutual Trust, they 2155 must behave in the following manner: 2157 1) The sizing of the completion queue is based on the 2158 size of the receive queue and send queues as 2159 documented in 6.4.3.2 Remote or Local Peer Attacking 2160 a Shared CQ. 2162 2) The local ULP ensures that CQ entries are reaped 2163 frequently enough to adhere to Section 6.4.3.2's 2164 rules. 2166 * 6.4.3.2 Remote or Local Peer Attacking a Shared CQ. 2167 There are two mitigations specified in this section - 2168 one requires a worst-case size of the CQ, and can be 2169 implemented entirely within the Privileged Resource 2170 Manager. The second approach requires cooperation 2171 with the local ULP server (to not post too many 2172 buffers), and enables a smaller CQ to be used. 2174 In some server environments, partial trust of the 2175 server ULP (but not the clients) is acceptable, thus 2176 the smaller CQ fully mitigates the remote attacker. 2177 In other environments, the local server ULP could 2178 also contain untrusted elements which can attack the 2179 local machine (or have bugs). In those environments, 2180 the worst-case size of the CQ must be used. 2182 * 6.4.3.3 The section requires a server's Privileged 2183 Resource Manager to not allow sharing of RDMA Read 2184 Request Queues across multiple Streams that do not 2185 share Partial Mutual Trust, for a ULP which performs 2186 RDMA Read operations to server buffers. However, 2187 because the server ULP knows best which of its 2188 Streams share Partial Mutual Trust, this requirement 2189 can be reflected back to the ULP. The ULP (i.e. 2190 server) requirement in this case is that it MUST NOT 2191 allow RDMA Read Request Queues to be shared between 2192 ULPs which do not have Partial Mutual Trust. 2194 * 6.4.5 Remote Invalidate an STag Shared on Multiple 2195 Streams. This mitigation is already covered by 2196 Section 6.2.2 (above). 2198 12 Appendix B: Summary of RNIC and ULP Implementation Requirements 2200 This appendix is informative. 2202 Below is a summary of implementation requirements for the RNIC: 2204 * 3 Trust and Resource Sharing 2206 * 5.4.5 Requirements for IPsec Encapsulation of DDP 2208 * 6.1.1 Using an STag on a Different Stream 2210 * 6.2.1 Buffer Overrun - RDMA Write or Read Response 2212 * 6.2.2 Modifying a Buffer After Indication 2214 * 6.4.1 RNIC Resource Consumption 2216 * 6.4.3.1 Multiple Streams Sharing Receive Buffers 2218 * 6.4.3.2 Remote or Local Peer Attacking a Shared CQ 2220 * 6.4.3.3 Attacking the RDMA Read Request Queue 2222 * 6.4.6 Remote Peer attacking an Unshared CQ. 2224 * 6.5 Elevation of Privilege 39 2226 * 7.1 Local ULP Attacking a Shared CQ 2228 * 7.3 Local ULP Attacking the PTT & STag Mapping 2230 Below is a summary of implementation requirements for the ULP 2231 above the RNIC: 2233 * 5.3 Information Disclosure - Network Based Eavesdropping 2235 * 6.1.1 Using an STag on a Different Stream 2237 * 6.2.2 Modifying a Buffer After Indication 2239 * 6.3.2 Using RDMA Read to Access Stale Data 2241 * 6.3.3 Accessing a Buffer After the Transfer 2243 * 6.3.4 Accessing Unintended Data With a Valid STag 2245 * 6.3.5 RDMA Read into an RDMA Write Buffer 2247 * 6.3.6 Using Multiple STags Which Alias to the Same Buffer 2248 * 6.4.5 Remote Invalidate an STag Shared on Multiple 2249 Streams 2251 13 Appendix C: Partial Trust Taxonomy 2253 This appendix is informative. 2255 Partial Trust is defined as when one party is willing to assume 2256 that another party will refrain from a specific attack or set of 2257 attacks, the parties are said to be in a state of Partial Trust. 2258 Note that the partially trusted peer may attempt a different set 2259 of attacks. This may be appropriate for many ULPs where any 2260 adverse effects of the betrayal is easily confined and does not 2261 place other clients or ULPs at risk. 2263 The Trust Models described in this section have three primary 2264 distinguishing characteristics. The Trust Model refers to a local 2265 ULP and Remote Peer, which are intended to be the local and 2266 remote ULP instances communicating via RDMA/DDP. 2268 * Local Resource Sharing (yes/no) - When local resources 2269 are shared, they are shared across a grouping of 2270 RDMAP/DDP Streams. If local resources are not shared, the 2271 resources are dedicated on a per Stream basis. Resources 2272 are defined in Section 2.2 - Resources. The advantage of 2273 not sharing resources between Streams is that it reduces 2274 the types of attacks that are possible. The disadvantage 2275 is that ULPs might run out of resources. 2277 * Local Partial Trust (yes/no) - Local Partial Trust is 2278 determined based on whether the local grouping of 2279 RDMAP/DDP Streams (which typically equates to one ULP or 2280 group of ULPs) mutually trust each other to not perform a 2281 specific set of attacks. 2283 * Remote Partial Trust (yes/no) - The Remote Partial Trust 2284 level is determined based on whether the local ULP of a 2285 specific RDMAP/DDP Stream partially trusts the Remote 2286 Peer of the Stream (see the definition of Partial Trust 2287 in Section 1 Introduction). 2289 Not all of the combinations of the trust characteristics are 2290 expected to be used by ULPs. This document specifically analyzes 2291 five ULP Trust Models that are expected to be in common use. The 2292 Trust Models are as follows: 2294 * NS-NT - Non-Shared Local Resources, no Local Trust, no 2295 Remote Trust - typically a server ULP that wants to run 2296 in the safest mode possible. All attack mitigations are 2297 in place to ensure robust operation. 2299 * NS-RT - Non-Shared Local Resources, no Local Trust, 2300 Remote Partial Trust - typically a peer-to-peer ULP, 2301 which has, by some method outside of the scope of this 2302 document, authenticated the Remote Peer. Note that unless 2303 some form of key based authentication is used on a per 2304 RDMA/DDP Stream basis, it may not be possible be possible 2305 for man-in-the-middle attacks to occur. 2307 * S-NT - Shared Local Resources, no Local Trust, no Remote 2308 Trust - typically a server ULP that runs in an untrusted 2309 environment where the amount of resources required is 2310 either too large or too dynamic to dedicate for each 2311 RDMAP/DDP Stream. 2313 * S-LT - Shared Local Resources, Local Partial Trust, no 2314 Remote Trust - typically a ULP, which provides a session 2315 layer and uses multiple Streams, to provide additional 2316 throughput or fail-over capabilities. All of the Streams 2317 within the local ULP partially trust each other, but do 2318 not trust the Remote Peer. This trust model may be 2319 appropriate for embedded environments. 2321 * S-T - Shared Local Resources, Local Partial Trust, Remote 2322 Partial Trust - typically a distributed application, such 2323 as a distributed database application or a High 2324 Performance Computer (HPC) application, which is intended 2325 to run on a cluster. Due to extreme resource and 2326 performance requirements, the application typically 2327 authenticates with all of its peers and then runs in a 2328 highly trusted environment. The application peers are all 2329 in a single application fault domain and depend on one 2330 another to be well-behaved when accessing data 2331 structures. If a trusted Remote Peer has an 2332 implementation defect that results in poor behavior, the 2333 entire application could be corrupted. 2335 Models NS-NT and S-NT above are typical for Internet networking - 2336 neither local ULPs nor the Remote Peer is trusted. Sometimes 2337 optimizations can be done that enable sharing of Page Translation 2338 Tables across multiple local ULPs, thus Model S-LT can be 2339 advantageous. Model S-T is typically used when resource scaling 2340 across a large parallel ULP makes it infeasible to use any other 2341 model. Resource scaling issues can either be due to performance 2342 around scaling or because there simply are not enough resources. 2343 Model NS-RT is probably the least likely model to be used, but is 2344 presented for completeness. 2346 14 Author's Addresses 2348 James Pinkerton 2349 Microsoft Corporation 2350 One Microsoft Way 2351 Redmond, WA. 98052 USA 2352 Phone: +1 (425) 705-5442 2353 Email: jpink@windows.microsoft.com 2355 Ellen Deleganes 2356 Intel Corporation 2357 MS JF5-355 2358 2111 NE 25th Ave. 2359 Hillsboro, OR 97124 USA 2360 Phone: +1 (503) 712-4173 2361 Email: ellen.m.deleganes@intel.com 2363 15 Acknowledgments 2365 Sara Bitan 2366 Microsoft Corporation 2367 Email: sarab@microsoft.com 2369 Allyn Romanow 2370 Cisco Systems 2371 170 W Tasman Drive 2372 San Jose, CA 95134 USA 2373 Phone: +1 408 525 8836 2374 Email: allyn@cisco.com 2376 Catherine Meadows 2377 Naval Research Laboratory 2378 Code 5543 2379 Washington, DC 20375 2380 Email: meadows@itd.nrl.navy.mil 2382 Patricia Thaler 2383 Agilent Technologies, Inc. 2384 1101 Creekside Ridge Drive, #100 2385 M/S-RG10 2386 Roseville, CA 95678 2387 Phone: +1-916-788-5662 2388 email: pat_thaler@agilent.com 2390 James Livingston 2391 NEC Solutions (America), Inc. 2392 7525 166th Ave. N.E., Suite D210 2393 Redmond, WA 98052-7811 2394 Phone: +1 (425) 897-2033 2395 Email: james.livingston@necsam.com 2397 John Carrier 2398 Adaptec, Inc. 2399 691 S. Milpitas Blvd. 2400 Milpitas, CA 95035 USA 2401 Phone: +1 (360) 378-8526 2402 Email: john_carrier@adaptec.com 2404 Caitlin Bestler 2405 Broadcom 2406 49 Discovery 2407 Irvine, CA 92618 2408 Email: cait@asomi.com 2410 Bernard Aboba 2411 Microsoft Corporation 2412 One Microsoft Way 2413 Redmond, WA. 98052 USA 2414 Phone: +1 (425) 706-6606 2415 Email: bernarda@windows.microsoft.com 2417 16 Full Copyright Statement 2419 Copyright (C) The Internet Society (2006). 2421 This document is subject to the rights, licenses and restrictions 2422 contained in BCP 78, and except as set forth therein, the authors 2423 retain all their rights. 2425 This document and the information contained herein are provided 2426 on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2427 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 2428 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 2429 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY 2430 THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY 2431 RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 2432 FOR A PARTICULAR PURPOSE. 2434 Intellectual Property 2436 The IETF takes no position regarding the validity or scope of any 2437 Intellectual Property Rights or other rights that might be 2438 claimed to pertain to the implementation or use of the technology 2439 described in this document or the extent to which any license 2440 under such rights might or might not be available; nor does it 2441 represent that it has made any independent effort to identify any 2442 such rights. Information on the procedures with respect to 2443 rights in RFC documents can be found in BCP 78 and BCP 79. 2445 Copies of IPR disclosures made to the IETF Secretariat and any 2446 assurances of licenses to be made available, or the result of an 2447 attempt made to obtain a general license or permission for the 2448 use of such proprietary rights by implementers or users of this 2449 specification can be obtained from the IETF on-line IPR 2450 repository at http://www.ietf.org/ipr. 2452 The IETF invites any interested party to bring to its attention 2453 any copyrights, patents or patent applications, or other 2454 proprietary rights that may cover technology that may be required 2455 to implement this standard. Please address the information to 2456 the IETF at ietf-ipr@ietf.org. 2458 Acknowledgement 2460 Funding for the RFC Editor function is currently provided by the 2461 Internet Society.