idnits 2.17.1 

draft-ietf-rddp-security-10.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 2432.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2443.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2450.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2456.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 59 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'not RECOMMENDED' in this paragraph:
     
     For these reasons, it is not RECOMMENDED that TLS be layered on top
     of RDMAP or DDP.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (June 2006) is 6525 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC 2246' is mentioned on line 1024, but not defined

  ** Obsolete undefined reference: RFC 2246 (Obsoleted by RFC 4346)

  == Unused Reference: 'IPv6-Trust' is defined on line 1972, but no explicit
     reference was found in the text

  == Unused Reference: 'VERBS-RDMAC-Overview' is defined on line 1985, but no
     explicit reference was found in the text

  == Unused Reference: 'ISER' is defined on line 2004, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-07) exists of
     draft-ietf-rddp-ddp-05

  == Outdated reference: A later version (-07) exists of
     draft-ietf-rddp-rdmap-05

  ** Obsolete normative reference: RFC 2406 (Obsoleted by RFC 4303, RFC 4305)

  ** Obsolete normative reference: RFC 2409 (Obsoleted by RFC 4306)

  ** Obsolete normative reference: RFC 2401 (Obsoleted by RFC 4301)

  ** Obsolete normative reference: RFC 2402 (Obsoleted by RFC 4302, RFC 4305)

  ** Obsolete normative reference: RFC 2960 (Obsoleted by RFC 4960)

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 2828
     (Obsoleted by RFC 4949)

  == Outdated reference: A later version (-08) exists of
     draft-ietf-rddp-applicability-06

  == Outdated reference: A later version (-04) exists of
     draft-ietf-nfsv4-channel-bindings-02

  -- Obsolete informational reference (is this intentional?): RFC 4347 (ref.
     'DTLS') (Obsoleted by RFC 6347)

  == Outdated reference: A later version (-06) exists of
     draft-ietf-ips-iser-05

  -- Obsolete informational reference (is this intentional?): RFC 3530 (ref.
     'NFSv4') (Obsoleted by RFC 7530)


     Summary: 10 errors (**), 0 flaws (~~), 12 warnings (==), 10 comments
     (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Draft                            James Pinkerton
3	draft-ietf-rddp-security-10.txt             Microsoft Corporation
4	Category: Standards Track                 Ellen Deleganes
5	Expires: December, 2006                     Intel Corporation
6	                                          June 2006

8	                           DDP/RDMAP Security

10	Status of this Memo
11	   By submitting this Internet-Draft, each author represents that
12	   any applicable patent or other IPR claims of which he or she is
13	   aware have been or will be disclosed, and any of which he or she
14	   becomes aware will be disclosed, in accordance with Section 6 of
15	   BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups. Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six
23	   months and may be updated, replaced, or obsoleted by other
24	   documents at any time. It is inappropriate to use Internet-Drafts
25	   as reference material or to cite them other than as "work in
26	   progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	Abstract
35	   This document analyzes security issues around implementation and
36	   use of the Direct Data Placement Protocol(DDP) and Remote Direct
37	   Memory Access Protocol (RDMAP). It first defines an architectural
38	   model for an RDMA Network Interface Card (RNIC), which can
39	   implement DDP or RDMAP and DDP. The document reviews various
40	   attacks against the resources defined in the architectural model
41	   and the countermeasures that can be used to protect the system.
42	   Attacks are grouped into those that can be mitigated by using
43	   secure communication channels across the network, attacks from
44	   Remote Peers, and attacks from Local Peers. Attack categories
45	   include spoofing, tampering, information disclosure, denial of
46	   service, and elevation of privilege.

48	   J. Pinkerton, et al.     Expires December, 2006                  1
49	   Table of Contents

51	   1    Introduction.................................................4
52	   2    Architectural Model..........................................7
53	   2.1  Components...................................................8
54	   2.2  Resources...................................................10
55	   2.2.1  Stream Context Memory.....................................10
56	   2.2.2  Data Buffers..............................................10
57	   2.2.3  Page Translation Tables...................................11
58	   2.2.4  Protection Domain (PD)....................................11
59	   2.2.5  STag Namespace and Scope..................................12
60	   2.2.6  Completion Queues.........................................13
61	   2.2.7  Asynchronous Event Queue..................................13
62	   2.2.8  RDMA Read Request Queue...................................13
63	   2.3  RNIC Interactions...........................................14
64	   2.3.1  Privileged Control Interface Semantics....................14
65	   2.3.2  Non-Privileged Data Interface Semantics...................14
66	   2.3.3  Privileged Data Interface Semantics.......................15
67	   2.3.4  Initialization of RNIC Data Structures for Data Transfer..15
68	   2.3.5  RNIC Data Transfer Interactions...........................16
69	   3    Trust and Resource Sharing..................................18
70	   4    Attacker Capabilities.......................................19
71	   5    Attacks That Can be Mitigated With End-to-End Security......20
72	   5.1  Spoofing....................................................20
73	   5.1.1  Impersonation.............................................20
74	   5.1.2  Stream Hijacking..........................................21
75	   5.1.3  Man-in-the-Middle Attack..................................21
76	   5.2  Tampering - Network based modification of buffer content....22
77	   5.3  Information Disclosure - Network Based Eavesdropping........22
78	   5.4  Specific Requirements for Security Services.................22
79	   5.4.1  Introduction to Security Options..........................23
80	   5.4.2  TLS is Inappropriate for DDP/RDMAP Security...............23
81	   5.4.3  DTLS and RDDP.............................................24
82	   5.4.4  ULPs Which Provide Security...............................24
83	   5.4.5  Requirements for IPsec Encapsulation of DDP...............25
84	   6    Attacks from Remote Peers...................................26
85	   6.1  Spoofing....................................................26
86	   6.1.1  Using an STag on a Different Stream.......................26
87	   6.2  Tampering...................................................27
88	   6.2.1  Buffer Overrun - RDMA Write or Read Response..............28
89	   6.2.2  Modifying a Buffer After Indication.......................28
90	   6.2.3  Multiple STags to access the same buffer..................29
91	   6.3  Information Disclosure......................................29
92	   6.3.1  Probing memory outside of the buffer bounds...............29
93	   6.3.2  Using RDMA Read to Access Stale Data......................29
94	   6.3.3  Accessing a Buffer After the Transfer.....................30
95	   6.3.4  Accessing Unintended Data With a Valid STag...............30
96	   6.3.5  RDMA Read into an RDMA Write Buffer.......................30
97	   6.3.6  Using Multiple STags Which Alias to the Same Buffer.......31
98	   6.4  Denial of Service (DOS).....................................31
99	   6.4.1  RNIC Resource Consumption.................................32
100	   6.4.2  Resource Consumption by Idle ULPs.........................32
101	   6.4.3  Resource Consumption By Active ULPs.......................33
102	   6.4.3.1   Multiple Streams Sharing Receive Buffers...............33
103	   6.4.3.2   Remote or Local Peer Attacking a Shared CQ.............35
104	   6.4.3.3   Attacking the RDMA Read Request Queue..................37
105	   6.4.4  Exercise of non-optimal code paths........................38
106	   6.4.5  Remote Invalidate an STag Shared on Multiple Streams......38
107	   6.4.6  Remote Peer attacking an Unshared CQ......................39
108	   6.5  Elevation of Privilege......................................39
109	   7    Attacks from Local Peers....................................40
110	   7.1  Local ULP Attacking a Shared CQ.............................40
111	   7.2  Local Peer Attacking the RDMA Read Request Queue............40
112	   7.3  Local ULP Attacking the PTT & STag Mapping..................40
113	   8    Security considerations.....................................42
114	   9    IANA Considerations.........................................43
115	   10   References..................................................44
116	   10.1   Normative References......................................44
117	   10.2   Informative References....................................44
118	   11   Appendix A: ULP Issues for RDDP Client/Server Protocols.....46
119	   12   Appendix B: Summary of RNIC and ULP Implementation
120	   Requirements.....................................................50
121	   13   Appendix C: Partial Trust Taxonomy..........................52
122	   14   Author's Addresses..........................................54
123	   15   Acknowledgments.............................................55
124	   16   Full Copyright Statement....................................57

126	   Table of Figures

128	   Figure 1 - RDMA Security Model....................................8

130	1  Introduction

132	   RDMA enables new levels of flexibility when communicating between
133	   two parties compared to current conventional networking practice
134	   (e.g. a stream-based model or datagram model). This flexibility
135	   brings new security issues that must be carefully understood when
136	   designing Upper Layer Protocols (ULPs) utilizing RDMA and when
137	   implementing RDMA-aware NICs (RNICs). Note that for the purposes
138	   of this security analysis, an RNIC may implement RDMAP [RDMAP]
139	   and DDP [DDP], or just DDP. Also, a ULP may be an application or
140	   it may be a middleware library.

142	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
143	   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
144	   "OPTIONAL" in this document are to be interpreted as described in
145	   RFC 2119. Additionally the security terminology defined in
146	   [RFC2828] is used in this specification.

148	   The document first develops an architectural model that is
149	   relevant for the security analysis - it details components,
150	   resources, and system properties that may be attacked in Section
151	   2. The document uses Local Peer to represent the RDMA/DDP
152	   protocol implementation on the local end of a Stream (implemented
153	   with a transport protocol such as [RFC793] or [RFC2960]). The
154	   local Upper-Layer-Protocol (ULP) is used to represent the
155	   application or middle-ware layer above the Local Peer. The
156	   document does not attempt to differentiate between a Remote Peer
157	   and a Remote ULP (an RDMA/DDP protocol implementation on the
158	   remote end of a Stream versus the application on the remote end)
159	   for several reasons: often the source of the attack is difficult
160	   to know for sure; and regardless of the source, the mitigations
161	   required of the Local Peer or local ULP are the same. Thus the
162	   document generically refers to a Remote Peer rather than trying
163	   to further delineate the attacker.

165	   The document then defines what resources a local ULP may share
166	   across Streams and what resources the local ULP may share with
167	   the Remote Peer across Streams in Section 3.

169	   Intentional sharing of resources between multiple Streams may
170	   imply some level of trust between the Streams. However, some
171	   types of resource sharing have unmitigated security attacks which
172	   would mandate not sharing a specific type of resource unless
173	   there is some level of trust between the Streams sharing
174	   resources.

176	   This document defines a new term, "Partial Mutual Trust" to
177	   address this concept:

179	        Partial Mutual Trust - a collection of RDMAP/DDP Streams,
180	        which represent the local and remote end points of the
181	        Stream, which are willing to assume that the Streams from
182	        the collection will not perform malicious attacks against
183	        any of the other Streams in the collection.

185	   ULPs have explicit control of which collection of endpoints is in
186	   a Partial Mutual Trust collection through tools discussed in
187	   Section 13 Appendix C: Partial Trust Taxonomy.

189	   An untrusted peer relationship is appropriate when a ULP wishes
190	   to ensure that it will be robust and uncompromised even in the
191	   face of a deliberate attack by its peer. For example, a single
192	   ULP that concurrently supports multiple unrelated Streams (e.g. a
193	   server) would presumably treat each of its peers as an untrusted
194	   peer. For a collection of Streams which share Partial Mutual
195	   Trust, the assumption is that any Stream not in the collection is
196	   untrusted. For the untrusted peer, a brief list of capabilities
197	   is enumerated in Section 4.

199	   The rest of the document is focused on analyzing attacks and
200	   recommending specific mitigations to the attacks. Attacks are
201	   categorized into attacks mitigated by end-to-end security,
202	   attacks initiated by Remote Peers, and attacks initiated by Local
203	   Peers. For each attack, possible countermeasures are reviewed.

205	   ULPs within a host are divided into two categories - Privileged
206	   and Non-Privileged. Both ULP types can send and receive data and
207	   request resources. The key differences between the two are:

209	        The Privileged ULP is trusted by the local system to not
210	        maliciously attack the operating environment, but it is not
211	        trusted to optimize resource allocation globally. For
212	        example, the Privileged ULP could be a kernel ULP, thus the
213	        kernel presumably has in some way vetted the ULP before
214	        allowing it to execute.

216	        A Non-Privileged ULP's capabilities are a logical sub-set of
217	        the Privileged ULP's. It is assumed by the local system that
218	        a Non-Privileged ULP is untrusted. All Non-Privileged ULP
219	        interactions with the RNIC Engine that could affect other
220	        ULPs need to be done through a trusted intermediary that can
221	        verify the Non-Privileged ULP requests.

223	   The appendices provide focused summaries of this specification.
224	   Section 11 Appendix A: ULP Issues for RDDP Client/Server
225	   Protocols focuses on implementers of traditional client/server
226	   protocols. Section 12 Appendix B: Summary of RNIC and ULP
227	   Implementation Requirements summarizes all normative requirements
228	   in this specification. Section 13 Appendix C: Partial Trust
229	   Taxonomy provides an abstract model for categorizing trust
230	   boundaries.

232	   If an RDMAP/DDP protocol implementation uses the mitigations
233	   recommended in this document, that implementation should not
234	   exhibit additional security vulnerabilities above and beyond
235	   those of an implementation of the transport protocol (i.e., TCP
236	   or SCTP) and protocols beneath it (e.g., IP) without RDMAP/DDP.

238	2  Architectural Model

240	   This section describes an RDMA architectural reference model that
241	   is used as security issues are examined. It introduces the
242	   components of the model, the resources that can be attacked, the
243	   types of interactions possible between components and resources,
244	   and the system properties which must be preserved.

246	   Figure 1 shows the components comprising the architecture and the
247	   interfaces where potential security attacks could be launched.
248	   External attacks can be injected into the system from a ULP that
249	   sits above the RNIC Interface or from the network.

251	   The intent here is to describe high level components and
252	   capabilities which affect threat analysis, and not focus on
253	   specific implementation options. Also note that the architectural
254	   model is an abstraction, and an actual implementation may choose
255	   to subdivide its components along different boundary lines than
256	   defined here. For example, the Privileged Resource Manager may be
257	   partially or completely encapsulated in the Privileged ULP.
258	   Regardless, it is expected that the security analysis of the
259	   potential threats and countermeasures still apply.

261	   Note that the model below is derived from several specific RDMA
262	   implementations. A few of note are [VERBS-RDMAC], [VERBS-RDMAC-
263	   Overview], and [INFINIBAND].

265	          +-------------+
266	          |  Privileged |
267	          |  Resource   |
268	 Admin<-+>|  Manager    |     ULP Control Interface
269	        | |             |<------+-------------------+
270	        | +-------------+       |                   |
271	        |       ^               v                   v
272	        |       |         +-------------+   +-----------------+
273	        +---------------->| Privileged  |   |  Non-Privileged |
274	                |         | ULP         |   |  ULP            |
275	                |         +-------------+   +-----------------+
276	                |               ^                   ^
277	                |Privileged     |Privileged         |Non-Privileged
278	                |Control        |Data               |Data
279	                |Interface      |Interface          |Interface
280	RNIC            |               |                   |
281	Interface       v               v                   v
282	=================================================================

284	              +--------------------------------------+
285	              |                                      |
286	              |               RNIC Engine            |
287	              |                                      |
288	              +--------------------------------------+
289	                                ^
290	                                |
291	                                v
292	                             Internet

294	                     Figure 1 - RDMA Security Model

296	2.1  Components

298	   The components shown in Figure 1 - RDMA Security Model are:

300	       *   RDMA Network Interface Controller Engine (RNIC) - the
301	           component that implements the RDMA protocol and/or DDP
302	           protocol.

304	       *   Privileged Resource Manager - the component responsible
305	           for managing and allocating resources associated with the
306	           RNIC Engine. The Resource Manager does not send or
307	           receive data. Note that whether the Resource Manager is
308	           an independent component, part of the RNIC, or part of
309	           the ULP is implementation dependent.

311	       *   Privileged ULP - See Section 1 Introduction for a
312	           definition of Privileged ULP. The local host
313	           infrastructure can enable the Privileged ULP to map a
314	           data buffer directly from the RNIC Engine to the host
315	           through the RNIC Interface, but it does not allow the
316	           Privileged ULP to directly consume RNIC Engine resources.

318	       *   Non-Privileged ULP - See Section 1 Introduction for a
319	           definition of Non-Privileged ULP.

321	   A design goal of the DDP and RDMAP protocols is to allow, under
322	   constrained conditions, Non-Privileged ULP to send and receive
323	   data directly to/from the RDMA Engine without Privileged Resource
324	   Manager intervention - while ensuring that the host remains
325	   secure. Thus, one of the primary goals of this document is to
326	   analyze this usage model for the enforcement that is required in
327	   the RNIC Engine to ensure the system remains secure.

329	   DDP provides two mechanisms for transferring data:

331	       *   Untagged Data Transfer - the incoming payload simply
332	           consumes the first buffer in a queue of buffers that are
333	           in the order specified by the receiving Peer (commonly
334	           referred to as the Receive Queue), and

336	       *   Tagged Data Transfer - the Peer transmitting the payload
337	           explicitly states which destination buffer is targeted,
338	           through use of an STag. STag based transfers allow the
339	           receiving ULP to be indifferent to what order (or in what
340	           messages) the opposite Peer sent the data, or what order
341	           packets are received in.

343	   Both data transfer mechanisms are also enabled through RDMAP,
344	   with additional control semantics. Typically Tagged Data Transfer
345	   can be used for payload transfer, while Untagged Data Transfer is
346	   best used for control messages.  However, each upper layer
347	   protocol can determine the optimal use of tagged and untagged
348	   messages for itself. See [APPLICABILITY] for more information on
349	   application applicability for the two transfer mechanisms.

351	   For DDP the two forms correspond to Untagged and Tagged DDP
352	   Messages, respectively. For RDMAP the two forms correspond to
353	   Send Type Messages and RDMA Messages (either RDMA Read or RDMA
354	   Write Messages), respectively.

356	   The host interfaces that could be exercised include:

358	       *   Privileged Control Interface - A Privileged Resource
359	           Manager uses the RNIC Interface to allocate and manage
360	           RNIC Engine resources, control the state within the RNIC
361	           Engine, and monitor various events from the RNIC Engine.
362	           It also uses this interface to act as a proxy for some
363	           operations that a Non-Privileged ULP may require (after
364	           performing appropriate countermeasures).

366	       *   ULP Control Interface - A ULP uses this interface to the
367	           Privileged Resource Manager to allocate RNIC Engine
368	           resources. The Privileged Resource Manager implements
369	           countermeasures to ensure that if the Non-Privileged ULP
370	           launches an attack it can prevent the attack from
371	           affecting other ULPs.

373	       *   Non-Privileged Data Transfer Interface - A Non-Privileged
374	           ULP uses this interface to initiate and to check the
375	           status of data transfer operations.

377	       *   Privileged Data Transfer Interface - A superset of the
378	           functionality provided by the Non-Privileged Data
379	           Transfer Interface. The ULP is allowed to directly
380	           manipulate RNIC Engine mapping resources to map an STag
381	           to a ULP data buffer.

383	   If Internet control messages, such as ICMP, ARP, RIPv4, etc. are
384	   processed by the RNIC Engine, the threat analyses for those
385	   protocols is also applicable, but outside the scope of this
386	   document.

388	2.2  Resources

390	   This section describes the primary resources in the RNIC Engine
391	   that could be affected if under attack. For RDMAP, all of the
392	   defined resources apply. For DDP, all of the resources except the
393	   RDMA Read Queue apply.

395	2.2.1  Stream Context Memory

397	   The state information for each Stream is maintained in memory,
398	   which could be located in a number of places - on the NIC, inside
399	   RAM attached to the NIC, in host memory, or in any combination of
400	   the three, depending on the implementation.

402	   Stream Context Memory includes state associated with Data
403	   Buffers. For Tagged Buffers, this includes how STag names, Data
404	   Buffers, and Page Translation Tables (see Section 2.2.3)
405	   interrelate. It also includes the list of Untagged Data Buffers
406	   posted for reception of Untagged Messages (commonly called the
407	   Receive Queue), and a list of operations to perform to send data
408	   (commonly called the Send Queue).

410	2.2.2  Data Buffers

412	   As mentioned previously, there are two different ways to expose a
413	   local ULP's data buffers for data transfer; Untagged Data
414	   Transfer - a buffer can be exposed for receiving RDMAP Send Type
415	   Messages (a.k.a. DDP Untagged Messages) on DDP Queue zero - or
416	   Tagged Data Transfer - the buffer can be exposed for remote
417	   access through STags (a.k.a. DDP Tagged Messages). This
418	   distinction is important because the attacks and the
419	   countermeasures used to protect against the attack are different
420	   depending on the method for exposing the buffer to the network.

422	   For the purposes of the security discussion, for Tagged Data
423	   Transfer a single logical Data Buffer is exposed with a single
424	   Stag on a given Stream. Actual implementations may support
425	   scatter/gather capabilities to enable multiple physical data
426	   buffers to be accessed with a single STag, but from a threat
427	   analysis perspective it is assumed that a single STag enables
428	   access to a single logical Data Buffer.

430	   In any event, it is the responsibility of the Privileged Resource
431	   Manager to ensure that no STag can be created that exposes memory
432	   that the consumer had no authority to expose.

434	   A data buffer has specific access rights. The local ULP can
435	   control whether a data buffer is exposed for local only, or local
436	   and remote access, and assign specific access privileges (read,
437	   write, read and write) on a per Stream basis.

439	   For DDP, when an STag is advertised, the Remote Peer is
440	   presumably given write access rights to the data (otherwise there
441	   was not much point to the advertisement). For RDMAP, when a ULP
442	   advertises an STag, it can enable write-only, read-only, or both
443	   write and read access rights.

445	   Similarly, some ULPs may wish to provide a single buffer with
446	   different access rights on a per-Stream basis. For example, some
447	   Streams may have read-only access, some may have remote read and
448	   write access, while on other Streams only the local ULP/Local
449	   Peer is allowed access.

451	2.2.3  Page Translation Tables

453	   Page Translation Tables are the structures used by the RNIC to be
454	   able to access ULP memory for data transfer operations. Even
455	   though these structures are called "Page" Translation Tables,
456	   they may not reference a page at all - conceptually they are used
457	   to map a ULP address space representation (e.g. a virtual
458	   address) of a buffer to the physical addresses that are used by
459	   the RNIC Engine to move data. If on a specific system a mapping
460	   is not used, then a subset of the attacks examined may be
461	   appropriate. Note that the Page Translation Table may or may not
462	   be a shared resource.

464	2.2.4  Protection Domain (PD)

466	   A Protection Domain (PD) is a local construct to the RDMA
467	   implementation, and never visible over the wire. Protection
468	   Domains are assigned to three of the resources of concern -
469	   Stream Context Memory, STags associated with Page Translation
470	   Table entries, and data buffers. A correct implementation of a
471	   Protection Domain requires that resources which belong to a given
472	   Protection Domain can not be used on a resource belonging to
473	   another Protection Domain, because Protection Domain membership
474	   is checked by the RNIC prior to taking any action involving such
475	   a resource. Protection Domains are therefore used to ensure that
476	   an STag can only be used to access an associated data buffer on
477	   one or more Streams that are associated with the same Protection
478	   Domain as the specific STag.

480	   If an implementation chooses to not share resources between
481	   Streams, it is recommended that each Stream be associated with
482	   its own, unique Protection Domain. If an implementation chooses
483	   to allow resource sharing, it is recommended that Protection
484	   Domain be limited to the collection of Streams that have Partial
485	   Mutual Trust with each other.

487	   Note that a ULP (either Privileged or Non-Privileged) can
488	   potentially have multiple Protection Domains. This could be used,
489	   for example, to ensure that multiple clients of a server do not
490	   have the ability to corrupt each other. The server would allocate
491	   a Protection Domain per client to ensure that resources covered
492	   by the Protection Domain could not be used by another (untrusted)
493	   client.

495	2.2.5  STag Namespace and Scope

497	   The DDP specification defines a 32-bit namespace for the STag.
498	   Implementations may vary in terms of the actual number of STags
499	   that are supported. In any case, this is a bounded resource that
500	   can come under attack. Depending upon STag namespace allocation
501	   algorithms, the actual name space to attack may be significantly
502	   less than 2^32.

504	   The scope of an STag is the set of DDP/RDMAP Streams on which the
505	   STag is valid. If an STag is valid on a particular DDP/RDMAP
506	   Stream, then that stream can modify the buffer, subject to the
507	   access rights that the stream has for the STag (see Section 2.2.2
508	   Data Buffers for additional information).

510	   The analysis presented in this document assumes two mechanisms
511	   for limiting the scope of Streams for which the STag is valid:

513	           *   Protection Domain scope. The STag is valid if used on
514	               any Stream within a specific Protection Domain, and
515	               is invalid if used on any Stream that is not a member
516	               of the Protection Domain.

518	           *   Single Stream scope. The STag is valid on a single
519	               Stream, regardless of what the Stream association is
520	               to a Protection Domain. If used on any other Stream,
521	               it is invalid.

523	2.2.6  Completion Queues

525	   Completion Queues (CQ) are used in this document to conceptually
526	   represent how the RNIC Engine notifies the ULP about the
527	   completion of the transmission of data, or the completion of the
528	   reception of data through the Data Transfer Interface
529	   (specifically for Untagged Data Transfer - Tagged Data Transfer
530	   can not cause a completion to occur). Because there could be many
531	   transmissions or receptions in flight at any one time,
532	   completions are modeled as a queue rather than a single event. An
533	   implementation may also use the Completion Queue to notify the
534	   ULP of other activities, for example, the completion of a mapping
535	   of an STag to a specific ULP buffer. Completion Queues may be
536	   shared by a group of Streams, or may be designated to handle a
537	   specific Stream's traffic. Limiting Completion Queue association
538	   to one, or a small number of RDMAP/DDP Streams can prevent
539	   several forms of attacks by sharply limiting the scope of the
540	   attack's effect.

542	   Some implementations may allow this queue to be manipulated
543	   directly by both Non-Privileged and Privileged ULPs.

545	2.2.7  Asynchronous Event Queue

547	   The Asynchronous Event Queue is a queue from the RNIC to the
548	   Privileged Resource Manager of bounded size. It is used by the
549	   RNIC to notify the host of various events which might require
550	   management action, including protocol violations, Stream state
551	   changes, local operation errors, low water marks on receive
552	   queues, and possibly other events.

554	   The Asynchronous Event Queue is a resource that can be attacked
555	   because Remote or Local Peers and/or ULPs can cause events to
556	   occur which have the potential of overflowing the queue.

558	   Note that an implementation is at liberty to implement the
559	   functions of the Asynchronous Event Queue in a variety of ways,
560	   including multiple queues or even simple callbacks. All
561	   vulnerabilities identified are intended to apply regardless of
562	   the implementation of the Asynchronous Event Queue. For example,
563	   a callback function may be viewed as simply a very short queue.

565	2.2.8  RDMA Read Request Queue

567	   The RDMA Read Request Queue is the memory that holds state
568	   information for one or more RDMA Read Request Messages that have
569	   arrived, but for which the RDMA Read Response Messages have not
570	   yet been completely sent. Because potentially more than one RDMA
571	   Read Request can be outstanding at one time, the memory is
572	   modeled as a queue of bounded size. Some implementations may
573	   enable sharing of a single RDMA Read Request Queue across
574	   multiple Streams.

576	2.3  RNIC Interactions

578	   With RNIC resources and interfaces defined, it is now possible to
579	   examine the interactions supported by the generic RNIC functional
580	   interfaces through each of the 3 interfaces - Privileged Control
581	   Interface, Privileged Data Interface, and Non-Privileged Data
582	   Interface. As mentioned previously in Section 2.1 Components,
583	   there are two data transfer mechanisms to be examined - Untagged
584	   Data Transfer and Tagged Data Transfer.

586	2.3.1  Privileged Control Interface Semantics

588	   Generically, the Privileged Control Interface controls the RNIC's
589	   allocation, de-allocation, and initialization of RNIC global
590	   resources. This includes allocation and de-allocation of Stream
591	   Context Memory, Page Translation Tables, STag names, Completion
592	   Queues, RDMA Read Request Queues, and Asynchronous Event Queues.

594	   The Privileged Control Interface is also typically used for
595	   managing Non-Privileged ULP resources for the Non-Privileged ULP
596	   (and possibly for the Privileged ULP as well). This includes
597	   initialization and removal of Page Translation Table resources,
598	   and managing RNIC events (possibly managing all events for the
599	   Asynchronous Event Queue).

601	2.3.2  Non-Privileged Data Interface Semantics

603	   The Non-Privileged Data Interface enables data transfer (transmit
604	   and receive) but does not allow initialization of the Page
605	   Translation Table resources. However, once the Page Translation
606	   Table resources have been initialized, the interface may enable a
607	   specific STag mapping to be enabled and disabled by directly
608	   communicating with the RNIC, or create an STag mapping for a
609	   buffer that has been previously initialized in the RNIC.

611	   For RDMAP, ULP data can be sent by one of the previously
612	   described data transfer mechanisms - Untagged Data Transfer or
613	   Tagged Data Transfer. Two RDMAP data transfer mechanisms are
614	   defined, one using Untagged Data Transfer (Send Type Messages),
615	   and one using Tagged Data Transfer (RDMA Read Responses and RDMA
616	   Writes). ULP data reception through RDMAP can be done by
617	   receiving Send Type Messages into buffers that have been posted
618	   on the Receive Queue or Shared Receive Queue. Thus a Receive
619	   Queue or Shared Receive Queue can only be affected by Untagged
620	   Data Transfer. Data reception can also be done by receiving RDMA
621	   Write and RDMA Read Response Messages into buffers that have
622	   previously been exposed for external write access through
623	   advertisement of an STag (i.e. Tagged Data Transfer).
624	   Additionally, to cause ULP data to be pulled (read) across the
625	   network, RDMAP uses an RDMA Read Request Message (which only
626	   contains RDMAP control information necessary to access the ULP
627	   buffer to be read), to cause an RDMA Read Response Message to be
628	   generated that contains the ULP data.

630	   For DDP, transmitting data means sending DDP Tagged or Untagged
631	   Messages. For data reception, DDP can receive Untagged Messages
632	   into buffers that have been posted on the Receive Queue or Shared
633	   Receive Queue. It can also receive Tagged DDP Messages into
634	   buffers that have previously been exposed for external write
635	   access through advertisement of an STag.

637	   Completion of data transmission or reception generally entails
638	   informing the ULP of the completed work by placing completion
639	   information on the Completion Queue. For data reception, only an
640	   Untagged Data Transfer can cause completion information to be put
641	   in the Completion Queue.

643	2.3.3  Privileged Data Interface Semantics

645	   The Privileged Data Interface semantics are a superset of the
646	   Non-Privileged Data Transfer semantics. The interface can do
647	   everything defined in the prior section, as well as
648	   create/destroy buffer to STag mappings directly. This generally
649	   entails initialization or clearing of Page Translation Table
650	   state in the RNIC.

652	2.3.4  Initialization of RNIC Data Structures for Data Transfer

654	   Initialization of the mapping between an STag and a Data Buffer
655	   can be viewed in the abstract as two separate operations:

657	       a.  Initialization of the allocated Page Translation Table
658	           entries with the location of the Data Buffer, and

660	       b.  Initialization of a mapping from an allocated STag name
661	           to a set of Page Translation Table entry(s) or partial-
662	           entries.

664	   Note that an implementation may not have a Page Translation Table
665	   (i.e. it may support a direct mapping between an STag and a Data
666	   Buffer). If there is no Page Translation Table, then attacks
667	   based on changing its contents or exhausting its resources are
668	   not possible.

670	   Initialization of the contents of the Page Translation Table can
671	   be done by either the Privileged ULP or by the Privileged
672	   Resource Manager as a proxy for the Non-Privileged ULP. By
673	   definition the Non-Privileged ULP is not trusted to directly
674	   manipulate the Page Translation Table. In general the concern is
675	   that the Non-Privileged ULP may try to maliciously initialize the
676	   Page Translation Table to access a buffer for which it does not
677	   have permission.

679	   The exact resource allocation algorithm for the Page Translation
680	   Table is outside the scope of this document. It may be allocated
681	   for a specific Data Buffer, or be allocated as a pooled resource
682	   to be consumed by potentially multiple Data Buffers, or be
683	   managed in some other way. This document attempts to abstract
684	   implementation dependent issues, and group them into higher level
685	   security issues such as resource starvation and sharing of
686	   resources between Streams.

688	   The next issue is how an STag name is associated with a Data
689	   Buffer. For the case of an Untagged Data Buffer (i.e. Untagged
690	   Data Transfer), there is no wire visible mapping between an STag
691	   and the Data Buffer. Note that there may, in fact, be an STag
692	   which represents the buffer, if an implementation chooses to
693	   internally represent Untagged Data  Buffer using STags. However,
694	   because the STag by definition is not visible on the wire, this
695	   is a local host implementation specific issue which should be
696	   analyzed in the context of a local host implementation specific
697	   security analysis, and thus is outside the scope of this
698	   document.

700	   For a Tagged Data Buffer (i.e. Tagged Data Transfer), either the
701	   Privileged ULP or the Privileged Resource Manager acting on
702	   behalf of the Non-Privileged ULP may initialize a mapping from an
703	   STag to a Page Translation Table, or may have the ability to
704	   simply enable/disable an existing STag to Page Translation Table
705	   mapping. There may also be multiple STag names which map to a
706	   specific group of Page Translation Table entries (or sub-
707	   entries). Specific security issues with this level of flexibility
708	   are examined in Section 6.2.3 Multiple STags to access the same
709	   buffer.

711	   There are a variety of implementation options for initialization
712	   of Page Translation Table entries and mapping an STag to a group
713	   of Page Translation Table entries which have security
714	   repercussions. This includes support for separation of Mapping an
715	   STag versus mapping a set of Page Translation Table entries, and
716	   support for ULPs directly manipulating STag to Page Translation
717	   Table entry mappings (versus requiring access through the
718	   Privileged Resource Manager).

720	2.3.5  RNIC Data Transfer Interactions

722	   RNIC Data Transfer operations can be subdivided into send
723	   operations and receive operations.

725	   For send operations, there is typically a queue that enables the
726	   ULP to post multiple operation requests to send data (referred to
727	   as the Send Queue). Depending upon the implementation, Data
728	   Buffers used in the operations may or may not have Page
729	   Translation Table entries associated with them, and may or may
730	   not have STags associated with them. Because this is a local host
731	   specific implementation issue rather than a protocol issue, the
732	   security analysis of threats and mitigations is left to the host
733	   implementation.

735	   Receive operations are different for Tagged Data Buffers versus
736	   Untagged Data Buffers (i.e. Tagged Data Transfer vs. Untagged
737	   Data Transfer). For Untagged Data Transfer, if more than one
738	   Untagged Data Buffer can be posted by the ULP, the DDP
739	   specification requires that they be consumed in sequential order
740	   (the RDMAP specification also requires this). Thus the most
741	   general implementation is that there is a sequential queue of
742	   receive Untagged Data Buffers (Receive Queue). Some
743	   implementations may also support sharing of the sequential queue
744	   between multiple Streams. In this case defining "sequential"
745	   becomes non-trivial - in general the buffers for a single Stream
746	   are consumed from the queue in the order that they were placed on
747	   the queue, but there is no consumption order guarantee between
748	   Streams.

750	   For receive Tagged Data Transfer (i.e. Tagged Data Buffers, RDMA
751	   Write Buffers, or RDMA Read Buffers), at some time prior to data
752	   transfer, the mapping of the STag to specific Page Translation
753	   Table entries (if present) and the mapping from the Page
754	   Translation Table entries to the Data Buffer must have been
755	   initialized (see Section 2.3.4 for interaction details).

757	3  Trust and Resource Sharing

759	   It is assumed that in general the Local and Remote Peer are
760	   untrusted, and thus attacks by either should have mitigations in
761	   place.

763	   A separate, but related issue is resource sharing between
764	   multiple Streams. If local resources are not shared, the
765	   resources are dedicated on a per Stream basis. Resources are
766	   defined in Section 2.2 Resources. The advantage of not sharing
767	   resources between Streams is that it reduces the types of attacks
768	   that are possible. The disadvantage of not sharing resources is
769	   that ULPs might run out of resources. Thus there can be a strong
770	   incentive for sharing resources, if the security issues
771	   associated with the sharing of resources can be mitigated.

773	   It is assumed in this document that the component that implements
774	   the mechanism to control sharing of the RNIC Engine resources is
775	   the Privileged Resource Manager. The RNIC Engine exposes its
776	   resources through the RNIC Interface to the Privileged Resource
777	   Manager. All Privileged and Non-Privileged ULPs request resources
778	   from the Resource Manager (note that by definition both the Non-
779	   Privileged and the Privileged application might try to greedily
780	   consume resources, thus creating a potential Denial of Service
781	   (DOS) attack). The Resource Manager implements resource
782	   management policies to ensure fair access to resources. The
783	   Resource Manager should be designed to take into account security
784	   attacks detailed in this document. Note that for some systems the
785	   Privileged Resource Manager may be implemented within the
786	   Privileged ULP.

788	   All Non-Privileged ULP interactions with the RNIC Engine that
789	   could affect other ULPs MUST be done using the Privileged
790	   Resource Manager as a proxy. All ULP resource allocation requests
791	   for scarce resources MUST also be done using a Privileged
792	   Resource Manager.

794	   The sharing of resources across Streams should be under the
795	   control of the ULP, both in terms of the trust model the ULP
796	   wishes to operate under, as well as the level of resource sharing
797	   the ULP wishes to give local processes. For more discussion on
798	   types of trust models which combine partial trust and sharing of
799	   resources, see Appendix C: Partial Trust Taxonomy.

801	   The Privileged Resource Manager MUST NOT assume different Streams
802	   share Partial Mutual Trust unless there is a mechanism to ensure
803	   that the Streams do indeed share Partial Mutual Trust. This can
804	   be done in several ways, including explicit notification from the
805	   ULP that owns the Streams.

807	4  Attacker Capabilities

809	   An attacker's capabilities delimit the types of attacks that
810	   attacker is able to launch. RDMAP and DDP require that the
811	   initial LLP Stream (and connection) be set up prior to
812	   transferring RDMAP/DDP Messages. This requires at least one
813	   round-trip handshake to occur.

815	   If the attacker is not the Remote Peer that created the initial
816	   connection, then the attacker's capabilities can be segmented
817	   into send only capabilities or send and receive capabilities.
818	   Attacking with send only capabilities requires the attacker to
819	   first guess the current LLP Stream parameters before they can
820	   attack RNIC resources (e.g. TCP sequence number). If this class
821	   of attacker also has receive capabilities and the ability to pose
822	   as the receiver to the sender and the sender to the receiver,
823	   they are typically referred to as a "man-in-the-middle" attacker
824	   [RFC3552]. A man-in-the-middle attacker has a much wider ability
825	   to attack RNIC resources. The breadth of attack is essentially
826	   the same as that of an attacking Remote Peer (i.e. the Remote
827	   Peer that setup the initial LLP Stream).

829	5  Attacks That Can be Mitigated With End-to-End Security

831	   This section describes the RDMAP/DDP attacks where the only
832	   solution is to implement some form of end-to-end security. The
833	   analysis includes a detailed description of each attack, what is
834	   being attacked, and a description of the countermeasures that can
835	   be taken to thwart the attack.

837	   Some forms of attack involve modifying the RDMAP or DDP payload
838	   by a network based attacker or involve monitoring the traffic to
839	   discover private information. An effective tool to ensure
840	   confidentiality is to encrypt the data stream through mechanisms
841	   such as IPsec encryption. Additionally, authentication protocols
842	   such as IPsec authentication are an effective tool to ensure the
843	   remote entity is who they claim to be as well as ensuring that
844	   the payload is unmodified as it traverses the network.

846	   Note that connection setup and teardown is presumed to be done in
847	   stream mode (i.e. no RDMA encapsulation of the payload), so there
848	   are no new attacks related to connection setup/teardown beyond
849	   what is already present in the LLP (e.g. TCP or SCTP). Note,
850	   however, that RDMAP/DDP parameters may be exchanged in stream
851	   mode, and if they are corrupted by an attacker unintended
852	   consequences will result. Therefore, any existing mitigations for
853	   LLP Spoofing, Tampering, Repudiation, Information Disclosure,
854	   Denial of Service, or Elevation of Privilege continue to apply
855	   (and are out of scope of this document). Thus the analysis in
856	   this section focuses on attacks that are present regardless of
857	   the LLP Stream type.

859	   Tampering is any modification of the legitimate traffic (machine
860	   internal or network). Spoofing attack is a special case of
861	   tampering where the attacker falsifies an identity of the Remote
862	   Peer (identity can be an IP address, machine name, ULP level
863	   identity etc.).

865	5.1  Spoofing

867	   Spoofing attacks can be launched by the Remote Peer, or by a
868	   network based attacker. A network based spoofing attack applies
869	   to all Remote Peers. This section analyzes the various types of
870	   spoofing attacks applicable to RDMAP & DDP.

872	5.1.1  Impersonation

874	   A network based attacker can impersonate a legal RDMAP/DDP Peer
875	   (by spoofing a legal IP address). This can either be done as a
876	   blind attack (see [RFC3552]) or by establishing an RDMAP/DDP
877	   Stream with the victim. Because an RDMAP/DDP Stream requires an
878	   LLP Stream to be fully initialized (e.g. for [RFC793] it is in
879	   the ESTABLISHED state), existing transport layer protection
880	   mechanisms against blind attacks remain in place.

882	   For a blind attack to succeed, it requires the attacker to inject
883	   a valid transport layer segment (e.g. for TCP it must match at
884	   least the 4-tuple as well as guess a sequence number within the
885	   window) while also guessing valid RDMAP or DDP parameters.  There
886	   are many ways to attack the RDMAP/DDP protocol if the transport
887	   protocol is assumed to be vulnerable. For example, for Tagged
888	   Messages, this entails guessing the STag and TO values. If the
889	   attacker wishes to simply terminate the connection, it can do so
890	   by correctly guessing the transport & network layer values, and
891	   providing an invalid STag. Per the DDP specification, if an
892	   invalid STag is received, the Stream is torn down and the Remote
893	   Peer is notified with an error. If an attacker wishes to
894	   overwrite an Advertised Buffer, it must successfully guess the
895	   correct STag and TO. Given that the TO often will start at zero,
896	   this is straightforward. The value of the STag should be chosen
897	   at random, as discussed in Section 6.1.1 Using an STag on a
898	   Different Stream. For Untagged Messages, if the MSN is invalid
899	   then the connection may be torn down. If it is valid, then the
900	   receive buffers can be corrupted.

902	   End-to-end authentication (e.g. IPsec or ULP authentication)
903	   provides protection against either the blind attack or the
904	   connected attack.

906	5.1.2  Stream Hijacking

908	   Stream hijacking happens when a network based attacker eavesdrops
909	   the LLP connection through the Stream establishment phase, and
910	   waits until the authentication phase (if such a phase exists) is
911	   completed successfully. The attacker then spoofs the IP address
912	   and re-directs the Stream from the victim to its own machine. For
913	   example, an attacker can wait until an iSCSI authentication is
914	   completed successfully, and then hijack the iSCSI Stream.

916	   The best protection against this form of attack is end-to-end
917	   integrity protection and authentication, such as IPsec, to
918	   prevent spoofing. Another option is to provide a physically
919	   segregated network for security. Discussion of physical security
920	   is out of scope for this document.

922	   Because the connection and/or Stream itself is established by the
923	   LLP, some LLPs are more difficult to hijack than others. Please
924	   see the relevant LLP documentation on security issues around
925	   connection and/or Stream hijacking.

927	5.1.3  Man-in-the-Middle Attack

929	   If a network based attacker has the ability to delete or modify
930	   packets which will still be accepted by the LLP (e.g., TCP
931	   sequence number is correct) then the Stream can be exposed to a
932	   man-in-the-middle attack. One style of attack is for the man-in-
933	   the-middle to send Tagged Messages (either RDMAP or DDP). If it
934	   can discover a buffer that has been exposed for STag enabled
935	   access, then the man-in-the-middle can use an RDMA Read operation
936	   to read the contents of the associated data buffer, perform an
937	   RDMA Write Operation to modify the contents of the associated
938	   data buffer, or invalidate the STag to disable further access to
939	   the buffer.

941	   The best protection against this form of attack is end-to-end
942	   integrity protection and authentication, such as IPsec, to
943	   prevent spoofing or tampering. If authentication and integrity
944	   protections are not used, then physical protection must be
945	   employed to prevent man-in-the-middle attacks.

947	   Because the connection/Stream itself is established by the LLP,
948	   some LLPs are more exposed to man-in-the-middle attack than
949	   others. Please see the relevant LLP documentation on security
950	   issues around connection and/or Stream hijacking.

952	   Another approach is to restrict access to only the local
953	   subnet/link, and provide some mechanism to limit access, such as
954	   physical security or 802.1.x. This model is an extremely limited
955	   deployment scenario, and will not be further examined here.

957	5.2  Tampering - Network based modification of buffer content

959	   This is actually a man in the middle attack - but only on the
960	   content of the buffer, as opposed to the man in the middle attack
961	   presented above, where both the signaling and content can be
962	   modified. See Section 5.1.3 Man-in-the-Middle Attack.

964	5.3  Information Disclosure - Network Based Eavesdropping

966	   An attacker that is able to eavesdrop on the network can read the
967	   content of all read and write accesses to a Peer's buffers. To
968	   prevent information disclosure, the read/written data must be
969	   encrypted. See also Section 5.1.3 Man-in-the-Middle Attack. The
970	   encryption can be done either by the ULP, or by a protocol that
971	   can provide security services to RDMAP & DDP (e.g. IPsec).

973	5.4  Specific Requirements for Security Services

975	   Generally speaking, Stream confidentiality protects against
976	   eavesdropping. Stream and/or session authentication and integrity
977	   protection is a counter measurement against various spoofing and
978	   tampering attacks. The effectiveness of authentication and
979	   integrity against a specific attack depend on whether the
980	   authentication is machine level authentication (such as IPsec),
981	   or ULP authentication.

983	5.4.1  Introduction to Security Options

985	   The following security services can be applied to an RDMAP/DDP
986	   Stream:

988	   1.  Session confidentiality - protects against eavesdropping
989	       (Section 5.3).

991	   2.  Per-packet data source authentication - protects against the
992	       following spoofing attacks: network based impersonation
993	       (Section 5.1.1), Stream hijacking (Section 5.1.2), and man in
994	       the middle (Section 5.1.3).

996	   3.  Per-packet integrity - protects against tampering done by
997	       network based modification of buffer content (Section 5.2)

999	   4.  Packet sequencing - protects against replay attacks, which is
1000	       a special case of the above tampering attack.

1002	   If an RDMAP/DDP Stream may be subject to impersonation attacks,
1003	   or Stream hijacking attacks, it is recommended that the Stream be
1004	   authenticated, integrity protected, and protected from replay
1005	   attacks; it may use confidentiality protection to protect from
1006	   eavesdropping (in case the RDMAP/DDP Stream traverses a public
1007	   network).

1009	   IPsec is a protocol suite which is used to secure communication
1010	   at the network layer between two peers. The IPsec protocol suite
1011	   is specified within the IP Security Architecture [RFC2401], IKE
1012	   [RFC2409], IPsec Authentication Header (AH) [RFC2402] and IPsec
1013	   Encapsulating Security Payload (ESP) [RFC2406] documents. IKE is
1014	   the key management protocol while AH and ESP are used to protect
1015	   IP traffic. Please see those RFCs for a complete description of
1016	   the respective protocols.

1018	   IPsec is capable of providing the above security services for IP
1019	   and TCP traffic respectively. ULP protocols are able to provide
1020	   only part of the above security services.

1022	5.4.2  TLS is Inappropriate for DDP/RDMAP Security

1024	   TLS [RFC 2246] provides Stream authentication, integrity and
1025	   confidentiality for TCP based ULPs. TLS supports one-way (server
1026	   only) or mutual certificates based authentication.

1028	   If TLS is layered underneath RDMAP, there are at least two
1029	   limitations that make TLS inappropriate for DDP/RDMA security:

1031	   1.  The maximum length supported by the TLS record layer protocol
1032	       is 2^14 bytes - longer packets must be fragmented (as a
1033	       comparison, the maximum length of an Untagged DDP Message is
1034	       roughly 2^32).

1036	   2.  TLS is a connection oriented protocol. If a stream cipher or
1037	       block cipher in CBC mode is used for bulk encryption, then a
1038	       packet can be decrypted only after all the packets preceding
1039	       it have already arrived. If TLS is used to protect DDP/RDMAP
1040	       traffic, then TCP must gather all out-of-order packets before
1041	       TLS can decrypt them. Only after this is done can RDMAP/DDP
1042	       place them into the ULP buffer. Thus one of the primary
1043	       features of DDP/RDMAP - enabling implementations to have a
1044	       flow-through architecture with little to no buffering, can
1045	       not be achieved if TLS is used to protect the data stream.

1047	   If TLS is layered on top of RDMAP or DDP, TLS does not protect
1048	   the RDMAP and/or DDP headers. Thus a man-in-the-middle attack can
1049	   still occur by modifying the RDMAP/DDP header to incorrectly
1050	   place the data into the wrong buffer, thus effectively corrupting
1051	   the data stream.

1053	   For these reasons, it is not RECOMMENDED that TLS be layered on
1054	   top of RDMAP or DDP.

1056	5.4.3  DTLS and RDDP

1058	   DTLS [DTLS] provides security services for datagram protocols,
1059	   including unreliable datagram protocols. These services include
1060	   anti-replay based on a mechanism adapted from IPsec that is
1061	   intended to operate on packets as they are received from the
1062	   network.  For these and other reasons, DTLS is best applied to
1063	   RDDP by employing DTLS beneath TCP, yielding a layering of RDDP
1064	   over TCP over DTLS over UDP/IP.  Such a layering inserts DTLS at
1065	   roughly the same level in the protocol stack as IPsec, making
1066	   DTLS's security services an alternative to IPsec's services from
1067	   an RDDP standpoint.

1069	   For RDDP, IPsec is the better choice for a security framework,
1070	   and hence is mandatory-to-implement (as specified elsewhere in
1071	   this document). An important contributing factor to the
1072	   specification of IPsec rather than DTLS is that the non-RDDP
1073	   versions of two initial adopters of RDDP (iSCSI [iSCSI][iSER] and
1074	   NFSv4 [NFSv4][NFSv4.1]) are compatible with IPsec but neither of
1075	   these protocols currently uses either TLS or DTLS.  For the
1076	   specific case of iSCSI, IPsec is the basis for mandatory-to-
1077	   implement security services [RFC3723].  Therefore this document
1078	   and the RDDP protocol specifications contain mandatory
1079	   implementation requirements for IPsec rather than for DTLS.

1081	5.4.4  ULPs Which Provide Security

1083	   ULPs which provide integrated security but wish to leverage
1084	   lower-layer protocol security should be aware of security
1085	   concerns around correlating a specific channel's security
1086	   mechanisms to the authentication performed by the ULP. See

1088	   [NFSv4CHANNEL] for additional information on a promising approach
1089	   called "channel binding". From [NFSv4CHANNEL]:

1091	        "The concept of channel bindings allows applications to
1092	        prove that the end-points of two secure channels at
1093	        different network layers are the same by binding
1094	        authentication at one channel to the session protection at
1095	        the other channel.  The use of channel bindings allows
1096	        applications to delegate session protection to lower layers,
1097	        which may significantly improve performance for some
1098	        applications."

1100	5.4.5  Requirements for IPsec Encapsulation of DDP

1102	   The IP Storage working group has spent significant time and
1103	   effort to define the normative IPsec requirements for IP Storage
1104	   [RFC3723]. Portions of that specification are applicable to a
1105	   wide variety of protocols, including the RDDP protocol suite. In
1106	   order to not replicate this effort, an RNIC implementation MUST
1107	   follow the requirements defined in RFC3723 Section 2.3 and
1108	   Section 5, including the associated normative references for
1109	   those sections. Note that this means that support for IPSEC ESP
1110	   mode is normative.

1112	   Additionally, since IPsec acceleration hardware may only be able
1113	   to handle a limited number of active IKE Phase 2 SAs, Phase 2
1114	   delete messages may be sent for idle SAs, as a means of keeping
1115	   the number of active Phase 2 SAs to a minimum. The receipt of an
1116	   IKE Phase 2 delete message MUST NOT be interpreted as a reason
1117	   for tearing down a DDP/RDMA Stream. Rather, it is preferable to
1118	   leave the Stream up, and if additional traffic is sent on it, to
1119	   bring up another IKE Phase 2 SA to protect it. This avoids the
1120	   potential for continually bringing Streams up and down.

1122	   Note that there are serious security issues if IPsec is not
1123	   implemented end-to-end. For example, if IPsec is implemented as a
1124	   tunnel in the middle of the network, any hosts between the Peer
1125	   and the IPsec tunneling device can freely attack the unprotected
1126	   Stream.

1128	6  Attacks from Remote Peers

1130	   This section describes remote attacks that are possible against
1131	   the RDMA system defined in Figure 1 - RDMA Security Model and the
1132	   RNIC Engine resources defined in Section 2.2. The analysis
1133	   includes a detailed description of each attack, what is being
1134	   attacked, and a description of the countermeasures that can be
1135	   taken to thwart the attack.

1137	   The attacks are classified into five categories: Spoofing,
1138	   Tampering, Information Disclosure, Denial of Service (DoS)
1139	   attacks, and Elevation of Privileges. As mentioned previously,
1140	   tampering is any modification of the legitimate traffic (machine
1141	   internal or network). A spoofing attack is a special case of
1142	   tampering where the attacker falsifies an identity of the Remote
1143	   Peer (identity can be an IP address, machine name, ULP level
1144	   identity etc.).

1146	6.1  Spoofing

1148	   This section analyzes the various types of spoofing attacks
1149	   applicable to RDMAP & DDP. Spoofing attacks can be launched by
1150	   the Remote Peer, or by a network based attacker. For
1151	   countermeasures against a network based attacker, see Section 5
1152	   Attacks That Can be Mitigated With End-to-End Security.

1154	6.1.1  Using an STag on a Different Stream

1156	   One style of attack from the Remote Peer is for it to attempt to
1157	   use STag values that it is not authorized to use. Note that if
1158	   the Remote Peer sends an invalid STag to the Local Peer, per the
1159	   DDP and RDMAP specifications, the Stream must be torn down. Thus
1160	   the threat exists if an STag has been enabled for Remote Access
1161	   on one Stream and a Remote Peer is able to use it on an unrelated
1162	   Stream. If the attack is successful, the attacker could
1163	   potentially be able to perform either RDMA Read Operations to
1164	   read the contents of the associated data buffer, perform RDMA
1165	   Write Operations to modify the contents of the associated data
1166	   buffer, or to invalidate the STag to disable further access to
1167	   the buffer.

1169	   An attempt by a Remote Peer to access a buffer with an STag on a
1170	   different Stream in the same Protection Domain may or may not be
1171	   an attack depending on whether resource sharing is intended (i.e.
1172	   whether the Streams shared Partial Mutual Trust or not). For some
1173	   ULPs, using an STag on multiple Streams within the same
1174	   Protection Domain could be desired behavior. For other ULPs,
1175	   attempting to use an STag on a different Stream could be
1176	   considered to be an attack. Since this varies by ULP, a ULP
1177	   typically would need to be able to control the scope of the STag.

1179	   In the case where an implementation does not share resources
1180	   between Streams (including STags), this attack can be defeated by
1181	   assigning each Stream to a different Protection Domain. Before
1182	   allowing remote access to the buffer, the Protection Domain of
1183	   the Stream where the access attempt was made is matched against
1184	   the Protection Domain of the STag. If the Protection Domains do
1185	   not match, access to the buffer is denied, an error is generated,
1186	   and the RDMAP Stream associated with the attacking Stream is
1187	   terminated.

1189	   For implementations that share resources between multiple
1190	   Streams, it may not be practical to separate each Stream into its
1191	   own Protection Domain. In this case, the ULP can still limit the
1192	   scope of any of the STags to a single Stream (if it is enabling
1193	   it for remote access). If the STag scope has been limited to a
1194	   single Stream, any attempt to use that STag on a different Stream
1195	   will result in an error, and the RDMAP Stream is terminated.

1197	   Thus for implementations that do not share STags between Streams,
1198	   each Stream MUST either be in a separate Protection Domain or the
1199	   scope of an STag MUST be limited to a single Stream.

1201	   An RNIC MUST ensure that a specific Stream in a specific
1202	   Protection Domain can not access an STag in a different
1203	   Protection Domain.

1205	   An RNIC MUST ensure that if an STag is limited in scope to a
1206	   single Stream, no other Stream can use the STag.

1208	   An additional issue may be unintended sharing of STags (i.e. a
1209	   bug in the ULP) or a bug in the Remote Peer which causes an off-
1210	   by-one STag to be used. For additional protection, an
1211	   implementation should allocate STags in such a fashion that it is
1212	   difficult to predict the next allocated STag number, and also
1213	   ensure that STags are reused at as slow a rate as possible. Any
1214	   allocation method which would lead to intentional or
1215	   unintentional reuse of an STag by the peer should be avoided
1216	   (e.g. a method which always starts with a given STag and
1217	   monotonically increases it for each new allocation, or a method
1218	   which always uses the same STag for each operation).

1220	6.2  Tampering

1222	   A Remote Peer or a network based attacker can attempt to tamper
1223	   with the contents of data buffers on a Local Peer that have been
1224	   enabled for remote write access. The types of tampering attacks
1225	   from a Remote Peer are outlined in the sections that follow. For
1226	   countermeasures against a network based attacker, see Section 5
1227	   Attacks That Can be Mitigated With End-to-End Security.

1229	6.2.1  Buffer Overrun - RDMA Write or Read Response

1231	   This attack is an attempt by the Remote Peer to perform an RDMA
1232	   Write or RDMA Read Response to memory outside of the valid length
1233	   range of the data buffer enabled for remote write access. This
1234	   attack can occur even when no resources are shared across
1235	   Streams. This issue can also arise if the ULP has a bug.

1237	   The countermeasure for this type of attack must be in the RNIC
1238	   implementation, leveraging the STag. When the local ULP specifies
1239	   to the RNIC the base address and the number of bytes in the
1240	   buffer that it wishes to make accessible, the RNIC must ensure
1241	   that the base and bounds check are applied to any access to the
1242	   buffer referenced by the STag before the STag is enabled for
1243	   access. When an RDMA data transfer operation (which includes an
1244	   STag) arrives on a Stream, a base and bounds byte granularity
1245	   access check must be performed to ensure the operation accesses
1246	   only memory locations within the buffer described by that STag.

1248	   Thus an RNIC implementation MUST ensure that a Remote Peer is not
1249	   able to access memory outside of the buffer specified when the
1250	   STag was enabled for remote access.

1252	6.2.2  Modifying a Buffer After Indication

1254	   This attack can occur if a Remote Peer attempts to modify the
1255	   contents of an STag referenced buffer by performing an RDMA Write
1256	   or an RDMA Read Response after the Remote Peer has indicated to
1257	   the Local Peer or local ULP (by a variety of means) that the STag
1258	   data buffer contents are ready for use. This attack can occur
1259	   even when no resources are shared across Streams. Note that a bug
1260	   in a Remote Peer, or network based tampering, could also result
1261	   in this problem.

1263	   For example, assume the STag referenced buffer contains ULP
1264	   control information as well as ULP payload, and the ULP sequence
1265	   of operation is to first validate the control information and
1266	   then perform operations on the control information. If the Remote
1267	   Peer can perform an additional RDMA Write or RDMA Read Response
1268	   (thus changing the buffer) after the validity checks have been
1269	   completed but before the control data is operated on, the Remote
1270	   Peer could force the ULP down operational paths that were never
1271	   intended.

1273	   The local ULP can protect itself from this type of attack by
1274	   revoking remote access when the original data transfer has
1275	   completed and before it validates the contents of the buffer. The
1276	   local ULP can either do this by explicitly revoking remote access
1277	   rights for the STag when the Remote Peer indicates the operation
1278	   has completed, or by checking to make sure the Remote Peer
1279	   invalidated the STag through the RDMAP Remote Invalidate
1280	   capability (see Section 6.4.5 Remote Invalidate an STag Shared on
1281	   Multiple Streams for a definition of Remote Invalidate), and if
1282	   it did not, the local ULP then explicitly revokes the STag remote
1283	   access rights.

1285	   The local ULP SHOULD follow the above procedure to protect the
1286	   buffer before it validates the contents of the buffer (or uses
1287	   the buffer in any way).

1289	   An RNIC MUST ensure that network packets using the STag for a
1290	   previously advertised buffer can no longer modify the buffer
1291	   after the ULP revokes remote access rights for the specific STag.

1293	6.2.3  Multiple STags to access the same buffer

1295	   See Section 6.3.6 Using Multiple STags Which Alias to the Same
1296	   Buffer for this analysis.

1298	6.3  Information Disclosure

1300	   The main potential source for information disclosure is through a
1301	   local buffer that has been enabled for remote access. If the
1302	   buffer can be probed by a Remote Peer on another Stream, then
1303	   there is potential for information disclosure.

1305	   The potential attacks that could result in unintended information
1306	   disclosure and countermeasures are detailed in the following
1307	   sections.

1309	6.3.1  Probing memory outside of the buffer bounds

1311	   This is essentially the same attack as described in Section 6.2.1
1312	   Buffer Overrun - RDMA Write or Read Response, except an RDMA Read
1313	   Request is used to mount the attack. The same countermeasure
1314	   applies.

1316	6.3.2  Using RDMA Read to Access Stale Data

1318	   If a buffer is being used for some combination of reads and
1319	   writes (either remote or local), and is exposed to a Remote Peer
1320	   with at least remote read access rights before it is initialized
1321	   with the correct data, there is a potential race condition where
1322	   the Remote Peer can view the prior contents of the buffer. This
1323	   becomes a security issue if the prior contents of the buffer were
1324	   not intended to be shared with the Remote Peer.

1326	   To eliminate this race condition, the local ULP SHOULD ensure
1327	   that no stale data is contained in the buffer before remote read
1328	   access rights are granted (this can be done by zeroing the
1329	   contents of the memory, for example). This ensures that the
1330	   Remote Peer can not access the buffer until the stale data has
1331	   been removed.

1333	6.3.3  Accessing a Buffer After the Transfer

1335	   If the Remote Peer has remote read access to a buffer, and by
1336	   some mechanism tells the local ULP that the transfer has been
1337	   completed, but the local ULP does not disable remote access to
1338	   the buffer before modifying the data, it is possible for the
1339	   Remote Peer to retrieve the new data.

1341	   This is similar to the attack defined in Section 6.2.2 Modifying
1342	   a Buffer After Indication. The same countermeasures apply. In
1343	   addition, the local ULP SHOULD grant remote read access rights
1344	   only for the amount of time needed to retrieve the data.

1346	6.3.4  Accessing Unintended Data With a Valid STag

1348	   If the ULP enables remote access to a buffer using an STag that
1349	   references the entire buffer, but intends only a portion of the
1350	   buffer to be accessed, it is possible for the Remote Peer to
1351	   access the other parts of the buffer anyway.

1353	   To prevent this attack, the ULP SHOULD set the base and bounds of
1354	   the buffer when the STag is initialized to expose only the data
1355	   to be retrieved.

1357	6.3.5  RDMA Read into an RDMA Write Buffer

1359	   One form of disclosure can occur if the access rights on the
1360	   buffer enabled remote read, when only remote write access was
1361	   intended. If the buffer contained ULP data, or data from a
1362	   transfer on an unrelated Stream, the Remote Peer could retrieve
1363	   the data through an RDMA Read operation. Note that an RNIC
1364	   implementation is not required to support STags that have both
1365	   read and write access.

1367	   The most obvious countermeasure for this attack is to not grant
1368	   remote read access if the buffer is intended to be write-only.
1369	   Then the Remote Peer would not be able to retrieve data
1370	   associated with the buffer. An attempt to do so would result in
1371	   an error and the RDMAP Stream associated with the Stream would be
1372	   terminated.

1374	   Thus if a ULP only intends a buffer to be exposed for remote
1375	   write access, it MUST set the access rights to the buffer to only
1376	   enable remote write access.  Note that this requirement is not
1377	   meant to restrict the use of zero-length RDMA Reads. Zero-length
1378	   RDMA Reads do not expose ULP data. Because they are intended to
1379	   be used as a mechanism to ensure that all RDMA Writes have been
1380	   received, and do not even require a valid STag, their use is
1381	   permitted even if a buffer has only been enabled for write
1382	   access.

1384	6.3.6  Using Multiple STags Which Alias to the Same Buffer

1386	   Multiple STags which alias to the same buffer at the same time
1387	   can result in unintentional information disclosure if the STags
1388	   are used by different, mutually untrusted, Remote Peers. This
1389	   model applies specifically to client/server communication, where
1390	   the server is communicating with multiple clients, each of which
1391	   do not mutually trust each other.

1393	   If only read access is enabled, then the local ULP has complete
1394	   control over information disclosure. Thus a server which intended
1395	   to expose the same data (i.e. buffer) to multiple clients by
1396	   using multiple STags to the same buffer creates no new security
1397	   issues beyond what has already been described in this document.
1398	   Note that if the server did not intend to expose the same data to
1399	   the clients, it should use separate buffers for each client (and
1400	   separate STags).

1402	   When one STag has remote read access enabled and a different STag
1403	   has remote write access enabled to the same buffer, it is
1404	   possible for one Remote Peer to view the contents that have been
1405	   written by another Remote Peer.

1407	   If both STags have remote write access enabled and the two Remote
1408	   Peers do not mutually trust each other, it is possible for one
1409	   Remote Peer to overwrite the contents that have been written by
1410	   the other Remote Peer.

1412	   Thus a ULP with multiple Remote Peers which do not share Partial
1413	   Mutual Trust MUST NOT grant write access to the same buffer
1414	   through different STags. A buffer should be exposed to only one
1415	   untrusted Remote Peer at a time to ensure that no information
1416	   disclosure or information tampering occurs between peers.

1418	6.4  Denial of Service (DOS)

1420	   A DOS attack is one of the primary security risks of RDMAP. This
1421	   is because RNIC resources are valuable and scarce, and many ULP
1422	   environments require communication with untrusted Remote Peers.
1423	   If the Remote Peer can be authenticated or the ULP payload
1424	   encrypted, clearly, the DOS profile can be reduced. For the
1425	   purposes of this analysis, it is assumed that the RNIC must be
1426	   able to operate in untrusted environments, which are open to DOS
1427	   style attacks.

1429	   Denial of service attacks against RNIC resources are not the
1430	   typical unknown party spraying packets at a random host (such as
1431	   a TCP SYN attack). Because the connection/Stream must be fully
1432	   established (e.g. a 3 message transport layer handshake has
1433	   occurred), the attacker must be able to both send and receive
1434	   messages over that connection/Stream, or be able to guess a valid
1435	   packet on an existing RDMAP Stream.

1437	   This section outlines the potential attacks and the
1438	   countermeasures available for dealing with each attack.

1440	6.4.1  RNIC Resource Consumption

1442	   This section covers attacks that fall into the general category
1443	   of a local ULP attempting to unfairly allocate scarce (i.e.
1444	   bounded) RNIC resources. The local ULP may be attempting to
1445	   allocate resources on its own behalf, or on behalf of a Remote
1446	   Peer. Resources that fall into this category include: Protection
1447	   Domains, Stream Context Memory, Translation and Protection
1448	   Tables, and STag namespace. These can be due to attacks by
1449	   currently active local ULPs or ones that allocated resources
1450	   earlier, but are now idle.

1452	   This type of attack can occur regardless of whether or not
1453	   resources are shared across Streams.

1455	   The allocation of all scarce resources MUST be placed under the
1456	   control of a Privileged Resource Manager. This allows the
1457	   Privileged Resource Manager to:

1459	       *   prevent a local ULP from allocating more than its fair
1460	           share of resources.

1462	       *   detect if a Remote Peer is attempting to launch a DOS
1463	           attack by attempting to create an excessive number of
1464	           Streams (with associated resources) and take corrective
1465	           action (such as refusing the request or applying network
1466	           layer filters against the Remote Peer).

1468	   This analysis assumes that the Resource Manager is responsible
1469	   for handing out Protection Domains, and RNIC implementations will
1470	   provide enough Protection Domains to allow the Resource Manager
1471	   to be able to assign a unique Protection Domain for each
1472	   unrelated, untrusted local ULP (for a bounded, reasonable number
1473	   of local ULPs). This analysis further assumes that the Resource
1474	   Manager implements policies to ensure that untrusted local ULPs
1475	   are not able to consume all of the Protection Domains through a
1476	   DOS attack. Note that Protection Domain consumption cannot result
1477	   from a DOS attack launched by a Remote Peer, unless a local ULP
1478	   is acting on the Remote Peer's behalf.

1480	6.4.2  Resource Consumption by Idle ULPs

1482	   The simplest form of a DOS attack given a fixed amount of
1483	   resources is for the Remote Peer to create a RDMAP Stream to a
1484	   Local Peer, and request dedicated resources then do no actual
1485	   work. This allows the Remote Peer to be very light weight (i.e.
1486	   only negotiate resources, but do no data transfer) and consumes a
1487	   disproportionate amount of resources at the Local Peer.

1489	   A general countermeasure for this style of attack is to monitor
1490	   active RDMAP Streams and if resources are getting low, reap the
1491	   resources from RDMAP Streams that are not transferring data and
1492	   possibly terminate the Stream. This would presumably be under
1493	   administrative control.

1495	   Refer to Section 6.4.1 for the analysis and countermeasures for
1496	   this style of attack on the following RNIC resources: Stream
1497	   Context Memory, Page Translation Tables and STag namespace.

1499	   Note that some RNIC resources are not at risk of this type of
1500	   attack from a Remote Peer because an attack requires the Remote
1501	   Peer to send messages in order to consume the resource. Receive
1502	   Data Buffers, Completion Queue, and RDMA Read Request Queue
1503	   resources are examples. These resources are, however, at risk
1504	   from a local ULP that attempts to allocate resources, then goes
1505	   idle. This could also be created if the ULP negotiates the
1506	   resource levels with the Remote Peer, which causes the Local Peer
1507	   to consume resources, however the Remote Peer never sends data to
1508	   consume them. The general countermeasure described in this
1509	   section can be used to free resources allocated by an idle Local
1510	   Peer.

1512	6.4.3  Resource Consumption By Active ULPs

1514	   This section describes DOS attacks from Local and Remote Peers
1515	   that are actively exchanging messages. Attacks on each RDMA NIC
1516	   resource are examined and specific countermeasures are
1517	   identified. Note that attacks on Stream Context Memory, Page
1518	   Translation Tables, and STag namespace are covered in Section
1519	   6.4.1 RNIC Resource Consumption, so are not included here.

1521	6.4.3.1  Multiple Streams Sharing Receive Buffers

1523	   The Remote Peer can attempt to consume more than its fair share
1524	   of receive data buffers (i.e. Untagged buffers for DDP are or
1525	   Send Type Messages for RDMAP) if receive buffers are shared
1526	   across multiple Streams.

1528	   If resources are not shared across multiple Streams, then this
1529	   attack is not possible because the Remote Peer will not be able
1530	   to consume more buffers than were allocated to the Stream. The
1531	   worst case scenario is that the Remote Peer can consume more
1532	   receive buffers than the local ULP allowed, resulting in no
1533	   buffers being available, which could cause the Remote Peer's
1534	   Stream to the Local Peer to be torn down, and all allocated
1535	   resources to be released.

1537	   If local receive data buffers are shared among multiple Streams,
1538	   then the Remote Peer can attempt to consume more than its fair
1539	   share of the receive buffers, causing a different Stream to be
1540	   short of receive buffers, thus possibly causing the other Stream
1541	   to be torn down. For example, if the Remote Peer sent enough one
1542	   byte Untagged Messages, they might be able to consume all local
1543	   shared receive queue resources with little effort on their part.

1545	   One method the Local Peer could use is to recognize that a Remote
1546	   Peer is attempting to use more than its fair share of resources
1547	   and terminate the Stream (causing the allocated resources to be
1548	   released). However, if the Local Peer is sufficiently slow, it
1549	   may be possible for the Remote Peer to still mount a denial of
1550	   service attack. One countermeasure that can protect against this
1551	   attack is implementing a low-water notification. The low-water
1552	   notification alerts the ULP if the number of buffers in the
1553	   receive queue is less than a threshold.

1555	   If all of the following conditions are true, then the Local Peer
1556	   or local ULP can size the amount of local receive buffers posted
1557	   on the receive queue to ensure a DOS attack can be stopped.

1559	       *   a low-water notification is enabled, and

1561	       *   the Local Peer is able to bound the amount of time that
1562	           it takes to replenish receive buffers, and

1564	       *   the Local Peer maintains statistics to determine which
1565	           Remote Peer is consuming buffers.

1567	   The above conditions enable the low-water notification to arrive
1568	   before resources are depleted and thus the Local Peer or local
1569	   ULP can take corrective action (e.g., terminate the Stream of the
1570	   attacking Remote Peer).

1572	   A different, but similar attack is if the Remote Peer sends a
1573	   significant number of out-of-order packets and the RNIC has the
1574	   ability to use the ULP buffer (i.e. the Untagged Buffer for DDP
1575	   or the buffer consumed by a Send Type Message for RDMAP) as a
1576	   reassembly buffer. In this case the Remote Peer can consume a
1577	   significant number of ULP buffers, but never send enough data to
1578	   enable the ULP buffer to be completed to the ULP.

1580	   An effective countermeasure is to create a high-water
1581	   notification which alerts the ULP if there is more than a
1582	   specified number of receive buffers "in process" (partially
1583	   consumed, but not completed). The notification is generated when
1584	   more than the specified number of buffers are in process
1585	   simultaneously on a specific Stream (i.e., packets have started
1586	   to arrive for the buffer, but the buffer has not yet been
1587	   delivered to the ULP).

1589	   A different countermeasure is for the RNIC Engine to provide the
1590	   capability to limit the Remote Peer's ability to consume receive
1591	   buffers on a per Stream basis. Unfortunately this requires a
1592	   large amount of state to be tracked in each RNIC on a per Stream
1593	   basis.

1595	   Thus, if an RNIC Engine provides the ability to share receive
1596	   buffers across multiple Streams, the combination of the RNIC
1597	   Engine and the Privileged Resource Manager MUST be able to detect
1598	   if the Remote Peer is attempting to consume more than its fair
1599	   share of resources so that the Local Peer or local ULP can apply
1600	   countermeasures to detect and prevent the attack.

1602	6.4.3.2  Remote or Local Peer Attacking a Shared CQ

1604	   For an overview of the shared CQ attack model, see Section 7.1.

1606	   The Remote Peer can attack a shared CQ by consuming more than its
1607	   fair share of CQ entries by using one of the following methods:

1609	       *   The ULP protocol allows the Remote Peer to cause the
1610	           local ULP to reserve a specified number of CQ entries,
1611	           possibly leaving insufficient entries for other Streams
1612	           that are sharing the CQ.

1614	       *   If the Remote Peer, Local Peer, or local ULP (or any
1615	           combination) can attack the CQ by overwhelming the CQ
1616	           with completions, then completion processing on other
1617	           Streams sharing that Completion Queue can be affected
1618	           (e.g. the Completion Queue overflows and stops
1619	           functioning).

1621	   The first method of attack can be avoided if the ULP does not
1622	   allow a Remote Peer to reserve CQ entries or there is a trusted
1623	   intermediary such as a Privileged Resource Manager. Unfortunately
1624	   it is often unrealistic to not allow a Remote Peer to reserve CQ
1625	   entries - particularly if the number of completion entries is
1626	   dependent on other ULP negotiated parameters, such as the amount
1627	   of buffering required by the ULP. Thus an implementation MUST
1628	   implement a Privileged Resource Manager to control the allocation
1629	   of CQ entries. See Section 2.1 Components for a definition of
1630	   Privileged Resource Manager.

1632	   One way that a Local or Remote Peer can attempt to overwhelm a CQ
1633	   with completions is by sending minimum length RDMAP/DDP Messages
1634	   to cause as many completions (receive completions for the Remote
1635	   Peer, send completions for the Local Peer) per second as
1636	   possible. If it is the Remote Peer attacking, and we assume that
1637	   the Local Peer's receive queue(s) do not run out of receive
1638	   buffers (if they do, then this is a different attack, documented
1639	   in Section 6.4.3.1 Multiple Streams Sharing Receive Buffers),
1640	   then it might be possible for the Remote Peer to consume more
1641	   than its fair share of Completion Queue entries. Depending upon
1642	   the CQ implementation, this could either cause the CQ to overflow
1643	   (if it is not large enough to handle all of the completions
1644	   generated) or for another Stream to not be able to generate CQ
1645	   entries (if the RNIC had flow control on generation of CQ entries
1646	   into the CQ). In either case, the CQ will stop functioning
1647	   correctly and any Streams expecting completions on the CQ will
1648	   stop functioning.

1650	   This attack can occur regardless of whether all of the Streams
1651	   associated with the CQ are in the same Protection Domain or are
1652	   in different Protection Domains - the key issue is that the
1653	   number of Completion Queue entries is less than the number of all
1654	   outstanding operations that can cause a completion.

1656	   The Local Peer can protect itself from this type of attack using
1657	   either of the following methods:

1659	       *   Size the CQ to the appropriate level, as specified below
1660	           (note that if the CQ currently exists, and it needs to be
1661	           resized, resizing the CQ is not required to succeed in
1662	           all cases, so the CQ resize should be done before sizing
1663	           the Send Queue and Receive Queue on the Stream), OR

1665	       *   Grant fewer resources than the Remote Peer requested (not
1666	           supplying the number of Receive Data Buffers requested).

1668	   The proper sizing of the CQ is dependent on whether the local
1669	   ULP(s) will post as many resources to the various queues as the
1670	   size of the queue enables or not. If the local ULP(s) can be
1671	   trusted to post a number of resources that is smaller than the
1672	   size of the specific resource's queue, then a correctly sized CQ
1673	   means that the CQ is large enough to hold completion status for
1674	   all of the outstanding Data Buffers (both send and receive
1675	   buffers), or:

1677	            CQ_MIN_SIZE = SUM(MaxPostedOnEachRQ)
1678	                          + SUM(MaxPostedOnEachSRQ)
1679	                          + SUM(MaxPostedOnEachSQ)

1681	   Where:

1683	           MaxPostedOnEachRQ = the maximum number of requests which
1684	                  can cause a completion that will be posted on a
1685	                  specific Receive Queue.

1687	           MaxPostedOnEachSRQ = the maximum number of requests which
1688	                  can cause a completion that will be posted on a
1689	                  specific Shared Receive Queue.

1691	           MaxPostedOnEachSQ = the maximum number of requests which
1692	                  can cause a completion that will be posted on a
1693	                  specific Send Queue.

1695	   If the local ULP must be able to completely fill the queues, or
1696	   can not be trusted to observe a limit smaller than the queues,
1697	   then the CQ must be sized to accommodate the maximum number of
1698	   operations that it is possible to post at any one time. Thus the
1699	   equation becomes:

1701	            CQ_MIN_SIZE = SUM(SizeOfEachRQ)
1702	                          + SUM(SizeOfEachSRQ)
1703	                          + SUM(SizeOfEachSQ)

1705	   Where:

1707	          SizeOfEachRQ = the maximum number of requests which
1708	                  can cause a completion that can ever be posted
1709	                  on a specific Receive Queue.

1711	          SizeOfEachSRQ = the maximum number of requests which
1712	                  can cause a completion that can ever be posted
1713	                  on a specific Shared Receive Queue.

1715	          SizeOfEachSQ = the maximum number of requests which
1716	                  can cause a completion that can ever be posted
1717	                  on a specific Send Queue.

1719	   Where MaxPosted*OnEach*Q and SizeOfEach*Q varies on a per Stream
1720	   or per Shared Receive Queue basis.

1722	   If the ULP is sharing a CQ across multiple Streams which do not
1723	   share Partial Mutual Trust, then the ULP MUST implement a
1724	   mechanism to ensure that the Completion Queue can not overflow.
1725	   Note that it is possible to share CQs even if the Remote Peers
1726	   accessing the CQs are untrusted if either of the above two
1727	   formulas are implemented. If the ULP can be trusted to not post
1728	   more than MaxPostedOnEachRQ, MaxPostedOnEachSRQ, and
1729	   MaxPostedOnEachSQ, then the first formula applies. If the ULP can
1730	   not be trusted to obey the limit, then the second formula
1731	   applies.

1733	6.4.3.3  Attacking the RDMA Read Request Queue

1735	   The RDMA Read Request Queue can be attacked if the Remote Peer
1736	   sends more RDMA Read Requests than the depth of the RDMA Read
1737	   Request Queue at the Local Peer. If the RDMA Read Request Queue
1738	   is a shared resource, this could corrupt the queue. If the queue
1739	   is not shared, then the worst case is that the current Stream is
1740	   no longer functional (e.g. torn down). One approach to solving
1741	   the shared RDMA Read Request Queue would be to create thresholds,
1742	   similar to those described in Section 6.4.3.1 Multiple Streams
1743	   Sharing Receive Buffers. A simpler approach is to not share RDMA
1744	   Read Request Queue resources among Streams or enforce hard limits
1745	   of consumption per Stream. Thus RDMA Read Request Queue resource
1746	   consumption MUST be controlled by the Privileged Resource Manager
1747	   such that RDMAP/DDP Streams which do not share Partial Mutual
1748	   Trust do not share RDMA Read Request Queue resources.

1750	   If the issue is a bug in the Remote Peer's implementation, but
1751	   not a malicious attack, the issue can be solved by requiring the
1752	   Remote Peer's RNIC to throttle RDMA Read Requests. By properly
1753	   configuring the Stream at the Remote Peer through a trusted
1754	   agent, the RNIC can be made to not transmit RDMA Read Requests
1755	   that exceed the depth of the RDMA Read Request Queue at the Local
1756	   Peer. If the Stream is correctly configured, and if the Remote
1757	   Peer submits more requests than the Local Peer's RDMA Read
1758	   Request Queue can handle, the requests would be queued at the
1759	   Remote Peer's RNIC until previous requests complete. If the
1760	   Remote Peer's Stream is not configured correctly, the RDMAP
1761	   Stream is terminated when more RDMA Read Requests arrive at the
1762	   Local Peer than the Local Peer can handle (assuming the prior
1763	   paragraph's recommendation is implemented). Thus an RNIC
1764	   implementation SHOULD provide a mechanism to cap the number of
1765	   outstanding RDMA Read Requests. The configuration of this limit
1766	   is outside the scope of this document.

1768	6.4.4  Exercise of non-optimal code paths

1770	   Another form of DOS attack is to attempt to exercise data paths
1771	   that can consume a disproportionate amount of resources. An
1772	   example might be if error cases are handled on a "slow path"
1773	   (consuming either host or RNIC computational resources), and an
1774	   attacker generates excessive numbers of errors in an attempt to
1775	   consume these resources. Note that for most RDMAP or DDP errors,
1776	   the attacking Stream will simply be torn down. Thus for this form
1777	   of attack to be effective, the Remote Peer needs to exercise data
1778	   paths which do not cause the Stream to be torn down.

1780	   If an RNIC implementation contains "slow paths" which do not
1781	   result in the tear down of the Stream, it is recommended that an
1782	   implementation provide the ability to detect the above condition
1783	   and allow an administrator to act, including potentially
1784	   administratively tearing down the RDMAP Stream associated with
1785	   the Stream exercising data paths consuming a disproportionate
1786	   amount of resources.

1788	6.4.5  Remote Invalidate an STag Shared on Multiple Streams

1790	   If a Local Peer has enabled an STag for remote access, the Remote
1791	   Peer could attempt to remote invalidate the STag by using the
1792	   RDMAP Send with Invalidate or Send with SE and Invalidate
1793	   Message. If the STag is only valid on the current Stream, then
1794	   the only side effect is that the Remote Peer can no longer use
1795	   the STag; thus there are no security issues.

1797	   If the STag is valid across multiple Streams, then the Remote
1798	   Peer can prevent other Streams from using that STag by using the
1799	   remote invalidate functionality.

1801	   Thus if RDDP Streams do not share Partial Mutual Trust (i.e. the
1802	   Remote Peer may attempt to remote invalidate the STag
1803	   prematurely), the ULP MUST NOT enable an STag which would be
1804	   valid across multiple Streams.

1806	6.4.6  Remote Peer attacking an Unshared CQ

1808	   The Remote Peer can attack an unshared CQ if the Local Peer does
1809	   not size the CQ correctly. For example, if the Local Peer enables
1810	   the CQ to handle completions of received buffers, and the receive
1811	   buffer queue is longer than the Completion Queue, then an
1812	   overflow can potentially occur. The effect on the attacker's
1813	   Stream is catastrophic. However if an RNIC does not have the
1814	   proper protections in place, then an attack to overflow the CQ
1815	   can also cause corruption and/or termination of an unrelated
1816	   Stream. Thus an RNIC MUST ensure that if a CQ overflows, any
1817	   Streams which do not use the CQ MUST remain unaffected.

1819	6.5  Elevation of Privilege

1821	   The RDMAP/DDP Security Architecture explicitly differentiates
1822	   between three levels of privilege - Non-Privileged, Privileged,
1823	   and the Privileged Resource Manager. If a Non-Privileged ULP is
1824	   able to elevate its privilege level to a Privileged ULP, then
1825	   mapping a physical address list to an STag can provide local and
1826	   remote access to any physical address location on the node. If a
1827	   Privileged Mode ULP is able to promote itself to be a Resource
1828	   Manager, then it is possible for it to perform denial of service
1829	   type attacks where substantial amounts of local resources could
1830	   be consumed.

1832	   In general, elevation of privilege is a local implementation
1833	   specific issue and thus outside the scope of this document.

1835	7  Attacks from Local Peers

1837	   This section describes local attacks that are possible against
1838	   the RDMA system defined in Figure 1 - RDMA Security Model and the
1839	   RNIC Engine resources defined in Section 2.2.

1841	7.1  Local ULP Attacking a Shared CQ

1843	   DOS attacks against a Shared Completion Queue (CQ - see Section
1844	   2.2.6 Completion Queues) can be caused by either the local ULP or
1845	   the Remote Peer if either attempts to cause more completions than
1846	   its fair share of the number of entries, thus potentially
1847	   starving another unrelated ULP such that no Completion Queue
1848	   entries are available.

1850	   A Completion Queue entry can potentially be maliciously consumed
1851	   by a completion from the Send Queue or a completion from the
1852	   Receive Queue. In the former, the attacker is the local ULP. In
1853	   the latter, the attacker is the Remote Peer.

1855	   A form of attack can occur where the local ULPs can consume
1856	   resources on the CQ. A local ULP that is slow to free resources
1857	   on the CQ by not reaping the completion status quickly enough
1858	   could stall all other local ULPs attempting to use that CQ.

1860	   For these reasons, an RNIC MUST NOT enable sharing a CQ across
1861	   ULPs that do not share Partial Mutual Trust.

1863	7.2  Local Peer Attacking the RDMA Read Request Queue

1865	   If RDMA Read Request Queue resources are pooled across multiple
1866	   Streams, one attack is if the local ULP attempts to unfairly
1867	   allocate RDMA Read Request Queue resources for its Streams. For
1868	   example, a local ULP attempts to allocate all available resources
1869	   on a specific RDMA Read Request Queue for its Streams, thereby
1870	   denying the resource to ULPs sharing the RDMA Read Request Queue.
1871	   The same type of argument applies even if the RDMA Read Request
1872	   is not shared - but a local ULP attempts to allocate all of the
1873	   RNIC's resources when the queue is created.

1875	   Thus access to interfaces that allocate RDMA Read Request Queue
1876	   entries MUST be restricted to a trusted Local Peer, such as a
1877	   Privileged Resource Manager. The Privileged Resource Manager
1878	   SHOULD prevent a local ULP from allocating more than its fair
1879	   share of resources.

1881	7.3  Local ULP Attacking the PTT & STag Mapping

1883	   If a Non-Privileged ULP is able to directly manipulate the RNIC
1884	   Page Translation Tables (which translate from an STag to a host
1885	   address), it is possible that the Non-Privileged ULP could point
1886	   the Page Translation Table at an unrelated Stream's or ULP's
1887	   buffers and thereby be able to gain access to information of the
1888	   unrelated Stream/ULP.

1890	   As discussed in Section 2 Architectural Model, introduction of a
1891	   Privileged Resource Manager to arbitrate the mapping requests is
1892	   an effective countermeasure. This enables the Privileged Resource
1893	   Manager to ensure a local ULP can only initialize the Page
1894	   Translation Table (PTT)to point to its own buffers.

1896	   Thus if Non-Privileged ULPs are supported, the Privileged
1897	   Resource Manager MUST verify that the Non-Privileged ULP has the
1898	   right to access a specific Data Buffer before allowing an STag
1899	   for which the ULP has access rights to be associated with a
1900	   specific Data Buffer. This can be done when the Page Translation
1901	   Table is initialized to access the Data Buffer or when the STag
1902	   is initialized to point to a group of Page Translation Table
1903	   entries, or both.

1905	8  Security considerations

1907	   Please see Sections 5 Attacks That Can be Mitigated With End-to-
1908	   End Security, Section 6 Attacks from Remote Peers, and Section 7
1909	   Attacks from Local Peers, for a detailed analysis of attacks and
1910	   normative countermeasures to mitigate the attacks.

1912	   Additionally, the appendices provide a summary of the security
1913	   requirements for specific audiences. Section 11 Appendix A: ULP
1914	   Issues for RDDP Client/Server Protocols provides a summary of
1915	   implementation issues and requirements for applications which
1916	   implement a traditional client/server style of interaction. It
1917	   provides additional insight and applicability of the normative
1918	   text in Sections 5, 6, and 7. Section 12, Appendix B: Summary of
1919	   RNIC and ULP Implementation Requirements provides a convenient
1920	   summary of normative requirements for implementers.

1922	9  IANA Considerations

1924	   IANA considerations are not addressed by this document.  Any IANA
1925	   considerations resulting from the use of DDP or RDMA must be
1926	   addressed in the relevant standards.

1928	10 References

1930	10.1 Normative References

1932	   [DDP] Shah, H., J. Pinkerton, R. Recio, and P. Culley, "Direct
1933	       Data Placement over Reliable Transports", Internet-Draft Work
1934	       in Progress draft-ietf-rddp-ddp-05.txt, July 2005.

1936	   [RDMAP] Recio, R., P. Culley, D. Garcia, J. Hilland, "An RDMA
1937	       Protocol Specification", Internet-Draft Work in Progress
1938	       draft-ietf-rddp-rdmap-05.txt, July 2005.

1940	   [RFC2406] Kent, S., Atkinson, R. "IP Encapsulating Security
1941	       Payload (ESP)", RFC 2406, November 1998.

1943	   [RFC2409] Harkins, D., Carrel, D., "The Internet Key Exchange
1944	       (IKE)", RFC 2409, November 1998.

1946	   [RFC2401] Kent, S., Atkinson, R. "Security Architecture for the
1947	   Internet Protocol", RFC 2401, November 1998.

1949	   [RFC2402] Kent, S., Atkinson, R. "IP Authentication Header", RFC
1950	   2402, November 1998.

1952	   [RFC3723] Aboba, B., et al, "Securing Block Storage Protocols
1953	       over IP", RFC3723, April 2004.

1955	   [RFC2960] Stewart, R. et al., "Stream Control Transmission
1956	       Protocol", RFC 2960, October 2000.

1958	   [RFC793] Postel, J., "Transmission Control Protocol - DARPA
1959	       Internet Program Protocol Specification", RFC 793, September
1960	       1981.

1962	10.2 Informative References

1964	   [RFC2828] Shirley, R., "Internet Security Glossary", FYI 36, RFC
1965	       2828, May 2000.

1967	   [APPLICABILITY] Bestler, C. , Coene, L. "Applicability of Remote
1968	       Direct Memory Access Protocol (RDMA) and Direct Data
1969	       Placement (DDP)", Internet-Draft Work in Progress draft-ietf-
1970	       rddp-applicability-06.txt, April 2006.

1972	   [IPv6-Trust] Nikander, P., J.Kempf, E. Nordmark, "IPv6 Neighbor
1973	       Discovery Trust Models and threats", Informational RFC,
1974	       RFC3756, May 2004.

1976	   [NFSv4CHANNEL] Williams, N., "On the Use of Channel Bindings to
1977	       Secure Channels", Internet-Draft draft-ietf-nfsv4-channel-
1978	       bindings-02.txt, July 2004.

1980	   [VERBS-RDMAC] "RDMA Protocol Verbs Specification", RDMA
1981	       Consortium standard, April 2003.
1982	       http://www.rdmaconsortium.org/home/draft-hilland-iwarp-verbs-
1983	       v1.0-RDMAC.pdf

1985	   [VERBS-RDMAC-Overview] "RDMA enabled NIC (RNIC) Verbs Overview",
1986	       slide presentation by Renato Recio, April 2003.
1987	       http://www.rdmaconsortium.org/home/RNIC_Verbs_Overview2.pdf

1989	   [RFC3552] "Guidelines for Writing RFC Text on Security
1990	       Considerations", Best Current Practice RFC, RFC 3552, July
1991	       2003.

1993	   [INFINIBAND] "InfiniBand Architecture Specification Volume 1",
1994	       release 1.2, InfiniBand Trade Association standard.
1995	       http://www.infinibandta.org/specs. Verbs are documented in
1996	       chapter 11.

1998	   [DTLS] E. Rescorla and N. Modadugu, "Datagram Transport Layer
1999	       Security", RFC 4347, April 2006.

2001	   [iSCSI] J. Satran, et al, "Internet Small Computer Systems
2002	       Interface (iSCSI)", RFC 3720, April 2004.

2004	   [ISER] M. Ko, et al, "iSCSI Extensions for RDMA Specification",
2005	       Internet-Draft Work in Progress draft-ietf-ips-iser-05.txt,
2006	       October 2005.

2008	   [NFSv4] S. Shepler, et al, "Network File System (NFS) version 4
2009	       Protocol", RFC 3530, April 2003.

2011	   [NFSv4.1] S. Shepler, ed., "NFSv4 Minor Version 1", Internet-
2012	       Draft draft-ietf-nfsv4-minorversion1-03.txt, Work in
2013	       Progress, June 2006.

2015	11 Appendix A: ULP Issues for RDDP Client/Server Protocols

2017	   This section is a normative appendix to the document that is
2018	   focused on client/server ULP implementation requirements to
2019	   ensure a secure server implementation.

2021	   The prior sections outlined specific attacks and their
2022	   countermeasures. This section summarizes the attacks and
2023	   countermeasures that have been defined in the prior section which
2024	   are applicable to creation of a secure ULP (e.g. application)
2025	   server. A ULP server is defined as a ULP which must be able to
2026	   communicate with many clients which do not necessarily have a
2027	   trust relationship with each other, and ensure that each client
2028	   can not attack another client through server interactions.
2029	   Further, the server may wish to use multiple Streams to
2030	   communicate with a specific client, and those Streams may share
2031	   mutual trust. Note that this section assumes a compliant RNIC and
2032	   Privileged Resource Manager implementation - thus it focuses
2033	   specifically on ULP server (e.g. application) implementation
2034	   issues.

2036	   All of the prior section's details on attacks and countermeasures
2037	   apply to the server, thus requirements which are repeated in this
2038	   section use non-normative "must", "should", "may". In some cases
2039	   normative SHOULD statements for the ULP from the main body of
2040	   this document are made MUST statements for the ULP server because
2041	   the operating conditions can be refined to make the motives for a
2042	   SHOULD inapplicable. If a prior SHOULD is changed to a MUST in
2043	   this section, it is explicitly noted and it uses upper-case
2044	   normative statements.

2046	   The following list summarizes the relevant attacks that clients
2047	   can mount on the shared server, by re-stating the previous
2048	   normative statements to be client/server specific. Note that each
2049	   client/server ULP may employ explicit RDMA operations (RDMA Read,
2050	   RDMA Write) in differing fashions. Therefore where appropriate,
2051	   "Local ULP", "Local Peer" and "Remote Peer" are used in place of
2052	   "server" or "client", in order to retain full generality of each
2053	   requirement.

2055	       *   Spoofing

2057	           *   Sections 5.1.1 to 5.1.3. For protection against many
2058	               forms of spoofing attacks, enable IPsec.

2060	           *   Section 6.1.1 Using an STag on a Different Stream. To
2061	               ensure that one client can not access another
2062	               client's data via use of the other client's STag, the
2063	               server ULP must either scope an STag to a single
2064	               Stream or use a unique Protection Domain per client.
2065	               If a single client has multiple Streams that share
2066	               Partial Mutual Trust, then the STag can be shared
2067	               between the associated Streams by using a single
2068	               Protection Domain among the associated Streams (see
2069	               Section 5.4.4 ULPs Which Provide Security for
2070	               additional issues). To prevent unintended sharing of
2071	               STags within the associated Streams, a server ULP
2072	               should use STags in such a fashion that it is
2073	               difficult to predict the next allocated STag number.

2075	       *   Tampering

2077	           *   6.2.2 Modifying a Buffer After Indication. Before the
2078	               local ULP operates on a buffer that was written by
2079	               the Remote Peer using an RDMA Write or RDMA Read, the
2080	               local ULP MUST ensure the buffer can no longer be
2081	               modified, by invalidating the STag for remote access
2082	               (note that this is stronger than the SHOULD in
2083	               Section 6.2.2). This can either be done explicitly by
2084	               revoking remote access rights for the STag when the
2085	               Remote Peer indicates the operation has completed, or
2086	               by checking to make sure the Remote Peer Invalidated
2087	               the STag through the RDMAP Invalidate capability, and
2088	               if it did not, the local ULP then explicitly revoking
2089	               the STag remote access rights.

2091	       *   Information Disclosure

2093	           *   6.3.2 Using RDMA Read to Access Stale Data. In a
2094	               general purpose server environment there is no
2095	               compelling rationale to not require a buffer to be
2096	               initialized before remote read is enabled (and an
2097	               enormous down side of unintentionally sharing data).
2098	               Thus a local ULP MUST (this is stronger than the
2099	               SHOULD in Section 6.3.2) ensure that no stale data is
2100	               contained in a buffer before remote read access
2101	               rights are granted to a Remote Peer (this can be done
2102	               by zeroing the contents of the memory, for example).

2104	           *   6.3.3 Accessing a Buffer After the Transfer. This
2105	               mitigation is already covered by Section 6.2.2
2106	               (above).

2108	           *   6.3.4 Accessing Unintended Data With a Valid STag.
2109	               The ULP must set the base and bounds of the buffer
2110	               when the STag is initialized to expose only the data
2111	               to be retrieved.

2113	           *   6.3.5 RDMA Read into an RDMA Write Buffer. If a peer
2114	               only intends a buffer to be exposed for remote write
2115	               access, it must set the access rights to the buffer
2116	               to only enable remote write access.

2118	           *   6.3.6 Using Multiple STags Which Alias to the Same
2119	               Buffer. The requirement in Section 6.1.1 (above)
2120	               mitigates this attack. A server buffer is exposed to
2121	               only one client at a time to ensure that no
2122	               information disclosure or information tampering
2123	               occurs between peers.

2125	           *   5.3  - Network Based Eavesdropping. Confidentiality
2126	               services should be enabled by the ULP if this threat
2127	               is a concern.

2129	       *   Denial of Service

2131	           *   6.4.3.1 Multiple Streams Sharing Receive Buffers. ULP
2132	               memory footprint size can be important for some
2133	               server ULPs. If a server ULP is expecting significant
2134	               network traffic from multiple clients, using a
2135	               receive buffer queue per Stream where there is a
2136	               large number of Streams can consume substantial
2137	               amounts of memory. Thus a receive queue that can be
2138	               shared by multiple Streams is attractive.

2140	               However, because of the attacks outlined in this
2141	               section, sharing a single receive queue between
2142	               multiple clients must only be done if a mechanism is
2143	               in place to ensure one client cannot consume receive
2144	               buffers in excess of its limits, as defined by each
2145	               ULP. For multiple Streams within a single client ULP
2146	               (which presumably shared Partial Mutual Trust) this
2147	               added overhead may be avoided.

2149	           *   7.1 Local ULP Attacking a Shared CQ. The normative
2150	               RNIC mitigations require the RNIC to not enable
2151	               sharing of a CQ if the local ULPs do not share
2152	               Partial Mutual Trust. Thus while the ULP is not
2153	               allowed to enable this feature in an unsafe mode, if
2154	               the two local ULPs share Partial Mutual Trust, they
2155	               must behave in the following manner:

2157	               1) The sizing of the completion queue is based on the
2158	               size of the receive queue and send queues as
2159	               documented in 6.4.3.2 Remote or Local Peer Attacking
2160	               a Shared CQ.

2162	               2) The local ULP ensures that CQ entries are reaped
2163	               frequently enough to adhere to Section 6.4.3.2's
2164	               rules.

2166	           *   6.4.3.2 Remote or Local Peer Attacking a Shared CQ.
2167	               There are two mitigations specified in this section -
2168	               one requires a worst-case size of the CQ, and can be
2169	               implemented entirely within the Privileged Resource
2170	               Manager. The second approach requires cooperation
2171	               with the local ULP server (to not post too many
2172	               buffers), and enables a smaller CQ to be used.

2174	               In some server environments, partial trust of the
2175	               server ULP (but not the clients) is acceptable, thus
2176	               the smaller CQ fully mitigates the remote attacker.
2177	               In other environments, the local server ULP could
2178	               also contain untrusted elements which can attack the
2179	               local machine (or have bugs). In those environments,
2180	               the worst-case size of the CQ must be used.

2182	           *   6.4.3.3 The section requires a server's Privileged
2183	               Resource Manager to not allow sharing of RDMA Read
2184	               Request Queues across multiple Streams that do not
2185	               share Partial Mutual Trust, for a ULP which performs
2186	               RDMA Read operations to server buffers. However,
2187	               because the server ULP knows best which of its
2188	               Streams share Partial Mutual Trust, this requirement
2189	               can be reflected back to the ULP. The ULP (i.e.
2190	               server) requirement in this case is that it MUST NOT
2191	               allow RDMA Read Request Queues to be shared between
2192	               ULPs which do not have Partial Mutual Trust.

2194	           *   6.4.5 Remote Invalidate an STag Shared on Multiple
2195	               Streams. This mitigation is already covered by
2196	               Section 6.2.2 (above).

2198	12 Appendix B: Summary of RNIC and ULP Implementation Requirements

2200	   This appendix is informative.

2202	   Below is a summary of implementation requirements for the RNIC:

2204	       *   3 Trust and Resource Sharing

2206	       *   5.4.5 Requirements for IPsec Encapsulation of DDP

2208	       *   6.1.1 Using an STag on a Different Stream

2210	       *   6.2.1 Buffer Overrun - RDMA Write or Read Response

2212	       *   6.2.2 Modifying a Buffer After Indication

2214	       *   6.4.1 RNIC Resource Consumption

2216	       *   6.4.3.1 Multiple Streams Sharing Receive Buffers

2218	       *   6.4.3.2 Remote or Local Peer Attacking a Shared CQ

2220	       *   6.4.3.3 Attacking the RDMA Read Request Queue

2222	       *   6.4.6 Remote Peer attacking an Unshared CQ.

2224	       *   6.5 Elevation of Privilege 39

2226	       *   7.1 Local ULP Attacking a Shared CQ

2228	       *   7.3 Local ULP Attacking the PTT & STag Mapping

2230	   Below is a summary of implementation requirements for the ULP
2231	   above the RNIC:

2233	       *   5.3 Information Disclosure - Network Based Eavesdropping

2235	       *   6.1.1 Using an STag on a Different Stream

2237	       *   6.2.2 Modifying a Buffer After Indication

2239	       *   6.3.2 Using RDMA Read to Access Stale Data

2241	       *   6.3.3 Accessing a Buffer After the Transfer

2243	       *   6.3.4 Accessing Unintended Data With a Valid STag

2245	       *   6.3.5 RDMA Read into an RDMA Write Buffer

2247	       *   6.3.6 Using Multiple STags Which Alias to the Same Buffer
2248	       *   6.4.5 Remote Invalidate an STag Shared on Multiple
2249	           Streams

2251	13 Appendix C: Partial Trust Taxonomy

2253	   This appendix is informative.

2255	   Partial Trust is defined as when one party is willing to assume
2256	   that another party will refrain from a specific attack or set of
2257	   attacks, the parties are said to be in a state of Partial Trust.
2258	   Note that the partially trusted peer may attempt a different set
2259	   of attacks. This may be appropriate for many ULPs where any
2260	   adverse effects of the betrayal is easily confined and does not
2261	   place other clients or ULPs at risk.

2263	   The Trust Models described in this section have three primary
2264	   distinguishing characteristics. The Trust Model refers to a local
2265	   ULP and Remote Peer, which are intended to be the local and
2266	   remote ULP instances communicating via RDMA/DDP.

2268	       *   Local Resource Sharing (yes/no) - When local resources
2269	           are shared, they are shared across a grouping of
2270	           RDMAP/DDP Streams. If local resources are not shared, the
2271	           resources are dedicated on a per Stream basis. Resources
2272	           are defined in Section 2.2 - Resources. The advantage of
2273	           not sharing resources between Streams is that it reduces
2274	           the types of attacks that are possible. The disadvantage
2275	           is that ULPs might run out of resources.

2277	       *   Local Partial Trust (yes/no) - Local Partial Trust is
2278	           determined based on whether the local grouping of
2279	           RDMAP/DDP Streams (which typically equates to one ULP or
2280	           group of ULPs) mutually trust each other to not perform a
2281	           specific set of attacks.

2283	       *   Remote Partial Trust (yes/no) - The Remote Partial Trust
2284	           level is determined based on whether the local ULP of a
2285	           specific RDMAP/DDP Stream partially trusts the Remote
2286	           Peer of the Stream (see the definition of Partial Trust
2287	           in Section 1 Introduction).

2289	   Not all of the combinations of the trust characteristics are
2290	   expected to be used by ULPs. This document specifically analyzes
2291	   five ULP Trust Models that are expected to be in common use. The
2292	   Trust Models are as follows:

2294	       *   NS-NT - Non-Shared Local Resources, no Local Trust, no
2295	           Remote Trust - typically a server ULP that wants to run
2296	           in the safest mode possible. All attack mitigations are
2297	           in place to ensure robust operation.

2299	       *   NS-RT - Non-Shared Local Resources, no Local Trust,
2300	           Remote Partial Trust - typically a peer-to-peer ULP,
2301	           which has, by some method outside of the scope of this
2302	           document, authenticated the Remote Peer. Note that unless
2303	           some form of key based authentication is used on a per
2304	           RDMA/DDP Stream basis, it may not be possible be possible
2305	           for man-in-the-middle attacks to occur.

2307	       *   S-NT - Shared Local Resources, no Local Trust, no Remote
2308	           Trust - typically a server ULP that runs in an untrusted
2309	           environment where the amount of resources required is
2310	           either too large or too dynamic to dedicate for each
2311	           RDMAP/DDP Stream.

2313	       *   S-LT - Shared Local Resources, Local Partial Trust, no
2314	           Remote Trust - typically a ULP, which provides a session
2315	           layer and uses multiple Streams, to provide additional
2316	           throughput or fail-over capabilities. All of the Streams
2317	           within the local ULP partially trust each other, but do
2318	           not trust the Remote Peer. This trust model may be
2319	           appropriate for embedded environments.

2321	       *   S-T - Shared Local Resources, Local Partial Trust, Remote
2322	           Partial Trust - typically a distributed application, such
2323	           as a distributed database application or a High
2324	           Performance Computer (HPC) application, which is intended
2325	           to run on a cluster. Due to extreme resource and
2326	           performance requirements, the application typically
2327	           authenticates with all of its peers and then runs in a
2328	           highly trusted environment. The application peers are all
2329	           in a single application fault domain and depend on one
2330	           another to be well-behaved when accessing data
2331	           structures. If a trusted Remote Peer has an
2332	           implementation defect that results in poor behavior, the
2333	           entire application could be corrupted.

2335	   Models NS-NT and S-NT above are typical for Internet networking -
2336	   neither local ULPs nor the Remote Peer is trusted. Sometimes
2337	   optimizations can be done that enable sharing of Page Translation
2338	   Tables across multiple local ULPs, thus Model S-LT can be
2339	   advantageous. Model S-T is typically used when resource scaling
2340	   across a large parallel ULP makes it infeasible to use any other
2341	   model. Resource scaling issues can either be due to performance
2342	   around scaling or because there simply are not enough resources.
2343	   Model NS-RT is probably the least likely model to be used, but is
2344	   presented for completeness.

2346	14 Author's Addresses

2348	   James Pinkerton
2349	   Microsoft Corporation
2350	   One Microsoft Way
2351	   Redmond, WA. 98052 USA
2352	   Phone: +1 (425) 705-5442
2353	   Email: jpink@windows.microsoft.com

2355	   Ellen Deleganes
2356	   Intel Corporation
2357	   MS JF5-355
2358	   2111 NE 25th Ave.
2359	   Hillsboro, OR 97124 USA
2360	   Phone: +1 (503) 712-4173
2361	   Email: ellen.m.deleganes@intel.com

2363	15 Acknowledgments

2365	   Sara Bitan
2366	   Microsoft Corporation
2367	   Email: sarab@microsoft.com

2369	   Allyn Romanow
2370	   Cisco Systems
2371	   170 W Tasman Drive
2372	   San Jose, CA 95134 USA
2373	   Phone: +1 408 525 8836
2374	   Email: allyn@cisco.com

2376	   Catherine Meadows
2377	   Naval Research Laboratory
2378	   Code 5543
2379	   Washington, DC 20375
2380	   Email: meadows@itd.nrl.navy.mil

2382	   Patricia Thaler
2383	   Agilent Technologies, Inc.
2384	   1101 Creekside Ridge Drive, #100
2385	   M/S-RG10
2386	   Roseville, CA 95678
2387	   Phone: +1-916-788-5662
2388	   email: pat_thaler@agilent.com

2390	   James Livingston
2391	   NEC Solutions (America), Inc.
2392	   7525 166th Ave. N.E., Suite D210
2393	   Redmond, WA 98052-7811
2394	   Phone: +1 (425) 897-2033
2395	   Email: james.livingston@necsam.com

2397	   John Carrier
2398	   Adaptec, Inc.
2399	   691 S. Milpitas Blvd.
2400	   Milpitas, CA 95035 USA
2401	   Phone: +1 (360) 378-8526
2402	   Email: john_carrier@adaptec.com

2404	   Caitlin Bestler
2405	   Broadcom
2406	   49 Discovery
2407	   Irvine, CA  92618
2408	   Email: cait@asomi.com

2410	   Bernard Aboba
2411	   Microsoft Corporation
2412	   One Microsoft Way
2413	   Redmond, WA. 98052 USA
2414	   Phone: +1 (425) 706-6606
2415	   Email: bernarda@windows.microsoft.com

2417	16 Full Copyright Statement

2419	   Copyright (C) The Internet Society (2006).

2421	   This document is subject to the rights, licenses and restrictions
2422	   contained in BCP 78, and except as set forth therein, the authors
2423	   retain all their rights.

2425	   This document and the information contained herein are provided
2426	   on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
2427	   REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND
2428	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
2429	   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY
2430	   THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY
2431	   RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS
2432	   FOR A PARTICULAR PURPOSE.

2434	   Intellectual Property

2436	   The IETF takes no position regarding the validity or scope of any
2437	   Intellectual Property Rights or other rights that might be
2438	   claimed to pertain to the implementation or use of the technology
2439	   described in this document or the extent to which any license
2440	   under such rights might or might not be available; nor does it
2441	   represent that it has made any independent effort to identify any
2442	   such rights.  Information on the procedures with respect to
2443	   rights in RFC documents can be found in BCP 78 and BCP 79.

2445	   Copies of IPR disclosures made to the IETF Secretariat and any
2446	   assurances of licenses to be made available, or the result of an
2447	   attempt made to obtain a general license or permission for the
2448	   use of such proprietary rights by implementers or users of this
2449	   specification can be obtained from the IETF on-line IPR
2450	   repository at http://www.ietf.org/ipr.

2452	   The IETF invites any interested party to bring to its attention
2453	   any copyrights, patents or patent applications, or other
2454	   proprietary rights that may cover technology that may be required
2455	   to implement this standard.  Please address the information to
2456	   the IETF at ietf-ipr@ietf.org.

2458	   Acknowledgement

2460	   Funding for the RFC Editor function is currently provided by the
2461	   Internet Society.