idnits 2.17.1 

draft-ietf-ppsp-peer-protocol-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 5 instances of lines with control characters in the document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (December 6, 2012) is 4157 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC4960' is defined on line 2372, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5226' is defined on line 2375, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2459 (Obsoleted by RFC 3280)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 5226
     (Obsoleted by RFC 8126)

  -- Obsolete informational reference (is this intentional?): RFC 5389
     (Obsoleted by RFC 8489)

  -- Obsolete informational reference (is this intentional?): RFC 6347
     (Obsoleted by RFC 9147)


     Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	PPSP                                                           A. Bakker
3	Internet-Draft                                               R. Petrocco
4	Intended status: Informational                            V. Grishchenko
5	Expires: June 9, 2013                      Technische Universiteit Delft
6	                                                        December 6, 2012

8	              Peer-to-Peer Streaming Peer Protocol (PPSPP)
9	                    draft-ietf-ppsp-peer-protocol-04

11	Abstract

13	   The Peer-to-Peer Streaming Peer Protocol (PPSPP) is a transport
14	   protocol for disseminating the same content to a group of interested
15	   parties in a streaming fashion.  PPSPP supports streaming of both
16	   pre-recorded (on-demand) and live audio/video content.  It is based
17	   on the peer-to-peer paradigm, where clients consuming the content are
18	   put on equal footing with the servers initially providing the
19	   content, to create a system where everyone can potentially provide
20	   upload bandwidth.  It has been designed to provide short time-till-
21	   playback for the end user, and to prevent disruption of the streams
22	   by malicious peers.  PPSPP has also been designed to be flexible and
23	   extensible.  It can use different mechanisms to optimize peer
24	   uploading, prevent freeriding, and work with different peer discovery
25	   schemes (centralized trackers or Distributed Hash Tables).  It
26	   supports multiple methods for content integrity protection and chunk
27	   addressing.  Designed as a generic protocol that can run on top of
28	   various transport protocols, it currently runs on top of UDP using
29	   LEDBAT for congestion control.

31	Status of this Memo

33	   This Internet-Draft is submitted in full conformance with the
34	   provisions of BCP 78 and BCP 79.

36	   Internet-Drafts are working documents of the Internet Engineering
37	   Task Force (IETF).  Note that other groups may also distribute
38	   working documents as Internet-Drafts.  The list of current Internet-
39	   Drafts is at http://datatracker.ietf.org/drafts/current/.

41	   Internet-Drafts are draft documents valid for a maximum of six months
42	   and may be updated, replaced, or obsoleted by other documents at any
43	   time.  It is inappropriate to use Internet-Drafts as reference
44	   material or to cite them other than as "work in progress."

46	   This Internet-Draft will expire on June 9, 2013.

48	Copyright Notice
49	   Copyright (c) 2012 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (http://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  5
65	     1.1.  Purpose  . . . . . . . . . . . . . . . . . . . . . . . . .  5
66	     1.2.  Requirements Language  . . . . . . . . . . . . . . . . . .  6
67	     1.3.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  6
68	   2.  Overall Operation  . . . . . . . . . . . . . . . . . . . . . .  8
69	     2.1.  Joining a Swarm  . . . . . . . . . . . . . . . . . . . . .  8
70	     2.2.  Exchanging Chunks  . . . . . . . . . . . . . . . . . . . .  9
71	     2.3.  Leaving a Swarm  . . . . . . . . . . . . . . . . . . . . .  9
72	   3.  Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
73	     3.1.  HANDSHAKE  . . . . . . . . . . . . . . . . . . . . . . . . 10
74	     3.2.  HAVE . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
75	     3.3.  DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
76	     3.4.  ACK  . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
77	     3.5.  INTEGRITY  . . . . . . . . . . . . . . . . . . . . . . . . 11
78	     3.6.  SIGNED_INTEGRITY . . . . . . . . . . . . . . . . . . . . . 11
79	     3.7.  REQUEST  . . . . . . . . . . . . . . . . . . . . . . . . . 11
80	     3.8.  CANCEL . . . . . . . . . . . . . . . . . . . . . . . . . . 12
81	     3.9.  CHOKE and UNCHOKE  . . . . . . . . . . . . . . . . . . . . 12
82	     3.10. Peer Address Exchange and NAT Hole Punching  . . . . . . . 13
83	       3.10.1.  PEX_REQ and PEX_RES Messages  . . . . . . . . . . . . 13
84	       3.10.2.  Hole Punching via PPSPP Messages  . . . . . . . . . . 13
85	     3.11. Keep Alive Signalling  . . . . . . . . . . . . . . . . . . 13
86	     3.12. Storage Independence . . . . . . . . . . . . . . . . . . . 14
87	   4.  Chunk Addressing Schemes . . . . . . . . . . . . . . . . . . . 14
88	     4.1.  Start-End Ranges . . . . . . . . . . . . . . . . . . . . . 14
89	       4.1.1.   Chunk Ranges  . . . . . . . . . . . . . . . . . . . . 14
90	       4.1.2.   Byte Ranges . . . . . . . . . . . . . . . . . . . . . 14
91	     4.2.  Bin Numbers  . . . . . . . . . . . . . . . . . . . . . . . 14
92	     4.3.  In Messages  . . . . . . . . . . . . . . . . . . . . . . . 16
93	       4.3.1.   In HAVE Messages  . . . . . . . . . . . . . . . . . . 16
94	       4.3.2.   In ACK Messages . . . . . . . . . . . . . . . . . . . 16
95	     4.4.  Compatibility  . . . . . . . . . . . . . . . . . . . . . . 16

97	   5.  Content Integrity Protection . . . . . . . . . . . . . . . . . 17
98	     5.1.  Merkle Hash Tree Scheme  . . . . . . . . . . . . . . . . . 18
99	     5.2.  Content Integrity Verification . . . . . . . . . . . . . . 19
100	     5.3.  The Atomic Datagram Principle  . . . . . . . . . . . . . . 20
101	     5.4.  INTEGRITY Messages . . . . . . . . . . . . . . . . . . . . 21
102	     5.5.  Discussion and Overhead  . . . . . . . . . . . . . . . . . 21
103	     5.6.  Automatic Detection of Content Size  . . . . . . . . . . . 22
104	       5.6.1.   Peak Hashes . . . . . . . . . . . . . . . . . . . . . 22
105	       5.6.2.   Procedure . . . . . . . . . . . . . . . . . . . . . . 24
106	   6.  Live Streaming . . . . . . . . . . . . . . . . . . . . . . . . 25
107	     6.1.  Content Authentication . . . . . . . . . . . . . . . . . . 25
108	       6.1.1.   Sign All  . . . . . . . . . . . . . . . . . . . . . . 25
109	       6.1.2.   Unified Merkle Tree . . . . . . . . . . . . . . . . . 26
110	     6.2.  Forgetting Chunks  . . . . . . . . . . . . . . . . . . . . 29
111	   7.  Protocol Options . . . . . . . . . . . . . . . . . . . . . . . 29
112	     7.1.  End Option . . . . . . . . . . . . . . . . . . . . . . . . 29
113	     7.2.  Version  . . . . . . . . . . . . . . . . . . . . . . . . . 29
114	     7.3.  Minimum Version  . . . . . . . . . . . . . . . . . . . . . 30
115	     7.4.  Swarm Identifier . . . . . . . . . . . . . . . . . . . . . 30
116	     7.5.  Content Integrity Protection Method  . . . . . . . . . . . 30
117	     7.6.  Merkle Tree Hash Function  . . . . . . . . . . . . . . . . 31
118	     7.7.  Live Signature Algorithm . . . . . . . . . . . . . . . . . 31
119	     7.8.  Chunk Addressing Method  . . . . . . . . . . . . . . . . . 32
120	     7.9.  Live Discard Window  . . . . . . . . . . . . . . . . . . . 32
121	     7.10. Supported Messages . . . . . . . . . . . . . . . . . . . . 33
122	   8.  UDP Encapsulation  . . . . . . . . . . . . . . . . . . . . . . 33
123	     8.1.  Chunk Size . . . . . . . . . . . . . . . . . . . . . . . . 33
124	     8.2.  Datagrams and Messages . . . . . . . . . . . . . . . . . . 34
125	     8.3.  Channels . . . . . . . . . . . . . . . . . . . . . . . . . 35
126	     8.4.  HANDSHAKE  . . . . . . . . . . . . . . . . . . . . . . . . 35
127	     8.5.  HAVE . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
128	     8.6.  DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
129	     8.7.  ACK  . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
130	     8.8.  INTEGRITY  . . . . . . . . . . . . . . . . . . . . . . . . 37
131	     8.9.  SIGNED_INTEGRITY . . . . . . . . . . . . . . . . . . . . . 37
132	     8.10. REQUEST  . . . . . . . . . . . . . . . . . . . . . . . . . 37
133	     8.11. CANCEL . . . . . . . . . . . . . . . . . . . . . . . . . . 37
134	     8.12. CHOKE and UNCHOKE  . . . . . . . . . . . . . . . . . . . . 38
135	     8.13. PEX_REQ, PEX_RESv4, PEX_RESv6 and PEX_REScert  . . . . . . 38
136	     8.14. KEEPALIVE  . . . . . . . . . . . . . . . . . . . . . . . . 38
137	     8.15. Flow and Congestion Control  . . . . . . . . . . . . . . . 39
138	   9.  Extensibility  . . . . . . . . . . . . . . . . . . . . . . . . 39
139	     9.1.  Chunk Picking Algorithms . . . . . . . . . . . . . . . . . 39
140	     9.2.  Reciprocity Algorithms . . . . . . . . . . . . . . . . . . 39
141	   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39
142	   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 40
143	   12. Security Considerations  . . . . . . . . . . . . . . . . . . . 40
144	     12.1. Security of the Handshake Procedure  . . . . . . . . . . . 40
145	       12.1.1.  Protection against attack 1 . . . . . . . . . . . . . 41
146	       12.1.2.  Protection against attack 2 . . . . . . . . . . . . . 42
147	       12.1.3.  Protection against attack 3 . . . . . . . . . . . . . 42
148	     12.2. Secure Peer Address Exchange . . . . . . . . . . . . . . . 42
149	       12.2.1.  Protection against the Amplification Attack . . . . . 43
150	       12.2.2.  Example: Tracker as Certification Authority . . . . . 43
151	       12.2.3.  Protection Against Eclipse Attacks  . . . . . . . . . 44
152	     12.3. Support for Closed Swarms (PPSP.SEC.REQ-1) . . . . . . . . 45
153	     12.4. Confidentiality of Streamed Content (PPSP.SEC.REQ-2+3) . . 45
154	     12.5. Limit Potential Damage and Resource Exhaustion by Bad
155	           or Broken Peers (PPSP.SEC.REQ-4+6) . . . . . . . . . . . . 45
156	       12.5.1.  HANDSHAKE . . . . . . . . . . . . . . . . . . . . . . 45
157	       12.5.2.  HAVE  . . . . . . . . . . . . . . . . . . . . . . . . 46
158	       12.5.3.  DATA  . . . . . . . . . . . . . . . . . . . . . . . . 46
159	       12.5.4.  ACK . . . . . . . . . . . . . . . . . . . . . . . . . 46
160	       12.5.5.  INTEGRITY and SIGNED_INTEGRITY  . . . . . . . . . . . 47
161	       12.5.6.  REQUEST . . . . . . . . . . . . . . . . . . . . . . . 47
162	       12.5.7.  CANCEL  . . . . . . . . . . . . . . . . . . . . . . . 47
163	       12.5.8.  CHOKE . . . . . . . . . . . . . . . . . . . . . . . . 47
164	       12.5.9.  UNCHOKE . . . . . . . . . . . . . . . . . . . . . . . 48
165	       12.5.10. PEX_RES . . . . . . . . . . . . . . . . . . . . . . . 48
166	       12.5.11. Unsolicited Messages in General . . . . . . . . . . . 48
167	     12.6. Exclude Bad or Broken Peers (PPSP.SEC.REQ-5) . . . . . . . 48
168	   13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 48
169	     13.1. Normative References . . . . . . . . . . . . . . . . . . . 48
170	     13.2. Informative References . . . . . . . . . . . . . . . . . . 50
171	   Appendix A.  Revision History  . . . . . . . . . . . . . . . . . . 53
172	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 57

174	1.  Introduction

176	1.1.  Purpose

178	   This document describes the Peer-to-Peer Streaming Peer Protocol
179	   (PPSPP), designed for disseminating the same content to a group of
180	   interested parties in a streaming fashion.  PPSPP supports streaming
181	   of both pre-recorded (on-demand) and live audio/video content.  It is
182	   based on the peer-to-peer paradigm where clients consuming the
183	   content are put on equal footing with the servers initially providing
184	   the content, to create a system where everyone can potentially
185	   provide upload bandwidth.

187	   PPSPP has been designed to provide short time-till-playback for the
188	   end user, and to prevent disruption of the streams by malicious
189	   peers.  Central in this design is a simple method of identifying
190	   content based on self-certification.  In particular, content in PPSPP
191	   is identified by a single cryptographic hash that is the root hash in
192	   a Merkle hash tree calculated recursively from the content
193	   [MERKLE][ABMRKL].  This self-certifying hash tree allows every peer
194	   to directly detect when a malicious peer tries to distribute fake
195	   content.  The tree can be used for both static and live content.
196	   Moreover, it ensures only a small amount of information is needed to
197	   start a download and to verify incoming chunks of content, thus
198	   ensuring short start-up times.

200	   PPSPP has also been designed to be extensible for different
201	   transports and use cases.  Hence, PPSPP is a generic protocol which
202	   can run directly on top of UDP, TCP, or other protocols.  As such,
203	   PPSPP defines a common set of messages that make up the protocol,
204	   which can have different representations on the wire depending on the
205	   lower-level protocol used.  When the lower-level transport allows,
206	   PPSPP can also use different congestion control algorithms.

208	   At present, PPSPP is set to run on top of UDP using LEDBAT for
209	   congestion control [I-D.ietf-ledbat-congestion].  Using LEDBAT
210	   enables PPSPP to serve the content after playback (seeding) without
211	   disrupting the user who may have moved to different tasks that use
212	   its network connection.  LEDBAT may be replaced with a different
213	   algorithm when the work of the IETF working group on RTP Media
214	   Congestion Avoidance Techniques (RMCAT) [RMCATCHART] matures.

216	   PPSPP is also flexible and extensible in the mechanisms it uses to
217	   promote client contribution and prevent freeriding, that is, how to
218	   deal with peers that only download content but never upload to
219	   others.  It also allows different schemes for chunk addressing and
220	   content integrity protection, if the defaults are not fit for a
221	   particular use case.  In addition, it can work with different peer
222	   discovery schemes, such as centralized trackers or fast Distributed
223	   Hash Tables [JIM11].  Finally, in this default setup, PPSPP maintains
224	   only a small amount of state per peer.  A reference implementation of
225	   PPSPP over UDP is available [SWIFTIMPL].

227	1.2.  Requirements Language

229	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
230	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
231	   document are to be interpreted as described in RFC 2119 [RFC2119].

233	1.3.  Terminology

235	   message
236	       The basic unit of PPSPP communication.  A message will have
237	       different representations on the wire depending on the transport
238	       protocol used.  Messages are typically multiplexed into a
239	       datagram for transmission.

241	   datagram
242	       A sequence of messages that is offered as a unit to the
243	       underlying transport protocol (UDP, etc.).  The datagram is
244	       PPSPP's Protocol Data Unit (PDU).

246	   content
247	       Either a live transmission, a pre-recorded multimedia asset, or a
248	       file.

250	   chunk
251	       The basic unit in which the content is divided.  E.g. a block of
252	       N kilobyte.

254	   chunk ID
255	       Unique identifier for a chunk of content (e.g. an integer).  Its
256	       type depends on the chunk addressing scheme used.

258	   chunk specification
259	       An expression that denotes one or more chunk IDs.

261	   chunk addressing scheme
262	       Scheme for identifying chunks and expressing the chunk
263	       availability map of a peer in a compact fashion.

265	   chunk availability map
266	       The set of chunks a peer has successfully downloaded and checked
267	       the integrity of.

269	   bin
270	       A number denoting a specific binary interval of the content
271	       (i.e., one or more consecutive chunks) in the bin numbers chunk
272	       addressing scheme (see Section 4).

274	   content integrity protection scheme
275	       Scheme for protecting the integrity of the content while it is
276	       being distributed via the peer-to-peer network.  I.e. methods for
277	       receiving peers to detect whether a requested chunk has been
278	       maliciously modified by the sending peer.

280	   hash
281	       The result of applying a cryptographic hash function, more
282	       specifically a modification detection code (MDC) [HAC01], such as
283	       SHA-1 [FIPS180-3], to a piece of data.

285	   Merkle hash tree
286	       A tree of hashes whose base is formed by the hashes of the chunks
287	       of content, and its higher nodes are calculated by recursively
288	       computing the hash of the concatenation of the two child hashes
289	       (see Section 5.1).

291	   root hash
292	       The root in a Merkle hash tree calculated recursively from the
293	       content (see Section 5.1).

295	   swarm
296	       A group of peers participating in the distribution of the same
297	       content.

299	   swarm ID
300	       Unique identifier for a swarm of peers, in PPSPP a sequence of
301	       bytes.  When Merkle hash trees are used for content integrity
302	       protection, the identifier is the so-called root hash of the
303	       content (video-on-demand).  For live streaming, the swarm ID is a
304	       public key.

306	   tracker
307	       An entity that records the addresses of peers participating in a
308	       swarm, usually for a set of swarms, and makes this membership
309	       information available to other peers on request.

311	   choking
312	       When a peer A is choking peer B it means that A is currently not
313	       willing to accept requests for content from B.

315	   seeding/leeching
316	       When a peer A is seeding it means that A has downloaded a static
317	       content asset completely and is now offering it for others to
318	       download.  If peer A does not yet have all content or is not
319	       offering it for download, A is said to be leeching.

321	2.  Overall Operation

323	   The basic unit of communication in PPSPP is the message.  Multiple
324	   messages are multiplexed into a single datagram for transmission.  A
325	   datagram (and hence the messages it contains) will have different
326	   representations on the wire depending on the transport protocol used
327	   (see Section 8).

329	   The overall operation of PPSPP is illustrated in the following
330	   examples.  The examples assume that UDP is used for transport, the
331	   recommended method for content integrity protection (Merkle hash
332	   trees) is used, and that a specific policy is used for selecting
333	   which chunks to download.

335	2.1.  Joining a Swarm

337	   Consider a peer A that wants to download a certain content asset.  To
338	   commence a PPSPP download, peer A must have the swarm ID of the
339	   content and a list of one or more tracker contact points (e.g. host+
340	   port).  The list of trackers is optional in the presence of a
341	   decentralized tracking mechanism.

343	   Peer A now registers with the tracker following e.g. the PPSP tracker
344	   protocol [I-D.ietf-ppsp-reqs] and receives the IP address and port of
345	   peers already in the swarm, say B, C, and D. Peer A now sends a
346	   datagram containing a HANDSHAKE message to B, C, and D. This message
347	   conveys protocol options and may serve as an end-to-end check that
348	   the peers are actually in the correct swarm (in which case it
349	   contains the ID of the swarm).

351	   Peer B and C respond with datagrams containing a HANDSHAKE message
352	   and one or more HAVE messages.  A HAVE message conveys (part of) the
353	   chunk availability of a peer and thus contains a chunk specification
354	   that denotes what chunks of the content peer B, resp. C have.  Peer D
355	   sends a datagram with a HANDSHAKE and HAVE messages, but also with a
356	   CHOKE message.  The latter indicates that D is not willing to upload
357	   chunks to A at present.

359	2.2.  Exchanging Chunks

361	   In response to B and C, A sends new datagrams to B and C containing
362	   REQUEST messages.  A REQUEST message indicates the chunks that a peer
363	   wants to download, and thus contains a chunk specification.  The
364	   REQUEST messages to B and C refer to disjunct sets of chunks.  B and
365	   C respond with datagrams containing HAVE, DATA and, in this example,
366	   INTEGRITY messages.  In the Merkle hash tree content protection
367	   scheme (see Section 5.1), the INTEGRITY messages contain all
368	   cryptographic hashes that peer A needs to verify the integrity of the
369	   content chunk sent in the DATA message.  Using these hashes peer A
370	   verifies that the chunks received from B and C are correct.  It also
371	   updates the chunk availability of B and C using the information in
372	   the received HAVE messages.

374	   After processing, A sends a datagram containing HAVE messages for the
375	   chunks it just received to all its peers.  In the datagram to B and C
376	   it includes an ACK message acknowledging the receipt of the chunks,
377	   and adds REQUEST messages for new chunks.  ACK messages are not used
378	   when a reliable transport protocol is used.  When e.g.  C finds that
379	   A obtained a chunk (from B) that C did not yet have, C's next
380	   datagram includes a REQUEST for that chunk.

382	   Peer D also sends HAVE messages to A when it downloads chunks from
383	   other peers.  When D is willing to accept REQUESTs from A, D sends a
384	   datagram with an UNCHOKE message to inform A. If B or C decide to
385	   choke A they sending a CHOKE message and A should then re-request
386	   from other peers.  B and C may continue to send HAVE, REQUEST, or
387	   periodic KEEPALIVE messages such that A keeps sending them HAVE
388	   messages.

390	   Once peer A has received all content (video-on-demand use case) it
391	   stops sending messages to all other peers that have all content
392	   (a.k.a. seeders).  Peer A MAY also contact the tracker or another
393	   source again to obtain more peer addresses.

395	2.3.  Leaving a Swarm

397	   To leave a swarm in a graceful way, peer A sends a "close-channel"
398	   datagram to all its peers and deregisters from the tracker following
399	   the (PPSP) tracker protocol.  Peers receiving the datagram should
400	   remove A from their current peer list.  If A crashes ungracefully,
401	   peers should remove A from their peer list when they detect it no
402	   longer sends messages.

404	3.  Messages

406	   In general, no error codes or responses are used in the protocol;
407	   absence of any response indicates an error.  Invalid messages are
408	   discarded, and further communication with the peer SHOULD be stopped.
409	   The rationale is that it is sufficient to classify peers as either
410	   good (i.e., responding with chunks) or bad and only use the good
411	   ones.  This behavior allows a peer to deal with slow, crashed and
412	   (silent) malicious peers.

414	   For the sake of simplicity, one swarm of peers deals with one content
415	   asset (e.g. file) only.  Retrieval of a collections of files can be
416	   done either by using multiple swarms or by using an external storage
417	   mapping from the linear byte space of a single swarm to different
418	   files, transparent to the protocol, as described in Section 3.12.

420	3.1.  HANDSHAKE

422	   The initiating peer and the addressed peer MUST send a HANDSHAKE
423	   message as the first message in the first datagrams they exchange.
424	   The payload of the HANDSHAKE message is a sequence of protocol
425	   options.  Example options are the content integrity protection scheme
426	   used and an option to specify the swarm identifier.  The latter
427	   option MAY be used as an end-to-end check that the peers are actually
428	   in the correct swarm.  Protocol options are specified in Section 7.

430	   After the handshakes are exchanged, the initiator knows that the peer
431	   really responds.  Hence, the second datagram the initiator sends MAY
432	   already contain some heavy payload, e.g.  DATA messages.  To minimize
433	   the number of initialization round-trips, the first two datagrams
434	   exchanged MAY also contain some minor payload, e.g.  HAVE messages to
435	   indicate the current progress of a peer or a REQUEST (see
436	   Section 3.7).

438	3.2.  HAVE

440	   The HAVE message is used to convey which chunks a peer has available
441	   for download.  The set of chunks it has available may be expressed
442	   using different chunk addressing and availability map compression
443	   schemes, described in Section 4.  HAVE messages can be used both for
444	   sending a complete overview of a peer's chunk availability as well as
445	   for updates to that set.

447	   In particular, whenever a receiving peer has successfully checked the
448	   integrity of a chunk or interval of chunks it MUST send a HAVE
449	   message to all peers it wants to interact with in the near future.
450	   The latter confinement allows the HAVE message to be used as a method
451	   of choking.  The HAVE message MUST contain the chunk specification of
452	   the received chunks.  A receiving peer MUST NOT send a HAVE message
453	   to peers for which the handshake procedure is still incomplete, see
454	   Section 12.1.

456	3.3.  DATA

458	   The DATA message is used to transfer chunks of content.  The DATA
459	   message MUST contain the chunk ID of the chunk and chunk itself.  A
460	   peer MAY send the DATA messages for multiple chunks in the same
461	   datagram.  The DATA message MAY contain additional information if
462	   needed by the specific congestion control mechanism used.  At present
463	   PPSPP uses LEDBAT [I-D.ietf-ledbat-congestion] for congestion
464	   control, which requires the current system time to be sent along with
465	   the DATA message, so the current system time MUST be included.

467	3.4.  ACK

469	   When PPSPP is run over an unreliable transport protocol, an
470	   implementation MAY choose to use ACK messages to acknowledge received
471	   data.  When used, a receiving peer that has successfully checked the
472	   integrity of a chunk or interval of chunks C it MUST send an ACK
473	   message containing a chunk specification for C. As LEDBAT is used, an
474	   ACK message MUST contain the one-way delay, computed from the peer's
475	   current system time received in the DATA message.  A peer MAY delay
476	   sending ACK messages as defined in the LEDBAT specification.

478	3.5.  INTEGRITY

480	   The INTEGRITY message carries information required by the receiver to
481	   verify the integrity of a chunk.  Its payload depends on the content
482	   integrity protection scheme used.  When the recommended method of
483	   Merkle hash trees is used, the datagram carrying the DATA message
484	   MUST include the cryptographic hashes that are necessary for a
485	   receiver to check the integrity of the chunk in the form of INTEGRITY
486	   messages.  What are the necessary hashes is explained in Section 5.3.

488	3.6.  SIGNED_INTEGRITY

490	   The SIGNED_INTEGRITY message carries digitally signed information
491	   required by the receiver to verify the integrity of a chunk in live
492	   streaming.  It logically contains a chunk specification and a digital
493	   signature.  Its exact payload depends on the live content integrity
494	   protection scheme used, see Section 6.1.

496	3.7.  REQUEST

498	   While bulk download protocols normally do explicit requests for
499	   certain ranges of data (i.e., use a pull model, for example,
500	   BitTorrent [BITTORRENT]), live streaming protocols quite often use a
501	   request-less push model to save round trips.  PPSPP supports both
502	   models of operation.

504	   A peer MAY send a REQUEST message that MUST contain the specification
505	   of the chunks it wants to download.  A peer receiving a REQUEST
506	   message MAY send out the requested chunks.  When peer Q receives
507	   multiple REQUESTs from the same peer P peer Q SHOULD process the
508	   REQUESTs sequentially.  Multiple REQUEST messages MAY be sent in one
509	   datagram, for example, when a peer wants to request several rare
510	   chunks at once.

512	   When live streaming via a push model, a peer receiving REQUESTs also
513	   MAY send some other chunks in case it runs out of requests or for
514	   some other reason.  In that case the only purpose of REQUEST messages
515	   is to provide hints and coordinate peers to avoid unnecessary data
516	   retransmission.

518	3.8.  CANCEL

520	   When downloading on demand or live streaming content, a peer MAY
521	   request urgent data from multiple peers to increase the probability
522	   of it being delivered on time.  In particular, when the specific
523	   chunk picking algorithm (see Section 9.1), detects that a request for
524	   urgent data might not be served on time, a request for the same data
525	   MAY be sent to a different peer.  When a peer P decides to request
526	   urgent data from a peer Q, peer P SHOULD send a CANCEL message to all
527	   the peers to which the data has been previously requested.  The
528	   CANCEL message contains the specification of the chunks P no longer
529	   wants to request.  In addition, when peer Q receives a HAVE message
530	   for the urgent data from peer P, peer Q MUST also cancel the previous
531	   REQUEST(s) from P. In other words, the HAVE message acts as an
532	   implicit CANCEL.

534	3.9.  CHOKE and UNCHOKE

536	   Peer A MAY send a CHOKE message to peer B to signal it will no longer
537	   be responding to REQUEST messages from B, for example, because A's
538	   upload capacity is exhausted.  Peer A MAY send a subsequent UNCHOKE
539	   message to signal that it will respond to new REQUESTs from B again
540	   (A SHOULD discard old requests).  When peer B receives a CHOKE
541	   message from A it MUST NOT send new REQUEST messages and SHOULD NOT
542	   expect answers to any outstanding ones.  The CHOKE and UNCHOKE
543	   messages are informational as a peer is not required to respond to
544	   REQUESTs, see Section 3.7.

546	3.10.  Peer Address Exchange and NAT Hole Punching

548	3.10.1.  PEX_REQ and PEX_RES Messages

550	   Peer address exchange messages (or PEX messages for short) are common
551	   in many peer-to-peer protocols.  By exchanging peer addresses in
552	   gossip fashion, peers relieve central coordinating entities (the
553	   trackers) from unnecessary work.  PPSPP optionally features two types
554	   of PEX messages: PEX_REQ and PEX_RES.  A peer that wants to retrieve
555	   some peer addresses MUST send a PEX_REQ message.  The receiving peer
556	   MAY respond with a PEX_RES message containing the (potentially
557	   signed) addresses of several peers.  The addresses MUST be of peers
558	   it has exchanged messages with in the last 60 seconds to guarantee
559	   liveliness.  Alternatively, the receiving peer MAY ignore PEX
560	   messages if uninterested in obtaining new peers or because of
561	   security considerations (rate limiting) or any other reason.  The PEX
562	   messages can be used to construct a dedicated tracker peer.

564	   As peer-address exchange enables a number of attacks it should not be
565	   used outside a benign environment unless extra security measures are
566	   in place.  These security measures, which involve exchanging
567	   addresses in cryptographically signed swarm-membership certificates,
568	   are described in Section 12.2.

570	3.10.2.  Hole Punching via PPSPP Messages

572	   PPSPP can be used in combination with STUN [RFC5389].  In addition,
573	   the native PEX messages can be used to do simple NAT hole punching
574	   [SNP], as follows.  When peer A introduces peer B to peer C by
575	   sending a PEX_RES message to C, it SHOULD also send a PEX_RES message
576	   to B introducing C. These messages SHOULD be within 2 seconds from
577	   each other, but MAY not be simultaneous, instead leaving a gap of
578	   twice the "typical" RTT, i.e. 300-600 ms.  As a result, the peers are
579	   supposed to initiate handshakes to each other thus forming a simple
580	   NAT hole punching pattern where the introducing peer effectively acts
581	   as a STUN server.  Note that the PEX_RES message is sent without a
582	   prior PEX_REQ in this case.  Also note the PEX_RES from A to B is
583	   likely to arrive because recent communication between A and B is a
584	   prerequisite for A introducing B to C, see previous section.

586	3.11.  Keep Alive Signalling

588	   A peer MUST send a "keep alive" message periodically to each peer it
589	   wants to interact with in the future, but has no other messages to
590	   send them at present.  PPSPP does not define an explicit message type
591	   for "keep alive" messages.  In the PPSP-over-UDP encapsulation they
592	   are implemented as simple datagrams consisting of a 4-byte channel
593	   number only, see Section 8.3 and Section 8.4.

595	3.12.  Storage Independence

597	   Note PPSPP does not prescribe how chunks are stored.  This also
598	   allows users of PPSPP to map different files into a single swarm as
599	   in BitTorrent's multi-file torrents [BITTORRENT], and more innovative
600	   storage solutions when variable-sized chunks are used.

602	4.  Chunk Addressing Schemes

604	   PPSPP can use different methods of chunk addressing, that is, support
605	   different ways of identifying chunks and different ways of expressing
606	   the chunk availability map of a peer in a compact fashion.

608	4.1.  Start-End Ranges

610	   A chunk specification consists of a single (start specification,end
611	   specification) pair that identifies a range of chunks (end
612	   inclusive).  The start and end specifications can use one of multiple
613	   addressing schemes.  Two schemes are currently defined, chunk ranges
614	   and byte ranges.

616	4.1.1.  Chunk Ranges

618	   The start and end specification are both chunk identifiers.  A PPSPP
619	   peer MUST support this scheme.

621	4.1.2.  Byte Ranges

623	   The start and end specification are byte offsets in the content.  A
624	   PPSPP peer MAY support this scheme.

626	4.2.  Bin Numbers

628	   PPSPP introduces a novel method of addressing chunks of content
629	   called "bin numbers" (or "bins" for short).  Bin numbers allow the
630	   addressing of a binary interval of data using a single integer.  This
631	   reduces the amount of state that needs to be recorded per peer and
632	   the space needed to denote intervals on the wire, making the protocol
633	   light-weight.  In general, this numbering system allows PPSPP to work
634	   with simpler data structures, e.g. to use arrays instead of binary
635	   trees, thus reducing complexity.  A PPSPP peer MAY support this
636	   scheme.

638	   In bin addressing, the smallest binary interval is a single chunk
639	   (e.g. a block of bytes which may be of variable size), the largest
640	   interval is a complete range of 2**63 chunks.  In a novel addition to
641	   the classical scheme, these intervals are numbered in a way which
642	   lays them out into a vector nicely, which is called bin numbering, as
643	   follows.  Consider an chunk interval of width W. To derive the bin
644	   numbers of the complete interval and the subintervals, a minimal
645	   balanced binary tree is built that is at least W chunks wide at the
646	   base.  The leaves from left-to-right correspond to the chunks 0..W-1
647	   in the interval, and have bin number I*2 where I is the index of the
648	   chunk (counting beyond W-1 to balance the tree).  The bin number of
649	   higher level nodes P in the tree is calculated as follows:

651	       binP = (binL + binR) / 2

653	   where binL is the bin of node P's left-hand child and binR is the bin
654	   of node P's right-hand child.  Given that each node in the tree
655	   represents a subinterval of the original interval, each such
656	   subinterval now is addressable by a bin number, a single integer.
657	   The bin number tree of an interval of width W=8 looks like this:

659	                                    7
660	                                   / \
661	                                 /     \
662	                               /         \
663	                             /             \
664	                            3                11
665	                           / \              / \
666	                          /   \            /   \
667	                         /     \          /     \
668	                        1       5        9       13
669	                       / \     / \      / \      / \
670	                      0   2   4   6    8   10  12   14

672	                      C0  C1  C2  C3   C4  C5  C6   C7

674	              The bin number tree of an interval of width W=8

676	                                 Figure 1

678	   So bin 7 represents the complete interval, bin 3 represents the
679	   interval of chunk 0..3, bin 1 represents the interval of chunks 0 and
680	   1, and bin 2 represents chunk C1.  The special numbers 0xFFFFFFFF
681	   (32-bit) or 0xFFFFFFFFFFFFFFFF (64-bit) stands for an empty interval,
682	   and 0x7FFF...FFF stands for "everything".

684	   When bin numbering is used, the ID of a chunk is its corresponding
685	   (leaf) bin number in the tree and the chunk specification in HAVE and
686	   ACK messages is equal to a single bin number, as follows.

688	4.3.  In Messages

690	4.3.1.  In HAVE Messages

692	   When a receiving peer has successfully checked the integrity of a
693	   chunk or interval of chunks it MUST send a HAVE message to all peers
694	   it wants to interact with.  The latter allows the HAVE message to be
695	   used as a method of choking.  The HAVE message MUST contain the chunk
696	   specification of the biggest complete interval of all chunks the
697	   receiver has received and checked so far that fully includes the
698	   interval of chunks just received.  So the chunk specification MUST
699	   denote at least the interval received, but the receiver is supposed
700	   to aggregate and acknowledge bigger intervals, when possible.

702	   As a result, every single chunk is acknowledged a logarithmic number
703	   of times.  That provides some necessary redundancy of acknowledgments
704	   and sufficiently compensates for unreliable transport protocols.

706	   Implementation note:

708	       To record which chunks a peer has in the state that an
709	       implementation keeps for each peer, an implementation MAY use the
710	       efficient "binmap" data structure, which is a hybrid of a bitmap
711	       and a binary tree, discussed in detail in [BINMAP].

713	4.3.2.  In ACK Messages

715	   When PPSPP is run over an unreliable transport protocol, an
716	   implementation MAY choose to use ACK messages to acknowledge received
717	   data.  When a receiving peer has successfully checked the integrity
718	   of a chunk or interval of chunks C it MUST send a ACK message
719	   containing the chunk specification of its biggest, complete, interval
720	   covering C to the sending peer (see HAVE).

722	4.4.  Compatibility

724	   In principle, peers using range addressing and peers using bin
725	   numbering can interact, with some limitations.  Alternatively, a peer
726	   A MAY refuse to interact with a peer B using a different addressing
727	   scheme.  In that case, A MUST respond to B'S HANDSHAKE message by
728	   sending an explicit close (see Section 8.4).  PPSPP presently
729	   supports only interaction between willing peers when fixed sized
730	   chunks are used, as follows:

732	   When a bin peer sends a message containing a chunk specification to a
733	   byte-range peer it MUST translate its internal bin numbers to byte
734	   ranges.  When a byte range peer sends a message with a chunk
735	   specification message to a bin peer, it MUST round its internal byte
736	   ranges to 1 or more bins.  For the latter translation, the byte-range
737	   peer MUST know the fixed chunk size used (which it should receive
738	   along with the swarm identifier).  When a range translates to
739	   multiple bins, the byte-range peer should send multiple e.g.  HAVE
740	   messages.  Note that the bin peer may not be able to request all
741	   content the byte-range peer has if it does not have an integral
742	   number of chunks.

744	   Aside: Translation from bytes to bins is possible for variable sized
745	   chunks only when the byte-range peer has extra information.  In
746	   particular, it will need to know the individual sizes of the chunks
747	   from the start of the content till the byte range it wants to convey
748	   to the bin peer.

750	   A similar translation MUST be done for translating between bins and
751	   chunk ranges.  Chunk ranges are directly translatable to bins.
752	   Assuming ranges are intervals of a list of chunks numbered 0...N, for
753	   a given bin number "bin" and bitwise operations AND and OR:

755	       startrange = (bin AND (bin + 1))/2

757	       endrange = ((bin OR (bin + 1)) - 1)/2

759	   The reverse translation may require a chunk range to be rounded to
760	   the largest binary interval it covers, or for a range be translated
761	   to a series of bin numbers that should be sent using multiple (e.g.
762	   HAVE) messages.

764	   Finally, byte-range peers can interact with chunk-range peers, by
765	   using the direct translation from chunks into bytes and by rounding
766	   byte ranges into chunk ranges.  The latter requires the byte-range
767	   peer to know the fixed chunk size.

769	5.  Content Integrity Protection

771	   PPSPP can use different methods for protecting the integrity of the
772	   content while it is being distributed via the peer-to-peer network.
773	   More specifically, PPSPP can use different methods for receiving
774	   peers to detect whether a requested chunk has been maliciously
775	   modified by the sending peer.

777	   This section describes the recommended method for bad content
778	   detection, the Merkle Hash Tree scheme, which SHOULD be implemented
779	   for protecting static content.  It can also be efficiently used in
780	   protecting live streams, as explained below and in Section 6.1.

782	   The Merkle hash tree scheme can use different chunk addressing
783	   schemes.  All it requires is the ability to address a range of
784	   chunks.  In the following description abstract node IDs are used to
785	   identify nodes in the tree.  On the wire these are translated to the
786	   corresponding range of chunks in the chosen chunk addressing scheme.
787	   When bin numbering is used, node IDs correspond directly to bin
788	   numbers in the INTEGRITY message, see below.

790	5.1.  Merkle Hash Tree Scheme

792	   PPSPP uses a method of naming content based on self-certification.
793	   In particular, content in PPSPP is identified by a single
794	   cryptographic hash that is the root hash in a Merkle hash tree
795	   calculated recursively from the content [ABMRKL].  This self-
796	   certifying hash tree allows every peer to directly detect when a
797	   malicious peer tries to distribute fake content.  It also ensures
798	   only a small the amount of information is needed to start a download
799	   (the root hash and some peer addresses).  For live streaming a
800	   dynamic tree and a public key are used, see below.

802	   The Merkle hash tree of a content asset that is divided into N chunks
803	   is constructed as follows.  Note the construction does not assume
804	   chunks of content to be fixed size.  Given a cryptographic hash
805	   function, more specifically a modification detection code (MDC)
806	   [HAC01] , such as SHA1, the hashes of all the chunks of the content
807	   are calculated.  Next, a binary tree of sufficient height is created.
808	   Sufficient height means that the lowest level in the tree has enough
809	   nodes to hold all chunk hashes in the set, as with bin numbering.
810	   The figure below shows the tree for a content asset consisting of 7
811	   chunks.  As before with the content addressing scheme, the leaves of
812	   the tree correspond to a chunk and in this case are assigned the hash
813	   of that chunk, starting at the left-most leaf.  As the base of the
814	   tree may be wider than the number of chunks, any remaining leaves in
815	   the tree are assigned an empty hash value of all zeros.  Finally, the
816	   hash values of the higher levels in the tree are calculated, by
817	   concatenating the hash values of the two children (again left to
818	   right) and computing the hash of that aggregate.  If the two children
819	   are empty hashes, the parent is an empty all zeros hash as well (to
820	   save computation).  This process ends in a hash value for the root
821	   node, which is called the "root hash".  Note the root hash only
822	   depends on the content and any modification of the content will
823	   result in a different root hash.

825	                               7 = root hash
826	                              / \
827	                            /     \
828	                          /         \
829	                        /             \
830	                      3*               11
831	                     / \              / \
832	                    /   \            /   \
833	                   /     \          /     \
834	                  1       5        9       13* = uncle hash
835	                 / \     / \      / \      / \
836	                0   2   4   6    8   10* 12   14

838	                C0  C1  C2  C3   C4  C5  C6   E
839	                =chunk index     ^^           = empty hash

841	             The Merkle hash tree of an interval of width W=8

843	                                 Figure 2

845	5.2.  Content Integrity Verification

847	   Assuming a peer receives the root hash of the content it wants to
848	   download from a trusted source, it can check the integrity of any
849	   chunk of that content it receives as follows.  It first calculates
850	   the hash of the chunk it received, for example chunk C4 in the
851	   previous figure.  Along with this chunk it MUST receive the hashes
852	   required to check the integrity of that chunk.  In principle, these
853	   are the hash of the chunk's sibling (C5) and that of its "uncles".  A
854	   chunk's uncles are the sibling Y of its parent X, and the uncle of
855	   that Y, recursively until the root is reached.  For chunk C4 its
856	   uncles are nodes 13 and 3, marked with * in the figure.  Using this
857	   information the peer recalculates the root hash of the tree, and
858	   compares it to the root hash it received from the trusted source.  If
859	   they match the chunk of content has been positively verified to be
860	   the requested part of the content.  Otherwise, the sending peer
861	   either sent the wrong content or the wrong sibling or uncle hashes.
862	   For simplicity, the set of sibling and uncles hashes is collectively
863	   referred to as the "uncle hashes".

865	   In the case of live streaming the tree of chunks grows dynamically
866	   and the root hash is undefined or, more precisely, transient, as long
867	   as new data is generated by the live source.  Section 6.1.2 defines a
868	   method for content integrity verification for live streams that works
869	   with such a dynamic tree.  Although the tree is dynamic, content
870	   verification works the same for both live and predefined content,
871	   resulting in a unified method for both types of streaming.

873	5.3.  The Atomic Datagram Principle

875	   As explained above, a datagram consists of a sequence of messages.
876	   Ideally, every datagram sent must be independent of other datagrams,
877	   so each datagram SHOULD be processed separately and a loss of one
878	   datagram MUST NOT disrupt the flow.  Thus, as a datagram carries zero
879	   or more messages, neither messages nor message interdependencies
880	   SHOULD span over multiple datagrams.

882	   This principle implies that as any chunk is verified using its uncle
883	   hashes the necessary hashes SHOULD be put into the same datagram as
884	   the chunk's data.  If this is not possible because of a limitation on
885	   datagram size, the necessary hashes MUST be sent first in one or more
886	   datagrams.  As a general rule, if some additional data is still
887	   missing to process a message within a datagram, the message SHOULD be
888	   dropped.

890	   The hashes necessary to verify a chunk are in principle its sibling's
891	   hash and all its uncle hashes, but the set of hashes to send can be
892	   optimized.  Before sending a packet of data to the receiver, the
893	   sender inspects the receiver's previous acknowledgments (HAVE or ACK)
894	   to derive which hashes the receiver already has for sure.  Suppose,
895	   the receiver had acknowledged chunks C0 and C1 (first two chunks of
896	   the file), then it must already have uncle hashes 5, 11 and so on.
897	   That is because those hashes are necessary to check C0 and C1 against
898	   the root hash.  Then, hashes 3, 7 and so on must be also known as
899	   they are calculated in the process of checking the uncle hash chain.
900	   Hence, to send chunk C7, the sender needs to include just the hashes
901	   for nodes 14 and 9, which let the data be checked against hash 11
902	   which is already known to the receiver.

904	   The sender MAY optimistically skip hashes which were sent out in
905	   previous, still unacknowledged datagrams.  It is an optimization
906	   trade-off between redundant hash transmission and possibility of
907	   collateral data loss in the case some necessary hashes were lost in
908	   the network so some delivered data cannot be verified and thus has to
909	   be dropped.  In either case, the receiver builds the Merkle tree on-
910	   demand, incrementally, starting from the root hash, and uses it for
911	   data validation.

913	   In short, the sender MUST put into the datagram the missing hashes
914	   necessary for the receiver to verify the chunk.  The receiver MUST
915	   remember all the hashes it needs to verify missing chunks that it
916	   still wants to download.  Note that the latter implies that a
917	   hardware-limited receiver MAY forget some hashes if it does not plan
918	   to announce possession of these chunks to others (i.e., does not plan
919	   to send HAVE messages.)

921	5.4.  INTEGRITY Messages

923	   Concretely, a peer that wants to send a chunk of content creates a
924	   datagram that MUST consist of a list of INTEGRITY messages followed
925	   by a DATA message.  If the INTEGRITY messages and DATA message cannot
926	   be put into a single datagram because of a limitation on datagram
927	   size, the INTEGRITY messages MUST be sent first in one or more
928	   datagrams.  The list of INTEGRITY messages sent MUST contain a
929	   INTEGRITY message for each hash the receiver misses for integrity
930	   checking.  A INTEGRITY message for a hash MUST contain the chunk
931	   specification corresponding to the node ID of the hash and the hash
932	   data itself.  The chunk specification corresponding to a node ID is
933	   defined as the range of chunks formed by the leaves of the subtree
934	   rooted at the node.  For example, node 3 in Figure 2 denotes chunks
935	   0,2,4,6, so the chunk specification should denote that interval.  The
936	   list of INTEGRITY messages MUST be sorted in order of tree height of
937	   the nodes, descending.  The DATA message MUST contain the chunk
938	   specification of the chunk and chunk itself.  A peer MAY send the
939	   required messages for multiple chunks in the same datagram, depending
940	   on the encapsulation.

942	5.5.  Discussion and Overhead

944	   The current method for protecting content integrity in BitTorrent
945	   [BITTORRENT] is not suited for streaming.  It involves providing
946	   clients with the hashes of the content's chunks before the download
947	   commences by means of metadata files (called .torrent files in
948	   BitTorrent.)  However, when chunks are small as in the current UDP
949	   encapsulation of PPSPP this implies having to download a large number
950	   of hashes before content download can begin.  This, in turn,
951	   increases time-till-playback for end users, making this method
952	   unsuited for streaming.

954	   The overhead of using Merkle hash trees is limited.  The size of the
955	   hash tree expressed as the total number of nodes depends on the
956	   number of chunks the content is divided (and hence the size of
957	   chunks) following this formula:

959	       nnodes = math.pow(2,math.log(nchunks,2)+1)

961	   In principle, the hash values of all these nodes will have to be sent
962	   to a peer once for it to verify all chunks.  Hence the maximum on-
963	   the-wire overhead is hashsize * nnodes.  However, the actual number
964	   of hashes transmitted can be optimized as described in Section 5.3.
965	   To see a peer can verify all chunks whilst receiving not all hashes,
966	   consider the example tree in Section 5.1.

968	   In case of a simple progressive download, of chunks 0,2,4,6, etc. the
969	   sending peer will send the following hashes:

971	          +-------+---------------------------------------------+
972	          | Chunk | Node IDs of hashes sent                     |
973	          +-------+---------------------------------------------+
974	          |   0   | 2,5,11                                      |
975	          |   2   | - (receiver already knows all)              |
976	          |   4   | 6                                           |
977	          |   6   | -                                           |
978	          |   8   | 10,13 (hash 3 can be calculated from 0,2,5) |
979	          |   10  | -                                           |
980	          |   12  | 14                                          |
981	          |   14  | -                                           |
982	          | Total | # hashes 7                                  |
983	          +-------+---------------------------------------------+

985	                  Table 1: Overhead for the example tree

987	   So the number of hashes sent in total (7) is less than the total
988	   number of hashes in the tree (16), as a peer does not need to send
989	   hashes that are calculated and verified as part of earlier chunks.

991	5.6.  Automatic Detection of Content Size

993	   In PPSPP, the root hash of a static content asset, such as a video
994	   file, along with some peer addresses is sufficient to start a
995	   download.  In addition, PPSPP can reliably and automatically derive
996	   the size of such content from information received from the network
997	   when fixed sized chunks are used.  As a result, it is not necessary
998	   to include the size of the content asset as the metadata of the
999	   content, in addition to the root hash.  Implementations of PPSPP MAY
1000	   use this automatic detection feature.  Note this feature is the only
1001	   feature of PPSPP that requires that a fixed-sized chunk is used.

1003	5.6.1.  Peak Hashes

1005	   The ability for a newcomer peer to detect the size of the content
1006	   depends heavily on the concept of peak hashes.  Peak hashes, in
1007	   general, enable two cornerstone features of PPSPP: reliable file size
1008	   detection and download/live streaming unification (see Section 6).
1009	   The concept of peak hashes depends on the concepts of filled and
1010	   incomplete nodes.  Recall that when constructing the binary trees for
1011	   content verification and addressing the base of the tree may have
1012	   more leaves than the number of chunks in the content.  In the Merkle
1013	   hash tree these leaves were assigned empty all-zero hashes to be able
1014	   to calculate the higher level hashes.  A filled node is now defined
1015	   as a node that corresponds to an interval of leaves that consists
1016	   only of hashes of content chunks, not empty hashes.  Reversely, an
1017	   incomplete (not filled) node corresponds to an interval that contains
1018	   also empty hashes, typically an interval that extends past the end of
1019	   the file.  In the following figure nodes 7, 11, 13 and 14 are
1020	   incomplete the rest is filled.

1022	   Formally, a peak hash is the hash of a filled node in the Merkle
1023	   tree, whose sibling is an incomplete node.  Practically, suppose a
1024	   file is 7162 bytes long and a chunk is 1 kilobyte.  That file fits
1025	   into 7 chunks, the tail chunk being 1018 bytes long.  The Merkle tree
1026	   for that file is shown in Figure 3.  Following the definition the
1027	   peak hashes of this file are in nodes 3, 9 and 12, denoted with a *.
1028	   E denotes an empty hash.

1030	                                  7
1031	                                 / \
1032	                               /     \
1033	                             /         \
1034	                           /             \
1035	                         3*               11
1036	                        / \              / \
1037	                       /   \            /   \
1038	                      /     \          /     \
1039	                     1       5        9*      13
1040	                    / \     / \      / \      / \
1041	                   0   2   4   6    8   10  12*  14

1043	                   C0  C1  C2  C3   C4  C5  C6   E
1044	                                            = 1018 bytes

1046	                    Peak hashes in a Merkle hash tree.

1048	                                 Figure 3

1050	   Peak hashes can be explained by the binary representation of the
1051	   number of chunks the file occupies.  The binary representation for 7
1052	   is 111.  Every "1" in binary representation of the file's packet
1053	   length corresponds to a peak hash.  For this particular file there
1054	   are indeed three peaks, nodes 3, 9, 12.  The number of peak hashes
1055	   for a file is therefore also at most logarithmic with its size.

1057	   A peer knowing which nodes contain the peak hashes for the file can
1058	   therefore calculate the number of chunks it consists of, and thus get
1059	   an estimate of the file size (given all chunks but the last are fixed
1060	   size).  Which nodes are the peaks can be securely communicated from
1061	   one (untrusted) peer A to another B by letting A send the peak hashes
1062	   and their node IDs to B. It can be shown that the root hash that B
1063	   obtained from a trusted source is sufficient to verify that these are
1064	   indeed the right peak hashes, as follows.

1066	   Lemma: Peak hashes can be checked against the root hash.

1068	   Proof: (a) Any peak hash is always the left sibling.  Otherwise, be
1069	   it the right sibling, its left neighbor/sibling must also be a filled
1070	   node, because of the way chunks are laid out in the leaves,
1071	   contradiction. (b) For the rightmost peak hash, its right sibling is
1072	   zero. (c) For any peak hash, its right sibling might be calculated
1073	   using peak hashes to the left and zeros for empty nodes. (d) Once the
1074	   right sibling of the leftmost peak hash is calculated, its parent
1075	   might be calculated. (e) Once that parent is calculated, we might
1076	   trivially get to the root hash by concatenating the hash with zeros
1077	   and hashing it repeatedly.

1079	   Informally, the Lemma might be expressed as follows: peak hashes
1080	   cover all data, so the remaining hashes are either trivial (zeros) or
1081	   might be calculated from peak hashes and zero hashes.

1083	   Finally, once peer B has obtained the number of chunks in the content
1084	   it can determine the exact file size as follows.  Given that all
1085	   chunks except the last are fixed size B just needs to know the size
1086	   of the last chunk.  Knowing the number of chunks B can calculate the
1087	   node ID of the last chunk and download it.  As always B verifies the
1088	   integrity of this chunk against the trusted root hash.  As there is
1089	   only one chunk of data that leads to a successful verification the
1090	   size of this chunk must be correct.  B can then determine the exact
1091	   file size as

1093	       (number of chunks -1) * fixed chunk size + size of last chunk

1095	5.6.2.  Procedure

1097	   A PPSPP implementation that wants to use automatic size detection
1098	   MUST operate as follows.  When a peer B sends a DATA message for the
1099	   first time to a peer A, B MUST first send all the peak hashes for the
1100	   content, unless A has already signalled earlier in the exchange that
1101	   it knows the peak hashes by having acknowledged any chunk.  If they
1102	   are needed, the peak hashes MUST be sent as an extra list of uncle
1103	   hashes for the chunk, before the list of actual uncle hashes of the
1104	   chunk as described in Section 5.3.  The receiver A MUST check the
1105	   peak hashes against the root hash to determine the approximate
1106	   content size.  To obtain the definite content size peer A MUST
1107	   download the last chunk of the content from any peer that offers it.

1109	6.  Live Streaming

1111	   The set of messages defined above can be used for live streaming as
1112	   well.  In a pull-based model, a live streaming injector can announce
1113	   the chunks it generates via HAVE messages, and peers can retrieve
1114	   them via REQUEST messages.  Areas that need special attention are
1115	   content authentication and chunk addressing (to achieve an infinite
1116	   stream of chunks).

1118	6.1.  Content Authentication

1120	   For live streaming, PPSPP supports two methods for a peer to
1121	   authenticate the content it receives from another peer, called "Sign
1122	   All" and "Unified Merkle Tree".

1124	   In the "Sign All" method, the live injector signs each chunk of
1125	   content using a private key and peers, upon receiving the chunk,
1126	   check the signature using the corresponding public key obtained from
1127	   a trusted source.  In particular, in PPSP, the swarm ID of the live
1128	   stream is that public key.  The signature is sent along with the DATA
1129	   message containing the relevant chunk using the SIGNED_INTEGRITY
1130	   message.

1132	   In the "Unified Merkle Tree" method, PPSPP combines the Merkle hash
1133	   tree scheme for static content with signatures to unify the video-on-
1134	   demand and live streaming case.  The use of Merkle hash trees reduces
1135	   the number of signing and verification operations per second, that
1136	   is, provide signature amortization similar to the approach described
1137	   in [SIGMCAST].

1139	   In both methods the swarm ID consists of a public key encoded as in a
1140	   DNSSEC DNSKEY resource record without BASE-64 encoding [RFC4034].  In
1141	   particular, the swarm ID consists of a 1 byte Algorithm field that
1142	   identifies the public key's cryptographic algorithm and determines
1143	   the format of the Public Key field that follows.

1145	6.1.1.  Sign All

1147	   Even with "Sign All", the number of cryptographic operations per
1148	   second may be limited.  For example, consider a 25 frame/second video
1149	   transmitted over UDP.  When each frame is transmitted in its own
1150	   chunk, only 25 signature verification operations per second are
1151	   required, for the receiving peer, for bitrates up to ~12.8 megabit/
1152	   second over UDP.  For higher bitrates multiple UDP packets per frame
1153	   are needed.

1155	6.1.2.  Unified Merkle Tree

1157	   In this method, the chunks of content are used as the basis for a
1158	   Merkle hash tree as for static content.  However, because chunks are
1159	   continuously generated, this tree is not static, but dynamic.  As a
1160	   result, the tree does not have a root hash, or more precisely has a
1161	   transient root hash.  A public key therefore serves as swarm ID of
1162	   the content.  It is used to digitally sign updates to the tree,
1163	   allowing peers to expand it based on trusted information using the
1164	   following procedure.

1166	   The live injector creates a number of chunks N, a fixed power of 2
1167	   (N>=2), which are added as new leaves to the existing hash tree,
1168	   expanding the tree as required.  As a result of this expansion, the
1169	   tree will have gotten a set of new peak hashes (see Section 5.6.1).
1170	   The injector now signs only the peak hashes in this set that are not
1171	   in the old set of peak hashes.  For N being a power of 2 there will
1172	   just be one new peak hash (see below).  This complementary signed
1173	   peak is distributed to the peers.  Receiving peers will verify the
1174	   signature on the signed peak against the swarm ID, update their tree
1175	   and request the new chunks.

1177	   To illustrate this procedure, consider the injector has generated the
1178	   tree shown in Figure 4 and it is connected to several peers that
1179	   currently have the same tree and all chunks.  In this tree the root
1180	   node 3 is also the peak node for this tree.  Now the injector
1181	   generates N=2 new chunks.  As a result the tree expands as shown in
1182	   Figure 5.  The two new pieces 8 and 10 extend the tree on the right
1183	   side, and to accommodate them a new root is created, node 7.  As this
1184	   tree is wider at the base than the actual number of chunks, there are
1185	   currently two empty leaves.  The peak nodes in this tree are 3 and 9.

1187	                                     3*
1188	                                    / \
1189	                                   /   \
1190	                                  /     \
1191	                                 1       5
1192	                                / \     / \
1193	                               0   2   4   6

1195	                             Current live tree

1197	                                 Figure 4
1198	                                     7
1199	                                    / \
1200	                                  /     \
1201	                                /         \
1202	                              /             \
1203	                            3*               11
1204	                           / \              / \
1205	                          /   \            /   \
1206	                         /     \          /     \
1207	                        1       5        9*      13
1208	                       / \     / \      / \      / \
1209	                      0   2   4   6    8   10   E   E

1211	                          Next current live tree

1213	                                 Figure 5

1215	   The injector now needs to inform its peers of the changed tree, in
1216	   particular the addition of the new complementary peak hash 9.  To
1217	   this extent, it sends an INTEGRITY message with the hash of node 9, a
1218	   SIGNED_INTEGRITY message with the signature of the hash of node 9 and
1219	   a HAVE message for node 9.  The receiving peers now expand their view
1220	   of the tree.  Next, the peers will request e.g. chunk 8 from the
1221	   injector by sending a REQUEST message.  The injector responds by
1222	   sending the requester an INTEGRITY message with the hash of node 10,
1223	   and a DATA message with chunk 8.  This allows the peer to verify the
1224	   chunk against peak hash 9 which is signed by the trusted injector.

1226	   The injector MAY send HAVE messages for the chunks it creates
1227	   immediately, and allow peers to retrieve them.  This optimizes the
1228	   use of the injector's bandwidth.  Peers MUST NOT forward these chunks
1229	   to others until they have received and checked the peak hash
1230	   signature and the necessary hashes.

1232	   This procedure generates just 1 new peak hash for every N blocks, so
1233	   it requires just one signature on each iteration, making it N times
1234	   cheaper than "Sign All".  To see why just 1 new peak hash is
1235	   generated each iteration let's return to the definition of a peak
1236	   hash in a tree, from Section 5.6.1.  A peak hash is the hash of a
1237	   filled node in the Merkle tree, whose sibling is an incomplete node.
1238	   Now consider the above procedure where N chunks (with N a power of 2)
1239	   are added to a tree at each iteration.  In the first iteration, the
1240	   tree consists of just N leaves, therefore the only peak is the root
1241	   of the tree.  In the second iteration, the tree consists of 2N chunks
1242	   and the only peak is the root of that bigger tree (depicted in
1243	   Figure 4 for N=2).  In the third iteration, we have 3N chunks as
1244	   leaves and a tree that is 4N wide (to span the 3N chunks) and hence
1245	   has N empty leaves (depicted in Figure 5 for N=2).  This implies that
1246	   the tree has 2 peaks, notably the peak from the previous iteration
1247	   (node 3 in the figure) and the top of the subtree of the N chunks
1248	   that were added last (node 9 in the figure).  Although this iteration
1249	   has two peaks, there is only one new peak as the expanded tree
1250	   overlaps with the tree from the previous iteration.  In the fourth
1251	   iteration, we have a complete balanced tree again, and just a single
1252	   new peak.  It is now easy to see that this process in which previous
1253	   peaks are either consumed into a single new peak, or peak sets
1254	   overlap with just 1 new addition yields a single new peak per N
1255	   chunks.

1257	   From this we can conclude that the injector has to sign less hashes
1258	   than in the "Sign All" method.  A receiving peer therefore also has
1259	   to verify less signatures.  It does additionally need to check one or
1260	   more hashes per chunk via the Merkle Tree scheme, but this requires
1261	   less CPU than a signature verification for every chunk.  This
1262	   approach is similar to signature amortization via Merkle Tree
1263	   Chaining [SIGMCAST].  The downside of this amortization of signature
1264	   costs over several chunks is that latency will increase.  A receiving
1265	   peer now has to wait for the signature before delivering the chunks
1266	   to the higher layers responsible for playback [POLLIVE], unless some
1267	   (optimistic) optimisations are made.  It MUST check the signature
1268	   before forwarding the chunks to other peers.

1270	   The number of chunks per signature N MUST be a fixed power of 2
1271	   (N>=2).  The procedure does not preclude using variable-sized chunks.
1272	   Using a variable number N, however, is not allowed as this breaks the
1273	   property of generating just 1 new peak per iteration.

1275	   Unification of static content checking and live content checking is
1276	   achieved by sending the signed peak hashes on-demand, ahead of the
1277	   actual data.  As before, the sender SHOULD use acknowledgments to
1278	   derive which content range the receiver has peak hashes for, and to
1279	   prepend the data hashes with the necessary (signed) peak hashes.
1280	   Except for the fact that the set of peak hashes changes with time,
1281	   other parts of the protocol work as described in Section 5.1.

1283	   This method of integrity verification has an added benefit if the
1284	   system includes some peers that saved the complete broadcast.  The
1285	   benefit is that as soon as the broadcast ends, the content is
1286	   available as a video-on-demand download using the now stabilized tree
1287	   and the final root hash as swarm identifier.  Peers that have saved
1288	   all chunks can now announce this root hash to the tracking
1289	   infrastructure and instantly seed it.

1291	6.2.  Forgetting Chunks

1293	   As a live broadcast progresses a peer may want to discard the chunks
1294	   that it already played out.  Ideally, other peers should be aware of
1295	   this fact such that they will not try to request these chunks from
1296	   this peer.  This could happen in scenarios where live streams may be
1297	   paused by viewers, or viewers are allowed to start late in a live
1298	   broadcast (e.g., start watching a broadcast at 20:35 whereas it began
1299	   at 20:30).

1301	   PPSPP provides a simple solution for peers to stay up-to-date with
1302	   the chunk availability of a discarding peer.  A discarding peer in a
1303	   live stream MUST enable the Live Discard Window protocol option,
1304	   specifying how many chunks/bytes it caches before the last chunk/byte
1305	   it advertised as being available (see Section 7.9).  Its peers SHOULD
1306	   apply this number as a sliding window filter over the peer's chunk
1307	   availability as conveyed via its HAVE messages.

1309	7.  Protocol Options

1311	   The HANDSHAKE message in PPSPP can contain the following protocol
1312	   options (cf. [RFC2132] (DHCP options)).  Each element in a protocol
1313	   option is 8 bits wide, unless stated otherwise.

1315	7.1.  End Option

1317	   A peer MUST conclude the list of protocol options with the end
1318	   option.  Subsequent octets should be considered protocol messages.
1319	   The code for the end option is 255, and its length is 1 octet.

1321	                                 +------+
1322	                                 | Code |
1323	                                 +------+
1324	                                 |  255 |
1325	                                 +------+

1327	7.2.  Version

1329	   A peer MUST include the maximum version of the PPSPP protocol it
1330	   supports as the first protocol option in the list.

1332	                          +------+-------------+
1333	                          | Code | Max Version |
1334	                          +------+-------------+
1335	                          |   0  |      v      |
1336	                          +------+-------------+

1338	   Currently one value is defined, 1 = protocol as described in this
1339	   document.

1341	7.3.  Minimum Version

1343	   When a peer initiates the handshake it MUST include the minimum
1344	   version of the PPSPP protocol it supports in the list of protocol
1345	   options, following the Min/max versioning scheme defined in
1346	   [RFC6709], Section 4.1.

1348	                          +------+-------------+
1349	                          | Code | Min Version |
1350	                          +------+-------------+
1351	                          |   1  |      v      |
1352	                          +------+-------------+

1354	   Currently one value is defined, 1 = protocol as described in this
1355	   document.

1357	7.4.  Swarm Identifier

1359	   To enable end-to-end checking of any peer discovery process a peer
1360	   MAY include a swarm identifier option.

1362	                 +------+-------------+------------------+
1363	                 | Code |    Length   | Swarm Identifier |
1364	                 +------+-------------+------------------+
1365	                 |   2  | n (16 bits) |     i1,i2,...    |
1366	                 +------+-------------+------------------+

1368	   Each PPSPP peer knows the IDs of the swarms it joins so this
1369	   information can be immediately verified upon receipt.  The length
1370	   field is 2 octets to allow for large public keys as identifiers in
1371	   live streaming.

1373	7.5.  Content Integrity Protection Method

1375	   A peer MUST include the content integrity method used by a swarm,
1376	   unless it uses the default, in which case it MAY include the method.

1378	                             +------+--------+
1379	                             | Code | Method |
1380	                             +------+--------+
1381	                             |   3  |    m   |
1382	                             +------+--------+

1384	   Currently three values are defined for the method, 0 = No integrity
1385	   protection, 1 = Merkle Hash Trees (for static content, see
1386	   Section 5.1), 2 = Sign All, and 3 = Unified Merkle Tree (for live
1387	   content, see Section 6.1).

1389	   The veracity of this information will come out when the receiver
1390	   successfully verifies the first chunk from any peer.

1392	7.6.  Merkle Tree Hash Function

1394	   When the content integrity protection method is Merkle Hash Trees
1395	   this option defining which hash function is used for the tree MUST
1396	   also be defined.

1398	                           +------+-----------+
1399	                           | Code | Hash Func |
1400	                           +------+-----------+
1401	                           |   4  |     h     |
1402	                           +------+-----------+

1404	   Currently the following values are defined for the hash function, 0 =
1405	   SHA1, 1 = SHA-224, 2 = SHA-256, 3 = SHA-384, and 4 = SHA-512
1406	   [FIPS180-3].

1408	   The veracity of this information will come out when the receiver
1409	   successfully verifies the first chunk from any peer.

1411	7.7.  Live Signature Algorithm

1413	   When the content integrity protection method is "Sign All" or
1414	   "Unified Merkle Tree" this option MUST also be defined.

1416	                            +------+---------+
1417	                            | Code | Sig Alg |
1418	                            +------+---------+
1419	                            |   5  |    s    |
1420	                            +------+---------+

1422	   The value of this option is one of the Domain Name System Security
1423	   (DNSSEC) Algorithm Numbers [IANADNSSECALGNUM].  If necessary, the key
1424	   size that impacts signature length can be derived from the swarm
1425	   identifier which is the signing public key in live streaming.

1427	   The veracity of this information will come out when the receiver
1428	   successfully verifies the first chunk from any peer.

1430	7.8.  Chunk Addressing Method

1432	   A peer MUST include the chunk addressing method it uses, unless it
1433	   uses the default, in which case it MAY include the method.

1435	                             +------+--------+
1436	                             | Code | Scheme |
1437	                             +------+--------+
1438	                             |   6  |    a   |
1439	                             +------+--------+

1441	   Currently six values are defined for the chunk addressing scheme,
1442	   0=32-bit bins, 1=64-bit byte ranges, and 2=32-bit chunk ranges, 3=64-
1443	   bit bins, 4=64-bit chunk ranges.

1445	   The veracity of this information will come out when the receiver
1446	   parses the first message containing a chunk specification from any
1447	   peer.

1449	7.9.  Live Discard Window

1451	   A peer in a live swarm MUST include the discard window it uses.  The
1452	   unit of the discard window depends on the chunk addressing method
1453	   used.  For bins and chunk ranges it is a number of chunks, for byte
1454	   ranges it is a number of bytes.  Its data type is the same as for a
1455	   bin, or one value in a range specification.  In other words, a 32-bit
1456	   or 64-bit integer in big endian format.  This option MUST therefore
1457	   appear after the Chunk Addressing option, if present in the list of
1458	   protocol options.

1460	                       +------+-------------------+
1461	                       | Code |       Scheme      |
1462	                       +------+-------------------+
1463	                       |   7  | w (32 or 64-bits) |
1464	                       +------+-------------------+

1466	   A peer that does not, under normal circumstances, discard chunks MUST
1467	   set this option to the special value 0xFFFFFFFF (32-bit) or
1468	   0xFFFFFFFFFFFFFFFF (64-bit).  For example, peers that record a
1469	   complete broadcast to offer it directly as a static asset after the
1470	   broadcast ends (see Section 6.1.2).

1472	   The veracity of this information does not impact a receiving peer
1473	   more than when a sender peer just does not respond to REQUEST
1474	   messages.

1476	7.10.  Supported Messages

1478	   Peers may support just a subset of the PPSPP messages.  For example,
1479	   peers running over TCP may not accept ACK messages, or peers used
1480	   with a centralized tracking infrastructure may not accept PEX
1481	   messages.  For these reasons, peers who support only a proper subset
1482	   of the PPSPP messages MUST signal which subset they support by means
1483	   of this protocol option.  The value of this option is a 256-bit
1484	   bitmap where each bit represents a message type.  The bitmap may be
1485	   truncated to the last non-zero byte.

1487	                    +------+--------+----------------+
1488	                    | Code | Length | Message Bitmap |
1489	                    +------+--------+----------------+
1490	                    |   8  |    n   |    m1,m2,...   |
1491	                    +------+--------+----------------+

1493	8.  UDP Encapsulation

1495	   Currently, PPSPP-over-UDP is the preferred deployment option.
1496	   Effectively, UDP allows the use of IP with minimal overhead and it
1497	   also allows userspace implementations.  LEDBAT is used for congestion
1498	   control [I-D.ietf-ledbat-congestion].  Using LEDBAT enables PPSPP to
1499	   serve the content after playback (seeding) without disrupting the
1500	   user who may have moved to different tasks that use its network
1501	   connection.  LEDBAT may be replaced with a different algorithm when
1502	   the work of the IETF working group on RTP Media Congestion Avoidance
1503	   Techniques (RMCAT) [RMCATCHART] matures.

1505	8.1.  Chunk Size

1507	   In general, an UDP datagram containing PPSPP messages SHOULD fit
1508	   inside a single IP packet, so its maximum size depends on the MTU of
1509	   the network.  The default is to use fixed-sized chunks of 1 kilobyte
1510	   such that a UDP datagram with a DATA message can be transmitted as a
1511	   single IP packet over an Ethernet network with 1500-byte frames.
1512	   PPSPP implementations can use larger chunk sizes.  For example, on
1513	   CPU-limited hardware 8 kilobyte chunks MAY be used, transported as a
1514	   single UDP datagram fragmented over multiple IP packets (with the
1515	   increased chance of that UDP datagram getting lost).  The chunk
1516	   addressing schemes can all work with different chunk sizes, see
1517	   Section 4.

1519	   The chunk size used for a particular swarm MUST be part of the
1520	   swarm's metadata (which then consists of the swarm ID and the chunk
1521	   size), unless it is the 1 KB default.

1523	8.2.  Datagrams and Messages

1525	   When using UDP, the abstract datagram described above corresponds
1526	   directly to a UDP datagram.  Most messages within a datagram have a
1527	   fixed length, which generally depends on the type of the message.
1528	   The first byte of a message denotes its type.  The currently defined
1529	   types are:

1531	   o  HANDSHAKE = 0x00

1533	   o  DATA = 0x01

1535	   o  ACK = 0x02

1537	   o  HAVE = 0x03

1539	   o  INTEGRITY = 0x04

1541	   o  PEX_RESv4 = 0x05

1543	   o  PEX_REQ = 0x06

1545	   o  SIGNED_INTEGRITY = 0x07

1547	   o  REQUEST = 0x08

1549	   o  CANCEL = 0x09

1551	   o  CHOKE = 0x0a

1553	   o  UNCHOKE = 0x0b

1555	   o  PEX_RESv6 = 0x0c

1557	   o  PEX_REScert = 0x0d

1559	   Furthermore, integers are serialized in the network (big-endian) byte
1560	   order.  So consider the example of a HAVE message (Section 3.2) using
1561	   bin chunk addressing.  It has message type of 0x02 and a payload of a
1562	   bin number, a four-byte integer (say, 1); hence, its on the wire
1563	   representation for UDP can be written in hex as: "0200000001".

1565	   All messages are idempotent or recognizable as duplicates.  In
1566	   particular, a peer MAY resend DATA, ACK, HAVE, INTEGRITY, PEX_*,
1567	   SIGNED_INTEGRITY, REQUEST, CANCEL, CHOKE and UNCHOKE messages without
1568	   problems when loss is suspected.  When a peer resends a HANDSHAKE
1569	   message it can be recognized as duplicate by the receiver and be
1570	   dealt with.

1572	8.3.  Channels

1574	   As it is increasingly complex for peers to enable UDP communication
1575	   between each other due to NATs and firewalls, PPSPP-over-UDP uses a
1576	   multiplexing scheme, called "channels", to allow multiple swarms to
1577	   use the same UDP port.  Channels loosely correspond to TCP
1578	   connections and each channel belongs to a single swarm.  To support
1579	   channels, each datagram starts with four bytes corresponding to the
1580	   receiving channel number.

1582	8.4.  HANDSHAKE

1584	   A channel is established with a handshake.  To start a handshake, the
1585	   initiating peer needs to know:

1587	   1.  the IP address of a peer

1589	   2.  peer's UDP port and

1591	   3.  the swarm ID of the content (see Section 5.1 and Section 6).

1593	   4.  the chunk size used, unless the 1 KB default

1595	   To do the handshake the initiating peer sends a datagram that MUST
1596	   start with an all 0-zeros channel number, followed by a HANDSHAKE
1597	   message, whose payload is a locally unused channel number and a list
1598	   of protocol options.

1600	   On the wire the datagram will look something like this:

1602	       (CHANNEL) 00000000 HANDSHAKE 00000011 v=01 si=123...1234 ca=0 end

1604	   (to unknown channel, handshake from channel 0x11 speaking protocol
1605	   version 0x01, initiating a transfer of a file with a root hash
1606	   123...1234 using bins for chunk addressing)

1608	   The receiving peer MAY respond in which case the returned datagram
1609	   MUST consist of the channel number from the sender's HANDSHAKE
1610	   message, a HANDSHAKE message, whose payload is a locally unused
1611	   channel number and a list of protocol options, followed by any other
1612	   messages it wants to send.

1614	   Peer's response datagram on the wire:

1616	       (CHANNEL) 00000011 HANDSHAKE 00000022 v=01 protocol options end

1618	   (peer to the initiator: use channel number 0x22 for this transfer and
1619	   proto version 0x01.)
1620	   At this point, the initiator knows that the peer really responds; for
1621	   that purpose channel IDs MUST be random enough to prevent easy
1622	   guessing.  So, the third datagram of a handshake MAY already contain
1623	   some heavy payload.  To minimize the number of initialization
1624	   roundtrips, the first two datagrams MAY also contain some minor
1625	   payload, e.g. a couple of HAVE messages roughly indicating the
1626	   current progress of a peer or a REQUEST (see Section 3.7).  When
1627	   receiving the third datagram, both peers have the proof they really
1628	   talk to each other; the three-way handshake is complete.

1630	   A peer MAY explicit close a channel by sending a HANDSHAKE message
1631	   that MUST contain an all 0-zeros channel number and a list of
1632	   protocol options.  The list MUST be either empty or contain the
1633	   maximum version number the sender supports, following the Min/max
1634	   versioning scheme defined in [RFC6709], Section 4.1.

1636	   On the wire:

1638	       (CHANNEL) 00000022 HANDSHAKE 00000000 end

1640	8.5.  HAVE

1642	   A HAVE message (type 0x03) consists of a chunk specification that
1643	   states that the sending peer has those chunks and successfully
1644	   checked their integrity.  A bin consists of a single integer, and a
1645	   chunk or byte range of two integers, of the width specified by the
1646	   Chunk Addressing protocol options, encoded big endian.

1648	   A HAVE message for bin 3 on the wire:

1650	       HAVE 00000003

1652	   (received and checked first four kilobytes of a file/stream)

1654	8.6.  DATA

1656	   A DATA message (type 0x01) consists of a chunk specification, a
1657	   timestamp and the actual chunk.  In case a datagram contains one DATA
1658	   message, a sender MUST always put the DATA message in the tail of the
1659	   datagram.  A datagram MAY contain multiple DATA messages unless one
1660	   of the chunks is the last chunk and smaller than the chunk size.  As
1661	   the LEDBAT congestion control is used, a sender MUST include a
1662	   timestamp, in particular, a 64-bit integer representing the current
1663	   system time with microsecond accuracy.  The timestamp MUST be
1664	   included between chunk specification and the actual chunk.

1666	   A DATA message for bin 0, with timestamp 12345678, and some data on
1667	   the wire:

1669	       DATA 00000000 12345678 48656c6c6f20776f726c6421

1671	   (This message accommodates an entire file: "Hello world!")

1673	8.7.  ACK

1675	   An ACK message (type 0x02) acknowledges data that was received from
1676	   its addressee; to comply with the LEDBAT delay-based congestion
1677	   control an ACK message consists of a chunk specification and a
1678	   timestamp representing an one-way delay sample.  The one-way delay
1679	   sample is a 64-bit integer with microsecond accuracy, and is computed
1680	   from the timestamp received from the previous DATA message containing
1681	   the chunk being acknowledged following the LEDBAT specification.

1683	   An ACK message for bin 2 with one-way delay 12345678 on the wire:

1685	       ACK 00000002 12345678

1687	8.8.  INTEGRITY

1689	   An INTEGRITY message (type 0x04) consists of a chunk specification
1690	   and the cryptographic hash for the specified chunk or node.  The type
1691	   and format of the hash depends on the protocol options.

1693	   An INTEGRITY message for bin 0 with a SHA1 hash on the wire:

1695	       INTEGRITY 00000000 1234123412341234123412341234123412341234

1697	8.9.  SIGNED_INTEGRITY

1699	   A SIGNED_INTEGRITY message (type 0x07) consists of a chunk
1700	   specification and a digital signature encoded as in DNSSEC without
1701	   the BASE-64 encoding [RFC4034].  The signature algorithm is defined
1702	   by the Live Signature Algorithm protocol option, see Section 7.7.

1704	8.10.  REQUEST

1706	   A REQUEST message (type 0x08) consists of a chunk specification for
1707	   the chunks the requester wants to download.

1709	8.11.  CANCEL

1711	   A CANCEL message (type 0x09) consists of a chunk specification for
1712	   the chunks the requester no longer is interested in.

1714	8.12.  CHOKE and UNCHOKE

1716	   Both CHOKE and UNCHOKE messages (types 0x0a and 0x0b, respectively)
1717	   carry no payload.

1719	8.13.  PEX_REQ, PEX_RESv4, PEX_RESv6 and PEX_REScert

1721	   A PEX_REQ (0x06) message has no payload.  A PEX_RES (0x05) message
1722	   consists of an IPv4 address in big endian format followed by a UDP
1723	   port number in big endian format.  A PEX_RESv6 (0x0c) message
1724	   contains a 128-bit IPv6 address instead of an IPv4 one.  If sender of
1725	   the PEX_REQ message does not have a private or link-local address,
1726	   then the PEX_RES* messages MUST NOT contain such addresses
1727	   [RFC1918][RFC4291].

1729	   A PEX_REScert (0x0d) message consists of a 16-bit integer in big
1730	   endian specifying the size of the membership certificate that
1731	   follows, see Section 12.2.1.  This membership certificate states that
1732	   peer P at time T is a member of swarm S and is a X.509v3 certificate
1733	   [RFC2459] that is encoded using the ASN.1 distinguished encoding
1734	   rules (DER) [CCITT.X208.1988].  The certificate MUST contain a
1735	   "Subject Alternative Name" extension, marked as critical, of type
1736	   uniformResourceIdentifier.

1738	   The URL the name extension contains MUST follow the generic syntax
1739	   for URLs [RFC3986], where its scheme component is "ppsp", the host in
1740	   the authority component is the DNS name or IP address of peer P, the
1741	   port in the authority component is the port of peer P, and the path
1742	   contains the swarm identifier for swarm S, in hexadecimal form.  In
1743	   particular, the preferred form of the swarm identifier is xxyyzz...,
1744	   where the 'x's, 'y's and 'z's are 2 hexadecimal digits of the 8-bit
1745	   pieces of the identifier.  The validity time of the certificate is
1746	   set with notBefore UTCTime set to T and notAfter UTCTime set to T
1747	   plus some expiry time defined by the issuer.  An example URL:

1749	       ppsp://192.168.0.1:6778/e5a12c7ad2d8fab33c699d1e198d66f79fa610c3

1751	8.14.  KEEPALIVE

1753	   Keepalives do not have a message type on UDP.  They are just simple
1754	   datagrams consisting of a 4-byte channel number only.

1756	   On the wire:

1758	       (CHANNEL) 00000022

1760	   A guideline for declaring a peer dead consist of a 3 minute delay
1761	   since that last packet has been received from that peer, and at least
1762	   3 datagrams were sent to that peer during the same period.

1764	8.15.  Flow and Congestion Control

1766	   Explicit flow control is not necessary in PPSPP-over-UDP.  In the
1767	   case of video-on-demand the receiver will request data explicitly
1768	   from peers and is therefore in control of how much data is coming
1769	   towards it.  In the case of live streaming, where a push-model may be
1770	   used, the amount of data incoming is limited to the bitrate, which
1771	   the receiver must be able to process otherwise it cannot play the
1772	   stream.  Should, for any reason, the receiver get saturated with data
1773	   that situation is perfectly detected by the congestion control.
1774	   PPSPP-over-UDP can support different congestion control algorithms.
1775	   At present, it uses the LEDBAT congestion control algorithm
1776	   [I-D.ietf-ledbat-congestion].

1778	9.  Extensibility

1780	9.1.  Chunk Picking Algorithms

1782	   Chunk (or piece) picking entirely depends on the receiving peer.  The
1783	   sender peer is made aware of preferred chunks by the means of REQUEST
1784	   messages.  In some (live) scenarios it may be beneficial to allow the
1785	   sender to ignore those hints and send unrequested data.

1787	   The chunk picking algorithm is external to the PPSPP protocol and
1788	   will generally be a pluggable policy that uses the mechanisms
1789	   provided by PPSPP.  The algorithm will handle the choices made by the
1790	   user consuming the content, such as seeking, switching audio tracks
1791	   or subtitles.  Example policies for P2P streaming can be found in
1792	   [BITOS], and [EPLIVEPERF].

1794	9.2.  Reciprocity Algorithms

1796	   The role of reciprocity algorithms in peer-to-peer systems is to
1797	   promote client contribution and prevent freeriding.  A peer is said
1798	   to be freeriding if it only downloads content but never uploads to
1799	   others.  Examples of reciprocity algorithms are tit-for-tat as used
1800	   in BitTorrent [TIT4TAT] and Give-to-Get [GIVE2GET].  In PPSPP,
1801	   reciprocity enforcement is the sole responsibility of the sender
1802	   peer.

1804	10.  Acknowledgements

1806	   Arno Bakker, Riccardo Petrocco and Victor Grishchenko are partially
1807	   supported by the P2P-Next project (http://www.p2p-next.org/), a
1808	   research project supported by the European Community under its 7th
1809	   Framework Programme (grant agreement no. 216217).  The views and
1810	   conclusions contained herein are those of the authors and should not
1811	   be interpreted as necessarily representing the official policies or
1812	   endorsements, either expressed or implied, of the P2P-Next project or
1813	   the European Commission.

1815	   The PPSPP protocol was designed by Victor Grishchenko at Technische
1816	   Universiteit Delft.  The authors would like to thank the following
1817	   people for their contributions to this draft: the chairs and members
1818	   of the IETF PPSP working group, and Mihai Capota, Raul Jimenez,
1819	   Flutra Osmani, Johan Pouwelse, and Raynor Vliegendhart.

1821	11.  IANA Considerations

1823	   To be determined.

1825	12.  Security Considerations

1827	   As any other network protocol, the PPSPP faces a common set of
1828	   security challenges.  An implementation must consider the possibility
1829	   of buffer overruns, DoS attacks and manipulation (i.e. reflection
1830	   attacks).  Any guarantee of privacy seems unlikely, as the user is
1831	   exposing its IP address to the peers.  A probable exception is the
1832	   case of the user being hidden behind a public NAT or proxy.  This
1833	   section discusses the protocol's security considerations in detail.

1835	12.1.  Security of the Handshake Procedure

1837	   Borrowing from the analysis in [RFC5971], the PPSP peer protocol may
1838	   be attacked with 3 types of denial-of-service attacks:

1840	   1.  DOS amplification attack: attackers try to use a PPSPP peer to
1841	       generate more traffic to a victim.

1843	   2.  DOS flood attack: attackers try to deny service to other peers by
1844	       allocating lots of state at a PPSPP peer.

1846	   3.  Disrupt service to an individual peer: attackers send bogus e.g.
1847	       REQUEST and HAVE messages appearing to come from victim peer A to
1848	       the peers B1..Bn serving that peer.  This causes A to receive
1849	       chunks it did not request or to not receive the chunks it
1850	       requested.

1852	   The basic scheme to protect against these attacks is the use of a
1853	   secure handshake procedure.  In the UDP encapsulation the handshake
1854	   procedure is secured by the use of randomly chosen channel IDs as
1855	   follows.  The channel IDs must be generated following the
1856	   requirements in [RFC4960](Sec. 5.1.3).

1858	   When UDP is used, all datagrams carrying PPSPP messages are prefixed
1859	   with a 4-byte channel ID.  These channel IDs are random numbers,
1860	   established during the handshake phase as follows.  Peer A initiates
1861	   an exchange with peer B by sending a datagram containing a HANDSHAKE
1862	   message prefixed with the channel ID consisting of all 0s.  Peer A's
1863	   HANDSHAKE contains a randomly chosen channel ID, chanA:

1865	   A->B: chan0 + HANDSHAKE(chanA) + ...

1867	   When peer B receives this datagram, it creates some state for peer A,
1868	   that at least contains the channel ID chanA.  Next, peer B sends a
1869	   response to A, consisting of a datagram containing a HANDSHAKE
1870	   message prefixed with the chanA channel ID.  Peer B's HANDSHAKE
1871	   contains a randomly chosen channel ID, chanB.

1873	   B->A: chanA + HANDSHAKE(chanB) + ...

1875	   Peer A now knows that peer B really responds, as it echoed chanA.  So
1876	   the next datagram that A sends may already contain heavy payload,
1877	   i.e., a chunk.  This next datagram to B will be prefixed with the
1878	   chanB channel ID.  When B receives this datagram, both peers have the
1879	   proof they are really talking to each other, the three-way handshake
1880	   is complete.  In other words, the randomly chosen channel IDs act as
1881	   tags (cf. [RFC4960](Sec. 5.1)).

1883	   A->B: chanB + HAVE + DATA + ...

1885	12.1.1.  Protection against attack 1

1887	   In short, PPSPP does a so-called return routability check before
1888	   heavy payload is sent.  This means that attack 1 is fended off: PPSPP
1889	   does not send back much more data than it received, unless it knows
1890	   it is talking to a live peer.  Attackers sending a spoofed HANDSHAKE
1891	   to B pretending to be A now need to intercept the message from B to A
1892	   to get B to send heavy payload, and ensure that that heavy payload
1893	   goes to the victim, something assumed too hard to be a practical
1894	   attack.

1896	   Note the rule is that no heavy payload may be sent until the third
1897	   datagram.  This has implications for PPSPP implementations that use
1898	   chunk addressing schemes that are verbose.  If a PPSPP implementation
1899	   uses large bitmaps to convey chunk availability these may not be sent
1900	   by peer B in the second datagram.

1902	12.1.2.  Protection against attack 2

1904	   On receiving the first datagram peer B will record some state about
1905	   peer A. At present this state consists of the chanA channel ID, and
1906	   the results of processing the other messages in the first datagram.
1907	   In particular, if A included some HAVE messages, B may add a chunk
1908	   availability map to A's state.  In addition, B may request some
1909	   chunks from A in the second datagram, and B will maintain state about
1910	   these outgoing requests.

1912	   So presently, PPSPP is somewhat vulnerable to attack 2.  An attacker
1913	   could send many datagrams with HANDSHAKEs and HAVEs and thus allocate
1914	   state at the PPSPP peer.  Therefore peer A MUST respond immediately
1915	   to the second datagram, if it is still interested in peer B.

1917	   The reason for using this slightly vulnerable three-way handshake
1918	   instead of the safer handshake procedure of SCTP [RFC4960](Sec. 5.1)
1919	   is quicker response time for the user.  In the SCTP procedure, peer A
1920	   and B cannot request chunks until datagrams 3 and 4 respectively, as
1921	   opposed to 2 and 1 in the proposed procedure.  This means that the
1922	   user has to wait shorter in PPSPP between starting the video stream
1923	   and seeing the first images.

1925	12.1.3.  Protection against attack 3

1927	   In general, channel IDs serve to authenticate a peer.  Hence, to
1928	   attack, a malicious peer T would need to be able to eavesdrop on
1929	   conversations between victim A and a benign peer B to obtain the
1930	   channel ID B assigned to A, chanB.  Furthermore, attacker T would
1931	   need to be able to spoof e.g.  REQUEST and HAVE messages from A to
1932	   cause B to send heavy DATA messages to A, or prevent B from sending
1933	   them, respectively.

1935	   The capability to eavesdrop is not common, so the protection afforded
1936	   by channel IDs will be sufficient in most cases.  If not, point-to-
1937	   point encryption of traffic should be used, see below.

1939	12.2.  Secure Peer Address Exchange

1941	   As described in Section 3.10, a peer A can send a Peer-Exchange
1942	   message PEX_RES to a peer B, which contains the IP address and port
1943	   of other peers that are supposedly also in the current swarm.  The
1944	   strength of this mechanism is that it allows decentralized tracking:
1945	   after an initial bootstrap no central tracker is needed anymore.  The
1946	   vulnerability of this mechanism (and DHTs) is that malicious peers
1947	   can use it for an Amplification attack.

1949	   In particular, a malicious peer T could send a PEX_RES to well-
1950	   behaved peer A containing a list of address B1,B2,...,BN and on
1951	   receipt, peer A could send a HANDSHAKE to all these peers.  So in the
1952	   worst case, a single datagram results in N datagrams.  The actual
1953	   damage depends on A's behaviour.  E.g. when A already has sufficient
1954	   connections it may not connect to the offered ones at all, but if it
1955	   is a fresh peer it may connect to all directly.

1957	   In addition, PEX can be used in Eclipse attacks [ECLIPSE] where
1958	   malicious peers try to isolate a particular peer such that it only
1959	   interacts with malicious peers.  Let us distinguish two specific
1960	   attacks:

1962	      E1.  Malicious peers try to eclipse the single injector in live
1963	      streaming.

1965	      E2.  Malicious peers try to eclipse a specific consumer peer.

1967	   Attack E1 has the most impact on the system as it would disrupt all
1968	   peers.

1970	12.2.1.  Protection against the Amplification Attack

1972	   If peer addresses are relatively stable, strong protection against
1973	   the attack can be provided by using public key cryptography and
1974	   certification.  In particular, a PEX_RES message will carry swarm-
1975	   membership certificates rather than IP address and port.  A
1976	   membership certificate for peer B states that peer B at address
1977	   (ipB,portB) is part of swarm S at time T and is cryptographically
1978	   signed.  The receiver A can check the cert for a valid signature, the
1979	   right swarm and liveliness and only then consider contacting B. These
1980	   swarm-membership certificates correspond to signed node descriptors
1981	   in secure decentralized peer sampling services [SPS].

1983	   Several designs are possible for the security environment for these
1984	   membership certificates.  That is, there are different designs
1985	   possible for who signs the membership certificates and how public
1986	   keys are distributed.  As an example, we describe a design where the
1987	   PPSP tracker acts as certification authority.

1989	12.2.2.  Example: Tracker as Certification Authority

1991	   A peer A wanting to join swarm S sends a certificate request message
1992	   to a tracker X for that swarm.  Upon receipt, the tracker creates a
1993	   membership certificate from the request with swarm ID S, a timestamp
1994	   T and the external IP and port it received the message from, signed
1995	   with the tracker's private key.  This certificate is returned to A.

1997	   Peer A then includes this certificate when it sends a PEX_RES to peer
1998	   B. Receiver B verifies it against the tracker public key.  This
1999	   tracker public key should be part of the swarm's metadata, which B
2000	   received from a trusted source.  Subsequently, peer B can send the
2001	   member certificate of A to other peers in PEX_RES messages.

2003	   Peer A can send the certification request when it first contacts the
2004	   tracker, or at a later time.  Furthermore, the responses the tracker
2005	   sends could contain membership certificates instead of plain
2006	   addresses, such that they can be gossiped securely as well.

2008	   We assume the tracker is protected against attacks and does a return
2009	   routability check.  The latter ensures that malicious peers cannot
2010	   obtain a certificate for a random host, just for hosts where they can
2011	   eavesdrop on incoming traffic.

2013	   The load generated on the tracker depends on churn and the lifetime
2014	   of a certificate.  Certificates can be fairly long lived, given that
2015	   the main goal of the membership certificates is to prevent that
2016	   malicious peer T can cause good peer A to contact *random* hosts.
2017	   The freshness of the timestamp just adds extra protection in addition
2018	   to achieving that goal.  It protects against malicious hosts causing
2019	   a good peer A to contact hosts that previously participated in the
2020	   swarm.

2022	   The membership certificate mechanism itself can be used for a kind of
2023	   amplification attack against good peers.  Malicious peer T can cause
2024	   peer A to spend some CPU to verify the signatures on the membership
2025	   certificates that T sends.  To counter this, A SHOULD check a few of
2026	   the certificates sent and discard the rest if they are defective.

2028	   The same membership certificates described above can be registered in
2029	   a Distributed Hash Table that has been secured against the well-known
2030	   DHT specific attacks [SECDHTS].

2032	   Note that this scheme does not work for peers behind a symmetric
2033	   Network Address Translator, but neither does normal tracker
2034	   registration.

2036	12.2.3.  Protection Against Eclipse Attacks

2038	   Before we can discuss Eclipse attacks we first need to establish the
2039	   security properties of the central tracker.  A tracker is vulnerable
2040	   to Amplification attacks too.  A malicious peer T could register a
2041	   victim B with the tracker, and many peers joining the swarm will
2042	   contact B. Trackers can also be used in Eclipse attacks.  If many
2043	   malicious peers register themselves at the tracker, the percentage of
2044	   bad peers in the returned address list may become high.  Leaving the
2045	   protection of the tracker to the PPSP tracker protocol specification,
2046	   we assume for the following discussion that it returns a true random
2047	   sample of the actual swarm membership (achieved via Sybil attack
2048	   protection).  This means that if 50% of the peers is bad, you'll
2049	   still get 50% good addresses from the tracker.

2051	   Attack E1 on PEX can be fended off by letting live injectors disable
2052	   PEX.  Or at least, let live injectors ensure that part of their
2053	   connections are to peers whose addresses came from the trusted
2054	   tracker.

2056	   The same measures defend against attack E2 on PEX.  They can also be
2057	   employed dynamically.  When the current set of peers B that peer A is
2058	   connected to doesn't provide good quality of service, A can contact
2059	   the tracker to find new candidates.

2061	12.3.  Support for Closed Swarms (PPSP.SEC.REQ-1)

2063	   The Closed Swarms [CLOSED] and Enhanced Closed Swarms [ECS]
2064	   mechanisms provide swarm-level access control.  The basic idea is
2065	   that a peer cannot download from another peer unless it shows a
2066	   Proof-of-Access.  Enhanced Closed Swarms improve on the original
2067	   Closed Swarms by adding on-the-wire encryption against man-in-the-
2068	   middle attacks and more flexible access control rules.

2070	   The exact mapping of ECS to PPSPP is defined in
2071	   [I-D.gabrijelcic-ppsp-ecs].

2073	12.4.  Confidentiality of Streamed Content (PPSP.SEC.REQ-2+3)

2075	   No extra mechanism is needed to support confidentiality in PPSPP.  A
2076	   content publisher wishing confidentiality should just distribute
2077	   content in cyphertext / DRM-ed format.  In that case it is assumed a
2078	   higher layer handles key management out-of-band.  Alternatively, pure
2079	   point-to-point encryption of content and traffic can be provided by
2080	   the proposed Closed Swarms access control mechanism, or by DTLS
2081	   [RFC6347] or IPsec [RFC4301].

2083	12.5.  Limit Potential Damage and Resource Exhaustion by Bad or Broken
2084	       Peers (PPSP.SEC.REQ-4+6)

2086	   In this section an analysis is given of the potential damage a
2087	   malicious peer can do with each message in the protocol, and how it
2088	   is prevented by the protocol (implementation).

2090	12.5.1.  HANDSHAKE
2091	   o  Secured against DoS amplification attacks as described in
2092	      Section 12.1.

2094	   o  Threat HS.1: An Eclipse attack where peers T1..TN fill all
2095	      connection slots of A by initiating the connection to A.

2097	      Solution: Peer A must not let other peers fill all its available
2098	      connection slots, i.e., A must initiate connections itself too, to
2099	      prevent isolation.

2101	12.5.2.  HAVE

2103	   o  Threat HAVE.1: Malicious peer T can claim to have content which it
2104	      hasn't.  Subsequently T won't respond to requests.

2106	      Solution: peer A will consider T to be a slow peer and not ask it
2107	      again.

2109	   o  Threat HAVE.2: Malicious peer T can claim not to have content.
2110	      Hence it won't contribute.

2112	      Solution: Peer and chunk selection algorithms external to the
2113	      protocol will implement fairness and provide sharing incentives.

2115	12.5.3.  DATA

2117	   o  Threat DATA.1: peer T sending bogus chunks.

2119	      Solution: The content integrity protection schemes defend against
2120	      this.

2122	   o  Threat DATA.2: peer T sends peer A unrequested chunks.

2124	      To protect against this threat we need network-level DoS
2125	      prevention.

2127	12.5.4.  ACK

2129	   o  Threat ACK.1: peer T acknowledges wrong chunks.

2131	      Solution: peer A will detect inconsistencies with the data it sent
2132	      to T.

2134	   o  Threat ACK.2: peer T modifies timestamp in ACK to peer A used for
2135	      time-based congestion control.

2137	      Solution: In theory, by decreasing the timestamp peer T could fake
2138	      there is no congestion when in fact there is, causing A to send
2139	      more data than it should.  [I-D.ietf-ledbat-congestion] does not
2140	      list this as a security consideration.  Possibly this attack can
2141	      be detected by the large resulting asymmetry between round-trip
2142	      time and measured one-way delay.

2144	12.5.5.  INTEGRITY and SIGNED_INTEGRITY

2146	   o  Threat INTEGRITY.1: An amplification attack where peer T sends
2147	      bogus INTEGRITY or SIGNED_INTEGRITY messages, causing peer A to
2148	      checks hashes or signatures, thus spending CPU unnecessarily.

2150	      Solution: If the hashes/signatures don't check out A will stop
2151	      asking T because of the atomic datagram principle and the content
2152	      integrity protection.  Subsequent unsolicited traffic from T will
2153	      be ignored.

2155	12.5.6.  REQUEST

2157	   o  Threat REQUEST.1: peer T could request lots from A, leaving A
2158	      without resources for others.

2160	      Solution: A limit is imposed on the upload capacity a single peer
2161	      can consume, for example, by using an upload bandwidth scheduler
2162	      that takes into account the need of multiple peers.  A natural
2163	      upper limit of this upload quotum is the bitrate of the content,
2164	      taking into account that this may be variable.

2166	12.5.7.  CANCEL

2168	   o  Threat CANCEL.1: peer T sends CANCEL messages for content it never
2169	      requested to peer A.

2171	      Solution: peer A will detect the inconsistency of the messages and
2172	      ignore them.  Note that CANCEL messages may be received
2173	      unexpectedly when a transport is used where REQUEST messages may
2174	      be lost or reordered with respect to the subsequent CANCELs.

2176	12.5.8.  CHOKE

2178	   o  Threat CHOKE.1: peer T sends REQUEST messages after peer A sent B
2179	      a CHOKE message.

2181	      Solution: peer A will just discard the unwanted REQUESTs and
2182	      resend the CHOKE, assuming it got lost.

2184	12.5.9.  UNCHOKE

2186	   o  Threat UNCHOKE.1: peer T sends an UNCHOKE message to peer A
2187	      without having sent a CHOKE message before.

2189	      Solution: peer A can easily detect this violation of protocol
2190	      state, and ignore it.  Note this can also happen due to loss of a
2191	      CHOKE message sent by a benign peer.

2193	   o  Threat UNCHOKE.2: peer T sends an UNCHOKE message to peer A, but
2194	      subsequently does not respond to its REQUESTs.

2196	      Solution: peer A will consider T to be a slow peer and not ask it
2197	      again.

2199	12.5.10.  PEX_RES

2201	   o  Secured against amplification and Eclipse attacks as described in
2202	      Section 12.2.

2204	12.5.11.  Unsolicited Messages in General

2206	   o  Threat: peer T could send a spoofed PEX_REQ or REQUEST from peer B
2207	      to peer A, causing A to send a PEX_RES/DATA to B.

2209	      Solution: the message from peer T won't be accepted unless T does
2210	      a handshake first, in which case the reply goes to T, not victim
2211	      B.

2213	12.6.  Exclude Bad or Broken Peers (PPSP.SEC.REQ-5)

2215	   A receiving peer can detect malicious or faulty senders as just
2216	   described, which it can then subsequently ignore.  However, excluding
2217	   such a bad peer from the system completely is complex.  Random
2218	   monitoring by trusted peers that would blacklist bad peers as
2219	   described in [DETMAL] is one option.  This mechanism does require
2220	   extra capacity to run such trusted peers, which must be
2221	   indistinguishable from regular peers, and requires a solution for the
2222	   timely distribution of this blacklist to peers in a scalable manner.

2224	13.  References

2226	13.1.  Normative References

2228	   [CCITT.X208.1988]
2229	              International International Telephone and Telegraph
2230	              Consultative Committee, "Specification of Abstract Syntax
2231	              Notation One (ASN.1)", CCITT Recommendation X.208,
2232	              November 1988.

2234	   [FIPS180-3]
2235	              Information Technology Laboratory,  National Institute of
2236	              Standards and Technology, "Federal Information Processing
2237	              Standards: Secure Hash Standard (SHS)", Publication 180-3,
2238	              Oct 2008.

2240	   [I-D.ietf-ledbat-congestion]
2241	              Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
2242	              "Low Extra Delay Background Transport (LEDBAT)",
2243	              draft-ietf-ledbat-congestion-10 (work in progress),
2244	              September 2012.

2246	   [IANADNSSECALGNUM]
2247	              IANA, "Domain Name System Security (DNSSEC) Algorithm
2248	              Numbers",
2249	              <http://www.iana.org/assignments/dns-sec-alg-numbers>.

2251	   [RFC1918]  Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and
2252	              E. Lear, "Address Allocation for Private Internets",
2253	              BCP 5, RFC 1918, February 1996.

2255	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2256	              Requirement Levels", BCP 14, RFC 2119, March 1997.

2258	   [RFC2459]  Housley, R., Ford, W., Polk, T., and D. Solo, "Internet
2259	              X.509 Public Key Infrastructure Certificate and CRL
2260	              Profile", RFC 2459, January 1999.

2262	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
2263	              Resource Identifier (URI): Generic Syntax", STD 66,
2264	              RFC 3986, January 2005.

2266	   [RFC4034]  Arends, R., Austein, R., Larson, M., Massey, D., and S.
2267	              Rose, "Resource Records for the DNS Security Extensions",
2268	              RFC 4034, March 2005.

2270	   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
2271	              Architecture", RFC 4291, February 2006.

2273	   [RFC6709]  Carpenter, B., Aboba, B., and S. Cheshire, "Design
2274	              Considerations for Protocol Extensions", RFC 6709,
2275	              September 2012.

2277	13.2.  Informative References

2279	   [ABMRKL]   Bakker, A., "Merkle hash torrent extension", BitTorrent
2280	              Enhancement Proposal 30, Mar 2009,
2281	              <http://bittorrent.org/beps/bep_0030.html>.

2283	   [BINMAP]   Grishchenko, V. and J. Pouwelse, "Binmaps: hybridizing
2284	              bitmaps and binary trees", Technical Report PDS-2011-005,
2285	              Parallel and  Distributed Systems Group, Fac. of
2286	              Electrical Engineering,  Mathematics, and Computer
2287	              Science, Delft University of Technology,  The Netherlands,
2288	              Apr 2009.

2290	   [BITOS]    Vlavianos, A., Iliofotou, M., Mathieu, F., and M.
2291	              Faloutsos, "BiToS: Enhancing BitTorrent for Supporting
2292	              Streaming Applications", IEEE INFOCOM Global Internet
2293	              Symposium Barcelona, Spain, Apr 2006.

2295	   [BITTORRENT]
2296	              Cohen, B., "The BitTorrent Protocol Specification",
2297	              BitTorrent Enhancement Proposal 3, Feb 2008,
2298	              <http://bittorrent.org/beps/bep_0003.html>.

2300	   [CLOSED]   Borch, N., Mitchell, K., Arntzen, I., and D. Gabrijelcic,
2301	              "Access Control to BitTorrent Swarms Using Closed Swarms",
2302	              ACM workshop on Advanced Video Streaming Techniques for
2303	              Peer-to-Peer Networks and Social Networking (AVSTP2P '10),
2304	              Florence, Italy, Oct 2010,
2305	              <http://doi.acm.org/10.1145/1877891.1877898>.

2307	   [DETMAL]   Shetty, S., Galdames, P., Tavanapong, W., and Ying. Cai,
2308	              "Detecting Malicious Peers in Overlay Multicast
2309	              Streaming", IEEE Conference on Local Computer
2310	              Networks (LCN'06). Tampa, FL, USA, Nov 2006.

2312	   [ECLIPSE]  Sit, E. and R. Morris, "Security Considerations for Peer-
2313	              to-Peer Distributed Hash Tables", IPTPS '01: Revised
2314	              Papers from the First International Workshop on Peer-to-
2315	              Peer Systems pp. 261-269, Springer-Verlag, 2002.

2317	   [ECS]      Jovanovikj, V., Gabrijelcic, D., and T. Klobucar, "Access
2318	              Control in BitTorrent P2P Networks Using the Enhanced
2319	              Closed Swarms Protocol", International Conference on
2320	              Emerging Security  Information, Systems and
2321	              Technologies (SECURWARE 2011), pp. 97-102, Nice, France,
2322	              Aug 2011.

2324	   [EPLIVEPERF]
2325	              Bonald, T., Massoulie, L., Mathieu, F., Perino, D., and A.
2326	              Twigg, "Epidemic Live Streaming: Optimal Performance
2327	              Trade-offs", Proceedings of the 2008 ACM SIGMETRICS
2328	              International  Conference on Measurement and Modeling of
2329	              Computer Systems Annapolis, MD, USA, Jun 2008.

2331	   [GIVE2GET]
2332	              Mol, J., Pouwelse, J., Meulpolder, M., Epema, D., and H.
2333	              Sips, "Give-to-Get: Free-riding Resilient Video-on-demand
2334	              in P2P Systems", Proceedings Multimedia Computing and
2335	              Networking conference (Proceedings of SPIE Vol. 6818) San
2336	              Jose, California, USA, Jan 2008.

2338	   [HAC01]    Menezes, A., van Oorschot, P., and S. Vanstone, "Handbook
2339	              of Applied Cryptography", CRC Press, (Fifth Printing,
2340	              August 2001), Oct 1996.

2342	   [I-D.gabrijelcic-ppsp-ecs]
2343	              Gabrijelcic, D., "Enhanced Closed Swarm protocol",
2344	              draft-ppsp-gabrijelcic-ecs (work in progress),
2345	              November 2012.

2347	   [I-D.ietf-ppsp-reqs]
2348	              Williams, C., Xiao, L., Zong, N., Pascual, V., and Y.
2349	              Zhang, "P2P Streaming Protocol (PPSP) Requirements",
2350	              draft-ietf-ppsp-reqs-05 (work in progress), October 2011.

2352	   [JIM11]    Jimenez, R., Osmani, F., and B. Knutsson, "Sub-Second
2353	              Lookups on a Large-Scale Kademlia-Based Overlay", IEEE
2354	              International Conference on Peer-to-Peer
2355	              Computing (P2P'11), Kyoto, Japan, Aug 2011.

2357	   [MERKLE]   Merkle, R., "Secrecy, Authentication, and Public Key
2358	              Systems", Ph.D. thesis Dept. of Electrical Engineering,
2359	              Stanford University, CA, USA, pp 40-45, 1979.

2361	   [POLLIVE]  Dhungel, P., Hei, Xiaojun., Ross, K., and N. Saxena,
2362	              "Pollution in P2P Live Video Streaming", International
2363	              Journal of Computer Networks & Communications
2364	              (IJCNC) Vol.1, No.2, Jul 2009.

2366	   [RFC2132]  Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor
2367	              Extensions", RFC 2132, March 1997.

2369	   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
2370	              Internet Protocol", RFC 4301, December 2005.

2372	   [RFC4960]  Stewart, R., "Stream Control Transmission Protocol",
2373	              RFC 4960, September 2007.

2375	   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
2376	              IANA Considerations Section in RFCs", BCP 26, RFC 5226,
2377	              May 2008.

2379	   [RFC5389]  Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
2380	              "Session Traversal Utilities for NAT (STUN)", RFC 5389,
2381	              October 2008.

2383	   [RFC5971]  Schulzrinne, H. and R. Hancock, "GIST: General Internet
2384	              Signalling Transport", RFC 5971, October 2010.

2386	   [RFC6347]  Rescorla, E. and N. Modadugu, "Datagram Transport Layer
2387	              Security Version 1.2", RFC 6347, January 2012.

2389	   [RMCATCHART]
2390	              Eggert, L. and others, "RTP Media Congestion Avoidance
2391	              Techniques (rmcat) Description of Working Group", 2012,
2392	              <http://datatracker.ietf.org/wg/rmcat/charter/>.

2394	   [SECDHTS]  Urdaneta, G., Pierre, G., and M. van Steen, "A Survey of
2395	              DHT Security Techniques", ACM Computing Surveys vol.
2396	              43(2), Jun 2011.

2398	   [SIGMCAST]
2399	              Wong, C. and S. Lam, "Digital Signatures for Flows and
2400	              Multicasts", IEEE/ACM Transactions on Networking 7(4), pp.
2401	              502-513, 1999.

2403	   [SNP]      Ford, B., Srisuresh, P., and D. Kegel, "Peer-to-Peer
2404	              Communication Across Network Address Translators",
2405	              Feb 2005, <http://www.brynosaurus.com/pub/net/p2pnat/>.

2407	   [SPS]      Jesi, G., Montresor, A., and M. van Steen, "Secure Peer
2408	              Sampling", Computer Networks vol. 54(12), pp. 2086-2098,
2409	              Elsevier, Aug 2010.

2411	   [SWIFTIMPL]
2412	              Grishchenko, V., Paananen, J., Pronchenkov, A., Bakker,
2413	              A., and R. Petrocco, "Swift reference implementation",
2414	              2012, <https://github.com/triblerteam/libswift/>.

2416	   [TIT4TAT]  Cohen, B., "Incentives Build Robustness in BitTorrent",
2417	              1st Workshop on Economics of Peer-to-Peer
2418	              Systems, Berkeley, CA, USA, Jun 2003.

2420	Appendix A.  Revision History

2422	   -00	2011-12-19 Initial version.

2424	   -01	2012-01-30 Minor text revision:

2426	       *   Changed heading to "A. Bakker"

2428	       *   Changed title to *Peer* Protocol, and abbreviation PPSPP.

2430	       *   Replaced swift with PPSPP.

2432	       *   Removed Sec. 6.4.  "HTTP (as PPSP)".

2434	       *   Renamed Sec. 8.4. to "Chunk Picking Algorithms".

2436	       *   Resolved Ticket #3: Removed sentence about random set of
2437	           peers.

2439	       *   Resolved Ticket #6: Added clarification to "Chunk Picking
2440	           Algorithms" section.

2442	       *   Resolved Ticket #11: Added Sec. 3.12 on Storage Independence

2444	       *   Resolved Ticket #14: Added clarification to "Automatic Size
2445	           Detection" section.

2447	       *   Resolved Ticket #15: Operation section now states it shows
2448	           example behaviour for a specific set of policies and schemes.

2450	       *   Resolved Ticket #30: Explained why multiple REQUESTs in one
2451	           datagram.

2453	       *   Resolved Ticket #31: Renamed PEX_ADD message to PEX_RES.

2455	       *   Resolved Ticket #32: Renamed Sec 3.8. to "Keep Alive
2456	           Signaling", and updated explanation.

2458	       *   Resolved Ticket #33: Explained NAT hole punching via only
2459	           PPSPP messages.

2461	       *   Resolved Ticket #34: Added section about limited overhead of
2462	           the Merkle hash tree scheme.

2464	   -02	2012-04-17 Major revision

2466	       *   Allow different chunk addressing and content integrity
2467	           protection schemes (ticket #13):

2469	       *   Added chunk ID, chunk specification, chunk addressing scheme,
2470	           etc. to terminology.

2472	       *   Created new Sections 4 and 5 discussing chunk addressing and
2473	           content integrity protection schemes, respectively and moved
2474	           relevant sections on bin numbering and Merkle hash trees
2475	           there.

2477	       *   Renamed Section 4 to "Merkle Hash Trees and The Automatic
2478	           Detection of Content Size".

2480	       *   Reformulated automatic size detection in terms of nodes, not
2481	           bins.

2483	       *   Extended HANDSHAKE message to carry protocol options and
2484	           created Section 8 on Protocol options.  VERSION and
2485	           MSGTYPE_RCVD messages replaced with protocol options.

2487	       *   Renamed HASH message to INTEGRITY.

2489	       *   Renamed HINT to REQUEST.

2491	       *   Added description of chunk addressing via (start,end) ranges.

2493	       *   Resolved Ticket #26: Extended "Security Considerations" with
2494	           section on the handshake procedure.

2496	       *   Resolved Ticket #17: Defined recently as "in last 60 seconds"
2497	           in PEX.

2499	       *   Resolved Ticket #20: Extended "Security Considerations" with
2500	           design to make Peer Address Exchange more secure.

2502	       *   Resolved Ticket #38+39 / PPSP.SEC.REQ-2+3: Extended "Security
2503	           Considerations" with a section on confidentiality of content.

2505	       *   Resolved Ticket #40+42 / PPSP.SEC.REQ-4+6: Extended "Security
2506	           Considerations" with a per-message analysis of threats and
2507	           how PPSPP is protected from them.

2509	       *   Progressed Ticket #41 / PPSP.SEC.REQ-5: Extended "Security
2510	           Considerations" with a section on possible ways of excluding
2511	           bad or broken peers from the system.

2513	       *   Moved Rationale to Appendix.

2515	       *   Resolved Ticket #43: Updated Live Streaming section to
2516	           include "Sign All" content authentication, and reference to
2517	           [SIGMCAST] following discussion with Fabio Picconi.

2519	       *   Resolved Ticket #12: Added a CANCEL message to cancel
2520	           REQUESTs for the same data that were sent to multiple peers
2521	           at the same time in time-critical situations.

2523	   -03	2012-10-22 Major revision

2525	       *   Updated Abstract and Introduction, removing download case.

2527	       *   Resolved Ticket #4: Added explicit CHOKE/UNCHOKE messages.

2529	       *   Removed directory lists unused in streaming.

2531	       *   Resolved Ticket #22, #23, #28: Failure behaviour, error codes
2532	           and dealing with peer crashes.

2534	       *   Resolved Ticket #13: Chunk ranges are the default chunk
2535	           addressing scheme that all peers MUST support.

2537	       *   Added a section on compatibility between chunk addressing
2538	           schemes.

2540	       *   Expanded the explanation of Unified Merkle Trees as a method
2541	           for content integrity protection for live streams.

2543	       *   Added a section on forgetting chunks in live streaming.

2545	       *   Added "End" option to protocol options and corrected bugs in
2546	           UDP encapsulation, following Karl Knutsson's comments.

2548	       *   Added SHA-2 support for Merkle Hash functions.

2550	       *   Added content integrity protection methods for live streaming
2551	           to the relevant protocol option.

2553	       *   Added a Live Signature Algorithm protocol option.

2555	       *   Resolved Ticket #24+27: The choice for UDP + LEDBAT as
2556	           transport has now been reflected in the draft.  TCP and RTP
2557	           encapsulations have been removed.

2559	       *   Superfluous parts of Section 10 on extensibility have been
2560	           removed.

2562	       *   Removed appendix with Rationale.

2564	       *   Resolved Ticket #21+25: PPSPP currently uses LEDBAT and the
2565	           DATA and ACK messages now contain the time fields it
2566	           requires.  Should other congestion control algorithms be
2567	           supported in the future, a protocol option will be added.

2569	   -04	2012-11-07 Minor revision

2571	       *   Corrected typos.

2573	       *   Added empty protocol option list when HANDSHAKE is used for
2574	           explicitly closing a channel in the UDP encapsulation.

2576	       *   Corrected definition of a range chunk specification to be a
2577	           single (start,end) pair.  To send multiple disjunct ranges
2578	           multiple messages should be used.

2580	       *   Clarified that in a range chunk specification the end is
2581	           inclusive.  I.e., [start,end] not [start,end)

2583	       *   Added PEX_REScert message to carry a membership certificate.
2584	           Renamed PEX_RES to PEX_RESv4.

2586	       *   Added a guideline about private and link-local addresses in
2587	           PEX_RES messages.

2589	       *   Defined the format of the public key that is used as swarm ID
2590	           in live streaming.

2592	       *   Clarified that a HANDSHAKE message must be the first message
2593	           in a datagram.

2595	       *   Clarified sending INTEGRITY messages ahead in a separate
2596	           datagram if not all necessary hashes that still need to be
2597	           sent and the chunk fit into a single datagram.  Defined an
2598	           order for the INTEGRITY messages.

2600	       *   Clarified rare case of sending multiple DATA messages in one
2601	           datagram.

2603	       *   Clarified UDP datagrams carrying PPSPP should adhere to the
2604	           network's MTU to avoid IP fragmentation.

2606	       *   Defined value for version protocol option.

2608	       *   Added small clarifications and corrected typos.

2610	       *   Extended versioning scheme to Min/max versioning scheme
2611	           defined in [RFC6709], Section 4.1, following Riccardo
2612	           Bernardini's suggestion.

2614	       *   Processed comments on unclear phrasing from Riccardo
2615	           Bernardini.

2617	       *   Added a guideline on when to declare a peer dead.

2619	       *   Made sure all essential references are listed as Normative
2620	           references following RFC3967.

2622	Authors' Addresses

2624	   Arno Bakker
2625	   Technische Universiteit Delft
2626	   Mekelweg 4
2627	   Delft,   2628CD
2628	   The Netherlands

2630	   Phone:
2631	   Email: arno@cs.vu.nl

2633	   Riccardo Petrocco
2634	   Technische Universiteit Delft
2635	   Mekelweg 4
2636	   Delft,   2628CD
2637	   The Netherlands

2639	   Phone:
2640	   Email: r.petrocco@gmail.com

2642	   Victor Grishchenko
2643	   Technische Universiteit Delft
2644	   Mekelweg 4
2645	   Delft,   2628CD
2646	   The Netherlands

2648	   Phone:
2649	   Email: victor.grishchenko@gmail.com