idnits 2.17.1
draft-ietf-p2psip-diagnostics-19.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** There are 22 instances of too long lines in the document, the longest
one being 2 characters in excess of 72.
-- The draft header indicates that this document updates RFC6940, but the
abstract doesn't seem to mention this, which it should.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== Line 337 has weird spacing: '...ionType type;...'
== Line 531 has weird spacing: '... opaque diagn...'
(Using the creation date from RFC6940, updated by this document, for
RFC5378 checks: 2008-10-28)
-- The document seems to contain a disclaimer for pre-RFC5378 work, and may
have content which was first submitted before 10 November 2008. The
disclaimer is necessary when there are original authors that you have
been unable to contact, or if some do not wish to grant the BCP78 rights
to the IETF Trust. If you are able to get all authors (current and
original) to grant those rights, you can and should remove the
disclaimer; otherwise, the disclaimer is needed and you can ignore this
comment. (See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (November 26, 2015) is 3075 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
== Missing Reference: 'TBD1' is mentioned on line 1102, but not defined
== Missing Reference: 'TBD2' is mentioned on line 1093, but not defined
== Missing Reference: 'TBD3' is mentioned on line 1094, but not defined
== Missing Reference: 'TBD4' is mentioned on line 1095, but not defined
== Missing Reference: 'TBD5' is mentioned on line 1096, but not defined
== Missing Reference: 'TBD6' is mentioned on line 1097, but not defined
== Missing Reference: '0x00' is mentioned on line 702, but not defined
== Missing Reference: '0x0F' is mentioned on line 702, but not defined
== Missing Reference: 'TBD7' is mentioned on line 1072, but not defined
== Missing Reference: 'TBD8' is mentioned on line 1073, but not defined
== Outdated reference: A later version (-09) exists of
draft-ietf-p2psip-concepts-07
-- Obsolete informational reference (is this intentional?): RFC 5226
(Obsoleted by RFC 8126)
Summary: 1 error (**), 0 flaws (~~), 14 warnings (==), 4 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 P2PSIP Working Group H. Song
3 Internet-Draft X. Jiang
4 Updates: 6940 (if approved) R. Even
5 Intended status: Standards Track Huawei
6 Expires: May 29, 2016 D. Bryan
7 ethernot.org
8 Y. Sun
9 ICT
10 November 26, 2015
12 P2P Overlay Diagnostics
13 draft-ietf-p2psip-diagnostics-19
15 Abstract
17 This document describes mechanisms for P2P overlay diagnostics. It
18 defines extensions to the RELOAD P2PSIP base protocol to collect
19 diagnostic information, and details the protocol specifications for
20 these extensions. Useful diagnostic information for connection and
21 node status monitoring is also defined. The document also describes
22 the usage scenarios and provides examples of how these methods are
23 used to perform diagnostics in P2PSIP overlay networks.
25 Status of This Memo
27 This Internet-Draft is submitted in full conformance with the
28 provisions of BCP 78 and BCP 79.
30 Internet-Drafts are working documents of the Internet Engineering
31 Task Force (IETF). Note that other groups may also distribute
32 working documents as Internet-Drafts. The list of current Internet-
33 Drafts is at http://datatracker.ietf.org/drafts/current/.
35 Internet-Drafts are draft documents valid for a maximum of six months
36 and may be updated, replaced, or obsoleted by other documents at any
37 time. It is inappropriate to use Internet-Drafts as reference
38 material or to cite them other than as "work in progress."
40 This Internet-Draft will expire on May 29, 2016.
42 Copyright Notice
44 Copyright (c) 2015 IETF Trust and the persons identified as the
45 document authors. All rights reserved.
47 This document is subject to BCP 78 and the IETF Trust's Legal
48 Provisions Relating to IETF Documents
49 (http://trustee.ietf.org/license-info) in effect on the date of
50 publication of this document. Please review these documents
51 carefully, as they describe your rights and restrictions with respect
52 to this document. Code Components extracted from this document must
53 include Simplified BSD License text as described in Section 4.e of
54 the Trust Legal Provisions and are provided without warranty as
55 described in the Simplified BSD License.
57 This document may contain material from IETF Documents or IETF
58 Contributions published or made publicly available before November
59 10, 2008. The person(s) controlling the copyright in some of this
60 material may not have granted the IETF Trust the right to allow
61 modifications of such material outside the IETF Standards Process.
62 Without obtaining an adequate license from the person(s) controlling
63 the copyright in such materials, this document may not be modified
64 outside the IETF Standards Process, and derivative works of it may
65 not be created outside the IETF Standards Process, except to format
66 it for publication as an RFC or to translate it into languages other
67 than English.
69 Table of Contents
71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
72 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
73 3. Diagnostic Scenarios . . . . . . . . . . . . . . . . . . . . 4
74 4. Data Collection Mechanisms . . . . . . . . . . . . . . . . . 5
75 4.1. Overview of Operations . . . . . . . . . . . . . . . . . 5
76 4.2. "Ping-like" Behavior: Extending Ping . . . . . . . . . . 7
77 4.2.1. RELOAD Request Extension: Ping . . . . . . . . . . . 7
78 4.3. "Traceroute-like" Behavior: The Path_Track Method . . . . 8
79 4.3.1. New RELOAD Request: PathTrack . . . . . . . . . . . . 9
80 4.4. Error Code Extensions . . . . . . . . . . . . . . . . . . 11
81 5. Diagnostic Data Structures . . . . . . . . . . . . . . . . . 11
82 5.1. DiagnosticsRequest Data Structure . . . . . . . . . . . . 12
83 5.2. DiagnosticsResponse Data Structure . . . . . . . . . . . 13
84 5.3. dMFlags and Diagnostic Kind ID Types . . . . . . . . . . 15
85 6. Message Processing . . . . . . . . . . . . . . . . . . . . . 18
86 6.1. Message Creation and Transmission . . . . . . . . . . . . 18
87 6.2. Message Processing: Intermediate Peers . . . . . . . . . 18
88 6.3. Message Response Creation . . . . . . . . . . . . . . . . 19
89 6.4. Interpreting Results . . . . . . . . . . . . . . . . . . 20
90 7. Authorization through Overlay Configuration . . . . . . . . . 21
91 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21
92 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
93 9.1. Diagnostics Flag . . . . . . . . . . . . . . . . . . . . 22
94 9.2. Diagnostic Kind ID Types . . . . . . . . . . . . . . . . 22
95 9.3. Message Codes . . . . . . . . . . . . . . . . . . . . . . 23
96 9.4. Error Code . . . . . . . . . . . . . . . . . . . . . . . 24
97 9.5. Message Extension . . . . . . . . . . . . . . . . . . . . 24
98 9.6. XML Name Space Registration . . . . . . . . . . . . . . . 24
99 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25
100 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 25
101 11.1. Normative References . . . . . . . . . . . . . . . . . . 25
102 11.2. Informative References . . . . . . . . . . . . . . . . . 26
103 Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 26
104 A.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . . . 26
105 A.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . . . 27
106 A.3. Example 3 . . . . . . . . . . . . . . . . . . . . . . . . 27
107 Appendix B. Problems with Generating Multiple Responses on Path 27
108 Appendix C. Changes to the Draft . . . . . . . . . . . . . . . . 27
109 C.1. Changes since -00 version . . . . . . . . . . . . . . . . 27
110 C.2. Changes since -01 version . . . . . . . . . . . . . . . . 28
111 C.3. Changes since -02 version . . . . . . . . . . . . . . . . 28
112 C.4. Changes since -03 version . . . . . . . . . . . . . . . . 28
113 C.5. Changes since -04 version . . . . . . . . . . . . . . . . 28
114 C.6. Changes since -05 version . . . . . . . . . . . . . . . . 28
115 C.7. Changes in version -10 . . . . . . . . . . . . . . . . . 28
116 C.8. Changes in version -15 . . . . . . . . . . . . . . . . . 28
117 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29
119 1. Introduction
121 In the last few years, overlay networks have rapidly evolved and
122 emerged as a promising platform for deployment of new applications
123 and services in the Internet. One of the reasons overlay networks
124 are seen as an excellent platform for large scale distributed systems
125 is their resilience in the presence of failures. This resilience has
126 three aspects: data replication, routing recovery, and static
127 resilience. Routing recovery algorithms are used to repopulate the
128 routing table with live nodes when failures are detected. Static
129 resilience measures the extent to which an overlay can route around
130 failures even before the recovery algorithm repairs the routing
131 table. Both routing recovery and static resilience rely on accurate
132 and timely detection of failures.
134 There are a number of situations in which some nodes in a Peer-to-
135 Peer (P2P) overlay may malfunction or behave badly. For example,
136 these nodes may be disabled, congested, or may be misrouting
137 messages. The impact of these malfunctions on the overlay network
138 may be a degradation of quality of service provided collectively by
139 the peers in the overlay network or an interruption of the overlay
140 services. It is desirable to identify malfunctioning or badly
141 behaving peers through diagnostic tools, and exclude or reject them
142 from the P2P system. Node failures may also be caused by failures of
143 underlying layers. For example, recovery from an incorrect overlay
144 topology may be slow when the speed at which IP routing recovers
145 after link failures is very slow. Moreover, if a backbone link fails
146 and the failover is slow, the network may be partitioned, leading to
147 partitions of overlay topologies and inconsistent routing results
148 between different partitioned components.
150 Some keep-alive algorithms based on periodic probe and acknowledge
151 mechanisms enable accurate and timely detection of failures of one
152 node's neighbors [Overlay-Failure-Detection], but these algorithms by
153 themselves can only detect the disabled neighbors using the periodic
154 method. This may not be sufficient for the service provider
155 operating the overlay network.
157 For Peer-to-Peer SIP (P2PSIP), a single, general P2PSIP overlay
158 diagnostic framework supporting periodic and on-demand methods for
159 detecting node failures and network failures is desirable. This
160 document describes a general P2PSIP overlay diagnostic extension to
161 the P2PSIP base protocol RELOAD [RFC6940] and is intended as a
162 complement to keep-alive algorithms in the P2PSIP overlay itself.
163 Readers are advised to consult [I-D.ietf-p2psip-concepts] for further
164 background on the problem domain.
166 2. Terminology
168 This document uses the concepts defined in RELOAD [RFC6940].
170 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
171 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
172 document are to be interpreted as described in [RFC2119].
174 3. Diagnostic Scenarios
176 P2P systems are self-organizing and ideally setup and configuration
177 of individual P2P nodes requires no network management in the
178 traditional sense. However, users of an overlay, as well as P2P
179 service providers may contemplate usage scenarios where some
180 monitoring and diagnostics are required. We present a simple
181 connectivity test and some useful diagnostic information that may be
182 used in such diagnostics.
184 The common usage scenarios for P2P diagnostics can be broadly
185 categorized in three classes:
187 a. Automatic diagnostics built into the P2P overlay routing
188 protocol. Nodes perform periodic checks of known neighbors and
189 remove those nodes from the routing tables that fail to respond
190 to connectivity checks [Handling_Churn_in_a_DHT]. Unresponsive
191 nodes may only be temporarily disabled, for example due to a
192 local cryptographic processing overload, disk processing overload
193 or link overload. It is therefore useful to repeat the
194 connectivity checks to see nodes have recovered and can be again
195 placed in the routing tables. This process is known as 'failed
196 node recovery' and can be optimized as described in the paper
197 "Handling Churn in a DHT" [Handling_Churn_in_a_DHT].
199 b. Diagnostics used by a particular node to follow up on an
200 individual user complaint or failure. For example, a technical
201 support staff member may use a desktop sharing application (with
202 the permission of the user) to remotely determine the health of,
203 and possible problems with, the malfunctioning node. Part of the
204 remote diagnostics may consist of simple connectivity tests with
205 other nodes in the P2PSIP overlay and retrieval of statistics
206 from nodes in the overlay. The simple connectivity tests are not
207 dependent on the type of P2PSIP overlay. Note that other tests
208 may be required as well, including checking the health and
209 performance of the user's computer or mobile device and checking
210 the bandwidth of the link connecting the user to the Internet.
212 c. P2P system-wide diagnostics used to check the overall health of
213 the P2P overlay network. These include checking the consumption
214 of network bandwidth, checking for the presence of problem links
215 and checking for abusive or malicious nodes. This is not a
216 trivial problem and has been studied in detail for content and
217 streaming P2P overlays [Diagnostic_Framework], and has not been
218 addressed in earlier P2PSIP documents
219 [Diagnostics_and_NAT_traversal_in_P2PP]. While this is a
220 difficult problem, a great deal of information that can help in
221 diagnosing these problems can be obtained by obtaining basic
222 diagnostic information for peers and the network. This document
223 provides a framework for obtaining this information.
225 4. Data Collection Mechanisms
227 4.1. Overview of Operations
229 The diagnostic mechanisms described in this document are primarily
230 intended to detect and locate failures or monitor performance in
231 P2PSIP overlay networks. It provides mechanisms to detect and locate
232 malfunctioning or badly behaving nodes including disabled nodes,
233 congested nodes and misrouting peers. It provides a mechanism to
234 detect direct connectivity or connectivity to a specified node, a
235 mechanism to detect the availability of specified resource records
236 and a mechanism to discover P2PSIP overlay topology and the underlay
237 topology failures.
239 The P2PSIP diagnostics extensions define two mechanisms to collect
240 data. The first is an extension to the RELOAD Ping mechanism,
241 allowing diagnostic data to be queried from a node, as well as to
242 diagnose the path to that node. The second is a new method and
243 response, PathTrack, for collecting diagnostic information
244 iteratively. Payloads for these mechanisms allowing diagnostic data
245 to be collected and represented are presented, and additional error
246 codes are introduced. Essentially, this document reuses RELOAD
247 [RFC6940]specification and extends them to introduce the new
248 diagnostics methods. The extensions strictly follow how RELOAD
249 specifies message routing, transport, NAT traversal, and other RELOAD
250 protocol features. The diagnostic methods are however P2PSIP
251 protocol independent.
253 This document primarily describes how to detect and locate failures
254 including disabled nodes, congested nodes, misrouting behaviors and
255 underlying network faults in P2PSIP overlay networks through a simple
256 and efficient mechanism. This mechanism is modeled after the ping/
257 traceroute paradigm: ping [RFC0792] is used for connectivity checks,
258 and traceroute is used for hop-by-hop fault localization as well as
259 path tracing. This document specifies a "ping-like" mode (by
260 extending the RELOAD Ping method to gather diagnostics) and a
261 "traceroute-like" mode (by defining the new PathTrack method) for
262 diagnosing P2PSIP overlay networks.
264 One way these tools can be used is to detect the connectivity to the
265 specified node or the availability of the specified resource-record
266 through the extended P2PSIP Ping operation. Once the overlay network
267 receives some alarms about overlay service degradation or
268 interruption, a Ping is sent. If the Ping fails, one can then send a
269 PathTrack to determine where the fault lies.
271 The diagnostic information can only be provided to authorized nodes.
272 Some diagnostic information can be provided to all the participants
273 in the P2PSIP overlay, and some other diagnostic information can only
274 be provided to the nodes authorized by the local or overlay policy.
275 The authorization depends on the type of the diagnostic information
276 and the administrative considerations, and is application specific.
278 This document considers the general administrative scenario based on
279 diagnostic kind type, where a whole overlay can authorize a certain
280 type of diagnostic information to a small list of particular nodes
281 (e.g. administrative nodes). That means, if a node gets the
282 authorization to access a diagnostic kind type, it can access that
283 information from all nodes in the overlay network. It leaves the
284 scenario where a particular node authorizes its diagnostic
285 information to a particular list of nodes out of scope. This could
286 be achieved by extension of this document if there is requirement in
287 the near future. The default policy or access rule for a type of
288 diagnostic information is "permit" unless specified in the
289 diagnostics extension document. As the RELOAD protocol already
290 requires that each message carries the message signature of the
291 sender, the receiver of the diagnostics requests can use the
292 signature to identify the sender. It can then use the overlay
293 configuration file with this signature to determine which types of
294 diagnostic information that node is authorized for.
296 In the remainder of this section we define mechanisms for collecting
297 data, as well as the specific protocol extensions (message
298 extensions, new methods, and error codes) required to collect this
299 information. In Section 5 we discuss the format of the data
300 collected, and in Section 6 we discuss detailed message processing.
302 4.2. "Ping-like" Behavior: Extending Ping
304 To provide "ping-like" behavior, the RELOAD Ping method is extended
305 to collect diagnostic data along the path. The request message is
306 forwarded by the intermediate peers along the path and then
307 terminated by the responsible peer. After optional local
308 diagnostics, the responsible peer returns a response message. If an
309 error is found when routing, an Error response is sent to the
310 initiator node by the intermediate peer.
312 The message flow of a Ping message (with diagnostic extensions) is as
313 follows:
315 Peer A Peer B Peer C Peer D
316 | | | |
317 |(1). PingReq | | |
318 |------------------->|(2). PingReq | |
319 | |------------------->|(3). PingReq |
320 | | |------------------->|
321 | | | |
322 | | |<-------------------|
323 | |<-------------------|(4). PingAns |
324 |<-------------------|(5). PingAns | |
325 |(6). PingAns | | |
326 | | | |
328 Figure 1: Ping Diagnostic Message Flow
330 4.2.1. RELOAD Request Extension: Ping
332 To extend the ping request for use in diagnostics, a new extension of
333 RELOAD is defined. The structure for a MessageExtension in RELOAD is
334 defined as:
336 struct {
337 MessageExtensionType type;
338 Boolean critical;
339 opaque extension_contents<0..2^32-1>;
340 } MessageExtension;
342 For the Ping request extension, we define a new MessageExtensionType,
343 extension 0x0002 named Diagnostic_Ping, as specified in Table 4. The
344 extension contents consists of a DiagnosticsRequest structure,
345 defined later in this document in Section 5.1. This extension MAY be
346 used for new requests of the Ping method and MUST NOT be included in
347 requests using any other method.
349 This extension is not critical. If a peer does not support the
350 extension, they will simply ignore the diagnostic portion of the
351 message, and will treat the message as if it was a normal ping.
352 Senders MUST accept a response that lacks diagnostic information and
353 SHOULD NOT resend the message expecting a reply. Receivers who
354 receive a method other than Ping including this extension MUST ignore
355 the extension.
357 4.3. "Traceroute-like" Behavior: The Path_Track Method
359 We define a simple PathTrack method for retrieving diagnostic
360 information iteratively.
362 The operation of this request is shown below in Figure 2. The
363 initiator node A asks its neighbor B which is the next hop peer to
364 the destination ID, and B returns a message with the next hop peer C
365 information, along with optional diagnostic information for B to the
366 initiator node. Then the initiator node A asks the next hop peer C
367 (directly or via symmetric routing) to return next hop peer D
368 information and diagnostic information of C. Unless a failure
369 prevents the message from being forwarded, this step can be
370 iteratively repeated until the request reaches responsible peer D for
371 the destination ID, and retrieves diagnostic information of peer D.
373 The message flow of a PathTrack message (with diagnostic extensions)
374 is as follows:
376 Peer-A Peer-B Peer-C Peer-D
377 | | | |
378 |(1).PathTrackReq | | |
379 |------------------->| | |
380 |(2).PathTrackAns | | |
381 |<-------------------| | |
382 | |(3).PathTrackReq | |
383 |--------------------|------------------->| |
384 | |(4).PathTrackAns | |
385 |<-------------------|--------------------| |
386 | | |(5).PathTrackReq |
387 |--------------------|--------------------|------------------->|
388 | | |(6).PathTrackAns |
389 |<-------------------|--------------------|--------------------|
390 | | | |
392 Figure 2: PathTrack Diagnostic Message Flow
394 There have been proposals that RouteQuery and a series of Fetch
395 requests can be used to replace the PathTrack mechanism, but in the
396 presence of high rates of churn, such an operation would not,
397 strictly speaking, provide identical results, as the path may change
398 between RouteQuery and Fetch operations. While obviously the path
399 could change between steps of PathTrack as well, with a single
400 message rather than two messages for query and fetch, less
401 inconsistency is likely, and thus the use of a single message is
402 preferred.
404 Given that in a typical diagnostic scenario the peer sending the
405 PathTrack request desires to obtain information about the current
406 path to the destination, in the event that succesive calls to
407 PathTrack return different paths, the results should be discarded and
408 the request resent, ensuring that the second request traverses the
409 appropriate path.
411 4.3.1. New RELOAD Request: PathTrack
413 This document defines a new RELOAD method, PathTrack, to retrieve the
414 diagnostic information from the intermediate peers along the routing
415 path. At each step of the PathTrack request, the responsible peer
416 responds to the initiator node with requested status information.
417 Status information can include a peer's congestion state, processing
418 power, available bandwidth, the number of entries in its neighbor
419 table, uptime, identity, network address information, and next hop
420 peer information.
422 A PathTrack request specifies which diagnostic information is
423 requested using a DiagnosticsRequest data structure, defined and
424 discussed in detail later in this document in Section 5.1. Base
425 information is requested by setting the appropriate flags in the data
426 structure in the request. If all flags are clear (no bits are set),
427 then the PathTrack request is only used for requesting the next hop
428 information. In this case the iterative mode of PathTrack is
429 degraded to a RouteQuery method which is only used for checking the
430 liveness of the peers along the routing path. The PathTrack request
431 can be routed directly or through the overlay based on the routing
432 mode chosen by the initiator node.
434 A response to a successful PathTrackReq is a PathTrackAns message.
435 The PathTrackAns contains general diagnostic information in the
436 payload, returned using a DiagnosticResponse data structure. This
437 data structure is defined and discussed in detail later in this
438 document in Section 5.2. The information returned is determined
439 based on the information requested in the flags in the corresponding
440 request.
442 4.3.1.1. PathTrack Request
444 The structure of the PathTrack request is as follows:
446 struct{
447 Destination destination;
448 DiagnosticsRequest request;
449 }PathTrackReq;
451 The fields of the PathTrackReq are as follows:
453 destination : The destination which the initiator node is
454 interested in. This may be any valid destination object,
455 including a NodeID, opaque ids, or ResourceID. One example should
456 be noted that, for debugging purpose, the initiator will use the
457 destination ID as it was used when failure happened.
459 request : A DiagnosticsRequest, as discussed in Section 5.1.
461 4.3.1.2. PathTrack Response
463 The structure of the PathTrack response is as follows:
465 struct{
466 Destination next_hop;
467 DiagnosticsResponse response;
468 }PathTrackAns;
470 The fields of the PathTrackAns are as follows:
472 next_hop : The information of the next hop node from the
473 responding intermediate peer to the destination. If the
474 responding peer is the responsible peer for the destination ID,
475 then the next_hop node ID equals the responding node ID, and after
476 receiving a PathTrackAns where the next_hop node ID equals the
477 responding node ID the initiator MUST stop the iterative process.
479 response : A DiagnosticsResponse, as discussed in Section 5.2.
481 4.4. Error Code Extensions
483 This document extends the Error response method defined in the RELOAD
484 specification to support error cases resulting from diagnostic
485 queries. When an error is encountered in RELOAD, the Message Code
486 0xFFFF is returned. The ErrorResponse structure includes an error
487 code. We define new error codes to report possible error conditions
488 detected while performing diagnostics:
490 Code Value Error Code Name
491 [TBD1] Underlay Destination Unreachable
492 [TBD2] Underlay Time exceeded
493 [TBD3] Message Expired
494 [TBD4] Upstream Misrouting
495 [TBD5] Loop detected
496 [TBD6] TTL hops exceeded
498 The final error codes will be assigned by IANA as specified in RELOAD
499 protocol [RFC6940]. The error code is returned by the upstreaming
500 node before the failure node. And the upstreaming node uses the
501 normal ping to detect the failure type and return it to the initiator
502 node, which will help the user (initiator node) to understand where
503 the failure happened and what kind of error happened, as the failure
504 may happen at the same location and for the same reason when sending
505 the normal message and the diagnostics message.
507 As defined in RELOAD, additional information may be stored (in an
508 implementation-specific way) in the optional error_info byte string.
509 While the specifics are obviously left to the implementation, as an
510 example, in the case of [TBD1], the error_field could be used to
511 provide additional information as to why the underlay destination is
512 unreachable (net unreachable, host unreachable, fragmentation needed,
513 etc.)
515 5. Diagnostic Data Structures
517 Both the extended Ping method and PathTrack method use the following
518 common diagnostics data structures to collect data. Two common
519 structures are defined: DiagnosticsRequest for requesting data, and
520 DiagnosticsResponse for returning the information.
522 5.1. DiagnosticsRequest Data Structure
524 The DiagnosticsRequest data structure is used to request diagnostic
525 information and has the following form:
527 enum{ (2^16-1) } DiagnosticKindId;
529 struct{
530 DiagnosticKindId kind;
531 opaque diagnostic_extension_contents<0..2^32-1>;
532 }DiagnosticExtension;
534 struct{
535 uint64 expiration;
536 uint64 timestamp_initiated;
537 uint64 dMFlags;
538 uint32 ext_length;
539 DiagnosticExtension diagnostic_extensions_list<0..2^32-1>;
540 }DiagnosticsRequest;
542 The fields in the DiagnosticsRequest are as follows:
544 expiration : The time when the request will expire represented as
545 the number of milliseconds elapsed since midnight Jan 1, 1970 UTC
546 not counting leap seconds. This will have the same values for
547 seconds as standard UNIX time or POSIX time. More information can
548 be found at UnixTime [UnixTime]. This value MUST have a value of
549 between 1 and 600 seconds in the future. This value is used to
550 prevent replay attacks.
552 timestamp_initiated : The time when the P2PSIP diagnostics request
553 was initiated represented as the number of milliseconds elapsed
554 since midnight Jan 1, 1970 UTC not counting leap seconds. This
555 will have the same values for seconds as standard UNIX time or
556 POSIX time.
558 dMFlags : A mandatory field which is an unsigned 64-bit integer
559 indicating which base diagnostic information the request initiator
560 node is interested in. The initiator sets different bits to
561 retrieve different kinds of diagnostic information. If dMFlags is
562 set to zero, then no base diagnostic information is conveyed in
563 the PathTrack response. If dMFlag is set to all '1's, then all
564 base diagnostic information values are requested. A request may
565 set any number of the flags to request the corresponding
566 diagnostic information.
568 Note this memo specifies the initial set of flags, the flags can
569 be extended. The dMflags indicate general diagnostic information
570 The mapping between the bits in the dMFlags and the diagnostic
571 information kind presented is as described in Section 9.1.
573 ext_length : the length of the extended diagnostic request
574 information in bytes. If the value is greater than or equal to 1,
575 then some extended diagnostic information is being requested, on
576 the assumption this information will be included in the response
577 if the recipient understands the extended request and is willing
578 to provide it. The specific diagnostic information requested is
579 defined in the diagnostic_extensions_list below. A value of zero
580 indicates no extended diagnostic information is being requested.
581 The value of ext_length MUST NOT be negative. Note that it is not
582 the length of the entire DiagnosticsRequest data structure, but of
583 the data making up the diagnostic_extensions_list.
585 diagnostic_extensions_list : consists of one or more
586 DiagnosticExtension structures (see below) documenting additional
587 diagnostic information being requested. Each DiagnosticExtension
588 consists of the following fields:
590 kind : a numerical code indicating the type of extension
591 diagnostic information (see Section 9.2). Note that kinds
592 0xF000 - 0xFFFE are reserved for overlay specific diagnostics
593 and may be used without IANA registration for local diagnostic
594 information. Kinds from 0x0000 to 0x003F MUST NOT be indicated
595 in the diagnostic_extensions_list in the message request, as
596 they may be represented using the dMFlags in a much simpler
597 (and more space efficient) way.
599 diagnostic_extension_contents : the opaque data containing the
600 request for this particular extension. This data is extension
601 dependent.
603 5.2. DiagnosticsResponse Data Structure
604 enum { (2^16-1) } DiagnosticKindId;
605 struct{
606 DiagnosticKindId kind;
607 opaque diagnostic_info_contents<0..2^16-1>;
608 }DiagnosticInfo;
610 struct{
611 uint64 expiration;
612 uint64 timestamp_initiated;
613 uint64 timestamp_received;
614 uint8 hop_counter;
615 uint32 ext_length;
616 DiagnosticInfo diagnostic_info_list<0..2^32-1>;
617 }DiagnosticsResponse;
619 The fields in the DiagnosticsResponse are as follows:
621 expiration : The time when the response will expire represented as
622 the number of milliseconds elapsed since midnight Jan 1, 1970 UTC
623 not counting leap seconds. This will have the same values for
624 seconds as standard UNIX time or POSIX time. This value MUST have
625 a value of between 1 and 600 seconds in the future.
627 timestamp_initiated: This value is copied from the diagnostics
628 request message. The benefit of containing such a value in the
629 response message is that the initiator node does not have to
630 maintain the state.
632 timestamp_received : The time when P2PSIP Overlay diagnostic
633 request was received represented as the number of milliseconds
634 elapsed since midnight Jan 1, 1970 UTC not counting leap seconds.
635 This will have the same values for seconds as standard UNIX time
636 or POSIX time.
638 hop_counter : This field only appears in diagnostic responses. It
639 MUST be exactly copied from the TTL field of the forwarding header
640 in the received request. This information is sent back to the
641 request initiator, allowing it to compute the number of hops that
642 the message traversed in the overlay.
644 ext_length : the length of the returned DiagnosticInfo information
645 in bytes. If the value is greater than or equal to 1, then some
646 extended diagnostic information (as specified in the
647 DiagnosticsRequest) was available and is being returned. In that
648 case, this value indicates the length of the returned information.
649 A value of zero indicates no extended diagnostic information is
650 included, either because none was requested or the request could
651 not be accommodated. The value of ext_length MUST NOT be
652 negative. Note that it is not the length of the entire
653 DiagnosticsRequest data structure, but of the data making up the
654 diagnostic_info_list.
656 diagnostic_info_list : consists of one or more DiagnosticInfo
657 structures containing the requested diagnostic_info_contents. The
658 fields in the DiagnosticInfo structure are as follows:
660 kind : A numeric code indicating the type of information being
661 returned. For base data requested using the dMFlags, this code
662 corresponds to the dMFlag set, and is described in Section 5.1.
663 For diagnostic extensions, this code will be identical to the
664 value of the DiagnosticKindId set in the "kind" field of the
665 DiagnosticExtension of the request. See Section 9.2.
667 diagnostic_info_contents : Data containing the value for the
668 diagnostic information being reported. Various kinds of
669 diagnostic information can be retrieved, Please refer to
670 Section 5.3 for details of the diagnostic kind ID for the base
671 diagnostic information that may be reported.
673 5.3. dMFlags and Diagnostic Kind ID Types
675 The dMFlags field described above is a 64 bit field that allows
676 initiator nodes to identify up to 62 items of base information to
677 request in a request message (the first and last flags being
678 reserved). When the requested base information is returned in the
679 response, the value of the diagnostic kind ID will correspond to the
680 numeric field marked in the dMFlags in the request. The values for
681 the dMFlags are defined in Section 9.1 and the diagnostic kind IDs
682 are defined in Section 9.2. The information contained for each value
683 is described in this section. Access to each kind of diagnostic
684 information MUST NOT be allowed unless compliant to the rules defined
685 in Section 7.
687 STATUS_INFO (8 bits):A single value element containing an unsigned
688 byte representing whether or not the node is in congestion status.
689 An example usage of STATUS_INFO is for congestion-aware routing.
690 In this scenario, each peer has to update its congestion status
691 periodically. An intermediate peer in the distributed hash table
692 (DHT) network will choose its next hop according to both the DHT
693 routing algorithm and the status information. This is done to
694 avoid increasing load on congested peers. The rightmost 4 bits
695 are used and other bits MUST be cleared to "0"s for future use.
696 There are 16 levels of congestion status, with "0x00" represent
697 zero load and "0x0F" represent congested. This document does not
698 provide a specific method for congestion, leaving this decision to
699 each overlay implementation. One possible option for an overlay
700 implementation would be to take node's CPU/memory/bandwidth usage
701 percentage in the past 600 seconds and normalize the highest value
702 to the range [0x00, 0x0F]. And an overlay implementation can also
703 decide to not use all that 16 values from 0x00 to 0x0F. A future
704 draft may define an objective measure or specific algorithm for
705 this.
707 ROUTING_TABLE_SIZE (32 bits): A single value element containing an
708 unsigned 32-bit integer representing the number of peers in the
709 peer's routing table. The administrator of the overlay may be
710 interested in statistics of this value for reasons such as routing
711 efficiency.
713 PROCESS_POWER (64 bits): A single value element containing an
714 unsigned 64-bit integer specifying the processing power of the
715 node in unit of MIPS. Fractional values are rounded up.
717 UPSTREAM_BANDWIDTH (64 bits): A single value element containing an
718 unsigned 64-bit integer specifying the upstream network bandwidth
719 (provisioned or maximum, not available) of the node in unit of
720 Kbps. Fractional values are rounded up. For multihomed hosts,
721 this should be the link used to send the response.
723 DOWNSTREAM_BANDWIDTH (64 bits): A single value element containing
724 an unsigned 64-bit integer specifying the downstream network
725 bandwidth (provisioned or maximum, not available) of the node in
726 unit of Kbps. Fractional values are rounded up. For multihomed
727 hosts, this should be the link the request was received from.
729 SOFTWARE_VERSION: A single value element containing a US-ASCII
730 string that identifies the manufacture, model, operating system
731 information and the version of the software. Given that there are
732 very large number of peers in some networks, and no peer is likely
733 to know all other peer's software, this information may be very
734 useful to help determine if the cause of certain groups of
735 misbehaving peers is related to specific software versions. While
736 the format is peer-defined, a suggested format is as follows:
737 "ApplicationProductToken (Platform; OS-or-CPU) VendorProductToken
738 (VendorComment)". For example: "MyReloadApp/1.0 (Unix; Linux
739 x86_64) libreload-java/0.7.0 (Stonyfish Inc.)". The string is a
740 C-style string, and MUST be delimited by "\0"."\0" MUST NOT be
741 included in the string itself to prevent confusion with the
742 delimiter.
744 MACHINE_UPTIME (64 bits): A single value element containing an
745 unsigned 64-bit integer specifying the time the node's underlying
746 system has been up in seconds.
748 APP_UPTIME (64 bits): A single value element containing an
749 unsigned 64-bit integer specifying the time the P2P application
750 has been up in seconds.
752 MEMORY_FOOTPRINT (64 bits): A single value element containing an
753 unsigned 64-bit integer representing the memory footprint of the
754 peer program in kilobytes (1024 bytes). Fractional values are
755 rounded up.
757 DATASIZE_STORED (64 bits): An unsigned 64-bit integer representing
758 the number of bytes of data being stored by this node.
760 INSTANCES_STORED: An array element containing the number of
761 instances of each kind stored. The array is indexed by Kind-ID.
762 Each entry is an unsigned 64-bit integer.
764 MESSAGES_SENT_RCVD: An array element containing the number of
765 messages sent and received. The array is indexed by method code.
766 Each entry in the array is a pair of unsigned 64-bit integers
767 (packed end to end) representing sent and received.
769 EWMA_BYTES_SENT (32 bits): A single value element containing an
770 unsigned 32-bit integer representing an exponential weighted
771 average of bytes sent per second by this peer. sent = alpha x
772 sent_present + (1 - alpha) x sent where sent_present represents
773 the bytes sent per second since the last calculation and sent
774 represents the last calculation of bytes sent per second. A
775 suitable value for alpha is 0.8. This value is calculated every
776 five seconds.
778 EWMA_BYTES_RCVD (32 bits): A single value element containing an
779 unsigned 32-bit integer representing an exponential weighted
780 average of bytes received per second by this peer. rcvd = alpha x
781 rcvd_present + (1 - alpha) x rcvd where rcvd_present represents
782 the bytes received per second since the last calculation and rcvd
783 represents the last calculation of bytes received per second. A
784 suitable value for alpha is 0.8. This value is calculated every
785 five seconds.
787 UNDERLAY_HOP (8 bits): Indicates the IP layer hops from the
788 intermediate peer which receives the diagnostics message to the
789 next hop peer for this message. (Note: RELOAD does not require
790 the intermediate peers to look into the message body. So here we
791 use PathTrack to gather underlay hops for diagnostics purpose).
793 BATTERY_STATUS (8 bits): The left-most bit is used to indicate
794 whether this peer is using a battery or not. If this bit is clear
795 (set to '0'), then the peer is using a battery for power. The
796 other 7 bits are to be determined by specific applications.
798 6. Message Processing
800 6.1. Message Creation and Transmission
802 When constructing either a Ping message with diagnostic extensions or
803 a PathTrack message, the sender first creates and populates a
804 DiagnosticsRequest data structure. The timestamp_initiated field is
805 set to the current time, and the expiration field is constructed
806 based on this time. The sender includes the dMFlags field in the
807 structure, setting any number (including all) of the flags to request
808 particular diagnostic information. The sender MAY leave all the bits
809 unset, requesting no particular diagnostic information.
811 The sender MAY also include diagnostic extensions in the
812 DiagnosticsRequest data structure to request additional information.
813 If the sender includes any extensions, it MUST calculate the length
814 of these extensions and set the ext_length field to this value. If
815 no extensions are included, the sender MUST set ext_length to zero.
817 The format of the DiagnosticRequest data structure and its fields
818 MUST follow the restrictions defined in Section 5.1.
820 When constructing a Ping message with diagnostic extensions, the
821 sender MUST create an MessageExtension structure as defined in RELOAD
822 [RFC6940], setting the value of type to 0x0002, and the value of
823 critical to FALSE. The value of extension_contents MUST be a
824 DiagnosticsRequest structure as defined above. The message MAY be
825 directed to a particular NodeId or ResourceID, but MUST NOT be sent
826 to the broadcast NodeID.
828 When constructing a PathTrack message, the sender MUST set the
829 message_code for the RELOAD MessageContents structure to
830 path_track_req ([TBD7]). The request field of the PathTrackReq MUST
831 be set to the DiagnosticsRequest data structure defined above. The
832 destination field MUST be set to the desired destination, which MAY
833 be either a NodeId or ResourceID but SHOULD NOT be the broadcast
834 NodeID.
836 6.2. Message Processing: Intermediate Peers
838 When a request arrives at a peer, if the peer's responsible ID space
839 does not cover the destination ID of the request, then the peer MUST
840 continue processing this request according to the overlay specified
841 routing mode from RELOAD protocol.
843 In P2PSIP overlay, error responses to a message can be generated by
844 either an intermediate peer or the responsible peer. When a request
845 is received at a peer, the peer may find connectivity failures or
846 malfunctioning peers through the pre-defined rules of the overlay
847 network, e.g. by analyzing via list or underlay error messages. In
848 this case, the intermediate peer SHOULD return an error response to
849 the initiator node, reporting any malfunction node information
850 available in the error message payload. All error responses
851 generated MUST contain the appropriate error code.
853 Each intermediate peer receiving a Ping message with extensions (and
854 which understands the extension) or receiving a PathTrack request/
855 response SHOULD check the expiration value (Unix time format) to
856 determine if the message is expired. If the message expired, the
857 intermediate peer SHOULD generate a response with Error Code [TBD3]
858 "Message Expired", return the response to the initiator node, and
859 discard the message.
861 The intermediate peer SHOULD return an error response with the Error
862 Code [TBD1] "Underlay Destination Unreachable" when it receives an
863 ICMP message with "Destination Unreachable" information after
864 forwarding the received request to the destination peer.
866 The intermediate peer SHOULD return an error response with the Error
867 Code [TBD2] "Underlay Time Exceeded" when it receives an ICMP message
868 with "Time Exceeded" information after forwarding the received
869 request.
871 The peer SHOULD return an Error response with Error Code [TBD4]
872 "Upstream Misrouting" when it finds its upstream peer disobeys the
873 routing rules defined in the overlay. The immediate upstream peer
874 information SHOULD also be conveyed to the initiator node.
876 The peer SHOULD return an Error response with Error Code [TBD5] "Loop
877 detected" when it finds a loop through the analysis of via list.
879 The peer SHOULD return an Error response with Error Code [TBD6] "TTL
880 hops exceeded" when it finds that the TTL field value is no more than
881 0 when forwarding.
883 6.3. Message Response Creation
885 When a diagnostic request message arrives at a peer, it is
886 responsible for the destination ID specified in the forwarding
887 header, and assuming it understands the extension (in the case of
888 Ping) or the new request type PathTrack, it MUST follow the
889 specifications defined in RELOAD to form the response header, and
890 perform the following operations:
892 When constructing a PathTrack response, the sender MUST set the
893 message_code for the RELOAD MessageContents structure to
894 path_track_ans ([TBD8]).
896 The receiver MUST check the expiration value (Unix time format) in
897 the DiagnosticsRequest to determine if the message is expired. If
898 the message is expired, the peer MUST generate a response with the
899 Error Code [TBD3] "Message Expired", return the response to the
900 initiator node, and discard the message.
902 If the message is not expired, the receiver MUST construct a
903 DiagnosticsResponse structure, as follows: The TTL value from the
904 forwarding header is copied to the hop_counter field of the
905 DiagnosticsResponse structure. Note that the default value for TTL
906 at the beginning represents 100-hops unless overlay configuration has
907 overridden the value. The receiver generates an Unix time format
908 timestamp for the current time of day and places it in the
909 timestamp_received field, and constructs a new expiration time and
910 places it in the expiration field of the DiagnosticsResponse.
912 The destination peer MUST check if the initiator node has the
913 authority to request specific types of diagnostic information, and if
914 appropriate, append the diagnostic information requested in the
915 dMFlags and diagnostic_extensions (if any) using the
916 diagnostic_info_list field to the DiagnosticsResponse structure. If
917 any information returned, the receiver MUST calculate the length of
918 the response and set ext_length appropriately. If no diagnostic
919 information is returned, ext_length MUST be set to zero.
921 The format of the DiagnosticResponse data structure and its fields
922 MUST follow the restrictions defined in Section 5.2.
924 In the event of an error, an error response containing the error code
925 followed by the description (if they exist) MUST be created and sent
926 to the sender. If the initiator node asks for diagnostic information
927 that they are not authorized to query, the receiving peer MUST return
928 an Error response with the Error Code 2 "Error_Forbidden".
930 6.4. Interpreting Results
932 The initiator node, as well as the responding peer, MAY compute the
933 overlay One-Way-Delay time through the value in timestamp_received
934 and the timestamp_initiated field. However, for a single hop
935 measurement, the traditional measurement methods (IP layer ping) MUST
936 be used instead of the overlay layer diagnostics methods.
938 The P2P overlay network using the diagnostics methods specified in
939 this document MUST enforce time synchronization with a central time
940 server. Network Time Protocol [RFC5905] can usually maintain time to
941 within tens of milliseconds over the public Internet, and can achieve
942 better than one millisecond accuracy in local area networks under
943 ideal conditions. However, this document does not specify the choice
944 for time synchronization, leaving it to the implementation.
946 The initiator node receiving the Ping response MAY check the
947 hop_counter field and compute the overlay hops to the destination
948 peer for the statistics of connectivity quality from the perspective
949 of overlay hops.
951 7. Authorization through Overlay Configuration
953 Different level of access control can be made for different users/
954 nodes. For example, diagnostic information A can be accessed by node
955 1 and 2, but diagnostic information B can only be accessed by node 2.
957 The overlay configuration file MUST contain the following XML
958 elements for authorizing a node to access the relative diagnostic
959 kinds.
961 diagnostic-kind: This has the attribute "kind" with the hexadecimal
962 number indicating the diagnostic Kind Type, this attribute has the
963 same value with Section 9.2, and at least one sub element "access-
964 node".
966 access-node: This element contains one hexadecimal number indicating
967 a NodeID, and the node with this NodeID is allowed to access the
968 diagnostic "kind" under the same diagnostic-kind element.
970 8. Security Considerations
972 The authorization for diagnostic information must be designed with
973 care to prevent it becoming a method to retrieve information for bot
974 attacks. It should also be noted that attackers can use diagnostics
975 to analyze overlay information to attack certain key peers. For
976 example, diagnostic information might be used to fingerprint a peer
977 where the peer will loose its anonymity characteristics, but
978 anonymity might be very important for some P2P overlay networks, and
979 defenses against such fingerprinting are probably very hard. As
980 such, networks where anonymity is of very high importance may find
981 implementation of diagnostics problematic or even undesirable,
982 despite the many advantages it offers. As this document is a RELOAD
983 extension, it follows RELOAD message header and routing
984 specifications, the common security considerations described in the
985 base document [RFC6940] are also applicable to this document.
986 Overlays may define their own requirements on who can collect/share
987 diagnostic information.
989 9. IANA Considerations
991 9.1. Diagnostics Flag
993 IANA SHALL create a "RELOAD Diagnostics Flag" Registry. Entries in
994 this registry are 1-bit flags contained in a 64-bits long integer
995 dMFlags denoting diagnostic information to be retrieved as described
996 in Section 4.3.1. New entries SHALL be defined via [RFC5226]
997 Standards Action. The initial contents of this registry are:
999 +-------------------------+------------------------------+----------+
1000 | diagnostic information |diagnostic flag in dMFlags | RFC |
1001 |-------------------------+------------------------------+----------|
1002 |Reserved | 0x 0000 0000 0000 0000 |RFC-[TBDX]|
1003 |STATUS_INFO | 0x 0000 0000 0000 0001 |RFC-[TBDX]|
1004 |ROUTING_TABLE_SIZE | 0x 0000 0000 0000 0002 |RFC-[TBDX]|
1005 |PROCESS_POWER | 0x 0000 0000 0000 0004 |RFC-[TBDX]|
1006 |UPSTREAM_BANDWIDTH | 0x 0000 0000 0000 0008 |RFC-[TBDX]|
1007 |DOWNSTREAM_ BANDWIDTH | 0x 0000 0000 0000 0010 |RFC-[TBDX]|
1008 |SOFTWARE_VERSION | 0x 0000 0000 0000 0020 |RFC-[TBDX]|
1009 |MACHINE_UPTIME | 0x 0000 0000 0000 0040 |RFC-[TBDX]|
1010 |APP_UPTIME | 0x 0000 0000 0000 0080 |RFC-[TBDX]|
1011 |MEMORY_FOOTPRINT | 0x 0000 0000 0000 0100 |RFC-[TBDX]|
1012 |DATASIZE_STORED | 0x 0000 0000 0000 0200 |RFC-[TBDX]|
1013 |INSTANCES_STORED | 0x 0000 0000 0000 0400 |RFC-[TBDX]|
1014 |MESSAGES_SENT_RCVD | 0x 0000 0000 0000 0800 |RFC-[TBDX]|
1015 |EWMA_BYTES_SENT | 0x 0000 0000 0000 1000 |RFC-[TBDX]|
1016 |EWMA_BYTES_RCVD | 0x 0000 0000 0000 2000 |RFC-[TBDX]|
1017 |UNDERLAY_HOP | 0x 0000 0000 0000 4000 |RFC-[TBDX]|
1018 |BATTERY_STATUS | 0x 0000 0000 0000 8000 |RFC-[TBDX]|
1019 |Reserved | 0x FFFF FFFF FFFF FFFF |RFC-[TBDX]|
1020 +-------------------------+------------------------------+--------+
1022 [To RFC editor: Please replace all RFC-[TBDX] in this document with
1023 the RFC number of this document.]
1025 9.2. Diagnostic Kind ID Types
1027 IANA SHALL create a "RELOAD Diagnostic Kind ID Types" Registry.
1028 Entries in this registry are 16-bit integers denoting diagnostics
1029 extension data kind types carried in the diagnostic request and
1030 response message, as described in Section 5.2. Code points from
1031 0x0000 to 0x003F SHALL be assigned together with flags within "RELOAD
1032 Diagnostics Flag" registry via RFC 5226 [RFC5226] standards action.
1033 Code points in the range 0x0040 to 0xEFFF SHALL be registered via RFC
1034 5226 standards action.
1036 +---------------------------+---------------+---------------+
1037 | Diagnostic Kind Type | Code | Specification |
1038 +---------------------------+---------------+---------------+
1039 | reserved | 0x0000 | RFC-[TBDX] |
1040 | STATUS_INFO | 0x0001 | RFC-[TBDX] |
1041 | ROUTING_TABLE_SIZE | 0x0002 | RFC-[TBDX] |
1042 | PROCESS_POWER | 0x0003 | RFC-[TBDX] |
1043 | UPSTREAM_BANDWIDTH | 0x0004 | RFC-[TBDX] |
1044 | DOWNSTREAM_BANDWIDTH | 0x0005 | RFC-[TBDX] |
1045 | SOFTWARE_VERSION | 0x0006 | RFC-[TBDX] |
1046 | MACHINE_UPTIME | 0x0007 | RFC-[TBDX] |
1047 | APP_UPTIME | 0x0008 | RFC-[TBDX] |
1048 | MEMORY_FOOTPRINT | 0x0009 | RFC-[TBDX] |
1049 | DATASIZE_STORED | 0x000A | RFC-[TBDX] |
1050 | INSTANCES_STORED | 0x000B | RFC-[TBDX] |
1051 | MESSAGES_SENT_RCVD | 0x000C | RFC-[TBDX] |
1052 | EWMA_BYTES_SENT | 0x000D | RFC-[TBDX] |
1053 | EWMA_BYTES_RCVD | 0x000E | RFC-[TBDX] |
1054 | UNDERLAY_HOP | 0x000F | RFC-[TBDX] |
1055 | BATTERY_STATUS | 0x0010 | RFC-[TBDX] |
1056 | reserved for future flags | 0x0011-40 | RFC-[TBDX] |
1057 | local use (reserved) | 0xF000-0xFFFE | RFC-[TBDX] |
1058 | reserved | 0xFFFF | RFC-[TBDX] |
1059 +---------------------------+---------------+---------------+
1061 Table 1: Diagnostic Kind Types
1063 9.3. Message Codes
1065 This document introduces two new types of messages and their
1066 responses, requiring the following additions to the "RELOAD Message
1067 Code" Registry defined in RELOAD [RFC6940]. These additions are:
1069 +-------------------+------------+----------+
1070 | Message Code Name | Code Value | RFC |
1071 +-------------------+------------+----------+
1072 | path_track_req | [TBD7] | RFC-AAAA |
1073 | path_track_ans | [TBD8] | RFC-AAAA |
1074 +-------------------+------------+----------+
1076 Table 2: Extensions to RELOAD Message Codes
1078 [To RFC editor: Values starting at [TBD1] were used to prevent
1079 collisions with RELOAD base values and other extensions. Please
1080 replace with the next highest available values. The final message
1081 codes will be assigned by IANA. And all RFC-AAAA should be replaced
1082 with the RFC number of RELOAD when publication.]
1084 9.4. Error Code
1086 This document introduces the following new error codes, extending the
1087 "RELOAD Message Code" registry as described below:
1089 +----------------------------------------+------------+----------+
1090 | Message Code Name | Code Value | RFC |
1091 +----------------------------------------+------------+----------+
1092 | Error_Underlay_Destination_Unreachable | [TBD1] | RFC-AAAA |
1093 | Error_Underlay_Time_Exceeded | [TBD2] | RFC-AAAA |
1094 | Error_Message_Expired | [TBD3] | RFC-AAAA |
1095 | Error_Upstream_Misrouting | [TBD4] | RFC-AAAA |
1096 | Error_Loop_Detected | [TBD5] | RFC-AAAA |
1097 | Error_TTL_Hops_Exceeded | [TBD6] | RFC-AAAA |
1098 +----------------------------------------+------------+----------+
1100 Table 3: Extensions to RELOAD Error Codes
1102 [To RFC editor: Values starting at [TBD1] were used to prevent
1103 collisions with RELOAD base values and other extensions. Please
1104 replace with the next highest available values. The final message
1105 codes will be assigned by IANA. And all RFC-AAAA should be replaced
1106 with the RFC number of RELOAD when publication.]
1108 9.5. Message Extension
1110 This document introduces the following new RELOAD extension code:
1112 +-----------------+------------+----------+
1113 | Extension Name | Code Value | RFC |
1114 +-----------------+------------+----------+
1115 | Diagnostic_Ping | 0x0002 | RFC-AAAA |
1116 +-----------------+------------+----------+
1118 Table 4: New RELOAD Extension Code
1120 [To RFC editor: The value 0x0002 was used to prevent collisions with
1121 other extensions. Please replace with the next highest available
1122 value. The final codes will be assigned by IANA. And all RFC-AAAA
1123 should be replaced with the RFC number of RELOAD when publication.]
1125 9.6. XML Name Space Registration
1127 This document registers a URI for the config-diagnostics XML
1128 namespaces in the IETF XML registry defined in [RFC3688]. All the
1129 elements defined in this document belong to this namespace.
1131 URI: urn:ietf:params:xml:ns:p2p:config-diagnostics
1132 Registrant Contact: The IESG.
1133 XML: N/A, the requested URIs are XML namespaces
1135 And the overlay configuration file MUST contain the following xml
1136 language declaring P2PSIP diagnostics as a mandatory extension to
1137 RELOAD.
1139
1140 urn:ietf:params:xml:ns:p2p:config-diagnostics
1141
1143 10. Acknowledgments
1145 We would like to thank Zheng Hewen for the contribution of the
1146 initial version of this document. We would also like to thank Bruce
1147 Lowekamp, Salman Baset, Henning Schulzrinne, Jiang Haifeng and Marc
1148 Petit-Huguenin for the email discussion and their valued comments,
1149 and special thanks to Henry Sinnreich for contributing to the usage
1150 scenarios text. We would like to thank the authors of the RELOAD
1151 protocol for transferring text about diagnostics to this document.
1153 11. References
1155 11.1. Normative References
1157 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
1158 RFC 792, DOI 10.17487/RFC0792, September 1981,
1159 .
1161 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1162 Requirement Levels", BCP 14, RFC 2119,
1163 DOI 10.17487/RFC2119, March 1997,
1164 .
1166 [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688,
1167 DOI 10.17487/RFC3688, January 2004,
1168 .
1170 [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch,
1171 "Network Time Protocol Version 4: Protocol and Algorithms
1172 Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
1173 .
1175 [RFC6940] Jennings, C., Lowekamp, B., Ed., Rescorla, E., Baset, S.,
1176 and H. Schulzrinne, "REsource LOcation And Discovery
1177 (RELOAD) Base Protocol", RFC 6940, DOI 10.17487/RFC6940,
1178 January 2014, .
1180 11.2. Informative References
1182 [UnixTime]
1183 "UnixTime", .>.
1186 [I-D.ietf-p2psip-concepts]
1187 Bryan, D., Matthews, P., Shim, E., Willis, D., and S.
1188 Dawkins, "Concepts and Terminology for Peer to Peer SIP",
1189 draft-ietf-p2psip-concepts-07 (work in progress), May
1190 2015.
1192 [Overlay-Failure-Detection]
1193 Zhuang, S., "On failure detection algorithms in overlay
1194 networks", Proc. IEEE Infocomm, Mar 2005.
1196 [Handling_Churn_in_a_DHT]
1197 Rhea, S., "Handling Churn in a DHT", USENIX Annual
1198 Conference, June 2004.
1200 [Diagnostic_Framework]
1201 Jin, X., "A Diagnostic Framework for Peer-to-Peer
1202 Streaming", 2005.
1204 [Diagnostics_and_NAT_traversal_in_P2PP]
1205 Gupta, G., "Diagnostics and NAT Traversal in P2PP - Design
1206 and Implementation", Columbia University Report , June
1207 2008.
1209 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
1210 IANA Considerations Section in RFCs", BCP 26, RFC 5226,
1211 DOI 10.17487/RFC5226, May 2008,
1212 .
1214 Appendix A. Examples
1216 Below, we sketch how these metrics can be used.
1218 A.1. Example 1
1220 A peer may set EWMA_BYTES_SENT and EWMA_BYTES_RCVD flags in the
1221 PathTrackReq to its direct neighbors. A peer can use EWMA_BYTES_SENT
1222 and EWMA_BYTES_RCVD of another peer to infer whether it is acting as
1223 a media relay. It may then choose not to forward any requests for
1224 media relay to this peer. Similarly, among the various candidates
1225 for filling up routing table, a peer may prefer a peer with a large
1226 UPTIME value, small RTT, and small LAST_CONTACT value.
1228 A.2. Example 2
1230 A peer may set the STATUS_INFO Flag in the PathTrackReq to a remote
1231 destination peer. The overlay has its own threshold definition for
1232 congestion. The peer can obtain knowledge of all the status
1233 information of the intermediate peers along the path. Then it can
1234 choose other paths to that node for the subsequent requests.
1236 A.3. Example 3
1238 A peer may use Ping to evaluate the average overlay hops to other
1239 peers by sending PingReq to a set of random resource or node IDs in
1240 the overlay. A peer may adjust its timeout value according to the
1241 change of average overlay hops.
1243 Appendix B. Problems with Generating Multiple Responses on Path
1245 An earlier version of this document considered an approach where a
1246 response was generated by each intermediate peer as the message
1247 traversed the overlay. This approach was discarded. One reason this
1248 approach was discarded was that it could provide a DoS mechanism,
1249 whereby an attacker could send an arbitrary message claiming to be
1250 from a spoofed "sender" the real sender wished to attack. As a
1251 result of sending this one message, many messages would be generated
1252 and sent back to the spoofed "sender" - one from each intermediate
1253 peer on the message path. While authentication mechanisms could
1254 reduce some risk of this attack, it still resulted in a fundamental
1255 break from the request-response nature of the RELOAD protocol, as
1256 multiple responses are generated to a single request. Although one
1257 request with responses from all the peers in the route will be more
1258 efficient, it was determined to be too great a security risk and
1259 deviation from the RELOAD architecture.
1261 Appendix C. Changes to the Draft
1263 To RFC editor: This section is to track the changes. Please remove
1264 this section before publication.
1266 C.1. Changes since -00 version
1268 1. Changed title from "Diagnose P2PSIP Overlay Network" to "P2PSIP
1269 Overlay Diagnostics".
1271 2. Changed the table of contents. Add a section about message
1272 processing and a section of examples.
1274 3. Merge diagnostics text from the p2psip base draft -01.
1276 4. Removed ECHO method for security reasons.
1278 C.2. Changes since -01 version
1280 Added BATTERY_STATUS as diagnostic information.
1282 Removed UnderlayTTL test from the Ping method, instead adding an
1283 UNDERLAY_HOP diagnostic information for PathTrack method.
1285 Give some examples for diagnostic information, and give some
1286 editor's notes for further work.
1288 C.3. Changes since -02 version
1290 Provided further explanation as to why the base draft Ping in the
1291 current form cannot be used to replace Ping, and why some combination
1292 of methods cannot replace PathTrack.
1294 C.4. Changes since -03 version
1296 Modified structure used to share information collected. Both
1297 mechanisms now use a common data structure to convey information.
1299 C.5. Changes since -04 version
1301 Updated the authors' addresses and modified the last sentence in .
1302 (Section 4.3.1.2)
1304 C.6. Changes since -05 version
1306 Resolve Marc's comments from the mailing list. And define the
1307 details of STATUS_INO.
1309 C.7. Changes in version -10
1311 Resolve the authorization issue and other comments (e.g. define
1312 diagnostics as a mandatory extension) from WGLC. And check for the
1313 languages.
1315 C.8. Changes in version -15
1317 Changed several diagnostic kind return values to be 64 bit vs. 32 bit
1318 to provide headroom. Split bandwidth into upstream and downstream.
1319 Renamed length in diagnostic request object to ext_length, added
1320 ext_length to response object, and clarified that ext_length is
1321 length of diagnostic info/extensions being returned, not the length
1322 of the object.
1324 Aligned many flags/values with RELOAD by using hex vs decimal values.
1326 Significant reorganization and edit for readability.
1328 Authors' Addresses
1330 Haibin Song
1331 Huawei
1333 Email: haibin.song@huawei.com
1335 Jiang Xingfeng
1336 Huawei
1338 Email: jiangxingfeng@huawei.com
1340 Roni Even
1341 Huawei
1342 14 David Hamelech
1343 Tel Aviv 64953
1344 Israel
1346 Email: roni.even@mail101.huawei.com
1348 David A. Bryan
1349 ethernot.org
1350 Cedar Park, Texas
1351 United States of America
1353 Email: dbryan@ethernot.org
1355 Yi Sun
1356 ICT
1358 Email: sunyi@ict.ac.cn