idnits 2.17.1
draft-ietf-p2psip-diagnostics-15.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== Line 337 has weird spacing: '...ionType type;...'
== Line 525 has weird spacing: '... opaque diagn...'
== The document seems to contain a disclaimer for pre-RFC5378 work, but was
first submitted on or after 10 November 2008. The disclaimer is usually
necessary only for documents that revise or obsolete older RFCs, and that
take significant amounts of text from those RFCs. If you can contact all
authors of the source material and they are willing to grant the BCP78
rights to the IETF Trust, you can and should remove the disclaimer.
Otherwise, the disclaimer is needed and you can ignore this comment.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (July 3, 2014) is 3584 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
== Missing Reference: '0x00' is mentioned on line 678, but not defined
== Missing Reference: '0x0F' is mentioned on line 678, but not defined
== Unused Reference: 'RFC0792' is defined on line 1126, but no explicit
reference was found in the text
== Unused Reference: 'I-D.ietf-p2psip-self-tuning' is defined on line 1149,
but no explicit reference was found in the text
== Unused Reference: 'I-D.ietf-p2psip-concepts' is defined on line 1155,
but no explicit reference was found in the text
== Outdated reference: A later version (-09) exists of
draft-ietf-p2psip-concepts-06
-- Obsolete informational reference (is this intentional?): RFC 5226
(Obsoleted by RFC 8126)
Summary: 0 errors (**), 0 flaws (~~), 10 warnings (==), 2 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 P2PSIP Working Group H. Song
3 Internet-Draft X. Jiang
4 Intended status: Standards Track R. Even
5 Expires: January 4, 2015 Huawei
6 D. Bryan
7 Ethernot.org
8 Y. Sun
9 ICT
10 July 3, 2014
12 P2P Overlay Diagnostics
13 draft-ietf-p2psip-diagnostics-15
15 Abstract
17 This document describes mechanisms for P2P overlay diagnostics. It
18 defines extensions to the RELOAD P2PSIP base protocol to collect
19 diagnostic information, and details the protocol specifications for
20 these extensions. Useful diagnostic information for connection and
21 node status monitoring is also defined. The document also describes
22 the usage scenarios and provides examples of how these methods are
23 used to perform diagnostics in P2PSIP overlay networks.
25 Status of This Memo
27 This Internet-Draft is submitted in full conformance with the
28 provisions of BCP 78 and BCP 79.
30 Internet-Drafts are working documents of the Internet Engineering
31 Task Force (IETF). Note that other groups may also distribute
32 working documents as Internet-Drafts. The list of current Internet-
33 Drafts is at http://datatracker.ietf.org/drafts/current/.
35 Internet-Drafts are draft documents valid for a maximum of six months
36 and may be updated, replaced, or obsoleted by other documents at any
37 time. It is inappropriate to use Internet-Drafts as reference
38 material or to cite them other than as "work in progress."
40 This Internet-Draft will expire on January 4, 2015.
42 Copyright Notice
44 Copyright (c) 2014 IETF Trust and the persons identified as the
45 document authors. All rights reserved.
47 This document is subject to BCP 78 and the IETF Trust's Legal
48 Provisions Relating to IETF Documents
49 (http://trustee.ietf.org/license-info) in effect on the date of
50 publication of this document. Please review these documents
51 carefully, as they describe your rights and restrictions with respect
52 to this document. Code Components extracted from this document must
53 include Simplified BSD License text as described in Section 4.e of
54 the Trust Legal Provisions and are provided without warranty as
55 described in the Simplified BSD License.
57 This document may contain material from IETF Documents or IETF
58 Contributions published or made publicly available before November
59 10, 2008. The person(s) controlling the copyright in some of this
60 material may not have granted the IETF Trust the right to allow
61 modifications of such material outside the IETF Standards Process.
62 Without obtaining an adequate license from the person(s) controlling
63 the copyright in such materials, this document may not be modified
64 outside the IETF Standards Process, and derivative works of it may
65 not be created outside the IETF Standards Process, except to format
66 it for publication as an RFC or to translate it into languages other
67 than English.
69 Table of Contents
71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
72 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
73 3. Diagnostic Scenarios . . . . . . . . . . . . . . . . . . . . 4
74 4. Data Collection Mechanisms . . . . . . . . . . . . . . . . . 5
75 4.1. Overview of Operations . . . . . . . . . . . . . . . . . 5
76 4.2. "Ping-like" Behavior: Extending Ping . . . . . . . . . . 7
77 4.2.1. RELOAD Request Extension: Ping . . . . . . . . . . . 7
78 4.3. "Traceroute-like" Behavior: The Path_Track Method . . . . 8
79 4.3.1. New RELOAD Request: PathTrack . . . . . . . . . . . . 9
80 4.3.1.1. PathTrack Request . . . . . . . . . . . . . . . . 10
81 4.3.1.2. PathTrack Response . . . . . . . . . . . . . . . 10
82 4.4. Error Code Extensions . . . . . . . . . . . . . . . . . . 11
83 5. Diagnostic Data Structures . . . . . . . . . . . . . . . . . 11
84 5.1. DiagnosticsRequest Data Structure . . . . . . . . . . . . 12
85 5.2. DiagnosticsResponse Data Structure . . . . . . . . . . . 13
86 5.3. dMFlags and Diagnostic Kind ID Types . . . . . . . . . . 14
87 6. Message Processing . . . . . . . . . . . . . . . . . . . . . 17
88 6.1. Message Creation and Transmission . . . . . . . . . . . . 17
89 6.2. Message Processing: Intermediate Peers . . . . . . . . . 18
90 6.3. Message Response Creation . . . . . . . . . . . . . . . . 19
91 6.4. Interpreting Results . . . . . . . . . . . . . . . . . . 20
92 7. Authorization through Overlay Configuration . . . . . . . . . 20
93 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21
94 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
95 9.1. Diagnostics Flag . . . . . . . . . . . . . . . . . . . . 21
96 9.2. Diagnostic Kind ID Types . . . . . . . . . . . . . . . . 22
97 9.3. Message Codes . . . . . . . . . . . . . . . . . . . . . . 22
98 9.4. Error Code . . . . . . . . . . . . . . . . . . . . . . . 23
99 9.5. Message Extension . . . . . . . . . . . . . . . . . . . . 23
100 9.6. XML Name Space Registration . . . . . . . . . . . . . . . 24
101 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 24
102 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 24
103 11.1. Normative References . . . . . . . . . . . . . . . . . . 24
104 11.2. Informative References . . . . . . . . . . . . . . . . . 25
105 Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 26
106 A.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . . . 26
107 A.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . . . 26
108 A.3. Example 3 . . . . . . . . . . . . . . . . . . . . . . . . 26
109 Appendix B. Problems with Generating Multiple Responses on Path 26
110 Appendix C. Changes to the Draft . . . . . . . . . . . . . . . . 27
111 C.1. Changes since -00 version . . . . . . . . . . . . . . . . 27
112 C.2. Changes since -01 version . . . . . . . . . . . . . . . . 27
113 C.3. Changes since -02 version . . . . . . . . . . . . . . . . 27
114 C.4. Changes since -03 version . . . . . . . . . . . . . . . . 27
115 C.5. Changes since -04 version . . . . . . . . . . . . . . . . 27
116 C.6. Changes since -05 version . . . . . . . . . . . . . . . . 28
117 C.7. Changes in version -10 . . . . . . . . . . . . . . . . . 28
118 C.8. Changes in version -15 . . . . . . . . . . . . . . . . . 28
119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28
121 1. Introduction
123 In the last few years, overlay networks have rapidly evolved and
124 emerged as a promising platform for deployment of new applications
125 and services in the Internet. One of the reasons overlay networks
126 are seen as an excellent platform for large scale distributed systems
127 is their resilience in the presence of failures. This resilience has
128 three aspects: data replication, routing recovery, and static
129 resilience. Routing recovery algorithms are used to repopulate the
130 routing table with live nodes when failures are detected. Static
131 resilience measures the extent to which an overlay can route around
132 failures even before the recovery algorithm repairs the routing
133 table. Both routing recovery and static resilience rely on accurate
134 and timely detection of failures.
136 There are a number of situations in which some nodes in a Peer-to-
137 Peer (P2P) overlay may malfunction or behave badly. For example,
138 these nodes may be disabled, congested, or may be misrouting
139 messages. The impact of these malfunctions on the overlay network
140 may be a degradation of quality of service provided collectively by
141 the peers in the overlay network or an interruption of the overlay
142 services. It is desirable to identify malfunctioning or badly
143 behaving peers through diagnostic tools, and exclude or reject them
144 from the P2P system. Node failures may also be caused by failures of
145 underlying layers. For example, recovery from an incorrect overlay
146 topology may be slow when the speed at which IP routing recovers
147 after link failures is very slow. Moreover, if a backbone link fails
148 and the failover is slow, the network may be partitioned, leading to
149 partitions of overlay topologies and inconsistent routing results
150 between different partitioned components.
152 Some keep-alive algorithms based on periodic probe and acknowledge
153 mechanisms enable accurate and timely detection of failures of one
154 node's neighbors [Overlay-Failure-Detection], but these algorithms by
155 themselves can only detect the disabled neighbors using the periodic
156 method. This may not be sufficient for the service provider
157 operating the overlay network.
159 For Peer-to-Peer SIP (P2PSIP), a single, general P2PSIP overlay
160 diagnostic framework supporting periodic and on-demand methods for
161 detecting node failures and network failures is desirable. This
162 document describes a general P2PSIP overlay diagnostic extension to
163 the P2PSIP base protocol RELOAD [RFC6940] and is intended as a
164 complement to keep-alive algorithms in the P2PSIP overlay itself.
166 2. Terminology
168 This document uses the concepts defined in RELOAD [RFC6940].
170 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
171 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
172 document are to be interpreted as described in [RFC2119].
174 3. Diagnostic Scenarios
176 P2P systems are self-organizing and ideally setup and configuration
177 of individual P2P nodes requires no network management in the
178 traditional sense. However, users of an overlay, as well as P2P
179 service providers may contemplate usage scenarios where some
180 monitoring and diagnostics are required. We present a simple
181 connectivity test and some useful diagnostic information that may be
182 used in such diagnostics.
184 The common usage scenarios for P2P diagnostics can be broadly
185 categorized in three classes:
187 1. Automatic diagnostics built into the P2P overlay routing
188 protocol. Nodes perform periodic checks of known neighbors and
189 remove those nodes from the routing tables that fail to respond
190 to connectivity checks [Handling_Churn_in_a_DHT]. Unresponsive
191 nodes may only be temporarily disabled, for example due to a
192 local cryptographic processing overload, disk processing overload
193 or link overload. It is therefore useful to repeat the
194 connectivity checks to see nodes have recovered and can be again
195 placed in the routing tables. This process is known as 'failed
196 node recovery' and can be optimized as described in the paper
197 "Handling Churn in a DHT" [Handling_Churn_in_a_DHT].
199 2. Diagnostics used by a particular node to follow up on an
200 individual user complaint or failure. For example, a technical
201 support staff member may use a desktop sharing application (with
202 the permission of the user) to remotely determine the health of,
203 and possible problems with, the malfunctioning node. Part of the
204 remote diagnostics may consist of simple connectivity tests with
205 other nodes in the P2PSIP overlay and retrieval of statistics
206 from nodes in the overlay. The simple connectivity tests are not
207 dependent on the type of P2PSIP overlay. Note that other tests
208 may be required as well, including checking the health and
209 performance of the user's computer or mobile device and checking
210 the bandwidth of the link connecting the user to the Internet.
212 3. P2P system-wide diagnostics used to check the overall health of
213 the P2P overlay network. These include checking the consumption
214 of network bandwidth, checking for the presence of problem links
215 and checking for abusive or malicious nodes. This is not a
216 trivial problem and has been studied in detail for content and
217 streaming P2P overlays [Diagnostic_Framework], and has not been
218 addressed in earlier P2PSIP documents
219 [Diagnostics_and_NAT_traversal_in_P2PP]. While this is a
220 difficult problem, a great deal of information that can help in
221 diagnosing these problems can be obtained by obtaining basic
222 diagnostic information for peers and the network. This document
223 provides a framework for obtaining this information.
225 4. Data Collection Mechanisms
227 4.1. Overview of Operations
229 The diagnostic mechanisms described in this document are primarily
230 intended to detect and locate failures or monitor performance in
231 P2PSIP overlay networks. It provides mechanisms to detect and locate
232 malfunctioning or badly behaving nodes including disabled nodes,
233 congested nodes and misrouting peers. It provides a mechanism to
234 detect direct connectivity or connectivity to a specified node, a
235 mechanism to detect the availability of specified resource records
236 and a mechanism to discover P2PSIP overlay topology and the underlay
237 topology failures.
239 The P2PSIP diagnostics extensions define two mechanisms to collect
240 data. The first is an extension to the RELOAD Ping mechanism,
241 allowing diagnostic data to be queried from a node, as well as to
242 diagnose the path to that node. The second is a new method and
243 response, PathTrack, for collecting diagnostic information
244 iteratively. Payloads for these mechanisms allowing diagnostic data
245 to be collected and represented are presented, and additional error
246 codes are introduced. Essentially, this document reuses RELOAD
247 [RFC6940]specification and extends them to introduce the new
248 diagnostics methods. The extensions strictly follow RELOAD
249 specification on the messages routing, transport, NAT traversal etc.
250 The diagnostic methods are however P2PSIP protocol independent.
252 This document primarily describes how to detect and locate failures
253 including disabled nodes, congested nodes, misrouting behaviors and
254 underlying network faults in P2PSIP overlay networks through a simple
255 and efficient mechanism. This mechanism is modeled after the ping/
256 traceroute paradigm: ping [RFC0792]is used for connectivity checks,
257 and traceroute is used for hop-by-hop fault localization as well as
258 path tracing. This document specifies a "ping-like" mode (by
259 extending the RELOAD Ping method to gather diagnostics) and a
260 "traceroute-like" mode (by defining the new PathTrack method) for
261 diagnosing P2PSIP overlay networks.
263 One way these tools can be used is to detect the connectivity to the
264 specified node or the availability of the specified resource-record
265 through the extended P2PSIP Ping operation. Once the overlay network
266 receives some alarms about overlay service degradation or
267 interruption, a Ping is sent. If the Ping fails, one can then send a
268 PathTrack to determine where the fault lies.
270 The diagnostic information can only be provided to authorized nodes.
271 Some diagnostic information can be provided to all the participants
272 in the P2PSIP overlay, and some other diagnostic information can only
273 be provided to the nodes authorized by the local or overlay policy.
274 The authorization depends on the type of the diagnostic information
275 and the administrative considerations, and is application specific.
277 This document considers the general administrative scenario based on
278 diagnostic kind type, where a whole overlay can authorize a certain
279 type of diagnostic information to a small list of particular nodes
280 (e.g. administrative nodes). That means, if a node gets the
281 authorization to access a diagnostic kind type, it can access that
282 information from all nodes in the overlay network. It leaves the
283 scenario where a particular node authorizes its diagnostic
284 information to a particular list of nodes out of scope. This could
285 be achieved by extension of this document if there is requirement in
286 the near future. The default policy or access rule for a type of
287 diagnostic information is "permit" unless specified in the
288 diagnostics extension document. As the RELOAD protocol already
289 requires that each message carries the message signature of the
290 sender, the receiver of the diagnostics requests can use the
291 signature to identify the sender. It can then use the overlay
292 configuration file with this signature to determine which types of
293 diagnostic information that node is authorized for.
295 In the remainder of this section we define mechanisms for collecting
296 data, as well as the specific protocol extensions (message
297 extensions, new methods, and error codes) required to collect this
298 information. In Section 5 we discuss the format of the data
299 collected, and in Section 6 we discuss detailed message processing.
301 4.2. "Ping-like" Behavior: Extending Ping
303 To provide "ping-like" behavior, the RELOAD Ping method is extended
304 to collect diagnostic data along the path. The request message is
305 forwarded by the intermediate peers along the path and then
306 terminated by the responsible peer. After optional local
307 diagnostics, the responsible peer returns a response message. If an
308 error is found when routing, an Error response is sent to the
309 initiator node by the intermediate peer. Please refer to the RELOAD
310 [RFC6940] for details of the protocol.
312 The message flow of a Ping message (with diagnostic extensions) is as
313 follows:
315 Peer A Peer B Peer C Peer D
316 | | | |
317 |(1). PingReq | | |
318 |------------------->|(2). PingReq | |
319 | |------------------->|(3). PingReq |
320 | | |------------------->|
321 | | | |
322 | | |<-------------------|
323 | |<-------------------|(4). PingAns |
324 |<-------------------|(5). PingAns | |
325 |(6). PingAns | | |
326 | | | |
328 Figure 1: Ping Diagnostic Message Flow
330 4.2.1. RELOAD Request Extension: Ping
332 To extend the ping request for use in diagnostics, a new extension of
333 RELOAD is defined. The structure for a MessageExtension in RELOAD is
334 defined as:
336 struct {
337 MessageExtensionType type;
338 Boolean critical;
339 opaque extension_contents<0..2^32-1>;
340 } MessageExtension;
342 For the Ping request extension, we define a new MessageExtensionType,
343 extension 0x0002 named Diagnostic_Ping, as specified in Table 4 and
344 specified in the RELOAD. The extension contents consists of a
345 DiagnosticsRequest structure, defined later in this document in
346 Section 5.1. This extension MAY be used for new requests of the the
347 Ping method and MUST NOT be included in requests using any other
348 method.
350 This extension is not critical. If a peer does not support the
351 extension, they will simply ignore the diagnostic portion of the
352 message, and will treat the message as if it was a normal ping.
353 Senders MUST accept a response that lacks diagnostic information and
354 SHOULD NOT resend the message expecting a reply. Receivers who
355 receive a method other than Ping including this extension MUST ignore
356 the extension.
358 4.3. "Traceroute-like" Behavior: The Path_Track Method
360 We define a simple PathTrack method for retrieving diagnostic
361 information iteratively. The mechanism defined in this document
362 follows the RELOAD specification, the new request and response
363 message use the message format specified in RELOAD messages. Please
364 refer to the RELOAD [RFC6940] for details of the protocol.
366 The operation of this request is shown below in Figure 2. The
367 initiator node A asks its neighbor B which is the next hop peer to
368 the destination ID, and B returns a message with the next hop peer C
369 information, along with optional diagnostic information for B to the
370 initiator node. Then the initiator node A asks the next hop peer C
371 (directly or via symmetric routing) to return next hop peer D
372 information and diagnostic information of C. Unless a failure
373 prevents the message from being forwarded, this step can be
374 iteratively repeated until the request reaches responsible peer D for
375 the destination ID, and retrieves diagnostic information of peer D.
377 The message flow of a PathTrack message (with diagnostic extensions)
378 is as follows:
380 Peer-A Peer-B Peer-C Peer-D
381 | | | |
382 |(1).PathTrackReq | | |
383 |------------------->| | |
384 |(2).PathTrackAns | | |
385 |<-------------------| | |
386 | |(3).PathTrackReq | |
387 |--------------------|------------------->| |
388 | |(4).PathTrackAns | |
389 |<-------------------|--------------------| |
390 | | |(5).PathTrackReq |
391 |--------------------|--------------------|------------------->|
392 | | |(6).PathTrackAns |
393 |<-------------------|--------------------|--------------------|
394 | | | |
396 Figure 2: PathTrack Diagnostic Message Flow
398 There have been proposals that RouteQuery and a series of Fetch
399 requests can be used to replace the PathTrack mechanism, but in the
400 presence of churn such an operation would not, strictly speaking,
401 provide identical results, as the path may change between RouteQuery
402 and Fetch operations. (although obviously the path could change
403 between steps of PathTrack as well).
405 4.3.1. New RELOAD Request: PathTrack
407 This document defines a new RELOAD method, PathTrack, to retrieve the
408 diagnostic information from the intermediate peers along the routing
409 path. At each step of the PathTrack request, the responsible peer
410 responds to the initiator node with requested status information.
411 Status information can include a peer's congestion state, processing
412 power, available bandwidth, the number of entries in its neighbor
413 table, uptime, identity, network address information, and next hop
414 peer information.
416 A PathTrack request specifies which diagnostic information is
417 requested using a DiagnosticsRequest data structure, defined and
418 discussed in detail later in this document in Section 5.1. Base
419 information is requested by setting the appropriate flags in the data
420 structure in the request. If all flags are clear (no bits are set),
421 then the PathTrack request is only used for requesting the next hop
422 information. In this case the iterative mode of PathTrack is
423 degraded to a RouteQuery method which is only used for checking the
424 liveness of the peers along the routing path. The PathTrack request
425 can be routed directly or through the overlay based on the routing
426 mode chosen by the initiator node.
428 A response to a successful PathTrackReq is a PathTrackAns message.
429 The PathTrackAns contains general diagnostic information in the
430 payload, returned using a DiagnosticResponse data structure. This
431 data structure is defined and discussed in detail later in this
432 document in Section 5.2. The information returned is determined
433 based on the information requested in the flags in the corresponding
434 request.
436 4.3.1.1. PathTrack Request
438 The structure of the PathTrack request is as follows:
440 struct{
441 Destination destination;
442 DiagnosticsRequest request;
443 }PathTrackReq;
445 The fields of the PathTrackReq are as follows:
447 destination : The destination which the initiator node is
448 interested in. This may be any valid destination object,
449 including a NodeID, opaque ids, or ResourceID.
451 request : A DiagnosticsRequest, as discussed in Section 5.1.
453 4.3.1.2. PathTrack Response
455 The structure of the PathTrack Response is as follows:
457 struct{
458 Destination next_hop;
459 DiagnosticsResponse response;
460 }PathTrackAns;
462 The fields of the PathTrackAns are as follows:
464 next_hop : The information of the next hop node from the
465 responding intermediate peer to the destination node. If the
466 responding peer is the responsible peer for the destination ID,
467 then the next_hop node ID equals the responding node ID, and after
468 that the initiator MUST stop the iterative process.
470 response : A DiagnosticsResponse, as discussed in Section 5.2.
472 4.4. Error Code Extensions
474 This document extends the Error response method defined in the RELOAD
475 specification to support error cases resulting from diagnostic
476 queries. When an error is encountered in RELOAD, the Message Code
477 0xFFFF is returned. The ErrorResponse structure includes an error
478 code. and we define new error codes to report possible error
479 conditions detected while performing diagnostics:
481 Code Value Error Code Name
482 0x65 Underlay Destination Unreachable
483 0x66 Underlay Time exceeded
484 0x67 Message Expired
485 0x68 Upstream Misrouting
486 0x69 Loop detected
487 0x70 TTL hops exceeded
489 The final error codes will be assigned by IANA as specified in RELOAD
490 protocol [RFC6940].
492 In addition, this document introduces several types of error
493 information in the error_info field in the case of Code 0x65. These
494 are represented as an opaque UTF-8 text string. Here are some
495 examples for the error info.
497 error_info:
499 net unreachable
500 host unreachable
501 protocol unreachable
502 port unreachable
503 fragmentation needed
504 source route failed
506 The error_info field values of the Code 0x66 to 0x70 are to be
507 application specific and defined by the particular overlay.
509 5. Diagnostic Data Structures
511 Both the extended Ping method and Path_track methods use the
512 following common diagnostics data structures to collect data. Two
513 common structures are defined: DiagnosticsRequest for requesting
514 data, and DiagnosticsResponse for returning the information.
516 5.1. DiagnosticsRequest Data Structure
518 The DiagnosticsRequest data structure is used to request diagnostic
519 information and has the following form:
521 enum{ (2^16-1) } DiagnosticKindId;
523 struct{
524 DiagnosticKindId kind;
525 opaque diagnostic_extension_contents<0..2^32-1>;
526 }DiagnosticExtension;
528 struct{
529 uint64 expiration;
530 uint64 timestamp_initiated;
531 uint64 dMFlags;
532 uint32 ext_length;
533 DiagnosticExtension diagnostic_extensions_list<0..2^32-1>;
534 }DiagnosticsRequest;
536 The fields in the DiagnosticsRequest are as follows:
538 expiration : The time when the request will expire represented as
539 the number of milliseconds elapsed since midnight Jan 1, 1970 UTC
540 not counting leap seconds. This will have the same values for
541 seconds as standard UNIX time or POSIX time. More information can
542 be found at UnixTime [UnixTime]. This value MUST have a value of
543 between 1 and 600 seconds in the future.
545 timestamp_initiated : The time when the P2PSIP diagnostics request
546 was initiated represented as the number of milliseconds elapsed
547 since midnight Jan 1, 1970 UTC not counting leap seconds. This
548 will have the same values for seconds as standard UNIX time or
549 POSIX time.
551 dMFlags : A mandatory field which is an unsigned 64-bit integer
552 indicating which base diagnostic information the request initiator
553 node is interested in. The initiator sets different bits to
554 retrieve different kinds of diagnostic information. If dMFlags is
555 set to zero, then no base diagnostic information is conveyed in
556 the PathTrack response. If dMFlag is set to all '1's, then all
557 base diagnostic information values are requested. A request may
558 set any number of the flags to request the corresponding
559 diagnostic information.
561 ext_length : the length of the extended diagnostic request
562 information in bytes. If the value is greater than or equal to 1,
563 then some extended diagnostic information is requested. A value
564 of zero indicates no extended diagnostic information is included.
565 The value of ext_length MUST NOT be negative. Note that is NOT
566 the length of the entire DiagnosticsRequest data structure.
568 Note this memo specifies the initial set of flags, the flags can
569 be extended. The dMflags indicate general diagnostic information
570 The mapping between the bits in the dMFlags and the diagnostic
571 information kind presented is as described in Section 9.1.
573 diagnostic_extensions_list : consists of one or more
574 DiagnosticExtension structures (see below) documenting additional
575 diagnostic information being requested. Each DiagnosticExtension
576 consists of the following fields:
578 kind : a numerical code indicating the type of extension
579 diagnostic information (see Section 9.2). Note that kinds
580 0xF000 - 0xFFFE are reserved for overlay specific diagnostics
581 and may be used without IANA registration for local diagnostic
582 information. Kinds from 0x0000 to 0x003F MUST NOT be indicated
583 in the diagnostic_extensions_list in the message request
584 because they can be represented using the dMFlags in a much
585 simpler way.
587 diagnostic_extension_contents : the opaque data containing the
588 request for this particular extension. This data is extension
589 dependent.
591 5.2. DiagnosticsResponse Data Structure
593 enum { (2^16-1) } DiagnosticKindId;
594 struct{
595 DiagnosticKindId kind;
596 opaque diagnostic_info_contents<0..2^16-1>;
597 }DiagnosticInfo;
599 struct{
600 uint64 expiration;
601 uint64 timestamp_received;
602 uint8 hop_counter;
603 uint32 ext_length;
604 DiagnosticInfo diagnostic_info_list<0..2^32-1>;
605 }DiagnosticsResponse;
607 The fields in the DiagnosticsResponse are as follows:
609 expiration : The time when the response will expire represented as
610 the number of milliseconds elapsed since midnight Jan 1, 1970 UTC
611 not counting leap seconds. This will have the same values for
612 seconds as standard UNIX time or POSIX time. This value MUST have
613 a value of between 1 and 600 seconds in the future.
615 timestamp_received : The time when P2PSIP Overlay diagnostic
616 request was received represented as the number of milliseconds
617 elapsed since midnight Jan 1, 1970 UTC not counting leap seconds.
618 This will have the same values for seconds as standard UNIX time
619 or POSIX time.
621 hop_counter : This field only appears in diagnostic responses. It
622 MUST be exactly copied from the TTL field of the forwarding header
623 in the received request. This information is sent back to the
624 request initiator, allowing it to compute the number of hops that
625 the message traversed in the overlay.
627 ext_length : the length of the returned DiagnosticInfo information
628 in bytes. If the value is greater than or equal to 1, then some
629 extended diagnostic information is requested. A value of zero
630 indicates no extended diagnostic information is included. The
631 value of ext_length MUST NOT be negative. Note that is NOT the
632 length of the entire DiagnosticsRequest data structure.
634 diagnostic_info_list : consists one or more DiagnosticInfo
635 structures containing the requested diagnostic information. The
636 fields in the DiagnosticInfo structure are as follows:
638 kind : A numeric code indicating the type of information being
639 returned. For base data requested using the dMFlags, this code
640 corresponds to the dMFlag set, and is described in Section 5.1.
641 For diagnostic extensions, this code will be identical to the
642 value of the DiagnosticKindId set in the "kind" field of the
643 DiagnosticExtension of the request. See Section 9.2.
645 diagnostic_information : Data containing the value for the
646 diagnostic information being reported. Various kinds of
647 diagnostic information can be retrieved, Please refer to
648 Section 5.3 for details of the diagnostic kind ID for the base
649 diagnostic information that may be reported.
651 5.3. dMFlags and Diagnostic Kind ID Types
653 The dMFlags field described above is a 64 bit field that allows
654 initiator nodes to identify up to 62 items of base information to
655 request in a request message (the first and last flags being
656 reserved). When the requested base information is returned in the
657 response, the value of the diagnostic kind ID will correspond to the
658 numeric field marked in the dMFlags in the request. The values for
659 the dMFlags are defined in Section 9.1 and the diagnostic kind IDs
660 are defined in Section 9.2. The information contained for each value
661 is described in this section.
663 STATUS_INFO (8 bits): A single value element containing an
664 unsigned byte representing whether or not the node is in
665 congestion status. An example usage of STATUS_INFO is for
666 congestion-aware routing. In this scenario, each peer has to
667 update its congestion status periodically. An intermediate peer
668 in the distributed hash table (DHT) network will choose its next
669 hop according to both the DHT routing algorithm and the status
670 information. This is done to avoid increasing load on congested
671 peers. The rightmost 4 bits are used and other bits MUST be
672 cleared to "0"s for future use. There are 16 levels of congestion
673 status, with "0x00" represent zero load and "0x0F" represent
674 congested. This document does not provide a specific method for
675 congestion, leaving this decision to each node. One possible
676 option for a node would be to take its CPU/memory/bandwidth usage
677 percentage in the past 600 seconds and normalize the highest value
678 to the range [0x00, 0x0F]. A future draft may define an objective
679 measure or specific algorithm for this.
681 ROUTING_TABLE_SIZE (32 bits): A single value element containing an
682 unsigned 32-bit integer representing the number of peers in the
683 peer's routing table. The administrator of the overlay may be
684 interested in statistics of this value for reasons such as routing
685 efficiency. Access to this kind of diagnostic information MUST
686 NOT be allowed unless compliant to the rules defined in Section 7.
688 PROCESS_POWER (64 bits): A single value element containing an
689 unsigned 64-bit integer specifying the processing power of the
690 node in unit of MIPS. Fractional values are rounded up.
692 UPSTREAM_BANDWIDTH (64 bits): A single value element containing an
693 unsigned 64-bit integer specifying the upstream network bandwidth
694 (provisioned or maximum, not available) of the node in unit of
695 Kbps. Fractional values are rounded up. For multihomed hosts,
696 this should be the link used to send the response.
698 DOWNSTREAM_BANDWIDTH (64 bits): A single value element containing
699 an unsigned 64-bit integer specifying the downstream network
700 bandwidth (provisioned or maximum, not available) of the node in
701 unit of Kbps. Fractional values are rounded up. For multihomed
702 hosts, this should be the link the request was received from.
704 SOFTWARE_VERSION: A single value element containing a US-ASCII
705 string that identifies the manufacture, model, operating system
706 information and the version of the software. While the format is
707 peer-defined, a suggested format is as follows:
709 "ApplicationProductToken (Platform; OS-or-CPU) VendorProductToken
710 (VendorComment)". For example: "MyReloadApp/1.0 (Unix; Linux
711 x86_64) libreload-java/0.7.0 (Stonyfish Inc.)". Access to this
712 kind of diagnostic information MUST NOT be allowed unless
713 compliant to the rules defined in Section 7.
715 MACHINE_UPTIME (64 bits): A single value element containing an
716 unsigned 64-bit integer specifying the time the node's underlying
717 system has been up in seconds.
719 APP_UPTIME (64 bits): A single value element containing an
720 unsigned 64-bit integer specifying the time the P2P application
721 has been up in seconds.
723 MEMORY_FOOTPRINT (64 bits): A single value element containing an
724 unsigned 64-bit integer representing the memory footprint of the
725 peer program in kilobytes (1024 bytes). Fractional values are
726 rounded up. Access to this kind of diagnostic information MUST
727 NOT be allowed unless compliant to the rules defined in Section 7.
729 DATASIZE_STORED (64 bits): An unsigned 64-bit integer representing
730 the number of bytes of data being stored by this node. Access to
731 this kind of diagnostic information MUST NOT be allowed unless
732 compliant to the rules defined in Section 7.
734 INSTANCES_STORED: An array element containing the number of
735 instances of each kind stored. The array is indexed by Kind-ID.
736 Each entry is an unsigned 64-bit integer. Access to this kind of
737 diagnostic information MUST NOT be allowed unless compliant to the
738 rules defined in Section 7.
740 MESSAGES_SENT_RCVD: An array element containing the number of
741 messages sent and received. The array is indexed by method code.
742 Each entry in the array is a pair of unsigned 64-bit integers
743 (packed end to end) representing sent and received. Access to
744 this kind of diagnostic information MUST NOT be allowed unless
745 compliant to the rules defined in Section 7.
747 EWMA_BYTES_SENT (32 bits): A single value element containing an
748 unsigned 32-bit integer representing an exponential weighted
749 average of bytes sent per second by this peer. sent = alpha x
750 sent_present + (1 - alpha) x sent where sent_present represents
751 the bytes sent per second since the last calculation and sent
752 represents the last calculation of bytes sent per second. A
753 suitable value for alpha is 0.8. This value is calculated every
754 five seconds. Access to this kind of diagnostic information MUST
755 NOT be allowed unless compliant to the rules defined in Section 7.
757 EWMA_BYTES_RCVD (32 bits): A single value element containing an
758 unsigned 32-bit integer representing an exponential weighted
759 average of bytes received per second by this peer. rcvd = alpha x
760 rcvd_present + (1 - alpha) x rcvd where rcvd_present represents
761 the bytes received per second since the last calculation and rcvd
762 represents the last calculation of bytes received per second. A
763 suitable value for alpha is 0.8. This value is calculated every
764 five seconds. Access to this kind of diagnostic information MUST
765 NOT be allowed unless compliant to the rules defined in Section 7.
767 UNDERLAY_HOP (8 bits): Indicates the IP layer hops from the
768 intermediate peer which receives the diagnostics message to the
769 next hop peer for this message. (Note: RELOAD does not require
770 the intermediate peers to look into the message body. So here we
771 use PathTrack to gather underlay hops for diagnostics purpose).
773 BATTERY_STATUS (8 bits): The left-most bit is used to indicate
774 whether this peer is using a battery or not. If this bit is clear
775 (set to '0'), then the peer is using a battery for power. The
776 other 7 bits are to be determined by specific applications.
778 6. Message Processing
780 6.1. Message Creation and Transmission
782 When constructing either a Ping message with diagnostic extensions or
783 a PathTrack message, the sender first creates and populates a
784 DiagnosticsRequest data structure. The timestamp_initiated field is
785 set to the current time, and the expiration field is constructed
786 based on this time. The sender includes the dMFlags field in the
787 structure, setting any number (including all) of the flags to request
788 particular diagnostic information. The sender MAY leave all the bits
789 unset, requesting no particular diagnostic information.
791 The sender MAY also include diagnostic extensions in the
792 DiagnosticsRequest data structure to request additional information.
793 If the sender includes any extensions, it MUST calculate the length
794 of these extensions and set the ext_length field to this value. If
795 no extensions are included, the sender MUST set ext_length to zero.
797 The format of the DiagnosticRequest data structure and its fields
798 MUST follow the restrictions defined in Section 5.1.
800 When constructing a Ping message with diagnostic extensions, the
801 sender MUST create an MessageExtension structure as defined in RELOAD
802 [RFC6940], setting the value of type to 0x0002, and the value of
803 critical to FALSE. The value of extension_contents MUST be a
804 DiagnosticsRequest structure as defined above. The message MAY be
805 directed to a particular NodeId or ResourceID, but SHOULD NOT be sent
806 to the broadcast NodeID.
808 When constructing a PathTrack message, the sender MUST set the
809 message_code for the RELOAD MessageContents structure to
810 path_track_req (0x65). The request field of the PathTrackReq MUST be
811 set to the DiagnosticsRequest data structure defined above. The
812 destination field MUST be set to the desired destination, which `MAY
813 be either a NodeId or ResourceID but SHOULD NOT be the broadcast
814 NodeID.
816 6.2. Message Processing: Intermediate Peers
818 When a request arrives at a peer, if the peer's responsible ID space
819 does not cover the destination ID of the request, then the peer MUST
820 continue processing this request according to the overlay specified
821 routing mode from RELOAD protocol.
823 In P2PSIP overlay, error responses to a message can be generated by
824 either an intermediate peer or the responsible peer. When a request
825 is received at a peer, the peer may find connectivity failures or
826 malfunctioning peers through the pre-defined rules of the overlay
827 network, e.g. by analyzing via list or underlay error messages. In
828 this case, the intermediate peer SHOULD return an error response to
829 the initiator node, reporting any malfunction node information
830 available in the error message payload. All error responses
831 generated MUST contain the appropriate error code.
833 Each intermediate peer receiving a Ping message with extensions (and
834 which understands the extension) or receiving a PathTrack request/
835 response SHOULD check the expiration value (Unix time format) to
836 determine if the message is expired. If the message expired, the
837 intermediate peer SHOULD generate a response with Error Code 0x67
838 "Message Expired", return the response the initiator node, and
839 discard the message.
841 The intermediate peer SHOULD return an error response with the Error
842 Code 0x65 "Underlay Destination Unreachable" when it receives an ICMP
843 message with "Destination Unreachable" information after forwarding
844 the received request to the destination peer.
846 The intermediate peer SHOULD return an error response with the Error
847 Code 0x66 "Underlay Time Exceeded" when it receives an ICMP message
848 with "Time Exceeded" information after forwarding the received
849 request.
851 The peer SHOULD return an Error response with Error Code 0x68
852 "Upstream Misrouting" when it finds its upstream peer disobeys the
853 routing rules defined in the overlay. The immediate upstream peer
854 information SHOULD also be conveyed to the initiator node.
856 The peer SHOULD return an Error response with Error Code 0x69 "Loop
857 detected" when it finds a loop through the analysis of via list.
859 The peer SHOULD return an Error response with Error Code 0x70 "TTL
860 hops exceeded" when it finds that the TTL field value is no more than
861 0 when forwarding.
863 6.3. Message Response Creation
865 When a diagnostic request message arrives at a peer, it is
866 responsible for the destination ID specified in the forwarding
867 header, and assuming it understands the extension (in the case of
868 Ping) or the new request type PathTrack, it MUST follow the
869 specifications defined in RELOAD [RFC6940] to form the response
870 header, and perform the following operations:
872 When constructing a PathTrack response, the sender MUST set the
873 message_code for the RELOAD MessageContents structure to
874 path_track_ans (0x66).
876 The receiver MUST check the expiration value (Unix time format) in
877 the DiagnosticsRequest to determine if the message is expired. If
878 the message is expired, the peer MUST generate a response with the
879 Error Code 0x67 "Message Expired", return the response to the
880 initiator node, and discard the message.
882 If the message is not expired, the receiver MUST construct a
883 DiagnosticsResponse structure, as follows: The TTL value from the
884 forwarding header is copied to the hop_counter field of the
885 DiagnosticsResponse structure. Note that the default value for TTL
886 at the beginning represents 100-hops unless overlay configuration has
887 overridden the value. The receiver generates an Unix time format
888 timestamp for the current time of day and places it in the
889 timestamp_received field, and constructs a new expiration time and
890 places it in the expiration field of the DiagnosticsResponse.
892 The destination peer MUST check if the initiator node has the
893 authority to request specific types of diagnostic information, and if
894 appropriate, append the diagnostic information requested in the
895 dMFlags and diagnostic_extensions (if any) using the
896 diagnostic_info_list field to the DiagnosticsResponse structure. If
897 any information returned, the receiver MUST calculate the length of
898 the response and set ext_length appropriately. If no diagnostic
899 information is returned, ext_length MUST be set to zero.
901 The format of the DiagnosticResponse data structure and its fields
902 MUST follow the restrictions defined in Section 5.2.
904 In the event of an error, an error response containing the error code
905 followed by the description (if they exist) MUST be created and sent
906 to the sender. If the initiator node asks for diagnostic information
907 that they are not authorized to query, the receiving peer MUST return
908 an Error response with the Error Code 2 "Error_Forbidden".
910 6.4. Interpreting Results
912 The initiator node, as well as the responding peer, MAY compute the
913 overlay One-Way-Delay time through the value in timestamp_received
914 and the timestamp_initiated field. However, for a single hop
915 measurement, the traditional measurement methods MUST be used instead
916 of the overlay layer diagnostics methods.
918 The P2P overlay network using the diagnostics methods specified in
919 this document MUST enforce time synchronization with a central time
920 server. Network Time Protocol [RFC5905] can usually maintain time to
921 within tens of milliseconds over the public Internet, and can achieve
922 better than one millisecond accuracy in local area networks under
923 ideal conditions. However, this document does not specify the choice
924 for time synchronization, leaving it to the implementation.
926 The initiator node receiving the Ping response MAY check the
927 hop_counter field and compute the overlay hops to the destination
928 peer for the statistics of connectivity quality from the perspective
929 of overlay hops.
931 7. Authorization through Overlay Configuration
933 The overlay configuration file MUST contain the following XML
934 elements for authorizing a node to access the relative diagnostic
935 kinds.
937 diagnostic-kind: This has the attribute "kind" with the hexadecimal
938 number indicating the diagnostic Kind Type, this attribute has the
939 same value with Section 9.2, and at least one sub element "access-
940 node".
942 access-node: This element contains one hexadecimal number indicating
943 a NodeID, and the node with this NodeID is allowed to access the
944 diagnostic "kind" under the same diagnostic-kind element.
946 8. Security Considerations
948 The authorization for diagnostic information must be designed with
949 care to prevent it becoming a method to retrieve information for bot
950 attacks. It should also be noted that attackers can use diagnostics
951 to analyze overlay information to attack certain key peers. As this
952 document is a RELOAD extension, it follows RELOAD message header and
953 routing specifications, the common security considerations described
954 in the base document [RFC6940] are also applicable to this document.
955 Overlays may define their own requirements on who can collect/share
956 diagnostic information.
958 9. IANA Considerations
960 9.1. Diagnostics Flag
962 IANA SHALL create a "RELOAD Diagnostics Flag" Registry. Entries in
963 this registry are 1-bit flags contained in a 64-bits long integer
964 dMFlags denoting diagnostic information to be retrieved as described
965 in Section 4.3.1. New entries SHALL be defined via [RFC5226]
966 Standards Action. The initial contents of this registry are:
968 +-------------------------+------------------------------+--------+
969 | diagnostic information |diagnostic flag in dMFlags | RFC |
970 |-------------------------+------------------------------+--------|
971 |Reserved | 0x 0000 0000 0000 0000 |RFC-XXXX|
972 |STATUS_INFO | 0x 0000 0000 0000 0001 |RFC-XXXX|
973 |ROUTING_TABLE_SIZE | 0x 0000 0000 0000 0002 |RFC-XXXX|
974 |PROCESS_POWER | 0x 0000 0000 0000 0004 |RFC-XXXX|
975 |UPSTREAM_BANDWIDTH | 0x 0000 0000 0000 0008 |RFC-XXXX|
976 |DOWNSTREAM_ BANDWIDTH | 0x 0000 0000 0000 0010 |RFC-XXXX|
977 |SOFTWARE_VERSION | 0x 0000 0000 0000 0020 |RFC-XXXX|
978 |MACHINE_UPTIME | 0x 0000 0000 0000 0040 |RFC-XXXX|
979 |APP_UPTIME | 0x 0000 0000 0000 0080 |RFC-XXXX|
980 |MEMORY_FOOTPRINT | 0x 0000 0000 0000 0100 |RFC-XXXX|
981 |DATASIZE_STORED | 0x 0000 0000 0000 0200 |RFC-XXXX|
982 |INSTANCES_STORED | 0x 0000 0000 0000 0400 |RFC-XXXX|
983 |MESSAGES_SENT_RCVD | 0x 0000 0000 0000 0800 |RFC-XXXX|
984 |EWMA_BYTES_SENT | 0x 0000 0000 0000 1000 |RFC-XXXX|
985 |EWMA_BYTES_RCVD | 0x 0000 0000 0000 2000 |RFC-XXXX|
986 |UNDERLAY_HOP | 0x 0000 0000 0000 4000 |RFC-XXXX|
987 |BATTERY_STATUS | 0x 0000 0000 0000 8000 |RFC-XXXX|
988 |Reserved | 0x FFFF FFFF FFFF FFFF |RFC-XXXX|
989 +-------------------------+------------------------------+--------+
991 [To RFC editor: Please replace all RFC-XXXX in this document with the
992 RFC number of this document.]
994 9.2. Diagnostic Kind ID Types
996 IANA SHALL create a "RELOAD Diagnostic Kind ID Types" Registry.
997 Entries in this registry are 16-bit integers denoting diagnostics
998 extension data kind types carried in the diagnostic request and
999 response message, as described in Section 5.2. Code points from
1000 0x0000 to 0x003F SHALL be assigned together with flags within "RELOAD
1001 Diagnostics Flag" registry via RFC 5226 [RFC5226] standards action.
1002 Code points in the range 0x0040 to 0xEFFF SHALL be registered via RFC
1003 5226 standards action.
1005 +---------------------------+---------------+---------------+
1006 | Diagnostic Kind Type | Code | Specification |
1007 +---------------------------+---------------+---------------+
1008 | reserved | 0x0000 | RFC-XXXX |
1009 | STATUS_INFO | 0x0001 | RFC-XXXX |
1010 | ROUTING_TABLE_SIZE | 0x0002 | RFC-XXXX |
1011 | PROCESS_POWER | 0x0003 | RFC-XXXX |
1012 | UPSTREAM_BANDWIDTH | 0x0004 | RFC-XXXX |
1013 | DOWNSTREAM_BANDWIDTH | 0x0005 | RFC-XXXX |
1014 | SOFTWARE_VERSION | 0x0006 | RFC-XXXX |
1015 | MACHINE_UPTIME | 0x0007 | RFC-XXXX |
1016 | APP_UPTIME | 0x0008 | RFC-XXXX |
1017 | MEMORY_FOOTPRINT | 0x0009 | RFC-XXXX |
1018 | DATASIZE_STORED | 0x000A | RFC-XXXX |
1019 | INSTANCES_STORED | 0x000B | RFC-XXXX |
1020 | MESSAGES_SENT_RCVD | 0x000C | RFC-XXXX |
1021 | EWMA_BYTES_SENT | 0x000D | RFC-XXXX |
1022 | EWMA_BYTES_RCVD | 0x000E | RFC-XXXX |
1023 | UNDERLAY_HOP | 0x000F | RFC-XXXX |
1024 | BATTERY_STATUS | 0x0010 | RFC-XXXX |
1025 | reserved for future flags | 0x0011-40 | RFC-XXXX |
1026 | local use (reserved) | 0xF000-0xFFFE | RFC-XXXX |
1027 | reserved | 0xFFFF | RFC-XXXX |
1028 +---------------------------+---------------+---------------+
1030 Table 1: Diagnostic Kind Types
1032 9.3. Message Codes
1034 This document introduces two new types of messages and their
1035 responses, requiring the following additions to the "RELOAD Message
1036 Code" Registry defined in RELOAD [RFC6940]. These additions are:
1038 +-------------------+------------+----------+
1039 | Message Code Name | Code Value | RFC |
1040 +-------------------+------------+----------+
1041 | path_track_req | 0x65 | RFC-AAAA |
1042 | path_track_ans | 0x66 | RFC-AAAA |
1043 +-------------------+------------+----------+
1045 Table 2: Extensions to RELOAD Message Codes
1047 [To RFC editor: Values starting at 0x65 were used to prevent
1048 collisions with RELOAD base values and other extensions. Please
1049 replace with the next highest available values. The final message
1050 codes will be assigned by IANA. And all RFC-AAAA should be replaced
1051 with the RFC number of RELOAD when publication.]
1053 9.4. Error Code
1055 This document introduces the following new error codes, extending the
1056 "RELOAD Message Code" registry as described below:
1058 +----------------------------------------+------------+----------+
1059 | Message Code Name | Code Value | RFC |
1060 +----------------------------------------+------------+----------+
1061 | Error_Underlay_Destination_Unreachable | 0x65 | RFC-AAAA |
1062 | Error_Underlay_Time_Exceeded | 0x66 | RFC-AAAA |
1063 | Error_Message_Expired | 0x67 | RFC-AAAA |
1064 | Error_Upstream_Misrouting | 0x68 | RFC-AAAA |
1065 | Error_Loop_Detected | 0x69 | RFC-AAAA |
1066 | Error_TTL_Hops_Exceeded | 0x70 | RFC-AAAA |
1067 +----------------------------------------+------------+----------+
1069 Table 3: Extensions to RELOAD Error Codes
1071 [To RFC editor: Values starting at 0x65 were used to prevent
1072 collisions with RELOAD base values and other extensions. Please
1073 replace with the next highest available values. The final message
1074 codes will be assigned by IANA. And all RFC-AAAA should be replaced
1075 with the RFC number of RELOAD when publication.]
1077 9.5. Message Extension
1079 This document introduces the following new RELOAD extension code:
1081 +-----------------+------------+----------+
1082 | Extension Name | Code Value | RFC |
1083 +-----------------+------------+----------+
1084 | Diagnostic_Ping | 0x0002 | RFC-AAAA |
1085 +-----------------+------------+----------+
1087 Table 4: New RELOAD Extension Code
1089 [To RFC editor: The value 0x0002 was used to prevent collisions with
1090 other extensions. Please replace with the next highest available
1091 value. The final codes will be assigned by IANA. And all RFC-AAAA
1092 should be replaced with the RFC number of RELOAD when publication.]
1094 9.6. XML Name Space Registration
1096 This document registers a URI for the config-diagnostics XML
1097 namespaces in the IETF XML registry defined in [RFC3688]. All the
1098 elements defined in this document belong to this namespace.
1100 URI: urn:ietf:params:xml:ns:p2p:config-diagnostics
1101 Registrant Contact: The IESG.
1102 XML: N/A, the requested URIs are XML namespaces
1104 And the overlay configuration file MUST contain the following xml
1105 language declaring P2PSIP diagnostics as a mandatory extension to
1106 RELOAD.
1108
1109 urn:ietf:params:xml:ns:p2p:config-diagnostics
1110
1112 10. Acknowledgments
1114 We would like to thank Zheng Hewen for the contribution of the
1115 initial version of this document. We would also like to thank Bruce
1116 Lowekamp, Salman Baset, Henning Schulzrinne, Jiang Haifeng and Marc
1117 Petit-Huguenin for the email discussion and their valued comments,
1118 and special thanks to Henry Sinnreich for contributing to the usage
1119 scenarios text. We would like to thank the authors of the RELOAD
1120 protocol for transferring text about diagnostics to this document.
1122 11. References
1124 11.1. Normative References
1126 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
1127 RFC 792, September 1981.
1129 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1130 Requirement Levels", BCP 14, RFC 2119, March 1997.
1132 [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688,
1133 January 2004.
1135 [RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network
1136 Time Protocol Version 4: Protocol and Algorithms
1137 Specification", RFC 5905, June 2010.
1139 [RFC6940] Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and
1140 H. Schulzrinne, "REsource LOcation And Discovery (RELOAD)
1141 Base Protocol", RFC 6940, January 2014.
1143 11.2. Informative References
1145 [UnixTime]
1146 "UnixTime", .>.
1149 [I-D.ietf-p2psip-self-tuning]
1150 Maenpaa, J. and G. Camarillo, "Self-tuning Distributed
1151 Hash Table (DHT) for REsource LOcation And Discovery
1152 (RELOAD)", draft-ietf-p2psip-self-tuning-15 (work in
1153 progress), June 2014.
1155 [I-D.ietf-p2psip-concepts]
1156 Bryan, D., Matthews, P., Shim, E., Willis, D., and S.
1157 Dawkins, "Concepts and Terminology for Peer to Peer SIP",
1158 draft-ietf-p2psip-concepts-06 (work in progress), June
1159 2014.
1161 [Overlay-Failure-Detection]
1162 Zhuang, S., "On failure detection algorithms in overlay
1163 networks", Proc. IEEE Infocomm, Mar 2005.
1165 [Handling_Churn_in_a_DHT]
1166 Rhea, S., "Handling Churn in a DHT", USENIX Annual
1167 Conference, June 2004.
1169 [Diagnostic_Framework]
1170 Jin, X., "A Diagnostic Framework for Peer-to-Peer
1171 Streaming", 2005.
1173 [Diagnostics_and_NAT_traversal_in_P2PP]
1174 Gupta, G., "Diagnostics and NAT Traversal in P2PP - Design
1175 and Implementation", Columbia University Report , June
1176 2008.
1178 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
1179 IANA Considerations Section in RFCs", BCP 26, RFC 5226,
1180 May 2008.
1182 Appendix A. Examples
1184 Below, we sketch how these metrics can be used.
1186 A.1. Example 1
1188 A peer may set EWMA_BYTES_SENT and EWMA_BYTES_RCVD flags in the
1189 PathTrackReq to its direct neighbors. A peer can use EWMA_BYTES_SENT
1190 and EWMA_BYTES_RCVD of another peer to infer whether it is acting as
1191 a media relay. It may then choose not to forward any requests for
1192 media relay to this peer. Similarly, among the various candidates
1193 for filling up routing table, a peer may prefer a peer with a large
1194 UPTIME value, small RTT, and small LAST_CONTACT value.
1196 A.2. Example 2
1198 A peer may set the STATUS_INFO Flag in the PathTrackReq to a remote
1199 destination peer. The overlay has its own threshold definition for
1200 congestion. The peer can obtain knowledge of all the status
1201 information of the intermediate peers along the path. Then it can
1202 choose other paths to that node for the subsequent requests.
1204 A.3. Example 3
1206 A peer may use Ping to evaluate the average overlay hops to other
1207 peers by sending PingReq to a set of random resource or node IDs in
1208 the overlay. A peer may adjust its timeout value according to the
1209 change of average overlay hops.
1211 Appendix B. Problems with Generating Multiple Responses on Path
1213 An earlier version of this document considered an approach where a
1214 response was generated by each intermediate peer as the message
1215 traversed the overlay. This approach was discarded. One reason this
1216 approach was discarded was that it could provide a DoS mechanism,
1217 whereby an attacker could send an arbitrary message claiming to be
1218 from a spoofed "sender" the real sender wished to attack. As a
1219 result of sending this one message, many messages would be generated
1220 and sent back to the spoofed "sender" - one from each intermediate
1221 peer on the message path. While authentication mechanisms could
1222 reduce some risk of this attack, it still resulted in a fundamental
1223 break from the request-response nature of the RELOAD protocol, as
1224 multiple responses are generated to a single request. Although one
1225 request with responses from all the peers in the route will be more
1226 efficient, it was determined to be too great a security risk and
1227 deviation from the RELOAD architecture.
1229 Appendix C. Changes to the Draft
1231 To RFC editor: This section is to track the changes. Please remove
1232 this section before publication.
1234 C.1. Changes since -00 version
1236 1. Changed title from "Diagnose P2PSIP Overlay Network" to "P2PSIP
1237 Overlay Diagnostics".
1239 2. Changed the table of contents. Add a section about message
1240 processing and a section of examples.
1242 3. Merge diagnostics text from the p2psip base draft -01.
1244 4. Removed ECHO method for security reasons.
1246 C.2. Changes since -01 version
1248 Added BATTERY_STATUS as diagnostic information.
1250 Removed UnderlayTTL test from the Ping method, instead adding an
1251 UNDERLAY_HOP diagnostic information for PathTrack method.
1253 Give some examples for diagnostic information, and give some
1254 editor's notes for further work.
1256 C.3. Changes since -02 version
1258 Provided further explanation as to why the base draft Ping in the
1259 current form cannot be used to replace Ping, and why some combination
1260 of methods cannot replace PathTrack.
1262 C.4. Changes since -03 version
1264 Modified structure used to share information collected. Both
1265 mechanisms now use a common data structure to convey information.
1267 C.5. Changes since -04 version
1269 Updated the authors' addresses and modified the last sentence in .
1270 (Section 4.3.1.2)
1272 C.6. Changes since -05 version
1274 Resolve Marc's comments from the mailing list. And define the
1275 details of STATUS_INO.
1277 C.7. Changes in version -10
1279 Resolve the authorization issue and other comments (e.g. define
1280 diagnostics as a mandatory extension) from WGLC. And check for the
1281 languages.
1283 C.8. Changes in version -15
1285 Changed several diagnostic kind return values to be 64 bit vs. 32 bit
1286 to provide headroom. Split bandwidth into upstream and downstream.
1287 Renamed length in diagnostic request object to ext_length, added
1288 ext_length to response object, and clarified that ext_length is
1289 length of diagnostic info/extensions being returned, not the length
1290 of the object.
1292 Aligned many flags/values with RELOAD by using hex vs decimal values.
1294 Significant reorganization and edit for readability.
1296 Authors' Addresses
1298 Haibin Song
1299 Huawei
1301 Email: haibin.song@huawei.com
1303 Jiang Xingfeng
1304 Huawei
1306 Email: jiangxignfeng@huawei.com
1308 Roni Even
1309 Huawei
1310 14 David Hamelech
1311 Tel Aviv 64953
1312 Israel
1314 Email: roni.even@mail101.huawei.com
1315 David A. Bryan
1316 Ethernot.org
1317 Cedar Park, Texas
1318 United States of America
1320 Email: dbryan@ethernot.org
1322 Yi Sun
1323 ICT
1325 Email: sunyi@ict.ac.cn