Internet Engineering Task Force A. Clark Internet-Draft Telchemy Incorporated Expires: 4 January 2008 A. Pendleton Nortel R. Kumar K. Connor Cisco Systems G. Hunt BT July 2007 RTCP HR - High Resolution VoIP Metrics Report Blocks draft-ietf-avt-rtcphr-01.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on 4th January 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This document defines extensions to the RTCP XR extended report packet type blocks to support Voice over IP (VoIP) monitoring for services that require higher resolution or more detailed metrics than those supported by RFC3611 [3]. Clark [Page 1] draft-ietf-avt-rtcphr-01.txt July 2007 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . 2 3. High Resolution VoIP Metrics Report Block . . . . . . . . 4 4. RTCP HR Configuration Block . . . . . . . . . . . . . . . . 19 5. Application of RTCP HR to RTP translators . . . . . . . . . 22 6. RTCP HR Block Multiplexing . . . . . . . . . . . . . . . . . 24 7. SDP Signaling . . . . . . . . . . . . . . . . . . . . . . 28 8. Practical applications . . . . . . . . . . . . . . . . . . 28 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 31 10. Security Considerations . . . . . . . . . . . . . . . . . . 31 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 31 12. Informative References . . . . . . . . . . . . . . . . . . . 31 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 32 Intellectual Property and Copyright Statements . . . . . . . 33 1. Introduction This draft defines several new block types to augment those defined in RFC3611 for use in Quality of Service reporting for Voice over IP. The new block types support the reporting of metrics to a higher resolution to support certain applications, for example carrier backbone networks. For certain types of VoIP service it is desirable to report VoIP performance metrics to a higher resolution than provided in the RFC3611 [3] VoIP Metrics block or RFC3550 [2] Receiver Reports. The report blocks described in this section provide both interval based and cumulative metrics with a higher resolution than that provided in the RFC3611 VoIP metrics report block[3]. The new block types defined in this draft are the High Resolution VoIP Metrics Report Block, and the High Resolution VoIP Metrics Configuration Block. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 2. Definitions 2.1 Cumulative and Interval Metrics Cumulative metrics relate to the entire duration of the call to the point at which metrics are determined and reported, and are typically used to report call quality. Cumulative metrics generally result in a lower volume of data that may need to be stored, as each report supersedes earlier reports. Clark [Page 2] draft-ietf-avt-rtcphr-01.txt July 2007 Interval metrics relate to the period since the last Interval report. Interval data may be easier to correlate with specific network events for which timing is known, and may also be used as a basis for threshold crossing alerts. Note that interval metrics for the start and end of calls may be unreliable due to factors such as irregular start and end interval length and the difficulty in knowing when packet transmission started and ended. 2.2 Bursts, Gaps, and Concealed Seconds The terms Burst and Gap are used in a manner consistent with that of RTCP XR (RFC3611). RTCP XR views a call as being divided into bursts, which are periods during which the combined packet loss and discard rate is high enough to cause noticeable call quality degradation (generally over 5 percent loss/discard rate), and gaps, which are periods during which lost or discarded packets are infrequent and hence call quality is generally acceptable. The recommended value for Gmin in RFC3611 results in a Burst being a period of time during which the call quality is degraded to a similar extent to a typical PCM Severely Errored Second. The term Concealed Seconds defines a count of seconds during which some proportion of the media stream was lost through packet loss and discard. The term Severely Concealed Seconds defines a count of seconds during which the proportion of the media stream lost through packet loss and discardeds a specified threshold. 2.3 Numeric formats This report block makes use of binary fractions. The terminology used is S X:Y, where S indicates a signed representation, X the number of bits prior to the decimal place and Y the number of bits after the decimal place. Hence 8:8 represents an unsigned number in the range 0.0 to 255.996 with a granulatity of 0.0039. S7:8 would represent the range -127.996 to +127.996. Clark [Page 3] draft-ietf-avt-rtcphr-01.txt July 2007 3 High Resolution VoIP Metrics Report Block 3.1 Block Description This block comprises a header and a series of sub-blocks. The Map field in the header defines which sub-blocks are present. Header sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BT=N | Map | block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of source | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Duration | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Basic Loss/Discard Metrics sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Loss Proportion | Discard Proportion | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Number of frames expected | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Burst/Gap metrics sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Threshold | Burst Duration (ms) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Gap Duration (ms) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Burst Loss/Disc Proportion | Gap Loss/Disc Proportion | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Playout metrics sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | On-time Playout Duration | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | On-time Active Speech Playout Duration | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Loss Concealment Duration | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Buffer Adjustment Concealment Duration | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Clark [Page 4] draft-ietf-avt-rtcphr-01.txt July 2007 Concealed Seconds metrics sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Unimpaired Seconds | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Concealed Seconds | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Severely Concealed Seconds | RESERVED | SCS Threshold | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Delay and PDV metrics sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Network Round Trip Delay | End System Delay | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | External Delay | Mean PDV | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Pos Threshold/Peak PDV | Pos PDV Percentile | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Neg Threshold/Peak PDV | Neg PDV Percentile | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PDV Type | JB/PLC config | JB nominal | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | JB maximum | JB abs max | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | JB high water mark | JB low water mark | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Call Quality metrics sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | R-LQ | R-CQ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MOS-LQ | MOS-CQ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | R-LQ Ext In | R-LQ Ext Out |RFC3550 Payload| Media Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RxSigLev (IP) |RxNoiseLev (IP)| Local RERL | Remote RERL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RxSigLev (Ext)|RxNoiseLev(Ext)| Metric Status | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Vendor specific extension sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Vendor ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Vendor ID Src |Manuf Code Src | Extension Block Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Clark [Page 5] draft-ietf-avt-rtcphr-01.txt July 2007 | Vendor-specific extension data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3.2 Header Implementations MUST send the Header block within each High Resolution Metrics report. 3.2.1 Block type Nine High Resolution VoIP Metrics blocks are defined mmm = HR Metrics- Cumulative, Locally Generated mmm+1 = HR Metrics- Cumulative, Relayed from Remote IP Endpoint mmm+2 = HR Metrics- Cumulative, Relayed from Remote Ext Endpoint mmm+3 = HR Metrics- Cumulative,TBD mmm+4 = HR Metrics- Interval, Locally Generated mmm+5 = HR Metrics- Interval, Relayed from Remote IP Endpoint mmm+6 = HR Metrics- Interval, Relayed from Remote Ext Endpoint mmm+7 = HR Metrics- Interval, TBD mmm+7 = HR Metrics- Alert, Locally Generated mmm+9 = HR Metrics- Alert, Relayed from Remote IP Endpoint mmm+10 = HR Metrics- Alert, Relayed from Remote Ext Endpoint mmm+11 = HR Metrics- Alert, TBD The time interval associated with these report blocks is left to the implementation. Spacing of RTCP reports should be in accordance with RFC3550. The specific timing of RTCP HR reports may be determined in response to an internally derived alert such as a threshold violation however the spacing of RTCP HR reports must not exceed that defined in RFC3550. Note that interval data may be derived by subtracting successive cumulative reports, which provides increased tolerance to potential loss of RTCP reports. When these block types are used in SDP offer-answer, semantics are as defined in RFC3611 [3], see clause 6. In particular, for "sendrecv" unicast connections, a block type in an offer indicates the offerer's wish to receive the specified block type. A block type in an answer indicates the answerer's wish to receive the specified block type. 3.2.2 Map field A Map field indicates the optional sub-blocks present in this report. A 1 indicates that the sub-block is present, and a 0 that the block is absent. If present, the sub-blocks must be in the sequence defined in this document. The bits have the following definitions: 0 Burst/Gap Metrics block 1 Playout Metrics block Clark [Page 6] draft-ietf-avt-rtcphr-01.txt July 2007 2 Concealed Seconds Metrics block 3 Call Quality Metrics 4 Vendor specific extension block 5-7 Reserved, set to 0 3.2.3 Block Length The block length indicates the length of this report in 32 bit words and includes the header and any extension octets. 3.2.4 SSRC The SSRC of the stream to which this report relates. The value of this field shall follow the rules defined in RFC3550 with regard to the forwarding of RTP and RTCP messages. 3.2.5 Duration The duration of time for which this report applies expressed in milliseconds. For cumulative reports this would be the call duration. For interval reports this would be the duration of the interval. 3.3 Basic Loss/ Discard Metrics The Basic Loss/Discard Metrics sub-block MUST be present. This block reports the proportion of frames lost by the network and the proportion of frames discarded due to jitter. For sample-based codecs such as G.711, a frame shall be defined as an RTP frame. For endpoints that incorporate jitter buffers capable of fractional frame discard the proportion of frames discarded MAY be determined on the basis of the proportion of samples discarded. If Voice Activity Detection is used then the proportion of frames lost and discarded shall be determined based on transmitted packets, i.e. frames that contained silence and were not transmitted shall not be considered. A frame shall be regarded as lost if it fails to arrive within an implementation-specific time window. A frame that arrives within this time window but is too early or late to be played out shall be regarded as discarded. A frame shall be classified as one of received (or OK), discarded or lost. The Loss and Discard metrics are determined after the effects of FEC, redundancy (RFC2198) or other similar process. 3.3.1 Loss Proportion Proportion of frames lost within the network expressed as a binary fraction in 0:16 format. Duplicate frames shall be disregarded. 3.3.2 Discard Proportion Proportion of voice frames received but discarded due to late or early arrival, expressed as a binary fraction in 0:16 format. Clark [Page 7] draft-ietf-avt-rtcphr-01.txt July 2007 3.3.3 Dead connection detection If no RTP, SID or RTP no-op packets have been received for a pre- specified time interval(for example ten seconds) then an RTCP HR dead connection indication MUST be sent. This time interval may be changed during the connection, for example being longer at the start of a call. Dead connection detection may be temporarily disabled during silence periods if VAD is used. A dead connection is indicated by setting both the frames lost and frames discarded fields above to 0xFFFF (equivalent to indicating 99.99% of packets have been both lost and discarded). Receipt of an RTCP HR block with either field set to a value other than 0xFFFF shall indicate that the remote endpoint HAS received some valid RTP packets. 3.3.4 Number of frames expected A count of the number of frames expected, estimated if necessary. If no frames have been received then this count shall be set to zero, however if a dead connection is indicated then this count shall be regarded as undefined. 3.4 Burst/Gap metrics sub-block The Burst/Gap metrics sub-block MAY be present and if present MUST be indicated in the Map field. This block provides information on transient IP problems and is able to represent the combined effect of packet loss and packet discard. Burst/Gap metrics are typically used in Cumulative reports however MAY be used in Interval reports. The definition of Burst and Gap is consistent with that defined in the RFC3611 VoIP Metrics block, with the clarification that Loss and Discard are defined in terms of frames (as described in 3.3 above). To accomodate the range of jitter buffer algorithms and packet discard logic that may be used by implementors, the method used to distinguish between bursts and gaps may be an equivalent method to that defined in RFC3611. The method used SHOULD produce the same result as that defined in RFC3611 for conditions of burst packet loss, but MAY produce different results for conditions of time varying jitter. If Voice Activity Detection is used the Burst and Gap Duration shall be determined as if silence frames had been sent, i.e. a period of silence in excess of Gmin frames MUST terminate a burst condition. The Burst/Gap Metrics sub-block contains the following elements. 3.4.1 Threshold The Threshold is equivalent to Gmin in RFC3611, i.e. the number of successive frames that must be received and not discarded prior to and following a lost or discarded frame in order for this lost or discarded frame to be regarded as part of a gap. Clark [Page 8] draft-ietf-avt-rtcphr-01.txt July 2007 3.4.2 Burst Duration (ms) The average duration of a burst of lost and discarded frames. 3.4.3 Gap Duration (ms) The average duration of periods between bursts. 3.4.4 Burst Loss/Discard Proportion The proportion of Lost and Discarded frames during Bursts expressed as a binary fraction expressed in 0:16 format. 3.4.5 Gap Loss/Discard Proportion The proportion of Lost and Discarded frames during Gaps expressed as a binary fraction expressed in 0:16 format. 3.5 Playout Metrics sub-block The Playout Duration metrics sub-block MAY be present and if present MUST be indicated in the Map field. At any instant, the audio output at a receiver may be classified as either 'normal' or 'concealed'. 'Normal' refers to playout of audio payload received from the remote end, and also includes locally generated signals such as announcements, tones and comfort noise. Concealment refers to playout of locally-generated signals used to mask the impact of network impairments or to reduce the audibility of jitter buffer adaptations. This sub-block accounts for the source of the output audio, in millisecond units. The on-time and active speech playout durations allow calculation of the voice activity fraction. The on-time, and concealment durations allow calculation of concealment ratios. This sub-block distinguishes between reactive (due to effective packet loss) and proactive (due to buffer adaptation) concealment. 3.5.1 On-time Playout Duration 'On-time' playout is the uninterrupted, in-sequence playout of valid decoded audio information originating from the remote endpoint. This includes comfort noise during periods of remote talker silence, if VAD is used, and locally generated or regenerated tones and announcements. An equivalent definition is that on-time playout is playout of any signal other than those used for concealment. On-time playout duration MUST include both speech and silence intervals, whether VAD is used or not. This duration is reported in millisecond units. 3.5.2 On-time Active Speech Playout Duration The duration, in milliseconds, of the on-time playout duration Clark [Page 9] draft-ietf-avt-rtcphr-01.txt July 2007 corresponding to playout of active speech signals, if known. If not known, then this field is set to all ones (0x FFFF FFFF). In the absence of silence suppression, on-time active speech playout equals on-time playout (section 3.5.1). 3.5.3 Loss Concealment Duration The duration, in milliseconds, of audio playout corresponding to Loss-type concealment. Loss-type concealment is reactive insertion or deletion of samples in the audio playout stream due to effective frame loss at the audio decoder. "Effective frame loss" is the event in which a frame of coded audio is simply not present at the audio decoder when required. In this case, substitute audio samples are generally formed, at the decoder or elsewhere, to reduce audible impairment. Only loss-type concealment is necessary to form Concealed and Severely Concealed Seconds counts, in Section 3.6. 3.5.4 Buffer Adjustment Concealment Duration (optional) The duration, in milliseconds, of audio playout corresponding to Buffer Adjustment-type concealment, if known. If not known, then this field is set to all ones (0x FFFF FFFF). Buffer Adjustment-type concealment is proactive or controlled insertion or deletion of samples in the audio playout stream due to jitter buffer adaptation, re-sizing or re-centering decisions within the endpoint. Because this insertion is controlled, rather than occurring randomly in response to losses, it is typically less audible than loss-type concealment (section 3.5.3). For example, jitter buffer adaptation events may be constrained to occur during periods of talker silence, in which case only silence duration is affected, or sophisticated time-stretching methods for insertion/deletion during favorable periods in active speech may be employed. For these reasons, buffer adjustment-type concealment MAY be exempted from inclusion in calculations of Concealed Seconds and Severely Concealed Seconds. However, an implementation SHOULD include buffer-type concealment in counts of Concealed Seconds and Severely Concealed Seconds if the event occurs at an 'inopportune' moment, with an emergency or large, immediate adaptation during active speech, or for unsophisticated adaptation during speech without regard for the underlying signal, in which cases the assumption of low-audibility cannot hold. In other words, jitter buffer adaptation events which may be presumed to be audible SHOULD be included in Concealed Seconds and Severely Concealed Seconds counts. Concealment events which cannot be classified as Buffer Adjustment- Clark [Page 10] draft-ietf-avt-rtcphr-01.txt July 2007 type MUST be classified as Loss-type. 3.6 Concealed Seconds metrics sub-block The Concealed Seconds metrics sub-block MAY be present and if present MUST be indicated in the Map field. This sub-block provides a description of potentially audible impairments due to lost and discarded packets at the endpoint, expressed on a time basis analogous to a traditional PSTN T1/E1 errored seconds metric. The following metrics are based on successive one second intervals as declared by a local clock. This local clock does NOT need to be synchronized to any external time reference. The starting time of this clock is unspecified. Note that this implies that the same loss pattern could result in slightly different count values, depending on where the losses occur relative to the particular one-second demarcation points. For example, two loss events occurring 50ms apart could result in either one concealed second or two, depending on the particular 1000 ms boundaries used. The seconds in this sub-block are not necessarily calendar seconds. At the tail end of a call, periods of time of less than 1000ms shall be incorporated into these counts if they exceed 500mS and shall be disregarded if they are less than 500mS. 3.6.1 Unimpaired Seconds A count of the number of unimpaired Seconds that have occurred. An unimpaired Second is defined as a continuous period of 1000ms during which no frame loss or discard due to late arrival has occurred. Every second in a call must be classified as either OK or Concealed. Normal playout of comfort noise or other silence concealment signal during periods of talker silence, if VAD is used, shall be counted as unimpaired seconds. 3.6.2 Concealed Seconds A count of the number of Concealed Seconds that have occurred. A Concealed Second is defined as a continuous period of 1000ms during which any frame loss or discard due to late arrival has occurred. Equivalently, a concealed second is one in which some Loss-type concealment (defined in section 3.6) has occurred. Buffer adjustment-type concealment SHALL not cause Concealed Seconds to be incremented, with the following exception. An implementation MAY cause Concealed Seconds to be incremented for 'emergency' buffer adjustments made during talkspurts. Clark [Page 11] draft-ietf-avt-rtcphr-01.txt July 2007 For clarification, the count of Concealed Seconds MUST include the count of Severely Concealed Seconds. 3.6.3 Severely Concealed Seconds A count of the number of Severely Concealed Seconds. A Severely Concealed Second is defined as a non-overlapping period of 1000 ms during which the cumulative amount of time that has been subject to frame loss or discard due to late arrival, exceeds the SCS Threshold. 3.6.4 SCS Threshold The SCS Threshold defines the amount of time corresponding to lost or discarded frames that must occur within a one second period in order for the second to be classified as a Severely Concealed Second. This is expressed in milliseconds and hence can represent a range of 0.1 to 25.5 percent loss/ discard. A default threshold of 50ms (5% effective frame loss per second) is suggested. 3.7 Delay and Packet Delay Variation (PDV) metrics sub-block The Delay and PDV metrics sub-block MUST be present. This sub-block contains a number of parameters related to overall delay (latency), delay variation and the current jitter buffer configuration. 3.7.1 Network Round Trip Delay (ms) The Network Round Trip Delay is the most recently measured value of the RTP-to-RTP interface round trip delay, typically determined using RTCP SR/RR. If no measured delay is available then this field shall be set to 0xFFFF. 3.7.2 End System Delay (ms) The End System Delay is the internal round trip delay within the reporting endpoint, calculated using the nominal value of the jitter buffer delay plus the accumulation/ encoding and decoding / playout delay associated with the codec being used. 3.7.3 External Delay (ms) The External Network Delay parameter indicates external network round trip delay through cellular, satellite or other types of network with significant delay impact, if known. A value of 0xFFFF shall indicate that the delay is unknown. If the external network is IP based then this parameter is typically determined using RTCP SR/RR. If the external network delay is known and does not vary materially then this value may be provisioned. Clark [Page 12] draft-ietf-avt-rtcphr-01.txt July 2007 3.7.4 PDV/ Jitter Metrics Jitter metrics defined are: (i) Mean PDV - For MAPDV this value is generated according to ITU-T G.1020. For interval reports the MAPDV value is reset at the start of the interval. For PPDV the value reported is the value of J(i) calculated according to RFC3550 at the time the report is generated. (16 bit, S11:4 format) expressed in milliseconds (ii) Positive Threshold/Peak PDV - the PDV associated with the Positive PDV percentile (16 bit, S11:4 format) expressed in milliseconds. The term Positive is associated with packets arriving later than the expected time. (iii) Negative Threshold/Peak PDV - the PDV associated with the Negative PDV percentile (16 bit, S11:4 format) expressed in milliseconds. The term Negative is associated with packets arriving earler than the expected time. (iv) Positive PDV Percentile - the percentage of packets on the call for which individual packet delays were less than the Positive Threshold PDV expressed in 8:8 format. (v) Negative PDV Percentile - the percentage of packets on the call for which individual packet delays were more than the Negative Threshold PDV expressed in 8:8 format. If the PDV Type indicated is IPDV and the Positive and Negative PDV Percentiles are set to 100.0 then the Positive and Negative Threshold/Peak PDV values are the peak values measured during the reporting interval (which may be from the start of the call for cumulative reports). In this case, the difference between the Positive and Negative Threshold/Peak values defines the range of IPDV. 3.7.5 PDV Type Indicates the type of algorithm used to calculate PDV: PPDV (0) according to RFC3550 [2], MAPDV (1) according to ITU-T G.1020 [4], IPDV (2) according to ITU-T Y.1540 [6] Other values reserved For example:- (a) To report PPDV (RFC3550): Mean PDV = PPDV Clark [Page 13] draft-ietf-avt-rtcphr-01.txt July 2007 Threshold PDV = Undefined (FF FF) PDV Percentile = Undefined (FFF) PDV type = 0 PPDV (b) To report MAPDV (G.1020): Mean PDV = average MAPDV Pos Threshold PDV = 50.0 Pos PDV Percentile = 95.3 Neg Threshold PDV = 50.0 (note - implies -50ms) Neg PDV Percentile = 98.4 PDV type = 1 MAPDV Note that implementations may either fix the reported percentile and calculate the associated PDV level OR may fix a threshold PDV level and calculate the associated percentile. From a practical implementation perspective it is simpler to use the second of these approaches (except of course in the extreme case of a 100% percentile). IPDV, according to Y.1540 is the difference in delay between the i-th packet and the first packet of the stream. If the sending and receiving clocks are not synchronized, this metrics includes the effect of relative timing drift. 3.7.6 Jitter Buffer / PLC Configuration Indicates the configuration of the jitter buffer and the type of PLC algorithm in use. bits 0-3 0 = silence insertion 1 = simple replay, no attenuation 2 = simple replay, with attenuation 3 = enhanced Other values reserved bits 4-7 0 = Fixed jitter buffer 1 = Adaptive jitter buffer Other values reserved 3.7.7 Jitter Buffer Size parameters Current nominal, maximum and absolute maximum jitter buffer size expressed in milliseconds, as defined in RFC3611. 3.8 Call Quality Metrics sub-block The Call Quality Metrics sub-block MAY be present and if present MUST be indicated in the Header Map field. This sub-block reports call quality metrics and estimates of signal, noise and echo levels. Signal, noise and echo metrics should be long term averages and should not be instantaneous values. Clark [Page 14] draft-ietf-avt-rtcphr-01.txt July 2007 3.8.1 Listening and Conversation Quality R Factors - R-LQ, R-CQ Expresses listening and conversational quality in terms of R factor, a 0-120 scaled parameter in 8:8 format. The algorithm used to calculate R factor MAY be defined in the RTCP HR Configuration block (see Section 4). 3.8.2 Listening and Conversation Quality MOS - MOS-LQ, MOS-CQ Expresses listening and conversational quality in terms of MOS, a 1-5 scaled parameter in 8:8 format. The algorithm used to calculate MOS MAY be defined in the RTCP HR Configuration block. Note that R factors and MOS scores may be defined for both narrow and wide-band VoIP calls. R Factors are continuous for narrow and wideband, hence the R factor for a wideband call may be higher than that for a narrowband call. MOS scores are scaled relative to reference conditions and hence both narrow and wideband MOS occupy the same 1-5 scale; this can lead to a wideband MOS being lower than a narrowband MOS even though the listening quality may be higher. 3.8.3 R-LQ Ext In and Out. These parameters provide call quality information for external networks - for example an external PCM or cellular network - or for a reporting call quality from the "other" side of a transcoding device or mixer - for example a conference bridge. R-LQ Ext In - measured by this endpoint for incoming connection on "other" side of this endpoint R-LQ Ext Out - copied from RTCP XR message received from remote endpoint on "other" side of this endpoint e.g. Phone A <---> Bridge <----> Phone B In XR message from Bridge to Phone A:- - R-LQ = quality for PhoneA ----> Bridge path - R-LQ-ExtIn = quality for Bridge <---- Phone B path - R-LQ-ExtOut = quality for Bridge -----> Phone B path This allows PhoneA to assess (i) received quality from the combination of R-LQ measured at A and R-LQ-ExtIn reported by the Bridge to A (ii) remote endpoint quality from the combination of R-LQ reported by the Bridge Clark [Page 15] draft-ietf-avt-rtcphr-01.txt July 2007 and R-LQ-ExtOut reported by the Bridge 3.8.4 RFC3550 RTP Payload Type The RTP Payload type field - as per RFC3551 and http://www.iana.org/assignments/rtp-parameters 3.8.5 Media Type Media type - 0 = No media present 1 = Narrowband audio 2 = Wideband audio 3.8.6 Received Signal and Noise Levels - IP side The received signal level during talkspurts and the noise level expressed in dBm0, for the decoded packet stream. 3.8.7 Received Signal and Noise Levels - External The received signal level during talkspurts and the noise level expressed in dBm0, for the PCM side of a gateway, audio input from a handset or decoded packet stream for an IP-to-IP gateway. 3.8.8 Local and Remote Residual Echo Return Loss The Local and Remote Residual Echo Return Loss (RERL) expressed in dB. The Local RERL is the echo level that would be reflected into the IP path due to line echo on the circuit switched element side of this IP endpoint if a gateway or acoustic echo if a handset or wireless terminal. The Remote RERL is the echo level that would be reflected into the remote IP endpoint from the network "behind" it, and would typically be measured at and reported from the remote endpoint. This value is included as it may be used in calculating the R-CQ and MOS-CQ values expressed in this report block. 3.8.9 Metric Status Indicates the source of parameter values used in call quality calculation: Bit Description Source 0-1 Local IP side Signal/Noise Levels measured on the incoming decoded VoIP stream to this endpoint 00 = assumed 01 = measured for this call 10 = measured across multiple calls on this port 11 = measured across multiple ports 2-3 Remote IP side Signal/Noise Levels reported by the remote IP endpoint through RTCP XR or equivalent Clark [Page 16] draft-ietf-avt-rtcphr-01.txt July 2007 00 = assumed 01 = measured for this call 10 = measured across multiple calls on this port 11 = measured across multiple ports 4-5 Local Trunk side Signal/Noise Levels measured on the incoming PCM, Audio or non-IP side of this endpoint 00 = assumed 01 = measured for this call 10 = measured across multiple calls on this port 11 = measured across multiple ports 6-7 Local Echo level measured in the incoming line/ trunk/ handset direction at this endpoint after the effects of echo cancellation 00 = assumed 01 = measured for this call 10 = measured across multiple calls on this port 11 = measured across multiple ports 8 Remote Echo level measured in the incoming line/ trunk/ handset direction at the remote endpoint after the effects of echo cancellation and reported to this endpoint via RTCP XR or equivalent. 0 = assumed 1 = reported from remote endpoint 9-15 Reserved For example, if this endpoint is "C" in the diagram below then the following definitions would apply. Endpoint B -----RTP-------> Gateway C <-----PCM-------> D "Remote" "Local" "Trunk/PCM/External" Reporting endpoint is "C" Local IP side signal/noise metrics relate to signal/noise levels from decoded RTP packets received by C from B Remote IP side signal/noise metrics relate to signal/noise levels from decoded RTP packets received by B from C, and reported by B to C through RTCP XR or RTCP HR VoIP Metrics blocks Local Trunk side signal/noise metrics relate to signal/noise levels from the PCM signal received by C from D Local Echo level relates to the proportion of the signal passing from B to C to D that is reflected back to C at some point Clark [Page 17] draft-ietf-avt-rtcphr-01.txt July 2007 between C and D or on the far side of D. This would typically be electrical echo or acoustic echo. Remote Echo level relates to the proportion of the signal passing from D to C to B that is reflected back to B at some point between B and the user. This echo level is typically measured at B and reported to C via RTCP XR or RTCP HR VoIP Metrics blocks. 3.9 Vendor Extension sub-block One or more Vendor Extension sub-blocks MAY be present and if present MUST be indicated in the Header Map field. Note that the map field does not indicate the number of vendor extension sub-blocks. This must be deduced from the length of the overall report block and the lengths of the Vendor Extension sub-blocks. Each vendor extension sub-block consists of an extension header and vendor-specific extension data. The extension header has the form Vendor ID, Vendor ID Source and Extension Block Length. The Extension Block Length is defined as including these extension header and extension data octets but does not include any subsequent vendor extension sub-blocks. An implementation can skip over a vendor extension sub-block that it does not understand. The Vendor ID Source field indicates whether the four-octet Vendor ID is based on ITU T.35 or is an IANA private enterprise number. The designated values for these options are 1 and 2 respectively. If the Vendor ID Source field is assigned a value of 0, then all of the fields in the vendor extension sub-block with the exception of the Vendor ID Source field and the Extension Block Length field have proprietary definitions. If the Vendor ID is based on ITU-T Recommendation T.35, its first two octets either contain a country code from Annex A of ITU-T Rec. T.35 followed by 0x00, or an escape code of 0xFF followed by a country code from Annex B of ITU-T Rec. T.35. The next two octets comprise a terminal provider code allocated by a national assignment authority (http://people.itu.int/~campos/t35/t35db.htm). This field is padded with leading zeros if necessary. If a vendor has multiple terminal provider codes in different registries (e.g. the H-series, T-series and V-series registries for the USA), then this provenance shall be indicated in the Manufacturer Code Source field. Possible values of the Manufacturer Code Source field are: 0 Unspecified/don't care (may be used if there is no conflict) 1 G-series registry 2 H-series registry 3 T-series registry 4 V-series registry 5-255 Reserved. If the four-octet Vendor ID is an IANA private enterprise number, then it is padded with leading zeros as necessary. Current private Clark [Page 18] draft-ietf-avt-rtcphr-01.txt July 2007 enterprise numbers (www.iana.org/assignments/enterprise-numbers) can be accommodated within two octets. The additional two octets provide for future growth. 4. RTCP HR Configuration Block This block type provides a flexible means to describe the algorithms used for call quality calculation and other data. This block need only be exchanged occasionally, for example sent once at the start of a call. Header sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BT=N | Map | block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of source | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Correlation Tag sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tag Type | Tag length | Correlation Tag... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... Correlation Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Algorithm sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Alg type | Descriptor len| Algorithm descriptor... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... Algorithm descriptor | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Vendor Specific Extensions sub-block 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Vendor ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Vendor ID Src |Manuf Code Src | Extension Block Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Vendor-specific extension data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4.1 Header Implementations MUST send the Header block within each RTCP HR Clark [Page 19] draft-ietf-avt-rtcphr-01.txt July 2007 Configuration report. 4.2.1 Block type One RTCP HR Configuration block is defined mmm+12 = RTCP HR Configuration Block The time interval associated with these report blocks is left to the implementation. Spacing of RTCP reports should be in accordance with RFC3550 however the specific timing of RTCP HR reports may be determined in response to an internally derived alert such as a threshold crossing. 4.2.2 Map field A Map field indicates the optional sub-blocks present in this report. A '1' indicates that the sub-block is present, and a '0' that the block is absent. If present, the sub-blocks must be in the sequence defined in this document. The bits have the following definitions: 0 Correlation Tag 1 Algorithm Descriptor 1 2 Algorithm Descriptor 2 3 Algorithm Descriptor 3 4 Algorithm Descriptor 4 5 Vendor Specific Extension 6-7 Reserved, set to '0' 4.2.3 Block Length The block length indicates the length of this report in 32 bit words and includes the header and any extension octets. 4.2.4 SSRC The SSRC of the stream to which this report relates. 4.3 Correlation Tag The Correlation Tag sub-block MAY be present and if present MUST be indicated in the map field. This tag facilitates the correlation of the high resolution VoIP metrics report blocks with other call-related data, session-related data or endpoint data. An example use case is for an endpoint to convey its version of a call identifier or a global call identifier via this tag. A flow measurement tool (sniffer) that is not call-aware can then forward the RTCP-HR reports along with this correlation tag to network management. Network management can then use this tag to correlate this report with other diagnostic information such as call detail records. The Tag Type indicates the use of the correlation tag. The following values are defined: Clark [Page 20] draft-ietf-avt-rtcphr-01.txt July 2007 0: IMS Charging Identity (ICID) subfield of the P-Charging-Vector header specified in RFC 3455 [7]. 1: Globally unique ID as specified in ITU-T H.225.0 (Table 20/H.225.0) [8]. 2: Conference Identifier, per ITU-T H.225.0 (Table 20/H.225.0) [8]. 3: SIP Call-ID as defined in RFC 3261 [9]. 4: PacketCable Billing Call ID (BCID) [10]. 5: Text string using the US-ASCII character set [11]. 6: Octet string. 7-255: Future growth. Although the intent of this RFC is to list all currently known values of usable correlation tags, it is possible that new values may be defined in the future. An IANA registry of correlation tags is recommended. The tag length indicates the overall length of the sub-block in 32 bit words and includes the tag type and length fields. 4.4 Algorithm descriptor The Algorithm Type sub-block MAY be present and if present MUST be indicated in the map field The Algorithm Type is a bit field which indicates which algorithm is being described. The bits are defined as:- Bit 0: MOS-LQ Algorithm Bit 1: MOS-CQ Algorithm Bit 2: R-LQ Algorithm Bit 3: R-CQ Algorithm Bit 4-7: Reserved and set to '0' The descriptor length gives the overall length of the descriptor in 32 bit words and includes the algorithm descriptor and length fields. The algorithm descriptor is a text field that contains the description or name of the algorithm. If the algorithm name is shorter than the length of the field then the trailing octets must be set to 0x00. For example, an implementation may report: Algorithm descriptor = 0xF0 - R and MOS algorithms Descriptor length = 3 - 3 words Clark [Page 21] draft-ietf-avt-rtcphr-01.txt July 2007 Descriptor = "Alg X" 0x00 - description Call quality estimation algorithms may be defined for listening or conversational quality MOS or R factor. 4.5 Vendor Extension field One or more vendor specific extension blocks may be added, as defined in Section 3.10 5. Application of RTCP HR to RTP translators RFC3550 [2] defines three types of RTP systems which may source and sink RTP streams. These types are RTP end systems, RTP mixers, and RTP translators. For purposes of reporting connection quality to other RTP systems, RTP mixers and RTP end systems are very similar. Mixers resynchronize audio packets and do not relay RTCP reports received from one cloud towards other cloud(s). Translators do not resynchronize packets and SHOULD forward certain RTCP reports between clouds. Translators have a wide range of possible reporting behaviours. This section describes in more detail the application of RTCP HR to RTP translators. An RTP translator is defined in RFC3550 [2] as "An intermediate system that forwards RTP packets with their synchronization source identifier intact". RFC3550 gives the following examples of translators: devices that convert encodings without mixing; replicators from multicast to unicast; and application-level filters in firewalls. For each session, an RTP translator has at least two RTP connections via different logical interfaces to network clouds, with logical separation based on a transport layer identifier (e.g. by UDP port) or at a lower layer (e.g. IP address, Ethernet VLAN identifier, ATM VCI, or physical interface). RTP traffic streams flowing via these two connections may be unicast or multicast, and may be unidirectional or bidirectional. Section 7 of RFC3550 [2] defines the RTCP processing in the translator which is required to maintain correct semantics for the RTCP communication between the RTP end systems involved in the connection. Where the RTP translator makes its own measurements of the incoming packet stream, section 7 of RFC3550 [2] allows it to send RTCP reception reports about what it has received. To do this, it is allowed to allocate an SSRC and a CNAME for itself. An RTP translator has several potential sources of information about transport and application quality in a session. These include o RTCP (including XR and HR) reports received from RTP end systems and mixers Clark [Page 22] draft-ietf-avt-rtcphr-01.txt July 2007 o RTCP (including XR and HR) reports received from other translators o Measurements made locally on incoming RTP streams at the translators' interfaces In general, it is a matter of policy whether each of these types of information should be reported or forwarded by the translator towards neighbouring RTP systems involved in the same session. The objective of this policy may be to ensure that sufficient information is available to support network and service management at RTP systems, whilst avoiding excessive RTCP processing. Where a translator forms a boundary between service providers' domains, a provider may wish to apply policy to restrict the release of network performance information towards the provider's peer. This suggests that policy controlling reporting and forwarding of RTCP information (including XR and HR) MAY be configurable differently for each network interface, and may be chosen differently for each session if the translator is capable of this level of control. This allows certain types of RTCP reports for the session to be sent outwards via one subset of the translator's interfaces involved in the session, whilst restricting reporting and forwarding outwards via another subset. The following behaviours are potentially useful at a translator: o RTCP [2] reports received from an upstream RTP end system or mixer SHOULD be forwarded towards the downstream RTP end systems or mixers involved in the session. This behaviour is RECOMMENDED to maintain the integrity of the RTCP connection between the RTP end systems involved in the connection. We call this behaviour "a" to aid discussion below of the examples of end-to-end capabilities. o RTCP XR and RTCP HR reports received from an upstream RTP end system or mixer MAY be forwarded towards the downstream RTP end systems or mixers involved in the session. We call this behaviour "b". o The results of measurements made on an RTP stream received at one of the translator's network interfaces MAY be sent out as RTCP reports towards neighbouring RTP systems, either upstream in the direction towards the RTP system which originated the RTP stream (behaviour "c1"), or downstream in the direction towards the RTP system(s) which will receive the translated RTP stream (behaviour "c2"), or both upstream and downstream (described as behaviour "c1+c2"). o RTCP XR and RTCP HR reports received from an upstream RTP translator MAY be forwarded towards the downstream RTP end systems or mixers involved in the session. We call this behaviour "d". Clark [Page 23] draft-ietf-avt-rtcphr-01.txt July 2007 NOTE: Behaviour "a", "b", "c" and "d" may be further classified in a) transparent forwarding ("a_t", "b_t", "c_t", "d_t") and b) forwarding with modification of RTCP fields ("a_m", "b_m", c_m", "d_t"), dependent on the specific, VoIP call-individual mode of operation of the RTP translator (see also below note at the end of clause 5.1.1). The detailed behaviour is inherent to the RTP translator configuration, thus orthogonal to the applied measurement mode. Material in section 6 illustrates how policies may be applied to control these behaviours and hence achieve potentially useful end- to-end reporting modes. 6. RTCP HR Report Multiplexing For RTP sessions that traverse one or more RTP translators, implementations MAY concatenate RTCP HR Report Blocks. The order of concatenated RTCP HR blocks shall be the same as the order in which network segments are traversed. This behaviour is described below with respect to the following example of a bidirectional unicast connection between RTP end systems A and Z using three translators J, K, and L: ---- ----- ----- ----- ---- | T|-->|1R 2T|-->|1R 2T|-->|1R 2T|-->|R | | A | | J | | K | | L | | Z | | R|<--|1T 2R|<--|1T 2R|<--|1T 2R|<--|T | ---- ----- ----- ----- ---- [Editors notes: (a) In above example, if segment A->J does not support HR then reports from K, L, Z should not be forwarded to A - for further study. (b) Suggest three options - - local only, i.e. (A->J) only - endpoint only, .e. (A->Z) only - all segments (A->J) and (J->K)....] In the first example (Figure 1) translators implement policies "a", "b", "c1+c2" and "d" of section 5. The result is that the connection operates in a mode called "Global Reporting" in which all measurements by translators are forwarded to end systems. Without concatenation, this set of policies results in a large number of separate RTCP packets. In Figure 2, concatenation is applied to reduce the number of separate RTCP packets, here to 3. The packet labelled "RTCP(A.R)" is the report from the end system, identified because its SSRC is the same as the SSRC of the RTP flow in the same direction. This is Clark [Page 24] draft-ietf-avt-rtcphr-01.txt July 2007 forwarded without concatenation. The packet labelled RTCP(J.1R, K.1R, L.1R) contains translator reports of measurements made on the RTP flow from A to Z, identified because they report the measured flow having SSRC equal to the SSRC of A. The packet labelled RTCP (J.2R,K.2R,L.2R) contains translator reports of measurements made on the RTP flow from Z to A, identified because they report the measured flow having SSRC equal to the SSRC of Z. A.R/A.T------>J.1R/J.2T---->K.1R/K.2T---->L.1R/L.2T------>Z.R/Z.T | RTP | | | | |<------------|-------------|-------------|------------->| | | | | | | RTCP(A.R) | RTCP(A.R) | RTCP(A.R) | RTCP(A.R) | |- - - - - - >|- - - - - - >|- - - - - - >|- - - - - - > | | | RTCP(J.1R) | RTCP(J.1R) | RTCP(J.1R) | | |- - - - - - >|- - - - - - >|- - - - - - > | | | RTCP(J.2R) | RTCP(J.2R) | RTCP(J.2R) | | |- - - - - - >|- - - - - - >|- - - - - - > | | | | RTCP(K.1R) | RTCP(K.1R) | | | |- - - - - - >|- - - - - - > | | | | RTCP(K.2R) | RTCP(K.2R) | | | |- - - - - - >|- - - - - - > | | | | | RTCP(L.2R) | | | | |- - - - - - > | | | | | RTCP(L.2R) | | | | |- - - - - - > | | | | | | A.T/A.R<------J.1T/J.2R<----K.1T/K.2R<----L.1T/L.2R<------Z.T/Z.R | RTP | | | | |<------------|-------------|-------------|------------->| | | | | | | RTCP(Z.R) | RTCP(Z.R) | RTCP(Z.R) | RTCP(Z.R) | |< - - - - - -|< - - - - - -|< - - - - - -|< - - - - - - | | RTCP(L.1R) | RTCP(L.1R) | RTCP(L.1R) | | |< - - - - - -|< - - - - - -|< - - - - - -| | | RTCP(L.2R) | RTCP(L.2R) | RTCP(L.2R) | | |< - - - - - -|< - - - - - -|< - - - - - -| | | RTCP(K.1R) | RTCP(K.1R) | | | |< - - - - - -|< - - - - - -| | | | RTCP(K.2R) | RTCP(K.2R) | | | |< - - - - - -|< - - - - - -| | | | RTCP(J.1R) | | | | |< - - - - - -| | | | | RTCP(J.2R) | | | | |< - - - - - -| | | | | | | | | Figure 1 The Global reporting mode Clark [Page 25] draft-ietf-avt-rtcphr-01.txt July 2007 A.R/A.T------>J.1R/J.2T---->K.1R/K.2T---->L.1R/L.2T------>Z.R/Z.T | RTP | | | | |<------------|-------------|-------------|------------->| | | | | | | RTCP(A.R) | RTCP(A.R) | RTCP(A.R) | RTCP(A.R) | |- - - - - - >|- - - - - - >|- - - - - - >|- - - - - - > | | | | | RTCP(J.1R, | | | | RTCP(J.1R, | K.1R, | | | RTCP(J.1R) | K.1R) | L.1R) | | |- - - - - - >|- - - - - - >|- - - - - - > | | | | | RTCP(J.2R, | | | | RTCP(J.2R, | K.2R, | | | RTCP(J.2R) | K.2R) | L.2R) | | |- - - - - - >|- - - - - - >|- - - - - - > | A.T/A.R<------J.1T/J.2R<----K.1T/K.2R<----L.1T/L.2R<------Z.T/Z.R | RTP | | | | |<------------|-------------|-------------|------------->| | | | | | | RTCP(Z.R) | RTCP(Z.R) | RTCP(Z.R) | RTCP(Z.R) | |< - - - - - -|< - - - - - -|< - - - - - -|< - - - - - - | | RTCP(L.1R, | | | | | K.1R, | RTCP(L.1R, | | | | J.1R) | K.1R) | RTCP(L.1R) | | |< - - - - - -|< - - - - - -|< - - - - - -| | | RTCP(L.2R, | | | | | K.2R, | RTCP(L.2R, | | | | J.2R) | K.2R) | RTCP(L.2R) | | |< - - - - - -|< - - - - - -|< - - - - - -| | | | | | | Figure 2 Proposed multiplexing scheme of the Global reporting mode In the second example, (Figure 3) translators implement policies "a", "b", and "c1+c2" of section 5. The result is that the connection operates in a mode called "Local Reporting" in which measurements by each translator are sent to the nearest-neighbour RTP system in each direction but are not forwarded further. Even without concatenation, this set of policies results in a maximum of three RTCP packets in each direction on each link (per RTCP measurement cycle). Concatenation does not occur in this mode. A.R/A.T------>J.1R/J.2T---->K.1R/K.2T---->L.1R/L.2T------>Z.R/Z.T | RTP | | | | |<------------|-------------|-------------|------------->| | | | | | | RTCP(A.R) | RTCP(A.R) | RTCP(A.R) | RTCP(A.R) | |- - - - - - >|- - - - - - >|- - - - - - >|- - - - - - > | Clark [Page 26] draft-ietf-avt-rtcphr-01.txt July 2007 | | RTCP(J.1R) | RTCP(K.1R) | RTCP(L.1R) | | |- - - - - - >|- - - - - - >|- - - - - - > | | | RTCP(J.2R) | RTCP(K.2R) | RTCP(L.2R) | | |- - - - - - >|- - - - - - >|- - - - - - > | | | | | | | | | | | A.T/A.R<------J.1T/J.2R<----K.1T/K.2R<----L.1T/L.2R<------Z.T/Z.R | RTP | | | | |<------------|-------------|-------------|------------->| | | | | | | RTCP(Z.R) | RTCP(Z.R) | RTCP(Z.R) | RTCP(Z.R) | |< - - - - - -|< - - - - - -|< - - - - - -|< - - - - - - | | RTCP(J.1R) | RTCP(K.1R) | RTCP(L.1R) | | |< - - - - - -|< - - - - - -|< - - - - - -| | | RTCP(J.2R) | RTCP(K.2R) | RTCP(L.2R) | | |< - - - - - -|< - - - - - -|< - - - - - -| | | | | | | Figure 3 Local reporting mode Figure 4 depicts an example of the RTCP reports configuration of the proposed multiplexing scheme. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Header | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Sender info | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Report block of SSRC 1 | | | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Chunk 1 | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | HR header | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Report blocks (Chunk 1) | | | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | | | ... | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Chunk N | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | HR header | Clark [Page 27] draft-ietf-avt-rtcphr-01.txt July 2007 | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Report blocks (Chunk N) | | | | | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Figure 4 An example of the multiplexed RTCP reports packet 7. SDP Signaling 7.1 Block type values: assignment of initial value "mmm" The assignment of the initial value "mmm" is under responsibility of "non-RTP means" (definition see clause 3/RFC 3550), which is typically SDP signalling. 7.2 Applicability for single port RTP/RTCP sessions The selection of a value range for HR block types is independent of the fact, whether RTP and RTCP using different ports or a single port (see draft-ietf-avt-rtp-and-rtcp-mux). Single port RTP/RTCP applications must ensure that RTP payload type 79 is NOT used (because in conflict with RTCP XR packet type 207, see clause 4/draft-ietf-avt-rtp-and-rtcp-mux). 7.3 The SDP attribute "rtcp-hr" The "rtcp-hr" attribute is complementing/extending the "rtcp-xr" SDP attribute as defined in clause 5.1/RFC3611. 7.4 Usage in Offer/Answer See clause 5.2/RFC3611. 7.5 Usage Outside of Offer/Answer See clause 5.3/RFC3611. 8. Practical Applications 8.1 Overview Clark [Page 28] draft-ietf-avt-rtcphr-01.txt July 2007 The objective of this section is to identify a number of cases in which there could potentially be some ambiguity in the application of the report blocks defined above or some exceptions to the defined operation of the metrics. 8.2 Supplementary Services: Call Hold and Transfer 8.2.1 General Supplementary services are under control of call/session control protocols like SIP. Such signaling protocols are acting also as "non-RTP means" (definition see clause 3/RFC 3550) in such service scenarios. The "northbound" served user instance for RTCP HR data is typically "co-located" to the served user instance of the call/session control protocol controlling the supplementary service. This allows to correlate in principle supplementary service control events with RTCP HR measurements in such network elements (like a SIP UA, SIP proxy, application server, etc.). Thus, the correlation between RTP/RTCP session control and supplementary service control allows basically the minimization of potential ambiguity. Below sub-clause providing some additional notes dependent on specific supplementary services. 8.2.2 Supplementary Service: Call Transfer A successful call transfer means that an initial call between A and B is transferred to a call between C and B. This means that the RTP end system A is "replaced" by RTP end system C, accompanied by all correspondent changes in a RTP/RTCP endpoint (e.g., SSRC for A "replaced" by SSRC for B). In the scope of RTCP HR, it is therefore recommended to consider the two call phases (1st phase: call A-B, 2nd phase: call C-B) as separate measurement phases. Separate measurement phases could be e.g. based on interval metrics and the derivation of call phase-individual cumulative metrics by the "northbound" served user instance of RTCP HR, or by "resetting" the cumulative metrics in each call phase. 8.2.3 Supplementary Service: Call Hold Call hold enables the served (holding) user A to put user B (with whom user A has an active call) into a hold condition (held user) and subsequently to retrieve that user again. During this hold condition, user B may be provided with media on hold (MoH). The served (holding) user A may perform other actions while user B is being held, e.g. consulting with another user C. In the scope of RTCP HR, it is recommended to consider the different call phases firstly as separate measurement phases (see also 8.2.2). Clark [Page 29] draft-ietf-avt-rtcphr-01.txt July 2007 8.3 Bitrate efficiency improvements: VAD/Silence Suppression based on Voice Activity Detection (VAD)Elimination A VoIP call is either enabled or disabled for silence suppression. This is typically a call-individual configuration parameter, negotiated during call establishment phase, and not changed anymore during the remaining call phase. An enabled silence suppression mode is basically affecting almost all high resolution VoIP metrics. The "northbound" served user instance of RTCP HR may require access to the information, whether silence suppression was enabled or disabled for that call, in order to indicate that mode of operation in the VoIP measurement data. 8.4 Endpoint configuration changes mid-call An endpoint relates to an RTP end system, which can be either a) located in VoIP user/terminal equipment (e.g. SIP UA), or b) located in VoIP gateway equipment (e.g. PSTN-to-RTP H.248 media gateway), or c) located in VoIP media server equipment. 8.4.1 Changes due to mid-call transitions between different voice codec types Voice codec type changes are reflected in RTP payload type changes, which are visible in the Call Quality metrics sub-block (see clause 3.1). 8.4.2 Changes due to mid-call transitions from VoIP to RTP-based VBDoIP There might be mid-call transitions from VoIP to dedicated modes of operation for voiceband data services support in case that at least one RTP end system is located in type (b) equipment. Mode transitions should be again reflected in RTP payload type changes in case of RTP-based VBD transport (e.g. like ITU-T Rec. V.152 for VBDoIP). Details are for further study. 8.4.3 Changes due to mid-call transitions from VoIP to non-RTP -based VBDoIP UDPTL/UDP based realtime facsimile according ITU-T Rec. T.38 is an example for RTP-less transport of facsimile/modem signals. Any mid-call transition to T.38 would inherently terminated the RTP/RTCP session, thus the measurement phase. Details are for further study. 8.5 SSRC changes mid-call An SSRC change may be e.g. the consequence of a mid-call transport address change. Details are for further study. Clark [Page 30] draft-ietf-avt-rtcphr-01.txt July 2007 9. IANA Considerations This document defines a series of new RTCP Extended Report (XR) block types within the existing Internet Assigned Numbers Authority (IANA) registry of RTP RTCP XR block types. In addition, this document defines the need for an IANA registry of correlation tag types (Section 4.3) 10. Security Considerations RTCP reports can contain sensitive information since they can provide information about the nature and duration of a session established between two endpoints. As a result, any third party wishing to obtain this information should be properly authenticated and the information transferred securely. 11. Contributors The authors gratefully acknowledge the comments and contributions made by Jim Frauenthal, Mike Ramalho, Paul Jones, Claus Dahm, Bob Biskner, Mohamed Mostafa, Tom Hock, Albert Higashi, Shane Holthaus, Amit Arora, Bruce Adams, Albrecht Schwarz, Keith Lantz, Randy Ethier, Philip Arden, Ravi Raviraj and Hideaki Yamada. 12. Informative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, June 1997. [2] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [3] Friedman, T., Caceres, R. and A. Clark, "RTP Control Protocol Extended Reports (RTCP XR)", RFC 3611, November 2003. [4] "Performance parameter definitions for quality of speech and other voiceband applications utilizing IP networks" ITU-T Rec. G.1020, November 2003 [5] Annex A to G.1020: "VoIP Gateway specific reference points and performance parameters", Amendment 1 to ITU-T Rec. G.1020 May 2004 [6] "Internet protocol data communication service - IP packet transfer and availability performance parameters:, ITU-T Rec. Y.1540, December 2002 [7] M. Garcia-Martin, E. Henrikson and D. Mills, "Private Header (P-Header) Extensions to the Session Initiation Protocol (SIP) for the 3rd-Generation Partnership Project (3GPP)", Clark [Page 31] draft-ietf-avt-rtcphr-01.txt July 2007 RFC 3455, January 2003. [8] ITU-T H.225.0, "Call signalling protocols and media stream packetization for packet-based multimedia communication systems", July 2003. [9] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, J. Peterson, R. Sparks, M. Handley, E. Schooler, "SIP: Session Initiation Protocol", June 2002. [10] PacketCable(TM) Multimedia Specification, PKT-SP-MM-I02-040930, September 2004. [11] Coded Character Set--7-Bit American Standard Code for Information Interchange, ANSI X3.4-1986. Authors' Addresses Alan Clark Telchemy Incorporated 2905 Premiere Parkway, Suite 280 Duluth, GA 30097 Email: alan@telchemy.com Amy Pendleton Nortel 2380 Performance Drive Richardson, TX 75081 Email: aspen@nortel.com Rajesh Kumar Cisco Systems 170 West Tasman Drive San Jose, CA 95134 Email: rkumar@cisco.com Kevin Connor Cisco Systems 5590 Whitehorn Way Blaine, WA 98230 Email: kconnor@cisco.com Geoff Hunt BT BT Adastral Park Martlesham Heath Ipswich IP5 3RE UK Email: geoff.hunt@bt.com Full Copyright Statement Clark [Page 32] draft-ietf-avt-rtcphr-01.txt July 2007 Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. Acknowledgement Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Clark [Page 33]