[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AVT] Suggestion for how to handlederived/computedmetricindraft-avt-rtcpxr-*



"Alan Clark" <alan.d.clark at telchemy.com> writes:
>>I read G.799.1, and I saw no reference on how to calculate R or MOS.  Even
>if they're elsewhere or in an appendix I missed, 
>
>You should look again - Appendix IV to G.799.1 (2004) is entitled "Mapping
>of E Model (G.107) Parameters from RFC3611 Metrics". 

Ah.  The .doc I found online didn't have that appendix.

>>those may not (properly) handle things like adaptive jitter buffers
>>(G.799.1 mentions jitter buffers briefly, but implies the delay must be
>>chosen according to the goals of the network - implying a fixed jitter
>>buffer).  Given all the ways an adaptive jitter buffer can deal with
>>lengthening and shrinking the buffer, I don't think any objective algorithm
>>is going to do a good job of measuring the effect of that operation,
>>certainly not from outside the jitter buffer/codec code.
>
>Actually - I don't quite agree with you.  A jitter buffer essentially
>replaces jitter with additional packet loss and delay.  The combined effects
>of network packet loss and packets discarded by the jitter buffer due to
>late arrival lead to the same result.  The added delay due to the jitter
>buffer increases overall delay as seen by the user, however this effect is
>generally perceived over a longer time period due to conversational
>interaction problems (unless of course there is a high level of echo on the
>call)

Sorry, that's wrong (IMHO), depending on the jitter buffer.  An adaptive
jitter buffer has to in some manner speed up and slow down.  This could be
by throwing away or adding entire frames (which would look mostly like
packet loss or unexpected repetition), or by much fancier mechanisms that
expand and contract time during speech by less than a frame (which would be
very unlike packet loss or repetition, and very tricky to compute an MOS
for, unless you have a way to do it from *only* the reconstructed sample
stream).  (Note: I haven't read that G.799.1 appendix.)

Your comments would apply to a static jitter buffer, however.

>> Roni's point was that UA developers generally get black-box codecs to play
>>with.  They generally know about packet loss, jitter (going into the jitter
>>buffer), codec, etc , but PLC may well be hidden from them.  
>
>This functionality could of course be implemented by the codec.  In
>addition, many commercial codecs have APIs that allow metrics to be
>retrieved - in some cases a fairly prolific set of metrics.  This does
>typically include information related to both packet processing (i.e. loss,
>discard, jitter buffer depth) and signal processing (signal levels, noise
>levels, echo levels).

Sure, some do - and more don't, from what I've seen.  The ones I've seen
have no metrics, or they have just enough for plain RTCP SR/RR.

>> Also, a loss of a packet when someone is listening (i.e. silence) has far
>>less effect than a loss at the start of a word (witness the Neil Armstrong
>>moon landing, sort of), and the UA developer probably doesn't know if the
>>packets on either side of the loss indicate that there was talking or not.
>
>Several answers to this one!!
>
>(i) Objective quality measurement generally does not take into account
>emotional or semantic content, but generally tries to be robust over a wide
>range of types of content (e.g. testing using content in multiple
>languages).  A good analogy was explained to me by a subjective test expert
>from a major US service provider - apparently they used to conduct state by
>state phone surveys of subscribers to fulfil FCC reporting requirements.
>One survey indicated that the (subjective) quality of service in that state
>had dropped significantly, which caused considerable concern to senior
>management; they found out however that the unemployment rate in the state
>had just increased significantly - you can imagine if someone had just lost
>their job and the phone company called to say "how happy are you with our
>service" that the response may be less than complementary!!

Ok - though my point was that automated MOS scoring could/should take
into account the content of the packets lost, at least to the level of
determining if there was (likely) any speech coded within the lost packet.
(look at the packets on either side of the loss - if they have significant
speech-like qualities, then this packet almost certainly coded speech too).
If it likely coded silence, it has much less impact on subjective MOS
and so objective MOS should try to model that (when it can).

>(ii) You can take into account whether packet loss affects silence or speech
>however from a service quality perspective this is not necessarily good.  If
>there is an apparent dip in service quality at some point as losses (for
>some strange reason) are tending to impact talkspurts more than silence then
>the service provider may view this as the metric describing service quality
>dropping when the service they are delivering is essentially unchanged.
>
>In some senses, a reported MOS score is not a prediction of what this
>specific user would say about this specific call but a measure of the
>"service quality" of this call expressed in terms of MOS.  The reason for
>doing this is that service providers understand what MOS is, it is providing
>a language to describe service quality in terms that relate to the user
>experience.  

No, MOS ("real MOS") is what specific user(s) would rate the call as.
"reported MOS" is something different - it's NOT "real" MOS.  It may be an
approximation of real MOS, but if so then it could be calculated in all
sorts of different ways.  It sounds like you're looking at it as something
related to but not the same as real MOS; something that is more useful to a
service provider than "real" MOS is, and therefore you (or whomever) would
prefer not to make the "reported MOS" any closer to "real MOS" as judged by
a human.  Which is fine, but it means we should move away from calling it
MOS.  If I want to know how the listener would rate that exact call, then I
want something as close to "real MOS" as possible.

-- 
Randell Jesup, Worldgate (developers of the Ojo videophone), ex-Amiga OS team
rjesup at wgate.com
"The fetters imposed on liberty at home have ever been forged out of the weapons
provided for defence against real, pretended, or imaginary dangers from abroad."
		- James Madison, 4th US president (1751-1836)


_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt