Dear Mr. Perkins,
I'd like to follow up on your draft "Guidelines for the use of
Variable Bit Rate Audio with Secure RTP" (http://www.ietf.org/id/draft-perkins-avt-srtp-vbr-audio-01.txt
), which proposes that "VBR audio codecs that alter the size or
spacing of their output according to the characteristics of the
input speech signal SHOULD NOT be used with encrypted SRTP
sessions." As already expressed in my previous email, I do not agree
with such a proposal. Please let me explain why:
1. The decision what codec to use does not belong in SRTP or with
the IETF. Just like SRTP should not prescribe that only hardware
encryption be used, or that all calls be routed through secure
proxies. While these kind of requirements would boost security, they
also come at a cost, and it is not the role of SRTP or the IETF to
decide the best trade-off between security and cost.
2. The cost of constant bitrate (CBR) coding is not insignificant:
the average bitrate can easily double compared to variable bitrate
(VBR). Many situations exist where Internet access is expensive or
limited.
3. Many users place limited value on security, as witnessed by the
wide use of PSTN and GSM. Such users may be happy with "cheap"
security such as encryption, but would rather avoid the
inconvenience, higher price or lower quality that comes with more
drastic security measures.
4. Combining (2) and (3) suggests that encrypted VBR could be the
"sweet spot" for some or even most users. As it happens, most VoIP
calls today are made using software-encrypted VBR.
5. The [spot-me] reference in the draft shows that sentences spoken
in a controlled environment can be recognized with some degree of
accuracy. Given that there were 122 sentences in their "dictionary",
and assuming at least 7 words per sentence, that suggests an
extracted information rate of less than 1 bit per word, on average.
Without major improvements, such a system could thus not be used to
recognize natural speech which has a higher entropy.
6. You may disagree with (5) based on the view that it is very
dangerous to downplay the threat from any information leak. But then
your proposed method of adding overhang periods to active speech
intervals is equally dangerous, as all it does is reduce the size of
the leak. My personal view is that it is dangerous to jump to
conclusions without further analysis or common sense.
I applaud any effort to raise awareness about security limitations
and risks. But the goal should be to inform developers and users who
want to balance security risk and cost as they see fit.
best,
koen.
Quoting Koen Vos <koen.vos at skype.net>:
Hi,
I saw this draft suggesting that variable bitrate (VBR) speech
codecs not be used in combination with SRTP:
http://www.ietf.org/id/draft-perkins-avt-srtp-vbr-audio-01.txt
I didn't find much discussion about this in the archive, and wonder
how serious the proposal is.
In particular, I would like to know how someone in the future
should use a VBR codec in combination with encryption? Or will the
IETF decide that what is prevalent today because of its pragmatic
trade-off between security and efficiency is simply no longer
acceptable tomorrow?
thanks,
koen.
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www.ietf.org/mailman/listinfo/avt
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www.ietf.org/mailman/listinfo/avt