Re: [Fecframe] Comments about draft-galanos-fecframe-rtp-reedsolomon-00
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Fecframe] Comments about draft-galanos-fecframe-rtp-reedsolomon-00
Hello Einat,
I had a discussion with Orly today morning and it helped me clarifying
some aspects.
See my comments below.
* Section 4 is confusing because it mixes the notions of packets,
symbols,
and "elements of the Galois Field". The following is the official
terminology:
[...]
[Einat] You are correct here. There are some inconsistencies and wrong
terminology in this section.
May I suggest:
In general a Reed Solomon code takes a group of K source symbols and
generates N - K repair symbols. A receiver needs to receive any K of the
N source or repair symbols in order to recover the remaining N-K
symbols. In this document we suggest a symbol to be a whole RTP packet.
As explained in RFC5510, the Reed-Solomon algorithm operates over
multiple elements each taken from a single source symbol. symbols are
composed of S "m-bit elements", where m is the Galois Field exponent
GF(2^m). In the usual case of GF(2^8), elements are bytes, and the size
S in terms of elements is of course equal to the symbol size in bytes.
In terms of implementation simplicity it is also recommended to use
8-bits elements. For more information on Reed Solomon codes, the reader
is referred to [Rizzo97].
=> VR: okay.
Similarly, in section 5, it is said:
"A source block for the Reed-Solomon code contains K data blocks."
There is no notion of "K data blocks" in the official terminology. I
think you
want to say:
"A source block [...] contains K source symbols."
(idem for the following sentences).
[Einat] You are right again. Data block should be replaced with Source
symbols. To link this to the FEC framework the appropriate phrasing
should be:
A source block for the Reed-Solomon code contains K source symbols. In
the scheme presented by this document, each source symbol contains a
single Application data Unit (as defined in FEC framework), which is in
our case RTP packet. Therefore a source block contains exactly K
consecutive RTP packets.
=> VR: okay
* Section 5: is the "one symbol per RTP packet" really appropriate?
If RTP packets have largely different sizes (usual case), the sender may
want to use symbols of size the median value (for instance) in order to
avoid
having too much padding. Of course padding is not sent (i.e. there's no
transmission overhead penalty), but encoding/decoding operations will be
faster with smaller symbols. There is a price to pay for that,
essentially
complexity (it breaks the packet <=> symbol relationship and there are
more
symbols), so I'm not sure there's a real benefit. What do you think?
[Einat] It is true that one could use a median packet size in order to
make encoding/decoding more efficient. However, since we are dealing
with packet losses and not symbol losses, breaking a large packet into
several symbols and unifying small packets to a single symbol will break
the equality between packets. For example, if we use 2 repair symbols
per a block and one of the RTP packets in this block contains 3 source
symbols then if we lose this single packet we could not restore it even
if all other source and repair symbols are received.
=> VR: you're right with your example, but it is a little bit biased. If
we think in terms of
"transmitted bytes" rather than "number of packets", the comparison is
not fair. If the
symbol size is half the size of the largest source packet, then I can
send two repair packets
for almost the same transmission overhead as what you are proposing in
the I-D (I'm not
considering here the various protocol header overheads of course).
There's an additional incentive for using median symbols sizes that I
forgot to mention:
ff symbols are of size the largest source packet size, it means that the
repair symbols are also
of that size, which has an impact on the transmission overhead. With
median size symbols,
this would not be the case, and it makes a big difference if there are a
significant number
of repair symbols.
I note there's a misunderstanding in your answer: I don't suggest to
"unify small packets to
a single symbol". I still padd source packets that are smaller than the
symbol size.
* Section 5: It is said:
"B. The sequence number of the source packets must be consecutive
in a source block."
IMHO the "MUST be consecutive" is too strong. It may happen that some
source RTP packets be lost in some networks prior to reaching the FECFRAME
sender (i.e. if FECFRAME is used in an intermediate box rather than
end-to-end).
And what about (crazy) situations where RTP packets are lost (e.g. by a
FIFO overflow) before reaching FECFRAME?
IMHO it's worth designing a solutions that is robust against this kind
of situation, no matter the cause.
[Einat] The 'MUST be consecutive' restriction is there in order to
simplify the signaling of which RTP packets are protected. If not
consecutive then we could not use the SN base + num packets fields in
the FEC header and we'd have to use SN base + bitmask, or worse -
specifically list the sequence numbers of protected packets. I think
that cases where some source packets are lost before FEC encoder are
unique and can be handled by creating smaller FEC blocks (in between the
gaps).
=> VR: I understand the simplification purpose. With end2end FECFRAME
(i.e. located
on the same host as the source application, it's a reasonable
assumption. In other situations,
it makes the proposal unusable, unless you have a TCP (or similar)
connection to guaranty
there's no loss. So I'm in favor of some solution. Which solution to
choose is a separate
discussion (see below).
* Section 6.2.1:
One stupid question: if there are several repair flows and if they are
all sent to the same multicast group/port, how does a receiver tell the
difference between the packets belonging to the different repair flows?
I think it's by means of the PT. This is not said in any case.
(I understand that sending everything to the same destination is not
necessarily a good idea, but that's a different topic)
[Einat] The payload type, which is signaled in SDP is used for that. We
could add a specific sentecnce describing this.
=> VR: yes, please.
* Section 6.2.2, Fig. 3: the choice of the various field sizes has
implications that are not explicitly mentioned.
Num Packets (badly named, I'd prefer K): 16 bits => too much in case
of RS over GF(2^8), more appropriate to GF(2^16).
[Einat] We did not limit GF to 8. It is only a recommendation.
i and N-K: 8 bits => okay with GF(2^8), a limit in case of GF(2^16),
but that's perhaps not the target.
* Whole document: More generally, I have the feeling nowhere it is
explained
whether or not there are restrictions on the GF size. Which possible
values
for the m parameter.
[Einat] I agree that such restrictions and consideration should be
further elaborated.
In general we focused more on the concept of how to apply Reed-Solomon
on RTP packets than on the Reed-Solomon operation.
=> VR: okay, but an I-D that specifies actual header formats must be
clear on such
assumptions.
* Section 6.2.2: I have the feeling that carrying N-K in the FEC header
is
perhaps not required. On the opposite, there MUST be a way to inform the
receiver of the GF(2^m) used (e.g. it can be done out-of-band in SDP
description).
So if I know m, I don't need to know the actual N-K as I know K and
max_K=(2^m)-1.
[Einat] I'm not sure that I follow here. Our intention was to use a
dynamic N-K value.
e.g. for one block it could be 2 repair symbols and for the next block
it could be 4 repair symbols.
This variable number must be signaled for each block.
=> VR: I misunderstood your point and the discussion I had with Orly
today clarified
a lot. So let's summarize:
- for a pure RS encoding/decoding aspect, knowing n-k is not needed, m
is sufficient
(it was the main motivation for my comment);
- from a practical point of view, sending N-K can be helpful in some
situations (but
not all) to help the receiver take decisions on whether or not
decoding will be feasible.
It makes sense. However (1) it requires the receiver actually got at
least one repair
packet otherwise he won't know if it's worth waiting for hypotetical
repair packets,
and (2) it's not absolutely required, we can also use the realtime
deadline of the data
transported.
All in all, carrying N-K can help, but I don't think it's the ultimate
solution, we also
need to consider the realtime deadline of data being transported for
robustness.
* Section 6.2.2: SN Base is used to identify the source block (since all
the FEC repair packets associated to the same source block have the same
SN Base
value). Since SN Base is a field that regularly wraps around, there is a
(theoretical) risk that two consecutive source blocks use the same SN
Base
value. This might happen in tricky configurations with large source
blocks
(GF(2^16)) and in case the RTP sequence numbers are not consecutive (see
above
discussion about possible erasures *before* FECFRAME).
Additionally, and more seriously, this vulnerability may also be
exploited as
a possible DoS (Denial of Service) to make a FECFRAME sender crash.
So it's worth mentioning this topic in the doc, both in the core spec
and in the
Security section.
If we clarify that this document is not for situations experiencing high
erasure
rates before FECFRAME (which can be checked on-the-fly), it should be
okay IMHO.
[Einat] Good point. Thanks.
* Section 6.2.2: Back to the idea of removing the constraint on
consecutive
source packets, I'm wondering if the following FEC header format
wouldn't be
okay:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Base SN | Max SN |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Block Length (k) | Enc. Symbol ID (16 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Thanks to this FEC header:
- We know that all SN values in {Base SN..Max SN} are included in this
source block.
- We know their actual number (k) even if some of RTP SN are missing.
- We know the ESI of the FEC repair packet (in {K, N-1}), no matter the
GF m parameter since it's 16-bit long.
- And we know in which order incoming source RTP packets should be
stored in the block, even if there are gaps in the SN.
At a receiver, the actual SN value in the RTP header of an erased
source packet is anyway recovered along with the rest of the
header/payload
so there's no problem either.
[Einat] This is interesting. Do you mean that all RTP packets in the
range [Base SN...Max SN] will be included in this block?
If not then the receiver will know for sure if he can recover a lost
packet only after making the decoding operation on the relevant block.
This could lead to unnecessary decoding calls.
See my comment above regarding the need for signaling N-K for each FEC
block.
=> VR: my above proposal is not sufficient! It won't help identifying
the position of source
packets inside the source block.
However, I think there's a solution to the case where there's a limited
number of erasures
prior to FECFRAME. Instead of carrying a bit mask that identifies which
source packet
of a SN range are actually considered or not, I suggest to replace
erased source packets
by zero'ed symbols.
These symbols will be reconstructed by the receiving side FECFRAME, and
a quick
check will enable him to get rid of them immediately (there's no RTP
header, it's all zero).
To answer your question: yes, all the packets in [Base SN; Max SN] are
in the block,
except that some of them are "fake" packets (zero'ed) which means they
are easily
recognized.
* Section 7.1.1: it is not clear to me if max_N is equal to 2^m or if
it is only a "codec limitation". I think it's related to the GF, but
it's
worth mentioning. And the "symbol-size" should also clarified as already
mentioned.
[Einat] Symbol size should be element size as mentioned above.
I also agree that considerations for determining MAX_N should be listed.
* Section 8.3.1: This section does not explain how to manage the case
where
there are several repair flows. It's a problem.
[Einat] What do you see as a problem here? The different repair flows
are distinguished by their payload type.
It is determined in the SDP which payload type is used by which repair
flow to protect which RTP flows.
=> VR: said like that it's clear. But it just needs to be said (same
comment as above).
* Section 11.
You cannot get rid of the security aspects in this way. Most
specifications
add additional security risks (like the DoS opportunities mentioned
above).
[Einat] Correct again. Would it be OK to say that we neglected some
edges in order to make it on time for the meeting...?
This will be fixed if/when the concept presented here is agreed.
=> VR: okay, I can also help.
Regards,
Vincent
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.