Re: [Int-area] SEAL - draft-templin-intearea-seal-05.txt
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Int-area] SEAL - draft-templin-intearea-seal-05.txt
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi, Margaret,
I'll help with a few answers, since I've been tracking SEAL fairly
closely. I hope this helps the list and Fred in his update...
Joe
Margaret Wasserman wrote:
...
>> Furthermore, when a source node that requires ICMP error message
>> feedback when a packet is dropped due to an MTU restriction does not
>> receive the messages, a path MTU-related black hole occurs. This
>> means that the source will continue to send packets that are too
>> large and never receive an indication from the network that they are
>> being discarded.
>
> My (incomplete) understanding is that this problem has been addressed
> by new PMTUD work.
That work requires upper layer protocol participation in MTU discovery;
for protocols that do not already implement this, or for largely
unidirectional protocols (e.g., streaming), this can be difficult. SEAL
supports PMTUD in its tunnel mechanism over the path of the tunnel,
which can be used either on a sub-path or end-to-end to augment the
ability of an application to traverse paths that do not support the
expected MTU.
...
>> If IPv4 fragmentation were used,
>> this would quickly wrap the 16-bit Identification field and could
>> lead to undetected data corruption.
>
> Ummm... Why and how? I think I understand what this is driving at,
> but it is not clearly explained. If the ingress tunnel endpoint is
> using a single source adddress/destination address/protocol three-tuple
> for a very large amount of traffic generated from a site, and if that
> tunnel endpoint is performing fragmentation on _external_ packets to
> make them fit in the tunnel, I can see how you might have accidental
> ID overlaps.
>
> However, it is more typical for tunnels to fragment the _inner_ IP
> packet, causing reassembly to happen at the end nodes, and thus
> avoiding the need to collapse many ID numbering spaces into one. For
> precisely this reason.
The primary reason for fragmenting the inner IP is to avoid the need to
reassemble a single stream at the tunnel egress, i.e., to distribute the
load of reassembly, or, more directly, to simplify (and reduce the
expense) of tunnel egress implementations. I like to call this "writing
a check you don't have to cash", and I think we should avoid that sort
of thing wherever possible.
In particular, this can subject less capable endpoints (e.g., PDAs,
cellphones) to reassembly overheads that they are challenged to support,
both computationally and memory space.
However, this also perpetuates intermediate fragmentation, which you
note below has problems.
...
>> The situation is exacerbated further still by IPsec tunnels, since
>> only the first IPv4 fragment of a fragmented packet contains the
>> transport protocol selectors (e.g., the source and destination ports)
>> required for identifying the correct security association rendering
>> fragmentation useless under certain circumstances. Even worse, there
>> may be no way for a site border router that configures an IPsec
>> tunnel to transcribe the encrypted packet fragment contained in an
>> ICMP error message into a suitable ICMP error message to return to
>> the original source.
>
> This is one of the classic reasons to avoid intermediate fragmentation.
...
>> This document introduces a Subnetwork Encapsulation and Adaptation
>> Layer (SEAL) for tunnel-mode operation of IP over subnetworks that
>> connect Ingress and Egress Tunnel Endpoints (ITEs/ETEs) of border
>> nodes. It provides a standalone specification designed to be
>> tailored to specific associated tunneling protocols such as VET
>> [I-D.templin-intarea-vet], the Locator-Identifier Split Protocol
>> (LISP) [I-D.ietf-lisp] and others.
>
> I do not believe that the LISP WG has explicitly been asked to review
> this document. However, this solution has been proposed on the LISP
> WG mailing list... It did not receive wide-spread support, was
> rejected by at least some LISP participants, and was ruled
> out-of-scope for LISP discussion. So, it is a bit odd to see an AD-
> sponsored submission that claims that this is tailored to LISP...
LISP encapsulates from ingress to egress, but the WG is focused on how
to determine the encapsulation information. LISP writes off the MTU
issue (draft-ietf-lisp-04.txt, section 5 - basically drops all packets
that can't fit), and proposes two mechanisms in sec 5.4 that have a very
thin explanation. LISP docs do not discuss ID collapse issues, partly
because they just drop packets that have DF=1 and won't fit.
I don't think these are tenable positions. IMO, SEAL is as much in-scope
as these sections are to the primary LISP document, and even moreso LISP
requires a solution that can support DF=1 that won't fit the MTU or
it'll be relegated to an experimental curiosity.
...
>> The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
>> SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
>> document, are to be interpreted as described in [RFC2119].
>
> This document makes frequent use of lower-case requirements keywords
> (such as "must") in cases where they appear to be (or could at least
> be interpreted as) requirements. They should be uppercased, switched
> to other words, or clarified.
FWIW, I have inserted language in I-Ds that clarifies as follows:
When used in lower case (e.g., must, must not, etc.), these
words MUST NOT be interpreted as described in [RFC2119], but
are rather interpreted as they would be in common English.
>> SEAL introduces a minimal new sublayer for IPvX in IPvY encapsulation
>> (e.g., as IPv4/SEAL/IPv6), and appears as a subnetwork encapsulation
>> as seen by the inner IP layer. SEAL can also be used as a sublayer
>> for encapsulating inner IPvX packets within outer IPvY/UDP headers
>> (e.g., as IPv4/UDP/SEAL/IPv6) such as for the Teredo domain of
>> applicability [RFC4380]. When it appears immediately after the outer
>> IPv4 header, the SEAL header is processed exactly as for IPv6
>> extension headers.
>
> What does it mean that the SEAL header is processed "exactly as for IPv6
> extension headers"? IPv4 does have an IP header extension mechanism
> (IP options), why wasn't that used?
IPv6 options are handled as successive encapsulations, where the outer
header's protocol field indicates the next layer of option, not the
protocol of the payload.
I.e., IPv6 options indicate the next option type in the outermost (i.e.,
first) protocol field and indicate the payload protocol in the innermost
(i.e., last) protocol field in the options chain. IPv4 options indicate
the payload protocol in the outermost protocol field. SEAL is
implemented as an IP encapsulation sub-layer, creating an arbitrarily
extensible chain in the same way that IPv6 does, and in the same way
that IPsec does for IPv4.
...
>> Note that this SEAL segmentation ignores the fact that the mid-layer
>> packet may be unfragmentable outside of the subnetwork. This
>> segmentation process is a mid-layer (not an IP layer) operation
>> employed by the ITE to adapt the mid-layer packet to the subnetwork
>> path characteristics, and the ETE will restore the packet to its
>> original form during reassembly. Therefore, the fact that the packet
>> may have been segmented within the subnetwork is not observable
>> outside of the subnetwork.
>
> I think there may be some significant problems with this approach.
> While it true that the exact same bits will be received on the other
> end in normal circumstances, this has most of the same problems that
> are present with any in-the-network fragmentation. In particular,
> what I think of as the "unit of loss" is not maintained end-to-end. A
> single packet from the source is broken into several pieces that may
> be lost separately. In order to make up for one last piece, the full
> original packet needs to be retransmitted. This has very bad
> properties for congestion, especially when large packets are used.
This is the way IP fragmentation interacts with TCP congestion control
currently for both IPv4 and IPv6. Retransmission of lost fragments at
lower layers could result in variable delays and feedback loop
interactions with TCP which are undesirable.
...
>> Note that when IPv6 is used as the outer IP encapsulation layer, the
>> ITE must insert an IPv6 fragment header with an Identification value
>> set as described in Section 4.3.6.
>
> Middle boxes are not supposed to fragment IPv6 packet en-route. There
> should be no reason to do so, as all IPv6 nodes support PMTUD.
An ITE adds the outer header. As such, it is an *origin* of those
packets, and so if fragmentation is required, it MUST happen at that
location according to IPv6 requirements.
It's required here because IPv6 is transmitting packets whose size it
cannot control. The ITE can find out the MTU of the path, but it can't
necessarily communicate it back to the origin of the packets it receives
(they may not be IPv6, e.g.). So it has no choice but to either drop all
packets larger than the known MTU, or to issue IPv6 fragments.
Joe
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
iEYEARECAAYFAkq6MgwACgkQE5f5cImnZrtSoQCfQzV3+fDEv1BNF7+tuF4PEQ5v
IdkAnjN6WnhHA2/gfY2/idONXF1x8SGf
=Krr6
-----END PGP SIGNATURE-----
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.