[v6ops] Stateful NAT64: Strange translation of ICMPv4 error source addresses

Tore Anderson <tore@fud.no> Fri, 13 March 2015 09:36 UTC

Return-Path: <tore@fud.no>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5EB1D1A1B67 for <v6ops@ietfa.amsl.com>; Fri, 13 Mar 2015 02:36:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.79
X-Spam-Level:
X-Spam-Status: No, score=0.79 tagged_above=-999 required=5 tests=[BAYES_50=0.8, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WQv8G__ttMZs for <v6ops@ietfa.amsl.com>; Fri, 13 Mar 2015 02:36:51 -0700 (PDT)
Received: from greed.fud.no (greed.fud.no [IPv6:2a02:c0:1001:100::145]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 836C11A1B99 for <v6ops@ietf.org>; Fri, 13 Mar 2015 02:36:51 -0700 (PDT)
Received: from [2a02:c0:2:4:6666:17:0:1001] (port=48231 helo=echo.ms.redpill-linpro.com) by greed.fud.no with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from <tore@fud.no>) id 1YWM1N-0005Zj-OY; Fri, 13 Mar 2015 10:36:45 +0100
Date: Fri, 13 Mar 2015 10:36:15 +0100
From: Tore Anderson <tore@fud.no>
To: draft-ietf-behave-v6v4-xlate-stateful@tools.ietf.org
Message-ID: <20150313103615.1380aeb1@echo.ms.redpill-linpro.com>
X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.27; x86_64-redhat-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/v6ops/uUxGgQ7dpxoG-A5lR3Egwu7_7eo>
Cc: v6ops@ietf.org, jool@nic.mx
Subject: [v6ops] Stateful NAT64: Strange translation of ICMPv4 error source addresses
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Mar 2015 09:36:54 -0000

tl;dr:

Stateful NAT64 (RFC6146) requires that whenever an ICMPv4 error is
received from a router in the IPv4 network, the (IPv4-converted) IPv6
source address of the translated ICMPv6 error must be set to the
(IPv4-converted) address of the destination address in the embedded
ICMPv4 payload, i.e., the original destination of the packet causing
the ICMPv4 error. I find this rather counter-intuitive - I would expect
the (IPv4-converted) address of the IPv4 router originating the ICMPv4
error to be used instead. Is my reading of RFC6146 faulty, or if not,
is this truly the intented behaviour (why?), or is an erratum warranted?

How does actual Stateful NAT64 implementations behave? [Cc: v6ops]



Longer version:

While testing Jool, an implementation of Stateful NAT64, I noticed that
traceroutes started from the IPv6 network towards IPv4 destinations
were looking decidedly odd. Some traceroute programs (e.g., mtr) would
report that any traceroute target (including definitively unreachable
destinations such as a martian) would be shown as successfully reached
on the hop immediately after the Stateful NAT64 router, while others
(e.g., tracroute6) would report each hop beyond the Stateful NAT64 as
having the same address of the traceroute target.

Examples of these odd traceroute outputs can be found in the bug report
I submitted about this error: https://github.com/NICMx/NAT64/issues/132

It turns out that first class of traceroute programs considers any
[ICMPv6] packet received, including "time exceeded" errors, originating
from the targeted address as a overall success. The second class of
traceroute programs will on the other hand continue to transmit packets
with an increasing Hop Limit until a non-ICMPv6 error (ICMPv6
echo-request, TCP SYN/ACK, etc. depending on type of traceroute) has
been receeived from the target.

After closer reading of RFC6146, both the Jool developers and I agree
that it appears that Jool is not buggy, as it faithfully implements the
algorithm in RFC6146. See below of a walk-through of our understanding
of the algorithm. However the resulting outcome makes absolutely no
sense from an operational point of view; I'd expect the IPv4 hops to be
represented using their IPv4-converted IPv6 address, in other words by
having the RFC6052 translation prefix prepended. This would have the
added benefit of causing the correct hostnames for each hop to be
reported if DNS64 is in use.

Towards the end of the initial message of the Github issue there is an
example traceroute output that looks more like what I'd expect, but on
the other hand that's not using a true Stateful NAT64 (RFC6146)
implementation, but rather the bastard child of TAYGA (a SIIT-ish
IPv4/IPv6 translator) plus NAPT44 (Linux IPTables), which might explain
that its outcome isn't RFC6146 compliant.



Walkthrough of Stateful NAT64 processing, re-using the example
addresses used in section 1.2.2 of RFC6146:

1) A TCP packet is sent from 2001:db8::1,1500 towards
64:ff9b::192.0.2.1,80 through a Stateful NAT64 (whose IPv4 address is
203.0.113.1).

The NAT64 picks an available source port (2000) and saves the following
mapping:

(2001:db8::1,1500) <--> (203.0.113.1,2000)

The resulting translated IPv4 packet gets the following addresses:

Src: 203.0.113.1,2000
Dst: 192.0.2.1,80


2) This packet ends up being discarded by the IPv4 router 198.51.100.1
for some reason (say, TTL exceeded). It originates an ICMPv4 error
packet that looks like this:

Src: 198.51.100.1
Dst: 203.0.113.1
Src embedded in ICMPv4 payload: 203.0.113.1,2000
Dst embedded in ICMPv4 payload: 192.0.2.1,80


3) Upon receipt of the ICMPv4 error packet, the NAT64 determines an
"Incoming Tuple" according to section 3.4:

> If the incoming IP packet contains a complete (un-fragmented) ICMP
> error message containing a UDP or a TCP packet, then the incoming
> 5-tuple is computed by extracting the appropriate fields from the IP
> packet embedded inside the ICMP error message.  However, the role of
> source and destination is swapped when doing this: the embedded
> source IP address becomes the destination IP address in the incoming
> 5-tuple, the embedded source port becomes the destination port in the
> incoming 5-tuple, etc.

Thus, now we have a complete Incoming Tuple, containing the following:

SrcIP: 192.0.2.1
SrcPort: 80
DstIP: 203.0.113.1
DstPort: 2000
Protocol: 6 (TCP)

Note that at this point the IPv4 source address of the IPv4 router
originating the ICMPv4 error (198.51.100.1) is already lost, but I'll
complete the walkthrough for completeness anyway...


4) The NAT64 proceeds to compute the "Outgoing Tuple" for this packet,
according to section 3.6.1:

> The transport protocol in the outgoing 5-tuple is always the same as
> that in the incoming 5-tuple.  When translating from IPv4 ICMP to
> IPv6 ICMP, the protocol number in the last next header field in the
> protocol chain is set to 58 (IPv6-ICMP).

These two sentences do seem to me to contradict each other, but my
assumption is the latter sentences are to be understood as an exception
to the first sentence (otherwise the Protocol of the Outgoing Tuple
ends up being TCP, which would make zero sense). Thus the Outgoing
Tuple at this point contains:

Protocol: 58 (IPv6-ICMP)

> When translating in the IPv4 --> IPv6 direction, let the source and
> destination transport addresses in the incoming 5-tuple be (S,s) and
> (D,d), respectively.  The outgoing source transport address is
> computed as follows:

In other words, at this point we (temporarily) have:

(S,s) = (192.0.2.1,80)
(D,d) = (203.0.113.1,2000)

> The outgoing source transport address is generated from S using
> the address transformation algorithm described in Section 3.5.4.

Section 3.5.4 essentially says «see RFC6052», so:

Replacement S = 64:ff9b::192.0.2.1

> The BIB table is searched for an entry (X',x) <--> (D,d), and if
> one is found, the outgoing destination transport address is set to
> (X',x).

This lookup will find the entry that was saved in step #1. Thus:

Replacement (D,d) = (2001:db8::1,1500)

So now we have a complete Outgoing Tuple:

SrcIP: 64:ff9b::192.0.2.1
SrcPort: 80
DstIP: 2001:db8::1
DstPort: 1500
Protocol: 48 (IPv6-ICMP)


5) The Stateful NAT64 builds an IPv6 packet according to section 3.7:

> o  When translating an IP header (Sections 4.1 and 5.1 of [RFC6145]),
>    the source and destination IP address fields are set to the source
>    and destination IP addresses from the outgoing tuple as determined
>    in Section 3.6.

and

> o  When the protocol following the IP header is ICMP and it is an
>    ICMP error message, the source and destination transport addresses
>    in the embedded packet are set to the destination and source
>    transport addresses from the outgoing 5-tuple (note the swap of
>    source and destination).

Thus we end up with the final translated ICMPv6 packet:

Src: 64:ff9b::192.0.2.1 <-- I'd expect 64:ff9b::198.51.100.1 here!
Dst: 2001:db8::1 <-- makes sense
Src embedded in ICMPv6 payload: 2001:db8::1,1500 <-- makes sense
Dst embedded in ICMPv6 payload: 64:ff9b::192.0.2.1,80 <-- makes sense


Tore