[BEHAVE] Review of draft-boucadair-behave-dns-a64-01
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[BEHAVE] Review of draft-boucadair-behave-dns-a64-01



Dear colleagues,

I have read draft-boucadair-behave-dns-a64-01.  I have some comments.
This is a somewhat lengthy review, but I decided to post it anyway
because I think the draft touches on the fundamental questions of the
utility (and disutility, not to say harmfulness) of NAT64/DNS64.  If
you don't care about my views on A64, but want to see two alternatives
that could be integrated in to DNS64 if we really need this, then skip
to section 3, near the end.

To begin with, I am sympathetic to the concerns behind the proposal.
When it was first suggested to me that the DNS might be used to solve
v4-v6 interoperation, my reaction was that it was a bad idea precisely
because of the kind of issue the authors raise.  From a purist point
of view, any time you start mucking with data in the DNS, or adding
data that sort of works but that isn't quite right, you introduce a
pile of new failure modes that you ought not to be introducing.  The
authors have identified some of those failure modes peculiar to the
DNS64 arrangement combined with certain use cases.

Nevertheless, my overall view is that the proposal in the draft is not
a good idea.  It is effectively introducing a new kind of query that,
if deployed, will live a long time (maybe forever) on the Internet,
for as long as people are looking up AAAA records.  That seems to me
to be a high cost for what I regard as a low-value and short-term
benefit.

    1.  Is the problem this solves something that should be or can be
        solved?

To begin with, I am not sure I find the important use case terribly
compelling.

DNS64 relies on the fact that an IPv6-using host will query for AAAA
records when it is trying to resolve another host on the Internet.
The DNS64 function captures that AAAA lookup, returns a AAAA record if
there is one, and then proceeds to A records in order to see whether
it can generate a synthetic response.  

A similar, if ugly, expedient is used for sites that wish to provide
v6 access to a v4-only service: they just put a synthetic AAAA (that
is to say, "A v6 address they control") into the zone, and do nat64 at
athe site.  This case is slightly more problematic than the previous
one, for while client-end DNS64 functions can be used or not depending
on whether the client is provisioned with both v4 and v6 connectivity,
the target-site-published synthetic AAAA record looks just like any
other AAAA, even though the target host does not actually speak v6.
So v6-enabled hosts will almost always get the v6 answer (since almost
all of them ask for AAAA first).  The effect of this is that the NAT64
will be used even when v4-only transit could better be used.

Now, there are two things to say about this.  First, the draft is
quite concerned about the case where site-published synthetic AAAA
records cause clients to use NAT64 when they could have native v4
transit, with possible concomitant loss of functionality.  My overall
reaction to that is, "Tough."  The plain fact is that the NAT64
functionality is still a NAT, and if your application loses
functionality across NATs, then NAT64 is just a poor choice.  Don't
publish synthetic AAAA records for that application.  (If such advice
is not abundantly clear from the DNS64 document, please send text.  It
should be.)  In my view, the cost of clients using NAT64 when they
don't need to is just part of the cost of deploying with NAT64, and if
you don't like that there's an easy solution open to you: get v6
transit, and deploy your applications using IPv6.  NAT64 is supposed
to help make that less hard when applications, or hosts, or whatever,
simply can't be upgraded.  But it is not a really good long-term
strategy for v4/v6 interoperation, and we have to stop imagining that
it will be.  We similarly have to make those limitations plain to
potential deployers.  Many IT shops to this day remain completely
astonished at the damage their NATs do, and I think this is partly
because during the early adoption of NATs there was not enough
information that made it plain, "This is going to break all manner of
things you might want later to have working.  Don't be surprised."

The second is that, in the presence of A64-unaware caches, the
strategy does not provide the desired guarantees anyway.  For an
A64-unaware recursor serving an A64-unaware initiating host will just
get the AAAA record anyway, no matter what.  This results in a
first-mover problem: there's no benefit to deploying A64-aware
recursors if nobody has published A64 records; and there's no reason
to publish and maintain additional host records in case nobody is
going to ask for them.  But in both cases, there is a cost to using
the A64 functionality, so there's actually a disincentive to deploy
the proposed technology.  This brings us to the cost of the A64
proposal, which results from how it has to be used.

          2.  How A64 will actually work

To understand my worry, we need to unpack the real meaning of section
4.6 of the draft.  The section obscures what is in fact required to
make this proposal work.  For the long-term effect of adding A64 to
the DNS is that every AAAA lookup will require an A64 lookup too.
Over the long term (i.e. as IPv6 becomes widely deployed), what this
means is a doubling of host queries to the DNS, with an effective
retirement time of "never".

We have three classes of actors in the DNS chain that are relevant to
the A64 proposal.  The first is the initiating host.  It can either be
A64-aware, or not.  If it is A64-aware; then it must query first for
A64; if it gets no result, then for AAAA, then for A; but if it gets a
result, maybe for A or maybe it can do something clever to strip the
prefix and extract the v4 address.  Since it cannot know whether the
target site has published v4-mapped v6-records, it always has to ask.
(Worse, if the goal really is to avoid using NAT64 when v4 is
available, it might query for the A _anyway_ after querying for AAAA,
in order to try to see whether the AAAA is actually an A record "in
disguise".  Until A64-awareness is universal, a site operator that
wants to offer NAT64 has to publish synthetic AAAA anyway, and it
might not have published A64.  To be charitable, let's assume every
mapped-v4-publishing site decides to publish A64, and ignore this
issue.)  The reason the initiator has to query for the A64 first is
that it needs to know whether the AAAA record it would get will
contain a v4-mapped address, and make a decision on that basis.  (You
could reverse the order, of course, but it makes no difference: you
have to do both.)

The second actor is the recursive resolver(s) in the path between the
initiating host and the authority server.  This case is interesting
only if the initiator is A64-oblivious (if the initiator is A64-aware,
then the recursor will presumably just do what the initator needs.
Let's ignore for the moment the case where the A64-aware initiator
acidentally gets hooked up to an A64-aware DNS64 node, and suppose
that it works).  If the recursor is A64-oblivious, it will simply
query for the AAAA, just as the host did, and the v4-mapped address
will be what gets used.  If the recursor knows A64, however, on
receiving a AAAA query, it first needs to perform an A64 query; if
that returns an answer, it can either extract the v4 address in some
way, or just query for the A record.  If the A64 query does not return
an answer, then the recursor asks for a AAAA record.  The reason it
has to ask for A64 first is that it needs to know whether the AAAA it
might get is a "real" one or not.  As before, you could reverse this,
but you still have to do it.

But it gets worse.  The recursor now does not know whether the
initiating host is dual-stack or not.  Of course, it might implicitly
know (i.e. by configuration) because it is a recursor that only serves
the dual-stack hosts (this supposes that the recursors for an ISP's
dual-stack and v6-only customers are different pools.  I'll leave
aside the complications arising from different ISPs, one for v6 and
another for v4, being involved in this nightmare).  But depending on
whether it is serving dual stack hosts or NAT64-requiring hosts, it
has to follows different paths.  If it is serving dual-stack hosts,
then if it gets an A64 record, it returns the A record to the
initator.  If it is serving v6-only hosts, then it converts the A64
record into a AAAA record, and returns that instead.  Unless, of
course, the initator set the CD bit on the query, in which case the
conversion is not allowed, because the initator will notice the
tampering and reject the answer.  In the case of the CD bit being set,
the recursor instead has to ask for the AAAA record and return that
answer, instead.

Finally, we have the authority server at the target site.  There are
two configuration possibilities and three action possibilities here.
Either the authority server has only A64 records in its zone, or it
has A64 records and equivalent AAAA records.  If it has only A64
records, then it either has to do special processing when it receives
a AAAA query, and convert the AAAA into an A64 query; or it has to
return NOERROR with an empty answer for the AAAA query, and simply not
get the communication benefit of NAT64 that was the whole purpose of
the exercise in the first place.  In either case, signing (if desired)
needs to be done on the fly.  If it is configured with the equivalent
AAAA record, that is functionally equivalent to the special processing
action above, except that it imposes on the server administrator the
burden to keep the A64 record in sync with the AAAA record (and, of
course, increases the size of the zone (but eliminates the need to
sign on the fly).  It introduces a powerful new set of failure
conditions, where the A64 record and the AAAA record don't match.

I haven't really thought through the security implications of any of
this, but at first blush it seems to me that the A64 record is going
to offer another opportunity to inject DNS poison, since an attacker
is going to know something about the capabilities of a querying host
and will also automatically know what query is coming next.  This
makes DNSSEC even more important in the case of A64, yet A64 presents
another stumbling block for DNSSEC, since an initator that is DNSSEC
validating but A64 oblivious is either out of luck, or just going to
experience the issues that the A64 draft is supposed to solve.

In any case, it is plain that adding A64 awareness either to hosts, or
to recursors, or both will guarantee an additional query for every
single AAAA query, effectively until the end of time.  (You would need
to know that there are no more v4-only hosts offering NAT64 before you
could shut this off.  How would you learn that?)  Depending on its
deployment at a target site, the A64 RR either requires server-side
processing (convertying an AAAA query into an A64 query) or else
requires the operator to forego most of the benefit that was supposed
to come from NAT64, which is making v4-only networks available to
v6-only networks without having to upgrade everything on (or even
significant parts of) the Internet in a co-ordinated fashion.

    3.  Alternatives

Those of us working on the DNS64 draft asked at various DNS working
group meetings, "Should we have a way of identifying these synthetic
answers?" and received, "No," in answer, as far as the DNS people are
concerned.  However, if it is desirable to be able to identify the
synthetic answers in some way, then there are two obvious ways to do
so without requiring an additional query for every AAAA query.

    a.  Add the "base" A record to the Additional section

If a DNS64 server were to provide synthetic AAAA records and include
the corresponding A record in the Additional section, a DNS64-aware
querying client could take the presence of that A record as a hint
that there might be more to the answer than meets the eye.  It should
be able to extract the equivalent IPv4 address from the AAAA record,
and thereby infer that the AAAA is synthetic.  A dual-stack host could
then just use the A record instead.  A signficant disadvantage of this
approach is that the Additional section is the first thing to be
truncated when a response gets long, so it's possible that the
Additional section won't have the "base A" when it's needed.  In
addition, a site publishing statically-configured synthetic AAAA
records can do so with a bog-standard DNS server, and there'd be no
way to tell that the records are in fact synthetic, so such servers
wouldn't include the A record without some kind of additional work.

    b.  Use an EDNS0 option to signal DNS64 awareness

If we defined an EDNS0 option (call it the SY bit, for synthetic) to
signal DNS64 awareness, then either an initiator or a recursor could
signal its ability to understand DNS64.  A server receiving such a
signal could include the Pref64::/n in the answer using a new RRTYPE
(say, PREF64).  That prefix could then be stripped off the AAAA by
whatever node got the answer, and the resulting IPv4 address could be
used as appropriate.  One disadvantage of this approach is that it
requires participating nodes to upgrade in order to achieve the
functionality; but in fact, A64 also requires that, and so the effect
is no worse.  Moreover, it does not introduce new queries or anything
of the sort to the mix, so we aren't increasing the number of queries.

    4.  Conclusion

Even if this is a problem that needs to be solved -- a position that
is not obvious to me -- it is not plain that A64 solves it.  Even if
it does solve it, it solves it at a very great price, and a price that
will be paid long after A64 is broadly useful.  In any case, there are
other ways to solve the same problem, and they are less disruptive
than adding A64 to the DNS.  Therefore, I do not think the A64
proposal should be adopted.

Best regards,

Andrew


-- 
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.

Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.