[Enum] Proposed tweak to section 2 of 3761

Lawrence Conroy <lconroy@insensate.co.uk> Sat, 02 May 2009 14:39 UTC

Message-Id: <635004FB-361A-4917-B2BE-09F6666B7ECD@insensate.co.uk>
From: Lawrence Conroy <lconroy@insensate.co.uk>
To: IETF ENUM list <enum@ietf.org>
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v930.3)
Date: Sat, 02 May 2009 15:40:56 +0100
Cc: paf@cisco.com, sob@harvard.edu
Subject: [Enum] Proposed tweak to section 2 of 3761
Precedence: list

Hi Folks,
  I have just had yet another developer confused with DDDS and RFC3761
and the way that key/FQDN generation is described for ENUM.
This confusion is caused exclusively by the text currently in 3761
section 2, and I ask the WG if they would mind this being changed into
English for the next spin of 3761 (as proposed at the end of this mail).
I'd prefer to avoid having to to raise an erratum against 3761 first.

[In rfc3761bis, I believe that the equivalent change is to move all but
the last two paragraphs of 2.4.1 into 2.2 (and renumber the following
sub-sections)]

Comments?

all the best,
   Lawrence
p.s. I am unsure of Michael's email address so couldn't ping him.
If anyone know this, could you pass this on?

--------
Background:
DDDS has some useful definitions in RFC3402 ss 2, describing the AUS and
the First Well Known Rule, and the Rule Database. The definitions are:

    Application Unique String
       A string that is the initial input to a DDDS application.  The
       lexical structure of this string must imply a unique delegation
       path, which is analyzed and traced by the repeated selection and
       application of Rewrite Rules.

    First Well Known Rule
       This is a rewrite rule that is defined by the application and not
       actually in the Rule Database.  It is used to produce the first
       valid key.

    Rule Database
       Any store of Rules such that a unique key can identify a set of
       Rules that specify the delegation step used when that particular
       Key is used.

Section 4 of RFC 3402 expands on these, defining what must be specified
for a DDDS application (like E2U).

    Application Unique String:
       This is the only string that the rewrite rules will apply to.   
The
       string must have some regular structure and be unique within the
       application such that anyone applying Rules taken from the same
       Database will end up with the same Keys.  For example, the URI
       Resolution application defines the Application Unique String to  
be
       a URI.

       No application is allowed to define an Application Unique String
       such that the Key obtained by a rewrite rule is treated as the
       Application Unique String for input to a new rule.  This leads to
       sendmail style rewrite rules which are fragile and error prone.
       The one single exception to this is when an Application defines
       some flag or state where the rules for that application are
       suspended and a new DDDS Application or some other arbitrary set
       of rules take over.  If this is the case then, by definition,  
none
       of these rules apply.  One such case can be found in the URI
       Resolution application which defines the 'p' flag which states
       that the next step is 'protocol specific' and thus outside of the
       scope of DDDS.

    First Well Known Rule:
       This is the first rule that, when applied to the Application
       Unique String, produces the first valid Key.  It can be expressed
       in the same form as a Rule or it can be something more complex.
       For example, the URI Resolution application might specify that  
the
       rule is that the sequence of characters in the URI up to but not
       including the first colon (the URI scheme) is the first Key.

    Valid Databases:
       The application can define which Databases are valid.  For each
       Database the Application must define how the First Well Known
       Rule's output (the first Key) is turned into something that is
       valid for that Database.  For example, the URI Resolution
       application could use the Domain Name System (DNS) as a Database.
       The operation for turning this first Key into something that was
       valid for the database would be to to turn it into some DNS-valid
       domain-name.  Additionally, for each Database an Application
       defines, it must also specify what the valid character sets are
       that will produce the correct Keys.  In the URI Resolution  
example
       shown here, the character set of a URI is 7 bit ASCII which
       matches fairly well with DNS's 8 bit limitation on characters in
       its zone files.
--

So... In the definition of the E2U DDDS application (as specified in
       3761 and soon to be updated in rfc3761bis):

-   you might expect some defined pre-processing of a passed telephone
     number to make a "canonical form" Application Unique String so that
     "the lexical structure of this string" implies "a unique delegation
     path".


-   you might expect the first well known rule to be specified so that
     the telephone number/AUS can be converted into a key in the rules
     database; in the case of DNS, that is going to be an owner or Fully
     Qualified Domain Name.

-   you might also expect that the database is defined as DNS, with
     NAPTRs as the records that carry the DDDS Rules.

What is currently in RFC 3761 (sections 2.1, 2.2, and 2.4 respectively)
is at best confusing, and may be incorrect as written.

2.1:  The AUS definition is clear.
2.2:  The First Well Known Rule is not right; it is shown as identity,
       which does not generate a valid key into DNS (a FQDN).
2.4:  The Database Definition includes a conversion step that will, if
       followed, be applied to ALL terms that will be converted into a
       FQDN, including the output of non-terminal Rules. That works for
       the AUS with the current identity First Well Known Rule, but it
       breaks badly when given the output of a non-terminal NAPTR.

Adding a caveat to section 2.4 is not the right approach, and HAS caused
entirely unnecessary confusion.
It's like adding a second ear to the fish. It may make it symmetrical,
but the real solution is not to have the mess in the first place.

What one would have expected in the first well known rule is that it
takes the AUS and converts it into a FQDN, by stripping all characters
except digits (step 1), reversing the sequence of the characters (step
3), interspersing dots (step 2) and then appending the apex (step 4).

The Database definition then only needs to refer to 3403. The steps are
not appropriate in this section, as E2U uses DNS and NAPTRs in exactly
the same way as any other.
--------

What I propose to fix this is:
The First Well Known Rule section inherits the FQDN generation steps
from section 2.4. For convenience, I'd suggest that steps 2 and 3 are
swapped. Every implementation I've seen does it that way, as it's
quicker.

The Database definition would have the first paragraph only.
The last two paragraphs of the current section 2.4 could stay there;
these are merely notes.

Proposed text for the start of section 2 (up to but excluding
2.4.1) follows (with changes indicated by arrows):
----------------------------------------------------------------
----------------------------------------------------------------
2.  The ENUM Application Specifications

    This template defines the ENUM DDDS Application according to the
    rules and requirements found in [7].  The DDDS database used by this
    Application is found in [2] which is the document that defines the
    NAPTR DNS Resource Record type.

    ENUM is only applicable for E.164 numbers.  ENUM compliant
    applications MUST only query DNS for what it believes is an E.164
    number.  Since there are numerous dialing plans which can change  
over
    time, it is probably impossible for a client application to have
    perfect knowledge about every valid and dialable E.164 number.
    Therefore a client application, doing everything within its power,
    can end up with what it thinks is a syntactically correct E.164
    number which in reality is not actually valid or dialable.  This
    implies that applications MAY send DNS queries when, for example, a
    user mistypes a number in a user interface.  Because of this, there
    is the risk that collisions between E.164 numbers and non-E.164
    numbers can occur.  To mitigate this risk, the E2U portion of the
    service field MUST NOT be used for non-E.164 numbers.

2.1.  Application Unique String

    The Application Unique String is a fully qualified E.164 number  
minus
    any non-digit characters except for the '+' character which appears
    at the beginning of the number.  The "+" is kept to provide a well
    understood anchor for the AUS in order to distinguish it from other
    telephone numbers that are not part of the E.164 namespace.

    For example, the E.164 number could start out as "+44-116-496-0348".
    To ensure that no syntactic sugar is allowed into the AUS, all non-
    digits except for "+" are removed, yielding "+441164960348".

2.2.  First Well Known Rule
---->
    The First Well Known Rule for this Application converts the
    Application Unique String (AUS) into a key for the Rules Database,
    which in this case is the DNS, so this key is a domain name.
    The output of this rule is the same as the input.
    The E.164 namespace and this Applications database are organized in
    such a way that it is possible to go directly from the name to the
    smallest granularity of the namespace directly from the name itself.
    The first well known rule merely maps from the pre-processed
    telephone number into the ENUM namespace as reflected within the  
DNS.

    In order to convert the AUS to a unique key in this Database the
    string is converted into a domain-name according to this algorithm:

    1. Remove all characters with the exception of the digits.  For
       example, the AUS is "+442079460148".  This step would simply
       remove the leading "+", producing "442079460148".

    2. Reverse the order of the digits.  Example:
       "841064970244"

    3. Put dots (".") between each digit.  Example:
       8.4.1.0.6.4.9.7.0.2.4.4

    4. Append the string ".e164.arpa" to the end.  Example:
       8.4.1.0.6.4.9.7.0.2.4.4.e164.arpa
<----
    This domain-name is the key used to request NAPTR records which may
    contain the end result or, if the flags field is blank, produces new
    keys in the form of domain-names from the DNS.

2.3.  Expected Output

    The output of the last DDDS loop is a Uniform Resource Identifier in
    its absolute form according to the 'absoluteURI' production in the
    Collected ABNF found in RFC2396 [4].

2.4.  Valid Databases

    At present only one DDDS Database is specified for this Application.
    "Dynamic Delegation Discovery System (DDDS) Part Three: The DNS
    Database" (RFC 3403) [2] specifies a DDDS Database that uses the
    NAPTR DNS resource record to contain the rewrite rules.  The Keys  
for
    this database are encoded as domain-names.
---->
<----
    Some nameserver implementations attempt to be intelligent about  
items
    that are inserted into the additional information section of a given
    DNS response.  For example, BIND will attempt to determine if it is
    authoritative for a domain whenever it encodes one into a packet.   
If
    it is, then it will insert any A records it finds for that domain
    into the additional information section of the answer until the
    packet reaches the maximum length allowed.  It is therefore
    potentially useful for a client to check for this additional
    information.  It is also easy to contemplate an ENUM enhanced
    nameserver that understand the actual contents of the NAPTR records
    it is serving and inserts more appropriate information into the
    additional information section of the response.  Thus, DNS servers
    MAY interpret Flag values and use that information to include
    appropriate resource records in the Additional Information portion  
of
    the DNS packet.  Clients are encouraged to check for additional
    information but are not required to do so.  See the Additional
    Information Processing section of RFC 3403 [2], Section 4.2 for more
    information on NAPTR records and the Additional Information section
    of a DNS response packet.

    The character set used to encode the substitution expression is UTF-
    8.  The allowed input characters are all those characters that are
    allowed anywhere in an E.164 number.  The characters allowed to be  
in
    a Key are those that are currently defined for DNS domain-names.
----------------------------------------------------------------
----------------------------------------------------------------

[Enum] Proposed tweak to section 2 of 3761 Lawrence Conroy