<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<!-- This is built from a template for a generic Internet Draft. Suggestions for
     improvement welcome - write to Brian Carpenter, brian.e.carpenter @ gmail.com -->
<!-- This can be converted using the Web service at http://xml.resource.org/experimental.html
     (which supports the latest, sometimes undocumented and under-tested, features.) -->
<?rfc toc="yes"?>
<!-- You want a table of contents -->
<?rfc symrefs="yes"?>
<!-- Use symbolic labels for references -->
<?rfc sortrefs="yes"?>
<!-- This sorts the references -->
<?rfc iprnotified="no" ?>
<!-- Change to "yes" if someone has disclosed IPR for the draft -->
<?rfc compact="yes"?>
<!-- This defines the specific filename and version number of your draft (and inserts the appropriate IETF boilerplate -->
<rfc category="info" docName="draft-carpenter-referral-ps-02"
     ipr="trust200902">
  <front>
    <title abbrev="Referral Problem">Problem Statement for
    Referral</title>

    <author fullname="Brian Carpenter" initials="B. E." surname="Carpenter">
      <organization abbrev="Univ. of Auckland"></organization>

      <address>
        <postal>
          <street>Department of Computer Science</street>

          <street>University of Auckland</street>

          <street>PB 92019</street>

          <city>Auckland</city>

          <region></region>

          <code>1142</code>

          <country>New Zealand</country>
        </postal>

        <email>brian.e.carpenter@gmail.com</email>
      </address>
    </author>

    <author fullname="Sheng Jiang" initials="S.J." surname="Jiang">
      <organization>Huawei Technologies Co., Ltd</organization>

      <address>
        <postal>
          <street>Huawei Building, No.3 Xinxi Rd.,</street>

          <city>Shang-Di Information Industry Base, Hai-Dian District,
          Beijing</city>

          <country>P.R. China</country>
        </postal>

        <email>jiangsheng@huawei.com</email>
      </address>
    </author>

    <author fullname="Zhen Cao" initials="Z" surname="Cao">
      <organization abbrev="ChinaMobile">China Mobile</organization>

      <address>
        <postal>
          <street>Unit2, 28 Xuanwumenxi Ave,Xuanwu District</street>

          <city>Beijing</city>

          <region></region>

          <code>100053</code>

          <country>P.R. China</country>
        </postal>

        <email>caozhenpku@gmail.com</email>
      </address>
    </author>

    <date day="23" month="February" year="2011" />

    <area></area>

    <workgroup>Network Working Group</workgroup>

    <abstract>
      <t>The purpose of a referral is to enable a given entity in a multiparty
      Internet application to pass information to another party. It enables a
      communication initiator to be aware of relevant information of its
      destination entity before launching the communication. This memo
      discusses the problems involved in referral scenarios.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">
      <t>A frequently occurring situation is that one entity A connected to
      the Internet (or to some private network using the Internet protocol
      suite) needs to be aware of the information of another entity B in order
      to reach it. The information can be obtained from B itself or some
      third-party entity C. This is known as a referral.</t>

      <t>Referral is the act whereby one entity informs another entity how to
      contact a specific entity. It enables a communication initiator to be
      aware of relevant information of its destination entity in order to
      launch a communication channel. This referral information can be
      obtained through an existing communication channel between these two
      entities or from thrid-party entities.</t>

      <t>In the original design of the Internet, IP addresses were global,
      unique, and quasi-permanent. Also any differentiation beyond that
      provided by an IP address was done by protocol and port numbers.
      Referrals were therefore performed simply by passing an IP address and
      possibly protocol and port numbers. In fact simple referrals (the first
      case above, sometimes called first-party referrals) were never needed
      since A could simply use B's address. Third-party referrals were
      trivial: C would tell A about B's address. Thus, it became common
      practice to pass raw addresses between entities. A classical example is
      the FTP PORT command <xref target="RFC0959"></xref>.</t>
    </section>

    <section title="Terminology">
      <t>This document makes use of the following terms:</t>

      <list style="symbols">
        <t>"Entity": we use this rather than "application" to describe any
        software component embedded in an Internet host, not just a specific
        application, that sends, receives or makes use of referrals. Also, in
        case of dynamic load sharing or failover, an entity might even migrate
        between hosts.</t>

        <t>"Referral": the act of one entity informing another entity how to
        contact a specific entity.</t>

        <t>"Reference": the actual data (name, address, identitifier, locator,
        pointer, etc.) that is the basis of a referral.</t>

        <t>"Referring entity": the entity that sends a referral.</t>

        <t>"Receiving entity": the entity that receives a referral.</t>

        <t>"Referenced entity": the target entity of a reference.</t>

        <t>"Scope": the region or regions of the Internet within which a given
        reference is applicable to reach the referenced entity.</t>
      </list>
    </section>

    <section title="Goals of Referral">
      <t>The principal purpose of referral is to enable one entity in a
      multi-party application to pass information to another party involved in
      the same application. This document makes no assumptions about whether
      the entities are acting as clients, servers, peers, super-nodes, relays,
      proxies, etc., as far as the application is concerned. Neither does it
      take a position as to how the various entities become aware of the need
      to send a referral; this depends entirely on the structure of the
      application.</t>

      <section title="Reachability">
        <t>The primary goals of referral is to enable a communication
        initiator to reach its destination entity. Referral is a best effort
        mechanism. It does not guarantee actual reachability, since the
        referring entity has no general way of knowing which paths exist
        between the receiving entity and the referenced entity. Even if a
        reference is theoretically in scope, and within its defined lifetime,
        it may have become unreachable since it was sent. A receiving entity
        should always be prepared for reachability failures and associated
        retry and failover mechanisms, which are out of scope for the referral
        mechanism itself.</t>
      </section>

      <section title="Path Selection">
        <t>A reference might carry multiple references for the same target.
        These may lead to multiple possible paths from the receiving entity to
        the referenced entity. This scenario is particularly generic when the
        destination or/and source entity has multiple interfaces or is
        multi-homed.</t>

        <t>The referring entity is not likely to know which path is best. The
        receiving entity will need to make a choice, possibly by local policy
        (e.g. <xref target="RFC3484"></xref>) or possibly by trial and error
        (e.g. <xref target="RFC4038"></xref>, <xref target="RFC5534"></xref>).
        This choice is also out of scope for the referral mechanism
        itself.</t>

        <section title="An Example: Triangle Path Optimization">
          <t>In application scenarios, the triangular path shown below is
          common. Both Host A and Host B connect to an application server and
          the application server forwards traffic as a relay agent. A slightly
          more complicated scenario is when the two hosts connect to different
          application servers individually and application servers talk to
          each other's relay agents. In SIP, this is often called the "SIP
          trapezoid".</t>

          <t><artwork><![CDATA[
              +------------------------------+ 
              |      application server      |
              +--+------------------------+--+
                /                         \ 
               |                          |
               |   referral information   | 
               |                          |
               |                          |
             +-+-+                      +-+-+
             | A +----------------------+ B |
             +---+ direct communication +---+]]></artwork></t>

          <t>By passing A's reference to B, B can try to communicate directly
          with A, using the communication line at the bottom. If the direct
          communication is established successfully, the triangle path gets
          optimized. Both the application server and network bandwidth can be
          benefit from this operation.</t>
        </section>
      </section>

      <section title="Interface Selection">
        <t>We also encounter multi-interfaced hosts whose reachability is
        bound to a particular (logical/physical) interface. More information
        is required to indicate which interface may be used under different
        circumstances. The multi-interface problem is defined and studied by the IETF MIF WG.
        Here referral can provide host A's multi-interface information to
        host B; accordingly, host B can select one of the interfaces to
        establish a connection.</t>

        <t><artwork><![CDATA[
          +------------+      Path 1    +------------+
          |Interface A1+----------------+Interface B1|
          |   Host A   |                |   Host B   |
          |Interface A2|----------------+Interface B2|
          +------------+      Path 2    +-----------+]]></artwork></t>

        <t>For example, as shown in the above figure, Host A has connected to
        Host B through Path 1. They can exchange references through Path 1.
        They may disciver that Path 2, using different interfaces, is better than
        Path 1, maybe cheaper, faster or more stable. Then, they can switch to
        Path 2. For example, Host A has interface A1 as broadband access, almost free; and
        interface A2 is 3G access, which costs 0.1 $ per MB. Both of them are
        avaible for incoming connections. If this information is passed to
        host B, through referral, then host B should choose the A1 interface to
        reach host A. Such information is useful to express a host's
        status or preference.</t>

        <t>In order to choose between different interfaces, not only the
        connectivity information of these interfaces, but also some additonal
        information may be helpful, such as bandwidth, financial cost, latency,
        etc. This additional information may also be provided through
        referral. However, this additional information, even if accurate when sent by the
        referring entity, may nevertheless be invalid at the location of the receiving
        entity.</t>
      </section>
    </section>

    <section anchor="ps" title="Problem Statement">
      <t>Unfortunately, the simple approach to referrals, passing an IP
      address, often fails in today's Internet. As has been known for some
      time <xref target="RFC2101"></xref>, hosts' IP addresses no longer all
      have global scope. They often have limited reachability, and may have
      limited lifetime. They are not sufficient to establish communication in
      many cases of dynamic referrals, for a variety of reasons. FQDNs may be
      used instead in some scenarios. However, FQDNs also have their own
      limitations and may fail in some scenarios.</t>

      <section anchor="ip" title="IP Addresses are not sufficient">
        <t>It is no longer reasonable to assume that a host with a fixed
        location has a fixed IP address, or even a stable IP address.</t>

        <t>Furthermore, in the context of IPv4 address exhaustion, several
        solutions have emerged to share a single public IPv4 address between
        several customers simultaneously. Consequently, a public IPv4 address
        often no longer identifies a single customer/user/host, while a private
        IPv4 address is meaningless out of the private network scope. Other
        information (e.g., port range) is required to identify unambiguously a
        given customer/user/host. Both IP addresses and port numbers may be
        different on either side of a NAT or some other middlebox <xref
        target="RFC3234"></xref>, and firewalls may block them. It is no
        longer reasonable to assume that an IP address for a host, which
        allows a given peer to reach that host in one location, also works
        from a different location - even if that host is reachable from the
        second location.</t>

        <t>Also, the Internet now has two co-existing address formats for IPv4
        and IPv6. Direct communication can only be established when both peers
        use the same IP version. Having the address of the source and
        destination in the same IP version does not necessarily mean that the
        path will be using that IP version. Simple approaches may cause
        unnecessary double translation <xref
        target="I-D.boucadair-softwire-cgn-bypass"></xref>. Some addresses may
        even be the result of translation between IPv4 and IPv6, with severe
        limitations on their scope and lifetime. Sending an out-of-scope or
        expired address, or one of the wrong format, as a referral will
        fail.</t>

        <t>A specific problem of this kind may be caused by NAT64
        <xref target="I-D.ietf-behave-v6v4-xlate-stateful"/>. If an IPv6-only
        host behind a NAT64 obtains a synthetic IPv6 address for an IPv4-only
        host, it can communicate successfully via the NAT64. However, if
        the synthetic address is referred to another IPv6 host, it may or may
        not work correctly. We can consider four cases: </t>
        <t><list style="numbers">
           <t>If the receiving entity is behind the same NAT64 as the
              referring entity, all should be well. </t>
           <t>If the referring entity and the receiving entity are
              behind different NAT64 devices, both using the defined Well Known Prefix <xref target="RFC6052"/>,
              all should be well, because the same synthetic address will work in both cases. </t> 
           <t>If the receiving entity is behind a different NAT64 that uses a
              Network Specific Prefix <xref target="RFC6052"/>, the synthetic address will 
              be meaningless and communication will fail. The only way to avoid this
              failure is for the original NAT64's Network Specific Prefix to be globally reachable,
              which seems highly unlikely for operational and security reasons. </t>
           <t>If the receiving entity is a dual stack node that is not behind
              a NAT64, the synthetic address will be meaningless. Although
              there is an IPv4 path to the target host, the receiving entity
              will not know how to find it. Again, the only way to avoid this
              failure is for the original NAT64's prefix to be globally reachable. </t>
           </list></t>
           <t>In the last two cases, even if connectivity failure is avoided, the path
           taken by the packets will be far from optimal, traversing the original NAT64. </t>

        <t>IP addresses today may have an implied "context" (VPN, VoIP VC, IP
        TV, etc.): the reachability of such an address depends on that
        context.</t>

        <t>An implication of these issues is that there is no clean definition
        of the scope of an address (especially an IPv4 address, due to the
        prevalence of NAT). It is impossible to determine algorithmically, by
        inspecting the bits of an address, what its scope of reachability is.
        Resolving this problem would greatly clarify the general problem of
        referrals.</t>
      </section>

      <section anchor="fqdn" title="FQDNs are not sufficient">
        <t>In some cases, this problem may be readily solved by passing a
        Fully Qualified Domain Name (FQDN) instead of an IP address. Indeed,
        that is an architecturally preferred solution <xref
        target="RFC1958" />. However, it is not sufficient in many cases of
        dynamic referrals. Experience shows that an application cannot use a
        domain name in order to reliably find usable address(es) of an
        arbitrary peer. Domain names work fairly well to find the addresses of
        public servers, as in web servers or SMTP servers, because operators
        of such servers take pains to make sure that their domain names work.
        But DNS records are not as reliably maintained for arbitrary hosts
        such as might need to be contacted in peer-to-peer applications, or
        for servers within corporate networks. Many small networks do not even
        maintain DNS entries for their hosts, and for some networks that do
        list local hosts in DNS, the listings may well be unusable from a
        remote location, because of two-faced DNS, or because the A record
        contains a private address. These cases may even be intentional as
        part of a security ring-fence, where w3.example.com only resolves
        within the corporate boundary, and/or resolves to IP addresses which
        are only reachable within the corporate administrative boundaries. In
        such contexts, incoming connections are usually filtered by the
        corporate firewall.</t>

        <t>An additional issue with FQDNs is the very common situation where
        multiple hosts are hidden behind a NAT, but they share one FQDN which
        is in fact a dummy name, created automatically by the ISP so that
        reverse DNS lookup will succeed for the NAT's public IPv4 address.
        Such FQDNs are useless for identifying specific hosts.</t>

        <t>Furthermore, an FQDN may not be sufficient to establish successful
        communications involving heterogeneous peers (i.e., IPv4 and IPv6)
        since A and AAAA records may not be consistently provisioned. There
        are known cases where a server has one name that produces an A record
        (e.g., www.example.com) and another name that produces an AAAA record
        (e.g., ipv6.example.com). An additional complication is that some
        answers from DNS may be synthetic IP addresses, e.g., AAAA records
        sent by DNS64. The host may have no means to detect that such an
        address represents an IPv4 host. These addresses should not be
        interpreted as native IPv6 address.</t>

        <t>In such cases, an IP address either cannot be derived from an FQDN,
        or if so derived, cannot be accessed from an arbitrary location in the
        Internet.</t>

        <t>A related problem is that an application does not have a reliable
        way of knowing its own domain name - or to be more precise, a way of
        knowing a domain name that will allow the application to be reached
        from another location.</t>

        <t>There are wider systemic problems with the DNS as a reliable way to
        find a usable address, which are somewhat out of scope here, but can
        be summarised:</t>

        <list style="symbols">
          <t>In large networks, it is now quite common that the DNS
          administrator is out of touch with the applications user or
          administrator, and as a result, that the DNS is out of sync with
          reality.</t>

          <t>DNS was never designed to accommodate mobile or roaming hosts,
          whose locator may change rapidly.</t>

          <t>DNS has never been satisfactorily adapted to isolated,
          transiently-connected, or ad hoc networks.</t>

          <t>It is no longer reasonable to assume that all addresses
          associated with a DNS name are bound to a single host. One result is
          that the DNS name might suffice for an initial connection, but a
          specific address is needed to rebind to the same peer, say, to
          recover from a broken connection.</t>

          <t>It is no longer reasonable to assume that a DNS query will return
          all usable addresses for a host.</t>

          <t>Hosts may be identified by a different URI per service: no unique
          URI scheme, meaning no single FQDN, will apply.</t>
        </list>

        <t>For all the above reasons, the problem of address referrals cannot
        be solved simply by recommending the use of FQDNs instead. The
        guideline in <xref target="RFC1958" /> is in fact too simple for
        today's network. Something more elaborate than an IP address or an
        FQDN appears to be needed in the general case of application
        referrals.</t>
      </section>

      <section anchor="Information" title="Relevant Information is lacking">
        <t>Neither an IP address nor an FQDN gives complete information about
        the referenced entity. For example, IP addresses normally have
        associated lifetimes (derived from DHCP, SLAAC or the relevant DNS
        TTL), so they should be treated as invalid after their lifetimes
        expire. A referral that does not convey the lifetime associated with
        an address is problematic. As mentioned above, the scope of a
        reference also affects its usefulness. These are examples of
        additional information that is necessary to correctly interpret a
        referral; therefore part of the problem is conveying such information
        along with the reference.</t>
      </section>

      <section anchor="IDLocator"
               title="Extra complexity from ID-Locator Split Mechanisms">
        <t>Additional complexity for referrals would come from the deployment
        of any technology that separates locators from identifiers, rather
        than combining the two as an IP address. Since a very wide range of
        such solutions have been proposed (e.g. HIP, LISP, ILNP and Name-based
        Sockets) <xref target="I-D.ubillos-name-based-sockets"></xref>, it is
        difficult to define the resulting problems precisely.</t>

        <t>However, to consider the example of Name-based Sockets, if a
        referral was made based on the IP address being used at a given
        instant for a Name-based Socket, that address might be useless by the
        time the referral was completed, because the socket suddenly migrated
        to a different IP address.</t>

        <t>The SHIM6 protocol <xref target="RFC5533"> </xref> and the Multiple
        Interfaces (MIF) Working Group may produce similar difficulties, since
        they also consider scenarios where the IP address in use for some
        purpose may change unexpectedly.</t>

        <t>Any referral mechanism must be able to deal with situations where
        the locator corresponding to a given identifier is subject to
        change.</t>
      </section>
    </section>

    <section anchor="motivation"
             title="A Generic Referral Mechanism is needed">
      <t>The first motivation is the observation that unless the parties
      involved have reached an understanding about the scope, lifetime, and
      format of the elements in a referral through some other means, that
      information must be passed with the referral. This is required so that
      the receiving entity can determine whether or not the referral is
      useful. The referral therefore needs to consist of a fully-fledged data
      structure, or to be made using a mutually agreed referral protocol.</t>

      <t>When an attempt to establish a communication channel based on certain
      referral information fails, good design suggests that the receiving
      entity should attempt to correct the situation. For example, if
      communication fails to be established using an IP address, it would
      often be appropriate to attempt a DNS lookup, despite the difficulties
      mentioned above. The second motivating problem is that it may be helpful
      to the entity receiving a reference to also receive information about
      the source of the reference, such as an FQDN, if that is known to the
      sender of the reference. The receiving entity can then attempt to
      recover a valid address (and possibly port number) for the referred
      entity.</t>

      <t>The third motivating problem is to allow a reference to contain
      alternatives to an IP address or an FQDN, when any such alternatives
      exist.</t>

      <t>Additional arguments for a generic referral mechanism include:</t>

      <list style="numbers">
        <t>Allow for general mechanisms that can be used by any application to
        handle references and understand the meaning of referral information,
        such as IP address, possibly protocol and port numbers. However, there
        is an open question whether this standard referral design should be
        used for new applications only, or extended to existing
        applications.</t>

        <t>Simplify ALG design during middlebox traversal. There are
        middleboxes, like firewalls and translators, especially in the mobile
        network, which require application layer gateways ALG. The cost of ALG
        functions is huge for the mobile operator in terms of implementation,
        performance. Standard references could simplify ALG implementation
        during middlebox traversal in the mobile network.</t>

        <t>Simplify packet inspection. Operators sometimes need to inspect
        information or details during communication for administration
        reasons. If referral mechanism is standardized, it is easier for an
        operator to capture and investigate the required information.</t>
      </list>

      <t>We observe that we have identified two general requirements: the need
      to define address scope more precisely, and the need to communicate
      references in a generic way.</t>

      <t>It should be noted that partial or application-specific solutions to
      these problems abound, because any multi-party distributed application
      must solve them. The best documented example is ICE <xref
      target="RFC5245" />, which is an active protocol specific to
      applications mediated by SDP <xref target="RFC4566" />. ICE "works by
      including a multiplicity of IP addresses and ports in SDP offers and
      answers, which are then tested for connectivity by peer-to-peer
      connectivity checks." The question raised here is whether we can define
      requirements for a generic solution that can be used by future
      applications, and possibly be retro-fitted to existing applications.</t>

      <t>One approach could be a "SuperICE" designed to be completely general
      and not tied to the SDP model. Another approach is the idea of a generic
      referral object. Such an object could be passed between the entities of
      a multi-party application, but without defining a specific protocol for
      that purpose. Some applications might choose to send it in-band as a raw
      binary object, others might use a simple ASCII encoding, and still
      others might prefer to encode it in XML, for example. Finally, it might
      also be used as part of SuperICE.</t>
    </section>

    <section anchor="security" title="Security Considerations">
      <t>It should be noted that referral should not function as a way to
      nullify the effect of a firewall or any other security mechanism. If the
      receiving entity chooses a particular reference and attempts to send
      packets to the corresponding IP address, whether they are delivered or
      not will depend on the existing security mechanisms, whatever they may
      be.</t>

      <t>Nevertheless, if a site security policy requires it, certain
      references may be excluded from referral information sent to certain
      destinations. This would require a security policy mechanism to be added
      to the process of generating referral information.</t>

      <t>Forged or intercepted referral information would enable a wide
      variety of attacks. Although not fundamentally different from attacks
      based on forged or observed IP addresses or FQDNs, no doubt referral
      would allow such attacks to be more ingenious, simply because they
      provide more information than an address or FQDN alone. Referral
      information should be transmitted through authenticated and encrypted
      channels. It is not further elaborated here.</t>

      <t>Referral may raise potential privacy issues, which are not explored
      in this document. For example, in the SIP context, mechanisms such as
      <xref target="RFC3323"></xref> and <xref target="RFC5767"></xref> are
      available to hide information that might identify end-points. Referral
      usage scenarios must ensure that they do not unintentionally defeat
      privacy solutions.</t>
    </section>

    <!-- security -->

    <section anchor="iana" title="IANA Considerations">
      <t>This document requests no action by IANA.</t>
    </section>

    <!-- iana -->

    <section anchor="ack" title="Acknowledgements">

      <t>Bo Zhou, formerly with China Mobile, was an original author of this document. 
         His contributions are gratefully acknowledged. </t>

      <t>Valuable comments and contributions were made by Mohamed Boucadair,
      Dan Wing, Keith Moore and others.</t>

      <t>This document was produced using the xml2rfc tool <xref
      target="RFC2629"></xref>.</t>
    </section>

    <!-- ack -->

    <section anchor="changes" title="Change log">
      <t>draft-carpenter-referral-ps-00: original version, 2010-06-21.</t>

      <t>draft-carpenter-referral-ps-01: add content regarding to ID-Locator
      Split Mechanisms, 2010-08-30.</t>

      <t>draft-carpenter-referral-ps-02: add content regarding NAT64, 2011-02-23.</t>

    </section>

    <!-- changes -->
  </middle>

  <back>

    <references title="Informative References">

      <?rfc include="reference.RFC.5245"?>

      <?rfc include="reference.RFC.5534"?>

      <?rfc include="reference.RFC.5533"?>

      <?rfc include="reference.RFC.3323"?>

      <?rfc include="reference.RFC.4566"?>

      <?rfc include="reference.RFC.3484"?>

      <?rfc include="reference.RFC.0959"?>
  

       

      <?rfc include="reference.RFC.2629"?>  

      <?rfc include="reference.RFC.2101"?>
   
      <?rfc include="reference.RFC.2775"?>

       --&gt; 

      <?rfc include="reference.RFC.1958"?>
    
      <?rfc include="reference.RFC.3234"?>
      
      <?rfc include="reference.RFC.4038"?>
    
      <?rfc include="reference.RFC.5767"?>

      <?rfc include="reference.RFC.6052"?>     

<?rfc include="reference.I-D.boucadair-softwire-cgn-bypass"?>

<?rfc include="reference.I-D.ubillos-name-based-sockets"?>
       
<?rfc include="reference.I-D.ietf-behave-v6v4-xlate-stateful"?>

       
    </references>
  </back>
</rfc>
