<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-xyz-pidloc-ps-02" ipr="trust200902">
  <front>
    <title abbrev="Pidloc Problem Statement">
     Problem Statement for Secure End to End Privacy in IdLoc Systems</title>


   <author fullname="Dirk von Hugo" initials="D.H." surname="von Hugo">
      <organization abbrev="Deutsche Telekom">Deutsche Telekom</organization>
      <address>
        <postal>
          <street>Deutsche-Telekom-Allee 7</street>
          <city>D-64295 Darmstadt</city>
          <code></code>
          <country>Germany</country>
        </postal>
        <phone></phone>
        <email>Dirk.von-Hugo@telekom.de</email>
      </address>
    </author>

    <author fullname="Behcet Sarikaya" initials="B.S." surname="Sarikaya">
      <organization>Denpel Informatique</organization>
      <address>
        <postal>
          <street></street>
          <street></street>
          <city></city>
          <region></region>
          <code></code>
        </postal>
        <email>sarikaya@ieee.org</email>
      </address>
    </author>
    <author fullname="Luigi Iannone" initials="L.I." surname="Iannone">
      <organization>Telecom ParisTech</organization>
        <address>
          <postal>
            <street></street>
            <street></street>
            <city></city>
            <region></region>
            <code></code>
          </postal>
        <email>ggx@gigix.net</email>
      </address>
    </author>
    <author fullname="Alex Petrescu" initials="A.P." surname="Petrescu">
      <organization abbrev="">CEA, LIST</organization>
      <address>
        <postal>
          <street></street>
          <city></city>
          <code></code>
          <country></country>
        </postal>
        <phone></phone>
        <email>alexandre.petrescu@gmail.com</email>
      </address>
    </author>
      <author fullname="Kyoungjae Sun" initials="K.S." surname="Sun">
      <organization abbrev="">Soongsil University</organization>
      <address>
        <postal>
          <street></street>
          <city></city>
          <code></code>
          <country></country>
        </postal>
        <phone></phone>
        <email>gomjae@dcn.ssu.ac.kr</email>
      </address>
    </author>
        <author fullname="Umberto Fattore" initials="U.F." surname="Fattore">
      <organization abbrev="">NEC</organization>
      <address>
        <postal>
          <street></street>
          <city></city>
          <code></code>
          <country></country>
        </postal>
        <phone></phone>
        <email>Umberto.Fattore@neclab.eu</email>
      </address>
    </author>

    <date  year="2019" />

    <abstract>
      <t>
       Efficient and service aware flexible end-to-end routing in future
    communication networks is achieved by routing protocol approaches
    making use of Identifier Locator separation systems.  Since these systems require a correlation between identifiers and location which
    might allow tracking and misusage of individuals' identities and    locations such operation demands for highly secure measures to
  preserve privacy of users and devices.  This document tries to
   identify and describe typical use cases and derive thereof
     a problem statement describing issues and challenges for application of privacy preserving Identifier-Locator
   split (PidLoc) approaches.
      </t>

    </abstract>
  </front>

  <middle>
    <section title="Introduction">


      <t>
      Forthcoming future communication systems which are currently under
   specification by various SDOs (Standards Development Organizations) try to achieve higher
   resource efficiency and flexibility as compared to currently deployed
   and operated networks.  Independent of specific access technologies,
   multiple applications shall be served with different levels of policy-driven mobility support and quality of service in terms of bandwidth,
   latency, error probability, etc.  Current practice of IP address usage
   includes semantics as session identification as well as entity location
   and name resolution.  Many networking and information processing
   related topics as cloud computing, software defined networking, network
   function virtualization, logical network slicing, and convergence of
   multiple heterogeneous access and transport technologies call for new
   approaches towards service specific and optimized packet routing.

      </t>
      <t>
      Promising proposals are Identifier Locator (Id-Loc) separation systems
   like Identifier Locator Addressing (ILA) <xref target="I-D.herbert-intarea-ila"/>, Identifier-Locator Network
   Protocol (ILNP) <xref target="RFC6740"/>, Locator/ID Separation Protocol (LISP) <xref target="I-D.ietf-lisp-rfc6830bis"/>
   <xref target="I-D.ietf-lisp-rfc6833bis"/>, and others.



      </t>
      <t>
      Architectures and protocols for these approaches are already documented
   in detail and are under continuous evolution in different WGs.  This
   document on the other hand attempts to identify potential issues with
   respect to real-world deployment scenarios, which may demand for  implementations of the above-mentionned Id-Loc systems.  In particular, this document focuses on issues related to threats due to privacy violation of devices and their users, as well as location detection and movement tracking, where specific countermeasures may be needed.


      </t>
      <t>
      To provide a problem statement this draft documents common aspects and
   differences of several Id-Loc approaches from a high-level perspective
   and describes a set of use cases resulting in identified issues and challenges concerning privacy and security. A set of requirements as outcome of a detailed analysis of these both generic and use cases specific questions will be provided in a companion document.
      </t>


    </section>

    <section title="Conventions and Terminology">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>

      <t>
      Identifier: An identifier is information allowing to unambiguously
      identify an entity or an entity group within a given scope.  An
      identifier is the equivalent of an End-point IDentifier (EID) in The Locator/ID Separation Protocol (LISP).
      It may or may not be visible in communications.


      </t>
      <t>
      Locator: A locator is a routable network address.  It may be
      associated with an identifier and used for communication on the network
      layer according to identifier locator split principle.  A locator is the
      equivalent of a Routing Locator (RLOC) in LISP or an IP address in
      other cases.
      </t>

    </section>





    <section title="Identifier Locator Separation Protocols">
    	<t>
    	Identifier represents a communication end-point of an entity and may not be routable.
    	Locator also represents a communication end-point, however, it is a  routable network address. 
    	Because entities identified by an Identifier can move the association between Identifiers 
    	and Locators may be ephemeral. A database called a mapping system needs to be used for
    	Identifier to Locator mapping. Identifiers are mapped to locators for reachability purposes.
    	A mapping system has to handle mobility by updating the identifier to locator mappings in the database.
 	</t>
	<t>
	To start the communication, a device needs to know the identifier of the destination, hence it relies on a identifier lookup process to obtain the associated locator(s).
	Note that both identifier and locator may be carried in clear in packet headers, depending 
	on the specific technology used and the level of security/privacy enforced.
	</t>


		<t>
		Usage of identifiers readily available for public access raises privacy issues.
		For public entities, it may be desirable to have their fully qualified domain names or
		host names available for public lookups by the clients, however, this is not the case in
		general for all identifiers, e.g. for individuals roaming in a mobile network.
	</t>

  <section title="ILNP">
  	<t>
	  Identifier-Locator Network Protocol (ILNP) <xref target="RFC6740"/> is a host-
   based approach enabling mobility using mechanisms that are only
   deployed in end-systems and do not require any router changes.
</t>
</section>

<section title="ILA">
  <t>
   Identifier-Locator Addressing (ILA) <xref target="I-D.herbert-intarea-ila"/> uses address
   transformation proposing to split an IPv6 address in 64-bit
   identifier (lower address bits) and locator (higher address bits)
   portions.  The locator part is determined dynamically from a mapping
   table that maintains associations between the location-independent
   identifiers and topologically significant locators.  
	</t>
	<t>
	ILA is currently deployed in commercially available cloud systems such as Facebook and Google which are Layer 3 based. Also
	A kernel implementation of ILA is available in Linux distribution. ILA does not require any transport layer (UDP/TCP)
	changes.
	</t>

</section>

<section title="LISP">
	<t>
    Locator/Id Separation Protocol (LISP) <xref target="I-D.ietf-lisp-rfc6830bis"/> <xref target="I-D.ietf-lisp-rfc6833bis"/> is based on a map-and-encap approach, which provides a level of indirection for routing
    and addressing performed at specific ingress/egress routers at the LISP domain boundaries. Such border routers performing LISP encapsulation at the packet's source stub network are indicated as Ingress Tunnel Routers (ITRs), while border routers at the packet's destination stub network are called Egress Tunnel Routers (ETRs), all of them are indicated by the general term xTRs. In order to obtain mappings used for encapsulation operation, xTRs query the mapping system in order to obtain all mappings related to a certain EID only when necessary (usually, but not exclusively, at the beginning of a new flow transmission).
    The LISP control plane protocol <xref target="I-D.ietf-lisp-rfc6833bis"/> allows to 
    support several different mapping systems (e.g., LISP+ALT <xref target="RFC6836"/> and LISP-DDT <xref target="RFC8111"/>). 
    More than that, it can actually also be applied to various other data plane protocols.


	</t>
</section>
<section title="Privacy in IdLoc Protocols">
	<t>
	In all of the above protocols of ILNP, ILA and LISP, the
   identifiers are carried in packet headers in clear and therefore preserving identifier's privacy is needed. Otherwise 
   private information such as the location and content of the communication can be revealed.
   
	</t>
	<t>
	In case of ILNP, public DNS can be used to by the end nodes to access the destination 
	identifier for a given Fully Qualified Domain
   Name (FQDN).  However the same node also gets the locator values
   raising serious privacy issues in the control plane. 
   As for the data plane, both source locator and identifier need to be privacy protected and techniques such as 
   locator rewriting and ephemeral-use identifiers,  respectively are suggested.
	</t>
	<t>
	In the control plane, ILA exhibits similar privacy issues if the ILA mapping system defining identifier locator mappings
	can publicly be accessed.
	In ILA, privacy is addressed in the data plane by way of UE simultaneously
   using different addresses for different connections chosen from a
   block of addresses.
	</t>
	<t>
	In LISP mapping system, the
   lack of privacy support in the control plane for a given identifier
   value exists due to the use of DNS, as in ILNP.
   In the data plane, privacy
   addressing by way of UE simultaneously using different addresses for
   different connections chosen from a block of addresses can be used as in ILA.
	</t>

</section>

</section>
          <section anchor="solutions" title="Use Cases">
      <t>

The collection of use cases shall serve as starting point to identify different issues and challenges allowing for later derivation of
   requirements to future solutions providing privacy and security in
   generic Identifier Locator Split approaches.
      </t>

         <section anchor="iot" title="Industrial IoT">
      <t>
    Sensors and other connected things in the industry are usually not
   personal items (e.g. wearables) potentially revealing an individual's
   sensitive information. Yet, industrial connected objects are business assets which should be detected/accessed
   only by authorised intra-company entities.  Since the huge amount of
   these things (massive IoT) as well as the typical energy and bandwidth
   constraints of battery-powered devices may pose a challenge to
   traditional routing and security measures, privacy enabled Id-Loc
   split approaches are proposed as a viable approach here,
   <xref target="I-D.nordmark-id-loc-privacy"/>.
      </t>
      <t>
      In Industrial IoT, there are very strong reasons to not share the ID/Locator binding 
      with third parties, i.e. retain the privacy. This can be achieved in a number of ways such as: 
      using an ID/locator system but using some fixed anchor points a locator; injecting routing 
      prefixes for the ID prefixes into the normal routing system and use proxy indirection;
      providing limited ID/Locator exposure. These are just examples, more approaches should be 
      explored in order to find which one is the most suitable in the context of industrial IoT.
      </t>
      </section>

      <section anchor="nextgen" title="5G Use Case">
      <t>
  Upcoming new truly universal communication via so-called 5G systems
   will demand for much more than (just) higher bandwidth and lower
   latency.  Integration of heterogeneous multiple access technologies
   (both wireless and wireline) controlled by a common converged core
   network and the evolution to service-based flexile functionalities
   instead of hard-coded network functions calls for new protocols both
   on control and user (data) plane.  While Id-Loc approach would serve
   well here, the challenge to provide a unique level of security and
   privacy even for a lightweight routing and forwarding mechanism - allowing for ease of deployment and migration from existing operational network architecture - remains to be solved.


      </t>
      </section>
      <section anchor="Cloud" title="Cloud Use Case">
      <t>
    The cloud, i.e. a set of distributed data centers for processing and
   storage connected via high speed transmission paths, is seen as
   logical location for content and also for virtualized network
   function instances and shall provide measures for easy re-location
   and migration of these instances deployed as e.g. containers or
   virtual machines.  Id-Loc split routing protocols are proposed for
   usage here as in ILA <xref target="I-D.herbert-intarea-ila"/> and LISP <xref target="I-D.ietf-lisp-rfc6830bis"/> <xref target="I-D.ietf-lisp-rfc6833bis"/> while the topology of the cloud components and logical
   correlations shall be invisible from outside.
      </t>
      <t>
      In a cloud, an upstream IP address does not necessarily belong to the actual service location, but
a gateway or load balancer. So, the locator or also ID reveal the location with the accuracy of a data center, not
the function taking a service request. This issue also manifests itself in today's LTE as
PGWs are in a data center binding UEs' IP addresses which are from the network of the data center.
      </t>
      </section>
      <section anchor="vehic" title="Vehicular Networks">
      <t>
      In vehicular networks use cases (e.g. for a future C-ITS, i.e.
   Cooperative Intelligent Transport Systems) there are some problems related to
privacy.  Cars are mandated to beacon CAM messages (cooperative awareness message - also denoted 
as basic service message, BSM)
very frequently (more
than 1 per second).  These messages contain identifiers such as MAC
addresses.  They are unique and visible in the public oui.txt file.
They can be tracked.  But these are MAC addresses, not IP addresses.


      </t>
      <t>
      If, in the future, cars beacon Router Advertisements as well, then there
is a risk in the source address of these RAs - the link local (LL) address.  They are usually
formed out of the MAC address, even though recent RFC7217 <xref target="RFC7217"/> give
suggestion of using a random ID in the IID (Interface Identifiers)
(rather than the MAC
address); the RFC stays silent about the prefix length; since the
RFC7217 method covers also the LL addresses, and requires them to be
RFC4291-like (64bit length), that random ID is still of fixed length
(64).  Longer than 64 IIDs may benefit privacy, since crypto attacks on
them would be harder.


      </t>
      <t>
      A variable length IID in link-local addresses may help create a flexible
identifier-locator split thus increasing privacy.
      </t>
       	<t>

In addition C-ITS shall also allow to improve vehicular network based
   services as e.g. predict traffic congestion along the route and
   propose a re-direction towards alternative routes, or predict network
   coverage along the foreseen path to adapt a critical service.  This
   on the other hand demands for knowledge of the actual route, i.e.
   tracking of the vehicle.  As was shown in <xref target="NYC_cab"/> even anonymizing
   sometimes does not prevent from privacy breaches.  ...

      </t>
		<t>
		Strong access control to ID/LOC mapping system(e.g. using longer and variable
		length of IID, crypto-ID, etc.)  has some tradeoffs between enhancing privacy and increasing delay.
		Furthermore, in the vehicular network, reducing delay is also very important issue
		because vehicle moves too fast to have enough time to configure.
		</t>
			<t>
		For V2V communication, using temporary identifier between two vehicles can be one solution to prevent privacy.
When we think of the example for V2V communication, most of their data includes current traffic condition, speed, or accident
information which are not related to identify their unique device information.
<xref target="I-D.ietf-lisp-eid-anonymity"/> can be one good solution to provide anonymity.
In <xref target="I-D.ietf-ipwave-vehicular-networking"/>, they suggest MAC address pseudonym in which
MAC address is changed periodically.
		</t>

     </section>







    </section>

      <section anchor="Issues" title="PIdLoc Issues and Challenges">
      <t>
      This section concludes on both common and specific issues and challenges in PIdLoc to 
      allow for derivation of requirements to potential solutions serving for a gap analysis 
      to be documented in upcoming drafts, e.g. (I-D.xyz-pidloc-reqs).
      </t>
    </section>




    <section anchor="IANA" title="IANA Considerations ">
      <t>
      TBD.
      </t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>
        TBD
      </t>
    </section>



    <section anchor="acks" title="Acknowledgements">
      <t>


   </t>
    </section>

  </middle>

  <back>
    <references title="References">
      <?rfc include="reference.RFC.2119"?>




					<?rfc include='reference.I-D.ietf-lisp-rfc6830bis'?>
					<?rfc include='reference.I-D.ietf-lisp-rfc6833bis'?>

      <?rfc include="reference.RFC.6740"?>
      <?rfc include="reference.RFC.6836"?>
      <?rfc include="reference.RFC.8111"?>


      <?rfc include="reference.RFC.7217"?>


     	         <?rfc include='reference.I-D.ietf-intarea-tunnels'?>
     <?rfc include='reference.I-D.herbert-intarea-ila'?>
     
     <?rfc include='reference.I-D.nordmark-id-loc-privacy'?>
     <?rfc include='reference.I-D.ietf-lisp-eid-anonymity'?>
  	<?rfc include='reference.I-D.ietf-lisp-sec'?>
  	<?rfc include='reference.I-D.ietf-ipwave-vehicular-networking'?>
  	
	 <reference anchor="NYC_cab">
        <front>
          <title>Anonymizing NYC Taxi Data: Does It Matter?</title>
          <author initials="M" surname="Douriez, et al.">
          </author>

          <date year="2016"/>
        </front>
        <seriesInfo name="Proc. of IEEE Intl.
Conf. on Data Science and Advanced Analytics (DSAA'16)" value=""/>
        <seriesInfo name="pp." value="140-148"/>
      </reference>

    </references>

  </back>

</rfc>
