Re: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design considerations for Metadata Insertion) to Informational RFC

Stephen Farrell <stephen.farrell@cs.tcd.ie> Tue, 07 March 2017 12:07 UTC

Return-Path: <stephen.farrell@cs.tcd.ie>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6FE281295D0; Tue, 7 Mar 2017 04:07:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.301
X-Spam-Level:
X-Spam-Status: No, score=-4.301 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.tcd.ie
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EUKFvuhKXb_q; Tue, 7 Mar 2017 04:07:46 -0800 (PST)
Received: from mercury.scss.tcd.ie (mercury.scss.tcd.ie [134.226.56.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EF713129484; Tue, 7 Mar 2017 04:07:45 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mercury.scss.tcd.ie (Postfix) with ESMTP id 51659BED7; Tue, 7 Mar 2017 12:07:43 +0000 (GMT)
X-Virus-Scanned: Debian amavisd-new at scss.tcd.ie
Received: from mercury.scss.tcd.ie ([127.0.0.1]) by localhost (mercury.scss.tcd.ie [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RzrP9kKFbg-r; Tue, 7 Mar 2017 12:07:40 +0000 (GMT)
Received: from [172.28.172.2] (unknown [109.125.19.162]) by mercury.scss.tcd.ie (Postfix) with ESMTPSA id BC0C0BED6; Tue, 7 Mar 2017 12:07:39 +0000 (GMT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cs.tcd.ie; s=mail; t=1488888460; bh=+u2nQsvvauOHw8vUEr9zl0EW1BPBhalvvEfRs95Zw3A=; h=Subject:To:References:Cc:From:Date:In-Reply-To:From; b=BKj9tkKaRC8R5zpmjPGDkDCDRmEOaBs1L+cavcIZ9ljVzFJZA+OZ34MFN6+hKE3q1 wUffTZPubshODNXaair8Q5RxtxosokT/m994IrRVqMsqM06mvmYChG1y4j0GxNIF/L GohWL0LASFf49+S6XOZ8fjjN+YIo0/EdgtdP1Vos=
Subject: Re: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design considerations for Metadata Insertion) to Informational RFC
To: mohamed.boucadair@orange.com, Ted Hardie <ted.ietf@gmail.com>
References: <148527996733.12573.15522530300481191993.idtracker@ietfa.amsl.com> <787AE7BB302AE849A7480A190F8B933009E16627@OPEXCLILMA3.corporate.adroot.infra.ftgroup> <CA+9kkMBw-QbaDzDanWs6sH-z7rEteofCvp8-d-qSf9J31zJykA@mail.gmail.com> <787AE7BB302AE849A7480A190F8B933009E183CF@OPEXCLILMA3.corporate.adroot.infra.ftgroup> <CA+9kkMC4-e=HXSa=QX4m1GgKFA1y-PmKsHkwQTg-ckEM2tGUbw@mail.gmail.com> <787AE7BB302AE849A7480A190F8B933009E1B2E6@OPEXCLILMA3.corporate.adroot.infra.ftgroup> <CA+9kkMBATNM9VAAoRVzAPrvshqeAkNjL_VA_Bwz_JhU75DTRiA@mail.gmail.com> <787AE7BB302AE849A7480A190F8B933009E1BDA0@OPEXCLILMA3.corporate.adroot.infra.ftgroup> <CA+9kkMAMBaq7p4t88T=Rvkg3Qr+wRaGxfPHnKAj7eReWq2Unkw@mail.gmail.com> <68cef6ce-f517-4b36-ae91-cc82c6fa4465@OPEXCLILM42.corporate.adroot.infra.ftgroup> <CA+9kkMD7p70mPp1ta1L1Zy7sDN9_=3zrFr=3nfbK8Wyb0C-fOg@mail.gmail.com> <a9787ef7-734b-49a4-8590-b793272893d7@OPEXCLILM5C.corporate.adroot.infra.ftgroup>
From: Stephen Farrell <stephen.farrell@cs.tcd.ie>
Openpgp: id=D66EA7906F0B897FB2E97D582F3C8736805F8DA2; url=
Message-ID: <cebf0487-2d71-ea1e-5dd9-cf1d8c80df3a@cs.tcd.ie>
Date: Tue, 07 Mar 2017 12:07:38 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <a9787ef7-734b-49a4-8590-b793272893d7@OPEXCLILM5C.corporate.adroot.infra.ftgroup>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="s4rBBtEBM47UpdAmu9qPDdcCewgwTCP7d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/PmUesGueTyEeTJBi1chH1kwAHA4>
Cc: "draft-hardie-privsec-metadata-insertion@ietf.org" <draft-hardie-privsec-metadata-insertion@ietf.org>, "ietf@ietf.org" <ietf@ietf.org>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Mar 2017 12:07:49 -0000

Hi Med and Ted,

Thanks for the discussion. My conclusion from that is that
the best next step will be to continue with IESG evaluation.
Progressing this draft had fairly clear support in earlier
discussions on the saag list and in saag meetings, so it
may be Med that you're just in the rough - given the limited
comment during LC, I'd say getting the IESG to evaluate
this will be the easiest way to figure that out. And as an
outgoing AD, I'd much prefer to try get this sorted before
I'm done rather than leaving it to Kathleen or Eric to pick
up later. (Though we can go there if needed.) Anyway, I
think logistically it makes sense to keep this on the March
16th IESG telechat and see where we go from there. I'll
include a pointer to the archive for this mail so other
ADs can easily find this discussion.

As to the content of the discussion, I think Ted has answered
your comments, even though you and he haven't reached full
agreement on all topics. For me it seems like the remaining
areas of discussion were:-

- language around "restore" or not (fwiw I agree with Ted's
  take on that term)

- "consent" - actually it may be best to just remove the two
  uses of that term entirely - the term is fairly legally
  "loaded" and I'm not sure it's needed for this design
  advice anyway - if we can figure a way to rephrase those
  bits that may be an improvement worth making

I'll start the iesg evaluation process for -07 shortly.

Cheers,
S.

PS: Sorry if I've missed anything else that is outstanding.
The form of (not really;-) quoting earlier messages used makes
it very hard to follow the conversation over more than two
emails.

On 07/03/17 07:31, mohamed.boucadair@orange.com wrote:
> Hi Ted,
> 
> Thank you for the answers.
> 
> Please see inline.
> 
> Cheers, Med
> 
> De : Ted Hardie [mailto:ted.ietf@gmail.com] Envoyé : lundi 6 mars
> 2017 18:14 À : BOUCADAIR Mohamed IMT/OLN Cc : ietf@ietf.org;
> draft-hardie-privsec-metadata-insertion@ietf.org Objet : Re: Last
> Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design
> considerations for Metadata Insertion) to Informational RFC
> 
> Hi Mohamed, Replies in-line.
> 
> 
> On Mon, Mar 6, 2017 at 1:48 AM,
> <mohamed.boucadair@orange.com<mailto:mohamed.boucadair@orange.com>>
> wrote:
> 
> 
> •         A Forward-For header inserted by a proxy does not restore
> any data; it does only reveal data that is already present in the
> packet issued by the client itself. That's what restore means here. 
> [Med] Then, this needs to be defined in the document. I naively
> assumed that “restored” is used to mean any piece of information
> that the client does not want to insert in a packet, but an on-path
> device decides to inject it despite there is no consent from the
> client. What you are describing is more about “maintaining” or
> “preserving” information not restoring it.
> 
> The common uses of restore in English all focus on putting something
> back that has been lost, [Med] But that information is not lost for
> an on-path device that encapsulates a packet in another one (so the
> inner header is still carrying the source IP address) or the one that
> supplies the original source IP address as a metadata when source IP
> address/port rewriting is required. The notion of “putting back”
> does not make sense to me because we are not dealing with the
> internal processing of a packet within an on-path device, but we only
> focus on the external behavior. This is exactly the role of “via”
> headers for SIP proxies; when there is a mismatch the received tag is
> completed with the visible source address.
> 
> so I believe restore is better than "maintain" or "preserve", which
> imply something is being carried forward as-is, rather than being put
> back after loss. [Med] Please see above. Because we don’t have a
> standard behavior of an on-path device (proxy, tunnel-endpoint.), I
> seems weird to me to say that a proxy that preserves the source IP
> address is “putting pack an information that is lost”.
> 
> 
> If the information is present as metadata in the packet sent to the
> proxy but would be absent as metadata under normal operation of the
> proxy, adding it back in somewhere else restores the metadata. [Med]
> “normal operation of proxy” is not a standard. A “normal
> operation of proxy” would be to maintain the information sent by
> the client when relaying it to the server. I’m sure you know for
> instance that SIP B2BUAs can do whatever they want!
> 
> You're right that the normal operation of a proxy is not a standard,
> and I should have said "the normal operation of the protocols used by
> a proxy". [Med] This is much better, but still not sufficient.
> On-path devices that manipulate packets may not be a
> “protocol-specific proxy”: tunnel endpoint (e.g., LISP), CGN
> (NAT64, NAT44, DS-Lite), MAP-E BR, etc.
> 
> If the action of the proxy is to start a new TCP connection to an
> origin server, for example, the normal operation of TCP is to use the
> initiator's IP address. [Med] This is protocol-specific. I can
> provide an example of a proxy behavior that relays the source IP
> address/port as part of its normal operation:
> http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt - TCP/IPv4
> : "PROXY TCP4 255.255.255.255 255.255.255.255 65535 65535\r\n" => 5 +
> 1 + 4 + 1 + 15 + 1 + 15 + 1 + 5 + 1 + 5 + 2 = 56 chars
> 
> - TCP/IPv6 : "PROXY TCP6 ffff:f...f:ffff ffff:f...f:ffff 65535
> 65535\r\n" => 5 + 1 + 4 + 1 + 39 + 1 + 39 + 1 + 5 + 1 + 5 + 2 = 104
> chars
> 
> 
> The loses the IP address of the querying host is implied by that
> normal operation(in other words, it elides metadata about any client
> that caused this new TCP connection to be createD). [Med] This makes
> sense if losing the original IP address is an intended propriety of
> the proxy. But this cannot be a generalized proxy behavior (see the
> example above).
> 
> So origin IP address starts out in the IP header of the original
> packet but gets pushed from that slot when the proxy constructs the
> onward IP packet to the server.  For it to reach the server, it has
> to be placed somewhere else in the onward packet, restoring the lost
> metadata. [Med] The client agreed to send packets with its source IP
> address (which mean consent). Why the proxy would need to an extra
> channel to get consent for relaying the source IP address to a
> server?
> 
> 
> Because the client agreed to send packets to the proxy by putting it
> in the destination [Med] The client is not even aware that proxy
> exists on the path! Packets are sent to the ultimate server’s
> address, not the one of the proxy. Even for SOME cases where packets
> are sent explicitly to the proxy (e.g., SOCKS proxy), a state is
> already in place to graft the outgoing packets to a binding context
> involving the destination server.
> 
> , and did not agree to general disclosure; you can't infer onward
> consent. [Med] Hmm…I’m afraid this conclusion is not technically
> backed, e.g., * A client that sends packets to a server located on
> the Internet is NOT necessarily aware that a proxy is solicited in
> forwarding path. Packets are sent using the server’s IP address. *
> The client and proxy may be owned by the same administrative entity
> (case of enterprise networks). That entity is responsible for ensure
> which information the proxy needs to leak. * The proxy and the server
> may be owned by the same administrative entity (content provider).
> Supplying data by a proxy to the server, based on the content of a
> packet received from a host, does not induce a privacy concern here
> because the proxy and the server owned by the same entity.
> 
> Had it been present in the packet as header value in the HTTP
> exchange, it would not have been stripped by normal operation.  There
> proxy operation forwarding it on would be simply preserving it. [Med]
> This is another question: whether the same or distinct channel can be
> used to communicate the SAME data that was present in the initial
> packet issued by a host.
> 
> 
> That depends on the nature of the channel.  Obviously, if you set the
> origin clients IP address as the source address, you're going to get
> a different result from that spoofing than putting it in a client
> subnet EDNS option or forwarded-for header. [Med] Agree.
> 
> 
> •         An address sharing device, under for example DS-Lite
> (RFC6333), that inserts the source IPv6 prefix in the TCP HOST_ID
> option (RFC7974) is not RESTORING any data. The content of that TCP
> option is already visible in the packet sent by the host. I agree
> with the IESG analysis of RFC7974.  It does restore information by
> taking information which normal operation would have elided and
> restores it. [Med] The  implication of what you are saying here is
> that proxies are good because they hide the source IP addresses of
> host!
> 
> 
> Aggregating proxies can have a positive privacy impact, yes.  An
> observer seeing traffic from an aggregating proxy to
> sensitive-topic.example.com<http://sensitive-topic.example.com> knows
> only that some user behind that proxy is looking for information on
> sensitive-topic.  To know which user, the observer must have either
> suborned the proxy or have a way of observing traffic between hosts
> and the proxy.  Both are more expensive and at higher risk of
> discovery than a simple tap near
> sensitive-topic.example.com<http://sensitive-topic.example.com>.
> 
> [Med] The main point here is that, even in the presence of an
> aggregating proxy, a server can demux users by correlating various
> information leaked at the application layer (e.g.,
> https://panopticlick.eff.org/). Tracking those users when they change
> their source IP address is possible in this case, too.
> 
> 
> If the data is taken from a portion of the packet that would not
> normally be forwarded to an upstream host and added to a portion that
> is forwarded to an upstream host, then the device adding the data
> back in should know it is a restoration. [Med] That definition is not
> trivial as mentioned above. I would use “preserve” or
> “maintain” rather than “restore”. Please see above.
> "Restore" is closer, in my opinion, than either preserve or
> maintain.
> 
> 
> 
> If the endpoint sends the data, data will be consistently available
> in that header.  The data changes, of course. [Med] I’m not sure to
> follow you here. What is meant by “consistent availability” then?
> Do you mean the same channel/procedure to communicate the
> information? Or “consistent data”?
> 
> I mean that if you define a protocol such that a well-formed message
> from the client has the data the server needs, it will be
> consistently available.  If you rely on intermediate network devices
> to add the data, it may not be available if there is not cooperating
> network device on path (e.g. if the DNS resolver does not support the
> relevant EDNS0 option).
> 
> [Med] Thank you. Please clarify this in the draft. I had troubles to
> parse what you meant by “consistent availability”. That’s said,
> there might be also “not cooperating on-path devices” that may
> strip/alter the content of client supplied data (easy for HTTP for
> example).
> 
> 
> 
> [Med] Resources may not be restricted to CPU or disk but may be
> granting access to the service (e.g., download a file when a quota
> per source address is enforced). It can be whatever the servers
> consider to be critical for them; it is up to the taste of the
> service design to characterize it. The NEW wording proposed above is
> technically correct. Please reconsider adding it to the draft.
> 
> 
> 
> I did consider it, but I continue to believe that it moves the needle
> too far into simple server preference.  I retained the original PSAP
> language in -07 as a result. [Med] emergency is only an example ;
> other services may exist that impose the same trust model.
> 
> 
> I think there is a qualitative difference between situations in which
> the resources at risk are human lives and those where they are host
> resources. [Med] I agree with you as an individual. But, it is not up
> to us to mandate this condition for executing services. It is up to
> the (protocol) designers/service providers to decide what is
> critical/key for their service operation. That's why the carve out
> was limited in the GEOPRIV case. [Med] GEOPRIV is not the only
> protocol/service that is concerned with human lives, we can consider
> vehicular networking that trust the information shared by the
> infrastructure. I prefer neutral wording that cites emergency as an
> example.
> 
> 
> I also added a note about your extensive review.  While you and I
> clearly have some differences of view, the document has gotten better
> from your engagement with it, and I appreciate your efforts. [Med] I
> reviewed the -07. Although it is better compared to -05, I still
> don’t think it is ready to be published as it is. Thank you for
> your effort. And thank you for yours, regards, Ted
> 
> regards, Ted
>