Re: [p2pi] Charter and problem statement
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [p2pi] Charter and problem statement



I agree  with Richard - as long as content ID's are optional and abstracted (e.g. content ID's are not Torrent ID's) I don't think there's a privacy issue.

There are a number of use cases where metadata related to the content could be useful in allowing the ALTO server to make decisions. For example, an ISP might want to implement policies in their ALTO server such as:

- Don't cache consumer generated content (requires being able to differentiate between CGC and commercial content).
- Provide different guidance to different applications (requires knowledge of p2p application).
- Provide guidance based on distribution of that content (requires ID of that specific content).

It's a good point that once content is cached the ISP knows a lot about the content and its distribution. In particular, since it's participating in the p2p network it knows the IP addresses of people exchanging the content, whatever metadata is exposed about the content in the protocol (e.g. size, number of files, names). And for p2p networks that don't secure content on caches (Pando does, but some others do not) the ISP has the content on the cache server and can inspect it directly. But this is (IMO, IANAL) similar to the case of an HTTP cache or an email server - the ISP is providing a service, and privacy of use of that service is protected by policy, not by technology.

Given all of that, ALTO can certainly be applied to content that isn't cached, and it should preserve privacy for that content. :-)

- Laird Popkin, CTO, Pando Networks
  mobile: 646/465-0570

----- Original Message -----
From: "Richard Woundy" <Richard_Woundy at cable.comcast.com>
To: "Stas Khirman" <stas at khirman.com>, "Laird Popkin" <laird at pando.com>
Cc: p2pi at ietf.org
Sent: Thursday, July 17, 2008 10:34:48 PM (GMT-0500) America/New_York
Subject: RE: [p2pi] Charter and problem statement

>It seems that we agree on the key point - content identity in ALTO
["peer
selection"] request will be harmful. 

>It will be very important to get confirmation/rejection of this
assumption
from ISP's representatives.  

I've been thinking about this since yesterday, and I'm not really sure I
agree with this statement.

I definitely understand that there would likely be privacy concerns if
ALTO mandated content identification in its messages. Content identities
should be an optional parameter in the ALTO message exchange (if we can
agree that it should be kept at all).

But I would think that having the content identity could help an ALTO
server make a better recommendation, at least for P2P caches.

Consider the case when the ISP is operating an ALTO server as well as a
P2P cache. If there isn't an issue of violating customer privacy, why
would it be undesirable to reveal the content identity to the ALTO
server, when it is necessary to reveal the content identity to the P2P
caching server?

In fact, the P2P caching server knows potentially more valuable
information than the content identity. ALTO would only know about
metadata for content, whereas the P2P cache would have a copy of the
actual content.

To illustrate, suppose that some content's metadata (e.g. torrent)
indicates that the file is "Spiderman" and not much more. It is possible
that the file represents a DVD rip of a Spiderman movie. It is equally
possible that the file represents a digital home video of someone's
child wearing a Spiderman costume for Halloween. Examining the actual
content (as stored on the P2P cache) would help determine which scenario
is true.

-- Rich

-----Original Message-----
From: p2pi-bounces at ietf.org [mailto:p2pi-bounces at ietf.org] On Behalf Of
Stas Khirman
Sent: Wednesday, July 16, 2008 5:55 PM
To: 'Laird Popkin'
Cc: p2pi at ietf.org
Subject: Re: [p2pi] Charter and problem statement

> -----Original Message-----
> From: Laird Popkin [mailto:laird at pando.com]
> Sent: Wednesday, July 16, 2008 11:15 AM
> 
> I agree that keeping content identity out of cache location is a good
idea.

[Stas Khirman] 
Laird,

It seems that we agree on the key point - content identity in ALTO
["peer
selection"] request will be harmful. 

It will be very important to get confirmation/rejection of this
assumption
from ISP's representatives.  


> That being said, cache location doesn't need to be content-specific,
> because one the p2p client connects to the caches, the caches can
> determine what they do on a content-specific basis. So ALTO could (for
> example) decide which cache servers to send to a user based on the
user's
> network location and protocol (e.g. Pando caches might run on
different
> servers from Kontiki caches or HTTP caches, etc.). But as long as ALTO
can
> return multiple cache servers (i.e. pools) the cache servers can
manage
> whether to cache specific content, or where to assign it,
transparently to
> ALTO, without ALTO having to know (for cache location) what content is
> assigned where. This is good for the caches, as it gives them maximum
> flexibility, without having to implement a complex state-change
mechanism
> between the cache servers and the ALTO server.
> 
[Stas Khirman] 
Lets review a few usage cases to decide if ALTO can locate optimal cache
without content identity.

1.) Cache implemented as a single virtual peer or as a set of peers with
the
same cached content. Certainly, no content knowledge needed to locate a
most
appropriate server. However, such architecture doesn't scale.

2.) Cache implemented as distributed set of servers where given content
may
be located in some subset of nodes. If ALTO return [ordered] list of all
known servers, it will be up to client application to poll them to
discover
which one has appropriate content. Sometimes it will work, but with
large
list it takes significant time, especially harmful for applications that
uses cache for bootstrapped playback.

3.) Cache implemented in master-slave hierarchy. ALTO returns a pointer
to a
master cache (cache director). Upon client connection, master redirects
it
to a slave cache that has appropriate content. However, "master" do not
have
enough knowledge to pick "ALTO optimized" slave server from set of
appropriate ( I assume that popular content may be presented in multiple
cache locations). Master can resolve this issue with a second ALTO
lookup,
but it not different from separated "discovery" and "optimization" calls
as
I suggested.

4.) Cache service implemented by third party CDN ( I assume you referred
to
this specific case).  In this case decision what cache to be used is
under
control on another party (CDN owner) who may have "optimization" policy
quite different then originating ISP (those diminishing value of ISP-own
ALTO service). Probably it is extreme speculation, but in theory, Tier-1
ISP
operated CDN may use "selection" policy that increase its own profit,
but
not decrease cost of originating ISP.




_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi



Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.