an ISP might want to implement policies in their ALTO server such as:
- Don't cache consumer generated content (requires being able to
differentiate between CGC and commercial content).
- Provide different guidance to different applications (requires
knowledge
of p2p application).
- Provide guidance based on distribution of that content (requires
ID of
that specific content).
It's a good point that once content is cached the ISP knows a lot
about
the content and its distribution. In particular, since it's
participating
in the p2p network it knows the IP addresses of people exchanging the
content, whatever metadata is exposed about the content in the
protocol
(e.g. size, number of files, names). And for p2p networks that don't
secure content on caches (Pando does, but some others do not) the
ISP has
the content on the cache server and can inspect it directly. But
this is
(IMO, IANAL) similar to the case of an HTTP cache or an email
server - the
ISP is providing a service, and privacy of use of that service is
protected by policy, not by technology.
Given all of that, ALTO can certainly be applied to content that
isn't
cached, and it should preserve privacy for that content. :-)
- Laird Popkin, CTO, Pando Networks
mobile: 646/465-0570
----- Original Message -----
From: "Richard Woundy" <Richard_Woundy at cable.comcast.com>
To: "Stas Khirman" <stas at khirman.com>, "Laird Popkin" <laird at pando.com
>
Cc: p2pi at ietf.org
Sent: Thursday, July 17, 2008 10:34:48 PM (GMT-0500) America/New_York
Subject: RE: [p2pi] Charter and problem statement
It seems that we agree on the key point - content identity in ALTO
["peer
selection"] request will be harmful.
It will be very important to get confirmation/rejection of this
assumption
from ISP's representatives.
I've been thinking about this since yesterday, and I'm not really
sure I
agree with this statement.
I definitely understand that there would likely be privacy concerns
if
ALTO mandated content identification in its messages. Content
identities
should be an optional parameter in the ALTO message exchange (if we
can
agree that it should be kept at all).
But I would think that having the content identity could help an ALTO
server make a better recommendation, at least for P2P caches.
Consider the case when the ISP is operating an ALTO server as well
as a
P2P cache. If there isn't an issue of violating customer privacy, why
would it be undesirable to reveal the content identity to the ALTO
server, when it is necessary to reveal the content identity to the
P2P
caching server?
In fact, the P2P caching server knows potentially more valuable
information than the content identity. ALTO would only know about
metadata for content, whereas the P2P cache would have a copy of the
actual content.
To illustrate, suppose that some content's metadata (e.g. torrent)
indicates that the file is "Spiderman" and not much more. It is
possible
that the file represents a DVD rip of a Spiderman movie. It is
equally
possible that the file represents a digital home video of someone's
child wearing a Spiderman costume for Halloween. Examining the actual
content (as stored on the P2P cache) would help determine which
scenario
is true.
-- Rich
-----Original Message-----
From: p2pi-bounces at ietf.org [mailto:p2pi-bounces at ietf.org] On
Behalf Of
Stas Khirman
Sent: Wednesday, July 16, 2008 5:55 PM
To: 'Laird Popkin'
Cc: p2pi at ietf.org
Subject: Re: [p2pi] Charter and problem statement
-----Original Message-----
From: Laird Popkin [mailto:laird at pando.com]
Sent: Wednesday, July 16, 2008 11:15 AM
I agree that keeping content identity out of cache location is a
good
idea.
[Stas Khirman]
Laird,
It seems that we agree on the key point - content identity in ALTO
["peer
selection"] request will be harmful.
It will be very important to get confirmation/rejection of this
assumption
from ISP's representatives.
That being said, cache location doesn't need to be content-specific,
because one the p2p client connects to the caches, the caches can
determine what they do on a content-specific basis. So ALTO could
(for
example) decide which cache servers to send to a user based on the
user's
network location and protocol (e.g. Pando caches might run on
different
servers from Kontiki caches or HTTP caches, etc.). But as long as
ALTO
can
return multiple cache servers (i.e. pools) the cache servers can
manage
whether to cache specific content, or where to assign it,
transparently to
ALTO, without ALTO having to know (for cache location) what
content is
assigned where. This is good for the caches, as it gives them
maximum
flexibility, without having to implement a complex state-change
mechanism
between the cache servers and the ALTO server.
[Stas Khirman]
Lets review a few usage cases to decide if ALTO can locate optimal
cache
without content identity.
1.) Cache implemented as a single virtual peer or as a set of peers
with
the
same cached content. Certainly, no content knowledge needed to
locate a
most
appropriate server. However, such architecture doesn't scale.
2.) Cache implemented as distributed set of servers where given
content
may
be located in some subset of nodes. If ALTO return [ordered] list
of all
known servers, it will be up to client application to poll them to
discover
which one has appropriate content. Sometimes it will work, but with
large
list it takes significant time, especially harmful for applications
that
uses cache for bootstrapped playback.
3.) Cache implemented in master-slave hierarchy. ALTO returns a
pointer
to a
master cache (cache director). Upon client connection, master
redirects
it
to a slave cache that has appropriate content. However, "master" do
not
have
enough knowledge to pick "ALTO optimized" slave server from set of
appropriate ( I assume that popular content may be presented in
multiple
cache locations). Master can resolve this issue with a second ALTO
lookup,
but it not different from separated "discovery" and "optimization"
calls
as
I suggested.
4.) Cache service implemented by third party CDN ( I assume you
referred
to
this specific case). In this case decision what cache to be used is
under
control on another party (CDN owner) who may have "optimization"
policy
quite different then originating ISP (those diminishing value of
ISP-own
ALTO service). Probably it is extreme speculation, but in theory,
Tier-1
ISP
operated CDN may use "selection" policy that increase its own profit,
but
not decrease cost of originating ISP.
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi