Re: [p2pi] Content Identifiers sent to ALTO server?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [p2pi] Content Identifiers sent to ALTO server?
Laird Popkin <laird at pando.com> wrote:
>
>> I see two very different use cases for what we're discussing here.
>>
>> 1. Client has a list of IP addresses, and wants to know which to
>> try first;
>>
>> 2. Client doesn't have a list of IP addresses, but has a Content
>> Identifier, and wants an ordered list of IP addresses to try.
>
> I don't think that either of the scenarios that you list is particularly
> likely for p2p swarm file downloading.
I'm perfectly happy for "client" to mean the tracker itself, BTW.
> I'll explain why, they outline the model of ALTO that seems most
> valuable for that application.
>
> 1. If the client has a list of IP addresses, the list is short enough
> that it can test throughput with each of them all fairly quickly,
> so the value of ALTO is low in terms of optimizing p2p performance.
There's no functional problem with such a list exceeding 100 IPs.
Many ISPs consider 100 simultaneous TCP connections abusive. (Unless
they actually overlap, I don't see how they can be tried "quickly".)
> This scenario makes sense for some other applications (e.g. finding
> a good network relay, or optimizing small swarm downloads).
Exactly. We should not design something useful only for p2p swarm.
> 2) I think that it's safe to assume that p2p networks will never send
> a list of all IP addresses associated with a particular content
> swarm to an ISP's ALTO server. This raises obvious business,
> privacy and legal issues, which are at best quite concerning.
Nonetheless, privacy is compromised as soon as they send it to a
potential user. More likely, they would send an incomplete list,
possibly adding non-swarm IPs to obscure things.
> In addition, putting the ALTO protocol in the middle of a peer request
> creates unacceptable operational problems for the p2p network that
> would be far worse than any optimization that ALTO could provide.
I don't follow you. I see neither "unacceptable operational problems"
nor a lack of potential optimization to outweigh them.
> Let's compare a standard BitTorrent "tracker announce", a tracker
> announce with ALTO IP lookup" and "tracker announce with ALTO
> guidance" for a peer in a swarm of 10,000 peers.
Please do...
> standard: peer announces, Tracker responds from list in memory (one
> MTU each way, UDP). Fast, reliable in-memory operation.
One Round-Trip-Time; but client must try the IPs in turn. Often it
will go through several TCP connections on paths the ISP would prefer
weren't used.
> ALTO IP list: In this model, information sent into or out of ALTO
> is lists of IP addresses. Peer announces. Tracker sends request,
> with the peer address and a complete list of the 10,000 peers in
> the swarm, to the ALTO server.
I doubt they'd send all 10,000, but if they did we're talking no
more than tens of milliseconds, still all from memory.
> ALTO server computes optimal list and returns 50 peers.
50 seems enough for an end-user to actually try.
> Tracker returns list to peer. Tracker does this (send swarm peer list,
> receive ALTO peer list) 2,000 times per second.
I _really_ doubt they'd send 10,000 IPs 2,000 times a second; but
that's for them to optimize. (An obvious optimization is to determine
the AS number of the end-user.)
> This makes the p2p tracker slow and unreliable (by adding external
> communication to what is now an in-memory operation),
Any tracker that busy would no doubt cache the information useful
for ordering the list, and serve from cache rather than wait for a
response from the ALTO server if a similar IP pairing hadn't expired
from the cache.
> and places a huge operational burden on the ISP's ALTO server.
ISPs _could_ manage such a load, but certainly would not choose to
do so unless they saw a return on the investment.
> I suspect that ISPs would have problems with this as well, because
> it would effectly make their ALTO server a p2p tracker, with all that
> implies.
I don't follow...
> ALTO guidance: Tracker (updated periodically, asynchronously) receives
> optimization rules from the ALTO server.
Certainly a reasonable design.
> Peer announces.
I don't follow...
> Tracker picks optimal list from known swarm peers based on the ISP's
> rules, and returns 50 peers to announcing peer. This is slightly slower
> than a standard announce, but it's all in memory in the Tracker,
> and only imposes a low volume of ALTO communications.
> Communication is the same as the first case (one MTU each way, UDP),
> plus one ALTO message every hour (or day, etc.) to update the
> optimization rules.
The devil is in the details...
Obviously, there could be a simple rule, in the nature of always
prefer ASs in this order. But that won't cover all situations, and
risks causing the whole service to be ignored.
A more useful rule would state that certain CIDR blocks can reach
other CIDR blocks over a path with these characteristics. While this
implies some computation on the part of the Tracker making the query,
that can easily be cached by CIDR-block-pairs.
So, you have made I case for returning CIDR length on ALTO replies.
I agree that looks like a good idea.
I didn't mention cacheing, because it seemed a local optimization.
I still don't think it _needs_ to be in the spec, but it very likely
deserves to be mentioned.
(And please don't forget that ALTO will be used for more than just
p2p swarms.)
--
John Leslie <john at jlc.net>
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi
Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.