Re: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted



Hi All,
To explain a bit of the test methodology, downloaders were globally distributed, and every downloader was assigned randomly to one of the five swarms at the time that they started the download. The result of this is that while there were quite comparable numbers of downloaders in each swarm globally, the number of downloaders within a given ISP for each swarm has random variation. For example, one ISP might have a few percent more downloaders of the 'random' swarm while another ISP might have a few percent more downloaders of the 'P4P generic weight matrix' swarm. As with any statistical analysis of real world data, you should expect some random variation based on sample size. In particular, within Comcast there happened to be a few percent more 'random' downloaders than for the guided swarms.

To be more precise, I just did some analysis on if the random number generator did a good job in assigning downloaders to swarms. Consider the assignment of Comcast downloaders to different swarms. Let's focus on one particular swarm, say Generic. If uniform random assignment, a downloader from Comcast is assigned to Generic with probability 1/5. Define random variable X1 as an indicator that the first downloader from Comcast is assigned to Generic; X2 the second Comcast downloader is assigned to Generic, ... Then:

Xi = 1 with prob. 1/5; 0 with prob. 4/5.

Define SN = X1 + X2 + ... + XN, where N is the number of downloaders from Comcast. Then SN is the number of Comcast downloaders assigned to Generic.

Applying Central Limit Theorem,

     SN - 1/5*N
z = ------------
     \sqrt(N) * 0.4

should follow Standard Normal Distribution.

I calculated the z scores for fine, coarse, generic, and random:

Z_fine = 0.69
Z_coarse = 1.27
Z_generic = -0.81
Z_random = -1.43

It looks that the assignment algorithm did a decent job in assigning downloaders from Comcast to the different swarms (the 95% confidence interval of Z is -1.96 to 1.96). Of course, the preceding is to give intuition, and we can write more rigorous statistical hypothesis testing. (BTW, there is a typo in the previous email, Random had the smallest number of downloaders from Comcast assigned to it.). We did observe higher cancellation rate by Random (3.17%) vs others (1.77%-2.22%), but this is just one sample. We can derive more samples by reprocessing the testing logs by days, calculating the cancellation rate during each day to obtain a series, and then conducting statistical testing to test if guided swarms have lower cancellation rate against the series of Random. But if the cancellation rate is of true interest, we may want to conduct more experiments to evaluate it.

Richard
Looking at the numbers of downloads and their completion rates, all of the swarms had completion rates that were extremely high, with the guided (P4P and PNA) having a higher completion rate (about 98%) and the unguided (random) downloads having a slightly lower completion rate (about 97%).

The standard technique for addressing random samples is to look at percentages rather than absolute values. That is, instead of looking at the number of bytes downloaded for each swarm from external sources (for example), it's better to compare the percentage of data downloaded that came from external sources. It's not particularly meaningful that one swarm was downloaded with one ISP a few percent more than another, but comparing cancellation rates, or ratio of internal to external data download volumes, is rather illuminating.

- Laird Popkin, CTO, Pando Networks
  mobile: 646/465-0570

----- Original Message -----
From: "Y. Richard Yang" <yry at cs.yale.edu>
To: "Robb Topolski" <robb at funchords.com>
Cc: p2pi at ietf.org
Sent: Sunday, November 9, 2008 4:57:32 PM (GMT-0500) America/New_York
Subject: Re: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted

Hi Robb,

Thanks for the excellent comments. Please see below.

Robb Topolski wrote:
It seems like a broken statistic given this experiment.

If it were tightly controlled (and thus less Internet-like), all of
the downloads would have completed and the download byte amounts would
be virtually the same.

Welcome to the real world :-) Large-scale and yet tightly controlled experiments are hard to conduct (if you have an idea or are interested in more discussion, it will be fantastic). We did quite a few controlled experiments using clients on PlanetLab, but we could get a couple hundred users if we were lucky. Neither may tightly controlled experiments be highly desirable, because we would not know what we were missing then (your example of p2p streaming is an excellent example). One objective of real experiments is unexpected discovery. Sometime they are pleasant, and sometime they are not. Download abort was one unexpected. But given the small fraction, I think it does not changes the reported results. For full disclosure, another unexpected user behavior we discovered during our log processing was that users sometime paused their download process (e.g. put the machines to sleep). We spent quite a lot of time on this issue. Unfortunately the logs we had did not capture such events. We were careful to use statistically more robust metrics (e.g., the download rates at different percentiles instead of the average). We tried to check the impacts of such events. For example, we also computed statistics after filtering to make sure our statistics did not change much. One filtering we did was that if a user had no activity for too long (we tried two minutes and some other numbers), we flagged that the user might have paused, and we ignored that user's data. We did not see large change of the
statistics we computed before and after such filtering.

Of course, we learn from such experiences. In our current experiments, we are designing a new log format to try to capture more events. We might be surprised again and go back to log more. We will be happy to share and work with others on robust experiment design and statistics collection.
One might argue that there is virtue to these observations because a
downloader who cancels a slow download is a bad thing, because they
ultimately did not get what they attempted to download.

But is there virtue in an aborted download?  If they canceled the
download in frustration or because they simply couldn't stay connected
to the swarm long enough before being forced to quit by some other
personal oe obligation, then one could say that an aborted download is
a band thing. OTOH, an incomplete download is a good thing when a
downloader may have changed his mind (wanted to hear/see/do something
else) with less bandwidth wasted. There's just no way to know why the
user aborted the download.

I agree it will be fantastic if we can know the reason for an aborted download. One thought came to mind immediately is a pop-up window after a download abort to ask the user for reason(s). I do not know any P2P clients doing this right now and I am not sure how many users will respond (e.g. not click cancel). But it is an excellent suggestion! We sure will talk to the P2P developers we are working with
to see if we can add such an option in one of our log plug-in.
Even if we did know, this experiment doesn't deal with other possible
P4P-advantaged uses such as streaming-P2P video delivery (a mode more
prone to user "taste-testing" before completing a download than
traditional file-transfer models).

Absolutely good comment. This is why we are focusing on P2P streaming, where channel hopping is a common user behavior (http://ccr.sigcomm.org/online/?q=node/404), and startup delay is a major
performance metric.

Hope to hear more such good comments!

Richard
Robb

2008/11/6 Ye WANG <wangye.thu at gmail.com>:
Hi Haibin,

Yes, the Random swarm has notable smaller finished downloads than Generic or
Coarse Grained during the period.

Since the swarm sizes (# downloading peers) are roughly equal across all
five swarms (Richard explained this in details), we suspect that a portion
of slow peers terminate/discard their downloads.  This is almost the same
hypothesis pointed out by Rich Woundy.

Another evidence is we do notice significantly slower peers in Random swarm,
e.g., the slowest Random peer took 7268s (>2hours) to download the video,
but the slowest Generic took 2725s (<1hour), the slowest Corase Grained took
3114s (<1hour).   Hours of downloading may make users impatient.  If the
"tail" peers in Random swarm could suffer much lower download rates,
presumably, the number of "terminated" peers may be larger in Random swarm.

On Thu, Nov 6, 2008 at 4:09 AM, Song Haibin <melodysong at huawei.com> wrote:
Hi Richard and all,

The access download of each swarm should be equal to the sum of those
downloaded by the clients in each swarm. So if the number of downloads in
each swarm is the same and the amount downloaded is the same, then each
swarm should have the same access download.
Song Haibin: From section 4.1, we can see that "The results of the trial
indicated that P4P can improve the speed of downloads to P2P clients", so
if
the statistics data is collected during a certain period (from July 2 to
July 17, 2008), then the download will be increased than the random swarm.
I
don't think each swarm has downloaded the same amount of chunk files
during
the statistic period.
Best Regards,
Song Haibin
Email: melodysong at huawei.com
Skype: alexsonghw



-----Original Message-----
From: p2pi-bounces at ietf.org [mailto:p2pi-bounces at ietf.org] On Behalf Of
Y.
R.
Yang
Sent: Thursday, November 06, 2008 10:40 AM
To: Woundy, Richard
Cc: p2pi at ietf.org; Livingood, Jason
Subject: Re: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted


Hi Rich and others,

The access download of each swarm should be equal to the sum of those
downloaded by the clients in each swarm. So if the number of downloads in
each swarm is the same and the amount downloaded is the same, then each
swarm should have the same access download.

First look at the amount downloaded. There are can be some differences
due
to duplicated chunks and the way we detected data chunks (there are also
control data in the logs), but the difference appears to be small.

Now let's look at the number of downloads. During the test, each client
is
uniformly assigned to a swarm. Given the large number of clients, each
swarm should have about the same number of clients. But there can be two
factors for us to see different numbers of *reported* downloads: (1) some
clients are old and may not report or the reporting of logs was not
successful; and (2) different # of clients finished downloading (if a
client does not finish downloading, it does not report. Laird, please
correct me if I am wrong). I belive the first factor should be small due
to uniform random assignment of peers to swarms.

I just looked at the data available. We did detect a smaller number of
finished download with Random than with the P4P swarms. For example, from
July 3 to July 10, detected # of finished download of Generic is about 5%
more than than Random, and Coarse is 7% more than Random. From July 10 to
July 17, Generic is 10.5% more than Random, and Coarse is 4.7% more.
Looking at the traffic volume at Section 4.2, I see that Generic is about
7% higher, and Coarse is about 8.5% higher. Note that # of finished
download and volume are different due to duplicated chunks and missing
logs.

So I would like to support the theory/guess of Rich that some users
terminated the download prematurally and faster downloads may result in
fewer such terminations. But it may also include factors in (1)
differences in initial assignment due to random numbers; and (2) # of
finished but non-reporting clients.

If you have any other suggestions, we will be more than happy to look
into
the available data more.

Richard

On Wed, 5 Nov 2008, 6:26pm -0500, Woundy, Richard wrote:

My current theory/guess is that some users may terminated the download
prematurely, eg due to user impatience. So faster downloads (e.g.
thanks
to P4P) may result in fewer user terminations.



Laird is checking the data to see if we can confirm that, or find
another explanation.



-- Rich



________________________________

From: Laird Popkin [mailto:laird at pando.com]
Sent: Wednesday, November 05, 2008 6:12 PM
To: Robb Topolski
Cc: Livingood, Jason; p2pi at ietf.org; Woundy, Richard
Subject: Re: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted



That's a good question, and Richard and I spoke about this yesterday.
I'll be looking into the data to see what the cause is.

- Laird Popkin, CTO, Pando Networks
  mobile: 646/465-0570

----- Original Message -----
From: "Robb Topolski" <robb at funchords.com>
To: "Richard Woundy" <Richard_Woundy at cable.comcast.com>
Cc: "Jason Livingood" <Jason_Livingood at cable.comcast.com>,
p2pi at ietf.org
Sent: Wednesday, November 5, 2008 5:42:51 PM (GMT-0500)
America/New_York
Subject: Re: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted

I don't get the part where access network download consumption
increased
as a result of using P4P (section 4.2).  Can someone explain how that
could happen?

Robb

On Wed, Nov 5, 2008 at 2:29 PM, Woundy, Richard
<Richard_Woundy at cable.comcast.com> wrote:

Reinaldo,

I can answer the easy questions. We will need some assistance from
Pando
(and Yale) for some of the other ones.


What was the file size in those experiments?
21 megabytes. From section 2: "Pando distributed a special 21 MB
licensed video file as in order to measure the effectiveness of P4P
iTrackers."


How long would it take to download the file in the three different
scenarios? I know that more consumed bandwidth in access might lead one

to conclude that file was downloaded faster...

To clarify, most of the raw data (download speed and Internet
peering/transit traffic volumes) were collected by Pando Networks from
their P2P clients, not collected by Comcast across its links. So my
assumption is that the Pando client used the content size (21 MB), and
divided by the download time to get the speed.


Was the file already seeded in Comcast's network? More specifically,
how
was file propagation done?

Any seeding happened outside of Comcast's network, and outside of
Comcast's control. That's really a question for Pando.


Was PEX, DHT and others enabled in the clients?
Pando would know whether PEX was enabled. It would be safe to assume
that with respect to this trial, DHT was NOT enabled, since Pando
supplied the tracker. (The pTracker in the draft is a tracker operated
by Pando.)


Was local peer discovery enabled in the clients?
Pando would know.


BTW, can broadcast/multicast peer discovery work in Cable networks?
Do you mean something like this:
http://bittorrent.org/beps/bep_0026.html?

If so, peer discovery probably would not work over the typical last
mile
cable network. Maybe I'm wrong, but I see this protocol as intended for
peer discovery within one's home network / LAN / WiFi network, not over
a cable network.


So, were clients allowed to become seeders to the outside of Comcast's
network?

Yes, they were.

As a related item, look closely at section 4.2. The amount of aggregate
uploaded data from Comcast clients (per swarm) was about 140,000 MB.
The
amount of aggregate downloaded data from Comcast clients (per swarm)
was
about 60,000 MB or so. So the typical Comcast client uploaded more than
twice the amount of data that it downloaded.


How much of the swarm was within Comcast and outside?
Most of the swarm was outside of Comcast. Unfortunately I don't have
access to the size of the global swarm, but I would guess that Comcast
clients represented no more than 15% of the swarm, and maybe as little
as 5%. Those guesses are based on the behavior of the random swarm,
e.g.
Comcast clients uploaded to non-Comcast clients 94% of the time in the
random swarm.

-- Rich


-----Original Message-----
From: p2pi-bounces at ietf.org [mailto:p2pi-bounces at ietf.org] On Behalf Of

Reinaldo Penno
Sent: Wednesday, November 05, 2008 11:23 AM
To: Livingood, Jason; p2pi at ietf.org
Subject: Re: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted

Hello Jason/Rich,

This is such an interesting draft. I'm surprised there are no questions
about it. Maybe everybody else is part of P4P one way or another and
I'm
not
in the 'in' crowd (;-) so I have questions.

* What was the file size in those experiments? Some post long ago said
the
file size in some P4P experiments was really small, as opposed to the
top
100 torrents where the file size is ~1Gb. I was curious what is the
optimization payback in terms of download time for large files as
opposed
small files.

* How long would it take to download the file in the three different
scenarios? I know that more consumed bandwidth in access might lead one
to
conclude that file was downloaded faster but I'm not sure this is a
straightforward conclusion.

* Was the file already seeded in Comcast's network? More specifically,
how
was file propagation done? All clients started from scratch and had to
start
pulling the file from some other side of the world and then exchanging
pieces? This is mainly due to the discussion in 4.2.

* Was PEX, DHT and others enabled in the clients?

* Was local peer discovery enabled in the clients? BTW, can
broadcast/multicast peer discovery work in Cable networks?

* If more clients finish downloading faster and become seeders you
would
think that for popular content Comcast's upstream bandwidth would
increase
due to the number of seeder in its network. So, were clients allowed to
become seeders to the outside of Comcast's network? How much of the
swarm
was within Comcast and outside?

Thanks,

Reinaldo

On 11/3/08 12:49 PM, "Livingood, Jason"
<Jason_Livingood at cable.comcast.com>
wrote:

For some reason the URL was cut to two lines - trying again:


http://www.ietf.org/internet-drafts/draft-livingood-woundy-p4p-experienc
es-02.txt



-----Original Message-----
From: p2pi-bounces at ietf.org [mailto:p2pi-bounces at ietf.org] On
Behalf Of Livingood, Jason
Sent: Monday, November 03, 2008 3:35 PM
To: p2pi at ietf.org
Subject: [p2pi] draft-livingood-woundy-p4p-experiences-02 posted

A draft at
http://www.ietf.org/internet-drafts/draft-livingood-woundy-p4p
-experienc
es-02.txt may be of interest to folks that have been
interested in P2Pi and ALTO.  We have requested time on the
ALTO agenda at IETF 73 to present this.

Regards
Jason
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi

_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi




--
Robb Topolski (robb at funchords.com)
Hillsboro, Oregon USA
http://www.funchords.com/

_______________________________________________ p2pi mailing list
p2pi at ietf.org https://www.ietf.org/mailman/listinfo/p2pi


_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi
_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi





_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi


_______________________________________________
p2pi mailing list
p2pi at ietf.org
https://www.ietf.org/mailman/listinfo/p2pi



Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.