Here's what I captured...
Spencer
3.1
Welcome/Note
Well/Intro Slides
Goals:
· To think
about the Comcast/BitTorrent problem.
· To think
about what the IETF might do to work on the problem
· Focus on
what’s actionable for the IETF, engineering, not research or politics.
“Not dealing with
illegal traffic” – Cullen asked us to focus on the problems that we
still have when all the traffic is legal.
3.2
Service
Provider Perspective (Comcast)
Presenters are Jason
Livingood and Richard Woundy from Comcast (including attributions because
they’re presenting a sponsor’s perspective0.
View that this is a
two-way street – applications becoming more network-friendly, networks
becoming more application-friendly.
Comcast has committed to
implement a protocol-agnostic network management technique (by yearend).
Comcast was managing
Vonage customer traffic, and customers were complaining. Not all the complaints
reflected this management (Comcast and Vonage both experienced problems with
transit providers), but that wasn’t obvious to customers.
Most ISPs started
managing P2P traffic about 2-3 years ago. When Comcast started managing
its traffic, they stopped getting complaints from Vonage and from P2P users.
The recent concerns have involved a suspicion that Comcast was degrading Vonage
VoIP service in an attempt to favor Comcast’s own VoIP products.
Current DOCSIS can
experience congestion independently on upstream and downstream and experience
congestion independently across DOCSIS domains.
We talked about some
DOCSIS details, but remember that we have to solve problems for a broad class
of access networks, not just for DOCSIS.
Slide point: “How
‘TCP flow fairness’ applies to the DOCSIS upstream is an open
research question”.
DOCSIS also has
constant-bandwidth “service flows” (so you don’t need to
request-to-send every packet). A Canadian cable company charges extra to set up
a second service flow for Vonage, but other cable companies haven’t
followed this lead because of concerns about “net neutrality”.
We did note that if you
have enough bandwidth, you don’t need to prioritize traffic, but this
doesn’t help when traffic levels increase suddenly (“network
capacity increases are not instantaneous”), and capacity increases can be
consumed quickly.
Not all customers are
similar (even at the DOCSIS domain level) – college populations use more
P2P, VoIP/streaming video are more diurnal than bulk file distribution.
Comcast would like to
avoid protocol-specific “cat and mouse” detection and mitigation,
and would like to avoid capacity increases that benefit only 5 percent of heavy
users.
DOCSIS 3.0 adds capacity
but doesn’t solve the capacity problem. Comcast will deploy DOCSIS 3.0 in
20 percent of network by yearend. This more than doubles bandwidth available
per subscriber, but the conflict is between adding capacity and supporting
applications that are designed to maximize bulk bandwidth consumption.
There are Internet2
techniques for client-to-network signaling that haven’t been standardized
in IETF yet…
Caching content in the
network can reduce TCP RTTs and reduce the peak-to-trough differences of
individual flows.
Network management trials
begin in June – management based on traffic characteristics, not on
specific protocols. Algorithm – mark all flows as priority and then mark
specific flows as “best-effort” when congestion approaches
“the Near Congestion State”.
Comcast hasn’t
figured out how to notify users that they have been marked as best-effort yet.
BT is planning a similar
mechanism on their DSL network, but they are planning on varying the bandwidth,
not changing QoS markings. Comcast thinks customers are more concerned about
changing markings and don’t want to penalize high-bandwidth users when
the network is NOT congested (“it’s a high-bandwidth service,
people should use it”).
Henning is concerned
about mechanisms that have 8 independent tweakable parameters…
Comcast does still care
about protocol-specific management mechanisms, when the carrier is providing
the application, or when the traffic is illegal, etc.
3.3
Application
Designer Perspective (BitTorrent)
Presenter is Stanislov
Shalunov.
BitTorrent is willing to
make changes based on outcomes at this conference.
BitTorrent sees two
problems – RTT measured in seconds, inefficient overlay routing.
Peer selection is random,
except that there is some bias towards established peers. Trackers randomize
the peer list, and peers ask the resulting randomized list what peers they know
about. Rarest-first ensures best reliability and helps peers have pieces of
files to “trade”. This also gives peer diversity.
The best peers can be
farther away – a peer with a big pipe, a long way away, can drive you at
your downlink speed, but a peer on your own cable modem subnet can only drive
you at its uplink speed.
Danny’s comment
– restated for minutes: “If you have more localized distribution
via caching, or location techniques, or whatever, then burst rates (i.e., peak
<> trough ratios) may actually be larger and you may increase HFC
contention (note: HFC in particular, v. DSL, etc..)”
Overlay efficiency
– Stanislov is experimenting with trying peers on same AS, or on a
set of ASes (but then you need the set), or in a list of IP prefixes (but then
you need the list), or based on some cost minimization model (but then you need
costs).
If you have more than 20
peers in an AS, you can probably trade files happily, and 57 percent of transit
traffic could be reduced with this modification. If there aren’t enough
local peers, make one a cache (and please choose a fast one), or 10, or 100.
Stanislov’s
experience is that they can do 30-percent hit rate with 1 TByte cache,
80-percent with 100 TBytes – and 1 TByte fits into a device.
Dave Oran asked what the
required bandwidth for an 80-percent hit rate of 1 TByte is – he is
concerned that this architecture may reveal bandwidth bottlenecks onto and off
the cache device itself.
Stanislov hopes the IETF
can work in these areas:
· Cache
discovery protocol
· Network
information for smart peer selection
· Experimental
congestion control
· BCP on P2P
pain
Stanislov doesn’t
think we’re ready to do congestion control for P2P yet – do a
framework, but don’t standardize the mechanism yet.
Lars says there’s a
TSV committee looking at new congestion control mechanisms, but the
participants are all from the long-delay community.
Current BitTorrent cache
discovery protocol is DNS-based, with an SRV record based on the reverse DNS
lookup of your IP address.
Dave Oran asked for input
from DSL and Metro-Ethernet network operators, because he sees the network
characteristics being radically different, and the problem he sees on deployed
networks is under-provisioned aggregation network links.
3.4
Lightning
Talks & General Discussion
3.4.1 Robb
Topolski
This talk covers about a
third of Robb’s submitted paper.
Robb believes that the
common accusations about P2P file-sharing aren’t actually true –
for example, it’s very rare for any applications to rate-limit, so almost
all applications will use all available bandwidth.
We need to know a lot
more about the problems we’re facing (including “where congestion
actually is”) in order to solve problems, instead of simply dealing with
effects.
3.4.2 Nick Weaver
The problem with P2P
filesharing is that it shifts cost, doesn’t save costs. It increases
aggregate costs, and it’s transport-inefficient.
Could easily build a
BitTorrent cache today – all the components are in place. But it’s
less efficient than HTTP, and it’s lawsuit-bait for pirated content.
3.4.3 Leslie
Daigle
“The open,
decentralized nature must be preserved.”
The problems aren’t
about P2P technology itself. There’s an assumption that network operators
won’t/can’t adjust the network to meet demand (as P2P supernodes
move).
What are we trying to do:
· Make P2P
networks work better?
· Make
applications and networks more aware of each other?
· Solve the
general-case of bandwidth-intensive applications?
· Deal with
in-network issues or impact of cross-network traffic?
3.5
Localization
and Caches
3.5.1 Laird Popkin
and Haiyong Xie – P4P
Average P2P bit travels
5-7 metro hops in Verizon networks – network-oblivious P2P applications
may not be network-efficient. Routing information is hidden, TCP rate control
is coarse-grained.
P4P goal is to design a
framework that enables better cooperation between providers and applications.
P4P tracker provides
static topologies, policies, provider capabilities, virtual cost…
Virtual costs can be
distributed based on provider preferences, but they are talking to providers
about 5-minute refresh windows (for scaling reasons). Costs are provided by the
providers (not currently tied to routing, but could be if someone wants to
provide a feed from routing).
Trials at Verizon and
Telefonica showed significant increases in intra-ISP traffic, including
increase in “local” traffic based on IP prefixes. This was a large
swarm (3500 peers at the peak, 300 after a few days).
Possible IETF work
– “trackerless P2P” (DNS), lookup mechanisms to find AS-PIDs
and trackers for a given ASN.
Jon asked for information
about what IETF protocols are being modified.
Henning said it would be
nice to have everyone work together, but IETF is slow… P4P Forum
doesn’t think they are doing a standard, they think they are validating a
concept.
3.5.2 Yu-Shun Wang
– P2PI Traffic Localization - Microsoft
Lots of reasons to
localize – apply QoS policy, throttling, admission control – but
localization makes the user experience “non-uniform”.
Using multi-layer
trackers (global, local, anything in between).
Want to be “local”
– but may have to accommodate traffic conditions, load balancing, policy,
etc. Need the right metrics – the policy might be NOT to localize.
Most of focus is on
topology matching between link layer and application layer. Caches are easy
(long-lived peers) but caches have to be in the right place in order to gain
benefits.
Looking at tracker
redirection/delegation. Prefer off-path for reliability and avoiding
performance bottlenecks.
Eric Burger asked what
incentive people have to deploy a mechanism like this.
Jon liked this
paper/presentation because it seemed easy to chop into engineering-sized parts.
Bob raised a concern
about ISP liability if they are helping pirates choose optimal paths for
particular swarms, but if you provide information for all users in all swarms,
this isn’t scalable.
3.5.3 Vinay
Aggrawa – ISP-aided Neighbor Selection – DT
Vinay is talking about
something that’s similar to P4P – having a provider supply an
oracle so that any peer can find its “local neighbors”.
Henning is concerned
about oracles that dump traffic off-net, and beggar their neighbors.
Dave Oran thought the
oracle would also know who was talking to who now – and this approach
doesn’t have that information yet.
People are thinking that
9000 nodes is a big swarm – for legal content with advertising, it
probably isn’t. We need to think about much larger swarms.
Could just run oracle as
web server/UDP server at well-known location (similar to BIND). Open and simple
solution, open for all overlays.
Oracle service
doesn’t cache content at ISP, doesn’t participate in file sharing.
Are ISPs willing to
accept approaches like oracles that rely on interaction with clients?
They’ve resisted this in the past.
3.5.4 Cullen
– IETF work on localization?
(Cullen had a slide of
possible work from listening to presentations and questions/answers)
Leslie – still need
to decide what problem we’re trying to solve…
EKR – people talked
about making clients less aggressive, moving data into caches so it
doesn’t consume uplink capacity, and choosing better peers. All of
Cullen’s list was about localization. Doesn’t seem that
localization would be important to Comcast, and I’m terrified of the
legal issues involved in caching.
Would be useful for nodes
to find their own locations in a topology – find own AS numbers, etc.
Henning – providing
information that customers can’t verify, can lead to undesirable
situations (like “hot-potato” in routing).
Standardize a protocol to
delegate from global trackers to local trackers? Lisa says lots of HTTP servers
have their own mirror mapping information distribution mechanisms
(proprietary). It’s possible this would be reusable, but possible that
the two problem spaces are just too different.
Lisa also had concerns
about layer-nine concerns with inter-ISP localization issues.
Why not look at the
entire problem, and not just parts by parts? We did IM as an application, and
Google is starting to use the standardized versions. Jon thinks that IM
came into the IETF in just the way he’s thinking of for P2P – what
are the parts we need?
Lars – we have
transport technologies that aren’t seeing deployment. Maybe incentives
have changed, but something would have to change in order for new transport
stuff to be deployed. Localized changes, at the point that benefits from the
change, do get deployed.
Lisa is concerned that
localization is one of the technologies that can work in practice, in a limited
environment, but can’t be standardized – because of issues about
trust, accuracy, etc…
3.6
New
Approaches to Congestion
3.6.1 Bob Briscoe
Bob is presenting
Diffserv weighted scheduling versus weighted congestion points.
Diffserv is typically
used without an API, so the application itself does nothing about Diffserv.
Diffserv isn’t
designed to work at the granularity of individual users.
When a light user sends
to a heavy user, the light user is using up Diffserv limits in his own network,
only for the traffic to be marked down in the heavy user’s network.
Bob is concerned that
we’re going to end up in the same place as ATM – with too many
traffic classes.
Bob proposes a
“congestion harm (cost) metric”.
Is this anything like
Comcast’s proposal earlier in the workshop? Important to take what ISPs
are doing now into account as we plan years into the future.
ECN is helpful but
ingress policer can’t see congestion on the rest of the path.
Bob is asking the
IETF/IRTF to set up a longer-term design team for congestion.
3.6.2 Marcin
Matusze
This presentation is
based on conversations with about 20 ISPs.
ISP customer acquisition
costs are much higher than monthly revenue, and cost structure varies (fixed
versus mobile). It’s difficult to change contracts, and 5 percent of
users generate 75 percent of the traffic.
Mobile network operators
have more tools (volume-based accounting, fixed pricing up to a limit). Most
tools aren’t widely used.
BT is starting to use
deep packet inspection, and Bob says that’s 10 times more cost-effective
than adding bandwidth.
Finnish ISPs were very
concerned about P2P traffic in 2004, but much less concerned today.
3.7
Quality
of Service
3.7.1 Mary Barnes
– QoS-enabled traffic prioritization
Many service providers
are implementing deep packet inspection, but this isn’t viable long-term.
Diffserv can be
implemented today but doesn’t solve all the problems, and
provisioning/managing still has to look at the events that cause peak usage.
Why hasn’t Diffserv
been deployed in the past? Maybe the problem is big enough now? Bob says data
is showing that the amount of P2P traffic in the world is decreasing…
People are using QoS
– just not on the Internet. More than 50 percent of Cisco’s
enterprise users have turned Diffserv on – actually higher than that, for
enterprise VoIP or video users. There’s a real problem with DSLAMs with
very limited processing capabilities – Dave Oran made the point that the
problems in layer-two aggregation networks are severe and overlooked.
3.7.2 Henning
Schulzrinne – encouraging bandwidth efficiency
Average TV consumption in
North America
for HDTV = 972 GBytes per month – reasonable upper limit.
Goal is to minimize
OVERALL cost of content delivery. Focusing only on HTTP efficiency misses
the point.
Transit bandwidth trends
changed in 2004 – from a steady decline to very little change since 2004.
Cost for sharing content
is a step function – moving from the neighborhood to a national
distribution requires different infrastructure.
FIOS TV architecture is
very different from the CATV architecture discussed this morning.
We have a general
discovery problem (network topology, STUN, HELD, LoST, SIP local network
configuration…). We have a variety of solutions (DHCP, DNS,
anycast…)
If you’re volume-based,
you need application-visible charging indication to allow people to make
rational decisions. Can the IETF help here?
Misbehaving applications
expose user to risk in a charging situation.
3.8
Conclusions
& Wrap
Generalized
“talking to your local infrastructure” mechanism keeps coming up,
over and over.
Indication of charging
would be machine-parsed for any “rational actor” – person or
machine.
Eric is concerned that
we’re identifying so many topics that are application-specific and hard-to-too-hard
to generalize into a transport mechanism.
Jon thinks P4P
“virtual costs” makes sense as something we can extract out.
Do we need more than
“this will be expensive/this will be cheap”?
“I’d be happy
to tag low-priority traffic if that made any difference to me” – if
ISPs are putting bandwidth caps in place and scavenger-class traffic
didn’t count, that would make a difference.
Congestion control
isn’t going to fix the cost issue – if we aren’t going to
grow a network to accommodate increased traffic, we’re just deciding what
order the packets queue in. The IETF needs to decide what’s about
economics and what’s about technology.
Can we stop whining about
P2P using too many TCP connections?
Is it possible to please
all the parties? What if benefits to users and benefits to ISPs are just
incompatible?
We need to be very
cautious about matching up network protocols and economic models – this
usually ends badly (micropayments, etc). Focus on technical topics (congestion,
etc).
Congestion arises when there’s
a scarcity, and you allocate resources based on cost (which may be money, or
might be pain). Congestion isn’t different from cost – ISPs know
that. Just because cost proposals haven’t worked in the past
doesn’t mean there’s no way to address cost.
Greg – worth IETF
attention, momentum around caching, momentum around optimizing peer selection.
Good to think about whether we bring all of the work in at once. On deep packet
inspection, as a vendor, unless you want to trust host markings, first-hop routers
have to inspect packets and that’s not cheap.
Cullen – Microsoft
API will default-mark voice traffic above router traffic in priority. Unless we
get our act together we’re headed for a train wreck.
Danny – IETF might
want to consider guidelines on application interaction with transport issues
(multiple ports, etc). Mobile networks have all kinds of differential pricing
(nights and weekends), but other providers have no hooks into charging for
things like distance (even though it might cost the provider more to retrieve a
file from Japan).
Lars – a lot of the
things we’re talking about aren’t just P2P – IPTV (for
example) might have many of the same issues. What are the other heavy bandwidth
usages of the network?
Bob – congestion
isn’t cost until it becomes part of the contract that you have with a
provider – and providers can’t see congestion outside the
provider’s own network.
Eric – in
economics, pricing is a protocol and tells you when you should spend money. We
need to remember that P2P is about content, too – networks that were
engineered for small-request/large-response applications are now carrying
servers from grandmothers.
Stanislov – opening
multiple TCP connections isn’t the problem – delay is the same for
P2P and Facebook. (This got responded to: this is only the case when
you’re the only user on the link – but other applications open
multiple links, too. Quoting Dave Oran – “there are non-evil
reasons to open multiple TCP connections and you can’t tell them
apart”).