CURRENT MEETING REPORT
Reported by Fred Baker,
Cisco Systems
Minutes of the Integrated
Services Working Group (intserv)
SESSION ONE
This meeting was devoted to
new topics in INT SERV.
INT SERV MIB PROGRESS REPORT
- Fred Baker
JJ (John Krawczyk) and Fred
have developed a MIB (6 objects). The key issue is how to divide
INT SERV information with RSVP and ST-II information. The MIB
just tells you information about interface properties (what link
capacity, etc). The information in RSVP and ST-II MIBs tells you
how the capacity is allocated.
By interface, you can:
Fred expressed interest in
getting this MIB done soon, so please send him suggested objects.
INT SERV DATA OBJECTS -
John Wroclawski
The Internet-Draft (draft-int-serv-data-formats-00.txt) describes some requirements and a data format for I-S data objects within protocol messages, RSPECS & TSPECS, etc.
John presented a data object library:
Implementation is available at:
ftp://mercury.lcs.mit.edu/pub/intserv/libintsrv.shar
SERVICE DISCRIMINATION
FOR BEST-EFFORT TRAFFIC - Dave Clark
David Clark gave a talk on
the topic of explicit allocation of best effort traffic. This
represents a somewhat new topic for the Integrated Services Working
Group. The initial goal for INTSERV was to add support for new
services like voice and video. However, discussions on the mailing
lists have shown that there is also a need to give the ability
to control best-effort quality of service too. This talk described
one possible way to provide multiple levels of best-effort traffic
service.
The goal of the enhancement
proposed by Clark was to allow different users to be explicitly
allocated different capacity service during times of congestions.
Capacity need not be measured using a simple model such as a fixed
minimun rate, but could be characterized in a broad range of ways,
which could model traffic with different degrees of burstiness
at different time frames. The approach was to provide each user
with a usage profile, and tag traffic entering the network as
to whether it is in or out of this profile. At any point of congestion,
the router preferentially discards packets marked as being out
of profile. The result of this discard is that the sending TCP
perceives an indication of congestion and slows down. It will
continue this process until all of its packets are marked as in
profile. At that point, different TCPs may be getting very different
services, depending on what their profiles are.
The talk presented two schemes,
one driven by controls at the sender, and one by controls at the
receiver. It also included some initial simulations to suggest
that somewhat accurate control on usage was actually possible.
This talk reported ongoing
work, not material being proposed for standarization at this time.
Discussion on Dave Clark's
Presentation
It was asked whether profile
scheme depends on source and destination profiles. Clark noted
that yes, if you've got a big profile, but if the destination
has a small pipe, you won't get full throughput. It is a matter
of education. (People need to be aware of Internet heterogeneity.)
It was suggested that the
profile be moved out to the hosts. Clark agreed with this point.
When asked whether IP ToS
field on WFQ gives similar service, Clark was unsure.
There was a question regarding
pricing scheme. If ISPs are only charging during congestion, doesn't
that give them motivation to be congested? Clark noted that currently
ISPs charge you for capacity (or modem hours). ISPs currently
have this conundrum that there's no motivation to add capacity
except fear that competitor has better capacity and will attract
one's user. And if one looks at congestion portion of fee, it
is pretty small. It must also be noted that expectation scheme
means we actually don't charge during congestion. (we charge for
a promise you won't be congested)
Greg Minshall asked how does
in/out marking work at multiple check points (MIT, NEARNET, MCI
all check)? Clark said that this is fine, and that one can mark
packets at out at multiple spots (and indeed, ISPs can observe
traffic marked in by MIT that the ISP needed to mark out and suggest
to MIT that they increase their expectation).
Abel Weinrib asked if someone
gave me TurboTCP that didn't back off, doesn't this fall apart?
Clark said that yes, if we are overloaded, in packets still get
through, and outs gets nailed (and trying to be greedy just ensures
you get marked out).
MAPPING INT-SERV ONTO LEVEL
2 TECHNOLOGIES - JJ (John Krawczyk)
He pointed out that the world is more than just point-to-point link and ATM links.
It was asked if vendors are
thinking about adding guarantees in MAC switches (Ethernet)? Fred
Baker pointed out that MAC switches sell not on features but cost
per port, so guarantees better be very very cheap to include.
Hoffman and Yavaktar made
some points about the Mother May I protocol for Level 2 and services
that do admission control for 802.3.
Don mentioned the protocol
he's developed using softstate and reservation messages (MMIP).
A multicast protocol isused to find subnet bandwidth manager,
and a unicast protocol is used to talk to the manager (Mother).
There is no enforcement, but there is some policing at hosts who
cooperate.
Raj Yavatkar is writing a
spec for the protocol.
SESSION TWO
There were four items on the agenda:
The major objective of the
evening was to reach resolution on the G and CD specs.
Craig Partridge presented
the material on the G Internet-Draft. At the last meeting, the
sense was that the Internet-Draft could go forward as a proposed
standard once a specific set of issues were resolved. After the
last meeting, Roch Guerin revised the spec to include the new
concept of the "slack term". This was circulated on
the list, and no objections and some support was detected. The
conclusion of this meeting was to proceed with the spec in the
form that includes the slack term.
Two small issues were noted prior to the meeting:
1)The draft is ambiguous on where reshaping point are, that is, places where traffic is made to conform to a token bucket. The answer is that it is done for two purposes:
2) The draft needs to be more clear on exactly how to compute the C and D terms of the delay bound formula.
The answer is that D is precomputed
when a box is configured, a property of the box, while C is more
complex, computed for each flow. The draft will so indicate.
There was discussion of where
to put an expanded discussion of how the delays terms is computed.
The conclusion was to put the meat of discussion in an Informational
RFC, but add some material to the example implementation section
of the standards document, and make that document refer to the
Informational RFC.
The proposed action, confirmed
by the meeting, was to revise the spec to meet the above objectives,
and submit it for consideration as a proposed standard. The Informational
RFC would be prepared by Craig Partridge and sent to the list
for comment.
There was a final point of
discussion as to whether the intent of the G spec was that implementation
over all media types was mandated. The answer is that implementation
is only anticipated over an appropriate subset of media types
used in the Internet.
Craig Partridge then chaired
the next part of the meeting, in which John Wroclawski presented
the status of the CL spec. He presented a prepared talk with slides,
which is summarized below.
The status of the CL spec is that it was first presented in Dallas, where the sense of the meeting is that it should proceed towards proposed standard after seeing what comments arose on the net. There were in fact a few substantive isues raised between that meeting and now, which were summarized in this talk. There were three sorts of issues:
1) Some wording was unclear. These points are noted and will be fixed.
2) There was some uncertainty, or possible disagreement, about the goals of CL. The talk reviewed the issues here, but the assumption was that the goals were correct as stated.
3) There were two technical
questions as to how the features of the CL specification meets
those goals. The talk clarified these issues, and presented specific
design choices for review.
The goals of the CL spec, by way of review:
1) Flows within profile (conforming to their proposed Tspec, should see the behavior associated with an "unloaded" network.
2) Especially in the case of multicast, the CL service and the Best Effort (BE) service must coexist. Use of CL service for a flow should not make the service worse than BE for that same flow.
3) (Not a CL goal but more
a more basic service issue:) Non rate-adaptive CL flows must not
be able to kill rate adaptive (TCP) traffic. This implies that
a router needs to support a sharing model other than brute force
wins. This point has always been around, but CL emphasizes the
need to deal with it, since CL legitimizes non rate adaptive traffic.
The first technical point is whether the CL service is allowed to reorder traffic that is out of profile. Some implementation schemes reorder under overload. The following points are germane:
1) Reordering may be easier for the network element implementor.
2) Reordering is _never_ preferable for the application, since CL does not have packet selectors to control which packets get delay, and CL has no marking to show which packets have been out of profile at previous hops on the path. Thus, there is every reason to think, in a reordering scheme, that not just some but essentially all of the packets may be treated as BE and reordered by the time the traffic crosses a multihop network. However, reordering may be acceptable for one class of apps -- reordering -insensitive, like vat. There was some discussion as to the degree to which this is true.
3) There is no reason to offer
both schemes. If reordering is never preferable, if implementing
both, just do the better one.
Currently, the CL spec allows
both reordering and non-reordering implementations of the overload
behavior. This range of options was included to permit product
differentiation and ease of implementation in different circumstances.
The alternative that was proposed on the list was to preclude
reordering implementations.
The discussion in the meeting raised the following points, positions and issues:
1) The Internet today reorders packets. Its a fact. Since Internet does reorder, leave the spec as is.
2) Reordering can hurt voice as well. But it was noted that reordering can only occur if traffic is out of profile, not in the normal case. Further, it was not clear that reordering implies a worse delay bound.
3) Due to possible distortion
of traffic shape as it passes through routers, traffic that conforms
to its profile at the edge may be non-conformant in the middle
of the net.
After discussion of several
options--preclude reordering, or say "should not", or
allow both, there was a first sense of the meeting to prefer "should
not". The point was reconsidered, and the final conclusion
was to put an extended discussion of the issue in the implementation
discussion in the spec, and to note the benefits of not reordering
in the evaluation criteria section.
The next technical issue raised
was the role of the "B" parameter and its relation to
buffer allocation, for both real time and TCP-style flows.
Non rate-adaptive flows such
as real time can exeed their reservation for two reasons: first,
multiple use of a shared reservation, and second, a transient
due to burst aggregation. The minimum necessary to deal with this
is a reasonable amount of buffering in the switch, but a better
approach is to drop packets when the delay gets too high. One
must not drop just because the traffic exceeds its Tspec, but
only when actual queues develop. This queue management discipline
is in fact the same as the discipline suitable for rate-adaptive
flows such as TCP. Thus there is no recognized need to have two
modes of buffer management in CL, for flows whose rate can or
cannot adapt. Adding two explicit modes raised the further issue
of insuring that one cannot cheat and get better service by asking
for the "wrong" mode. So the proposal, since there is
not a clear need for two modes, is not to define a means, such
as the "TCP" flag, to select or specify explicit buffer
management disciplines.
It was further noted, in response
to questions from the floor concerning the meaning of the B parameter
for TCP, that if you are asking for a reservation for TCP, with
the goal that the TCP really achieve the reserved rate in the
Tspec, the reservation must tell the router to allocate a specific
number of buffers, not just link capacity. Otherwise, packets
of a typical TCP burst will get lost. In the absence of a specific
calculation of the expected TCP burst size (as much as one round
trip of traffic), one could try setting B to 2* MTU, which is
sufficient to deal with the TCP behavior in which two packets
are produced in response to one ack. Without this level of buffer
assurance, the source will have to pace its sending of packets,
which is difficult with today's systems and hardware. The question
was raised whether the CL spec stated that the buffers specified
in the Tspec must be explicitly allocated and dedicated to this
flow. The answer is no; the router is permitted to allocate both
buffers and capacity on a statistical basis. But all the Tspec
parameters should be considered by the admission control algorithm.
After some discussion, it
was the sense of the meeting that trying to distinguish the treatment
that TCP and that real time flows need is more complex than could
be justified. The current form of the spec, which does not make
any distinction, was affirmed.
There was no discussion from
the floor at the meeting of revising the basic goals of CL, as
they were presented by Wroclawski.
The action approved by the
meeting was to clean up certain wording in the spec, to leave
the technical matters discussed above as they are currently treated
in the draft, and send the draft in for consideration as a proposed
standard.
The next item on the agenda
was a presentation by Juha Heinanen on Protected Best Effort (PBE)
Service.
The motivation of this service
is the claim that CL is too complex. What is needed is to protect
BE users from non-adaptive flows. We need to replace Van Jacobson
in his current role as real time traffic police. So a PBE service
is proposed, as a way to allow a user to ask for a fair share
of BE capacity. This request could be used to control IP routing,
etc.
The specific features of the
PBE service are as follows.
PBE is intended for elastic,
time insensitive applications. It may not be suitable for some
real time traffic. It provides an isolated BE flow with optional
min. BW. All unprotected BE traffic is considers as a single PBE
flow. The end to end behavior is that a flow gets a fair share
or minimum requested bandwidth (MRE), whichever is larger. There
should be fair transit delay among PBE packets on same path. There
should be ordered delivery of PBE packets as long as the path
remains the same. There is no guarantee of delivery of packets
at higher then the min. rate.
Why PBE? The motivation is
to support all elastic non real time users and only them. It support
apps that don't know their traffic characteristics, and protects
the apps against greedy internet users. It may provide an upper
bound on transfer times.
The TSpec: min. requested
BW, min. policed unit, max. packet size, and might add peak rate.
A PBE flow is nonconformant
if it exceeds the fair share or minimum requested bandwidth, whichever
is larger. Fair is defined according to the max-min concept.
Policing: nonconformant traffic
may be discarded, but should not as long as conforming traffic
can be protected. It must be discarded if buffer resources would
not be adequate. Under this approach, one can calculate how much
buffer space each flow uses, so we don't need RED.
Evaluation criteria:
Following this presentation, there was discussion that raised the following issues and concerns:
1) Is fairness a good thing? It seems initially appealing, but is not obviously the right thing.
2) the max-min definition of fairness may not be suitable in practice, and is not at all simple to achieve. So this specification may not in fact be simple.
3) The proposal provdes "fairness" only for those flows which make a reservation (flow setup). Movement towards per-flow setup for best effort may be a bad direction for the Internet.
4) The simplicity of the service
may come at the cost of constraining the implementation options
to essentially one approach, a per flow WFQ. This may not be an
acceptable constraint for the range of routers in the Internet.
It was noted that by relaxing
the definition of fair allocation, one might permit a wider range
of implementations. However, this might also remove some of the
appealing initial simplicity of the service goal.
It was agreed that this proposal
would be further discussed and refined.
The next and last item on
the agenda was a presentation by Lee Breslau on Partial Deployment
of the Service Model.
The three questions under
consideration in this talk were 1) whether we can expect all services
to be deployed at all IS routers? If not, 2) can end to end services
be composed out of the different heterogeneous services that are
found at each router, and if so, 3) should the end node be allowed
to control this?
Can we expect universal deployment
if advanced Integrated Services (IS)? The answer seems obviously
no. Implementation is hard on some subnets, it is left to service
providers to decide what to install, etc. So how can we deploy
and utilize real time services in a heterogeneous environment
with varying subnet characteristics? One option is only to define
services that are implementable on all subnet technologies. This
would imply a baseline service weaker than CL. This reduces to
too much a low common denominator.
Services at some but not all
hops may be very useful. Even today, we tunnel between non-RSVP
nodes, which has same effect. And if service is in place at critical
(e.g. congested) nodes, a flow can get most of the benefit. If
we do not allow service variation, then the only solution is just
to reject all non-comforming requests. But if we allow composition
of heterogeneous service at different nodes, then we must decide
how to compose. One question is will apps want to know if the
service they get does not exactly conform? Some can just work
without knowing, but some will need/want to know the nature of
the actual service.
One mechanism to let apps
know what is going on is to add a flag to the request message
saying that an app will or won't accept replacement service at
nodes. In multicast, merging is a problem for this flag. Another
mechanism is for routers to advertise availability of services.
This allows apps to know the degree to which a request will achieve
the service, but needs extra mechanism for advertising.
Service replacement might
work as follows: when a router does not offer a service, it offers
some replacement, such as BE. When it gets a service request,
it maps it to the local replacement. Replacements are characterized
as reliable or unreliable. BE is a replacement (perhaps unreliable)
for anything. Replacement decisions are local.
Adspecs can be augmented to
indicate to the requester what the degree of replacement is as
follows. Include in the adspec counts of the number of routers
that offered reliable and unreliable replacement as well as how
many implented the actual service. Each router can then export
additional substitute parameters, such as an estimate of D for
guaranteed service.
In addition to the three questions
at the beginning of the talk, there are two more. 4) should we
consider service-specific solutions for this problem, and 5) do
we want extensible solutions?
Following this presentation,
the following issues were raised in the discussion.
It was observed that this
approach, if kept simple, is a good idea, but it could be a real
morass of complexity. Does thedetailed information really matter
to the end service? One extreme is to set one bit in adspec saying
a replacement occurred, but this cannot (for example) distinguish
between reliable replacement and BE replacement. However, another
approach noted was to use the bit to signal the end node that
some replacement occured, which could then use diagnostic tools
to find out what actually happened at each node. This might permit
the actual notification mechanism to be very simple. There was
some disagreement as to whether an adspec of three integers is
simple or complex.
Another concern was that since
different services have different parameters, replacement may
be limited by wrong parameter set. However, today this does not
seem to be a real issue, since we have a limited number of services,
and this is not a fact with them. Mostly today the replacement
service would be BE, and in that case we never need additional
info.
A question was asked as to
whether there a problem of how admission control works in the
presence of various sorts of service replacements?
There is a practical issue
of what the individual vendor chooses to call a true implementation
of a service, a reliable replacement, and an unreliable one. There
may be marketing issues here. There is also a technical issue
as to when a replacement is reliable. For example, if a BE service
is currently very underloaded, it might at the moment represent,
in practice, a reliable replacement even for G. The current proposal
would preclude this sort of replacement as reliable, unless the
underloaded condition was known to persist for a very long time.
The determination might be administrative, rather than based on
measurement.
Craig Partridge offered, since he wanted to move forward on this aspect of the problem, to will work with Lee to make this happen.