intserv-minutes-96mar.html

CURRENT MEETING REPORT

Reported by Fred Baker, Cisco Systems

Minutes of the Integrated Services Working Group (intserv)

SESSION ONE

This meeting was devoted to new topics in INT SERV.

INT SERV MIB PROGRESS REPORT - Fred Baker

JJ (John Krawczyk) and Fred have developed a MIB (6 objects). The key issue is how to divide INT SERV information with RSVP and ST-II information. The MIB just tells you information about interface properties (what link capacity, etc). The information in RSVP and ST-II MIBs tells you how the capacity is allocated.

By interface, you can:

configure total bit rate that can be allocated
configure max bit rate for any one flow
display currently allocated bandwidth
current delay being experience by traffic (has been deleted)
add display parameter indicating bit rate allocated per service currently allocated buffer is in the MIB (sum of burst sizes)

Fred expressed interest in getting this MIB done soon, so please send him suggested objects.

INT SERV DATA OBJECTS - John Wroclawski

The Internet-Draft (draft-int-serv-data-formats-00.txt) describes some requirements and a data format for I-S data objects within protocol messages, RSPECS & TSPECS, etc.

John presented a data object library:

Treats I-S data as opaque to setup protocol or similar - Supports:
- Creation and deletion of objects.
- ordered set of per-service information fields within each object (ordered so RSpec, etc can express priority of different services in request.
- Addition and deletion of service-specific info from the set. - Listing of which service-specific parameters are actually present for each service.
- Creation, modification, and deletion of the parameters given for each service.
- Some simple error handling and checking. (e.g., that IEEE floating-point numbers are valid.)

Implementation is available at:

ftp://mercury.lcs.mit.edu/pub/intserv/libintsrv.shar

SERVICE DISCRIMINATION FOR BEST-EFFORT TRAFFIC - Dave Clark

David Clark gave a talk on the topic of explicit allocation of best effort traffic. This represents a somewhat new topic for the Integrated Services Working Group. The initial goal for INTSERV was to add support for new services like voice and video. However, discussions on the mailing lists have shown that there is also a need to give the ability to control best-effort quality of service too. This talk described one possible way to provide multiple levels of best-effort traffic service.

The goal of the enhancement proposed by Clark was to allow different users to be explicitly allocated different capacity service during times of congestions. Capacity need not be measured using a simple model such as a fixed minimun rate, but could be characterized in a broad range of ways, which could model traffic with different degrees of burstiness at different time frames. The approach was to provide each user with a usage profile, and tag traffic entering the network as to whether it is in or out of this profile. At any point of congestion, the router preferentially discards packets marked as being out of profile. The result of this discard is that the sending TCP perceives an indication of congestion and slows down. It will continue this process until all of its packets are marked as in profile. At that point, different TCPs may be getting very different services, depending on what their profiles are.

The talk presented two schemes, one driven by controls at the sender, and one by controls at the receiver. It also included some initial simulations to suggest that somewhat accurate control on usage was actually possible.

This talk reported ongoing work, not material being proposed for standarization at this time.

Discussion on Dave Clark's Presentation

It was asked whether profile scheme depends on source and destination profiles. Clark noted that yes, if you've got a big profile, but if the destination has a small pipe, you won't get full throughput. It is a matter of education. (People need to be aware of Internet heterogeneity.)

It was suggested that the profile be moved out to the hosts. Clark agreed with this point.

When asked whether IP ToS field on WFQ gives similar service, Clark was unsure.

There was a question regarding pricing scheme. If ISPs are only charging during congestion, doesn't that give them motivation to be congested? Clark noted that currently ISPs charge you for capacity (or modem hours). ISPs currently have this conundrum that there's no motivation to add capacity except fear that competitor has better capacity and will attract one's user. And if one looks at congestion portion of fee, it is pretty small. It must also be noted that expectation scheme means we actually don't charge during congestion. (we charge for a promise you won't be congested)

Greg Minshall asked how does in/out marking work at multiple check points (MIT, NEARNET, MCI all check)? Clark said that this is fine, and that one can mark packets at out at multiple spots (and indeed, ISPs can observe traffic marked in by MIT that the ISP needed to mark out and suggest to MIT that they increase their expectation).

Abel Weinrib asked if someone gave me TurboTCP that didn't back off, doesn't this fall apart? Clark said that yes, if we are overloaded, in packets still get through, and outs gets nailed (and trying to be greedy just ensures you get marked out).

MAPPING INT-SERV ONTO LEVEL 2 TECHNOLOGIES - JJ (John Krawczyk)

He pointed out that the world is more than just point-to-point link and ATM links.

In relation to Frame-Switched media problems, LAN switches do simple fifo, and frame relay has no delay characterizations.
In current work, IEEE 802.1p signalling via management and GARP has no admission control or policing just destination and source MACs and priority scheduling.
He proposes that we do some work on implementing services (ie, how-to guide with current technologies and discussion of adaptation protocols that we use over layer 2 protocols).

It was asked if vendors are thinking about adding guarantees in MAC switches (Ethernet)? Fred Baker pointed out that MAC switches sell not on features but cost per port, so guarantees better be very very cheap to include.

Hoffman and Yavaktar made some points about the Mother May I protocol for Level 2 and services that do admission control for 802.3.

Don mentioned the protocol he's developed using softstate and reservation messages (MMIP). A multicast protocol isused to find subnet bandwidth manager, and a unicast protocol is used to talk to the manager (Mother). There is no enforcement, but there is some policing at hosts who cooperate.

Raj Yavatkar is writing a spec for the protocol.

SESSION TWO

There were four items on the agenda:

Discussion of the Guaranteed Service (G) specification ID
Discussion of the Controlled Load (CL) specification ID
Presentation on the Protected Best Effort Service
Presentation on the issue of dealing with heterogeneity of QoS service offering within the Internet.

The major objective of the evening was to reach resolution on the G and CD specs.

Craig Partridge presented the material on the G Internet-Draft. At the last meeting, the sense was that the Internet-Draft could go forward as a proposed standard once a specific set of issues were resolved. After the last meeting, Roch Guerin revised the spec to include the new concept of the "slack term". This was circulated on the list, and no objections and some support was detected. The conclusion of this meeting was to proceed with the spec in the form that includes the slack term.

Two small issues were noted prior to the meeting:

1)The draft is ambiguous on where reshaping point are, that is, places where traffic is made to conform to a token bucket. The answer is that it is done for two purposes:

done to restore shape to conformant flow; or,
force merged flows into a particular shape and perhaps discard excess traffic. The draft will be revised to make this clear. The draft does not say what to do when splitting reservations. The draft will make clear that reshaping is to be done at that point as well.

2) The draft needs to be more clear on exactly how to compute the C and D terms of the delay bound formula.

The answer is that D is precomputed when a box is configured, a property of the box, while C is more complex, computed for each flow. The draft will so indicate.

There was discussion of where to put an expanded discussion of how the delays terms is computed. The conclusion was to put the meat of discussion in an Informational RFC, but add some material to the example implementation section of the standards document, and make that document refer to the Informational RFC.

The proposed action, confirmed by the meeting, was to revise the spec to meet the above objectives, and submit it for consideration as a proposed standard. The Informational RFC would be prepared by Craig Partridge and sent to the list for comment.

There was a final point of discussion as to whether the intent of the G spec was that implementation over all media types was mandated. The answer is that implementation is only anticipated over an appropriate subset of media types used in the Internet.

Craig Partridge then chaired the next part of the meeting, in which John Wroclawski presented the status of the CL spec. He presented a prepared talk with slides, which is summarized below.

The status of the CL spec is that it was first presented in Dallas, where the sense of the meeting is that it should proceed towards proposed standard after seeing what comments arose on the net. There were in fact a few substantive isues raised between that meeting and now, which were summarized in this talk. There were three sorts of issues:

1) Some wording was unclear. These points are noted and will be fixed.

2) There was some uncertainty, or possible disagreement, about the goals of CL. The talk reviewed the issues here, but the assumption was that the goals were correct as stated.

3) There were two technical questions as to how the features of the CL specification meets those goals. The talk clarified these issues, and presented specific design choices for review.

The goals of the CL spec, by way of review:

1) Flows within profile (conforming to their proposed Tspec, should see the behavior associated with an "unloaded" network.

2) Especially in the case of multicast, the CL service and the Best Effort (BE) service must coexist. Use of CL service for a flow should not make the service worse than BE for that same flow.

3) (Not a CL goal but more a more basic service issue:) Non rate-adaptive CL flows must not be able to kill rate adaptive (TCP) traffic. This implies that a router needs to support a sharing model other than brute force wins. This point has always been around, but CL emphasizes the need to deal with it, since CL legitimizes non rate adaptive traffic.

The first technical point is whether the CL service is allowed to reorder traffic that is out of profile. Some implementation schemes reorder under overload. The following points are germane:

1) Reordering may be easier for the network element implementor.

2) Reordering is _never_ preferable for the application, since CL does not have packet selectors to control which packets get delay, and CL has no marking to show which packets have been out of profile at previous hops on the path. Thus, there is every reason to think, in a reordering scheme, that not just some but essentially all of the packets may be treated as BE and reordered by the time the traffic crosses a multihop network. However, reordering may be acceptable for one class of apps -- reordering -insensitive, like vat. There was some discussion as to the degree to which this is true.

3) There is no reason to offer both schemes. If reordering is never preferable, if implementing both, just do the better one.

Currently, the CL spec allows both reordering and non-reordering implementations of the overload behavior. This range of options was included to permit product differentiation and ease of implementation in different circumstances. The alternative that was proposed on the list was to preclude reordering implementations.

The discussion in the meeting raised the following points, positions and issues:

1) The Internet today reorders packets. Its a fact. Since Internet does reorder, leave the spec as is.

2) Reordering can hurt voice as well. But it was noted that reordering can only occur if traffic is out of profile, not in the normal case. Further, it was not clear that reordering implies a worse delay bound.

3) Due to possible distortion of traffic shape as it passes through routers, traffic that conforms to its profile at the edge may be non-conformant in the middle of the net.

After discussion of several options--preclude reordering, or say "should not", or allow both, there was a first sense of the meeting to prefer "should not". The point was reconsidered, and the final conclusion was to put an extended discussion of the issue in the implementation discussion in the spec, and to note the benefits of not reordering in the evaluation criteria section.

The next technical issue raised was the role of the "B" parameter and its relation to buffer allocation, for both real time and TCP-style flows.

Non rate-adaptive flows such as real time can exeed their reservation for two reasons: first, multiple use of a shared reservation, and second, a transient due to burst aggregation. The minimum necessary to deal with this is a reasonable amount of buffering in the switch, but a better approach is to drop packets when the delay gets too high. One must not drop just because the traffic exceeds its Tspec, but only when actual queues develop. This queue management discipline is in fact the same as the discipline suitable for rate-adaptive flows such as TCP. Thus there is no recognized need to have two modes of buffer management in CL, for flows whose rate can or cannot adapt. Adding two explicit modes raised the further issue of insuring that one cannot cheat and get better service by asking for the "wrong" mode. So the proposal, since there is not a clear need for two modes, is not to define a means, such as the "TCP" flag, to select or specify explicit buffer management disciplines.

It was further noted, in response to questions from the floor concerning the meaning of the B parameter for TCP, that if you are asking for a reservation for TCP, with the goal that the TCP really achieve the reserved rate in the Tspec, the reservation must tell the router to allocate a specific number of buffers, not just link capacity. Otherwise, packets of a typical TCP burst will get lost. In the absence of a specific calculation of the expected TCP burst size (as much as one round trip of traffic), one could try setting B to 2* MTU, which is sufficient to deal with the TCP behavior in which two packets are produced in response to one ack. Without this level of buffer assurance, the source will have to pace its sending of packets, which is difficult with today's systems and hardware. The question was raised whether the CL spec stated that the buffers specified in the Tspec must be explicitly allocated and dedicated to this flow. The answer is no; the router is permitted to allocate both buffers and capacity on a statistical basis. But all the Tspec parameters should be considered by the admission control algorithm.

After some discussion, it was the sense of the meeting that trying to distinguish the treatment that TCP and that real time flows need is more complex than could be justified. The current form of the spec, which does not make any distinction, was affirmed.

There was no discussion from the floor at the meeting of revising the basic goals of CL, as they were presented by Wroclawski.

The action approved by the meeting was to clean up certain wording in the spec, to leave the technical matters discussed above as they are currently treated in the draft, and send the draft in for consideration as a proposed standard.

The next item on the agenda was a presentation by Juha Heinanen on Protected Best Effort (PBE) Service.

The motivation of this service is the claim that CL is too complex. What is needed is to protect BE users from non-adaptive flows. We need to replace Van Jacobson in his current role as real time traffic police. So a PBE service is proposed, as a way to allow a user to ask for a fair share of BE capacity. This request could be used to control IP routing, etc.

The specific features of the PBE service are as follows.

PBE is intended for elastic, time insensitive applications. It may not be suitable for some real time traffic. It provides an isolated BE flow with optional min. BW. All unprotected BE traffic is considers as a single PBE flow. The end to end behavior is that a flow gets a fair share or minimum requested bandwidth (MRE), whichever is larger. There should be fair transit delay among PBE packets on same path. There should be ordered delivery of PBE packets as long as the path remains the same. There is no guarantee of delivery of packets at higher then the min. rate.

Why PBE? The motivation is to support all elastic non real time users and only them. It support apps that don't know their traffic characteristics, and protects the apps against greedy internet users. It may provide an upper bound on transfer times.

The TSpec: min. requested BW, min. policed unit, max. packet size, and might add peak rate.

A PBE flow is nonconformant if it exceeds the fair share or minimum requested bandwidth, whichever is larger. Fair is defined according to the max-min concept.

Policing: nonconformant traffic may be discarded, but should not as long as conforming traffic can be protected. It must be discarded if buffer resources would not be adequate. Under this approach, one can calculate how much buffer space each flow uses, so we don't need RED.

Evaluation criteria:

fairness can be measured with a parking lot algorithm
delay should depend only on active flows
bandwidth should not depend on the number or behavior of unprotected BE flowse
ach flow should get its minimum rate

Following this presentation, there was discussion that raised the following issues and concerns:

1) Is fairness a good thing? It seems initially appealing, but is not obviously the right thing.

2) the max-min definition of fairness may not be suitable in practice, and is not at all simple to achieve. So this specification may not in fact be simple.

3) The proposal provdes "fairness" only for those flows which make a reservation (flow setup). Movement towards per-flow setup for best effort may be a bad direction for the Internet.

4) The simplicity of the service may come at the cost of constraining the implementation options to essentially one approach, a per flow WFQ. This may not be an acceptable constraint for the range of routers in the Internet.

It was noted that by relaxing the definition of fair allocation, one might permit a wider range of implementations. However, this might also remove some of the appealing initial simplicity of the service goal.

It was agreed that this proposal would be further discussed and refined.

The next and last item on the agenda was a presentation by Lee Breslau on Partial Deployment of the Service Model.

The three questions under consideration in this talk were 1) whether we can expect all services to be deployed at all IS routers? If not, 2) can end to end services be composed out of the different heterogeneous services that are found at each router, and if so, 3) should the end node be allowed to control this?

Can we expect universal deployment if advanced Integrated Services (IS)? The answer seems obviously no. Implementation is hard on some subnets, it is left to service providers to decide what to install, etc. So how can we deploy and utilize real time services in a heterogeneous environment with varying subnet characteristics? One option is only to define services that are implementable on all subnet technologies. This would imply a baseline service weaker than CL. This reduces to too much a low common denominator.

Services at some but not all hops may be very useful. Even today, we tunnel between non-RSVP nodes, which has same effect. And if service is in place at critical (e.g. congested) nodes, a flow can get most of the benefit. If we do not allow service variation, then the only solution is just to reject all non-comforming requests. But if we allow composition of heterogeneous service at different nodes, then we must decide how to compose. One question is will apps want to know if the service they get does not exactly conform? Some can just work without knowing, but some will need/want to know the nature of the actual service.

One mechanism to let apps know what is going on is to add a flag to the request message saying that an app will or won't accept replacement service at nodes. In multicast, merging is a problem for this flag. Another mechanism is for routers to advertise availability of services. This allows apps to know the degree to which a request will achieve the service, but needs extra mechanism for advertising.

Service replacement might work as follows: when a router does not offer a service, it offers some replacement, such as BE. When it gets a service request, it maps it to the local replacement. Replacements are characterized as reliable or unreliable. BE is a replacement (perhaps unreliable) for anything. Replacement decisions are local.

Adspecs can be augmented to indicate to the requester what the degree of replacement is as follows. Include in the adspec counts of the number of routers that offered reliable and unreliable replacement as well as how many implented the actual service. Each router can then export additional substitute parameters, such as an estimate of D for guaranteed service.

In addition to the three questions at the beginning of the talk, there are two more. 4) should we consider service-specific solutions for this problem, and 5) do we want extensible solutions?

Following this presentation, the following issues were raised in the discussion.

It was observed that this approach, if kept simple, is a good idea, but it could be a real morass of complexity. Does thedetailed information really matter to the end service? One extreme is to set one bit in adspec saying a replacement occurred, but this cannot (for example) distinguish between reliable replacement and BE replacement. However, another approach noted was to use the bit to signal the end node that some replacement occured, which could then use diagnostic tools to find out what actually happened at each node. This might permit the actual notification mechanism to be very simple. There was some disagreement as to whether an adspec of three integers is simple or complex.

Another concern was that since different services have different parameters, replacement may be limited by wrong parameter set. However, today this does not seem to be a real issue, since we have a limited number of services, and this is not a fact with them. Mostly today the replacement service would be BE, and in that case we never need additional info.

A question was asked as to whether there a problem of how admission control works in the presence of various sorts of service replacements?

There is a practical issue of what the individual vendor chooses to call a true implementation of a service, a reliable replacement, and an unreliable one. There may be marketing issues here. There is also a technical issue as to when a replacement is reliable. For example, if a BE service is currently very underloaded, it might at the moment represent, in practice, a reliable replacement even for G. The current proposal would preclude this sort of replacement as reliable, unless the underloaded condition was known to persist for a very long time. The determination might be administrative, rather than based on measurement.

Craig Partridge offered, since he wanted to move forward on this aspect of the problem, to will work with Lee to make this happen.