[PCN] Feedback to draft-briscoe-tsvwg-cl-architecture-03.txt

Hi all,

I've read the draft
http://www.ietf.org/internet-drafts/draft-briscoe-tsvwg-cl-architecture-03.txt

I find the proposed concept pretty good as I consider it a simple
and technically feasible solution. Here are my comments regarding the draft.

1) Abstract: Although I am working on this field for years, I haven't
understood what the draft is about upon the first reading. Here is how
I'd summarize the presented mechanism.

This document describes a distributed, measurement-based, and resilient
admission control for application in large DiffServ-based networks.
All routers have a configured admission and pre-emption rate for their 
adjacent links.
They measure the admitted traffic they forward on each outgoing link and 
use a so-called
pre-congestion notification (PCN) mechanism to mark packets that exceed
the configured rates. The egress router measures the rates of non-labelled,
admission-marked, and preemption-marked packets from each ingress, and
notifies the respective ingress if it must not admit any further packets
or that it must even pre-empt a certain rate of admitted flows. Per-flow
admission states are kept at the edge routers only, while the PCN markers
that are required for all routers focus on the aggregate traffic. The
admission control is resilient with regard to the following two aspects:
(1) the configured admission and preemption rates can be
chosen in such a way that the reservations for admitted flows are
not pre-empted if a foreseeable failure occurs and (2) that reservations
can be torn down explicitly if a large failure occurs to preserve the
quality of service of the remaining flows.

2) The summary is very long. As the terminology is explained only in 1.2,
it is sometime hard to follow Section 1.1. It is sometimes unnecessarily
unprecise and, therefore, hard to follow. I think this could be rewritten
without increasing the length of Sections 1.1 and 1.2 such that they
it provide more clarity even at the first reading.

3) Terminology for "sustainable-aggregate-rate" is not clear. Suggestion:
rate of non-preemption marked packets measured by an egress gateway for a
specific ingress gateway; this is the supportable traffic rate between
these gateways if the corresponding path is congested.

4) Section 2.2: o Routing: a study regarding the accuracy of load
balancing schemes can be found in

[LoadBalancing-a] Ruediger Martin, Michael Menth, and Michael Hemmkeppler:
"Accuracy and Dynamics of Hash-Based Load Balancing Algorithms for
Multipath Internet Routing", IEEE Broadnets, San Jose, CA, USA,
October 2006
http://www3.informatik.uni-wuerzburg.de/~menth/Publications/Menth06p.pdf

[LoadBalancing-b] Ruediger Martin, Michael Menth, and Michael Hemmkeppler:
"Accuracy and Dynamics of Multi-Stage Load Balancing for Multipath
Internet Routing", currently under submission
http://www3.informatik.uni-wuerzburg.de/~menth/Publications/Menth07-Sub-6.pdf

Section 3 and 4: no comments, they look good and are well understandable.

5) Section 5.1:

o tunnel: what kind of tunnel? IP-in-IP tunneling from
ingress gateway to egress gateway with the effect that all flows are
forced by ECMP on the paths that is determined by the IPs of these
tunnel endpoints? This is how understood tunneling later in 5.7.1.

o probes: probe packets have the same address as the corresponding
flow. Do they finally leave the CL-domain and reach the destination?
This is unwanted extra traffic for this machine.

6) Section 5.2:

Another potential solution: smart marking

The meaning of preemption marked packets is: this is an excess packet
and the rate of the corresponding CL-region-aggregate will be reduced
in the future. Therefore, packets that are already preemption marked
should not be counted at another router for the metering process which
is done for preemption marking. This does not solve the problem in
Figure 4, but it alleviates the problem in other constellations:

                        IAR-b=1
                           @
             +-----+ L1 +-----+ L2
             |CPR=1|    |CPR=2|@@@@ SAR-b=2/3
IAR-a=2 >####| R1  |####| R2  |#### SAR-a=1/2*2/3=1/3
             +-----+    +-----+

These SARs lead in the end to a rate of 1 on link L2 although the
configured-admission-rate (CAR) is 2.
If already marked packets are not respected at R2 for the metering
process for preemption marking, we get SAR-b=1 and SAR-a=1/2 which
leads in the end to the fully used CAR on link L2.

After I wrote this, I read the draft about the markers:
http://www.ietf.org/internet-drafts/draft-briscoe-tsvwg-cl-phb-02.txt
The suggested algorithm implements the preemption marker
in fact in such a way that it respects only non-preemption-marked
packets for the measurement process. This is a requirement which
should also be mentioned in this draft.

7) Section 5.5

Another potential solution: add the declared rate to the SAR at the
ingress gateway as long as new flows send hardly any traffic. This
information must be obtained from the corresponding policers which
are also under the control of the ingress gateway. Or add it for an
initial period of e.g. 10 s?

8) Section 5.7.2
What does over-congested mean?

9) Section 5.7.4
Reminds me of RFC4127: Russian Dolls Bandwidth Constraints Model for
Diffserv-aware MPLS Traffic Engineering (and other models for that
objective)

10) Section 5.7.5
I haven't understood the last sentence of this section.
Can it be rephrased?

11) Section 6.8 *NEW*
Other Network Admission Control Approaches
Link admission control (LAC) describes how admission control (AC) can be
done on a single link and comprises, e.g., the calculation of
effective bandwidths which may be the base for a parameter-based AC.
In contrast, network AC (NAC) describes how AC can be done for a network
and focuses on the locations from which data is gathered for the admission
decision. Most approaches implement a link budget based NAC (LB NAC)
where each link has a certain AC-budget. RSVP works according to that
principle, but also the new concept admits additional flows as long as 
each link
on the new flow's path still has resources available. The border-to-border
budget based NAC (BBB NAC) pre-configures an AC budget for all
border-to-border relationships (=CL-region-aggregates) and if this
capacity budget is exhausted, new flows are rejected. The TCA-based
admission control which is associated with the DiffServ architecture
implements an ingress budget based NAC (IB NAC). These basically different
concepts have different flexibility and efficiency with regard to
the use of link bandwidths [NAC-a,NAC-b]. They can be made resilient
by choosing the budgets in such a way that the network will not be
congested after rerouting due to a failure. The efficiency of the
approaches is different with and without such resilient requirements.

[NAC-a] Michael Menth: "Efficient Admission Control and Routing in
Resilient Communication Networks", PhD thesis, July 2004,
http://opus.bibliothek.uni-wuerzburg.de/opus/volltexte/2004/994/pdf/Menth04.pdf

[NAC-b] Michael Menth, Stefan Kopf, Joachim Charzinski, and Karl
Schrodi: "Resilient Network Admission Control", currently under
submission.
http://www3.informatik.uni-wuerzburg.de/~menth/Publications/Menth07-Sub-3.pdf

12) Section 11.1
Paragraph "The ECN-Capable ... codepoint set." is hard to understand.
Two sentences more explanation could maybe help.

13) Section 11.3
By running the EWMA whenever a new packet for a specific
CL-region-aggregate arrives makes the aging process depending on the
arrival rate of the traffic. As a consequence, there is the problem
with the stale information. This can be done in a better way, e.g.
by using the time-exponentially weighted moving average (TEWMA) [TEWMA].

[TEWMA] Ruediger Martin and Michael Menth: "Improving the Timeliness
of Rate Measurements", in proceedings of GI/ITG Conference on Measuring,
Modelling and Evaluation of Computer and Communication Systems (MMB),
Dresden, Germany, September 2004
http://www3.informatik.uni-wuerzburg.de/~menth/Publications/Menth04h.pdf

If the TEWMA seems too complex (it requires the calculation of an
exponential function), a rate of the admission- or preemption-marked
packets can be calculated based on a, e.g., 20 ms interval and used
as input for a conventional EWMA.

14) Some places in the document say that admission and preemption marking
is done based on the rates of the packet AND some virtual queue size. This
is misleading from my perspective. The objective is
a) to admission-mark all packets of an CL-region-aggregate if the 
configured admission rate ist exceeded and
b) to preemption-mark only those packets exceeding the configured 
preemption rate.
The virtual queue is just a means to allow a tolerance against 
short-time rate fluctuations
like the "cell delay variation tolerance" or the "burst tolerance" in 
ATM. More on this topic
in an extra mail regarding draft-briscoe-tsvwg-cl-phb-02.txt

Best wishes,

    Michael

-- 
Dr. Michael Menth, Assistant Professor
University of Wuerzburg, Institute of Computer Science
Am Hubland, D-97074 Wuerzburg, Germany, room B206
phone: (+49)-931/888-6644, fax: (+49)-931/888-6632
mailto:menth@informatik.uni-wuerzburg.de
http://www3.informatik.uni-wuerzburg.de/research/ngn

_______________________________________________
PCN mailing list
PCN@ietf.org
https://www1.ietf.org/mailman/listinfo/pcn