[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Sipping] Overload/Congestion mechanisms - design choices andissues
Jeroen:
Please see below.
Indra
Jeroen van Bemmel wrote:
Volker, Jonathan,
"load balancing" and "overload control" are two related but different
issues. I get the feeling that we need to get the distinction clear
before plunging into this discussion. I don't pretend to know the
answers, but let me try and give it a go
I couldn't agree more with your statement that load balancing and
overload control are related but different issues.
I'd say that in general they are not related, but in this context, they
can be argued to be somewhat related.
"Load balancing" is about resource optimization and scaleability. What
makes load balancing for SIP (or proxy systems in general) difficult,
is that behind a SIP element there is typically a cloud of other
systems. An upstream SIP peer typically has no knowledge of the
structure of this cloud, especially when it is in another domain. My
hunch is that to enable a peer to make good load balancing decisions,
it will need load information that is not only based on its direct
next hop but also on the systems behind that. So a major issue (for
load balancing) is the propagation and aggregation of this information
(and the underlying model).
Your hunch is also correct that load balancing based on the information
(e.g., load) behind the next hop will lead to better performance IF this
information
could be obtained. We have verified this in our simulation experiments.
The problem is how to get this "behind the cloud" information.
"Overload/congestion control" is about reducing the load on a
particular system that is getting more than it is capable of handling.
The two mechanisms that can be applied here are traffic diversion and
throttling. To decide that overload control is needed, information
from a single element (ie the system that is experiencing overload)
suffices. However, to enable good diversion/throttling decisions, you
may need the same information that is used for load balancing.
I wager that the upstream system does not need to know exactly which
part of the peer system (disk, network, etc) is experiencing overload.
This information would be useful for diagnosis, but not to decide what
(ie divert and/or throttle) to do. Since the system under overload
would know best what effect certain decisions would have on its
overload status (and perhaps also on the rest of its network), I
reckon that the system under overload should make a suggestion to the
upstream peer on what to do. In other words, not "help, I am 98%
loaded, and it is because my disks are failing" but concrete
suggestions / parameters on what to do (see next).
Traffic diversion done in a sensible manner is equivalent to load
balancing, thus confirming the relatedness, right?
I agree with you that the system under overload should make a suggestion
to its upstream peer to perform the overload control. In fact, if the
system under overload
performs its own overload control, the overall system may likely
experience congestion collapse.
To take it to the next level, it is even better, IF it can done, for the
system under overload to make suggestion to its ingress proxy (this is
probably pushing too much).
Regarding traffic diversion: what we currently have is forwarding or
3xx responses. What we might need here is what eg also Diameter
offers, what I would call "selective temporary redirection". In
Diameter, you can say "for the next 10 minutes, send all traffic for
{this session, this domain, this user, ...} to this and that node".
You could do the same for SIP, the selection criteria could e.g. be
pre-established, standardized profiles or fully dynamic.
A solution would probably need to support both implicit and explicit
alternative hosts, ie the system in overload must be able to express
"send your traffic to these nodes instead" but also "to any node other
than me"
For throttling we have 503, which only applies to a single request.
What we might need here is a mechanism that allows a node to respond
with classification criteria (for example "all dialog creating
requests", or something more dynamic like IMS initial filter criteria)
and something like shaper settings, eg minimum intervals between
requests, max requests per time interval, etc. Again, such a filter
would probably have a suggested time period
My 2 cents,
Jeroen
_______________________________________________
Sipping mailing list https://www1.ietf.org/mailman/listinfo/sipping
This list is for NEW development of the application of SIP
Use sip-implementors at cs.columbia.edu for questions on current sip
Use sip at ietf.org for new developments of core SIP
_______________________________________________
Sipping mailing list https://www1.ietf.org/mailman/listinfo/sipping
This list is for NEW development of the application of SIP
Use sip-implementors at cs.columbia.edu for questions on current sip
Use sip at ietf.org for new developments of core SIP