RE: [Dime] CER/CEA on an open connection
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Dime] CER/CEA on an open connection



After rereading the proposed text, I have some comments/questions, mainly
initiated by the sentence "Applications can be restarted for various reasons
including maintenance and upgrades." This again makes me think about
graceful shutdown of applications, which I biefly mentioned before on the
mailing list.

I see threee different scenarios, where application set supported by a node
chnages:
a) Application is taken out for maintenance/upgrade, this I will call
controlled shutdown.
b) Application crashes etc..., this I will call uncontrolled shutdown.
c) Application starts, maybe it is newly installed, or it restarted after
a)/b).

For c), I think CER is appropriate. Also for b), I think the same. OTOH for
a), IMO the more useful approach is to gracefully shutdown the application.
This means, waiting till all pending sessions are over and only then
shutting down the application. As an example consider a credit control
server, where several session-based credit control sessions are going on. I
wouldn't think operators would be happy to sacrifice all of those sessions
when the application needs to be shutdown for maintenence purposes.

This issue can be addressed by introducing a new error code in protocol
errors class. If the server is in process of being gracefully shutdown, it
could reply with this error code for requests initiating a new session.
Requests for existing sessions would be answered regularly. When the
immediate neighbor of the server -which could be a client or a relay/proxy-
receives an answer with this new error code, it shouldn't send new session
requests to the server anymore -for relays/proxies this probably corresponds
to not to send messages which don't have a Destination-Host AVP to the
server-. When the server has no more pending sessions, it can send CER.

What do people think about this?

   Thanks,
   Tolga

> -----Original Message-----
> From: Ram O V Vishnu-A14676 [mailto:vishnu at motorola.com]
> Sent: Monday, June 05, 2006 3:40 AM
> To: john.loughney at nokia.com
> Cc: dime at ietf.org
> Subject: RE: [Dime] CER/CEA on an open connection
>
>
> All,
>
> Below is the proposed text for peer capabilities update in Base Diameter
> (as discussed in this email list earlier).
>
> John, Please advice us on the process for text proposals to *-biz
> versions.
>
> Regards,
> Vishnu.
>
> ------------------------------------------------------------
> 5.3.8 Peer Capabilities Update
>
> Among the capabilities exchanged during Diameter connection
> initialization, list of supported applications by the node can change
> dynamically. Applications can be restarted for various reasons including
> maintenance and upgrades.
>
> A Diameter node MUST initiate peer capabilities update by sending a
> Capabilities-Exchange-Req (CER) to all its peers which supports peer
> capabilities update and are in open state. Diameter nodes that implement
> peer capabilities update SHOULD check the version information advertised
> by its peer in the Diameter header of the previous CER/CEA exchange to
> determine if the peer supports peer capabilities update. The Diameter
> node MUST NOT send peer capabilities update to the peer if it determines
> that the peer has no support for such scheme, instead it SHOULD
> gracefully disconnect its current connection and attempt to establish a
> new connection towards that peer. In either case, the Diameter node is
> expected to advertise the most recent set of supported applications in a
> CER message, as specified by the peer state machine (see Section 5.6)
> while it is in the open state.
>
> The receiver of CER in open state MUST process and reply to the CER as a
> described in Section 5.3. The CEA which the receiver sends MUST contain
> its latest capabilities. Note that peers which successfully process the
> peer capabilities update SHOULD also update their routing tables to
> reflect the change. The receiver of the CEA, with a Result-Code AVP
> other than DIAMETER_SUCCESS, initiates the transport disconnect. Peer
> capabilities update in the open state SHOULD be limited to the
> advertisement of the new list of supported applications and MUST
> preclude re-negotiation of security mechanism or other parameters.
> ------------------------------------------------------------
>
> Regards,
> Vishnu.
>
> Motorola India Electronics Pvt Ltd
> +91 9844178052
> [*] Motorola Internal Use Only
>
>
>
> -----Original Message-----
> From: Victor Fajardo [mailto:vfajardo at tari.toshiba.com]
> Sent: Monday, April 10, 2006 8:13 PM
> To: Tolga Asveren
> Cc: dime at ietf.org
> Subject: Re: [Dime] CER/CEA on an open connection
>
>
> Hi All,
>
> > I believe in general now we all have a good understanding about what
> > the issues are for renegotiation. It could be an idea to have a new
> > iteration of the proposed text, and to continue the discussions on the
>
> > new version.
> >
>
> To that end, I'll plan on generating a new set of text maybe next week
> as we let people digest and hopefully comment on the latest round of
> discussion for the next couple of days.
>
> best regards,
> victor
>
>
> _______________________________________________
> DiME mailing list
> DiME at ietf.org
> https://www1.ietf.org/mailman/listinfo/dime
> I believe in general now we all have a good understanding about what the
> issues are for renegotiation. It could be an idea to have a new
> iteration of the proposed text, and to continue the discussions on the
> new version.
>
> A few more comments/questions below.
>
>     Thanks,
>     Tolga
>
> > -----Original Message-----
> > From: Ram O V Vishnu-A14676 [mailto:vishnu at motorola.com]
> > Sent: Monday, April 10, 2006 2:38 AM
> > To: dime at ietf.org
> > Cc: Nakhjiri Madjid-MNAKHJI1
> > Subject: RE: [Dime] CER/CEA on an open connection
> >
> >
> > Hi Victor/Timothy,
> >
> > 1. Comments on Retry mechanism <to Timothy>:
> > To summarize, we are sending CER in open state because our
> > capabilities (supported apps) are changing and we would like to update
>
> > the peer about it. We will maintain open state.
> >
> > we may not receive a CEA due to the following possible reasons:
> > 1.1) The peer does not support CER in open state. In such a case, peer
>
> > might still originate unsupported app messages towards us which will
> > result in "DIAMETER_APPLICATION_UNSUPPORTED" error from us. We can
> > leave it to the peer implementation to "correct this error" as
> > mentioned in sec 7.1.3 of RFC 3588. This will take care of the case
> > where there is relay/proxy in between.
> > 1.2) Connection is lost with the peer, in which case the DWR/DWA
> > mechanism should step in.
> >
> > The solution suggested by Timothy (to wait for for CEA & timer) will
> > need changes to FSM, even though we agree that this is consistent with
>
> > the initial CER/CEA scenario.
> >
> > Ofcourse the use of "DIAMETER_APPLICATION_UNSUPPORTED" will result in
> > added (unecessary traffic) incase the peer/proxy is not clever enough
> > to correct its behaviour. This can be avoided by using the version
> > number exchange as Victor pointed out. So, we will not attempt the CER
>
> > in open state to such a peer which doesnt support CERs in open state,
> > and can go for a reconnection.
> [TOLGA]The problem case of a node not being clever enough to take action
> based on "DIAMETER_APPLICATION_UNSUPPORTED" is not limited to
> renegotiations. This could happen in existing systems developed
> according to RFC3588 as well. I would expect a node, whose
> DIAMETER_APPLICATION_UNSUPPORTED replies are not honored by its neighbor
> to drop the connection, but as said this is not limited to renegotiation
> case. For cases, where the neighbor does not support renegotiation, what
> would be the result code in CEA? DIAMETER_UNABLE_TO_COMPLY? We can say
> that such a result code in CEA should indicate that the neighbor does
> not support renegotiations and sender of CER should/may drop/reestablish
> the connection. Please note that with a properly behaving neighbor,
> dropping connection is probably not necessary because the first
> DIAMETER_APPLICATION_UNSUPPORTED should update its tables properly. I
> don't know whether we need to standardize how a node determines whether
> a neighbor honors DIAMETER_APPLICATION_UNSUPPORTED result code.
> >
> > 2. Comments on "re-negotiation" <to Victor>
> > We think that what we need is an advertisement mechanism and not a
> > negotiation mechanism. The CER/CEA in open state allows us to do that.
>
> > We are proposing that the receiver of the CER takes the decision on
> > the disconnection. Thus we are able to delay the disconnection
> > (assuming that the point mentioned in my email before that the app
> > changes are temporary). Since CER allows us to convey the latest set
> > of supported apps to the peer, we favour the CER to the DPR.
> >
> > Following is a scenario where sending CER/CEA is better than the DPR
> > mechanism. (Apologies if the figure is mangled due to tab settings).
> [TOLGA]Although the example case below is a race condition which
> probably won't happen very often -it relies on changing supported
> application set on two nodes more or less simultaneously-, I also prefer
> relying on CER in such a situation, so that there is a single way of
> handling renegotiations in terms of advertising changes to the
> neighbors.
> >
> > Initial exchanged app list:
> > A: 1,2,3               B:2
> > 2 went down 3 came up just when 2 went down on A
> > ________            ________
> > |A     |----DPR---> |B     |
> > |      |            |      |
> > |      |<---CER---- |      |
> > |______|    [3]     |______|
> >         <---DPA----
> >
> > this in our understanding will result in reconnection even thou there
> > is 3 in common.
> >
> > Initial exchanged app list:
> > A: 1,2,3             B:2
> > 2 went down 3 came up just when 2 went down on A
> > ________              ________
> > |A      |----CER---> |B      |
> > |       | [1,3]      |       |
> > |       |<---CER---- |       |
> > |_______| [2,3]      |_______|
> >          ----CEA---->
> >           [1,3]
> >          <---CEA-----
> >           [2,3]
> > This will avoid a reconnection, because 3 is in common.
> >
> > Regards,
> > Vishnu.
> >
> > Motorola India Electronics Pvt Ltd
> > +91 9844178052
> > [*] Motorola Internal Use Only
> >
> >
> > -----Original Message-----
> > From: Timothy Smith [mailto:tjsmith at us.ibm.com]
> > Sent: Monday, April 10, 2006 7:11 AM
> > To: Victor Fajardo
> > Cc: dime at ietf.org; Nakhjiri Madjid-MNAKHJI1; Ram O V Vishnu-A14676
> > Subject: Re: [Dime] CER/CEA on an open connection
> >
> >
> >
> > Hi Victor,
> >
> > Thanks for your response.  Comments below"
> > >>
> > >> Good summary!  I agree with most of your points.  I'm not sure,
> > >> however, that I distinguish between (1) and (2).  I think whether
> > >> temporary or not, we should handle CER/CEA exchanges in the same
> > manner.
> > >I'm just speculating but I think (2) refers to application change
> > >requiring a reboot.
> > >>
> > >> I agree with your design goals (3) ,(4), and suggestions (5) , (6)
> > >> , (8), and (9).
> > >If some are more in favor of scheme (5) ,  maybe we need more opinion
> > on
> > >whether the receiver of the CEA can always comply with the change
> > >request regardless of any scenario. I think (6) and (9) can be
> > >avoided regardless of which scheme we decide to use (see my previous
> > >reply).
> > >
> >
> > Here is the text from your previous reply:
> > "This certainly simplifies things but it also implies that the sender
> > of
> >
> > the CER mandates a change and the receiver has no choice but to accept
>
> > it. In some sense, the scheme is no longer a re-negotiation but merely
>
> > notifying the peer of a change. The proposed text was considering the
> > case where the receiver of the CER cannot comply with the change for
> > whatever reason."
> >
> > I tend to agree with the notion that the sender of the CER is
> > mandating a change.  The receiver does have a choice just as it has a
> > choice in the original CER/CEA exchange.  The receiver should respond
> > with the list of
> >
> > App Ids that is the intersection of what was listed in the CER and of
> > what App Ids it wishes to support.  It is a renegotiation which may
> > add or remove App Ids.  But either way, it is telling the other side
> > about its intentions.  I don't thing the receiving side has any choice
>
> > but to participate
> > in the renegotiation.  For example, if an app Id is being removed, the
> > side
> > where it is being removed renegotiates with a CER that does not
> include
> > that
> > App ID.  It doesn't do the receiving side any good to decline this as
> > the App
> > will be shut down regardless.
> >
> >
> > >> Item 7, "7. Cross over and sequencing of CER/CEA exchange. We dont
> > >> think there is a problem here. Cant find any race condition."  I
> > >> thought that there was a subtle problem here, but I think you are
> > >> right.  Given that the connection insures sequencing, the latest
> > >> CER or CEA that you receive is what you should use as the list of
> > >> negotiated App Ids.
> > >>
> > >Mmmm... i'm not sure that the connection itself ensures sequencing.
> > >Anyway, majority rule favors removing this text.
> > >> Should there be some discussion on the retry mechanism?  You send a
>
> > >> CER, and no response is received.  Does this mean that the peer
> > >> just doesn't know how to renegotiate?  Should we ignore and retry
> > >> every n seconds?  Bring down the connection?  Or do we just
> > >> continue to use the apps from the initial negotiation? My
> > >> preference on this is that we should require a CEA response to the
> > >> CER.  If the CEA is not received, we shutdown the connection and
> > >> start over.  This would address compatibility issues with existing
> > >> implementations.  It would
> >
> > >> also be consistent with the processing of the initial CER/CEA.
> > >>
> > >The current proposal has text mentioning the use of version number of
>
> > >initial CER/CEA exchange to determine whether a peer is capable of
> > >renegotiating. If a node knows that its  peer is not capable of
> > >re-negotiating then the node should not initiate re-negotiation. In
> > this
> > >scheme, existing implementations will be spared of any change.
> >
> > I'm not sure I picked up the version number.  I don't see a reason to
> > do
> >
> > something special.  I would simply like to renegotiate.  If the other
> > side does not respond to the CER, you shut down the connection and
> > restart it
> >
> > after some period of time.  This is what you would do if your initial
> > CER did not get a response.  How is this different?
> >
> >
> > Best Regards,
> > Timothy Smith
> >
> > Vishnu wrote:
> > > Hi,
> > >
> > > Some clarifications and comments on the discussion on this thread:
> > >
> > > We would like to clarify the following practical scenarios of this
> > > happenning:
> > > 1. there are a list of published applications which the box support
> > > and these are installed in the box. Now, some of them go down/up.
> > > This would get translated into change in peer capabilities.
> > > 2. there is a new application which is getting installed/removed
> from
> > > the box.
> > >
> > > We think that (1) is the most probable scenario. so,
> > > there is value in giving a simple solution assuming that this change
> > (in
> > > capabilities)
> > > is most probably temporary. If this is not the case, we are talking
> > > about (2), which is a major change in the box anyways. So in (2), we
>
> > > assume it is ok to assume that
> > > the connections to be reestablished.
> > >
> > > We would like to say that our design goals are:
> > > 3. solution should be simple & as backward compatible as possible.
> > > 4. minimize the FSM changes.
> > >
> > > We suggest:
> > > 5. Updating peer capabilities done only using a CER/CEA in the open
> > > state. Sender of CER/CEA updates the local capabilities before
> > > sending the message and
> > > hence is a local decision.
> > >
> > > 6. The rest of the processing of the CER/CEA will be as per the
> > current
> > > RFC.
> > > Say for example, if there are no
> > > common applications left with that peer,
> > DIAMETER_NO_COMMON_APPLICATION
> > > is sent in the
> > > CEA and the connection is closed.
> > >
> > > Problems in the email thread under discussion & some comments: 7.
> > > Cross over and sequencing of CER/CEA exchange. We dont think there
> > is
> > > a problem
> > > here. Cant find any race condition.
> > >
> > > 8. Mutual agreement to bring down applications will not work due to
> > > possible relays in between as Tolga has pointed out in the mailing
> > > list.
> > >
> > > 9. The DPR solution in the suggested text is not a good idea.
> > > Because DPR cannot advertise the latest local applications to the
> > > peer. This may cause
> > the
> > > race condition
> > > and sequencing problem. This problem can be avoided by using the
> > > approach which we suggested in (5).
> > >
> > > Regards,
> > > Vishnu.
> >
> > tjsmith at us.ibm.com
> > (919) 254-4723
> >
> > _______________________________________________
> > DiME mailing list
> > DiME at ietf.org
> > https://www1.ietf.org/mailman/listinfo/dime
>
>
> _______________________________________________
> DiME mailing list
> DiME at ietf.org
> https://www1.ietf.org/mailman/listinfo/dime
>
> _______________________________________________
> DiME mailing list
> DiME at ietf.org
> https://www1.ietf.org/mailman/listinfo/dime


_______________________________________________
DiME mailing list
DiME at ietf.org
https://www1.ietf.org/mailman/listinfo/dime




Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.