[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PWE3] BFD for MPLS PWs



Luca,
 
Your entire email suggests that all that OAM stuff isn't really that critical. To some extent I even agree with you. When you run MPLS over SONET or SDH, the probability of a pure MPLS or PW failure is rare and does not warrant a heavy investment in PW OAM. If I were an operator, I don't think that I would run BFD over every PW.
 
However, your comment about ATM not being used as the Internet Protocol is nonsense. The Internet is by design a connectionless network with completely different properties than the services for which ATM was designed. There are many different network architectures and many different services and for some of these OAM implementations are required that do not apply to the Internet.
 
The irony of the situation is that even though you don't see much value in OAM, you are proposing solutions that complicate the implementation of it: you believe that in many cases defects should be reported twice, once via in-band notifications and once via PW Status.
 
My proposal is to define OAM in way that is simple as possible. For each defect there is a well-defined, minimal set of consequent actions. That seems the obvious way to minimize the burden that OAM could potentially imply. One would think that you would resonate with that. Yet, for some reason, and I don't understand why, you are fighting it.
 
I can't prevent you from writing emails to explain what you believe is the best solution, but it is worth considering that several of us have spent a lot of time to define the OAM Message Mapping draft. Therefore, if you believe that there is a fundamentally better solution, I would rather see that you work it out in a document with sufficient detail so that we can compare apples with apples.
 
Peter
 
 

 -----Original Message-----
From: Luca Martini [mailto:lmartini at cisco.com]
Sent: Tuesday, August 22, 2006 6:47 PM
To: Busschbach, Peter B (Peter)
Cc: Swallow George; 'Thomas D. Nadeau'; Pignataro Carlos; Morrow Monique; pwe3 WG ((((((((E-mail)))))))); Danny McPherson; Agarwal Rahul; Stewart (stbryant) Bryant
Subject: Re: [PWE3] BFD for MPLS PWs

Busschbach, Peter B (Peter) wrote:
Luca,

  
I would suggest that you look at the e-mail archive. If I remember 
correctly there were strong opinions about making the PW 
status messages mandatory.
    

I remember that there was a strong consensus to use PW Status instead of Label Withdraw. In that sense PW Status is indeed mandatory. However, I don't remember ever having seen a discussion about the mandatory use of PW Status for *every* single PW and AC defect. It would be helpful if you could identify that email exchange.

Further comments in-line.

Peter

  
-----Original Message-----
From: Luca Martini [mailto:lmartini at cisco.com]
Sent: Friday, August 18, 2006 6:15 PM
To: Busschbach, Peter B (Peter)
Cc: Swallow George; 'Thomas D. Nadeau'; Pignataro Carlos; Morrow
Monique; pwe3 WG ((((((((E-mail)))))))); Danny McPherson; 
Agarwal Rahul;
Stewart (stbryant) Bryant
Subject: Re: [PWE3] BFD for MPLS PWs


Busschbach, Peter B (Peter) wrote:
    
To move the discussion forward, let me follow up on my own question.

I believe that there is a broad concensus that insertion of 
      
AIS, as mandated by draft-ietf-pwe3-atm-encap-11, implies 
that the PE will NOT send an additional PW Status message to 
report the same defect.
    
  
      
Here I am assuming you mean insertion of an AIS alarm at the 
PE toward 
the local attachment circuit.
    

No. My example was about LOS, which is an AC defect, which according to draft-ietf-pwe3-atm-encap-11 triggers the PE to insert F4 AIS over the PW.

  
We should also send a status message with status "0x00000008 - Local 
PSN-facing PW (ingress) Receive Fault "  fault in this case.
Remember that the MPLS path might have gone down completely because 
somewhere a router was mis-configured , and can now only 
forward IP packets.
This message would be immediate , and much faster then any VCCV path 
fault detection scheme.
    

You seem to be confusing fault detection and fault notification. My example was about AC defects, but let's look at the PW defects that you are addressing.

  
no , I wanted to understand which kind of fault are you refering to: AC or or PSN faults.

If for some reason the MPLS path goes down, you need some mechanism to find out that that is the case. You could do this via BFD or Y.1711, or you could wait until you receive an RSVP error notification. Since, according to your assumption, the control plane stays up, a PE will not 
or more likely receive an OSPF/ISIS route update , or an LDP label withdraw.
detect the failure through an LDP session failure. PW Status only served to inform a PE about a defect detected by its peer. But then the peer 
Detecting LDP session failures is always a last resort. I have experienced this only once in 5 year of running  a real network. THere was always some other event first.
needs a way to detect the defect. Two PEs that don't do anything else than sending each other PW Staus messages will never detect a defect.

  
Defects are communicated by the network. Some people like the circuit based architectures and want every circuit to monitor and communicate defects.
This was the ATM concept, and clearly it did not work as we are not running ATM as the Internet Protocol.

Is PW Status faster? It depends. Let's assume you use BFD for failure detection. A PE enters the defect state at expiry of the control detection time. In the next control packet it sends to its peer, it indicates that it entered the defect state. Since the two sessions are asynchronous, there may be a time lag between defect detection in PE1 and (BFD) notification to PE2, but that lag is smaller than the timer value. 

Consider two scenarios: in the first case, the operator requires sub-50-ms failure detection and restoration and therefore provisions use of a 10-ms timer. Within 10 ms after failure detection both sides know about the defect. With PW Status, that 
A 10 ms timer is not realistic. Anything below 10 seconds will not scale in a real network. Why would one spend so many resources to protect against possible bugs ?
Under normal operation you will get a notification from the PSN that something has gone down.
is hard to achieve, especially since an MPLS failure may bring hundreds of PWs down, each of which would trigger transmission of a PW Status message.
  
Ok, and generating hundreds of ATm AIS alarms is going to be easier ? No. it would take far more resources. If this is your concern , I suggest using the Grouping TLV ( or group ID ), and wind card PW status messages. this mechanism was designed specifically for ATM to improve the down time response.

Alternatively, consider that the operator provisions 10 minute timers. In that case, it might take several minutes before PE1 informs PE2 about the defect and the use of PW Status would result in a much faster failure notification. But in this case, what is the point? If it is acceptable that defect detection might take 30 minutes , it does not seem necessary to inform the peer PE about the defect within a much smaller time interval.

  
I think that it might very well be acceptable that a very rare , bug related , network defect be deteced more slowy. Especially since 99.9999% of the cases this will not happen.
  
RFC 4447 states "The PW status signaling procedures 
      
described in this section MUST be fully implemented." The 
document itself, however, specifies HOW to use PW Status 
signaling but is vague about WHEN to use PW Status signaling. 
The latter is addressed in the encapsulation drafts, such as 
atm-encap, and is fully defined in OAM-MAP 
(draft-ietf-pwe3-oam-msg-map-04).
    
  
      
This is not my interpretation. When we say procedures 
described in this 
section MUST be fully implemented, implies that the protocol 
will send 
and receive status messages when appropriate. The trigger to send the 
status messages is attachment circuit specific, and therefore is  
described in the encapsulation drafts.
    

But the atm-encapsulation draft that I referenced specifies transmission of F4 AIS, not of PW Status. 
  
That is an omission that we can still fix. Matthew had at one point suggested that we remove this entire section.


The notion that RFC 4447 should be interpreted as a mandate 
      
to send a PW Status message for every single defect is an 
incorrect interpretation of the spirit of the standard, does 
not reflect WG concensus and, if implemented, would lead to 
inefficient implementations.
    
  
      
I disagree, and I have seen no indication that the WG interpreted the 
rfc4447 in this fashion.
    

Perhaps you should read OAM-MSG-MAP. It clearly shows that the people who worked on OAM never thought that PW Status would be transmitted for every single PW and AC defect.

  
I realize this , and we need to resolve the problem.

I will summarize what I believe is the best solution in a separate e-mail.

Thanks.
Luca

  
Therefore, I strongly disagree with the point of view that 
      
Tom and Luca have formulated regarding the use of BFD for 
both fault detection and status signaling. When this option 
is used, there is, IMO, no need to send PW Status messages. 
The current text of OAM-MAP is in line with this view.
    
  
      
I would suggest that you look at the e-mail archive. If I remember 
correctly there were strong opinions about making the PW 
status messages 
mandatory.

The current text OAM-MAP needs to be changed.

Luca
 
    
Peter

  
      
-----Original Message-----
From: Busschbach, Peter B (Peter) [mailto:busschbach at lucent.com]
Sent: Wednesday, August 16, 2006 4:40 PM
To: 'Luca Martini'
Cc: Swallow George; 'Thomas D. Nadeau'; Pignataro Carlos; Morrow
Monique; pwe3 WG ((((((((E-mail)))))))); Danny McPherson; 
Agarwal Rahul;
Stewart (stbryant) Bryant
Subject: RE: [PWE3] BFD for MPLS PWs


Luca,

Section 7.4 of the ATM Encapsulation draft 
(draft-ietf-pwe3-atm-encap-11.txt) mandates that upon LOS the 
ingress PE inserts F4 AIS for every affected VPC. Is it your 
opinion that because of RFC 4447 the ingress PE must send a 
PW Status message in addition to F AIS insertion?

Peter

    
        
-----Original Message-----
From: Luca Martini [mailto:lmartini at cisco.com]
Sent: Tuesday, August 15, 2006 6:29 PM
To: Busschbach, Peter B (Peter)
Cc: 'Thomas D. Nadeau'; Swallow George; Pignataro Carlos; Morrow
Monique; pwe3 WG ((((((((E-mail)))))))); Danny McPherson; 
Agarwal Rahul;
Stewart (stbryant) Bryant
Subject: Re: [PWE3] BFD for MPLS PWs


Busschbach, Peter B (Peter) wrote:
      
          
In my opinion, the work on VCCV and OAM-MAP provides a 
        
            
further specification of RFC4447 and in fact overrules the 
requirement that LDP Status signaling must be used.
      
          
  
        
            
Peter, 
We agreed a long time ago that the LDP status messaging was 
going to be mandatory , in fact people insisted I make it 
          
mandatory.
    
So I would have to say that the LDP status MUST always be 
used as mandated in RFC 4447, and the BFD status messaging is 
an optional, and , in my opinion, not a very useful option 
when LDP is in use.

Luca


      
          
A minor comment on your email:
I am not sure what you mean by the "de-facto status must 
        
            
always rely on LDP". Either you declare that that the PW is 
down when (1) BFD OR LSP indicate a PW failure, or (2) based 
only on the LDP status. In my mind "de-facto status" implies 
(2), whereas the beginning of your email says that (1) is the 
proposed procedure.
      
          
Peter
 

_______________________________________________
pwe3 mailing list
pwe3 at ietf.org
https://www1.ietf.org/mailman/listinfo/pwe3
  
        
            
_______________________________________________
pwe3 mailing list
pwe3 at ietf.org
https://www1.ietf.org/mailman/listinfo/pwe3

    
        

_______________________________________________
pwe3 mailing list
pwe3 at ietf.org
https://www1.ietf.org/mailman/listinfo/pwe3