PCE Working Group Renhai Zhang Internet Draft Huawei Expires: December 2005 Defeng Li Huawei June 2005 draft-zhang-pce-comm-app-model-00.txt PCE Communication Protocol Application Model Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 6, 2005. Copyright Notice Copyright (C) The Internet Society (2005).All Rights Reserved. Abstract In an environment where multiple PCEs are responsible for computing paths in a domain, there is a greater possibility of race conditions compared to traditional methods. Fast convergence of network state, e.g. TED, LSPs, is more important for PCEs. Furthermore, load balancing of computation should be considered among these PCEs. This document is not intended to provide a detailed description of all the architectural components of the PCE communication protocol. Rather, it specifies the Global Computing Table (GCT) along with corresponding mechanisms and procedures to resolve some predictable Zhang & Li Informational - Expires December 2005 [Page 1] Internet Draft PCE Communication Protocol Application June 2005 Model problems in PCEs, especially in situations where multiple PCEs are responsible for computing TE LSPs path in a domain. It is hoped that these mechanisms can cooperate with the PCE communication protocol to improve the abilities of the protocol. Table of Contents 1. Terminology..................................................1 2. Introduction.................................................2 3. Global Computing Table.......................................3 4. Resolving race conditions....................................5 4.1 request state in GCT.........................................6 4.2 relevant activity of PCE and PCC.............................6 4.3 procedure....................................................6 5. Load balancing of computation................................7 6. IANA Considerations..........................................8 7. Security Considerations......................................8 8. References...................................................8 8.1 Normative References.........................................8 8.2 Informative References.......................................8 9. Author's Addresses...........................................9 10. Full Copyright Statement.....................................9 1. Terminology This memo makes use of the following terms: o Path Computation Element (PCE): an entity that is responsible for computing/finding inter/intra domain LSPs. This entity can simultaneously act as client and a server. Several PCEs can be deployed in a given AS. o Path Computation Client (PCC): a PCE acting as a client. This entity is responsible for issuing path computation requests that fulfill the Service Management constraints for the establishment of inter/intra domain LSPs. o TED: Traffic Engineering Database. o GCT: Global Computing Table: contained in a PCE to hold all the requests being served or to be served. 2. Introduction The Path Computation Element (PCE) is responsible for finding an inter-domain path satisfying a set of constraints (like QoS performance guarantees) in order to establish inter-domain Zhang & Li Informational - Expires December 2005 [Page 2] Internet Draft PCE Communication Protocol Application June 2005 Model constraint-based LSPs. Once computed, the result is provided to the head-end Label Switching Router (LSR), which can signal downstream LSRs to establish an inter-domain Label Switching Path (LSP). For the purpose of this document, a domain is considered to be any collection of network elements within a common realm of address space or path computation responsibility. Examples of such domains include Autonomous Systems, IGP areas and GMPLS overlay networks. Compared to traditional mechanisms, when multiple PCEs are performing computation simultaneously in a domain, more race conditions may occur. This is because the PCE communication protocol will increase the time that the distributed routing information and the real-time network state are different due to synchronization of network state of PCEs and transmission of protocol messages. On the other hand, when a computed path is to be established, the reserved resources are still not updated and flooded in the domain. Computation based on the current information in the PCE might result in failure. A PCE may be blocked due to a large number of requests received at a time. If a PCE is blocked, those requests sent by the PCC are rejected and error messages are sent back to the PCC. PCCs may randomly select another PCE in the domain and send the request again, but that PCE may also be blocked at that time. Hence, some approach must be taken to avoid this. When the first PCE is blocked, it's best to identify which PCE in the domain can now serve the request. By doing this, any request from PCC will not fail due to the blockage of PCEs and corresponding random selection. Assuming that PCE knows other PCE's status, it can designate an alternate PCE with appropriate capability and send this with the rejection message back to PCC upon being blocked. When the PCC receives a failure reply with such information, it can quickly resend the request to the designated PCE. 3. Global Computing Table The Global Computing Table (GCT) is a table specified for a PCE that holds all the current requests being and to be served by all PCEs in a domain. With GCT, an approach is also specified to perform load balancing and enable exclusive computation between PCEs to avoid race conditions. This provides significant improvements in successful computing and setup of TE LSP, especially when a node or link, like inter-domain link recovery results in a large amount of requests messages being generated at the same time. For PCEs in charge of path computation in a domain, it may be particularly useful to know each other's running status, such as how many requests to be served in each PCE, how many calculated paths are Zhang & Li Informational - Expires December 2005 [Page 3] Internet Draft PCE Communication Protocol Application June 2005 Model being established. We can see the benefit in section 4 and 5. It is recommended that each PCE in a domain should learn other PCEs' computing capabilities through some mechanism, for example, PCE discovery protocol. When using the GCT, once a request is generated by a PCC, it MUST be sent to all PCEs in a domain. The consequence is that each PCE in a domain maintains an identical computing table. Each request contained in the table MUST include the information such as the kind of resource it will apply, the priority of the request, etc. For example: ................................................ . . . -------- -------- . . | PCE1 | | PCE2 | . . -------- -------- . . . . . . -------- --------- . . | RTA | | RTC |--------------- . -------- --------- . . | -------- | . . |----------| RTB |--------| . . -------- . . . ................................................ In the above figure, there are three routers: RTA, RTB, RTC and two PCEs: PCE1, PCE2 in a domain. Each router may learn of the PCEs' capabilities through the PCE discovery protocol, and each router can make a request to a PCE for path computation. For example, before router RTA wants establish a LSP path with some constraint, it should select an appropriate PCE with its capability and state for this request in the domain, for example, PCE1. When RTA constructs the request packet, PCE1's ID is included. Note that this ID is not used as the destination of the request message. Instead, the request message should be sent to all PCEs in the domain. Upon receiving the request message, each PCE can determine whether it is being asked by this message to perform this computation by parsing the packet. If the ID in the packet is the same as its own, the request should be served locally. Consequently, the request is queued in the Local Computing Table according to its priority. Conversely, if the ID is not the same as its own, the request should be queued in the Remote Computing Table, where the requests also are identified and organized by PCEs IDs, and stored respectively in each PCEs's computing queue according the request priority. When a PCE has successfully computed the path, the result is returned Zhang & Li Informational - Expires December 2005 [Page 4] Internet Draft PCE Communication Protocol Application June 2005 Model to the PCC. It is recommended that the result SHOULD also be sent to other PCEs. After the PCC has established the LSP path, information on the LSP is sent to all PCEs. The resources reserved by the path MUST be determined through this message. Upon receiving the message, all PCEs SHOULD update their TED or LSPs which they maintain and delete the corresponding request in the GCT. The GCT in PCE1 is organized like the below figure: ------------------------------------------------------------- | | | Global Computing Table | | | | Local Computing Table Remote Computing Table | | | | --------------- ................................ | | |RTA's request| . PCE2's table ... . | | --------------- . --------------- . | | |RTC's request| . |RTB's request| ... . | | --------------- . --------------- . | | |RTB's request| . . | | --------------- ................................ | | | ------------------------------------------------------------- We can see, in the figure, that there are two main modules: Local Computing Table, Remote Computing Table. In the Local Computing Table, PCE1 has three requests being or to be served:, they are from RTA, RTC, RTB. We can also infer from the order that RTA's request has the highest priority, and RTB's request has the lowest priority. In the Remote Computing Table, we can see that PCE2 is also performing a computation, and it now has one request to handle. In other words, using reliable delivery of PCE communication protocol and the GCT, each PCE in a domain is capable of knowing the state of the other PCEs. It is especially useful in situations where race conditions occur and load balancing is desired. 4. Resolving race conditions There are many situations whereby race conditions may occur, for example, the recovery of an inter-AS TE link. However, this computation result might not succeed. Here a mechanism is presented with the GCT. It should be noticed that not all PCCs care about the race conditions when a request message is generated. Zhang & Li Informational - Expires December 2005 [Page 5] Internet Draft PCE Communication Protocol Application June 2005 Model 4.1 request state in GCT First, the request status in PCE is specified. Request status: 1. Computing: a request being or to be served; 2. Computed: a request that computation has been performed, and the result has been replied to PCC and other PCEs, but path still has not been established. Upon receiving this result, other PCEs also update their network state for LSPs; 3. Done: the computed path has been established and all PCEs are informed by the PCC with the result. The network state about LSPs in the domain maintained by PCE is updated through the result, and the corresponding request is deleted from the GCT. 4.2 relevant activity of PCE and PCC The request message SHOULD be sent to all PCEs in the domain by the PCC when generated in the way presented in section 3. After a PCE has performed a successful computation, it has two choices about the result: first, the result is only sent back to the requesting PCC; second, it is sent to the requesting PCC and all the other PCEs in the domain. For the first choice, the request state in the GCT will only be Computing or Done. For the second choice, the request state will change from Computing to Computed when PCEs receive the result message, and the network state on LSPs maintained in PCE SHOULD be updated. Additional details are provided in the next section. 4.3 procedure Even in traditional scenario, where the path computation is performed in ingress LSR, race conditions may occur. This is because there may be a lot of ingress LSRs in a domain, each ingress LSR can perform the computation without any collaboration, and there is no method to know if the TED maintained in memory has been synchronized to the real-time network state. But PCE-based computation can provide a way to overcome it. Note that, for different requests, a PCC may have a different desire about race conditions when the request is generated. To avoid race conditions, a computation MUST be exclusive in a domain. To achieve this, multiple PCEs should collaborate when Zhang & Li Informational - Expires December 2005 [Page 6] Internet Draft PCE Communication Protocol Application June 2005 Model performing computation. In this section, based on the exclusion degree of the computation desired by PCC when a request is generated, the requests are divided into three classes, and corresponding procedures are presented: Non-exclusive computation: the PCC does not care about race conditions, and wants to get the result immediately. When receiving this kind of request, regardless of the status of other requests in its GCT, the PCE can perform the computation at once. Partial-exclusive computation: by requesting this kind of computation, a PCC only cares if there is a probability of race conditions with those already computed and to be established paths in the domain. When receiving this kind of request, before performing the computation, the PCE SHOULD compare it to other requests in the status of Computed in the GCT. If determining that there is the probability of race conditions and the priority of this request is lower than those in GCT, the request SHOULD be delayed until the status of those requests with higher priority change from Computed to Done and the LSPs information maintained by PCE has been updated by the PCC message; otherwise, if the priority of the request is the highest among these requests, then the request can be served without delay. Note that the determination approach is local to PCE and outside the scope of this document. However, the approach MUST be consistent among PCEs and also SHOULD consider the resources style of the request and possible path. Conversely, if it is determined that there is no possibility of race conditions with those requests in status of Computed, the computation can be performed at once. Therefore, this kind of request allows for PCEs in a domain to perform computation in parallel except when there is a possibility of race conditions among requests in state of Computed. It should be noted that there is still a possibility of race conditions when performing this kind of computation. Full-exclusive computation: by asking this kind of request, a PCC cares about all the probability of race conditions not only among Computed requests but also among the Computing requests and desires that the corresponding path establishment succeed. A PCE will compare this request to all the other requests in GCT to gather all those that may result in race conditions. These requests will be served in turn according to their priority as described for the Partial exclusion computation. Other requests without the possibility of race conditions can still be served in parallel among PCEs. 5. Load balancing of computation Refer to the figure above. Without GCT, suppose that PCE1 is blocked Zhang & Li Informational - Expires December 2005 [Page 7] Internet Draft PCE Communication Protocol Application June 2005 Model when performing the computation for the RTA's request. If the other requests (of RTC and RTB) cannot be served for a relatively long time, router RTC and RTB may determine that the request is blocked either through a timeout or notification from PCE1. The request messages may be sent again to another PCE, e.g., PCE2. If PCE2 is also being blocked, and the request will fail again. But assume that another PCE, PCE3, exists and it may be in idle state, on the first failure, instead to PCE2, the request message should be reasonably sent to PCE3. 6. IANA Considerations The solution proposed in this draft uses exclusion degree of computaion asked by PCCs that SHOULD be attributed by IANA [RFC2434]. 7. Security Considerations This document does not introduce new security issues. 8. References 8.1 Normative References [RFC3667] Bradner, S., "IETF Rights in Contributions", BCP 78, RFC 3667, February 2004. [RFC3668] Bradner, S., Ed., "Intellectual Property Rights in IETF Technology", BCP 79, RFC 3668, February 2004. [RSVP] R. Braden et al., "Resource ReSerVation Protocol (RSVP) - Version 1 Functional Specification", RFC 2205, September 1997. [RSVP-TE] Awduche, D., et. al., "RSVP-TE: Extensions to RSVP for LSP tunnels", RFC 3209, December 2001. 8.2 Informative References [PCE-ARCH] A. Farrel, JP. Vasseur and J. Ash, "Path Computation Element (PCE) Architecture", draft-ietf-pce-architecture (work in progress). [PCE-GEN-COM-REQ] J. Ash, J.L Le Roux et al., "PCE Communication Protocol Generic Requirements", draft-ietf-pce-comm-protocol-gen- reqs, (Work Progress). [INT-AREA-REQ] Le Roux, J.L., Vasseur, J.P., Boyle, J. et al, "Requirements for inter-area MPLS Traffic Engineering", draft-ietf- Zhang & Li Informational - Expires December 2005 [Page 8] Internet Draft PCE Communication Protocol Application June 2005 Model tewg-interarea-mpls-te-req-03.txt, work in progress. [INT-DOMAIN-FRWK] Farrel, A., Vasseur, J.P., Ayyangar, A., "A Framework for Inter-Domain MPLS Traffic Engineering", draft-ietf- ccamp-inter-domain-framework-00.txt, work in progress. [PCE-DISC-REQ] JL Le Roux et al., "Requirements for Path Computation Element (PCE) Discovery", draft-leroux-pce-discovery-reqs-00.txt, work in progress. 9. Author's Addresses Renhai Zhang Huawei Technologies No. 3 Xinxi Road, Shangdi, Haidian District, Beijing City, P. R. China Email: zhangrenhai@huawei.com Defeng Li Huawei Technologies No. 3 Xinxi Road, Shangdi, Haidian District, Beijing City, P. R. China Email: 77cronux.leed0621@huawei.com 10. Full Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and translations of it MAY be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation MAY be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself MAY not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process MUST be followed, or as required to translate it into languages other than Zhang & Li Informational - Expires December 2005 [Page 9] Internet Draft PCE Communication Protocol Application June 2005 Model English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Zhang & Li Informational - Expires December 2005 [Page 10]