P2PSIP Yunfei.Zhang Internet Draft China Mobile Intended status: Informational Gang.Li Expires: April 2011 China Mobile Jin.Peng China Mobile Baohong.He China CATR Shihui.Duan China CATR Wei.Zhu China CATR March 11, 2011 bcp for a large scale carrier-level VoIP system using p2psip draft-zhang-p2psip-bcp-04.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on September 11, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. Zhang, et al Expires September 11, 2011 [Page 1] Internet-Draft p2psip bcp March 11, 2011 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract This document introduces a practical work for Peer-to-Peer (P2P) SIP system based on Distributed Service Network (DSN), which was proposed by China Mobile in ITU-T. In this document, it introduces some key problems of carrier grade P2PSIP VoIP system. Then it gives a brief introduction on DSN VoIP system and especially discusses some key technologies aimed at those mentioned problems. At last, we present some measurement works for validating DSN VoIP system call transaction performance, reliability and high Availability. These measurements include the measurement in lab environment and in real internet. Zhang, et al Expires September 11, 2011 [Page 2] Internet-Draft p2psip bcp March 11, 2011 Table of Contents 1. Introduction ................................................ 4 2. Requirements and key problems of carrier-grade P2P VoIP system 4 2.1. Requirements ........................................... 4 2.2. Some Key problems ....................................... 5 3. DSN VoIP System Overview ..................................... 6 3.1. System Architecture ..................................... 6 3.2. System Implementation and Components .................... 8 3.3. System Key technologies ................................ 10 4. Validation of DSN VoIP system in lab environment ............ 12 4.1. Overview .............................................. 12 4.2. System bulk call performance measurement ............... 13 4.2.1. Pure bulk call performance measurement results ..... 13 4.2.2. Mix Bulk call performance measurement results ...... 14 4.2.3. conclusions ....................................... 15 4.3. Churn measurement ...................................... 16 4.3.1. Churn model and setting ........................... 16 4.3.2. Measurement results ............................... 17 4.3.3. conclusions ....................................... 18 5. Validation of DSN VoIP system in real environment ........... 18 5.1. Overview .............................................. 18 5.2. Single Capabilities measurement ........................ 19 5.3. conclusions ........................................... 21 6. Security Considerations ..................................... 21 7. IANA Considerations ........................................ 21 8. Conclusions ................................................ 21 9. References ................................................. 22 9.1. Normative References ................................... 22 9.2. Informative References ................................. 22 10. Acknowledgments ........................................... 22 Zhang, et al Expires September 11, 2011 [Page 3] Internet-Draft p2psip bcp March 11, 2011 1. Introduction DSN [1],the abbreviation of Distributed Services Network, is a new question being standardized in ITU-T proposed by China Mobile. [3] outlines the DSN Architecture. As described in [3], DSN can support many applications, such as Multimedia telephony services, streaming services, content distribution services and so on. DSN VoIP system is designed based on the theory of DSN and P2PSIP. According to [1], DSN VoIP system includes two layers over IP network: P2PSIP layer and the application layer. The application layer uses SIP protocol for call and P2PSIP layer uses RELOAD[4] protocol for call routing in overlay network. It also analyzes some key problems in P2P VoIP system and defines the corresponding requirements in this document. It also includes the system architecture, components and key technologies. At last it presents some measurement work for validating the DSN VoIP system. 2. Requirements and key problems of carrier-grade P2P VoIP system 2.1. Requirements Peer-to-Peer(P2P) systems have been widely deployed in current internet, due to its advantages such as high scalability and cost- effectiveness. Many P2P technologies have been to implement the traditional telecom voice service (for example, IMS or softswitch) such as Skype. But it hasn't been discussed thoroughly that how to make P2P system meet the telecom infrastructure performance requirements, which is usually called as the "Carrier-Grade" requirements. According to the current performance of telecom voice service deployments, the requirements of the carrier-grade services can be summarized but not limited to those listed below: 1. Qos guarantee: Current IMS requires that any client-side operations should be able to get the final responses in 300 milliseconds. Therefore, P2P VoIP system should be able to guarantee the signaling response time which is comparable to this time and the media transmission time which is less than 400ms. There are also some specifications for voice quality, which is not mentioned in this document. Zhang, et al Expires September 11, 2011 [Page 4] Internet-Draft p2psip bcp March 11, 2011 2. High Availability: The high availability includes network availability, service availability, and subscriber data availability and so on. The most common practice is to use some special mechanisms to acquire robustness and availability such as hot backup redundancy. High availability is a severe challenge for P2P VoIP systems which intends to achieve the same level of the telecom systems. 3. Scalability: The P2P VoIP system should be adapted to the expanding of the system scale, such as the increasing of the number of the clients or core nodes. 4. Load balance: The resource is distributed among P2P nodes and each node should be able to collaborate with one another to avoid the emergence of the centralization of resource and traffic. 5. Cost-effectiveness: It's the aim that the P2P VoIP system can achieve the same performance as the telecom commercial system by fully utilizing only commodity PCs, not using the costly hardware. 6. Maintainability: it's quite appealing that it needs to do a little work to configure the nodes and network in order to maintain the natural operation of the P2P VoIP system. P2P VoIP system can realize the self-organization even after any network failure, so as to reduce the demand for emergent response and well experienced operators. 7. Others: TBD. 2.2. Some Key problems To obtain the same performance as the telecom system, there are many key problems needed to be solved. Some of them are listed as below: 1. Routing performance: The traditional P2P routing performance is relative to the network dimensions. The routing performance will be sharply descended as the increase of the network dimensions, for example, the traditional Chord routing performance is O(logN). 2. Redundant traffic: The P2P network does not match the underlying network, so the P2P nodes forward the traffic only according to the logic P2P routing and not to the real optimal IP layer network routing, and this will cause mass P2P redundant traffic iteratively across many Automatic Systems(AS). Zhang, et al Expires September 11, 2011 [Page 5] Internet-Draft p2psip bcp March 11, 2011 3. The reliability and availability of subscriber data: Current P2P network can't provide an efficient solution for the reliability and availability of subscriber data in case of the random joining or exiting of the P2P nodes happened in the network. 4. Duplicate data consistency: There are many replicas for subscriber data to acquire enough availability. It's very common that the subscribers' data are often modified, so the system must allow for the updating and amendment information to be propagated to all replicas asynchronously. 5. Hot spot: The subscriber data are distributed unevenly in the network, which will cause that some nodes have much more subscribers' data than the others. Therefore, it might probably lead to more traffic due to those high-loaded nodes which might makes these nodes overload and even fault in some cases. 6. Single node failure: If a node fails, its clients can't use the services. 7. Others: TBD. It discusses some key technologies in section 3.3 with a view to solve the above questions. 3. DSN VoIP System Overview DSN VoIP system is a VoIP system based on P2P overlay. DSN VoIP system uses SIP protocol for call transaction and RELOAD for call routing. 3.1. System Architecture According to the architecture of RELOAD, DSN VoIP system has the similar architecture and contains some essential components defined in RELOAD, such as Usage Layer, Routing Layer, Storage and so on. The architecture of DSN VoIP system is shown as figure 1. Zhang, et al Expires September 11, 2011 [Page 6] Internet-Draft p2psip bcp March 11, 2011 +-----------------------------------------------------------+ | +-------------------------------------------------------+ | | |Application layer | | | | +---------- --+ +------------+ | | | | | SIP Usage | | XMPP Usage | ... | | | | +-------------+ +------------+ | | | +-------------------------------------------------------+ | | ^ ^ | | =================Message Routing API===================== | | v v | | +------------------------------------------------------ + | | |P2PSIP layer | | | | +-------------------------------------------------- + | | | | | Data Storage | | | | | |-------------------------------------------------- | | | | | |Parallel Recovery| |Consistency Mgt| |User Index| | | | | | |------------------------------------------------ --| | | | | | Replica Placement Strategy | | User Profile | | | | | +----------------------------------------------- ---+ | | | | ^ ^ | | | | v v | | | | +---------------------------------------------------+ | | | | | Key Based Routing(KBR) | | | | | |---------------------------------------------------| | | | | | ID Mgt | | DHT Protocol | | Topo Maintenance| | | | | | +---------------------------------------------------+ | | | +-------------------------------------------------------+ | | ^ ^ | | ====================Transport API======================== | | v v | | +-------------------------------------------------------+ | | | +------------------------------------------------ --+ | | | | | Communication | | | | | |---------------------------------------------------| | | | | | TCP | | UDP | | Other | | | | | +------------------------------------------------- -| | | | | IP Network | | | | | | | +--------------------------------------------------- ---+ | +----------------------------------------------------- -----+ Figure 1 The architecture of DSN VoIP system Based on the RELOAD architecture, DSN VoIP system implements the necessary functions required by RELOAD and some additional functions to make the system more practicability. Zhang, et al Expires September 11, 2011 [Page 7] Internet-Draft p2psip bcp March 11, 2011 There are two basic modules in P2PSIP layer: KBR and Data Storage module. The key based routing, KBR, refers to find the best suitable host for an input key, also includes ID management, DHT routing protocol, topology maintenance, peers failure detecting and etc. DSN VoIP system implements a unique KBR routing scheme with real-time response and a traffic localization mechanism, including a novel ID assignment method. The data storage module takes charge of management of subscriber data, including storage, restoring, consistency verification and etc. Here DSN VoIP system implements a unique and practical replica replacement strategy and an effective consistency strategy. 3.2. System Implementation and Components DSN VoIP system includes some distributed servers and subscriber terminals. These servers form the DSN VoIP network by using P2P protocol and share load for each other. According to the different functions, the server can be Peer Node(PN), Application Server(AS), Enrollment Server(ES), Edge Agent(EA), Super Peer-Maintenance (SPM),Relay Node(RN) and RN Cluster Server(RCS). According to the support for P2P protocol, the subscriber terminal can be the traditional SIP Client(SiC) and Peer Client(PeC). The DSN VoIP system framework is shown as figure 2. Zhang, et al Expires September 11, 2011 [Page 8] Internet-Draft p2psip bcp March 11, 2011 +-----+ +----+ +-----+ | ES | | AS | | SPM | +-----+ +----+ +-----+ | | | | | | +----+ +----+ +----+ +----+ | PN |----| PN |----| PN |- ... -| PN | +----+ +----+ +----+ +----+ | | | | +----+ +----+ +----+ +----+ | PN |----| PN |----| PN |- ... -| PN | +----+ +----+ +----+ +----+ | | | | | | +----+ +----+ +----+ | EA | | PeC| | RCS| +----+ +----+ +----+ | | | | +----+ +----+ | SiC| | RN | +----+ +----+ Figure 2 The framework and components of DSN VoIP system Peer Node(PN): PN is the core node in DSN VoIP network and is deployed by telecom operators. PN is responsible for DHT topology maintenance, overlay routing, storage and access for subscriber/service data, subscriber registration and Authentication, BootStrap, SIP session control and VoIP service. Application Server(AS): AS can provide other services which can't be provided by Peer Node(PN). Enrollment Server(ES): ES implements BootStrap service when the PeC or PN joins the DSN VoIP network for the first time. When new PN joins, ES is responsible for authentication for the new node. ES is deployed by telecom operators and fixed in the network. Edge Agent(EA):EA provides the relay service for SiC and adapt to all kinds of services and connections to PN. EA supports SIP proxy discovery for DSN and connection with PN. SPM(Super Peer Maintenance): SPM is an administrator for one zone. SPM undertakes the registration of the PN nodes in the same zone, assigns the nodes IDs to PN nodes and maintain DHT routing information. Zhang, et al Expires September 11, 2011 [Page 9] Internet-Draft p2psip bcp March 11, 2011 Relay Node(RN): RN is responsible for NAT traversal and media traffic forwarding. RN can discover the DSN and connection with PN. RN can be deployed by telecom operators or upgraded from PeC. RN Cluster Server(RCS): RCS is responsible for the management of RN nodes according to topology information or RTT(Round-Trip Time). RCS manages one AS and assigns a suitable RN for PeC according to the information about PeC's request or location etc. SIP Client(SiC): SiC support only SIP protocol and can't discover the DSN network if it is directly connected to the DSN. It must connect directly with Edge Agent (EA). Peer Client(PeC): PeC support both SIP protocol and P2P protocol. PeC can be upgraded to be Relay Node (RN) or RN Cluster, but can't be Peer Node(PN). PN nodes form the DSN VoIP network and interconnect via P2P protocol. The subscriber terminal can get valid service once it connects to any PN. If some but not all PN nodes fail or go offline, this will not affect the normal operation of DSN network. We can add more new PN nodes to augment the capacity of DSN network. 3.3. System Key technologies In this section, it presents with some key technologies which is aiming to solve the problem in sec 2.2. 1. SPM assisted one hop route: DSN VoIP system is typically deployed as a two-layer DHT, including a global DHT (corresponding to countrywide) and several regional DHTs (corresponding to provinces). For each region, there are several special peers dedicated to routing information updating and some other management tasks and these special peers are SPM. Multiple SPMs can be deployed in a large region. Each one is responsible for a partition of PN belonging to that region, and these SPMs backup each other. SPMs from different regions are full-mesh connected, they run a gossip-based protocol to maintain a strongly consistent view of their existence. All PN nodes must register to the local SPM of the same region, when they join the network. During runtime, once the PN's state change is detected by its neighboring PN, the neighbor will notify the local SPM, and then the SPM will disseminate the event to other SPMs, finally each SPM will broadcast this notification to all PN nodes under the charge of it. In this way, each PN can obtain other PNs' information through SPM and can maintain routing table for all PNs. So SPM assisted one hop route can be implemented. Zhang, et al Expires September 11, 2011 [Page 10] Internet-Draft p2psip bcp March 11, 2011 2. Traffic Localization: Recent studies showed that a large volume of inter-domain redundant traffic aroused by P2P mismatch problem already became a serious problem. As mentioned above, each PN node joins the countrywide global DHT and one regional DHT at the same time. We use a novel peer ID assignment scheme optimized for global load balancing, which can write the regional information into peer ID. With this unique ID scheme, the global DHT and regional DHT can be maintained by only one KBR (Key Based Routing) incarnation, that's different with many other layered DHT proposed. Every object will be put in the regional DHT and global DHT for disaster recovery, according to the normal DHT semantics internally. 3. Replica Placement strategy: The DSN VoIP system raises a unique N:B replica placement which provides an effective way to backup and recovery subscribers' data from local and remote region. Under this solution a parallel recovery factor N is defined as how many candidates should be used as the recovery. And B is defined as the number of Backups which means how many duplicated replicas the overlay hold. Subscriber data in each primary peer node is divided into S=N/B slices. After that, each B backup peers in a group for this primary peer will backup the same slice of subscriber data. One peer node could execute data backup procedure to send data to different backup peers simultaneously. Compare to 1:1 backup mechanism, not only does our N:B replica placement increase the efficiency of parallel recovery, but also avoids the reservation of half capacities (CPU cycles, bandwidth) for potential migration of the service backed up. 4. Consistency Strategy: In order to ensure better data consistency, DSN VoIP system will trigger "DataCheck" and "DataGoHome" mechanisms periodically. DSN VoIP system through "DataCheck" triggers the comparison between the data version information of the corresponding PN nodes, and updates the inconsistent data to ensure that the eventual consistency of subscriber profiles. "DataGoHome" takes charge of identifying those temporary irregular data, recalculate their key-value pair and push them to correct PN nodes. With the help of timing service always available in telecom offices, DSN VoIP system uses timestamp in order to capture causality between different versions of the same data. One can determine whether two versions of a data have a causal ordering by examine their timestamps. Zhang, et al Expires September 11, 2011 [Page 11] Internet-Draft p2psip bcp March 11, 2011 5. Strip Segmentation method Based ID Assignment: We use this method to generate the subscriber ID. This method can evenly distribute the subscriber data on each PN node so as to achieve traffic load balance and avoid hot spot. The detailed method can be referred to [5]. 6. Single node failure handle: If a PN node fails, we design some mechanisms to ensure the natural operation of VoIP service. For the session in progress, the session between the caller and the callee is established, the standby PN checks the primary PN in failure, and the standby PN notifies the associated nodes and switch the session to itself. For the new call originated from the caller, if the standby PN has switch to be the primary PN and notified the subscribers, the call originated from the subscriber will be routed to the standby PN. For the new call originated from the callee, if the callee knows the failure in the primary PN by requesting routing or querying routing table, the callee will route the query to the standby PN and establishes the session. 7. Others: TBD. 4. Validation of DSN VoIP system in lab environment 4.1. Overview We did some measurements to validate the DSN VoIP system and its key technologies and all these were described in this document of version 01 which was submitted in IETF 77 meeting. In this document, we don't repeat the results and conclusions in the first document. Here we do some more thorough measurements to validate the performance and reliability of DSN VoIP system which are carried out in a bigger network. We set up a DSN VoIP demo system in CMCC lab. The demo system includes 58 PN nodes, two SPM and one ES. All the PN nodes are deployed on VMWare machine and have the same configuration: two virtual CPU, 2G virtual memory, 75G hard disk space and SUSE Linux Enterprise Desktop 10 SP2 (i586). We deploy the four VM machines on one HP DL320 server and we use seventeen HP DL320 servers. All these HP servers connect through one Cisco 7500 layer3 switch. Our emphases are system bulk call performance measurement and reliability measurement under churn. Zhang, et al Expires September 11, 2011 [Page 12] Internet-Draft p2psip bcp March 11, 2011 4.2. System bulk call performance measurement For measuring the bulk call performance of the demo system, we configure the following parameters: 1.call parameters: each call last time=3 second each measurement last time = 15 minutes 2. subscriber data parameters: caller subscriber number = callee subscriber number = 16000 The format of SIP subscriber terminal ID is like 8xxxxxxx@bj.dsn.com. The callers and callees must be registered before bulk call performance measurement. All these registered subscribers and calls handled by each PN are randomly averagely distributed in system. Our algorithm can ensure the approximate well-proportioned distribution for registered subscribers and calls handled by each PN. We measured the bulk call transaction performance when the demo system respectively included 24, 32, 40, 48, 56 PN nodes. We collected the following parameters: 1.Registration statistics: average register time 2.Call Establishment time: average call setup time, average call tear down time, Post dial delay 3.Call finish: call success ratio, call attempts per second(CAPS) If the call lose ratio is below 0.5% and don't persist to increase, we think the call performance is under the limit of the system. We do two measurements: 1)there are only callers and callees in the demo system; 2)except the callers and callees, there are some other users which originate registration for simulating the transfer from one place to another and the number of these users is one fifth of the total users. 4.2.1. Pure bulk call performance measurement results Note: the maximal call capability of the demo system increases along with the number of PN nodes and it goes beyond the maximal capability Zhang, et al Expires September 11, 2011 [Page 13] Internet-Draft p2psip bcp March 11, 2011 of the test device. So we can get accurate measurement results when the number of PN nodes is no more than 48. The following are the results of the first measurement. 1.average register time Node number 24 32 40 48 average register time(ms): 21 21 22 22 2. average call setup time Node number 24 32 40 48 average setup time(ms): 338 423 311 223 3. average call tear down time Node number 24 32 40 48 average tear down time(ms):139 152 140 97 4. Post dial delay Node number 24 32 40 48 Post dial delay (ms): 338 423 310 223 5. call success ratio Node number 24 32 40 48 call success ratio(%): 99.97 99.95 99.89 99.91 6. call attempts per second(CAPS) Node number 24 32 40 48 System CAPS: 1252 1708 2221 2562 PN's average CAPS: 52.2 53.4 55.5 53.4 4.2.2. Mix Bulk call performance measurement results The following are the results of the second measurement. Zhang, et al Expires September 11, 2011 [Page 14] Internet-Draft p2psip bcp March 11, 2011 1.average register time Node number 24 32 40 48 average register time(ms): 21 24 22 21 2. average call setup time Node number 24 32 40 48 average setup time(ms): 346 376 319 244 3. average call tear down time Node number 24 32 40 48 average tear down time(ms):138 176 137 115 4. Post dial delay Node number 24 32 40 48 Post dial delay (ms): 346 439 318 244 5. call success ratio Node number 24 32 40 48 call success ratio(%): 99.98 99.06 99.96 99.77 6. call attempts per second(CAPS) Node number 24 32 40 48 System CAPS: 1196 1591 1986 2267 PN's average CAPS: 49.8 49.7 49.7 47.2 4.2.3. conclusions Through above measurement, the conclusions are: 1. The DSN VoIP system has the capability for call transaction. 2. The system capability can approximately linearly increase as the number of PN nodes increase. Zhang, et al Expires September 11, 2011 [Page 15] Internet-Draft p2psip bcp March 11, 2011 We have measure the bulk call performance for the demo system. However due to measurement time limited, these results are just elementary and further measurement will be done for more large scale network. 4.3. Churn measurement 4.3.1. Churn model and setting The P2P overlay will churn when there are some nodes arriving or leaving. Generally we can give some assumption: 1. one node has the the probability p% to arise churn in an hour for leaving or poweroff etc and the churning node will equably distribute in the network. 2. The churn will occur with the periods of T1 minutes in the P2P network and p% of all nodes will leave the P2P overlay. 3. After T2 minutes, the total leaving nodes will return the P2P overlay with the probability q%. T2 must be less than T1. Because we use backup mechanisms for user's information, if one node leaves the overlay, its backup nodes will take on the incoming calls. When the node leaves the overlay, it will lead to call errors. If one node return the overlay, it will get its previously possessed information from the backup nodes and this action will lead to generate new traffic(we call these traffic as recover traffic). These traffics will influence the network bandwidth and increase as the total number of the all users. Based on the above discuss, for simplifying and accelerating the measurements, we set some supreme values for the measurement as follows: 1. p = 5, 10, 20, q = 80 2. T1 = 10 minutes, T2 = 4 minutes 3. Number of users = 100 thousands, 300 thousands, 500 thousands, 1000 thousands We do churn measurements under different p and number of users in the demo system with 58 PN nodes. Because we don't care the bulk call performance under churn, we use test device to originate 1800caps Zhang, et al Expires September 11, 2011 [Page 16] Internet-Draft p2psip bcp March 11, 2011 call in each measurement and the settings for the calls are the same as section 4.2. We collected the following parameters: 1. Call success ratio: call success establishment compared to all call. 2. Call error: call fail caused by either caller or callee. we randomly get the call errors from two arbitrary nodes which leave and return the overlay. 3. sampled recover traffics: we randomly get the recover traffics from two arbitrary nodes which leave and return the overlay. 4.3.2. Measurement results The following are the results. 1. p = 5% (about 3 nodes involved in churn at the same time) user number(thousands): 100 300 500 1000 call success ratio(%): 99.00 99.31 99.52 99.93 call error: 34283 22702 16240 3316 32676 22290 15681 2545 Recover traffic(kbps): 999 490 198 12372 880 117 441 11613 2. p = 10% (about 6 nodes involved in churn at the same time) user number(thousands): 100 300 500 1000 call success ratio(%): 99.74 99.65 99.67 99.71 call error: 10299 13041 12532 15256 9020 12038 11445 12055 Recover traffic(kbps): 1430 1360 2500 13840 1170 1020 2670 9720 Zhang, et al Expires September 11, 2011 [Page 17] Internet-Draft p2psip bcp March 11, 2011 3. p = 20% (about 12 nodes involved in churn at the same time) user number(thousands): 100 300 500 1000 call success ratio(%): 99.11 99.16 98.98 98.93 call error: 28276 26684 32474 51705 28275 26684 32474 42942 Recover traffic(kbps): 250 450 660 5397 220 410 600 10888 4.3.3. conclusions Through above measurement, the conclusions are: 1. The recover traffic will increase as the total users increase. 2. If the churn involves a small quantity of PN nodes, the churn has little effect to the system call transaction performance and almost is independent of the total user's number. 5. Validation of DSN VoIP system in real environment 5.1. Overview We establish a small DSN VoIP demo network over the internet to validate the DSN VoIP system's usability in real internet. The demo network is deployed in three cities: Beijing, Shenzhen and Wuhan. This network includes 7 PN nodes, three in Beijing, two in Shenzhen and two in Wuhan. These seven PNs have different configuration: the PNs in Wuhan use common PCs, the PNs in Shenzhen use virtual machines in a common PC and the PNs in Beijing use virtual machines in two HP DL320 servers. The local network in each place belongs to different operator network, the PNs in Wuhan use CERNET, the PNs in Shenzhen use China Telecom's IP network and the PNs in Beijing use China Mobile's IP network. The access bandwidth for each point is quite different. The access bandwidth for these three locations is shared bandwidth, there are some other traffics on the same access point, we don't know how much bandwidth is used by the other traffic and how the other traffic may affect the available bandwidth for the DSN VoIP traffic. Zhang, et al Expires September 11, 2011 [Page 18] Internet-Draft p2psip bcp March 11, 2011 (Note: we do many measurements for different number of PNs in three cities and here only choose a typical measurement due to limited space.) 5.2. Single Capabilities measurement For measuring the performance of the DSN VoIP demo network, we configure the following parameters: 1. call parameters: call length =20 second Total call number = 200,000 2. subscriber data parameters: caller number = callee number = 100,000 The callers and callees must be registered before measurement. All these registered subscribers and calls handled by each PN are randomly averagely distributed in system. So there are one sixth calls between Beijing and Beijing, one sixth between Beijing and Wuhan, one six between Beijing and Shenzhen, one sixth between Shenzhen and Shenzhen, one sixth between Shenzhen and Wuhan, one sixth between Wuhan and Wuhan. We guess that these circs may occur, namely the least time for the calls is the calls between Beijing and Beijing, then the calls between Shenzhen and Shenzhen and the calls between Wuhan and Wuhan, the calls between different cities will take much longer time than the two cases. Since there are no standards for Internet VoIP, we don't know how to evaluate the results of this measurement. We refer to [2] and [2] gives some performance referenced parameters for mobile calls. Every caller will finish two calls during the measurement if no error happens. We collected the following results: 1. Call Response time(msec)-CR time Minimum: 63 Average: 195 Maximum: 6122 Call Response time is the time when the first 100 trying message is received after Invite is sent. In Sec 3.2.3.1 of [2], the mean value for CR time is required to be less than 800ms and it doesn't exceed 1000ms with 0.95 probability. Zhang, et al Expires September 11, 2011 [Page 19] Internet-Draft p2psip bcp March 11, 2011 In our test about 85% CR times in successful calls doesn't exceed 1000ms. 2. Call setup(msec)-CS time Minimum: 73 Average: 1548 Maximum: 24525 In Sec 3.2.3.8 of [2], the mean value for CS time is required to be less than 2200ms and it doesn't exceed 2400ms with 0.95 probability. In our test, about 76% CS times in successful calls doesn't exceed 2400ms. 3. Tear Down(msec)-TD time Minimum: 17 Average: 919 Maximum: 19735 In Sec 3.2.3.5 of [2], the mean value for TD time is required to be less than 400ms and it doesn't exceed 700ms with 0.95 probability. In our test, about 82% TD times in successful calls doesn't exceed 700ms. 4. Post Dial delay(msec)-PD time Minimum: 73 Average: 1544 Maximum: 24528 In Sec 3.2.3.7 of [2], the mean value for PD time is required to be less than 175ms and it doesn't exceed 350ms with 0.95 probability. In this measurement, less than 45% PD times in successful calls doesn't exceed 350ms. From the above results, we can see these results show much difference with the results in lab environment, for example, the average call setup time is much higher in real network. We give a brief analysis as following: 1. The measurement spans three operator's network so that the traffic go cross too many different network and it can lead to long transmission delay. If the long transmission delay exceeds the threshold of some timers, it will lead the call fail. For example, we do traceroute from Shenzhen to Beijing and find there are 16 hops between these two points. In lab environment, there is only one hop between every pair of PNs and the ping delay is under 5ms. 2. For security, the PNs in these three points are all behind firewall though these PNs have public IP addresses. The firewall has its secure policy for every stream and inspects every ingoing/outgoing packet. So the firewall's action has some impact Zhang, et al Expires September 11, 2011 [Page 20] Internet-Draft p2psip bcp March 11, 2011 on the communication between PNs. We find that the communication between two neighbor PNs is often lost for some reasons and this badly affect the call success ratio. In lab environment, there is no firewall and direction communication between PNs. 3. Since the PNs in different points have different configuration, the PN has different call transaction capability. In DSN VoIP system, the calls distributed for each PN are approximately uniform, as the call traffic increases gradually, the PN with the lowest configuration will lead to call fails firstly. In lab environment, all PNs have the same configuration and the call fail has an even distribution. 5.3. conclusions Through above measurement, the conclusions are: 1. The DSN VoIP system can work in global internet and doesn't perform well compared with the tests in lab environment. 2. Some issues obviously affect the performance of the system, e.g. limited IP interconnection bandwidth, firewall configuration, heterogeneous platform configuration. Due to many unpredictable reasons in the WAN, these results show much ununiformity which can't be seen in the lab environment. 6. Security Considerations This draft does not introduce any new security issues. 7. IANA Considerations This memo includes no request to IANA. 8. Conclusions Through the measurement for the demo system, we validate that DSN VoIP system is a potential operationable system, which has good call transaction capability, high reliability and availability to settle some key problem mentioned in Sec2.2 at a certain extent. The call transaction capability of DSN VoIP system can linearly increase as the server continuously joins the overlay. And this makes it possible that we can provide Carrier-grade call transaction capability by using normal computer server. Zhang, et al Expires September 11, 2011 [Page 21] Internet-Draft p2psip bcp March 11, 2011 The DSN VoIP system has an excellent anti-churn capability so that this overcomes the single point failure and provides the high network and service reliability for subscribers. The measurement in real WAN shows that the DSN VoIP system can be deployed in the internet and affected by many unpredictable issues. 9. References 9.1. Normative References [1] ITU-T DSN, Proposed scope of DSN, http://www.itu.int/md/meetingdoc.asp?lang=en&parent=T09-SG13- 090112-C&PageLB=50. [2] 3GPP TS 43.005 V9.0.0, Technical Specification Group Core Network and Terminals, Technical performance objectives 9.2. Informative References [3] [draft-zhang-ppsp-dsn-introduction-00]www.ietf.org/internet- draft/draft-zhang-ppsp-dsn-introduction-00.txt [4] [draft-ietf-p2psip-base-07]www.ietf.org/id/draft-ietf-p2psip- base-07.txt [5] Guangyu Shi, Jian Chen, Hao Gong, Lingyuan Fan, Haiqiang Xue, Qingming Lu, and Liang Liang, " SandStone: A DHT based Carrier Grade Distributed Storage System ", Proc. ICPP 2009 10. Acknowledgments Thanks to Cheng Li, Haitao Yang, Xiaolin Jiang, Jian Zhang and Jiguang Cao for their hard efforts, comments and suggestions. Special thanks to Guangwu He for his elaborate technical supports for DSN VoIP system. Zhang, et al Expires September 11, 2011 [Page 22] Internet-Draft p2psip bcp March 11, 2011 Authors' Addresses Yunfei Zhang China Mobile Beijing 100045, China Phone: 86-13601032119 Email: zhangyunfei@chinamobile.com Gang Li China Mobile Beijing 100045, China Phone: 86-13501279120 Email: ligangyf@chinamobile.com Jin Peng China Mobile Beijing 100045, China Phone: 86-13911281193 Email: pengjin@chinamobile.com Baohong He China CATR Beijing 100045, China Phone: 86-10-62300050 Email: hebaohong@catr.cn Shihui Duan China CATR Beijing 100045, China Phone: 86-10-62300068 Email: duanshihui@catr.cn Wei Zhu China CATR Beijing 100045, China Phone: 86-10-62300089 Email: zhuwei@catr.cn Zhang, et al Expires September 11, 2011 [Page 23]