< draft-widjaja-mpls-vc-merge-00.txt   draft-widjaja-mpls-vc-merge-01.txt >
MPLS Working Group Indra Widjaja Network Working Group Indra Widjaja
Fujitsu Network Communications Fujitsu Network Communications
Internet Draft Anwar Elwalid Internet Draft Anwar Elwalid
Expiration: Dec 1997 Bell Labs, Lucent Technologies Expired in six months Bell Labs, Lucent Technologies
July 1997 October 1998
Performance Issues in VC-Merge Capable MPLS Switches Performance Issues in VC-Merge Capable ATM LSRs
<draft-widjaja-mpls-vc-merge-00.txt> <draft-widjaja-mpls-vc-merge-01.txt>
Status of this Memo Status of this Memo
This document is an Internet Draft. Internet Drafts are working This document is an Internet Draft. Internet Drafts are working
documents of the Internet Engineering Task Force (IETF), its Areas, documents of the Internet Engineering Task Force (IETF), its Areas,
and its Working Groups. Note that other groups may also distribute and its Working Groups. Note that other groups may also distribute
working documents as Internet Drafts. working documents as Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six Internet Drafts are draft documents valid for a maximum of six
months. Internet Drafts may be updated, replaced, or obsoleted by months. Internet Drafts may be updated, replaced, or obsoleted by
other documents at any time. It is not appropriate to use Internet other documents at any time. It is not appropriate to use Internet
Drafts as reference material or to cite them other than as a "working Drafts as reference material or to cite them other than as a "working
draft" or "work in progress." draft" or "work in progress."
Please check the 1id-abstracts.txt listing contained in the Please check the 1id-abstracts.txt listing contained in the
internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net, internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net,
nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the
current status of any Internet Draft. current status of any Internet Draft.
Abstract Abstract
VC merging allows many routes to be mapped to the same VC label, VC merging allows many routes to be mapped to the same VC label,
thereby providing a scalable mapping method that can support tens of thereby providing a scalable mapping method that can support
thousands of edge routers. VC merging requires reassembly buffers so thousands of edge routers. VC merging requires reassembly buffers so
that cells belonging to different packets intended for the same that cells belonging to different packets intended for the same
destination do not interleave with each other. This document destination do not interleave with each other. This document
investigates the impact of VC merging on the additional buffer investigates the impact of VC merging on the additional buffer
required for the reassembly buffers and other buffers. The main required for the reassembly buffers and other buffers. The main
result indicates that VC merging incurs a minimal overhead compared result indicates that VC merging incurs a minimal overhead compared
to non-VC merging in terms of additional buffering. Moreover, the to non-VC merging in terms of additional buffering. Moreover, the
overhead decreases as utilization increases, or as the traffic overhead decreases as utilization increases, or as the traffic
becomes more bursty. becomes more bursty.
1. Introduction 1.0 Introduction
Recently some radical proposals to overhaul the legacy router architec- Recently some radical proposals to overhaul the legacy router
tures have been presented by several organizations, notably the architectures have been presented by several organizations, notably
Ipsilon's IP switching [1], Cisco's Tag switching [2], Toshiba's CSR the Ipsilon's IP switching [1], Cisco's Tag switching [2], Toshiba's
[3], IBM's ARIS [4], and IETF's MPLS [5]. Although the details of their CSR [3], IBM's ARIS [4], and IETF's MPLS [5]. Although the details
implementations vary, there is one fundamental concept that is shared by of their implementations vary, there is one fundamental concept that
all these proposals: map the route information to short fixed-length is shared by all these proposals: map the route information to short
labels so that next-hop routers can be determined quickly through index- fixed-length labels so that next-hop routers can be determined by
ing rather than some type of searching (or matching longest prefixes). direct indexing.
Although any layer 2 switching mechanism can in principle be applied, Although any layer 2 switching mechanism can in principle be applied,
the use of ATM switches in the backbone network is believed to be the the use of ATM switches in the backbone network is believed to be a
most attractive solution since ATM hardware switches have been exten- very attractive solution since ATM hardware switches have been exten-
sively studied and are widely available in many different architectures. sively studied and are widely available in many different architec-
In this document, we will assume that layer 2 switching uses ATM tech- tures. In this document, we will assume that layer 2 switching uses
nology. In this case, each IP packet may be segmented to multiple 53- ATM technology. In this case, each IP packet may be segmented to mul-
byte cells before being switched. Traditionally, AAL 5 has been used as tiple 53-byte cells before being switched. Traditionally, AAL 5 has
the encapsulation method in data communications since it is simple, been used as the encapsulation method in data communications since it
efficient, and has a powerful error detection mechanism. For the ATM is simple, efficient, and has a powerful error detection mechanism.
switch to forward incoming cells to the correct outputs, the IP route For the ATM switch to forward incoming cells to the correct outputs,
information needs to be mapped to ATM labels which are kept in the VPI the IP route information needs to be mapped to ATM labels which are
or/and VCI fields. The relevant route information that is stored semi- kept in the VPI or/and VCI fields. The relevant route information
permanently in the IP routing table contains the tuple (destination, that is stored semi-permanently in the IP routing table contains the
next-hop router). The route information changes when the network state tuple (destination, next-hop router). The route information changes
changes and this typically occurs slowly, except during transient cases. when the network state changes and this typically occurs slowly,
The word ``destination'' typically refers to the destination network (or except during transient cases. The word ``destination'' typically
CIDR prefix), but can be readily generalized to (destination network, refers to the destination network (or CIDR prefix), but can be
QoS), (destination host, QoS), or many other granularities. In this doc- readily generalized to (destination network, QoS), (destination host,
ument, the destination can mean any of the above or other possible gran- QoS), or many other granularities. In this document, the destination
ularities. can mean any of the above or other possible granularities.
Several methods of mapping the route information to ATM labels exist. Several methods of mapping the route information to ATM labels exist.
In the simplest form, each source-destination pair is mapped to a unique In the simplest form, each source-destination pair is mapped to a
VC value at a switch. This method, called the non-VC merging case, unique VC value at a switch. This method, called the non-VC merging
allows the receiver to easily reassemble cells into respective packets case, allows the receiver to easily reassemble cells into respective
since the VC values can be used to distinguish the senders. However, if packets since the VC values can be used to distinguish the senders.
there are n sources and destinations, each switch is potentially However, if there are n sources and destinations, each switch is
required to manage O(n^2) VC labels for full-meshed connectivity. For potentially required to manage O(n^2) VC labels for full-meshed con-
example, if there are 1,000 sources/destinations, then the size of the nectivity. For example, if there are 1,000 sources/destinations,
VC routing table is on the order of 1,000,000 entries. Clearly, this then the size of the VC routing table is on the order of 1,000,000
method is not scalable to large networks. In the second method called entries. Clearly, this method is not scalable to large networks. In
VP merging, the VP labels of cells that are intended for the same desti- the second method called VP merging, the VP labels of cells that are
nation would be translated to the same outgoing VP value, thereby reduc- intended for the same destination would be translated to the same
ing VP consumption downstream. For each VP, the VC value is used to outgoing VP value, thereby reducing VP consumption downstream. For
identify the sender so that the receiver can reconstruct packets even each VP, the VC value is used to identify the sender so that the
though cells from different packets are allowed to interleave. For a receiver can reconstruct packets even though cells from different
given destination, the switch would encounter O(e) incoming VP labels , packets are allowed to interleave. Each switch is now required to
where e is the number of switch ports (typically, 8 to 16) which may manage O(n) VP labels - a considerable saving from O(n^2). Although
depend on the network size (or n). If there are n destinations, each the number of label entries is considerably reduced, VP merging is
switch would is now required to manage O(e*n) VP labels - a considerable limited to only 4,096 entries at the network-to-network interface.
saving from O(n^2). Although the number of label entries is consider-
ably reduced, VP merging is not practical since the VP space is limited
to only 4,096 entries at the network-to-network interface. A third
method, called VC merging, maps incoming VC labels for the same desti-
nation to the same outgoing VC label. This method is scalable and does
not have the space constraint problem as in VP merging. With VC merging,
cells for the same destination is indistinguishable at the output of a
switch. Therefore, cells belonging to different packets for the same
destination cannot interleave with each other, or else the receiver will
not be able to reassemble the packets. With VC merging, the boundary
between two adjacent packets are identified by the ``End-of-Packet''
(EOP) marker used by AAL 5.
It is worthy to mention that cell interleaving may be allowed if we use Moreover, VP merging requires coordination of the VC values for a
the AAL 3/4 Message Identifier (MID) field to identify the sender given VP, which introduces more complexity. A third method, called
uniquely. However, this method has some serious drawbacks as: 1) the MID VC merging, maps incoming VC labels for the same destination to the
size may not be sufficient to identify all senders, 2) the encapsulation same outgoing VC label. This method is scalable and does not have the
method is not efficient, 3) the CRC capability is not as powerful as in space constraint problem as in VP merging. With VC merging, cells for
AAL 5, and 4) AAL 3/4 is not as widely supported as AAL 5 in data commu- the same destination is indistinguishable at the output of a switch.
nications. Therefore, cells belonging to different packets for the same destina-
tion cannot interleave with each other, or else the receiver will not
be able to reassemble the packets. With VC merging, the boundary
between two adjacent packets are identified by the ``End-of-Packet''
(EOP) marker used by AAL 5.
Before VC merging with no cell interleaving can be qualified as the most It is worthy to mention that cell interleaving may be allowed if we
promising approach, two main issues need to be addressed. First, the use the AAL 3/4 Message Identifier (MID) field to identify the sender
feasibility of an ATM switch that is capable of merging VCs needs to be uniquely. However, this method has some serious drawbacks as: 1) the
investigated. Second, there is widespread concern that the additional MID size may not be sufficient to identify all senders, 2) the encap-
amount of buffering required to implement VC merging is excessive and sulation method is not efficient, 3) the CRC capability is not as
thus making the VC-merging method impractical. Through analysis and powerful as in AAL 5, and 4) AAL 3/4 is not as widely supported as
simulation, we will dispel these concerns in this document by showing AAL 5 in data communications.
that the additional buffer requirement for VC merging is minimal for
most practical purposes. Other performance related issues such addi-
tional delay due to VC merging will also be discussed.
2. A VC-Merge Capable MPLS Switch Architecture Before VC merging with no cell interleaving can be qualified as the
most promising approach, two main issues need to be addressed.
First, the feasibility of an ATM switch that is capable of merging
VCs needs to be investigated. Second, there is widespread concern
that the additional amount of buffering required to implement VC
merging is excessive and thus making the VC-merging method impracti-
cal. Through analysis and simulation, we will dispel these concerns
in this document by showing that the additional buffer requirement
for VC merging is minimal for most practical purposes. Other perfor-
mance related issues such additional delay due to VC merging will
also be discussed.
In principle, the reassembly buffers can be placed at the input or out- 2.0 A VC-Merge Capable MPLS Switch Architecture
put side of a switch. If they are located at the input, then the switch
fabric has to transfer all cells belonging to a given packet in an
atomic manner since cells are not allowed to interleave. This requires
the fabric to perform frame switching which is not flexible nor desir-
able when multiple QoSs need to be supported. On the other hand, if the
reassembly buffers are located at the output, the switch fabric can
forward each cell independently as in normal ATM switching. Placing the
reassembly buffers at the output makes an output-buffered ATM switch a
natural choice.
We consider a generic output-buffered VC-merge capable MPLS switch with In principle, the reassembly buffers can be placed at the input or
VCI translation performed at the output. Other possible architectures output side of a switch. If they are located at the input, then the
may also be adopted. The switch consists of a non-blocking cell switch switch fabric has to transfer all cells belonging to a given packet
fabric and multiple output modules (OMs), each is associated with an in an atomic manner since cells are not allowed to interleave. This
output port. Each arriving ATM cell is appended with two fields con- requires the fabric to perform frame switching which is not flexible
taining an output port number and an input port number. Based on the nor desirable when multiple QoSs need to be supported. On the other
output port number, the switch fabric forwards each cell to the correct hand, if the reassembly buffers are located at the output, the switch
output port, just as in normal ATM switches. If VC merging is not fabric can forward each cell independently as in normal ATM switch-
implemented, then the OM consists of an output buffer. If VC merging is ing. Placing the reassembly buffers at the output makes an output-
implemented, the OM contains a number of reassembly buffers (RBs), fol- buffered ATM switch a natural choice.
lowed by a merging unit, and an output buffer. Each RB typically corre-
sponds to an incoming VC value. It is important to note that each buffer
is a logical buffer, and it is envisioned that a common pool of memory
for the reassembly buffers and the output buffer.
The purpose of the RB is to ensure that cells for a given packet do not We consider a generic output-buffered VC-merge capable MPLS switch
interleave with other cells that are merged to the same VC. This mecha- with VCI translation performed at the output. Other possible archi-
nism (called store-and-forward at the packet level) can be accomplished tectures may also be adopted. The switch consists of a non-blocking
by storing each incoming cell for a given packet at the RB until the cell switch fabric and multiple output modules (OMs), each is associ-
last cell of the packet arrives. When the last cell arrives, all cells ated with an output port. Each arriving ATM cell is appended with
in the packet are transferred in an atomic manner to the output buffer two fields containing an output port number and an input port number.
for transmission to the next hop. It is worth pointing out that perform- Based on the output port number, the switch fabric forwards each cell
ing a cut-through mode at the RB is not recommended since it would to the correct output port, just as in normal ATM switches. If VC
result in wastage of bandwidth if the subsequent cells are delayed. merging is not implemented, then the OM consists of an output buffer.
During the transfer of a packet to the output buffer, the incoming VCI If VC merging is implemented, the OM contains a number of reassembly
is translated to the outgoing VCI by the merging unit. To save VC buffers (RBs), followed by a merging unit, and an output buffer. Each
translation table space, different incoming VCIs are merged to the same RB typically corresponds to an incoming VC value. It is important to
outgoing VCI during the translation process if the cells are intended note that each buffer is a logical buffer, and it is envisioned that
for the same destination. If all traffic is best-effort, full-merging a common pool of memory for the reassembly buffers and the output
where all incoming VCs destined for the same destination network are buffer.
mapped to the same outgoing VC, can be implemented. However, if the
traffic is composed of multiple classes, it is desirable to implement
partial merging, where incoming VCs destined for the same (destination
network, QoS) are mapped to the same outgoing VC.
Regardless of whether full merging or partial merging is implemented, The purpose of the RB is to ensure that cells for a given packet do
the output buffer may consist of a single FIFO buffer or multiple not interleave with other cells that are merged to the same VC. This
buffers each corresponds to a destination network or (destination net- mechanism (called store-and-forward at the packet level) can be
work, QoS). If a single output buffer is used, then the switch essen- accomplished by storing each incoming cell for a given packet at the
tially tries to emulate frame switching. If multiple output buffers are RB until the last cell of the packet arrives. When the last cell
used, VC merging is different from frame switching since cells of a arrives, all cells in the packet are transferred in an atomic manner
given packet are not bound to be transmitted back-to-back. In fact, to the output buffer for transmission to the next hop. It is worth
fair queueing can be implemented so that cells from their respective pointing out that performing a cut-through mode at the RB is not
output buffers are served according to some QoS requirements. Note that recommended since it would result in wastage of bandwidth if the sub-
cell-by-cell scheduling can be implemented with VC merging, whereas only sequent cells are delayed. During the transfer of a packet to the
packet-by-packet scheduling can be implemented with frame switching. In output buffer, the incoming VCI is translated to the outgoing VCI by
summary, VC merging is more flexible than frame switching and supports the merging unit. To save VC translation table space, different
better QoS control. incoming VCIs are merged to the same outgoing VCI during the transla-
tion process if the cells are intended for the same destination. If
all traffic is best-effort, full-merging where all incoming VCs des-
tined for the same destination network are mapped to the same outgo-
ing VC, can be implemented. However, if the traffic is composed of
multiple classes, it is desirable to implement partial merging, where
incoming VCs destined for the same (destination network, QoS) are
mapped to the same outgoing VC.
3. Performance Investigation of VC Merging Regardless of whether full merging or partial merging is implemented,
the output buffer may consist of a single FIFO buffer or multiple
buffers each corresponds to a destination network or (destination
network, QoS). If a single output buffer is used, then the switch
essentially tries to emulate frame switching. If multiple output
buffers are used, VC merging is different from frame switching since
cells of a given packet are not bound to be transmitted back-to-back.
In fact, fair queueing can be implemented so that cells from their
respective output buffers are served according to some QoS require-
ments. Note that cell-by-cell scheduling can be implemented with VC
merging, whereas only packet-by-packet scheduling can be implemented
with frame switching. In summary, VC merging is more flexible than
frame switching and supports better QoS control.
This section compares the VC-merging switch and the non-VC merging 3.0 Performance Investigation of VC Merging
switch. The non-VC merging switch is analogous to the traditional
output-buffered ATM switch, whereby cells of any packets are allowed to
interleave. Since each cell is a distinct unit of information, the
non-VC merging switch is a work-conserving system at the cell level. On
the other hand, the VC-merging switch is non-work conserving so its per-
formance is always lower than that of the non-VC merging switch. The
main objective here is to study the effect of VC merging on performance
implications of MPLS switches such as additional delay, additional
buffer, etc., subject to different traffic conditions.
In the simulation, the arrival process to each reassembly buffer is an This section compares the VC-merging switch and the non-VC merging
independent ON-OFF process. Cells within an ON period form a single switch. The non-VC merging switch is analogous to the traditional
packet. During an OFF periof, the slots are idle. output-buffered ATM switch, whereby cells of any packets are allowed
to interleave. Since each cell is a distinct unit of information,
the non-VC merging switch is a work-conserving system at the cell
level. On the other hand, the VC-merging switch is non-work conserv-
ing so its performance is always lower than that of the non-VC merg-
ing switch. The main objective here is to study the effect of VC
merging on performance implications of MPLS switches such as addi-
tional delay, additional buffer, etc., subject to different traffic
conditions.
3.1 Effect of Utilization on Additional Buffer Requirement In the simulation, the arrival process to each reassembly buffer is
an independent ON-OFF process. Cells within an ON period form a sin-
gle packet. During an OFF periof, the slots are idle. Note that the
ON-OFF process is a general process that can model any traffic pro-
cess.
We first investigate the effect of switch utilization on the additional 3.1 Effect of Utilization on Additional Buffer Requirement
buffer requirement for a given overflow probability. To carry the com-
parison, we analyze the VC-merging and non-VC merging case when the
average packet size is equal to 10 cells, using geometrically dis-
tributed packet sizes and packet interarrival times, with cells of a
packet arriving contiguously (later, we consider other distributions).
The results show, as expected, the VC-merging switch requires more
buffers than the non-VC merging switch. When the utilization is low,
there may be relatively many incomplete packets in the reassembly
buffers at any given time, thus wasting storage resource. For example,
when the utilization is 0.3, VC merging requires an additional storage
of about 45 cells to achieve the same overflow probability. However, as
the utilization increases to 0.9, the additional storage to achieve the
same overflow probability drops to about 30 cells. The reason is that
when traffic intensity increases, the VC-merging system becomes more
work-conserving.
It is important to note that ATM switches must be dimensioned at high We first investigate the effect of switch utilization on the addi-
utilization value (in the range of 0.8-0.9) to withstand harsh traffic tional buffer requirement for a given overflow probability. To carry
conditions. At the utilization of 0.9, a VC-merge ATM switch requires a the comparison, we analyze the VC-merging and non-VC merging case
buffer of size 976 cells to provide an overflow probability of 10^{-5}, when the average packet size is equal to 10 cells, using geometri-
whereas an non-VC merge ATM switch requires a buffer of size 946. These cally distributed packet sizes and packet interarrival times, with
numbers translate the additional buffer requirement for VC merging to cells of a packet arriving contiguously (later, we consider other
about 3% - hardly an additional hardware cost. distributions). The results show, as expected, the VC-merging switch
requires more buffers than the non-VC merging switch. When the utili-
zation is low, there may be relatively many incomplete packets in the
reassembly buffers at any given time, thus wasting storage resource.
For example, when the utilization is 0.3, VC merging requires an
additional storage of about 45 cells to achieve the same overflow
probability. However, as the utilization increases to 0.9, the addi-
tional storage to achieve the same overflow probability drops to
about 30 cells. The reason is that when traffic intensity increases,
the VC-merging system becomes more work-conserving.
3.2 Effect of Packet Size on Additional Buffer Requirement It is important to note that ATM switches must be dimensioned at high
utilization value (in the range of 0.8-0.9) to withstand harsh
traffic conditions. At the utilization of 0.9, a VC-merge ATM switch
requires a buffer of size 976 cells to provide an overflow
probability of 10^{-5}, whereas an non-VC merge ATM switch requires a
buffer of size 946. These numbers translate the additional buffer
requirement for VC merging to about 3% - hardly an additional buffer-
ing cost.
We now vary the average packet size to see the impact on the buffer 3.2 Effect of Packet Size on Additional Buffer Requirement
requirement. We fix the utilization to 0.5 and use two different aver-
age packet sizes; that is, B=10 and B=30. To achieve the same overflow
probability, VC merging requires an additional buffer of about 40 cells
(or 4 packets) compared to non-VC merging when B=10. When B=30, the
additional buffer requirement is about 90 cells (or 3 packets). In
terms of the number of packets, the additional buffer requirement does
not increase as the average packet size increases.
3.3 Additional Buffer Overhead Due to Packet Reassembly We now vary the average packet size to see the impact on the buffer
requirement. We fix the utilization to 0.5 and use two different
average packet sizes; that is, B=10 and B=30. To achieve the same
overflow probability, VC merging requires an additional buffer of
about 40 cells (or 4 packets) compared to non-VC merging when B=10.
When B=30, the additional buffer requirement is about 90 cells (or 3
packets). As expected, the additional buffer requirement in terms of
cells increases as the packet size increases. However, the additional
buffer requirement is roughly constant in terms of packets.
There may be some concern that VC merging may require too much buffering 3.3 Additional Buffer Overhead Due to Packet Reassembly
when the number of reassembly buffers increases, which would happen if
the switch size is increased or if cells for packets going to different
destinations are allowed to interleave. We will show that the concern
is unfounded since buffer sharing becomes more efficient as the number
of reassembly buffers increases.
To demonstrate our argument, we consider the overflow probability for VC There may be some concern that VC merging may require too much
merging for several values of reassembly buffers (N); i.e., N=4, 8, 16, buffering when the number of reassembly buffers increases, which
32, 64, and 128. The utilization is fixed to 0.8 for each case, and the would happen if the switch size is increased or if cells for packets
average packet size is chosen to be 10. For a given overflow probabil- going to different destinations are allowed to interleave. We will
ity, the increase in buffer requirement becomes less pronounced as N show that the concern is unfounded since buffer sharing becomes more
increases. Beyond a certain value (N=32), the increase in buffer efficient as the number of reassembly buffers increases.
requirement becomes insignificant. The reason is that as N increases,
the traffic gets thinned and eventually approaches a limiting process.
3.4 Effect of Interarrival time Distribution on Additional Buffer To demonstrate our argument, we consider the overflow probability for
VC merging for several values of reassembly buffers (N); i.e., N=4,
8, 16, 32, 64, and 128. The utilization is fixed to 0.8 for each
case, and the average packet size is chosen to be 10. For a given
overflow probability, the increase in buffer requirement becomes less
pronounced as N increases. Beyond a certain value (N=32), the
increase in buffer requirement becomes insignificant. The reason is
that as N increases, the traffic gets thinned and eventually
approaches a limiting process.
We now turn our attention to different traffic processes. First, we use 3.4 Effect of Interarrival time Distribution on Additional Buffer
the same ON period distribution and change the OFF period distribution
from geometric to hypergeometric which has a larger Square Coefficient
of Variation (SCV), defined to be the ratio of the variance to the
square of the mean. Here we fix the utilization at 0.5. As expected,
the switch performance degrades as the SCV increases in both the VC-
merging and non-VC merging cases. To achieve a buffer overflow proba-
bility of 10^{-4}, the additional buffer required is about 40 cells when
SCV=1, 26 cells when SCV=1.5, and 24 cells when SCV=2.6. The result
shows that VC merging becomes more work-conserving as SCV increases. In
summary, as the interarrival time between packets becomes more bursty,
the additional buffer requirement for VC merging diminishes.
3.5 Effect of Internet Packets on Additional Buffer Requirement We now turn our attention to different traffic processes. First, we
use the same ON period distribution and change the OFF period distri-
bution from geometric to hypergeometric which has a larger Square
Coefficient of Variation (SCV), defined to be the ratio of the vari-
ance to the square of the mean. Here we fix the utilization at 0.5.
As expected, the switch performance degrades as the SCV increases in
both the VC-merging and non-VC merging cases. To achieve a buffer
overflow probability of 10^{-4}, the additional buffer required is
about 40 cells when SCV=1, 26 cells when SCV=1.5, and 24 cells when
SCV=2.6. The result shows that VC merging becomes more work-
conserving as SCV increases. In summary, as the interarrival time
between packets becomes more bursty, the additional buffer require-
ment for VC merging diminishes.
Up to now, the packet size has been modeled as a geometric distribution 3.5 Effect of Internet Packets on Additional Buffer Requirement
with a certain parameter. We now modify the packet size distribution to
a more realistic one. Since the initial deployment of VC-merge capable
ATM switches is likely to be in the core network, it is more realistic
to consider the packet size distribution in the Wide Area Network. To
this end, we refer to the data given in [6]. The data collected on Feb
10, 1996, in FIX-West network, is in the form of probability mass func-
tion versus packet size in bytes. Data collected at other dates closely
resemble this one.
The distribution appears bi-modal with two big masses at 40 bytes (about Up to now, the packet size has been modeled as a geometric distribu-
a third) due to TCP acknowledgment packets, and 552 bytes (about 22 per- tion with a certain parameter. We modify the packet size distribu-
cent) due to Maximum Transmission Unit (MTU) limitations in many tion to a more realistic one for the rest of this document. Since
routers. Other prominent packet sizes include 72 bytes (about 4.1 per- the initial deployment of VC-merge capable ATM switches is likely to
cent), 576 bytes (about 3.6 percent), 44 bytes (about 3 percent), 185 be in the core network, it is more realistic to consider the packet
bytes (about 2.7 percent), and 1500 bytes (about 1.5 percent) due to size distribution in the Wide Area Network. To this end, we refer to
Ethernet MTU. The mean packet size is 257 bytes, and the variance is the data given in [6]. The data collected on Feb 10, 1996, in FIX-
84,287 bytes^2. Thus, the SCV for the Internet packet size is about 1.1. West network, is in the form of probability mass function versus
packet size in bytes. Data collected at other dates closely resemble
this one.
To convert the IP packet size in bytes to ATM cells, we assume AAL 5 The distribution appears bi-modal with two big masses at 40 bytes
using null encapsulation where the additional overhead in AAL 5 is 8 (about a third) due to TCP acknowledgment packets, and 552 bytes
bytes long [7]. Using the null encapsulation technique, the average (about 22 percent) due to Maximum Transmission Unit (MTU) limitations
packet size is about 6.2 ATM cells. in many routers. Other prominent packet sizes include 72 bytes (about
4.1 percent), 576 bytes (about 3.6 percent), 44 bytes (about 3 per-
cent), 185 bytes (about 2.7 percent), and 1500 bytes (about 1.5 per-
cent) due to Ethernet MTU. The mean packet size is 257 bytes, and
the variance is 84,287 bytes^2. Thus, the SCV for the Internet packet
size is about 1.1.
We examine the buffer overflow probability against the buffer size using To convert the IP packet size in bytes to ATM cells, we assume AAL 5
the Internet packet size distribution. The OFF period is assumed to have using null encapsulation where the additional overhead in AAL 5 is 8
a geometric distribution. Again, we find that the same behavior as bytes long [7]. Using the null encapsulation technique, the average
before, except that the buffer requirement drops with Internet packets packet size is about 6.2 ATM cells.
due to smaller average packet size.
3.6 Effect of Correlated Interarrival Times on Additional Buffer We examine the buffer overflow probability against the buffer size
using the Internet packet size distribution. The OFF period is
assumed to have a geometric distribution. Again, we find that the
same behavior as before, except that the buffer requirement drops
with Internet packets due to smaller average packet size.
To model correlated interarrival times, we use the DAR(p) process (dis- 3.6 Effect of Correlated Interarrival Times on Additional Buffer
crete autoregressive process of order p) [8], which has been used to Requirement
accurately model video traffic (Star Wars movie) in [9]. The DAR(p)
process is a p-th order (lag-p) discrete-time Markov chain. The state of
the process at time n depends explicitly on the states at times (n-1),
..., (n-p).
We examine the overflow probability for the case where the interarrival To model correlated interarrival times, we use the DAR(p) process
time between packets is geometric and independent, and the case where (discrete autoregressive process of order p) [8], which has been used
the interarrival time is geometric and correlated to the previous one to accurately model video traffic (Star Wars movie) in [9]. The
with coefficient of correlation equal to 0.9. The empirical distribution DAR(p) process is a p-th order (lag-p) discrete-time Markov chain.
of the Internet packet size from the last section is used. The utiliza- The state of the process at time n depends explicitly on the states
tion is fixed to 0.5 in each case. Although, the overflow probability at times (n-1), ..., (n-p).
increases as p increases, the additional amount of buffering actually
decreases for VC merging as p, or equivalently the correlation,
increases. One can easily conclude that higher-order correlation or
long-range dependence will result in similar qualitative performance.
3.7 Slow Sources We examine the overflow probability for the case where the interar-
rival time between packets is geometric and independent, and the case
where the interarrival time is geometric and correlated to the previ-
ous one with coefficient of correlation equal to 0.9. The empirical
distribution of the Internet packet size from the last section is
used. The utilization is fixed to 0.5 in each case. Although, the
overflow probability increases as p increases, the additional amount
of buffering actually decreases for VC merging as p, or equivalently
the correlation, increases. One can easily conclude that higher-
order correlation or long-range dependence, which occurs in self-
similar traffic, will result in similar qualitative performance.
The discussions up to now have assumed that cells within a packet arrive 3.7 Slow Sources
back-to-back. With slow sources, adjacent cells would typically be
spaced by idle slots. Adjacent cells within the same packet may also be
perturbed and spaced as these cells travel downstream due to the merging
and splitting of cells at preceding nodes.
Here, we assume that each source transmits at the rate of r (0 < r < The discussions up to now have assumed that cells within a packet
1), in units of link speed, to the ATM switch. To capture the merging arrive back-to-back. When traffic shaping is implemented, adjacent
and splitting of cells as they travel in the network, we will also cells within the same packet would typically be spaced by idle slots.
assume that the cell interarrival time within a packet is randomly per- We call such sources as "slow sources". Adjacent cells within the
turbed. To model this perturbation, we stretch the original ON period same packet may also be perturbed and spaced as these cells travel
by 1/r, and flip a Bernoulli coin with parameter r during the downstream due to the merging and splitting of cells at preceding
stretched ON period. In other words, a slot would contain a cell with nodes.
probability r, and would be idle with probability 1-r during the ON
period. By doing so, the average packet size remains the same as r is
varied. We simulated slow sources on the VC-merge ATM switch using the
Internet packet size distribution with r=1 and r=0.2. The packet
interarrival time is assumed to be geometrically distributed. Reducing
the source rate in general reduces the stresses on the ATM switches
since the traffic becomes smoother. With VC merging, slow sources also
have the effect of increasing the reassembly time. At utilization of
0.5, the reassembly time is more dominant and causes the slow source
(with r=0.2) to require more buffering than the fast source (with
r=1). At utilization of 0.8, the smoother traffic is more dominant
and causes the slow source (with r=0.2) to require less buffering than
the fast source (with r=1). This result again has practical conse-
quences in ATM switch design where buffer dimensioning is performed at
reasonably high utilization. In this situation, slow sources only help.
3.8 Packet Delay Here, we assume that each source transmits at the rate of r_s (0 <
r_s < 1), in units of link speed, to the ATM switch. To capture the
merging and splitting of cells as they travel in the network, we will
also assume that the cell interarrival time within a packet is ran-
domly perturbed. To model this perturbation, we stretch the original
ON period by 1/r_s, and flip a Bernoulli coin with parameter r_s
during the stretched ON period. In other words, a slot would contain
a cell with probability r_s, and would be idle with probability 1-r_s
during the ON period. By doing so, the average packet size remains
the same as r_s is varied. We simulated slow sources on the VC-merge
ATM switch using the Internet packet size distribution with r_s=1 and
r_s=0.2. The packet interarrival time is assumed to be geometrically
distributed. Reducing the source rate in general reduces the
stresses on the ATM switches since the traffic becomes smoother.
With VC merging, slow sources also have the effect of increasing the
reassembly time. At utilization of 0.5, the reassembly time is more
dominant and causes the slow source (with r_s=0.2) to require more
buffering than the fast source (with r_s=1). At utilization of 0.8,
the smoother traffic is more dominant and causes the slow source
(with r_s=0.2) to require less buffering than the fast source (with
r_s=1). This result again has practical consequences in ATM switch
design where buffer dimensioning is performed at reasonably high
utilization. In this situation, slow sources only help.
It is of interest to see the impact of cell reassembly on packet delay. 3.8 Packet Delay
Here we consider the delay at one node only; end-to-end delays are sub-
ject of ongoing work. We define the delay of a packet as the time
between the arrival of the first cell of a packet at the switch and the
departure of the last cell of the same packet. We study the average
packet delay as a function of utilization for both VC-merging and non-VC
merging switches for the case r=1 (back-to-back cells in a packet).
Again, the Internet packet size distribution is used to adopt the more
realistic scenario. The interarrival time of packets is geometrically
distributed. Although the difference in the worst-case delay between
VC-merging and non-VC merging can be theoretically very large, we
observe that the difference in average delays of the two systems to be
consistently about one average packet time for a wide range of utiliza-
tion. The difference is due to the average time needed to reassemble a
packet.
To see the effect of cell spacing in a packet, we again simulate the It is of interest to see the impact of cell reassembly on packet
average packet delay for r=0.2. We observe that the difference in delay. Here we consider the delay at one node only; end-to-end delays
average delays of VC merging and non-VC merging increases to a few are subject of ongoing work. We define the delay of a packet as the
packet times (approximately 20 cells at high utilization). It should be time between the arrival of the first cell of a packet at the switch
noted that when a VC-merge capable ATM switch reassembles packets, in and the departure of the last cell of the same packet. We study the
effect it performs the task that the receiver has to do otherwise. average packet delay as a function of utilization for both VC-merging
>From practical point-of-view, an increase in 20 cells translates to and non-VC merging switches for the case r_s=1 (back-to-back cells in
about 60 micro seconds at OC-3 link speed. This additional delay should a packet). Again, the Internet packet size distribution is used to
be insignificant for most applications. For delay-sensitive traffic, adopt the more realistic scenario. The interarrival time of packets
the additional delay can be reduced by using smaller packets. is geometrically distributed. Although the difference in the worst-
case delay between VC-merging and non-VC merging can be theoretically
very large, we consistently observe that the difference in average
delays of the two systems to be consistently about one average packet
time for a wide range of utilization. The difference is due to the
average time needed to reassemble a packet.
4. Security Considerations To see the effect of cell spacing in a packet, we again simulate the
average packet delay for r_s=0.2. We observe that the difference in
average delays of VC merging and non-VC merging increases to a few
packet times (approximately 20 cells at high utilization). It should
be noted that when a VC-merge capable ATM switch reassembles packets,
in effect it performs the task that the receiver has to do otherwise.
From practical point-of-view, an increase in 20 cells translates to
about 60 micro seconds at OC-3 link speed. This additional delay
should be insignificant for most applications.
Security considerations are not addressed in this document. 4.0 Security Considerations
5. Conclusion There are no security considerations directly related to this docu-
ment since the document is concerned with the performance implica-
tions of VC merging. There are also no known security considerations
as a result of the proposed modification of a legacy ATM LSR to
incorporate VC merging.
This document has investigated the impacts of VC merging on an ATM 5.0 Discussion
switch performance. We experimented with various traffic processes to
understand the detailed behavior of VC-merge capable MPLS switches. Our
main finding indicates that VC merging incurs a minimal overhead com-
pared to non-VC merging in terms of additional buffering. Moreover, the
overhead decreases as utilization increases, or as the traffic becomes
more bursty. This fact has important practical consequences since
switches are dimensioned for high utilization and stressful traffic con-
ditions. We have considered the case where the output buffer uses a
FIFO scheduling. Future work will focus on fair queueing and variations.
Fair queueing essentially has the effect of spacing the incoming cells
and increasing the number of reassembly buffers. Our earlier results
indicate that these two factors do not have a significant impact on the
amount of buffering. However, additional delay due to fair queueing
requires further investigation. Network-wide performance implications
resulting from interconnecting many VC-merge capable ATM switches also
need further study.
6. Acknowledgment: This document has investigated the impacts of VC merging on the
performance of an ATM LSR. We experimented with various traffic
processes to understand the detailed behavior of VC-merge capable ATM
LSRs. Our main finding indicates that VC merging incurs a minimal
overhead compared to non-VC merging in terms of additional buffering.
Moreover, the overhead decreases as utilization increases, or as the
traffic becomes more bursty. This fact has important practical
consequences since switches are dimensioned for high utilization and
stressful traffic conditions. We have considered the case where the
output buffer uses a FIFO scheduling. However, based on our investi-
gation on slow sources, we believe that fair queueing will not intro-
duce a significant impact on the additional amount of buffering.
Others may wish to investigate this further.
The authors thank Debasis Mitra for his penetrating questions during the 6.0 Acknowledgement
internal talks and discussions.
7. References The authors thank Debasis Mitra for his penetrating questions during
the internal talks and discussions.
[1] P. Newman, Tom Lyon and G. Minshall, 7.0 References
``Flow Labelled IP: Connectionless ATM Under IP,''
in Proceedings of INFOCOM'96, San-Francisco, Apr. 1996.
[2] Y. Rekhter, B. Davie, D. Katz, E. Rosen and [1] P. Newman, Tom Lyon and G. Minshall,
G. Swallow, ``Cisco Systems' Tag Switching Architecture Overview,'' ``Flow Labelled IP: Connectionless ATM Under IP,''
RFC 2105, Feb. 1997. in Proceedings of INFOCOM'96, San-Francisco, Apr. 1996.
[3] Y. Katsube, K. Nagami and H. Esaki, [2] Y. Rekhter, B. Davie, D. Katz, E. Rosen and
``Toshiba's Router Architecture Extensions for ATM: Overview,'' G. Swallow, ``Cisco Systems' Tag Switching Architecture Overview,''
RFC 2098, Feb. 1997. RFC 2105, Feb. 1997.
[4] A. Viswanathan, N. Feldman, R. Boivie and R. Woundy, [3] Y. Katsube, K. Nagami and H. Esaki,
``ARIS: Aggregate Route-Based IP Switching,'' ``Toshiba's Router Architecture Extensions for ATM: Overview,''
Internet Draft <draft-viswanathan-aris-overview-00.txt>, Mar. 1997. RFC 2098, Feb. 1997.
[5] R. Callon, P. Doolan, N. Feldman, A. Fredette, [4] A. Viswanathan, N. Feldman, R. Boivie and R. Woundy,
G. Swallow and A. Viswanathan, ``ARIS: Aggregate Route-Based IP Switching,''
``A Framework for Multiprotocol Label Switching,'' Internet Draft <draft-viswanathan-aris-overview-00.txt>, Mar. 1997.
Internet Draft <draft-ietf-mpls-framework-00.txt>, May 1997.
[6] WAN Packet Size Distribution, [5] R. Callon, P. Doolan, N. Feldman, A. Fredette,
http://www.nlanr.net/NA/Learn/packetsizes.html. G. Swallow and A. Viswanathan,
``A Framework for Multiprotocol Label Switching,''
Internet Draft <draft-ietf-mpls-framework-00.txt>, Nov 1997.
[7] J. Heinanen, [6] WAN Packet Size Distribution,
``Multiprotocol Encapsulation over ATM Adaptation Layer 5,'' http://www.nlanr.net/NA/Learn/packetsizes.html.
RFC 1483, Jul. 1993.
[8] P. Jacobs and P. Lewis, [7] J. Heinanen,
``Discrete Time Series Generated by Mixtures III: ``Multiprotocol Encapsulation over ATM Adaptation Layer 5,''
Autoregressive Processes (DAR(p)),'' Technical Report NPS55-78-022, RFC 1483, Jul. 1993.
Naval Postgraduate School, 1978.
[9] B.K. Ryu and A. Elwalid, [8] P. Jacobs and P. Lewis,
``The Importance of Long-Range Dependence of VBR Video Traffic ``Discrete Time Series Generated by Mixtures III:
in ATM Traffic Engineering,'' Autoregressive Processes (DAR(p)),'' Technical Report NPS55-78-022,
ACM SigComm'96, Stanford, CA, pp. 3-14, Aug. 1996. Naval Postgraduate School, 1978.
Authors' Address: [9] B.K. Ryu and A. Elwalid,
``The Importance of Long-Range Dependence of VBR Video Traffic
in ATM Traffic Engineering,''
ACM SigComm'96, Stanford, CA, pp. 3-14, Aug. 1996.
Author Information:
Indra Widjaja Indra Widjaja
Fujitsu Network Communications, Inc. Fujitsu Network Communications
4403 Bland Road 4403 Bland Road
Raleigh, NC 27609, USA Raleigh, NC 27609, USA
Phone: 919 790 2037 Phone: 919 790-2037
Email: i_widjaja@fujitsu-fnc.com Email: indra.widjaja@fnc.fujitsu.com
Anwar Elwalid Anwar Elwalid
Bell Labs, Lucent Technologies Bell Labs Lucent Technologies
600 Mountain Ave. Rm 2C-124
Murray Hill, NJ 07974, USA Murray Hill, NJ 07974, USA
Phone: 908 582-7589 Phone: 908 582-7589
Email: anwar@lucent.com Email: anwar@lucent.com
 End of changes. 62 change blocks. 
375 lines changed or deleted 390 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/