| < draft-fioccola-rfc8321bis-02.txt | draft-fioccola-rfc8321bis-03.txt > | |||
|---|---|---|---|---|
| Network Working Group G. Fioccola, Ed. | Network Working Group G. Fioccola, Ed. | |||
| Internet-Draft Huawei Technologies | Internet-Draft Huawei Technologies | |||
| Obsoletes: 8321 (if approved) M. Cociglio | Obsoletes: 8321 (if approved) M. Cociglio | |||
| Intended status: Standards Track Telecom Italia | Intended status: Standards Track Telecom Italia | |||
| Expires: August 21, 2022 G. Mirsky | Expires: August 27, 2022 G. Mirsky | |||
| Ericsson | Ericsson | |||
| T. Mizrahi | T. Mizrahi | |||
| T. Zhou | T. Zhou | |||
| Huawei Technologies | Huawei Technologies | |||
| X. Min | X. Min | |||
| ZTE Corp. | ZTE Corp. | |||
| February 17, 2022 | February 23, 2022 | |||
| Alternate-Marking Method | Alternate-Marking Method | |||
| draft-fioccola-rfc8321bis-02 | draft-fioccola-rfc8321bis-03 | |||
| Abstract | Abstract | |||
| This document describes the Alternate-Marking technique to perform | This document describes the Alternate-Marking technique to perform | |||
| packet loss, delay, and jitter measurements on live traffic. This | packet loss, delay, and jitter measurements on live traffic. This | |||
| technology can be applied in various situations and for different | technology can be applied in various situations and for different | |||
| protocols. It could be considered Passive or Hybrid depending on the | protocols. It could be considered Passive or Hybrid depending on the | |||
| application. This document obsoletes [RFC8321]. | application. This document obsoletes [RFC8321]. | |||
| Status of This Memo | Status of This Memo | |||
| skipping to change at page 1, line 42 ¶ | skipping to change at page 1, line 42 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on August 21, 2022. | This Internet-Draft will expire on August 27, 2022. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 22 ¶ | skipping to change at page 2, line 22 ¶ | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 | 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 | |||
| 2. Overview of the Method . . . . . . . . . . . . . . . . . . . 4 | 2. Overview of the Method . . . . . . . . . . . . . . . . . . . 4 | |||
| 3. Detailed Description of the Method . . . . . . . . . . . . . 5 | 3. Detailed Description of the Method . . . . . . . . . . . . . 5 | |||
| 3.1. Packet Loss Measurement . . . . . . . . . . . . . . . . . 5 | 3.1. Packet Loss Measurement . . . . . . . . . . . . . . . . . 5 | |||
| 3.1.1. Coloring the Packets . . . . . . . . . . . . . . . . 10 | 3.2. One-Way Delay Measurement . . . . . . . . . . . . . . . . 9 | |||
| 3.1.2. Counting the Packets . . . . . . . . . . . . . . . . 10 | 3.2.1. Single-Marking Methodology . . . . . . . . . . . . . 9 | |||
| 3.1.3. Collecting Data and Calculating Packet Loss . . . . . 11 | 3.2.2. Double-Marking Methodology . . . . . . . . . . . . . 10 | |||
| 3.2. Timing Aspects . . . . . . . . . . . . . . . . . . . . . 12 | 3.3. Delay Variation Measurement . . . . . . . . . . . . . . . 11 | |||
| 3.3. One-Way Delay Measurement . . . . . . . . . . . . . . . . 13 | 4. Alternate Marking Functions . . . . . . . . . . . . . . . . . 12 | |||
| 3.3.1. Single-Marking Methodology . . . . . . . . . . . . . 14 | 4.1. Marking the Packets . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.3.2. Double-Marking Methodology . . . . . . . . . . . . . 16 | 4.2. Counting and Timestamping Packets . . . . . . . . . . . . 13 | |||
| 3.4. Delay Variation Measurement . . . . . . . . . . . . . . . 17 | 4.3. Data Collection and Correlation . . . . . . . . . . . . . 14 | |||
| 4. Considerations . . . . . . . . . . . . . . . . . . . . . . . 17 | 5. Synchronization and Timing . . . . . . . . . . . . . . . . . 15 | |||
| 4.1. Synchronization . . . . . . . . . . . . . . . . . . . . . 17 | 6. Packet Fragmentation . . . . . . . . . . . . . . . . . . . . 17 | |||
| 4.2. Data Correlation . . . . . . . . . . . . . . . . . . . . 18 | 7. Results of the Alternate Marking Experiment . . . . . . . . . 17 | |||
| 4.3. Packet Reordering . . . . . . . . . . . . . . . . . . . . 19 | 7.1. Controlled Domain requirement . . . . . . . . . . . . . . 19 | |||
| 4.4. Packet Fragmentation . . . . . . . . . . . . . . . . . . 20 | 8. Compliance with Guidelines from RFC 6390 . . . . . . . . . . 19 | |||
| 5. Results of the Alternate Marking Experiment . . . . . . . . . 20 | 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 5.1. Controlled Domain requirement . . . . . . . . . . . . . . 22 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 21 | |||
| 6. Compliance with Guidelines from RFC 6390 . . . . . . . . . . 22 | 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 | 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 24 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 26 | 13.1. Normative References . . . . . . . . . . . . . . . . . . 23 | |||
| 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 | 13.2. Informative References . . . . . . . . . . . . . . . . . 24 | |||
| 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 | Appendix A. Changes Log . . . . . . . . . . . . . . . . . . . . 26 | |||
| 11.1. Normative References . . . . . . . . . . . . . . . . . . 26 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
| 11.2. Informative References . . . . . . . . . . . . . . . . . 27 | ||||
| Appendix A. Changes Log . . . . . . . . . . . . . . . . . . . . 29 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 | ||||
| 1. Introduction | 1. Introduction | |||
| Nowadays, most Service Providers' networks carry traffic with | Nowadays, most Service Providers' networks carry traffic with | |||
| contents that are highly sensitive to packet loss [RFC7680], delay | contents that are highly sensitive to packet loss [RFC7680], delay | |||
| [RFC7679], and jitter [RFC3393]. | [RFC7679], and jitter [RFC3393]. | |||
| In view of this scenario, Service Providers need methodologies and | In view of this scenario, Service Providers need methodologies and | |||
| tools to monitor and measure network performance with an adequate | tools to monitor and measure network performance with an adequate | |||
| accuracy, in order to constantly control the quality of experience | accuracy, in order to constantly control the quality of experience | |||
| skipping to change at page 8, line 5 ¶ | skipping to change at page 8, line 5 ¶ | |||
| value of the counters not immediately after the color switch: some | value of the counters not immediately after the color switch: some | |||
| packets could arrive out of order and increment the counter | packets could arrive out of order and increment the counter | |||
| associated with the previous block (color), so it is worth waiting | associated with the previous block (color), so it is worth waiting | |||
| for some time. A safe choice is to wait L/2 time units (where L is | for some time. A safe choice is to wait L/2 time units (where L is | |||
| the duration for each block) after the color switch, to read the | the duration for each block) after the color switch, to read the | |||
| still counter of the previous color, so the possibility of reading a | still counter of the previous color, so the possibility of reading a | |||
| running counter instead of a still one is minimized. The drawback is | running counter instead of a still one is minimized. The drawback is | |||
| that the longer the duration of the block, the less frequent the | that the longer the duration of the block, the less frequent the | |||
| measurement can be taken. | measurement can be taken. | |||
| The following table shows how the counters can be used to calculate | The timer-based batches is preferable because it is more | |||
| the packet loss between R1 and R2. The first column lists the | deterministic that the counter-based batches and it will be | |||
| sequence of traffic blocks, while the other columns contain the | considered hereafter. | |||
| counters of A-colored packets and B-colored packets for R1 and R2. | ||||
| In this example, we assume that the values of the counters are reset | ||||
| to zero whenever a block ends and its associated counter has been | ||||
| read: with this assumption, the table shows only relative values, | ||||
| which is the exact number of packets of each color within each block. | ||||
| If the values of the counters were not reset, the table would contain | ||||
| cumulative values, but the relative values could be determined simply | ||||
| by the difference from the value of the previous block of the same | ||||
| color. | ||||
| The color is switched on the basis of a fixed timer (not shown in the | ||||
| table), so the number of packets in each block is different. | ||||
| +-------+--------+--------+--------+--------+------+ | ||||
| | Block | C(A)R1 | C(B)R1 | C(A)R2 | C(B)R2 | Loss | | ||||
| +-------+--------+--------+--------+--------+------+ | ||||
| | 1 | 375 | 0 | 375 | 0 | 0 | | ||||
| | 2 | 0 | 388 | 0 | 388 | 0 | | ||||
| | 3 | 382 | 0 | 381 | 0 | 1 | | ||||
| | 4 | 0 | 377 | 0 | 374 | 3 | | ||||
| | ... | ... | ... | ... | ... | ... | | ||||
| | 2n | 0 | 387 | 0 | 387 | 0 | | ||||
| | 2n+1 | 379 | 0 | 377 | 0 | 2 | | ||||
| +-------+--------+--------+--------+--------+------+ | ||||
| Figure 4: Evaluation of Counters for Packet Loss Measurements | ||||
| During an A block (blocks 1, 3, and 2n+1), all the packets are | ||||
| A-colored; therefore, the C(A) counters are incremented to the number | ||||
| seen on the interface, while C(B) counters are zero. Conversely, | ||||
| during a B block (blocks 2, 4, and 2n), all the packets are | ||||
| B-colored: C(A) counters are zero, while C(B) counters are | ||||
| incremented. | ||||
| When a block ends (because of color switching), the relative counters | ||||
| stop incrementing; it is possible to read them, compare the values | ||||
| measured on routers R1 and R2, and calculate the packet loss within | ||||
| that block. | ||||
| For example, looking at the table above, during the first block | ||||
| (A-colored), C(A)R1 and C(A)R2 have the same value (375), which | ||||
| corresponds to the exact number of packets of the first block (no | ||||
| loss). Also, during the second block (B-colored), R1 and R2 counters | ||||
| have the same value (388), which corresponds to the number of packets | ||||
| of the second block (no loss). During the third and fourth blocks, | ||||
| R1 and R2 counters are different, meaning that some packets have been | ||||
| lost: in the example, one single packet (382-381) was lost during | ||||
| block three, and three packets (377-374) were lost during block four. | ||||
| The method applied to R1 and R2 can be extended to any other router | ||||
| and applied to more complex networks, as far as the measurement is | ||||
| enabled on the path followed by the traffic flow(s) being observed. | ||||
| It's worth mentioning two different strategies that can be used when | It's worth mentioning two different strategies that can be used when | |||
| implementing the method: | implementing the method: | |||
| o flow-based: the flow-based strategy is used when only a limited | o flow-based: the flow-based strategy is used when only a limited | |||
| number of traffic flows need to be monitored. According to this | number of traffic flows need to be monitored. According to this | |||
| strategy, only a subset of the flows is colored. Counters for | strategy, only a subset of the flows is colored. Counters for | |||
| packet loss measurements can be instantiated for each single flow, | packet loss measurements can be instantiated for each single flow, | |||
| or for the set as a whole, depending on the desired granularity. | or for the set as a whole, depending on the desired granularity. | |||
| A relevant problem with this approach is the necessity to know in | A relevant problem with this approach is the necessity to know in | |||
| skipping to change at page 10, line 12 ¶ | skipping to change at page 9, line 8 ¶ | |||
| transparent to intermediate nodes and independent from the path | transparent to intermediate nodes and independent from the path | |||
| followed by traffic flows. On the contrary, to monitor the flow on a | followed by traffic flows. On the contrary, to monitor the flow on a | |||
| hop-by-hop basis along its whole path, it is necessary to enable the | hop-by-hop basis along its whole path, it is necessary to enable the | |||
| monitoring on every node from the source to the destination. In case | monitoring on every node from the source to the destination. In case | |||
| the exact path followed by the flow is not known a priori (i.e., the | the exact path followed by the flow is not known a priori (i.e., the | |||
| flow has multiple paths to reach the destination), it is necessary to | flow has multiple paths to reach the destination), it is necessary to | |||
| enable the monitoring system on every path: counters on interfaces | enable the monitoring system on every path: counters on interfaces | |||
| traversed by the flow will report packet count, whereas counters on | traversed by the flow will report packet count, whereas counters on | |||
| other interfaces will be null. | other interfaces will be null. | |||
| 3.1.1. Coloring the Packets | 3.2. One-Way Delay Measurement | |||
| The coloring operation is fundamental in order to create packet | ||||
| blocks. This implies choosing where to activate the coloring and how | ||||
| to color the packets. | ||||
| In case of flow-based measurements, the flow to monitor can be | ||||
| defined by a set of selection rules (e.g., header fields) used to | ||||
| match a subset of the packets; in this way, it is possible to control | ||||
| the number of involved nodes, the path followed by the packets, and | ||||
| the size of the flows. It is possible, in general, to have multiple | ||||
| coloring nodes or a single coloring node that is easier to manage and | ||||
| doesn't raise any risk of conflict. Coloring in multiple nodes can | ||||
| be done, and the requirement is that the coloring must change | ||||
| periodically between the nodes according to the timing considerations | ||||
| in Section 3.2; so every node that is designated as a measurement | ||||
| point along the path should be able to identify unambiguously the | ||||
| colored packets. Furthermore, [I-D.fioccola-rfc8889bis] generalizes | ||||
| the coloring for multipoint-to-multipoint flow. In addition, it can | ||||
| be advantageous to color the flow as close as possible to the source | ||||
| because it allows an end-to-end measure if a measurement point is | ||||
| enabled on the last-hop router as well. | ||||
| For link-based measurements, all traffic needs to be colored when | ||||
| transmitted on the link. If the traffic had already been colored, | ||||
| then it has to be re-colored because the color must be consistent on | ||||
| the link. This means that each hop along the path must (re-)color | ||||
| the traffic; the color is not required to be consistent along | ||||
| different links. | ||||
| Traffic coloring can be implemented by setting a specific bit in the | ||||
| packet header and changing the value of that bit periodically. How | ||||
| to choose the marking field depends on the application and is out of | ||||
| scope here. | ||||
| 3.1.2. Counting the Packets | ||||
| For flow-based measurements, assuming that the coloring of the | ||||
| packets is performed only by the source nodes, the nodes between | ||||
| source and destination (included) have to count the colored packets | ||||
| that they receive and forward: this operation can be enabled on every | ||||
| router along the path or only on a subset, depending on which network | ||||
| segment is being monitored (a single link, a particular metro area, | ||||
| the backbone, or the whole path). Since the color switches | ||||
| periodically between two values, two counters (one for each value) | ||||
| are needed: one counter for packets with color A and one counter for | ||||
| packets with color B. For each flow (or group of flows) being | ||||
| monitored and for every interface where the monitoring is Active, two | ||||
| counters are needed. For example, in order to separately monitor | ||||
| three flows on a router with four interfaces involved, 24 counters | ||||
| are needed (two counters for each of the three flows on each of the | ||||
| four interfaces). Furthermore, [I-D.fioccola-rfc8889bis] generalizes | ||||
| the counting for multipoint-to-multipoint flow. | ||||
| In case of link-based measurements, the behavior is similar except | ||||
| that coloring and counting operations are performed on a link-by-link | ||||
| basis at each endpoint of the link. | ||||
| Another important aspect to take into consideration is when to read | ||||
| the counters: in order to count the exact number of packets of a | ||||
| block, the routers must perform this operation when that block has | ||||
| ended; in other words, the counter for color A must be read when the | ||||
| current block has color B, in order to be sure that the value of the | ||||
| counter is stable. This task can be accomplished in two ways. The | ||||
| general approach suggests reading the counters periodically, many | ||||
| times during a block duration, and comparing these successive | ||||
| readings: when the counter stops incrementing, it means that the | ||||
| current block has ended, and its value can be elaborated safely. | ||||
| Alternatively, if the coloring operation is performed on the basis of | ||||
| a fixed timer, it is possible to configure the reading of the | ||||
| counters according to that timer: for example, reading the counter | ||||
| for color A every period in the middle of the subsequent block with | ||||
| color B is a safe choice. A sufficient margin should be considered | ||||
| between the end of a block and the reading of the counter, in order | ||||
| to take into account any out-of-order packets. | ||||
| 3.1.3. Collecting Data and Calculating Packet Loss | ||||
| The nodes enabled to perform performance monitoring collect the value | ||||
| of the counters, but they are not able to directly use this | ||||
| information to measure packet loss, because they only have their own | ||||
| samples. For this reason, an external Network Management System | ||||
| (NMS) can be used to collect and elaborate data and to perform packet | ||||
| loss calculation. The NMS compares the values of counters from | ||||
| different nodes and can calculate if some packets were lost (even a | ||||
| single packet) and where those packets were lost. | ||||
| The value of the counters needs to be transmitted to the NMS as soon | ||||
| as it has been read. This can be accomplished by using SNMP or FTP | ||||
| and can be done in Push Mode or Polling Mode. In the first case, | ||||
| each router periodically sends the information to the NMS; in the | ||||
| latter case, it is the NMS that periodically polls routers to collect | ||||
| information. In any case, the NMS has to collect all the relevant | ||||
| values from all the routers within one cycle of the timer. | ||||
| It would also be possible to use a protocol to exchange values of | ||||
| counters between the two endpoints in order to let them perform the | ||||
| packet loss calculation for each traffic direction. | ||||
| 3.2. Timing Aspects | ||||
| This document introduces two color-switching methods: one is based on | ||||
| a fixed number of packets, and the other is based on a fixed timer. | ||||
| But the method based on a fixed timer is preferable because it is | ||||
| more deterministic, and it is considered in the document. | ||||
| In general, clocks in network devices are not accurate and for this | ||||
| reason, there is a clock error between the measurement points R1 and | ||||
| R2. But, to implement the methodology, they must be synchronized to | ||||
| the same clock reference with an accuracy of +/- L/2 time units, | ||||
| where L is the fixed time duration of the block. So each colored | ||||
| packet can be assigned to the right batch by each router. This is | ||||
| because the minimum time distance between two packets of the same | ||||
| color but that belong to different batches is L time units. | ||||
| In practice, in addition to clock errors, the delay between | ||||
| measurement points also affects the implementation of the methodology | ||||
| because each packet can be delayed differently, and this can produce | ||||
| out of order at batch boundaries. This means that, without | ||||
| considering clock error, we wait L/2 after color switching to be sure | ||||
| to take a still counter. | ||||
| In summary, we need to take into account two contributions: clock | ||||
| error between network devices and the interval we need to wait to | ||||
| avoid packets being out of order because of network delay. | ||||
| The following figure explains both issues. | ||||
| ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... | ||||
| |<======================================>| | ||||
| | L | | ||||
| ...=========>|<==================><==================>|<==========... | ||||
| | L/2 L/2 | | ||||
| |<===>| |<===>| | ||||
| d | | d | ||||
| |<==========================>| | ||||
| available counting interval | ||||
| Figure 5: Timing Aspects | ||||
| It is assumed that all network devices are synchronized to a common | ||||
| reference time with an accuracy of +/- A/2. Thus, the difference | ||||
| between the clock values of any two network devices is bounded by A. | ||||
| The network delay between the network devices can be represented as a | ||||
| data set and 99.7% of the samples are within 3 standard deviation of | ||||
| the average. | ||||
| The guard band d is given by: | ||||
| d = A + D_avg + 3*D_stddev, | ||||
| where A is the clock accuracy, D_avg is the average value of the | ||||
| network delay between the network devices, and D_stddev is the | ||||
| standard deviation of the delay. | ||||
| The available counting interval is L - 2d that must be > 0. | ||||
| The condition that must be satisfied and is a requirement on the | ||||
| synchronization accuracy is: | ||||
| d < L/2. | ||||
| 3.3. One-Way Delay Measurement | ||||
| The same principle used to measure packet loss can be applied also to | The same principle used to measure packet loss can be applied also to | |||
| one-way delay measurement. There are three alternatives, as | one-way delay measurement. There are three alternatives, as | |||
| described hereinafter. | described hereinafter. | |||
| Note that, for all the one-way delay alternatives described in the | Note that, for all the one-way delay alternatives described in the | |||
| next sections, by summing the one-way delays of the two directions of | next sections, by summing the one-way delays of the two directions of | |||
| a path, it is always possible to measure the two-way delay (round- | a path, it is always possible to measure the two-way delay (round- | |||
| trip "virtual" delay). | trip "virtual" delay). | |||
| 3.3.1. Single-Marking Methodology | 3.2.1. Single-Marking Methodology | |||
| The alternation of colors can be used as a time reference to | The alternation of colors can be used as a time reference to | |||
| calculate the delay. Whenever the color changes (which means that a | calculate the delay. Whenever the color changes (which means that a | |||
| new block has started), a network device can store the timestamp of | new block has started), a network device can store the timestamp of | |||
| the first packet of the new block; that timestamp can be compared | the first packet of the new block; that timestamp can be compared | |||
| with the timestamp of the same packet on a second router to compute | with the timestamp of the same packet on a second router to compute | |||
| packet delay. When looking at Figure 2, R1 stores the timestamp | packet delay. When looking at Figure 2, R1 stores the timestamp | |||
| TS(A1)R1 when it sends the first packet of block 1 (A-colored), the | TS(A1)R1 when it sends the first packet of block 1 (A-colored), the | |||
| timestamp TS(B2)R1 when it sends the first packet of block 2 | timestamp TS(B2)R1 when it sends the first packet of block 2 | |||
| (B-colored), and so on for every other block. R2 performs the same | (B-colored), and so on for every other block. R2 performs the same | |||
| operation on the receiving side, recording TS(A1)R2, TS(B2)R2, and so | operation on the receiving side, recording TS(A1)R2, TS(B2)R2, and so | |||
| on. Since the timestamps refer to specific packets (the first packet | on. Since the timestamps refer to specific packets (the first packet | |||
| of each block), we are sure that timestamps compared to compute delay | of each block), we are sure that timestamps compared to compute delay | |||
| refer to the same packets. By comparing TS(A1)R1 with TS(A1)R2 (and | refer to the same packets. By comparing TS(A1)R1 with TS(A1)R2 (and | |||
| similarly TS(B2)R1 with TS(B2)R2, and so on), it is possible to | similarly TS(B2)R1 with TS(B2)R2, and so on), it is possible to | |||
| measure the delay between R1 and R2. In order to have more | measure the delay between R1 and R2. In order to have more | |||
| measurements, it is possible to take and store more timestamps, | measurements, it is possible to take and store more timestamps, | |||
| referring to other packets within each block. | referring to other packets within each block. The number of | |||
| measurements could be increased by considering multiple packets in | ||||
| the block: for instance, a timestamp could be taken every N packets, | ||||
| thus generating multiple delay measurements. Taking this to the | ||||
| limit, in principle, the delay could be measured for each packet by | ||||
| taking and comparing the corresponding timestamps (possible but | ||||
| impractical from an implementation point of view). | ||||
| In order to coherently compare timestamps collected on different | In order to coherently compare timestamps collected on different | |||
| routers, the clocks on the network nodes must be in sync. | routers, the clocks on the network nodes must be in sync. | |||
| Furthermore, a measurement is valid only if no packet loss occurs and | Furthermore, a measurement is valid only if no packet loss occurs and | |||
| if packet misordering can be avoided; otherwise, the first packet of | if packet misordering can be avoided; otherwise, the first packet of | |||
| a block on R1 could be different from the first packet of the same | a block on R1 could be different from the first packet of the same | |||
| block on R2 (for instance, if that packet is lost between R1 and R2 | block on R2 (for instance, if that packet is lost between R1 and R2 | |||
| or it arrives after the next one). Since packet misordering is | or it arrives after the next one). Since packet misordering is | |||
| generally undetectable it is not possible to check whether the first | generally undetectable it is not possible to check whether the first | |||
| packet on R1 is the same on R2 and this is part of the intrinsic | packet on R1 is the same on R2 and this is part of the intrinsic | |||
| error in this measurement. | error in this measurement. | |||
| The following table shows how timestamps can be used to calculate the | 3.2.1.1. Mean Delay | |||
| delay between R1 and R2. The first column lists the sequence of | ||||
| blocks, while other columns contain the timestamp referring to the | ||||
| first packet of each block on R1 and R2. The delay is computed as a | ||||
| difference between timestamps. For the sake of simplicity, all the | ||||
| values are expressed in milliseconds. | ||||
| +-------+---------+---------+---------+---------+-------------+ | ||||
| | Block | TS(A)R1 | TS(B)R1 | TS(A)R2 | TS(B)R2 | Delay R1-R2 | | ||||
| +-------+---------+---------+---------+---------+-------------+ | ||||
| | 1 | 12.483 | - | 15.591 | - | 3.108 | | ||||
| | 2 | - | 6.263 | - | 9.288 | 3.025 | | ||||
| | 3 | 27.556 | - | 30.512 | - | 2.956 | | ||||
| | 4 | - | 18.113 | - | 21.269 | 3.156 | | ||||
| | ... | ... | ... | ... | ... | ... | | ||||
| | 2n | 77.463 | - | 80.501 | - | 3.038 | | ||||
| | 2n+1 | - | 24.333 | - | 27.433 | 3.100 | | ||||
| +-------+---------+---------+---------+---------+-------------+ | ||||
| Figure 6: Evaluation of Timestamps for Delay Measurements | ||||
| The first row shows timestamps taken on R1 and R2, respectively, and | ||||
| refers to the first packet of block 1 (which is A-colored). Delay | ||||
| can be computed as a difference between the timestamp on R2 and the | ||||
| timestamp on R1. Similarly, the second row shows timestamps (in | ||||
| milliseconds) taken on R1 and R2 and refers to the first packet of | ||||
| block 2 (which is B-colored). By comparing timestamps taken on | ||||
| different nodes in the network and referring to the same packets | ||||
| (identified using the alternation of colors), it is possible to | ||||
| measure delay on different network segments. | ||||
| For the sake of simplicity, in the above example, a single | ||||
| measurement is provided within a block, taking into account only the | ||||
| first packet of each block. The number of measurements can be easily | ||||
| increased by considering multiple packets in the block: for instance, | ||||
| a timestamp could be taken every N packets, thus generating multiple | ||||
| delay measurements. Taking this to the limit, in principle, the | ||||
| delay could be measured for each packet by taking and comparing the | ||||
| corresponding timestamps (possible but impractical from an | ||||
| implementation point of view). | ||||
| 3.3.1.1. Mean Delay | ||||
| As mentioned before, the method previously exposed for measuring the | As mentioned before, the method previously exposed for measuring the | |||
| delay is sensitive to out-of-order reception of packets. In order to | delay is sensitive to out-of-order reception of packets. In order to | |||
| overcome this problem, a different approach has been considered: it | overcome this problem, a different approach has been considered: it | |||
| is based on the concept of mean delay. The mean delay is calculated | is based on the concept of mean delay. The mean delay is calculated | |||
| by considering the average arrival time of the packets within a | by considering the average arrival time of the packets within a | |||
| single block. The network device locally stores a timestamp for each | single block. The network device locally stores a timestamp for each | |||
| packet received within a single block: summing all the timestamps and | packet received within a single block: summing all the timestamps and | |||
| dividing by the total number of packets received, the average arrival | dividing by the total number of packets received, the average arrival | |||
| time for that block of packets can be calculated. By subtracting the | time for that block of packets can be calculated. By subtracting the | |||
| skipping to change at page 16, line 15 ¶ | skipping to change at page 10, line 33 ¶ | |||
| out-of-order packets and also to packet loss (only a small error is | out-of-order packets and also to packet loss (only a small error is | |||
| introduced). Moreover, it greatly reduces the number of timestamps | introduced). Moreover, it greatly reduces the number of timestamps | |||
| (only one per block for each network device) that have to be | (only one per block for each network device) that have to be | |||
| collected by the management system. On the other hand, it only gives | collected by the management system. On the other hand, it only gives | |||
| one measure for the duration of the block, and it doesn't give the | one measure for the duration of the block, and it doesn't give the | |||
| minimum, maximum, and median delay values [RFC6703]. This limitation | minimum, maximum, and median delay values [RFC6703]. This limitation | |||
| could be overcome by reducing the duration of the block (for | could be overcome by reducing the duration of the block (for | |||
| instance, from 5 minutes to a few seconds), which implies a highly | instance, from 5 minutes to a few seconds), which implies a highly | |||
| optimized implementation of the method. | optimized implementation of the method. | |||
| 3.3.2. Double-Marking Methodology | 3.2.2. Double-Marking Methodology | |||
| The Single-Marking methodology for one-way delay measurement is | The Single-Marking methodology for one-way delay measurement is | |||
| sensitive to out-of-order reception of packets. The first approach | sensitive to out-of-order reception of packets. The first approach | |||
| to overcome this problem has been described before and is based on | to overcome this problem has been described before and is based on | |||
| the concept of mean delay. But the limitation of mean delay is that | the concept of mean delay. But the limitation of mean delay is that | |||
| it doesn't give information about the delay value's distribution for | it doesn't give information about the delay value's distribution for | |||
| the duration of the block. Additionally, it may be useful to have | the duration of the block. Additionally, it may be useful to have | |||
| not only the mean delay but also the minimum, maximum, and median | not only the mean delay but also the minimum, maximum, and median | |||
| delay values and, in wider terms, to know more about the statistic | delay values and, in wider terms, to know more about the statistic | |||
| distribution of delay values. So, in order to have more information | distribution of delay values. So, in order to have more information | |||
| skipping to change at page 17, line 22 ¶ | skipping to change at page 11, line 40 ¶ | |||
| Double-Marking Method, where a subset of batch packets is selected | Double-Marking Method, where a subset of batch packets is selected | |||
| for extensive delay calculation by using a second marking. In this | for extensive delay calculation by using a second marking. In this | |||
| way, it is possible to perform a detailed analysis on these double- | way, it is possible to perform a detailed analysis on these double- | |||
| marked packets. Please note that there are classic algorithms for | marked packets. Please note that there are classic algorithms for | |||
| median and variance calculation, but they are out of the scope of | median and variance calculation, but they are out of the scope of | |||
| this document. The comparison between the mean delay for the entire | this document. The comparison between the mean delay for the entire | |||
| batch and the mean delay on these double-marked packets gives useful | batch and the mean delay on these double-marked packets gives useful | |||
| information since it is possible to understand if the Double-Marking | information since it is possible to understand if the Double-Marking | |||
| measurements are actually representative of the delay trends. | measurements are actually representative of the delay trends. | |||
| 3.4. Delay Variation Measurement | 3.3. Delay Variation Measurement | |||
| Similar to one-way delay measurement (both for Single Marking and | Similar to one-way delay measurement (both for Single Marking and | |||
| Double Marking), the method can also be used to measure the inter- | Double Marking), the method can also be used to measure the inter- | |||
| arrival jitter. We refer to the definition in RFC 3393 [RFC3393]. | arrival jitter. We refer to the definition in RFC 3393 [RFC3393]. | |||
| The alternation of colors, for a Single-Marking Method, can be used | The alternation of colors, for a Single-Marking Method, can be used | |||
| as a time reference to measure delay variations. In case of Double | as a time reference to measure delay variations. In case of Double | |||
| Marking, the time reference is given by the second-marked packets. | Marking, the time reference is given by the second-marked packets. | |||
| Considering the example depicted in Figure 2, R1 stores the timestamp | Considering the example depicted in Figure 2, R1 stores the timestamp | |||
| TS(A)R1 whenever it sends the first packet of a block, and R2 stores | TS(A)R1 whenever it sends the first packet of a block, and R2 stores | |||
| the timestamp TS(B)R2 whenever it receives the first packet of a | the timestamp TS(B)R2 whenever it receives the first packet of a | |||
| block. The inter-arrival jitter can be easily derived from one-way | block. The inter-arrival jitter can be easily derived from one-way | |||
| delay measurement, by evaluating the delay variation of consecutive | delay measurement, by evaluating the delay variation of consecutive | |||
| samples. | samples. | |||
| The concept of mean delay can also be applied to delay variation, by | The concept of mean delay can also be applied to delay variation, by | |||
| evaluating the average variation of the interval between consecutive | evaluating the average variation of the interval between consecutive | |||
| packets of the flow from R1 to R2. | packets of the flow from R1 to R2. | |||
| 4. Considerations | 4. Alternate Marking Functions | |||
| This section highlights some considerations about the methodology. | 4.1. Marking the Packets | |||
| 4.1. Synchronization | The coloring operation is fundamental in order to create packet | |||
| blocks and marked packets. This implies choosing where to activate | ||||
| the coloring and how to color the packets. | ||||
| The Alternate-Marking technique does not require a strong | In case of flow-based measurements, the flow to monitor can be | |||
| synchronization, especially for packet loss and two-way delay | defined by a set of selection rules (e.g., header fields) used to | |||
| measurement. Only one-way delay measurement requires network devices | match a subset of the packets; in this way, it is possible to control | |||
| to have synchronized clocks. | the number of involved nodes, the path followed by the packets, and | |||
| the size of the flows. It is possible, in general, to have multiple | ||||
| coloring nodes or a single coloring node that is easier to manage and | ||||
| doesn't raise any risk of conflict. Coloring in multiple nodes can | ||||
| be done, and the requirement is that the coloring must change | ||||
| periodically between the nodes according to the timing considerations | ||||
| in Section 5; so every node that is designated as a measurement point | ||||
| along the path should be able to identify unambiguously the colored | ||||
| packets. Furthermore, [I-D.fioccola-rfc8889bis] generalizes the | ||||
| coloring for multipoint-to-multipoint flow. In addition, it can be | ||||
| advantageous to color the flow as close as possible to the source | ||||
| because it allows an end-to-end measure if a measurement point is | ||||
| enabled on the last-hop router as well. | ||||
| Color switching is the reference for all the network devices, and the | For link-based measurements, all traffic needs to be colored when | |||
| only requirement to be achieved is that all network devices have to | transmitted on the link. If the traffic had already been colored, | |||
| recognize the right batch along the path. | then it has to be re-colored because the color must be consistent on | |||
| the link. This means that each hop along the path must (re-)color | ||||
| the traffic; the color is not required to be consistent along | ||||
| different links. | ||||
| Section 3.2 specifies the level of synchronization accuracy so that | Traffic coloring can be implemented by setting specific flags in the | |||
| all network devices consistently match the color bit to the correct | packet header and changing the value of that bit periodically. How | |||
| block. | to choose the marking field depends on the application and is out of | |||
| scope here. | ||||
| This synchronization requirement can be satisfied even with a | 4.2. Counting and Timestamping Packets | |||
| relatively inaccurate synchronization method. This is true for | ||||
| packet loss and two-way delay measurement, but not for one-way delay | ||||
| measurement, where clock synchronization must be accurate. | ||||
| Therefore, a system that uses only packet loss and two-way delay | For flow-based measurements, assuming that the coloring of the | |||
| measurement does not require synchronization. This is because the | packets is performed only by the source nodes, the nodes between | |||
| value of the clocks of network devices does not affect the | source and destination (included) have to count and timestamp the | |||
| computation of the two-way delay measurement. | colored packets that they receive and forward: this operation can be | |||
| enabled on every router along the path or only on a subset, depending | ||||
| on which network segment is being monitored (a single link, a | ||||
| particular metro area, the backbone, or the whole path). Since the | ||||
| color switches periodically between two values, two counters (one for | ||||
| each value) are needed: one counter for packets with color A and one | ||||
| counter for packets with color B. For each flow (or group of flows) | ||||
| being monitored and for every interface where the monitoring is | ||||
| Active, two counters are needed. For example, in order to separately | ||||
| monitor three flows on a router with four interfaces involved, 24 | ||||
| counters are needed (two counters for each of the three flows on each | ||||
| of the four interfaces). The number of timestamps to be stored | ||||
| depends on the method for delay measurement that is applied. | ||||
| Furthermore, [I-D.fioccola-rfc8889bis] generalizes the counting for | ||||
| multipoint-to-multipoint flow. | ||||
| 4.2. Data Correlation | In case of link-based measurements, the behavior is similar except | |||
| that coloring, counting and timestamping operations are performed on | ||||
| a link-by-link basis at each endpoint of the link. | ||||
| Data correlation is the mechanism to compare counters and timestamps | Another important aspect to take into consideration is when to read | |||
| for packet loss, delay, and delay variation calculation. It could be | the counters: in order to count the exact number of packets of a | |||
| performed in several ways depending on the Alternate-Marking | block, the routers must perform this operation when that block has | |||
| application and use case. Some possibilities are to: | ended; in other words, the counter for color A must be read when the | |||
| current block has color B, in order to be sure that the value of the | ||||
| counter is stable. This task can be accomplished in two ways. The | ||||
| general approach suggests reading the counters periodically, many | ||||
| times during a block duration, and comparing these successive | ||||
| readings: when the counter stops incrementing, it means that the | ||||
| current block has ended, and its value can be elaborated safely. | ||||
| Alternatively, if the coloring operation is performed on the basis of | ||||
| a fixed timer, it is possible to configure the reading of the | ||||
| counters according to that timer: for example, reading the counter | ||||
| for color A every period in the middle of the subsequent block with | ||||
| color B is a safe choice. A sufficient margin should be considered | ||||
| between the end of a block and the reading of the counter, in order | ||||
| to take into account any out-of-order packets. Regarding the | ||||
| selection of the packet to be double-marked for delay measurement, | ||||
| the same considerations for packet loss measurement apply also here | ||||
| and it is reasonable to choose the double-marked packet in the middle | ||||
| of the block. The timing aspects are further described in Section 5. | ||||
| o use a centralized solution using NMS to correlate data; and | 4.3. Data Collection and Correlation | |||
| o define a protocol-based distributed solution by introducing a new | The nodes enabled to perform performance monitoring collect the value | |||
| protocol or by extending the existing protocols (e.g., see RFC | of the counters and timestamps, but they are not able to directly use | |||
| 6374 [RFC6374] or the Two-Way Active Measurement Protocol (TWAMP) | this information to measure packet loss and delay, because they only | |||
| have their own samples. | ||||
| Data collection enables the transmission of the counters and | ||||
| timestamps as soon as it has been read. While, data correlation is | ||||
| the mechanism to compare counters and timestamps for packet loss, | ||||
| delay, and delay variation calculation. | ||||
| There are two main possibilities to perform both data collection and | ||||
| correlation depending on the Alternate-Marking application and use | ||||
| case: | ||||
| o Use of a centralized solution using Network Management System | ||||
| (NMS) to correlate data. This can be done in Push Mode or Polling | ||||
| Mode. In the first case, each router periodically sends the | ||||
| information to the NMS; in the latter case, it is the NMS that | ||||
| periodically polls routers to collect information. In any case, | ||||
| the NMS has to collect all the relevant values from all the | ||||
| routers within one cycle of the timer. | ||||
| o Definition of a protocol-based distributed solution to exchange | ||||
| values of counters and timestamps between the endpoints. This can | ||||
| be done by introducing a new protocol or by extending the existing | ||||
| protocols (e.g., the Two-Way Active Measurement Protocol (TWAMP) | ||||
| as defined in RFC 5357 [RFC5357] or the One-Way Active Measurement | as defined in RFC 5357 [RFC5357] or the One-Way Active Measurement | |||
| Protocol (OWAMP) as defined in RFC 4656 [RFC4656]) in order to | Protocol (OWAMP) as defined in RFC 4656 [RFC4656]) in order to | |||
| communicate the counters and timestamps between nodes. | communicate the counters and timestamps between nodes. | |||
| In the following paragraphs, an example data correlation mechanism is | In the following paragraphs, an example data correlation mechanism is | |||
| explained and could be used independently of the adopted solutions. | explained and could be used independently of the adopted solutions. | |||
| When data is collected on the upstream and downstream nodes, e.g., | When data is collected on the upstream and downstream nodes, e.g., | |||
| packet counts for packet loss measurement or timestamps for packet | packet counts for packet loss measurement or timestamps for packet | |||
| delay measurement, and is periodically reported to or pulled by other | delay measurement, and is periodically reported to or pulled by other | |||
| skipping to change at page 19, line 20 ¶ | skipping to change at page 15, line 18 ¶ | |||
| When the nodes or NMS see, for example, the same BNs associated with | When the nodes or NMS see, for example, the same BNs associated with | |||
| two packet counts from an upstream and a downstream node, | two packet counts from an upstream and a downstream node, | |||
| respectively, it considers that these two packet counts correspond to | respectively, it considers that these two packet counts correspond to | |||
| the same block, i.e., these two packet counts belong to the same | the same block, i.e., these two packet counts belong to the same | |||
| block of markers from the upstream and downstream nodes. The | block of markers from the upstream and downstream nodes. The | |||
| assumption of this BN mechanism is that the measurement nodes are | assumption of this BN mechanism is that the measurement nodes are | |||
| time synchronized. This requires the measurement nodes to have a | time synchronized. This requires the measurement nodes to have a | |||
| certain time synchronization capability (e.g., the Network Time | certain time synchronization capability (e.g., the Network Time | |||
| Protocol (NTP) [RFC5905] or the IEEE 1588 Precision Time Protocol | Protocol (NTP) [RFC5905] or the IEEE 1588 Precision Time Protocol | |||
| (PTP) [IEEE-1588]). Synchronization aspects are further discussed in | (PTP) [IEEE-1588]). | |||
| Section 3.2. | ||||
| 4.3. Packet Reordering | 5. Synchronization and Timing | |||
| Due to ECMP, packet reordering is very common in an IP network. The | This document introduces two color-switching methods: one is based on | |||
| accuracy of a marking-based PM, especially packet loss measurement, | a fixed number of packets, and the other is based on a fixed timer. | |||
| may be affected by packet reordering. Take a look at the following | But the method based on a fixed timer is preferable because it is | |||
| example: | more deterministic, and it is considered in the document. | |||
| Block : 1 | 2 | 3 | 4 | 5 |... | Color switching is the reference for all the network devices, and the | |||
| --------|---------|---------|---------|---------|---------|--- | only requirement to be achieved is that all network devices have to | |||
| Node R1 : AAAAAAA | BBBBBBB | AAAAAAA | BBBBBBB | AAAAAAA |... | recognize the right batch along the path. | |||
| Node R2 : AAAAABB | AABBBBA | AAABAAA | BBBBBBA | ABAAABA |... | ||||
| Figure 7: Packet Reordering | In general, clocks in network devices are not accurate and for this | |||
| reason, there is a clock error between the measurement points R1 and | ||||
| R2. But, to implement the methodology, they must be synchronized to | ||||
| the same clock reference with an accuracy of +/- L/2 time units, | ||||
| where L is the fixed time duration of the block. So each colored | ||||
| packet can be assigned to the right batch by each router. This is | ||||
| because the minimum time distance between two packets of the same | ||||
| color but that belong to different batches is L time units. This | ||||
| level of accuracy guarantees that all network devices consistently | ||||
| match the marking bit to the correct block. | ||||
| In Figure 7, the packet stream for Node R1 isn't being reordered and | If the value of L is not too small, this synchronization requirement | |||
| can be safely assigned to interval blocks, but the packet stream for | could be satisfied even with a relatively inaccurate synchronization | |||
| Node R2 is being reordered; so, looking at the packet with the marker | method. This is true for packet loss and two-way delay measurement, | |||
| of "B" in block 3, there is no safe way to tell whether the packet | but not for one-way delay measurement, where clock synchronization | |||
| belongs to block 2 or block 4. | must be accurate. Therefore, a system that uses only packet loss and | |||
| two-way delay measurement does not require a very precise | ||||
| synchronization. This is because the value of the clocks of network | ||||
| devices does not affect the computation of the two-way delay | ||||
| measurement. | ||||
| In general, there is the need to assign packets with the marker of | But, in practice, besides clock errors, packet reordering is also | |||
| "B" or "A" to the right interval blocks. Most of the packet | very common in a packet network due to equal-cost multipath (ECMP). | |||
| reordering occurs at the edge of adjacent blocks, and they are easy | In particular, the delay between measurement points is the main cause | |||
| to handle if the interval of each block is sufficiently large. Then, | of out of order because each packet can be delayed differently. For | |||
| it can be assumed that the packets with different markers belong to | this reason, the accuracy of the Alternate-Marking Method, especially | |||
| the block that they are closer to. If the interval is small, it is | for packet loss measurement, is affected by packet reordering. | |||
| difficult and sometimes impossible to determine to which block a | ||||
| packet belongs. | ||||
| Section 3.2 provides a guidance on how to choose a proper interval | If the time duration L of each block is too small, it may be | |||
| and mitigate packet reordering issues. | difficult to determine to which block the reordered packets belong. | |||
| However, if the value of L is sufficiently large, packet reordering | ||||
| occurs only at the edge of adjacent blocks and it becomes easy to | ||||
| assign reordered packets to the right interval blocks. This means | ||||
| that, without considering clock error, we can wait L/2 after color | ||||
| switching to be sure to take a still counter and mitigate the | ||||
| reordering issues. | ||||
| 4.4. Packet Fragmentation | In summary, we need to take into account two contributions: clock | |||
| error between network devices and the interval we need to wait to | ||||
| avoid packets being out of order because of network delay. | ||||
| The following figure explains both issues. | ||||
| ...BBBBBBBBB | AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | BBBBBBBBB... | ||||
| |<======================================>| | ||||
| | L | | ||||
| ...=========>|<==================><==================>|<==========... | ||||
| | L/2 L/2 | | ||||
| |<===>| |<===>| | ||||
| d | | d | ||||
| |<==========================>| | ||||
| available counting interval | ||||
| Figure 4: Timing Aspects | ||||
| It is assumed that all network devices are synchronized to a common | ||||
| reference time with an accuracy of +/- A/2. Thus, the difference | ||||
| between the clock values of any two network devices is bounded by A. | ||||
| The network delay between the network devices can be represented as a | ||||
| data set and 99.7% of the samples are within 3 standard deviation of | ||||
| the average. | ||||
| The guard band d is given by: | ||||
| d = A + D_avg + 3*D_stddev, | ||||
| where A is the clock accuracy, D_avg is the average value of the | ||||
| network delay between the network devices, and D_stddev is the | ||||
| standard deviation of the delay. | ||||
| The available counting interval is L - 2d that must be > 0. | ||||
| The condition that must be satisfied and is a requirement on the | ||||
| synchronization accuracy is: | ||||
| d < L/2. | ||||
| 6. Packet Fragmentation | ||||
| Fragmentation can be managed with the Alternate-Marking Method and in | Fragmentation can be managed with the Alternate-Marking Method and in | |||
| particular it is possible to give the following guidance: | particular it is possible to give the following guidance: | |||
| Marking nodes MUST mark all fragments if there are flag bits to | Marking nodes MUST mark all fragments if there are flag bits to | |||
| use (i.e. it is in the specific encapsulation), as if they were | use (i.e. it is in the specific encapsulation), as if they were | |||
| separate packets. | separate packets. | |||
| Nodes that fragment packets within the measurement domain SHOULD, | Nodes that fragment packets within the measurement domain SHOULD, | |||
| if they have the capability to do so, ensure that only one | if they have the capability to do so, ensure that only one | |||
| skipping to change at page 20, line 37 ¶ | skipping to change at page 17, line 44 ¶ | |||
| (a) other fragments are also marked, or (b) it observes all other | (a) other fragments are also marked, or (b) it observes all other | |||
| fragments and they are unmarked. | fragments and they are unmarked. | |||
| The proposed approach allows the marking node to mark all the | The proposed approach allows the marking node to mark all the | |||
| fragments except in the case of fragmentation within the network | fragments except in the case of fragmentation within the network | |||
| domain, in that event it is suggested to mark only the first | domain, in that event it is suggested to mark only the first | |||
| fragment. In addition it could be possible to take the counters | fragment. In addition it could be possible to take the counters | |||
| properly in order to keep track of both marked and unmarked | properly in order to keep track of both marked and unmarked | |||
| fragments. | fragments. | |||
| 5. Results of the Alternate Marking Experiment | 7. Results of the Alternate Marking Experiment | |||
| The methodology described in the previous sections can be applied to | The methodology described in the previous sections can be applied to | |||
| various performance measurement problems, as explained in [RFC8321]. | various performance measurement problems, as explained in [RFC8321]. | |||
| The only requirement is to select and mark the flow to be monitored; | The only requirement is to select and mark the flow to be monitored; | |||
| in this way, packets are batched by the sender, and each batch is | in this way, packets are batched by the sender, and each batch is | |||
| alternately marked such that it can be easily recognized by the | alternately marked such that it can be easily recognized by the | |||
| receiver. | receiver. | |||
| Either one or two flag bits might be available for marking in | Either one or two flag bits might be available for marking in | |||
| different deployments: | different deployments: | |||
| One flag: packet loss measurement SHOULD be done as described in | One flag: packet loss measurement SHOULD be done as described in | |||
| Section 3.1, while delay measurement MAY be done according to the | Section 3.1, while delay measurement MAY be done according to the | |||
| single-marking method described in Section 3.3.1. Mean delay | single-marking method described in Section 3.2.1. Mean delay | |||
| (Section 3.3.1.1) is NOT RECOMMENDED since it implies more | (Section 3.2.1.1) is NOT RECOMMENDED since it implies more | |||
| computational load. | computational load. | |||
| Two flags: packet loss measurement SHOULD be done as described in | Two flags: packet loss measurement SHOULD be done as described in | |||
| Section 3.1, while delay measurement SHOULD be done according to | Section 3.1, while delay measurement SHOULD be done according to | |||
| double-marking method Section 3.3.2. In this case single-marking | double-marking method Section 3.2.2. In this case single-marking | |||
| MAY also be used in combination with double-marking and the two | MAY also be used in combination with double-marking and the two | |||
| approaches provide slightly different pieces of information that | approaches provide slightly different pieces of information that | |||
| can be combined to have a more robust data set. | can be combined to have a more robust data set. | |||
| The experiment with Alternate Marking methodologies confirmed the | The experiment with Alternate Marking methodologies confirmed the | |||
| following benefits: | following benefits: | |||
| o easy implementation: it can be implemented by using features | o easy implementation: it can be implemented by using features | |||
| already available on major routing platforms, or by applying an | already available on major routing platforms, or by applying an | |||
| optimized implementation of the method for both legacy and newest | optimized implementation of the method for both legacy and newest | |||
| skipping to change at page 22, line 16 ¶ | skipping to change at page 19, line 22 ¶ | |||
| specific extension is included or not within the packets. | specific extension is included or not within the packets. | |||
| It is worth mentioning some related work: in particular | It is worth mentioning some related work: in particular | |||
| [IEEE-Network-PNPM] explains the Alternate-Marking method together | [IEEE-Network-PNPM] explains the Alternate-Marking method together | |||
| with new mechanisms based on hashing techniques as also further | with new mechanisms based on hashing techniques as also further | |||
| described in [I-D.mizrahi-ippm-marking]; while | described in [I-D.mizrahi-ippm-marking]; while | |||
| [I-D.zhou-ippm-enhanced-alternate-marking] extends the Alternate- | [I-D.zhou-ippm-enhanced-alternate-marking] extends the Alternate- | |||
| Marking Data Fields, to provide enhanced capabilities and allow | Marking Data Fields, to provide enhanced capabilities and allow | |||
| advanced functionalities. | advanced functionalities. | |||
| 5.1. Controlled Domain requirement | 7.1. Controlled Domain requirement | |||
| The Alternate Marking Method is an example of a solution limited to a | The Alternate Marking Method is an example of a solution limited to a | |||
| controlled domain [RFC8799]. | controlled domain [RFC8799]. | |||
| A controlled domain is a managed network that selects, monitors, and | A controlled domain is a managed network that selects, monitors, and | |||
| controls access by enforcing policies at the domain boundaries, in | controls access by enforcing policies at the domain boundaries, in | |||
| order to discard undesired external packets entering the domain and | order to discard undesired external packets entering the domain and | |||
| check internal packets leaving the domain. It does not necessarily | check internal packets leaving the domain. It does not necessarily | |||
| mean that a controlled domain is a single administrative domain or a | mean that a controlled domain is a single administrative domain or a | |||
| single organization. A controlled domain can correspond to a single | single organization. A controlled domain can correspond to a single | |||
| administrative domain or multiple administrative domains under a | administrative domain or multiple administrative domains under a | |||
| defined network management. It must be possible to control the | defined network management. It must be possible to control the | |||
| domain boundaries, and use specific precautions if traffic traverses | domain boundaries, and use specific precautions if traffic traverses | |||
| the Internet. | the Internet. | |||
| For security reasons, the Alternate Marking Method is RECOMMENDED | For security reasons, the Alternate Marking Method is RECOMMENDED | |||
| only for controlled domains. | only for controlled domains. | |||
| 6. Compliance with Guidelines from RFC 6390 | 8. Compliance with Guidelines from RFC 6390 | |||
| RFC 6390 [RFC6390] defines a framework and a process for developing | RFC 6390 [RFC6390] defines a framework and a process for developing | |||
| Performance Metrics for protocols above and below the IP layer (such | Performance Metrics for protocols above and below the IP layer (such | |||
| as IP-based applications that operate over reliable or datagram | as IP-based applications that operate over reliable or datagram | |||
| transport protocols). | transport protocols). | |||
| This document doesn't aim to propose a new Performance Metric but | This document doesn't aim to propose a new Performance Metric but | |||
| rather a new Method of Measurement for a few Performance Metrics that | rather a new Method of Measurement for a few Performance Metrics that | |||
| have already been standardized. Nevertheless, it's worth applying | have already been standardized. Nevertheless, it's worth applying | |||
| guidelines from [RFC6390] to the present document, in order to | guidelines from [RFC6390] to the present document, in order to | |||
| skipping to change at page 23, line 34 ¶ | skipping to change at page 20, line 38 ¶ | |||
| number of packets sent by the source node and not received by the | number of packets sent by the source node and not received by the | |||
| destination node. | destination node. | |||
| o Measurement Point(s) with Potential Measurement Domain: the | o Measurement Point(s) with Potential Measurement Domain: the | |||
| measurement can be performed between adjacent nodes, on a per-link | measurement can be performed between adjacent nodes, on a per-link | |||
| basis, or along a multi-hop path, provided that the traffic under | basis, or along a multi-hop path, provided that the traffic under | |||
| measurement follows that path. In case of a multi-hop path, the | measurement follows that path. In case of a multi-hop path, the | |||
| measurements can be performed both end-to-end and hop-by-hop. | measurements can be performed both end-to-end and hop-by-hop. | |||
| o Measurement Timing: the method has a constraint on the frequency | o Measurement Timing: the method has a constraint on the frequency | |||
| of measurements. This is detailed in Section 3.2, where it is | of measurements. This is detailed in Section 5, where it is | |||
| specified that the marking period and the guard band interval are | specified that the marking period and the guard band interval are | |||
| strictly related each other to avoid out-of-order issues. That is | strictly related each other to avoid out-of-order issues. That is | |||
| because, in order to perform a measurement, the counter must be in | because, in order to perform a measurement, the counter must be in | |||
| a steady state, and this happens when the traffic is being colored | a steady state, and this happens when the traffic is being colored | |||
| with the alternate color. | with the alternate color. | |||
| o Implementation: the method uses one or two marking bits to color | o Implementation: the method uses one or two marking bits to color | |||
| the packets; this enables the use of policy configurations on the | the packets; this enables the use of policy configurations on the | |||
| router to color the packets and accordingly configure the counter | router to color the packets and accordingly configure the counter | |||
| for each color. The path followed by traffic being measured | for each color. The path followed by traffic being measured | |||
| skipping to change at page 24, line 27 ¶ | skipping to change at page 21, line 32 ¶ | |||
| o Dependencies: the values of the counters have to be correlated to | o Dependencies: the values of the counters have to be correlated to | |||
| the time interval they refer to. | the time interval they refer to. | |||
| o Organization of Results: the Method of Measurement produces | o Organization of Results: the Method of Measurement produces | |||
| singletons. | singletons. | |||
| o Parameters: currently, the main parameter of the method is the | o Parameters: currently, the main parameter of the method is the | |||
| time interval used to alternate the colors and read the counters. | time interval used to alternate the colors and read the counters. | |||
| 7. IANA Considerations | 9. IANA Considerations | |||
| This document has no IANA actions. | This document has no IANA actions. | |||
| 8. Security Considerations | 10. Security Considerations | |||
| This document specifies a method to perform measurements in the | This document specifies a method to perform measurements in the | |||
| context of a Service Provider's network and has not been developed to | context of a Service Provider's network and has not been developed to | |||
| conduct Internet measurements, so it does not directly affect | conduct Internet measurements, so it does not directly affect | |||
| Internet security nor applications that run on the Internet. | Internet security nor applications that run on the Internet. | |||
| However, implementation of this method must be mindful of security | However, implementation of this method must be mindful of security | |||
| and privacy concerns. | and privacy concerns. | |||
| There are two types of security concerns: potential harm caused by | There are two types of security concerns: potential harm caused by | |||
| the measurements and potential harm to the measurements. | the measurements and potential harm to the measurements. | |||
| skipping to change at page 26, line 8 ¶ | skipping to change at page 23, line 13 ¶ | |||
| each block, marked by a dedicated color bit. Therefore, a | each block, marked by a dedicated color bit. Therefore, a | |||
| man-in-the-middle attacker can selectively induce synthetic delay | man-in-the-middle attacker can selectively induce synthetic delay | |||
| only to delay-colored packets, causing systematic error in the delay | only to delay-colored packets, causing systematic error in the delay | |||
| measurements. As discussed in previous sections, the methods | measurements. As discussed in previous sections, the methods | |||
| described in this document rely on an underlying time synchronization | described in this document rely on an underlying time synchronization | |||
| protocol. Thus, by attacking the time protocol, an attacker can | protocol. Thus, by attacking the time protocol, an attacker can | |||
| potentially compromise the integrity of the measurement. A detailed | potentially compromise the integrity of the measurement. A detailed | |||
| discussion about the threats against time protocols and how to | discussion about the threats against time protocols and how to | |||
| mitigate them is presented in RFC 7384 [RFC7384]. | mitigate them is presented in RFC 7384 [RFC7384]. | |||
| 9. Contributors | 11. Contributors | |||
| Mach(Guoyi) Chen | Mach(Guoyi) Chen | |||
| Huawei Technologies | Huawei Technologies | |||
| Email: mach.chen@huawei.com | Email: mach.chen@huawei.com | |||
| Alessandro Capello | Alessandro Capello | |||
| Telecom Italia | Telecom Italia | |||
| Email: alessandro.capello@telecomitalia.it | Email: alessandro.capello@telecomitalia.it | |||
| 10. Acknowledgements | 12. Acknowledgements | |||
| The authors would like to thank Alberto Tempia Bonda, Luca | The authors would like to thank Alberto Tempia Bonda, Luca | |||
| Castaldelli and Lianshu Zheng for their contribution to the | Castaldelli and Lianshu Zheng for their contribution to the | |||
| experimentation of the method. | experimentation of the method. | |||
| The authors would also thank Martin Duke and Tommy Pauly for their | The authors would also thank Martin Duke and Tommy Pauly for their | |||
| assistance and their detailed and precious reviews. | assistance and their detailed and precious reviews. | |||
| 11. References | 13. References | |||
| 11.1. Normative References | 13.1. Normative References | |||
| [IEEE-1588] | [IEEE-1588] | |||
| IEEE, "IEEE Standard for a Precision Clock Synchronization | IEEE, "IEEE Standard for a Precision Clock Synchronization | |||
| Protocol for Networked Measurement and Control Systems", | Protocol for Networked Measurement and Control Systems", | |||
| IEEE Std 1588-2008. | IEEE Std 1588-2008. | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, | [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, | |||
| "Network Time Protocol Version 4: Protocol and Algorithms | "Network Time Protocol Version 4: Protocol and Algorithms | |||
| Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, | Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, | |||
| <https://www.rfc-editor.org/info/rfc5905>. | <https://www.rfc-editor.org/info/rfc5905>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| 11.2. Informative References | 13.2. Informative References | |||
| [I-D.fioccola-rfc8889bis] | [I-D.fioccola-rfc8889bis] | |||
| Fioccola, G., Cociglio, M., Sapio, A., Sisto, R., and T. | Fioccola, G., Cociglio, M., Sapio, A., Sisto, R., and T. | |||
| Zhou, "Multipoint Alternate-Marking Method", draft- | Zhou, "Multipoint Alternate-Marking Method", draft- | |||
| fioccola-rfc8889bis-01 (work in progress), December 2021. | fioccola-rfc8889bis-02 (work in progress), February 2022. | |||
| [I-D.mizrahi-ippm-marking] | [I-D.mizrahi-ippm-marking] | |||
| Mizrahi, T., Fioccola, G., Cociglio, M., Chen, M., and G. | Mizrahi, T., Fioccola, G., Cociglio, M., Chen, M., and G. | |||
| Mirsky, "Marking Methods for Performance Measurement", | Mirsky, "Marking Methods for Performance Measurement", | |||
| draft-mizrahi-ippm-marking-00 (work in progress), October | draft-mizrahi-ippm-marking-00 (work in progress), October | |||
| 2021. | 2021. | |||
| [I-D.zhou-ippm-enhanced-alternate-marking] | [I-D.zhou-ippm-enhanced-alternate-marking] | |||
| Zhou, T., Fioccola, G., Liu, Y., Lee, S., Cociglio, M., | Zhou, T., Fioccola, G., Liu, Y., Lee, S., Cociglio, M., | |||
| and W. Li, "Enhanced Alternate Marking Method", draft- | and W. Li, "Enhanced Alternate Marking Method", draft- | |||
| skipping to change at page 29, line 44 ¶ | skipping to change at page 26, line 44 ¶ | |||
| o New section on "Packet Fragmentation" | o New section on "Packet Fragmentation" | |||
| Changes in v-(02) include: | Changes in v-(02) include: | |||
| o Considerations on how to handle unmarked traffic in section 5 on | o Considerations on how to handle unmarked traffic in section 5 on | |||
| "Results of the Alternate Marking Experiment" | "Results of the Alternate Marking Experiment" | |||
| o Minor rewording in section 4.4 on "Packet Fragmentation" | o Minor rewording in section 4.4 on "Packet Fragmentation" | |||
| Changes in v-(03) include: | ||||
| o Deleted numeric examples in sections on "Packet Loss Measurement" | ||||
| and on "Single-Marking Methodology" | ||||
| o New section on "Alternate Marking Functions" | ||||
| o Moved sections 3.1.1 on "Coloring the Packets", 3.1.2 on "Counting | ||||
| the Packets" and 3.1.3 on "Collecting Data and Calculating Packet | ||||
| Loss" into the new section on "Alternate Marking Functions" | ||||
| o Renamed sections 4.1 as "Marking the Packets", 4.2 as "Counting | ||||
| and Timestamping Packets" and 4.3 as "Data Collection and | ||||
| Correlation" | ||||
| o Merged old section on "Data Correlation" with section 4.3 on "Data | ||||
| Collection and Correlation" | ||||
| o Moved and renamed section on "Timing Aspects" as "Synchronization | ||||
| and Timing" | ||||
| o Merged old section on "Synchronization" with section on | ||||
| "Synchronization and Timing" | ||||
| o Merged old section on "Packet Reordering" with section on | ||||
| "Synchronization and Timing" | ||||
| Authors' Addresses | Authors' Addresses | |||
| Giuseppe Fioccola (editor) | Giuseppe Fioccola (editor) | |||
| Huawei Technologies | Huawei Technologies | |||
| Riesstrasse, 25 | Riesstrasse, 25 | |||
| Munich 80992 | Munich 80992 | |||
| Germany | Germany | |||
| Email: giuseppe.fioccola@huawei.com | Email: giuseppe.fioccola@huawei.com | |||
| Mauro Cociglio | Mauro Cociglio | |||
| Telecom Italia | Telecom Italia | |||
| Via Reiss Romoli, 274 | Via Reiss Romoli, 274 | |||
| Torino 10148 | Torino 10148 | |||
| Italy | Italy | |||
| Email: mauro.cociglio@telecomitalia.it | Email: mauro.cociglio@telecomitalia.it | |||
| Greg Mirsky | Greg Mirsky | |||
| Ericsson | Ericsson | |||
| End of changes. 49 change blocks. | ||||
| 367 lines changed or deleted | 267 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||