idnits 2.17.1 draft-song-ippm-ioam-scalability-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 27, 2017) is 2485 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-07) exists of draft-brockners-inband-oam-data-02 == Outdated reference: A later version (-03) exists of draft-brockners-inband-oam-requirements-02 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ippm H. Song, Ed. 3 Internet-Draft T. Zhou 4 Intended status: Experimental Huawei 5 Expires: December 29, 2017 June 27, 2017 7 On Scalability of In-situ OAM 8 draft-song-ippm-ioam-scalability-01 10 Abstract 12 This document describes several potential scalability issues when 13 implementing in-situ OAM based on the current in-situ OAM documents 14 and proposes the corresponding solutions and modifications to the 15 current in-situ OAM specification. Specifically, we extend in-situ 16 OAM to support more standard tracing data than is currently defined 17 and add new features to avoid limitations on MTU, bandwidth, 18 forwarding path length, and node processing capability. We provide 19 use cases to motivate our proposal and base the changes on the 20 current in-situ OAM header format specification. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on December 29, 2017. 39 Copyright Notice 41 Copyright (c) 2017 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Motivation for Better iOAM Scalability . . . . . . . . . . . 2 58 2.1. Support Data Type Extensions . . . . . . . . . . . . . . 3 59 2.1.1. Motivating Use Cases . . . . . . . . . . . . . . . . 3 60 2.2. Cope with Packet Size Limitation . . . . . . . . . . . . 4 61 2.2.1. Motivating Use Cases . . . . . . . . . . . . . . . . 4 62 2.3. Adapt to Node Processing Capability . . . . . . . . . . . 4 63 2.3.1. Motivating Use Cases . . . . . . . . . . . . . . . . 5 64 3. Scalable Data Type Extension . . . . . . . . . . . . . . . . 5 65 3.1. Data Type Bitmap . . . . . . . . . . . . . . . . . . . . 5 66 3.2. Scalable Data Type Extension Use Cases . . . . . . . . . 6 67 3.3. Consideration for Data Packing . . . . . . . . . . . . . 7 68 3.4. Other Data Extension Possibilities . . . . . . . . . . . 7 69 4. Segment In-situ OAM . . . . . . . . . . . . . . . . . . . . . 7 70 4.1. Segment and Hops . . . . . . . . . . . . . . . . . . . . 7 71 4.2. Considerations for Data Handling . . . . . . . . . . . . 8 72 4.3. Segment iOAM Use Cases . . . . . . . . . . . . . . . . . 8 73 5. In-situ OAM Sampling and Data Validation . . . . . . . . . . 9 74 5.1. Valid Node Bitmap and Valid Data Bitmap . . . . . . . . . 9 75 5.2. iOAM Sampling and Data Validation Use Cases . . . . . . . 10 76 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 77 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 78 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 79 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 11 80 10. Informative References . . . . . . . . . . . . . . . . . . . 11 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 83 1. Introduction 85 In-situ OAM (iOAM) [I-D.brockners-inband-oam-requirements] records 86 OAM information within user packets while the packets traverse a 87 network. The data types and data formats for in-situ OAM data 88 records have been defined in [I-D.brockners-inband-oam-data]. We 89 identify several scalability issues for implementing the current iOAM 90 specification and propose solutions in this draft. 92 2. Motivation for Better iOAM Scalability 93 2.1. Support Data Type Extensions 95 Currently 11 data types and associated formats (including wide format 96 and short format of the same data) are defined in 97 [I-D.brockners-inband-oam-data] . The presence of data is indicated 98 by a 16-bit bitmap in the "OAM-Trace-Type" field. 100 In the current specification only five bits are left to identify new 101 data types. Moreover, some data is forced to be bundled together as 102 a single unit to save bitmap space and pack data to the ideal size 103 (e.g., the hop limit and the node id are bundled, and the ingress 104 interface id and the egress interface id are bundled), regardless of 105 the fact that an application may only ask for a part of the data. 106 Last but not the least, each data is forced to be 4-byte aligned for 107 easier access, resulting in waste of header space in many cases. 109 Since the data plane bandwidth, the data plane packet processing, and 110 the management plane data handling are all precious yet scarce 111 resource, the scheme should strive to be simple and precise. The 112 application should be able to control the exact type and format of 113 data it needs to collect and analyze. It is conceivable that more 114 types of data may be introduced in the future. However, the current 115 scheme cannot support it after all the bits in the bitmap are used 116 up. 118 Currently, bit 7 is used to indicate the presence of variable length 119 opaque state snapshot data. While this data field can be used to 120 store arbitrary data, the data is difficult to be standardized and 121 another schema is needed to decode the data, which may lead to low 122 data plane performance. 124 2.1.1. Motivating Use Cases 126 When a flow traverses a series of middleboxes (e.g., Firewall, NAT, 127 and load balancer), its identity (e.g., the 5-tuple) is often 128 altered, which makes the OAM system lose track of the flow trace. In 129 this case, we may want to copy some of the original packet header 130 fields into the iOAM header so the original flow can be identified at 131 any point of the network. 133 In wireless, mobile, and optical network environments, some physical 134 data associated with a flow (e.g., power, temperature, signal 135 strength, GPS location) need to be collected to monitor the service 136 performance. 138 Both cases require new iOAM data types. More examples are listed in 139 Section 3.2. 141 2.2. Cope with Packet Size Limitation 143 The total size of data is limited by the MTU. When the number of 144 required data types is large and the forwarding path length is long, 145 it is possible that there is not enough space in the iOAM header to 146 save the data. The current proposal is to label the overflow status 147 and stop adding new node data to the packet, leading to loss of 148 information. 150 Even if the header has enough space to hold the iOAM data, the 151 overhead may be too large and consume too much bandwidth. For 152 example, if we assume moderate 20 bytes of data per node, a path with 153 length of 10 will need 200 bytes to hold the data. This will inflate 154 small 64-byte packets by more than four times. Even for the largest 155 packet size (e.g., 1500 bytes), the overhead (>10%) is not 156 negligible. Therefore, we need to limit the iOAM data overhead 157 without sacrificing the data collection capability. 159 Here we have another interesting related issue. Packets can be 160 dropped anywhere in a network for various reasons. If we can only 161 collect iOAM data at the path end, we lose all data from the dropped 162 packets and have no idea where the packets are dropped. This defies 163 the purpose of iOAM and makes those iOAM-enabled nodes work in vain. 165 2.2.1. Motivating Use Cases 167 Some use cases are described in Section 4.3. 169 2.3. Adapt to Node Processing Capability 171 iOAM can designate the flow to add the iOAM header and collect data 172 on the flow forwarding path. The flow can have arbitrary 173 granularity. However, processing the data can be a heavy burden for 174 the network nodes, especially when some data needs to be calculated 175 by the node (e.g., the transit delay). If the flow traffic is heavy, 176 the node may not be able to handle the iOAM processing so many 177 performance issues may occur, such as long latency and packet drop. 179 Although it is good for the OAM applications to gain the detailed 180 information on every packet at every node, in many cases, such 181 information is often repetitive and redundant. The large quantity of 182 data would also burden the management plane which needs to collect 183 and stream the data for analytics. It is also possible that some 184 nodes cannot provide the requested data at all or are unwilling to 185 provide some data for security or privacy concerns. So a trade-off 186 is needed to balance the performance impact and the data availability 187 and completeness. 189 2.3.1. Motivating Use Cases 191 To minimize the network impact, a network operator decides to collect 192 the iOAM data only for initial and last flow packets (e.g., TCP 193 packets with SYN, FIN, and RST flags). 195 A head node alternates two iOAM headers with each requesting a subset 196 of iOAM data. Hence, each node on the flow path only needs to handle 197 partial data. The requests can be balanced without exhausting the 198 network nodes. 200 A node is temporarily under heavy traffic load. It is in danger of 201 dropping packets if it tries to satisfy all the iOAM data requests. 202 In this case, it would rather deny some requests than drop user 203 traffic. 205 More examples are listed in Section 5.2. 207 3. Scalable Data Type Extension 209 Based on the observation in Section 2.1, we propose a method for data 210 type encoding which can solve the current limitation and address 211 future data requirements. 213 3.1. Data Type Bitmap 215 Bitmap is simple and efficient data structure for high performance 216 data plane implementation. The base bitmap size is kept to be 16 217 bits. We use one bit to indicate a single type of data in a single 218 format. The last bit in the bitmap (i.e., bit 15), if set, is used 219 to indicate the presence of the next data type bitmap, which is 32 220 bits long. In the second bitmap, bit 31 is again reserved to 221 indicate a third bitmap, and so on. With each extra bitmap, 31 more 222 data types can be defined. 224 Figure 1 shows an example of the in-situ OAM header format with two 225 extended OAM trace type fields. Except the OAM Trace Type fields, 226 all other fields remain the same as defined in 227 [I-D.brockners-inband-oam-data]. 229 0 1 2 3 230 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 232 | Base OAM Trace Type |1| Length Field | Flags | 233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 234 | Extended OAM Trace Type 1 |1| 235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 236 | Extended OAM Trace Type 2 |0| 237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 238 | | 239 | Node Data List [] | 240 | | 241 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 243 Figure 1: Extended OAM Trace Type Header Format 245 The specification of the Base OAM Trace Type is the same as the OAM 246 Trace Type in [I-D.brockners-inband-oam-data] except the last bit, 247 which is defined as follows: 249 o Bit 15: When set indicates presence of next bit map. 251 The OAM trace type fields are labeled as Base OAM Trace Type, 252 Extended OAM Trace Type 1, Extended OAM Trace Type 2, and so on. The 253 Base OAM Trace Type is always present. If no data type is asked by 254 the application in Extended OAM Trace Type n and beyond, then the 255 last bit in the previous bitmap is set to 1 and these extended fields 256 are not included in the header. On the other hand, to eliminate 257 ambiguity, if any data is asked for by the application in Extended 258 OAM Trace Type n, then Extended OAM Trace Type 1 to (n-1) must be 259 included in the header, even though no data type in these bitmaps are 260 needed (i.e., all zero bitmap except the last bit). 262 The actual data in a node is packed together in the same order as 263 listed in the OAM Trace Type bitmap. Each node is padded to be the 264 multiple of 4 bytes. 266 3.2. Scalable Data Type Extension Use Cases 268 New types of data can be potentially added and standardized, which 269 demand new bits allocated in the OAM Trace Type bitmaps. Some 270 examples are listed here. 272 o Metered flow bandwidth. 274 o Time gap between two consecutive flow packets. 276 o Remaining time budget to the packet delivery deadline. 278 o Buffer occupancy on the Node. 280 o Queue depth on each level of hierarchical QoS queues. 282 o Packet jitter at the Node. 284 o Current packet IP addresses. 286 o Current packet port numbers. 288 o Other node statistics. 290 3.3. Consideration for Data Packing 292 The length of each data must be the multiple of 2 bytes. However, 293 allowing different data type to have different length, while 294 efficient in storage, makes data alignment and packing difficult. 296 If we can define the maximum number of data types that can be carried 297 per packet, the offset of each data in the node can be pre-calculated 298 and carried in the iOAM header. The overhead can be justified by the 299 overall space saving of the node data list. Otherwise, each data's 300 offset in the node must be calculated in each device, with the help 301 of a table which stores the size of each data type. We can also 302 arrange the bitmap to reflect the data availability order in the 303 system (e.g., the bit for egress_if_id must be after the bit for 304 ingress_if_id), so in a pipeline-based system, the required data can 305 be packed one after one. 307 3.4. Other Data Extension Possibilities 309 Bitmap is simple and support parallel processing in hardware, 310 however, it is not the only option to support data type extension. 311 For example, cascaded TLV can be used to support arbitrary number of 312 new data types. 314 4. Segment In-situ OAM 316 Based on the observation in Section 2.2, we propose a method to limit 317 the size of the node data list. 319 4.1. Segment and Hops 321 A hop is a node on a flow's forwarding path which is capable of 322 processing iOAM data. A segment is a fixed number hops on a flow's 323 forwarding path. While working in the "per hop" mode, the segment 324 size (SSize) and the remaining hops (RHop), is added to the iOAM 325 header at the edge. Initially, RHop is equal to SSize. At each hop, 326 if RH is not zero, the node data is added to the node data list at 327 the corresponding location and then RH is decremented by 1. If RH is 328 equal to 0 when receiving the packet, the node needs to remove (in 329 incremental trace option) or clear (in pre-allocated trace option) 330 the iOAM node data list and reset RHop to SSize. Then the node will 331 add its data to the node data list as if it is the edge node. 333 Figure 2 shows the proposed in-situ OAM header format. The last bit 334 (bit 31) in the Flags field is used to indicate the current header is 335 a segment iOAM header. In this context, the third byte of the first 336 word is partitioned into two 4-bit piece. The first piece is used to 337 save the segment size and the second piece is used to save the 338 remaining hops. This limits the maximum segment size to 15. 340 0 1 2 3 341 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 343 | Base OAM Trace Type |0| SSize | RHop | Flags |1| 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 | | 346 | Node Data List [] | 347 | | 348 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 350 Figure 2: Segment iOAM Header Format 352 4.2. Considerations for Data Handling 354 At any hop when RHop is equal to 0, the node data list is copied from 355 the iOAM header. The data can be encapsulated and reported to the 356 controller or the edge node as configured. The encapsulation and 357 report method is beyond the scope of this draft but should be comply 358 with the method used by the iOAM edge node. 360 The actual size of the last segment may not be equal to SSize but 361 this is not a problem. 363 4.3. Segment iOAM Use Cases 365 Segment iOAM is necessary in the following example scenarios: 367 o Segment iOAM can be used to detect at which segment the flow 368 packet is dropped. If the SSize is set to 1, then the exact drop 369 node can be identified. The iOAM data before the dropping point 370 is also retained. 372 o The path MTU allows to add at most k node data in the list to 373 avoid fragmentation. Therefore SSize is set to k and at each hop 374 where RHop is 0, the node data list is retrieved and sent in a 375 standalone packet. 377 o A flow contains mainly short packets and travels a long path. It 378 would be inefficient to keep a large node data list in the packet 379 so the network bandwidth utilization rate is low. In this case, 380 segment iOAM can be used to limit the ratio of the iOAM data to 381 the flow packet payload. 383 o The network allows at most n bytes budget for the iOAM data. 384 There is a tradeoff between the number of data types that can be 385 collected and the number of hops for data collecting. The segment 386 size is therefore necessary to meet the application's data 387 requirement (i.e., SSize * Node Data Size < n). 389 5. In-situ OAM Sampling and Data Validation 391 Based on the observation in Section 1.3, the source edge node should 392 be able to define either the period or the probability to add the 393 iOAM header to the selected flow packet. In this way, only a subset 394 of the flow/sec packets would carry the OAM data, which not only 395 reduces the overall iOAM data quantity but also reduces the 396 processing work load of the network nodes. 398 5.1. Valid Node Bitmap and Valid Data Bitmap 400 It is possible that even an iOAM capable node will not add data to 401 the node data list as requested. In some cases, a node can be too 402 busy to handle the data request or some types of the requested data 403 is not available. Therefore, we propose to add two bitmaps, a valid 404 node bitmap and a valid data bit, to the iOAM specification. 406 The Node Valid Bitmap is inserted before the Node Data List as shown 407 in Figure 3. Each bit in the bitmap corresponds to a hop on the 408 packet's forwarding path. The bits are listed in the same order as 409 the hop on the packet's forwarding path. The bitmap is cleared to 410 all zero at first. If a hop can add data to the Node Data List, the 411 corresponding bit in Node Valid Bitmap is set to 1. The bit location 412 for a hop can be calculated from the length field (e.g, the bit index 413 is equal to SSize-RHop).The valid node data items in the node data 414 list is equal to the number of 1's in the Node Valid Bitmap. 416 0 1 2 3 417 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 418 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 419 | Base OAM Trace Type |0| Length Field | Flags | 420 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 421 | Valid Node Bitmap | 422 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 423 | | 424 | Node Data List [] | 425 | | 426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 428 Figure 3: Segment iOAM Header Format 430 For each node data in the node data list, a Valid Data Bitmap is 431 added before the node data. The number of bits in the Valid Data 432 Bitmap is equal to the number of 1's in the OAM Trace Type bitmaps 433 (excluding the next trace type bitmap indicator bits). When the bit 434 is set, the corresponding data is valid in the node; otherwise, the 435 corresponding data is invalid so the management plane should ignore 436 it after the data is collected. 438 The size of the bitmap can be padded to two or four bytes, which 439 allow up to 16 or 32 types of data to be included in a node. 441 5.2. iOAM Sampling and Data Validation Use Cases 443 We give some examples to show the usefulness of in-situ OAM sampling 444 and data validation features. 446 o An application needs to track a flow's forwarding path and knows 447 the path will not change frequently, so it sets a low sampling 448 rate to periodically insert the iOAM header to request the node 449 ID. 451 o In a heterogeneous data plane, some nodes support to provide data 452 x but the other nodes do not support it. However, an application 453 is still interested in collecting data x if available. In this 454 case, iOAM header can still be configured to ask for data x but 455 the nodes that cannot provide the data simply invalidates it by 456 resetting the corresponding bit in the valid data bitmap. 458 o Multiple sampling rate and multiple data request schema can be 459 defined for a flow based on applications requirements and the data 460 property, so for a flow packet, there can be no iOAM header or 461 different iOAM headers. The node does not need to process all 462 data all the time. 464 o For security reason, a node decides to not participate in the iOAM 465 data collection. While it processes the other iOAM header fields 466 as usual, it does not set the node valid bit in the Node Valid 467 Bitmap and add node data to the Node Data List. 469 6. Security Considerations 471 There is no extra security considerations beyond those have been 472 identified by in-situ OAM protocol. 474 7. IANA Considerations 476 This memo includes no request to IANA. 478 8. Acknowledgments 480 We would like to thank Frank Brockners and Carlos Pignataro for 481 helpful comments and suggestions. 483 9. Contributors 485 The document is inspired by numerous discussions with James N. 486 Guichard. He also provided significant comments and suggestions to 487 help improve this document. 489 10. Informative References 491 [I-D.brockners-inband-oam-data] 492 Brockners, F., Bhandari, S., Pignataro, C., Gredler, H., 493 Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov, 494 P., and R. <>, "Data Formats for In-situ OAM", draft- 495 brockners-inband-oam-data-02 (work in progress), October 496 2016. 498 [I-D.brockners-inband-oam-requirements] 499 Brockners, F., Bhandari, S., Dara, S., Pignataro, C., 500 Gredler, H., Leddy, J., Youell, S., Mozes, D., Mizrahi, 501 T., <>, P., and r. remy@barefootnetworks.com, 502 "Requirements for In-situ OAM", draft-brockners-inband- 503 oam-requirements-02 (work in progress), October 2016. 505 Authors' Addresses 506 Haoyu Song (editor) 507 Huawei 508 2330 Central Expressway 509 Santa Clara, 95050 510 USA 512 Email: haoyu.song@huawei.com 514 Tianran Zhou 515 Huawei 516 156 Beiqing Road 517 Beijing, 100095 518 P.R. China 520 Email: zhoutianran@huawei.com