idnits 2.17.1 draft-lencse-tsvwg-mpt-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 13, 2020) is 1412 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'FILENAME' is mentioned on line 626, but not defined -- Looks like a reference, but probably isn't: '1' on line 778 == Missing Reference: 'N' is mentioned on line 778, but not defined == Missing Reference: 'M' is mentioned on line 786, but not defined -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group G. Lencse 2 Internet Draft Budapest Univ. of Tech. and Econ. 3 Intended status: Experimental Sz. Szilagyi 4 Expires: December 2020 University of Debrecen 5 F. Fejes 6 Eotvos Lorand University 7 M. Georgescu 8 RCS&RDS 9 June 13, 2020 11 MPT Network Layer Multipath Library 12 draft-lencse-tsvwg-mpt-06.txt 14 Abstract 16 Although several contemporary IT devices have multiple network 17 interfaces, communication sessions are restricted to use only one of 18 them at a time due to the design of the TCP/IP protocol stack: the 19 communication endpoint is identified by an IP address and a TCP or 20 UDP port number. The simultaneous use of these multiple interfaces 21 for a communication session would improve user experience through 22 higher throughput and improved resilience to network failures. 24 MPT is a network layer multipath solution, which provides a tunnel 25 over multiple paths using the GRE-in-UDP specification, thus being 26 different from both MPTCP and Huawei's GRE Tunnel Bonding Protocol. 28 MPT can also be used as a router, routing the packets among several 29 networks between the tunnel endpoints, thus establishing a multipath 30 site-to-site connection. 32 The version of tunnel IP and the version of path IP are independent 33 from each other, therefore MPT can also be used for IPv6 transition 34 purposes. 36 Status of this Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF), its areas, and its working groups. Note that 43 other groups may also distribute working documents as Internet- 44 Drafts. 46 Internet-Drafts are draft documents valid for a maximum of six 47 months and may be updated, replaced, or obsoleted by other documents 48 at any time. It is inappropriate to use Internet-Drafts as 49 reference material or to cite them other than as "work in progress." 51 The list of current Internet-Drafts can be accessed at 52 http://www.ietf.org/ietf/1id-abstracts.txt 54 The list of Internet-Draft Shadow Directories can be accessed at 55 http://www.ietf.org/shadow.html 57 This Internet-Draft will expire on December 13, 2020. 59 Copyright Notice 61 Copyright (c) 2020 IETF Trust and the persons identified as the 62 document authors. All rights reserved. 64 This document is subject to BCP 78 and the IETF Trust's Legal 65 Provisions Relating to IETF Documents 66 (http://trustee.ietf.org/license-info) in effect on the date of 67 publication of this document. Please review these documents 68 carefully, as they describe your rights and restrictions with 69 respect to this document. Code Components extracted from this 70 document must include Simplified BSD License text as described in 71 Section 4.e of the Trust Legal Provisions and are provided without 72 warranty as described in the Simplified BSD License. 74 Table of Contents 76 1. Introduction ................................................. 3 77 1.1. Design Assumptions ...................................... 3 78 1.2. MPT in the Networking Stack ............................. 3 79 1.3. Terminology ............................................. 4 80 1.4. MPT Concept ............................................. 5 81 2. Conventions Used in this Document ............................ 6 82 3. Operation Overview ........................................... 6 83 4. MPT Control .................................................. 8 84 4.1. Configuration Information ............................... 8 85 4.1.1. General Information for the MPT Server ............. 8 86 4.1.2. Connection Specifications .......................... 9 87 4.2. MPT Configuration Commands ............................. 13 88 5. Possible Mappings of the Tunnel Traffic to Paths ............ 14 89 5.1. Per Packet Based Mapping ............................... 17 90 5.2. Flow Based Mapping ..................................... 18 91 5.3. Combined Mapping ....................................... 19 92 6. Packet Reordering ........................................... 19 93 7. Why MPT is Considered Experimental? ......................... 20 94 7.1. Published Results ...................................... 21 95 7.1.1. MPT Concept and First Implementation .............. 21 96 7.1.2. Estimation of the Channel Aggregation Capabilities 21 97 7.1.3. Demonstrating the Resilience of an MPT Connection . 21 98 7.2. Open questions ......................................... 22 99 7.2.1. Parameters......................................... 22 100 7.2.2. Development of Further Mapping Algorithms.......... 22 101 7.2.3. Performance Issues ................................ 22 102 8. Security Considerations ..................................... 22 103 9. IANA Considerations ......................................... 23 104 10. Conclusions ................................................ 23 105 11. References ................................................. 23 106 11.1. Normative References .................................. 23 107 11.2. Informative References ................................ 23 108 12. Acknowledgments ............................................ 25 109 Appendix A. Sample C code for packet reordering ................ 26 111 1. Introduction 113 MPT is a multipath extension of the GRE-in-UDP encapsulation 114 [RFC8086]. 116 1.1. Design Assumptions 118 MPT is intended to be used as a preconfigured tunnel and the 119 application of MPT does not require any modifications to the 120 applications using the TCP/IP socket interface API. 122 1.2. MPT in the Networking Stack 124 The layer architecture of MPT is shown in Fig. 1. MPT extends the 125 GRE-in-UDP [RFC8086] architecture by allowing multiple physical 126 paths. To that end it can be compared to MPTCP [RFC6824], but unlike 127 MPTCP, MPT uses UDP in the underlying layer, builds on GRE-in-UDP, 128 and provides a tunnel IP layer, over which both UDP and TCP can be 129 used. The aim of Huawei's GRE tunnel bonding protocol [RFC8157] is 130 also similar to that of MPT: it targets to bonded access to wired 131 and wireless network in customer premises. However, it uses GRE (not 132 GRE-in-UDP) which is less supported in ISP networks than UDP, and it 133 seems to limit the number of physical interfaces to two. For the 134 comparison of MPT with other multipath solutions, please refer to 135 [Alm2017]. 137 +---------------------------------------------+ 138 | Application (Tunnel) | 139 +---------------------------------------------+ 140 | TCP/UDP (Tunnel) | 141 +---------------------------------------------+ 142 | IPv4/IPv6 (Tunnel) | 143 +---------------------------------------------+ 144 | GRE-in-UDP | +-----+ 145 +----------------------+----------------------+<--| MPT | 146 | UDP (Physical) | UDP (Physical) | +-----+ 147 +----------------------+----------------------+ 148 | IPv4/IPv6 (Physical) | IPv4/IPv6 (Physical) | 149 +----------------------+----------------------+ 150 | Network Access | Network Access | 151 +----------------------+----------------------+ 153 Figure 1: MPT Layer Architecture 155 1.3. Terminology 157 This document uses a number of terms that are either MPT specific or 158 have defined meaning in the context of MPT as follows: 160 MPT server: An MPT server is a software that implements network 161 layer multipath communication by providing an UDP tunnel (named 162 "connection" in the MPT terminology) over several underlying 163 "paths". 165 MPT client: An MPT client is a software tool, which is used to 166 control the local MPT server (e.g. start/stop connections, add 167 paths to connections, etc.). 169 Connection: An MPT connection (also referred as communication 170 session) is an UDP tunnel between two MPT servers, which can be 171 used to carry user data. A connection can be established over one 172 or more paths. A connection is initiated on the basis of a 173 "connection specification". 175 Path: A path is used to refer to the pair of the network cards of 176 the end nodes (identified by the pair of IP addresses of the 177 cards). Using a specified path, the packet transmission runs 178 between the given pair of network cards. 180 Connection specification: A connection specification is stored in a 181 configuration file and it is used by an MPT server to establish 182 an MPT connection with another MPT server. It contains all the 183 configuration information for the connection (e.g. endpoint IP 184 versions and addresses, number of paths and configuration 185 information for all paths). The precise definition of the 186 connection specification can be found in Section 4.1.2. 188 Data port: Data port means the GRE-in-UDP port defined in [RFC8086] 189 as 4754. It is used for transmitting data packets. 191 Local command port: An MPT server accepts commands from the MPT 192 client at the local command port. 194 Remote command port: An MPT server MAY accept commands from other 195 MPT servers at the remote command port. 197 Data plane: The parts and functions of MPT, which are responsible 198 for handling user data packets. 200 Control plane: All parts and functions of MPT except the data plane. 201 E.g.: handling connections and paths, all the communication 202 through local or remote command ports, etc. 204 1.4. MPT Concept 206 When an MPT server is started, it reads its configuration files, and 207 depending on its contents, it MAY wait for and accept connection(s) 208 initiated by other MPT server(s) and/or it MAY initiate one or more 209 MPT connection(s) with other MPT server(s). In the simplest case, 210 the MPT server uses the connection specifications described in its 211 configuration files for initiating connections. In addition to that, 212 new connections MAY be established, connections MAY be closed, the 213 parameters of the connections MAY be changed later (e.g. some paths 214 may be switched on and off) dynamically by issuing the appropriate 215 commands using an MPT client. 217 MPT connections between MPT servers implement tunnels. The traffic 218 comes from the tunnel interface is distributed over the active paths 219 of the MPT connection by the MPT server. There are three possible 220 mappings (see Section 5 for details and illustrative examples): 222 o per packet based mapping, where a mapping decision is made for 223 every single packet 225 o flow based mapping, where the flows, identified by the usual five 226 tuple, are always mapped to a given path. 228 o combined mapping, where the flows, identified by the usual five 229 tuple, are always mapped to a given connection and a mapping 230 decision is made for every single packet of each connections. 232 The peer MPT server receives and de-encapsulates the traffic from 233 the different paths and restores the tunnel traffic using the 234 optional GRE sequence numbers for packet reordering if necessary. 236 2. Conventions Used in this Document 238 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 239 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 240 document are to be interpreted as described in RFC 2119 [RFC2119]. 242 In this document, these words will appear with that interpretation 243 only when in ALL CAPS. Lower case uses of these words are not to be 244 interpreted as carrying significance described in RFC 2119. 246 3. Operation Overview 248 In this Section, we describe the operation of the data plane, 249 whereas the operation of the control plane can be found in Section 250 4. 252 The data packet transmission and receive mechanism of MPT is 253 summarized in Fig. 2. Now, we shall follow the route and processing 254 of data packets. 256 When a packet is read from the tunnel interface, the MPT software 257 looks up the appropriate connection specification, which determines 258 the mapping of the packets to the paths. The connection 259 specification determines the behavior of the multipath 260 communication, especially the distribution of the packets among the 261 paths (see Section 5 for possible mapping methods). The path is 262 selected and the user data packet encapsulated into a GRE-in-UDP 263 data unit, which may optionally contain GRE Sequence Numbers for 264 reordering. The simplest GRE header contains 4 octets: 16 bits of 265 zeros and 16 bits of protocol type identification value (i.e. 0x86DD 266 in the case of using IPv6 on the tunnel interface, or 0x0800 in the 267 case of IPv4). Then the GRE-in-UDP data unit is encapsulated into 268 the UDP/IP data unit of the selected path, where the destination UDP 269 port number is the 4754 GRE-in-UDP port and the IP addresses (either 270 both IPv4 or both IPv6) are determined by the path definition. 271 Finally, the encapsulated packet is transmitted through the physical 272 interface. The encapsulation of the different protocol data units is 273 shown in Fig. 3. 275 +------------------------------------------------+ 276 | Tunnel Interface | 277 +------------------------------------------------+ 278 || /\ 279 || || 280 ############||######################||############ 281 # \/ || # 282 # +--------------------+ +--------------------+ # 283 # | data packet | | forwarding to the | # +----------+ 284 # | reading | | tunnel interface | # | Control | 285 # +--------------------+ +--------------------+ # | Protocol | 286 # || /\ # | PDU | 287 # \/ || # +--------- + 288 # +--------------------+ +--------------------+ # || 289 # | checking connection| | packet reordering | # || 290 # | specification | | (optional) | # || ########### 291 # +--------------------+ +--------------------+ # \/ # MPT # 292 # || /\ #<======# Control # 293 # \/ || # # # 294 # +--------------------+ +--------------------+ # ########### 295 # | path selection, | | data | # 296 # | GRE-in-UDP encaps. | | checking | # 297 # +--------------------+ +--------------------+ # 298 # || /\ # 299 # \/ || # 300 # +--------------------+ +--------------------+ # 301 # | physical data | | data packet | # 302 # | transmission | | reading | # 303 # +--------------------+ +--------------------+ # 304 # || /\ # 305 ############||######################||############ 306 || || 307 \/ || 308 +------------------------------------------------+ 309 | Physical Interface | 310 +------------------------------------------------+ 312 Figure 2: Conceptual architecture of MPT working mechanism 314 When a packet is read from the physical interface, its destination 315 UDP port number is the 4754 GRE-in-UDP port. MPT reads the packet, 316 identifies the connection the packet belongs to (by the source and 317 destination IP addresses of the tunnel IP header) and runs checking 318 mechanisms (e.g. connection validity check, GRE sequence number 319 check or GRE Key value check, if present). If all the checking 320 mechanisms finish successfully and no reordering is necessary, then 321 the packet is promptly transmitted to the Transport and Application 322 Layers through the tunnel interface. If reordering is on and GRE 323 sequence number indicates that one or more data unit(s) are missing, 324 then the packet is placed into a buffer array for reordering 325 purposes. (Reordering is discussed in Section 6.) 327 +----------+-----------+---------+-----------+---------+-----------+ 328 | path IP | path UDP | GRE-in- | tunnel IP | tunnel |application| 329 | v4 or v6 |(port 4754)| UDP | v4 or v6 | TCP/UDP | data | 330 +----------+-----------+---------+-----------+---------+-----------+ 332 Figure 3: PDU encapsulation of the MPT data communication 334 4. MPT Control 336 A connection can be established between two MPT servers in two ways: 338 1. When the MPT server is started, it establishes the connection on 339 the basis of a connection specification from the configuration 340 files. In this case, the connection specification contains all 341 the necessary parameters. MPT client commands still can be used 342 to modify the parameters, switch off and on paths, etc. as 343 described in Section 4.2. 345 2. The connection is established by using MPT client commands. In 346 this case the command line arguments of the MPT commands and 347 configuration files contain the necessary parameters. 349 4.1. Configuration Information 351 The MPT configuration files contain various pieces of information. 352 They can be divided into two groups: 354 1. general information for the MPT server 356 2. connections specification(s) 358 4.1.1. General Information for the MPT Server 360 The MPT configuration file is made up of sections. The "general" 361 section MUST be present and it contains general information for the 362 operation of the MPT server, whereas there MAY several sections 363 follow, each of which describes a different tunnel. 365 The general section MUST contain the following elements: 367 o tunnel number: the number of tunnels to create (they are to be 368 described in separate sections) 370 o accept remote: it is a key (yes/no) whether this MPT server 371 should accept commands from other MPT servers to build up 372 connections, which are not defined in the local configuration 373 files 375 o local command port: the port number on which the local MPT client 376 software can give commands to the MPT server 378 o command timeout: the timeout value for the MPT client 380 For each tunnel, a separate section is to be used to describe the 381 following parameters: 383 o name: The name is used by the operating system to access to the 384 interface. 386 o MTU: The maximum transmission unit of the tunnel interface. For 387 the Ethernet environment, the value should be set between 1436 388 and 1468 (depending on the additional header sizes, used by the 389 actual system). It can be calculated as: 1500- 390 Path_IP_header_size-UDP_header_size-GRE_header_size. 392 o ipv4_addr: IPv4 address and mask 394 o ipv6_addr: IPv6 address and mask 396 Note that both ipv4_addr and ipv6_addr MAY be present. At least one 397 of them MUST be present. 399 It is important that the same tunnel may be used by several 400 connections. A connection can be uniquely identified by the IP 401 addresses of the two endpoints, which have to be of the same type 402 (IPv4 or IPv6). 404 4.1.2. Connection Specifications 406 A connection specification is made up of sections. The "connection" 407 section contains parameters that are to be specified only ones for 408 each connection. The "paths" section contains one or more path 409 definitions. The optional "networks" section contains network 410 definitions for routing purposes. 412 The general section (called "connection") MUST contain the following 413 elements: 415 o name: The unique name of the connection. If we use multiple 416 connections, the name must uniquely identify the connection. 418 o permissions: There MAY be SEND and RECEIVE permissions, which 419 allow sending and receiving connection updates. The term SEND 420 means that the local MPT environment is allowed to start 421 configuration change to the peer. The term RECEIVE means that the 422 peer is allowed to start a configuration change, and the local 423 MPT environment will accept it. (The actual execution of the 424 requested change depends on further conditions, e.g. successful 425 authentication.) 427 o IP version: its possible values are 4 or 6. 429 o local IP address: must be of the IP version specified above and 430 must be the same as defined for the tunnel. 432 o remote IP address: the IP address of the remote peer, must be of 433 the IP version specified above 435 o local data port number: used for data communication, SHOULD be 436 set to the 4754 GRE-in-UDP port number 438 o remote data port number: used for data communication, SHOULD be 439 set to the 4754 GRE-in-UDP port number 441 o remote command port number: The UDP port number of the peer, 442 which is used to accept control commands. If the local MPT client 443 starts an MPT command (e.g. turning off a path usage), the MPT 444 server will communicate this action to the peer by using the 445 remote command port number as the destination port number. 447 o path count: The key is an integer P, denoting the number of paths 448 defined for this connection. The minimum value is 1, the maximum 449 value is implementation dependent, e.g. 20. This configuration 450 file MUST have P sections (usually named [path_n]), where 0<=n 0). If ordered packet 482 transmission is required, maximum buffer delay specifies the 483 maximum time (in milliseconds) while the packet may be stored in 484 the buffer-array. 486 o authentication key: The authentication key contains the key value 487 of the control communication authentication. Some algorithms do 488 not need authentication keys. In this case the specification of 489 the authentication key is not necessary, or will be ignored. 491 A path definition section MUST contain the following elements: 493 o interface name: The value is the name of the physical interface 494 used by the given path for packet forwarding (e.g. eth0, wlan0). 496 o IP version: Specifies the version of IP used by the path. The 497 value can be 4 or 6. 499 o public IP address: Specifies the public IP address of the 500 interface used for the tunnel communication. If the host is 501 placed into the Global Address Realm, the public IP address is 502 the IP address of the interface, otherwise (i.e. when the host is 503 behind a NAT-Box) it is the public address assigned by the NAT- 504 Box to the tunnel communication session. If the path uses IPv4 505 and NAT, then the special address value of 0.0.0.0 can be used to 506 force the MPT server program to determine the public IP address 507 automatically. 509 o remote IP address: Indicates the public IP address of the remote 510 endpoint. 512 o gateway IP address: The IP address of the gateway, used to reach 513 the peer (i.e. remote IP address) using the given path. If the 514 operating system uses the Network Manager (nmcli) software for 515 network configuration, then the value of 0.0.0.0 can be used to 516 find the gateway of the named interface automatically. 518 o weight out: This is the "weight of the path" in the system 519 expressing the estimated transmission capacity of the path. The 520 MPT server program distributes the outgoing packets between the 521 available paths according to their weights, if per packet based 522 mapping is used. The value must be between 1 and 10,000. 524 o status: This key means the initial state of the path after 525 starting the MPT server. The value "up" means that the path is 526 usable (working), and the state of the path is OK. If required, 527 may be set initially as "down". 529 A path definition section MAY contain the following elements: 531 o private IP address: The IP address of the physical interface. Can 532 be omitted, if the public IP address is assigned directly to the 533 interface. When using IPv4 and NAT, the special value of 0.0.0.0 534 can be used to force the MPT server application to read and use 535 the first IPv4 address assigned to the interface. 537 o keepalive time: The MPT system monitors the availability of each 538 path by sending keepalive messages regularly. The key specifies 539 the frequency (i.e. the time between the keepalive messages in 540 seconds) that the MPT server uses for sending keepalives. The 541 value of zero (which is the default value) means switching off 542 the keepalive mechanism. 544 o dead time: If the keepalive mechanism is active, and the host 545 does not receive any keepalive message on the given path from the 546 peer for dead time seconds, then the path is considered as "dead" 547 and will not be used for data transmission. (The default value is 548 3*keepalive time.) 550 o weight in: This field is used at the "mpt path up" command (see 551 Section 4.2) to set the outgoing weight of the corresponding path 552 at the peer. The default value is 1. 554 o command default: This key can be used to specify one path as the 555 default path for control command communication. In the case of 556 receiving the control command of "create connection", the system 557 will use this path for the control communication. 559 The optional "networks" section contains network definitions for 560 routing purposes. Each network definition begins with its name in 561 the [net_n] format and contains the following parameters: 563 o IP version: Specifies the version of IP used in the network 564 definitions. The value can be 4 or 6. 566 o source address: specifies the source network and its prefix 567 length in the CIDR notation. 569 o destination address: specifies the destination network and its 570 prefix length in the CIDR notation. 572 The network configuration can also be used to provide multipath 573 Internet connection by specifying 0.0.0.0/0 as destination address 574 and prefix length. (The source is our tunnel address in this case.) 576 4.2. MPT Configuration Commands 578 The same control interface is used for the local administration of 579 the MPT server (by the MPT client accessing the MPT server at the 580 local command port through the loopback interface) and for the 581 communication of the local MPT server with the remote MPT server 582 (accessing it at its remote command port). 584 Now, some client commands will follow. Although some of the syntax 585 of our MPT implementation will be used, the focus is not on their 586 syntax, which may be implementation dependent, but rather on their 587 functionalities. The execution of these commands may also involve 588 communication between the local MPT server and a/the remote MPT 589 server. 591 mpt address {add|del} IPADDRESS/PREFIX dev INTERFACE 593 An IPv4 or IPv6 address can be added to or deleted from a (local) 594 interface. 596 mpt interface INTERFACE {up|down} 598 The specified interface is turned up or down plus all the paths, 599 that are based on the given local physical interface are also turned 600 on or off by starting the "mpt path {up|down}" command (see below) 601 for each considered path. 603 mpt path PATH {up|down} 605 This command can be used to turn on or off a specified path. If the 606 path status is changed to down, then it is not used by the 607 connection, (i.e. no data is sent through that path by the MPT 608 software). 610 mpt connection CONNECTION {create|delete} 612 This command can be used to establish or tear down a connection 613 between the local and a remote MPT server. (The parameters are taken 614 from local configuration files.) If the remote server is configured 615 so, then it accepts the parameters of the connection from the local 616 server. 618 mpt save [FILENAME] 620 The current configuration can be changed during runtime by remote 621 peers. (This can be enabled with the accept remote key and with the 622 permissions key.) This command is used to write these connection 623 changes to the configuration files, so the new settings will remain 624 after server startup or after mpt reload. 626 mpt reload [FILENAME] 628 Warm restart: the MPT server build up its connections according to 629 its configuration files. (Our implementation only establishes, but 630 it does not tears down connections.) 632 5. Possible Mappings of the Tunnel Traffic to Paths 634 The data packets coming from the tunnel interface must be forwarded 635 through one of the active paths of the connection. Three possible 636 mapping solutions are proposed: 638 o Per packet based mapping means that the tunnel traffic is 639 distributed among the paths on the basis of the parameters of the 640 paths only, and regardless of what network flow a given packet 641 belongs to. 643 o Flow based mapping means that packets which belong to a given 644 network flow, identified by the usual five tuple of source IP 645 address, destination IP address, source port number, destination 646 port number, and protocol number (TCP or UDP), or three tuple of 647 source IP address, destination IP address, and protocol number 648 (TCP, UDP or ICMP), are always mapped to the same path. 650 o Combined mapping means the combinations of the two above in the 651 way that packets which belong to a given network flow, identified 652 by the way described above, are always mapped to the same 653 connection. And the packets that belong to a connection are 654 distributed among the paths of that connection by per packet 655 decisions on the basis of the parameters of the paths of the 656 connection. 658 We illustrate the three mapping solutions by examples. 660 Definitions for the examples: 662 Computers A and B are interconnected by 3 different paths: 664 path_1: 100Base-TX Ethernet 666 path_2: 802.11g WiFi 668 path_3: LTE 670 Connection_1 has 3 paths with the following weight out values: 672 path_1: 5 674 path_2: 2 676 path_3: 3 678 Example 1 (Per packet based mapping) 680 All the traffic between the two computers is distributed among the 681 three paths of Connection_1 proportionally to their weight out 682 values. A decision is made about every single packet as described in 683 Section 5.1, regardless of the fact what application it belongs to. 685 Advantage: The transmission capacity of all the paths can be 686 utilized. 688 Disadvantage: There is no possibility to use different mappings for 689 different applications. 691 Example 2 (Per flow based mapping) 693 Based on the destination port number or port range, the traffic of 694 different applications are mapped to paths as follows: 696 HTTP, VoD: path_1 698 FTP, Bit-Torrent: path_2 700 VoIP: path3 702 Advantage: Application can be differentiated: e.g. the delay 703 critical VoIP can use LTE, whereas the free WiFi is satisfactory for 704 the non-mission critical Bit-Torrent. 706 Disadvantage: The mapping of the traffic is too rigid, all the 707 traffic of applications of a given type is mapped to a single path, 708 therefore, the applications (and thus their users) do not experience 709 the benefits of multipath transmission. 711 Example 3 (Combined mapping) 713 We define further two connections: 715 Connection_2 717 path_1: 5 719 path_2: 2 721 Connection_2 723 path_1: 5 724 path_3: 3 726 Based on the destination port number or port range, the traffic of 727 different applications are mapped to paths as follows: 729 HTTP: connection_1 731 FTP, Bit-Torrent: connecton_2 733 VoIP, VoD: connection_3 735 Advantage: The applications may benefit from the multipath 736 transmission, whereas each types of applications use those paths, 737 which are beneficial and affordable for them. 739 Disadvantage: The price of the above resilience is the time and 740 computational complexity of the execution of both algorithms. 742 Conclusion: The appropriate choice of the mapping algorithm depends 743 on the expectations of the user. 745 5.1. Per Packet Based Mapping 747 The aim of the "per packet based" mapping is to distribute the 748 tunnel traffic to the paths proportionally to their transmission 749 capacity. This mapping facilitates the aggregation of the 750 transmission capacities of the paths. 752 In MPT, the transmission capacity of the paths is represented by 753 their WEIGT_OUT parameter. 755 The following algorithm calculates the sending vector, which 756 contains the indices of the paths in the order they are to be used 757 for transmission. 759 ALGORITHM calculate_sending_vector 761 INPUT: W[i] (1 <= i <= N), the vector of the weights of the paths. 763 (Note: We have N paths with indices (1, ... , N) 765 OUTPUT: O[j] ( 1 <= j <= M ), the sending vector containing the 766 indices of the paths; where M is the length of the sending cycle. 768 lcm := Least Common Multiple for (W[1], ... , W[N]) 769 M := 0 771 s[i] := 0, for all i (1 <= i <= N) 773 (Note: s[i] will store the sum of the increments for path i, where 774 the increment is lcm/W[i]) 776 WHILE TRUE DO 778 z := min(s[1]+lcm/W[1], ... , s[N]+lcm/W[N]) 780 k := The smallest index i, for which z == s[i]+lcm/W[i] 782 M := M+1 784 s[k] := z 786 O[M] := k 788 IF s[i] == z for all i (1 <= i <= N) THEN RETURN 790 DONE 792 END 794 A sample C code can be found in the Appendix. 796 5.2. Flow Based Mapping 798 The aim of the flow based mapping is to be able to distinguish the 799 packets belong to different network flows and map them to the path 800 that was set for them. (E.g. WiFi is used for Torrent traffic and 801 LTE is used for VoIP calls.) 803 Our current implementation realizes a port-based flow mapping. It is 804 possible to select the interface for the outgoing traffic based on 805 transport protocol and port. For communication between two MPT 806 servers, you can precisely specify which flow mapped to which path. 808 The configuration of the mechanism is simple. Four new values can be 809 added for the definition of paths: 811 tcp_dst - TCP destination port matches 813 tcp_src - TCP source port matches 814 udp_dst - UDP destination port matches 816 udp_src - UDP source port matches 818 All of these are optional, they can be listed in many ports. Ports 819 that are not defined will continue to be per-packet based. The 820 current implementation of the MPT with flow based mapping can be 821 found on [MptFlow] 823 In the example below, each outgoing TCP packet with destination port 824 80, 443, and 8080 and UDP packet with destination port 5901 will be 825 sent on path_1. TCP packets with source ports 7880 and 56000 will be 826 sent on path_2. 828 Example (flow based mapping configuration snippet) 830 [path_1] 831 ... 832 tcp_dst = 80 443 8080 833 udp_dst = 5901 835 [path_2] 837 ... 839 tcp_src = 7880 56000 841 5.3. Combined Mapping 843 TBD 845 6. Packet Reordering 847 As the delay of the different paths can be different, packet 848 reordering may appear in a packet sequence transmission. The MPT 849 environment offers an optional feature to ensure the right ordered 850 packet transmission for the tunnel communication. If this feature is 851 enabled, the receiver uses a buffer-array to store the incoming 852 (unordered) packets. Then the packets are sorted according to the 853 GRE sequence numbers, so ensuring the ordered transmission to the 854 receiver's tunnel interface. 856 There are two parameters aimed to control the reordering. The 857 reorder window parameter specifies the length of the buffer array 858 used for reordering. The maximum buffer delay parameter specifies 859 the maximum time (in milliseconds) while the packet is stored in the 860 buffer-array. If the packet is delayed in the buffer-array for the 861 specified time, it will be transmitted to the tunnel interface, even 862 in the case, when some packets are missing before the considered 863 packet. The missing packets are considered as lost packets (i.e. we 864 will not wait more for a lost packet). The arrived packets are 865 transferred to the tunnel interface according to their GRE sequence 866 number, so the ordered delivery will be kept also in the case of 867 packet loss. 869 How to set the values of these parameters? 871 As for maximum buffer delay, if its value is too small, then MPT may 872 incorrectly consider a sequence number as lost, and if it arrives 873 later, MPT has to drop it to keep on the order-right delivery. If 874 its value is too large, then the packet loss will be determined too 875 late, and thus the communication performance may decrease. Our 876 experience shows that a feasible choice could be: a few times the 877 RTT (Round-Trip Time) of the slowest path. 879 As for reorder window, it MUST be large enough to store packets 880 arriving at maximum line rate from all the active paths of the given 881 connection during a maximum buffer delay interval. 883 The appropriate choice of these parameters is still subject of 884 research. 886 7. Why MPT is Considered Experimental? 888 We view MPT as a research area rather than a solution which is ready 889 for deployment. We have an MPT implementation, which is workable, 890 but it contains only the "per packet based" mapping of the tunnel 891 traffic to the paths. One of our aims of writing this Internet Draft 892 is to enable others to write MPT implementations. It is our hope 893 that the experience gained with preparing other implementations as 894 well as the results of their testing and performance analysis will 895 lead to a better MPT specification, which may then serve as a 896 standard track specification of an improved MPT, which will be ready 897 for deployment. 899 In this section, we summarize the most important results as well as 900 the open questions of the MPT related research. 902 7.1. Published Results 904 7.1.1. MPT Concept and First Implementation 906 The conceptual architecture of MPT, comparison with other multipath 907 solutions, some details of the first implementation and some test 908 results are available in [Alm2017]. 910 The user manual of the first MPT implementation and the precompiled 911 MPT libraries for Linux (both i386 and amd64) and Raspbian are 912 available from [Mpt2017]. 914 7.1.2. Estimation of the Channel Aggregation Capabilities 916 The channel aggregation capabilities of an early MPT implementation, 917 which did not use GRE-in-UDP, were analyzed up to twelve 100Mbps 918 links in [Len2015]. 920 Some of the above tests were repeated with the current GRE-in-UDP 921 based MPT implementation, and the path aggregation capabilities of 922 MPT were compared to that of MPTCP in [Kov2016] and [Szi2018]. 924 Measurements were performed also using two 1Gbps links [Szi2018b] 925 and four 1Gpbs links [Szi2019]. 927 The performance of MPT and MPTCP was compared using two 10Gbps links 928 [Szi2019b]. 930 7.1.3. Demonstrating the Resilience of an MPT Connection 932 The resilience property of the early MPT implementation, which did 933 not use GRE-in-UDP, was demonstrated in [Alm2014] and in [Alm2015]. 935 The fast connection recovery of the GRE-in-UDP based MPT 936 implementation was demonstrated in [Fej2016]. 938 Playout buffer length triggered path switching algorithm was 939 developed for the GRE-in-UDP based MPT, and its effectiveness was 940 demonstrated by the elimination of the stalling events on YouTube 941 video playback [Fej2017]. 943 7.2. Open questions 945 7.2.1. Parameters 947 The optimal (or good enough) choice of the reorder window size and 948 maximum buffer delay parameters are important questions, which 949 should be solved before MPT can be deployed. 951 7.2.2. Development of Further Mapping Algorithms 953 The current MPT implementation [Mpt2017] includes only the per 954 packet base mapping. For a precise specification of the further two 955 mapping algorithms, we would like to use or experiences with them. 956 There are some open questions e.g. how to handle the traffic that is 957 neither TCP nor UDP? 959 7.2.3. Performance Issues 961 The current MPT implementation [Mpt2017] works in user space. Thus, 962 it is not surprising, that multipath transmission of the same amount 963 of traffic by MPT results in higher CPU load than its multipath 964 transmission by MPTCP [Kov2019]. How much CPU power could a kernel 965 space MPT implementation save? 967 It was also pointed out by [Kov2019], that MPT is not able to 968 utilize the computing power of more than two CPU cores. It is so, 969 because MPT uses only two threads (one for each direction). This is 970 not a serious issue, when MPT is used on personal computers. 971 However, when MPT is used to connect several networks, it is an 972 important question, how MPT could utilize the computing power of the 973 modern CPUs with several cores. 975 7.3. Implementation 977 A sample implementation of the MPT software is available from 978 [MptSrc] under GPLv3 license. It is intended for research and 979 experimentation purposes only, as it has not been sufficiently 980 tested to be used for commercial purposes. 982 8. Security Considerations 984 Threats that apply to GRE-in-UDP tunneling, apply here, too. For the 985 security considerations of GRE-in-UDP, please refer to Section 11 of 986 [RFC8086]. 988 If an MPT server is configured so, its peer is allowed to build up 989 connections. It may lead to resource exhaustion and thus successful 990 DoS (Denial of Service) attacks. 992 Authentication between MPT servers is optional, which may lead to 993 security issues. 995 9. IANA Considerations 997 Port numbers may be reserved for local command port and remote 998 command port. 1000 10. Conclusions 1002 Hereby we publish the specifications of the MPT network layer 1003 multipath library in the hope, that it can be made better by the 1004 review and comments of the WG members and, after answering several 1005 open questions, one day MPT can mature to be a production tool. We 1006 seek for interested volunteers for a different implementation and we 1007 would be happy to take part in research cooperation. We welcome all 1008 kinds of feedback from anyone to make MPT better. 1010 11. References 1012 11.1. Normative References 1014 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1015 Requirement Levels", BCP 14, RFC 2119, March 1997. 1017 [RFC8086] Young, L. (Editor), Crabbe, E., Xu, X., and T. Herbert, 1018 "GRE-in-UDP Encapsulation", RFC 8086, DOI: 1019 10.17487/RFC8086, March 2017. 1021 11.2. Informative References 1023 [Alm2014] Almasi, B., "A solution for changing the communication 1024 interfaces between WiFi and 3G without packet loss", in 1025 Proc. 37th Int. Conf. on Telecommunications and Signal 1026 Processing (TSP 2014), Berlin, Germany, Jul. 1-3, 2014, 1027 pp. 73-77 1029 [Alm2015] Almasi, B., Kosa, M., Fejes, F., Katona, R., and L. Pusok, 1030 "MPT: a solution for eliminating the effect of network 1031 breakdowns in case of HD video stream transmission", in: 1032 Proc. 6th IEEE Conf. on Cognitive Infocommunications 1033 (CogInfoCom 2015), Gyor, Hungary, 2015, pp. 121-126, doi: 1034 10.1109/CogInfoCom.2015.7390576 . 1036 [Alm2017] Almasi, B., Lencse, G., and Sz. Szilagyi, "Investigating 1037 the Multipath Extension of the GRE in UDP Technology", 1038 Computer Communications (Elsevier), vol. 103, no. 1, 1039 (2017.) pp. 29-38, DOI: 10.1016/j.comcom.2017.02.002 1041 [Fej2016] Fejes, F., Katona, R., and L. Pusok, "Multipath strategies 1042 and solutions in multihomed mobile environments", in: 1043 Proc. 7th IEEE Conf. on Cognitive Infocommunications 1044 (CogInfoCom 2016), Wroclaw, Poland, 2016, pp. 79-84, doi: 1045 10.1109/CogInfoCom.2016.7804529 1047 [Fej2017] Fejes, F., Racz, S., and G. Szabo, "Application agnostic 1048 QoE triggered multipath switching for Android devices", 1049 In: Proc. 2017 IEEE International Conference on 1050 Communications (IEEE ICC 2017), Paris, France, 21-25 May 1051 21-25, 2017. pp. 1585-1591. 1053 [Kov2016] Kovacs, A., "Comparing the aggregation capability of the 1054 MPT communications library and multipath TCP", in: Proc. 1055 7th IEEE Conf. on Cognitive Infocommunications (CogInfoCom 1056 2016), Wroclaw, Poland, 2016, pp. 157-162, doi: 1057 10.1109/CogInfoCom.2016.7804542 1059 [Kov2019] Kovacs, A., "Evaluation of the Aggregation Capability of 1060 the MPT Communications Library and Multipath TCP", Acta 1061 Polytechnica Hungarica, vol. 16, no. 6, 2019, pp. 129-147. 1062 DOI: 10.12700/APH.16.6.2019.6.9 1064 [Len2015] Lencse, G. and A. Kovacs, "Advanced Measurements of the 1065 Aggregation Capability of the MPT Multipath Communication 1066 Library", International Journal of Advances in 1067 Telecommunications, Electrotechnics, Signals and Systems, 1068 vol. 4. no. 2. (2015.) pp 41-48. DOI: 1069 10.11601/ijates.v4i2.112 1071 [Szi2018] Szilagyi, Sz., Fejes, F. and R. Katona, "Throughput 1072 Performance Comparison of MPT-GRE and MPTCP in the Fast 1073 Ethernet IPv4/IPv6 Environment", Journal of 1074 Telecommunications and Information Technology, vol. 3. no. 1075 2. (2018.) pp 53-59. DOI: 10.26636/jtit.2018.122817 1077 [Szi2018b] Szilagyi, Sz., Bordan, I., Harangi, L. and B. Kiss, 1078 "MPT-GRE: A Novel Multipath Communication Technology for 1079 the Cloud", in: Proc. 9th IEEE Conf. on Cognitive 1080 Infocommunications (CogInfoCom 2018), Budapest, Hungary, 1081 2018, pp. 81-86, doi: 10.1109/CogInfoCom.2018.8639941 1083 [Szi2019] Szilagyi, Sz., Bordan, I., Harangi, L. and B. Kiss, 1084 "Throughput Performance Comparison of MPT-GRE and MPTCP in 1085 the Gigabit Ethernet IPv4/IPv6 Environment", Journal of 1086 Electrical and Electronics Engineering, vol. 12. no. 1. 1087 (2019.), pp. 57-60. ISSN: 1844-6035 1089 [Szi2019b] Szilagyi, Sz., Bordan, I., Harangi, L. and B. Kiss, 1090 "Throughput Performance Analysis of the Multipath 1091 Communication Technologies for the Cloud", Journal of 1092 Electrical and Electronics Engineering, Vol. 12, No. 2, 1093 (2019.), pp. 69-72, ISSN: 1844-6035 1095 [Mpt2017] MPT - Multipath Communication Library, 1096 https://irh.inf.unideb.hu/~szilagyi/index.php/en/mpt/ 1098 [MptFlow] "MPT - Multi Path Tunnel", source code version with flow 1099 based packet to path mapping feature, 1100 https://github.com/spyff/mpt/tree/flow_mapping 1102 [MptSrc] "MPT - Multi Path Tunnel", source code, 1103 https://github.com/spyff/mpt 1105 [RFC6824] Ford, A., Raiciu, C, Handley, M., and O Bonaventure, "TCP 1106 Extensions for Multipath Operation with Multiple 1107 Addresses", RFC 6824, DOI: 10.17487/RFC6824, January, 1108 2013. 1110 [RFC8157] Leymann, N., Heidemann, C., Zhang, M., Sarikaya, B, and M. 1111 Cullen, "Huawei's GRE Tunnel Bonding Protocol", RFC 8157, 1112 DOI: 10.17487/RFC8157, May, 2017 1114 12. Acknowledgments 1116 The MPT Network Layer Multipath Library was invented by Bela Almasi, 1117 the organizer and original leader of the MPT development team. 1119 This document was prepared using 2-Word-v2.0.template.dot. 1121 Appendix A. Sample C code for calculating the packet sending order 1123 1124 void calculate_pathselection(connection_type *con) { 1125 long long lcm; 1126 long gcd, min_inc, cinc; 1127 int i,j, min_idx; 1128 path_type *p; 1130 con->pathselectionlength = 0; 1131 gcd = con->mpath[0].weight_out; 1132 lcm = gcd; 1133 for (i = 0; i < M; i++) 1134 O[i] = NULL; 1136 for (i = 0; i < con->path_count; i++) { 1137 gcd = CALCULATE_GCD(gcd, con->mpath[i].weight_out); 1138 lcm = (lcm * con->mpath[i].weight_out ) / gcd; 1139 con->mpath[i].selection_increment = 0; 1140 } 1142 for (j = 0; j < M; j++) { 1143 min_idx = 0; 1144 min_inc = lcm + 1; 1145 for (i = 0; i < con->path_count; i++) { 1146 p = &con->mpath[i]; 1147 cinc = p->selection_increment + (lcm / p->weight_out); 1148 if ((p->weight_out) && (cinc < min_inc)) { 1149 min_idx = i; 1150 min_inc = cinc; 1151 } 1152 } 1153 O[j] = &con->mpath[min_idx]; 1154 con->mpath[min_idx].selection_increment = min_inc; 1156 for (i = 0; i < con->path_count; i++) // check if ready 1157 if (con->mpath[i].selection_increment != min_inc) 1158 goto NEXT_SELECTION; 1159 break; 1161 NEXT_SELECTION: 1162 continue; 1163 } 1164 con->path_index = 0; 1165 con->pathselectionlength = j + 1; 1166 } 1167 1168 Copyright (c) 2020 IETF Trust and the persons identified as authors 1169 of the code. All rights reserved. 1171 Redistribution and use in source and binary forms, with or without 1172 modification, is permitted pursuant to, and subject to the license 1173 terms contained in, the Simplified BSD License set forth in Section 1174 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents 1175 (http://trustee.ietf.org/license-info). 1177 Authors' Addresses 1179 Gabor Lencse 1180 Budapest University of Technology and Economics 1181 Magyar Tudosok korutja 2. 1182 H-1117 Budapest 1183 Hungary 1185 Phone: +36 1 463 2055 1186 Email: lencse@hit.bme.hu 1188 Szabolcs Szilagyi 1189 University of Debrecen 1190 Egyetem ter 1. 1191 H-4032 Debrecen 1192 Hungary 1194 Phone: +36 52 512 900 / 75013 1195 Email: szilagyi.szabolcs@inf.unideb.hu 1197 Ferenc Fejes 1198 Eotvos Lorand University 1199 Egyetem ter 1-3. 1200 H-1053 Budapest 1201 Hungary 1203 Phone: +36 70 545 48 07 1204 Email: fejes@inf.elte.hu 1206 Marius Georgescu 1207 RCS&RDS 1208 Strada Dr. Nicolae D. Staicovici 71-75 1209 Bucharest 030167 1210 Romania 1212 Phone: +40 31 005 0979 1213 Email: marius.georgescu@rcs-rds.ro