idnits 2.17.1 draft-lencse-tsvwg-mpt-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 10, 2018) is 1957 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'FILENAME' is mentioned on line 625, but not defined -- Looks like a reference, but probably isn't: '1' on line 777 == Missing Reference: 'N' is mentioned on line 777, but not defined == Missing Reference: 'M' is mentioned on line 785, but not defined -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group G. Lencse 2 Internet Draft Budapest Univ. of Tech. and Econ. 3 Intended status: Experimental Sz. Szilagyi 4 Expires: December 2018 F. Fejes 5 University of Debrecen 6 M. Georgescu 7 RCS&RDS 8 December 10, 2018 10 MPT Network Layer Multipath Library 11 draft-lencse-tsvwg-mpt-03.txt 13 Abstract 15 Although several contemporary IT devices have multiple network 16 interfaces, communication sessions are restricted to use only one of 17 them at a time due to the design of the TCP/IP protocol stack: the 18 communication endpoint is identified by an IP address and a TCP or 19 UDP port number. The simultaneous use of these multiple interfaces 20 for a communication session would improve user experience through 21 higher throughput and improved resilience to network failures. 23 MPT is a network layer multipath solution, which provides a tunnel 24 over multiple paths using the GRE-in-UDP specification, thus being 25 different from both MPTCP and Huawei's GRE Tunnel Bonding Protocol. 27 MPT can also be used as a router, routing the packets among several 28 networks between the tunnel endpoints, thus establishing a multipath 29 site-to-site connection. 31 The version of tunnel IP and the version of path IP are independent 32 from each other, therefore MPT can also be used for IPv6 transition 33 purposes. 35 Status of this Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF), its areas, and its working groups. Note that 42 other groups may also distribute working documents as Internet- 43 Drafts. 45 Internet-Drafts are draft documents valid for a maximum of six 46 months and may be updated, replaced, or obsoleted by other documents 47 at any time. It is inappropriate to use Internet-Drafts as 48 reference material or to cite them other than as "work in progress." 50 The list of current Internet-Drafts can be accessed at 51 http://www.ietf.org/ietf/1id-abstracts.txt 53 The list of Internet-Draft Shadow Directories can be accessed at 54 http://www.ietf.org/shadow.html 56 This Internet-Draft will expire on June 10, 2009. 58 Copyright Notice 60 Copyright (c) 2018 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (http://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with 68 respect to this document. Code Components extracted from this 69 document must include Simplified BSD License text as described in 70 Section 4.e of the Trust Legal Provisions and are provided without 71 warranty as described in the Simplified BSD License. 73 Table of Contents 75 1. Introduction ................................................. 3 76 1.1. Design Assumptions ...................................... 3 77 1.2. MPT in the Networking Stack ............................. 3 78 1.3. Terminology ............................................. 4 79 1.4. MPT Concept ............................................. 5 80 2. Conventions Used in this Document ............................ 6 81 3. Operation Overview ........................................... 6 82 4. MPT Control .................................................. 8 83 4.1. Configuration Information ............................... 8 84 4.1.1. General Information for the MPT Server ............. 8 85 4.1.2. Connection Specifications .......................... 9 86 4.2. MPT Configuration Commands ............................. 13 87 5. Possible Mappings of the Tunnel Traffic to Paths ............ 14 88 5.1. Per Packet Based Mapping ............................... 17 89 5.2. Flow Based Mapping ..................................... 18 90 5.3. Combined Mapping ....................................... 19 91 6. Packet Reordering ........................................... 19 92 7. Why MPT is Considered Experimental? ......................... 20 93 7.1. Published Results ...................................... 21 94 7.1.1. MPT Concept and First Implementation .............. 21 95 7.1.2. Estimation of the Channel Aggregation Capabilities 21 96 7.1.3. Demonstrating the Resilience of an MPT Connection . 21 97 7.2. Open questions ......................................... 21 98 7.2.1. Parameters ........................................ 21 99 7.2.2. Development of Further Mapping Algorithms ......... 21 100 7.2.3. Performance Issues ................................ 22 101 8. Security Considerations ..................................... 22 102 9. IANA Considerations ......................................... 22 103 10. Conclusions ................................................ 23 104 11. References ................................................. 23 105 11.1. Normative References .................................. 23 106 11.2. Informative References ................................ 23 107 12. Acknowledgments ............................................ 25 108 Appendix A. Sample C code for packet reordering ................ 26 110 1. Introduction 112 MPT is a multipath extension of the GRE-in-UDP encapsulation 113 [RFC8086]. 115 1.1. Design Assumptions 117 MPT is intended to be used as a preconfigured tunnel and the 118 application of MPT does not require any modifications to the 119 applications using the TCP/IP socket interface API. 121 1.2. MPT in the Networking Stack 123 The layer architecture of MPT is shown in Fig. 1. MPT extends the 124 GRE-in-UDP [RFC8086] architecture by allowing multiple physical 125 paths. To that end it can be compared to MPTCP [RFC6824], but unlike 126 MPTCP, MPT uses UDP in the underlying layer, builds on GRE-in-UDP, 127 and provides a tunnel IP layer, over which both UDP and TCP can be 128 used. The aim of Huawei's GRE tunnel bonding protocol [RFC8157] is 129 also similar to that of MPT: it targets to bonded access to wired 130 and wireless network in customer premises. However, it uses GRE (not 131 GRE-in-UDP) which is less supported in ISP networks than UDP, and it 132 seems to limit the number of physical interfaces to two. For the 133 comparison of MPT with other multipath solutions, please refer to 134 [Alm2017]. 136 +---------------------------------------------+ 137 | Application (Tunnel) | 138 +---------------------------------------------+ 139 | TCP/UDP (Tunnel) | 140 +---------------------------------------------+ 141 | IPv4/IPv6 (Tunnel) | 142 +---------------------------------------------+ 143 | GRE-in-UDP | +-----+ 144 +----------------------+----------------------+<--| MPT | 145 | UDP (Physical) | UDP (Physical) | +-----+ 146 +----------------------+----------------------+ 147 | IPv4/IPv6 (Physical) | IPv4/IPv6 (Physical) | 148 +----------------------+----------------------+ 149 | Network Access | Network Access | 150 +----------------------+----------------------+ 152 Figure 1: MPT Layer Architecture 154 1.3. Terminology 156 This document uses a number of terms that are either MPT specific or 157 have defined meaning in the context of MPT as follows: 159 MPT server: An MPT server is a software that implements network 160 layer multipath communication by providing an UDP tunnel (named 161 "connection" in the MPT terminology) over several underlying 162 "paths". 164 MPT client: An MPT client is a software tool, which is used to 165 control the local MPT server (e.g. start/stop connections, add 166 paths to connections, etc.). 168 Connection: An MPT connection (also referred as communication 169 session) is an UDP tunnel between two MPT servers, which can be 170 used to carry user data. A connection can be established over one 171 or more paths. A connection is initiated on the basis of a 172 "connection specification". 174 Path: A path is used to refer to the pair of the network cards of 175 the end nodes (identified by the pair of IP addresses of the 176 cards). Using a specified path, the packet transmission runs 177 between the given pair of network cards. 179 Connection specification: A connection specification is stored in a 180 configuration file and it is used by an MPT server to establish 181 an MPT connection with another MPT server. It contains all the 182 configuration information for the connection (e.g. endpoint IP 183 versions and addresses, number of paths and configuration 184 information for all paths). The precise definition of the 185 connection specification can be found in Section 4.1.2. 187 Data port: Data port means the GRE-in-UDP port defined in [RFC8086] 188 as 4754. It is used for transmitting data packets. 190 Local command port: An MPT server accepts commands from the MPT 191 client at the local command port. 193 Remote command port: An MPT server MAY accept commands from other 194 MPT servers at the remote command port. 196 Data plane: The parts and functions of MPT, which are responsible 197 for handling user data packets. 199 Control plane: All parts and functions of MPT except the data plane. 200 E.g.: handling connections and paths, all the communication 201 through local or remote command ports, etc. 203 1.4. MPT Concept 205 When an MPT server is started, it reads its configuration files, and 206 depending on its contents, it MAY wait for and accept connection(s) 207 initiated by other MPT server(s) and/or it MAY initiate one or more 208 MPT connection(s) with other MPT server(s). In the simplest case, 209 the MPT server uses the connection specifications described in its 210 configuration files for initiating connections. In addition to that, 211 new connections MAY be established, connections MAY be closed, the 212 parameters of the connections MAY be changed later (e.g. some paths 213 may be switched on and off) dynamically by issuing the appropriate 214 commands using an MPT client. 216 MPT connections between MPT servers implement tunnels. The traffic 217 comes from the tunnel interface is distributed over the active paths 218 of the MPT connection by the MPT server. There are three possible 219 mappings (see Section 5 for details and illustrative examples): 221 o per packet based mapping, where a mapping decision is made for 222 every single packet 224 o flow based mapping, where the flows, identified by the usual five 225 tuple, are always mapped to a given path. 227 o combined mapping, where the flows, identified by the usual five 228 tuple, are always mapped to a given connection and a mapping 229 decision is made for every single packet of each connections. 231 The peer MPT server receives and de-encapsulates the traffic from 232 the different paths and restores the tunnel traffic using the 233 optional GRE sequence numbers for packet reordering if necessary. 235 2. Conventions Used in this Document 237 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 238 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 239 document are to be interpreted as described in RFC 2119 [RFC2119]. 241 In this document, these words will appear with that interpretation 242 only when in ALL CAPS. Lower case uses of these words are not to be 243 interpreted as carrying significance described in RFC 2119. 245 3. Operation Overview 247 In this Section, we describe the operation of the data plane, 248 whereas the operation of the control plane can be found in Section 249 4. 251 The data packet transmission and receive mechanism of MPT is 252 summarized in Fig. 2. Now, we shall follow the route and processing 253 of data packets. 255 When a packet is read from the tunnel interface, the MPT software 256 looks up the appropriate connection specification, which determines 257 the mapping of the packets to the paths. The connection 258 specification determines the behavior of the multipath 259 communication, especially the distribution of the packets among the 260 paths (see Section 5 for possible mapping methods). The path is 261 selected and the user data packet encapsulated into a GRE-in-UDP 262 data unit, which may optionally contain GRE Sequence Numbers for 263 reordering. The simplest GRE header contains 4 octets: 16 bits of 264 zeros and 16 bits of protocol type identification value (i.e. 0x86DD 265 in the case of using IPv6 on the tunnel interface, or 0x0800 in the 266 case of IPv4). Then the GRE-in-UDP data unit is encapsulated into 267 the UDP/IP data unit of the selected path, where the destination UDP 268 port number is the 4754 GRE-in-UDP port and the IP addresses (either 269 both IPv4 or both IPv6) are determined by the path definition. 270 Finally, the encapsulated packet is transmitted through the physical 271 interface. The encapsulation of the different protocol data units is 272 shown in Fig. 3. 274 +------------------------------------------------+ 275 | Tunnel Interface | 276 +------------------------------------------------+ 277 || /\ 278 || || 279 ############||######################||############ 280 # \/ || # 281 # +--------------------+ +--------------------+ # 282 # | data packet | | forwarding to the | # +----------+ 283 # | reading | | tunnel interface | # | Control | 284 # +--------------------+ +--------------------+ # | Protocol | 285 # || /\ # | PDU | 286 # \/ || # +--------- + 287 # +--------------------+ +--------------------+ # || 288 # | checking connection| | packet reordering | # || 289 # | specification | | (optional) | # || ########### 290 # +--------------------+ +--------------------+ # \/ # MPT # 291 # || /\ #<======# Control # 292 # \/ || # # # 293 # +--------------------+ +--------------------+ # ########### 294 # | path selection, | | data | # 295 # | GRE-in-UDP encaps. | | checking | # 296 # +--------------------+ +--------------------+ # 297 # || /\ # 298 # \/ || # 299 # +--------------------+ +--------------------+ # 300 # | physical data | | data packet | # 301 # | transmission | | reading | # 302 # +--------------------+ +--------------------+ # 303 # || /\ # 304 ############||######################||############ 305 || || 306 \/ || 307 +------------------------------------------------+ 308 | Physical Interface | 309 +------------------------------------------------+ 311 Figure 2: Conceptual architecture of MPT working mechanism 313 When a packet is read from the physical interface, its destination 314 UDP port number is the 4754 GRE-in-UDP port. MPT reads the packet, 315 identifies the connection the packet belongs to (by the source and 316 destination IP addresses of the tunnel IP header) and runs checking 317 mechanisms (e.g. connection validity check, GRE sequence number 318 check or GRE Key value check, if present). If all the checking 319 mechanisms finish successfully and no reordering is necessary, then 320 the packet is promptly transmitted to the Transport and Application 321 Layers through the tunnel interface. If reordering is on and GRE 322 sequence number indicates that one or more data unit(s) are missing, 323 then the packet is placed into a buffer array for reordering 324 purposes. (Reordering is discussed in Section 6.) 326 +----------+-----------+---------+-----------+---------+-----------+ 327 | path IP | path UDP | GRE-in- | tunnel IP | tunnel |application| 328 | v4 or v6 |(port 4754)| UDP | v4 or v6 | TCP/UDP | data | 329 +----------+-----------+---------+-----------+---------+-----------+ 331 Figure 3: PDU encapsulation of the MPT data communication 333 4. MPT Control 335 A connection can be established between two MPT servers in two ways: 337 1. When the MPT server is started, it establishes the connection on 338 the basis of a connection specification from the configuration 339 files. In this case, the connection specification contains all 340 the necessary parameters. MPT client commands still can be used 341 to modify the parameters, switch off and on paths, etc. as 342 described in Section 4.2. 344 2. The connection is established by using MPT client commands. In 345 this case the command line arguments of the MPT commands and 346 configuration files contain the necessary parameters. 348 4.1. Configuration Information 350 The MPT configuration files contain various pieces of information. 351 They can be divided into two groups: 353 1. general information for the MPT server 355 2. connections specification(s) 357 4.1.1. General Information for the MPT Server 359 The MPT configuration file is made up of sections. The "general" 360 section MUST be present and it contains general information for the 361 operation of the MPT server, whereas there MAY several sections 362 follow, each of which describes a different tunnel. 364 The general section MUST contain the following elements: 366 o tunnel number: the number of tunnels to create (they are to be 367 described in separate sections) 369 o accept remote: it is a key (yes/no) whether this MPT server 370 should accept commands from other MPT servers to build up 371 connections, which are not defined in the local configuration 372 files 374 o local command port: the port number on which the local MPT client 375 software can give commands to the MPT server 377 o command timeout: the timeout value for the MPT client 379 For each tunnel, a separate section is to be used to describe the 380 following parameters: 382 o name: The name is used by the operating system to access to the 383 interface. 385 o MTU: The maximum transmission unit of the tunnel interface. For 386 the Ethernet environment, the value should be set between 1436 387 and 1468 (depending on the additional header sizes, used by the 388 actual system). It can be calculated as: 1500- 389 Path_IP_header_size-UDP_header_size-GRE_header_size. 391 o ipv4_addr: IPv4 address and mask 393 o ipv6_addr: IPv6 address and mask 395 Note that both ipv4_addr and ipv6_addr MAY be present. At least one 396 of them MUST be present. 398 It is important that the same tunnel may be used by several 399 connections. A connection can be uniquely identified by the IP 400 addresses of the two endpoints, which have to be of the same type 401 (IPv4 or IPv6). 403 4.1.2. Connection Specifications 405 A connection specification is made up of sections. The "connection" 406 section contains parameters that are to be specified only ones for 407 each connection. The "paths" section contains one or more path 408 definitions. The optional "networks" section contains network 409 definitions for routing purposes. 411 The general section (called "connection") MUST contain the following 412 elements: 414 o name: The unique name of the connection. If we use multiple 415 connections, the name must uniquely identify the connection. 417 o permissions: There MAY be SEND and RECEIVE permissions, which 418 allow sending and receiving connection updates. The term SEND 419 means that the local MPT environment is allowed to start 420 configuration change to the peer. The term RECEIVE means that the 421 peer is allowed to start a configuration change, and the local 422 MPT environment will accept it. (The actual execution of the 423 requested change depends on further conditions, e.g. successful 424 authentication.) 426 o IP version: its possible values are 4 or 6. 428 o local IP address: must be of the IP version specified above and 429 must be the same as defined for the tunnel. 431 o remote IP address: the IP address of the remote peer, must be of 432 the IP version specified above 434 o local data port number: used for data communication, SHOULD be 435 set to the 4754 GRE-in-UDP port number 437 o remote data port number: used for data communication, SHOULD be 438 set to the 4754 GRE-in-UDP port number 440 o remote command port number: The UDP port number of the peer, 441 which is used to accept control commands. If the local MPT client 442 starts an MPT command (e.g. turning off a path usage), the MPT 443 server will communicate this action to the peer by using the 444 remote command port number as the destination port number. 446 o path count: The key is an integer P, denoting the number of paths 447 defined for this connection. The minimum value is 1, the maximum 448 value is implementation dependent, e.g. 20. This configuration 449 file MUST have P sections (usually named [path_n]), where 0<=n 0). If ordered packet 481 transmission is required, maximum buffer delay specifies the 482 maximum time (in milliseconds) while the packet may be stored in 483 the buffer-array. 485 o authentication key: The authentication key contains the key value 486 of the control communication authentication. Some algorithms do 487 not need authentication keys. In this case the specification of 488 the authentication key is not necessary, or will be ignored. 490 A path definition section MUST contain the following elements: 492 o interface name: The value is the name of the physical interface 493 used by the given path for packet forwarding (e.g. eth0, wlan0). 495 o IP version: Specifies the version of IP used by the path. The 496 value can be 4 or 6. 498 o public IP address: Specifies the public IP address of the 499 interface used for the tunnel communication. If the host is 500 placed into the Global Address Realm, the public IP address is 501 the IP address of the interface, otherwise (i.e. when the host is 502 behind a NAT-Box) it is the public address assigned by the NAT- 503 Box to the tunnel communication session. If the path uses IPv4 504 and NAT, then the special address value of 0.0.0.0 can be used to 505 force the MPT server program to determine the public IP address 506 automatically. 508 o remote IP address: Indicates the public IP address of the remote 509 endpoint. 511 o gateway IP address: The IP address of the gateway, used to reach 512 the peer (i.e. remote IP address) using the given path. If the 513 operating system uses the Network Manager (nmcli) software for 514 network configuration, then the value of 0.0.0.0 can be used to 515 find the gateway of the named interface automatically. 517 o weight out: This is the "weight of the path" in the system 518 expressing the estimated transmission capacity of the path. The 519 MPT server program distributes the outgoing packets between the 520 available paths according to their weights, if per packet based 521 mapping is used. The value must be between 1 and 10,000. 523 o status: This key means the initial state of the path after 524 starting the MPT server. The value "up" means that the path is 525 usable (working), and the state of the path is OK. If required, 526 may be set initially as "down". 528 A path definition section MAY contain the following elements: 530 o private IP address: The IP address of the physical interface. Can 531 be omitted, if the public IP address is assigned directly to the 532 interface. When using IPv4 and NAT, the special value of 0.0.0.0 533 can be used to force the MPT server application to read and use 534 the first IPv4 address assigned to the interface. 536 o keepalive time: The MPT system monitors the availability of each 537 path by sending keepalive messages regularly. The key specifies 538 the frequency (i.e. the time between the keepalive messages in 539 seconds) that the MPT server uses for sending keepalives. The 540 value of zero (which is the default value) means switching off 541 the keepalive mechanism. 543 o dead time: If the keepalive mechanism is active, and the host 544 does not receive any keepalive message on the given path from the 545 peer for dead time seconds, then the path is considered as "dead" 546 and will not be used for data transmission. (The default value is 547 3*keepalive time.) 549 o weight in: This field is used at the "mpt path up" command (see 550 Section 4.2) to set the outgoing weight of the corresponding path 551 at the peer. The default value is 1. 553 o command default: This key can be used to specify one path as the 554 default path for control command communication. In the case of 555 receiving the control command of "create connection", the system 556 will use this path for the control communication. 558 The optional "networks" section contains network definitions for 559 routing purposes. Each network definition begins with its name in 560 the [net_n] format and contains the following parameters: 562 o IP version: Specifies the version of IP used in the network 563 definitions. The value can be 4 or 6. 565 o source address: specifies the source network and its prefix 566 length in the CIDR notation. 568 o destination address: specifies the destination network and its 569 prefix length in the CIDR notation. 571 The network configuration can also be used to provide multipath 572 Internet connection by specifying 0.0.0.0/0 as destination address 573 and prefix length. (The source is our tunnel address in this case.) 575 4.2. MPT Configuration Commands 577 The same control interface is used for the local administration of 578 the MPT server (by the MPT client accessing the MPT server at the 579 local command port through the loopback interface) and for the 580 communication of the local MPT server with the remote MPT server 581 (accessing it at its remote command port). 583 Now, some client commands will follow. Although some of the syntax 584 of our MPT implementation will be used, the focus is not on their 585 syntax, which may be implementation dependent, but rather on their 586 functionalities. The execution of these commands may also involve 587 communication between the local MPT server and a/the remote MPT 588 server. 590 mpt address {add|del} IPADDRESS/PREFIX dev INTERFACE 592 An IPv4 or IPv6 address can be added to or deleted from a (local) 593 interface. 595 mpt interface INTERFACE {up|down} 597 The specified interface is turned up or down plus all the paths, 598 that are based on the given local physical interface are also turned 599 on or off by starting the "mpt path {up|down}" command (see below) 600 for each considered path. 602 mpt path PATH {up|down} 604 This command can be used to turn on or off a specified path. If the 605 path status is changed to down, then it is not used by the 606 connection, (i.e. no data is sent through that path by the MPT 607 software). 609 mpt connection CONNECTION {create|delete} 611 This command can be used to establish or tear down a connection 612 between the local and a remote MPT server. (The parameters are taken 613 from local configuration files.) If the remote server is configured 614 so, then it accepts the parameters of the connection from the local 615 server. 617 mpt save [FILENAME] 619 The current configuration can be changed during runtime by remote 620 peers. (This can be enabled with the accept remote key and with the 621 permissions key.) This command is used to write these connection 622 changes to the configuration files, so the new settings will remain 623 after server startup or after mpt reload. 625 mpt reload [FILENAME] 627 Warm restart: the MPT server build up its connections according to 628 its configuration files. (Our implementation only establishes, but 629 it does not tears down connections.) 631 5. Possible Mappings of the Tunnel Traffic to Paths 633 The data packets coming from the tunnel interface must be forwarded 634 through one of the active paths of the connection. Three possible 635 mapping solutions are proposed: 637 o Per packet based mapping means that the tunnel traffic is 638 distributed among the paths on the basis of the parameters of the 639 paths only, and regardless of what network flow a given packet 640 belongs to. 642 o Flow based mapping means that packets which belong to a given 643 network flow, identified by the usual five tuple of source IP 644 address, destination IP address, source port number, destination 645 port number, and protocol number (TCP or UDP), or three tuple of 646 source IP address, destination IP address, and protocol number 647 (TCP, UDP or ICMP), are always mapped to the same path. 649 o Combined mapping means the combinations of the two above in the 650 way that packets which belong to a given network flow, identified 651 by the way described above, are always mapped to the same 652 connection. And the packets that belong to a connection are 653 distributed among the paths of that connection by per packet 654 decisions on the basis of the parameters of the paths of the 655 connection. 657 We illustrate the three mapping solutions by examples. 659 Definitions for the examples: 661 Computers A and B are interconnected by 3 different paths: 663 path_1: 100Base-TX Ethernet 665 path_2: 802.11g WiFi 667 path_3: LTE 669 Connection_1 has 3 paths with the following weight out values: 671 path_1: 5 673 path_2: 2 675 path_3: 3 677 Example 1 (Per packet based mapping) 679 All the traffic between the two computers is distributed among the 680 three paths of Connection_1 proportionally to their weight out 681 values. A decision is made about every single packet as described in 682 Section 5.1, regardless of the fact what application it belongs to. 684 Advantage: The transmission capacity of all the paths can be 685 utilized. 687 Disadvantage: There is no possibility to use different mappings for 688 different applications. 690 Example 2 (Per flow based mapping) 692 Based on the destination port number or port range, the traffic of 693 different applications are mapped to paths as follows: 695 HTTP, VoD: path_1 697 FTP, Bit-Torrent: path_2 699 VoIP: path3 701 Advantage: Application can be differentiated: e.g. the delay 702 critical VoIP can use LTE, whereas the free WiFi is satisfactory for 703 the non-mission critical Bit-Torrent. 705 Disadvantage: The mapping of the traffic is too rigid, all the 706 traffic of applications of a given type is mapped to a single path, 707 therefore, the applications (and thus their users) do not experience 708 the benefits of multipath transmission. 710 Example 3 (Combined mapping) 712 We define further two connections: 714 Connection_2 716 path_1: 5 718 path_2: 2 720 Connection_2 722 path_1: 5 723 path_3: 3 725 Based on the destination port number or port range, the traffic of 726 different applications are mapped to paths as follows: 728 HTTP: connection_1 730 FTP, Bit-Torrent: connecton_2 732 VoIP, VoD: connection_3 734 Advantage: The applications may benefit from the multipath 735 transmission, whereas each types of applications use those paths, 736 which are beneficial and affordable for them. 738 Disadvantage: The price of the above resilience is the time and 739 computational complexity of the execution of both algorithms. 741 Conclusion: The appropriate choice of the mapping algorithm depends 742 on the expectations of the user. 744 5.1. Per Packet Based Mapping 746 The aim of the "per packet based" mapping is to distribute the 747 tunnel traffic to the paths proportionally to their transmission 748 capacity. This mapping facilitates the aggregation of the 749 transmission capacities of the paths. 751 In MPT, the transmission capacity of the paths is represented by 752 their WEIGT_OUT parameter. 754 The following algorithm calculates the sending vector, which 755 contains the indices of the paths in the order they are to be used 756 for transmission. 758 ALGORITHM calculate_sending_vector 760 INPUT: W[i] (1 <= i <= N), the vector of the weights of the paths. 762 (Note: We have N paths with indices (1, ... , N) 764 OUTPUT: O[j] ( 1 <= j <= M ), the sending vector containing the 765 indices of the paths; where M is the length of the sending cycle. 767 lcm := Least Common Multiple for (W[1], ... , W[N]) 768 M := 0 770 s[i] := 0, for all i (1 <= i <= N) 772 (Note: s[i] will store the sum of the increments for path i, where 773 the increment is lcm/W[i]) 775 WHILE TRUE DO 777 z := min(s[1]+lcm/W[1], ... , s[N]+lcm/W[N]) 779 k := The smallest index i, for which z == s[i]+lcm/W[i] 781 M := M+1 783 s[k] := z 785 O[M] := k 787 IF s[i] == z for all i (1 <= i <= N) THEN RETURN 789 DONE 791 END 793 A sample C code can be found in the Appendix. 795 5.2. Flow Based Mapping 797 The aim of the flow based mapping is to be able to distinguish the 798 packets belong to different network flows and map them to the path 799 that was set for them. (E.g. WiFi is used for Torrent traffic and 800 LTE is used for VoIP calls.) 802 Our current implementation realizes a port-based flow mapping. It is 803 possible to select the interface for the outgoing traffic based on 804 transport protocol and port. For communication between two MPT 805 servers, you can precisely specify which flow mapped to which path. 807 The configuration of the mechanism is simple. Four new values can be 808 added for the definition of paths: 810 tcp_dst - TCP destination port matches 812 tcp_src - TCP source port matches 813 udp_dst - UDP destination port matches 815 udp_src - UDP source port matches 817 All of these are optional, they can be listed in many ports. Ports 818 that are not defined will continue to be per-packet based. The 819 current implementation of the MPT with flow based mapping can be 820 found on [MptFlow] 822 In the example below, each outgoing TCP packet with destination port 823 80, 443, and 8080 and UDP packet with destination port 5901 will be 824 sent on path_1. TCP packets with source ports 7880 and 56000 will be 825 sent on path_2. 827 Example (flow based mapping configuration snippet) 829 [path_1] 830 ... 831 tcp_dst = 80 443 8080 832 udp_dst = 5901 834 [path_2] 836 ... 838 tcp_src = 7880 56000 840 5.3. Combined Mapping 842 TBD 844 6. Packet Reordering 846 As the delay of the different paths can be different, packet 847 reordering may appear in a packet sequence transmission. The MPT 848 environment offers an optional feature to ensure the right ordered 849 packet transmission for the tunnel communication. If this feature is 850 enabled, the receiver uses a buffer-array to store the incoming 851 (unordered) packets. Then the packets are sorted according to the 852 GRE sequence numbers, so ensuring the ordered transmission to the 853 receiver's tunnel interface. 855 There are two parameters aimed to control the reordering. The 856 reorder window parameter specifies the length of the buffer array 857 used for reordering. The maximum buffer delay parameter specifies 858 the maximum time (in milliseconds) while the packet is stored in the 859 buffer-array. If the packet is delayed in the buffer-array for the 860 specified time, it will be transmitted to the tunnel interface, even 861 in the case, when some packets are missing before the considered 862 packet. The missing packets are considered as lost packets (i.e. we 863 will not wait more for a lost packet). The arrived packets are 864 transferred to the tunnel interface according to their GRE sequence 865 number, so the ordered delivery will be kept also in the case of 866 packet loss. 868 How to set the values of these parameters? 870 As for maximum buffer delay, if its value is too small, then MPT may 871 incorrectly consider a sequence number as lost, and if it arrives 872 later, MPT has to drop it to keep on the order-right delivery. If 873 its value is too large, then the packet loss will be determined too 874 late, and thus the communication performance may decrease. Our 875 experience shows that a feasible choice could be: a few times the 876 RTT (Round-Trip Time) of the slowest path. 878 As for reorder window, it MUST be large enough to store packets 879 arriving at maximum line rate from all the active paths of the given 880 connection during a maximum buffer delay interval. 882 The appropriate choice of these parameters is still subject of 883 research. 885 7. Why MPT is Considered Experimental? 887 We view MPT as a research area rather than a solution which is ready 888 for deployment. We have an MPT implementation, which is workable, 889 but it contains only the "per packet based" mapping of the tunnel 890 traffic to the paths. One of our aims of writing this Internet Draft 891 is to enable others to write MPT implementations. It is our hope 892 that the experience gained with preparing other implementations as 893 well as the results of their testing and performance analysis will 894 lead to a better MPT specification, which may then serve as a 895 standard track specification of an improved MPT, which will be ready 896 for deployment. 898 In this section, we summarize the most important results as well as 899 the open questions of the MPT related research. 901 7.1. Published Results 903 7.1.1. MPT Concept and First Implementation 905 The conceptual architecture of MPT, comparison with other multipath 906 solutions, some details of the first implementation and some test 907 results are available in [Alm2017]. 909 The user manual of the first MPT implementation and the precompiled 910 MPT libraries for Linux (both i386 and amd64) and Raspbian are 911 available from [Mpt2017]. 913 7.1.2. Estimation of the Channel Aggregation Capabilities 915 The channel aggregation capabilities of an early MPT implementation, 916 which did not use GRE-in-UDP, were analyzed up to twelve 100Mbps 917 links in [Len2015]. 919 Some of the above tests were repeated with the current GRE-in-UDP 920 based MPT implementation, and the path aggregation capabilities of 921 MPT were compared to that of MPTCP in [Kov2016] and [Szil2018]. 923 7.1.3. Demonstrating the Resilience of an MPT Connection 925 The resilience property of the early MPT implementation, which did 926 not use GRE-in-UDP, was demonstrated in [Alm2014] and in [Alm2015]. 928 The fast connection recovery of the GRE-in-UDP based MPT 929 implementation was demonstrated in [Fej2016]. 931 Playout buffer length triggered path switching algorithm was 932 developed for the GRE-in-UDP based MPT, and its effectiveness was 933 demonstrated by the elimination of the stalling events on YouTube 934 video playback [Fej2017]. 936 7.2. Open questions 938 7.2.1. Parameters 940 The optimal (or good enough) choice of the reorder window size and 941 maximum buffer delay parameters are important questions, which 942 should be solved before MPT can be deployed. 944 7.2.2. Development of Further Mapping Algorithms 946 The current MPT implementation [Mpt2017] includes only the per 947 packet base mapping. For a precise specification of the further two 948 mapping algorithms, we would like to use or experiences with them. 949 There are some open questions e.g. how to handle the traffic that is 950 neither TCP nor UDP? 952 7.2.3. Performance Issues 954 The current MPT implementation [Mpt2017] works in user space. Thus, 955 it is not surprising, that multipath transmission of the same amount 956 of traffic by MPT results in higher CPU load than its multipath 957 transmission by MPTCP [Kov2018]. How much CPU power could a kernel 958 space MPT implementation save? 960 It was also pointed out by [Kov2018], that MPT is not able to 961 utilize the computing power of more than two CPU cores. It is so, 962 because MPT uses only two threads (one for each direction). This is 963 not a serious issue, when MPT is used on personal computers. 964 However, when MPT is used to connect several networks, it is an 965 important question, how MPT could utilize the computing power of the 966 modern CPUs with several cores. 968 7.3. Implementation 970 A sample implementation of the MPT software is available from 971 [MptSrc] under GPLv3 license. It is intended for research and 972 experimentation purposes only, as it has not been sufficiently 973 tested to be used for commercial purposes. 975 8. Security Considerations 977 Threats that apply to GRE-in-UDP tunneling, apply here, too. For the 978 security considerations of GRE-in-UDP, please refer to Section 11 of 979 [RFC8086]. 981 If an MPT server is configured so, its peer is allowed to build up 982 connections. It may lead to resource exhaustion and thus successful 983 DoS (Denial of Service) attacks. 985 Authentication between MPT servers is optional, which may lead to 986 security issues. 988 9. IANA Considerations 990 Port numbers may be reserved for local command port and remote 991 command port. 993 10. Conclusions 995 Hereby we publish the specifications of the MPT network layer 996 multipath library in the hope, that it can be made better by the 997 review and comments of the WG members and, after answering several 998 open questions, one day MPT can mature to be a production tool. We 999 seek for interested volunteers for a different implementation and we 1000 would be happy to take part in research cooperation. We welcome all 1001 kinds of feedback from anyone to make MPT better. 1003 11. References 1005 11.1. Normative References 1007 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1008 Requirement Levels", BCP 14, RFC 2119, March 1997. 1010 [RFC8086] Young, L. (Editor), Crabbe, E., Xu, X., and T. Herbert, 1011 "GRE-in-UDP Encapsulation", RFC 8086, DOI: 1012 10.17487/RFC8086, March 2017. 1014 11.2. Informative References 1016 [Alm2014] Almasi, B., "A solution for changing the communication 1017 interfaces between WiFi and 3G without packet loss", in 1018 Proc. 37th Int. Conf. on Telecommunications and Signal 1019 Processing (TSP 2014), Berlin, Germany, Jul. 1-3, 2014, 1020 pp. 73-77 1022 [Alm2015] Almasi, B., Kosa, M., Fejes, F., Katona, R., and L. Pusok, 1023 "MPT: a solution for eliminating the effect of network 1024 breakdowns in case of HD video stream transmission", in: 1025 Proc. 6th IEEE Conf. on Cognitive Infocommunications 1026 (CogInfoCom 2015), Gyor, Hungary, 2015, pp. 121-126, doi: 1027 10.1109/CogInfoCom.2015.7390576 . 1029 [Alm2017] Almasi, B., Lencse, G., and Sz. Szilagyi, "Investigating 1030 the Multipath Extension of the GRE in UDP Technology", 1031 Computer Communications (Elsevier), vol. 103, no. 1, 1032 (2017.) pp. 29-38, DOI: 10.1016/j.comcom.2017.02.002 1034 [Fej2016] Fejes, F., Katona, R., and L. Pusok, "Multipath strategies 1035 and solutions in multihomed mobile environments", in: 1036 Proc. 7th IEEE Conf. on Cognitive Infocommunications 1037 (CogInfoCom 2016), Wroclaw, Poland, 2016, pp. 79-84, doi: 1038 10.1109/CogInfoCom.2016.7804529 1040 [Fej2017] Fejes, F., Racz, S., and G. Szabo, "Application agnostic 1041 QoE triggered multipath switching for Android devices", 1042 In: Proc. 2017 IEEE International Conference on 1043 Communications (IEEE ICC 2017), Paris, France, 21-25 May 1044 21-25, 2017. pp. 1585-1591. 1046 [Kov2016] Kovacs, A., "Comparing the aggregation capability of the 1047 MPT communications library and multipath TCP", in: Proc. 1048 7th IEEE Conf. on Cognitive Infocommunications (CogInfoCom 1049 2016), Wroclaw, Poland, 2016, pp. 157-162, doi: 1050 10.1109/CogInfoCom.2016.7804542 1052 [Kov2018] Kovacs, A., "Evaluation of the Aggregation Capability of 1053 the MPT Communications Library and Multipath TCP", 1054 unpublished 1056 [Len2015] Lencse, G. and A. Kovacs, "Advanced Measurements of the 1057 Aggregation Capability of the MPT Multipath Communication 1058 Library", International Journal of Advances in 1059 Telecommunications, Electrotechnics, Signals and Systems, 1060 vol. 4. no. 2. (2015.) pp 41-48. DOI: 1061 10.11601/ijates.v4i2.112 1063 [Szil2018] Szilagyi, Sz., Fejes, F. and R. Katona, "Throughput 1064 Performance Comparison of MPT-GRE and MPTCP in the Fast 1065 Ethernet IPv4/IPv6 Environment", Journal of 1066 Telecommunications and Information Technology, vol. 3. no. 1067 2. (2018.) pp 53-59. DOI: 10.26636/jtit.2018.122817 1069 [Mpt2017] MPT - Multipath Communication Library, 1070 http://irh.inf.unideb.hu/user/szilagyi/mpt/ 1072 [MptFlow] "MPT - Multi Path Tunnel", source code version with flow 1073 based packet to path mapping feature, 1074 https://github.com/spyff/mpt/tree/flow_mapping 1076 [MptSrc] "MPT - Multi Path Tunnel", source code, 1077 https://github.com/spyff/mpt 1079 [RFC6824] Ford, A., Raiciu, C, Handley, M., and O Bonaventure, "TCP 1080 Extensions for Multipath Operation with Multiple 1081 Addresses", RFC 6824, DOI: 10.17487/RFC6824, January, 1082 2013. 1084 [RFC8157] Leymann, N., Heidemann, C., Zhang, M., Sarikaya, B, and M. 1085 Cullen, "Huawei's GRE Tunnel Bonding Protocol", RFC 8157, 1086 DOI: 10.17487/RFC8157, May, 2017 1088 12. Acknowledgments 1090 The MPT Network Layer Multipath Library was invented by Bela Almasi, 1091 the organizer and original leader of the MPT development team. 1093 This document was prepared using 2-Word-v2.0.template.dot. 1095 Appendix A. Sample C code for calculating the packet sending order 1097 1098 void calculate_pathselection(connection_type *con) { 1099 long long lcm; 1100 long gcd, min_inc, cinc; 1101 int i,j, min_idx; 1102 path_type *p; 1104 con->pathselectionlength = 0; 1105 gcd = con->mpath[0].weight_out; 1106 lcm = gcd; 1107 for (i = 0; i < M; i++) 1108 O[i] = NULL; 1110 for (i = 0; i < con->path_count; i++) { 1111 gcd = CALCULATE_GCD(gcd, con->mpath[i].weight_out); 1112 lcm = (lcm * con->mpath[i].weight_out ) / gcd; 1113 con->mpath[i].selection_increment = 0; 1114 } 1116 for (j = 0; j < M; j++) { 1117 min_idx = 0; 1118 min_inc = lcm + 1; 1119 for (i = 0; i < con->path_count; i++) { 1120 p = &con->mpath[i]; 1121 cinc = p->selection_increment + (lcm / p->weight_out); 1122 if ((p->weight_out) && (cinc < min_inc)) { 1123 min_idx = i; 1124 min_inc = cinc; 1125 } 1126 } 1127 O[j] = &con->mpath[min_idx]; 1128 con->mpath[min_idx].selection_increment = min_inc; 1130 for (i = 0; i < con->path_count; i++) // check if ready 1131 if (con->mpath[i].selection_increment != min_inc) 1132 goto NEXT_SELECTION; 1133 break; 1135 NEXT_SELECTION: 1136 continue; 1137 } 1138 con->path_index = 0; 1139 con->pathselectionlength = j + 1; 1140 } 1141 1142 Copyright (c) 2018 IETF Trust and the persons identified as authors 1143 of the code. All rights reserved. 1145 Redistribution and use in source and binary forms, with or without 1146 modification, is permitted pursuant to, and subject to the license 1147 terms contained in, the Simplified BSD License set forth in Section 1148 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents 1149 (http://trustee.ietf.org/license-info). 1151 Authors' Addresses 1153 Gabor Lencse 1154 Budapest University of Technology and Economics 1155 Magyar Tudosok korutja 2. 1156 H-1117 Budapest, Hungary 1158 Phone: +36 1 463 2055 1159 Email: lencse@hit.bme.hu 1161 Szabolcs Szilagyi 1162 University of Debrecen 1163 Egyetem ter 1. 1164 H-4032 Debrecen 1165 Hungary 1167 Phone: +36 52 512 900 / 75015 1168 Email: szilagyi.szabolcs@inf.unideb.hu 1170 Ferenc Fejes 1171 University of Debrecen 1172 Egyetem ter 1. 1173 H-4032 Debrecen 1174 Hungary 1176 Phone: +36 70 545 48 07 1177 Email: fejes@openmailbox.org 1179 Marius Georgescu 1180 RCS&RDS 1181 Strada Dr. Nicolae D. Staicovici 71-75 1182 Bucharest 030167 1183 Romania 1185 Phone: +40 31 005 0979 1186 Email: marius.georgescu@rcs-rds.ro