idnits 2.17.1 draft-shyam-site-multi-43.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 1161: '...ield. The value MUST be initialized t...' RFC 2119 keyword, line 1162: '... the sender, and MUST be ignored by th...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 582 has weird spacing: '... u_char icmp...' == Line 583 has weird spacing: '... u_char icmp...' == Line 584 has weird spacing: '...u_short icmp_...' -- The document date (October 12, 2018) is 1995 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: 'M' on line 462 -- Looks like a reference, but probably isn't: 'RFC 2215' on line 499 -- Looks like a reference, but probably isn't: 'R' on line 466 -- Looks like a reference, but probably isn't: 'S' on line 468 == Unused Reference: '12' is defined on line 1278, but no explicit reference was found in the text ** Obsolete normative reference: RFC 5246 (ref. '5') (Obsoleted by RFC 8446) ** Obsolete normative reference: RFC 2460 (ref. '9') (Obsoleted by RFC 8200) Summary: 4 errors (**), 0 flaws (~~), 5 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET DRAFT S. Bandyopadhyay 3 draft-shyam-site-multi-43.txt October 12, 2018 4 Intended status: Experimental 5 Expires: April 12, 2019 7 Solution for Site Multihoming in a Real IP Environment 8 draft-shyam-site-multi-43.txt 10 Abstract 12 This document provides a solution for Site Multihoming of stub 13 networks in a real IP environment. Each user interface in a customer 14 network may have as many global unicast addresses as many service 15 providers it will be connected with. Users can establish multiple 16 connections through different service providers simultaneously. 17 Customer networks can maintain private address space to communicate 18 within its users. Customer networks can provide IP mobility services 19 as well. 21 Status of this Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on April 12, 2019. 38 Copyright Notice 40 Copyright (c) 2018 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. 50 Table of Contents 51 1. Introduction.....................................................2 52 2. Solution for site multihoming....................................3 53 2.1. Multihoming and IP Mobility....................................5 54 2.2. Selection of source and destination address ...................6 55 2.2.1. Path selection...............................................7 56 2.2.1.1. RSVP extension for path selection from client application..8 57 2.2.2. Link failure and switch over to an alternate route..........11 58 2.3. Implementation aspects........................................16 59 2.3.1. Processing of system call 'getcommaddr'.....................17 60 2.3.2. Processing of 'gethostbynamewithsrcaddr'....................19 61 2.3.3. Changes required in ip_output and ip_forwarding modules.....20 62 2.3.4. Processing of protocol input routines and socket IO 63 system calls................................................21 64 2.4. Multihoming and VPN...........................................22 65 2.5. IP Address Stacking...........................................22 66 3. Security Consideration..........................................26 67 4. IANA Consideration..............................................26 68 5. Normative References............................................26 69 6. Informative References..........................................27 70 7. Author's Address................................................27 72 1. Introduction 74 Based on the definition of "multihoming" as stated in RFC3582[1], 76 "A "multihomed" site is one with more than one transit provider. 77 "Site-multihoming" is the practice of arranging a site to be 78 multihomed." 80 This is a general solution for site multihoming of stub networks in a 81 real IP world irrespective of the framework supported by the service 82 provider network. The solution is applicable to any customer network 83 that receives globally unique IP addresses for all of its nodes and 84 communicates with the rest of the world without the help of NAT[17]. 85 It is applicable to any version of IP, i.e. IPv4, IPv6 or any new 86 generation of IP that may emerge by removing the drawbacks associated 87 with IPv6[7]. Within a provider assigned address space, each customer 88 network will possess as many global unicast address space as many 89 service providers it gets connected with. So, an user interface of a 90 host may have as many global unicast addresses as many service 91 providers it will be connected with. 93 Users can maintain multiple connections through multiple service 94 providers simultaneously. A customer network can maintain private IP 95 addresses to communicate within its users. Communication using 96 private IP is restricted to private IP space for the sake of privacy. 97 Customer networks can provide IP mobility support as well. 99 There are many variants of UNIX systems (as well as real time 100 operating systems) which make use of BSD source code for their 101 implementation of TCP/IP stack. The solution given below highlights 102 the changes required with the BSD release 4.4 source code with the 103 notations used by IPv4. It addresses issues relevant to IPv6 wherever 104 applicable. All other implementations of TCP/IP have to be updated 105 in the similar manner. 107 In this document the term "default router" will refer to the customer 108 edge (CE) router that communicates with the provider network. Also 109 the term "intermediate routers" will refer to all the routers apart 110 from the CE routers. 112 2. Solution for site multihoming 114 RFC1122[2] made an extensive study related to different aspects of 115 multihoming. Some of the requirements suggested in that document 116 related to UDP and the application layer were avoided for multihomed 117 hosts in a connected network with a single gateway to reach the 118 outside world. This was achieved by the implementation of TCP/IP by 119 making sure that the interface address of an outgoing packet gets 120 selected based on the route to be followed by the destination 121 address. This criterion holds good in a connected environment with a 122 single gateway to reach the outside world. Once more than one gateway 123 comes into play to reach the outside world, either routing table of 124 the entire world has to be brought in or needs some enhancements 125 within the existing system to make things work. 127 Whenever a customer network gets service from more than one service 128 provider, the customer network can be viewed as having multiple 129 source-id (user-id) space. Each of these IP domain gets connected to 130 different service providers through different routers. So each 131 interface of customer network may have as many global unicast 132 addresses as many service providers it is connected with. Number of 133 routing entries in the routing table will (roughly) become a multiple 134 of IP domains that it supports. Communication between any two hosts 135 within the customer network will follow the traditional routing 136 mechanism. In order to provide multihoming services it is needed that 137 a host computer always forwards packets to the customer edge router 138 associated to the same IP domain while communicating to someone in 139 the outside world. i.e. if the interface of a host computer H 140 receives an IP address 'addr1' and 'addr2' from two service providers 141 P1 and P2 which are connected through routers R1 and R2 respectively, 142 host H has to forward a packet to R1 while using its IP address as 143 'addr1' in order to send packets to the outside world. So, host 144 computers as well as the intermediate routers have to use default 145 routing based on the source domain of the source address in the IP 146 header. 148 In order to achieve this, host computers as well as intermediate 149 routers need to have information related to its IP domain (net 150 address/net mask) and the associated default router for all of its IP 151 domains. They need to have a route entry per IP domain for all of its 152 default routers. These information should be uploaded at the system 153 start up time. 155 Routing of IP packets (in the ip_output module of the hosts and in 156 the ip_forwarding module of the intermediate routers) need to be 157 modified in the following manner. 159 If destination address of a packet falls outside of its IP domains, 160 it has to be forwarded to the default router based on the domain that 161 the source address belongs to. 163 If destination address of the IP header falls within any one of its 164 IP domains, usual routing mechanism has to be followed. 166 If customer network maintains private IP domain, communication using 167 private IP has to be restricted within private IP space. 169 UDP (or RAW) based servers that need to support multiple clients 170 simultaneously need to respond to a client's request with the same 171 source address that the client had specified as the destination 172 address. In order to satisfy this, system needs to introduce two 173 system calls along with the existing system calls (i.e. read, write, 174 send, sendto, recv, recvfrom) 176 ssize_t recvwithdstaddr (int sockfd, char *buf, size_t nbytes, 177 int flags, struct sockaddr *from, socklen_t *fromlen, 178 struct sockaddr *fromcladdr, socklen_t *fromcladdrlen, 179 struct sockaddr *dst, socklen_t *dstlen, 180 struct sockaddr *dstcladdr, socklen_t *dstcladdrlen); 182 'recvwithdstaddr' receives data with destination address as specified 183 by the sender. It is similar to 'recvfrom' with the additional field 184 'dst' related to the address of the receiving interface of the host. 185 'fromcladdr' and 'dstcladdr' will hold the values of co-located care- 186 of addresses (see section 2.2) of source and destination if they 187 happen to be mobile. 189 ssize_t sendwithsrcaddr (int sockfd, char *buf, size_t nbytes, 190 int flags, struct sockaddr *to, socklen_t tolen, 191 struct sockaddr *dstcladdr, socklen_t dstcladdrlen, 192 struct sockaddr *src, socklen_t srclen, 193 struct sockaddr *srccladdr, socklen_t srccladdrlen); 195 'sendwithsrcaddr' sends data specifying the source address of the 196 outgoing interface of the host. It is similar to 'sendto' with 197 additional parameters related to source address. It behaves like 198 'sendto' if no address is specified for 'src'. 'srccladdr' and 199 'dstcladdr' will hold the values of co-located care-of addresses of 200 source and destination. 202 All the UDP based servers that need to support multiple clients 203 simultaneously, need to replace 'sendto' with 'sendwithsrcaddr' and 204 'recvfrom' with 'recvwithdstaddr'. 206 It has been expressed in several documents including RFC4291[3], that 207 a single interface will possess multiple IP addresses in a real IP 208 environment. In these cases, all the UDP servers have to be updated 209 with the system calls 'sendwithsrcaddr' and 'recvwithdstaddr' even if 210 a customer site gets attached to a single gateway to reach the 211 outside world. 213 The same logic will apply to server applications with RAW sockets. 214 Server applications that are TCP based should work in the usual 215 manner. 217 2.1. Multihoming and IP Mobility 219 For a mobile node, its co-located care-of IP address[4] has to be 220 bound to one of the IP addresses supported by the service providers 221 (if mobile node advertises more than one address, the home agent will 222 get confused, also there are other implications). Transport layer 223 must ensure that the 'home address' gets tightly coupled with that 224 particular IP address. 226 A mobile node in a foreign site will have all the IP addresses 227 supported by the foreign site as well as its "Home Address". As the 228 mobile node will also communicate with the outside world with its 229 "Home Address", user should get a provision to choose its "Home 230 Address" while initiating communication. If mobile node makes use of 231 the address of foreign site for applications that do not need its 232 "Home Address" (say, accessing a web site) cost of communication will 233 get reduced. This feature is useful when a mobile user is in a 234 foreign site but remains within the same sphere of influence (say an 235 user lives in one city but works in a different city which is in a 236 different sphere of influence and likes to access web during his 237 working hours). 239 If "Home Address" is selected for communication, the transport layer 240 of the mobile node should use its care-of address as the source 241 address and pass its "Home Address" as an option field in the stack. 242 This is because multihoming expects the source address as the 243 deciding factor for packet forwarding. 245 All the issues that need to be handled for IP mobility have been 246 thoroughly discussed in section 5 of the architectural 247 specification[7]. 249 2.2. Selection of source and destination address 251 If a source network is connected with 'n' service providers and the 252 destination network is connected with 'm' service providers, there 253 will be a possible 'm*n' combination of source-destination pairs for 254 connection between source and destination. So, application program 255 needs to select a source and destination address before initiating 256 communication with the destination. 258 A system call needs to be introduced to get the source address based 259 on the destination address. If application program needs to use the 260 destination address directly, it needs to use this system call. 262 int getcommaddr(int sockfd, struct in_addr *dst, struct addr_pair 263 *endpts); 265 'addr_pair' holds the addresses of communication end points as 266 follows: 268 struct addr_pair { 269 struct in_addr src; 270 struct in_addr dst; 271 }; 273 'getcommaddr' returns the number of source-destination pairs for 274 communication; the field 'endpt' will hold the array of these 275 addresses. The array will be in sorted manner based on the best 276 possible route. 'sockfd' is used to get the 'type of service' 277 assigned. So, an application program needs to set its type of service 278 before using this call. 280 'getcommaddr' needs to call a routine 'getmappedaddr'[7] to resolve 281 the mapped provider assigned addresses of a provider independent 282 address. 284 int getmappedaddr(struct in_addr *piaddr, struct in_addr *mpiaddr); 286 'getmappedaddr' will return number of mapped addresses and 'mpiaddr' 287 will hold their values. 289 Users may use name instead of IP address to reach the destination. A 290 new system call needs to be introduced 'gethostbynamewithsrcaddr', 291 which is an extension to 'gethostbyname' as follows: 293 struct hostent *gethostbynamewithsrcaddr(int sockfd,const char *name, 294 int *nroutes, struct addr_pair *endpts); 296 'gethostbynamewithsrcaddr' takes 'name' and 'sockfd' as input 297 parameters and finds out the best possible route to reach the 298 destination. It returns the pointer to the 'hostent' structure as 299 returned by 'gethostbyname' system call. The parameter 'nroutes' 300 gets the number of possible routes to be used and the corresponding 301 source and destination addresses gets assigned to 'endpts' in sorted 302 manner. 'sockfd' is used to get the 'type of service' assigned. So, 303 an application program needs to set its type of service before using 304 this call. 306 An application program needs to use these source addresses from the 307 top (i.e. the 0th) to establish connection with the destination. It 308 needs to bind source address 'src' and then connect with the 309 destination address 'dst'. 311 2.2.1. Path selection 313 Paths are selected by sending RSVP messages from user to the PE 314 routers with the following changes in respective modules. 316 In order to transport a packet from one network to another, provider 317 network sets up a LSP. In RSVP[10,11], resource reservation is 318 receiver-initiated. In the Path message, the sending application 319 construct Path message using RSVP SENDER_TSPEC and ADSPEC objects. 320 The path properties of ADSPEC object gets modified by the network 321 elements as the Path message moves from sender to receiver. The 322 receiver makes use of SENDER_TSPEC and ADSPEC objects and forms 323 FLOWSPEC object and sends back to the network element towards the 324 sender. In order to make decision which path an application should 325 select from multiple possible paths due to multihoming, composed 326 general parameters of ADSPEC object that were received by the 327 receiver have to be passed back to the sender by appending them with 328 the Resv message. 330 For best effort service, path is selected based on widest-shortest 331 path approach, i.e. the path having the maximum effective available 332 bandwidth with minimum NUMBER_OF_IS_HOPS. Effective available 333 bandwidth is calculated as 335 bandwidth allocated to the customer 336 ----------------------------------------- * AVAILABLE_PATH_BANDWIDTH 337 gross effective bandwidth allocated to customers 339 If (Effective available bandwidth > unused bandwidth 340 allocated to the customer) 342 Effective available bandwidth = unused bandwidth 343 allocated to the customer. 345 When a Path message is sent from a user to the ingress PE router, for 346 best-effort service the PE router sets up a LSP with the egress PE 347 router and stores the path attributes with the ADSPEC objects if no 348 LSP has already been created. The ingress PE router sends the path 349 attributes (with AVAILABLE_PATH_BANDWIDTH set as Effective available 350 bandwidth) to the sender. If ingress PE router finds an existing LSP 351 for the destination node, it sends the path attributes associated to 352 the LSP. 354 PE routers need to maintain a list of customers that have accessed 355 the LSP with the last time of access. At the end of each RSVP refresh 356 time, it needs to check the list and delete those entries whose last 357 time of access exceeds the time period of RSVP refresh time. Gross 358 effective bandwidth is calculated as the sum of bandwidths allocated 359 to all the customers available in the list. 361 The above equation is applicable when communication takes place 362 between global unicast/multicast addresses. In case of VPN, service 363 providers allocate fixed bandwidth path between two customer 364 locations. So, when communication takes place between private 365 addresses actual unused bandwidth of that path has to be returned. 367 For Guaranteed bandwidth[14] and Controlled-Load service[13] path is 368 selected with MINIMUM_PATH_LATENCY with minimum NUMBER_OF_IS_HOPS, 369 also sender applications need to send PathTear messages for all the 370 paths that are not selected. 372 A PE router will be in a different address space than the address 373 space of the customer network. As hosts need not be aware of the PE 374 routers, hosts need to send queries to the CE router to get the 375 address of the PE router and store the same in their cache, the way 376 it works with DNS. 378 2.2.1.1. RSVP extension for path selection from client application 380 As stated above, for client application to select path, RSVP Resv 381 message needs to pass back composed general parameters of ADSPEC 382 object that were received by the receiver. It is done by appending 383 default ADSPEC general parameters (service 1) NUMBER_OF_IS_HOPS, 384 AVAILABLE_PATH_BANDWIDTH, and MINIMUM_PATH_LATENCY[15] with the 385 FLOWSPEC object. These parameters need to be returned to the ingress 386 PE router without modification. 388 FLOWSPEC object for Controlled-Load service as defined in RFC 389 2210[16] will appear to be as follows: 391 31 24 23 16 15 8 7 0 392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 393 1 | 0 (a) | reserved | 13 (b) | 394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 395 2 | 5 (c) |0| reserved | 12 (d) | 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 397 3 | 127 (e) | 0 (f) | 5 (g) | 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 399 4 | Token Bucket Rate [r] (32-bit IEEE floating point number) | 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 5 | Token Bucket Size [b] (32-bit IEEE floating point number) | 402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 403 6 | Peak Data Rate [p] (32-bit IEEE floating point number) | 404 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 405 7 | Minimum Policed Unit [m] (32-bit integer) | 406 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 407 8 | Maximum Packet Size [M] (32-bit integer) | 408 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 409 9 | 4 (h) | (i) | 1 (j) | 410 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 411 10 | IS hop cnt (32-bit unsigned integer) | 412 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 413 11 | 6 (k) | (l) | 1 (m) | 414 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 415 12 | Path b/w estimate (32-bit IEEE floating point number) | 416 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 417 13 | 8 (n) | (o) | 1 (p) | 418 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 419 14 | Minimum path latency (32-bit integer) | 420 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 422 (a) - Message format version number (0) 423 (b) - Overall length (13 words not including header) 424 (c) - Service header, service number 5 (Controlled-Load) 425 (d) - Length of controlled-load data, 12 words not including 426 per-service header 427 (e) - Parameter ID, parameter 127 (Token Bucket TSpec) 428 (f) - Parameter 127 flags (none set) 429 (g) - Parameter 127 length, 5 words not including per-service 430 (h) - Parameter ID, parameter 4 (Number-of-IS-hops param from 431 [RFC 2215]) 432 (i) - Parameter 4 flag byte 433 (j) - Parameter 4 length, 1 word not including header 434 (k) - Parameter ID, parameter 6 (Path-BW param from [RFC 2215]) 435 (l) - Parameter 6 flag byte 436 (m) - Parameter 6 length, 1 word not including header 437 (n) - Parameter ID, parameter 8 (minimum path latency from [RFC 438 2215]) 440 (o) - Parameter 8 flag byte 441 (p) - Parameter 8 length, 1 word not including header 443 FLOWSPEC object for Guaranteed bandwidth service as defined in RFC 444 2210[16] will appear to be as follows: 446 31 24 23 16 15 8 7 0 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 1 | 0 (a) | Unused | 16 (b) | 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 2 | 2 (c) |0| reserved | 15 (d) | 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 452 3 | 127 (e) | 0 (f) | 5 (g) | 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 4 | Token Bucket Rate [r] (32-bit IEEE floating point number) | 455 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 456 5 | Token Bucket Size [b] (32-bit IEEE floating point number) | 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 458 6 | Peak Data Rate [p] (32-bit IEEE floating point number) | 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 7 | Minimum Policed Unit [m] (32-bit integer) | 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 8 | Maximum Packet Size [M] (32-bit integer) | 463 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 464 9 | 130 (h) | 0 (i) | 2 (j) | 465 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 466 10 | Rate [R] (32-bit IEEE floating point number) | 467 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 468 11 | Slack Term [S] (32-bit integer) | 469 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 470 12 | 4 (k) | (l) | 1 (m) | 471 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 472 13 | IS hop cnt (32-bit unsigned integer) | 473 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 474 14 | 6 (n) | (o) | 1 (p) | 475 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 476 15 | Path b/w estimate (32-bit IEEE floating point number) | 477 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 478 16 | 8 (q) | (r) | 1 (s) | 479 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 480 17 | Minimum path latency (32-bit integer) | 481 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 483 (a) - Message format version number (0) 484 (b) - Overall length (16 words not including header) 485 (c) - Service header, service number 2 (Guaranteed) 486 (d) - Length of per-service data, 15 words not including per-service 487 header 489 (e) - Parameter ID, parameter 127 (Token Bucket TSpec) 490 (f) - Parameter 127 flags (none set) 491 (g) - Parameter 127 length, 5 words not including parameter header 492 (h) - Parameter ID, parameter 130 (Guaranteed Service RSpec) 493 (i) - Parameter 130 flags (none set) 494 (j) - Parameter 130 length, 2 words not including parameter header 495 (k) - Parameter ID, parameter 4 (Number-of-IS-hops param from 496 [RFC 2215]) 497 (l) - Parameter 4 flag byte 498 (m) - Parameter 4 length, 1 word not including header 499 (n) - Parameter ID, parameter 6 (Path-BW param from [RFC 2215]) 500 (o) - Parameter 6 flag byte 501 (p) - Parameter 6 length, 1 word not including header 502 (q) - Parameter ID, parameter 8 (minimum path latency from [RFC 503 2215]) 504 (r) - Parameter 8 flag byte 505 (s) - Parameter 8 length, 1 word not including header 507 2.2.2. Link failure and switch over to an alternate route 509 As stated in section 2.1, there are possible "m*n" routes. Client 510 applications select any one of them for communication. If 511 communication fails due to link failure, it may be desirable to 512 switch over to an alternate route (application programs must ensure 513 that it conforms to the requirement of the application). 515 In reality link failure is a rare phenomenon; so detection of link 516 failure should not become an overhead for the network. Fault gets 517 detected first at the local site where the fault is associated with. 518 Say, if CE-PE link fails, it is the CE router that comes to know 519 about it at the beginning. So, the local site needs to take 520 initiative for the switchover operation. When failure happens, 521 system generates trap which triggers the operation for switchover to 522 an alternate route. 524 The steps can be summarized as follows: 526 o When client application calls 'getcommaddr' or 527 'gethostbynamewithsrcaddr' system finds out a list of possible 528 "source-destination" pairs for communication. If number of routes 529 happen to be more than one rest of the steps are followed. 531 o Client application establishes a TLS [5] session with its peer 532 after 5 unit tuple gets established. After handshake operation, 533 client application sends the list of source-destination pair to its 534 peer in secured mode. Exchange of routes is required because failure 535 may happen in the remote site too; 536 o Both client application and its peer store security parameters of 537 TLS session and the list of source-destination routes with the 538 protocol control block (PCB) using 'setsockopt' which informs the 539 system to activate switchover operation if there is a link failure; 541 o When CE router detects failure of CE-PE link, it broadcasts an ICMP 542 message ICMP_LINKFAILURE_CE_PE_LINK to all the hosts. 544 o On receiving ICMP_LINKFAILURE_CE_PE_LINK, system goes through the 545 list of PCB and gets the list of applications for which it needs to 546 start the switchover operation. For any such particular application, 547 it prepares the list of possible routes for communication through the 548 active links. It tries to set alternate route to its peer by sending 549 ICMP message ICMP_LINKFAILURE_SET_ALT_ROUTE in secured mode with the 550 best possible route. 552 o On receiving ICMP_LINKFAILURE_SET_ALT_ROUTE, peer host checks 553 whether there is any application in the list of PCB where the request 554 will be applicable. On finding the right PCB, it sets the alternate 555 route and sends a message ICMP_LINKFAILURE_ALT_ROUTE_ESTABLISHED to 556 its peer. 558 o On receiving ICMP_LINKFAILURE_ALT_ROUTE_ESTABLISHED, system sets 559 the alternate route and completes the operation of switchover. 561 So, it introduces an ICMP message of type ICMP_LINKFAILURE 562 with the following codes: 563 ICMP_LINKFAILURE_CE_PE_LINK 1 564 ICMP_LINKFAILURE_CE_FAILURE 2 565 ICMP_LINKFAILURE_SET_ALT_ROUTE 3 566 ICMP_LINKFAILURE_ALT_ROUTE_ESTABLISHED 4 568 In order to provide secured communication it needs to depend on 569 security protocol SSL/TLS. Security parameters e.g. secret key, 570 compression method and cipher spec are stored in the PCB. ICMP 571 messages will have two parts; information in the first part, i.e. 572 'struct icmp' will hold all the necessary information to locate the 573 connection entry in the list of PCB. The second part will hold the 574 information related to the operation and will be in encrypted form 575 with record header. So, changes within a PCB entry is allowed only 576 if ICMP message is received in a secured mode. 578 It introduces an element 'struct id_pcb' inside union 'icmp_dun' of 579 'struct icmp' as follows: 581 struct icmp { 582 u_char icmp_type; /* type of message, see below */ 583 u_char icmp_code; /* type sub code */ 584 u_short icmp_cksum; /* ones complement cksum of struct */ 585 union { 586 u_char ih_pptr; /* ICMP_PARAMPROB */ 587 struct in_addr ih_gwaddr; /* ICMP_REDIRECT */ 588 struct ih_idseq { 589 uint16_t icd_id; /* network format */ 590 uint16_t icd_seq; /* network format */ 591 } ih_idseq; 592 int ih_void; 593 /* ICMP_UNREACH_NEEDFRAG -- Path MTU Discovery (RFC1191) */ 594 struct ih_pmtu { 595 uint16_t ipm_void; /* network format */ 596 uint16_t ipm_nextmtu; /* network format */ 597 } ih_pmtu; 598 struct ih_rtradv { 599 u_char irt_num_addrs; 600 u_char irt_wpa; 601 u_int16_t irt_lifetime; 602 } ih_rtradv; 603 } icmp_hun; 604 union { 605 struct id_ts { /* ICMP Timestamp */ 606 uint32_t its_otime; /* Originate */ 607 uint32_t its_rtime; /* Receive */ 608 uint32_t its_ttime; /* Transmit */ 609 } id_ts; 610 struct id_ip { 611 struct ip idi_ip; 612 /* options and then 64 bits of data */ 613 } id_ip; 614 struct id_pcb { 615 u_char ipcb_ip_proto; /* protocol TCP/UDP */ 616 struct in_addr ipcb_laddr, /* source address */ 617 ipcb_faddr; /* destination address */ 618 u_short ipcb_lport, /* source port */ 619 ipcb_fport; /* destination port */ 620 } id_pcb; 621 struct icmp_ra_addr id_radv; 622 u_int32_t id_mask; 623 char id_data[1]; 624 } icmp_dun; 625 }; 627 'struct inpcb' of protocol control block includes four new fields 628 'inp_lf_n_routes', 'inp_lf_stat', 'inp_lf_routes' and 629 'inp_seq_params' of type SecParams (SecParams is a type of struct 630 whose elements are elements of SecurityParameters as defined in 631 section 6.1 of RFC5246 [5]) as follows: 633 struct inpcb { 634 struct inpcb *inp_next, *inp_prev; /* doubly linked list */ 635 struct inpcb *inp_head; /* pointer back to chain of inpcb's for 636 this protocol */ 637 struct in_addr inp_faddr; /* foreign IP address */ 638 u_short inp_fport; /* foreign port# */ 639 struct in_addr inp_laddr; /* local IP address */ 640 u_short inp_lport; /* local port# */ 641 struct in_addr inp_fcladdr;/* foreign care-of address */ 642 struct in_addr inp_lcladdr;/* local care-of address */ 643 struct in_addr inp_hagentaddr; /* address of home agent */ 644 struct socket *inp_socket; /* back pointer to socket */ 645 caddr_t inp_ppcb; /* pointer to per-protocol pcb */ 646 struct route inp_route /* placeholder for routing entry */ 647 int inp_flags; /* generic IP/datagram flags */ 648 struct ip inp_ip; /* header prototype; should have more */ 649 struct mbuf *inp_options;/* IP options */ 650 struct ip_moptions *inp_moptions; /* IP multicast options */ 651 u_char inp_lf_n_routes; /* number of possible routes */ 652 u_char inp_lf_stat; /* state of switchover; 653 STAT_DO_NOT_ALTER(0)/STAT_ALTER(1) */ 654 struct addr_pair *inp_lf_routes;/*pointer to the array of routes*/ 655 SecParams inp_seq_params;/* security parameters */ 656 }; 658 From application layer, the field 'inp_seq_params' is set with the 659 system call 'setsockopt' by introducing a new socket option 660 SO_SEQPARAM of level SOL_SOCKET; route information i.e. 661 inp_lf_n_route, inp_lf_routes and inp_lf_stat are set by system call 662 'setsockopt' by introducing another socket option SO_LFROUTES of 663 level SOL_SOCKET. 665 setcockopt (sockfd, SOL_SOCKET, SO_SEQPARAM, (char *)&seq_param, 666 sizeof(SecurityParameters)); 668 setsockopt (sockfd, SOL_SOCKET, SO_LFROUTES, (char *)routes, sizeof 669 (struct addr_pair)*n_routes); 671 ICMP messages with 'icmp_code' ICMP_LINKFAILURE_SET_ALT_ROUTE and 672 ICMP_LINKFAILURE_ATL_ROUTE_ESTABLISHED will have same format as 673 follows: 675 Information of the current active link of the PCB entry i.e. protocol 676 id, source address, destination address, source port and destination 677 port are set with the fields of 'struct id_pcb' of 'struct icmp'. The 678 encrypted part of the message will have three fields, source address 679 and destination address of the alternate route and ICMP code (i.e. 680 ICMP_LINKFAILURE_SET_ALT_ROUTE/ 681 ICMP_LINKFAILURE_ALT_ROUTES_ESTABLISHED) as it was set with the ICMP 682 header. 684 Recipient of these messages needs to search PCB entry in the 685 following manner: 687 If 'source port', 'destination port' and 'protocol id' of incoming 688 ICMP message matches with any entry in the list of PCB and the fields 689 'source address' and 'destination address' of the ICMP message 690 matches with any entry of 'inp_lf_routes' of the corresponding entry 691 in the PCB, it will be considered as a match. If no matching entry is 692 found, message has to be dropped. With the security information of 693 the PCB entry, the encrypted part of the ICMP message gets decrypted. 694 If it fails to decrypt the message or the message received with 695 invalid MAC, message needs to be dropped. If ICMP code in the header 696 does not match with that of the encrypted part, the message also 697 needs to be dropped. 699 Details of the ICMP operations are described below: 701 ICMP_LINKFAILURE_CE_PE_LINK 703 CE router detects link failure and sends this message to all the 704 users in the network; The field 'icmp_gwadd' of 'struct icmp' holds 705 the IP address of the PE router. 707 ICMP_LINKFAILURE_CE_FAILURE 709 CE router itself may fail. It gets detected by alternate CE router. 710 CE routers send keep-alive messages between themselves at regular 711 interval to detect this failure. The field 'icmp_gwadd' of 'struct 712 icmp' holds the IP address of the faulty CE router. 714 ICMP_LINKFAILURE_SET_ALT_ROUTE 716 This message is sent by a host after receiving ICMP broadcast message 717 ICMP_LINKFAILURE_CE_PE_LINK or ICMP_LINKFAILURE_CE_FAILURE from a CE 718 router for all the entries of PCB whose ('inp_lf_stat' = STAT_ALTER 719 and source-destination route passes through the failed link), to 720 their peer. It maintains a list of information where each entry will 721 have the connection details including the best possible route. For 722 best effort traffic route is selected by sending echo messages and 723 calculating round trip delay; for the rest it follows the approach 724 stated in section 2.1.1. For each entry in the list, host sends 725 ICMP_LINKFAILURE_SET_ALT_ROUTE for (arbitrary) 'n' number of times 726 with an (arbitrary) interval of 't' msecs (sufficient enough for the 727 reply of ICMP_LINKFAILURE_SET_ALT_ROUTE comes back and gets 728 processed; roughly twice the round trip delay) till it receives a 729 positive acknowledgment ICMP_LINKFAILURE_ATL_ROUTE_ESTABLISHED from 730 its peer. On receiving a positive acknowledgment 731 ICMP_LINKFAILURE_ATL_ROUTE_ESTABLISHED, it deletes the corresponding 732 entry from the list and updates the route information in the PCB. 734 ICMP_LINKFAILURE_ATL_ROUTE_ESTABLISHED 736 On receiving ICMP_LINKFAILURE_SET_ALT_ROUTE, host needs to look for a 737 match in the PCB. If there is a match, host sends 738 ICMP_LINKFAILURE_ATL_ROUTE_ESTABLISHED to its peer on successful 739 completion of changing 'source address' and 'destination address' 740 with the desired value of the alternate route in the PCB. The message 741 will contain all the fields as that of the receiving message by 742 setting 'icmp_code' as ICMP_LINKFAILURE_ATL_ROUTE_ESTABLISHED both at 743 the header part as well as at the encrypted part. 745 Switchover operation requires some amount of time. This duration is 746 under the tolerance limit for best effort traffic. For Guaranteed 747 bandwidth and Controlled-Load service as the circuit needs to be 748 reestablished, it may cause flicker. This situation can be avoided 749 by maintaining back-up circuit through an alternate route. As link 750 failure is a rare phenomenon, this feature can be provided on on- 751 demand basis or based on the application type. 753 2.3. Implementation aspects 755 Following changes are expected with the source code of BSD. 757 Introduce ip_domain structure and some parameters as follows: 759 struct ip_domain { 760 struct in_addr net_addr; 761 struct in_addr net_mask; 762 struct in_addr def_router; 763 }; 764 #define MAX_IP_DOMAINS 16 765 short num_ipdomains; 766 struct ip_domain *ipdomain[MAX_IP_DOMAINS]; 768 If customer network maintains private IP domain (along with the user- 769 id space provided by the service providers) and expects its 770 communication to be confined within its own space, 'def_router' has 771 to be set as NULL. 773 Upload IP domain information for all of its IP domains during system 774 start up. These domain information can be uploaded through router 775 advertisement or through DHCP. The domain information should contain 776 the next hop address to reach the corresponding default router as 777 well. 779 There has to be a provision to upload these information through 780 'sysctl' to configure them manually. 782 Three new 'sysctl' routines have to be introduced under the 'ip' node 783 of the MIB tree (i.e. under CTL_NET, PF_INET, IPPROTO_IP) 784 IPCTL_NUM_DOMAINS, IPCTL_DOMAIN and IPCTL_USE_HOMEADDR (applicable 785 for mobile node). Both IPCTL_NUM_DOMAINS and IPCTL_USE_HOMEADDR are 786 of type CTLTYPE_INT and IPCTL_DOMAIN is of type CTLTYPE_NODE. Using 787 'sysctl' IPCTL_NUM_DOMAINS has to be configured first. Configuration 788 of IPCTL_NUM_DOMAINS has to populate IPCTL_NUM_DOMAIN entries of 789 nodes under IPCTL_DOMAIN and for each of these nodes three MIB 790 attributes DOMAIN_NET_ADDR, DOMAIN_NET_MASK and DOMAIN_DEF_ROUTER 791 (each of type CTLTYPE_NODE) has to be allocated. 793 All the routers as well as hosts that are having interfaces 794 connecting to multiple subnets need to be configured through 795 'sysctl'. 797 Mobile users should get provision to change IPCTL_USE_HOMEADDR 798 attribute dynamically. 800 Add a route entry for all the default routers during system start up. 802 2.3.1. Processing of system call 'getcommaddr' 804 Introduce a routine (say 'getendpointaddr') that will find out a list 805 of source-destination addressees sorted in order based on sending 806 Path messages between a list of source addresses to a list of 807 destination addresses. The routine should select the service type 808 based on the type of service field (which can be obtained by calling 809 'getsockopt' with the socket id 'sockfd' passed as a parameter). 811 System call 'getcommaddr' has to be processed in the following 812 manner: 814 If destination address of the IP packet falls outside of its 815 IP domains { 816 If destination address is from private address space { 817 if the host is having only one interface { 818 for each private address assigned to the interface get 819 an entry for the source list. 820 } 821 else { 822 for all the default routers { 823 use 'rtalloc' to get the next hop address for the 824 default router. 826 get an entry for the source list based on 827 the outgoing interface 'ia', and the private 828 address associated with the default router. 829 } 830 } 831 destination list will have a lone entry with the 832 destination private address. 833 } 834 else { 835 If destination address is provider independent { 836 call 'getmappedaddr' to get all the associated PA addresses; 837 for each PA address get an entry of the destination list 838 } 839 else { 840 get a lone entry for the destination list with the 841 destination address. 842 } 844 If user has selected its "Home Address" { 845 /*Applicable to IP mobility*/ 846 get a lone entry in the source list with the "Home Address". 847 } 848 else { 849 if the host is having only one interface { 850 for each global unicast address of the interface, 851 get an entry for the source list. 852 } 853 else { 854 for all the default routers { 855 use 'rtalloc' to get the next hop address for the 856 default router. 858 get an entry for the source list based on 859 the outgoing interface 'ia', and the global unicast 860 address associated with the default router. 861 } 862 } 863 } 864 } 865 call 'getendpointaddr' to get the list of source-destination 866 addresses in sorted manner. 867 } 868 else { /* i.e. destination address is inside its IP domains */ 869 use 'rtalloc' to get the next hop address for the 870 destination address. 872 if destination address is a link local address { 873 select source address based on the outgoing interface 874 and the link local address assigned to it. 875 } 876 else { 877 select source address based on the outgoing interface 878 and the domain that the destination address belongs to. 879 } 880 there is only one possible source-destination combination. 881 } 883 2.3.2. Processing of 'gethostbynamewithsrcaddr' 885 System call 'gethostbynamewithsrcaddr' has to be processed in the 886 following manner: 888 This is an enhancement of the system call 'gethostbyname'. 889 'gethostbyname' calls three routines that performs host table search, 890 NIS search and DNS search. Once name is resolved, following additions 891 are expected to resolve source-destination pair. 893 If 'hostent' structure contains addresses which are inside its IP 894 domains { 895 if 'hostent' structure contains a private address { 896 Assign destination address as a private address 897 contained in 'hostent'; 899 use 'rtalloc' to get the next hop address for the 900 destination address. 902 select source address based on the outgoing interface 903 and the domain that the destination address belongs to. 904 } 905 else { 906 Select a global unicast address contained in 'hostent' for 907 destination address. 909 use 'rtalloc' to get the next hop address for the 910 destination address. 912 select source address based on the outgoing interface 913 and the domain that the destination address belongs to. 914 } 915 there is only one possible source-destination combination. 916 } 917 else { 918 if 'hostent' structure contains private address { 919 if host is having only one interface { 920 for each private address assigned to the interface get 921 an entry for the source list. 923 } 924 else { 925 for all the default routers { 926 use 'rtalloc' to get the next hop address for the 927 default router. 929 get an entry for the source list based on 930 the outgoing interface 'ia', and the private 931 address associated with the default router. 932 } 933 } 934 for each private address in the 'hostent' structure 935 get an entry for the destination list. 936 } 937 else { 938 for each PA address in the 'hostent' structure 939 get an entry for the destination list. 941 if user has selected its "Home Address" { 942 /*Applicable only to IP mobility */ 943 get a lone entry in the source list with the "Home Address". 944 } 945 else { 946 if the host is having only one interface { 947 for each global unicast address of the interface, 948 get an entry for the source list. 949 } 950 else { 951 for all the default routers { 952 use 'rtalloc' to get the next hop address for the 953 default router. 955 get an entry for the source list based on 956 the outgoing interface 'ia', and the global unicast 957 address associated with the default router. 958 } 959 } 960 } 961 } 962 call 'getendpointaddr' to get the list of source-destination 963 addresses in sorted manner. 964 } 966 2.3.3. Changes required in ip_output and ip_forwarding modules 968 Execute the following steps in the 'ip_output' routine of the IP 969 stack before it calls 'rtalloc' for route look up. 971 If destination address of the IP packet falls outside of its 972 IP domains { 973 get def router address based on the IP domain 974 the source address belongs to. 976 use 'rtalloc' to get the next hop address for the def router. 978 Forward the packet to the next hop. 979 } 980 else { /* i.e. destination address is inside its IP domains */ 981 follow the usual procedure to forward packets 982 } 984 In BSD, the 'ip_forwarding' routine calls 'ip_output'; so it should 985 be left as it is. 987 2.3.4. Processing of protocol input routines and socket IO system calls 989 Protocol input routines need to locate the socket/process in the 990 usual manner with the 5 unit tuple (i.e. protocol, source address, 991 source port, destination address, destination port). 993 When a packet is received by a mobile node (at a foreign site), it 994 can be received in two modes. It can be received directly from the 995 correspondent node with the 'destination address' as the co-located 996 care-of address and its home address in the IP stack (see section 4.1 997 of RFC6275[8]). In the second mode the packet can be received via the 998 home agent using IP over IP. Once the IP layer receives a packet with 999 IP over IP, it is supposed to strip off the outer header before 1000 passing the packet to the protocol input routine. In this case 1001 packet will be received by the protocol input routine with 1002 destination address as the home address of the mobile node with no 1003 information related to its care-of address. So, protocol input 1004 routine needs to check whether the destination address of the 1005 received packet belongs to any one of its IP domains. If it does 1006 not, it needs to find out the co-located care-of address by going 1007 through the interface list if it is not already found in the packet 1008 received. This information is needed by the TCP input routine while 1009 processing a SYN message. It is also needed by the UDP/RAW modules 1010 while processing the system call 'recvwithdstaddr'. 1012 While processing the output routines like 'sendwithsrcaddr', 1013 'sendto', UDP/RAW modules needs to check the parameters related to 1014 source address, source port, destination address, destination port, 1015 care-of address of the source, care-of address of the destination in 1016 the protocol control block. Parameters in the PCB should prevail over 1017 parameters passed by the system call while forming the IP packet. 1019 2.4. Multihoming and VPN 1021 For a corporate, that maintains multiple offices and communicates 1022 within themselves through private address space using VPN needs to 1023 distribute its entire private address space to all its site in a 1024 suitable manner. Each one of its offices will get multiple private 1025 address space where each of them will be associated with a particular 1026 link. Let us consider one of its offices gets connected to two 1027 providers P1 and P2 and gets address space as 1028 'unicastNetAddr1'/'unicastNetMask1' and 1029 'unicastNetAddr2'/'unicastNetMask2' respectively. It also gets 1030 assigned private address space from its corporate as 1031 'privateDomainNetAddr1'/'privateDomainNetMask1' and 1032 'privateDomainNetAddr2'/'privateDomainNetMask2' which will be 1033 associated with the CE routers CE1 and CE2 respectively. 1035 All hosts as well as the intermediate routers will have four entries 1036 of ip_domain: 1038 1: 'net_addr = 'unicastNetAddr1' 1039 'net_mask = 'unicastNetMask1' 1040 'def_router = CE1 1041 2: 'net_addr = 'unicastNetAddr2' 1042 'net_mask = 'unicastNetMask2' 1043 'def_router = CE2 1044 3: 'net_addr' = 'privateDomainNetAddr1' 1045 'net_mask' = 'privateDomainNetMask1' 1046 'def_router' = CE1 1047 4: 'net_addr' = 'privateDomainNetAddr2' 1048 'net_mask' = 'privateDomainNetMask2' 1049 'def_router' = CE2 1051 2.5. IP Address Stacking 1053 IP address stacking in IPv6 is performed with the approach introduced 1054 in section 6.4 of RFC6275[8] with slight modification. RFC6275 1055 describes how to pass "Home Address" as well as co-located care-of 1056 address of the destination address if it happen to be mobile. The 1057 same approach has been extended to support IP address stacking for 1058 the source address and to support IP address stacking for both source 1059 address as well as destination address. The "Reserved" space in the 1060 type 2 routing header has been split into two parts; an one octet 1061 field to address the "Stacking Type" and the rest 3 octets are left 1062 as Reserved. 1064 Stacking Type is interpreted as follows: 1066 Stacking Type=0 1067 Source Address: Address of the sender. 1068 Destination Address: co-located care-of address of the receiver. 1069 Address 1: Home Address of the receiver. 1070 Hdr Ext Len=2. 1072 So, type 2 routing header for stacking type 0 will be as follows: 1074 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075 | Next Header | Hdr Ext Len=2 | Routing Type=2|Segments Left=1| 1076 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 |Stacking Type=0| Reserved | 1078 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1079 | | 1080 + + 1081 | | 1082 + Address 1:Home Address of the receiver + 1083 | | 1084 + + 1085 | | 1086 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1088 Stacking Type=1 1089 Source Address: co-located care-of address of the sender. 1090 Destination address: Address of the receiver. 1091 Address 1: Home Address of the sender. 1092 Hdr Ext Len=2. 1094 So, type 2 routing header for stacking type 1 will be as follows: 1096 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1097 | Next Header | Hdr Ext Len=2 | Routing Type=2|Segments Left=1| 1098 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1099 |Stacking Type=1| Reserved | 1100 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1101 | | 1102 + + 1103 | | 1104 + Address 1:Home Address of the sender + 1105 | | 1106 + + 1107 | | 1108 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1110 Stacking Type 2 1111 Source Address: co-located care-of address of the sender. 1112 Destination Address: co-located care-of address of the receiver. 1113 Address 1: Home Address of the sender. 1114 Address 2: Home Address of the receiver. 1116 Hdr Ext Len=4. 1118 So, type 2 routing header for stacking type 2 will be as follows: 1120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1121 | Next Header | Hdr Ext Len=4 | Routing Type=2|Segments Left=1| 1122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1123 |Stacking Type=2| Reserved | 1124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1125 | | 1126 + + 1127 | | 1128 + Address 1:Home Address of the sender + 1129 | | 1130 + + 1131 | | 1132 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1133 | | 1134 + + 1135 | | 1136 + Address 2:Home Address of the receiver + 1137 | | 1138 + + 1139 | | 1140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1142 Next Header 1143 8-bit selector. Identifies the type of header immediately 1144 following the routing header. Uses the same values as the IPv6 1145 Next Header field [9]. 1147 Hdr Ext Len 1148 4 (8-bit unsigned integer); length of the routing header in 8- 1149 octet units, not including the first 8 octets. 1151 Routing Type 1152 2 (8-bit unsigned integer). 1154 Segments Left 1155 1 (8-bit unsigned integer). 1157 Stacking Type 1158 2 (8-bit unsigned integer). 1160 Reserved 1161 24-bit reserved field. The value MUST be initialized to zero by 1162 the sender, and MUST be ignored by the receiver. 1164 Address 1 1165 Home Address of the sender. 1167 Address 2 1168 Home Address of the receiver. 1170 IP address stacking in IPv4 is performed by introducing new IP option 1171 under the option class "Datagram or Network Control", i.e. 0. The 1172 option number is 16. The CODE(144) field is followed by one octet 1173 field "Stacking Type" followed by two octet reserved space (NULL) as 1174 padding followed by the address fields based on the Stacking Type. 1176 Stacking Type is interpreted as follows: 1177 Stacking Type=0 1178 Source Address: Address of the sender. 1179 Destination Address: co-located care-of address of the receiver. 1180 Address 1: Home Address of the receiver. 1181 Header Length:7 1183 Format of IP address stacking option with stacking type 0 1184 in the IP header will be as follows: 1186 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1187 | CODE(144) |Stacking Type=0| Reserved | 1188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1189 + Address 1:Home Address of the receiver + 1190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1192 Stacking Type=1 1193 Source Address: co-located care-of address of the sender. 1194 Destination Address: Address of the receiver. 1195 Address 1: Home Address of the sender. 1196 Header Length:7 1198 Format of IP address stacking option with stacking type 1 1199 in the IP header will be as follows: 1201 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1202 | CODE(144) |Stacking Type=1| Reserved | 1203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1204 + Address 1:Home Address of the sender + 1205 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1207 Stacking Type=2 1208 Source Address: co-located care-of address of the sender. 1209 Destination Address: co-located care-of address of the receiver. 1210 Address 1: Home Address of the sender. 1211 Address 2: Home Address of the receiver. 1213 Header Length:8 1215 Format of IP address stacking option with stacking type 2 1216 in the IP header will be as follows: 1218 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1219 | CODE(144) |Stacking Type=2| Reserved | 1220 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1221 + Address 1:Home Address of the sender + 1222 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1223 + Address 2:Home Address of the receiver + 1224 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1226 3. Security Consideration 1228 This document provides a solution for site multihoming of stub networks. 1229 Message exchange between source and destination related to link failure 1230 has to be done in secured mode as explained in section 2.1.2. For common 1231 security related issues that any site may experience, one needs to 1232 consult with the "Site Security Handbook", RFC2196[6]. For issues 1233 related to IP Mobility, section 5 of RFC5944[4] has to be consulted. 1235 4. IANA Consideration 1237 IANA has assigned an ICMP type (ICMP_LINKFAILURE) for link 1238 failure. IANA has also assigned two socket options SO_SEQPARAM 1239 for security parameters and SO_LFROUTES for 1240 routes to be considered on link failure. 1242 5. Normative References 1244 [1] J. Abley, B. Black, V. Gill, "Goals for IPv6 Site-Multihoming 1245 Architectures", RFC3582, August 2003. 1247 [2] R. Braden, "Requirements for Internet Hosts -- Communication 1248 Layers", RFC1122, October 1989. 1250 [3] R. Hinden, S. Deering, "IP Version 6 Addressing Architecture.", 1251 RFC4291, February 2006. 1253 [4] C. Perkins, "IP Mobility Support for IPv4, Revised", RFC5944, 1254 November 2010. 1256 [5] T. Dierks, E. Rescorla, "The Transport Layer Security (TLS) 1257 Protocol Version 1.2", RFC5246, August 2008. 1259 [6] B. Fraser, "Site Security Handbook", RFC2196, September 1997. 1261 [7] S. Bandyopadhyay, "An Architectural Framework of the Internet 1262 for the Real IP World" 1263 (work in progress). 1264 [8] C. Perkins, Ed., D. Johnson, J. Arkko, "Mobility Support in 1265 IPv6" RFC 6275, July 2011. 1267 [9] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) 1268 Specification", RFC 2460, December 1998. 1270 [10] L. Zhang, S. Berson, S. Herzog, S. Jamin, "Resource ReSerVation 1271 Protocol (RSVP) -- Version 1 Functional Specification", RFC 1272 2205, September 1997. 1274 [11] D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan, G. 1275 Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", 1276 RFC 3209, December 2001. 1278 [12] G. Swallow, J. Drake, H. Ishimatsu, Y. Rekhter, "Generalized 1279 Multiprotocol Label Switching (GMPLS) User-Network Interface 1280 (UNI): Resource ReserVation Protocol-Traffic Engineering 1281 (RSVP-TE) Support for the Overlay Model", RFC 4208, 1282 October 2005. 1284 [13] J. Wroclawski, "Specification of the Controlled-Load Network 1285 Element Service", RFC 2211, September 1997. 1287 [14] S. Shenker, C. Partridge, R. Guerin, "Specification of 1288 Guaranteed Quality of Service", RFC 2212, September 1997. 1290 [15] S. Shenker, J. Wroclawski, "General Characterization Parameters 1291 for Integrated Service Network Elements", RFC 2215, 1292 September 1997. 1294 [16] J. Wroclawski, "The Use of RSVP with IETF Integrated Services", 1295 RFC 2210, September 1997. 1297 6. Informative References 1299 [17] P. Srisuresh, K. Egevang, "Traditional IP Network Address 1300 Translator (Traditional NAT)", RFC3022, January 2001. 1302 7. Author's Address 1304 Shyamaprasad Bandyopadhyay 1305 HL No 205/157/7, Kharagpur 721305, India 1306 Phone: +91 3222 225137 1307 e-mail: shyamb66@gmail.com