idnits 2.17.1 draft-kini-pwe3-pkt-encap-efficient-ip-mpls-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 6 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 14, 2011) is 4792 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'MS-PW-ARCH' is defined on line 848, but no explicit reference was found in the text == Unused Reference: 'CAIDA-PKT-SIZE' is defined on line 852, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 3985 (ref. 'PWE3-ARCH') == Outdated reference: A later version (-07) exists of draft-ietf-pwe3-fat-pw-05 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MPLS Working Group S. Kini 3 Internet-Draft D. Sinicrope 4 Intended Status: Standards Track Ericsson 5 Expires: September 2011 March 14, 2011 7 Encapsulation Methods for Transport of packets over an MPLS PSN - 8 efficient for IP/MPLS 9 draft-kini-pwe3-pkt-encap-efficient-ip-mpls-02.txt 11 Status of this Memo 13 Distribution of this memo is unlimited. 15 This Internet-Draft is submitted to IETF in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on September 15, 2011. 36 Copyright Notice 38 Copyright (c) 2011 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. 48 Abstract 50 A Packet Pseudowire (PPW) must be able to carry a packet of any 51 protocol that can be carried over Ethernet. In many cases IP and MPLS 52 are the pre-dominant protocols on a PPW transported over an MPLS PSN. 53 Other protocols are used mainly for control purposes. In such a 54 scenario it is highly beneficial to make IP/MPLS encapsulation 55 efficient. This document defines such an encapsulation while 56 retaining the ability to exchange packets of any other protocol over 57 the PPW. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2. Conventions used in this document . . . . . . . . . . . . . . . 4 63 3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 4. Network Reference Model . . . . . . . . . . . . . . . . . . . . 4 65 5. Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 66 5.1. Encapsulation format on the PPW . . . . . . . . . . . . . 6 67 5.1.1. IP packets . . . . . . . . . . . . . . . . . . . . . 6 68 5.1.2. MPLS packet . . . . . . . . . . . . . . . . . . . . . 7 69 5.1.3. An arbitrary protocol . . . . . . . . . . . . . . . . 8 70 5.2. Traffic adaptation . . . . . . . . . . . . . . . . . . . . 9 71 5.2.1. PE-bound . . . . . . . . . . . . . . . . . . . . . . 9 72 5.2.2. CE-bound . . . . . . . . . . . . . . . . . . . . . 10 73 5.3. QoS considerations . . . . . . . . . . . . . . . . . . . 13 74 5.4. PW Types . . . . . . . . . . . . . . . . . . . . . . . . 13 75 5.5. Control Word . . . . . . . . . . . . . . . . . . . . . . 15 76 5.5.1. Characteristics without CW . . . . . . . . . . . . 15 77 5.5.2. PPW-EIM-CW . . . . . . . . . . . . . . . . . . . . 16 78 5.6. Signaling extensions . . . . . . . . . . . . . . . . . . 16 79 5.7. Implementation considerations . . . . . . . . . . . . . 17 80 6. PSN MTU requirements . . . . . . . . . . . . . . . . . . . . 17 81 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 82 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 83 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 18 84 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 85 10.1. Normative References . . . . . . . . . . . . . . . . . 19 86 10.2. Informative References . . . . . . . . . . . . . . . . 19 87 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 20 88 Appendix A: Example . . . . . . . . . . . . . . . . . . . . . . . 21 89 A.1. PWE3-ETH-EVC to connect routers . . . . . . . . . . . . . 21 90 A.2. CE co-existing with PE - interconnect . . . . . . . . . . 23 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 93 1. Introduction 95 A packet transport service modeled along [PWE3-ARCH] is considered 96 useful. Such a service is also referred to as a packet pseudowire 97 (PPW). The server network is a Packet Switched Network (PSN) and 98 could be a MPLS (or a MPLS-TP) network. The client requires a generic 99 packet transport service that is isolated from the underlying PSN. 101 It must be possible to carry any number and type of client protocols 102 on the PPW, similar to Ethernet. Some of these may be purely control 103 protocols such as [ARP] or [LLDP]. Such protocols may not take up the 104 majority of the bandwidth of the service. On the other hand client 105 protocols such as IP and MPLS can take up the majority of the 106 bandwidth and it is very useful for the PPW to encapsulate them 107 efficiently. 109 This document defines an encapsulation for a PPW over a MPLS PSN that 110 efficiently encapsulates IP and MPLS. However it is still possible to 111 carry all client protocols on the PPW. It is useful when IP and/or 112 MPLS are the pre-dominant protocols on the PPW. The encapsulation 113 defined in this document is referred to as PPW-EIM (where EIM stands 114 for Efficient IP MPLS). The efficiency is realized by minimizing any 115 extra headers that would be needed to transport an IP or MPLS packet 116 when compared to a solution such as [PWE3-ETH]. The benefits of this 117 efficiency include increased bandwidth available for user traffic due 118 to lesser overhead, better throughput due to reduced possibility of 119 fragmentation and also more efficient use of ECMP paths. 121 2. Conventions used in this document 123 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 124 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 125 document are to be interpreted as described in [RFC2119]. 127 3. Scope 129 This document covers a PPW as a point-to-point (p2p) service. Multi- 130 access service is considered outside the scope of this version of the 131 document. 133 The encapsulation scheme PPW-EIM is useful when IP/MPLS packets are 134 the majority of the packets on the PPW. The method to determine this 135 is considered outside the scope of this document. 137 4. Network Reference Model 139 The solution in this document addresses the following two cases of 140 the reference model in Figure 2 of [PWE3-ARCH] 141 1. The native service is an ethernet virtual circuit (EVC). The 142 EVC may either be untagged or tagged. The untagged traffic is 143 treated as a unique EVC. The stack of VLAN Identifiers (VIDs) 144 in the VLAN tags stack of an Ethernet frame uniquely identifies 145 an EVC. The number of VIDs in the stack identifying the circuit 146 may be one (as in [802.1q], e.g. a customer tag C-tag) or more 147 (similar to [802.1ad] e.g. a customer and service tag C-tag and 148 S-tag). Typically the physical interface between CE and PE will 149 be an Ethernet interface. Note that if another VLAN tag is 150 stacked on an EVC it MUST be treated as a separate EVC to apply 151 PPW-EIM. This is a subset of the reference model in [PWE3-ETH] 152 and is henceforth referred to as PWE3-ETH-EVC. PPW-EIM 153 encapsulates a single EVC into a PPW. If a packet transport 154 service is required for multiple EVCs then a separate PPW 155 should be used for each. The encapsulation in [PWE3-ETH] must 156 be used instead of PPW-EIM under the following conditions: 158 a. If an EVC has to be transported transparently in a single 159 pseudowire (PW) by carrying all VLAN tags encapsulated 160 inside the EVC. 162 b. If the EVC is not pre-dominantly carrying IP or MPLS. The 163 method to determine this is outside the scope of this 164 document. 166 c. If there are a large number of EVCs (pre-dominantly 167 carrying IP/MPLS) that need a p2p transport service 168 towards another PE but one of the PEs has PPW scaling 169 limitations that prevent it from creating separate PPWs 170 per EVC as required by PPW-EIM. 172 2. The CE and the corresponding PE are co-located in the same 173 equipment. This is similar to a virtual untagged point-to-point 174 (p2p) Ethernet interface between the two CEs. This should be 175 treated as the case of providing p2p transport service for the 176 untagged traffic EVC of the PWE3-ETH-EVC reference model 177 described above. 179 It should be noted that the access circuit is modeled as an EVC since 180 an EVC can carry any protocol packet. However, the technique defined 181 in this draft can be extended to any access circuit encapsulation 182 that encapsulates IP and MPLS packets. 184 5. Solution 186 This solution does not use a data link layer header (such as 187 Ethernet) on the PPW to transport IP/MPLS packets. This reduces the 188 overhead bytes for such packets. There are implementations that look 189 beyond the MPLS label stack for an IP packet. For non IP/MPLS 190 packets, whenever there is a potential for such a condition, an IP 191 encapsulation (with GRE) is used. Thus ECMP based on looking for an 192 IP packet beyond the MPLS stack will work correctly and not re-order 193 any flows. To prevent the GRE encapsulated packets from having IP 194 address conflicts with the IP address space of the customer's 195 network, a non-routable IP address (in the 127/8 range) is used. The 196 details of the packet encapsulation are in section 5.1. The 197 adaptation of PE-bound and CE-bound traffic is explained in section 198 5.2. 200 5.1. Encapsulation format on the PPW 202 The encapsulation of the packet is described below along with any 203 control word (CW) bits that are required to be defined. A more formal 204 definition of the CW for PPW-EIM is in section 5.5. 206 5.1.1. IP packets 208 An IPv4/v6 packet encapsulation into a PPW depends on whether CW is 209 present. If the CW is not present, the encapsulation is as shown in 210 Figure 1. Any ECMP implementation that looks for an IP packet beyond 211 the label stack will not re-order flows. If the CW is present then 212 the flags bits 6 and 7 in the CW are set to 01. The encapsulation is 213 as shown in Figure 2. In both cases the first nibble of the IP packet 214 is used to distinguish between an IPv4 and IPv6 packet. 216 +------------------------------------------------+ 217 |PSN Tunnel & PSN Physical Headers | m octets 218 |------------------------------------------------| 219 |PW Label (S=0 if FAT-PW label present, else S=1)| 4 220 |------------------------------------------------| 221 |Optional FAT-PW label S=1 | 4 222 |------------------------------------------------| 223 |IP v4/v6 packet | n octets 224 | | 225 +------------------------------------------------+ 227 Figure 1 IPv4/v6 packet encapsulated into PPW without CW 229 +------------------------------------------------+ 230 |PSN Tunnel & PSN Physical Headers | m octets 231 |------------------------------------------------| 232 |PW Label (S=0 if FAT-PW label present, else S=1)| 4 233 |------------------------------------------------| 234 |Optional FAT-PW label S=1 | 4 235 |------------------------------------------------| 236 |Control Word with Flags bits 6,7 set to 01 | 4 237 |------------------------------------------------| 238 |IP v4/v6 packet | n octets 239 | | 240 +------------------------------------------------+ 242 Figure 2 IPv4/v6 packet encapsulated into PPW with CW 244 5.1.2. MPLS packet 246 A MPLS packet encapsulation into a PPW depends on whether the CW is 247 present in the packet. If the CW is present then the flags bits 6 and 248 7 in the CW are set to 10. The encapsulation is as shown in Figure 3. 249 If the CW is not present, the S-bit in the bottom-most label in the 250 pseudowire label stack is set to zero and the format is as shown in 251 Figure 4. The pseudowire label stack (including the PSN tunnel label 252 stack if any) along with the label stack of the payload appear as a 253 single label stack. This is also consistent with the notion of having 254 a single S-bit set in a labeled packet. Since the payload (MPLS) has 255 (independently) ensured that looking beyond the label stack correctly 256 interprets IP payloads and PWE3 payloads, the same holds true for the 257 combined label stack. Hence flows are identified correctly. 259 +------------------------------------------------+ 260 |PSN Tunnel & PSN Physical Headers | m octets 261 |------------------------------------------------| 262 |PW Label (S=0 if FAT-PW label present, else S=1)| 4 263 |------------------------------------------------| 264 |Optional FAT-PW label S=1 | 4 265 |------------------------------------------------| 266 |Control Word with Flags bits 6,7 set to 10 | 4 267 |------------------------------------------------| 268 |MPLS Packet | n octets 269 | | 270 +------------------------------------------------+ 272 Figure 3 MPLS packet encapsulated into PPW with CW 274 +------------------------------------------------+ 275 |PSN Tunnel & PSN Physical Headers | m octets 276 |------------------------------------------------| 277 |PW Label S=0 | 4 278 |------------------------------------------------| 279 |Optional FAT-PW label S=0 | 4 280 |------------------------------------------------| 281 |MPLS Packet | n octets 282 | | 283 +------------------------------------------------+ 285 Figure 4 MPLS packet encapsulated into PPW without CW 287 5.1.3. An arbitrary protocol 289 An arbitrary protocol (other than IP and MPLS) being encapsulated 290 into a PPW depends on whether a CW is present. If a CW is not present 291 a GRE encapsulation MUST be used as shown in Figure 5. This extends 292 the encapsulation for an IPv4 packet shown earlier in Figure 1 of 293 section 5.1.1. The IP destination addresses in the GRE delivery 294 header is a non-routable address from the 127/8 range. These are used 295 to identify that the packet does not belong to a real GRE tunnel in 296 the IP address space of the payload but rather is a protocol packet 297 on the PPW. Also the protocol type in the GRE Header is according to 298 the protocol that is being carried. The TTL in the GRE delivery 299 header is set to 0 (or 1) to prevent this packet from being IP 300 routed. 302 If the CW is present then the flags bits 6 and 7 in the CW are set to 303 00 and the format is as shown in Figure 6. Note that the ethernet 304 frame carrying the arbitrary protocol packet immediately follows the 305 CW. The GRE encapsulation is not needed in this case. 307 +------------------------------------------------+ 308 |PSN Tunnel & PSN Physical Headers | m octets 309 |------------------------------------------------| 310 |PW Label (S=0 if FAT-PW label present, else S=1)| 4 311 |------------------------------------------------| 312 |Optional FAT-PW label S=1 | 4 313 |------------------------------------------------| 314 |IPv4 header (GRE Delivery header) | 20 315 | IPv4 protocol field=47(GRE) | 316 | TTL=1 | 317 | | 318 | Dst Addr 127/8 | 319 |------------------------------------------------| 320 | GRE Header | 8 321 | | 322 +------------------------------------------------+ 323 | GRE Payload Packet - any arbitrary protocol | n octets 324 | | 325 +------------------------------------------------+ 327 Figure 5 An arbitrary protocol packet encapsulated into PPW without CW 329 +------------------------------------------------+ 330 |PSN Tunnel & PSN Physical Headers | m octets 331 |------------------------------------------------| 332 |PW Label (S=0 if FAT-PW label present, else S=1)| 4 333 |------------------------------------------------| 334 |Optional FAT-PW label S=1 | 4 335 |------------------------------------------------| 336 |Control Word with Flags bits 6,7 set to 00 | 4 337 |------------------------------------------------| 338 |Ethernet frame of an arbitrary protocol | n octets 339 | | 340 +------------------------------------------------+ 342 Figure 6 An arbitrary protocol packet encapsulated into PPW with CW 344 5.2. Traffic adaptation 346 5.2.1. PE-bound 348 After the Native service processing (NSP), the Ethernet frame (from 349 CE) MUST be mapped into the PPW based on the value of the Ethernet 350 type field as follows: 352 1. If it is IP (0x800 - IPv4 or 0x86DD - IPv6), the Ethernet 353 header (including the VLAN tags stack) is stripped off and the 354 encapsulation format is as described in section 5.1.1. Note 355 that the flags bits 6 and 7 in the CW MUST be set to 01. 357 2. If it is MPLS (0x8847, 0x8848), the Ethernet header (including 358 the VLAN tags stack) is stripped off and the encapsulation 359 format is as described in section 5.1.2. The S-bit in the 360 bottom-most label of the pseudowire label stack is set to 1 or 361 0 depending whether the CW is present or not respectively. Note 362 that the flags bits 6 and 7 in the CW MUST be set to 10. 364 3. For all other values of the Ethernet type field, the entire 365 Ethernet frame is carried on the PPW. Depending on whether the 366 CW is use, the encapsulation is as follows: 368 a. If CW is not present then the frame is first encapsulated 369 into GRE (with IP) and the encapsulation format is as 370 described in section Figure 3. The GRE header protocol- 371 type is set according to the protocol being carried. The 372 IP destination address MUST be chosen from the 127/8 373 range. Typically the same source and destination 374 addresses SHOULD be used for the life of the PPW. The IP 375 header TTL SHOULD be set to 0. If there is any hardware 376 limitation due to which TTL of zero cannot be set then a 377 TTL of 1 MUST be used. The checksum in the GRE Header and 378 the IP header MAY be set to 0 since the packet is not 379 forwarded based on these headers and the protocol packet 380 typically has its own data integrity verification 381 mechanisms. If the IP packet (encapsulating GRE) exceeds 382 the PW's MTU, IP fragmentation SHOULD be used provided 383 the PW peer is capable of IP reassembly. If the PW peer 384 is not capable of reassembly the packet must be dropped. 386 b. If CW is present then the Ethernet frame immediately 387 follows the CW. If packet exceeds MTU then [PWE3-FRAG] 388 SHOULD be used. 390 5.2.2. CE-bound 392 The association between the EVC and the PPW has the following extra 393 information that will be used when adapting traffic from the PPW to 394 the EVC. 396 1. MAC address of the directly connected CE. This would be the 397 source MAC address of any frame received from the CE and is 398 henceforth referred to as PPW-EIM-SMAC. This may be configured, 399 signaled or dynamically learnt. 401 2. MAC address of the remotely connected CE. This would be the 402 source MAC address of any frame received from the remote CE and 403 is henceforth referred to as PPW-EIM-DMAC. This may be 404 configured or dynamically learnt. 406 3. The VLAN tag stack (henceforth referred to as PPW-EIM-VSTACK). 407 The VLAN Identifier (VID) portion of PPW-EIM-VSTACK should be 408 known as this uniquely identifies the EVC. The Canonical Format 409 Indicator (CFI) must always be 0. 411 4. A mapping function to map IP differentiated services (DS) 412 [RFC2474] field to Ethernet PCP bits (henceforth referred to as 413 PPW-EIM-DS-to-PCP). This is applicable only if the EVC is 414 tagged. If there are multiple tags in the VLAN tag stack this 415 may be a separate mapping for each tag. It is recommended that 416 the same mapping be used for all tags. The mapping may be user- 417 configurable. A default mapping of a DS field "xyzPQRCU" to a 418 PCP of "xyz" is recommended. 420 When the packet is parsed the type and location of the user data is 421 known. If the packet belongs to the G-ACh then its processing is 422 defined in [VCCV] and remains unchanged for PPW-EIM. The processing 423 for an IP or MPLS packet in the PW is as follows: 425 1. If the payload of the PPW is an MPLS packet it is mapped into 426 an Ethernet frame as follows: 428 a. PPW-EIM-SMAC as the source MAC address. 430 b. PPW-EIM-DMAC as the destination MAC address. 432 c. PPW-EIM-VSTACK as the VLAN tag stack. The PCP bits for 433 each tag in the stack are mapped from the Traffic Class 434 (TC) bits of the first MPLS label in the payload. 436 d. The Ethernet type field is set to 0x8847 (MPLS). 438 2. If the payload of the PPW is an IP packet, the first nibble of 439 the IP header and the Protocol-type then determine further 440 processing. 442 a. If the first nibble is 0x6 then the payload of the PPW is 443 an IPv6 packet. The IPv6 packet is mapped into an 444 Ethernet frame as follows: 446 i. PPW-EIM-SMAC as the source MAC address. 448 ii. If the destination IPv6 address is 449 broadcast/multicast then the destination MAC 450 address of the Ethernet frame is determined 451 accordingly. Else if the destination IPv6 address 452 is unicast then PPW-EIM-DMAC is used. 454 iii. PPW-EIM-VSTACK as the VLAN tag stack. The PCP 455 bits for each tag in the stack are mapped from the 456 DS field in the IPv6 header using PPW-EIM-DS-to-PCP 457 mapping. 459 iv. The Ethernet type field is set to 0x86DD (IPv6) 461 b. If the first nibble is 0x4 then the payload of the PPW is 462 an IPv4 packet. The IP destination address together with 463 protocol field determines further processing: 465 i. If the destination IP address is in the 127/8 range 466 and the protocol field is 47 (GRE) then the GRE 467 payload packet is an arbitrary protocol packet on 468 the PPW. It should be noted that comparing 3 fields 469 that start at fixed offsets in the header and 470 require a comparison of a fixed number of bits from 471 those offsets is sufficient to shunt the packet off 472 the IP/MPLS de-capsulation path. These three fields 473 are the first nibble (starting offset 0, field size 474 1 nibble), IP header protocol field (starting 475 offset 10, field size 2), IP destination address 476 (starting offset 16, compare just first byte). 477 Moreover these comparisons are against fixed values 478 and should be easily implementable in hardware. 479 Further validation of the GRE Delivery header for 480 checksum, TTL, etc as well as the GRE header 481 validation can be done after the packet is shunted 482 off the IP/MPLS de-capsulation path. The VLAN tag 483 stack in the Ethernet frame is validated against 484 PPW-EIM-VSTACK and if the VLAN IDs match, the frame 485 is passed to the NSP. If the IP packet was 486 fragmented it SHOULD be reassembled. If the node is 487 not capable of IP reassembly, the packet is 488 dropped. 490 ii. For all other values it is an IPv4 packet and the 491 processing is similar to that of an IPv6 packet 492 except that the Ethernet type field on the CE-bound 493 frame is set to 0x800 (IPv4). 495 3. If the payload of the PPW is any protocol packet, then it is an 496 Ethernet frame. 498 5.3. QoS considerations 500 The QoS considerations in [PWE3-ETH] are applicable in this document. 502 5.4. PW Types 504 Depending on the requirements of a particular deployment the packet 505 transport service may be required to carry only a subset of the 506 packet types that are carried on a PPW. The following deployment 507 scenarios of the client network on the p2p link (that is emulated by 508 the PPW) are considered useful: 510 1. IP only - In this deployment scenario the client network uses 511 the p2p link to exchange exclusively IP packets. This would be 512 especially true when the PE and CE co-exist on the same device 513 at both ends of the PPW and the CE's exchange only IP packets 514 on that p2p link. A MAC address is not needed in this case. 515 This deployment scenario would also be the case when the PE and 516 CE are on separate devices, the CE's exchange only IP packets 517 on the p2p link and the MAC address mapping for the IP is 518 configured on the CE (e.g. static ARP entry). IP encapsulated 519 control protocols (such as RIP, OSPF, etc) could run on the 520 link. 522 2. IP and ARP only - In this deployment scenario the client 523 network uses the p2p link to exchange exclusively IP packets 524 but additionally uses ARP for layer-2 address resolution. 526 3. MPLS only - In this deployment scenario the client network uses 527 the p2p link to exchange exclusively MPLS packets. Typically 528 the client network would be purely a MPLS (or MPLS-TP) network 529 and would not even use an IP based control plane. This 530 deployment scenario would be especially true when the PE and CE 531 co-exist on the same device at both ends of the PPW and the 532 CE's exchange only MPLS packets on the p2p link. A MAC address 533 is not needed in this case. This deployment scenario would also 534 be the case when the PE and CE are on separate devices, the 535 client network uses the p2p link to exchange MPLS (or MPLS-TP) 536 packets and the mapping of MPLS-label to MAC address is 537 configured on the CE. The MAC address may be from an assigned 538 range (as defined in MPLS-TP). 540 4. IP/MPLS only - In this deployment scenario the client network 541 uses the p2p link to exchange exclusively IP/MPLS packets. This 542 would be the typical case when the PE and CE co-exist on the 543 same device at both ends of the PPW and the CE sends only 544 IP/MPLS packets on the p2p link. A MAC address is not needed in 545 this case. This would also be the case when the PE and CE are 546 on separate devices but the MAC address mapping for IP and MPLS 547 is configured on the CE (e.g. static ARP entry). IP 548 encapsulated control protocols (such as RIP, OSPF, BGP, LDP, 549 RSVP-TE, etc) could run on the link. 551 5. IP/MPLS and ARP only - In this deployment scenario the client 552 network uses the p2p link to exchange exclusively IP/MPLS 553 packets but additionally uses ARP for layer-2 address 554 resolution. This is the typical case when the client network 555 uses that p2p link exclusively with the IP protocol for layer-3 556 routing and MPLS protocol for switching but uses ARP for layer- 557 2 address resolution. 559 6. Generic packet service - In this deployment scenario the client 560 network can use the p2p link to exchange any type of packet 561 that can be sent over an EVC. Even MAC address configuration is 562 not necessary since ARP can be run on this link. 564 For many of these scenarios a subset of the encapsulation and traffic 565 adaptation that has been defined for PPW-EIM is relevant. The 566 following pseudowire types are additionally defined that perform a 567 subset of the full functionality of PPW-EIM. 569 1. IP-only-PPW-EIM - Only IP traffic is transported in PPW-EIM. 570 The relevant encapsulations are in section 5.1.1. Only the 571 adaptations for IP traffic are relevant from section 5.2. This 572 PW would not implement the [GRE] encapsulation. It would 573 optionally implement the CW. When the CW is not used the 574 encapsulation format of this PW is similar to L3VPN. 576 2. MPLS-only-PPW-EIM - Only MPLS traffic is transported in PPW- 577 EIM. The relevant encapsulations are in 5.1.2. Only the 578 adaptations for MPLS traffic are relevant from section 5.2. 579 This PW would not implement the [GRE] encapsulation. It would 580 optionally implement the CW. When the CW is not used, the 581 encapsulation (label-stack) of this PW is similar to a MPLS-TP 582 LSP that has MPLS as a client. 584 3. IPMPLS-only-PPW-EIM - Only IP and MPLS traffic is transported 585 in PPW-EIM. The relevant encapsulations are in sections 5.1.1. 586 and 5.1.2. Only the adaptations for IP and MPLS traffic are 587 relevant from section 5.2. This PW would not implement the 588 [GRE] encapsulation. It would optionally implement the CW. 589 Each deployment scenario described earlier can be realized by the 590 generic PPW-EIM. However many deployment scenarios can also be 591 realized by a PPW that implements a subset of PPW-EIM. The method and 592 choice of PPW to do this for each deployment scenario is as follows: 594 1. IP only - A PW can be realized with an IP-only-PPW-EIM. 596 2. IP and ARP only - The straightforward way to realize this is by 597 the generic PPW-EIM. It is also possible to realize it using an 598 IP-only-PPW-EIM if the PE acts as a proxy ARP ([PXY-ARP]) 599 gateway to its directly connected CE. 601 3. MPLS only - A PW can be realized with a MPLS-only-PPW-EIM. 603 4. IP/MPLS only - A PW can be realized with an IPMPLS-only-PPW- 604 EIM. 606 5. IP/MPLS and ARP only - The straightforward way to realize this 607 is by the generic PPW-EIM. It is also possible to realize it 608 using an IPMPLS-only-PPW-EIM if the PE acts as a proxy ARP 609 gateway to its directly connected CE. 611 6. Generic packet service - This of course should be realized 612 using PPW-EIM. 614 5.5. Control Word 616 One of the primary purposes of the CW ([PWE3-CW]) is to prevent re- 617 ordering within a flow if there are implementations that look beyond 618 the label stack for an IP flow. PPW-EIM has different characteristics 619 due to the use of IP for encapsulating non IP/MPLS packets. Hence a 620 CW is considered optional and the characteristics of PPW-EIM without 621 a CW are analyzed in section 5.5.1. A CW that meets the requirements 622 in [PWE3-CW] is described in section 5.5.2. This should be used in 623 cases where a CW is required for reasons other than preventing flow 624 re-ordering. 626 5.5.1. Characteristics without CW 628 PPW-EIM (without CW) is not susceptible to re-ordering flows within 629 the PPW. It can also take advantage of ECMP implementations that 630 examine the first nibble after the MPLS label stack to determine 631 whether the labeled packet is an IP packet. Such implementations are 632 widely available today and will correctly identify the IP flow in the 633 PPW. Even the flows of non IP/MPLS protocols will not be re-ordered 634 as long as the same source and destination IP addresses are used in 635 the GRE Delivery header for the life of the PPW. Hence a CW is not 636 necessary for PPW-EIM to prevent flow re-ordering. This can also 637 obviate the need for [FAT-PW] within PPW-EIM and thereby save on 638 processing power at ingress to identify the flow (through packet 639 classification) and add the flow-label. When an ECMP based on the 640 label stack is required (and available), then [FAT-PW] must be used 641 with PPW-EIM. An important benefit of not adding a CW and/or flow- 642 label is that the difference in packet size between the access 643 network and the PSN is further reduced by up to 8 bytes (compared 644 with [PWE3-ETH]) and hence there is less chance for fragmentation of 645 jumbo IP/MPLS packets. 647 5.5.2. PPW-EIM-CW 649 If a CW is needed for PPW-EIM, then the one defined in [PWE3-ETH] 650 must be used with the following extension. In accordance with the 651 preferred CW format in [PWE3-CW] that specifies the flags field for 652 per-payload signaling, the bits 6 and 7 are defined as follows: 654 - 00 indicates payload is any protocol encapsulated in an 655 Ethernet frame 657 - 01 indicates payload is IP 659 - 10 indicates payload is MPLS 661 This CW is also applicable to IP-only-PPW-EIM, MPLS-only-PPW-EIM and 662 IPMPLS-only-PPW-EIM. 664 5.6. Signaling extensions 666 New values for the "PW type" field should be defined for the 667 pseudowire encapsulations as "Packet - Efficient IP/MPLS", "Packet - 668 IP only Efficient IP/MPLS", "Packet - MPLS only Efficient IP/MPLS", 669 "Packet - IPMPLS only Efficient IP/MPLS" (values to be allocated by 670 IANA). 672 An LDP optional parameter TLV "Local MAC Address" may be used to 673 indicate the local MAC address to the remote peer. This TLV should be 674 used in the LDP Notification message. The MAC address may have been 675 configured or dynamically learnt. The format of the Local MAC address 676 TLV is: 678 0 1 2 3 679 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 680 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 681 |U|F| Local MAC addr (TBA) | Length=6 | 682 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | Local MAC address | 684 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 685 | | 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 U bit: Unknown bit. This bit MUST be set to 1. If the MAC address 689 format is not understood, then the TLV is not understood and MUST be 690 ignored. 692 F bit: Forward bit. This bit MUST be set to 1. In a MS-PW the S-PE 693 should not interpret this TLV and it MUST be forwarded. 695 5.7. Implementation considerations 697 It is worthwhile noting that IP-only-PPW-EIM without the CW has an 698 encapsulation format similar to that used in L3VPN. Also, MPLS-only- 699 PPW-EIM without the CW has a packet format similar to that of a MPLS- 700 TP LSP that has MPLS as a client. The action of pop and forward of 701 the packet is in-line with the MPLS architecture. The capability to 702 handle these formats should exist in most of the currently used 703 hardware. The PPW-EIM with CW, has a format that is in line with the 704 format in [PWE3-CW] and existing hardware should be capable of 705 handling it. It is important to note that even with the GRE 706 encapsulation, the PE does not have to do any of the typical GRE 707 processing such as IP lookups. A capability to match a few 708 nibbles/bytes in the header is sufficient to correctly identify and 709 process the packet. Alternatively, an implementation may make CW 710 mandatory for PPW-EIM, in which case the GRE encapsulation is not 711 needed. 713 6. PSN MTU requirements 715 The MPLS PSN MUST be configured with an MTU that is large enough to 716 transport a maximum-sized Ethernet frame that has been encapsulated 717 with a control word, a flow label (if ECMP is desired), a pseudowire 718 demultiplexer, and a tunnel encapsulation. With MPLS used as the 719 tunneling protocol, for example, this is likely to be 12 or 16 bytes 720 greater than the largest frame size. The methodology described in 721 [PWE3-FRAG] MAY be used to fragment encapsulated frames that exceed 722 the PSN MTU. However, if [PWE3-FRAG] is not used and if the ingress 723 router determines that an encapsulated layer 2 PDU exceeds the MTU of 724 the PSN tunnel through which it must be sent, the PDU MUST be 725 dropped. 727 Note that the benefits associated with [FAT-PW] can be recognized in 728 PPW-EIM for IP/MPLS packets without adding the flow-label, if ECMP is 729 done by looking for an IP packet beyond the MPLS label stack when the 730 PPW is setup without a control-word. This also reduces the MTU 731 difference to only 8 bytes for IP/MPLS packets since both the 732 control-word and the flow-label are not needed. In the scenario where 733 the EVC is [802.1q] and the PE's interface into the PSN is Ethernet 734 but not virtualized, the MTU difference is further reduced to 4. For 735 the extreme case where PSN tunnel is a MPLS LSP with a single hop and 736 has PHP, there is no difference in the MTU. Alternately, if the EVC 737 has two or more tags (similar to [802.1ad]) no fragmentation is 738 needed for IP/MPLS packets even if the PSN tunnel LSP has multiple 739 hops and there is no PHP. 741 7. Security Considerations 743 The security considerations in [PWE3-ETH] are applicable to this 744 document. 746 8. IANA Considerations 748 IANA needs to allocate values for the following: 750 1. 'PW Type' field for "Packet - Efficient IP/MPLS", "Packet - IP 751 only Efficient IP/MPLS", "Packet - MPLS only Efficient IP/MPLS" 752 and "Packet - IPMPLS only Efficient IP/MPLS". Recommend next 753 available values 0x0020, 0x0021, 0x0022 and 0x0023. 755 2. LDP 'TLV type' for 'Local MAC address'. Recommend available 756 value 0x0405. 758 9. Conclusion 760 PPW-EIM has the following useful advantages: 762 1. Reduces the number of bytes on the wire. This translates into a 763 significant reduction in bandwidth (as a percentage of packet 764 size) for smaller packets. 766 2. Reduces the possibility of fragmentation (and reassembly) of 767 jumbo IP/MPLS packets. This improves the throughput of the 768 network. 770 3. Helps multi-layer networks by reducing the overhead required to 771 stack each layer. This also reduces the possibility of 772 fragmentation for jumbo packets in such networks. 774 4. Utilizes ECMP based on IP, a capability that exists in many 775 current implementations. 777 5. Reduces the requirement to implement [FAT-PW] by taking 778 advantage of existing implementations of ECMP based on IP. 780 6. Makes ECMP more efficient in multi-layer networks by enabling 781 existing implementations (at any layer) to examine the label 782 stack through all higher layers. In addition it enables 783 existing implementations (at any layer) to easily examine the 784 end-host's IP packet and simplifies deep-packet- 785 inspection/flow-based applications (including ECMP). 787 10. References 789 10.1. Normative References 791 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 792 Requirement Levels", BCP 14, RFC 2119, March 1997. 794 [GRE] Farinacci, D., et al, "Generic Routing Encapsulation 795 (GRE)", RFC 2784, March 2000. 797 [PWE3-ARCH] Bryant, S., et al, "Pseudo Wire Emulation Edge-to-Edge 798 (PWE3) Architecture", RFC 3985, March 2005. 800 [PWE3-CW] Bryant, S., et al, "Pseudowire Emulation Edge-to-Edge 801 (PWE3) Control Word for Use over an MPLS PSN", RFC 4385, 802 February 2006. 804 [PWE3-FRAG] Malis, A., et al, "Pseudowire Emulation Edge-to-Edge 805 (PWE3) Fragmentation and Reassembly", RFC 4623, August 806 2006. 808 [VCCV] Nadeau, T., et al, "Pseudowire Virtual Circuit 809 Connectivity Verification (VCCV): A Control Channel for 810 Pseudowires", RFC 5085, December 2007. 812 10.2. Informative References 814 [ARP] Plummer, D., "An Ethernet Address Resolution Protocol", 815 RFC 826, November 1982. 817 [PXY-ARP] Carl-Mitchell, S., et al, "Using ARP to Implement 818 Transparent Subnet Gateways", RFC 1027, October 1987. 820 [ISIS] International Organization for Standardization, 821 "Intermediate system to intermediate system intra-domain- 822 routing routine information exchange protocol for use in 823 conjunction with the protocol for providing the 824 connectionless-mode Network Service (ISO 8473)", ISO 825 Standard 10589, 1992. 827 [RFC2474] Nichols, K., et al, "Definition of the Differentiated 828 Services Field (DS Field) in the IPv4 and IPv6 Headers", 829 RFC 2474, December 1998. 831 [PWE3-ETH] Martini, L., et al, "Encapsulation Methods for Transport 832 of Ethernet over MPLS Networks", RFC 4448, April 2006. 834 [FAT-PW] Bryant, S., et al, "Flow Aware Transport of Pseudowires 835 over an MPLS PSN ", draft-ietf-pwe3-fat-pw-05 (Work in 836 progress), October 2010. 838 [802.1q] "Virtual Bridged Local Area Networks", IEEE Std 802.1Q- 839 2005, 2005. 841 [802.1ad] "Virtual Bridged Local Area Networks - Amendment 4: 842 Provider Bridges", IEEE Std 802.1ad-2005, 2005. 844 [LLDP] "IEEE Standard for Local and Metropolitan Area Networks - 845 Station and Media Access Control Connectivity Discovery", 846 IEEE Std 802.1AB-2005, 2005. 848 [MS-PW-ARCH] Bocci, M., et al, "An Architecture for Multi-Segment 849 Pseudowire Emulation Edge-to-Edge", RFC 5659, October 850 2009. 852 [CAIDA-PKT-SIZE] CAIDA, "Packet size distribution comparison between 853 Internet links in 1998 and 2008", 854 http://www.caida.org/research/traffic- 855 analysis/pkt_size_distribution/graphs.xml 857 11. Acknowledgments 859 The authors would like to thank Joel Halpern, Loa Andersson, Andy 860 Malis, Stewart Bryant and Edwin Mallette for their comments. 862 Appendix A: Example 864 Two examples are provided, one each for the two cases of the 865 reference model described in section 4. 867 A.1. PWE3-ETH-EVC to connect routers 869 +------+ +------+ 870 | | AC +---+ PSN +---+ AC | | 871 | R1 |-------|PE1|-------|PE2|-------| R2 | 872 | | E +---+ L +---+ E | | 873 +------+ +------+ 875 R1, R2 - IP routers 876 PE1, PE2 - PPW(PPW-EIM) capable PEs 877 AC - Attachment Circuit 878 E - Ethernet Frame, L - MPLS packet 880 Figure 7 Router inter-connect using PPW 882 R1 has an p2p IP interface to R2. This interface is created on VLAN 5 883 and runs ISIS level-2 ([ISIS]) as a routing protocol. 885 MAC addr - R1: 00-01-02-03-04-05, R2: 10-11-12-13-14-15 886 IP address - R1: 198.0.2.1/24, R2: 198.0.2.2/24 888 The VLAN 5 is emulated with a PPW (using encapsulation PPW-EIM) from 889 PE1 to PE2 for EVC 5. Neither a control-word nor a flow-label is used 890 on the PPW. PE2 has allocated a MPLS label 0x4321 as the PW 891 demultiplexer. The PPW is encapsulated in a MPLS PSN and the PSN 892 tunnel is a 1-hop LSP tunnel from PE1 to PE2 setup with PHP. 894 Using a typical encapsulation on an Ethernet port for an ISIS 895 protocol packet, the level-2 LAN ISIS hello packet (LAN-IIH) from R1 896 to R2 is formatted by R1 into an ethernet frame E as shown below: 898 +------------------------------------------------+ 899 | Dest MAC addr AllL2ISs 01-80-C2-00-00-14 | 4 900 | +------------------------| 901 | | | 4 902 +-----------------------+ | 903 | Src MAC addr 00-01-02-03-04-05 | 4 904 +-----------------------+------------------------| 905 | TPID=0x8100 | VID=0x5 PCP=111 CFI=0 | 4 906 +-----------------------+------------------------| 907 | Length= n+3 | LLC = 0xFE 0xFE | 4 908 +-----------+------------------------------------+ 909 | SNAP=0x03 | NLPID=0x83| | 4 910 +-----------+-----------+ | 911 | ISIS L2 LAN-IIH | n-3 octets 912 | | 913 +------------------------------------------------+ 915 Figure 8 ISIS L2 LAN-IIH from R1 to R2 on AC 917 When the IIH is carried over the PPW it is encapsulated by PE1 as 918 shown below: 920 +------------------------------------------------+ 921 |PSN Physical layer headers | m octets 922 +------------------------------------------------+ 923 |PW Demultiplexer Label=0x4321 S=1 TC=0x7 | 4 924 |------------------------------------------------| 925 |IPv4 header (GRE Delivery header) | 20 926 | IPv4 protocol field=47(GRE) | 927 | TTL=0, Checksum= | 928 | Src Addr 127.0.0.1 | 929 | Dst Addr 127.0.0.1 | 930 |------------------------------------------------| 931 | GRE Header Protocol Type=0x8100 | 8 932 | Checksum= | 933 +------------------------------------------------+ 934 | GRE Payload Packet - frame E | n+22 octets 935 | | 936 +------------------------------------------------+ 938 Figure 9 ISIS L2 LAN-IIH from R1 to R2 on PPW-EIM 940 A unicast IP packet routed by R1 that has 198.0.2.2 as next-hop is 941 formatted by R1 as shown below: 943 +------------------------------------------------+ 944 | Dest MAC addr 10-11-12-13-14-15 | 4 945 | +------------------------| 946 | | | 4 947 +-----------------------+ | 948 | Src MAC addr 00-01-02-03-04-05 | 4 949 +-----------------------+------------------------| 950 | TPID=0x8100 | VID=0x5 PCP=000 CFI=0 | 4 951 +-----------------------+------------------------| 952 | EtherType=0x800 | | 4 953 +-----------------------+ | 954 | IP packet | n-2 octets 955 | | 956 +------------------------------------------------+ 958 Figure 10 IP packet from R1 to R2 on AC 960 When this IP packet is carried over the PPW it is encapsulated by PE1 961 as shown below: 963 +------------------------------------------------+ 964 |PSN Physical layer headers | m octets 965 +------------------------------------------------+ 966 |PW Demultiplexer Label=0x4321 S=1 TC=0x0 | 4 967 +------------------------------------------------+ 968 | IP packet | n octets 969 | | 970 +------------------------------------------------+ 972 Figure 11 IP packet from R1 to R2 on PPW-EIM 974 A.2. CE co-existing with PE - interconnect 975 R1 R2 976 +-------+ +-------+ 977 |CE1| | | |CE2| 978 +---| | +---+ | |---| 979 | |PE1|-------------| P |-------------|PE2| | 980 | . | L1 +---+ L2 | . | 981 | . | | . | 982 +-------+ +-------+ 984 R1, R2 - IP/MPLS routers with co-existing PE and CE 985 PE1, PE2 - PPW(PPW-EIM) capable PEs 986 CE1, CE2 - IP/MPLS routers with a p2p IP/MPLS interface 987 P - MPLS P router 988 L1, L2 - MPLS packets 990 Figure 12 CE interconnect when co-existing with PE 992 CE1 has a p2p unnumbered IP interface to CE2. This interface runs 993 ISIS level-2 as a routing protocol. 995 The IP interface is emulated with a PPW (using encapsulation PPW-EIM) 996 from PE1 to PE2. Neither a control-word nor a flow-label is used on 997 the PPW. PE2 has allocated a MPLS label 0x4321 as the PW 998 demultiplexer. The PPW is encapsulated in a MPLS PSN tunnel that is a 999 2-hop bi-directional LSP TE tunnel from PE1 to PE2 setup without PHP. 1001 The level-2 p2p ISIS hello packet (IIH) from CE1 to CE2 is 1002 encapsulated by PE1 as shown below: 1004 +------------------------------------------------+ 1005 |PSN Tunnel and Physical layer headers | m octets 1006 +------------------------------------------------+ 1007 |PW Demultiplexer Label=0x4321 S=1 TC=0x7 | 4 1008 |------------------------------------------------| 1009 |IPv4 header (GRE Delivery header) | 20 1010 | IPv4 protocol field=47(GRE) | 1011 | TTL=1, Checksum= | 1012 | Src Addr 127.0.0.1 | 1013 | Dst Addr 127.0.0.1 | 1014 |------------------------------------------------| 1015 | GRE Header Protocol Type=Length=n | 8 1016 | Checksum= | 1017 +------------------------------------------------+ 1018 | GRE Payload Packet - IIH | n octets 1019 | | 1020 +------------------------------------------------+ 1022 Figure 13 ISIS IIH from CE1 to CE2 on PPW-EIM 1024 An IP packet routed by CE1 that has the unnumbered interface to CE2 1025 as the next-hop is encapsulated by PE1 as shown below: 1027 +------------------------------------------------+ 1028 |PSN Tunnel and Physical layer headers | m octets 1029 +------------------------------------------------+ 1030 |PW Demultiplexer Label=0x4321 S=1 TC=0x0 |4 1031 +------------------------------------------------+ 1032 | IP packet | n octets 1033 | | 1034 +------------------------------------------------+ 1036 Figure 14 IP packet from CE1 to CE2 on PPW-EIM 1038 An MPLS packet switched by CE1 that has the unnumbered interface to 1039 CE2 as the next-hop is encapsulated by PE1 as shown below: 1041 +------------------------------------------------+ 1042 |PSN Tunnel and Physical layer headers | m octets 1043 +------------------------------------------------+ 1044 |PW Demultiplexer Label=0x4321 S=0 TC=0x0 |4 1045 +------------------------------------------------+ 1046 | MPLS packet | n octets 1047 | | 1048 +------------------------------------------------+ 1050 Figure 15 MPLS packet from R1 to R2 on PPW-EIM 1052 Authors' Addresses 1054 Sriganesh Kini 1055 Ericsson 1056 300 Holger Way, San Jose, CA 95134 1057 EMail: sriganesh.kini@ericsson.com 1059 David Sinicrope 1060 Ericsson 1061 8001 Development Dr, Research Triangle Park, NC 27709 1062 EMail: david.sinicrope@ericsson.com