idnits 2.17.1 draft-gu-nvo3-overlay-cp-arch-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 9, 2012) is 4303 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'Qbg' Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y. Gu 3 Internet-Draft W. Hao 4 Intended status: Standards Track Huawei 5 Expires: January 10, 2013 July 9, 2012 7 Analysis of external assistance to NVE and consideration of architecture 8 draft-gu-nvo3-overlay-cp-arch-00 10 Abstract 12 Draft [overlay-cp] has introduced some control plan requirements and 13 characteristics. From NVE's perspective, this draft describes what 14 assistance is needed to make NVE satisfy the requirements and 15 characteristics introduce in [overlay-cp]. Not all of these 16 assistance is necessarily achieved by an external controller. Some 17 of the assistance requirements can be regarded as a complementarity 18 requirements to [overlay-cp] . while others are requirements to an 19 assistance Database. This draft also provide considerations on how 20 the network virtualization architecture should be like and how these 21 assistance can be fulfilled. The target is to help the working group 22 to figure out the architecture of overlay control plane, instead of 23 providing solutions. 25 Status of this Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on January 10, 2013. 42 Copyright Notice 44 Copyright (c) 2012 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Terminologies and concepts . . . . . . . . . . . . . . . . . . 3 61 3. The fundamental requirements and characteristics . . . . . . . 5 62 3.1. Assistance to NVE . . . . . . . . . . . . . . . . . . . . 6 63 3.1.1. Assistance from TES . . . . . . . . . . . . . . . . . 6 64 3.2. Access Control List . . . . . . . . . . . . . . . . . . . 7 65 3.3. QoS . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 66 3.4. DHCP Snooping . . . . . . . . . . . . . . . . . . . . . . 7 67 3.5. NVE to VNI Registration . . . . . . . . . . . . . . . . . 7 68 3.6. VNI to Multicast Addr Mapping . . . . . . . . . . . . . . 8 69 3.7. Synchronization . . . . . . . . . . . . . . . . . . . . . 8 70 4. Implementation Options and Architecture considerations . . . . 8 71 4.1. Exclusively using External Controller . . . . . . . . . . 9 72 4.2. Hybrid of External Controller and Centralized Database . . 10 73 4.2.1. Brief introduction of VDP profile database and 74 work flow . . . . . . . . . . . . . . . . . . . . . . 10 75 4.2.2. Example Architecture and Work Flow . . . . . . . . . . 12 76 5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 77 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 78 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 79 7.1. Normative Reference . . . . . . . . . . . . . . . . . . . 14 80 7.2. Informative Reference . . . . . . . . . . . . . . . . . . 14 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14 83 1. Introduction 85 Draft [overlay-cp] has introduced some control plan requirments and 86 characteristics. From NVE's perspective, this draft describes what 87 assistance is needed to make NVE statisfy the requirements and 88 characteristics introduce in [overlay-cp]. Not all of these 89 assistance is necessarily acheived by an external controller. Some 90 of the assistance requirements can be regarded as a complementarity 91 requirements to [overlay-cp] . while others are requirements to an 92 assistance Database. This draft also provide considerations on how 93 the network virtualization architecture should be and how these 94 assistance can be fulfilled. The target is to help the working group 95 to figure out the architecture of overlay control plane, instead of 96 providing solutions. 98 2. Terminologies and concepts 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 101 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 102 document are to be interpreted as described in [RFC2119]. 104 The document uses terms defined in [framework]and [overlay-cp]. 106 VN: Virtual Network. This is a virtual L2 or L3 domain that belongs 107 a tenant. 109 VNI: Virtual Network Instance. This is one instance of a virtual 110 overlay network. Two Virtual Networks are isolated from one another 111 and may use overlapping addresses. 113 Virtual Network Context or VN Context: Field that is part of the 114 overlay encapsulation header which allows the encapsulated frame to 115 be delivered to the appropriate virtual network endpoint by the 116 egress NVE. The egress NVE uses this field to determine the 117 appropriate virtual network context in which to process the packet. 118 This field MAY be an explicit, unique (to the administrative domain) 119 virtual network identifier (VNID) or MAY express the necessary 120 context information in other ways (e.g. a locally significant 121 identifier). 123 VNID: Virtual Network Identifier. In the case where the VN context 124 has global significance, this is the ID value that is carried in each 125 data packet in the overlay encapsulation that identifies the Virtual 126 Network the packet belongs to. 128 NVE: Network Virtualization Edge. It is a network entity that sits 129 on the edge of the NVO3 network. It implements network 130 virtualization functions that allow for L2 and/or L3 tenant 131 separation and for hiding tenant addressing information (MAC and IP 132 addresses). An NVE could be implemented as part of a virtual switch 133 within a hypervisor, a physical switch or router, a Network Service 134 Appliance or even be embedded within an End Station. 136 Underlay or Underlying Network: This is the network that provides the 137 connectivity between NVEs. The Underlying Network can be completely 138 unaware of the overlay packets. Addresses within the Underlying 139 Network are also referred to as "outer addresses" because they exist 140 in the outer encapsulation. The Underlying Network can use a 141 completely different protocol (and address family) from that of the 142 overlay. 144 Data Center (DC): A physical complex housing physical servers, 145 network switches and routers, Network Service Appliances and 146 networked storage. The purpose of a Data Center is to provide 147 application and/or compute and/or storage services. One such service 148 is virtualized data center services, also known as Infrastructure as 149 a Service. 151 VM: Virtual Machine. Several Virtual Machines can share the 152 resources of a single physical computer server using the services of 153 a Hypervisor (see below definition). 155 Hypervisor: Server virtualization software running on a physical 156 compute server that hosts Virtual Machines. The hypervisor provides 157 shared compute/memory/storage and network connectivity to the VMs 158 that it hosts. Hypervisors often embed a Virtual Switch (see below). 160 Virtual Switch: A function within a Hypervisor (typically implemented 161 in software) that provides similar services to a physical Ethernet 162 switch. It switches Ethernet frames between VMs' virtual NICs within 163 the same physical server, or between a VM and a physical NIC card 164 connecting the server to a physical Ethernet switch. It also 165 enforces network isolation between VMs that should not communicate 166 with each other. 168 Tenant: A customer who consumes virtualized data center services 169 offered by a cloud service provider. A single tenant may consume one 170 or more Virtual Data Centers hosted by the same cloud service 171 provider. 173 Tenant End System: It defines an end system of a particular tenant, 174 which can be for instance a virtual machine (VM), a non-virtualized 175 server, or a physical appliance. 177 Virtual Access Points (VAPs): Tenant End Systems are connected to the 178 Tenant Instance through Virtual Access Points (VAPs). The VAPs can 179 be in reality physical ports on a ToR or virtual ports identified 180 through logical interface identifiers (VLANs, internal VSwitch 181 Interface ID leading to a VM). 183 VN Name: A globally unique name for a VN. The VN Name is not carried 184 in data packets originating from End Stations, but must be mapped 185 into an appropriate VN-ID for a particular encapsulating technology. 186 Using VN Names rather than VN-IDs to identify VNs in configuration 187 files and control protocols increases the portability of a VDC and 188 its associated VNs when moving among different administrative domains 189 (e.g. switching to a different cloud service provider). 191 VSI: Virtual Station Interface. Typically, a VSI is a virtual NIC 192 connected directly with a VM. [Qbg] 194 3. The fundamental requirements and characteristics 196 In this section, we make a summary of the fundamental requirements 197 and characteristics made in [overlay-cp]. 199 Summary of requirements: 201 o Inner to Outer address mapping 203 o Underlying Network Multi-Destination Delivery Address(es) 205 o VN Connect/Disconnect Notification 207 o VN Name to VN-ID Mapping 209 Summary of characteristics: 211 o As few local caching state as better 213 o Fast acquisition of needed state 215 o Fast detection/update of stale cached state information 217 o Minimize processing overhead 219 o Highly scalable 221 o Minimize the complexity of the implementation 223 o Extensible 224 o Simple protocol configuration 226 o Do not rely on IP Multicast 228 o Flexible mapping sources 230 3.1. Assistance to NVE 232 In this section, we describe the assistance to NVE as an addition to 233 the requirements enumerated in the above section. Meanwhile the 234 additional requirements must satisfy the required characteristic. We 235 call it assistance, instead of control plane requirements, since the 236 assistance can be achieved by a controller, or a database, which is 237 not traditionally in concept of control plane. 239 In following section, more than one options to enable these 240 assistance are introduced. No matter what kind of control plane 241 components are finally adopted by the working, the assistance 242 requirements must be satisfied. 244 3.1.1. Assistance from TES 246 In draft [tes-nve-mechanism], some requirements and possible 247 mechanisms to enable the requirements are described. These 248 requirements are the assistance that TES can provides, maybe together 249 with external entities, e.g. controllers or profile Database. A 250 summary is enumerated here. 252 REQUIREMENT-1: The TNP (TES to NVE notification mechanism and 253 protocol) MUST support TES to notify NVE about the VM's status, 254 including but not limited to Start up, Shut down, Emigration and 255 Immigration. 257 REQUIREMENT-2: The TNP MUST support TES to notify NVE about the VM's 258 VN Clue, which can be one identifier or a combination of several 259 indentifier. 261 REQUIREMENT-3: The TNP MUST support TES to notify NVE about the VM's 262 inner address. The inner address MUST include one or both of MAC 263 address of VM's virtual NIC and VM's IP address. And it SHOULD 264 be extensible to carry new address type. 266 REQUIREMENT-4: The TNP MUST support NVE to notify TES about the VM's 267 local tag. The local Tag type supported by TNP MUST include IEEE 268 802.1Q tag. And it SHOULD be extensible to carry other type of 269 local tag. 271 REQUIREMENT-5: The TNP SHOULD support NVE to notify TES about the 272 VM's traffic PCP value. 274 The following sections are the assistance the NVE needs but can be 275 provided by entities other than TES, e.g. by an external controller 276 or a database. These assistance requirements are complementarity to 277 those introduced in . [overlay-cp] 279 3.2. Access Control List 281 While VAP identify the a new membership, be a VM or a physical 282 server, NVE needs to get the Access Control List to the member. The 283 ACL maybe associate with a specific member or associate with a 284 specific VNI. If the ACL is associate with a specific VNI, NVE only 285 needs to get the ACL at the first time the NVE is associate with the 286 VNI. 288 If the ACL changes, e.g. rules change or deleting, the assistance 289 subject must be able to notify NVE to update the ACL. 291 While the member migrates to a new NVE, the NVE must be able to get 292 the ACL as soon as possible. 294 3.3. QoS 296 Similar to ACL, NVE needs to get the QoS policies while a new member 297 is associated with the NVE. In order to achieve QoS policies, not 298 only the NVE but also the network devices on traffic path other than 299 NVE need to be aware of the QoS policies. But in the NVO3 working 300 group, we only focus on NVE. 302 While the member migrates to a new NVE, the NVE must be able to get 303 the QoS policies as soon as possible. 305 3.4. DHCP Snooping 307 While DHCP Snooping function is enabled on NVE, a DHCP snooping table 308 item is created by the access NVE. While VM migrates to a new NVE, 309 the VM may not resend a DHCP request since the migration is 310 transparent to the VM and the IP address must be the same. In this 311 case, the new NVE must be able to get the DHCP Snooping information 312 created by the original NVE by some way. And the original NVE must 313 be able to delete the DHCP Snooping information timely. 315 3.5. NVE to VNI Registration 317 While the first membership to a specific VNI is created on NVE, NVE 318 need to register the association to an external entity. The reason 319 for this is to enable an a global view of which NVEs belongs to a 320 specific VNI. Every NVE must be aware of NVE to VNI mapping for 321 multicast in a single VNI or to update the QoS/ACL policies. For 322 example, all NVEs responsible to at least one member belong to a 323 particular VNI have to be notified of updated ACL or QoS policies 324 related to this VNI. 326 3.6. VNI to Multicast Addr Mapping 328 NVE can get the inner to outer address mapping through control plane 329 assistance or through data plane learning. In the case of latter, 330 NVE must be able to learn the VNI to Multicast address mapping in 331 order to forward unknown unicast and broadcast traffic. 333 3.7. Synchronization 335 This assistance a general requirement. For whatever information NVE 336 get from external entity, while the origin of the information is 337 changing, all relevant NVE who have local copy of the information 338 must be able to synchronize with the origin. Some examples of the 339 information are ACL, QoS, Inner to Outer address mapping, VN Name to 340 VNID mapping, and NVEs to VNI global view. 342 4. Implementation Options and Architecture considerations 344 The combination of requirements in Section 3 and Section 4 are the 345 assistance that NVE need in order to fulfill the overlay forwarding 346 in a way satisfying the characteristic in Section 3. Not all of the 347 assistance is necessarily regarded as requirements to an external 348 controller. In fact, there are more than one way to enable these 349 requirements. In this section, we introduce 2 kinds of assistance 350 subject to enable the above requirements. These should not be 351 regarded as solution proposals, but considerations on overlay control 352 plan components. 354 In this draft, we only consider the situation where external NVE is 355 embedded on network devices and VMs access to NVE via hypervisor. 356 But for other cases, the mechanism introduced here can also be used, 357 with necessary prune. 359 Two assistance subjects are introduced, including external controller 360 and centralized database. It's not feasible to use only database, 361 e.g. it's hard for database to synchronize mapping and QoS/ACL 362 polices among all VNI-relevant NVEs. But a centralized database can 363 offload much work from controller. 365 4.1. Exclusively using External Controller 367 Only an external controller is used to assist NVE for virtualization 368 network forwarding. The controller might have a database on it or 369 directly attached. 370 +------Control Protocol-----+ 371 | | 372 | | 373 +------------+---------+ +------------+ 374 | +----------+-------+ | | Controller |-----+ 375 | | Overlay Module | | +-------+----+ | 376 | +---------+--------+ | | Database | 377 | |VN context| +----------+ 378 | | | 379 | +--------+-------+ | 380 | | VNI | | 381 NVE1 | +-+------------+-+ | 382 | | VAPs | | 383 +----+------------+----+ 384 | | 385 -------+------------+----- 386 | | 387 | | 388 Tenant End Systems 390 Fig1. Architecture with only controller 392 The working flow is as follows. 393 TES/VM NVE Controller 394 |--start up-->| 395 or immigrate |<-get mappings and policies-->| 396 (VNID, inner to outer, etc) 397 locally create locally record 398 caches NVE-VNI mapping 400 |-data frame->| 401 |--encapsulation----> 403 |-emigrate--->| 404 |--notify VM emigration-------->| 405 locally update locally update 406 caches NVE-VNI mapping 408 |<-synch mappings and policies--| 409 locally update 410 caches 412 Fig2. Work flow with controller assists NVE 414 4.2. Hybrid of External Controller and Centralized Database 416 4.2.1. Brief introduction of VDP profile database and work flow 418 Take Profile Database introduced in IEEE 802.1Qbg as an example of 419 the Centralized Database. In IEEE 802.1Qbg, a database is mentioned 420 on how to assist the VDP protocol. It's not standardized in IEEE 421 802.1Qbg, but is a fundamental knowledge while VDP is defined. 422 Please refer to to find out the brief protocol introduction of VDP. 423 The following figure shows what is profile database and how it works. 424 [tes-nve-mechanism] 425 +-------------------+ 426 | +----+ +-------+ |Step4 +---------+ 427 | | VM |--|Hyper- | |------| Bridge |--------+ 428 | +----+ |Visor | | VDP +---------+ | 429 | +-------+ | | Database | 430 +-------------------+ | protocol | 431 | Step3 Step5 | +--------+ 432 | | +---|Network | 433 +-----------+ API +---------+ | |Admin | 434 |VM Manager |---------| Profile |---+ +--------+ 435 +-----------+ Step2 | Database| Step1 436 +---------+ 438 Fig3. VDP Profile Database 440 A profile database is a centralized database, which is used to store 441 profile of VSI type and VM. A VSI type is a set of policies or 442 resource definition that can be shared by all VMs that choose to use 443 this VSI type. VSI type can be regarded as an instance of Virtual 444 Network. The profile is quite flexible, and it can be organized in a 445 way shown in the following figure and include one or more of the 446 following information. There can be other kind of profile 447 organization format. The profile is very easy to extend to include 448 more information. 450 +-------------------------------------------------------+ 451 |VSI type|Profile type | description | 452 +--------+---------------+------------------------------+ 453 |VN1 |Priority | The priority of traffic | 454 | |QoS | QoS policies for the VSI type| 455 | |ACL | ACL rules for the VSI type | 456 | |Bandwidth | Bandwidth of the traffic | 457 | |Multicast Addr | The multicast addr for all | 458 | | | VMs belong to the VN | 459 | |VNID | A global unique ID for this | 460 | | | VN | 461 +--------+---------------+------------------------------+ 462 |VN2 |Priority | The priority of traffic | 463 | |QoS | QoS policies for the VSI type| 464 | |ACL | ACL rules for the VSI type | 465 | |Bandwidth | Bandwidth of the traffic | 466 | |Multicast Addr | The multicast addr for all | 467 | | | VMs belong to the VSI type | 468 | |VNID | A global unique ID for this | 469 | | | VN | 470 +-------------------------------------------------------+ 472 Fig4. Profile organization example 474 A mapping between VSI type and VM is also managed on the database. 475 +----------------------------------------------------------+ 476 |VSI type|VM list| Profile type| description | 477 +--------+-------+-------------+---------------------------+ 478 |VN1 |VM1 |MAC Addr | The MAC Addr of VM's vNIC.| 479 | | |VID | The VID to which the VM is| 480 | | | | associated. | 481 | | |Inner Addr | The inner addr of the VM, | 482 | | | | which can be IPv4/v6 addr.| 483 | | |Outer Addr | The outer addr of the VM, | 484 | | | | which can be IPv4/v6 addr.| 485 | |-------+-------------+---------------------------+ 486 | |VM2 |MAC Addr | The MAC Addr of VM's vNIC.| 487 | | |VID | The VID to which the VM is| 488 | | | | associated. | 489 | | |Inner Addr | The inner addr of the VM, | 490 | | | | which can be IPv4/v6 addr.| 491 | | |Outer Addr | The outer addr of the VM, | 492 | | | | which can be IPv4/v6 addr.| 493 +--------+-------+-------------+---------------------------+ 495 Fig5. VSI type to VM mapping 497 The work flow of VDP with profile database is as follows. 499 o Step1: Network Administrator creates VSI type database. 501 o Step2: VM Manager query available VSI type and obtain a VSI type 502 instance. 504 o Step3: VM Manager creat a VM on physical server and push VSI type 505 information to Hypervisor 507 o Step4: While VM is in start up/shut down/emigrate/immigrate 508 status, VDP messages are exchanged between hypervisor and bridge. 510 o Step5: Bridge retrieve VSI type information from profile database. 512 4.2.2. Example Architecture and Work Flow 514 +------Control Protocol-----+ 515 | | 516 | | 517 +------------+---------+ +------------+ 518 | +----------+-------+ | | Controller | 519 | | Overlay Module | | +------------+ 520 | +---------+--------+ | 521 | |VN context| 522 | | | +-----------+ 523 | +--------+-------+ | Database | Profile | 524 | | VNI | |------------| Database | 525 NVE1 | +-+------------+-+ | API +-----------+ 526 | | VAPs | | 527 +----+------------+----+ 528 | | 529 -------+------------+----- 530 | | 531 | | 532 Tenant End Systems 534 Fig6. Example architecture 536 TES/VM NVE database 537 |--start up-->| 538 or immigrate |<-get mappings and policies->| 539 (VNID, inner to outer, etc) 540 locally create 541 caches Controller 542 |--register NVE-VNI mapping-------->| 543 locally update 544 NVE-VNI mapping 545 |-data frame->| 546 |--encapsulation----> 548 |-emigrate--->| 549 |--notify VM emigration------------>| 550 locally update locally update 551 caches NVE-VNI mapping 553 |-syn->| 554 while mappings and/or 555 policies is updated 557 |<-synch mappings and policies------| 559 |<-get mappings and policies->| 560 (VNID, inner to outer, etc) 561 locally update 562 caches 564 Fig7. Example work flow 566 5. Summary 568 Compared the mechanism in Sec 4.1 and 4.2, we can get the following 569 results. From architecture view, exclusive controller has simpler 570 architecture with few interaction requirements, and simpler work 571 flow. 573 From performance view and reusing of existed protocols, hybird 574 mechanism is able to offload the query of static information to 575 database, which can optimize the performance of controller and make 576 the system more extensible. 578 6. Security Considerations 580 TBA 582 7. References 584 7.1. Normative Reference 586 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 587 Requirement Levels", March 1997. 589 [Qbg] "IEEE P802.1Qbg Edge Virtual Bridging". 591 7.2. Informative Reference 593 [framework] 594 Marc Lasserre, Marc., Balus, Florin., Morin, Thomas., 595 Bitar, Nabil., and Yakov. Rekhter, 596 "draft-lasserre-nvo3-framework-02", June 2012. 598 [overlay-cp] 599 Kreeger, L., Dutt, D., Narten, T., Black, D., and M. 600 Sridharan, "draft-kreeger-nvo3-overlay-cp-00", Jan 2012. 602 [tes-nve-mechanism] 603 Gu, Y., "The mechanism and protocol between TES and NVE to 604 facilitate NVO3", July 2012. 606 Authors' Addresses 608 Gu Yingjie 609 Huawei 610 No. 101 Software Avenue 611 Nanjing, Jiangsu Province 210001 612 P.R.China 614 Phone: +86-25-56625392 615 Email: guyingjie@huawei.com 617 Weiguo Hao 618 Huawei