idnits 2.17.1 draft-cth-rtgwg-bgp-control-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (April 30, 2020) is 1457 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC4271' is defined on line 738, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-idr-bgpls-segment-routing-epe' is defined on line 761, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-idr-flowspec-path-redirect' is defined on line 767, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-isis-segment-routing-extensions' is defined on line 778, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-rtgwg-bgp-routing-large-dc' is defined on line 790, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-spring-segment-routing' is defined on line 795, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) ** Obsolete normative reference: RFC 5575 (Obsoleted by RFC 8955) ** Obsolete normative reference: RFC 7752 (Obsoleted by RFC 9552) == Outdated reference: A later version (-12) exists of draft-ietf-idr-flowspec-path-redirect-10 == Outdated reference: A later version (-26) exists of draft-ietf-idr-segment-routing-te-policy-08 Summary: 4 errors (**), 0 flaws (~~), 10 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y. Luo 3 Internet-Draft L. Qu 4 Intended status: Informational China Telcom Co., Ltd. 5 Expires: November 1, 2020 X. Huang 6 Tencent 7 H. Chen 8 Futurewei 9 S. Zhuang 10 Z. Li 11 Huawei 12 April 30, 2020 14 Architecture for Use of BGP as Central Controller 15 draft-cth-rtgwg-bgp-control-04 17 Abstract 19 BGP is a core part of a network including Software-Defined Networking 20 (SDN) system. It has the traffic engineering information on the 21 network topology and can compute optimal paths for a given traffic 22 flow across the network. 24 This document describes some reference architectures for BGP as a 25 central controller. A BGP-based central controller can simplify the 26 operations on the network and use network resources efficiently for 27 providing services with high quality. 29 Requirements Language 31 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 32 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 33 document are to be interpreted as described in RFC 2119 [RFC2119]. 35 Status of This Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF). Note that other groups may also distribute 42 working documents as Internet-Drafts. The list of current Internet- 43 Drafts is at https://datatracker.ietf.org/drafts/current/. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 This Internet-Draft will expire on November 1, 2020. 51 Copyright Notice 53 Copyright (c) 2020 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (https://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the Simplified BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 69 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 70 3. Architectures . . . . . . . . . . . . . . . . . . . . . . . . 4 71 3.1. Building Blocks . . . . . . . . . . . . . . . . . . . . . 5 72 3.1.1. TEDB . . . . . . . . . . . . . . . . . . . . . . . . 5 73 3.1.2. SLDB . . . . . . . . . . . . . . . . . . . . . . . . 5 74 3.1.3. TPDB . . . . . . . . . . . . . . . . . . . . . . . . 5 75 3.1.4. CSPF . . . . . . . . . . . . . . . . . . . . . . . . 6 76 3.1.5. TM . . . . . . . . . . . . . . . . . . . . . . . . . 6 77 3.2. One Controller . . . . . . . . . . . . . . . . . . . . . 6 78 3.3. Controller Cluster . . . . . . . . . . . . . . . . . . . 8 79 3.4. Hierarchical Controllers . . . . . . . . . . . . . . . . 10 80 4. Application Scenarios . . . . . . . . . . . . . . . . . . . . 11 81 4.1. Business-oriented Traffic Steering . . . . . . . . . . . 11 82 4.1.1. Preferential Users . . . . . . . . . . . . . . . . . 11 83 4.1.2. Preferential Services . . . . . . . . . . . . . . . . 12 84 4.2. Traffic Congestion Mitigation . . . . . . . . . . . . . . 13 85 4.2.1. Congestion Mitigation in Core . . . . . . . . . . . . 14 86 4.2.2. Congestion Mitigation among ISPs . . . . . . . . . . 14 87 4.2.3. Congestion Mitigation at International Edge . . . . . 15 88 5. Security Considerations . . . . . . . . . . . . . . . . . . . 16 89 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 90 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 91 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 16 92 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 93 9.1. Normative References . . . . . . . . . . . . . . . . . . 17 94 9.2. Informative References . . . . . . . . . . . . . . . . . 17 95 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 97 1. Introduction 99 Border Gateway Protocol (BGP) [RFC1771] is an exterior gateway 100 protocol (EGP). It is developed to exchange routing information 101 among routers in different autonomous systems (ASes). Along its 102 developments, BGP has been extended to provide numerous new 103 functions. It collects the link states including traffic engineering 104 (TE) information from other protocols such as IGP and distributes 105 them among routers in different ASes [RFC7752]. It also controls the 106 redirection of traffic flows [RFC5575]. Furthermore, it distributes 107 MPLS labels [RFC3107]. For scalability, BGP is extended to have 108 Route Reflector (RR) [RFC4456]. 110 For segment routing (SR), BGP is extended to advertise SR policies 111 with candidate paths to the policy headend routers, which are 112 typically ingress routers [I-D.ietf-idr-segment-routing-te-policy]. 113 The SR specific PCEP extensions are defined in 114 [I-D.ietf-pce-segment-routing]. A stateful PCE can compute an SR 115 traffic engineering (SR-TE) path satisfying a set of constraints, and 116 initiate an SR-TE path on a headend router using the extensions. 118 An SDN controller (or controller for short) is the core of an SDN 119 system or network. It is between network elements (NEs) such as 120 routers or switches at one end and applications such as Operational 121 Support System (OSS) or Network Management System (NMS) at the other 122 end. The essential function of a controller is to steer traffic 123 flows across the network for providing more services with higher 124 quality. It manages network resources such as link bandwidth, 125 computes expected paths for carrying traffic flows based on available 126 network resources, programs the network elements for the creation of 127 tunnels along the paths, and redirects traffic flows into 128 corresponding tunnels. 130 Based on the current BGP, it is natural, beneficial and relatively 131 simple to extend BGP to become a controller. Using BGP as a 132 controller for a network will greatly simplify the operations on the 133 network. It avoids deploying, operating and maintaining a new extra 134 component or protocol such as PCE as a controller in the network. 136 This document describes some reference architectures for BGP as a 137 central controller and introduces some scenarios to which the BGP 138 controller can be applied. 140 2. Terminology 142 o SR: Segment Routing 144 o RR: Route Reflector 145 o SID: Segment Identifier 147 o SR-Path: Segment Routing Path 149 o SR-Tunnel: Segment Routing Tunnel 151 o TEDB: Traffic Engineering Database 153 o LSDB: Link State Database 155 o SLDB: SID/Label Database 157 o TPDB: Tunnel and Path Database 159 o CSPF: Constrained Shortest Path First 161 o TM: Tunnel Manager 163 o NMS: Network Management System 165 o SRLB: SR Local Block 167 o NE: Network Element 169 o PCE: Path Computation Element 171 o AS: Autonomous System 173 o QoS: Quality of Service 175 o ISP: Internet Service Provider 177 o MAN: Metropolitan Area Network 179 o OTT: Over the Top 181 o OTTSP: Over the Top Service Provider, or Content Operator 183 o AR: Access Router 185 3. Architectures 187 An architecture for the use of BGP as a central controller is based 188 on the essential function of a controller. It is constructed from 189 some building blocks or components. After introduction to building 190 blocks, a few of reference architectures are described in this 191 section. 193 3.1. Building Blocks 195 Some critical building blocks are briefed. They are Traffic 196 Engineering Database (TEDB or TED for short), SID/Label Database 197 (SLDB), Tunnel and Path Database (TPDB), Constrained Shortest Path 198 First (CSPF), and Tunnel Manager (TM). 200 3.1.1. TEDB 202 The Traffic Engineering Database (TEDB) stores the Traffic 203 Engineering (TE) information about the network. It includes the 204 unreserved bandwidth at each of eight priority levels for every link 205 in the network. 207 TEDB can be an individual block, which is constructed from the link 208 state information received. It may be embedded into the link state 209 database (LSDB) in the BGP when the BGP creates/updates the LSDB from 210 the link state information it receives. 212 3.1.2. SLDB 214 The SID/Label Database (SLDB) records and maintains the status of 215 every Segment Identifier (SID) and label for every node, interface/ 216 link and/or prefix in the network, which the controller controls. 217 The status of SID/label indicates whether the SID/Label is assigned. 218 If it is assigned, then the object such as the node, link or prefix, 219 to which it is assigned, is recorded. 221 SLDB can be an individual block, which is constructed from the link 222 state information such as SR Local Block (SRLB) that the BGP 223 receives. It may be embedded into the link state database (LSDB) in 224 the BGP when the BGP creates the LSDB from the link state information 225 it receives. 227 3.1.3. TPDB 229 The Tunnel and Path Database (TPDB) stores the information for every 230 tunnel, which includes: 232 o the parameters received for the tunnel from a user/application, 234 o the path computed for the tunnel, 236 o the resources such as link bandwidth reserved along the path for 237 the tunnel, 239 o the SID/labels assigned along the path for the tunnel, and 240 o the status of the tunnel. 242 3.1.4. CSPF 244 The Constrained Shortest Path First (CSPF) computes a path for a 245 tunnel such as SR tunnel or LSP tunnel that satisfies a set of given 246 constraints using the information in TEDB. 248 3.1.5. TM 250 The Tunnel Manager (TM) receives a request for an operation on a 251 tunnel from a user or an application such as Network Management 252 System (NMS). The operation may be a creation of a new tunnel, a 253 deletion of an existing tunnel, or a change to an existing tunnel. 255 When receiving a request for creating a new tunnel, the TM asks the 256 CSPF to compute a path for the tunnel that satisfies the constraints 257 given for the tunnel. 259 After obtaining the path for the tunnel from the CSPF, the TM 260 requests the SLDB to assign SID/labels along the path for the tunnel 261 and asks the TEDB to reserve the resources such as link bandwidth 262 along the path for the tunnel. 264 The TM in a central controller may set up the tunnel along the path 265 in the network by programming each of the NEs along the path through 266 the API to the network. In a SR network, the TM initiates a SR 267 tunnel in the network by sending a sequence of SID/labels to the 268 source NE of the tunnel. 270 The TM records the information for the tunnel in the Tunnel and Path 271 Database (TPDB). The information includes the path computed for the 272 tunnel, the resources such as bandwidth reserved along the path, the 273 SID/labels assigned along the path for the tunnel, and the status of 274 the tunnel. 276 3.2. One Controller 278 Figure below illustrates a reference architecture for using the BGP 279 as a central controller, which controls a network. The BGP as a 280 controller in the reference architecture controls a network through 281 an API to the network such as BGP+/RR+ (extensions to BGP for central 282 controller). The BGP controller is responsible for creating and 283 maintaining every tunnel in the network. It also controls the 284 redirection of traffic flow to each tunnel. 286 The BGP controller comprises a number of modules, including a TM, a 287 CSPF, a TEDB, a SLDB and a TPDB. The interfaces among these modules 288 are listed as follows: 290 +------------------------------------------+ 291 | Users/Applications(Orchestrator/OSS/NMS) | 292 +------------------------------------------+ 293 | 294 +----------------------------------------------+ 295 | BGP as Controller | 296 | +---------------+ | 297 | /------------| TM | | 298 | / Ia +---------------+ | 299 | +--------+ | | | \ | 300 | | CSPF | ________| | | \Id | 301 | +--------+ / Ib /Ic | +---------+ | 302 | \Ie / / | | TPDB | | 303 | +---------+ +-------+ | +---------+ | 304 | | TEDB | | SLDB | | | 305 | +---------+ +-------+ | | 306 | \ \ |In | 307 +----------------API to Network(RR+)-----------+ 308 / \ 309 / \____ 310 / \ \____ 311 /\ .---. .---+ \ 312 | \( ' |'.---. | 313 |---\ Network | '+. 314 (o \ | | ) 315 ( | | o) 316 ( | | ) 317 ( o o .-' 318 ' ) 319 '---._.-. ) 320 '---' 322 o Interface Ia between the TM and the CSPF. Through this interface, 323 the TM requests the CSPF to compute a path for a tunnel with a set 324 of constraints, and the CSPF responses the TM with the path 325 computed that satisfies the constraints. 327 o Interface Ib between the TM and the TEDB. When a tunnel is to be 328 created, through this interface, the TM reserves in the TEDB the 329 TE resources such as link bandwidths on every link along the path 330 computed for the tunnel. When a tunnel is deleted, the TM 331 releases the TE resources such as link bandwidths on every link 332 along the path for the tunnel. 334 o Interface Ic between the TM and the SLDB. When a tunnel is to be 335 created, through this interface, the TM reserves in the SLDB a 336 SID/label for every link or some links along the path computed for 337 the tunnel. When a tunnel is deleted, the TM releases the SID/ 338 label for every link or some links along the path for the tunnel. 340 o Interface Id between the TM and the TPDB. the TM updates the 341 information for every tunnel in the TPDB through this interface. 343 o Interface Ie between the CSPF and the TEDB. Through this 344 interface, the CSPF accesses the traffic engineering information 345 such as link bandwidths when it computes a path for a tunnel. 347 There is an interface In between the BGP controller and the network. 348 In fact, there is a control channel (or interface) between the BGP 349 controller and every (edge) node in the network. 351 Initially, the TEDB obtains the original traffic engineering (TE) 352 information such as link bandwidths from the network through the 353 interface In (i.e., API to network) for every link in the network. 354 The SLDB gets the original SID/label resources from the network 355 through the interface for every node, link and prefix in the network. 357 3.3. Controller Cluster 359 A critical issue in a network with a central controller is the 360 failure of the controller, which is a single point of failure (SPOF). 361 If the controller fails, the entire network may not work. 363 A controller cluster (i.e., a group of controllers) works as a single 364 controller from user's point of view. A simple controller cluster 365 consists of two controllers. One works as a active (or say primary) 366 controller, and the other as a standby (or say secondary) controller. 367 In normal operations, the active controller is responsible for the 368 network it controls. It also synchronizes with the standby 369 controller. When the active controller fails, the standby controller 370 becomes a new active controller, which controls the network. 372 The Figure below illustrates a simple controller cluster containing 373 two BGP-based controllers: Active BGP-based Controller and Standby 374 BGP-based Controller. In normal operations, the active controller 375 interacts with users and/or applications. For example, it receives 376 configurations for tunnels and the traffic flows to tunnels from 377 users. The active controller instructs the network elements in the 378 network to provide the services requested by users and/or 379 applications. For example, after receiving the configurations for a 380 tunnel and a traffic flow to the tunnel, the active controller 381 computes a path for the tunnel, programs (or say instructs) the 382 network elements along the path for creating the tunnel, and 383 instructs the ingress of the tunnel to direct the traffic flow into 384 the tunnel. 386 +-------------------------------------------+ 387 | Users/Applications(Orchestrator/OSS/NMS) | 388 +-------------------------------------------+ 389 ^ 390 | 391 +--------------------------+------------------------+ 392 | Controller ______________|_____________ | 393 | Cluster | | | 394 | | ___________________ | | 395 | | | Synchronization | | | 396 | v v v v | 397 | +------------+ +------------+ | 398 | | Active | | Standby | | 399 | | BGP-based | | BGP-based | | 400 | | Controller | | Controller | | 401 | +------------+ +------------+ | 402 | ^ ^ | 403 | |____________________________| | 404 | | | 405 | v | 406 +-----------------API to Network(RR+)---------------+ 407 / \ 408 / \____ 409 / \ \____ 410 /\ .---. .---+ \ 411 | \( ' |'.---. | 412 |---\ Network | '+. 413 (o \ | | ) 414 ( | | o) 415 ( | | ) 416 ( o o .-' 417 ' ) 418 '---._.-. ) 419 '---' 421 During this process, the status information about the network is 422 updated in the active controller. The information includes: the 423 traffic engineering information in their TEDBs, the SID/label 424 information in their SLDBs, and the configurations, paths, resources 425 and status for tunnels in their TPDBs. The active controller 426 synchronizes this information with the standby controller. Thus 427 these two controllers have the same status information about the 428 network. When the active controller fails, the standby controller 429 takes over the role of the active controller smoothly and becomes 430 active controller. 432 3.4. Hierarchical Controllers 434 The Figure below illustrates a system with hierarchical controllers. 435 There is one Parent Controller and four Child Controllers: Child 436 Controller 1, Child Controller 2, Child Controller 3 and Child 437 Controller 4. 439 +-------------------------------------------+ 440 | Users/Applications(Orchestrator/OSS/NMS) | 441 +----------------------+--------------------+ 442 | 443 +---------+---------+ 444 | Parent Controller | 445 +--+---------+----+-+ 446 _/| \ \____ 447 _/ | \ \____ 448 _/ | \ \__ 449 __/ | +---------+---------+ \ 450 __/ | |Child Controller 3 | | 451 / | +-------------------+ | 452 +---------+---------+ | / \ | 453 |Child Controller 1 | | .---. .---,\ | 454 +-------------------+ | ( ' ') | 455 / \ | ( Domain 3 ) | 456 .---. .---,\ | ( ) +---------+---------+ 457 ( ' ') | '-o-.--o) |Child Controller 4 | 458 ( Domain 1 ) | | +-------------------+ 459 ( ) | | / \____ 460 '-o-.---) +--------+----------+ \ / \ \____ 461 | |Child Controller 2 | \ /\ .---. .---+ \ 462 | +-------------------+ \ | \( ' |'.---. | 463 | / \____ \_ |---\ Domain 4 | '+, 464 \ / \ \____ (o \ | | ) 465 \ /\ .---. .---+ \ ( | | o) 466 \ | \( ' |'.---. | ( | | ) 467 \ |---\ Domain 2 | '+. ( o o .-' 468 \____(o \ | | ) ' ) 469 ( | | o)-------o---._.-.-----) 470 ( | | ) 471 ( o o .-' 472 ' ) 473 '---._.-.-----) 475 The parent controller communicates with these four child controllers 476 and controls them, each of which controls (or is responsible for) a 477 domain. Child controller 1 controls domain 1, Child controller 2 478 controls domain 2, Child controller 3 controls domain 3, and Child 479 controller 4 controls domain 4. 481 One level of hierarchy of controllers is illustrated in the figure 482 above. There is one parent controller at top level, which is not a 483 child controller. Under the parent controller, there are four child 484 controllers, which are not parent controllers. 486 In a general case, at top level there is one parent controller that 487 is not a child controller, there are some controllers that are both 488 parent controllers and child controllers, and there are a number of 489 child controllers that are not parent controllers. This is a system 490 of multiple levels of hierarchies, in which one parent controller 491 controls or communicates with a first number of child controllers, 492 some of which are also parent controllers, each of which controls or 493 communicates with a second number of child controllers, and so on. 495 The parent controller receives requests for creating end to end 496 tunnels from users or applications. For each request, the parent 497 controller is responsible for obtaining a path for the tunnel and 498 creating the tunnel along the path through sending instructions to 499 the corresponding child controllers. 501 4. Application Scenarios 503 This section introduces a set of scenarios to which the controller 504 can be applied. 506 4.1. Business-oriented Traffic Steering 508 It is reasonable in commercial sense to provide multiple paths to the 509 same destination with differentiated experiences for preferential 510 users/services. This is an efficient approach to maximize providers' 511 network resource usage as well as their profit and offer more choices 512 to network users. 514 4.1.1. Preferential Users 516 In the Figure below for an ISP network, there are three kinds of 517 users in Sydney, saying Gold, Silver and Bronze, and they wish to 518 visit website located in HongKong. The ISP provides three different 519 paths with different experiences according to users' priority. The 520 Gold Users may use Path1 with less latency and loss. The Silver 521 Users may use the Path2 through Singapore with less latency but maybe 522 some congestion there. The Bronze Users may use Path3 through LA 523 with some latency and loss. 525 +----------+ 526 | HongKong | 527 --+----------+-- 528 --- | --- 529 --- | --- 530 -- | -- 531 +----------+ | +----------+ 532 |Singapore | | | LA | 533 +----------+ | +----------+ 534 -- |Path1 -- 535 --- | --- 536 Path2 --- | --- Path3 537 --+----------+-- 538 | Sydney | 539 +----------+ 540 | 541 | 542 +-----------+-----------+ 543 | | | 544 +-------+ +-------+ +-------+ 545 |Silver | |Gold | |Bronze | 546 |Users | |Users | |Users | 547 +-------+ +-------+ +-------+ 549 4.1.2. Preferential Services 551 As depicted in the Figure below, the OTTSP has 3 exits with one ISP, 552 which are located in City A, City B and City C. The content is 553 obtained from Content Server and send to the exits through AR. An 554 OTTSP may make its steering strategy based on different services. 555 For example, the OTTSP in the Figure may choose exit R21 for video 556 service and exit R22 for web service, which REQUIREs a mechanism/ 557 system exists to identify different services from traffic flow. 559 * * 560 City A * City B * City C 561 * * 562 * +-----+ * 563 * |Users| * 564 * +-----+ * 565 * | * 566 +-----------+-----------+ 567 | * | * | 568 +-----+ * +-----+ * +-----+ 569 | R11 |-----| R12 |-----| R13 | 570 +-----+ * +-----+ * +-----+ ISP 571 | * | * | 572 *****|***********|***********|********* 573 | * | * | 574 | * | * | OTT 575 +-----+ * +-----+ * +-----+ 576 | R21 |-----| R22 |-----| R23 | 577 +-----+ * +-----+ * +-----+ 578 | * | * | 579 +-----------+-----------+ 580 * | * 581 * +-----+ * +-------+ 582 * | AR |--------|Content| 583 * +-----+ * |Server | 584 +-------+ 586 4.2. Traffic Congestion Mitigation 588 It is a persistent goal for providers to increase the utilization 589 ratio of their current network resources, and to mitigate the traffic 590 congestion. Traffic congestion is possible to happen anywhere in the 591 ISP network(MAN, IDC, core and the links between them), because 592 internet traffic is hard to predict. For example, there might be 593 some local online events that the network operators didn't know 594 beforehand, or some sudden attack just happened. Even for the big 595 events that can be predicted, such as annual online discount of 596 e-commerce company, or IOS update of Apple Inc, we could not 597 guarantee there is no congestion. Since the network capacity 598 expansion is usually an annual operation, there could be delay on any 599 links of the engineering. As a result, the temporary traffic 600 steering is always needed. The same thing happens to the OTT 601 networks as well. 603 It should be noted that, the traffic steering is absolutely not a 604 global behavior. It just acts on part of the network, and it's 605 temporary. 607 4.2.1. Congestion Mitigation in Core 609 As depicted in the Figure below, traffic from MAN C1 to MAN D2 610 follows the path Core C->Core B->Core D as the primary path, but 611 somehow the load ratio becomes too much. It is reasonable to 612 transfer some traffic load to less utilized path Core C->Core A->Core 613 D when the primary path has congestion. 615 Core 617 +----------+ 618 | Core A | 619 +------+ --+----------+-- +------+ 620 |MAN C1|-+ --- --- +-|MAN D1| 621 +------+ | --- --- | +------+ 622 | -- -- | 623 | +----------+ +----------+ | 624 +-| Core C | | Core D |-+ 625 | +----------+ +----------+ | 626 | -- -- | 627 +------+ | --- --- | +------+ 628 |MAN C2|-+ --- --- +-|MAN D2| 629 +------+ --+----------+-- +------+ 630 | Core B | 631 +----------+ 633 4.2.2. Congestion Mitigation among ISPs 635 As depicted in the Figure below, ISP1 and ISP2 are interconnect by 3 636 exits which are located in 3 cities respectively. The links between 637 ISP1 and ISP2 in the same city are called local links, and the rest 638 are long distance links. Traffic from IXP C1 to Core A in ISP 2 639 usually passes through link IXP C1->IXP A2->Core A. This is a long 640 distant route, directly connecting city C and city A. Part of 641 traffic could be transferred to link IXP. 643 * * 644 City A * City B * City C 645 * * 646 +-------+ * +-------+ * +-------+ 647 |IXP A1 |----|IXP B1|---|IXP C1 | 648 +-------+ * +-------+ * +-------+ ISP 1 649 | * | * | | 650 *******|*************|*********|**|********** 651 | +----------|---------+ | 652 | | * | * | ISP 2 653 | | * | * | 654 +------+ * +------+ * +------+ 655 |IXP A2|----|IXP B2|----|IXP C2| 656 +------+ * +------+ * +------+ 657 | * | * | 658 | * | * | 659 +-------+ * +-------+ * +-------+ 660 |Core A |----|Core B |---|Core C | 661 +-------+ * +-------+ * +-------+ 663 4.2.3. Congestion Mitigation at International Edge 665 An ISP usually interconnects with more than 2 transit networks at the 666 international edge, so it is quite common that multiple paths may 667 exist for the same foreign destination. Usually those paths with 668 better QoS properties such as latency, loss, jitter and etc are often 669 preferred. Since these properties keep changing from time to time, 670 the decision of path selection has to be made dynamically. 672 As depicted in the Figure below, the traffic to the foreign 673 destination H from IP core network (AS C1) has two choices on transit 674 network, saying Transit A and Transit B. Under normal conditions, 675 Transit B is the primary choice, but Transit A will be preferred when 676 the QoS of Transit B gets worse. As a result, the same traffic will 677 go through Transit A instead. 679 * * 680 City A * City B * City C 681 * * 682 +-------+ * +-------+ * +-------+ 683 |IXP A1 |----|IXP B1|---|IXP C1 | 684 +-------+ * +-------+ * +-------+ ISP 1 685 | * | * | | 686 *******|*************|*********|**|********** 687 | +----------|---------+ | 688 | | * | * | ISP 2 689 | | * | * | 690 +------+ * +------+ * +------+ 691 |IXP A2|----|IXP B2|----|IXP C2| 692 +------+ * +------+ * +------+ 693 | * | * | 694 | * | * | 695 +-------+ * +-------+ * +-------+ 696 |Core A |----|Core B |---|Core C | 697 +-------+ * +-------+ * +-------+ 699 5. Security Considerations 701 The interactions with a BGP-based controller are similar to those 702 with any other SDN controller. The security implications of SDN 703 controller have not been fully discussed or described. Therefore, 704 protocol and applicability for solutions around this architecture 705 must take proper account of these concerns. 707 6. IANA Considerations 709 This document does not require any IANA actions. 711 7. Acknowledgements 713 The authors would like to thank Chris Bowers, Jeff Tantsura for their 714 valuable suggestions and comments on this draft. 716 8. Contributors 718 Nan Wu 719 Huawei 720 Email: eric.wu@huawei.com 722 9. References 723 9.1. Normative References 725 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP- 726 4)", RFC 1771, DOI 10.17487/RFC1771, March 1995, 727 . 729 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 730 Requirement Levels", BCP 14, RFC 2119, 731 DOI 10.17487/RFC2119, March 1997, 732 . 734 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 735 BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, 736 . 738 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 739 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 740 DOI 10.17487/RFC4271, January 2006, 741 . 743 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 744 Reflection: An Alternative to Full Mesh Internal BGP 745 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 746 . 748 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 749 and D. McPherson, "Dissemination of Flow Specification 750 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 751 . 753 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 754 S. Ray, "North-Bound Distribution of Link-State and 755 Traffic Engineering (TE) Information Using BGP", RFC 7752, 756 DOI 10.17487/RFC7752, March 2016, 757 . 759 9.2. Informative References 761 [I-D.ietf-idr-bgpls-segment-routing-epe] 762 Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray, 763 S., and J. Dong, "BGP-LS extensions for Segment Routing 764 BGP Egress Peer Engineering", draft-ietf-idr-bgpls- 765 segment-routing-epe-19 (work in progress), May 2019. 767 [I-D.ietf-idr-flowspec-path-redirect] 768 Velde, G., Patel, K., and Z. Li, "Flowspec Indirection-id 769 Redirect", draft-ietf-idr-flowspec-path-redirect-10 (work 770 in progress), October 2019. 772 [I-D.ietf-idr-segment-routing-te-policy] 773 Previdi, S., Filsfils, C., Talaulikar, K., Mattes, P., 774 Rosen, E., Jain, D., and S. Lin, "Advertising Segment 775 Routing Policies in BGP", draft-ietf-idr-segment-routing- 776 te-policy-08 (work in progress), November 2019. 778 [I-D.ietf-isis-segment-routing-extensions] 779 Previdi, S., Ginsberg, L., Filsfils, C., Bashandy, A., 780 Gredler, H., and B. Decraene, "IS-IS Extensions for 781 Segment Routing", draft-ietf-isis-segment-routing- 782 extensions-25 (work in progress), May 2019. 784 [I-D.ietf-pce-segment-routing] 785 Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., 786 and J. Hardwick, "PCEP Extensions for Segment Routing", 787 draft-ietf-pce-segment-routing-16 (work in progress), 788 March 2019. 790 [I-D.ietf-rtgwg-bgp-routing-large-dc] 791 Lapukhov, P., Premji, A., and J. Mitchell, "Use of BGP for 792 routing in large-scale data centers", draft-ietf-rtgwg- 793 bgp-routing-large-dc-11 (work in progress), June 2016. 795 [I-D.ietf-spring-segment-routing] 796 Filsfils, C., Previdi, S., Ginsberg, L., Decraene, B., 797 Litkowski, S., and R. Shakir, "Segment Routing 798 Architecture", draft-ietf-spring-segment-routing-15 (work 799 in progress), January 2018. 801 Authors' Addresses 803 Yujia 804 China Telcom Co., Ltd. 805 109 West Zhongshan Ave,Tianhe District 806 Guangzhou 510630 807 China 809 Email: luoyuj@sdu.edu.cn 811 Liang 812 China Telcom Co., Ltd. 813 109 West Zhongshan Ave,Tianhe District 814 Guangzhou 510630 815 China 817 Email: ouliang@chinatelecom.cn 818 Xiang 819 Tencent 821 Email: terranhuang@tencent.com 823 Huaimo Chen 824 Futurewei 825 Boston, MA 826 USA 828 Email: Huaimo.chen@futurewei.com 830 Shunwan Zhuang 831 Huawei 832 Huawei Bld., No.156 Beiqing Rd. 833 Beijing 100095 834 China 836 Email: zhuangshunwan@huawei.com 838 Zhenbin Li 839 Huawei 840 Huawei Bld., No.156 Beiqing Rd. 841 Beijing 100095 842 China 844 Email: lizhenbin@huawei.com