idnits 2.17.1 draft-cth-rtgwg-bgp-control-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 10, 2019) is 1873 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC4271' is defined on line 741, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-idr-bgpls-segment-routing-epe' is defined on line 764, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-idr-flowspec-path-redirect' is defined on line 770, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-isis-segment-routing-extensions' is defined on line 781, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-rtgwg-bgp-routing-large-dc' is defined on line 793, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-spring-segment-routing' is defined on line 798, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) ** Obsolete normative reference: RFC 5575 (Obsoleted by RFC 8955) ** Obsolete normative reference: RFC 7752 (Obsoleted by RFC 9552) == Outdated reference: A later version (-19) exists of draft-ietf-idr-bgpls-segment-routing-epe-17 == Outdated reference: A later version (-12) exists of draft-ietf-idr-flowspec-path-redirect-07 == Outdated reference: A later version (-26) exists of draft-ietf-idr-segment-routing-te-policy-05 == Outdated reference: A later version (-25) exists of draft-ietf-isis-segment-routing-extensions-22 Summary: 4 errors (**), 0 flaws (~~), 12 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y. Luo 3 Internet-Draft L. Qu 4 Intended status: Informational China Telcom Co., Ltd. 5 Expires: September 11, 2019 X. Huang 6 Tencent 7 H. Chen 8 S. Zhuang 9 Z. Li 10 Huawei 11 March 10, 2019 13 Architecture for Use of BGP as Central Controller 14 draft-cth-rtgwg-bgp-control-01 16 Abstract 18 BGP is a core part of a network including Software-Defined Networking 19 (SDN) system. It has the traffic engineering information on the 20 network topology and can compute optimal paths for a given traffic 21 flow acrosss the network. 23 This document describes the architecture for BGP as a central 24 controller. A BGP-based central controller can simplify the 25 operations on the network and use network resources efficiently for 26 providing services with high quality. 28 Requirements Language 30 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 31 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 32 document are to be interpreted as described in RFC 2119 [RFC2119]. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on September 11, 2019. 50 Copyright Notice 52 Copyright (c) 2019 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 3. Architecture . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3.1. Building Blocks . . . . . . . . . . . . . . . . . . . . . 5 71 3.1.1. TEDB . . . . . . . . . . . . . . . . . . . . . . . . 5 72 3.1.2. SLDB . . . . . . . . . . . . . . . . . . . . . . . . 5 73 3.1.3. TPDB . . . . . . . . . . . . . . . . . . . . . . . . 5 74 3.1.4. CSPF . . . . . . . . . . . . . . . . . . . . . . . . 6 75 3.1.5. TM . . . . . . . . . . . . . . . . . . . . . . . . . 6 76 3.2. One Controller . . . . . . . . . . . . . . . . . . . . . 6 77 3.3. Controller Cluster . . . . . . . . . . . . . . . . . . . 8 78 3.4. Hierarchical Controllers . . . . . . . . . . . . . . . . 10 79 4. Application Scenarios . . . . . . . . . . . . . . . . . . . . 12 80 4.1. Business-oriented Traffic Steering . . . . . . . . . . . 12 81 4.1.1. Preferential Users . . . . . . . . . . . . . . . . . 12 82 4.1.2. Preferential Services . . . . . . . . . . . . . . . . 13 83 4.2. Traffic Congestion Mitigation . . . . . . . . . . . . . . 14 84 4.2.1. Congestion Mitigation in Core . . . . . . . . . . . . 15 85 4.2.2. Congestion Mitigation among ISPs . . . . . . . . . . 15 86 4.2.3. Congestion Mitigation at International Edge . . . . . 16 87 5. Security Considerations . . . . . . . . . . . . . . . . . . . 17 88 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 89 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 90 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 17 91 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 92 9.1. Normative References . . . . . . . . . . . . . . . . . . 18 93 9.2. Informative References . . . . . . . . . . . . . . . . . 18 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 96 1. Introduction 98 Border Gateway Protocol (BGP) [RFC1771] is an exterior gateway 99 protocol (EGP). It is developed to exchange routing information 100 among routers in different autonomous systems (ASs). Along its 101 developments, BGP has been extended to provide numerous new 102 functions. It collects the link states including traffic engineering 103 (TE) information from other protocols such as IGP and distributes 104 them among routers in different ASes [RFC7752]. It also controls the 105 redirection of traffic flows [RFC5575]. Furthermore, it distributes 106 MPLS labels [RFC3107]. For scalability, BGP is extended to have 107 Route Reflector (RR) [RFC4456]. 109 For segment routing (SR), BGP is extended to advertise SR policies 110 with candidate paths to the policy headend routers, which are 111 typically ingress routers [I-D.ietf-idr-segment-routing-te-policy]. 112 The SR specific PCEP extensions are defined in 113 [I-D.ietf-pce-segment-routing]. A stateful PCE can compute an SR 114 traffic engineering (SR-TE) path satisfying a set of constraints, and 115 initiate an SR-TE path on a headend router using the extensions. 117 An SDN controller (or controller for short) is the core of an SDN 118 system or network. It is between network elements (NEs) such as 119 routers or switches at one end and applications such as Operational 120 Support System (OSS) or Network Management System (NMS) at the other 121 end. The essential function of a controller is to steer traffic 122 flows across the network for providing more services with higher 123 quality. It manages network resources such as link bandwidth, 124 computes expected paths for carrying traffic flows based on available 125 network resources, programs the network elements for the creation of 126 tunnels along the paths, and redirects traffic flows into 127 corresponding tunnels. 129 Based on the current BGP, it is natural, beneficial and relatively 130 simple to extend BGP to become a controller. Using BGP as a 131 controller for a network will greatly simplify the operations on the 132 network. It avoids deploying, operating and maintaining a new extra 133 component or protocol such as PCE as a controller in the network. 135 This document describes the architecture for BGP as a central 136 controller and introduces some scenarios to which the BGP controller 137 can be applied. 139 2. Terminology 141 o SR: Segment Routing 143 o RR: Route Reflector 144 o SID: Segment Identifier 146 o SR-Path: Segment Routing Path 148 o SR-Tunnel: Segment Routing Tunnel 150 o TEDB: Traffic Engineering Database 152 o LSDB: Link State Database 154 o SLDB: SID/Label Database 156 o TPDB: Tunnel and Path Database 158 o CSPF: Constrained Shortest Path First 160 o TM: Tunnel Manager 162 o NMS: Network Management System 164 o SRLB: SR Local Block 166 o NE: Network Element 168 o PCE: Path Computation Element 170 o AS: Autonomous System 172 o QoS: Quality of Service 174 o ISP: Internet Service Provider 176 o MAN: Metropolitan Area Network 178 o OTT: Over the Top 180 o OTTSP: Over the Top Service Provider, or Content Operator 182 o AR: Access Router 184 3. Architecture 186 The architecture for the use of BGP as a central controller is based 187 on the essential function of a controller. It is constructed from 188 some building blocks or components. After introduction to building 189 blocks, a few of reference architectures are described in this 190 section. 192 3.1. Building Blocks 194 Some critical building blocks are briefed. They are Traffic 195 Engineering Database (TEDB or TED for short), SID/Label Database 196 (SLDB), Tunnel and Path Database (TPDB), Constrained Shortest Path 197 First (CSPF), and Tunnel Manager (TM). 199 3.1.1. TEDB 201 The Traffic Engineering Database (TEDB) stores the Traffic 202 Engineering (TE) information about the network. It includes the 203 unreserved bandwidth at each of eight priority levels for every link 204 in the network. 206 TEDB can be an individual block, which is constructed from the link 207 state information received. It may be embedded into the link state 208 database (LSDB) in the BGP when the BGP creates/updates the LSDB from 209 the link state information it receives. 211 3.1.2. SLDB 213 The SID/Label Database (SLDB) records and maintains the status of 214 every Segment Identifier (SID) and label for every node, interface/ 215 link and/or prefix in the network, which the controller controls. 216 The status of SID/label indicates whether the SID/Label is assigned. 217 If it is assigned, then the object such as the node, link or prefix, 218 to which it is assigned, is recorded. 220 SLDB can be an individual block, which is constructed from the link 221 state information such as SR Local Block (SRLB) that the BGP 222 receives. It may be embedded into the link state database (LSDB) in 223 the BGP when the BGP creates the LSDB from the link state information 224 it receives. A SRLB indicates the range(s) of SIDs/labels allocated 225 to a node for local SIDs. 227 3.1.3. TPDB 229 The Tunnel and Path Database (TPDB) stores the related information 230 for every tunnel. The information stored in the TPDB for each tunnel 231 includes: 233 o the parameters received for the tunnel from a user or application, 235 o the path computed for the tunnel, 237 o the resources such as link bandwidth reserved along the path for 238 the tunnel, 240 o the SID/labels assigned along the path for the tunnel, and 242 o the status of the tunnel. 244 3.1.4. CSPF 246 The Constrained Shortest Path First (CSPF) computes a path for a 247 tunnel such as SR tunnel or LSP tunnel that satisfies a set of given 248 constraints using the information in the traffic engineering database 249 (TEDB). 251 3.1.5. TM 253 The Tunnel Manager (TM) receives a request for an operation on a 254 tunnel from a user or an application such as Network Management 255 System (NMS). The operation may be a creation of a new tunnel, a 256 deletion of an existing tunnel, or a change to an existing tunnel. 258 When receiving a request for creating a new tunnel, the TM asks the 259 CSPF to compute a path for the tunnel that satisfies the constraints 260 given for the tunnel. 262 After obtaining the path for the tunnel from the CSPF, the TM 263 requests the SLDB to assign SID/labels along the path for the tunnel 264 and asks the TEDB to reserve the resources such as link bandwidth 265 along the path for the tunnel. 267 The TM in a central controller may set up the tunnel along the path 268 in the network by programming each of the NEs along the path through 269 the API to the network. In a SR network, the TM initiates a SR 270 tunnel in the network by sending a sequence of SID/labels to the 271 source NE of the tunnel. 273 The TM records the related information for the tunnel in the Tunnel 274 and Path database (TPDB). The information includes the path computed 275 for the tunnel, the resources such as bandwidth reserved along the 276 path, the SID/labels assigned along the path for the tunnel, and the 277 status of the tunnel. 279 3.2. One Controller 281 Figure below illustrates a reference architecture for using the BGP 282 as a central controller, which controls a network. The BGP as a 283 controller in the reference architecture controls a network through 284 an API to the network such as BGP+/RR+ (extensions to BGP for central 285 controller). The BGP controller is responsible for creating and 286 maintaining every tunnel in the network. It also controls the 287 redirection of traffic flow to each tunnel. 289 The BGP controller comprises a number of modules, including a TM, a 290 CSPF, a TEDB, a SLDB and a TPDB. The interfaces among these modules 291 are listed as follows: 293 +------------------------------------------+ 294 | Users/Applications(Orchestrator/OSS/NMS) | 295 +------------------------------------------+ 296 | 297 +----------------------------------------------+ 298 | BGP as Controller | 299 | +---------------+ | 300 | /------------| TM | | 301 | / Ia +---------------+ | 302 | +--------+ | | | \ | 303 | | CSPF | ________| | | \Id | 304 | +--------+ / Ib /Ic | +---------+ | 305 | \Ie / / | | TPDB | | 306 | +---------+ +-------+ | +---------+ | 307 | | TEDB | | SLDB | | | 308 | +---------+ +-------+ | | 309 | \ \ |In | 310 +----------------API to Network(RR+)-----------+ 311 / \ 312 / \____ 313 / \ \____ 314 /\ .---. .---+ \ 315 | \( ' |'.---. | 316 |---\ Network | '+. 317 (o \ | | ) 318 ( | | o) 319 ( | | ) 320 ( o o .-' 321 ' ) 322 '---._.-. ) 323 '---' 325 o Interface Ia between the TM and the CSPF. Through this interface, 326 the TM requests the CSPF to compute a path for a tunnel with a set 327 of constraints, and the CSPF responses the TM with the path 328 computed that satisfies the constraints. 330 o Interface Ib between the TM and the TEDB. When a tunnel is to be 331 created, through this interface, the TM reserves in the TEDB the 332 TE resources such as link bandwidths on every link along the path 333 computed for the tunnel. When a tunnel is deleted, the TM 334 releases the TE resources such as link bandwidths on every link 335 along the path for the tunnel. 337 o Interface Ic between the TM and the SLDB. When a tunnel is to be 338 created, through this interface, the TM reserves in the SLDB a 339 SID/label for every link or some links along the path computed for 340 the tunnel. When a tunnel is deleted, the TM releases the SID/ 341 label for every link or some links along the path for the tunnel. 343 o Interface Id between the TM and the TPDB. the TM updates the 344 information for every tunnel in the TPDB through this interface. 346 o Interface Ie between the CSPF and the TEDB. Through this 347 interface, the CSPF accesses the traffic engineering information 348 such as link bandwidths when it computes a path for a tunnel. 350 There is an interface In between the BGP controller and the network. 351 In fact, there is a control channel (or interface) between the BGP 352 controller and every (edge) node in the network. 354 Initially, the TEDB obtains the original traffic engineering (TE) 355 information such as link bandwidths from the network through the 356 interface In (i.e., API to network) for every link in the network. 357 The SLDB gets the original SID/label resources from the network 358 through the interface for every node, link and prefix in the network. 360 3.3. Controller Cluster 362 A critical issue in a network with a central controller is the 363 failure of the controller, which is a single point of failure (SPOF). 364 If the controller fails, the entire network may not work. 366 A controller cluster (i.e., a group of controllers) works as a single 367 controller from user's point of view. A simple controller cluster 368 consists of two controllers. One works as a active (or say primary) 369 controller, and the other as a standby (or say secondary) controller. 370 In normal operations, the active controller is responsible for the 371 network it controls. It also synchronizes with the standby 372 controller. When the active controller fails, the standby controller 373 becomes a new active controller, which controls the network. 375 The Figure below illustrates a simple controller cluster containing 376 two BGP-based controllers: Active BGP-based Controller and Standby 377 BGP-based Controller. In normal operations, the active controller 378 interacts with users and/or applications. For example, it receives 379 configurations for tunnels and the traffic flows to tunnels from 380 users. The active controller instructs the network elements in the 381 network to provide the services requested by users and/or 382 applications. For example, after receiving the configurations for a 383 tunnel and a traffic flow to the tunnel, the active controller 384 computes a path for the tunnel, programs (or say instructs) the 385 network elements along the path for creating the tunnel, and 386 instructs the ingress of the tunnel to direct the traffic flow into 387 the tunnel. 389 During this process, the status information about the network is 390 updated in the active controller. The information includes: the 391 traffic engineering information in their TEDBs, the SID/label 392 information in their SLDBs, and the configurations, paths, resources 393 and status for tunnels in their TPDBs. The active controller 394 synchronizes this information with the standby controller. Thus 395 these two controllers have the same status information about the 396 network. When the active controller fails, the standby controller 397 takes over the role of the active controller smoothly and becomes 398 active controller. 400 +-------------------------------------------+ 401 | Users/Applications(Orchestrator/OSS/NMS) | 402 +-------------------------------------------+ 403 ^ 404 | 405 +--------------------------+------------------------+ 406 | Controller ______________|_____________ | 407 | Cluster | | | 408 | | ___________________ | | 409 | | | Synchronization | | | 410 | v v v v | 411 | +------------+ +------------+ | 412 | | Active | | Standby | | 413 | | BGP-based | | BGP-based | | 414 | | Controller | | Controller | | 415 | +------------+ +------------+ | 416 | ^ ^ | 417 | |____________________________| | 418 | | | 419 | v | 420 +-----------------API to Network(RR+)---------------+ 421 / \ 422 / \____ 423 / \ \____ 424 /\ .---. .---+ \ 425 | \( ' |'.---. | 426 |---\ Network | '+. 427 (o \ | | ) 428 ( | | o) 429 ( | | ) 430 ( o o .-' 431 ' ) 432 '---._.-. ) 433 '---' 435 3.4. Hierarchical Controllers 437 The Figure below illustrates a system with hierarchical controllers. 438 There is one Parent Controller and four Child Controllers: Child 439 Controller 1, Child Controller 2, Child Controller 3 and Child 440 Controller 4. 442 +-------------------------------------------+ 443 | Users/Applications(Orchestrator/OSS/NMS) | 444 +----------------------+--------------------+ 445 | 446 +---------+---------+ 447 | Parent Controller | 448 +--+---------+----+-+ 449 _/| \ \____ 450 _/ | \ \____ 451 _/ | \ \__ 452 __/ | +---------+---------+ \ 453 __/ | |Child Controller 3 | | 454 / | +-------------------+ | 455 +---------+---------+ | / \ | 456 |Child Controller 1 | | .---. .---,\ | 457 +-------------------+ | ( ' ') | 458 / \ | ( Domain 3 ) | 459 .---. .---,\ | ( ) +---------+---------+ 460 ( ' ') | '-o-.--o) |Child Controller 4 | 461 ( Domain 1 ) | | +-------------------+ 462 ( ) | | / \____ 463 '-o-.---) +--------+----------+ \ / \ \____ 464 | |Child Controller 2 | \ /\ .---. .---+ \ 465 | +-------------------+ \ | \( ' |'.---. | 466 | / \____ \_ |---\ Domain 4 | '+, 467 \ / \ \____ (o \ | | ) 468 \ /\ .---. .---+ \ ( | | o) 469 \ | \( ' |'.---. | ( | | ) 470 \ |---\ Domain 2 | '+. ( o o .-' 471 \____(o \ | | ) ' ) 472 ( | | o)-------o---._.-.-----) 473 ( | | ) 474 ( o o .-' 475 ' ) 476 '---._.-.-----) 478 The parent controller communicates with these four child controllers 479 and controls them, each of which controls (or is responsible for) a 480 domain. Child controller 1 controls domain 1, Child controller 2 481 controls domain 2, Child controller 3 controls domain 3, and Child 482 controller 4 controls domain 4. 484 One level of hierarchy of controllers is illustrated in the figure 485 above. There is one parent controller at top level, which is not a 486 child controller. Under the parent controller, there are four child 487 controllers, which are not parent controllers. 489 In a general case, at top level there is one parent controller that 490 is not a child controller, there are some controllers that are both 491 parent controllers and child controllers, and there are a number of 492 child controllers that are not parent controllers. This is a system 493 of multiple levels of hierarchies, in which one parent controller 494 controls or communicates with a first number of child controllers, 495 some of which are also parent controllers, each of which controls or 496 communicates with a second number of child controllers, and so on. 498 The parent controller receives requests for creating end to end 499 tunnels from users or applications. For each request, the parent 500 controller is responsible for obtaining a path for the tunnel and 501 creating the tunnel along the path through sending instructions to 502 the corresponding child controllers. 504 4. Application Scenarios 506 This section introduces a set of scenarios to which the controller 507 can be applied. 509 4.1. Business-oriented Traffic Steering 511 It is reasonable in commercial sense to provide multiple paths to the 512 same destination with differentiated experiences for preferential 513 users/services. This is an efficient approach to maximize providers' 514 network resource usage as well as their profit and offer more choices 515 to network users. 517 4.1.1. Preferential Users 519 In the Figure below for an ISP network, there are three kinds of 520 users in Sydney, saying Gold, Silver and Bronze, and they wish to 521 visit website located in HongKong. The ISP provides three different 522 paths with different experiences according to users' priority. The 523 Gold Users may use Path1 with less latency and loss. The Silver 524 Users may use the Path2 through Singapore with less latency but maybe 525 some congestion there. The Bronze Users may use Path3 through LA 526 with some latency and loss. 528 +----------+ 529 | HongKong | 530 --+----------+-- 531 --- | --- 532 --- | --- 533 -- | -- 534 +----------+ | +----------+ 535 |Singapore | | | LA | 536 +----------+ | +----------+ 537 -- |Path1 -- 538 --- | --- 539 Path2 --- | --- Path3 540 --+----------+-- 541 | Sydney | 542 +----------+ 543 | 544 | 545 +-----------+-----------+ 546 | | | 547 +-------+ +-------+ +-------+ 548 |Silver | |Gold | |Bronze | 549 |Users | |Users | |Users | 550 +-------+ +-------+ +-------+ 552 4.1.2. Preferential Services 554 As depicted in the Figure below, the OTTSP has 3 exits with one ISP, 555 which are located in City A, City B and City C. The content is 556 obtained from Content Server and send to the exits through AR. An 557 OTTSP may make its steering strategy based on different services. 558 For example, the OTTSP in the Figure may choose exit R21 for video 559 service and exit R22 for web service, which REQUIREs a mechanism/ 560 system exists to identify different services from traffic flow. 562 * * 563 City A * City B * City C 564 * * 565 * +-----+ * 566 * |Users| * 567 * +-----+ * 568 * | * 569 +-----------+-----------+ 570 | * | * | 571 +-----+ * +-----+ * +-----+ 572 | R11 |-----| R12 |-----| R13 | 573 +-----+ * +-----+ * +-----+ ISP 574 | * | * | 575 *****|***********|***********|********* 576 | * | * | 577 | * | * | OTT 578 +-----+ * +-----+ * +-----+ 579 | R21 |-----| R22 |-----| R23 | 580 +-----+ * +-----+ * +-----+ 581 | * | * | 582 +-----------+-----------+ 583 * | * 584 * +-----+ * +-------+ 585 * | AR |--------|Content| 586 * +-----+ * |Server | 587 +-------+ 589 4.2. Traffic Congestion Mitigation 591 It is a persistent goal for providers to increase the utilization 592 ratio of their current network resources, and to mitigate the traffic 593 congestion. Traffic congestion is possible to happen anywhere in the 594 ISP network(MAN, IDC, core and the links between them), because 595 internet traffic is hard to predict. For example, there might be 596 some local online events that the network operators didn't know 597 beforehead, or some sudden attack just happened. Even for the big 598 events that can be predicted, such as annual online discount of 599 e-commerce company, or IOS update of Apple Inc, we could not 600 guarantee there is no congestion. Since the network capacity 601 expansion is usually an annual operation, there could be delay on any 602 links of the engineering. As a result, the temporary traffic 603 steering is always needed. The same thing happens to the OTT 604 networks as well. 606 It should be noted that, the traffic steering is absolutely not a 607 global behavior. It just acts on part of the network, and it's 608 temporary. 610 4.2.1. Congestion Mitigation in Core 612 As depicted in the Figure below, traffic from MAN C1 to MAN D2 613 follows the path Core C->Core B->Core D as the primary path, but 614 somehow the load ratio becomes too much. It is reasonable to 615 transfer some traffic load to less utilized path Core C->Core A->Core 616 D when the primary path has congestion. 618 Core 620 +----------+ 621 | Core A | 622 +------+ --+----------+-- +------+ 623 |MAN C1|-+ --- --- +-|MAN D1| 624 +------+ | --- --- | +------+ 625 | -- -- | 626 | +----------+ +----------+ | 627 +-| Core C | | Core D |-+ 628 | +----------+ +----------+ | 629 | -- -- | 630 +------+ | --- --- | +------+ 631 |MAN C2|-+ --- --- +-|MAN D2| 632 +------+ --+----------+-- +------+ 633 | Core B | 634 +----------+ 636 4.2.2. Congestion Mitigation among ISPs 638 As depicted in the Figure below, ISP1 and ISP2 are interconnect by 3 639 exits which are located in 3 cities respectively. The links between 640 ISP1 and ISP2 in the same city are called local links, and the rest 641 are long distance links. Traffic from IXP C1 to Core A in ISP 2 642 usually passes through link IXP C1->IXP A2->Core A. This is a long 643 distant route, directly connecting city C and city A. Part of 644 traffic could be transferred to link IXP. 646 * * 647 City A * City B * City C 648 * * 649 +-------+ * +-------+ * +-------+ 650 |IXP A1 |----|IXP B1|---|IXP C1 | 651 +-------+ * +-------+ * +-------+ ISP 1 652 | * | * | | 653 *******|*************|*********|**|********** 654 | +----------|---------+ | 655 | | * | * | ISP 2 656 | | * | * | 657 +------+ * +------+ * +------+ 658 |IXP A2|----|IXP B2|----|IXP C2| 659 +------+ * +------+ * +------+ 660 | * | * | 661 | * | * | 662 +-------+ * +-------+ * +-------+ 663 |Core A |----|Core B |---|Core C | 664 +-------+ * +-------+ * +-------+ 666 4.2.3. Congestion Mitigation at International Edge 668 An ISP usually interconnects with more than 2 transit networks at the 669 international edge, so it is quite common that multiple paths may 670 exist for the same foreign destination. Usually those paths with 671 better QoS properties such as latency, loss, jitter and etc are often 672 preferred. Since these properties keep changing from time to time, 673 the decision of path selection has to be made dynamically. 675 As depicted in the Figure below, the traffic to the foreign 676 destination H from IP core network (AS C1) has two choices on transit 677 network, saying Transit A and Transit B. Under normal conditions, 678 Transit B is the primary choice, but Transit A will be preferred when 679 the QoS of Transit B gets worse. As a result, the same traffic will 680 go through Transit A instead. 682 * * 683 City A * City B * City C 684 * * 685 +-------+ * +-------+ * +-------+ 686 |IXP A1 |----|IXP B1|---|IXP C1 | 687 +-------+ * +-------+ * +-------+ ISP 1 688 | * | * | | 689 *******|*************|*********|**|********** 690 | +----------|---------+ | 691 | | * | * | ISP 2 692 | | * | * | 693 +------+ * +------+ * +------+ 694 |IXP A2|----|IXP B2|----|IXP C2| 695 +------+ * +------+ * +------+ 696 | * | * | 697 | * | * | 698 +-------+ * +-------+ * +-------+ 699 |Core A |----|Core B |---|Core C | 700 +-------+ * +-------+ * +-------+ 702 5. Security Considerations 704 The interactions with a BGP-based controller are similar to those 705 with any other SDN controller. The security implications of SDN 706 controller have not been fully discussed or described. Therefore, 707 protocol and applicability for solutions around this architecture 708 must take proper account of these concerns. 710 6. IANA Considerations 712 This document does not require any IANA actions. 714 7. Acknowledgements 716 The authors would like to thank Chris Bowers, Jeff Tantsura for their 717 valuable suggestions and comments on this draft. 719 8. Contributors 721 Nan Wu 722 Huawei 723 Email: eric.wu@huawei.com 725 9. References 726 9.1. Normative References 728 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP- 729 4)", RFC 1771, DOI 10.17487/RFC1771, March 1995, 730 . 732 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 733 Requirement Levels", BCP 14, RFC 2119, 734 DOI 10.17487/RFC2119, March 1997, 735 . 737 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 738 BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, 739 . 741 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 742 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 743 DOI 10.17487/RFC4271, January 2006, 744 . 746 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 747 Reflection: An Alternative to Full Mesh Internal BGP 748 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 749 . 751 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 752 and D. McPherson, "Dissemination of Flow Specification 753 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 754 . 756 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 757 S. Ray, "North-Bound Distribution of Link-State and 758 Traffic Engineering (TE) Information Using BGP", RFC 7752, 759 DOI 10.17487/RFC7752, March 2016, 760 . 762 9.2. Informative References 764 [I-D.ietf-idr-bgpls-segment-routing-epe] 765 Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray, 766 S., and J. Dong, "BGP-LS extensions for Segment Routing 767 BGP Egress Peer Engineering", draft-ietf-idr-bgpls- 768 segment-routing-epe-17 (work in progress), October 2018. 770 [I-D.ietf-idr-flowspec-path-redirect] 771 Velde, G., Patel, K., and Z. Li, "Flowspec Indirection-id 772 Redirect", draft-ietf-idr-flowspec-path-redirect-07 (work 773 in progress), December 2018. 775 [I-D.ietf-idr-segment-routing-te-policy] 776 Previdi, S., Filsfils, C., Jain, D., Mattes, P., Rosen, 777 E., and S. Lin, "Advertising Segment Routing Policies in 778 BGP", draft-ietf-idr-segment-routing-te-policy-05 (work in 779 progress), November 2018. 781 [I-D.ietf-isis-segment-routing-extensions] 782 Previdi, S., Ginsberg, L., Filsfils, C., Bashandy, A., 783 Gredler, H., and B. Decraene, "IS-IS Extensions for 784 Segment Routing", draft-ietf-isis-segment-routing- 785 extensions-22 (work in progress), December 2018. 787 [I-D.ietf-pce-segment-routing] 788 Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., 789 and J. Hardwick, "PCEP Extensions for Segment Routing", 790 draft-ietf-pce-segment-routing-16 (work in progress), 791 March 2019. 793 [I-D.ietf-rtgwg-bgp-routing-large-dc] 794 Lapukhov, P., Premji, A., and J. Mitchell, "Use of BGP for 795 routing in large-scale data centers", draft-ietf-rtgwg- 796 bgp-routing-large-dc-11 (work in progress), June 2016. 798 [I-D.ietf-spring-segment-routing] 799 Filsfils, C., Previdi, S., Ginsberg, L., Decraene, B., 800 Litkowski, S., and R. Shakir, "Segment Routing 801 Architecture", draft-ietf-spring-segment-routing-15 (work 802 in progress), January 2018. 804 Authors' Addresses 806 Yujia 807 China Telcom Co., Ltd. 808 109 West Zhongshan Ave,Tianhe District 809 Guangzhou 510630 810 China 812 Email: luoyuj@gsta.com 814 Liang 815 China Telcom Co., Ltd. 816 109 West Zhongshan Ave,Tianhe District 817 Guangzhou 510630 818 China 820 Email: oul@gsta.com 821 Xiang 822 Tencent 824 Email: terranhuang@tencent.com 826 Huaimo Chen 827 Huawei 828 Boston, MA 829 USA 831 Email: Huaimo.chen@huawei.com 833 Shunwan Zhuang 834 Huawei 835 Huawei Bld., No.156 Beiqing Rd. 836 Beijing 100095 837 China 839 Email: zhuangshunwan@huawei.com 841 Zhenbin Li 842 Huawei 843 Huawei Bld., No.156 Beiqing Rd. 844 Beijing 100095 845 China 847 Email: lizhenbin@huawei.com