idnits 2.17.1 draft-ietf-i2rs-rib-info-model-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 464 has weird spacing: '...thop-id egre...' -- The document date (September 16, 2013) is 3875 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-03) exists of draft-hares-i2rs-use-case-vn-vc-00 == Outdated reference: A later version (-06) exists of draft-white-i2rs-use-case-01 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group N. Bahadur, Ed. 3 Internet-Draft R. Folkes, Ed. 4 Intended status: Informational Juniper Networks, Inc. 5 Expires: March 20, 2014 S. Kini 6 Ericsson 7 J. Medved 8 Cisco 9 September 16, 2013 11 Routing Information Base Info Model 12 draft-ietf-i2rs-rib-info-model-00 14 Abstract 16 Routing and routing functions in enterprise and carrier networks are 17 typically performed by network devices (routers and switches) using a 18 routing information base (RIB). Protocols and configuration push 19 data into the RIB and the RIB manager install state into the 20 hardware; for packet forwarding. This draft specifies an information 21 model for the RIB to enable defining a standardized data model. Such 22 a data model can be used to define an interface to the RIB from an 23 entity that may even be external to the network device. This 24 interface can be used to support new use-cases being defined by the 25 IETF I2RS WG. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on March 20, 2014. 44 Copyright Notice 46 Copyright (c) 2013 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 1.1. Conventions used in this document . . . . . . . . . . . . 6 63 2. RIB data . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 64 2.1. RIB definition . . . . . . . . . . . . . . . . . . . . . . 6 65 2.2. Routing instance . . . . . . . . . . . . . . . . . . . . . 7 66 2.3. Route . . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 2.4. Nexthop . . . . . . . . . . . . . . . . . . . . . . . . . 10 68 2.4.1. Nexthop types . . . . . . . . . . . . . . . . . . . . 12 69 2.4.2. Nexthop list attributes . . . . . . . . . . . . . . . 13 70 2.4.3. Nexthop content . . . . . . . . . . . . . . . . . . . 14 71 2.4.4. Nexthop attributes . . . . . . . . . . . . . . . . . . 14 72 2.4.5. Nexthop vendor attributes . . . . . . . . . . . . . . 15 73 2.4.6. Special nexthops . . . . . . . . . . . . . . . . . . . 15 74 3. Reading from the RIB . . . . . . . . . . . . . . . . . . . . . 16 75 4. Writing to the RIB . . . . . . . . . . . . . . . . . . . . . . 16 76 5. Events and Notifications . . . . . . . . . . . . . . . . . . . 16 77 6. RIB grammar . . . . . . . . . . . . . . . . . . . . . . . . . 17 78 7. Using the RIB grammar . . . . . . . . . . . . . . . . . . . . 19 79 7.1. Using route preference and metric . . . . . . . . . . . . 20 80 7.2. Using different nexthops types . . . . . . . . . . . . . . 20 81 7.2.1. Tunnel nexthops . . . . . . . . . . . . . . . . . . . 20 82 7.2.2. Replication lists . . . . . . . . . . . . . . . . . . 20 83 7.2.3. Weighted lists . . . . . . . . . . . . . . . . . . . . 21 84 7.2.4. Protection lists . . . . . . . . . . . . . . . . . . . 21 85 7.2.5. Nexthop chains . . . . . . . . . . . . . . . . . . . . 22 86 7.2.6. Lists of lists . . . . . . . . . . . . . . . . . . . . 22 87 7.3. Performing multicast . . . . . . . . . . . . . . . . . . . 22 88 7.4. Solving optimized exit control . . . . . . . . . . . . . . 23 89 8. RIB operations at scale . . . . . . . . . . . . . . . . . . . 23 90 8.1. RIB reads . . . . . . . . . . . . . . . . . . . . . . . . 24 91 8.2. RIB writes . . . . . . . . . . . . . . . . . . . . . . . . 24 92 8.3. RIB events and notifications . . . . . . . . . . . . . . . 24 93 9. Security Considerations . . . . . . . . . . . . . . . . . . . 24 94 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 95 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 24 96 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 97 12.1. Normative References . . . . . . . . . . . . . . . . . . . 25 98 12.2. Informative References . . . . . . . . . . . . . . . . . . 25 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 26 101 1. Introduction 103 Routing and routing functions in enterprise and carrier networks are 104 traditionally performed in network devices. Traditionally routers 105 run routing protocols and the routing protocols (along with static 106 config) populates the Routing information base (RIB) of the router. 107 The RIB is managed by the RIB manager and it provides a north-bound 108 interface to its clients i.e. the routing protocols to insert routes 109 into the RIB. The RIB manager consults the RIB and decides how to 110 program the forwarding information base (FIB) of the hardware by 111 interfacing with the FIB-manager. The relationship between these 112 entities is shown in Figure 1. 114 +-------------+ +-------------+ 115 |RIB-Client 1 | ...... |RIB-Client N | 116 +-------------+ +-------------+ 117 ^ ^ 118 | | 119 +----------------------+ 120 | 121 V 122 +---------------------+ 123 |RIB-Manager | 124 | | 125 | +-----+ | 126 | | RIB | | 127 | +-----+ | 128 +---------------------+ 129 ^ 130 | 131 +---------------------------------+ 132 | | 133 V V 134 +-------------+ +-------------+ 135 |FIB-Manager 1| |FIB-Manager M| 136 | +-----+ | .......... | +-----+ | 137 | | FIB | | | | FIB | | 138 | +-----+ | | +-----+ | 139 +-------------+ +-------------+ 141 Figure 1: RIB-Manager, RIB-Clients and FIB-Managers 143 Routing protocols are inherently distributed in nature and each 144 router makes an independent decision based on the routing data 145 received from its peers. With the advent of newer deployment 146 paradigms and the need for specialized applications, there is an 147 emerging need to guide the router's routing function 148 [I-D.atlas-i2rs-problem-statement]. Traditional network-device 149 protocol-based RIB population suffices for most use cases where 150 distributed network control works. However there are use cases in 151 which the network admins today configure static routes, policies and 152 RIB import/export rules on the routers. There is also a growing list 153 of use cases [I-D.white-i2rs-use-case], 154 [I-D.hares-i2rs-use-case-vn-vc] in which a network admin might want 155 to program the RIB based on data unrelated to just routing (within 156 that network's domain). It could be based on routing data in 157 adjacent domain or it could be based on load on storage and compute 158 in the given domain. Or it could simply be a programmatic way of 159 creating on-demand dynamic overlays between compute hosts (without 160 requiring the hosts to run traditional routing protocols). If there 161 was a standardized programmatic interface to a RIB, it would fuel 162 further networking applications targeted towards specific niches. 164 A programmatic interface to the RIB involves 2 types of operations - 165 reading what's in the RIB and adding/modifying/deleting contents of 166 the RIB. [I-D.white-i2rs-use-case] lists various use-cases which 167 require read and/or write manipulation of the RIB. 169 In order to understand what is in a router's RIB, methods like per- 170 protocol SNMP MIBs and show output screen scraping are being used. 171 These methods are not scalable, since they are client pull mechanisms 172 and not proactive push (from the router) mechanisms. Screen scraping 173 is error prone (since the output format can change) and vendor 174 dependent. Building a RIB from per-protocol MIBs is error prone 175 since the MIB data represents protocol data and not the exact 176 information that went into the RIB. Thus, just getting read-only RIB 177 information from a router is a hard task. 179 Adding content to the RIB from an external entity can be done today 180 using static configuration support provided by router vendors. 181 However the mix of what can be modified in the RIB varies from vendor 182 to vendor and the way of configuring it is also vendor dependent. 183 This makes it hard for an external entity to program a multi-vendor 184 network in a consistent and vendor independent way. 186 The purpose of this draft is to specify an information model for the 187 RIB. Using the information model, one can build a detailed data 188 model for the RIB. And that data model could then be used by an 189 external entity to program a network device. 191 The rest of this document is organized as follows. Section 2 goes 192 into the details of what constitutes and can be programmed in a RIB. 193 Guidelines for reading and writing the RIB are provided in Section 3 194 and Section 4 respectively. Section 5 provides a high-level view of 195 the events and notifications going from a network device to an 196 external entity, to update the external entity on asynchronous 197 events. The RIB grammar is specified in Section 6. Examples of 198 using the RIB grammar are shown in Section 7. Section 8 covers 199 considerations for performing RIB operations at scale. 201 1.1. Conventions used in this document 203 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 204 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 205 document are to be interpreted as described in [RFC2119]. 207 2. RIB data 209 This section describes the details of a RIB. It makes forward 210 references to objects in the RIB grammar (Section 6). A high-level 211 description of the RIB contents is as shown below. 213 routing-instance 215 | | 216 | | 217 0..N | | 1..N 218 | | 220 interface(s) RIB(s) 222 | 223 | 224 | 0..N 226 route(s) 228 2.1. RIB definition 230 A RIB is an entity that contains routes. A RIB is identified by its 231 name and a RIB is contained within a routing instance (Section 2.2). 232 The name MUST be unique within a routing instance. All routes in a 233 given RIB MUST be of the same type (e.g. IPv4). Each RIB MUST 234 belong to some routing instance. 236 A RIB can be tagged with a MULTI_TOPOLOGY_ID. If a routing instance 237 is divided into multiple logical topologies, then the multi-topology 238 field is used to distinguish one topology from the other, so as to 239 keep routes from one topology independent of routes from another 240 topology. 242 If a routing instance contains multiple RIBs of the same type (e.g. 243 IPv4), then a MULTI_TOPOLOGY_ID MUST be associated with each such 244 RIB. Multiple RIBs are useful when describing multiple topology IGP 245 (Interior Gateway Protocol) networks (see [RFC4915] and [RFC5120] ). 246 In a given routing instance, MULTI_TOPOLOGY_ID MUST be unique across 247 RIBs of the same type. 249 Each RIB can be optionally associated with a ENABLE_IP_RPF_CHECK 250 attribute that enables Reverse path forwarding (RPF) checks on all IP 251 routes in that RIB. Reverse path forwarding (RPF) check is used to 252 prevent spoofing and limit malicious traffic. For IP packets, the IP 253 source address is looked up and the rpf interface(s) associated with 254 the route for that IP source address is found. If the incoming IP 255 packet's interface matches one of the rpf interface(s), then the IP 256 packet is forwarded based on its IP destination address; otherwise, 257 the IP packet is discarded. 259 2.2. Routing instance 261 A routing instance, in the context of the RIB information model, is a 262 collection of RIBs, interfaces, and routing parameters. A routing 263 instance creates a logical slice of the router and allows different 264 logical slices; across a set of routers; to communicate with other 265 each. Layer 3 Virtual Private Networks (VPN), Layer 2 VPNs (L2VPN) 266 and Virtual Private Lan Service (VPLS) can be modeled as routing 267 instances. Note that modeling a Layer 2 VPN using a routing instance 268 only models the Layer-3 (RIB) aspect and does not model any layer-2 269 information (like ARP) that might be associated with the L2VPN. 271 The set of interfaces indicates which interfaces are associated with 272 this routing instance. The RIBs specify how incoming traffic is to 273 be forwarded. And the routing parameters control the information in 274 the RIBs. The intersection set of interfaces of 2 routing instances 275 SHOULD be the null set. In other words, an interface MUST NOT be 276 present in 2 routing instances. Thus a routing instance describes 277 the routing information and parameters across a set of interfaces. 279 A routing instance MUST contain the following mandatory fields. 280 o INSTANCE_NAME: A routing instance is identified by its name, 281 INSTANCE_NAME. This SHOULD be unique across all routing instances 282 in a given network device. 283 o INSTANCE_DISTINGUISHER: Each routing instance MUST have a 284 distinguisher associated with it. It enables one to distinguish 285 routes across routing instances. The route distinguisher MUST be 286 unique across all routing instances in a given network device. 287 How the INSTANCE_DISTINGUISHER is allocated and kept unique is 288 outside the scope of this document. The instance distinguisher 289 maps well to BGP route-distinguisher for virtual private networks 290 (VPNs). However, the same concept can be used for other use-cases 291 as well. 293 o rib-list: This is the list of RIBs associated with this routing 294 instance. Each routing instance can have multiple RIBs to 295 represent routes of different types. For example, one would put 296 IPv4 routes in one RIB and MPLS routes in another RIB. 298 A routing instance MAY contain the following optional fields. 299 o interface-list: This represents the list of interfaces associated 300 with this routing instance. The interface list helps constrain 301 the boundaries of packet forwarding. Packets coming on these 302 interfaces are directly associated with the given routing 303 instance. The interface list contains a list of identifiers, with 304 each identifier uniquely identifying an interface. 305 o ROUTER_ID: The router-id field identifies the network device in 306 control plane interactions with other network devices. This field 307 is to be used if one wants to virtualize a physical router into 308 multiple virtual routers. Each virtual router MUST have a unique 309 router-id. ROUTER_ID MUST be unique across all network devices in 310 a given domain. 311 o as-data: This is an identifier of the administrative domain to 312 which the routing instance belongs. The as-data fields is used 313 when the routes in this instance are to be tagged with certain 314 autonomous system (AS) characteristics. The RIB manager can use 315 AS length as one of the parameters for making route selection. as- 316 data consists of a AS number and an optional Confederation AS 317 number ([RFC5065]). 319 2.3. Route 321 A route is essentially a match condition and an action following the 322 match. The match condition specifies the kind of route (IPv4, MPLS, 323 etc.) and the set of fields to match on. Figure 2 represents the 324 overall contents of a route. 326 artwork 327 route 329 | | | 330 +---------+ | +----------+ 331 | | | 332 0..N | | | 0..N 334 route-attributes match nexthop-list 336 | 337 | 338 +-------+-------+-------+--------+ 339 | | | | | 340 | | | | | 342 IPv4 IPv6 MPLS MAC Interface 344 Figure 2: Route model 346 This document specifies the following match types: 347 o IPv4: Match on destination IP in IPv4 header 348 o IPv6: Match on destination IP in IPv6 header 349 o MPLS: Match on a MPLS tag 350 o MAC: Match on ethernet destination addresses 351 o Interface: Match on incoming interface of packet 352 o IP multicast: Match on (S, G) or (*, G), where S and G are IP 353 prefixes 355 Each route can have associated with it one or more optional route 356 attributes. 357 o ROUTE_PREFERENCE: This is a numerical value that allows for 358 comparing routes from different protocols (where static 359 configuration is also considered a protocol for the purpose of 360 this field). It is also known as administrative-distance. The 361 lower the value, the higher the preference. For example there can 362 be an OSPF route for 192.0.2.1/32 with a preference of 5. If a 363 controller programs a route for 192.0.2.1/32 with a preference of 364 2, then the controller entered route will be preferred by the RIB 365 manager. Preference should be used to dictate behavior. For more 366 examples of preference, see Section 7.1. 367 o ROUTE_METRIC: Route preference is used for comparing routes from 368 different protocols. Route metric is used for comparing routes 369 learned by the same protocol. If a controller wishes to program 2 370 or more routes to the same destination, then it can use the metric 371 field to disambiguate the 2 routes. For more examples, see 372 Section 7.1. 374 o LOCAL_ONLY: This is a boolean value. If this is present, then it 375 means that this route should not be exported into other RIBs or 376 other RIBs. 377 o rpf-check-interface: Reverse path forwarding (RPF) check is used 378 to prevent spoofing and limit malicious traffic. For IP packets, 379 the IP source address is looked up and the rpf-check-interface 380 associated with the route for that IP source address is found. If 381 the incoming IP packet's interface matches one of the rpf-check- 382 interfaces, then the IP packet is forwarded based on its IP 383 destination address; otherwise, the IP packet is discarded. For 384 MPLS routes, there is no source address to be looked up, so the 385 usage is slightly different. For an MPLS route, a packet with the 386 specified MPLS label will only be forwarded if it is received on 387 one of the interfaces specified by the rpf-check-interface. If no 388 rpf-check-interface is specified, then matching packets are no 389 subject to this check. This field overrides the 390 ENABLE_IP_RPF_CHECK flag on the RIB and interfaces provided in 391 this list are used for doing the RPF check. 392 o as-path: A route can have an as-path associated with it to 393 indicate which set of autonomous systems has to be traversed to 394 reach the final destination. The as-path attribute can be used by 395 the RIB manager in multiple ways. The RIB manager can choose 396 paths with lower as-path length. Or the RIB manager can choose to 397 not install paths going via a particular AS. How exactly the RIB 398 manager uses the as-path is outside the scope of this document. 399 For details of how the as-path is formed, see Section 5.1.2 of 400 [RFC4271] and Section 3 of [RFC5065]. 401 o route-vendor-attributes: Vendors can specify vendor-specific 402 attributes using this. The details of this field is outside the 403 scope of this document. 405 2.4. Nexthop 407 A nexthop represents an object or action resulting from a route 408 lookup. For example, if a route lookup results in sending the packet 409 out a given interface, then the nexthop represents that interface. 411 Nexthops can be fully resolved nexthops or unresolved nexthop. A 412 resolved nexthop is something that is ready for installation in the 413 FIB. For example, a nexthop that points to an interface. An 414 unresolved nexthop is something that requires the RIB manager to 415 figure out the final resolved nexthop. For example, a nexthop could 416 point to an IP address. The RIB manager has to resolve how to reach 417 that IP address - is the IP address reachable by regular IP 418 forwarding or by a MPLS tunnel or by both. If the RIB manager cannot 419 resolve the nexthop, then the nexthop stays in unresolved state and 420 is NOT a candidate for installation in the FIB. Future RIB events 421 can cause a nexthop to get resolved (like that IP address being 422 advertised by an IGP neighbor). 424 The RIB information model allows an external entity to program 425 nexthops that may be unresolved initially. Whenever a unresolved 426 nexthop gets resolved, the RIB manager will send a notification of 427 the same (see Section 5 ). 429 The overall structure and usage of a nexthop is as shown in the 430 figure below. 432 route 434 | 435 | 0..N 437 nexthop-list 439 | 440 +------------------+------------------+ 441 1..N | | 442 | | 444 nexthop-list-member special-nexthop 446 | 447 | 449 nexthop-chain 451 | 452 1..N | 454 nexthop 456 | 457 +------- nexthop-attributes 458 | 459 | 460 +--------+------+------------------+------------------+ 461 | | | | 462 | | | | 464 nexthop-id egress-interface logical-tunnel tunnel-encap 466 Nexthops can be identified by an identifier to create a level of 467 indirection. The identifier is set by the RIB manager and returned 468 to the external entity on request. The RIB data-model SHOULD support 469 a way to optionally receive a nexthop identifier for a given nexthop. 471 For example, one can create a nexthop that points to a BGP peer. The 472 returned nexthop identifier can then be used for programming routes 473 to point to the same nexthop. Given that the RIB manager has created 474 an indirection for that BGP peer using the nexthop identifier, if the 475 transport path to the BGP peer changes, that change in path will be 476 seamless to the external entity and all routes that point to that BGP 477 peer will automatically start going over the new transport path. 478 Nexthop indirection using identifier could be applied to not just 479 unicast nexthops, but even to nexthops that contain chains and nested 480 nexthops (Section 2.4.1). 482 2.4.1. Nexthop types 484 This document specifies a very generic, extensible and recursive 485 grammar for nexthops. Nexthops can be 486 o Unicast nexthops - pointing to an interface 487 o Tunnel nexthops - pointing to a tunnel 488 o Replication lists - list of nexthops to which to replicate a 489 packet to 490 o Weighted lists - for load-balancing 491 o Protection lists - for primary/backup paths 492 o Nexthop chains - for chaining headers, e.g. MPLS label over a GRE 493 header 494 o Lists of lists - recursive application of the above 495 o Indirect nexthops - pointing to a nexthop identifier 496 o Special nexthops - for performing specific well-defined functions 497 It is expected that all network devices will have a limit on how many 498 levels of lookup can be performed and not all hardware will be able 499 to support all kinds of nexthops. RIB capability negotiation becomes 500 very important for this reason and a RIB data-model MUST specify a 501 way for an external entity to learn about the network device's 502 capabilities. Examples of when and how to use various kinds of 503 nexthops are shown in Section 7.2. 505 Tunnel nexthops allow an external entity to program static tunnel 506 headers. There can be cases where the remote tunnel end-point does 507 not support dynamic signaling (e.g. no LDP support on a host) and in 508 those cases the external entity might want to program the tunnel 509 header on both ends of the tunnel. The tunnel nexthop is kept 510 generic with specifications provided for some commonly used tunnels. 511 It is expected that the data-model will model these tunnel types with 512 complete accuracy. 514 Nexthop chains can be used to specify multiple headers over a packet, 515 before a packet is forwarded. One simple example is that of MPLS 516 over GRE, wherein the packet has a inner MPLS header followed by a 517 GRE header followed by an IP header. The outermost IP header is 518 decided by the network device whereas the MPLS header and GRE header 519 are specified by the controller. Not every network device will be 520 able to support all kinds of nexthop chains and an arbitrary number 521 of header chained together. The RIB data-model SHOULD provide a way 522 to expose nexthop chaining capability supported by a given network 523 device. 525 2.4.2. Nexthop list attributes 527 For nexthops that are of the form of a list(s), attributes can be 528 associated with each member of the list to indicate the role of an 529 individual member of the list. Two kinds of attributes are 530 specified: 531 o PROTECTION_PREFERENCE: This provides a primary/backup like 532 preference. The preference is an integer value that should be set 533 to 1 or 2. Nexthop members with a preference of 1 are preferred 534 over those with preference of 2. The network device SHOULD create 535 a list of nexthops with preference 1 (primary) and another list of 536 nexthops with preference 2 (backup) and SHOULD pre-program the 537 forwarding plane with both the lists. In case if all the primary 538 nexthops fail, then traffic MUST be switched over to members of 539 the backup nexthop list. All members in a list MUST either have a 540 protection preference specified or all members in a list MUST NOT 541 have a protection preference specified. 542 o LOAD_BALANCE_WEIGHT: This is used for load-balancing. Each list 543 member MUST be assigned a weight. The weight is a percentage 544 number from 1 to 99. The weight determines how much traffic is 545 sent over a given list member. If one of the members nexthops in 546 the list is not active, then the weight value of that nexthop 547 SHOULD be distributed among the other active members. How the 548 distribution is done is up to the network device and not in the 549 scope of the document. In other words, traffic should always be 550 load-balanced even if there is a failure. After a failure, the 551 external entity SHOULD re-program the nexthop list with updated 552 weights so as to get a deterministic behavior among the remaining 553 list members. To perform equal load-balancing, one MAY specify a 554 weight of "0" for all the member nexthops. The value "0" is 555 reserved for equal load-balancing and if applied, MUST be applied 556 to all member nexthops. 558 A nexthop list MAY contain elements that have both 559 PROTECTION_PREFERENCE and LOAD_BALANCE_WEIGHT set. When both are 560 set, it means under normal operation the network device should load 561 balance the traffic over all nexthops with a protection preference of 562 1. And when all nexthops with a protection preference of 1 are down 563 (or unavailable), then traffic MUST be load balanced over elements 564 with protection preference of 2. 566 2.4.3. Nexthop content 568 At the lowest level, a nexthop can point to a: 569 o identifier: This is an identifier returned by the network device 570 representing another nexthop or another nexthop chain. 571 o EGRESS_INTERFACE: This represents a physical, logical or virtual 572 interface on the network device. 573 o address: This can be an IP address or MAC address or ISO address. 574 * An optional RIB name can also be specified to indicate the RIB 575 in which the address is to be looked up further. One can use 576 the RIB name field to direct the packet from one domain into 577 another domain. For example, a MPLS packet coming in on an 578 interface would be looked up in a MPLS RIB and the nexthop for 579 that could indicate that we strip the MPLS label and do a 580 subsequent IPv4 lookup in an IPv4 RIB. By default the RIB will 581 be the same in which the route lookup was performed. 582 * An optional egress interface can be specified to indicate which 583 interface to send the packet out on. The egress interface is 584 useful when the network device contains Ethernet interfaces and 585 one needs to perform an ARP lookup for the IP packet. 586 o tunnel encap: This can be an encap representing an IP tunnel or 587 MPLS tunnel or others as defined in this document. An optional 588 egress interface can be specified to indicate which interface to 589 send the packet out on. The egress interface is useful when the 590 network device contains Ethernet interfaces and one needs to 591 perform an ARP lookup for the IP packet. 592 o logical tunnel: This can be a MPLS LSP or a GRE tunnel (or others 593 as defined in this document), that is represented by a unique 594 identifier (E.g. name). 595 o RIB_NAME: A nexthop pointing to a RIB indicates that the route 596 lookup needs to continue in the specified RIB. This is a way to 597 perform chained lookups. 599 2.4.4. Nexthop attributes 601 Certain information is encoded implicitly in the nexthop and does not 602 need to be specified by the controller. For example, when a IP 603 packet is forwarded out, the IP TTL is decremented by default. Same 604 applies for an MPLS packet. Similarly, when an IP packet is sent 605 over an ethernet interface, any ARP processing is handled implicitly 606 by the network device and does not need to be programmed by an 607 external device. 609 A nexthop can have some attributes associated with it. The purpose 610 of the attributes is to either override implicit behavior (like that 611 related to TTL processing) or to guide the network device to perform 612 something specific. Vendor specific attributes can also be 613 specified. The details of vendor specific attributes is outside the 614 scope of this document. 616 2.4.4.1. Nexthop flags 618 Nexthop flags in a nexthop is an optional attribute that is used to 619 denote specific connotation to hardware. Two common types of 620 operations are specified using nexthop flags. 621 o NO_DECREMENT_TTL: This indicates that the IPv4 time-to-live field 622 in an IPv4 packet MUST NOT be decremented before the packet is 623 forwarded. This may be applied one when an IPv4 packet is 624 encapsulated in a tunnel (E.g. MPLS) and one wants to hide the 625 fact that the packet is going through a tunnel. 626 o NO_PROPAGATE_TTL: This indicates that the IPv4 time-to-live field 627 in an IPv4 packet MUST NOT be propagated into an equivalent field, 628 when the IPv4 packet is tunneled. For example, if the IPv4 packet 629 is tunneled over MPLS, then the network device should use the 630 default time-to-live value for the outer MPLS header. This field 631 can also be used to indicate that when a tunnel terminates, one 632 does not propagate the outer header's time-to-live value into the 633 inner header. So, on MPLS tunnel termination, one does not 634 propagate the MPLS TTL value into the IPv4 header. 635 The TTL nexthop flags can be used to simulate a Pipe model for 636 tunnels. See [RFC3443] for a detailed understanding of Pipe model 637 and Uniform model. 639 2.4.5. Nexthop vendor attributes 641 This field has been defined for vendor specific extensions. The 642 contents of this field are beyond the scope of this document. 644 2.4.6. Special nexthops 646 This document specifies certain special nexthops. The purpose of 647 each of them is explained below: 648 o DISCARD: This indicates that the network device should drop the 649 packet and increment a drop counter. 650 o DISCARD_WITH_ERROR: This indicates that the network device should 651 drop the packet, increment a drop counter and send back an 652 appropriate error message (like ICMP error). 653 o RECEIVE: This indicates that that the traffic is destined for the 654 network device. For example, protocol packets or OAM packets. 655 All locally destined traffic SHOULD be throttled to avoid a denial 656 of service attack on the router's control plane. An optional 657 rate-limiter can be specified to indicate how to throttle traffic 658 destined for the control plane. The description of the rate- 659 limiter is outside the scope of this document. 661 3. Reading from the RIB 663 A RIB data-model MUST allow an external entity to read entries, for 664 RIBs created by that entity. The network device administrator MAY 665 allow reading of other RIBs by an external entity through access 666 lists on the network device. The details of access lists are outside 667 the scope of this document. 669 The data-model MUST support a full read of the RIB and subsequent 670 incremental reads of changes to the RIB. An external agent SHOULD be 671 able to request a full read at any time in the lifecycle of the 672 connection. When sending data to an external entity, the RIB manager 673 SHOULD try to send all dependencies of an object prior to sending 674 that object. 676 4. Writing to the RIB 678 A RIB data-model MUST allow an external entity to write entries, for 679 RIBs created by that entity. The network device administrator MAY 680 allow writes to other RIBs by an external entity through access lists 681 on the network device. The details of access lists are outside the 682 scope of this document. 684 When writing an object to a RIB, the external entity SHOULD try to 685 write all dependencies of the object prior to sending that object. 686 The data-model MUST support requesting identifiers for nexthops and 687 collecting the identifiers back in the response. 689 Route programming in the RIB MUST result in a return code that 690 contains the following attributes: 691 o Installed - Yes/No (Indicates whether the route got installed in 692 the FIB) 693 o Active - Yes/No (Indicates whether a route is fully resolved and 694 is a candidate for selection) 695 o Reason - E.g. Not authorized 696 The data-model MUST specify which objects are modify-able objects. A 697 modify-able object is one whose contents can be changed without 698 having to change objects that depend on it and without affecting any 699 data forwarding. To change a non-modifiable object, one will need to 700 create a new object and delete the old one. For example, routes that 701 use a nexthop that is identifier by a nexthop-identifier should be 702 unaffected when the contents of that nexthop changes. 704 5. Events and Notifications 706 Asynchronous notifications are sent by the network device's RIB 707 manager to an external entity when some event occurs on the network 708 device. A RIB data-model MUST support sending asynchronous 709 notifications. A brief list of suggested notifications is as below: 710 o Route change notification, with return code as specified in 711 Section 4 712 o Nexthop resolution status (resolved/unresolved) notification 714 6. RIB grammar 716 This section specifies the RIB information model in Routing Backus- 717 Naur Form [RFC5511]. 719 ::= 720 [] 721 [] [] 723 ::= [] 725 ::= ( ...) 727 ::= ( ...) 728 ::= 729 [ ... ] [] 730 [ENABLE_IP_RPF_CHECK] 731 ::= | | 732 | 734 ::= 735 [] 736 [] 738 ::= | | | 739 | 741 ::= [] 742 ::= 744 ::= [] 745 ::= 747 ::= 748 ::= ( ) 749 ::= 751 ::= 752 754 ::= 755 757 ::= [] [] 758 [] 759 [] 761 ::= | 762 | 763 764 ::= [] [] 765 ::= ( ) [ ...] 766 ::= | | 767 | 768 ::= ( ...) [] 770 ::= 772 ::= [] 773 ::= <> 774 ::= <> 776 ::= | 777 (() | 778 ([ ... ] )) 780 ::= ( | 781 ) 782 [] 783 ::= [] 784 [] 786 ::= ( ...) 787 ::= | 788 ::= ( | | 789 ( 790 ([] | [])) | 791 ( []) | 792 | 793 ) 794 [] 795 [] 797 ::= | 798 ::= ( ) | 799 ( ) | 800 ( ) | 801 ( ) 802 ::= | | 803 ( [] []) 804 ::= <> 806 ::= 807 ::= | | | | 809 ::= ( ) | 810 ( ) | 811 ( ) | 812 ( ) | 813 ( ) | 814 ( ) 816 ::= 817 [] [] 819 ::= 820 [] 821 [] [] 823 ::= ( ...) 824 ::= ( [] 825 [] []) | 826 ( []) 828 ::= [] 829 ::= ( | ) 830 [] 831 ::= ( | ) 832 833 [] 835 ::= [] 836 [] 837 ::= | | | 838 ::= [] [] 839 ::= <> 841 7. Using the RIB grammar 843 The RIB grammar is very generic and covers a variety of features. 844 This section provides examples on using objects in the RIB grammar 845 and examples to program certain use cases. 847 7.1. Using route preference and metric 849 Using route preference one can pre-install protection paths in the 850 network. For example, if OSPF has a route preference of 10, then one 851 can install a route with route preference of 20 to the same 852 destination. The OSPF route will get precedence and will get 853 installed in the FIB. When the OSPF route goes away (for any 854 reason), the protection path will get installed in the FIB. If the 855 hardware supports it, then the RIB manager can choose to pre-install 856 both routes, with the OSPF nexthop getting preference. 858 Route preference can also be used to prevent denial of service 859 attacks by installing routes with the best preference, which either 860 drops the offending traffic or routes it to some monitoring/analysis 861 station. Since the routes are installed with the best preference, 862 they will supersede any route installed by any other protocol. 864 Route metric is used to disambiguate between 2 or more routes to the 865 same destination with the same preference and in the same RIB. One 866 usage of this is to install 2 routes, each with a different nexthop. 867 The preferred nexthop is given a better metric than the other one. 868 This results in traffic being forwarded to the preferred nexthop. If 869 the preferred nexthop fails, then the RIB manager will automatically 870 install a route to the other nexthop. 872 7.2. Using different nexthops types 874 The RIB grammar allows one to create a variety of nexthops. This 875 section describes uses for certain types of nexthops. 877 7.2.1. Tunnel nexthops 879 A tunnel nexthop points to a tunnel of some kind. Traffic that goes 880 over the tunnel gets encapsulated with the tunnel encap. Tunnel 881 nexthops are useful for abstracting out details of the network, by 882 having the traffic seamlessly route between network edges. 884 7.2.2. Replication lists 886 One can create a replication list for replication traffic to multiple 887 destinations. The destinations, in turn, could be complex nexthops 888 in themselves - at a level supported by the network device. Point to 889 multipoint and broadcast are examples that involve replication. 891 A replication list (at the simplest level) can be represented as: 893 ::= [ ... ] 895 The above can be derived from the grammar as follows: 897 ::= [ ...] 898 ::= [ ...] 899 ::= [ ... ] 901 7.2.3. Weighted lists 903 A weighted list is used to load-balance traffic among a set of 904 nexthops. From a modeling perspective, a weighted list is very 905 similar to a replication list, with the difference that each member 906 nexthop MUST have a LOAD_BALANCE_WEIGHT associated with it. 908 A weighted list (at the simplest level) can be represented as: 910 ::= ( ) 911 [( )... ] 913 The above can be derived from the grammar as follows: 915 ::= [ ...] 916 ::= ( ) 917 [( 918 ) ...] 919 ::= ( ) 920 [( ) ... ] 921 ::= ( ) 922 [( )... ] 924 7.2.4. Protection lists 926 Protection lists are similar to weighted lists. A protection list 927 specifies a set of primary nexthops and a set of backup nexthops. 928 The attribute indicates which nexthop is 929 primary and which is backup. 931 A protection list can be represented as: 933 ::= ( ) 934 [( )... ] 936 A protection list can also be a weighted list. In other words, 937 traffic can be load-balanced among the primary nexthops of a 938 protection list. In such a case, the list will look like: 940 ::= ( 941 ) 942 [( 943 )... ] 945 7.2.5. Nexthop chains 947 A nexthop chain is a nexthop that puts one or more headers on an 948 outgoing packet. One example is a Pseudowire - which is MPLS over 949 some transport (MPLS or GRE for instance). Another example is VxLAN 950 over IP. A nexthop chain allows an external entity to break up the 951 programming of the nexthop into independent pieces - one per 952 encapsulation. 954 A simple example of MPLS over GRE can be represented as: 956 ::= ( ) ( ) 958 The above can be derived from the grammar as follows: 960 ::= [ ...] 961 ::= 962 ::= [ ... ] 963 ::= ( [ ...]) 964 ::= () 965 ::= ( ) ( ) 967 7.2.6. Lists of lists 969 Lists of lists is a complex construct. One example of usage of such 970 a construct is to replicate traffic to multiple destinations, with 971 high availability. In other words, for each destination you have a 972 primary and backup nexthop (replication list) to ensure there is no 973 traffic drop in case of a failure. So the outer list is a protection 974 list and the inner lists are replication lists of primary/backup 975 nexthops. 977 7.3. Performing multicast 979 IP multicast involves matching a packet on (S, G) or (*, G), where 980 both S (source) and G (group) are IP prefixes. Following the match, 981 the packet is replicated to one or more recipients. How the 982 recipients subscribe to the multicast group is outside the scope of 983 this document. 985 In PIM-based multicast, the packets are IP forwarded on an IP 986 multicast tree. The downstream nodes on each point in the multicast 987 tree is one or more IP addresses. These can be represented as a 988 replication list ( Section 7.2.2 ). 990 In MPLS-based multicast, the packets are forwarded on a point to 991 multipoint (P2MP) label-switched path (LSP). The nexthop for a P2MP 992 LSP can be represented in the nexthop grammar as a 993 (P2MP LSP identifier) or a replication list ( Section 7.2.2) of 994 , with each tunnel encap representing a single mpls 995 downstream nexthop. 997 7.4. Solving optimized exit control 999 In case of optimized exit control, a controller wants to control the 1000 edge device (and optionally control the outgoing interface on that 1001 edge device) that is used by a server to send traffic out. This can 1002 be easily achieved by having the controller program the edge router 1003 (Eg. 192.0.2.10) and the server along the following lines: 1005 Server: 1006 ::= ( 1007 ) 1008 ::= <198.51.100.1/16> 1009 ( ) 1010 ( ) 1012 ::- <198.51.100.1/16> 1013 ( <100>) 1014 ( <192.0.2.10> ) 1016 Edge Router: 1017 ::= 1018 ::= ( <100>) 1020 In the above case, the label 100 identifies the egress interface 1021 on the edge router. 1023 8. RIB operations at scale 1025 This section discusses the scale requirements for a RIB data-model. 1026 The RIB data-model should be able to handle large scale of 1027 operations, to enable deployment of RIB applications in large 1028 networks. 1030 8.1. RIB reads 1032 Bulking (grouping of multiple objects in a single message) MUST be 1033 supported when a network device sends RIB data to an external entity. 1034 Similarly the data model MUST enable a RIB client to request data in 1035 bulk from a network device. 1037 8.2. RIB writes 1039 Bulking (grouping of multiple write operations in a single message) 1040 MUST be supported when an external entity wants to write to the RIB. 1041 The response from the network device MUST include a return-code for 1042 each write operation in the bulk message. 1044 8.3. RIB events and notifications 1046 There can be cases where a single network event results in multiple 1047 events and/or notifications from the network device to an external 1048 entity. On the other hand, due to timing of multiple things 1049 happening at the same time, a network device might have to send 1050 multiple events and/or notifications to an external entity. The 1051 network device originated event/notification message MUST support 1052 bulking of multiple events and notifications in a single message. 1054 9. Security Considerations 1056 All interactions between a RIB manager and an external entity MUST be 1057 authenticated and authorized. The RIB manager MUST protect itself 1058 against a denial of service attack by a rogue external entity, by 1059 throttling request processing. A RIB manager MUST enforce limits on 1060 how much data can be programmed by an external entity and return 1061 error when such a limit is reached. 1063 The RIB manager MUST expose a data-model that it implements. An 1064 external agent MUST send requests to the RIB manager that comply with 1065 the supported data-model. The data-model MUST specify the behavior 1066 of the RIB manager on handling of unsupported data requests. 1068 10. IANA Considerations 1070 This document does not generate any considerations for IANA. 1072 11. Acknowledgements 1074 The authors would like to thank the working group co-chairs and 1075 reviewers on their comments and suggestions on this draft. The 1076 following people contributed to the design of the RIB model as part 1077 of the I2RS Interim meeting in April 2013 - Wes George, Chris 1078 Liljenstolpe, Jeff Tantsura, Sriganesh Kini, Susan Hares, Fabian 1079 Schneider and Nitin Bahadur. 1081 12. References 1083 12.1. Normative References 1085 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1086 Requirement Levels", BCP 14, RFC 2119, March 1997. 1088 12.2. Informative References 1090 [I-D.atlas-i2rs-problem-statement] 1091 Atlas, A., Nadeau, T., and D. Ward, "Interface to the 1092 Routing System Problem Statement", 1093 draft-atlas-i2rs-problem-statement-02 (work in progress), 1094 August 2013. 1096 [I-D.hares-i2rs-use-case-vn-vc] 1097 Hares, S., "Use Cases for Virtual Connections on Demand 1098 (VCoD) and Virtual Network on Demand using Interface to 1099 Routing System", draft-hares-i2rs-use-case-vn-vc-00 (work 1100 in progress), February 2013. 1102 [I-D.white-i2rs-use-case] 1103 White, R., Hares, S., and A. Retana, "Protocol Independent 1104 Use Cases for an Interface to the Routing System", 1105 draft-white-i2rs-use-case-01 (work in progress), 1106 August 2013. 1108 [RFC3443] Agarwal, P. and B. Akyol, "Time To Live (TTL) Processing 1109 in Multi-Protocol Label Switching (MPLS) Networks", 1110 RFC 3443, January 2003. 1112 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1113 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1115 [RFC4915] Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., and P. 1116 Pillay-Esnault, "Multi-Topology (MT) Routing in OSPF", 1117 RFC 4915, June 2007. 1119 [RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous 1120 System Confederations for BGP", RFC 5065, August 2007. 1122 [RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi 1123 Topology (MT) Routing in Intermediate System to 1124 Intermediate Systems (IS-ISs)", RFC 5120, February 2008. 1126 [RFC5511] Farrel, A., "Routing Backus-Naur Form (RBNF): A Syntax 1127 Used to Form Encoding Rules in Various Routing Protocol 1128 Specifications", RFC 5511, April 2009. 1130 Authors' Addresses 1132 Nitin Bahadur (editor) 1133 Juniper Networks, Inc. 1134 1194 N. Mathilda Avenue 1135 Sunnyvale, CA 94089 1136 US 1138 Phone: +1 408 745 2000 1139 Email: nitinb@juniper.net 1140 URI: www.juniper.net 1142 Ron Folkes (editor) 1143 Juniper Networks, Inc. 1144 1194 N. Mathilda Avenue 1145 Sunnyvale, CA 94089 1146 US 1148 Phone: +1 408 745 2000 1149 Email: ronf@juniper.net 1150 URI: www.juniper.net 1152 Sriganesh Kini 1153 Ericsson 1155 Email: sriganesh.kini@ericsson.com 1157 Jan Medved 1158 Cisco 1160 Email: jmedved@cisco.com