idnits 2.17.1 draft-nitinb-i2rs-rib-info-model-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 15, 2013) is 3930 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-02) exists of draft-atlas-i2rs-problem-statement-01 == Outdated reference: A later version (-03) exists of draft-hares-i2rs-use-case-vn-vc-00 == Outdated reference: A later version (-06) exists of draft-white-i2rs-use-case-00 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group N. Bahadur, Ed. 3 Internet-Draft R. Folkes, Ed. 4 Intended status: Informational Juniper Networks, Inc. 5 Expires: January 16, 2014 S. Kini 6 Ericsson 7 J. Medved 8 Cisco 9 July 15, 2013 11 Routing Information Base Info Model 12 draft-nitinb-i2rs-rib-info-model-01 14 Abstract 16 Routing and routing functions in enterprise and carrier networks are 17 typically performed by network devices (routers and switches) using a 18 routing information base (RIB). Protocols and configuration push 19 data into the RIB and the RIB manager install state into the 20 hardware; for packet forwarding. This draft specifies an information 21 model for the RIB to enable defining a standardized data model. Such 22 a data model can be used to define an interface to the RIB from an 23 entity that may even be external to the network device. This 24 interface can be used to support new use-cases being defined by the 25 IETF I2RS WG. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on January 16, 2014. 44 Copyright Notice 46 Copyright (c) 2013 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 1.1. Conventions used in this document . . . . . . . . . . . . 6 63 2. RIB data . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 64 2.1. RIB definition . . . . . . . . . . . . . . . . . . . . . . 6 65 2.2. Routing tables . . . . . . . . . . . . . . . . . . . . . . 7 66 2.3. Route . . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 2.4. Nexthop . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 2.4.1. Nexthop types . . . . . . . . . . . . . . . . . . . . 10 69 2.4.2. Nexthop list attributes . . . . . . . . . . . . . . . 11 70 2.4.3. Nexthop content . . . . . . . . . . . . . . . . . . . 11 71 2.4.4. Nexthop attributes . . . . . . . . . . . . . . . . . . 12 72 2.4.5. Nexthop vendor attributes . . . . . . . . . . . . . . 13 73 2.4.6. Special nexthops . . . . . . . . . . . . . . . . . . . 13 74 3. Reading from the RIB . . . . . . . . . . . . . . . . . . . . . 14 75 4. Writing to the RIB . . . . . . . . . . . . . . . . . . . . . . 14 76 5. Events and Notifications . . . . . . . . . . . . . . . . . . . 14 77 6. RIB grammar . . . . . . . . . . . . . . . . . . . . . . . . . 15 78 7. Using the RIB grammar . . . . . . . . . . . . . . . . . . . . 17 79 7.1. Using route preference and metric . . . . . . . . . . . . 18 80 7.2. Using different nexthops types . . . . . . . . . . . . . . 18 81 7.2.1. Tunnel nexthops . . . . . . . . . . . . . . . . . . . 18 82 7.2.2. Replication lists . . . . . . . . . . . . . . . . . . 18 83 7.2.3. Weighted lists . . . . . . . . . . . . . . . . . . . . 19 84 7.2.4. Protection lists . . . . . . . . . . . . . . . . . . . 19 85 7.2.5. Nexthop chains . . . . . . . . . . . . . . . . . . . . 20 86 7.2.6. Lists of lists . . . . . . . . . . . . . . . . . . . . 20 87 7.3. Performing multicast . . . . . . . . . . . . . . . . . . . 20 88 7.4. Solving optimized exit control . . . . . . . . . . . . . . 21 89 8. RIB operations at scale . . . . . . . . . . . . . . . . . . . 21 90 8.1. RIB reads . . . . . . . . . . . . . . . . . . . . . . . . 22 91 8.2. RIB writes . . . . . . . . . . . . . . . . . . . . . . . . 22 92 8.3. RIB events and notifications . . . . . . . . . . . . . . . 22 93 9. Security Considerations . . . . . . . . . . . . . . . . . . . 22 94 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 95 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22 96 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 97 12.1. Normative References . . . . . . . . . . . . . . . . . . . 23 98 12.2. Informative References . . . . . . . . . . . . . . . . . . 23 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24 101 1. Introduction 103 Routing and routing functions in enterprise and carrier networks are 104 traditionally performed in network devices. Traditionally routers 105 run routing protocols and the routing protocols (along with static 106 config) populates the Routing information base (RIB) of the router. 107 The RIB is managed by the RIB manager and it provides a north-bound 108 interface to its clients i.e. the routing protocols to insert routes 109 into the RIB. The RIB manager consults the RIB and decides how to 110 program the forwarding information base (FIB) of the hardware by 111 interfacing with the FIB-manager. The relationship between these 112 entities is shown in Figure 1. 114 +-------------+ +-------------+ 115 |RIB-Client 1 | ...... |RIB-Client N | 116 +-------------+ +-------------+ 117 ^ ^ 118 | | 119 +----------------------+ 120 | 121 V 122 +---------------------+ 123 |RIB-Manager | 124 | | 125 | +-----+ | 126 | | RIB | | 127 | +-----+ | 128 +---------------------+ 129 ^ 130 | 131 +---------------------------------+ 132 | | 133 V V 134 +-------------+ +-------------+ 135 |FIB-Manager 1| |FIB-Manager M| 136 | +-----+ | .......... | +-----+ | 137 | | FIB | | | | FIB | | 138 | +-----+ | | +-----+ | 139 +-------------+ +-------------+ 141 Figure 1: RIB-Manager, RIB-Clients and FIB-Managers 143 Routing protocols are inherently distributed in nature and each 144 router makes an independent decision based on the routing data 145 received from its peers. With the advent of newer deployment 146 paradigms and the need for specialized applications, there is an 147 emerging need to guide the router's routing function 148 [I-D.atlas-i2rs-problem-statement]. Traditional network-device 149 protocol-based RIB population suffices for most use cases where 150 distributed network control works. However there are use cases in 151 which the network admins today configure static routes, policies and 152 RIB import/export rules on the routers. There is also a growing list 153 of use cases [I-D.white-i2rs-use-case], 154 [I-D.hares-i2rs-use-case-vn-vc] in which a network admin might want 155 to program the RIB based on data unrelated to just routing (within 156 that network's domain). It could be based on routing data in 157 adjacent domain or it could be based on load on storage and compute 158 in the given domain. Or it could simply be a programmatic way of 159 creating on-demand dynamic overlays between compute hosts (without 160 requiring the hosts to run traditional routing protocols). If there 161 was a standardized programmatic interface to a RIB, it would fuel 162 further networking applications targeted towards specific niches. 164 A programmatic interface to the RIB involves 2 types of operations - 165 reading what's in the RIB and adding/modifying/deleting contents of 166 the RIB. [I-D.white-i2rs-use-case] lists various use-cases which 167 require read and/or write manipulation of the RIB. 169 In order to understand what is in a router's RIB, methods like per- 170 protocol SNMP MIBs and show output screen scraping are being used. 171 These methods are not scalable, since they are client pull mechanisms 172 and not proactive push (from the router) mechanisms. Screen scraping 173 is error prone (since output can change) and vendor dependent. 174 Building a RIB from per-protocol MIBs is error prone since the MIB 175 data represents protocol data and not the exact information that went 176 into the RIB. Thus, just getting read-only RIB information from a 177 router is a hard task. 179 Adding content to the RIB from an external entity can be done today 180 using static configuration support provided by router vendors. 181 However the mix of what can be modified in the RIB varies from vendor 182 to vendor and the way of configuring it is also vendor dependent. 183 This makes it hard for an external entity to program a multi-vendor 184 network in a consistent and vendor independent way. 186 The purpose of this draft is to specify an information model for the 187 RIB. Using the information model, one can build a detailed data 188 model for the RIB. And that data model could then be used by an 189 external entity to program a router. 191 The rest of this document is organized as follows. Section 2.1 goes 192 into the details of what constitutes and can be programmed in a RIB. 193 Section 5 provides a high-level view of the events and notifications 194 going from a network device to an external entity, to update the 195 external entity on asynchronous events. The RIB grammar is specified 196 in Section 6. Examples of using the RIB grammar are shown in 197 Section 7. 199 1.1. Conventions used in this document 201 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 202 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 203 document are to be interpreted as described in [RFC2119]. 205 2. RIB data 207 This section describes the details of a RIB. It makes forward 208 references to objects in the RIB grammar (Section 6). 210 2.1. RIB definition 212 A RIB is a logical construct controlled by an external entity. A RIB 213 contains one or more routing instances. On a network device, a RIB 214 is uniquely identified by its name. A routing instance can be in 215 only 1 RIB. A routing instance, in the context of the RIB 216 information model, is a collection of routing tables, interfaces, and 217 routing parameters. A routing instance creates a logical slice of 218 the router and allows different logical slices; across a set of 219 routers; to communicate with other each. Layer 3 Virtual Private 220 Networks (VPN), Layer 2 VPNs (L2VPN) and Virtual Private Lan Service 221 (VPLS) can be modeled as routing instances. Note that modeling a 222 Layer 2 VPN using a routing instance only models the Layer-3 (RIB) 223 aspect and does not model any layer-2 information (like ARP) that 224 might be associated with the L2VPN. 226 The set of interfaces indicates which interfaces this routing 227 instance has control over. The routing tables specify how incoming 228 traffic is to be forwarded. And the routing parameters control the 229 information in the routing tables. The intersection set of 230 interfaces of 2 routing instances MUST be the null set. In other 231 words, an interface should not be present in 2 routing instances. 232 Thus a routing instance describes the routing information and 233 parameters across a set of interfaces. 235 A routing instance MUST contain the following mandatory fields. 236 o INSTANCE_NAME: A routing instance is identified by its name, 237 INSTANCE_NAME. 238 o INSTANCE_DISTINGUISHER: Each routing instance must have a unique 239 distinguisher associated with it. It enables one to distinguish 240 routes across routing instances. The route distinguisher SHOULD 241 be unique across all routing instances in a given network device. 242 How the INSTANCE_DISTINGUISHER is allocated and kept unique is 243 outside the scope of this document. The instance distinguisher 244 maps well to BGP route-distinguisher for virtual private networks 245 (VPNs). However, the same concept can be used for other use-cases 246 as well. 247 o routing-table-list: This is the list of routing tables associated 248 with this routing instance. Each routing instance can have 249 multiple tables to represent routes of different types. For 250 example, one would put IPv4 routes in one table and MPLS routes in 251 another table. 252 A routing instance MAY contain the following optional fields. 253 o interface-list: This represents the list of interfaces in this 254 routing instance. The interface list helps constrain the 255 boundaries of packet forwarding. Packets coming on these 256 interfaces are directly associated with the given routing 257 instance. The interface list contains a list of identifiers, with 258 each identifier uniquely identifying an interface. 259 o ROUTER_ID: The router-id field identifies the router. This field 260 is to be used if one wants to virtualize a physical router into 261 multiple virtual routers. Each virtual router will have a unique 262 router-id. 263 o ISO_SYSTEM_ID: For IS-IS to operate on a router, a system 264 identifier is needed. This represents the same. 265 o as-data: The as-data fields is used when the routes in this 266 instance are to be tagged with certain autonomous system (AS) 267 characteristics. The RIB manager can use AS length as one of the 268 parameters for making path selection. as-data consists of a AS 269 number and an optional Confederation AS number ([RFC5065]). 271 2.2. Routing tables 273 A routing table is an entity that contains routes. A routing table 274 is identified by its name. The name MUST be unique within a RIB. 275 All routes in a given routing table MUST be of the same type (e.g. 276 IPv4). Each routing table MUST belong to some routing instance. 278 A routing table can be tagged with a MULTI_TOPOLOGY_ID. If a routing 279 instance is divided into multiple logical topologies, then the multi- 280 topology field is used to distinguish one topology from the other, so 281 as to keep routes from one topology independent of routes from 282 another topology. 284 If a routing instance contains multiple tables of the same type (e.g. 285 IPv4), then a MULTI_TOPOLOGY_ID MUST be associated with each such 286 table. Multiple tables are useful when describing multiple topology 287 IGP (Interior Gateway Protocol) networks (see [RFC4915] and [RFC5120] 288 ). In a given routing instance, MULTI_TOPOLOGY_ID MUST be unique 289 across routing tables of the same type. 291 Each route table can be optionally associated with a 292 ENABLE_IP_RPF_CHECK attribute that enables Reverse path forwarding 293 (RPF) checks on all IP routes in that table. Reverse path forwarding 294 (RPF) check is used to prevent spoofing and limit malicious traffic. 295 For IP packets, the IP source address is looked up and the rpf 296 interface(s) associated with the route for that IP source address is 297 found. If the incoming IP packet's interface matches one of the rpf 298 interface(s), then the IP packet is forwarded based on its IP 299 destination address; otherwise, the IP packet is discarded. 301 2.3. Route 303 A route is essentially a match condition and an action following the 304 match. The match condition specifies the kind of route (IPv4, MPLS, 305 etc.) and the set of fields to match on. This document specifies the 306 following match types: 307 o IPv4: Match on destination IP in IPv4 header 308 o IPv6: Match on destination IP in IPv6 header 309 o MPLS: Match on a MPLS tag 310 o MAC: Match on ethernet destination addresses 311 o Interface: Match on incoming interface of packet 312 o IP multicast: Match on (S, G) or (*, G), where S and G are IP 313 prefixes 315 Each route can have associated with it one or more optional route 316 attributes. 317 o ROUTE_PREFERENCE: This is a numerical value that allows for 318 comparing routes from different protocols. It is also known as 319 administrative-distance. The lower the value, the higher the 320 preference. For example there can be an OSPF route for 321 192.0.2.1/32 with a preference of 5. If a controller programs a 322 route for 192.0.2.1/32 with a preference of 2, then the controller 323 entered route will be preferred by the RIB manager. Preference 324 should be used to dictate behavior. For more examples of 325 preference, see Section 7.1. 326 o ROUTE_METRIC: Route preference is used for comparing routes from 327 different protocols. Route metric is used for comparing routes 328 learned by the same protocol. If a controller wishes to program 2 329 or more routes to the same destination, then it can use the metric 330 field to disambiguate the 2 routes. For more examples, see 331 Section 7.1. 332 o LOCAL_ONLY: This is a boolean value. If this is present, then it 333 means that this route should not be exported into other RIBs or 334 other route tables. 335 o rpf-check-interface: Reverse path forwarding (RPF) check is used 336 to prevent spoofing and limit malicious traffic. For IP packets, 337 the IP source address is looked up and the rpf-check-interface 338 associated with the route for that IP source address is found. If 339 the incoming IP packet's interface matches one of the rpf-check- 340 interfaces, then the IP packet is forwarded based on its IP 341 destination address; otherwise, the IP packet is discarded. For 342 MPLS routes, there is no source address to be looked up, so the 343 usage is slightly different. For an MPLS route, a packet with the 344 specified MPLS label will only be forwarded if it is received on 345 one of the interfaces specified by the rpf-check-interface. If no 346 rpf-check-interface is specified, then matching packets are no 347 subject to this check. This field overrides the 348 ENABLE_IP_RPF_CHECK flag on the routing table and interfaces 349 provided in this list are used for doing the RPF check. 350 o as-path: A route can have an as-path associated with it to 351 indicate which set of autonomous systems has to be traversed to 352 reach the final destination. The as-path attribute can be used by 353 the RIB manager in multiple ways. The RIB manager can choose 354 paths with lower as-path length. Or the RIB manager can choose to 355 not install paths going via a particular AS. How exactly the RIB 356 manager uses the as-path is outside the scope of this document. 357 For details of how the as-path is formed, see Section 5.1.2 of 358 [RFC4271] and Section 3 of [RFC5065]. 359 o route-vendor-attributes: Vendors can specify vendor-specific 360 attributes using this. The details of this field is outside the 361 scope of this document. 363 2.4. Nexthop 365 A nexthop represents an object or action resulting from a route 366 lookup. For example, if a route lookup results in sending the packet 367 out a given interface, then the nexthop represents that interface. 369 Nexthops can be fully resolved nexthops or unresolved nexthop. A 370 resolved nexthop is something that is ready for installation in the 371 FIB. For example, a nexthop that points to an interface. An 372 unresolved nexthop is something that requires the RIB manager to 373 figure out the final resolved nexthop. For example, a nexthop could 374 point to an IP address. The RIB manager has to resolve how to reach 375 that IP address - is the IP address reachable by regular IP 376 forwarding or by a MPLS tunnel or by both. If the RIB manager cannot 377 resolve the nexthop, then the nexthop stays in unresolved state and 378 is NOT a candidate for installation in the FIB. Future RIB events 379 can cause a nexthop to get resolved (like that IP address being 380 advertised by an IGP neighbor). 382 The RIB information model allows an external entity to program 383 nexthops that may be unresolved initially. Whenever a unresolved 384 nexthop gets resolved, the RIB manager will send a notification of 385 the same (see Section 5 ). 387 Nexthops can be identified by an identifier to create a level of 388 indirection. The identifier is set by the RIB manager and returned 389 to the external entity on request. The RIB data-model SHOULD support 390 a way to optionally receive a nexthop identifier for a given nexthop. 391 For example, one can create a nexthop that points to a BGP peer. The 392 returned nexthop identifier can then be used for programming routes 393 to point to the same nexthop. Given that the RIB manager has created 394 an indirection for that BGP peer using the nexthop identifier, if the 395 transport path to the BGP peer changes, that change in path will be 396 seamless to the external entity and all routes that point to that BGP 397 peer will automatically start going over the new transport path. 398 Nexthop indirection using identifier could be applied to not just 399 unicast nexthops, but even to nexthops that contain chains and nested 400 nexthops (Section 2.4.1). 402 2.4.1. Nexthop types 404 This document specifies a very generic, extensible and recursive 405 grammar for nexthops. Nexthops can be 406 o Unicast nexthops - pointing to an interface 407 o Tunnel nexthops - pointing to a tunnel 408 o Replication lists - list of nexthops to which to replicate a 409 packet to 410 o Weighted lists - for load-balancing 411 o Protection lists - for primary/backup paths 412 o Nexthop chains - for chaining headers, e.g. MPLS label over a GRE 413 header 414 o Lists of lists - recursive application of the above 415 o Indirect nexthops - pointing to a nexthop identifier 416 o Special nexthops - for performing specific well-defined functions 417 It is expected that all network devices will have a limit on 418 recursion and not all hardware will be able to support all kinds of 419 nexthops. RIB capability negotiation becomes very important for this 420 reason and a RIB data-model MUST specify a way for an external entity 421 to learn about the network device's capabilities. Examples of when 422 and how to use various kinds of nexthops are shown in Section 7.2. 424 Tunnel nexthops allow an external entity to program static tunnel 425 headers. There can be cases where the remote tunnel end-point does 426 not support dynamic signaling (e.g. no LDP support on a host) and in 427 those cases the external entity might want to program the tunnel 428 header on both ends of the tunnel. The tunnel nexthop is kept 429 generic with specifications provided for some commonly used tunnels. 430 It is expected that the data-model will model these tunnel types with 431 complete accuracy. 433 Nexthop chains can be used to specify multiple headers over a packet, 434 before a packet is forwarded. One simple example is that of MPLS 435 over GRE, wherein the packet has a inner MPLS header followed by a 436 GRE header followed by an IP header. The outermost IP header is 437 decided by the network device whereas the MPLS header and GRE header 438 are specified by the controller. Not every network device will be 439 able to support all kinds of nexthop chains and an arbitrary number 440 of header chained together. The RIB data-model SHOULD provide a way 441 to expose nexthop chaining capability supported by a given network 442 device. 444 2.4.2. Nexthop list attributes 446 For nexthops that are of the form of a list(s), attributes can be 447 associated with each member of the list to indicate the role of an 448 individual member of the list. Two kinds of attributes are 449 specified: 450 o PROTECTION_PREFERENCE: This provides a primary/backup like 451 preference. The preference is an integer value that should be set 452 to 1 or 2. Nexthop members with a preference of 1 are preferred 453 over those with preference of 2. The network device SHOULD create 454 a list of nexthops with preference 1 (primary) and another list of 455 nexthops with preference 2 (backup) and SHOULD pre-program the 456 forwarding plane with both the lists. In case if all the primary 457 nexthops fail, then traffic MUST be switched over to members of 458 the backup nexthop list. All members in a list MUST either have a 459 protection preference specified or all members in a list MUST NOT 460 have a protection preference specified. 461 o LOAD_BALANCE_WEIGHT: This is used for load-balancing. Each list 462 member MUST be assigned a weight. The weight is a percentage 463 number from 1 to 99. The weight determines how much traffic is 464 sent over a given list member. If one of the members nexthops in 465 the list is not active, then the weight value of that nexthop 466 SHOULD be distributed among the other active members. How the 467 distribution is done is up to the network device and not in the 468 scope of the document. In other words, traffic should always be 469 load-balanced even if there is a failure. After a failure, the 470 external entity SHOULD re-program the nexthop list with updated 471 weights so as to get a deterministic behavior among the remaining 472 list members. To perform equal load-balancing, one MAY specify a 473 weight of "0" for all the member nexthops. The value "0" is 474 reserved for equal load-balancing and if applied, MUST be applied 475 to all member nexthops. 477 2.4.3. Nexthop content 479 At the lowest level, a nexthop can point to a: 480 o identifier: This is an identifier returned by the network device 481 representing another nexthop or another nexthop chain. 483 o EGRESS_INTERFACE: This represents a physical, logical or virtual 484 interface on the network device. 485 o address: This can be an IP address or MAC address or ISO address. 486 * An optional table name can also be specified to indicate the 487 table in which the address is to be looked up further. One can 488 use the table name field to direct the packet from one domain 489 into another domain. For example, a MPLS packet coming in on 490 an interface would be looked up in a MPLS routing table and the 491 nexthop for that could indicate that we strip the MPLS label 492 and do a subsequent IPv4 lookup in an IPv4 table. By default 493 the table will be the same in which the route lookup was 494 performed. 495 * An optional egress interface can be specified to indicate which 496 interface to send the packet out on. The egress interface is 497 useful when the network device contains Ethernet interfaces and 498 one needs to perform an ARP lookup for the IP packet. 499 o tunnel encap: This can be an encap representing an IP tunnel or 500 MPLS tunnel or others as defined in this document. An optional 501 egress interface can be specified to indicate which interface to 502 send the packet out on. The egress interface is useful when the 503 network device contains Ethernet interfaces and one needs to 504 perform an ARP lookup for the IP packet. 505 o logical tunnel: This can be a MPLS LSP or a GRE tunnel (or others 506 as defined in this document), that is represented by a unique 507 identifier (E.g. name). 508 o ROUTING_TABLE_NAME: This is a routing table that exists in the 509 RIB. A nexthop pointing to a table indicates that the route 510 lookup needs to continue in the specified table. This is a way to 511 perform chained lookups. 513 2.4.4. Nexthop attributes 515 Certain information is encoded implicitly in the nexthop and does not 516 need to be specified by the controller. For example, when a IP 517 packet is forwarded out, the IP TTL is decremented by default. Same 518 applies for an MPLS packet. Similarly, when an IP packet is sent 519 over an ethernet interface, any ARP processing is handled implicitly 520 by the network device and does not need to be programmed by an 521 external device. 523 A nexthop can have some attributes associated with it. The purpose 524 of the attributes is to either override implicit behavior (like that 525 related to TTL processing) or to guide the network device to perform 526 something specific. Vendor specific attributes can also be 527 specified. The details of vendor specific attributes is outside the 528 scope of this document. 530 2.4.4.1. Nexthop flags 532 Nexthop flags in a nexthop is an optional attribute that is used to 533 denote specific connotation to hardware. Two common types of 534 operations are specified using nexthop flags. 535 o NO_DECREMENT_TTL: This indicates that the IPv4 time-to-live field 536 in an IPv4 packet MUST NOT be decremented before the packet is 537 forwarded. This may be applied one when an IPv4 packet is 538 encapsulated in a tunnel (E.g. MPLS) and one wants to hide the 539 fact that the packet is going through a tunnel. 540 o NO_PROPAGATE_TTL: This indicates that the IPv4 time-to-live field 541 in an IPv4 packet MUST NOT be propagated into an equivalent field, 542 when the IPv4 packet is tunneled. For example, if the IPv4 packet 543 is tunneled over MPLS, then the network device should use the 544 default time-to-live value for the outer MPLS header. This field 545 can also be used to indicate that when a tunnel terminates, one 546 does not propagate the outer header's time-to-live value into the 547 inner header. So, on MPLS tunnel termination, one does not 548 propagate the MPLS TTL value into the IPv4 header. 549 The TTL nexthop flags can be used to simulate a Pipe model for 550 tunnels. See [RFC3443] for a detailed understanding of Pipe model 551 and Uniform model. 553 2.4.5. Nexthop vendor attributes 555 This field has been defined for vendor specific extensions. The 556 contents of this field are beyond the scope of this document. 558 2.4.6. Special nexthops 560 This document specifies certain special nexthops. The purpose of 561 each of them is explained below: 562 o DISCARD: This indicates that the network device should drop the 563 packet and increment a drop counter. 564 o DISCARD_WITH_ERROR: This indicates that the network device should 565 drop the packet, increment a drop counter and send back an 566 appropriate error message (like ICMP error). 567 o RECEIVE: This indicates that that the traffic is destined for the 568 network device. For example, protocol packets or OAM packets. 569 All locally destined traffic SHOULD be throttled to avoid a denial 570 of service attack on the router's control plane. An optional 571 rate-limiter can be specified to indicate how to throttle traffic 572 destined for the control plane. The description of the rate- 573 limiter is outside the scope of this document. 575 3. Reading from the RIB 577 A RIB data-model MUST allow an external entity to read entries, for 578 RIBs created by that entity. The network device administrator MAY 579 allow reading of other RIBs by an external entity through access 580 lists on the network device. The details of access lists are outside 581 the scope of this document. 583 The data-model MUST support a full read of the RIB and subsequent 584 incremental reads of changes to the RIB. An external agent SHOULD be 585 able to request a full read at any time in the lifecycle of the 586 connection. When sending data to an external entity, the RIB manager 587 SHOULD try to send all dependencies of an object prior to sending 588 that object. 590 4. Writing to the RIB 592 A RIB data-model MUST allow an external entity to write entries, for 593 RIBs created by that entity. The network device administrator MAY 594 allow writes to other RIBs by an external entity through access lists 595 on the network device. The details of access lists are outside the 596 scope of this document. 598 When writing an object to a RIB, the external entity SHOULD try to 599 write all dependencies of the object prior to sending that object. 600 The data-model MUST support requesting identifiers for nexthops and 601 collecting the identifiers back in the response. 603 Route programming in the RIB SHOULD result in a return code that 604 contains the following attributes: 605 o Installed - Yes/No (Indicates whether the route got installed in 606 the FIB) 607 o Active - Yes/No (Indicates whether a route is fully resolved and 608 is a candidate for selection) 609 o Reason - E.g. Not authorized 610 The data-model MUST specify which objects are modify-able objects. A 611 modify-able object is one whose contents can be changed without 612 having to change objects that depend on it and without affecting any 613 data forwarding. To change a non-modifiable object, one will need to 614 create a new object and delete the old one. For example, routes that 615 use a nexthop that is identifier by a nexthop-identifier should be 616 unaffected when the contents of that nexthop changes. 618 5. Events and Notifications 620 Asynchronous notifications are sent by the network device's RIB 621 manager to an external entity when some event occurs on the network 622 device. A RIB data-model MUST support sending asynchronous 623 notifications. A brief list of suggested notifications is as below: 624 o Route change notification, with return code as specified in 625 Section 4 626 o Nexthop resolution status (resolved/unresolved) notification 628 6. RIB grammar 630 This section specifies the RIB information model in Routing Backus- 631 Naur Form [RFC5511]. 633 ::= [ ...] 634 ::= 635 [] 636 [] [] 637 [] 639 ::= [] 641 ::= ( ...) 643 ::= ( ...) 644 ::= 645 [ ... ] [] 646 [ENABLE_IP_RPF_CHECK] 647 ::= | | 648 | 650 ::= 651 [] 652 [] 654 ::= | | | 655 | 656 ::= 657 [] 658 ::= 659 [] 660 ::= 661 ::= ( ) 662 ::= 664 ::= 665 666 ::= 667 669 ::= [] [] 670 [] 671 [] 673 ::= | 674 | 675 676 ::= [] [] 677 ::= ( ) [ ...] 678 ::= | | 679 | 680 ::= ( ...) [] 682 ::= 684 ::= [] 685 ::= <> 686 ::= <> 688 ::= | 689 (() | 690 ([ ... ] )) 692 ::= ( | 693 ) 694 [] 695 ::= [] 696 [] 698 ::= ( ...) 699 ::= | 700 ::= ( | | 701 ( 702 ([] | [])) | 703 ( []) | 704 | 705 ) 706 [] 707 [] 709 ::= | 710 ::= ( ) | 711 ( ) | 712 ( ) | 713 ( ) 715 ::= | | 716 ( [] []) 717 ::= <> 719 ::= 720 ::= | | | | 722 ::= ( ) | 723 ( ) | 724 ( ) | 725 ( ) | 726 ( ) | 727 ( ) 729 ::= 730 [] [] 732 ::= 733 [] 734 [] [] 736 ::= ( ...) 737 ::= ( [] 738 [] []) | 739 ( []) 741 ::= [] 742 ::= ( | ) 743 [] 744 ::= ( | ) 745 746 [] 748 ::= [] 749 [] 750 ::= | | | 751 ::= [] [] 752 ::= <> 754 7. Using the RIB grammar 756 The RIB grammar is very generic and covers a variety of features. 757 This section provides examples on using objects in the RIB grammar 758 and examples to program certain use cases. 760 7.1. Using route preference and metric 762 Using route preference one can pre-install protection paths in the 763 network. For example, if OSPF has a route preference of 10, then one 764 can install a route with route preference of 20 to the same 765 destination. The OSPF route will get precedence and will get 766 installed in the FIB. When the OSPF route goes away (for any 767 reason), the protection path will get installed in the FIB. 769 Route preference can also be used to prevent denial of service 770 attacks by installing routes with the best preference, which either 771 drops the offending traffic or routes it to some monitoring/analysis 772 station. Since the routes are installed with the best preference, 773 they will supersede any route installed by any other protocol. 775 Route metric is used to disambiguate between 2 or more routes to the 776 same destination with the same preference and in the same route 777 table. One usage of this is to install 2 routes, each with a 778 different nexthop. The preferred nexthop is given a better metric 779 than the other one. This results in traffic being forwarded to the 780 preferred nexthop. If the preferred nexthop fails, then the RIB 781 manager will automatically install a route to the other nexthop. 783 7.2. Using different nexthops types 785 The RIB grammar allows one to create a variety of nexthops. This 786 section describes uses for certain types of nexthops. 788 7.2.1. Tunnel nexthops 790 A tunnel nexthop points to a tunnel of some kind. Traffic that goes 791 over the tunnel gets encapsulated with the tunnel encap. Tunnel 792 nexthops are useful for abstracting out details of the network, by 793 having the traffic seamlessly route between network edges. 795 7.2.2. Replication lists 797 One can create a replication list for replication traffic to multiple 798 destinations. The destinations, in turn, could be complex nexthops 799 in themselves - at a level supported by the network device. Point to 800 multipoint and broadcast are examples that involve replication. 802 A replication list (at the simplest level) can be represented as: 804 ::= [ ... ] 806 The above can be derived from the grammar as follows: 808 ::= [ ...] 809 ::= [ ...] 810 ::= [ ... ] 812 7.2.3. Weighted lists 814 A weighted list is used to load-balance traffic among a set of 815 nexthops. A weighted list is very similar to a replication list, 816 with the difference that each member nexthop MUST have a 817 LOAD_BALANCE_WEIGHT associated with it. 819 A weighted list (at the simplest level) can be represented as: 821 ::= ( ) 822 [( )... ] 824 The above can be derived from the grammar as follows: 826 ::= [ ...] 827 ::= ( ) 828 [( 829 ) ...] 830 ::= ( ) 831 [( ) ... ] 832 ::= ( ) 833 [( )... ] 835 7.2.4. Protection lists 837 Protection lists are similar to weighted lists. A protection list 838 specifies a set of primary nexthops and a set of backup nexthops. 839 The attribute indicates which nexthop is 840 primary and which is backup. 842 A protection list can be represented as: 844 ::= ( ) 845 [( )... ] 847 A protection list can also be a weighted list. In other words, 848 traffic can be load-balanced among the primary nexthops of a 849 protection list. In such a case, the list will look like: 851 ::= ( 852 ) 853 [( 854 )... ] 856 7.2.5. Nexthop chains 858 A nexthop chain is a nexthop that puts one or more headers on an 859 outgoing packet. One example is a Pseudowire - which is MPLS over 860 some transport (MPLS or GRE for instance). Another example is VxLAN 861 over IP. A nexthop chain allows an external entity to break up the 862 programming of the nexthop into independent pieces - one per 863 encapsulation. 865 A simple example of MPLS over GRE can be represented as: 867 ::= ( ) ( ) 869 The above can be derived from the grammar as follows: 871 ::= [ ...] 872 ::= 873 ::= [ ... ] 874 ::= ( [ ...]) 875 ::= () 876 ::= ( ) ( ) 878 7.2.6. Lists of lists 880 Lists of lists is a complex construct. One example of usage of such 881 a construct is to replicate traffic to multiple destinations, with 882 high availability. In other words, for each destination you have a 883 primary and backup nexthop (replication list) to ensure there is no 884 traffic drop in case of a failure. So the outer list is a list of 885 destinations and the inner lists are replication lists of primary/ 886 backup nexthops. 888 7.3. Performing multicast 890 IP multicast involves matching a packet on (S, G) or (*, G), where 891 both S (source) and G (group) are IP prefixes. Following the match, 892 the packet is replicated to one or more recipients. How the 893 recipients subscribe to the multicast group is outside the scope of 894 this document. 896 In PIM-based multicast, the packets are IP forwarded on an IP 897 multicast tree. The downstream nodes on each point in the multicast 898 tree is one or more IP addresses. These can be represented as a 899 replication list ( Section 7.2.2 ). 901 In MPLS-based multicast, the packets are forwarded on a point to 902 multipoint (P2MP) label-switched path (LSP). The nexthop for a P2MP 903 LSP can be represented in the nexthop grammar as a 904 (P2MP LSP identifier) or a replication list ( Section 7.2.2) of 905 , with each tunnel encap representing a single mpls 906 downstream nexthop. 908 7.4. Solving optimized exit control 910 In case of optimized exit control, a controller wants to control the 911 edge device (and optionally control the outgoing interface on that 912 edge device) that is used by a server to send traffic out. This can 913 be easily achieved by having the controller program the edge router 914 (Eg. 192.0.2.10) and the server along the following lines: 916 Server: 917 ::= ( 918 ) 919 ::= <198.51.100.1/16> 920 ( ) 921 ( ) 923 ::- <198.51.100.1/16> 924 ( <100>) 925 ( <192.0.2.10> ) 927 Edge Router: 928 ::= 929 ::= ( <100>) 931 In the above case, the label 100 identifies the egress interface 932 on the edge router. 934 8. RIB operations at scale 936 This section discusses the scale requirements for a RIB data-model. 937 The RIB data-model should be able to handle large scale of 938 operations, to enable deployment of RIB applications in large 939 networks. 941 8.1. RIB reads 943 Bulking (grouping of multiple objects in a single message) MUST be 944 supported when a network device sends RIB data to an external entity. 946 8.2. RIB writes 948 Bulking (grouping of multiple write operations in a single message) 949 MUST be supported when an external entity wants to write to the RIB. 950 The response from the network device MUST include a return-code for 951 each write operation in the bulk message. 953 8.3. RIB events and notifications 955 There can be cases where a single network event results in multiple 956 events and/or notifications from the network device to an external 957 entity. On the other hand, due to timing of multiple things 958 happening at the same time, a network device might have to send 959 multiple events and/or notifications to an external entity. The 960 network device originated event/notification message MUST support 961 bulking of multiple events and notifications in a single message. 963 9. Security Considerations 965 All interactions between a RIB manager and an external entity MUST be 966 authenticated. The RIB manager MUST protect itself against a denial 967 of service attack by a rouge external entity, by throttling request 968 processing. A RIB manager MUST enforce limits on how much data can 969 be programmed by an external entity and return error when such a 970 limit is reached. 972 The RIB manager MUST expose a data-model that it implements. An 973 external agent MUST send requests to the RIB manager that comply with 974 the supported data-model. The data-model MUST specify the behavior 975 of the RIB manager on handling of unsupported data requests. 977 10. IANA Considerations 979 This document does not generate any considerations for IANA. 981 11. Acknowledgements 983 The authors would like to thank Alia Atlas, Edward Crabbe, Hariharan 984 Ananthakrishnan, Jeff Haas and Ina Minei on their comments and 985 suggestions on this draft. The following people contributed to the 986 design of the RIB model as part of the I2RS Interim meeting in April 987 2013 - Wes George, Chris Liljenstolpe, Jeff Tantsura, Sriganesh Kini, 988 Susan Hares, Fabian Schneider and Nitin Bahadur. 990 12. References 992 12.1. Normative References 994 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 995 Requirement Levels", BCP 14, RFC 2119, March 1997. 997 12.2. Informative References 999 [I-D.atlas-i2rs-problem-statement] 1000 Atlas, A., Nadeau, T., and D. Ward, "Interface to the 1001 Routing System Problem Statement", 1002 draft-atlas-i2rs-problem-statement-01 (work in progress), 1003 July 2013. 1005 [I-D.hares-i2rs-use-case-vn-vc] 1006 Hares, S., "Use Cases for Virtual Connections on Demand 1007 (VCoD) and Virtual Network on Demand using Interface to 1008 Routing System", draft-hares-i2rs-use-case-vn-vc-00 (work 1009 in progress), February 2013. 1011 [I-D.white-i2rs-use-case] 1012 White, R., Hares, S., and R. Fernando, "Use Cases for an 1013 Interface to the Routing System", 1014 draft-white-i2rs-use-case-00 (work in progress), 1015 February 2013. 1017 [RFC3443] Agarwal, P. and B. Akyol, "Time To Live (TTL) Processing 1018 in Multi-Protocol Label Switching (MPLS) Networks", 1019 RFC 3443, January 2003. 1021 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 1022 Protocol 4 (BGP-4)", RFC 4271, January 2006. 1024 [RFC4915] Psenak, P., Mirtorabi, S., Roy, A., Nguyen, L., and P. 1025 Pillay-Esnault, "Multi-Topology (MT) Routing in OSPF", 1026 RFC 4915, June 2007. 1028 [RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous 1029 System Confederations for BGP", RFC 5065, August 2007. 1031 [RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi 1032 Topology (MT) Routing in Intermediate System to 1033 Intermediate Systems (IS-ISs)", RFC 5120, February 2008. 1035 [RFC5511] Farrel, A., "Routing Backus-Naur Form (RBNF): A Syntax 1036 Used to Form Encoding Rules in Various Routing Protocol 1037 Specifications", RFC 5511, April 2009. 1039 Authors' Addresses 1041 Nitin Bahadur (editor) 1042 Juniper Networks, Inc. 1043 1194 N. Mathilda Avenue 1044 Sunnyvale, CA 94089 1045 US 1047 Phone: +1 408 745 2000 1048 Email: nitinb@juniper.net 1049 URI: www.juniper.net 1051 Ron Folkes (editor) 1052 Juniper Networks, Inc. 1053 1194 N. Mathilda Avenue 1054 Sunnyvale, CA 94089 1055 US 1057 Phone: +1 408 745 2000 1058 Email: ronf@juniper.net 1059 URI: www.juniper.net 1061 Sriganesh Kini 1062 Ericsson 1064 Email: sriganesh.kini@ericsson.com 1066 Jan Medved 1067 Cisco 1069 Email: jmedved@cisco.com