idnits 2.17.1 draft-ietf-lsvr-l3dl-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 7, 2020) is 1450 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-18) exists of draft-ietf-idr-bgp-ls-segment-routing-ext-16 == Outdated reference: A later version (-29) exists of draft-ietf-lsvr-bgp-spf-08 -- Possible downref: Non-RFC (?) normative reference: ref. 'IANA-PEN' -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE802-2014' ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) ** Obsolete normative reference: RFC 7752 (Obsoleted by RFC 9552) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Bush 3 Internet-Draft Arrcus & Internet Initiative Japan 4 Intended status: Standards Track R. Austein 5 Expires: November 8, 2020 K. Patel 6 Arrcus 7 May 7, 2020 9 Layer 3 Discovery and Liveness 10 draft-ietf-lsvr-l3dl-04 12 Abstract 14 In Massive Data Centers, BGP-SPF and similar routing protocols are 15 used to build topology and reachability databases. These protocols 16 need to discover IP Layer 3 attributes of links, such as neighbor IP 17 addressing, logical link IP encapsulation abilities, and link 18 liveness. This Layer 3 Discovery and Liveness protocol collects 19 these data, which may then be disseminated using BGP-SPF and similar 20 protocols. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 26 "OPTIONAL" in this document are to be interpreted as described in BCP 27 14 [RFC2119] [RFC8174] when, and only when, they appear in all 28 capitals, as shown here. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on November 8, 2020. 47 Copyright Notice 49 Copyright (c) 2020 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 5 67 4. Top Level Overview . . . . . . . . . . . . . . . . . . . . . 6 68 5. Inter-Link Protocol Overview . . . . . . . . . . . . . . . . 7 69 5.1. L3DL Ladder Diagram . . . . . . . . . . . . . . . . . . . 7 70 6. Transport Layer . . . . . . . . . . . . . . . . . . . . . . . 9 71 7. The Checksum . . . . . . . . . . . . . . . . . . . . . . . . 11 72 8. TLV PDUs . . . . . . . . . . . . . . . . . . . . . . . . . . 13 73 9. Logical Link Endpoint Identifier . . . . . . . . . . . . . . 14 74 10. HELLO . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 75 11. OPEN . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 76 12. ACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 77 12.1. Retransmission . . . . . . . . . . . . . . . . . . . . . 20 78 13. The Encapsulations . . . . . . . . . . . . . . . . . . . . . 20 79 13.1. The Encapsulation PDU Skeleton . . . . . . . . . . . . . 21 80 13.2. Encapsulaion Flags . . . . . . . . . . . . . . . . . . . 22 81 13.3. IPv4 Encapsulation . . . . . . . . . . . . . . . . . . . 22 82 13.4. IPv6 Encapsulation . . . . . . . . . . . . . . . . . . . 23 83 13.5. MPLS Label List . . . . . . . . . . . . . . . . . . . . 24 84 13.6. MPLS IPv4 Encapsulation . . . . . . . . . . . . . . . . 24 85 13.7. MPLS IPv6 Encapsulation . . . . . . . . . . . . . . . . 25 86 14. VENDOR - Vendor Extensions . . . . . . . . . . . . . . . . . 25 87 15. KEEPALIVE - Layer 2 Liveness . . . . . . . . . . . . . . . . 26 88 16. Layers 2.5 and 3 Liveness . . . . . . . . . . . . . . . . . . 27 89 17. The North/South Protocol . . . . . . . . . . . . . . . . . . 27 90 17.1. Use BGP-LS as Much as Possible . . . . . . . . . . . . . 28 91 17.2. Extensions to BGP-LS . . . . . . . . . . . . . . . . . . 28 92 18. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 28 93 18.1. HELLO Discussion . . . . . . . . . . . . . . . . . . . . 28 94 18.2. HELLO versus KEEPALIVE . . . . . . . . . . . . . . . . . 29 96 19. VLANs/SVIs/Sub-interfaces . . . . . . . . . . . . . . . . . . 29 97 20. Implementation Considerations . . . . . . . . . . . . . . . . 29 98 21. Security Considerations . . . . . . . . . . . . . . . . . . . 30 99 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 100 22.1. PDU Types . . . . . . . . . . . . . . . . . . . . . . . 30 101 22.2. Signature Type . . . . . . . . . . . . . . . . . . . . . 31 102 22.3. Flag Bits . . . . . . . . . . . . . . . . . . . . . . . 31 103 22.4. Error Codes . . . . . . . . . . . . . . . . . . . . . . 31 104 23. IEEE Considerations . . . . . . . . . . . . . . . . . . . . . 32 105 24. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 32 106 25. References . . . . . . . . . . . . . . . . . . . . . . . . . 32 107 25.1. Normative References . . . . . . . . . . . . . . . . . . 32 108 25.2. Informative References . . . . . . . . . . . . . . . . . 34 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35 111 1. Introduction 113 The Massive Data Center (MDC) environment presents unusual problems 114 of scale, e.g. O(10,000) forwarding devices, while its homogeneity 115 presents opportunities for simple approaches. Approaches such as 116 Jupiter Rising [JUPITER] use a central controller to deal with 117 scaling, while BGP-SPF [I-D.ietf-lsvr-bgp-spf] provides massive 118 scale-out without centralization using a tried and tested scalable 119 distributed control plane, offering a scalable routing solution in 120 Clos [Clos0][Clos1] and similar environments. But BGP-SPF and 121 similar higher level device-spanning protocols, e.g. 122 [I-D.malhotra-bess-evpn-lsoe], need logical link state and addressing 123 data from the network to build the routing topology. They also need 124 prompt but prudent reaction to (logical) link failure. 126 Layer 3 Discovery and Liveness (L3DL) provides brutally simple 127 mechanisms for devices to 129 o Discover each other's unique endpoint identification, 131 o Discover mutually supported layer 3 encapsulations, e.g. IP/MPLS, 133 o Discover Layer 3 IP and/or MPLS addressing of interfaces of the 134 encapsulations, 136 o Present these data, using a very restricted profile of a BGP-LS 137 [RFC7752] API, to BGP-SPF which computes the topology and builds 138 routing and forwarding tables, 140 o Enable Layer 3 link liveness such as BFD, 142 o Provide Layer 2 keep-alive messages for session continuity, and 143 finally 145 o Provide for authenticity verification of protocol messages. 147 In this document, the use case for L3DL is for point to point links 148 in a datacenter Clos in order to exchange the data needed for BGP-SPF 149 [I-D.ietf-lsvr-bgp-spf] bootstrap and continuity. Once layer two 150 connectivity has been leveraged to get layer three addressability and 151 forwarding capabilities, normal layer three forwarding and routing 152 can take over. 154 L3DL might be found to be more widely applicable to a range of 155 routing and similar protocols which need layer three discovery and 156 characterisation. 158 2. Terminology 160 Even though it concentrates on the inter-device layer, this document 161 relies heavily on routing terminology. The following attempts to 162 clarify the use of some possibly confusing terms: 164 ASN: Autonomous System Number [RFC4271], a BGP identifier for 165 an originator of Layer 3 routes, particularly BGP 166 announcements. 167 BGP-LS: A mechanism by which link-state and TE information can be 168 collected from networks and shared with external 169 components using the BGP routing protocol. See [RFC7752]. 170 BGP-SPF A hybrid protocol using BGP transport but a Dijkstra 171 Shortest Path First decision process. See 172 [I-D.ietf-lsvr-bgp-spf]. 173 Clos: A hierarchic subset of a crossbar switch topology commonly 174 used in data centers. 175 Datagram: The L3DL content of a single Layer 2 frame, sans Ethernet 176 framing. A full L3DL PDU may be packaged in multiple 177 Datagrams. 178 Encapsulation: Address Family Indicator and Subsequent Address 179 Family Indicator (AFI/SAFI). I.e. classes of layer 2.5 180 and 3 addresses such as IPv4, IPv6, MPLS, etc. 181 Frame: A Layer 2 Ethernet packet. 182 Link or Logical Link: A logical connection between two logical ports 183 on two devices. E.g. two VLANs between the same two ports 184 are two links. 185 LLEI: Logical Link Endpoint Identifier, the unique identifier of 186 one end of a logical link, see Section 9. 187 MAC Address: 48-bit Layer 2 addresses are assumed since they are 188 used by all widely deployed Layer 2 network technologies 189 of interest, especially Ethernet. See [IEEE.802_2001]. 190 MDC: Massive Data Center, commonly composed of thousands of Top 191 of Rack Switches (TORs). 193 MTU: Maximum Transmission Unit, the size in octets of the 194 largest packet that can be sent on a medium, see [RFC1122] 195 1.3.3. 196 PDU: Protocol Data Unit, an L3DL application layer message. A 197 PDU's content may need to be broken into multiple 198 Datagrams to make it through MTU or other restrictions. 199 RouterID: An 32-bit identifier unique in the current routing domain, 200 see [RFC6286]. 201 Session: An established, via OPEN PDUs, session between two L3DL 202 capable link end-points, 203 SPF: Shortest Path First, an algorithm for finding the shortest 204 paths between nodes in a graph; AKA Dijkstra's algorithm. 205 System Identifier: An eight octet ISO System Identifier a la 206 [RFC1629] System ID 207 TOR: Top Of Rack switch, aggregates the servers in a rack and 208 connects to aggregation layers of the Clos tree, AKA the 209 Clos spine. 210 ZTP: Zero Touch Provisioning gives devices initial addresses, 211 credentials, etc. on boot/restart. 213 3. Background 215 L3DL is primarily designed for a Clos type datacenter scale and 216 topology, but can accommodate richer topologies which contain 217 potential cycles. 219 While L3DL is designed for the MDC, there are no inherent reasons it 220 could not run on a WAN. The authentication and authorization needed 221 to run safely on a WAN need to be considered, and the appropriate 222 level of security options chosen. 224 L3DL assumes a new IEEE assigned EtherType (TBD). 226 The number of addresses of one Encapsulation type on an interface 227 link may be quite large given a TOR with tens of servers, each server 228 having a few hundred micro-services, resulting in an inordinate 229 number of addresses. And highly automated micro-service migration 230 can cause serious address prefix disaggregation, resulting in 231 interfaces with thousands of disaggregated prefixes. 233 Therefore the L3DL protocol is session oriented and uses incremental 234 announcement and withdrawal with session restart, a la BGP 235 ([RFC4271]). 237 4. Top Level Overview 239 o Devices discover each other on logical links 241 o Logical Link Endpoint Identifiers (LLEIs) are exchanged 243 o Layer 2 Liveness checks may be started 245 o Encapsulation data are exchanged and IP-Level Liveness checks 246 enabled 248 o A BGP-like upper layer protocol is assumed to use the identiiers 249 and encapsulation data to discover and build a topology database 251 +-------------------+ +-------------------+ +-------------------+ 252 | Device | | Device | | Device | 253 | | | | | | 254 |+-----------------+| |+-----------------+| |+-----------------+| 255 || || || || || || 256 || BGP-SPF <+---+> BGP-SPF <+---+> BGP-SPF || 257 || || || || || || 258 |+--------^--------+| |+--------^--------+| |+--------^--------+| 259 | | | | | | | | | 260 | | | | | | | | | 261 |+--------+--------+| |+--------+--------+| |+--------+--------+| 262 || Encapsulations || || Encapsulations || || Encapsulations || 263 || Addresses || || Addresses || || Addresses || 264 || L2 Liveness || || L2 Liveness || || L2 Liveness || 265 |+--------^--------+| |+--------^--------+| |+--------^--------+| 266 | | | | | | | | | 267 | | | | | | | | | 268 |+--------v--------+| |+--------v--------+| |+--------v--------+| 269 || || || || || || 270 ||Inter-Device PDUs<+---+>Inter-Device PDUs<+---+>Inter-Device PDUs|| 271 || || || || || || 272 |+-----------------+| |+-----------------+| |+-----------------+| 273 +-------------------+ +-------------------+ +-------------------+ 275 There are two protocols, the inter-device (left-right in the diagram) 276 per-link layer 3 discovery and the API to the upper level BGP-like 277 routing prototol (up-down in the above diagram): 279 o Inter-device PDUs are used to exchange device and logical link 280 identities and layer 2.5 (MPLS) and 3 identifiers (not payloads), 281 e.g. device IDs, port identities, VLAN IDs, Encapsulations, and IP 282 addresses. 284 o A Link Layer to BGP API presents these data up the stack to a BGP 285 protocol or an other device-spanning upper layer protocol, 286 presenting them using the BGP-LS BGP-like data format. 288 The upper layer BGP family routing protocols cross all the devices, 289 though they are not part of these L3DL protocols. 291 To simplify this document, Layer 2 framing is not shown. L3DL is 292 about layer 3. 294 5. Inter-Link Protocol Overview 296 Two devices discover each other and their respective identities by 297 sending multicast HELLO PDUs (Section 10). To assure discovery of 298 new devices coming up on a multi-link topology, devices on such a 299 topology, and only on a multi-link topology, send periodic HELLOs 300 forever, see Section 18.1. 302 Once a new device is recognized, both devices attempt to negotiate 303 and establish a session by sending unicast OPEN PDUs (Section 11) to 304 the source MAC addresses (plus VIDs if VLANs) of the received HELLOs. 305 Once a session is established through the OPEN exchange, the 306 Encapsulations (Section 13) configured on an end point may be 307 announced and modified. Note that these are only the encapsuation 308 and addresses configured on the announcing interface; though a 309 device's loopback and overlay interface(s) may also be announced. 310 When two devices on a link have compatible Encapsulations and 311 addresses, i.e. the same AFI/SAFI and the same subnet, the link is 312 announced via the BGP-LS API. 314 5.1. L3DL Ladder Diagram 316 The HELLO, Section 10, is a priming message sent on all configured 317 logical links. It is a small L3DL PDU encapsulated in an Ethernet 318 multicast frame with the simple goal of discovering the identities of 319 logical link endpoint(s) reachable from a Logical Link Endpoint, 320 Section 9. 322 The HELLO and OPEN, Section 11, PDUs, which are used to discover and 323 exchange detailed Logical Link Endpoint Identifiers, LLEIs, and the 324 ACK/ERROR PDU, are mandatory; other PDUs are optional; though at 325 least one encapsulation SHOULD be agreed at some point. 327 The following is a ladder-style diagram of the L3DL protocol 328 exchanges: 330 | HELLO | Logical Link Peer discovery 331 |---------------------------->| 332 | HELLO | Mandatory 333 |<----------------------------| 334 | | 335 | | 336 | OPEN | MACs, IDs, etc. 337 |---------------------------->| 338 | ACK | 339 |<----------------------------| 340 | | 341 | OPEN | Mandatory 342 |<----------------------------| 343 | ACK | 344 |---------------------------->| 345 | | 346 | | 347 | Interface IPv4 Addresses | Interface IPv4 Addresses 348 |---------------------------->| Optional 349 | ACK | 350 |<----------------------------| 351 | | 352 | Interface IPv4 Addresses | 353 |<----------------------------| 354 | ACK | 355 |---------------------------->| 356 | | 357 | | 358 | Interface IPv6 Addresses | Interface IPv6 Addresses 359 |---------------------------->| Optional 360 | ACK | 361 |<----------------------------| 362 | | 363 | Interface IPv6 Addresses | 364 |<----------------------------| 365 | ACK | 366 |---------------------------->| 367 | | 368 | | 369 | Interface MPLSv4 Labels | Interface MPLSv4 Labels 370 |---------------------------->| Optional 371 | ACK | 372 |<----------------------------| 373 | | 374 | Interface MPLSv4 Labels | Interface MPLSv4 Labels 375 |<----------------------------| Optional 376 | ACK | 377 |---------------------------->| 378 | | 379 | | 380 | Interface MPLSv6 Labels | Interface MPLSv6 Labels 381 |---------------------------->| Optional 382 | ACK | 383 |<----------------------------| 384 | | 385 | Interface MPLSv6 Labels | Interface MPLSv6 Labels 386 |<----------------------------| Optional 387 | ACK | 388 |---------------------------->| 389 | | 390 | | 391 | L3DL KEEPALIVE | Layer 2 Liveness 392 |---------------------------->| Optional 393 | L3DL KEEPALIVE | 394 |<----------------------------| 396 6. Transport Layer 398 L3DL PDUs are carried by a simple transport layer which allows long 399 PDUs to occupy many Ethernet frames. The L3DL content of a single 400 Ethernet frame, exclusive of Ethernet framing data, is referred to as 401 a Datagram. 403 The L3DL Transport Layer encapsulates each Datagram using a common 404 transport header. 406 If a PDU does not fit in a single datagram, it is broken into 407 multiple Datagrams and reassembled by the receiver a la [RFC0791] 408 Section 2.3 Fragmentation. 410 This is not classic 'fragmentation', but rather decomposition at the 411 origin to allow PDU payloads larger than the frame allows. There are 412 no intermediate devices capable of further fragmentation or 413 reassembly. 415 L3DL is carrying relatively small amounts of data on relatively high 416 bandwidth links, and at a time when the link is not active with other 417 data as it does not yet have layer three connectivity. So congestion 418 is not considered a sufficiently significant risk to warrent 419 additional complexity. 421 Should a PDU need to be retransmitted, it MUST BE sent as the 422 identical Datagram set as the original transmission. The 423 Transmission Sequence Number informs the receiver that it is the same 424 PDU. 426 0 1 2 3 427 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 429 | Version | Transmission Sequence Number |L| ~ 430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 431 ~ Datagram Number | Datagram Length | 432 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 433 | Checksum | 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 | Payload... | 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 438 The fields of the L3DL Transport Header are as follows: 440 Version: Seven-bit Version number of the protocol, currently 0. 441 Values other than 0 MUST BE treated as an error. The protocol 442 version needs to be in one and only one place, so it is in the 443 datagram as opposed to, for example, the PDU header. 445 L: A bit that set to one if this Datagram is the last Datagram of the 446 PDU. For a PDU which fits in only one Datagram, it is set to one. 447 Note that this is the inverse of the marking technique used by 448 [RFC0791]. 450 Transmission Sequence Number: A 16-bit strictly increasing unsigned 451 integer identifying this PDU, possibly across retransmissions, 452 that wraps from 2^16-1 to 0. The initial value is arbitrary. See 453 [RFC1982] on DNS Serial Number Arithmetic for too much detail on 454 comparing and incrementing a wrapping sequence number. 456 Datagram Number: A monotonically increasing 24-bit value which 457 starts at zero for each PDU. This is used to reassemble frames 458 into PDUs a la [RFC0791] Section 2.3. Note that this limits an 459 L3DL PDU to 2^24 frames. 461 Datagram Length: Total number of octets in the Datagram including 462 all payloads and fields. Note that this limits a datagram to 2^16 463 octets; though Ethernet framing is likely to impose a smaller 464 limit. 466 Checksum: A 32 bit hash over the Datagram to detect bit flips, see 467 Section 7. 469 If a Datagram fails checksum verification, the datagram is invalid 470 and should be silently discarded. The sender will retransmit the 471 PDU, and the receiver can assmble it. 473 Payload: The PDU being transported or a fragment thereof. 475 To avoid the need for a receiver to reassemble two PDUs at the same 476 time, a sender MUST NOT send a subsequent PDU when a PDU is already 477 in flight and not yet acknowledged; assuming it is an ACKed PDU Type. 479 7. The Checksum 481 There is a reason conservative folk use a checksum in UDP. And as 482 many operators stretch to jumbo frames (over 1,500 octets) longer 483 checksums are the prudent approach. 485 For the purpose of computing a checksum, the checksum field itself is 486 assumed to be zero. 488 The following code describes a suggested algorithm. This 489 specification avoids mandatory to implement, algorithm agility, etc. 490 What matters is that the same algorithm is used consistently in any 491 deployment. 493 Sum up 32-bit unsigned ints in a 64-bit long, then take the high- 494 order section, shift it right, rotate, add it in, repeat until zero. 496 497 #include 498 #include 500 /* The F table from Skipjack, and it would work for the S-Box. */ 501 static const uint8_t sbox[256] = { 502 0xa3,0xd7,0x09,0x83,0xf8,0x48,0xf6,0xf4,0xb3,0x21,0x15,0x78, 503 0x99,0xb1,0xaf,0xf9,0xe7,0x2d,0x4d,0x8a,0xce,0x4c,0xca,0x2e, 504 0x52,0x95,0xd9,0x1e,0x4e,0x38,0x44,0x28,0x0a,0xdf,0x02,0xa0, 505 0x17,0xf1,0x60,0x68,0x12,0xb7,0x7a,0xc3,0xe9,0xfa,0x3d,0x53, 506 0x96,0x84,0x6b,0xba,0xf2,0x63,0x9a,0x19,0x7c,0xae,0xe5,0xf5, 507 0xf7,0x16,0x6a,0xa2,0x39,0xb6,0x7b,0x0f,0xc1,0x93,0x81,0x1b, 508 0xee,0xb4,0x1a,0xea,0xd0,0x91,0x2f,0xb8,0x55,0xb9,0xda,0x85, 509 0x3f,0x41,0xbf,0xe0,0x5a,0x58,0x80,0x5f,0x66,0x0b,0xd8,0x90, 510 0x35,0xd5,0xc0,0xa7,0x33,0x06,0x65,0x69,0x45,0x00,0x94,0x56, 511 0x6d,0x98,0x9b,0x76,0x97,0xfc,0xb2,0xc2,0xb0,0xfe,0xdb,0x20, 512 0xe1,0xeb,0xd6,0xe4,0xdd,0x47,0x4a,0x1d,0x42,0xed,0x9e,0x6e, 513 0x49,0x3c,0xcd,0x43,0x27,0xd2,0x07,0xd4,0xde,0xc7,0x67,0x18, 514 0x89,0xcb,0x30,0x1f,0x8d,0xc6,0x8f,0xaa,0xc8,0x74,0xdc,0xc9, 515 0x5d,0x5c,0x31,0xa4,0x70,0x88,0x61,0x2c,0x9f,0x0d,0x2b,0x87, 516 0x50,0x82,0x54,0x64,0x26,0x7d,0x03,0x40,0x34,0x4b,0x1c,0x73, 517 0xd1,0xc4,0xfd,0x3b,0xcc,0xfb,0x7f,0xab,0xe6,0x3e,0x5b,0xa5, 518 0xad,0x04,0x23,0x9c,0x14,0x51,0x22,0xf0,0x29,0x79,0x71,0x7e, 519 0xff,0x8c,0x0e,0xe2,0x0c,0xef,0xbc,0x72,0x75,0x6f,0x37,0xa1, 520 0xec,0xd3,0x8e,0x62,0x8b,0x86,0x10,0xe8,0x08,0x77,0x11,0xbe, 521 0x92,0x4f,0x24,0xc5,0x32,0x36,0x9d,0xcf,0xf3,0xa6,0xbb,0xac, 522 0x5e,0x6c,0xa9,0x13,0x57,0x25,0xb5,0xe3,0xbd,0xa8,0x3a,0x01, 523 0x05,0x59,0x2a,0x46 524 }; 526 /* non-normative example C code, constant time even */ 528 uint32_t sbox_checksum_32(const uint8_t *b, const size_t n) 529 { 530 uint32_t sum[4] = {0, 0, 0, 0}; 531 uint64_t result = 0; 532 for (size_t i = 0; i < n; i++) 533 sum[i & 3] += sbox[*b++]; 534 for (int i = 0; i < sizeof(sum)/sizeof(*sum); i++) 535 result = (result << 8) + sum[i]; 536 result = (result >> 32) + (result & 0xFFFFFFFFU); 537 result = (result >> 32) + (result & 0xFFFFFFFFU); 538 return (uint32_t) result; 539 } 540 542 8. TLV PDUs 544 The basic L3DL application layer PDU is a typical TLV (Type Length 545 Value) PDU. It includes a signature to provide optional integrity 546 and authentication. It may be broken into multiple Datagrams, see 547 Section 6. 549 0 1 2 3 550 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | PDU Type | Payload Length ~ 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 ~ | Payload ... | 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 556 | Sig Type | Signature Length | ~ 557 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 558 ~ Signature ~ 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 The fields of the basic L3DL header are as follows: 563 PDU Type: An integer differentiating PDU payload types. See 564 Section 22.1. 566 Payload Length: Total number of octets in the Payload field. 568 Payload: The application layer content of the L3DL PDU. 570 Sig Type: The type of the Signature, see Section 22.2. Type 0, a 571 null signature, is defined in this document. 573 Sig Type 0 indicates a null Signature. For a trivial PDU such as 574 KEEPALIVE, the underlying Datagram checksum may be sufficient for 575 integrity, though it lacks authenticity. 577 Other Sig Types may be defined in other documents, cf. 578 [I-D.ymbk-lsvr-l3dl-signing]. 580 Signature Length: The length of the Signature, possibly including 581 padding, in octets. If Sig Type is 0, Signature Length MUST BE 0. 583 Signature: The result of running the signature algorithm specified 584 in Sig Type over all octets of the PDU except for the Signature 585 itself. 587 9. Logical Link Endpoint Identifier 589 L3DL discovers neighbors on logical links and establishes sessions 590 between the two ends of all consenting discovered logical links. A 591 logical link is described by a pair of Logical Link Endpoint 592 Identifiers, LLEIs. 594 An LLEI is a variable length descriptor which could be an ASN, a 595 classic RouterID, a catenation of the two, an eight octet ISO System 596 Identifier [RFC1629], or any other identifier unique to a single 597 logical link endpoint in the topology. 599 An L3DL deployment will choose and define an LLEI which suits its 600 needs, simple or complex. Examples of two extremes follow: 602 A simplistic view of a link between two devices is two ports, 603 identified by unique MAC addresses, carrying a layer 3 protocol 604 conversation. In this case, the MAC addresses might suffice for the 605 LLEIs. 607 Unfortunately, things can get more complex. Multiple VLANs can run 608 between those two MAC addresses. In practice, many real devices use 609 the same MAC address on multiple ports and/or sub-interfaces. 611 Therefore, in the general circumstance, a fully described LLEI might 612 be as follows: 614 0 1 2 3 615 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 616 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 617 | | 618 + System Identifier + 619 | | 620 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 621 | ifIndex | 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 System Identifier, a la [RFC1629], is an eight octet identifier 625 unique in the entire operational space. Routers and switches usually 626 have internal MAC Addresses which can be padded with high order zeros 627 and used if no System ID exists on the device. If no unique 628 identifier is burned into a device, the local L3DL configuration 629 SHOULD create and assign a unique one, likely by configuration. 631 ifIndex is the SNMP identifier of the (sub-)interface, see [RFC1213]. 632 This uniquely identifies the port. 634 For a layer 3 tagged sub-interface or a VLAN/SVI interface, Ifindex 635 is that of the logical sub-interface, so no further disambiguation is 636 needed. 638 L3DL PDUs learned over VLAN-ports may be interpreted by upper layer-3 639 routing protocols as being learned on the corresponding layer-3 SVI 640 interface for the VLAN. 642 LLEIs are big-endian. 644 10. HELLO 646 The HELLO PDU is unique in that it is encapsulated in a multicast 647 Ethernet frame. It solicits response(s) from other LLEI(s) on the 648 link. See Section 18.1 for why multicast is used. The destination 649 multicast MAC Addressees to be used MUST be one of the following, See 650 Clause 9.2.2 of [IEEE802-2014]: 652 01-80-C2-00-00-0E: Nearest Bridge = Propagation constrained to a 653 single physical link; stopped by all types of bridges (including 654 MPRs (media converters)). This SHOULD BE used when the link is 655 known to be a simple point to point link. 656 To Be Assigned: When a switch receives a frame with a multicast 657 destination MAC it does not recognize, it forwards to all ports. 658 This destination MAC is to be sent when the interface is known to 659 be connected to a switch. See Section 23. This SHOULD BE used 660 when the link may be a multi-point link. 662 All other L3DL PDUs are encapsulated in unicast frames, as the peer's 663 destination MAC address is known after the HELLO exchange. 665 When an interface is turned up on a device, it SHOULD issue a HELLO 666 if it is to participate in L3DL sessions. 668 If a constrained Nearest Bridge destination address has been 669 configured for a point-to-point interface, see above, then the HELLO 670 SHOULD NOT be repeated once a session has been created by an exchange 671 of OPENs. 673 If the configured destination address is one that is propagated by 674 switches, the HELLO SHOULD be repeated at a configured interval, with 675 a default of 60 seconds. This allows discovery by new devices which 676 come up on the layer-2 mesh. In this multi-link scenario, the 677 operator should be aware of the trade-off between timer tuning and 678 network noise and adjust the inter-HELLO timer accordingly. 680 0 1 2 3 681 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 682 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | PDU Type = 0 | Payload Length = 0 ~ 684 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 685 ~ | Sig Type = 0 | Signature Length = 0 | 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 If more than one device responds, one adjacency is formed for each 689 unique source LLEI response. L3DL treats each adjacency as a 690 separate logical link. 692 When a HELLO is received from a source MAC address (plus VID if VLAN) 693 with which there is no established L3DL session, the receiver SHOULD 694 respond by sending an OPEN PDU to the source MAC address (plus VID). 695 The two devices establish an L3DL session by exchanging OPEN PDUs. 697 To ameliorate possible load spikes during bootstrap or event 698 recovery, there SHOULD be a jittered delay between receipt of a HELLO 699 and issue of the OPEN. The default delay range SHOULD BE zero to 700 five seconds, and MUST be configurable. 702 If a HELLO is received from a MAC address with which there is an 703 established session, the HELLO should be dropped. 705 The Payload Length is zero as there is no payload. 707 HELLO PDUs can not be signed as keying material has yet to be 708 exchanged. Hence the signature MUST always be the null type. 710 11. OPEN 712 Each device has learned the other's MAC Address from the HELLO 713 exchange, see Section 10. Therefore the OPEN and all subsequent PDUs 714 MUST BE unicast, as opposed to the HELLO's multicast frame. 716 0 1 2 3 717 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 718 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 719 | PDU Type = 1 | Payload Length ~ 720 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 721 ~ | Nonce ~ 722 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 723 ~ | LLEI Length | My LLEI | 724 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-~ 725 ~ | AttrCount | ~ 726 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 727 ~ Attribute List ... | Auth Type | Key Length ~ 728 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 729 ~ | Key ... | 730 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 731 | Serial Number | 732 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 733 | Sig Type | Signature Length | Signature ... | 734 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 736 The Payload Length is the number of octets in all fields of the PDU 737 from the Nonce through the Serial Number, not including the three 738 final signature fields. 740 The Nonce enables detection of a duplicate OPEN PDU. It SHOULD be 741 either a random number or a high resolution timestamp. It is needed 742 to prevent session closure due to a repeated OPEN caused by a race or 743 a dropped or delayed ACK. 745 My LLEI is the sender's LLEI, see Section 9. 747 AttrCount is the number of attributes in the Attribute List. 748 Attributes are single octets the semantics of which are operator- 749 defined. 751 A node may have zero or more operator-defined attributes, e.g.: 752 spine, leaf, backbone, route reflector, arabica, ... 754 Attribute syntax and semantics are local to an operator or 755 datacenter; hence there is no global registry. Nodes exchange their 756 attributes only in the OPEN PDU. 758 Auth Type is the Signature algorithm suite, see Section 8. 760 Key Length is a 16-bit field denoting the length in octets of the Key 761 itself, not including the Auth Type or the Key Length. If the Auth 762 Type is zero, then the Key Length MUST also be zero, and there MUST 763 BE no Key data. 765 The Key is specific to the operational environment. A failure to 766 authenticate is a failure to start the L3DL session, an ERROR PDU 767 MUST BE sent (Error Code 3), and HELLOs MUST be restarted. 769 The Serial Number is that of the last received and processed PDU. 770 This allows a receiver sending an OPEN to tell the sender that the 771 receiver wants to resume a session and the sender only needs to send 772 data more recent than the Serial Number. If this OPEN is not trying 773 to restart a lost session, the Serial Number MUST BE set to zero. 775 The Signature fields are described in Section 8 and in an asymmetric 776 key environment serve as a proof of possession of the signing auth 777 data by the sender. 779 Once two logical link endpoints know each other, and have ACKed each 780 other's OPEN PDUs, Layer 2 KEEPALIVEs (see Section 15) MAY be started 781 to ensure Layer 2 liveness and keep the session semantics alive. The 782 timing and acceptable drop of KEEPALIVE PDUs are discussed in 783 Section 15. 785 If a sender of OPEN does not receive an ACK of the OPEN PDU, then 786 they MUST resend the same OPEN PDU, with the same Nonce. Resending 787 an unacknowledged OPEN PDU, like other ACKed PDUs, SHOULD use 788 exponential back-off, see [RFC1122]. 790 If a properly authenticated OPEN arrives at L3DL speaker A with a new 791 Nonce from an LLEI, speaker B, with which A believes it already has 792 an L3DL session (OPENs have already been exchanged), and the Serial 793 Number in the OPEN PDU is non-zero, speaker A SHOULD establish a new 794 session by sending an OPEN with the Serial Number being the same as 795 that of A's last sent and ACKed PDU. Each party MUST resume sending 796 encapsulations etc. subsequent to the other party's Sequence Number. 797 And each MUST retain all previously discovered encapsulation and 798 other data. 800 If a properly authenticated OPEN arrives with a new Nonce from an 801 LLEI with which the receiving logical link endpoint believes it 802 already has an L3DL session (OPENs have already been exchanged), and 803 the Serial Number in the OPEN is zero, then the receiver MUST assume 804 that the sending LLEI or entire device has been reset. All 805 previously discovered encapsulation data MUST NOT be kept and MUST BE 806 withdrawn via the BGP-LS API and the recipient MUST respond with a 807 new OPEN. 809 12. ACK 811 The ACK PDU acknowledges receipt of a PDU and reports any error 812 condition which might have been raised. 814 0 1 2 3 815 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 816 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 817 | PDU Type = 3 | Payload Length = 5 ~ 818 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 819 ~ | ACKed PDU | EType | Error Code | 820 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 821 | Error Hint | Sig Type |Signature Leng.~ 822 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 823 ~ | Signature ... | 824 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 826 The ACK acknowledges receipt of an OPEN, Encapsulation, VENDOR PDU, 827 etc. 829 The ACKed PDU is the PDU Type of the PDU being acknowledged, e.g., 830 OPEN, one of the Encapsulations, etc. 832 If there was an error processing the received PDU, then the EType is 833 non-zero. If the EType is zero, Error Code and Error Hint MUST also 834 be zero. 836 A non-zero EType is the receiver's way of telling the PDU's sender 837 that the receiver had problems processing the PDU. The Error Code 838 and Error Hint will tell the sender more detail about the error. 840 The decimal value of EType gives a strong hint how the receiver 841 sending the ACK believes things should proceed: 843 0 - No Error, Error Code and Error Hint MUST be zero 844 1 - Warning, something not too serious happened, continue 845 2 - Session should not be continued, try to restart 846 3 - Restart is hopeless, call the operator 847 4-15 - Reserved 849 The Error Codes, noting protocol failures, are listed in 850 Section 22.4. Someone stuck in the 1990s might think the catenation 851 of EType and Error Code as an echo of 0x1zzz, 0x2zzz, etc. They 852 might be right; or not. 854 The Error Hint, an arbitrary 16 bits, is any additional data the 855 sender of the error PDU thinks will help the recipient or the 856 debugger with the particular error. 858 The Signature fields are described in Section 8. 860 12.1. Retransmission 862 If a PDU sender expects an ACK, e.g. for an OPEN, an Encapsulation, a 863 VENDOR PDU, etc., and does not receive the ACK for a configurable 864 time (default one second), and the interface is live at layer 2, the 865 sender resends the PDU using exponential back-off, see [RFC1122]. 866 This cycle MAY be repeated a configurable number of times (default 867 three) before it is considered a failure. The session MAY BE 868 considered closed this in case of this ACK failure. 870 If the link is broken at layer 2, retransmission MAY BE retried when 871 the link is restored. 873 13. The Encapsulations 875 Once the devices know each other's LLEIs, know each other's upper 876 layer (L2.5 and L3) identities, have means to ensure link state, 877 etc., the L3DL session is considered established, and the devices 878 SHOULD exchange L3 interface encapsulations, L3 addresses, and L2.5 879 labels. 881 The Encapsulation types the peers exchange may be IPv4 882 (Section 13.3), IPv6 (Section 13.4), MPLS IPv4 (Section 13.6), MPLS 883 IPv6 (Section 13.7), and/or possibly others not defined here. 885 The sender of an Encapsulation PDU MUST NOT assume that the peer is 886 capable of the same Encapsulation Type. An ACK (Section 12) merely 887 acknowledges receipt. Only if both peers have sent the same 888 Encapsulation Type is it safe for Layer 3 protocols to assume that 889 they are compatible for that type. 891 A receiver of an encapsulation might recognize an addressing 892 conflict, such as both ends of the link trying to use the same 893 address. In this case, the receiver SHOULD respond with an error 894 (Error Code 2) ACK. As there may be other usable addresses or 895 encapsulations, this error might log and continue, letting an upper 896 layer topology builder deal with what works. 898 Further, to consider a logical link of a type to formally be 899 established so that it may be pushed up to upper layer protocols, the 900 addressing for the type must be compatible, e.g. on the same IP 901 subnet. 903 13.1. The Encapsulation PDU Skeleton 905 The header for all encapsulation PDUs is as follows: 907 0 1 2 3 908 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 910 | PDU Type | Payload Length ~ 911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 912 ~ | Count | 913 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 914 | Serial Number | 915 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 916 | Encapsulation List... | Sig Type | 917 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 918 | Signature Length | Signature ... | 919 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 921 An Encapsulation PDU describes zero or more addresses of the 922 encapsulation type. 924 The 24-bit Count is the number of Encapsulations in the Encapsulation 925 list. 927 The Serial Number is a monotonically increasing 32-bit value 928 representing the sender's state in time. It may be an integer, a 929 timestamp, etc. On session restart (new OPEN), a receiver MAY send 930 the last received Session Number to tell the sender to only send 931 newer data. 933 If a sender has multiple links on the same interface, separate state: 934 data, ACKs, etc. must be kept for each peer session. 936 Over time, multiple Encapsulation PDUs may be sent for an interface 937 as configuration changes. 939 If the length of an Encapsulation PDU exceeds the Datagram size limit 940 on media, the PDU is broken into multiple Datagrams. See Section 8. 942 The Signature fields are described in Section 8. 944 The Receiver MUST acknowledge the Encapsulation PDU with a Type=3, 945 ACK PDU (Section 12) with the Encapsulation Type being that of the 946 encapsulation being announced, see Section 12. 948 If the Sender does not receive an ACK in a configurable interval 949 (default one second), and the interface is live at layer 2, they 950 SHOULD retransmit. After a user configurable number of failures 951 (default three), the L3DL session should be considered dead and the 952 OPEN process SHOULD be restarted. 954 If the link is broken at layer 2, retransmission MAY BE retried if 955 data have not changed in the interim. 957 13.2. Encapsulaion Flags 959 The Encapsulation Flags are a sequence of bit fields as follows: 961 0 1 2 3 4 ... 7 962 +------------+------------+------------+------------+------------+ 963 | Ann/With | Primary | Under/Over | Loopback | Reserved ..| 964 +------------+------------+------------+------------+------------+ 966 Each encapsulation in an Encapsulation PDU of Type T may announce new 967 and/or withdraw old encapsulations of Type T. It indicates this with 968 the Ann/With Encapsulation Flag, Announce == 1, Withdraw == 0. 970 Each Encapsulation interface address in an Encapsulation PDU is 971 either a new encapsulation be announced (Ann/With == 1) (yes, a la 972 BGP) or requests one be withdrawn (Ann/With == 0). Adding an 973 encapsulation which already exists SHOULD raise an Announce/Withdraw 974 Error (see Section 22.4); the EType SHOULD be 2, suggesting a session 975 restart (see Section 12 so all encapsulations will be resent. 977 If an LLEI has multiple addresses for an encapsulation type, one and 978 only one address MAY be marked as primary (Primary Flag == 1) for 979 that Encapsulation Type. 981 An Encapsulation interface address in an Encapsulation PDU MAY be 982 marked as a loopback, in which case the Loopback bit is set. 983 Loopback addresses are generally not seen directly on an external 984 interface. One or more loopback addresses MAY be exposed by 985 configuration on one or more L3DL speaking external interfaces, e.g. 986 for iBGP peering. They SHOULD be marked as such, Loopback Flag == 1. 988 Each Encapsulation interface address in an Encapsulation PDU is that 989 of the direct 'underlay interface (Under/Over == 1), or an 'overlay' 990 address (Under/Over == 0), likely that of a VM or container guest 991 bridged or configured on to the interface already having an underlay 992 address. 994 13.3. IPv4 Encapsulation 996 The IPv4 Encapsulation describes a device's ability to exchange IPv4 997 packets on one or more subnets. It does so by stating the 998 interface's addresses and the corresponding prefix lengths. 1000 0 1 2 3 1001 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1002 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1003 | PDU Type = 4 | Payload Length ~ 1004 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1005 ~ | Count | 1006 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1007 | Serial Number | 1008 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1009 | Encaps Flags | IPv4 Address ~ 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 ~ | PrefixLen | more ... | Sig Type | 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | Signature Length | Signature ... | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1016 The 24-bit Count is the sum of the number of IPv4 Encapsulations 1017 being announced and/or withdrawn. 1019 13.4. IPv6 Encapsulation 1021 The IPv6 Encapsulation describes a logical link's ability to exchange 1022 IPv6 packets on one or more subnets. It does so by stating the 1023 interface's addresses and the corresponding prefix lengths. 1025 0 1 2 3 1026 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1027 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1028 | PDU Type = 5 | Payload Length ~ 1029 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1030 ~ | Count | 1031 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1032 | Serial Number | 1033 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1034 | Encaps Flags | | 1035 +-+-+-+-+-+-+-+-+ + 1036 | | 1037 + + 1038 | | 1039 + + 1040 | IPv6 Address | 1041 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1042 | | PrefixLen | more ... | Sig Type | 1043 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1044 | Signature Length | Signature ... | 1045 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1046 The 24-bit Count is the sum of the number of IPv6 Encapsulations 1047 being announced and/or withdrawn. 1049 13.5. MPLS Label List 1051 As an MPLS enabled interface may have a label stack, see [RFC3032], a 1052 variable length list of labels is needed. These are the labels the 1053 sender will accept for the prefix to which the list is attached. 1055 0 1 2 3 1056 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1057 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1058 | Label Count | Label | Exp |S| 1059 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1060 | Label | Exp |S| more ... | 1061 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1063 A Label Count of zero is an implicit withdraw of all labels for that 1064 prefix on that interface. 1066 13.6. MPLS IPv4 Encapsulation 1068 The MPLS IPv4 Encapsulation describes a logical link's ability to 1069 exchange labeled IPv4 packets on one or more subnets. It does so by 1070 stating the interface's addresses the corresponding prefix lengths, 1071 and the corresponding labels which will be accepted fpr each address. 1073 0 1 2 3 1074 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1076 | PDU Type = 6 | Payload Length ~ 1077 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1078 ~ | Count | 1079 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1080 | Serial Number | 1081 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1082 | Encaps Flags | MPLS Label List ... | ~ 1083 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1084 ~ IPv4 Address | PrefixLen | 1085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1086 | more ... | Sig Type | Signature Length | 1087 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1088 | Signature | 1089 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1091 The 24-bit Count is the sum of the number of MPLSv4 Encapsulation 1092 being announced and/or withdrawns. 1094 13.7. MPLS IPv6 Encapsulation 1096 The MPLS IPv6 Encapsulation describes a logical link's ability to 1097 exchange labeled IPv6 packets on one or more subnets. It does so by 1098 stating the interface's addresses, the corresponding prefix lengths, 1099 and the corresponding labels which will be accepted for each address. 1101 0 1 2 3 1102 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1103 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1104 | PDU Type = 7 | Payload Length ~ 1105 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1106 ~ | Count | 1107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1108 | Serial Number | 1109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1110 | Encaps Flags | MPLS Label List ... | | 1111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 1112 | | 1113 + + 1114 | | 1115 + + 1116 | IPv6 Address | 1117 + +-+-+-+-+-+-+-+-+ 1118 | | Prefix Len | 1119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1120 | more ... | Sig Type | Signature Length | 1121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 | Signature ... | 1123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1125 The 24-bit Count is the sum of the number of MPLSv6 Encapsulations 1126 being announced and/or withdrawn. 1128 14. VENDOR - Vendor Extensions 1129 0 1 2 3 1130 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1131 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1132 | PDU Type = 255| Payload Length ~ 1133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1134 ~ | Serial Number ~ 1135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1136 ~ | Enterprise Number | 1137 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1138 | Ent Type | Enterprise Data ... ~ 1139 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1140 ~ | Sig Type | Signature Length | 1141 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1142 | Signature ... | 1143 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1145 Vendors or enterprises may define TLVs beyond the scope of L3DL 1146 standards. This is done using a Private Enterprise Number [IANA-PEN] 1147 followed by Enterprise Data in a format defined for that Enterprise 1148 Number and Ent Type. 1150 Ent Type allows a VENDOR PDU to be sub-typed in the event that the 1151 vendor/enterprise needs multiple PDU types. 1153 As with Encapsulation PDUs, a receiver of a VENDOR PDU MUST respond 1154 with an ACK or an ERROR PDU. Similarly, a VENDOR PDU MUST only be 1155 sent over an open session. 1157 15. KEEPALIVE - Layer 2 Liveness 1159 0 1 2 3 1160 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1161 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1162 | PDU Type = 2 | Payload Length = 0 ~ 1163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1164 ~ | Sig Type = 0 | Signature Length = 0 | 1165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 L3DL devices SHOULD beacon frequent Layer 2 KEEPALIVE PDUs to ensure 1168 session continuity. The inter-KEEPALIVE interval is configurable, 1169 with a default of ten seconds. A receiver may choose to ignore 1170 KEEPALIVE PDUs. 1172 An operational deployment MUST BE configured whether to use 1173 KEEPALIVEs or not, either globally, or as finely as to per-link 1174 granularity. Disagreement MAY result in repeated session failure and 1175 reestablishment. 1177 KEEPALIVEs SHOULD be beaconed at a configured frequency. One per 1178 second is the default. Layer 3 liveness, such as BFD, may be more 1179 (or less) aggressive. 1181 When a sender transmits a PDU which is not a KEEPALIVE, the sender 1182 SHOULD reset the KEEPALIVE timer. I.e. sending any PDU acts as a 1183 keepalive. Once the last fragment has been sent, the KEEPALIVE timer 1184 SHOULD BE restarted. Do not wait for the ACK. 1186 If a KEEPALIVE or other PDUs have not been received from a peer with 1187 which a receiver has an open session for a configurable time (default 1188 30 seconds), the link SHOULD BE presumed down. The devices MAY keep 1189 configuration state and restore it without retransmission if no data 1190 have changed. Otherwise, a new session SHOULD BE established and new 1191 Encapsulation PDUs exchanged. 1193 16. Layers 2.5 and 3 Liveness 1195 Layer 2 liveness may be continuously tested by KEEPALIVE PDUs, see 1196 Section 15. As layer 2.5 or layer 3 connectivity could still break, 1197 liveness above layer 2 MAY be frequently tested using BFD ([RFC5880]) 1198 or a similar technique. 1200 This protocol assumes that one or more Encapsulation addresses may be 1201 used to ping, run BFD, or whatever the operator configures. 1203 17. The North/South Protocol 1205 Thus far, a one-hop point-to-point logical link discovery protocol 1206 has been defined. 1208 The devices know their unique LLEIs and know the unique peer LLEIs 1209 and Encapsulations on each logical link interface. 1211 Full topology discovery is not appropriate at the L3DL layer, so 1212 Dijkstra a la IS-IS etc. is assumed to be done by higher level 1213 protocols such as BGP-SPF. 1215 Therefore the LLEIs, link Encapsulations, and state changes are 1216 pushed North via a small subset of the BGP-LS API. The upper layer 1217 routing protocol(s), e.g. BGP-SPF, learn and maintain the topology, 1218 run Dijkstra, and build the routing database(s). 1220 For example, if a neighbor's IPv4 Encapsulation address changes, the 1221 devices seeing the change push that change Northbound. 1223 17.1. Use BGP-LS as Much as Possible 1225 BGP-LS [RFC7752] defines BGP-like Datagrams describing logical link 1226 state (links, nodes, link prefixes, and many other things), and a new 1227 BGP path attribute providing Northbound transport, all of which can 1228 be ingested by upper layer protocols such as BGP-SPF; see Section 4 1229 of [I-D.ietf-lsvr-bgp-spf]. 1231 For IPv4 links, TLVs 259 and 260 are used. For IPv6 links, TLVs 261 1232 and 262. If there are multiple addresses on a link, multiple TLV 1233 pairs are pushed North, having the same ID pairs. 1235 17.2. Extensions to BGP-LS 1237 The Northbound protocol needs a few minor extensions to BGP-LS. 1238 Luckily, others have needed the same extensions. 1240 Similarly to BGP-SPF, the BGP protocol is used in the Protocol-ID 1241 field specified in table 1 of 1242 [I-D.ietf-idr-bgpls-segment-routing-epe]. The local and remote node 1243 descriptors for all NLRI are the IDs described in Section 11. This 1244 is equivalent to an adjacency SID or a node SID if the address is a 1245 loopback address. 1247 Label Sub-TLVs from [I-D.ietf-idr-bgp-ls-segment-routing-ext] 1248 Section 2.1.1, are used to associate one or more MPLS Labels with a 1249 link. 1251 18. Discussion 1253 This section explores some trade-offs taken and some considerations. 1255 18.1. HELLO Discussion 1257 A device with multiple Layer 2 interfaces, traditionally called a 1258 switch, may be used to forward frames and therefore packets from 1259 multiple devices to one logical interface (LLEI), I, on an L3DL 1260 speaking device. Interface I could discover a peer J across the 1261 switch. Later, a prospective peer K could come up across the switch. 1262 If I was not still sending and listening for HELLOs, the potential 1263 peering with K could not be discovered. Therefore, on multi-link 1264 interfaces, L3DL MUST continue to send HELLOs as long as they are 1265 turned up. 1267 18.2. HELLO versus KEEPALIVE 1269 Both HELLO and KEEPALIVE are periodic. KEEPALIVE might be eliminated 1270 in favor of keeping only HELLOs. But KEEPALIVEs are unicast, and 1271 thus less noisy on the network, especially if HELLO is configured to 1272 transit layer-2-only switches, see Section 18.1. 1274 19. VLANs/SVIs/Sub-interfaces 1276 One can think of the protocol as an instance (i.e. state machine) 1277 which runs on each logical link of a device. 1279 As the upper routing layer must view VLAN topologies as separate 1280 graphs, L3DL treats VLAN ports as separate links. 1282 L3DL PDUs learned over VLAN-ports may be interpreted by upper layer-3 1283 routing protocols as being learned on the corresponding layer-3 SVI 1284 interface for the VLAN. 1286 As Sub-Interfaces each have their own LLIEs, they act as separate 1287 interfaces, forming their own links. 1289 20. Implementation Considerations 1291 An implementation SHOULD provide the ability to configure each 1292 logical interface as L3DL speaking or not. 1294 An implementation SHOULD provide the ability to configure whether 1295 HELLOs on an L3DL enabled interface send Nearest Bridge or the MAC 1296 which is propagated by switches from that interface; see Section 10. 1298 An implementation SHOULD provide the ability to distribute one or 1299 more loopback addresses or interfaces into L3DL on an external L3DL 1300 speaking interface. 1302 An implementation SHOULD provide the ability to distribute one or 1303 more overlay and/or underlay addresses or interfaces into L3DL on an 1304 external L3DL speaking interface. 1306 An implementation SHOULD provide the ability to configure one of the 1307 addresses of an encapsulation as primary on an L3DL speaking 1308 interface. If there is only one address for a particular 1309 encapsulation, the implementation MAY mark it as primary by default. 1311 An implementation MAY allow optional configuration which updates the 1312 local forwarding table with overlay and underlay data both learned 1313 from L3DL peers and configured locally. 1315 21. Security Considerations 1317 The protocol as is MUST NOT be used outside a datacenter or similarly 1318 closed environment without authentication ans authorisation 1319 mechanisms such as [I-D.ymbk-lsvr-l3dl-signing]. 1321 Many MDC operators have a strange belief that physical walls and 1322 firewalls provide sufficient security. This is not credible. All 1323 MDC protocols need to be examined for exposure and attack surface. 1324 In the case of L3DL, Authentication and Integrity as provided in 1325 [I-D.ymbk-lsvr-l3dl-signing] is strongly recommended. 1327 It is generally unwise to assume that on the wire Layer 2 is secure. 1328 Strange/unauthorized devices may plug into a port. Mis-wiring is 1329 very common in datacenter installations. A poisoned laptop might be 1330 plugged into a device's port, form malicious sessions, etc. to 1331 divert, intercept, or drop traffic. 1333 Similarly, malicious nodes/devices could mis-announce addressing. 1335 If OPENs are not being authenticated, an attacker could forge an OPEN 1336 for an existing session and cause the session to be reset. 1338 For these reasons, the OPEN PDU's authentication data exchange SHOULD 1339 be used. 1341 If the KEEPALIVE PDU is not signed (as suggested in Section 8) to 1342 save computation, then a MITM could fake a session being alive. 1344 22. IANA Considerations 1346 22.1. PDU Types 1348 This document requests the IANA create a registry for L3DL PDU Type, 1349 which may range from 0 to 255. The name of the registry should be 1350 L3DL-PDU-Type. The policy for adding to the registry is RFC Required 1351 per [RFC5226], either standards track or experimental. The initial 1352 entries should be the following: 1354 PDU 1355 Code PDU Name 1356 ---- ------------------- 1357 0 HELLO 1358 1 OPEN 1359 2 KEEPALIVE 1360 3 ACK 1361 4 IPv4 Announcement 1362 5 IPv6 Announcement 1363 6 MPLS IPv4 Announcement 1364 7 MPLS IPv6 Announcement 1365 8-254 Reserved 1366 255 VENDOR 1368 22.2. Signature Type 1370 This document requests the IANA create a registry for L3DL Signature 1371 Type, AKA Sig Type, which may range from 0 to 255. The name of the 1372 registry should be L3DL-Signature-Type. The policy for adding to the 1373 registry is RFC Required per [RFC5226], either standards track or 1374 experimental. The initial entries should be the following: 1376 Number Name 1377 ------ ------------------- 1378 0 Null 1379 1-255 Reserved 1381 22.3. Flag Bits 1383 This document requests the IANA create a registry for L3DL PL Flag 1384 Bits, which may range from 0 to 7. The name of the registry should 1385 be L3DL-PL-Flag-Bits. The policy for adding to the registry is RFC 1386 Required per [RFC5226], either standards track or experimental. The 1387 initial entries should be the following: 1389 Bit Bit Name 1390 ---- ------------------- 1391 0 Announce/Withdraw (ann == 0) 1392 1 Primary 1393 2 Underlay/Overlay (under == 0) 1394 3 Loopback 1395 4-7 Reserved 1397 22.4. Error Codes 1399 This document requests the IANA create a registry for L3DL Error 1400 Codes, a 16 bit integer. The name of the registry should be L3DL- 1401 Error-Codes. The policy for adding to the registry is RFC Required 1402 per [RFC5226], either standards track or experimental. The initial 1403 entries should be the following: 1405 Error 1406 Code Error Name 1407 ---- ------------------- 1408 0 No Error 1409 1 Checksum Error 1410 2 Logical Link Addressing Conflict 1411 3 Authorization Failure 1412 4 Announce/Withdraw Error 1414 23. IEEE Considerations 1416 This document requires a new EtherType. 1418 This document requires a new multicast MAC address that will be 1419 broadcast through a switch. 1421 24. Acknowledgments 1423 The authors thank Cristel Pelsser for multiple reviews, Harsha Kovuru 1424 for comments during implementation, Jeff Haas for review and 1425 comments, Joerg Ott for an early but deep transport review, Joe 1426 Clarke for a useful review, John Scudder for deeply serious review 1427 and comments, Larry Kreeger for a lot of layer 2 clue, Martijn 1428 Schmidt for his contribution, Nalinaksh Pai for transport 1429 discussions, Neeraj Malhotra for review, Paul Congdon for Ethernet 1430 hints, Russ Housley for checksum discussion and sBox, and Steve 1431 Bellovin for checksum advice. 1433 25. References 1435 25.1. Normative References 1437 [I-D.ietf-idr-bgp-ls-segment-routing-ext] 1438 Previdi, S., Talaulikar, K., Filsfils, C., Gredler, H., 1439 and M. Chen, "BGP Link-State extensions for Segment 1440 Routing", draft-ietf-idr-bgp-ls-segment-routing-ext-16 1441 (work in progress), June 2019. 1443 [I-D.ietf-idr-bgpls-segment-routing-epe] 1444 Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray, 1445 S., and J. Dong, "BGP-LS extensions for Segment Routing 1446 BGP Egress Peer Engineering", draft-ietf-idr-bgpls- 1447 segment-routing-epe-19 (work in progress), May 2019. 1449 [I-D.ietf-lsvr-bgp-spf] 1450 Patel, K., Lindem, A., Zandi, S., and W. Henderickx, 1451 "Shortest Path Routing Extensions for BGP Protocol", 1452 draft-ietf-lsvr-bgp-spf-08 (work in progress), March 2020. 1454 [I-D.ymbk-lsvr-l3dl-signing] 1455 Bush, R. and R. Austein, "Layer 3 Discovery and Liveness 1456 Signing", draft-ymbk-lsvr-l3dl-signing-01 (work in 1457 progress), May 2020. 1459 [IANA-PEN] 1460 "IANA Private Enterprise Numbers", 1461 . 1464 [IEEE.802_2001] 1465 IEEE, "IEEE Standard for Local and Metropolitan Area 1466 Networks: Overview and Architecture", IEEE 802-2001, 1467 DOI 10.1109/ieeestd.2002.93395, July 2002, 1468 . 1470 [IEEE802-2014] 1471 Institute of Electrical and Electronics Engineers, "Local 1472 and Metropolitan Area Networks: Overview and 1473 Architecture", IEEE Std 802-2014, 2014. 1475 [RFC1213] McCloghrie, K. and M. Rose, "Management Information Base 1476 for Network Management of TCP/IP-based internets: MIB-II", 1477 STD 17, RFC 1213, DOI 10.17487/RFC1213, March 1991, 1478 . 1480 [RFC1629] Colella, R., Callon, R., Gardner, E., and Y. Rekhter, 1481 "Guidelines for OSI NSAP Allocation in the Internet", 1482 RFC 1629, DOI 10.17487/RFC1629, May 1994, 1483 . 1485 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1486 Requirement Levels", BCP 14, RFC 2119, 1487 DOI 10.17487/RFC2119, March 1997, 1488 . 1490 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 1491 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 1492 Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, 1493 . 1495 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1496 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1497 DOI 10.17487/RFC4271, January 2006, 1498 . 1500 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1501 IANA Considerations Section in RFCs", RFC 5226, 1502 DOI 10.17487/RFC5226, May 2008, 1503 . 1505 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 1506 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 1507 . 1509 [RFC6286] Chen, E. and J. Yuan, "Autonomous-System-Wide Unique BGP 1510 Identifier for BGP-4", RFC 6286, DOI 10.17487/RFC6286, 1511 June 2011, . 1513 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 1514 S. Ray, "North-Bound Distribution of Link-State and 1515 Traffic Engineering (TE) Information Using BGP", RFC 7752, 1516 DOI 10.17487/RFC7752, March 2016, 1517 . 1519 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1520 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1521 May 2017, . 1523 25.2. Informative References 1525 [Clos0] Clos, C., "A study of non-blocking switching networks 1526 [PAYWALLED]", Bell System Technical Journal 32 (2), pp 1527 406-424, March 1953. 1529 [Clos1] "Clos Network", 1530 . 1532 [I-D.malhotra-bess-evpn-lsoe] 1533 Malhotra, N., Patel, K., and J. Rabadan, "LSoE-based PE-CE 1534 Control Plane for EVPN", draft-malhotra-bess-evpn-lsoe-00 1535 (work in progress), March 2019. 1537 [JUPITER] Singh, A., Ong, J., Agarwal, A., Anderson, G., Armistead, 1538 A., Bannon, R., Boving, S., Desai, G., Felderman, B., 1539 Germano, P., Kanagala, A., Liu, H., Provost, J., Simmons, 1540 J., Tanda, E., Wanderer, J., HAP.lzle, U., Stuart, S., and 1541 A. Vahdat, "Jupiter rising", Communications of the 1542 ACM Vol. 59, pp. 88-97, DOI 10.1145/2975159, August 2016. 1544 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1545 DOI 10.17487/RFC0791, September 1981, 1546 . 1548 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1549 Communication Layers", STD 3, RFC 1122, 1550 DOI 10.17487/RFC1122, October 1989, 1551 . 1553 [RFC1982] Elz, R. and R. Bush, "Serial Number Arithmetic", RFC 1982, 1554 DOI 10.17487/RFC1982, August 1996, 1555 . 1557 Authors' Addresses 1559 Randy Bush 1560 Arrcus & Internet Initiative Japan 1561 5147 Crystal Springs 1562 Bainbridge Island, WA 98110 1563 US 1565 Email: randy@psg.com 1567 Rob Austein 1568 Arrcus, Inc 1570 Email: sra@hactrn.net 1572 Keyur Patel 1573 Arrcus 1574 2077 Gateway Place, Suite #400 1575 San Jose, CA 95119 1576 US 1578 Email: keyur@arrcus.com