idnits 2.17.1 draft-ietf-trill-directory-assist-mechanisms-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 14, 2014) is 3723 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 7042 (Obsoleted by RFC 9542) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Linda Dunbar 2 Intended status: Proposed Standard Donald Eastlake 3 Updates: ESADI Huawei 4 Radia Perlman 5 Intel 6 Igor Gashinsky 7 Yahoo 8 Yizhou Li 9 Huawei 10 Expires: August 13, 2014 February 14, 2014 12 TRILL: Edge Directory Assist Mechanisms 13 15 Abstract 16 This document describes mechanisms for providing directory service to 17 TRILL (Transparent Interconnection of Lots of Links) edge switches. 18 The directory information provided can be used in reducing multi- 19 destination traffic, particularly ARP/ND and unknown unicast 20 flooding. 22 Status of This Memo 24 This Internet-Draft is submitted to IETF in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Distribution of this document is unlimited. Comments should be sent 28 to the TRILL working group mailing list. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF), its areas, and its working groups. Note that 32 other groups may also distribute working documents as Internet- 33 Drafts. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 The list of current Internet-Drafts can be accessed at 41 http://www.ietf.org/1id-abstracts.html. The list of Internet-Draft 42 Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html. 45 Table of Contents 47 1. Introduction............................................3 48 1.1 Terminology............................................3 50 2. Push Model Directory Assistance Mechanisms..............5 51 2.1 Requesting Push Service................................5 52 2.2 Push Directory Servers.................................5 53 2.3 Push Directory Server State Machine....................6 54 2.3.1 Push Directory States................................6 55 2.3.2 Push Directory Events and Conditions.................7 56 2.3.3 State Transition Diagram and Table...................8 57 2.4 Additional Push Details................................9 58 2.5 Primary to Secondary Server Push Service..............10 60 3. Pull Model Directory Assistance Mechanisms.............12 61 3.1 Pull Directory Message Common Format..................12 62 3.2 Pull Directory Query and Response Messages............14 63 3.2.1 Pull Directory Query Message Format.................14 64 3.2.2 Pull Directory Response Format......................16 65 3.3 Cache Consistency.....................................19 66 3.3.1 Update Message Format...............................21 67 3.3.2 Acknowledge Message Format..........................22 68 3.4 Pull Directory Hosted on an End Station...............22 69 3.5 Pull Directory Message Errors.........................23 70 3.6 Additional Pull Details...............................25 72 4. Events That May Cause Directory Use....................26 73 4.1 Forged Native Frame Ingress...........................26 74 4.2 Unknown Destination MAC...............................26 75 4.3 Address Resolution Protocol (ARP).....................27 76 4.4 IPv6 Neighbor Discovery (ND)..........................28 77 4.5 Reverse Address Resolution Protocol (RARP)............28 79 5. Layer 3 Address Learning...............................29 81 6. Directory Use Strategies and Push-Pull Hybrids.........30 82 6.1 Strategy Configuration................................30 84 7. Security Considerations................................33 86 8. IANA Considerations....................................34 87 8.1 ESADI-Parameter Data Extensions.......................34 88 8.2 RBridge Channel Protocol Number.......................35 89 8.3 The Pull Directory (PUL) and No Data (NOD) Bits.......35 91 Acknowledgments...........................................36 92 Normative References......................................37 93 Informational References..................................38 94 Authors' Addresses........................................39 96 1. Introduction 98 [RFC7067] gives a problem statement and high level design for using 99 directory servers to assist TRILL [RFC6325] edge nodes to reduce 100 multi-destination ARP/ND and unknown unicast flooding traffic and to 101 potentially improve security against address spoofing within a TRILL 102 campus. Because multi-destination traffic becomes an increasing 103 burden as a network scales up in number of nodes, reducing ARP/ND and 104 unknown unicast flooding improves TRILL network scalability. This 105 document describes specific mechanisms for directory servers to 106 assist TRILL edge nodes. These mechanisms are optional to implement. 108 The information held by the Directory(s) is address mapping and 109 reachability information. Most commonly, what MAC address [RFC7042] 110 corresponds to an IP address within a Data Label (VLAN or FGL (Fine 111 Grained Label [RFCfgl])) and the egress TRILL switch (RBridge) (and 112 optionally what specific TRILL switch port) from which that MAC 113 address is reachable. But it could be what IP address corresponds to 114 a MAC address or possibly other address mappings or reachability. 116 In the data center environment, it is common for orchestration 117 software to know and control where all the IP addresses, MAC 118 addresses, and VLANs/tenants are in a data center. Thus such 119 orchestration software is appropriate for providing the directory 120 function or for supplying the Directory(s) with directory 121 information. 123 Directory services can be offered in a Push or Pull Mode. Push Mode, 124 in which a directory server pushes information to TRILL switches 125 indicating interest, is specified in Section 2. Pull Mode, in which a 126 TRILL switch queries a server for the information it wants, is 127 specified in Section 3. More detail on modes of operation, including 128 hybrid Push/Pull, are provided in Section 4. 130 The mechanisms used to initially populate directory data in primary 131 servers is beyond the scope of this document. A primary server can 132 use the Push Directory service to provide directory data to secondary 133 servers as described in Section 2.5. 135 1.1 Terminology 137 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 138 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 139 document are to be interpreted as described in RFC 2119 [RFC2119]. 141 The terminology and acronyms of [RFC6325] are used herein along with 142 the following: 144 COP: Complete Push flag bit. See Sections 2 and 8.1 below. 146 CSNP Time: Complete Sequence Number PDU Time. See ESDADI [RFCesadi] 147 and Section 8.1 below. 149 Data Label: VLAN or FGL. 151 FGL: Fine Grained Label [RFCfgl]. 153 Host: Application running on a physical server or a virtual machine. 154 A host must have a MAC address and usually has at least one IP 155 address. 157 IP: Internet Protocol. In this document, IP includes both IPv4 and 158 IPv6. 160 PSH: Push Directory flag bit. See Sections 2 and 8.1 below. 162 PUL: Pull Directory flag bit. See Sections 3 and 8.3 below. 164 primary server: A Directory server that obtains the information it is 165 serving up by a reliable mechanism outside the scope of this 166 document designed to assure the freshness of that information. 167 (See secondary server.) 169 RBridge: An alternative name for a TRILL switch. 171 secondary server: A Directory server that obtains the information it 172 is serving up from one or more primary servers. 174 tenant: Sometimes used as a synonym for FGL. 176 TRILL switch: A device that implements the TRILL protocol. 178 2. Push Model Directory Assistance Mechanisms 180 In the Push Model [RFC7067], one or more Push Directory servers 181 reside at TRILL switches and push down the address mapping 182 information for the various addresses associated with end station 183 interface and the TRILL switches from which those interfaces are 184 reachable [IA]. This service is scoped by Data Label (VLAN or FGL 185 [RFCfgl]). A Push Directory also advertises whether or not it 186 believes it has pushed complete mapping information for a Data Label. 187 It might be pushing only a subset of the mapping and/or reachability 188 information for a Data Label. The Push Model uses the ESADI 189 [RFCesadi] protocol as its distribution mechanism. 191 With the Push Model, if complete address mapping information for a 192 Data Label being pushed is available, a TRILL switch (RBridge) which 193 has that complete pushed information and is ingressing a native frame 194 can simply drop the frame if the destination unicast MAC address 195 can't be found in the mapping information available, instead of 196 flooding the frame (ingressing it as an unknown MAC destination TRILL 197 Data frame). But this will result in lost traffic if ingress TRILL 198 switch's directory information is incomplete. 200 2.1 Requesting Push Service 202 In the Push Model, it is necessary to have a way for a TRILL switch 203 to request information from the directory server(s). TRILL switches 204 simply use the ESADI [RFCesadi] protocol mechanism to announce, in 205 their core IS-IS LSPs, the Data Labels for which they are 206 participating in ESADI by using the Interested VLANs and/or 207 Interested Labels sub-TLVs [RFC6326bis]. This will cause them to be 208 pushed the Directory information for all such Data Labels that are 209 being served by one or more Push Directory servers. 211 2.2 Push Directory Servers 213 Push Directory servers advertise their availability to push the 214 mapping information for a particular Data Label to each other and to 215 ESADI participants for that Data Label through ESADI by turning on 216 the a flag bit in their ESADI Parameter APPsub-TLV for that ESADI 217 instance (see [RFCesadi] and Section 8.1). Each Push Directory 218 server MUST participate in ESADI for the Data Labels for which it 219 will push mappings and set the PSH (Push Directory) bit in its ESADI- 220 Parameters APPsub-TLV for that Data Label. 222 For robustness, it is useful to have more than one copy of the data 223 being pushed. Each Push Directory server is configured with a number 224 in the range 1 to 8, which defaults to 2, for each Data Label for 225 which it can push directory information. If the Push Directories for 226 a Data Label are configured the same in this regard and enough such 227 servers are available, this is the number of copies of the directory 228 that will be pushed. 230 Each Push Directory server also has an 8-bit priority to be Active 231 (see Section 8.1 of this document). This priority is treated as an 232 unsigned integer where larger magnitude means higher priority and is 233 in its ESADI Parameter APPsub-TLV. In cases of equal priority, the 234 6-byte IS-IS System IDs of the tied Push Directories are used as a 235 tie breaker and treated as an unsigned integer where larger magnitude 236 means higher priority. 238 For each Data Label it can serve, each Push Directory server orders, 239 by priority, the Push Directory servers that it can see in the ESADI 240 link state database for that Data Label that are data reachable 241 [RFCclear] and determines its own position in that order. If a Push 242 Directory server is configured to believe that N copies of the 243 mappings for a Data Label should be pushed and finds that it is 244 number K in the priority ordering (where number 1 is highest priority 245 and number K is lowest), then if K is less than or equal to N the 246 Push Directory server is Active. If K is greater than N it is 247 Passive. Active and Passive behavior are specified below. 249 For a Push Directory to reside on an end station, one or more TRILL 250 switches locally connected to that end station must proxy for the 251 Push Directory server and advertise themselves as Push Directory 252 servers. It appears to the rest of the TRILL campus that these TRILL 253 switches (that are proxying for the end station) are the Push 254 Directory server(s). The protocol between such a Push Directory end 255 station and the one or more proxying TRILL switches acting as Push 256 Directory servers is beyond the scope of this document. 258 2.3 Push Directory Server State Machine 260 The subsections below describe the states, events, and corresponding 261 actions for Push Directory servers. 263 2.3.1 Push Directory States 265 A Push Directory Server is in one of six states, as listed below, for 266 each Data Label it can serve. In addition, it has an internal State- 267 Transition-Time variable for each Data Label it can serve which is 268 set at each state transition and which enables it to determine how 269 long it has been in its current state for that Data Label. 271 Down: A completely shut down virtual state defined for convenience in 272 specifying state diagrams. A Push Directory Server in this state 273 does not advertise any Push Directory data. It may be 274 participating in ESDADI [RFCesadi] with the PSH bit zero in its 275 ESADI-Parameters or might be not participating in ESADI at all. 276 All states other than the Down state are considered to be Up 277 states. 279 Passive: No Push Directory data is advertised. Any outstanding EASDI- 280 LSP fragments containing directory data are updated to remove that 281 data and if the result is an empty fragment (contains nothing 282 except possibly an Authentication TLV), the fragment is purged. 283 The Push Directory participates in ESDADI [RFCesadi] and 284 advertises its ESADI fragment zero that includes an ESADI- 285 Parameters APPsub-TLV with the PSH bit set to one and COP 286 (Complete Push) bit zero. 288 Active: If a Push Directory server is Active, it advertises its 289 directory data and any changes through ESADI [RFCesadi] in its 290 ESADI-LSPs using the Interface Addresses [IA] APPsub-TLV and 291 updates that information as it changes. The PSH bit is set to one 292 in the ESADI-Parameters and the COP bit set to zero. 294 Completing: Same behavior as the Active state but responds 295 differently to events. 297 Complete: The same behavior as Active except that the COP bit in the 298 ESADI-Parameters APPsub-TLV is set to one and the server responds 299 differently to events. 301 Reducing: The same behavior as Complete but responds differently to 302 events. The PSH bit remains a one but the COP bit is cleared to 303 zero in the ESADI-Parameters APPsub-TLV. Directory updates 304 continue to be advertised. 306 2.3.2 Push Directory Events and Conditions 308 Three auxiliary conditions referenced later in this section are 309 defined as follows for convenience: 311 The Activate Condition: The Push Directory server determines that it 312 is priority K among the data reachable Push Directory servers 313 (where highest priority is 1), the server is configured that there 314 should be N copies pushed, and K is less than or equal to N. For 315 example, the Push Directory server is configured that 2 copies 316 should be pushed and finds that it is priority 1 or 2 among the 317 Push Directory servers it can see. 319 The Pacify Condition: The Push Directory server determines that it is 320 priority K among the data reachable data reachable Push Directory 321 servers (where highest priority is 1), the server is configured 322 that there should be N copies pushed, and K is greater than N. For 323 example, the Push Directory server is configured that 2 copies 324 should be pushed and finds that it is priority 3 or lower priority 325 (higher number) among the Push directory servers it can see. 327 The Time Condition: The Push Directory server has been in its current 328 state for an amount of time equal to or larger than its CSNP time 329 (see Section 8.1).) 331 The events and conditions listed below cause state transitions in 332 Push Directory servers. 334 1. Push Directory server was Down but is now up. 336 2. The Push Directory server or the TRILL switch on which it resides 337 is being shut down. 339 3. The Activate Condition is met and the server is not configured to 340 believe it has complete data. 342 4. The server determines that the Pacify Condition is met. 344 5. The Activate Condition is met and the server is configured to 345 believe it has complete data. 347 6. The server is configured to believe it does not have complete 348 data. 350 7. The Time Condition is met. 352 2.3.3 State Transition Diagram and Table 354 The state transition table is as follows: 356 Event || Down |Passive |Active |Completing|Complete|Reducing| 357 ------++-------+----------+--------+----------+--------+--------+ 358 1 ||Passive|Passive |Active |Completing|Complete|Reducing| 359 2 || Down | Down |Passive |Passive |Reducing|Reducing| 360 3 || Down |Active |Active |Active |Reducing|Reducing| 361 4 || Down |Passive |Passive |Passive |Reducing|Reducing| 362 5 || Down |Completing|Complete|Completing|Complete|Complete| 363 6 || Down |Passive |Active |Active |Reducing|Reducing| 364 7 || Down |Passive |Active |Complete |Complete|Active | 366 The above state table is equivalent to the following transition 367 diagram: 369 +-----------+ 370 | Down |<---------+ 371 +-----------+ | 372 |1 ^ | 3,4,5,6,7 | 373 | | +------------+ 374 V |2 375 +-----------+ 376 | Passive |<----------------------- 377 +-----------+ ^ ^ ^ 378 |5 |3 |1,4,6,7 | | | 379 | | +---------+ | | 380 | V |2,4 | 381 | +---------------------+ | 382 | | Active |<--+ | 383 | +---------------------+ | | 384 | |5 ^ |1,3,6,7 ^ | | 385 | | | | | | | 386 | | | +---------+ | | 387 | | | | | 388 V V |3,6 | | 389 +--------------+ | | 390 | Completing |-------------------+ 391 +--------------+ 2,4 | 392 |7 |1,5 ^ | 393 | | | | 394 | +-----+ | 395 V |7 396 +-------------+ +----------------+ 397 | Complete |--------->| Reducing |<--+ 398 +-------------+ 2,3,4,6 +----------------+ | 399 |1,5,7 ^ ^ |5 |1,2,3,4,6 | 400 | | | | | | 401 +------+ +--------------+ +--------------+ 403 Figure 1. Push Server State Diagram 405 2.4 Additional Push Details 407 Push Directory mappings can be distinguished for other data 408 distributed through ESADI because mappings are distributed only with 409 the Interface Addresses APPsub-TLV [IA] and are flagged as being Push 410 Directory data. 412 TRILL switches, whether or not they are a Push Directory server, MAY 413 continue to advertise any locally learned MAC attachment information 414 in ESDADI [RFCesadi] using the Reachable MAC Addresses TLV [RFC6165]. 416 However, if a Data Label is being served by complete Push Directory 417 servers, advertising such locally learned MAC attachment generally 418 SHOULD NOT be done as it would not add anything and would just waste 419 bandwidth and ESADI link state space. An exception might be when a 420 TRILL switch learns local MAC connectivity and that information 421 appears to be missing from the directory mapping. 423 Because a Push Directory server may need to advertise interest in 424 Data Labels even if it does not want to receive end station 425 multidestination data in those Data Labels, the No Data (NOD) flag 426 bit is provided as specified in Section 8.3. 428 When a Push Directory server is no longer data reachable [RFCclear], 429 TRILL switches MUST ignore any Push Directory data from that server 430 because it is no longer being updated and may be stale. 432 The nature of dynamic distributed asynchronous systems is such that 433 it is impossible for a TRILL switch receiving Push Directory 434 information to be absolutely certain that it has complete 435 information. However, it can obtain a reasonable assurance of 436 complete information by requiring two conditions to be met: 437 1. The PSH and COP bits are on in the ESADI zero fragment from the 438 server for the relevant Data Label. 439 2. It has had continuous data connectivity to the server for the 440 larger of the client's and the server's CSNP times. 441 Condition 2 is necessary because a client TRILL switch might be just 442 coming up and receive an EASDI LSP meeting the requirement in 443 condition 1 above but have not yet received all of the ESADI LSP 444 fragment from the Push Directory server. 446 There may be conflicts between mapping information from different 447 Push Directory servers or conflicts between locally learned 448 information and information received from a Push Directory server. In 449 case of such conflicts, information with a higher confidence value 450 [RFC6325] is preferred over information with a lower confidence. In 451 case of equal confidence, Push Directory information is preferred to 452 locally learned information and if information from Push Directory 453 servers conflicts, the information from the higher priority Push 454 Directory server is preferred. 456 2.5 Primary to Secondary Server Push Service 458 A secondary Push or Pull Directory server is one that obtains its 459 data from a primary directory server. Other techniques MAY be used 460 but, by default, this data transfer occurs through the primary server 461 acting as a Push Directory server for the Data Labels involved while 462 the secondary directory server takes the pushed data it receives from 463 the highest priority Push Directory server and re-originates it. Such 464 a secondary server may be a Push Directory server or a Pull Directory 465 server or both for any particular Data Label. 467 3. Pull Model Directory Assistance Mechanisms 469 In the Pull Model [RFC7067], a TRILL switch (RBridge) pulls directory 470 information from an appropriate Directory Server when needed. 472 Pull Directory servers for a particular Data Label X are found by 473 looking in the core TRILL IS-IS link state database for data 474 reachable TRILL switches that advertise themselves by having the Pull 475 Directory flag (PUL) on in their Interested VLANs or Interested 476 Labels sub-TLV [RFC6326bis] for that Data Label. If multiple such 477 TRILL switches indicate that they are Pull Directory Servers for a 478 particular Data Label, pull requests can be sent to any one or more 479 of them but it is RECOMMENDED that pull requests be preferentially 480 sent to the server or servers that are lower cost from the requesting 481 TRILL switch. 483 Pull Directory requests are sent by enclosing them in an RBridge 484 Channel [Channel] message using the Pull Directory channel protocol 485 number (see Section 8.2). Responses are returned in an RBridge 486 Channel message using the same channel protocol number. See Section 487 3.2 for Query and Response message formats. For cache consistency or 488 notification purposes, Pull Directory servers can sent unsolicited 489 Update messages to client TRILL switches that believe may be holding 490 old data and those clients can acknowledge such updates, as described 491 in Section 3.3. All these messages have a common header as described 492 in Section 3.1. Errors returns can be sent for queries or updates as 493 described in Section 3.5. 495 The requests to Pull Directory Servers are typically derived from 496 ingressed ARP [RFC826], ND [RFC4861], or RARP [RFC903] messages, or 497 data frames with unknown unicast destination MAC addresses, 498 intercepted by an ingress TRILL switch as described in Section 4. 500 Pull Directory responses include an amount of time for which the 501 response should be considered valid. This includes negative responses 502 that indicate no data is available. Thus both positive responses with 503 data and negative responses can be cached and used to locally handle 504 ARP, ND, RARP, or unknown destination MAC frames, until the responses 505 expire. If information previously pulled is about to expire, a TRILL 506 switch MAY try to refresh it by issuing a new pull request but, to 507 avoid unnecessary requests, SHOULD NOT do so if it has not been 508 recently used. The validity timer of cached Pull Directory responses 509 is NOT reset or extended merely because that cache entry is used. 511 3.1 Pull Directory Message Common Format 513 All Pull Directory messages are transmitted as the payload of RBridge 514 Channel messages. All Pull Directory messages are formatted as 515 described below starting with the following common 8-byte header: 517 0 1 2 3 518 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 519 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 520 | Ver | Type | Flags | Count | Err | SubErr | 521 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 522 | Sequence Number | 523 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 524 | Type Specific Payload - variable length 525 +-+-+- ... 527 Ver: Version of the Pull Directory protocol as an unsigned 528 integer. Version zero is specified in this document. 530 Type: The Pull Directory message type as follows: 532 Type Section Name 533 ---- ------- -------- 534 0 3.2.1 Query 535 1 3.2.2 Response 536 2 3.1.4 Update 537 3 3.1.5 Acknowledge 538 4-15 - Reserved 540 Flags: Four flag bits whose meaning depends on the Pull Directory 541 message Type. Flags whose meaning is not specified are 542 reserved, MUST be sent as zero, and ignored on receipt. 544 Count: Most Pull Directory message types specified herein have 545 zero or more occurrences of a Record as part of the type 546 specific payload. The Count field is the number of occurrences 547 of that Record as an unsigned integer. For Pull Directory 548 messages not structured with such occurrences, this field MUST 549 be sent as zero and ignored on receipt. 551 Err, SubErr: The error and suberror fields are only used in 552 messages that are in the nature of replies or acknowledgements. 553 In messages that are requests or updates, these fields MUST be 554 sent as zero and ignored on receipt. The meaning of values in 555 the Err field depends on the Pull Directory message Type but in 556 all cases the value zero means no error. The meaning of values 557 in the SubErr field depends on both the message Type and on the 558 value of the Err field but in all cases, a zero SubErr field is 559 allowed and provides no additional information beyond the value 560 of the Err field. 562 Sequence Number: An opaque 32-bit quantity set by the TRILL switch 563 sending a request or other unsolicited message and returned in 564 any reply or acknowledgement. It is used to match up responses 565 with the message to which they respond. 567 Type Specific Payload: Format depends on the Pull Directory 568 message Type. 570 3.2 Pull Directory Query and Response Messages 572 3.2.1 Pull Directory Query Message Format 574 A Pull Directory Query message is sent as the Channel Protocol 575 specific content of an RBridge Channel message [Channel] TRILL Data 576 packet or as a native RBridge Channel data frame (see Section 3.4). 577 The Data Label of the packet is the Data Label in which the query is 578 being made. The priority of the channel message is a mapping of the 579 priority of the frame being ingressed that caused the query with the 580 default mapping depending, per Data Label, on the strategy (see 581 Section 6) or a configured priority for generated queries. The 582 Channel Protocol specific data is formatted as a header and a 583 sequence of zero or more QUERY Records as follows: 585 0 1 2 3 586 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 587 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 588 | Ver | Type | Flags | Count | Err | SubErr | 589 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 590 | Sequence Number | 591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 592 | QUERY 1 593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 594 | QUERY 2 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 596 | ... 597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 598 | QUERY K 599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 601 Ver, Sequence Number: See 3.1. 603 Type: 1 for Query. Queries received by an TRILL switch that is not 604 a Pull Directory result in an error response (see Section 3.5) 605 unless inhibited by rate limiting. 607 Flags, Err, and SubErr: MUST be sent as zero and ignored on 608 receipt. 610 Count: Number of QUERY Records present. A Query message Count of 611 zero is explicitly allowed, for the purpose of pinging a Pull 612 Directory server to see if it is responding. On receipt of such 613 an empty Query message, a Response message that also has a 614 Count of zero is sent unless inhibited by rate limiting. 616 QUERY: Each QUERY Record within a Pull Directory Query message is 617 formatted as follows: 619 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 620 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 621 | SIZE | RESV | QTYPE | 622 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 623 If QTYPE = 1 624 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 625 | AFN | 626 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 627 | Query address ... 628 +--+--+--+--+--+--+--+--+--+--+--... 629 If QTYPE = 2, 3, 4, or 5 630 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 631 | Query frame ... 632 +--+--+--+--+--+--+--+--+--+--+--... 634 SIZE: Size of the QUERY record in bytes as an unsigned integer 635 starting after the SIZE field and following byte. Thus the 636 minimum legal value is 2. A value of SIZE less than 2 637 indicates a malformed QUERY record. The QUERY record with 638 the illegal SIZE value and any subsequent QUERY records MUST 639 be ignored and the entire Query message MAY be ignored. 641 RESV: A block of reserved bits. MUST be sent as zero and 642 ignored on receipt. 644 QTYPE: There are several types of QUERY Records currently 645 defined in two classes as follows: (1) a QUERY Record that 646 provides an explicit address and asks for all addresses for 647 the interface specified by the query address and (2) a QUERY 648 Record that includes a frame. The fields of each are 649 specified below. Values of QTYPE are as follows: 651 QTYPE Description 652 ----- ----------- 653 0 reserved 654 1 address query 655 2 ARP query frame 656 3 ND query frame 657 4 RARP query frame 658 5 Unknown unicast MAC query frame 659 6-14 assignable by IETF Review 660 15 reserved 662 AFN: Address Family Number of the query address. 664 Address Query: The query is asking for any other addresses, 665 and the nickname of the TRILL switch from which they are 666 reachable, that correspond to the same interface, within 667 the data label of the query. Typically that would be 668 either (1) a MAC address with the querying TRILL switch 669 primarily interested in the TRILL switch by which that 670 MAC address is reachable, or (2) an IP address with the 671 querying TRILL switch interested in the corresponding MAC 672 address and the TRILL switch by which that MAC address is 673 reachable. But it could be some other address type. 675 Query Frame: Where a QUERY Record is the result of an ARP, 676 ND, RARP, or unknown unicast MAC destination address, the 677 ingress TRILL switch MAY send the frame to a Pull 678 Directory Server if the frame is small enough that the 679 resulting Query message fits into a TRILL Data packet 680 within the campus MTU. 682 If no response is received to a Pull Directory Query message within a 683 timeout configurable in milliseconds that defaults to 200, the Query 684 message should be re-transmitted with the same Sequence Number up to 685 a configurable number of times that defaults to three. If there are 686 multiple QUERY Records in a Query message, responses can be received 687 to various subsets of these QUERY Records before the timeout. In that 688 case, the remaining unanswered QUERY Records should be re-sent in a 689 new Query message with a new sequence number. If a TRILL switch is 690 not capable of handling partial responses to queries with multiple 691 QUERY Records, it MUST NOT sent a Request message with more than one 692 QUERY Record in it. 694 See Section 3.5 for a discussion of how Query message errors are 695 handled. 697 3.2.2 Pull Directory Response Format 699 Pull Directory Response messages are sent as the Channel Protocol 700 specific content of an RBridge Channel message [Channel] TRILL Data 701 packet or as a native RBridge Channel data frame (see Section 3.4). 702 Responses are sent with the same Data Label and priority as the Query 703 message to which they correspond except that the Response message 704 priority is limited to be not more than a configured value. This 705 priority limit is configurable at per TRILL switch and defaults to 706 priority 6. Pull Directory Response messages SHOULD NOT be sent with 707 priority 7 as that priority SHOULD be reserved for messages critical 708 to network connectivity. 710 The RBridge Channel protocol specific data format is as follows: 712 0 1 2 3 713 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 714 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 715 | Ver | Type | Flags | Count | Err | SubErr | 716 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 717 | Sequence Number | 718 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 719 | RESPONSE 1 720 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 721 | RESPONSE 2 722 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 723 | ... 724 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 725 | RESPONSE K 726 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-... 728 Ver, Sequence Number: As specified in Section 3.1. 730 Type: 2 = Response. 732 Flags: MUST be sent as zero and ignored on receipt. 734 Count: Count is the number of RESPONSE Records present in the 735 Response message. 737 Err, SubErr: A two part error code. Zero unless there was an error 738 in the Query message, for which case see Section 3.5. 740 RESPONSE: Each RESPONSE record within a Pull Directory Response 741 message is formatted as follows: 743 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 744 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 745 | SIZE |OV| RESV | Index | 746 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 747 | Lifetime | 748 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 749 | Response Data ... 750 +--+--+--+--+--+--+--+--+--+--+--... 752 SIZE: Size of the RESPONSE Record in bytes starting after the 753 SIZE field and following byte. Thus the minimum value of 754 SIZE is 2. If SIZE is less than 2, that RESPONSE Record and 755 all subsequent RESPONSE Records in the Response message MUST 756 be ignored and the entire Response message MAY be ignored. 758 OV: The overflow flag. Indicates, as described below, that 759 there was too much Response Data to include in one Response 760 message. 762 RESV: Four reserved bits that MUST be sent as zero and ignored 763 on receipt. 765 Index: The relative index of the QUERY Record in the Query 766 message to which this RESPONSE Record corresponds. The index 767 will always be one for Query messages containing a single 768 QUERY Record. If the Index is larger than the Count was in 769 the corresponding Query, that RESPONSE Record MUST be 770 ignored and subsequent RESPONSE Records or the entire 771 Response message MAY be ignored. 773 Lifetime: The length of time for which the response should be 774 considered valid in units of 200 milliseconds except that 775 the values zero and 2**16-1 are special. If zero, the 776 response can only be used for the particular query from 777 which it resulted and MUST NOT be cached. If 2**16-1, the 778 response MAY be kept indefinitely but not after the Pull 779 Directory server goes down or becomes unreachable. The 780 maximum definite time that can be expressed is a little over 781 3.6 hours. 783 Response Data: There are various types of RESPONSE Records. 784 - If the Err field is non-zero, then the Response Data is a 785 copy of the corresponding QUERY Record data, that is, 786 either an AFN followed by an address or a query frame. 787 See Section 3.5 for additional information on errors. 788 - If the Err field is zero and the corresponding QUERY 789 Record was an address query, then the Response Data is 790 the contents of an Interface Addresses APPsub-TLV [IA]. 791 The maximum size of such contents is 253 bytes in the 792 case when SIZE is 255. 793 - If the Err field is zero and the corresponding QUERY 794 Record was a frame query, then the Response data consists 795 of the response frame for ARP, ND, or RARP and a copy of 796 the frame for unknown unicast destination MAC. 798 Multiple RESPONSE Records can appear in a Response message with the 799 same index if the answer to a QUERY Record consists of multiple 800 Interface Address APPsub-TLV contents. This would be necessary if, 801 for example, a MAC address within a Data Label appears to be 802 reachable by multiple TRILL switches. However, all RESPONSE Records 803 to any particular QUERY Record MUST occur in the same Response 804 message. If a Pull Directory holds more mappings for a queried 805 address than will fit into one Response message, it selects which to 806 include by some method outside the scope of this document and sets 807 the overflow flag (OV) in all of the RESPONSE Records responding to 808 that query address. 810 See Section 3.5 for a discussion of how errors are handled. 812 3.3 Cache Consistency 814 A Pull Directory MUST take action to minimize the amount of time that 815 a TRILL switch will continue to use stale information from that Pull 816 Directory by sending Update messages. 818 A Pull Directory server MUST maintain one of the following three sets 819 of records, in order of increasing specificity. Retaining more 820 specific records, such as that given in item 3 below, minimizes 821 Spontaneous Update messages sent to update pull client TRILL switch 822 caches but increases the record keeping burden on the Pull Directory 823 server. Retaining less specific records, such as that given in item 824 1, will generally increase the volume and overhead due to Spontaneous 825 Update messages and due to unnecessarily invalidating cached 826 information, but will still maintain consistency and will reduce the 827 record keeping burden on the Pull Directory server. In all cases, 828 there may still be brief periods of time when directory information 829 has changed but cached information a pull clients has not yet been 830 updated or expunged. 832 1. An overall record per Data Label of when the last positive 833 response data sent will expire at some requester and when the 834 last negative response will expire at some requester, assuming 835 those responders cached the response. 837 2. For each unit of data (IA APPsub-TLV Address Set [IA]) held by 838 the server and each address about which `a negative response 839 was sent, when the last response sent with that positive 840 response data or negative response will expire at a requester, 841 assuming the requester cached the response. 843 3. For each unit of data held by the server (IA APPsub-TLV Address 844 Set [IA]) and each address about which a negative response was 845 sent, a list of TRILL switches that were sent that data as a 846 positive response or sent a negative response for the address, 847 and the expected time to expiration for that data or address at 848 each such TRILL switch, assuming the requester cached the 849 response. 851 A Pull Directory server may have a limit as to how many TRILL 852 switches for which it can maintain expiry information by method 3 853 above or how many data units or addresses it can maintain expiry 854 information for by method 2. If such limits are exceeded, it MUST 855 transition to a lower numbered strategy but, in all cases, MUST 856 support, at a minimum, method 1. 858 When data at a Pull Directory changes or is deleted or data is added 859 and there may be unexpired stale information at a requesting TRILL 860 switch, the Pull Directory MUST send an Update message as discussed 861 below. The sending of such an Update message MAY be delayed by a 862 configurable number of milliseconds that default to 50 milliseconds 863 to await other possible changes that could be included in the same 864 Update. 866 If method 1, the most crude method, is being followed, then when any 867 Pull Directory information in a Data Label is changed or deleted and 868 there are outstanding cached positive data response(s), an all- 869 addresses flush positive Update message is flooded within that Data 870 Label as an RBridge Channel message with an Inner.MacDA of All- 871 Egress-RBridges. And if data is added and there are outstanding 872 cached negative responses, an all-addresses flush negative message is 873 similarly flooded. "All-addresses" is indicated by the Count field 874 being zero in an Update message. On receiving an all-addresses 875 flooded flush positive Update from a Pull Directory server it has 876 used, indicated by the F and P bits being one and the Count being 877 zero, a TRILL switch discards all cached data responses it has for 878 that Data Label. Similarly, on receiving an all addresses flush 879 negative Update, indicated by the F and N bits being one and the 880 Count being zero, it discards all cached negative replies for that 881 Data Label. A combined flush positive and negative can be flooded by 882 having all of the F, P, and N bits set to one resulting in the 883 discard of all positive and negative cached information for the Data 884 Label. 886 If method 2 is being followed, then a TRILL switch floods address 887 specific positive Update messages when data that might be cached by a 888 querying TRILL switch is changed or deleted and floods address 889 specific negative Update messages when such information is added to. 890 Such messages are similar to the method 1 flooded flush Update 891 messages and are also sent as RBridge Channel messages with an 892 Inner.MacDA of All-Egress-RBridges. However the Count field will be 893 non-zero and either the P or N bit, but not both, will be one. On 894 receiving such as address specific unsolicited update, if it is 895 positive the addresses in the RESPONSE records in the unsolicited 896 response are compared to the addresses about which the receiving 897 TRILL switch is holding cached positive information from that server 898 and, if they match, the cached information is updated. On receiving 899 an address specific unsolicited update negative message, the 900 addresses in the RESPONSE records in the unsolicited update are 901 compared to the addresses about which the receiving TRILL switch is 902 holding cached negative information from that server and, if they 903 match, the cached negative information is updated. 905 If method 3 is being followed, the same sort of unsolicited update 906 messages are sent as with method 2 above except they are not normally 907 flooded but unicast only to the specific TRILL switches the directory 908 server believes may be holding the cached positive or negative 909 information that needs updating. However, a Pull Directory server MAY 910 flood the unsolicited update under method 3, for example if it 911 determines that a sufficiently large fraction of the TRILL switches 912 in some Data label are requesters that need to be updated. 914 A Pull Directory server tracking cached information with method 3 915 MUST NOT clear the indication that it needs update cached information 916 at a querying TRILL switch until it has sent an Update message and 917 received a corresponding Acknowledge message or it has sent a 918 configurable number of updates at a configurable interval which 919 default to 3 updates 200 milliseconds apart. 921 A Pull Directory server tracking cached information with methods 2 or 922 1 SHOULD NOT clear the indication that it needs to update cached 923 information until it has sent an Update message and received a 924 corresponding Acknowledge message from all of its ESADI neighbors or 925 it has sent a configurable number of updates at a configurable 926 interval that defaults to 3 updates 200 milliseconds apart. 928 3.3.1 Update Message Format 930 An Update message is formatted as a Response message except that the 931 Type field in the message header is a different value. 933 Update messages are initiated by a Pull Directory server. The 934 Sequence number space used is controlled by the originating Pull 935 Directory server and different from Sequence number space used in a 936 Query and the corresponding Response that are controlled by the 937 querying TRILL switch. 939 The Flags field of the message header for an Update message is as 940 follows: 942 +---+---+---+---+ 943 | F | P | N | R | 944 +---+---+---+---+ 946 F: The Flood bit. If zero, the response is to be unicast . If F=1, it 947 is multicast to All-Egress-RBridges. 949 P, N: Flags used to indicate positive or negative Update messages. 950 P=1 indicates positive. N=1 indicates negative. Both may be 1 for 951 a flooded all addresses Update. 953 R: Reserved. MUST be sent as zero and ignored on receipt 955 3.3.2 Acknowledge Message Format 957 An Acknowledge message is sent in response to an Update to confirm 958 receipt or indicate an error unless response is inhibited by rate 959 limiting. It is also formatted as a Response message. 961 If there are no errors in the processing of an Update message, the 962 message is essentially echoed back with the Type changed to 963 Acknowledge. 965 If there was an overall or header error in an Update message, it is 966 echoed back as an Acknowledge message with the Err and SubErr fields 967 set appropriately (see Section 3.5). 969 If there is a RESPONSE Record level error in an Update message, one 970 or more Acknowledge messages may be returns as indicated in Section 971 3.5. 973 3.4 Pull Directory Hosted on an End Station 975 Optionally, a Pull Directory actually hosted on an end station MAY be 976 supported. In that case, a TRILL switch must proxy for the end 977 station and advertise itself as a Pull Directory server. 979 When the proxy TRILL switch receives a Query message, it modifies the 980 inter-RBridge Channel message received into a native RBridge Channel 981 message and forwards it to that end station. Later, when it receives 982 one or more responses from that end station by native RBridge Channel 983 messages, it modifies them into inter-RBridge Channel messages and 984 forwards them to the source TRILL switch of the original Query 985 message. Similarly, an Update from the end station is forwarded to 986 client TRILL switches and acknowledgements from those TRILL switches 987 are returned to the end station by the proxy. Because native RBridge 988 Channel messages have no TRILL Header and are addressed by MAC 989 address, as opposed to inter-RBridge Channel messages that are TRILL 990 Data packets and are addressed by nickname, nickname information must 991 be added to the native RBridge Channel version of Pull Directory 992 messages. 994 The native Pull Directory RBridge Channel messages use the same 995 Channel protocol number as do the inter-RBridge Pull Directory 996 RBridge Channel messages. The native messages SHOULD be sent with an 997 Outer.VLAN tag which gives the priority of each message which is the 998 priority of the original inter-RBridge request packet. The Outer.VLAN 999 ID used is the Designated VLAN on the link to the end station. Since 1000 there is no TRILL Header or inner Data Label for native RBridge 1001 Chanel messages, that information is added to the header. 1003 The native RBridge Channel message protocol dependent data Pull 1004 Directory message is the same as for inter-RBridge Channel messages 1005 except that the 8-byte header described in Section 3.1 is expanded to 1006 14 or 18 bytes as follows: 1008 0 1 2 3 1009 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 | Ver | Type | Flags | Count | Err | SubErr | 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | Sequence Number | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1015 | Nickname (2 bytes) | 1016 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 1017 | Data Label ... (4 or 8 bytes) | 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 1019 | Type Specific Payload - variable length 1020 +-+-+- ... 1022 Fields not described below are as in Section 3.1. 1024 Data Label: The Data Label that normally appear right after the 1025 Inner.MacSA of the an RBridge Channel Pull Directory message 1026 appears here in the native RBridge Channel message version. 1027 This might appear in a Query message, to be reflected in a 1028 Response message, or it might appear in an Update message, to 1029 be reflected in an Acknowledge message. 1031 Nickname: The nickname of the TRILL switch that is communicating 1032 with the end station Pull Directory. Usually this is a remote 1033 TRILL switch but it could be the TRILL switch to which the end 1034 station is attached. The proxy copies this from the ingress 1035 nickname when mapping a Query or Acknowledge message to native 1036 form. It also takes this from a native Response or Update to be 1037 used as the egress of the inter-RBridge form on the message 1038 unless it is a flooded Update in which case a distribution tree 1039 is used. 1041 3.5 Pull Directory Message Errors 1043 A non-zero Err field in the Pull Directory message header indicates 1044 an error message. 1046 If there is an error that applies to an entire Query message or its 1047 header, as indicated by the range of the value of the Err field, then 1048 the QUERY records in the request are just echoed back in the RESPONSE 1049 records of the Response message but expanded with a zero Lifetime and 1050 the insertion of the Index field. If there is an error that applies 1051 to an entire Update message or its header, then the RESPONSE records 1052 in the update, if any, are echoed back in the Acknowledge message. 1054 If errors occur at the QUERY Record level for a Query message, they 1055 MUST be reported in a Response message separate from the results of 1056 any successful non-erroneous QUERY Records. If multiple QUERY Records 1057 in a Query message have different errors, they MUST be reported in 1058 separate Response messages. If multiple QUERY Records in a Query 1059 message have the same error, this error response MAY be reported in 1060 one Response message. In an error Response message, the QUERY Record 1061 or records being responded to appear, expanded by the Lifetime for 1062 which the server thinks the error might persist and with their Index 1063 inserted, as the RESPONSE record or records. 1065 If errors occur at the RESPONSE Record level for an Update message, 1066 they MUST be reported in a Acknowledge message separate from the 1067 acknowledgement of any non-erroneous RESPONSE Records. If multiple 1068 RESPONSE Records in an Update have different errors, they MUST be 1069 reported in separate Acknowledge messages. If multiple RESPONSE 1070 Records in an Update message have the same error, this error response 1071 MAY be reported in one Acknowledge message. In an error Acknowledge 1072 message, the RESPONSE Record or records being responded to appear, 1073 expanded by the time for which the server thinks the error might 1074 persist and with their Index inserted, as a RESPONSE Record or 1075 records. 1077 ERR values 1 through 127 are available for encoding Request or Update 1078 message level errors. ERR values 128 through 254 are available for 1079 encoding QUERY or RESPONSE Record level errors. The SubErr field is 1080 available for providing more detail on errors. The meaning of a 1081 SubErr field value depends on the value of the Err field. 1083 Err Meaning 1084 --- ------- 1085 0 (no error) 1087 1 Unknown or reserved Query message field value 1088 2 Request data too short 1089 3 Unknown or reserved Update message field value 1090 4 Update data too short 1091 5-127 (Available for allocation by IETF Review) 1093 128 Unknown or reserved QUERY Record field value 1094 129 Address not found 1095 130 Unknown or reserved RESPONSE Record field value 1096 131-254 (Available for allocation by IETF Review) 1098 255 Reserved 1100 The following sub-errors are specified under error code 1 and 3: 1102 SubErr Field with Error 1103 ------ ---------------- 1104 0 Unspecified 1105 1 Unknown V field value 1106 2 Reserved T field value 1107 3 Zero sequence number in request 1108 4-254 (Available for allocation by Expert Review) 1109 255 Reserved 1111 The following sub-errors are specified under error code 128 and 130: 1113 SubErr Field with Error 1114 ------ ---------------- 1115 0 Unspecified 1116 1 Unknown AFN field value 1117 2 Unknown or Reserved TYPE field value 1118 3 Invalid or inconsistent SIZE field value 1119 4-254 (Available for allocation by Expert Review) 1120 255 Reserved 1122 More TBD 1124 3.6 Additional Pull Details 1126 If a TRILL switch notices that a Pull Directory server is no longer 1127 data reachable [RFCclear], it MUST promptly discard all pull 1128 responses it is retaining from that server as it can no longer 1129 receive cache consistency update messages from the server. 1131 Because a Pull Directory server may need to advertise interest in 1132 Data Labels even though it does not want to received end station data 1133 in those Data Labels, the No Data (NOD) flag bit is provided as 1134 specified in Section 8.3. For example, an RBridge hosting a Pull 1135 Directory may be a secondary directory that wants to receive its data 1136 from a primary Push Directory server but have no interest in 1137 receiving multicast traffic from end stations. 1139 4. Events That May Cause Directory Use 1141 A TRILL switch can consult Directory information whenever it wants, 1142 by (1) searching through information that has been retained after 1143 being pushed to it or pulled by it or (2) by requesting information 1144 from a Pull Directory. However, the following are expected to be the 1145 most common circumstances leading to directory information use. All 1146 of these are cases of ingressing (or originating) a native frame. 1148 ARP requests and replies normally have the broadcast address in their 1149 MAC destination address and are normally treated the same way as any 1150 broadcast Ethernet frame. A directory assisted RBridge MUST intercept 1151 ARP broadcast, ND multicast, and unknown unicast destination MAC 1152 address native frames. It SHOULD also intercept RARP and, if complete 1153 directory information is available, forged source MAC frames. 1155 Support for each of the cases below is separately optional. 1157 4.1 Forged Native Frame Ingress 1159 End stations can forge the source MAC and/or IP address in a native 1160 frame that an edge TRILL switch receives for ingress in some 1161 particular Data Label. If there is complete Directory information as 1162 to what end stations should be reachable by an egress TRILL switch, 1163 frames with forged source addresses SHOULD be discarded. If such 1164 frames are discarded, then none of the special processing in the 1165 remaining subsection of this Section 2 occur and MAC address learning 1166 (see [RFC6325] Section 4.8) SHOULD NOT occur. ("SHOULD NOT" is chosen 1167 because it is harmless in cases where it has no effect. For example, 1168 if complete directory information is available and such directory 1169 information is treated as having a higher confidence that MAC 1170 addresses learned from the data plane.) 1172 If directory information includes the TRILL switch a port by which a 1173 MAC and/or IP address is reachable, that may also be tested on 1174 ingress so that an end station on one TRILL switch port cannot forge 1175 a source MAC or IP address that should not be reachable by that port 1176 even if it is reachable by that TRILL switch. 1178 4.2 Unknown Destination MAC 1180 Ingressing a native frame with an unknown unicast destination MAC: 1182 The mapping from the destination MAC and Data Label to the egress 1183 TRILL switch from which it is reachable is needed to ingress the 1184 frame as unicast. If the egress TRILL switch is unknown, the frame 1185 must be either dropped or ingressed as a multi-destination frame 1186 which is flooded to all edge TRILL switches for its Data Label 1187 resulting in increased link utilization compared with unicast 1188 routing. Depending on the configuration of the TRILL switch 1189 ingressing the native frame (see Section 6), directory information 1190 can be used for the { destination MAC, Data Label } to egress 1191 TRILL switch nickname mapping and destination MACs for which such 1192 direction information is not available MAY be discarded. 1194 4.3 Address Resolution Protocol (ARP) 1196 Ingressing an ARP [RFC826]: 1198 ARP is a flexible protocol detected by its Ethertype of 0x0806. It 1199 is commonly used on a link to (1) query for the MAC address 1200 corresponding to an IPv4 address, (2) test if an IPv4 address is 1201 in use, or (3) to announce a change in any of IPv4 address, MAC 1202 address, and/or point of attachment. 1204 The logically important elements in an ARP are (1) the 1205 specification of a "protocol" and a "hardware" address type, (2) 1206 an operation code that can be Request or Reply, and (3) fields for 1207 the protocol and hardware address of the sender and the target 1208 (destination) node. 1210 Examining the three types of ARP use: 1212 1. General ARP Request / Response 1214 This is a request for the destination "hardware" address 1215 corresponding to the destination "protocol" address; however, if 1216 the source and destination protocol addresses are equal, it should 1217 be handled as in type 2 below. A general ARP is handled by doing a 1218 directory lookup on the destination "protocol" address provided in 1219 hops of finding a mapping to the desired "hardware" address. If 1220 such information is obtain from a directory, a response can be 1221 synthesized. 1223 2. Address Probe ARP Query 1225 An address probe ARP is used to determine if an IPv4 address is in 1226 use [RFC5227]. It can be identified by the source "protocol" 1227 (IPv4) address field being zero. The destination "protocol" 1228 address field is the IPv4 address being tested. If some host 1229 believes it has that destination IPv4 address, it would respond to 1230 the ARP query, which indicates that the address is in use. 1231 Address probe ARPs can be handled in the same way as General ARP 1232 queries above. 1234 3. Gratuitous ARP 1236 A gratuitous ARP is an unsolicited ARP message, usually a response 1237 but sometimes a query, used by a host to announce a new IPv4 1238 address, new MAC address, and/or new point of network attachment. 1239 Such ARPs are identifiable because the sender and destination 1240 "protocol" address fields have the same value. Thus, under normal 1241 circumstances, there really isn't any separate destination host to 1242 generate a response. If complete Push Directory information is 1243 being used with the Notify flag set in the IA APPsub-TLVs being 1244 pushed [IA] by all the TRILL switches in the Data Label, then 1245 gratuitous ARPs SHOULD be discarded rather the ingressed. 1246 Otherwise, they are either ingressed and flooded or discarded 1247 depending on local policy. 1249 4.4 IPv6 Neighbor Discovery (ND) 1251 Ingressing an IPv6 ND [RFC4861]: 1252 TBD 1254 Secure Neighbor Discovery messages [RFC3971] will, in general, 1255 have to be sent to the neighbor intended so that neighbor can sign 1256 the answer; however, directory information can be used to unicast 1257 a Secure Neighbor Discovery packet rather than multicasting it. 1259 4.5 Reverse Address Resolution Protocol (RARP) 1261 Ingressing a RARP [RFC903]: 1262 RARP uses the same packet format as ARP but a different Ethertype 1263 (0x8035) and opcode values. Its use is similar to the General ARP 1264 Request/Response as described above. The difference is that it is 1265 intended to query for the destination "protocol" address 1266 corresponding to the destination "hardware" address provided. It 1267 is handled by doing a directory lookup on the destination 1268 "hardware" address provided in hops of finding a mapping to the 1269 desired "protocol" address. For example, looking up a MAC address 1270 to find the corresponding IP address. 1272 5. Layer 3 Address Learning 1274 TRILL switches MAY learn IP addresses in a manner similar to that in 1275 which they learn MAC addresses. On ingress of a native IP frame, they 1276 can learn the { IP address, MAC address, Data Label, input port } set 1277 and on the egress of a native IP frame, they can learn the { IP 1278 address, MAC address, Data Label, remote RBridge } information plus 1279 the nickname of the RBridge that ingressed the frame. 1281 This locally learned information is retained and times out in a 1282 similar manner to MAC address learning specified in [RFC6325]. By 1283 default, it has the same Confidence as locally learned MAC 1284 reachability information. 1286 Such learned Layer 3 address information MAY be disseminated with 1287 ESDADI [RFCesadi] using the IA APPsub-TLV [IA]. It can also be used 1288 as, in effect, local directory information to assist in locally 1289 responding to ARP/ND packets as discussed in Section 4. 1291 6. Directory Use Strategies and Push-Pull Hybrids 1293 For some edge nodes that have a great number of Data Labels enabled, 1294 managing the MAC and Data Label <-> Edge RBridge mapping for hosts 1295 under all those Data Labels can be a challenge. This is especially 1296 true for Data Center gateway nodes, which need to communicate with a 1297 majority of Data Labels, if not all. 1299 For those edge TRILL switch nodes, a hybrid model should be 1300 considered. That is the Push Model is used for some Data Labels, and 1301 the Pull Model is used for other Data Labels. It is the network 1302 operator's decision by configuration as to which Data Labels' mapping 1303 entries are pushed down from directories and which Data Labels' 1304 mapping entries are pulled. 1306 For example, assume a data center where hosts in specific Data 1307 Labels, say VLANs 1 through 100, communicate regularly with external 1308 peers. Probably, the mapping entries for those 100 VLANs should be 1309 pushed down to the data center gateway routers. For hosts in other 1310 Data Labels which only communicate with external peers occasionally 1311 for management interface, the mapping entries for those VLANs should 1312 be pulled down from directory when the need comes up. 1314 The mechanisms described above for Push and Pull Directory services 1315 make it easy to use Push for some Data Labels and Pull for others. In 1316 fact, different TRILL switches can even be configured so that some 1317 use Push Directory services and some use Pull Directory services for 1318 the same Data Label if both Push and Pull Directory services are 1319 available for that Data Label. And there can be Data Labels for which 1320 directory services are not used at all. 1322 For Data Labels in which a hybrid push/pull approach is being taken, 1323 it would make sense to use push for address information of hosts that 1324 frequently communicate with many other hosts in the Data Label, such 1325 as a file or DNS server. Pull could then be used for hosts that 1326 communicate with few other hosts, perhaps such as hosts being used as 1327 compute engines. 1329 6.1 Strategy Configuration 1331 Each TRILL switch that has the ability to use directory assistance 1332 has, for each Data Label X in which it is might ingress native 1333 frames, one of four major modes: 1335 0. No directory use: The TRILL switch does not subscribe to Push 1336 Directory data or make Pull Directory requests for Data Label X 1337 and directory data is not consulted on ingressed frames in Data 1338 Label X that might have used directory data. This includes ARP, 1339 ND, RARP, and unknown MAC destination addresses, which are 1340 flooded as appropriate. 1342 1. Use Push only: The TRILL switch subscribes to Push Directory 1343 data for Data Label X. 1345 2. Use Pull only: When the TRILL switch ingresses a frame in Data 1346 Label X that can use Directory information, if it has cached 1347 information for the address it uses it. If it does not have 1348 either cached positive or negative information for the address, 1349 it sends a Pull Directory query. 1351 3. Use Push and Pull: The TRILL switch subscribes to Push 1352 Directory data for Data Label X. When it ingresses a frame in 1353 Data Label X that can use Directory information and it does not 1354 find that information in its link state database of Push 1355 Directory information, it makes a Pull Directory query. 1357 The above major Directory use mode is per Data Label. In addition, 1358 there is a per Data Label per priority minor mode as listed below 1359 that indicates what should be done if Directory Data is not available 1360 for the ingressed frame. In all cases, if you are holding Push 1361 Directory or Pull Directory information to handle the frame given the 1362 major mode, the directory information is simply used and, in that 1363 instance, the minor mode does not matter. 1365 A. Flood immediate: Flood the frame immediately (even if you are 1366 also sending a Pull Directory) request. 1368 B. Flood: Flood the frame immediately unless you are going to do a 1369 Pull Directory request, in which case you wait for the response 1370 or for the request to time out after retries and flood the 1371 frame if the request times out. 1373 C. Discard if complete or Flood immediate: If you have complete 1374 Push Directory information and the address is not in that 1375 information, discard the frame. If you do not have complete 1376 Push Directory information, the same as A above. 1378 D. Discard if complete or Flood: If you have complete Push 1379 Directory information and the address is not in that 1380 information, discard the frame. If you do not have complete 1381 Push Directory information, the same as B above. 1383 In addition, the query message priority for Pull Directory requests 1384 sent can be configured on a per Data Label, per ingressed frame 1385 priority basis. The default mappings are as follows where Ingress 1386 Priority is the priority of the native frame that provoked the Pull 1387 Directory query: 1389 Ingress If Flood If Flood 1390 Priority Immediate Delayed 1391 -------- --------- -------- 1392 7 5 6 1393 6 5 6 1394 5 4 5 1395 4 3 4 1396 3 2 3 1397 2 0 2 1398 0 1 0 1399 1 1 1 1401 Priority 7 is normally only used for urgent messages critical to 1402 adjacency and so is avoided by default for directory traffic. 1403 Unsolicited updates are sent with a priority that is configured per 1404 Data Label that defaults to priority 5. 1406 7. Security Considerations 1408 Incorrect directory information can result in a variety of security 1409 threats including the following: 1411 Incorrect directory mappings can result in data being delivered to 1412 the wrong end stations, or set of end stations in the case of 1413 multi-destination packets, violation security policy. 1415 Missing or incorrect directory data can result in denial of 1416 service due to sending data packets to black holes or discarding 1417 data on ingress due to incorrect information that their 1418 destinations are not reachable. 1420 Push Directory data is distributed through ESADI-LSPs [RFCesadi] that 1421 can be authenticated with the same mechanisms as IS-IS LSPs. See 1422 [RFC5304] [RFC5310] and the Security Considerations section of 1423 [RFCesadi]. 1425 Pull Directory queries and responses are transmitted as RBridge-to- 1426 RBridge or native RBridge Channel messages. Such messages can be 1427 secured as specified in [ChannelTunnel]. 1429 For general TRILL security considerations, see [RFC6325]. 1431 8. IANA Considerations 1433 This section gives IANA allocation and registry considerations. 1435 8.1 ESADI-Parameter Data Extensions 1437 IANA is requested to allocate two ESADI-Parameter TRILL APPsub-TLV 1438 flag bits for "Push Directory" (PSH) and "Complete Push" (COP) and to 1439 create a sub-registry in the TRILL Parameters Registry as follows: 1441 Sub-Registry: ESADI-Parameter APPsub-TLV Flag Bits 1443 Registration Procedures: Expert Review 1445 References: [RFCesadi] [This document] 1447 Bit Mnemonic Description Reference 1448 --- -------- ----------- --------- 1449 0 UN Supports Unicast ESADI ESDADI [RFCesadi] 1450 1 PSH Push Directory Server This document 1451 2 COP Complete Push This document 1452 3-7 - available for allocation 1454 The COP bit is ignored if the PSH bit is zero. 1456 In addition, the ESADI-Parameter APPsub-TLV is optionally extended, 1457 as provided in its original specification in ESDADI [RFCesadi], by 1458 one byte as show below: 1460 +-+-+-+-+-+-+-+-+ 1461 | Type | (1 byte) 1462 +-+-+-+-+-+-+-+-+ 1463 | Length | (1 byte) 1464 +-+-+-+-+-+-+-+-+ 1465 |R| Priority | (1 byte) 1466 +-+-+-+-+-+-+-+-+ 1467 | CSNP Time | (1 byte) 1468 +-+-+-+-+-+-+-+-+ 1469 | Flags | (1 byte) 1470 +---------------+ 1471 |PushDirPriority| (optional, 1 byte) 1472 +---------------+ 1473 | Reserved for expansion (variable) 1474 +-+-+-+-... 1476 The meanings of all the fields are as specified in ESDADI [RFCesadi] 1477 except that the added PushDirPriority is the priority of the 1478 advertising ESADI instance to be a Push Directory as described in 1479 Section 2.3. If the PushDirPriority field is not present (Length = 3) 1480 it is treated as if it were 0x40. 0x40 is also the value used and 1481 placed here by an TRILL switch whose priority to be a Push Directory 1482 has not been configured. 1484 8.2 RBridge Channel Protocol Number 1486 IANA is requested to allocate a new RBridge Channel protocol number 1487 for "Pull Directory Services" from the range allocable by Standards 1488 Action and update the subregistry of such protocol number in the 1489 TRILL Parameters Registry referencing this document. 1491 8.3 The Pull Directory (PUL) and No Data (NOD) Bits 1493 IANA is requested to allocate two currently reserved bits in the 1494 Interested VLANs field of the Interested VLANs sub-TLV (suggested 1495 bits 18 and 19) and the Interested Labels field of the Interested 1496 Labels sub-TLV (suggested bits 6 and 7) [RFC6326bis] to indicate Pull 1497 Directory server (PUL) and No Data (NOD) respectively. These bits are 1498 to be added, with this document as reference, to the "Interested 1499 VLANs Flag Bits" and "Interested Labels Flag Bits" subregistries 1500 created by [RFCesadi]. 1502 In the TRILL base protocol [RFC6325] as extended for FGL [rfcFGL], 1503 the mere presence of an Interested VLANs or Interested Labels sub- 1504 TLVs in the LSP of a TRILL switch indicates connection to end 1505 stations in the VLAN(s) or FGL(s) listed and thus a desire to receive 1506 multi-destination traffic in those Data Labels. But, with Push and 1507 Pull Directories, advertising that you are a directory server 1508 requires using these sub-TLVs to indicate the Data Label(s) you are 1509 serving. If such a directory server does not wish to received multi- 1510 destination TRILL Data packets for the Data Labels it lists in one of 1511 these sub-TLVs, it sets the "No Data" (NOD) bit to one. This means 1512 that data on a distribution tree may be pruned so as not to reach the 1513 "No Data" TRILL switch as long as there are no TRILL switches 1514 interested in the Data that are beyond the "No Data" TRILL switch on 1515 a distribution tree. The NOD bit is backwards compatible as TRILL 1516 switches ignorant of it will simply not prune when they could, which 1517 is safe although it may cause increased link utilization. 1519 An example of a TRILL switch serving as a directory that would not 1520 want multi-destination traffic in some Data Labels might be a TRILL 1521 switch that does not offer end station service for any of the Data 1522 Labels for which it is serving as a directory and is either a Pull 1523 Directory and/or a Push Directory for which all of the ESADI traffic 1524 can be handled by unicast ESDADI [RFCesadi]. 1526 Acknowledgments 1528 The contributions of the following persons are gratefully 1529 acknowledged: 1531 TBD 1533 The document was prepared in raw nroff. All macros used were defined 1534 within the source file. 1536 Normative References 1538 [RFC826] - Plummer, D., "An Ethernet Address Resolution Protocol", 1539 RFC 826, November 1982. 1541 [RFC903] - Finlayson, R., Mann, T., Mogul, J., and M. Theimer, "A 1542 Reverse Address Resolution Protocol", STD 38, RFC 903, June 1543 1984 1545 [RFC2119] - Bradner, S., "Key words for use in RFCs to Indicate 1546 Requirement Levels", BCP 14, RFC 2119, March 1997 1548 [RFC3971] - Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander, 1549 "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005. 1551 [RFC4861] - Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1552 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1553 September 2007. 1555 [RFC5304] Li, T. and R. Atkinson, "IS-IS Cryptographic 1556 Authentication", RFC 5304, October 2008. 1558 [RFC5310] - Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 1559 and M. Fanto, "IS-IS Generic Cryptographic Authentication", RFC 1560 5310, February 2009. 1562 [RFC6165] - Banerjee, A. and D. Ward, "Extensions to IS-IS for 1563 Layer-2 Systems", RFC 6165, April 2011. 1565 [RFC6325] - Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A. 1566 Ghanwani, "Routing Bridges (RBridges): Base Protocol 1567 Specification", RFC 6325, July 2011. 1569 [RFC7042] - Eastlake 3rd, D. and J. Abley, "IANA Considerations and 1570 IETF Protocol and Documentation Usage for IEEE 802 Parameters", 1571 BCP 141, RFC 7042, October 2013. 1573 [RFC6326bis] - Eastlake, D., Banerjee, A., Dutt, D., Perlman, R., and 1574 A. Ghanwani, "TRILL Use of IS-IS", draft-ietf-isis-rfc6326bis, 1575 work in progress. 1577 [RFCclear] - Eastlake, D., M. Zhang, A. Ghanwani, V. Manral, A. 1578 Banerjee, draft-ietf-trill-clear-correct-06.txt, in RFC 1579 Editor's queue. 1581 [Channel] - D. Eastlake, V. Manral, Y. Li, S. Aldrin, D. Ward, 1582 "TRILL: RBridge Channel Support", draft-ietf-trill-rbridge- 1583 channel-08.txt, in RFC Editor's queue. 1585 [RFCfgl] - D. Eastlake, M. Zhang, P. Agarwal, R. Perlman, D. Dutt, 1586 "TRILL: Fine-Grained Labeling", draft-ietf-trill-fine- 1587 labeling-07.txt, in RFC Editor's queue. 1589 [RFCesadi] - Zhai, H., F. Hu, R. Perlman, D. Eastlake, O. Stokes, 1590 "TRILL (Transparent Interconnection of Lots of Links): The 1591 ESADI (End Station Address Distribution Information) Protocol", 1592 draft-ietf-trill-esadi, work in progress. 1594 [IA] - Eastlake, D., L. Yizhou, R. Perlman, "TRILL: Interface 1595 Addresses APPsub-TLV", draft-eastlake-trill-ia-appsubtlv, work 1596 in progress. 1598 Informational References 1600 [RFC5227] - Cheshire, S., "IPv4 Address Conflict Detection", RFC 1601 5227, July 2008. 1603 [RFC7067] - Dunbar, L., Eastlake 3rd, D., Perlman, R., and I. 1604 Gashinsky, "Directory Assistance Problem and High-Level Design 1605 Proposal", RFC 7067, November 2013. 1607 [ChannelTunnel] - D. Eastlake, Y. Li, "TRILL: RBridge Channel Tunnel 1608 Protocol", draft-eastlake-trill-channel-tunnel, work in 1609 progress. 1611 [ARP reduction] - Shah, et. al., "ARP Broadcast Reduction for Large 1612 Data Centers", Oct 2010. 1614 Authors' Addresses 1616 Linda Dunbar 1617 Huawei Technologies 1618 5430 Legacy Drive, Suite #175 1619 Plano, TX 75024, USA 1621 Phone: +1-469-277-5840 1622 Email: ldunbar@huawei.com 1624 Donald Eastlake 1625 Huawei Technologies 1626 155 Beaver Street 1627 Milford, MA 01757 USA 1629 Phone: +1-508-333-2270 1630 Email: d3e3e3@gmail.com 1632 Radia Perlman 1633 Intel Labs 1634 2200 Mission College Blvd. 1635 Santa Clara, CA 95054-1549 USA 1637 Phone: +1-408-765-8080 1638 Email: Radia@alum.mit.edu 1640 Igor Gashinsky 1641 Yahoo 1642 45 West 18th Street 6th floor 1643 New York, NY 10011 1645 Email: igor@yahoo-inc.com 1647 Yizhou Li 1648 Huawei Technologies 1649 101 Software Avenue, 1650 Nanjing 210012 China 1652 Phone: +86-25-56622310 1653 Email: liyizhou@huawei.com 1655 Copyright, Disclaimer, and Additional IPR Provisions 1657 Copyright (c) 2014 IETF Trust and the persons identified as the 1658 document authors. All rights reserved. 1660 This document is subject to BCP 78 and the IETF Trust's Legal 1661 Provisions Relating to IETF Documents 1662 (http://trustee.ietf.org/license-info) in effect on the date of 1663 publication of this document. Please review these documents 1664 carefully, as they describe your rights and restrictions with respect 1665 to this document. Code Components extracted from this document must 1666 include Simplified BSD License text as described in Section 4.e of 1667 the Trust Legal Provisions and are provided without warranty as 1668 described in the Simplified BSD License. The definitive version of 1669 an IETF Document is that published by, or under the auspices of, the 1670 IETF. Versions of IETF Documents that are published by third parties, 1671 including those that are translated into other languages, should not 1672 be considered to be definitive versions of IETF Documents. The 1673 definitive version of these Legal Provisions is that published by, or 1674 under the auspices of, the IETF. Versions of these Legal Provisions 1675 that are published by third parties, including those that are 1676 translated into other languages, should not be considered to be 1677 definitive versions of these Legal Provisions. For the avoidance of 1678 doubt, each Contributor to the IETF Standards Process licenses each 1679 Contribution that he or she makes as part of the IETF Standards 1680 Process to the IETF Trust pursuant to the provisions of RFC 5378. No 1681 language to the contrary, or terms, conditions or rights that differ 1682 from or are inconsistent with the rights and licenses granted under 1683 RFC 5378, shall have any effect and shall be null and void, whether 1684 published or posted by such Contributor, or included with or in such 1685 Contribution.