idnits 2.17.1 draft-ietf-idr-rfc5575bis-23.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? -- The draft header indicates that this document obsoletes RFC7674, but the abstract doesn't seem to directly say this. It does mention RFC7674 though, so this could be OK. -- The draft header indicates that this document obsoletes RFC5575, but the abstract doesn't seem to directly say this. It does mention RFC5575 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 23, 2020) is 1463 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '137' on line 646 -- Looks like a reference, but probably isn't: '139' on line 646 -- Looks like a reference, but probably isn't: '1' on line 1554 -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE.754.1985' ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) == Outdated reference: A later version (-22) exists of draft-ietf-idr-flow-spec-v6-10 -- Obsolete informational reference (is this intentional?): RFC 5575 (Obsoleted by RFC 8955) -- Obsolete informational reference (is this intentional?): RFC 7674 (Obsoleted by RFC 8955) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group C. Loibl 3 Internet-Draft next layer Telekom GmbH 4 Obsoletes: 5575,7674 (if approved) S. Hares 5 Intended status: Standards Track Huawei 6 Expires: October 25, 2020 R. Raszuk 7 Bloomberg LP 8 D. McPherson 9 Verisign 10 M. Bacher 11 T-Mobile Austria 12 April 23, 2020 14 Dissemination of Flow Specification Rules 15 draft-ietf-idr-rfc5575bis-23 17 Abstract 19 This document defines a Border Gateway Protocol Network Layer 20 Reachability Information (BGP NLRI) encoding format that can be used 21 to distribute traffic Flow Specifications. This allows the routing 22 system to propagate information regarding more specific components of 23 the traffic aggregate defined by an IP destination prefix. 25 It also specifies BGP Extended Community encoding formats, that can 26 be used to propagate Traffic Filtering Actions along with the Flow 27 Specification NLRI. Those Traffic Filtering Actions encode actions a 28 routing system can take if the packet matches the Flow Specification. 30 Additionally, it defines two applications of that encoding format: 31 one that can be used to automate inter-domain coordination of traffic 32 filtering, such as what is required in order to mitigate 33 (distributed) denial-of-service attacks, and a second application to 34 provide traffic filtering in the context of a BGP/MPLS VPN service. 35 Other applications (e.g. centralized control of traffic in a SDN or 36 NFV context) are also possible. Other documents may specify Flow 37 Specification extensions. 39 The information is carried via BGP, thereby reusing protocol 40 algorithms, operational experience, and administrative processes such 41 as inter-provider peering agreements. 43 This document obsoletes both RFC5575 and RFC7674. 45 Status of This Memo 47 This Internet-Draft is submitted in full conformance with the 48 provisions of BCP 78 and BCP 79. 50 Internet-Drafts are working documents of the Internet Engineering 51 Task Force (IETF). Note that other groups may also distribute 52 working documents as Internet-Drafts. The list of current Internet- 53 Drafts is at https://datatracker.ietf.org/drafts/current/. 55 Internet-Drafts are draft documents valid for a maximum of six months 56 and may be updated, replaced, or obsoleted by other documents at any 57 time. It is inappropriate to use Internet-Drafts as reference 58 material or to cite them other than as "work in progress." 60 This Internet-Draft will expire on October 25, 2020. 62 Copyright Notice 64 Copyright (c) 2020 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents 69 (https://trustee.ietf.org/license-info) in effect on the date of 70 publication of this document. Please review these documents 71 carefully, as they describe your rights and restrictions with respect 72 to this document. Code Components extracted from this document must 73 include Simplified BSD License text as described in Section 4.e of 74 the Trust Legal Provisions and are provided without warranty as 75 described in the Simplified BSD License. 77 Table of Contents 79 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 80 2. Definitions of Terms Used in This Memo . . . . . . . . . . . 5 81 3. Flow Specifications . . . . . . . . . . . . . . . . . . . . . 5 82 4. Dissemination of IPv4 Flow Specification Information . . . . 6 83 4.1. Length Encoding . . . . . . . . . . . . . . . . . . . . . 7 84 4.2. NLRI Value Encoding . . . . . . . . . . . . . . . . . . . 7 85 4.2.1. Operators . . . . . . . . . . . . . . . . . . . . . . 7 86 4.2.2. Components . . . . . . . . . . . . . . . . . . . . . 9 87 4.3. Examples of Encodings . . . . . . . . . . . . . . . . . . 14 88 5. Traffic Filtering . . . . . . . . . . . . . . . . . . . . . . 16 89 5.1. Ordering of Flow Specifications . . . . . . . . . . . . . 17 90 6. Validation Procedure . . . . . . . . . . . . . . . . . . . . 18 91 7. Traffic Filtering Actions . . . . . . . . . . . . . . . . . . 19 92 7.1. Traffic Rate in Bytes (traffic-rate-bytes) sub-type 0x06 21 93 7.2. Traffic Rate in Packets (traffic-rate-packets) sub-type 94 TBD . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 95 7.3. Traffic-action (traffic-action) sub-type 0x07 . . . . . . 21 96 7.4. RT Redirect (rt-redirect) sub-type 0x08 . . . . . . . . . 22 97 7.5. Traffic Marking (traffic-marking) sub-type 0x09 . . . . . 23 98 7.6. Interaction with other Filtering Mechanisms in Routers . 23 99 7.7. Considerations on Traffic Filtering Action Interference . 24 100 8. Dissemination of Traffic Filtering in BGP/MPLS VPN Networks . 24 101 9. Traffic Monitoring . . . . . . . . . . . . . . . . . . . . . 25 102 10. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 25 103 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 104 11.1. AFI/SAFI Definitions . . . . . . . . . . . . . . . . . . 25 105 11.2. Flow Component Definitions . . . . . . . . . . . . . . . 26 106 11.3. Extended Community Flow Specification Actions . . . . . 27 107 12. Security Considerations . . . . . . . . . . . . . . . . . . . 29 108 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 31 109 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 110 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 111 15.1. Normative References . . . . . . . . . . . . . . . . . . 31 112 15.2. Informative References . . . . . . . . . . . . . . . . . 33 113 15.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 34 114 Appendix A. Python code: flow_rule_cmp . . . . . . . . . . . . . 34 115 Appendix B. Comparison with RFC 5575 . . . . . . . . . . . . . . 35 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 118 1. Introduction 120 This document obsoletes "Dissemination of Flow Specification Rules" 121 [RFC5575] (see Appendix B for the differences). This document also 122 obsoletes "Clarification of the Flowspec Redirect Extended Community" 123 [RFC7674] since it incorporates the encoding of the BGP Flow 124 Specification Redirect Extended Community in Section 7.4. 126 Modern IP routers have the capability to forward traffic and to 127 classify, shape, rate limit, filter, or redirect packets based on 128 administratively defined policies. These traffic policy mechanisms 129 allow the operator to define match rules that operate on multiple 130 fields of the packet header. Actions such as the ones described 131 above can be associated with each rule. 133 The n-tuple consisting of the matching criteria defines an aggregate 134 traffic Flow Specification. The matching criteria can include 135 elements such as source and destination address prefixes, IP 136 protocol, and transport protocol port numbers. 138 Section 4 of this document defines a general procedure to encode Flow 139 Specifications for aggregated traffic flows so that they can be 140 distributed as a BGP [RFC4271] NLRI. Additionally, Section 7 of this 141 document defines the required Traffic Filtering Actions BGP Extended 142 Communities and mechanisms to use BGP for intra- and inter-provider 143 distribution of traffic filtering rules to filter (distributed) 144 denial-of-service (DoS) attacks. 146 By expanding routing information with Flow Specifications, the 147 routing system can take advantage of the ACL (Access Control List) or 148 firewall capabilities in the router's forwarding path. Flow 149 Specifications can be seen as more specific routing entries to a 150 unicast prefix and are expected to depend upon the existing unicast 151 data information. 153 A Flow Specification received from an external autonomous system will 154 need to be validated against unicast routing before being accepted 155 (Section 6). The Flow Specification received from an internal BGP 156 peer within the same autonomous system [RFC4271] is assumed to have 157 been validated prior to transmission within the internal BGP (iBGP) 158 mesh of an autonomous system. If the aggregate traffic flow defined 159 by the unicast destination prefix is forwarded to a given BGP peer, 160 then the local system can install more specific Flow Specifications 161 that may result in different forwarding behavior, as requested by 162 this system. 164 From an operational perspective, the utilization of BGP as the 165 carrier for this information allows a network service provider to 166 reuse both internal route distribution infrastructure (e.g., route 167 reflector or confederation design) and existing external 168 relationships (e.g., inter-domain BGP sessions to a customer 169 network). 171 While it is certainly possible to address this problem using other 172 mechanisms, this solution has been utilized in deployments because of 173 the substantial advantage of being an incremental addition to already 174 deployed mechanisms. 176 In current deployments, the information distributed by this extension 177 is originated both manually as well as automatically, the latter by 178 systems that are able to detect malicious traffic flows. When 179 automated systems are used, care should be taken to ensure the 180 correctness of the automated system. The the limitations of the 181 receiving systems that need to process these automated Flow 182 Specifications need to be taken in consideration as well (see also 183 Section 12). 185 This specification defines required protocol extensions to address 186 most common applications of IPv4 unicast and VPNv4 unicast filtering. 187 The same mechanism can be reused and new match criteria added to 188 address similar filtering needs for other BGP address families such 189 as IPv6 families [I-D.ietf-idr-flow-spec-v6]. 191 2. Definitions of Terms Used in This Memo 193 AFI - Address Family Identifier. 195 AS - Autonomous System. 197 Loc-RIB - The Loc-RIB contains the routes that have been selected 198 by the local BGP speaker's Decision Process [RFC4271]. 200 NLRI - Network Layer Reachability Information. 202 PE - Provider Edge router. 204 RIB - Routing Information Base. 206 SAFI - Subsequent Address Family Identifier. 208 VRF - Virtual Routing and Forwarding instance. 210 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 211 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 212 "OPTIONAL" in this document are to be interpreted as described in BCP 213 14 [RFC2119] [RFC8174] when, and only when, they appear in all 214 capitals, as shown here. 216 3. Flow Specifications 218 A Flow Specification is an n-tuple consisting of several matching 219 criteria that can be applied to IP traffic. A given IP packet is 220 said to match the defined Flow Specification if it matches all the 221 specified criteria. This n-tuple is encoded into a BGP NLRI defined 222 below. 224 A given Flow Specification may be associated with a set of 225 attributes, depending on the particular application; such attributes 226 may or may not include reachability information (i.e., NEXT_HOP). 227 Well-known or AS-specific community attributes can be used to encode 228 a set of predetermined actions. 230 A particular application is identified by a specific (Address Family 231 Identifier, Subsequent Address Family Identifier (AFI, SAFI)) pair 232 [RFC4760] and corresponds to a distinct set of RIBs. Those RIBs 233 should be treated independently from each other in order to assure 234 non-interference between distinct applications. 236 BGP itself treats the NLRI as a key to an entry in its databases. 237 Entries that are placed in the Loc-RIB are then associated with a 238 given set of semantics, which is application dependent. This is 239 consistent with existing BGP applications. For instance, IP unicast 240 routing (AFI=1, SAFI=1) and IP multicast reverse-path information 241 (AFI=1, SAFI=2) are handled by BGP without any particular semantics 242 being associated with them until installed in the Loc-RIB. 244 Standard BGP policy mechanisms, such as UPDATE filtering by NLRI 245 prefix as well as community matching and must apply to the Flow 246 specification defined NLRI-type. Network operators can also control 247 propagation of such routing updates by enabling or disabling the 248 exchange of a particular (AFI, SAFI) pair on a given BGP peering 249 session. 251 4. Dissemination of IPv4 Flow Specification Information 253 This document defines a Flow Specification NLRI type (Figure 1) that 254 may include several components such as destination prefix, source 255 prefix, protocol, ports, and others (see Section 4.2 below). 257 This NLRI information is encoded using MP_REACH_NLRI and 258 MP_UNREACH_NLRI attributes as defined in [RFC4760]. When advertising 259 Flow Specifications, the Length of Next Hop Network Address SHOULD be 260 set to 0. The Network Address of Next Hop field MUST be ignored. 262 The NLRI field of the MP_REACH_NLRI and MP_UNREACH_NLRI is encoded as 263 one or more 2-tuples of the form . It consists 264 of a 1- or 2-octet length field followed by a variable-length NLRI 265 value. The length is expressed in octets. 267 +-------------------------------+ 268 | length (0xnn or 0xfnnn) | 269 +-------------------------------+ 270 | NLRI value (variable) | 271 +-------------------------------+ 273 Figure 1: Flow Specification NLRI for IPv4 275 Implementations wishing to exchange Flow Specification MUST use BGP's 276 Capability Advertisement facility to exchange the Multiprotocol 277 Extension Capability Code (Code 1) as defined in [RFC4760]. The 278 (AFI, SAFI) pair carried in the Multiprotocol Extension Capability 279 MUST be (AFI=1, SAFI=133) for IPv4 Flow Specification, and (AFI=1, 280 SAFI=134) for VPNv4 Flow Specification. 282 4.1. Length Encoding 284 o If the NLRI length is smaller than 240 (0xf0 hex) octets, the 285 length field can be encoded as a single octet. 287 o Otherwise, it is encoded as an extended-length 2-octet value in 288 which the most significant nibble of the first octet is all ones. 290 In Figure 1 above, values less-than 240 are encoded using two hex 291 digits (0xnn). Values above 239 are encoded using 3 hex digits 292 (0xfnnn). The highest value that can be represented with this 293 encoding is 4095. For example the length value of 239 is encoded as 294 0xef (single octet) while 240 is encoded as 0xf0f0 (2-octet). 296 4.2. NLRI Value Encoding 298 The Flow Specification NLRI value consists of a list of optional 299 components and is encoded as follows: 301 Encoding: <[component]+> 303 A specific packet is considered to match the Flow Specification when 304 it matches the intersection (AND) of all the components present in 305 the Flow Specification. 307 Components MUST follow strict type ordering by increasing numerical 308 order. A given component type may (exactly once) or may not be 309 present in the Flow Specification. If present, it MUST precede any 310 component of higher numeric type value. 312 All combinations of components within a single Flow Specification are 313 allowed. However, some combinations cannot match any packets (e.g. 314 "ICMP Type AND Port" will never match any packets), and thus SHOULD 315 NOT be propagated by BGP. 317 A NLRI value not encoded as specified in Section 4.2 is considered 318 malformed and error handling according to Section 10 is performed. 320 4.2.1. Operators 322 Most of the components described below make use of comparison 323 operators. Which of the two operators is used is defined by the 324 components in Section 4.2.2. The operators are encoded as a single 325 octet. 327 4.2.1.1. Numeric Operator (numeric_op) 329 This operator is encoded as shown in Figure 2. 331 0 1 2 3 4 5 6 7 332 +---+---+---+---+---+---+---+---+ 333 | e | a | len | 0 |lt |gt |eq | 334 +---+---+---+---+---+---+---+---+ 336 Figure 2: Numeric Operator (numeric_op) 338 e - end-of-list bit: Set in the last {op, value} pair in the list. 340 a - AND bit: If unset, the result of the previous {op, value} pair 341 is logically ORed with the current one. If set, the operation is 342 a logical AND. In the first operator octet of a sequence it 343 SHOULD be encoded as unset and MUST be treated as always unset on 344 decoding. The AND operator has higher priority than OR for the 345 purposes of evaluating logical expressions. 347 len - length: The length of the value field for this operator given 348 as (1 << len). This encodes 1 (len=00), 2 (len=01), 4 (len=10), 8 349 (len=11) octets. 351 0 - SHOULD be set to 0 on NLRI encoding, and MUST be ignored during 352 decoding 354 lt - less than comparison between data and value. 356 gt - greater than comparison between data and value. 358 eq - equality between data and value. 360 The bits lt, gt, and eq can be combined to produce common relational 361 operators such as "less or equal", "greater or equal", and "not equal 362 to" as shown in Table 1. 364 +----+----+----+-----------------------------------+ 365 | lt | gt | eq | Resulting operation | 366 +----+----+----+-----------------------------------+ 367 | 0 | 0 | 0 | false (independent of the value) | 368 | 0 | 0 | 1 | == (equal) | 369 | 0 | 1 | 0 | > (greater than) | 370 | 0 | 1 | 1 | >= (greater than or equal) | 371 | 1 | 0 | 0 | < (less than) | 372 | 1 | 0 | 1 | <= (less than or equal) | 373 | 1 | 1 | 0 | != (not equal value) | 374 | 1 | 1 | 1 | true (independent of the value) | 375 +----+----+----+-----------------------------------+ 377 Table 1: Comparison operation combinations 379 4.2.1.2. Bitmask Operator (bitmask_op) 381 This operator is encoded as shown in Figure 3. 383 0 1 2 3 4 5 6 7 384 +---+---+---+---+---+---+---+---+ 385 | e | a | len | 0 | 0 |not| m | 386 +---+---+---+---+---+---+---+---+ 388 Figure 3: Bitmask Operator (bitmask_op) 390 e, a, len - Most significant nibble: (end-of-list bit, AND bit, and 391 length field), as defined in the Numeric Operator format in 392 Section 4.2.1.1. 394 not - NOT bit: If set, logical negation of operation. 396 m - Match bit: If set, this is a bitwise match operation defined as 397 "(data AND value) == value"; if unset, (data AND value) evaluates 398 to TRUE if any of the bits in the value mask are set in the data 400 0 - all 0 bits: SHOULD be set to 0 on NLRI encoding, and MUST be 401 ignored during decoding 403 4.2.2. Components 405 The encoding of each of the components begins with a type field (1 406 octet) followed by a variable length parameter. The following 407 sections define component types and parameter encodings for the IPv4 408 IP layer and transport layer headers. IPv6 NLRI component types are 409 described in [I-D.ietf-idr-flow-spec-v6]. 411 4.2.2.1. Type 1 - Destination Prefix 413 Encoding: 415 Defines the destination prefix to match. The length and prefix 416 fields are encoded as in BGP UPDATE messages [RFC4271] 418 4.2.2.2. Type 2 - Source Prefix 420 Encoding: 422 Defines the source prefix to match. The length and prefix fields are 423 encoded as in BGP UPDATE messages [RFC4271] 425 4.2.2.3. Type 3 - IP Protocol 427 Encoding: 429 Contains a list of {numeric_op, value} pairs that are used to match 430 the IP protocol value octet in IP packet header (see [RFC0791] 431 Section 3.1). 433 This component uses the Numeric Operator (numeric_op) described in 434 Section 4.2.1.1. Type 3 component values SHOULD be encoded as single 435 octet (numeric_op len=00). 437 4.2.2.4. Type 4 - Port 439 Encoding: 441 Defines a list of {numeric_op, value} pairs that matches source OR 442 destination TCP/UDP ports (see [RFC0793] Section 3.1 and [RFC0768] 443 Section "Format"). This component matches if either the destination 444 port OR the source port of a IP packet matches the value. 446 This component uses the Numeric Operator (numeric_op) described in 447 Section 4.2.1.1. Type 4 component values SHOULD be encoded as 1- or 448 2-octet quantities (numeric_op len=00 or len=01). 450 In case of the presence of the port (destination-port, source-port) 451 component only TCP or UDP packets can match the entire Flow 452 Specification. The port component, if present, never matches when 453 the packet's IP protocol value is not 6 (TCP) or 17 (UDP), if the 454 packet is fragmented and this is not the first fragment, or if the 455 system is unable to locate the transport header. Different 456 implementations may or may not be able to decode the transport header 457 in the presence of IP options or Encapsulating Security Payload (ESP) 458 NULL [RFC4303] encryption. 460 4.2.2.5. Type 5 - Destination Port 462 Encoding: 464 Defines a list of {numeric_op, value} pairs used to match the 465 destination port of a TCP or UDP packet (see also [RFC0793] 466 Section 3.1 and [RFC0768] Section "Format"). 468 This component uses the Numeric Operator (numeric_op) described in 469 Section 4.2.1.1. Type 5 component values SHOULD be encoded as 1- or 470 2-octet quantities (numeric_op len=00 or len=01). 472 The last paragraph of Section 4.2.2.4 also applies to this component. 474 4.2.2.6. Type 6 - Source Port 476 Encoding: 478 Defines a list of {numeric_op, value} pairs used to match the source 479 port of a TCP or UDP packet (see also [RFC0793] Section 3.1 and 480 [RFC0768] Section "Format"). 482 This component uses the Numeric Operator (numeric_op) described in 483 Section 4.2.1.1. Type 6 component values SHOULD be encoded as 1- or 484 2-octet quantities (numeric_op len=00 or len=01). 486 The last paragraph of Section 4.2.2.4 also applies to this component. 488 4.2.2.7. Type 7 - ICMP type 490 Encoding: 492 Defines a list of {numeric_op, value} pairs used to match the type 493 field of an ICMP packet (see also [RFC0792] Section "Message 494 Formats"). 496 This component uses the Numeric Operator (numeric_op) described in 497 Section 4.2.1.1. Type 7 component values SHOULD be encoded as single 498 octet (numeric_op len=00). 500 In case of the presence of the ICMP type (code) component only ICMP 501 packets can match the entire Flow Specification. The ICMP type 502 (code) component, if present, never matches when the packet's IP 503 protocol value is not 1 (ICMP), if the packet is fragmented and this 504 is not the first fragment, or if the system is unable to locate the 505 transport header. Different implementations may or may not be able 506 to decode the transport header in the presence of IP options or 507 Encapsulating Security Payload (ESP) NULL [RFC4303] encryption. 509 4.2.2.8. Type 8 - ICMP code 511 Encoding: 513 Defines a list of {numeric_op, value} pairs used to match the code 514 field of an ICMP packet (see also [RFC0792] Section "Message 515 Formats"). 517 This component uses the Numeric Operator (numeric_op) described in 518 Section 4.2.1.1. Type 8 component values SHOULD be encoded as single 519 octet (numeric_op len=00). 521 The last paragraph of Section 4.2.2.7 also applies to this component. 523 4.2.2.9. Type 9 - TCP flags 525 Encoding: 527 Defines a list of {bitmask_op, bitmask} pairs used to match TCP 528 Control Bits (see also [RFC0793] Section 3.1). 530 This component uses the Bitmask Operator (bitmask_op) described in 531 Section 4.2.1.2. Type 9 component bitmasks MUST be encoded as 1- or 532 2-octet bitmask (bitmask_op len=00 or len=01). 534 When a single octet (bitmask_op len=00) is specified, it matches 535 octet 14 of the TCP header (see also [RFC0793] Section 3.1), which 536 contains the TCP Control Bits. When a 2-octet (bitmask_op len=01) 537 encoding is used, it matches octets 13 and 14 of the TCP header with 538 the data offset (leftmost 4 bits) always treated as 0. 540 In case of the presence of the TCP flags component only TCP packets 541 can match the entire Flow Specification. The TCP flags component, if 542 present, never matches when the packet's IP protocol value is not 6 543 (TCP), if the packet is fragmented and this is not the first 544 fragment, or if the system is unable to locate the transport header. 545 Different implementations may or may not be able to decode the 546 transport header in the presence of IP options or Encapsulating 547 Security Payload (ESP) NULL [RFC4303] encryption. 549 4.2.2.10. Type 10 - Packet length 551 Encoding: 553 Defines a list of {numeric_op, value} pairs used to match on the 554 total IP packet length (excluding Layer 2 but including IP header). 556 This component uses the Numeric Operator (numeric_op) described in 557 Section 4.2.1.1. Type 10 component values SHOULD be encoded as 1- or 558 2-octet quantities (numeric_op len=00 or len=01). 560 4.2.2.11. Type 11 - DSCP (Diffserv Code Point) 562 Encoding: 564 Defines a list of {numeric_op, value} pairs used to match the 6-bit 565 DSCP field (see also [RFC2474]). 567 This component uses the Numeric Operator (numeric_op) described in 568 Section 4.2.1.1. Type 11 component values MUST be encoded as single 569 octet (numeric_op len=00). 571 The six least significant bits contain the DSCP value. All other 572 bits SHOULD be treated as 0. 574 4.2.2.12. Type 12 - Fragment 576 Encoding: 578 Defines a list of {bitmask_op, bitmask} pairs used to match specific 579 IP fragments. 581 This component uses the Bitmask Operator (bitmask_op) described in 582 Section 4.2.1.2. The Type 12 component bitmask MUST be encoded as 583 single octet bitmask (bitmask_op len=00). 585 0 1 2 3 4 5 6 7 586 +---+---+---+---+---+---+---+---+ 587 | 0 | 0 | 0 | 0 |LF |FF |IsF|DF | 588 +---+---+---+---+---+---+---+---+ 590 Figure 4: Fragment Bitmask Operand 592 Bitmask values: 594 DF - Don't fragment - match if [RFC0791] IP Header Flags Bit-1 (DF) 595 is 1 597 IsF - Is a fragment - match if [RFC0791] IP Header Fragment Offset 598 is not 0 600 FF - First fragment - match if [RFC0791] IP Header Fragment Offset 601 is 0 AND Flags Bit-2 (MF) is 1 603 LF - Last fragment - match if [RFC0791] IP Header Fragment Offset is 604 not 0 AND Flags Bit-2 (MF) is 0 606 0 - SHOULD be set to 0 on NLRI encoding, and MUST be ignored during 607 decoding 609 4.3. Examples of Encodings 611 4.3.1. Example 1 613 An example of a Flow Specification NLRI encoding for: "all packets to 614 192.0.2.0/24 and TCP port 25". 616 +--------+----------------+----------+----------+ 617 | length | destination | protocol | port | 618 +--------+----------------+----------+----------+ 619 | 0x0b | 01 18 c0 00 02 | 03 81 06 | 04 81 19 | 620 +--------+----------------+----------+----------+ 622 Decoded: 624 +-------+------------+-------------------------------+ 625 | Value | | | 626 +-------+------------+-------------------------------+ 627 | 0x0b | length | 11 octets (len<240 1-octet) | 628 | 0x01 | type | Type 1 - Destination Prefix | 629 | 0x18 | length | 24 bit | 630 | 0xc0 | prefix | 192 | 631 | 0x00 | prefix | 0 | 632 | 0x02 | prefix | 2 | 633 | 0x03 | type | Type 3 - IP Protocol | 634 | 0x81 | numeric_op | end-of-list, value size=1, == | 635 | 0x06 | value | 6 (TCP) | 636 | 0x04 | type | Type 4 - Port | 637 | 0x81 | numeric_op | end-of-list, value size=1, == | 638 | 0x19 | value | 25 | 639 +-------+------------+-------------------------------+ 641 This constitutes a NLRI with a NLRI length of 11 octets. 643 4.3.2. Example 2 645 An example of a Flow Specification NLRI encoding for: "all packets to 646 192.0.2.0/24 from 203.0.113.0/24 and port {range [137, 139] or 647 8080}". 649 +--------+----------------+----------------+-------------------------+ 650 | length | destination | source | port | 651 +--------+----------------+----------------+-------------------------+ 652 | 0x12 | 01 18 c0 00 02 | 02 18 cb 00 71 | 04 03 89 45 8b 91 1f 90 | 653 +--------+----------------+----------------+-------------------------+ 655 Decoded: 657 +--------+------------+-------------------------------+ 658 | Value | | | 659 +--------+------------+-------------------------------+ 660 | 0x12 | length | 18 octets (len<240 1-octet) | 661 | 0x01 | type | Type 1 - Destination Prefix | 662 | 0x18 | length | 24 bit | 663 | 0xc0 | prefix | 192 | 664 | 0x00 | prefix | 0 | 665 | 0x02 | prefix | 2 | 666 | 0x02 | type | Type 2 - Source Prefix | 667 | 0x18 | length | 24 bit | 668 | 0xcb | prefix | 203 | 669 | 0x00 | prefix | 0 | 670 | 0x71 | prefix | 113 | 671 | 0x04 | type | Type 4 - Port | 672 | 0x03 | numeric_op | value size=1, >= | 673 | 0x89 | value | 137 | 674 | 0x45 | numeric_op | "AND", value size=1, <= | 675 | 0x8b | value | 139 | 676 | 0x91 | numeric_op | end-of-list, value size=2, == | 677 | 0x1f90 | value | 8080 | 678 +--------+------------+-------------------------------+ 680 This constitutes a NLRI with a NLRI length of 18 octets. 682 4.3.3. Example 3 684 An example of a Flow Specification NLRI encoding for: "all packets to 685 192.0.2.1/32 and fragment { DF or FF } (matching packet with DF bit 686 set or First Fragments) 688 +--------+-------------------+----------+ 689 | length | destination | fragment | 690 +--------+-------------------+----------+ 691 | 0x09 | 01 20 c0 00 02 01 | 0c 80 05 | 692 +--------+-------------------+----------+ 694 Decoded: 696 +-------+------------+------------------------------+ 697 | Value | | | 698 +-------+------------+------------------------------+ 699 | 0x09 | length | 9 octets (len<240 1-octet) | 700 | 0x01 | type | Type 1 - Destination Prefix | 701 | 0x20 | length | 32 bit | 702 | 0xc0 | prefix | 192 | 703 | 0x00 | prefix | 0 | 704 | 0x02 | prefix | 2 | 705 | 0x01 | prefix | 1 | 706 | 0x0c | type | Type 12 - Fragment | 707 | 0x80 | bitmask_op | end-of-list, value size=1 | 708 | 0x05 | bitmask | DF=1, FF=1 | 709 +-------+------------+------------------------------+ 711 This constitutes a NLRI with a NLRI length of 9 octets. 713 5. Traffic Filtering 715 Traffic filtering policies have been traditionally considered to be 716 relatively static. Limitations of these static mechanisms caused 717 this new dynamic mechanism to be designed for the three new 718 applications of traffic filtering: 720 o Prevention of traffic-based, denial-of-service (DOS) attacks. 722 o Traffic filtering in the context of BGP/MPLS VPN service. 724 o Centralized traffic control for SDN/NFV networks. 726 These applications require coordination among service providers and/ 727 or coordination among the AS within a service provider. 729 The Flow Specification NLRI defined in Section 4 conveys information 730 about traffic filtering rules for traffic that should be discarded or 731 handled in a manner specified by a set of pre-defined actions (which 732 are defined in BGP Extended Communities). This mechanism is 733 primarily designed to allow an upstream autonomous system to perform 734 inbound filtering in their ingress routers of traffic that a given 735 downstream AS wishes to drop. 737 In order to achieve this goal, this document specifies two 738 application-specific NLRI identifiers that provide traffic filters, 739 and a set of actions encoding in BGP Extended Communities. The two 740 application-specific NLRI identifiers are: 742 o IPv4 Flow Specification identifier (AFI=1, SAFI=133) along with 743 specific semantic rules for IPv4 routes, and 745 o VPNv4 Flow Specification identifier (AFI=1, SAFI=134) value, which 746 can be used to propagate traffic filtering information in a BGP/ 747 MPLS VPN environment. 749 Encoding of the NLRI is described in Section 4 for IPv4 Flow 750 Specification and in Section 8 for VPNv4 Flow Specification. The 751 filtering actions are described in Section 7. 753 5.1. Ordering of Flow Specifications 755 More than one Flow Specification may match a particular traffic flow. 756 Thus, it is necessary to define the order in which Flow 757 Specifications get matched and actions being applied to a particular 758 traffic flow. This ordering function is such that it does not depend 759 on the arrival order of the Flow Specification via BGP and thus is 760 consistent in the network. 762 The relative order of two Flow Specifications is determined by 763 comparing their respective components. The algorithm starts by 764 comparing the left-most components (lowest component type value) of 765 the Flow Specifications. If the types differ, the Flow Specification 766 with lowest numeric type value has higher precedence (and thus will 767 match before) than the Flow Specification that doesn't contain that 768 component type. If the component types are the same, then a type- 769 specific comparison is performed (see below). If the types are equal 770 the algorithm continues with the next component. 772 For IP prefix values (IP destination or source prefix): If one of the 773 two prefixes to compare is a more specific prefix of the other, the 774 more specific prefix has higher precedence. Otherwise the one with 775 the lowest IP value has higher precedence. 777 For all other component types, unless otherwise specified, the 778 comparison is performed by comparing the component data as a binary 779 string using the memcmp() function as defined by [ISO_IEC_9899]. For 780 strings with equal lengths the lowest string (memcmp) has higher 781 precedence. For strings of different lengths, the common prefix is 782 compared. If the common prefix is not equal the string with the 783 lowest prefix has higher precedence. If the common prefix is equal, 784 the longest string is considered to have higher precedence than the 785 shorter one. 787 The code in Appendix A shows a Python3 implementation of the 788 comparison algorithm. The full code was tested with Python 3.6.3 and 789 can be obtained at 790 https://github.com/stoffi92/rfc5575bis/tree/master/flowspec-cmp [1]. 792 6. Validation Procedure 794 Flow Specifications received from a BGP peer that are accepted in the 795 respective Adj-RIB-In are used as input to the route selection 796 process. Although the forwarding attributes of two routes for the 797 same Flow Specification prefix may be the same, BGP is still required 798 to perform its path selection algorithm in order to select the 799 correct set of attributes to advertise. 801 The first step of the BGP Route Selection procedure (Section 9.1.2 of 802 [RFC4271] is to exclude from the selection procedure routes that are 803 considered non-feasible. In the context of IP routing information, 804 this step is used to validate that the NEXT_HOP attribute of a given 805 route is resolvable. 807 The concept can be extended, in the case of the Flow Specification 808 NLRI, to allow other validation procedures. 810 The validation process described below validates Flow Specifications 811 against unicast routes received over the same AFI but the associated 812 unicast routing information SAFI: 814 Flow Specification received over SAFI=133 will be validated 815 against routes received over SAFI=1 817 Flow Specification received over SAFI=134 will be validated 818 against routes received over SAFI=128 820 In the absence of explicit configuration a Flow Specification NLRI 821 MUST be validated such that it is considered feasible if and only if 822 all of the conditions below are true: 824 a) A destination prefix component is embedded in the Flow 825 Specification. 827 b) The originator of the Flow Specification matches the originator 828 of the best-match unicast route for the destination prefix 829 embedded in the Flow Specification (this is the unicast route with 830 the longest possible prefix length covering the destination prefix 831 embedded in the Flow Specification). 833 c) There are no "more-specific" unicast routes, when compared with 834 the flow destination prefix, that have been received from a 835 different neighboring AS than the best-match unicast route, which 836 has been determined in rule b). 838 However, rule a) MAY be relaxed by explicit configuration, permitting 839 Flow Specifications that include no destination prefix component. If 840 such is the case, rules b) and c) are moot and MUST be disregarded. 842 By "originator" of a BGP route, we mean either the address of the 843 originator in the ORIGINATOR_ID Attribute [RFC4456], or the source IP 844 address of the BGP peer, if this path attribute is not present. 846 BGP implementations MUST also enforce that the AS_PATH attribute of a 847 route received via the External Border Gateway Protocol (eBGP) 848 contains the neighboring AS in the left-most position of the AS_PATH 849 attribute. While this rule is optional in the BGP specification, it 850 becomes necessary to enforce it here for security reasons. 852 The best-match unicast route may change over the time independently 853 of the Flow Specification NLRI. Therefore, a revalidation of the 854 Flow Specification NLRI MUST be performed whenever unicast routes 855 change. Revalidation is defined as retesting rules a) to c) as 856 described above. 858 Explanation: 860 The underlying concept is that the neighboring AS that advertises the 861 best unicast route for a destination is allowed to advertise Flow 862 Specification information that conveys a destination prefix that is 863 more or equally specific. Thus, as long as there are no "more- 864 specific" unicast routes, received from a different neighboring AS, 865 which would be affected by that Flow Specification, the Flow 866 Specification is validated successfully. 868 The neighboring AS is the immediate destination of the traffic 869 described by the Flow Specification. If it requests these flows to 870 be dropped, that request can be honored without concern that it 871 represents a denial of service in itself. The reasoning is that this 872 is as if the traffic is being dropped by the downstream autonomous 873 system, and there is no added value in carrying the traffic to it. 875 7. Traffic Filtering Actions 877 This document defines a minimum set of Traffic Filtering Actions that 878 it standardizes as BGP extended communities [RFC4360]. This is not 879 meant to be an inclusive list of all the possible actions, but only a 880 subset that can be interpreted consistently across the network. 881 Additional actions can be defined as either requiring standards or as 882 vendor specific. 884 The default action for a matching Flow Specification is to accept the 885 packet (treat the packet according to the normal forwarding behaviour 886 of the system). 888 This document defines the following extended communities values shown 889 in Table 2 in the form 0xttss where tt indicates the type and ss 890 indicates the sub-type of the extended community. Encodings for 891 these extended communities are described below. 893 +-------------+---------------------------+-------------------------+ 894 | community | action | encoding | 895 | 0xttss | | | 896 +-------------+---------------------------+-------------------------+ 897 | 0x8006 | traffic-rate-bytes | 2-octet AS, 4-octet | 898 | | (Section 7.1) | float | 899 | TBD | traffic-rate-packets | 2-octet AS, 4-octet | 900 | | (Section 7.1) | float | 901 | 0x8007 | traffic-action | bitmask | 902 | | (Section 7.3) | | 903 | 0x8008 | rt-redirect AS-2octet | 2-octet AS, 4-octet | 904 | | (Section 7.4) | value | 905 | 0x8108 | rt-redirect IPv4 | 4-octet IPv4 address, | 906 | | (Section 7.4) | 2-octet value | 907 | 0x8208 | rt-redirect AS-4octet | 4-octet AS, 2-octet | 908 | | (Section 7.4) | value | 909 | 0x8009 | traffic-marking | DSCP value | 910 | | (Section 7.5) | | 911 +-------------+---------------------------+-------------------------+ 913 Table 2: Traffic Filtering Action Extended Communities 915 Multiple Traffic Filtering Actions defined in this document may be 916 present for a single Flow Specification and SHOULD be applied to the 917 traffic flow (for example traffic-rate-bytes and rt-redirect can be 918 applied to packets at the same time). If not all of the Traffic 919 Filtering Actions can be applied to a traffic flow they should be 920 treated as interfering Traffic Filtering Actions (see below). 922 Some Traffic Filtering Actions may interfere with each other or even 923 contradict. Section 7.7 of this document provides general 924 considerations on such Traffic Filtering Action interference. Any 925 additional definition of Traffic Filtering Actions SHOULD specify the 926 action to take if those Traffic Filtering Actions interfere (also 927 with existing Traffic Filtering Actions). 929 All Traffic Filtering Actions are specified as transitive BGP 930 Extended Communities. 932 7.1. Traffic Rate in Bytes (traffic-rate-bytes) sub-type 0x06 934 The traffic-rate-bytes extended community uses the following extended 935 community encoding: 937 The first two octets carry the 2-octet id, which can be assigned from 938 a 2-octet AS number. When a 4-octet AS number is locally present, 939 the 2 least significant octets of such an AS number can be used. 940 This value is purely informational and SHOULD NOT be interpreted by 941 the implementation. 943 The remaining 4 octets carry the maximum rate information in IEEE 944 floating point [IEEE.754.1985] format, units being bytes per second. 945 A traffic-rate of 0 should result on all traffic for the particular 946 flow to be discarded. On encoding the traffic-rate MUST NOT be 947 negative. On decoding negative values MUST be treated as zero 948 (discard all traffic). 950 Interferes with: May interfere with the traffic-rate-packets (see 951 Section 7.2). A policy may allow both filtering by traffic-rate- 952 packets and traffic-rate-bytes. If the policy does not allow this, 953 these two actions will conflict. 955 7.2. Traffic Rate in Packets (traffic-rate-packets) sub-type TBD 957 The traffic-rate-packets extended community uses the same encoding as 958 the traffic-rate-bytes extended community. The floating point value 959 carries the maximum packet rate in packets per second. A traffic- 960 rate-packets of 0 should result in all traffic for the particular 961 flow to be discarded. On encoding the traffic-rate-packets MUST NOT 962 be negative. On decoding negative values MUST be treated as zero 963 (discard all traffic). 965 Interferes with: May interfere with the traffic-rate-bytes (see 966 Section 7.1). A policy may allow both filtering by traffic-rate- 967 packets and traffic-rate-bytes. If the policy does not allow this, 968 these two actions will conflict. 970 7.3. Traffic-action (traffic-action) sub-type 0x07 972 The traffic-action extended community consists of 6 octets of which 973 only the 2 least significant bits of the 6th octet (from left to 974 right) are defined by this document as shown in Figure 5. 976 0 1 2 3 977 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 978 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 979 | Traffic Action Field | 980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 981 | Tr. Action Field (cont.) |S|T| 982 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 Figure 5: Traffic-action Extended Community Encoding 986 where S and T are defined as: 988 o T: Terminal Action (bit 47): When this bit is set, the traffic 989 filtering engine will evaluate any subsequent Flow Specifications 990 (as defined by the ordering procedure Section 5.1). If not set, 991 the evaluation of the traffic filters stops when this Flow 992 Specification is evaluated. 994 o S: Sample (bit 46): Enables traffic sampling and logging for this 995 Flow Specification (only effective when set). 997 o Traffic Action Field: Other Traffic Action Field (see Section 11) 998 bits unused in this specification. These bits SHOULD be set to 0 999 on encoding, and MUST be ignored during decoding. 1001 The use of the Terminal Action (bit 47) may result in more than one 1002 Flow Specification matching a particular traffic flow. All the 1003 Traffic Filtering Actions from these Flow Specifications shall be 1004 collected and applied. In case of interfering Traffic Filtering 1005 Actions it is an implementation decision which Traffic Filtering 1006 Actions are selected. See also Section 7.7. 1008 Interferes with: No other BGP Flow Specification Traffic Filtering 1009 Action in this document. 1011 7.4. RT Redirect (rt-redirect) sub-type 0x08 1013 The redirect extended community allows the traffic to be redirected 1014 to a VRF routing instance that lists the specified route-target in 1015 its import policy. If several local instances match this criteria, 1016 the choice between them is a local matter (for example, the instance 1017 with the lowest Route Distinguisher value can be elected). 1019 This Extended Community allows 3 different encodings formats for the 1020 route-target (type 0x80, 0x81, 0x82). It uses the same encoding as 1021 the Route Target Extended Community in Sections 3.1 (type 0x80: 1022 2-octet AS, 4-octet value), 3.2 (type 0x81: 4-octet IPv4 address, 1023 2-octet value) and 4 of [RFC4360] and Section 2 (type 0x82: 4-octet 1024 AS, 2-octet value) of [RFC5668] with the high-order octet of the Type 1025 field 0x80, 0x81, 0x82 respectively and the low-order of the Type 1026 field (Sub-Type) always 0x08. 1028 Interferes with: No other BGP Flow Specification Traffic Filtering 1029 Action in this document. 1031 7.5. Traffic Marking (traffic-marking) sub-type 0x09 1033 The traffic marking extended community instructs a system to modify 1034 the DSCP bits in the IP header ([RFC2474] Section 3) of a transiting 1035 IP packet to the corresponding value encoded in the 6 least 1036 significant bits of the extended community value as shown in 1037 Figure 6. 1039 The extended is encoded as follows: 1041 0 1 2 3 1042 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1043 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1044 | reserved | reserved | reserved | reserved | 1045 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1046 | reserved | r.| DSCP | 1047 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1049 Figure 6: Traffic Marking Extended Community Encoding 1051 o DSCP: new DSCP value for the transiting IP packet. 1053 o reserved, r.: SHOULD be set to 0 on encoding, and MUST be ignored 1054 during decoding. 1056 Interferes with: No other BGP Flow Specification Traffic Filtering 1057 Action in this document. 1059 7.6. Interaction with other Filtering Mechanisms in Routers 1061 Implementations should provide mechanisms that map an arbitrary BGP 1062 community value (normal or extended) to Traffic Filtering Actions 1063 that require different mappings on different systems in the network. 1064 For instance, providing packets with a worse-than-best-effort per-hop 1065 behavior is a functionality that is likely to be implemented 1066 differently in different systems and for which no standard behavior 1067 is currently known. Rather than attempting to define it here, this 1068 can be accomplished by mapping a user-defined community value to 1069 platform-/network-specific behavior via user configuration. 1071 7.7. Considerations on Traffic Filtering Action Interference 1073 Since Traffic Filtering Actions are represented as BGP extended 1074 community values, Traffic Filtering Actions may interfere with each 1075 other (e.g. there may be more than one conflicting traffic-rate-bytes 1076 Traffic Filtering Action associated with a single Flow 1077 Specification). Traffic Filtering Action interference has no impact 1078 on BGP propagation of Flow Specifications (all communities are 1079 propagated according to policies). 1081 If a Flow Specification associated with interfering Traffic Filtering 1082 Actions is selected for packet forwarding, it is an implementation 1083 decision which of the interfering Traffic Filtering Actions are 1084 selected. Implementors of this specification SHOULD document the 1085 behaviour of their implementation in such cases. 1087 Operators are encouraged to make use of the BGP policy framework 1088 supported by their implementation in order to achieve a predictable 1089 behaviour. See also Section 12. 1091 8. Dissemination of Traffic Filtering in BGP/MPLS VPN Networks 1093 Provider-based Layer 3 VPN networks, such as the ones using a BGP/ 1094 MPLS IP VPN [RFC4364] control plane, may have different traffic 1095 filtering requirements than Internet service providers. But also 1096 Internet service providers may use those VPNs for scenarios like 1097 having the Internet routing table in a VRF, resulting in the same 1098 traffic filtering requirements as defined for the global routing 1099 table environment within this document. This document defines an 1100 additional BGP NLRI type (AFI=1, SAFI=134) value, which can be used 1101 to propagate Flow Specification in a BGP/MPLS VPN environment. 1103 The NLRI format for this address family consists of a fixed-length 1104 Route Distinguisher field (8 octets) followed by the Flow 1105 Specification NLRI value (Section 4.2). The NLRI length field shall 1106 include both the 8 octets of the Route Distinguisher as well as the 1107 subsequent Flow Specification NLRI value. The resulting encoding is 1108 shown in Figure 7. 1110 +--------------------------------+ 1111 | length (0xnn or 0xfn nn) | 1112 +--------------------------------+ 1113 | Route Distinguisher (8 octets) | 1114 +--------------------------------+ 1115 | NLRI value (variable) | 1116 +--------------------------------+ 1118 Figure 7: Flow Specification NLRI for MPLS 1120 Propagation of this NLRI is controlled by matching Route Target 1121 extended communities associated with the BGP path advertisement with 1122 the VRF import policy, using the same mechanism as described in BGP/ 1123 MPLS IP VPNs [RFC4364]. 1125 Flow Specifications received via this NLRI apply only to traffic that 1126 belongs to the VRF(s) in which it is imported. By default, traffic 1127 received from a remote PE is switched via an MPLS forwarding decision 1128 and is not subject to filtering. 1130 Contrary to the behavior specified for the non-VPN NLRI, Flow 1131 Specifications are accepted by default, when received from remote PE 1132 routers. 1134 The validation procedure (Section 6) and Traffic Filtering Actions 1135 (Section 7) are the same as for IPv4. 1137 9. Traffic Monitoring 1139 Traffic filtering applications require monitoring and traffic 1140 statistics facilities. While this is an implementation specific 1141 choice, implementations SHOULD provide: 1143 o A mechanism to log the packet header of filtered traffic. 1145 o A mechanism to count the number of matches for a given Flow 1146 Specification. 1148 10. Error Handling 1150 Error handling according to [RFC7606] and [RFC4760] applies to this 1151 specification. 1153 This document introduces Traffic Filtering Action Extended 1154 Communities. Malformed Traffic Filtering Action Extended Communities 1155 in the sense of [RFC7606] Section 7.14. are Extended Community values 1156 that cannot be decoded according to Section 7 of this document. 1158 11. IANA Considerations 1160 This section complies with [RFC7153]. 1162 11.1. AFI/SAFI Definitions 1164 IANA maintains a registry entitled "SAFI Values". For the purpose of 1165 this work, IANA is requested to update the following SAFIs to read 1166 according to the table below (Note: This document obsoletes both 1167 RFC7674 and RFC5575 and all references to those documents should be 1168 deleted from the registry below): 1170 +-------+------------------------------------------+----------------+ 1171 | Value | Name | Reference | 1172 +-------+------------------------------------------+----------------+ 1173 | 133 | Dissemination of Flow Specification | [this | 1174 | | rules | document] | 1175 | 134 | L3VPN Dissemination of Flow | [this | 1176 | | Specification rules | document] | 1177 +-------+------------------------------------------+----------------+ 1179 Table 3: Registry: SAFI Values 1181 11.2. Flow Component Definitions 1183 A Flow Specification consists of a sequence of flow components, which 1184 are identified by an 8-bit component type. IANA has created and 1185 maintains a registry entitled "Flow Spec Component Types". IANA is 1186 requested to update the reference for this registry to [this 1187 document]. Furthermore the references to the values should be 1188 updated according to the table below (Note: This document obsoletes 1189 both RFC7674 and RFC5575 and all references to those documents should 1190 be deleted from the registry below). 1192 +-------+--------------------+-----------------+ 1193 | Value | Name | Reference | 1194 +-------+--------------------+-----------------+ 1195 | 1 | Destination Prefix | [this document] | 1196 | 2 | Source Prefix | [this document] | 1197 | 3 | IP Protocol | [this document] | 1198 | 4 | Port | [this document] | 1199 | 5 | Destination port | [this document] | 1200 | 6 | Source port | [this document] | 1201 | 7 | ICMP type | [this document] | 1202 | 8 | ICMP code | [this document] | 1203 | 9 | TCP flags | [this document] | 1204 | 10 | Packet length | [this document] | 1205 | 11 | DSCP | [this document] | 1206 | 12 | Fragment | [this document] | 1207 +-------+--------------------+-----------------+ 1209 Table 4: Registry: Flow Spec Component Types 1211 In order to manage the limited number space and accommodate several 1212 usages, the following policies defined by [RFC8126] are used: 1214 +--------------+-------------------------------+ 1215 | Type Values | Policy | 1216 +--------------+-------------------------------+ 1217 | 0 | Reserved | 1218 | [1 .. 12] | Defined by this specification | 1219 | [13 .. 127] | Specification required | 1220 | [128 .. 255] | First Come First Served | 1221 +--------------+-------------------------------+ 1223 Table 5: Flow Spec Component Types Policies 1225 11.3. Extended Community Flow Specification Actions 1227 The Extended Community Flow Specification Action types defined in 1228 this document consist of two parts: 1230 Type (BGP Transitive Extended Community Type) 1232 Sub-Type 1234 For the type-part, IANA maintains a registry entitled "BGP Transitive 1235 Extended Community Types". For the purpose of this work (Section 7), 1236 IANA is requested to update the references to the following entries 1237 according to the table below (Note: This document obsoletes both 1238 RFC7674 and RFC5575 and all references to those documents should be 1239 deleted in the registry below): 1241 +-------+-----------------------------------------------+-----------+ 1242 | Type | Name | Reference | 1243 | Value | | | 1244 +-------+-----------------------------------------------+-----------+ 1245 | 0x81 | Generic Transitive Experimental | [this | 1246 | | Use Extended Community Part 2 (Sub-Types are | document] | 1247 | | defined in the "Generic Transitive | | 1248 | | Experimental Use Extended Community Part 2 | | 1249 | | Sub-Types" Registry) | | 1250 | 0x82 | Generic Transitive Experimental | [this | 1251 | | Use Extended Community Part 3 | document] | 1252 | | (Sub-Types are defined in the "Generic | | 1253 | | Transitive Experimental Use | | 1254 | | Extended Community Part 3 Sub-Types" | | 1255 | | Registry) | | 1256 +-------+-----------------------------------------------+-----------+ 1258 Table 6: Registry: BGP Transitive Extended Community Types 1260 For the sub-type part of the extended community Traffic Filtering 1261 Actions IANA maintains the following registries. IANA is requested 1262 to update all names and references according to the tables below and 1263 assign a new value for the "Flow spec traffic-rate-packets" Sub-Type 1264 (Note: This document obsoletes both RFC7674 and RFC5575 and all 1265 references to those documents should be deleted from the registries 1266 below). 1268 +----------+--------------------------------------------+-----------+ 1269 | Sub-Type | Name | Reference | 1270 | Value | | | 1271 +----------+--------------------------------------------+-----------+ 1272 | 0x06 | Flow spec traffic-rate-bytes | [this | 1273 | | | document] | 1274 | TBD | Flow spec traffic-rate-packets | [this | 1275 | | | document] | 1276 | 0x07 | Flow spec traffic-action (Use | [this | 1277 | | of the "Value" field is defined in the | document] | 1278 | | "Traffic Action Fields" registry) | | 1279 | 0x08 | Flow spec rt-redirect | [this | 1280 | | AS-2octet format | document] | 1281 | 0x09 | Flow spec traffic-remarking | [this | 1282 | | | document] | 1283 +----------+--------------------------------------------+-----------+ 1285 Table 7: Registry: Generic Transitive Experimental Use Extended 1286 Community Sub-Types 1288 +------------+----------------------------------------+-------------+ 1289 | Sub-Type | Name | Reference | 1290 | Value | | | 1291 +------------+----------------------------------------+-------------+ 1292 | 0x08 | Flow spec rt-redirect IPv4 | [this | 1293 | | format | document] | 1294 +------------+----------------------------------------+-------------+ 1296 Table 8: Registry: Generic Transitive Experimental Use Extended 1297 Community Part 2 Sub-Types 1299 +------------+-----------------------------------------+------------+ 1300 | Sub-Type | Name | Reference | 1301 | Value | | | 1302 +------------+-----------------------------------------+------------+ 1303 | 0x08 | Flow spec rt-redirect | [this | 1304 | | AS-4octet format | document] | 1305 +------------+-----------------------------------------+------------+ 1307 Table 9: Registry: Generic Transitive Experimental Use Extended 1308 Community Part 3 Sub-Types 1310 Furthermore IANA is requested to update the reference for the 1311 registries "Generic Transitive Experimental Use Extended Community 1312 Part 2 Sub-Types" and "Generic Transitive Experimental Use Extended 1313 Community Part 3 Sub-Types" to [this document]. 1315 The "traffic-action" extended community (Section 7.3) defined in this 1316 document has 46 unused bits, which can be used to convey additional 1317 meaning. IANA created and maintains a registry entitled: "Traffic 1318 Action Fields". IANA is requested to update the reference for this 1319 registry to [this document]. Furthermore IANA is requested to update 1320 the references according to the table below. These values should be 1321 assigned via IETF Review rules only (Note: This document obsoletes 1322 both RFC7674 and RFC5575 and all references to those documents should 1323 be deleted from the registry below). 1325 +-----+-----------------+-----------------+ 1326 | Bit | Name | Reference | 1327 +-----+-----------------+-----------------+ 1328 | 47 | Terminal Action | [this document] | 1329 | 46 | Sample | [this document] | 1330 +-----+-----------------+-----------------+ 1332 Table 10: Registry: Traffic Action Fields 1334 12. Security Considerations 1336 As long as Flow Specifications are restricted to match the 1337 corresponding unicast routing paths for the relevant prefixes 1338 (Section 6), the security characteristics of this proposal are 1339 equivalent to the existing security properties of BGP unicast 1340 routing. Any relaxation of the validation procedure described in 1341 Section 6 may allow unwanted Flow Specifications to be propagated and 1342 thus unwanted Traffic Filtering Actions may be applied to flows. 1344 Where the above mechanisms are not in place, this could open the door 1345 to further denial-of-service attacks such as unwanted traffic 1346 filtering, remarking or redirection. 1348 Deployment of specific relaxations of the validation within an 1349 administrative boundary of a network are useful in some networks for 1350 quickly distributing filters to prevent denial-of-service attacks. 1351 For a network to utilize this relaxation, the BGP policies must 1352 support additional filtering since the origin AS field is empty. 1353 Specifications relaxing the validation restrictions MUST contain 1354 security considerations that provide details on the required 1355 additional filtering. For example, the use of Origin validation can 1356 provide enhanced filtering within an AS confederation. 1358 Inter-provider routing is based on a web of trust. Neighboring 1359 autonomous systems are trusted to advertise valid reachability 1360 information. If this trust model is violated, a neighboring 1361 autonomous system may cause a denial-of-service attack by advertising 1362 reachability information for a given prefix for which it does not 1363 provide service (unfiltered address space hijack). Since validation 1364 of the Flow Specification is tied to the announcement of the best 1365 unicast route, the failure in the validation of best path route may 1366 prevent the Flow Specificaiton from being used by a local router. 1367 Possible mitigations are [RFC6811] and [RFC8205]. 1369 On IXPs routes are often exchanged via route servers which do not 1370 extend the AS_PATH. In such cases it is not possible to enforce the 1371 left-most AS in the AS_PATH to be the neighbor AS (the AS of the 1372 route server). Since the validation of Flow Specification 1373 (Section 6) depends on this, additional care must be taken. It is 1374 advised to use a strict inbound route policy in such scenarios. 1376 Enabling firewall-like capabilities in routers without centralized 1377 management could make certain failures harder to diagnose. For 1378 example, it is possible to allow TCP packets to pass between a pair 1379 of addresses but not ICMP packets. It is also possible to permit 1380 packets smaller than 900 or greater than 1000 octets to pass between 1381 a pair of addresses, but not packets whose length is in the range 1382 900- 1000. Such behavior may be confusing and these capabilities 1383 should be used with care whether manually configured or coordinated 1384 through the protocol extensions described in this document. 1386 Flow Specification BGP speakers (e.g. automated DDoS controllers) not 1387 properly programmed, algorithms that are not performing as expected, 1388 or simply rogue systems may announce unintended Flow Specifications, 1389 send updates at a high rate or generate a high number of Flow 1390 Specifications. This may stress the receiving systems, exceed their 1391 capacity, or lead to unwanted Traffic Filtering Actions being applied 1392 to flows. 1394 While the general verification of the Flow Specification NLRI is 1395 specified in this document (Section 6) the Traffic Filtering Actions 1396 received by a third party may need custom verification or filtering. 1397 In particular all non traffic-rate actions may allow a third party to 1398 modify packet forwarding properties and potentially gain access to 1399 other routing-tables/VPNs or undesired queues. This can be avoided 1400 by proper filtering/screening of the Traffic Filtering Action 1401 communities at network borders and only exposing a predefined subset 1402 of Traffic Filtering Actions (see Section 7) to third parties. One 1403 way to achieve this is by mapping user-defined communities, that can 1404 be set by the third party, to Traffic Filtering Actions and not 1405 accepting Traffic Filtering Action extended communities from third 1406 parties. 1408 This extension adds additional information to Internet routers. 1409 These are limited in terms of the maximum number of data elements 1410 they can hold as well as the number of events they are able to 1411 process in a given unit of time. Service providers need to consider 1412 the maximum capacity of their devices and may need to limit the 1413 number of Flow Specifications accepted and processed. 1415 13. Contributors 1417 Barry Greene, Pedro Marques, Jared Mauch and Nischal Sheth were 1418 authors on [RFC5575], and therefore are contributing authors on this 1419 document. 1421 14. Acknowledgements 1423 The authors would like to thank Yakov Rekhter, Dennis Ferguson, Chris 1424 Morrow, Charlie Kaufman, and David Smith for their comments for the 1425 comments on the original [RFC5575]. Chaitanya Kodeboyina helped 1426 design the flow validation procedure; and Steven Lin and Jim Washburn 1427 ironed out all the details necessary to produce a working 1428 implementation in the original [RFC5575]. 1430 A packet rate Traffic Filtering Action was also described in a Flow 1431 Specification extension draft and the authors like to thank Wesley 1432 Eddy, Justin Dailey and Gilbert Clark for their work. 1434 Additionally, the authors would like to thank Alexander Mayrhofer, 1435 Nicolas Fevrier, Job Snijders, Jeffrey Haas and Adam Chappell for 1436 their comments and review. 1438 15. References 1440 15.1. Normative References 1442 [IEEE.754.1985] 1443 IEEE, "Standard for Binary Floating-Point Arithmetic", 1444 IEEE 754-1985, August 1985. 1446 [ISO_IEC_9899] 1447 ISO, "Information technology -- Programming languages -- 1448 C", ISO/IEC 9899:2018, June 2018. 1450 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1451 DOI 10.17487/RFC0768, August 1980, 1452 . 1454 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1455 DOI 10.17487/RFC0791, September 1981, 1456 . 1458 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1459 RFC 792, DOI 10.17487/RFC0792, September 1981, 1460 . 1462 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1463 RFC 793, DOI 10.17487/RFC0793, September 1981, 1464 . 1466 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1467 Requirement Levels", BCP 14, RFC 2119, 1468 DOI 10.17487/RFC2119, March 1997, 1469 . 1471 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1472 "Definition of the Differentiated Services Field (DS 1473 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1474 DOI 10.17487/RFC2474, December 1998, 1475 . 1477 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1478 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1479 DOI 10.17487/RFC4271, January 2006, 1480 . 1482 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 1483 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 1484 February 2006, . 1486 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1487 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1488 2006, . 1490 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 1491 Reflection: An Alternative to Full Mesh Internal BGP 1492 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 1493 . 1495 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1496 "Multiprotocol Extensions for BGP-4", RFC 4760, 1497 DOI 10.17487/RFC4760, January 2007, 1498 . 1500 [RFC5668] Rekhter, Y., Sangli, S., and D. Tappan, "4-Octet AS 1501 Specific BGP Extended Community", RFC 5668, 1502 DOI 10.17487/RFC5668, October 2009, 1503 . 1505 [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP 1506 Extended Communities", RFC 7153, DOI 10.17487/RFC7153, 1507 March 2014, . 1509 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1510 Patel, "Revised Error Handling for BGP UPDATE Messages", 1511 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1512 . 1514 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1515 Writing an IANA Considerations Section in RFCs", BCP 26, 1516 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1517 . 1519 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1520 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1521 May 2017, . 1523 15.2. Informative References 1525 [I-D.ietf-idr-flow-spec-v6] 1526 Loibl, C., Raszuk, R., and S. Hares, "Dissemination of 1527 Flow Specification Rules for IPv6", draft-ietf-idr-flow- 1528 spec-v6-10 (work in progress), November 2019. 1530 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 1531 RFC 4303, DOI 10.17487/RFC4303, December 2005, 1532 . 1534 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 1535 and D. McPherson, "Dissemination of Flow Specification 1536 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 1537 . 1539 [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1540 Austein, "BGP Prefix Origin Validation", RFC 6811, 1541 DOI 10.17487/RFC6811, January 2013, 1542 . 1544 [RFC7674] Haas, J., Ed., "Clarification of the Flowspec Redirect 1545 Extended Community", RFC 7674, DOI 10.17487/RFC7674, 1546 October 2015, . 1548 [RFC8205] Lepinski, M., Ed. and K. Sriram, Ed., "BGPsec Protocol 1549 Specification", RFC 8205, DOI 10.17487/RFC8205, September 1550 2017, . 1552 15.3. URIs 1554 [1] https://github.com/stoffi92/rfc5575bis/tree/master/flowspec-cmp 1556 Appendix A. Python code: flow_rule_cmp 1558 1559 """ 1560 Copyright (c) 2020 IETF Trust and the persons identified as authors of 1561 the code. All rights reserved. 1563 Redistribution and use in source and binary forms, with or without 1564 modification, is permitted pursuant to, and subject to the license 1565 terms contained in, the Simplified BSD License set forth in Section 1566 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents 1567 (http://trustee.ietf.org/license-info). 1568 """ 1570 import itertools 1571 import ipaddress 1573 def flow_rule_cmp(a, b): 1574 for comp_a, comp_b in itertools.zip_longest(a.components, 1575 b.components): 1576 # If a component type does not exist in one rule 1577 # this rule has lower precedence 1578 if not comp_a: 1579 return B_HAS_PRECEDENCE 1580 if not comp_b: 1581 return A_HAS_PRECEDENCE 1582 # higher precedence for lower component type 1583 if comp_a.component_type < comp_b.component_type: 1584 return A_HAS_PRECEDENCE 1585 if comp_a.component_type > comp_b.component_type: 1586 return B_HAS_PRECEDENCE 1587 # component types are equal -> type specific comparison 1588 if comp_a.component_type in (IP_DESTINATION, IP_SOURCE): 1589 # assuming comp_a.value, comp_b.value of type 1590 # ipaddress.IPv4Network 1591 if comp_a.value.overlaps(comp_b.value): 1592 # longest prefixlen has precedence 1593 if comp_a.value.prefixlen > comp_b.value.prefixlen: 1594 return A_HAS_PRECEDENCE 1595 if comp_a.value.prefixlen < comp_b.value.prefixlen: 1597 return B_HAS_PRECEDENCE 1598 # components equal -> continue with next component 1599 elif comp_a.value > comp_b.value: 1600 return B_HAS_PRECEDENCE 1601 elif comp_a.value < comp_b.value: 1602 return A_HAS_PRECEDENCE 1603 else: 1604 # assuming comp_a.value, comp_b.value of type bytearray 1605 if len(comp_a.value) == len(comp_b.value): 1606 if comp_a.value > comp_b.value: 1607 return B_HAS_PRECEDENCE 1608 if comp_a.value < comp_b.value: 1609 return A_HAS_PRECEDENCE 1610 # components equal -> continue with next component 1611 else: 1612 common = min(len(comp_a.value), len(comp_b.value)) 1613 if comp_a.value[:common] > comp_b.value[:common]: 1614 return B_HAS_PRECEDENCE 1615 elif comp_a.value[:common] < comp_b.value[:common]: 1616 return A_HAS_PRECEDENCE 1617 # the first common bytes match 1618 elif len(comp_a.value) > len(comp_b.value): 1619 return A_HAS_PRECEDENCE 1620 else: 1621 return B_HAS_PRECEDENCE 1622 return EQUAL 1623 1625 Appendix B. Comparison with RFC 5575 1627 This document includes numerous editorial changes to [RFC5575]. It 1628 also completely incorporates the redirect action clarification 1629 document [RFC7674]. It is recommended to read the entire document. 1630 The authors, however want to point out the following technical 1631 changes to [RFC5575]: 1633 Section 1 introduces the Flow Specification NLRI. In [RFC5575] 1634 this NLRI was defined as an opaque-key in BGPs database. This 1635 specification has removed all references to an opaque-key 1636 property. BGP implementations are able to understand the NLRI 1637 encoding. 1639 Section 4.2.1.1 defines a numeric operator and comparison bit 1640 combinations. In [RFC5575] the meaning of those bit combination 1641 was not explicitly defined and left open to the reader. 1643 Section 4.2.2.3 - Section 4.2.2.8, Section 4.2.2.10, 1644 Section 4.2.2.11 make use of the above numeric operator. The 1645 allowed length of the comparison value was not consistently 1646 defined in [RFC5575]. 1648 Section 7 defines all Traffic Filtering Action Extended 1649 communities as transitive extended communities. [RFC5575] defined 1650 the traffic-rate action to be non-transitive and did not define 1651 the transitivity of the other Traffic Filtering Action communities 1652 at all. 1654 Section 7.2 introduces a new Traffic Filtering Action (traffic- 1655 rate-packets). This action did not exist in [RFC5575]. 1657 Section 7.4 contains the same redirect actions already defined in 1658 [RFC5575] however, these actions have been renamed to "rt- 1659 redirect" to make it clearer that the redirection is based on 1660 route-target. This section also completely incorporates the 1661 [RFC7674] clarifications of the Flowspec Redirect Extended 1662 Community. 1664 Section 7.7 contains general considerations on interfering traffic 1665 actions. Section 7.3 also cross-references this section. 1666 [RFC5575] did not mention this. 1668 Section 10 contains new error handling. 1670 Authors' Addresses 1672 Christoph Loibl 1673 next layer Telekom GmbH 1674 Mariahilfer Guertel 37/7 1675 Vienna 1150 1676 AT 1678 Phone: +43 664 1176414 1679 Email: cl@tix.at 1681 Susan Hares 1682 Huawei 1683 7453 Hickory Hill 1684 Saline, MI 48176 1685 USA 1687 Email: shares@ndzh.com 1688 Robert Raszuk 1689 Bloomberg LP 1690 731 Lexington Ave 1691 New York City, NY 10022 1692 USA 1694 Email: robert@raszuk.net 1696 Danny McPherson 1697 Verisign 1698 USA 1700 Email: dmcpherson@verisign.com 1702 Martin Bacher 1703 T-Mobile Austria 1704 Rennweg 97-99 1705 Vienna 1030 1706 AT 1708 Email: mb.ietf@gmail.com