idnits 2.17.1 draft-ietf-idr-rfc5575bis-21.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? -- The draft header indicates that this document obsoletes RFC7674, but the abstract doesn't seem to directly say this. It does mention RFC7674 though, so this could be OK. -- The draft header indicates that this document obsoletes RFC5575, but the abstract doesn't seem to directly say this. It does mention RFC5575 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 16, 2020) is 1469 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '137' on line 646 -- Looks like a reference, but probably isn't: '139' on line 646 -- Looks like a reference, but probably isn't: '1' on line 1549 -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE.754.1985' ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) == Outdated reference: A later version (-22) exists of draft-ietf-idr-flow-spec-v6-10 -- Obsolete informational reference (is this intentional?): RFC 5575 (Obsoleted by RFC 8955) -- Obsolete informational reference (is this intentional?): RFC 7674 (Obsoleted by RFC 8955) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group C. Loibl 3 Internet-Draft next layer Telekom GmbH 4 Obsoletes: 5575,7674 (if approved) S. Hares 5 Intended status: Standards Track Huawei 6 Expires: October 18, 2020 R. Raszuk 7 Bloomberg LP 8 D. McPherson 9 Verisign 10 M. Bacher 11 T-Mobile Austria 12 April 16, 2020 14 Dissemination of Flow Specification Rules 15 draft-ietf-idr-rfc5575bis-21 17 Abstract 19 This document defines a Border Gateway Protocol Network Layer 20 Reachability Information (BGP NLRI) encoding format that can be used 21 to distribute traffic Flow Specifications. This allows the routing 22 system to propagate information regarding more specific components of 23 the traffic aggregate defined by an IP destination prefix. 25 It also specifies BGP Extended Community encoding formats, that can 26 be used to propagate Traffic Filtering Actions along with the Flow 27 Specification NLRI. Those Traffic Filtering Actions encode actions a 28 routing system can take if the packet matches the Flow Specification. 30 Additionally, it defines two applications of that encoding format: 31 one that can be used to automate inter-domain coordination of traffic 32 filtering, such as what is required in order to mitigate 33 (distributed) denial-of-service attacks, and a second application to 34 provide traffic filtering in the context of a BGP/MPLS VPN service. 35 Other applications (ie. centralized control of traffic in a SDN or 36 NFV context) are also possible. Other documents may specify Flow 37 Specification extensions. 39 The information is carried via BGP, thereby reusing protocol 40 algorithms, operational experience, and administrative processes such 41 as inter-provider peering agreements. 43 This document obsoletes both RFC5575 and RFC7674. 45 Status of This Memo 47 This Internet-Draft is submitted in full conformance with the 48 provisions of BCP 78 and BCP 79. 50 Internet-Drafts are working documents of the Internet Engineering 51 Task Force (IETF). Note that other groups may also distribute 52 working documents as Internet-Drafts. The list of current Internet- 53 Drafts is at https://datatracker.ietf.org/drafts/current/. 55 Internet-Drafts are draft documents valid for a maximum of six months 56 and may be updated, replaced, or obsoleted by other documents at any 57 time. It is inappropriate to use Internet-Drafts as reference 58 material or to cite them other than as "work in progress." 60 This Internet-Draft will expire on October 18, 2020. 62 Copyright Notice 64 Copyright (c) 2020 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents 69 (https://trustee.ietf.org/license-info) in effect on the date of 70 publication of this document. Please review these documents 71 carefully, as they describe your rights and restrictions with respect 72 to this document. Code Components extracted from this document must 73 include Simplified BSD License text as described in Section 4.e of 74 the Trust Legal Provisions and are provided without warranty as 75 described in the Simplified BSD License. 77 Table of Contents 79 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 80 2. Definitions of Terms Used in This Memo . . . . . . . . . . . 5 81 3. Flow Specifications . . . . . . . . . . . . . . . . . . . . . 5 82 4. Dissemination of IPv4 Flow Specification Information . . . . 6 83 4.1. Length Encoding . . . . . . . . . . . . . . . . . . . . . 7 84 4.2. NLRI Value Encoding . . . . . . . . . . . . . . . . . . . 7 85 4.2.1. Operators . . . . . . . . . . . . . . . . . . . . . . 7 86 4.2.2. Components . . . . . . . . . . . . . . . . . . . . . 9 87 4.3. Examples of Encodings . . . . . . . . . . . . . . . . . . 14 88 5. Traffic Filtering . . . . . . . . . . . . . . . . . . . . . . 16 89 5.1. Ordering of Flow Specifications . . . . . . . . . . . . . 17 90 6. Validation Procedure . . . . . . . . . . . . . . . . . . . . 18 91 7. Traffic Filtering Actions . . . . . . . . . . . . . . . . . . 19 92 7.1. Traffic Rate in Bytes (traffic-rate-bytes) sub-type 0x06 20 93 7.2. Traffic Rate in Packets (traffic-rate-packets) sub-type 94 TBD . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 95 7.3. Traffic-action (traffic-action) sub-type 0x07 . . . . . . 21 96 7.4. RT Redirect (rt-redirect) sub-type 0x08 . . . . . . . . . 22 97 7.5. Traffic Marking (traffic-marking) sub-type 0x09 . . . . . 22 98 7.6. Interaction with other Filtering Mechanisms in Routers . 23 99 7.7. Considerations on Traffic Filtering Action Interference . 23 100 8. Dissemination of Traffic Filtering in BGP/MPLS VPN Networks . 24 101 9. Traffic Monitoring . . . . . . . . . . . . . . . . . . . . . 25 102 10. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 25 103 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 104 11.1. AFI/SAFI Definitions . . . . . . . . . . . . . . . . . . 25 105 11.2. Flow Component Definitions . . . . . . . . . . . . . . . 26 106 11.3. Extended Community Flow Specification Actions . . . . . 27 107 12. Security Considerations . . . . . . . . . . . . . . . . . . . 29 108 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 31 109 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 110 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 111 15.1. Normative References . . . . . . . . . . . . . . . . . . 31 112 15.2. Informative References . . . . . . . . . . . . . . . . . 33 113 15.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 34 114 Appendix A. Python code: flow_rule_cmp . . . . . . . . . . . . . 34 115 Appendix B. Comparison with RFC 5575 . . . . . . . . . . . . . . 35 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 118 1. Introduction 120 This document obsoletes "Dissemination of Flow Specification Rules" 121 [RFC5575], the differences can be found in Appendix B. This document 122 also obsoletes 123 "Clarification of the Flowspec Redirect Extended Community" [RFC7674] 124 since it incorporates the encoding of the BGP Flow Specification 125 Redirect Extended Community in Section 7.4. 127 Modern IP routers contain both the capability to forward traffic 128 according to IP prefixes as well as to classify, shape, rate limit, 129 filter, or redirect packets based on administratively defined 130 policies. These traffic policy mechanisms allow the operator to 131 define match rules that operate on multiple fields of the packet 132 header. Actions such as the ones described above can be associated 133 with each rule. 135 The n-tuple consisting of the matching criteria defines an aggregate 136 traffic Flow Specification. The matching criteria can include 137 elements such as source and destination address prefixes, IP 138 protocol, and transport protocol port numbers. 140 Section 4 of this document defines a general procedure to encode Flow 141 Specification for aggregated traffic flows so that they can be 142 distributed as a BGP [RFC4271] NLRI. Additionally, Section 7 of this 143 document defines the required Traffic Filtering Actions BGP Extended 144 Communities and mechanisms to use BGP for intra- and inter-provider 145 distribution of traffic filtering rules to filter (distributed) 146 denial-of-service (DoS) attacks. 148 By expanding routing information with Flow Specifications, the 149 routing system can take advantage of the ACL (Access Control List) or 150 firewall capabilities in the router's forwarding path. Flow 151 Specifications can be seen as more specific routing entries to a 152 unicast prefix and are expected to depend upon the existing unicast 153 data information. 155 A Flow Specification received from an external autonomous system will 156 need to be validated against unicast routing before being accepted 157 (Section 6). The Flow Specification received from an internal BGP 158 peer within the same autonomous system [RFC4271] is assumed to have 159 been validated prior to transmission within the internal BGP (iBGP) 160 mesh of an autonomous system. If the aggregate traffic flow defined 161 by the unicast destination prefix is forwarded to a given BGP peer, 162 then the local system can install more specific Flow Specifications 163 that may result in different forwarding behavior, as requested by 164 this system. 166 From an operational perspective, the utilization of BGP as the 167 carrier for this information allows a network service provider to 168 reuse both internal route distribution infrastructure (e.g., route 169 reflector or confederation design) and existing external 170 relationships (e.g., inter-domain BGP sessions to a customer 171 network). 173 While it is certainly possible to address this problem using other 174 mechanisms, this solution has been utilized in deployments because of 175 the substantial advantage of being an incremental addition to already 176 deployed mechanisms. 178 In current deployments, the information distributed by this extension 179 is originated both manually as well as automatically, the latter by 180 systems that are able to detect malicious traffic flows. When 181 automated systems are used, care should be taken to ensure their 182 correctness as well as the limitations of the systems that receive 183 and process the advertised Flow Specifications (see also Section 12). 185 This specification defines required protocol extensions to address 186 most common applications of IPv4 unicast and VPNv4 unicast filtering. 187 The same mechanism can be reused and new match criteria added to 188 address similar filtering needs for other BGP address families such 189 as IPv6 families [I-D.ietf-idr-flow-spec-v6]. 191 2. Definitions of Terms Used in This Memo 193 AFI - Address Family Identifier. 195 AS - Autonomous System. 197 Loc-RIB - The Loc-RIB contains the routes that have been selected 198 by the local BGP speaker's Decision Process [RFC4271]. 200 NLRI - Network Layer Reachability Information. 202 PE - Provider Edge router. 204 RIB - Routing Information Base. 206 SAFI - Subsequent Address Family Identifier. 208 VRF - Virtual Routing and Forwarding instance. 210 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 211 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 212 "OPTIONAL" in this document are to be interpreted as described in BCP 213 14 [RFC2119] [RFC8174] when, and only when, they appear in all 214 capitals, as shown here. 216 3. Flow Specifications 218 A Flow Specification is an n-tuple consisting of several matching 219 criteria that can be applied to IP traffic. A given IP packet is 220 said to match the defined Flow Specification if it matches all the 221 specified criteria. This n-tuple is encoded into a BGP NLRI defined 222 below. 224 A given Flow Specification may be associated with a set of 225 attributes, depending on the particular application; such attributes 226 may or may not include reachability information (i.e., NEXT_HOP). 227 Well-known or AS-specific community attributes can be used to encode 228 a set of predetermined actions. 230 A particular application is identified by a specific (Address Family 231 Identifier, Subsequent Address Family Identifier (AFI, SAFI)) pair 232 [RFC4760] and corresponds to a distinct set of RIBs. Those RIBs 233 should be treated independently from each other in order to assure 234 non-interference between distinct applications. 236 BGP itself treats the NLRI as a key to an entry in its databases. 237 Entries that are placed in the Loc-RIB are then associated with a 238 given set of semantics, which is application dependent. This is 239 consistent with existing BGP applications. For instance, IP unicast 240 routing (AFI=1, SAFI=1) and IP multicast reverse-path information 241 (AFI=1, SAFI=2) are handled by BGP without any particular semantics 242 being associated with them until installed in the Loc-RIB. 244 Standard BGP policy mechanisms, such as UPDATE filtering by NLRI 245 prefix as well as community matching and manipulation, must apply to 246 the Flow Specification defined NLRI-type, especially in an inter- 247 domain environment. Network operators can also control propagation 248 of such routing updates by enabling or disabling the exchange of a 249 particular (AFI, SAFI) pair on a given BGP peering session. 251 4. Dissemination of IPv4 Flow Specification Information 253 This document defines a Flow Specification NLRI type (Figure 1) that 254 may include several components such as destination prefix, source 255 prefix, protocol, ports, and others (see Section 4.2 below). 257 This NLRI information is encoded using MP_REACH_NLRI and 258 MP_UNREACH_NLRI attributes as defined in [RFC4760]. When advertising 259 Flow Specifications, the Length of Next Hop Network Address SHOULD be 260 set to 0. The Network Address of Next Hop field MUST be ignored. 262 The NLRI field of the MP_REACH_NLRI and MP_UNREACH_NLRI is encoded as 263 one or more 2-tuples of the form . It consists 264 of a 1- or 2-octet length field followed by a variable-length NLRI 265 value. The length is expressed in octets. 267 +-------------------------------+ 268 | length (0xnn or 0xfnnn) | 269 +-------------------------------+ 270 | NLRI value (variable) | 271 +-------------------------------+ 273 Figure 1: Flow Specification NLRI for IPv4 275 Implementations wishing to exchange Flow Specification MUST use BGP's 276 Capability Advertisement facility to exchange the Multiprotocol 277 Extension Capability Code (Code 1) as defined in [RFC4760]. The 278 (AFI, SAFI) pair carried in the Multiprotocol Extension Capability 279 MUST be (AFI=1, SAFI=133) for IPv4 Flow Specification, and (AFI=1, 280 SAFI=134) for VPNv4 Flow Specification. 282 4.1. Length Encoding 284 o If the NLRI length is smaller than 240 (0xf0 hex) octets, the 285 length field can be encoded as a single octet. 287 o Otherwise, it is encoded as an extended-length 2-octet value in 288 which the most significant nibble of the first octet is all ones. 290 In Figure 1 above, values less-than 240 are encoded using two hex 291 digits (0xnn). Values above 239 are encoded using 3 hex digits 292 (0xfnnn). The highest value that can be represented with this 293 encoding is 4095. For example the length value of 239 is encoded as 294 0xef (single octet) while 240 is encoded as 0xf0f0 (2-octet). 296 4.2. NLRI Value Encoding 298 The Flow Specification NLRI value consists of a list of optional 299 components and is encoded as follows: 301 Encoding: <[component]+> 303 A specific packet is considered to match the Flow Specification when 304 it matches the intersection (AND) of all the components present in 305 the Flow Specification. 307 Components MUST follow strict type ordering by increasing numerical 308 order. A given component type may (exactly once) or may not be 309 present in the Flow Specification. If present, it MUST precede any 310 component of higher numeric type value. 312 All combinations of components within a single Flow Specification are 313 allowed. However, some combinations cannot match any packets (e.g. 314 "ICMP Type AND Port" will never match any packets), and thus SHOULD 315 NOT be propagated by BGP. 317 A NLRI value not encoded as specified in Section 4.2 is considered 318 malformed and error handling according to Section 10 is performed. 320 4.2.1. Operators 322 Most of the components described below make use of comparison 323 operators. Which of the two operators is used is defined by the 324 components in Section 4.2.2. The operators are encoded as a single 325 octet. 327 4.2.1.1. Numeric Operator (numeric_op) 329 This operator is encoded as shown in Figure 2. 331 0 1 2 3 4 5 6 7 332 +---+---+---+---+---+---+---+---+ 333 | e | a | len | 0 |lt |gt |eq | 334 +---+---+---+---+---+---+---+---+ 336 Figure 2: Numeric Operator (numeric_op) 338 e - end-of-list bit: Set in the last {op, value} pair in the list. 340 a - AND bit: If unset, the result of the previous {op, value} pair 341 is logically ORed with the current one. If set, the operation is 342 a logical AND. In the first operator octet of a sequence it 343 SHOULD be encoded as unset and MUST be treated as always unset on 344 decoding. The AND operator has higher priority than OR for the 345 purposes of evaluating logical expressions. 347 len - length: The length of the value field for this operator given 348 as (1 << len). This encodes 1 (len=00), 2 (len=01), 4 (len=10), 8 349 (len=11) octets. 351 0 - SHOULD be set to 0 on NLRI encoding, and MUST be ignored during 352 decoding 354 lt - less than comparison between data and value. 356 gt - greater than comparison between data and value. 358 eq - equality between data and value. 360 The bits lt, gt, and eq can be combined to produce common relational 361 operators such as "less or equal", "greater or equal", and "not equal 362 to" as shown in Table 1. 364 +----+----+----+-----------------------------------+ 365 | lt | gt | eq | Resulting operation | 366 +----+----+----+-----------------------------------+ 367 | 0 | 0 | 0 | false (independent of the value) | 368 | 0 | 0 | 1 | == (equal) | 369 | 0 | 1 | 0 | > (greater than) | 370 | 0 | 1 | 1 | >= (greater than or equal) | 371 | 1 | 0 | 0 | < (less than) | 372 | 1 | 0 | 1 | <= (less than or equal) | 373 | 1 | 1 | 0 | != (not equal value) | 374 | 1 | 1 | 1 | true (independent of the value) | 375 +----+----+----+-----------------------------------+ 377 Table 1: Comparison operation combinations 379 4.2.1.2. Bitmask Operator (bitmask_op) 381 This operator is encoded as shown in Figure 3. 383 0 1 2 3 4 5 6 7 384 +---+---+---+---+---+---+---+---+ 385 | e | a | len | 0 | 0 |not| m | 386 +---+---+---+---+---+---+---+---+ 388 Figure 3: Bitmask Operator (bitmask_op) 390 e, a, len - Most significant nibble: (end-of-list bit, AND bit, and 391 length field), as defined in the Numeric Operator format in 392 Section 4.2.1.1. 394 not - NOT bit: If set, logical negation of operation. 396 m - Match bit: If set, this is a bitwise match operation defined as 397 "(data AND value) == value"; if unset, (data AND value) evaluates 398 to TRUE if any of the bits in the value mask are set in the data 400 0 - all 0 bits: SHOULD be set to 0 on NLRI encoding, and MUST be 401 ignored during decoding 403 4.2.2. Components 405 The encoding of each of the components begins with a type field (1 406 octet) followed by a variable length parameter. The following 407 sections define component types and parameter encodings for the IPv4 408 IP layer and transport layer headers. IPv6 NLRI component types are 409 described in [I-D.ietf-idr-flow-spec-v6]. 411 4.2.2.1. Type 1 - Destination Prefix 413 Encoding: 415 Defines the destination prefix to match. The length and prefix 416 fields are encoded as in BGP UPDATE messages [RFC4271] 418 4.2.2.2. Type 2 - Source Prefix 420 Encoding: 422 Defines the source prefix to match. The length and prefix fields are 423 encoded as in BGP UPDATE messages [RFC4271] 425 4.2.2.3. Type 3 - IP Protocol 427 Encoding: 429 Contains a list of {numeric_op, value} pairs that are used to match 430 the IP protocol value octet in IP packet header (see [RFC0791] 431 Section 3.1). 433 This component uses the Numeric Operator (numeric_op) described in 434 Section 4.2.1.1. Type 3 component values SHOULD be encoded as single 435 octet (numeric_op len=00). 437 4.2.2.4. Type 4 - Port 439 Encoding: 441 Defines a list of {numeric_op, value} pairs that matches source OR 442 destination TCP/UDP ports (see [RFC0793] Section 3.1 and [RFC0768] 443 Section "Format"). This component matches if either the destination 444 port OR the source port of a IP packet matches the value. 446 This component uses the Numeric Operator (numeric_op) described in 447 Section 4.2.1.1. Type 4 component values SHOULD be encoded as 1- or 448 2-octet quantities (numeric_op len=00 or len=01). 450 In case of the presence of the port (destination-port, source-port) 451 component only TCP or UDP packets can match the entire Flow 452 Specification. The port component, if present, never matches when 453 the packet's IP protocol value is not 6 (TCP) or 17 (UDP), if the 454 packet is fragmented and this is not the first fragment, or if the 455 system is unable to locate the transport header. Different 456 implementations may or may not be able to decode the transport header 457 in the presence of IP options or Encapsulating Security Payload (ESP) 458 NULL [RFC4303] encryption. 460 4.2.2.5. Type 5 - Destination Port 462 Encoding: 464 Defines a list of {numeric_op, value} pairs used to match the 465 destination port of a TCP or UDP packet (see also [RFC0793] 466 Section 3.1 and [RFC0768] Section "Format"). 468 This component uses the Numeric Operator (numeric_op) described in 469 Section 4.2.1.1. Type 5 component values SHOULD be encoded as 1- or 470 2-octet quantities (numeric_op len=00 or len=01). 472 The last paragraph of Section 4.2.2.4 also applies to this component. 474 4.2.2.6. Type 6 - Source Port 476 Encoding: 478 Defines a list of {numeric_op, value} pairs used to match the source 479 port of a TCP or UDP packet (see also [RFC0793] Section 3.1 and 480 [RFC0768] Section "Format"). 482 This component uses the Numeric Operator (numeric_op) described in 483 Section 4.2.1.1. Type 6 component values SHOULD be encoded as 1- or 484 2-octet quantities (numeric_op len=00 or len=01). 486 The last paragraph of Section 4.2.2.4 also applies to this component. 488 4.2.2.7. Type 7 - ICMP type 490 Encoding: 492 Defines a list of {numeric_op, value} pairs used to match the type 493 field of an ICMP packet (see also [RFC0792] Section "Message 494 Formats"). 496 This component uses the Numeric Operator (numeric_op) described in 497 Section 4.2.1.1. Type 7 component values SHOULD be encoded as single 498 octet (numeric_op len=00). 500 In case of the presence of the ICMP type (code) component only ICMP 501 packets can match the entire Flow Specification. The ICMP type 502 (code) component, if present, never matches when the packet's IP 503 protocol value is not 1 (ICMP), if the packet is fragmented and this 504 is not the first fragment, or if the system is unable to locate the 505 transport header. Different implementations may or may not be able 506 to decode the transport header in the presence of IP options or 507 Encapsulating Security Payload (ESP) NULL [RFC4303] encryption. 509 4.2.2.8. Type 8 - ICMP code 511 Encoding: 513 Defines a list of {numeric_op, value} pairs used to match the code 514 field of an ICMP packet (see also [RFC0792] Section "Message 515 Formats"). 517 This component uses the Numeric Operator (numeric_op) described in 518 Section 4.2.1.1. Type 8 component values SHOULD be encoded as single 519 octet (numeric_op len=00). 521 The last paragraph of Section 4.2.2.7 also applies to this component. 523 4.2.2.9. Type 9 - TCP flags 525 Encoding: 527 Defines a list of {bitmask_op, bitmask} pairs used to match TCP 528 Control Bits (see also [RFC0793] Section 3.1). 530 This component uses the Bitmask Operator (bitmask_op) described in 531 Section 4.2.1.2. Type 9 component bitmasks MUST be encoded as 1- or 532 2-octet bitmask (bitmask_op len=00 or len=01). 534 When a single octet (bitmask_op len=00) is specified, it matches 535 octet 14 of the TCP header (see also [RFC0793] Section 3.1), which 536 contains the TCP Control Bits. When a 2-octet (bitmask_op len=01) 537 encoding is used, it matches octets 13 and 14 of the TCP header with 538 the data offset (leftmost 4 bits) always treated as 0. 540 In case of the presence of the TCP flags component only TCP packets 541 can match the entire Flow Specification. The TCP flags component, if 542 present, never matches when the packet's IP protocol value is not 6 543 (TCP), if the packet is fragmented and this is not the first 544 fragment, or if the system is unable to locate the transport header. 545 Different implementations may or may not be able to decode the 546 transport header in the presence of IP options or Encapsulating 547 Security Payload (ESP) NULL [RFC4303] encryption. 549 4.2.2.10. Type 10 - Packet length 551 Encoding: 553 Defines a list of {numeric_op, value} pairs used to match on the 554 total IP packet length (excluding Layer 2 but including IP header). 556 This component uses the Numeric Operator (numeric_op) described in 557 Section 4.2.1.1. Type 10 component values SHOULD be encoded as 1- or 558 2-octet quantities (numeric_op len=00 or len=01). 560 4.2.2.11. Type 11 - DSCP (Diffserv Code Point) 562 Encoding: 564 Defines a list of {numeric_op, value} pairs used to match the 6-bit 565 DSCP field (see also [RFC2474]). 567 This component uses the Numeric Operator (numeric_op) described in 568 Section 4.2.1.1. Type 11 component values MUST be encoded as single 569 octet (numeric_op len=00). 571 The six least significant bits contain the DSCP value. All other 572 bits SHOULD be treated as 0. 574 4.2.2.12. Type 12 - Fragment 576 Encoding: 578 Defines a list of {bitmask_op, bitmask} pairs used to match specific 579 IP fragments. 581 This component uses the Bitmask Operator (bitmask_op) described in 582 Section 4.2.1.2. The Type 12 component bitmask MUST be encoded as 583 single octet bitmask (bitmask_op len=00). 585 0 1 2 3 4 5 6 7 586 +---+---+---+---+---+---+---+---+ 587 | 0 | 0 | 0 | 0 |LF |FF |IsF|DF | 588 +---+---+---+---+---+---+---+---+ 590 Figure 4: Fragment Bitmask Operand 592 Bitmask values: 594 DF - Don't fragment - match if [RFC0791] IP Header Flags Bit-1 (DF) 595 is 1 597 IsF - Is a fragment - match if [RFC0791] IP Header Fragment Offset 598 is not 0 600 FF - First fragment - match if [RFC0791] IP Header Fragment Offset 601 is 0 AND Flags Bit-2 (MF) is 1 603 LF - Last fragment - match if [RFC0791] IP Header Fragment Offset is 604 not 0 AND Flags Bit-2 (MF) is 0 606 0 - SHOULD be set to 0 on NLRI encoding, and MUST be ignored during 607 decoding 609 4.3. Examples of Encodings 611 4.3.1. Example 1 613 An example of a Flow Specification NLRI encoding for: "all packets to 614 192.0.2.0/24 and TCP port 25". 616 +--------+----------------+----------+----------+ 617 | length | destination | protocol | port | 618 +--------+----------------+----------+----------+ 619 | 0x0b | 01 18 c0 00 02 | 03 81 06 | 04 81 19 | 620 +--------+----------------+----------+----------+ 622 Decoded: 624 +-------+------------+-------------------------------+ 625 | Value | | | 626 +-------+------------+-------------------------------+ 627 | 0x0b | length | 11 octets (len<240 1-octet) | 628 | 0x01 | type | Type 1 - Destination Prefix | 629 | 0x18 | length | 24 bit | 630 | 0xc0 | prefix | 192 | 631 | 0x00 | prefix | 0 | 632 | 0x02 | prefix | 2 | 633 | 0x03 | type | Type 3 - IP Protocol | 634 | 0x81 | numeric_op | end-of-list, value size=1, == | 635 | 0x06 | value | 6 (TCP) | 636 | 0x04 | type | Type 4 - Port | 637 | 0x81 | numeric_op | end-of-list, value size=1, == | 638 | 0x19 | value | 25 | 639 +-------+------------+-------------------------------+ 641 This constitutes a NLRI with a NLRI length of 11 octets. 643 4.3.2. Example 2 645 An example of a Flow Specification NLRI encoding for: "all packets to 646 192.0.2.0/24 from 203.0.113.0/24 and port {range [137, 139] or 647 8080}". 649 +--------+----------------+----------------+-------------------------+ 650 | length | destination | source | port | 651 +--------+----------------+----------------+-------------------------+ 652 | 0x12 | 01 18 c0 00 02 | 02 18 cb 00 71 | 04 03 89 45 8b 91 1f 90 | 653 +--------+----------------+----------------+-------------------------+ 655 Decoded: 657 +--------+------------+-------------------------------+ 658 | Value | | | 659 +--------+------------+-------------------------------+ 660 | 0x12 | length | 18 octets (len<240 1-octet) | 661 | 0x01 | type | Type 1 - Destination Prefix | 662 | 0x18 | length | 24 bit | 663 | 0xc0 | prefix | 192 | 664 | 0x00 | prefix | 0 | 665 | 0x02 | prefix | 2 | 666 | 0x02 | type | Type 2 - Source Prefix | 667 | 0x18 | length | 24 bit | 668 | 0xcb | prefix | 203 | 669 | 0x00 | prefix | 0 | 670 | 0x71 | prefix | 113 | 671 | 0x04 | type | Type 4 - Port | 672 | 0x03 | numeric_op | value size=1, >= | 673 | 0x89 | value | 137 | 674 | 0x45 | numeric_op | "AND", value size=1, <= | 675 | 0x8b | value | 139 | 676 | 0x91 | numeric_op | end-of-list, value size=2, == | 677 | 0x1f90 | value | 8080 | 678 +--------+------------+-------------------------------+ 680 This constitutes a NLRI with a NLRI length of 18 octets. 682 4.3.3. Example 3 684 An example of a Flow Specification NLRI encoding for: "all packets to 685 192.0.2.1/32 and fragment { DF or FF } (matching packet with DF bit 686 set or First Fragments) 688 +--------+-------------------+----------+ 689 | length | destination | fragment | 690 +--------+-------------------+----------+ 691 | 0x09 | 01 20 c0 00 02 01 | 0c 80 05 | 692 +--------+-------------------+----------+ 694 Decoded: 696 +-------+------------+------------------------------+ 697 | Value | | | 698 +-------+------------+------------------------------+ 699 | 0x09 | length | 9 octets (len<240 1-octet) | 700 | 0x01 | type | Type 1 - Destination Prefix | 701 | 0x20 | length | 32 bit | 702 | 0xc0 | prefix | 192 | 703 | 0x00 | prefix | 0 | 704 | 0x02 | prefix | 2 | 705 | 0x01 | prefix | 1 | 706 | 0x0c | type | Type 12 - Fragment | 707 | 0x80 | bitmask_op | end-of-list, value size=1 | 708 | 0x05 | bitmask | DF=1, FF=1 | 709 +-------+------------+------------------------------+ 711 This constitutes a NLRI with a NLRI length of 9 octets. 713 5. Traffic Filtering 715 Traffic filtering policies have been traditionally considered to be 716 relatively static. Limitations of these static mechanisms caused 717 this new dynamic mechanism to be designed for the three new 718 applications of traffic filtering: 720 o Prevention of traffic-based, denial-of-service (DOS) attacks. 722 o Traffic filtering in the context of BGP/MPLS VPN service. 724 o Centralized traffic control for SDN/NFV networks. 726 These applications require coordination among service providers and/ 727 or coordination among the AS within a service provider. 729 The Flow Specification NLRI defined in Section 4 conveys information 730 about traffic filtering rules for traffic that should be discarded or 731 handled in a manner specified by a set of pre-defined actions (which 732 are defined in BGP Extended Communities). This mechanism is 733 primarily designed to allow an upstream autonomous system to perform 734 inbound filtering in their ingress routers of traffic that a given 735 downstream AS wishes to drop. 737 In order to achieve this goal, this document specifies two 738 application specific NLRI identifiers that provide traffic filters, 739 and a set of actions encoding in BGP Extended Communities. The two 740 application specific NLRI identifiers are: 742 o IPv4 Flow Specification identifier (AFI=1, SAFI=133) along with 743 specific semantic rules for IPv4 routes, and 745 o VPNv4 Flow Specification identifier (AFI=1, SAFI=134) value, which 746 can be used to propagate traffic filtering information in a BGP/ 747 MPLS VPN environment. 749 Encoding of the NLRI is described in Section 4 for IPv4 Flow 750 Specification and in Section 8 for VPNv4 Flow Specification. The 751 filtering actions are described in Section 7. 753 5.1. Ordering of Flow Specifications 755 More than one Flow Specification may match a particular traffic flow. 756 Thus, it is necessary to define the order in which Flow 757 Specifications get matched and actions being applied to a particular 758 traffic flow. This ordering function is such that it does not depend 759 on the arrival order of the Flow Specification via BGP and thus is 760 consistent in the network. 762 The relative order of two Flow Specifications is determined by 763 comparing their respective components. The algorithm starts by 764 comparing the left-most components (lowest component type value) of 765 the Flow Specifications. If the types differ, the Flow Specification 766 with lowest numeric type value has higher precedence (and thus will 767 match before) than the Flow Specification that doesn't contain that 768 component type. If the component types are the same, then a type- 769 specific comparison is performed (see below) if the types are equal 770 the algorithm continues with the next component. 772 For IP prefix values (IP destination or source prefix): If one of the 773 two prefixes to compare is a more specific prefix of the other, the 774 more specific prefix has higher precedence. Otherwise the one with 775 the lowest IP value has higher precedence. 777 For all other component types, unless otherwise specified, the 778 comparison is performed by comparing the component data as a binary 779 string using the memcmp() function as defined by [ISO_IEC_9899]. For 780 strings with equal lengths the lowest string (memcmp) has higher 781 precedence. For strings of different lengths, the common prefix is 782 compared. If the common prefix is not equal the string with the 783 lowest prefix has higher precedence. If the common prefix is equal, 784 the longest string is considered to have higher precedence than the 785 shorter one. 787 The code in Appendix A shows a Python3 implementation of the 788 comparison algorithm. The full code was tested with Python 3.6.3 and 789 can be obtained at https://github.com/stoffi92/flowspec-cmp [1]. 791 6. Validation Procedure 793 Flow Specifications received from a BGP peer that are accepted in the 794 respective Adj-RIB-In are used as input to the route selection 795 process. Although the forwarding attributes of two routes for the 796 same Flow Specification prefix may be the same, BGP is still required 797 to perform its path selection algorithm in order to select the 798 correct set of attributes to advertise. 800 The first step of the BGP Route Selection procedure (Section 9.1.2 of 801 [RFC4271] is to exclude from the selection procedure routes that are 802 considered non-feasible. In the context of IP routing information, 803 this step is used to validate that the NEXT_HOP attribute of a given 804 route is resolvable. 806 The concept can be extended, in the case of the Flow Specification 807 NLRI, to allow other validation procedures. 809 The validation process described below validates Flow Specifications 810 against unicast routes received over the same AFI but the associated 811 unicast routing information SAFI: 813 Flow Specification received over SAFI=133 will be validated 814 against routes received over SAFI=1 816 Flow Specification received over SAFI=134 will be validated 817 against routes received over SAFI=128 819 By default a Flow Specification NLRI MUST be validated such that it 820 is considered feasible if and only if all of the below is true: 822 a) A destination prefix component is embedded in the Flow 823 Specification. 825 b) The originator of the Flow Specification matches the originator 826 of the best-match unicast route for the destination prefix 827 embedded in the Flow Specification (this is the unicast route with 828 the longest possible prefix length covering the destination prefix 829 embedded in the Flow Specification). 831 c) There are no more specific unicast routes, when compared with 832 the flow destination prefix, that have been received from a 833 different neighboring AS than the best-match unicast route, which 834 has been determined in rule b). 836 However, rule a) MAY be relaxed by explicit configuration, permitting 837 Flow Specifications that include no destination prefix component. If 838 such is the case, rules b) and c) are moot and MUST be disregarded. 840 By originator of a BGP route, we mean either the address of the 841 originator in the ORIGINATOR_ID Attribute [RFC4456], or the source IP 842 address of the BGP peer, if this path attribute is not present. 844 BGP implementations MUST also enforce that the AS_PATH attribute of a 845 route received via the External Border Gateway Protocol (eBGP) 846 contains the neighboring AS in the left-most position of the AS_PATH 847 attribute. While this rule is optional in the BGP specification, it 848 becomes necessary to enforce it for security reasons. 850 The best-match unicast route may change over the time independently 851 of the Flow Specification NLRI. Therefore, a revalidation of the 852 Flow Specification NLRI MUST be performed whenever unicast routes 853 change. Revalidation is defined as retesting that clause a and 854 clause b above are true. 856 Explanation: 858 The underlying concept is that the neighboring AS that advertises the 859 best unicast route for a destination is allowed to advertise Flow 860 Specification information that conveys a more or equally specific 861 destination prefix. Thus, as long as there are no more specific 862 unicast routes, received from a different neighboring AS, which would 863 be affected by that Flow Specification. 865 The neighboring AS is the immediate destination of the traffic 866 described by the Flow Specification. If it requests these flows to 867 be dropped, that request can be honored without concern that it 868 represents a denial of service in itself. Supposedly, the traffic is 869 being dropped by the downstream autonomous system, and there is no 870 added value in carrying the traffic to it. 872 7. Traffic Filtering Actions 874 This document defines a minimum set of Traffic Filtering Actions that 875 it standardizes as BGP extended community values [RFC7153]. This is 876 not meant to be an inclusive list of all the possible actions, but 877 only a subset that can be interpreted consistently across the 878 network. Additional actions can be defined as either requiring 879 standards or as vendor specific. 881 The default action for a matching Flow Specification is to accept the 882 packet (treat the packet according to the normal forwarding behaviour 883 of the system). 885 This document defines the following extended communities values shown 886 in Table 2 in the form 0xttss where tt indicates the type and ss 887 indicates the sub-type of the extended community. Encodings for 888 these extended communities are described below. 890 +-------------+---------------------------+-------------------------+ 891 | community | action | encoding | 892 | 0xttss | | | 893 +-------------+---------------------------+-------------------------+ 894 | 0x8006 | traffic-rate-bytes | 2-octet ASN, 4-octet | 895 | | (Section 7.1) | float | 896 | TBD | traffic-rate-packets | 2-octet ASN, 4-octet | 897 | | (Section 7.1) | float | 898 | 0x8007 | traffic-action | bitmask | 899 | | (Section 7.3) | | 900 | 0x8008 | rt-redirect AS-2octet | 2-octet AS, 4-octet | 901 | | (Section 7.4) | value | 902 | 0x8108 | rt-redirect IPv4 | 4-octet IPv4 address, | 903 | | (Section 7.4) | 2-octet value | 904 | 0x8208 | rt-redirect AS-4octet | 4-octet AS, 2-octet | 905 | | (Section 7.4) | value | 906 | 0x8009 | traffic-marking | DSCP value | 907 | | (Section 7.5) | | 908 +-------------+---------------------------+-------------------------+ 910 Table 2: Traffic Filtering Action Extended Communities 912 Multiple Traffic Filtering Actions defined in this document may be 913 present for a single Flow Specification and SHOULD be applied to the 914 traffic flow (for example traffic-rate-bytes and rt-redirect can be 915 applied to packets at the same time). If not all of the Traffic 916 Filtering Actions can be applied to a traffic flow they should be 917 treated as interfering Traffic filtering actions (see below). 919 Some Traffic Filtering Actions may interfere with each other even 920 contradict. Section 7.7 of this document provides general 921 considerations on such Traffic Filtering Action interference. Any 922 additional definition of Traffic Filtering Actions SHOULD specify the 923 action to take if those Traffic Filtering Actions interfere (also 924 with existing Traffic Filtering Actions). 926 All Traffic Filtering Actions are specified as transitive BGP 927 Extended Communities. 929 7.1. Traffic Rate in Bytes (traffic-rate-bytes) sub-type 0x06 931 The traffic-rate-bytes extended community uses the following extended 932 community encoding: 934 The first two octets carry the 2-octet id, which can be assigned from 935 a 2-octet AS number. When a 4-octet AS number is locally present, 936 the 2 least significant octets of such an AS number can be used. 937 This value is purely informational and SHOULD NOT be interpreted by 938 the implementation. 940 The remaining 4 octets carry the maximum rate information in IEEE 941 floating point [IEEE.754.1985] format, units being bytes per second. 942 A traffic-rate of 0 should result on all traffic for the particular 943 flow to be discarded. On encoding the traffic-rate MUST NOT be 944 negative. On decoding negative values MUST be treated as zero 945 (discard all traffic). 947 Interferes with: No other BGP Flow Specification Traffic Filtering 948 Action in this document. 950 7.2. Traffic Rate in Packets (traffic-rate-packets) sub-type TBD 952 The traffic-rate-packets extended community uses the same encoding as 953 the traffic-rate-bytes extended community. The floating point value 954 carries the maximum packet rate in packets per second. A traffic- 955 rate-packets of 0 should result in all traffic for the particular 956 flow to be discarded. On encoding the traffic-rate-packets MUST NOT 957 be negative. On decoding negative values MUST be treated as zero 958 (discard all traffic). 960 Interferes with: No other BGP Flow Specification Traffic Filtering 961 Action in this document. 963 7.3. Traffic-action (traffic-action) sub-type 0x07 965 The traffic-action extended community consists of 6 octets of which 966 only the 2 least significant bits of the 6th octet (from left to 967 right) are defined by this document as shown in Figure 5. 969 0 1 2 3 970 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 971 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 972 | Traffic Action Field | 973 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 974 | Tr. Action Field (cont.) |S|T| 975 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 977 Figure 5: Traffic-action Extended Community Encoding 979 where S and T are defined as: 981 o T: Terminal Action (bit 47): When this bit is set, the traffic 982 filtering engine will evaluate any subsequent Flow Specifications 983 (as defined by the ordering procedure Section 5.1). If not set, 984 the evaluation of the traffic filters stops when this Flow 985 Specification is evaluated. 987 o S: Sample (bit 46): Enables traffic sampling and logging for this 988 Flow Specification (only effective when set). 990 o Traffic Action Field: Other Traffic Action Field (see Section 11) 991 bits unused in this specification. These bits SHOULD be set to 0 992 on encoding, and MUST be ignored during decoding. 994 The use of the Terminal Action (bit 47) may result in more than one 995 Flow Specification matching a particular traffic flow. All the 996 Traffic Filtering Actions from these Flow Specifications shall be 997 collected and applied. In case of interfering Traffic Filtering 998 Actions it is an implementation decision which Traffic Filtering 999 Actions are selected. See also Section 7.7. 1001 Interferes with: No other BGP Flow Specification Traffic Filtering 1002 Action in this document. 1004 7.4. RT Redirect (rt-redirect) sub-type 0x08 1006 The redirect extended community allows the traffic to be redirected 1007 to a VRF routing instance that lists the specified route-target in 1008 its import policy. If several local instances match this criteria, 1009 the choice between them is a local matter (for example, the instance 1010 with the lowest Route Distinguisher value can be elected). 1012 This Extended Community allows 3 different encodings formats for the 1013 route-target (type 0x80, 0x81, 0x82). It uses the same encoding as 1014 the Route Target Extended Community in Sections 3.1 (type 0x80: 1015 2-octet AS, 4-octet value), 3.2 (type 0x81: 4-octet IPv4 address, 1016 2-octet value) and 4 of [RFC4360] and Section 2 (type 0x82: 4-octet 1017 AS, 2-octet value) of [RFC5668] with the high-order octet of the Type 1018 field 0x80, 0x81, 0x82 respectively and the low-order of the Type 1019 field (Sub-Type) always 0x08. 1021 Interferes with: No other BGP Flow Specification Traffic Filtering 1022 Action in this document. 1024 7.5. Traffic Marking (traffic-marking) sub-type 0x09 1026 The traffic marking extended community instructs a system to modify 1027 the DSCP bits in the IP header ([RFC2474] Section 3) of a transiting 1028 IP packet to the corresponding value encoded in the 6 least 1029 significant bits of the extended community value as shown in 1030 Figure 6. 1032 The extended is encoded as follows: 1034 0 1 2 3 1035 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1036 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1037 | reserved | reserved | reserved | reserved | 1038 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1039 | reserved | r.| DSCP | 1040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1042 Figure 6: Traffic Marking Extended Community Encoding 1044 o DSCP: new DSCP value for the transiting IP packet. 1046 o reserved, r.: SHOULD be set to 0 on encoding, and MUST be ignored 1047 during decoding. 1049 Interferes with: No other BGP Flow Specification Traffic Filtering 1050 Action in this document. 1052 7.6. Interaction with other Filtering Mechanisms in Routers 1054 Implementations should provide mechanisms that map an arbitrary BGP 1055 community value (normal or extended) to Traffic Filtering Actions 1056 that require different mappings in different systems in the network. 1057 For instance, providing packets with a worse-than-best-effort, per- 1058 hop behavior is a functionality that is likely to be implemented 1059 differently in different systems and for which no standard behavior 1060 is currently known. Rather than attempting to define it here, this 1061 can be accomplished by mapping a user-defined community value to 1062 platform-/network-specific behavior via user configuration. 1064 7.7. Considerations on Traffic Filtering Action Interference 1066 Since Traffic Filtering Actions are represented as BGP extended 1067 community values, Traffic Filtering Actions may interfere with each 1068 other (e.g. there may be more than one conflicting traffic-rate-bytes 1069 Traffic Filtering Action associated with a single Flow 1070 Specification). Traffic Filtering Action interference has no impact 1071 on BGP propagation of Flow Specifications (all communities are 1072 propagated according to policies). 1074 If a Flow Specification associated with interfering Traffic Filtering 1075 Actions is selected for packet forwarding, it is an implementation 1076 decision which of the interfering Traffic Filtering Actions are 1077 selected. Implementors of this specification SHOULD document the 1078 behaviour of their implementation in such cases. 1080 Operators are encouraged to make use of the BGP policy framework 1081 supported by their implementation in order to achieve a predictable 1082 behaviour (ie. match - replace - delete communities on administrative 1083 boundaries). See also Section 12. 1085 8. Dissemination of Traffic Filtering in BGP/MPLS VPN Networks 1087 Provider-based Layer 3 VPN networks, such as the ones using a BGP/ 1088 MPLS IP VPN [RFC4364] control plane, may have different traffic 1089 filtering requirements than Internet service providers. But also 1090 Internet service providers may use those VPNs for scenarios like 1091 having the Internet routing table in a VRF, resulting in the same 1092 traffic filtering requirements as defined for the global routing 1093 table environment within this document. This document defines an 1094 additional BGP NLRI type (AFI=1, SAFI=134) value, which can be used 1095 to propagate Flow Specification in a BGP/MPLS VPN environment. 1097 The NLRI format for this address family consists of a fixed-length 1098 Route Distinguisher field (8 octets) followed by the Flow 1099 Specification NLRI value (Section 4.2). The NLRI length field shall 1100 include both the 8 octets of the Route Distinguisher as well as the 1101 subsequent Flow Specification NLRI value. The resulting encoding is 1102 shown in Figure 7. 1104 +--------------------------------+ 1105 | length (0xnn or 0xfn nn) | 1106 +--------------------------------+ 1107 | Route Distinguisher (8 octets) | 1108 +--------------------------------+ 1109 | NLRI value (variable) | 1110 +--------------------------------+ 1112 Figure 7: Flow Specification NLRI for MPLS 1114 Propagation of this NLRI is controlled by matching Route Target 1115 extended communities associated with the BGP path advertisement with 1116 the VRF import policy, using the same mechanism as described in BGP/ 1117 MPLS IP VPNs [RFC4364]. 1119 Flow Specifications received via this NLRI apply only to traffic that 1120 belongs to the VRF(s) in which it is imported. By default, traffic 1121 received from a remote PE is switched via an MPLS forwarding decision 1122 and is not subject to filtering. 1124 Contrary to the behavior specified for the non-VPN NLRI, Flow 1125 Specifications are accepted by default, when received from remote PE 1126 routers. 1128 The validation procedure (Section 6) and Traffic Filtering Actions 1129 (Section 7) are the same as for IPv4. 1131 9. Traffic Monitoring 1133 Traffic filtering applications require monitoring and traffic 1134 statistics facilities. While this is an implementation specific 1135 choice, implementations SHOULD provide: 1137 o A mechanism to log the packet header of filtered traffic. 1139 o A mechanism to count the number of matches for a given Flow 1140 Specification. 1142 10. Error Handling 1144 Error handling according to [RFC7606] and [RFC4760] applies to this 1145 specification. 1147 This document introduces Traffic Filtering Action Extended 1148 Communities. Malformed Traffic Filtering Action Extended Communities 1149 in the sense of [RFC7606] Section 7.14. are Extended Community values 1150 that cannot be decoded according to Section 7 of this document. 1152 11. IANA Considerations 1154 This section complies with [RFC7153]. 1156 11.1. AFI/SAFI Definitions 1158 IANA maintains a registry entitled "SAFI Values". For the purpose of 1159 this work, IANA is requested to update the following SAFIs to read 1160 according to the table below (Note: This document obsoletes both 1161 RFC7674 and RFC5575 and all references to those documents should be 1162 deleted from the registry below): 1164 +-------+------------------------------------------+----------------+ 1165 | Value | Name | Reference | 1166 +-------+------------------------------------------+----------------+ 1167 | 133 | Dissemination of Flow Specification | [this | 1168 | | rules | document] | 1169 | 134 | L3VPN Dissemination of Flow | [this | 1170 | | Specification rules | document] | 1171 +-------+------------------------------------------+----------------+ 1173 Table 3: Registry: SAFI Values 1175 11.2. Flow Component Definitions 1177 A Flow Specification consists of a sequence of flow components, which 1178 are identified by a an 8-bit component type. IANA has created and 1179 maintains a registry entitled "Flow Spec Component Types". IANA is 1180 requested to update the reference for this registry to [this 1181 document]. Furthermore the references to the values should be 1182 updated according to the table below (Note: This document obsoletes 1183 both RFC7674 and RFC5575 and all references to those documents should 1184 be deleted from the registry below). 1186 +-------+--------------------+-----------------+ 1187 | Value | Name | Reference | 1188 +-------+--------------------+-----------------+ 1189 | 1 | Destination Prefix | [this document] | 1190 | 2 | Source Prefix | [this document] | 1191 | 3 | IP Protocol | [this document] | 1192 | 4 | Port | [this document] | 1193 | 5 | Destination port | [this document] | 1194 | 6 | Source port | [this document] | 1195 | 7 | ICMP type | [this document] | 1196 | 8 | ICMP code | [this document] | 1197 | 9 | TCP flags | [this document] | 1198 | 10 | Packet length | [this document] | 1199 | 11 | DSCP | [this document] | 1200 | 12 | Fragment | [this document] | 1201 +-------+--------------------+-----------------+ 1203 Table 4: Registry: Flow Spec Component Types 1205 In order to manage the limited number space and accommodate several 1206 usages, the following policies defined by [RFC8126] are used: 1208 +--------------+-------------------------------+ 1209 | Type Values | Policy | 1210 +--------------+-------------------------------+ 1211 | 0 | Reserved | 1212 | [1 .. 12] | Defined by this specification | 1213 | [13 .. 127] | Specification required | 1214 | [128 .. 255] | First Come First Served | 1215 +--------------+-------------------------------+ 1217 Table 5: Flow Spec Component Types Policies 1219 11.3. Extended Community Flow Specification Actions 1221 The Extended Community Flow Specification Action types defined in 1222 this document consist of two parts: 1224 Type (BGP Transitive Extended Community Type) 1226 Sub-Type 1228 For the type-part, IANA maintains a registry entitled "BGP Transitive 1229 Extended Community Types". For the purpose of this work (Section 7), 1230 IANA is requested to update the references to the following entries 1231 according to the table below (Note: This document obsoletes both 1232 RFC7674 and RFC5575 and all references to those documents should be 1233 deleted in the registry below): 1235 +-------+-----------------------------------------------+-----------+ 1236 | Type | Name | Reference | 1237 | Value | | | 1238 +-------+-----------------------------------------------+-----------+ 1239 | 0x81 | Generic Transitive Experimental | [this | 1240 | | Use Extended Community Part 2 (Sub-Types are | document] | 1241 | | defined in the "Generic Transitive | | 1242 | | Experimental Use Extended Community Part 2 | | 1243 | | Sub-Types" Registry) | | 1244 | 0x82 | Generic Transitive Experimental | [this | 1245 | | Use Extended Community Part 3 | document] | 1246 | | (Sub-Types are defined in the "Generic | | 1247 | | Transitive Experimental Use | | 1248 | | Extended Community Part 3 Sub-Types" | | 1249 | | Registry) | | 1250 +-------+-----------------------------------------------+-----------+ 1252 Table 6: Registry: BGP Transitive Extended Community Types 1254 For the sub-type part of the extended community Traffic Filtering 1255 Actions IANA maintains the following registries. IANA is requested 1256 to update all names and references according to the tables below and 1257 assign a new value for the "Flow spec traffic-rate-packets" Sub-Type 1258 (Note: This document obsoletes both RFC7674 and RFC5575 and all 1259 references to those documents should be deleted from the registries 1260 below). 1262 +----------+--------------------------------------------+-----------+ 1263 | Sub-Type | Name | Reference | 1264 | Value | | | 1265 +----------+--------------------------------------------+-----------+ 1266 | 0x06 | Flow spec traffic-rate-bytes | [this | 1267 | | | document] | 1268 | TBD | Flow spec traffic-rate-packets | [this | 1269 | | | document] | 1270 | 0x07 | Flow spec traffic-action (Use | [this | 1271 | | of the "Value" field is defined in the | document] | 1272 | | "Traffic Action Fields" registry) | | 1273 | 0x08 | Flow spec rt-redirect | [this | 1274 | | AS-2octet format | document] | 1275 | 0x09 | Flow spec traffic-remarking | [this | 1276 | | | document] | 1277 +----------+--------------------------------------------+-----------+ 1279 Table 7: Registry: Generic Transitive Experimental Use Extended 1280 Community Sub-Types 1282 +------------+----------------------------------------+-------------+ 1283 | Sub-Type | Name | Reference | 1284 | Value | | | 1285 +------------+----------------------------------------+-------------+ 1286 | 0x08 | Flow spec rt-redirect IPv4 | [this | 1287 | | format | document] | 1288 +------------+----------------------------------------+-------------+ 1290 Table 8: Registry: Generic Transitive Experimental Use Extended 1291 Community Part 2 Sub-Types 1293 +------------+-----------------------------------------+------------+ 1294 | Sub-Type | Name | Reference | 1295 | Value | | | 1296 +------------+-----------------------------------------+------------+ 1297 | 0x08 | Flow spec rt-redirect | [this | 1298 | | AS-4octet format | document] | 1299 +------------+-----------------------------------------+------------+ 1301 Table 9: Registry: Generic Transitive Experimental Use Extended 1302 Community Part 3 Sub-Types 1304 Furthermore IANA is requested to update the reference for the 1305 registries "Generic Transitive Experimental Use Extended Community 1306 Part 2 Sub-Types" and "Generic Transitive Experimental Use Extended 1307 Community Part 3 Sub-Types" to [this document]. 1309 The "traffic-action" extended community (Section 7.3) defined in this 1310 document has 46 unused bits, which can be used to convey additional 1311 meaning. IANA created and maintains a registry entitled: "Traffic 1312 Action Fields". IANA is requested to update the reference for this 1313 registry to [this document]. Furthermore IANA is requested to update 1314 the references according to the table below. These values should be 1315 assigned via IETF Review rules only (Note: This document obsoletes 1316 both RFC7674 and RFC5575 and all references to those documents should 1317 be deleted from the registry below). 1319 +-----+-----------------+-----------------+ 1320 | Bit | Name | Reference | 1321 +-----+-----------------+-----------------+ 1322 | 47 | Terminal Action | [this document] | 1323 | 46 | Sample | [this document] | 1324 +-----+-----------------+-----------------+ 1326 Table 10: Registry: Traffic Action Fields 1328 12. Security Considerations 1330 As long as Flow Specifications are restricted to match the 1331 corresponding unicast routing paths for the relevant prefixes 1332 (Section 6), the security characteristics of this proposal are 1333 equivalent to the existing security properties of BGP unicast 1334 routing. Any relaxation of the validation procedure described in 1335 Section 6 may allow unwanted Flow Specifications to be propagated and 1336 thus unwanted Traffic Filtering Actions may be applied to flows. 1338 Where the above mechanisms are not in place, this could open the door 1339 to further denial-of-service attacks such as unwanted traffic 1340 filtering, remarking or redirection. 1342 Deployment of specific relaxations of the validation within an 1343 administrative boundary of a network, defined by an AS or an AS- 1344 Confederation boundary, may be useful in some networks for quickly 1345 distributing filters to prevent denial-of-service attacks. For a 1346 network to utilize this relaxation, the BGP policies must support 1347 additional filtering since the origin AS field is empty. 1348 Specifications relaxing the validation restrictions MUST contain 1349 security considerations that provide details on the required 1350 additional filtering. For example, the use of [RFC6811] to enhance 1351 filtering within an AS confederation. 1353 Inter-provider routing is based on a web of trust. Neighboring 1354 autonomous systems are trusted to advertise valid reachability 1355 information. If this trust model is violated, a neighboring 1356 autonomous system may cause a denial-of-service attack by advertising 1357 reachability information for a given prefix for which it does not 1358 provide service (unfiltered address space hijack). Since validation 1359 of the Flow Specification is tied to the announcement of the best 1360 unicast route, this may also cause this validation to fail and 1361 consequently prevent Flow Specifications from being accepted by a 1362 peer. Possible mitigations are [RFC6811] and [RFC8205]. 1364 On IXPs routes are often exchanged via route servers which do not 1365 extend the AS_PATH. In such cases it is not possible to enforce the 1366 left-most AS in the AS_PATH to be the neighbor AS (the AS of the 1367 route server). Since the validation of Flow Specification 1368 (Section 6) depends on this, additional care must be taken. It is 1369 advised to use a strict inbound route policy in such scenarios. 1371 Enabling firewall-like capabilities in routers without centralized 1372 management could make certain failures harder to diagnose. For 1373 example, it is possible to allow TCP packets to pass between a pair 1374 of addresses but not ICMP packets. It is also possible to permit 1375 packets smaller than 900 or greater than 1000 octets to pass between 1376 a pair of addresses, but not packets whose length is in the range 1377 900- 1000. Such behavior may be confusing and these capabilities 1378 should be used with care whether manually configured or coordinated 1379 through the protocol extensions described in this document. 1381 Flow Specification BGP speakers (e.g. automated DDoS controllers) not 1382 properly programmed, algorithms that are not performing as expected, 1383 or simply rogue systems may announce unintended Flow Specifications, 1384 send updates at a high rate or generate a high number of Flow 1385 Specifications. This may stress the receiving systems, exceed their 1386 maximum capacity or may lead to unwanted Traffic Filtering Actions 1387 being applied to flows. 1389 While the general verification of the Flow Specification NLRI is 1390 specified in this document (Section 6) the Traffic Filtering Actions 1391 received by a third party may need custom verification or filtering. 1392 In particular all non traffic-rate actions may allow a third party to 1393 modify packet forwarding properties and potentially gain access to 1394 other routing-tables/VPNs or undesired queues. This can be avoided 1395 by proper filtering/screening of the Traffic Filtering Action 1396 communities at network borders and only exposing a predefined subset 1397 of Traffic Filtering Actions (see Section 7) to third parties. One 1398 way to achieve this is by mapping user-defined communities, that can 1399 be set by the third party, to Traffic Filtering Actions and not 1400 accepting Traffic Filtering Action extended communities from third 1401 parties. 1403 This extension adds additional information to Internet routers. 1404 These are limited in terms of the maximum number of data elements 1405 they can hold as well as the number of events they are able to 1406 process in a given unit of time. Service providers need to consider 1407 the maximum capacity of their devices and may need to limit the 1408 number of Flow Specifications accepted and processed. 1410 13. Contributors 1412 Barry Greene, Pedro Marques, Jared Mauch and Nischal Sheth were 1413 authors on [RFC5575], and therefore are contributing authors on this 1414 document. 1416 14. Acknowledgements 1418 The authors would like to thank Yakov Rekhter, Dennis Ferguson, Chris 1419 Morrow, Charlie Kaufman, and David Smith for their comments for the 1420 comments on the original [RFC5575]. Chaitanya Kodeboyina helped 1421 design the flow validation procedure; and Steven Lin and Jim Washburn 1422 ironed out all the details necessary to produce a working 1423 implementation in the original [RFC5575]. 1425 A packet rate Traffic Filtering Action was also described in a Flow 1426 Specification extension draft and the authors like to thank Wesley 1427 Eddy, Justin Dailey and Gilbert Clark for their work. 1429 Additionally, the authors would like to thank Alexander Mayrhofer, 1430 Nicolas Fevrier, Job Snijders, Jeffrey Haas and Adam Chappell for 1431 their comments and review. 1433 15. References 1435 15.1. Normative References 1437 [IEEE.754.1985] 1438 IEEE, "Standard for Binary Floating-Point Arithmetic", 1439 IEEE 754-1985, August 1985. 1441 [ISO_IEC_9899] 1442 ISO, "Information technology -- Programming languages -- 1443 C", ISO/IEC 9899:2018, June 2018. 1445 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1446 DOI 10.17487/RFC0768, August 1980, 1447 . 1449 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1450 DOI 10.17487/RFC0791, September 1981, 1451 . 1453 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1454 RFC 792, DOI 10.17487/RFC0792, September 1981, 1455 . 1457 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1458 RFC 793, DOI 10.17487/RFC0793, September 1981, 1459 . 1461 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1462 Requirement Levels", BCP 14, RFC 2119, 1463 DOI 10.17487/RFC2119, March 1997, 1464 . 1466 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1467 "Definition of the Differentiated Services Field (DS 1468 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1469 DOI 10.17487/RFC2474, December 1998, 1470 . 1472 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1473 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1474 DOI 10.17487/RFC4271, January 2006, 1475 . 1477 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 1478 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 1479 February 2006, . 1481 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1482 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1483 2006, . 1485 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 1486 Reflection: An Alternative to Full Mesh Internal BGP 1487 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 1488 . 1490 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1491 "Multiprotocol Extensions for BGP-4", RFC 4760, 1492 DOI 10.17487/RFC4760, January 2007, 1493 . 1495 [RFC5668] Rekhter, Y., Sangli, S., and D. Tappan, "4-Octet AS 1496 Specific BGP Extended Community", RFC 5668, 1497 DOI 10.17487/RFC5668, October 2009, 1498 . 1500 [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP 1501 Extended Communities", RFC 7153, DOI 10.17487/RFC7153, 1502 March 2014, . 1504 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1505 Patel, "Revised Error Handling for BGP UPDATE Messages", 1506 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1507 . 1509 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1510 Writing an IANA Considerations Section in RFCs", BCP 26, 1511 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1512 . 1514 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1515 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1516 May 2017, . 1518 15.2. Informative References 1520 [I-D.ietf-idr-flow-spec-v6] 1521 Loibl, C., Raszuk, R., and S. Hares, "Dissemination of 1522 Flow Specification Rules for IPv6", draft-ietf-idr-flow- 1523 spec-v6-10 (work in progress), November 2019. 1525 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 1526 RFC 4303, DOI 10.17487/RFC4303, December 2005, 1527 . 1529 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 1530 and D. McPherson, "Dissemination of Flow Specification 1531 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 1532 . 1534 [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1535 Austein, "BGP Prefix Origin Validation", RFC 6811, 1536 DOI 10.17487/RFC6811, January 2013, 1537 . 1539 [RFC7674] Haas, J., Ed., "Clarification of the Flowspec Redirect 1540 Extended Community", RFC 7674, DOI 10.17487/RFC7674, 1541 October 2015, . 1543 [RFC8205] Lepinski, M., Ed. and K. Sriram, Ed., "BGPsec Protocol 1544 Specification", RFC 8205, DOI 10.17487/RFC8205, September 1545 2017, . 1547 15.3. URIs 1549 [1] https://github.com/stoffi92/flowspec-cmp 1551 Appendix A. Python code: flow_rule_cmp 1553 1554 """ 1555 Copyright (c) 2020 IETF Trust and the persons identified as authors of 1556 the code. All rights reserved. 1558 Redistribution and use in source and binary forms, with or without 1559 modification, is permitted pursuant to, and subject to the license 1560 terms contained in, the Simplified BSD License set forth in Section 1561 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents 1562 (http://trustee.ietf.org/license-info). 1563 """ 1565 import itertools 1566 import ipaddress 1568 def flow_rule_cmp(a, b): 1569 for comp_a, comp_b in itertools.zip_longest(a.components, 1570 b.components): 1571 # If a component type does not exist in one rule 1572 # this rule has lower precedence 1573 if not comp_a: 1574 return B_HAS_PRECEDENCE 1575 if not comp_b: 1576 return A_HAS_PRECEDENCE 1577 # higher precedence for lower component type 1578 if comp_a.component_type < comp_b.component_type: 1579 return A_HAS_PRECEDENCE 1580 if comp_a.component_type > comp_b.component_type: 1581 return B_HAS_PRECEDENCE 1582 # component types are equal -> type specific comparison 1583 if comp_a.component_type in (IP_DESTINATION, IP_SOURCE): 1584 # assuming comp_a.value, comp_b.value of type 1585 # ipaddress.IPv4Network 1586 if comp_a.value.overlaps(comp_b.value): 1587 # longest prefixlen has precedence 1588 if comp_a.value.prefixlen > comp_b.value.prefixlen: 1589 return A_HAS_PRECEDENCE 1590 if comp_a.value.prefixlen < comp_b.value.prefixlen: 1592 return B_HAS_PRECEDENCE 1593 # components equal -> continue with next component 1594 elif comp_a.value > comp_b.value: 1595 return B_HAS_PRECEDENCE 1596 elif comp_a.value < comp_b.value: 1597 return A_HAS_PRECEDENCE 1598 else: 1599 # assuming comp_a.value, comp_b.value of type bytearray 1600 if len(comp_a.value) == len(comp_b.value): 1601 if comp_a.value > comp_b.value: 1602 return B_HAS_PRECEDENCE 1603 if comp_a.value < comp_b.value: 1604 return A_HAS_PRECEDENCE 1605 # components equal -> continue with next component 1606 else: 1607 common = min(len(comp_a.value), len(comp_b.value)) 1608 if comp_a.value[:common] > comp_b.value[:common]: 1609 return B_HAS_PRECEDENCE 1610 elif comp_a.value[:common] < comp_b.value[:common]: 1611 return A_HAS_PRECEDENCE 1612 # the first common bytes match 1613 elif len(comp_a.value) > len(comp_b.value): 1614 return A_HAS_PRECEDENCE 1615 else: 1616 return B_HAS_PRECEDENCE 1617 return EQUAL 1618 1620 Appendix B. Comparison with RFC 5575 1622 This document includes numerous editorial changes to [RFC5575]. It 1623 also completely incorporates the redirect action clarification 1624 document [RFC7674]. It is recommended to read the entire document. 1625 The authors, however want to point out the following technical 1626 changes to [RFC5575]: 1628 Section 1 introduces the Flow Specification NLRI. In [RFC5575] 1629 this NLRI was defined as an opaque-key in BGPs database. This 1630 specification has removed all references to a opaque-key property. 1631 BGP is able to understand the NLRI encoding. 1633 Section 4.2.1.1 defines a numeric operator and comparison bit 1634 combinations. In [RFC5575] the meaning of those bit combination 1635 was not explicitly defined and left open to the reader. 1637 Section 4.2.2.3 - Section 4.2.2.8, Section 4.2.2.10, 1638 Section 4.2.2.11 make use of the above numeric operator. The 1639 allowed length of the comparison value was not consistently 1640 defined in [RFC5575]. 1642 Section 7 defines all Traffic Filtering Action Extended 1643 communities as transitive extended communities. [RFC5575] defined 1644 the traffic-rate action to be non-transitive and did not define 1645 the transitivity of the other Traffic Filtering Action communities 1646 at all. 1648 Section 7.2 introduces a new Traffic Filtering Action (traffic- 1649 rate-packets). This action did not exist in [RFC5575]. 1651 Section 7.4 contains the same redirect actions already defined in 1652 [RFC5575] however, these actions have been renamed to "rt- 1653 redirect" to make it clearer that the redirection is based on 1654 route-target. This section also completely incorporates the 1655 [RFC7674] clarifications of the Flowspec Redirect Extended 1656 Community. 1658 Section 7.7 contains general considerations on interfering traffic 1659 actions. Section 7.3 also cross-references this section. 1660 [RFC5575] did not mention this. 1662 Section 10 contains new error handling. 1664 Authors' Addresses 1666 Christoph Loibl 1667 next layer Telekom GmbH 1668 Mariahilfer Guertel 37/7 1669 Vienna 1150 1670 AT 1672 Phone: +43 664 1176414 1673 Email: cl@tix.at 1675 Susan Hares 1676 Huawei 1677 7453 Hickory Hill 1678 Saline, MI 48176 1679 USA 1681 Email: shares@ndzh.com 1682 Robert Raszuk 1683 Bloomberg LP 1684 731 Lexington Ave 1685 New York City, NY 10022 1686 USA 1688 Email: robert@raszuk.net 1690 Danny McPherson 1691 Verisign 1692 USA 1694 Email: dmcpherson@verisign.com 1696 Martin Bacher 1697 T-Mobile Austria 1698 Rennweg 97-99 1699 Vienna 1030 1700 AT 1702 Email: mb.ietf@gmail.com