idnits 2.17.1 draft-ietf-idr-rfc5575bis-22.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? -- The draft header indicates that this document obsoletes RFC7674, but the abstract doesn't seem to directly say this. It does mention RFC7674 though, so this could be OK. -- The draft header indicates that this document obsoletes RFC5575, but the abstract doesn't seem to directly say this. It does mention RFC5575 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 17, 2020) is 1468 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '137' on line 646 -- Looks like a reference, but probably isn't: '139' on line 646 -- Looks like a reference, but probably isn't: '1' on line 1550 -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE.754.1985' ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) == Outdated reference: A later version (-22) exists of draft-ietf-idr-flow-spec-v6-10 -- Obsolete informational reference (is this intentional?): RFC 5575 (Obsoleted by RFC 8955) -- Obsolete informational reference (is this intentional?): RFC 7674 (Obsoleted by RFC 8955) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group C. Loibl 3 Internet-Draft next layer Telekom GmbH 4 Obsoletes: 5575,7674 (if approved) S. Hares 5 Intended status: Standards Track Huawei 6 Expires: October 19, 2020 R. Raszuk 7 Bloomberg LP 8 D. McPherson 9 Verisign 10 M. Bacher 11 T-Mobile Austria 12 April 17, 2020 14 Dissemination of Flow Specification Rules 15 draft-ietf-idr-rfc5575bis-22 17 Abstract 19 This document defines a Border Gateway Protocol Network Layer 20 Reachability Information (BGP NLRI) encoding format that can be used 21 to distribute traffic Flow Specifications. This allows the routing 22 system to propagate information regarding more specific components of 23 the traffic aggregate defined by an IP destination prefix. 25 It also specifies BGP Extended Community encoding formats, that can 26 be used to propagate Traffic Filtering Actions along with the Flow 27 Specification NLRI. Those Traffic Filtering Actions encode actions a 28 routing system can take if the packet matches the Flow Specification. 30 Additionally, it defines two applications of that encoding format: 31 one that can be used to automate inter-domain coordination of traffic 32 filtering, such as what is required in order to mitigate 33 (distributed) denial-of-service attacks, and a second application to 34 provide traffic filtering in the context of a BGP/MPLS VPN service. 35 Other applications (ie. centralized control of traffic in a SDN or 36 NFV context) are also possible. Other documents may specify Flow 37 Specification extensions. 39 The information is carried via BGP, thereby reusing protocol 40 algorithms, operational experience, and administrative processes such 41 as inter-provider peering agreements. 43 This document obsoletes both RFC5575 and RFC7674. 45 Status of This Memo 47 This Internet-Draft is submitted in full conformance with the 48 provisions of BCP 78 and BCP 79. 50 Internet-Drafts are working documents of the Internet Engineering 51 Task Force (IETF). Note that other groups may also distribute 52 working documents as Internet-Drafts. The list of current Internet- 53 Drafts is at https://datatracker.ietf.org/drafts/current/. 55 Internet-Drafts are draft documents valid for a maximum of six months 56 and may be updated, replaced, or obsoleted by other documents at any 57 time. It is inappropriate to use Internet-Drafts as reference 58 material or to cite them other than as "work in progress." 60 This Internet-Draft will expire on October 19, 2020. 62 Copyright Notice 64 Copyright (c) 2020 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents 69 (https://trustee.ietf.org/license-info) in effect on the date of 70 publication of this document. Please review these documents 71 carefully, as they describe your rights and restrictions with respect 72 to this document. Code Components extracted from this document must 73 include Simplified BSD License text as described in Section 4.e of 74 the Trust Legal Provisions and are provided without warranty as 75 described in the Simplified BSD License. 77 Table of Contents 79 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 80 2. Definitions of Terms Used in This Memo . . . . . . . . . . . 5 81 3. Flow Specifications . . . . . . . . . . . . . . . . . . . . . 5 82 4. Dissemination of IPv4 Flow Specification Information . . . . 6 83 4.1. Length Encoding . . . . . . . . . . . . . . . . . . . . . 7 84 4.2. NLRI Value Encoding . . . . . . . . . . . . . . . . . . . 7 85 4.2.1. Operators . . . . . . . . . . . . . . . . . . . . . . 7 86 4.2.2. Components . . . . . . . . . . . . . . . . . . . . . 9 87 4.3. Examples of Encodings . . . . . . . . . . . . . . . . . . 14 88 5. Traffic Filtering . . . . . . . . . . . . . . . . . . . . . . 16 89 5.1. Ordering of Flow Specifications . . . . . . . . . . . . . 17 90 6. Validation Procedure . . . . . . . . . . . . . . . . . . . . 18 91 7. Traffic Filtering Actions . . . . . . . . . . . . . . . . . . 19 92 7.1. Traffic Rate in Bytes (traffic-rate-bytes) sub-type 0x06 20 93 7.2. Traffic Rate in Packets (traffic-rate-packets) sub-type 94 TBD . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 95 7.3. Traffic-action (traffic-action) sub-type 0x07 . . . . . . 21 96 7.4. RT Redirect (rt-redirect) sub-type 0x08 . . . . . . . . . 22 97 7.5. Traffic Marking (traffic-marking) sub-type 0x09 . . . . . 22 98 7.6. Interaction with other Filtering Mechanisms in Routers . 23 99 7.7. Considerations on Traffic Filtering Action Interference . 23 100 8. Dissemination of Traffic Filtering in BGP/MPLS VPN Networks . 24 101 9. Traffic Monitoring . . . . . . . . . . . . . . . . . . . . . 25 102 10. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 25 103 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 104 11.1. AFI/SAFI Definitions . . . . . . . . . . . . . . . . . . 25 105 11.2. Flow Component Definitions . . . . . . . . . . . . . . . 26 106 11.3. Extended Community Flow Specification Actions . . . . . 27 107 12. Security Considerations . . . . . . . . . . . . . . . . . . . 29 108 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 31 109 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 110 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 111 15.1. Normative References . . . . . . . . . . . . . . . . . . 31 112 15.2. Informative References . . . . . . . . . . . . . . . . . 33 113 15.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 34 114 Appendix A. Python code: flow_rule_cmp . . . . . . . . . . . . . 34 115 Appendix B. Comparison with RFC 5575 . . . . . . . . . . . . . . 35 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 118 1. Introduction 120 This document obsoletes "Dissemination of Flow Specification Rules" 121 [RFC5575], the differences can be found in Appendix B. This document 122 also obsoletes 123 "Clarification of the Flowspec Redirect Extended Community" [RFC7674] 124 since it incorporates the encoding of the BGP Flow Specification 125 Redirect Extended Community in Section 7.4. 127 Modern IP routers contain both the capability to forward traffic 128 according to IP prefixes as well as to classify, shape, rate limit, 129 filter, or redirect packets based on administratively defined 130 policies. These traffic policy mechanisms allow the operator to 131 define match rules that operate on multiple fields of the packet 132 header. Actions such as the ones described above can be associated 133 with each rule. 135 The n-tuple consisting of the matching criteria defines an aggregate 136 traffic Flow Specification. The matching criteria can include 137 elements such as source and destination address prefixes, IP 138 protocol, and transport protocol port numbers. 140 Section 4 of this document defines a general procedure to encode Flow 141 Specification for aggregated traffic flows so that they can be 142 distributed as a BGP [RFC4271] NLRI. Additionally, Section 7 of this 143 document defines the required Traffic Filtering Actions BGP Extended 144 Communities and mechanisms to use BGP for intra- and inter-provider 145 distribution of traffic filtering rules to filter (distributed) 146 denial-of-service (DoS) attacks. 148 By expanding routing information with Flow Specifications, the 149 routing system can take advantage of the ACL (Access Control List) or 150 firewall capabilities in the router's forwarding path. Flow 151 Specifications can be seen as more specific routing entries to a 152 unicast prefix and are expected to depend upon the existing unicast 153 data information. 155 A Flow Specification received from an external autonomous system will 156 need to be validated against unicast routing before being accepted 157 (Section 6). The Flow Specification received from an internal BGP 158 peer within the same autonomous system [RFC4271] is assumed to have 159 been validated prior to transmission within the internal BGP (iBGP) 160 mesh of an autonomous system. If the aggregate traffic flow defined 161 by the unicast destination prefix is forwarded to a given BGP peer, 162 then the local system can install more specific Flow Specifications 163 that may result in different forwarding behavior, as requested by 164 this system. 166 From an operational perspective, the utilization of BGP as the 167 carrier for this information allows a network service provider to 168 reuse both internal route distribution infrastructure (e.g., route 169 reflector or confederation design) and existing external 170 relationships (e.g., inter-domain BGP sessions to a customer 171 network). 173 While it is certainly possible to address this problem using other 174 mechanisms, this solution has been utilized in deployments because of 175 the substantial advantage of being an incremental addition to already 176 deployed mechanisms. 178 In current deployments, the information distributed by this extension 179 is originated both manually as well as automatically, the latter by 180 systems that are able to detect malicious traffic flows. When 181 automated systems are used, care should be taken to ensure their 182 correctness as well as the limitations of the systems that receive 183 and process the advertised Flow Specifications (see also Section 12). 185 This specification defines required protocol extensions to address 186 most common applications of IPv4 unicast and VPNv4 unicast filtering. 187 The same mechanism can be reused and new match criteria added to 188 address similar filtering needs for other BGP address families such 189 as IPv6 families [I-D.ietf-idr-flow-spec-v6]. 191 2. Definitions of Terms Used in This Memo 193 AFI - Address Family Identifier. 195 AS - Autonomous System. 197 Loc-RIB - The Loc-RIB contains the routes that have been selected 198 by the local BGP speaker's Decision Process [RFC4271]. 200 NLRI - Network Layer Reachability Information. 202 PE - Provider Edge router. 204 RIB - Routing Information Base. 206 SAFI - Subsequent Address Family Identifier. 208 VRF - Virtual Routing and Forwarding instance. 210 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 211 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 212 "OPTIONAL" in this document are to be interpreted as described in BCP 213 14 [RFC2119] [RFC8174] when, and only when, they appear in all 214 capitals, as shown here. 216 3. Flow Specifications 218 A Flow Specification is an n-tuple consisting of several matching 219 criteria that can be applied to IP traffic. A given IP packet is 220 said to match the defined Flow Specification if it matches all the 221 specified criteria. This n-tuple is encoded into a BGP NLRI defined 222 below. 224 A given Flow Specification may be associated with a set of 225 attributes, depending on the particular application; such attributes 226 may or may not include reachability information (i.e., NEXT_HOP). 227 Well-known or AS-specific community attributes can be used to encode 228 a set of predetermined actions. 230 A particular application is identified by a specific (Address Family 231 Identifier, Subsequent Address Family Identifier (AFI, SAFI)) pair 232 [RFC4760] and corresponds to a distinct set of RIBs. Those RIBs 233 should be treated independently from each other in order to assure 234 non-interference between distinct applications. 236 BGP itself treats the NLRI as a key to an entry in its databases. 237 Entries that are placed in the Loc-RIB are then associated with a 238 given set of semantics, which is application dependent. This is 239 consistent with existing BGP applications. For instance, IP unicast 240 routing (AFI=1, SAFI=1) and IP multicast reverse-path information 241 (AFI=1, SAFI=2) are handled by BGP without any particular semantics 242 being associated with them until installed in the Loc-RIB. 244 Standard BGP policy mechanisms, such as UPDATE filtering by NLRI 245 prefix as well as community matching and manipulation, must apply to 246 the Flow Specification defined NLRI-type, especially in an inter- 247 domain environment. Network operators can also control propagation 248 of such routing updates by enabling or disabling the exchange of a 249 particular (AFI, SAFI) pair on a given BGP peering session. 251 4. Dissemination of IPv4 Flow Specification Information 253 This document defines a Flow Specification NLRI type (Figure 1) that 254 may include several components such as destination prefix, source 255 prefix, protocol, ports, and others (see Section 4.2 below). 257 This NLRI information is encoded using MP_REACH_NLRI and 258 MP_UNREACH_NLRI attributes as defined in [RFC4760]. When advertising 259 Flow Specifications, the Length of Next Hop Network Address SHOULD be 260 set to 0. The Network Address of Next Hop field MUST be ignored. 262 The NLRI field of the MP_REACH_NLRI and MP_UNREACH_NLRI is encoded as 263 one or more 2-tuples of the form . It consists 264 of a 1- or 2-octet length field followed by a variable-length NLRI 265 value. The length is expressed in octets. 267 +-------------------------------+ 268 | length (0xnn or 0xfnnn) | 269 +-------------------------------+ 270 | NLRI value (variable) | 271 +-------------------------------+ 273 Figure 1: Flow Specification NLRI for IPv4 275 Implementations wishing to exchange Flow Specification MUST use BGP's 276 Capability Advertisement facility to exchange the Multiprotocol 277 Extension Capability Code (Code 1) as defined in [RFC4760]. The 278 (AFI, SAFI) pair carried in the Multiprotocol Extension Capability 279 MUST be (AFI=1, SAFI=133) for IPv4 Flow Specification, and (AFI=1, 280 SAFI=134) for VPNv4 Flow Specification. 282 4.1. Length Encoding 284 o If the NLRI length is smaller than 240 (0xf0 hex) octets, the 285 length field can be encoded as a single octet. 287 o Otherwise, it is encoded as an extended-length 2-octet value in 288 which the most significant nibble of the first octet is all ones. 290 In Figure 1 above, values less-than 240 are encoded using two hex 291 digits (0xnn). Values above 239 are encoded using 3 hex digits 292 (0xfnnn). The highest value that can be represented with this 293 encoding is 4095. For example the length value of 239 is encoded as 294 0xef (single octet) while 240 is encoded as 0xf0f0 (2-octet). 296 4.2. NLRI Value Encoding 298 The Flow Specification NLRI value consists of a list of optional 299 components and is encoded as follows: 301 Encoding: <[component]+> 303 A specific packet is considered to match the Flow Specification when 304 it matches the intersection (AND) of all the components present in 305 the Flow Specification. 307 Components MUST follow strict type ordering by increasing numerical 308 order. A given component type may (exactly once) or may not be 309 present in the Flow Specification. If present, it MUST precede any 310 component of higher numeric type value. 312 All combinations of components within a single Flow Specification are 313 allowed. However, some combinations cannot match any packets (e.g. 314 "ICMP Type AND Port" will never match any packets), and thus SHOULD 315 NOT be propagated by BGP. 317 A NLRI value not encoded as specified in Section 4.2 is considered 318 malformed and error handling according to Section 10 is performed. 320 4.2.1. Operators 322 Most of the components described below make use of comparison 323 operators. Which of the two operators is used is defined by the 324 components in Section 4.2.2. The operators are encoded as a single 325 octet. 327 4.2.1.1. Numeric Operator (numeric_op) 329 This operator is encoded as shown in Figure 2. 331 0 1 2 3 4 5 6 7 332 +---+---+---+---+---+---+---+---+ 333 | e | a | len | 0 |lt |gt |eq | 334 +---+---+---+---+---+---+---+---+ 336 Figure 2: Numeric Operator (numeric_op) 338 e - end-of-list bit: Set in the last {op, value} pair in the list. 340 a - AND bit: If unset, the result of the previous {op, value} pair 341 is logically ORed with the current one. If set, the operation is 342 a logical AND. In the first operator octet of a sequence it 343 SHOULD be encoded as unset and MUST be treated as always unset on 344 decoding. The AND operator has higher priority than OR for the 345 purposes of evaluating logical expressions. 347 len - length: The length of the value field for this operator given 348 as (1 << len). This encodes 1 (len=00), 2 (len=01), 4 (len=10), 8 349 (len=11) octets. 351 0 - SHOULD be set to 0 on NLRI encoding, and MUST be ignored during 352 decoding 354 lt - less than comparison between data and value. 356 gt - greater than comparison between data and value. 358 eq - equality between data and value. 360 The bits lt, gt, and eq can be combined to produce common relational 361 operators such as "less or equal", "greater or equal", and "not equal 362 to" as shown in Table 1. 364 +----+----+----+-----------------------------------+ 365 | lt | gt | eq | Resulting operation | 366 +----+----+----+-----------------------------------+ 367 | 0 | 0 | 0 | false (independent of the value) | 368 | 0 | 0 | 1 | == (equal) | 369 | 0 | 1 | 0 | > (greater than) | 370 | 0 | 1 | 1 | >= (greater than or equal) | 371 | 1 | 0 | 0 | < (less than) | 372 | 1 | 0 | 1 | <= (less than or equal) | 373 | 1 | 1 | 0 | != (not equal value) | 374 | 1 | 1 | 1 | true (independent of the value) | 375 +----+----+----+-----------------------------------+ 377 Table 1: Comparison operation combinations 379 4.2.1.2. Bitmask Operator (bitmask_op) 381 This operator is encoded as shown in Figure 3. 383 0 1 2 3 4 5 6 7 384 +---+---+---+---+---+---+---+---+ 385 | e | a | len | 0 | 0 |not| m | 386 +---+---+---+---+---+---+---+---+ 388 Figure 3: Bitmask Operator (bitmask_op) 390 e, a, len - Most significant nibble: (end-of-list bit, AND bit, and 391 length field), as defined in the Numeric Operator format in 392 Section 4.2.1.1. 394 not - NOT bit: If set, logical negation of operation. 396 m - Match bit: If set, this is a bitwise match operation defined as 397 "(data AND value) == value"; if unset, (data AND value) evaluates 398 to TRUE if any of the bits in the value mask are set in the data 400 0 - all 0 bits: SHOULD be set to 0 on NLRI encoding, and MUST be 401 ignored during decoding 403 4.2.2. Components 405 The encoding of each of the components begins with a type field (1 406 octet) followed by a variable length parameter. The following 407 sections define component types and parameter encodings for the IPv4 408 IP layer and transport layer headers. IPv6 NLRI component types are 409 described in [I-D.ietf-idr-flow-spec-v6]. 411 4.2.2.1. Type 1 - Destination Prefix 413 Encoding: 415 Defines the destination prefix to match. The length and prefix 416 fields are encoded as in BGP UPDATE messages [RFC4271] 418 4.2.2.2. Type 2 - Source Prefix 420 Encoding: 422 Defines the source prefix to match. The length and prefix fields are 423 encoded as in BGP UPDATE messages [RFC4271] 425 4.2.2.3. Type 3 - IP Protocol 427 Encoding: 429 Contains a list of {numeric_op, value} pairs that are used to match 430 the IP protocol value octet in IP packet header (see [RFC0791] 431 Section 3.1). 433 This component uses the Numeric Operator (numeric_op) described in 434 Section 4.2.1.1. Type 3 component values SHOULD be encoded as single 435 octet (numeric_op len=00). 437 4.2.2.4. Type 4 - Port 439 Encoding: 441 Defines a list of {numeric_op, value} pairs that matches source OR 442 destination TCP/UDP ports (see [RFC0793] Section 3.1 and [RFC0768] 443 Section "Format"). This component matches if either the destination 444 port OR the source port of a IP packet matches the value. 446 This component uses the Numeric Operator (numeric_op) described in 447 Section 4.2.1.1. Type 4 component values SHOULD be encoded as 1- or 448 2-octet quantities (numeric_op len=00 or len=01). 450 In case of the presence of the port (destination-port, source-port) 451 component only TCP or UDP packets can match the entire Flow 452 Specification. The port component, if present, never matches when 453 the packet's IP protocol value is not 6 (TCP) or 17 (UDP), if the 454 packet is fragmented and this is not the first fragment, or if the 455 system is unable to locate the transport header. Different 456 implementations may or may not be able to decode the transport header 457 in the presence of IP options or Encapsulating Security Payload (ESP) 458 NULL [RFC4303] encryption. 460 4.2.2.5. Type 5 - Destination Port 462 Encoding: 464 Defines a list of {numeric_op, value} pairs used to match the 465 destination port of a TCP or UDP packet (see also [RFC0793] 466 Section 3.1 and [RFC0768] Section "Format"). 468 This component uses the Numeric Operator (numeric_op) described in 469 Section 4.2.1.1. Type 5 component values SHOULD be encoded as 1- or 470 2-octet quantities (numeric_op len=00 or len=01). 472 The last paragraph of Section 4.2.2.4 also applies to this component. 474 4.2.2.6. Type 6 - Source Port 476 Encoding: 478 Defines a list of {numeric_op, value} pairs used to match the source 479 port of a TCP or UDP packet (see also [RFC0793] Section 3.1 and 480 [RFC0768] Section "Format"). 482 This component uses the Numeric Operator (numeric_op) described in 483 Section 4.2.1.1. Type 6 component values SHOULD be encoded as 1- or 484 2-octet quantities (numeric_op len=00 or len=01). 486 The last paragraph of Section 4.2.2.4 also applies to this component. 488 4.2.2.7. Type 7 - ICMP type 490 Encoding: 492 Defines a list of {numeric_op, value} pairs used to match the type 493 field of an ICMP packet (see also [RFC0792] Section "Message 494 Formats"). 496 This component uses the Numeric Operator (numeric_op) described in 497 Section 4.2.1.1. Type 7 component values SHOULD be encoded as single 498 octet (numeric_op len=00). 500 In case of the presence of the ICMP type (code) component only ICMP 501 packets can match the entire Flow Specification. The ICMP type 502 (code) component, if present, never matches when the packet's IP 503 protocol value is not 1 (ICMP), if the packet is fragmented and this 504 is not the first fragment, or if the system is unable to locate the 505 transport header. Different implementations may or may not be able 506 to decode the transport header in the presence of IP options or 507 Encapsulating Security Payload (ESP) NULL [RFC4303] encryption. 509 4.2.2.8. Type 8 - ICMP code 511 Encoding: 513 Defines a list of {numeric_op, value} pairs used to match the code 514 field of an ICMP packet (see also [RFC0792] Section "Message 515 Formats"). 517 This component uses the Numeric Operator (numeric_op) described in 518 Section 4.2.1.1. Type 8 component values SHOULD be encoded as single 519 octet (numeric_op len=00). 521 The last paragraph of Section 4.2.2.7 also applies to this component. 523 4.2.2.9. Type 9 - TCP flags 525 Encoding: 527 Defines a list of {bitmask_op, bitmask} pairs used to match TCP 528 Control Bits (see also [RFC0793] Section 3.1). 530 This component uses the Bitmask Operator (bitmask_op) described in 531 Section 4.2.1.2. Type 9 component bitmasks MUST be encoded as 1- or 532 2-octet bitmask (bitmask_op len=00 or len=01). 534 When a single octet (bitmask_op len=00) is specified, it matches 535 octet 14 of the TCP header (see also [RFC0793] Section 3.1), which 536 contains the TCP Control Bits. When a 2-octet (bitmask_op len=01) 537 encoding is used, it matches octets 13 and 14 of the TCP header with 538 the data offset (leftmost 4 bits) always treated as 0. 540 In case of the presence of the TCP flags component only TCP packets 541 can match the entire Flow Specification. The TCP flags component, if 542 present, never matches when the packet's IP protocol value is not 6 543 (TCP), if the packet is fragmented and this is not the first 544 fragment, or if the system is unable to locate the transport header. 545 Different implementations may or may not be able to decode the 546 transport header in the presence of IP options or Encapsulating 547 Security Payload (ESP) NULL [RFC4303] encryption. 549 4.2.2.10. Type 10 - Packet length 551 Encoding: 553 Defines a list of {numeric_op, value} pairs used to match on the 554 total IP packet length (excluding Layer 2 but including IP header). 556 This component uses the Numeric Operator (numeric_op) described in 557 Section 4.2.1.1. Type 10 component values SHOULD be encoded as 1- or 558 2-octet quantities (numeric_op len=00 or len=01). 560 4.2.2.11. Type 11 - DSCP (Diffserv Code Point) 562 Encoding: 564 Defines a list of {numeric_op, value} pairs used to match the 6-bit 565 DSCP field (see also [RFC2474]). 567 This component uses the Numeric Operator (numeric_op) described in 568 Section 4.2.1.1. Type 11 component values MUST be encoded as single 569 octet (numeric_op len=00). 571 The six least significant bits contain the DSCP value. All other 572 bits SHOULD be treated as 0. 574 4.2.2.12. Type 12 - Fragment 576 Encoding: 578 Defines a list of {bitmask_op, bitmask} pairs used to match specific 579 IP fragments. 581 This component uses the Bitmask Operator (bitmask_op) described in 582 Section 4.2.1.2. The Type 12 component bitmask MUST be encoded as 583 single octet bitmask (bitmask_op len=00). 585 0 1 2 3 4 5 6 7 586 +---+---+---+---+---+---+---+---+ 587 | 0 | 0 | 0 | 0 |LF |FF |IsF|DF | 588 +---+---+---+---+---+---+---+---+ 590 Figure 4: Fragment Bitmask Operand 592 Bitmask values: 594 DF - Don't fragment - match if [RFC0791] IP Header Flags Bit-1 (DF) 595 is 1 597 IsF - Is a fragment - match if [RFC0791] IP Header Fragment Offset 598 is not 0 600 FF - First fragment - match if [RFC0791] IP Header Fragment Offset 601 is 0 AND Flags Bit-2 (MF) is 1 603 LF - Last fragment - match if [RFC0791] IP Header Fragment Offset is 604 not 0 AND Flags Bit-2 (MF) is 0 606 0 - SHOULD be set to 0 on NLRI encoding, and MUST be ignored during 607 decoding 609 4.3. Examples of Encodings 611 4.3.1. Example 1 613 An example of a Flow Specification NLRI encoding for: "all packets to 614 192.0.2.0/24 and TCP port 25". 616 +--------+----------------+----------+----------+ 617 | length | destination | protocol | port | 618 +--------+----------------+----------+----------+ 619 | 0x0b | 01 18 c0 00 02 | 03 81 06 | 04 81 19 | 620 +--------+----------------+----------+----------+ 622 Decoded: 624 +-------+------------+-------------------------------+ 625 | Value | | | 626 +-------+------------+-------------------------------+ 627 | 0x0b | length | 11 octets (len<240 1-octet) | 628 | 0x01 | type | Type 1 - Destination Prefix | 629 | 0x18 | length | 24 bit | 630 | 0xc0 | prefix | 192 | 631 | 0x00 | prefix | 0 | 632 | 0x02 | prefix | 2 | 633 | 0x03 | type | Type 3 - IP Protocol | 634 | 0x81 | numeric_op | end-of-list, value size=1, == | 635 | 0x06 | value | 6 (TCP) | 636 | 0x04 | type | Type 4 - Port | 637 | 0x81 | numeric_op | end-of-list, value size=1, == | 638 | 0x19 | value | 25 | 639 +-------+------------+-------------------------------+ 641 This constitutes a NLRI with a NLRI length of 11 octets. 643 4.3.2. Example 2 645 An example of a Flow Specification NLRI encoding for: "all packets to 646 192.0.2.0/24 from 203.0.113.0/24 and port {range [137, 139] or 647 8080}". 649 +--------+----------------+----------------+-------------------------+ 650 | length | destination | source | port | 651 +--------+----------------+----------------+-------------------------+ 652 | 0x12 | 01 18 c0 00 02 | 02 18 cb 00 71 | 04 03 89 45 8b 91 1f 90 | 653 +--------+----------------+----------------+-------------------------+ 655 Decoded: 657 +--------+------------+-------------------------------+ 658 | Value | | | 659 +--------+------------+-------------------------------+ 660 | 0x12 | length | 18 octets (len<240 1-octet) | 661 | 0x01 | type | Type 1 - Destination Prefix | 662 | 0x18 | length | 24 bit | 663 | 0xc0 | prefix | 192 | 664 | 0x00 | prefix | 0 | 665 | 0x02 | prefix | 2 | 666 | 0x02 | type | Type 2 - Source Prefix | 667 | 0x18 | length | 24 bit | 668 | 0xcb | prefix | 203 | 669 | 0x00 | prefix | 0 | 670 | 0x71 | prefix | 113 | 671 | 0x04 | type | Type 4 - Port | 672 | 0x03 | numeric_op | value size=1, >= | 673 | 0x89 | value | 137 | 674 | 0x45 | numeric_op | "AND", value size=1, <= | 675 | 0x8b | value | 139 | 676 | 0x91 | numeric_op | end-of-list, value size=2, == | 677 | 0x1f90 | value | 8080 | 678 +--------+------------+-------------------------------+ 680 This constitutes a NLRI with a NLRI length of 18 octets. 682 4.3.3. Example 3 684 An example of a Flow Specification NLRI encoding for: "all packets to 685 192.0.2.1/32 and fragment { DF or FF } (matching packet with DF bit 686 set or First Fragments) 688 +--------+-------------------+----------+ 689 | length | destination | fragment | 690 +--------+-------------------+----------+ 691 | 0x09 | 01 20 c0 00 02 01 | 0c 80 05 | 692 +--------+-------------------+----------+ 694 Decoded: 696 +-------+------------+------------------------------+ 697 | Value | | | 698 +-------+------------+------------------------------+ 699 | 0x09 | length | 9 octets (len<240 1-octet) | 700 | 0x01 | type | Type 1 - Destination Prefix | 701 | 0x20 | length | 32 bit | 702 | 0xc0 | prefix | 192 | 703 | 0x00 | prefix | 0 | 704 | 0x02 | prefix | 2 | 705 | 0x01 | prefix | 1 | 706 | 0x0c | type | Type 12 - Fragment | 707 | 0x80 | bitmask_op | end-of-list, value size=1 | 708 | 0x05 | bitmask | DF=1, FF=1 | 709 +-------+------------+------------------------------+ 711 This constitutes a NLRI with a NLRI length of 9 octets. 713 5. Traffic Filtering 715 Traffic filtering policies have been traditionally considered to be 716 relatively static. Limitations of these static mechanisms caused 717 this new dynamic mechanism to be designed for the three new 718 applications of traffic filtering: 720 o Prevention of traffic-based, denial-of-service (DOS) attacks. 722 o Traffic filtering in the context of BGP/MPLS VPN service. 724 o Centralized traffic control for SDN/NFV networks. 726 These applications require coordination among service providers and/ 727 or coordination among the AS within a service provider. 729 The Flow Specification NLRI defined in Section 4 conveys information 730 about traffic filtering rules for traffic that should be discarded or 731 handled in a manner specified by a set of pre-defined actions (which 732 are defined in BGP Extended Communities). This mechanism is 733 primarily designed to allow an upstream autonomous system to perform 734 inbound filtering in their ingress routers of traffic that a given 735 downstream AS wishes to drop. 737 In order to achieve this goal, this document specifies two 738 application specific NLRI identifiers that provide traffic filters, 739 and a set of actions encoding in BGP Extended Communities. The two 740 application specific NLRI identifiers are: 742 o IPv4 Flow Specification identifier (AFI=1, SAFI=133) along with 743 specific semantic rules for IPv4 routes, and 745 o VPNv4 Flow Specification identifier (AFI=1, SAFI=134) value, which 746 can be used to propagate traffic filtering information in a BGP/ 747 MPLS VPN environment. 749 Encoding of the NLRI is described in Section 4 for IPv4 Flow 750 Specification and in Section 8 for VPNv4 Flow Specification. The 751 filtering actions are described in Section 7. 753 5.1. Ordering of Flow Specifications 755 More than one Flow Specification may match a particular traffic flow. 756 Thus, it is necessary to define the order in which Flow 757 Specifications get matched and actions being applied to a particular 758 traffic flow. This ordering function is such that it does not depend 759 on the arrival order of the Flow Specification via BGP and thus is 760 consistent in the network. 762 The relative order of two Flow Specifications is determined by 763 comparing their respective components. The algorithm starts by 764 comparing the left-most components (lowest component type value) of 765 the Flow Specifications. If the types differ, the Flow Specification 766 with lowest numeric type value has higher precedence (and thus will 767 match before) than the Flow Specification that doesn't contain that 768 component type. If the component types are the same, then a type- 769 specific comparison is performed (see below) if the types are equal 770 the algorithm continues with the next component. 772 For IP prefix values (IP destination or source prefix): If one of the 773 two prefixes to compare is a more specific prefix of the other, the 774 more specific prefix has higher precedence. Otherwise the one with 775 the lowest IP value has higher precedence. 777 For all other component types, unless otherwise specified, the 778 comparison is performed by comparing the component data as a binary 779 string using the memcmp() function as defined by [ISO_IEC_9899]. For 780 strings with equal lengths the lowest string (memcmp) has higher 781 precedence. For strings of different lengths, the common prefix is 782 compared. If the common prefix is not equal the string with the 783 lowest prefix has higher precedence. If the common prefix is equal, 784 the longest string is considered to have higher precedence than the 785 shorter one. 787 The code in Appendix A shows a Python3 implementation of the 788 comparison algorithm. The full code was tested with Python 3.6.3 and 789 can be obtained at 790 https://github.com/stoffi92/rfc5575bis/tree/master/flowspec-cmp [1]. 792 6. Validation Procedure 794 Flow Specifications received from a BGP peer that are accepted in the 795 respective Adj-RIB-In are used as input to the route selection 796 process. Although the forwarding attributes of two routes for the 797 same Flow Specification prefix may be the same, BGP is still required 798 to perform its path selection algorithm in order to select the 799 correct set of attributes to advertise. 801 The first step of the BGP Route Selection procedure (Section 9.1.2 of 802 [RFC4271] is to exclude from the selection procedure routes that are 803 considered non-feasible. In the context of IP routing information, 804 this step is used to validate that the NEXT_HOP attribute of a given 805 route is resolvable. 807 The concept can be extended, in the case of the Flow Specification 808 NLRI, to allow other validation procedures. 810 The validation process described below validates Flow Specifications 811 against unicast routes received over the same AFI but the associated 812 unicast routing information SAFI: 814 Flow Specification received over SAFI=133 will be validated 815 against routes received over SAFI=1 817 Flow Specification received over SAFI=134 will be validated 818 against routes received over SAFI=128 820 By default a Flow Specification NLRI MUST be validated such that it 821 is considered feasible if and only if all of the below is true: 823 a) A destination prefix component is embedded in the Flow 824 Specification. 826 b) The originator of the Flow Specification matches the originator 827 of the best-match unicast route for the destination prefix 828 embedded in the Flow Specification (this is the unicast route with 829 the longest possible prefix length covering the destination prefix 830 embedded in the Flow Specification). 832 c) There are no more specific unicast routes, when compared with 833 the flow destination prefix, that have been received from a 834 different neighboring AS than the best-match unicast route, which 835 has been determined in rule b). 837 However, rule a) MAY be relaxed by explicit configuration, permitting 838 Flow Specifications that include no destination prefix component. If 839 such is the case, rules b) and c) are moot and MUST be disregarded. 841 By originator of a BGP route, we mean either the address of the 842 originator in the ORIGINATOR_ID Attribute [RFC4456], or the source IP 843 address of the BGP peer, if this path attribute is not present. 845 BGP implementations MUST also enforce that the AS_PATH attribute of a 846 route received via the External Border Gateway Protocol (eBGP) 847 contains the neighboring AS in the left-most position of the AS_PATH 848 attribute. While this rule is optional in the BGP specification, it 849 becomes necessary to enforce it for security reasons. 851 The best-match unicast route may change over the time independently 852 of the Flow Specification NLRI. Therefore, a revalidation of the 853 Flow Specification NLRI MUST be performed whenever unicast routes 854 change. Revalidation is defined as retesting that clause a and 855 clause b above are true. 857 Explanation: 859 The underlying concept is that the neighboring AS that advertises the 860 best unicast route for a destination is allowed to advertise Flow 861 Specification information that conveys a more or equally specific 862 destination prefix. Thus, as long as there are no more specific 863 unicast routes, received from a different neighboring AS, which would 864 be affected by that Flow Specification. 866 The neighboring AS is the immediate destination of the traffic 867 described by the Flow Specification. If it requests these flows to 868 be dropped, that request can be honored without concern that it 869 represents a denial of service in itself. Supposedly, the traffic is 870 being dropped by the downstream autonomous system, and there is no 871 added value in carrying the traffic to it. 873 7. Traffic Filtering Actions 875 This document defines a minimum set of Traffic Filtering Actions that 876 it standardizes as BGP extended communities [RFC4360]. This is not 877 meant to be an inclusive list of all the possible actions, but only a 878 subset that can be interpreted consistently across the network. 879 Additional actions can be defined as either requiring standards or as 880 vendor specific. 882 The default action for a matching Flow Specification is to accept the 883 packet (treat the packet according to the normal forwarding behaviour 884 of the system). 886 This document defines the following extended communities values shown 887 in Table 2 in the form 0xttss where tt indicates the type and ss 888 indicates the sub-type of the extended community. Encodings for 889 these extended communities are described below. 891 +-------------+---------------------------+-------------------------+ 892 | community | action | encoding | 893 | 0xttss | | | 894 +-------------+---------------------------+-------------------------+ 895 | 0x8006 | traffic-rate-bytes | 2-octet ASN, 4-octet | 896 | | (Section 7.1) | float | 897 | TBD | traffic-rate-packets | 2-octet ASN, 4-octet | 898 | | (Section 7.1) | float | 899 | 0x8007 | traffic-action | bitmask | 900 | | (Section 7.3) | | 901 | 0x8008 | rt-redirect AS-2octet | 2-octet AS, 4-octet | 902 | | (Section 7.4) | value | 903 | 0x8108 | rt-redirect IPv4 | 4-octet IPv4 address, | 904 | | (Section 7.4) | 2-octet value | 905 | 0x8208 | rt-redirect AS-4octet | 4-octet AS, 2-octet | 906 | | (Section 7.4) | value | 907 | 0x8009 | traffic-marking | DSCP value | 908 | | (Section 7.5) | | 909 +-------------+---------------------------+-------------------------+ 911 Table 2: Traffic Filtering Action Extended Communities 913 Multiple Traffic Filtering Actions defined in this document may be 914 present for a single Flow Specification and SHOULD be applied to the 915 traffic flow (for example traffic-rate-bytes and rt-redirect can be 916 applied to packets at the same time). If not all of the Traffic 917 Filtering Actions can be applied to a traffic flow they should be 918 treated as interfering Traffic filtering actions (see below). 920 Some Traffic Filtering Actions may interfere with each other even 921 contradict. Section 7.7 of this document provides general 922 considerations on such Traffic Filtering Action interference. Any 923 additional definition of Traffic Filtering Actions SHOULD specify the 924 action to take if those Traffic Filtering Actions interfere (also 925 with existing Traffic Filtering Actions). 927 All Traffic Filtering Actions are specified as transitive BGP 928 Extended Communities. 930 7.1. Traffic Rate in Bytes (traffic-rate-bytes) sub-type 0x06 932 The traffic-rate-bytes extended community uses the following extended 933 community encoding: 935 The first two octets carry the 2-octet id, which can be assigned from 936 a 2-octet AS number. When a 4-octet AS number is locally present, 937 the 2 least significant octets of such an AS number can be used. 938 This value is purely informational and SHOULD NOT be interpreted by 939 the implementation. 941 The remaining 4 octets carry the maximum rate information in IEEE 942 floating point [IEEE.754.1985] format, units being bytes per second. 943 A traffic-rate of 0 should result on all traffic for the particular 944 flow to be discarded. On encoding the traffic-rate MUST NOT be 945 negative. On decoding negative values MUST be treated as zero 946 (discard all traffic). 948 Interferes with: No other BGP Flow Specification Traffic Filtering 949 Action in this document. 951 7.2. Traffic Rate in Packets (traffic-rate-packets) sub-type TBD 953 The traffic-rate-packets extended community uses the same encoding as 954 the traffic-rate-bytes extended community. The floating point value 955 carries the maximum packet rate in packets per second. A traffic- 956 rate-packets of 0 should result in all traffic for the particular 957 flow to be discarded. On encoding the traffic-rate-packets MUST NOT 958 be negative. On decoding negative values MUST be treated as zero 959 (discard all traffic). 961 Interferes with: No other BGP Flow Specification Traffic Filtering 962 Action in this document. 964 7.3. Traffic-action (traffic-action) sub-type 0x07 966 The traffic-action extended community consists of 6 octets of which 967 only the 2 least significant bits of the 6th octet (from left to 968 right) are defined by this document as shown in Figure 5. 970 0 1 2 3 971 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 972 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 973 | Traffic Action Field | 974 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 975 | Tr. Action Field (cont.) |S|T| 976 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 978 Figure 5: Traffic-action Extended Community Encoding 980 where S and T are defined as: 982 o T: Terminal Action (bit 47): When this bit is set, the traffic 983 filtering engine will evaluate any subsequent Flow Specifications 984 (as defined by the ordering procedure Section 5.1). If not set, 985 the evaluation of the traffic filters stops when this Flow 986 Specification is evaluated. 988 o S: Sample (bit 46): Enables traffic sampling and logging for this 989 Flow Specification (only effective when set). 991 o Traffic Action Field: Other Traffic Action Field (see Section 11) 992 bits unused in this specification. These bits SHOULD be set to 0 993 on encoding, and MUST be ignored during decoding. 995 The use of the Terminal Action (bit 47) may result in more than one 996 Flow Specification matching a particular traffic flow. All the 997 Traffic Filtering Actions from these Flow Specifications shall be 998 collected and applied. In case of interfering Traffic Filtering 999 Actions it is an implementation decision which Traffic Filtering 1000 Actions are selected. See also Section 7.7. 1002 Interferes with: No other BGP Flow Specification Traffic Filtering 1003 Action in this document. 1005 7.4. RT Redirect (rt-redirect) sub-type 0x08 1007 The redirect extended community allows the traffic to be redirected 1008 to a VRF routing instance that lists the specified route-target in 1009 its import policy. If several local instances match this criteria, 1010 the choice between them is a local matter (for example, the instance 1011 with the lowest Route Distinguisher value can be elected). 1013 This Extended Community allows 3 different encodings formats for the 1014 route-target (type 0x80, 0x81, 0x82). It uses the same encoding as 1015 the Route Target Extended Community in Sections 3.1 (type 0x80: 1016 2-octet AS, 4-octet value), 3.2 (type 0x81: 4-octet IPv4 address, 1017 2-octet value) and 4 of [RFC4360] and Section 2 (type 0x82: 4-octet 1018 AS, 2-octet value) of [RFC5668] with the high-order octet of the Type 1019 field 0x80, 0x81, 0x82 respectively and the low-order of the Type 1020 field (Sub-Type) always 0x08. 1022 Interferes with: No other BGP Flow Specification Traffic Filtering 1023 Action in this document. 1025 7.5. Traffic Marking (traffic-marking) sub-type 0x09 1027 The traffic marking extended community instructs a system to modify 1028 the DSCP bits in the IP header ([RFC2474] Section 3) of a transiting 1029 IP packet to the corresponding value encoded in the 6 least 1030 significant bits of the extended community value as shown in 1031 Figure 6. 1033 The extended is encoded as follows: 1035 0 1 2 3 1036 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1037 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1038 | reserved | reserved | reserved | reserved | 1039 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1040 | reserved | r.| DSCP | 1041 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1043 Figure 6: Traffic Marking Extended Community Encoding 1045 o DSCP: new DSCP value for the transiting IP packet. 1047 o reserved, r.: SHOULD be set to 0 on encoding, and MUST be ignored 1048 during decoding. 1050 Interferes with: No other BGP Flow Specification Traffic Filtering 1051 Action in this document. 1053 7.6. Interaction with other Filtering Mechanisms in Routers 1055 Implementations should provide mechanisms that map an arbitrary BGP 1056 community value (normal or extended) to Traffic Filtering Actions 1057 that require different mappings in different systems in the network. 1058 For instance, providing packets with a worse-than-best-effort, per- 1059 hop behavior is a functionality that is likely to be implemented 1060 differently in different systems and for which no standard behavior 1061 is currently known. Rather than attempting to define it here, this 1062 can be accomplished by mapping a user-defined community value to 1063 platform-/network-specific behavior via user configuration. 1065 7.7. Considerations on Traffic Filtering Action Interference 1067 Since Traffic Filtering Actions are represented as BGP extended 1068 community values, Traffic Filtering Actions may interfere with each 1069 other (e.g. there may be more than one conflicting traffic-rate-bytes 1070 Traffic Filtering Action associated with a single Flow 1071 Specification). Traffic Filtering Action interference has no impact 1072 on BGP propagation of Flow Specifications (all communities are 1073 propagated according to policies). 1075 If a Flow Specification associated with interfering Traffic Filtering 1076 Actions is selected for packet forwarding, it is an implementation 1077 decision which of the interfering Traffic Filtering Actions are 1078 selected. Implementors of this specification SHOULD document the 1079 behaviour of their implementation in such cases. 1081 Operators are encouraged to make use of the BGP policy framework 1082 supported by their implementation in order to achieve a predictable 1083 behaviour (ie. match - replace - delete communities on administrative 1084 boundaries). See also Section 12. 1086 8. Dissemination of Traffic Filtering in BGP/MPLS VPN Networks 1088 Provider-based Layer 3 VPN networks, such as the ones using a BGP/ 1089 MPLS IP VPN [RFC4364] control plane, may have different traffic 1090 filtering requirements than Internet service providers. But also 1091 Internet service providers may use those VPNs for scenarios like 1092 having the Internet routing table in a VRF, resulting in the same 1093 traffic filtering requirements as defined for the global routing 1094 table environment within this document. This document defines an 1095 additional BGP NLRI type (AFI=1, SAFI=134) value, which can be used 1096 to propagate Flow Specification in a BGP/MPLS VPN environment. 1098 The NLRI format for this address family consists of a fixed-length 1099 Route Distinguisher field (8 octets) followed by the Flow 1100 Specification NLRI value (Section 4.2). The NLRI length field shall 1101 include both the 8 octets of the Route Distinguisher as well as the 1102 subsequent Flow Specification NLRI value. The resulting encoding is 1103 shown in Figure 7. 1105 +--------------------------------+ 1106 | length (0xnn or 0xfn nn) | 1107 +--------------------------------+ 1108 | Route Distinguisher (8 octets) | 1109 +--------------------------------+ 1110 | NLRI value (variable) | 1111 +--------------------------------+ 1113 Figure 7: Flow Specification NLRI for MPLS 1115 Propagation of this NLRI is controlled by matching Route Target 1116 extended communities associated with the BGP path advertisement with 1117 the VRF import policy, using the same mechanism as described in BGP/ 1118 MPLS IP VPNs [RFC4364]. 1120 Flow Specifications received via this NLRI apply only to traffic that 1121 belongs to the VRF(s) in which it is imported. By default, traffic 1122 received from a remote PE is switched via an MPLS forwarding decision 1123 and is not subject to filtering. 1125 Contrary to the behavior specified for the non-VPN NLRI, Flow 1126 Specifications are accepted by default, when received from remote PE 1127 routers. 1129 The validation procedure (Section 6) and Traffic Filtering Actions 1130 (Section 7) are the same as for IPv4. 1132 9. Traffic Monitoring 1134 Traffic filtering applications require monitoring and traffic 1135 statistics facilities. While this is an implementation specific 1136 choice, implementations SHOULD provide: 1138 o A mechanism to log the packet header of filtered traffic. 1140 o A mechanism to count the number of matches for a given Flow 1141 Specification. 1143 10. Error Handling 1145 Error handling according to [RFC7606] and [RFC4760] applies to this 1146 specification. 1148 This document introduces Traffic Filtering Action Extended 1149 Communities. Malformed Traffic Filtering Action Extended Communities 1150 in the sense of [RFC7606] Section 7.14. are Extended Community values 1151 that cannot be decoded according to Section 7 of this document. 1153 11. IANA Considerations 1155 This section complies with [RFC7153]. 1157 11.1. AFI/SAFI Definitions 1159 IANA maintains a registry entitled "SAFI Values". For the purpose of 1160 this work, IANA is requested to update the following SAFIs to read 1161 according to the table below (Note: This document obsoletes both 1162 RFC7674 and RFC5575 and all references to those documents should be 1163 deleted from the registry below): 1165 +-------+------------------------------------------+----------------+ 1166 | Value | Name | Reference | 1167 +-------+------------------------------------------+----------------+ 1168 | 133 | Dissemination of Flow Specification | [this | 1169 | | rules | document] | 1170 | 134 | L3VPN Dissemination of Flow | [this | 1171 | | Specification rules | document] | 1172 +-------+------------------------------------------+----------------+ 1174 Table 3: Registry: SAFI Values 1176 11.2. Flow Component Definitions 1178 A Flow Specification consists of a sequence of flow components, which 1179 are identified by a an 8-bit component type. IANA has created and 1180 maintains a registry entitled "Flow Spec Component Types". IANA is 1181 requested to update the reference for this registry to [this 1182 document]. Furthermore the references to the values should be 1183 updated according to the table below (Note: This document obsoletes 1184 both RFC7674 and RFC5575 and all references to those documents should 1185 be deleted from the registry below). 1187 +-------+--------------------+-----------------+ 1188 | Value | Name | Reference | 1189 +-------+--------------------+-----------------+ 1190 | 1 | Destination Prefix | [this document] | 1191 | 2 | Source Prefix | [this document] | 1192 | 3 | IP Protocol | [this document] | 1193 | 4 | Port | [this document] | 1194 | 5 | Destination port | [this document] | 1195 | 6 | Source port | [this document] | 1196 | 7 | ICMP type | [this document] | 1197 | 8 | ICMP code | [this document] | 1198 | 9 | TCP flags | [this document] | 1199 | 10 | Packet length | [this document] | 1200 | 11 | DSCP | [this document] | 1201 | 12 | Fragment | [this document] | 1202 +-------+--------------------+-----------------+ 1204 Table 4: Registry: Flow Spec Component Types 1206 In order to manage the limited number space and accommodate several 1207 usages, the following policies defined by [RFC8126] are used: 1209 +--------------+-------------------------------+ 1210 | Type Values | Policy | 1211 +--------------+-------------------------------+ 1212 | 0 | Reserved | 1213 | [1 .. 12] | Defined by this specification | 1214 | [13 .. 127] | Specification required | 1215 | [128 .. 255] | First Come First Served | 1216 +--------------+-------------------------------+ 1218 Table 5: Flow Spec Component Types Policies 1220 11.3. Extended Community Flow Specification Actions 1222 The Extended Community Flow Specification Action types defined in 1223 this document consist of two parts: 1225 Type (BGP Transitive Extended Community Type) 1227 Sub-Type 1229 For the type-part, IANA maintains a registry entitled "BGP Transitive 1230 Extended Community Types". For the purpose of this work (Section 7), 1231 IANA is requested to update the references to the following entries 1232 according to the table below (Note: This document obsoletes both 1233 RFC7674 and RFC5575 and all references to those documents should be 1234 deleted in the registry below): 1236 +-------+-----------------------------------------------+-----------+ 1237 | Type | Name | Reference | 1238 | Value | | | 1239 +-------+-----------------------------------------------+-----------+ 1240 | 0x81 | Generic Transitive Experimental | [this | 1241 | | Use Extended Community Part 2 (Sub-Types are | document] | 1242 | | defined in the "Generic Transitive | | 1243 | | Experimental Use Extended Community Part 2 | | 1244 | | Sub-Types" Registry) | | 1245 | 0x82 | Generic Transitive Experimental | [this | 1246 | | Use Extended Community Part 3 | document] | 1247 | | (Sub-Types are defined in the "Generic | | 1248 | | Transitive Experimental Use | | 1249 | | Extended Community Part 3 Sub-Types" | | 1250 | | Registry) | | 1251 +-------+-----------------------------------------------+-----------+ 1253 Table 6: Registry: BGP Transitive Extended Community Types 1255 For the sub-type part of the extended community Traffic Filtering 1256 Actions IANA maintains the following registries. IANA is requested 1257 to update all names and references according to the tables below and 1258 assign a new value for the "Flow spec traffic-rate-packets" Sub-Type 1259 (Note: This document obsoletes both RFC7674 and RFC5575 and all 1260 references to those documents should be deleted from the registries 1261 below). 1263 +----------+--------------------------------------------+-----------+ 1264 | Sub-Type | Name | Reference | 1265 | Value | | | 1266 +----------+--------------------------------------------+-----------+ 1267 | 0x06 | Flow spec traffic-rate-bytes | [this | 1268 | | | document] | 1269 | TBD | Flow spec traffic-rate-packets | [this | 1270 | | | document] | 1271 | 0x07 | Flow spec traffic-action (Use | [this | 1272 | | of the "Value" field is defined in the | document] | 1273 | | "Traffic Action Fields" registry) | | 1274 | 0x08 | Flow spec rt-redirect | [this | 1275 | | AS-2octet format | document] | 1276 | 0x09 | Flow spec traffic-remarking | [this | 1277 | | | document] | 1278 +----------+--------------------------------------------+-----------+ 1280 Table 7: Registry: Generic Transitive Experimental Use Extended 1281 Community Sub-Types 1283 +------------+----------------------------------------+-------------+ 1284 | Sub-Type | Name | Reference | 1285 | Value | | | 1286 +------------+----------------------------------------+-------------+ 1287 | 0x08 | Flow spec rt-redirect IPv4 | [this | 1288 | | format | document] | 1289 +------------+----------------------------------------+-------------+ 1291 Table 8: Registry: Generic Transitive Experimental Use Extended 1292 Community Part 2 Sub-Types 1294 +------------+-----------------------------------------+------------+ 1295 | Sub-Type | Name | Reference | 1296 | Value | | | 1297 +------------+-----------------------------------------+------------+ 1298 | 0x08 | Flow spec rt-redirect | [this | 1299 | | AS-4octet format | document] | 1300 +------------+-----------------------------------------+------------+ 1302 Table 9: Registry: Generic Transitive Experimental Use Extended 1303 Community Part 3 Sub-Types 1305 Furthermore IANA is requested to update the reference for the 1306 registries "Generic Transitive Experimental Use Extended Community 1307 Part 2 Sub-Types" and "Generic Transitive Experimental Use Extended 1308 Community Part 3 Sub-Types" to [this document]. 1310 The "traffic-action" extended community (Section 7.3) defined in this 1311 document has 46 unused bits, which can be used to convey additional 1312 meaning. IANA created and maintains a registry entitled: "Traffic 1313 Action Fields". IANA is requested to update the reference for this 1314 registry to [this document]. Furthermore IANA is requested to update 1315 the references according to the table below. These values should be 1316 assigned via IETF Review rules only (Note: This document obsoletes 1317 both RFC7674 and RFC5575 and all references to those documents should 1318 be deleted from the registry below). 1320 +-----+-----------------+-----------------+ 1321 | Bit | Name | Reference | 1322 +-----+-----------------+-----------------+ 1323 | 47 | Terminal Action | [this document] | 1324 | 46 | Sample | [this document] | 1325 +-----+-----------------+-----------------+ 1327 Table 10: Registry: Traffic Action Fields 1329 12. Security Considerations 1331 As long as Flow Specifications are restricted to match the 1332 corresponding unicast routing paths for the relevant prefixes 1333 (Section 6), the security characteristics of this proposal are 1334 equivalent to the existing security properties of BGP unicast 1335 routing. Any relaxation of the validation procedure described in 1336 Section 6 may allow unwanted Flow Specifications to be propagated and 1337 thus unwanted Traffic Filtering Actions may be applied to flows. 1339 Where the above mechanisms are not in place, this could open the door 1340 to further denial-of-service attacks such as unwanted traffic 1341 filtering, remarking or redirection. 1343 Deployment of specific relaxations of the validation within an 1344 administrative boundary of a network, defined by an AS or an AS- 1345 Confederation boundary, may be useful in some networks for quickly 1346 distributing filters to prevent denial-of-service attacks. For a 1347 network to utilize this relaxation, the BGP policies must support 1348 additional filtering since the origin AS field is empty. 1349 Specifications relaxing the validation restrictions MUST contain 1350 security considerations that provide details on the required 1351 additional filtering. For example, the use of [RFC6811] to enhance 1352 filtering within an AS confederation. 1354 Inter-provider routing is based on a web of trust. Neighboring 1355 autonomous systems are trusted to advertise valid reachability 1356 information. If this trust model is violated, a neighboring 1357 autonomous system may cause a denial-of-service attack by advertising 1358 reachability information for a given prefix for which it does not 1359 provide service (unfiltered address space hijack). Since validation 1360 of the Flow Specification is tied to the announcement of the best 1361 unicast route, this may also cause this validation to fail and 1362 consequently prevent Flow Specifications from being accepted by a 1363 peer. Possible mitigations are [RFC6811] and [RFC8205]. 1365 On IXPs routes are often exchanged via route servers which do not 1366 extend the AS_PATH. In such cases it is not possible to enforce the 1367 left-most AS in the AS_PATH to be the neighbor AS (the AS of the 1368 route server). Since the validation of Flow Specification 1369 (Section 6) depends on this, additional care must be taken. It is 1370 advised to use a strict inbound route policy in such scenarios. 1372 Enabling firewall-like capabilities in routers without centralized 1373 management could make certain failures harder to diagnose. For 1374 example, it is possible to allow TCP packets to pass between a pair 1375 of addresses but not ICMP packets. It is also possible to permit 1376 packets smaller than 900 or greater than 1000 octets to pass between 1377 a pair of addresses, but not packets whose length is in the range 1378 900- 1000. Such behavior may be confusing and these capabilities 1379 should be used with care whether manually configured or coordinated 1380 through the protocol extensions described in this document. 1382 Flow Specification BGP speakers (e.g. automated DDoS controllers) not 1383 properly programmed, algorithms that are not performing as expected, 1384 or simply rogue systems may announce unintended Flow Specifications, 1385 send updates at a high rate or generate a high number of Flow 1386 Specifications. This may stress the receiving systems, exceed their 1387 maximum capacity or may lead to unwanted Traffic Filtering Actions 1388 being applied to flows. 1390 While the general verification of the Flow Specification NLRI is 1391 specified in this document (Section 6) the Traffic Filtering Actions 1392 received by a third party may need custom verification or filtering. 1393 In particular all non traffic-rate actions may allow a third party to 1394 modify packet forwarding properties and potentially gain access to 1395 other routing-tables/VPNs or undesired queues. This can be avoided 1396 by proper filtering/screening of the Traffic Filtering Action 1397 communities at network borders and only exposing a predefined subset 1398 of Traffic Filtering Actions (see Section 7) to third parties. One 1399 way to achieve this is by mapping user-defined communities, that can 1400 be set by the third party, to Traffic Filtering Actions and not 1401 accepting Traffic Filtering Action extended communities from third 1402 parties. 1404 This extension adds additional information to Internet routers. 1405 These are limited in terms of the maximum number of data elements 1406 they can hold as well as the number of events they are able to 1407 process in a given unit of time. Service providers need to consider 1408 the maximum capacity of their devices and may need to limit the 1409 number of Flow Specifications accepted and processed. 1411 13. Contributors 1413 Barry Greene, Pedro Marques, Jared Mauch and Nischal Sheth were 1414 authors on [RFC5575], and therefore are contributing authors on this 1415 document. 1417 14. Acknowledgements 1419 The authors would like to thank Yakov Rekhter, Dennis Ferguson, Chris 1420 Morrow, Charlie Kaufman, and David Smith for their comments for the 1421 comments on the original [RFC5575]. Chaitanya Kodeboyina helped 1422 design the flow validation procedure; and Steven Lin and Jim Washburn 1423 ironed out all the details necessary to produce a working 1424 implementation in the original [RFC5575]. 1426 A packet rate Traffic Filtering Action was also described in a Flow 1427 Specification extension draft and the authors like to thank Wesley 1428 Eddy, Justin Dailey and Gilbert Clark for their work. 1430 Additionally, the authors would like to thank Alexander Mayrhofer, 1431 Nicolas Fevrier, Job Snijders, Jeffrey Haas and Adam Chappell for 1432 their comments and review. 1434 15. References 1436 15.1. Normative References 1438 [IEEE.754.1985] 1439 IEEE, "Standard for Binary Floating-Point Arithmetic", 1440 IEEE 754-1985, August 1985. 1442 [ISO_IEC_9899] 1443 ISO, "Information technology -- Programming languages -- 1444 C", ISO/IEC 9899:2018, June 2018. 1446 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1447 DOI 10.17487/RFC0768, August 1980, 1448 . 1450 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1451 DOI 10.17487/RFC0791, September 1981, 1452 . 1454 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1455 RFC 792, DOI 10.17487/RFC0792, September 1981, 1456 . 1458 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1459 RFC 793, DOI 10.17487/RFC0793, September 1981, 1460 . 1462 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1463 Requirement Levels", BCP 14, RFC 2119, 1464 DOI 10.17487/RFC2119, March 1997, 1465 . 1467 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1468 "Definition of the Differentiated Services Field (DS 1469 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1470 DOI 10.17487/RFC2474, December 1998, 1471 . 1473 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1474 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1475 DOI 10.17487/RFC4271, January 2006, 1476 . 1478 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 1479 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 1480 February 2006, . 1482 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1483 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1484 2006, . 1486 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 1487 Reflection: An Alternative to Full Mesh Internal BGP 1488 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 1489 . 1491 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1492 "Multiprotocol Extensions for BGP-4", RFC 4760, 1493 DOI 10.17487/RFC4760, January 2007, 1494 . 1496 [RFC5668] Rekhter, Y., Sangli, S., and D. Tappan, "4-Octet AS 1497 Specific BGP Extended Community", RFC 5668, 1498 DOI 10.17487/RFC5668, October 2009, 1499 . 1501 [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP 1502 Extended Communities", RFC 7153, DOI 10.17487/RFC7153, 1503 March 2014, . 1505 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1506 Patel, "Revised Error Handling for BGP UPDATE Messages", 1507 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1508 . 1510 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1511 Writing an IANA Considerations Section in RFCs", BCP 26, 1512 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1513 . 1515 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1516 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1517 May 2017, . 1519 15.2. Informative References 1521 [I-D.ietf-idr-flow-spec-v6] 1522 Loibl, C., Raszuk, R., and S. Hares, "Dissemination of 1523 Flow Specification Rules for IPv6", draft-ietf-idr-flow- 1524 spec-v6-10 (work in progress), November 2019. 1526 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 1527 RFC 4303, DOI 10.17487/RFC4303, December 2005, 1528 . 1530 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 1531 and D. McPherson, "Dissemination of Flow Specification 1532 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 1533 . 1535 [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1536 Austein, "BGP Prefix Origin Validation", RFC 6811, 1537 DOI 10.17487/RFC6811, January 2013, 1538 . 1540 [RFC7674] Haas, J., Ed., "Clarification of the Flowspec Redirect 1541 Extended Community", RFC 7674, DOI 10.17487/RFC7674, 1542 October 2015, . 1544 [RFC8205] Lepinski, M., Ed. and K. Sriram, Ed., "BGPsec Protocol 1545 Specification", RFC 8205, DOI 10.17487/RFC8205, September 1546 2017, . 1548 15.3. URIs 1550 [1] https://github.com/stoffi92/rfc5575bis/tree/master/flowspec-cmp 1552 Appendix A. Python code: flow_rule_cmp 1554 1555 """ 1556 Copyright (c) 2020 IETF Trust and the persons identified as authors of 1557 the code. All rights reserved. 1559 Redistribution and use in source and binary forms, with or without 1560 modification, is permitted pursuant to, and subject to the license 1561 terms contained in, the Simplified BSD License set forth in Section 1562 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents 1563 (http://trustee.ietf.org/license-info). 1564 """ 1566 import itertools 1567 import ipaddress 1569 def flow_rule_cmp(a, b): 1570 for comp_a, comp_b in itertools.zip_longest(a.components, 1571 b.components): 1572 # If a component type does not exist in one rule 1573 # this rule has lower precedence 1574 if not comp_a: 1575 return B_HAS_PRECEDENCE 1576 if not comp_b: 1577 return A_HAS_PRECEDENCE 1578 # higher precedence for lower component type 1579 if comp_a.component_type < comp_b.component_type: 1580 return A_HAS_PRECEDENCE 1581 if comp_a.component_type > comp_b.component_type: 1582 return B_HAS_PRECEDENCE 1583 # component types are equal -> type specific comparison 1584 if comp_a.component_type in (IP_DESTINATION, IP_SOURCE): 1585 # assuming comp_a.value, comp_b.value of type 1586 # ipaddress.IPv4Network 1587 if comp_a.value.overlaps(comp_b.value): 1588 # longest prefixlen has precedence 1589 if comp_a.value.prefixlen > comp_b.value.prefixlen: 1590 return A_HAS_PRECEDENCE 1591 if comp_a.value.prefixlen < comp_b.value.prefixlen: 1593 return B_HAS_PRECEDENCE 1594 # components equal -> continue with next component 1595 elif comp_a.value > comp_b.value: 1596 return B_HAS_PRECEDENCE 1597 elif comp_a.value < comp_b.value: 1598 return A_HAS_PRECEDENCE 1599 else: 1600 # assuming comp_a.value, comp_b.value of type bytearray 1601 if len(comp_a.value) == len(comp_b.value): 1602 if comp_a.value > comp_b.value: 1603 return B_HAS_PRECEDENCE 1604 if comp_a.value < comp_b.value: 1605 return A_HAS_PRECEDENCE 1606 # components equal -> continue with next component 1607 else: 1608 common = min(len(comp_a.value), len(comp_b.value)) 1609 if comp_a.value[:common] > comp_b.value[:common]: 1610 return B_HAS_PRECEDENCE 1611 elif comp_a.value[:common] < comp_b.value[:common]: 1612 return A_HAS_PRECEDENCE 1613 # the first common bytes match 1614 elif len(comp_a.value) > len(comp_b.value): 1615 return A_HAS_PRECEDENCE 1616 else: 1617 return B_HAS_PRECEDENCE 1618 return EQUAL 1619 1621 Appendix B. Comparison with RFC 5575 1623 This document includes numerous editorial changes to [RFC5575]. It 1624 also completely incorporates the redirect action clarification 1625 document [RFC7674]. It is recommended to read the entire document. 1626 The authors, however want to point out the following technical 1627 changes to [RFC5575]: 1629 Section 1 introduces the Flow Specification NLRI. In [RFC5575] 1630 this NLRI was defined as an opaque-key in BGPs database. This 1631 specification has removed all references to a opaque-key property. 1632 BGP is able to understand the NLRI encoding. 1634 Section 4.2.1.1 defines a numeric operator and comparison bit 1635 combinations. In [RFC5575] the meaning of those bit combination 1636 was not explicitly defined and left open to the reader. 1638 Section 4.2.2.3 - Section 4.2.2.8, Section 4.2.2.10, 1639 Section 4.2.2.11 make use of the above numeric operator. The 1640 allowed length of the comparison value was not consistently 1641 defined in [RFC5575]. 1643 Section 7 defines all Traffic Filtering Action Extended 1644 communities as transitive extended communities. [RFC5575] defined 1645 the traffic-rate action to be non-transitive and did not define 1646 the transitivity of the other Traffic Filtering Action communities 1647 at all. 1649 Section 7.2 introduces a new Traffic Filtering Action (traffic- 1650 rate-packets). This action did not exist in [RFC5575]. 1652 Section 7.4 contains the same redirect actions already defined in 1653 [RFC5575] however, these actions have been renamed to "rt- 1654 redirect" to make it clearer that the redirection is based on 1655 route-target. This section also completely incorporates the 1656 [RFC7674] clarifications of the Flowspec Redirect Extended 1657 Community. 1659 Section 7.7 contains general considerations on interfering traffic 1660 actions. Section 7.3 also cross-references this section. 1661 [RFC5575] did not mention this. 1663 Section 10 contains new error handling. 1665 Authors' Addresses 1667 Christoph Loibl 1668 next layer Telekom GmbH 1669 Mariahilfer Guertel 37/7 1670 Vienna 1150 1671 AT 1673 Phone: +43 664 1176414 1674 Email: cl@tix.at 1676 Susan Hares 1677 Huawei 1678 7453 Hickory Hill 1679 Saline, MI 48176 1680 USA 1682 Email: shares@ndzh.com 1683 Robert Raszuk 1684 Bloomberg LP 1685 731 Lexington Ave 1686 New York City, NY 10022 1687 USA 1689 Email: robert@raszuk.net 1691 Danny McPherson 1692 Verisign 1693 USA 1695 Email: dmcpherson@verisign.com 1697 Martin Bacher 1698 T-Mobile Austria 1699 Rennweg 97-99 1700 Vienna 1030 1701 AT 1703 Email: mb.ietf@gmail.com