idnits 2.17.1 draft-yang-forces-model-02.txt: -(193): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(275): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(279): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(416): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(418): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(554): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(710): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(722): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(723): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(819): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(963): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(969): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(970): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(975): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1086): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1113): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1114): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1149): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1530): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1533): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1548): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1556): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1560): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1564): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == There are 75 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The abstract seems to contain references ([FORCES-REQ]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2003) is 7614 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC-2119' is mentioned on line 115, but not defined == Missing Reference: 'DiffServ' is mentioned on line 1078, but not defined == Unused Reference: 'RFC1812' is defined on line 1523, but no explicit reference was found in the text == Unused Reference: 'RFC3084' is defined on line 1530, but no explicit reference was found in the text == Unused Reference: 'IPSEC-MIB' is defined on line 1564, but no explicit reference was found in the text ** Downref: Normative reference to an Historic RFC: RFC 3084 ** Downref: Normative reference to an Historic RFC: RFC 3159 ** Downref: Normative reference to an Informational RFC: RFC 3290 -- Possible downref: Non-RFC (?) normative reference: ref. 'FORCES-REQ' Summary: 8 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft L. Yang 3 Expiration: Dec 2003 Intel Labs 4 File: draft-yang-forces-model-02.txt J. Halpern 5 Working Group: ForCES 6 R. Gopal 7 Nokia 8 A. DeKok 9 IDT Inc. 10 June 2003 12 ForCES Forwarding Element Functional Model 14 draft-yang-forces-model-02.txt 16 Status of this Memo 18 This document is an Internet-Draft and is in full conformance with 19 all provisions of Section 10 of RFC2026. Internet-Drafts are 20 working documents of the Internet Engineering Task Force (IETF), 21 its areas, and its working groups. Note that other groups may also 22 distribute working documents as Internet-Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other 26 documents at any time. It is inappropriate to use Internet-Drafts 27 as reference material or to cite them other than as ``work in 28 progress.'' 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 Abstract 38 This document defines a functional model for forwarding elements 39 (FEs) used in the Forwarding and Control Plane Separation (ForCES) 40 protocol. This model is used to describe the capabilities, 41 capacities, state and configuration of ForCES forwarding elements 42 within the context of the ForCES protocol, so that ForCES control 43 elements (CEs) can control the FEs accordingly. The model is to 44 specify what logical functions are present in the FEs, what 45 capabilities these functions support, and how these functions are 46 or can be interconnected. The forwarding element model defined 47 herein is intended to satisfy the requirements specified in the 48 ForCES requirements draft [FORCES-REQ]. Using this model, 49 predefined or vendor specific logical functions can be expressed 50 and configured. However, the definition of these individual 51 functions are not described and defined in this document. 53 Table of Contents 55 Abstract.........................................................1 56 1. Definitions...................................................3 57 2. Motivation and Requirements of FE model.......................4 58 3. State Model versus Capability Model...........................4 59 4. FE Model Concepts: FE Block and FE Block Topology.............7 60 4.1. FE Blocks................................................7 61 4.2. FE Block Topology........................................9 62 4.2.1. Configuring FE Block Topology......................11 63 4.2.2. Modeling FE Block Topology.........................16 64 5. Logical FE Block Library.....................................21 65 5.1. FE Input/Output Block Characterization..................21 66 5.1.1. Source Block.......................................21 67 5.1.2. Sink Block.........................................22 68 5.1.3. Port Block.........................................22 69 5.1.4. Dropper Block......................................22 70 5.1.5. MUX Block..........................................23 71 5.1.6. Redirector (de-MUX) Block..........................23 72 5.1.7. Shaper Block.......................................23 73 5.2. FE Processing Blocks....................................23 74 5.2.1. Counter Block......................................24 75 5.2.2. Meter Block........................................24 76 5.2.3. Filter Block.......................................24 77 5.2.4. Classifier Block...................................24 78 5.2.5. Redirecting Classifier Block.......................25 79 5.2.6. Modifier Block.....................................25 80 5.2.7. Packet Header Rewriter Block.......................26 81 5.2.8. Packet Compression/Decompression Block.............26 82 5.2.9. Packet Encryption/Decryption Block.................26 83 5.2.10. Packet Encapsulation/Decapsulation Block..........26 84 6. Minimal Set of Logical Functions Required for FE Model.......27 85 6.1. QoS Functions...........................................27 86 6.1.1. Classifier.........................................27 87 6.1.2. Meter..............................................28 88 6.1.3. Marker.............................................28 89 6.1.4. Dropper............................................28 90 6.1.5. Counter............................................28 91 6.1.6. Queue and Scheduler (?)............................28 92 6.1.7. Shaper.............................................28 93 6.2. Generic Filtering Functions.............................28 94 6.3. Vendor Specific Functions...............................29 95 6.4. Port Functions..........................................29 96 6.5. Forwarding Functions....................................29 97 6.6. High-Touch Functions....................................30 98 6.7. Security Functions......................................31 99 6.8. Off-loaded Functions....................................31 100 7. Cascading Multiple FEs.......................................31 101 8. Data Modeling and Representation.............................32 102 9. Security Considerations......................................33 103 10. Intellectual Property Right.................................33 104 11. IANA consideration..........................................34 105 12. Normative References........................................34 106 13. Informative References......................................34 107 14. Acknowledgments.............................................35 108 15. Authors' Addresses..........................................35 110 Conventions used in this document 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 114 this document are to be interpreted as described in [RFC-2119]. 116 1. Definitions 118 A set of terminology associated with the ForCES requirements is 119 defined in [FORCES-REQ] and is not copied here. The following list 120 of terminology is relevant to the FE model defined in this 121 document. 123 Datapath -- A conceptual path taken by packets within the 124 forwarding plane, inside an FE. There might exist more than one 125 datapath within an FE. 127 Forwarding Element (FE) Block -- An abstraction of the basic packet 128 processing logical functions in the datapath. It is the building 129 block of FE functionality. This concept abstracts away 130 implementation details from the parameters of interest for 131 configuration, control and management by CE. 133 Forwarding Element (FE) Stage -- Representation of an FE block 134 instance in a FE's datapath. As a packet flows through an FE along 135 a datapath, it flows through one or multiple distinct stages, with 136 each stage implementing an instance of a certain logical function 137 block. There may be multiple instances of the same functional 138 block in a FE's datapath. 140 FE Topology -- Representation of how the multiple FEs in a single 141 NE are interconnected. Sometimes it is called inter-FE topology, 142 to be distinguished from intra-FE (block) topology. 144 FE Block Topology -- Representation of how the FE stages are 145 interconnected and placed along the datapath within one FE. 146 Sometimes it is also called intra-FE topology, to be distinguished 147 from inter-FE topology. 149 Inter-FE Topology � See FE Topology. 151 Intra-FE Topology � See FE Block Topology. 153 2. Motivation and Requirements of FE model 155 The ForCES architecture allows Forwarding Elements (FEs) of varying 156 functionality to participate in a ForCES network element (NE). The 157 implication of this varying functionality is that CEs can make only 158 minimal assumptions about the functionality provided by its FEs. 159 Before CEs can configure and control the forwarding behavior of 160 FEs, CEs need to query and discover the capabilities and states of 161 their FEs. [FORCES-REQ] mandates that this capabilities and states 162 information be expressed in the form of an FE model, and this model 163 will be used as the basis for CEs to control and manipulate FEs' 164 behavior via ForCES protocol. 166 [FORCES-REQ] describes all the requirements placed on the FE model 167 in detail. We provide a brief summary here to highlight some of the 168 design issues we face. 169 . The FE model MUST express what logical functions can be 170 applied to packets as they pass through an FE. 171 . The FE model MUST be capable of supporting/allowing variations 172 in the way logical functions are implemented on an FE. 173 . The model MUST be capable of describing the order in which 174 these logical functions are applied in a FE. 175 . The FE model SHOULD be extendable and should have provision to 176 express new or vendor specific logical functions. 177 . The FE model SHOULD be able to support minimal set of logical 178 functions that are already identified, such as port functions, 179 forwarding functions, QoS functions, filtering functions, 180 high-touch functions, security functions, vendor-specific 181 functions and off-loaded functions. 183 3. State Model versus Capability Model 185 Since the motivation of an FE model is to allow the CEs later to 186 control and configure the FEs' behavior via ForCES protocol, it 187 becomes essential to examine and understand what kind of control 188 and configuration the CEs might do to the FEs. It is also equally 189 essential to understand how configurable or programmable FEs are 190 today and will be in the near future. 192 To understand the issue better, it is helpful to make a distinction 193 between two different kinds of FE models � an FE state model and FE 194 capability model. 196 An FE state model describes the current state of the FE, that is, 197 the instantaneous values or operational behavior of the FE. The FE 198 state model presents the snapshot view of the FE to the CE. For 199 example, using an FE state model, an FE may be described to its CE 200 as the following: 201 - on a given port the packets are classified using a given 202 classification filter; 203 - the given classifier results in packets being metered in a 204 certain way, and then marked in a certain way; 205 - the packets coming from specific markers are delivered into a 206 shared queue for handling, while other packets are delivered to a 207 different queue; 208 - a specific scheduler with specific behavior and parameters will 209 service these collected queues. 211 On the other hand, the FE capability model describes the 212 configurable capabilities and capacities of an FE in terms of 213 variations of functions supported or limitations contained. 214 Conceptually FE capability model presents the many possible states 215 allowed on an FE with capacity information indicating certain 216 quantitative limits or constraints. For example, an FE capability 217 model may describe the FE at a coarse level such as: 218 - this FE can handle IPv4 and IPv6 forwarding; 219 - this FE can perform classification on the following fields: 220 source IP address, destination IP address, source port number, 221 destination port number, etc; 222 - this FE can perform metering; 223 - this FE can handle up to N queues (capacity); 224 - this FE can add and remove encapsulating headers of types 225 including IPSec, GRE, L2TP. 227 The information on the capabilities and capacities of the FE helps 228 the CE understand the flexibility of the FE functions. Where it 229 gets more complicated is for the capability model to cope with the 230 detailed limits, issues such as how many classifiers the FE can 231 handle, how many queues, and how many buffer pools the FE can 232 support, how many meters the FE can provide. 234 While one could try to build an object model for representing 235 capabilities in full, other efforts have found this to be a 236 significant undertaking. A middle of the road approach is to define 237 coarse-grained capabilities and simple capacity measures. Then, if 238 the CE attempts to instruct the FE to set up some specific behavior 239 it is not capable of, the FE will return an error indicating the 240 problem. Such an approach is taken by RFC3318 in defining a set of 241 Provisioning Classes (PRCs) for Framework Policy Information Base 242 (PIB). For example, in Section 4.1 of RFC3318, a �Component 243 Limitations Table� is described so that �the PEP can report some 244 limitations of attribute values and/or classes and possibly 245 guidance values for the attribute�. Similar approach is also taken 246 in Differentiated Services QoS Policy Information Base [RFC3317]. 247 The DiffServ QoS PIB includes capability reporting classes for 248 individual devices, like classification capabilities, metering 249 capabilities, etc. Two additional classes are also defined to allow 250 specification of the element linkage capabilities of the PEP: the 251 dsIfElmDepthCaps PRC indicates the maximum number of functional 252 datapath elements that can be linked consecutively in a datapath; 253 while the dsIfElmLinkCaps PRC indicates what functional datapath 254 elements may follow a specific type of element in a datapath. Such 255 capability reporting classes in the DiffServ and Framework PIB are 256 all meant to allow the PEP to indicate some general guidelines 257 about what the device can do. They are intended to be an aid to 258 the PDP when it constructs policy for the PEP. These classes do 259 not necessarily allow the PEP to indicate every possible 260 configuration that it can or cannot support. If a PEP receives a 261 policy that it cannot implement, it must notify the PDP with a 262 failure report. 264 Figure 1 shows the concepts of FE state, capabilities, capacities 265 and configuration in the context of CE-FE communication via ForCES 266 protocol. 268 It is clear to us that in the context of ForCES, a state model is 269 definitely necessary but not sufficient. A simple state model 270 without any capability flavor will severely limit ForCES�s ability 271 to take advantage of the flexibility offered by programmable FEs. 272 The question is how much of the capability model is needed in 273 addition to the state model. As we discussed previously, a 274 detailed capability model is difficult to develop and may impose 275 unnecessary overhead for those FEs that don�t have much flexibility 276 in their capability. We believe that a good balance between 277 simplicity and flexibility can be achieved for ForCES FE model by 278 taking the similar approach as demonstrated by DiffServ 279 PIB[RFC3317] and Framework PIB[RFC3318] � that is, by combining the 280 coarse level capability reporting mechanism for both the individual 281 FE functions and linkage constraints with the error reporting 282 mechanism. 284 +-------+ +-------+ 285 | | FE capabilities/capacity: what it can be.| | 286 | |<-------------------------------------- --| | 287 | | | | 288 | CE | FE state: what it is now. | FE | 289 | |<-----------------------------------------| | 290 | | | | 291 | | FE configuration: what it should be. | | 292 | |----------------------------------------->| | 293 +-------+ +-------+ 295 Figure 1. Illustration of FE state, capabilities, capacities and 296 configuration in the context of CE-FE communication via ForCES. 298 4. FE Model Concepts: FE Block and FE Block Topology 300 Conceptually, the FE model presents two levels of information about 301 the FE. At the first level are the individual FE functions. We 302 call these individual FE functions FE blocks. The second level of 303 information that the FE model should present is about how these 304 individual function are ordered and placed along the datapath to 305 deliver a complete forwarding plane service. The interconnection 306 of the FE functions is called �FE block topology�. 308 4.1. FE Blocks 310 A new terminology �FE Functional Block� is used to refer to the 311 individual FE functions that constitute the very basic units for FE 312 models. Each FE functional block performs a well-defined action or 313 computation on the packets passing through it. Upon completion of 314 such function, either the packets are modified in certain ways 315 (like decapsulator, marker), or some results are generated and 316 stored, probably in the form of meta-data (like classifier). Each 317 FE Block typically does one thing and one thing only. Classifiers, 318 shapers, meters are all examples of FE blocks. Modeling FE blocks 319 at such fine granularity allows us to use a small number of FE 320 blocks to create the higher-order FE functions (like Ipv4 321 forwarder) precisely, which in turn can describe more complex 322 networking functions and vendor implementations of software and 323 hardware. 325 +----------+ 326 | CE | 327 +----------+ 328 | ^ 329 | | 330 v | 331 +----------+ 332 Inputs ---> | FE Block | ---> Outputs 333 (P,M) | | (P�,M�) 334 | S | 335 +----------+ 337 Figure 2. Generic FE Block Layout 339 An FE Block has inputs, outputs, and a connection to and from the 340 CE, as shown in Figure 2. The horizontal axis is in the forwarding 341 plane, and the vertical axis denotes interaction between the 342 forwarding and control planes. An FE block contains internal state 343 S, composed of one or both CE->FE configuration; and data created 344 and managed by the FE itself. An FE Block also has one or more 345 inputs, each of which takes a packet P, and optionally metadata M; 346 and produces one or more outputs, each of which carries a packet 347 P�, and optionally metadata M�. 349 Meta-data is data which is associated with the packet in the 350 network processing device (router, switch, etc), but which is not 351 sent across the network. CE to FE communication is for 352 configuration, control and packet injection while FE to CE is for 353 packet re-direction to the control plane, rmon, accounting 354 information, errors, etc. 356 The FE model defines a generic FE block akin to an abstract base 357 class in object-oriented terminology. The generic FE block contains 358 basic information like block type and textual description of the 359 block function. A namespace is used to associate a unique name or 360 ID with each type of FE block. The namespace must be extensible so 361 that new logical functions can also be added later to accommodate 362 future innovation in the forwarding plane. 364 Based on this generic FE block, each FE logical function is defined 365 with additional state and capability information pertinent to each 366 specific function. Typically it is important to specify 367 information such as: 368 - how many inputs it takes and what kinds of packets and meta data 369 it takes for each input; 370 - how many outputs it produces and what kind of packets and meta 371 data it emits for each output; 372 - the packet processing (such as modification) behavior; 373 - what information is programmed into it (e.g., LPM list, next hop 374 list, WRED parameters, etc.) and what parameters among them are 375 configurable; 376 - what statistics it keeps (e.g., drop count, CRC error count, 377 etc.); 378 - what events it can throw (e.g., table miss, port down, etc.). 379 These parameters are further described in Section 5, below. 381 4.2. FE Block Topology 383 Packets coming into the FE from ingress ports generally flow 384 through multiple functional blocks before leaving out of the egress 385 ports. Different packets (or packets from different flows) may take 386 different datapath inside the same FE and hence perform different 387 sequences of FE blocks. Such interconnection of the FE blocks as 388 traversed by the packets is referred to as FE block topology. 390 It is important to point out that the FE block topology here is the 391 logical topology that the packets flow through, not the physical 392 topology as determined by how the FE hardware is laid out. Figure 393 3(a) shows part of the block topology of one simple FE example. 394 Three ingress ports are present in the FE and these ports may be of 395 different type with different characteristics. If we model a 396 single ingress port function as an FE block, clearly we need a way 397 to model multiple instances of one FE block with each instance 398 having separate set of parameters allowed for independent 399 configuration. 401 +-------------------------------------------+ 402 | | 403 +-----------+ | +-----------+ +--------+ | 404 | | v | |if IP-in-IP | | | 405 ---->| ingress |--------->|classifier |----------->|Decap. |-->+ 406 | ports | | |----+ | | 407 +-----------+ +-----------+ |others +--------+ 408 | 409 V 410 (a) The FE block topology example with a logical loop 412 instance tables 413 ================= 414 ingress port classifier Decapsulator 415 +---+--------+--+ +---+--------+--+ +---+-----------+ 416 |id |IP Addr |� | |id |#filters|� | |id | � | 417 +---+--------+--+ +---+--------+--+ +---+-----------+ 418 |1 |x.x.x.x |� | |1 |10 |� | |1 | � | 419 +---+--------+--+ +---+--------+--+ +---+-----------+ 420 |2 |x.x.x.x |� | |2 |10 |� | 421 +---+--------+--+ +---+--------+--+ 422 |3 |x.x.x.x |� | 423 +---+--------+--+ 425 (b) The block instance tables used for such an FE block 426 topology 428 +-------+ +-----------+ +------+ +-----------+ 429 | | | |if IP-in-IP | | | | 430 --->|ingress|-->|classifier1|----------->|Decap.|-->+classifier2|-> 431 | ports | | |----+ | | | | 432 +-------+ +-----------+ |others +------+ +-----------+ 433 | 434 V 435 (c) The FE block topology equivalent of (a) without the loop 437 Figure 3. An FE block topology example with block instance 438 tables. 440 Figure 3(a) also shows that it is possible for a packet to flow 441 through a certain function more than once and hence create a 442 logical loop in the FE block topology. For example, an IP-in-IP 443 packet from an IPSec application like VPN may go to the classifier 444 first and have the classification done based on the outer IP 445 header; upon being classified as an IP-in-IP packet, the packet is 446 then sent to a decapsulator to strip off the outer IP header, 447 followed by the classifier again to perform classification on the 448 inner IP header. It is clear from Figure 3(a) that such a logical 449 loop is sometimes necessary and must be properly modeled in the FE 450 block topology. 452 To represent the FE block instances, we define an �FE block 453 instance table� associated with each FE block � each row of the 454 table corresponds to one instance of the block. An instance ID is 455 needed to distinguish different instances of one block. Multiple 456 instances of the same block can be configured independently with 457 different parameters. Figure 3(b) shows the FE block instance 458 tables for the FE block topology in (a). The instance table of the 459 ingress ports has 3 rows because there are 3 ingress ports. The 460 classifier block has two rows, one corresponding to the classifier 461 instance after the ingress port, while the other row corresponding 462 to the instance after the decapsulator. The decapsulator has only 463 one row in its instance table since only one instance of 464 decapsulator is used. Each row in the instance table has its own 465 parameters and so each instance can be configured independently. 467 A way to model the logical loop to the classifier in Figure 3(a) is 468 to treat it as if there are two different instances of classifier, 469 as shown in Figure 3(c). 471 While there is little doubt that the individual FE blocks must be 472 configurable, the configurability question becomes complicated and 473 controversial for FE block topology. To discuss the issue further, 474 we need to answer the following questions: 475 1) Is the FE block topology configurable at all? Is that feasible 476 with today�s forwarding plane technology? Even if the CE can 477 dynamically configure an FE block topology, how can the CE 478 interpret an arbitrary FE block topology and know what to do 479 with it? 480 2) If the FE block topology can be configured by the CE, how do we 481 model the FE block topology? 482 Let�s discuss these questions in the rest of the section. 484 4.2.1. Configuring FE Block Topology 486 We believe that the FE block topology should be configurable with 487 ForCES model because even today�s forwarding plane technology can 488 potentially allow that. As network processors are being used 489 increasingly in the forwarding plane, much of the packet processing 490 functions on the FE is implemented in software. As such, the FE 491 can afford much flexibility and programmability of its 492 functionality by configuring the software either at runtime or 493 compile time. It is conceivably feasible for the FE to change its 494 FE block topology by recompiling the set of the software components 495 and their chaining order along the datapath. It might be possible 496 to achieve some of the reconfiguration at runtime. Therefore, we 497 argue that it is necessary for ForCES to allow FE block topology 498 configurable in its FE model since it is technically feasible. 500 For example, a NAT-enabled router may have several line cards (FEs) 501 that are capable of both NAT (Network Address Translator) functions 502 and IPv4 Forwarding. Such an FE contain two FE blocks in it: NAT 503 and IPv4 Forwarder. Depends on where on the network this router is 504 deployed, the network administrator may decide on different 505 configuration for the CE to configure the FEs. If the router sits 506 on the edge of a private address domain, the CE may want to 507 configure the FEs to perform NAT first and IPv4 Forwarder later so 508 that the forwarding is done with the correct address space. On the 509 other hand, if the router sits inside the private address domain, 510 the CE may want to configure the FEs to perform only the IPv4 511 forwarding function and bypass the NAT because the address space is 512 already translated by the edge router. Therefore, the FEs might be 513 asked to configure the NAT block as an optional stage in the FE 514 topologies to accommodate the two deployment scenarios. This is a 515 very simple example and the switch between these two topologies 516 could be easily done with a runtime flag in the FE software. 517 However simple as it is, it does demonstrate the need to allow for 518 FE block topology configuration. 520 +-------------+ +--------------+ 521 | | | | 522 ------->| NAT |-------->|IPv4 Forwarder|------> 523 | | | | 524 +-------------+ +--------------+ 526 (a) NAT followed by IPv4 Forwarder 528 +-------------+ +--------------+ 529 | | | | 530 --->-+ | NAT | +---->|IPv4 Forwarder|------> 531 | | | | | | 532 | +-------------+ | +--------------+ 533 | | 534 +----------->--------+ 536 (b) NAT is skipped and only the forwarder is used 538 Figure 4. A simple example to configure different FE 539 topologies. 541 We want to point out that allowing configurable FE block topology 542 in FE model does not mandate that all FEs must have such 543 capability. Even if the FE elects to support block topology 544 reconfiguration, it is entirely up to the FE designers to decide 545 how the FE actually implements such reconfiguration. Whether it is 546 only a simple runtime switch to allow a few choices like in Figure 547 4, or a much more elaborate reconfiguration as shown later in 548 Figure 5 possibly supported by recompilation is all implementation 549 details internal to the FE but outside the scope of FE model. The 550 purpose of this discussion is to justify the motivation and 551 necessity of supporting FE block topology configuration in the FE 552 model, but not to dictate how this should be done inside the FEs. 554 We�ve just answered the questions of �Is it possible to configure 555 the FE block topology with today�s forwarding plane technology�. 556 Now it is time to look at the other related question: �Even if it 557 is feasible to configure an FE block topology, how can the CE 558 interpret an arbitrary FE block topology (presented to it by the 559 FE) and know what to do with it? Alternatively, how does the CE 560 know what kind of FE block topology it should use to implement a 561 particular NE service or application?� 563 The example in Figure 4 is too trivial to require much intelligence 564 at the CE. Figure 5 shows a more comlex example where a QoS- 565 enabled router has several line cards that have a few ingress ports 566 and egress ports, a specialized classification chip, a network 567 processor containing codes for FE blocks like meter, marker, 568 dropper, counter, mux, queue, scheduler and Ipv4 forwarder. Some of 569 the FE block topology is already fixed and has to remain static due 570 to the physical layout of the line cards. For example, all the 571 ingress ports might be already hard wired into the classification 572 chip and so all packets must follow from the ingress port into the 573 classification engine. On the other hand, the blocks on the network 574 processor are programmable and the order of these blocks can be 575 changed by recompilation of the codes. There might exist certain 576 capacity limits and linkage constraints between these blocks. 577 Examples of the capacity limits might be: there can be no more than 578 8 meters; there can be no more than 16 queues in one FE; the 579 scheduler can handle at most up to 16 queues; etc. The linkage 580 constraints might dictate that classification engine may be 581 followed by meter, marker, dropper, counter, queue or Ipv4 582 forwarder, but not scheduler; queues can only be followed by 583 scheduler; scheduler must be followed by the Ipv4 forwarder; the 584 last block in the datapath before going into the egress ports must 585 be the Ipv4 forwarder, etc. 587 Once the FE reports such capability and capacity to the CE, it is 588 now up to the CE to translate the QoS policy into the desirable 589 configuration for the FE. Now the question arises as to whether or 590 not the CE has the ultimate intelligence to translate high level 591 QoS policy into the configuration data for the FEs. We argue that 592 this question is outside of the scope of FE model itself. It is 593 possible that some human intervention is still necessary. For 594 example, the network administrator might be called upon to 595 translate the high level QoS policy into the configurable FE data 596 (including the block topology) that the CE uses to configure the 597 line cards. It is also conceivable that within a given network 598 service domain (like DiffServ), certain amount of intelligence can 599 be programmed into the CE such that the CE has a general 600 understanding of the FE blocks involved and so the translation from 601 high level QoS policy to the low level FE configuration can be done 602 automatically. In any event, this is considered implementation 603 issue internal to the control plane only and outside the scope of 604 the FE model. Therefore, it is not discussed any further in this 605 draft. 607 Figure 5(a) depicts the FE capability while 4(b) and 4(c) depict 608 two different topologies that the FE might be asked to configure 609 into. Note that both ingress and egress are omitted in (b) and (c) 610 for simplicity in the figures. The topology in (b) is considerably 611 more complex than (c) but both are feasible with the FE 612 capabilities, and so the FE should accept either configuration 613 request from the CE. 615 As demonstrated in the example shown in Figure 5, many variants of 616 the FE block topology come directly from the configuration of the 617 individual FE blocks. For example, the number of datapath branches 618 from the classifier is determined by the number of filters used by 619 the classifier. Figure 5(b) uses four filters so there are four 620 main datapath branches fan out from the classifier while 4(c) uses 621 only two filters resulting two datapath fan-out. Each datapath is 622 further configured by configuring the FE blocks along the path. 624 +----------+ +-----------+ 625 | | | | 626 ---->| Ingress |---->|classifier |--------------+ 627 | | |chip | | 628 +----------+ +-----------+ | 629 | 630 v 631 +-------------------------------------------+ 632 | Network Processor | 633 +--------+ | | 634 <----| Egress | | +------+ +------+ +-------+ +---+ | 635 +--------+ | |Meter | |Marker| |Dropper| |Mux| | 636 ^ | +------+ +------+ +-------+ +---+ | 637 | | | 638 +----------+-------+ | 639 | | | 640 | +---------+ +---------+ +------+ +---------+ | 641 | |Forwarder|<------|Scheduler|<--|Queue | |Counter | | 642 | +---------+ +---------+ +------+ +---------+ | 643 | | 644 |--------------------------------------------------------------+ 646 (a) The Capability of the FE, reported to the CE 648 Queue1 649 +---+ +--+ 650 | A|------------------->| |--+ 651 +->| | | | | 652 | | B|--+ +--+ +--+ +--+ | 653 | +---+ | | | | | | 654 | Meter1 +->| |-->| | | 655 | | | | | | 656 | +--+ +--+ | Ipv4 657 | Counter1 Dropper1 Queue2| +--+ Fwd. 658 +---+ | +--+ +--->|A | +-+ 659 | A|---+ | |------>|B | | | 660 ------>| B|------------------------------>| | +--->|C |->| |-> 661 | C|---+ +--+ | +->|D | | | 662 | D|-+ | | | +--+ +-+ 663 +---+ | | +---+ +---+ Queue3| | Scheduler 664 Classifier1 | | | A|------------>|A | +--+ | | 665 | +->| | | |->| |--+ | 666 | | B|--+ +--+ +->|B | | | | 667 | +---+ | | | | +---+ +--+ | 668 | Meter2 +->| |-+ Mux1 | 669 | | | | 670 | +--+ Queue4 | 671 | Marker1 +--+ | 672 +---------------------------->| |----+ 673 | | 674 +--+ 676 (b) One FE block topology as configured by the CE and 677 accepted by the FE 679 +-----+ +-------+ +---+ 680 | A|--->|Queue1 |--------------------->| | 681 ------>| | +-------+ | | +---+ 682 | | | | | | 683 | | +-------+ +-------+ | | | | 684 | B|--->|Meter1 |----->|Queue2 |------>| |->| | 685 | | | | +-------+ | | | | 686 | | | |--+ | | | | 687 +-----+ +-------+ | +-------+ | | +---+ 688 classifier +-->|Dropper| | | IPv4 689 +-------+ +---+ Fwd. 690 scheduler 692 (c) Another FE block topology as configured by the CE 693 and accepted by the FE 695 Figure 5. Another example of configuring FE block topology. 697 4.2.2. Modeling FE Block Topology 699 Now that we�ve seen some examples of how FE block topology can be 700 configured, we need to focus on the question as how to model the FE 701 block topology traversed by the packets. As discussed below, there 702 exist two different approaches in modeling the FE block topology. 704 . Directed Graph Topological Approach 706 An FE stage is simply an instance of an FE block within an FE's 707 datapath. As a packet flows through an FE along a datapath, it 708 flows through one or multiple distinct stages, with each stage 709 instantiating a certain FE logical function. So an FE stage is 710 simply a row in the �FE block instance tables� corresponding to the 711 block type of the stage. Each FE allocates an FE-unique stage ID 712 to each of its stages. One way to assign the stage ID is to 713 combine both the block-type namespace and the instance ID in the 714 instance table. 716 The FE block topology can then be modeled by a directed graph 717 interconnecting all the FE stages present in the FE, with each node 718 in the graph corresponding to an FE stage, and the direction 719 between two nodes coinciding with the packet flow direction. In 720 order to represent the directed interconnection between two 721 consecutive nodes along a datapath, each stage contains a field 722 called �number of downstream stages� and an array of �downstream 723 stage IDs� that point to the set of downstream nodes following this 724 stage. Such a modeling approach directly models the datapath 725 topological graph of the FE stages and so we refer to it as the 726 directed graph topological approach. 728 For such a directed graph topological approach, the following 729 information needs to be specified for each FE stage in the graph: 730 - stage identifier which uniquely identifies the node within this 731 FE graph; 732 - block type which identifies the block function that this stage is 733 an instance of; 734 - number of downstream stages which corresponds to the number of 735 downstream nodes connected to this stage; 736 - downstream stage identifiers which corresponds to the set of 737 downstream nodes connected to this stage. 739 Such information can be combined into the rows of the �FE block 740 instance table� for each FE block type present on the FE. With 741 such information defined for each row in the instance table, it is 742 now possible to traverse the whole graph in a node-by-node fashion 743 following the linked list, as long as the initial stage(s) are 744 known. For example, the topology model for Figure 5(c) is shown in 745 Figure 6. It is assumed that the FE has four ingress ports and two 746 egress ports. The stage id is assigned to have the format of 747 �xx.yy� where xx being the block type name while yy being the 748 instance id of that stage in the instance table of type xx. The 749 following shorthand are used for FE block type namespace: 750 IG=Ingress-port; CL=classifier; EG=egress-port; QU=queue; MT=meter; 751 DR=dropper; SC=scheduler; and FW=Forwarder. 753 In Figure 6, by starting from the initial stages of {IG.1; IG.2; 754 IG.3; IG.4} and using the instance tables, all the datapath in the 755 FE block topology can be easily traversed. From this example, it is 756 clear that directed graph topological approach is straightforward 757 and graphical, and hence easy to understand and implement. DiffServ 758 [RFC3317] uses this approach in modeling its QoS functions and 759 their interconnection. However, such approach has certain 760 limitations. One of the limitations is that there exists an 761 implicit assumption within such a model that each node affects the 762 datapath branching only for the next immediate stage. For example, 763 in Figure 5(c), the classifier directs packets into either queue1 764 or meter1, but once the packets enter meter1, the classification 765 results have no impact on which of the two branches leaving meter1 766 (i.e., queue2 or dropper) is being taken. While this limitation 767 might be perfectly reasonable for many FE designs, some find it 768 insufficient. For example, some of the classification engine uses 769 the classification results to determine the full datapath, i.e., 770 not just the immediate stage following the classifier, but 771 including all the following FE stages the packets should perform. 772 It is difficult to represent such FE design using the pure directed 773 graph topological approach. An alternative approach, encoded state 774 approach, is more suitable in this case because it carries meta- 775 data between the stages. 777 Instance tables: 778 ================ 779 IG CL QU 780 +---+-----+----+ +---+-----+---------+ +---+-----+----+ 781 |id |#next|next| |id |#next|next | |id |#next|next| 782 +---+-----+----+ +---+-----+---------+ +---+-----+----+ 783 |1 | 1 |CL.1| |1 |2 |QU.1;MT.1| |1 |1 |SC.1| 784 +---+-----+----+ +---+-----+---------+ +---+-----+----+ 785 |2 | 1 |CL.1| |2 |1 |SC.1| 786 +---+-----+----+ +---+-----+----+ 787 |3 | 1 |CL.1| 788 +---+-----+----+ 789 |4 | 1 |CL.1| 790 +---+-----+----+ 792 DR MT EG 793 +---+-----+----+ +---+-----+---------+ +---+-----+----+ 794 |id |#next|next| |id |#next|next | |id |#next|next| 795 +---+-----+----+ +---+-----+---------+ +---+-----+----+ 796 |1 | 0 | | |1 |2 |QU.2;DR.1| |1 |0 | | 797 +---+-----+----+ +---+-----+---------+ +---+-----+----+ 798 |2 |0 | | 799 +---+-----+----+ 801 SC FW 802 +---+-----+----+ +---+-----+---------+ 803 |id |#next|next| |id |#next|next | 804 +---+-----+----+ +---+-----+---------+ 805 |1 | 1 |FW.1| |1 |2 |EG.1;EG.2| 806 +---+-----+----+ +---+-----+---------+ 808 Directed Graph: 809 =============== 810 Traverse the graph by starting from {IG.1;IG.2;IG.3;IG.4}. 812 *Notes: 813 1) The fields shown in the instance tables are only the fields 814 common to all: id (instance ID); #next (number of immediate next 815 stages); next (the instance IDs of all the immediate next 816 stages). The parameters pertinent to each block type are not 817 shown in the instance tables because they do not affect the 818 topology modeling. 819 2) The stage id is assigned to have the format of �xx.yy� where xx 820 being the block type name while yy being the instance id of that 821 stage in the instance table of type xx. 822 3) The following shorthand are used for FE block type namespace: 823 IG=Ingress-port; CL=classifier; EG=egress-port; QU=queue; 824 MT=meter; DR=dropper; SC=scheduler; and FW=Forwarder. 826 Figure 6. Using the directed graph approach to model the FE 827 block topology in Figure 5(c). 829 . Encoded State Approach 831 In addition to the topological approach, the QDDIM model also 832 adopts the encoded state approach so that information about the 833 treatment that a packet received on an ingress interface is allowed 834 to be communicated along with the packet to the egress interface 835 (see [QDDIM] Section 3.8.3). QDDIM model represents this 836 information transfer in terms of a packet preamble. 838 +----------------+ 839 | Meter-A | 840 | | 841 ----->| In -|-----PM-1---> 842 | | 843 | Out -|-----PM-2---> 844 +----------------+ 846 Figure 7: Meter Followed by Two Preamble Markers 848 Figure 7 shows an example used in [QDDIM] (section 3.8.3) in which 849 meter results are captured in a packet preamble. �PreamberMarker 850 PM-1 adds to the packet preamble an indication that the packet 851 exited Meter A as conforming traffic. Similarly, PreambleMarker PM- 852 2 adds to the preambles of packets that come through it indications 853 that they exited Meter A as nonconforming traffic. A PreambleMarker 854 appends its information to whatever is already present in a packet 855 preamble, as opposed to overwriting what is already there.� �To 856 foster interoperability, the basic format of the information 857 captured by a PreambleMarker is specified.� �Once a meter result 858 has been stored in a packet preamble, it is available for any 859 subsequent Classifier to use.� 861 In the example of Figure 5(c), if the results from classifier are 862 to impact all the following stages, even beyond the immediate next 863 stage, encoded state approach should be used so that meta-data is 864 inserted representing the results from classifier and is made 865 available to all following stages. 867 +------------+ +------------+ +------------+ 868 input | Ethernet | | | | Ethernet |output 869 ------->| Ingress |-->| IPv4 L3 LPM|-->| Egress |----> 870 | Port Mgr | | Forwarder | | Port Mgr | 871 +------------+ +------------+ +------------+ 873 (a) using encoded state approach 875 Input +------------+ +------------+ output 876 ------->|Ingr-Port #1|-->| | 877 +------------+ | | +------------+ 878 ------->|Ingr-Port #2|-->| |-->|EgressPort#1|-----> 879 +------------+ | | +------------+ 880 ------->|Ingr-Port #3|-->|IPv4 L3 LPM |-->|EgressPort#2|-----> 881 +------------+ |Forwarder | +------------+ 882 ------->|Ingr-Port #4|-->| |-->|EgressPort#3|-----> 883 +------------+ | | +------------+ 884 ------->|Ingr-Port #5|-->| |-->|EgressPort#4|-----> 885 +------------+ | | +------------+ 886 ------->|Ingr-Port #6|-->| | 887 +------------+ +------------+ 889 (b) using directed graph topological approach 891 Figure 8. A simple example using two different approaches. 893 Using the topological approach as exemplified by DiffServ model, 894 there are N connections between a fan-out node of 1:N (e.g., a 895 classifier) and its next stages. Using the encoded state approach, 896 fewer connections are typically needed between the same fan-out 897 node and its next stages, because each packet carries some state 898 information as metadata that the next stage nodes can interpret and 899 invoke different packet treatment. Pure topological approaches can 900 be overly complex to represent because they force on to build 901 elaborate topologies with a lot more connections. An encoded state 902 approach is nicer in that it allows one to simplify the graph and 903 represent the functional blocks with more clarity. But it does 904 require extra metadata to be carried along with the packet, like 905 the preamble in the QDDIM model. 907 For example in Figure 8(a), IPv4 L3 LPM forwarder generates some 908 metadata at its output to carry information on which port the 909 packets should go to, and #3 (Enet-Egress-port-Manager) uses this 910 meta data to direct the packets to the right egress port. Figure 911 8(b) shows how the FE graph looks like when using the pure 912 topological approach instead, assuming six ingress and four egress 913 ports. It is clear that (b) is unwieldy compared to (a). 915 Note that the FE graph can represent largely arbitrary topologies 916 of the stages, regardless which approach (topological or encoded 917 state) is taken. Clearly the two approaches are not exclusive. 918 For complex topologies, a combination of the two is most useful and 919 flexible. Therefore, we recommend that the ForCES FE model adopt 920 both approaches. More specifically, the directed graph topological 921 approach should be used as the basic model, while the encoded state 922 approach can be used as optional, when meta-data is needed between 923 stages beyond the immediate next neighbors. 925 5. Logical FE Block Library 927 A small set of fine-grained FE blocks can be identified as the very 928 basic units from which all other FE functions can be built upon. 929 Such a set of FE blocks can be viewed as a FE block library. This 930 section defines such a library. 932 Several working groups in the IETF have already done some relevant 933 work in modeling the provisioning policy data for some of the 934 functions we are interested in, for example, DiffServ 935 (Differentiated Services) PIB [RFC3317], IPSec PIB [IPSEC-PIB]. 936 Whenever possible, we should try to reuse the work done elsewhere 937 instead of reinventing the wheel. 939 FE blocks may be characterized into two general classes: 940 input/output oriented blocks, and processing blocks. Each class is 941 composed of a number of sub-blocks, and the combination of classes 942 and sub-blocks can completely characterize FE functions. 944 The FE input/output blocks are characterized by their inputs and 945 outputs, and they generally do not modify or further process the 946 data that they handle. The FE processing blocks are characterized 947 by the manner in which they modify the packet, metadata, or 948 internal state, independent of how that information is input into 949 the block. 951 5.1. FE Input/Output Block Characterization 953 The FE input/output blocks are characterized by the following 954 elements: 956 - number of inputs 957 - number of outputs 959 These blocks do not modify or examine the packet in any way. 961 5.1.1. Source Block 963 A source block has no inputs, and one output. It �sources� events 964 from the external world into the FE model. 966 The purpose of the source block is to allow the model to explicitly 967 interact with objects that are outside of the model. That is, an 968 Ethernert port that injects packets into the FE may be modeled as a 969 �source� block, as from the point of view of the model, it creates 970 packets out of the �ether�, and outside of the scope of the model. 971 See also the FE Port block below, in Section 5.1.3. 973 5.1.2. Sink Block 975 A sink block has one input, and no outputs. It �sinks� events from 976 the FE model into the external world. 978 The purpose of the sink block is to allow the model to explicitly 979 interact with objects that are outside of the model. That is, and 980 Ethernet port that sends packets from an FE may be modeled as a 981 �sink� block, as from the point of view of the model, it sends 982 packets into the �ether�, and outside of the scope of the model. 983 See also the FE Port block below, in Section 5.1.3. 985 5.1.3. Port Block 987 An FE Port Block is used to describe specific sinks or sources. An 988 FE Source Block may source events other than packets, such as TCP 989 timers. An FE Source block may also not require complex 990 configuration. In addition, the model should be able to map both 991 sources and sinks onto one logical block which models a port that 992 implements those functions. For these reasons, it is useful to 993 define a Port Block separately from the previously defined Source 994 and Sink blocks, even though there is some overlap between them. 996 The FE Port Block contains a number of configurable parameters, 997 which may include, but are not limited to, the following items: 999 - the number of ports on the FE; 1000 - the sub-interfaces if any; 1001 - the static attributes of each port (e.g., port type, direction, 1002 link speed); 1003 - the configurable attributes of each port (e.g., IP address, 1004 administrative status); 1005 - the statistics collected on each port (e.g., number of packets 1006 received); 1007 - the current status (up or down). 1009 5.1.4. Dropper Block 1011 A dropper block has one input, and no outputs. It discards all 1012 packets that it receives without any modification or examination of 1013 those packets. 1015 The purpose of a dropper block is to allow the description of 1016 �sinks� within the model, where those sinks do not result in the 1017 packet being sent into any object external to the model. 1019 5.1.5. MUX Block 1021 A mux block has N inputs, and one output. It multiplexes packets 1022 from the inputs onto its output. 1024 5.1.6. Redirector (de-MUX) Block 1026 A redirector block has one input, and N outputs. It is the inverse 1027 a MUX block. 1029 The redirector block takes an input packet P, and uses the metadata 1030 M to redirect that packet to one or more of N outputs, e.g. Most 1031 commonly unicast forwarding, multicast, or broadcast. 1033 5.1.7. Shaper Block 1035 A shaper block has one input, and one output. It takes input 1036 packets and metadata at some time t, and outputs the packet and 1037 (possibly updated) metadata at some other time, t�. The packet is 1038 not examined or modified during this process. 1040 The meta-data is used to determine how to shape the outgoing 1041 traffic. The packet and metadata are conceptually added to the 1042 internal state S of the block when the packet is received, and are 1043 removed from that internal state when the packet is output from the 1044 block. 1046 5.2. FE Processing Blocks 1048 An FE processing block may be characterized by four parameters: 1050 P � the packet that it is processing 1051 t � the time at which that packet is being processed 1052 M � the metadata that is associated with that packet 1053 S � the internal state of the block 1054 (including any CE->FE configuration, and any internal FE 1055 data) 1056 We do not model or describe how any of these parameters arrive at 1057 the block. Instead, we characterize the blocks by how they process 1058 those parameters. 1060 5.2.1. Counter Block 1062 A counter block updates its internal state S, by counting packets, 1063 or metadata. The packet is not modified, and the metadata may, or 1064 may not, be modified. 1066 A counter block is independent of time �t�, in that it does not 1067 perform any time-dependent counting. The time at which a count is 1068 made may, however, be associated with that count. 1070 5.2.2. Meter Block 1072 A meter block is a counter block that is time dependent. That is, 1073 it meters the rate over time at which packets or metadata flow 1074 through the block. 1076 5.2.3. Filter Block 1078 According to [DiffServ], "a filter consists of a set of conditions 1079 on the component values of a packet's classification key (the 1080 header values, contents, and attributes relevant for 1081 classification)�. 1083 That is, a filter block examines the packet without modifying it, 1084 and uses its internal state S to make decisions about the packet. 1085 The result of that examination is that the filter block creates new 1086 metadata �match�, or �no match� to associate with that packet, 1087 depending on whether the packet matched, or did not match, the 1088 conditions of the filter. 1090 A filter block may be viewed as a special case of a classifier 1091 block. Alternately, a classifier block may be viewed as consisting 1092 of multiple filter blocks. 1094 5.2.4. Classifier Block 1096 A classifier block uses its internal state S to classify the packet 1097 into one of N different logical classes. That is, it takes an 1098 input packet and meta-data, and produces the same packet with new 1099 or more meta-data. A classifier is parameterized by filters. 1100 Classification is done by matching the contents of the incoming 1101 packets according to the filters, and the result of classification 1102 is produced in the form of metadata. Note that this classifier is 1103 modeled solely based on its internal processing, and not on its 1104 inputs and outputs. It is a single-exit classifier that does NOT 1105 physically redirect the packet. In contrast, a DiffServ-like 1106 classifier is a 1:N (fan-out) device: It takes a single traffic 1107 stream as input and generate N logically separate traffic streams 1108 as output. That kind of multi-exit classifier can be modeled by 1109 combining this classifier with a redirector (see Section 5.1.5). 1111 Note that other FE Blocks MAY perform simple classification on the 1112 packet or metadata. The purpose of the FE Classifier Block is to 1113 model a block that �digests� large amounts of input data (packet, 1114 metadata), to produce a �summary� of the classification results, in 1115 the form of additional metadata. Other FE Blocks can then use this 1116 summary information to quickly and simply perform trivial 1117 �classifications�. 1119 The requirement for a unique and separate FE Classifier Block comes 1120 about because it would not make sense to model a classifier block 1121 inside each of every other block. Such a model would be highly 1122 redundant. We therefore specifically model a complex 1123 classification block, and explicitly state that other blocks may 1124 make decisions based on the parameters S, t, and M, but not on P. 1126 5.2.5. Redirecting Classifier Block 1128 This block is logically a combination of the FE Classifier Block in 1129 Section 5.2.4, and the FE Redirector Block in Section 5.1.6. It 1130 uses its internal classification rules to redirect the input packet 1131 P to one or more outputs. 1133 Its purpose is to allow the �atomic� modeling of classification 1134 with redirection. If this block was described as two blocks, then 1135 the model would be required to describe the format and 1136 interpretation of the metadata. As there is not yet consensus on 1137 the format and interpretation of metadata, it is preferable to 1138 define an additional block which allows us to avoid most of that 1139 contention. 1141 It is expected that once there is experience with using the FE 1142 model and blocks defined here, that we may reach consensus on the 1143 format and interpretation of the metadata. At that time, we may 1144 revisit the definition of this block, and may choose to remove it 1145 due to redundancy with previously defined blocks. 1147 5.2.6. Modifier Block 1148 A modifier block modifies incoming packets and sends them out. This 1149 is a generic �catch-all� block for packet processing which is not 1150 modeled in one of the other blocks. Usually the meta-data is used 1151 to determine how to modify the packet. 1153 This block is defined in a generic manner, and we expect that 1154 specific examples of packet and/or metadata modification will be 1155 described as below, with named sub-classes of the modifier block. 1157 5.2.7. Packet Header Rewriter Block 1159 This block is a sub-class of the Modifier Block. It is used to re- 1160 write fields on the packet header, such as Ipv4 TTL decrementing, 1161 checksum calculation, or TCP/IP NAT. 1163 5.2.8. Packet Compression/Decompression Block 1165 This block is a sub-class of the Modifier Block. It is used to 1166 compress or decompress packet data, such as with Ipv4 Van Jacobson 1167 header compression. 1169 It may be useful to split this block into separate compression and 1170 decompression blocks. This decision should be made after we have 1171 more experience with the model. 1173 5.2.9. Packet Encryption/Decryption Block 1175 This block is a sub-class of the Modifier Block. It is used to 1176 encrypt or decrypt packet data, such as with TLS. 1178 It may be useful to split this block into separate encryption and 1179 decryption blocks. This decision should be made after we have more 1180 experience with the model. 1182 5.2.10. Packet Encapsulation/Decapsulation Block 1184 This block is a sub-class of the Modifier Block. It is used to 1185 encapsulate or decapsulate packet data, such as with IP in IP. 1187 It may be useful to split this block into separate encapsulation 1188 and decapsulation blocks. This decision should be made after we 1189 have more experience with the model. 1191 6. Minimal Set of Logical Functions Required for FE Model 1193 A minimum set of FE functions is defined in [FORCES-REQ] that must 1194 be supported by any proposed FE model. In this section, we 1195 demonstrate how the small FE block library defined in Section 5 can 1196 be used to model all the logical functions required in [FORCES- 1197 REQ]. 1199 6.1. QoS Functions 1201 The IETF community has already done some work in modeling the QoS 1202 functions in the datapath. The IETF DiffServ working group has 1203 defined an informal data model [RFC3290] for QoS-related functions 1204 like classification, metering, marking, actions of marking, 1205 dropping, counting and multiplexing, queueing, etc. The latest work 1206 on DiffServ PIB (Policy Information Base) [RFC3317] defines a set 1207 of provisioning classes to provide policy control of resources 1208 implementing the Diferentiated Services Architecture. DiffServ PIB 1209 also has an element of capability flavor in it. The IETF Policy 1210 Framework working group is also defining an informational model 1211 [QDDIM] to describe the QoS mechanisms inherent in different 1212 network devices, including hosts. This model is intended to be used 1213 with the QoS Policy Information Model [QPIM] to model how policies 1214 can be defined to manage and configure the QoS mechanisms present 1215 in the datapath of devices. 1217 Here is a list of QoS functional blocks that should be supported 1218 directly in the library or indirectly via combination of the FE 1219 blocks in the library: 1220 . Classifier 1221 . Meter 1222 . Marker 1223 . Dropper 1224 . Counter 1225 . Queue and Scheduler 1226 . Shaper 1228 6.1.1. Classifier 1230 There are two ways to define a classifier block: single-exit 1231 classifier or multi-exit classifier. 1233 A single-exit classifier follows the QDDIM model. It takes an input 1234 packet and meta-data, and produces the same packet, with new/more 1235 meta-data. Such a single-exit classifier does not physically 1236 redirect the packets. It only decides which meta-data to associate 1237 with the packet and such meta-data can be used by later blocks to 1238 physically redirect the packets. 1240 A multi-exit classifier, on the other hand, follows the DiffServ 1241 model. It is equivalent of a single-exit classifier followed by a 1242 redirector. Such a classifier directs packets to different output 1243 paths. 1245 6.1.2. Meter 1247 Meter is directly defined in the FE Block library. 1249 6.1.3. Marker 1251 Marker can be modeled as a special kind of FE Modifier Block. 1253 6.1.4. Dropper 1255 Dropper is directly defined in the FE Block library. 1257 6.1.5. Counter 1259 Counter is directly defined in the FE Block library. 1261 6.1.6. Queue and Scheduler (?) 1263 6.1.7. Shaper 1265 Shaper is directly defined in the FE Block library. 1267 6.2. Generic Filtering Functions 1269 A combination of classifier, redirector, modifier etc. can model 1270 complex set of filtering functions. For example, Figure 9 1271 represents a filtering function that classifies packets into one of 1272 two logical classes: forward, and drop. These logical classes are 1273 represented as meta data M1, and M2. The re-director uses this 1274 meta data to re-direct the packet to one of two outputs. The first 1275 sinks the packet back into the network. The second silently drops 1276 the packets. 1278 classifier -> redirector ---M1--- sink 1279 \ 1280 \-M2--- dropper 1282 Figure 9. A filtering function example. 1284 6.3. Vendor Specific Functions 1286 New and currently unknown FE functionality can be derived (i.e., 1287 extended) based on the generic FE Block. The name space used to 1288 identify the FE block type must be extensible such that new logical 1289 functions can be defined and added later to accommodate future 1290 innovation in forwarding plane, as long as the new functions are 1291 modeled as an FE block. 1293 6.4. Port Functions 1295 Every FE contains a certain number of interfaces (ports), including 1296 both the inter-NE interfaces and intra-NE interfaces. The inter-NE 1297 interfaces are the external interfaces for the NE to 1298 receive/forward packets from/to the external world. The intra-NE 1299 interfaces are used for FE-FE or FE-CE communications. Same model 1300 should be used for both the inter-FE and intra-FE interfaces, but 1301 it is necessary to make the distinction between the two known to 1302 the CE so that the CE can do different configuration. 1304 Certain types of physical ports have sub-interfaces (frame relay 1305 DLCIs, ATM VCs, Ethernet VLans, etc.) as virtual or logical 1306 interfaces. Some implementations treat tunnels (e.g., GRE, L2TP, 1307 IPSec, MPLS, etc.) as interfaces, while others do not. [FORCES-REQ] 1308 treats tunneling as high-touch functions and so FE model does not 1309 model tunneling as part of the port functions. Instead, tunneling 1310 is covered in Section 6.6. 1312 6.5. Forwarding Functions 1314 Support for IPv4 and IPv6 unicast and multicast forwarding 1315 functions must be provided by the model. 1317 Typically, the control plane maintains the Routing Information Base 1318 (RIB), which contains all the routes discovered by all the routing 1319 protocols with all kinds of attributes relevant to the routes. The 1320 forwarding plane uses a different database, the Forwarding 1321 Information Base (FIB), which contains only the active subset of 1322 those routes (only the best routes chosen for forwarding) with 1323 attributes that are only relevant for forwarding. A component in 1324 the control plane, termed Route Table Manager (RTM), is responsible 1325 to manage the RIB in the CE and maintain the FIB used by the FEs. 1326 Therefore, the most important aspect in modeling the forwarding 1327 functions is the data model for the FIB. The model also needs to 1328 support the possibility of multiple paths. 1330 At the very minimum, each route in the FIB needs to contain the 1331 following layer-3 information: 1332 - the prefix of the destination IP address; 1333 - the length of the prefix; 1334 - the number of equal-cost multi-path; 1335 - the next hop IP address and the egress interface for each path. 1337 Another aspect of the forwarding functions is the method to resolve 1338 a next hop destination IP address into the associated media 1339 address. There are many ways to resolve Layer 3 to Layer 2 address 1340 mapping depending upon link layer. For example, in case of Ethernet 1341 links, the Address Resolution Protocol (ARP, defined in RFC 826) is 1342 used for IPv4 address resolution. 1344 Assuming a separate table is maintained in the FEs for address 1345 resolution, the following information is necessary for each address 1346 resolution entry: 1347 - the next hop IP address; 1348 - the media address. 1350 Different implementation may have different ways to maintain the 1351 FIB and the resolution table. For example, a FIB may consist of two 1352 separate tables, one to match the prefix to the next hop and the 1353 other to match the next hop to the egress interface. Another 1354 implementation may use one table instead. Our approach of using 1355 the fine-grained FE blocks to model the forwarding functions allow 1356 such flexibility. 1358 For example, a combination of a classifier, followed by a modifier 1359 and a redirector can model the forwarding function. 1361 6.6. High-Touch Functions 1363 High-touch functions are those that take action on the contents or 1364 headers of a packet based on content other than what is found in 1365 the IP header. Examples of such functions include NAT, ALG, 1366 firewall, tunneling and L7 content recognition. 1368 The ForCES working group first needs to agree upon a small set of 1369 common high-touch functions with well-defined behavior to be 1370 included in the initial FE block library. Here is a list of 1371 candidate blocks: 1372 . NAT 1373 . Firewall 1374 . Encapsulator 1375 . Decapsulator 1377 NAT, Encapsulator, Decapsulator are all different examples of the 1378 modifier FE block; while firewall can be modeled as a filtering 1379 function (Section 6.2). 1381 6.7. Security Functions 1383 The FE model must be able to describe the types of encryption 1384 and/or decryption functions that an FE supports and the associated 1385 attributes for such functions. In general, encyption and decryption 1386 can be modeled by modifier. 1388 IP Security Policy (IPSP) Working Group in the IETF has started 1389 work in defining the IPSec Policy Information Base [IPSEC-PIB]. 1390 Further study on this is needed to determine whether it can be 1391 reused here and any other additional work is needed. 1393 6.8. Off-loaded Functions 1395 In addition to the packet processing functions that are typical to 1396 find on the FEs, some logical functions may also be executed 1397 asynchronously by some FEs, according to a certain finite-state 1398 machine, triggered not only by packet events, but by timer events 1399 as well. Examples of such functions include finite-state machine 1400 execution required by TCP termination or OSPF Hello processing off- 1401 loaded from the CE. The FE model must be capable of expressing 1402 these asynchronous functions, so that the CE may take advantage of 1403 such off-loaded functions on the FEs. 1405 The ForCES working group first needs to agree upon a small set of 1406 such off-loaded functions with well-understood behavior and 1407 interactions with the control plane. 1409 7. Cascading Multiple FEs 1411 An FE may contain zero, one or more external ingress ports. 1412 Similarly, an FE may contain zero, one or more external egress 1413 ports. In another word, not every FE has to contain any external 1414 ingress or egress interfaces. For example, Figure 10 shows two 1415 cascading FEs. FE #1 contains one external ingress interface but 1416 no external egress interface, while FE #2 contains one external 1417 egress interface but no ingress interfce. It is possible to 1418 connect these two FEs together via their internal interfaces to 1419 achieve the complete ingress-to-egress packet processing function. 1421 This provides the flexibility to spread the functions across 1422 multiple FEs and interconnect them together later for certain 1423 applications. 1425 +-----------------------------------------------------+ 1426 | +---------+ +------------+ +---------+ | 1427 input| | | | | | output | 1428 ---+->| Ingress |-->|Header |-->|IPv4 |---------+--->+ 1429 | | port | |Decompressor| |Forwarder| FE | | 1430 | +---------+ +------------+ +---------+ #1 | | 1431 +-----------------------------------------------------+ V 1432 | 1433 +-----------------------<-----------------------------+ 1434 | 1435 | +----------------------------------------+ 1436 V | +------------+ +----------+ | 1437 | input | | | | output | 1438 +->--+->|Header |-->| Egress |---------+--> 1439 | |Compressor | | port | FE | 1440 | +------------+ +----------+ #2 | 1441 +----------------------------------------+ 1443 Figure 10. An example of two different FEs connected together. 1445 While inter-FE communication protocol is out of scope for ForCES, 1446 it is up to the CE to query and understand the FE function and 1447 inter-FE topology for multiple FEs and cascade them together when 1448 necessary to perform a complete ingress-egress packet processing 1449 function, like described in Figure 10. 1451 8. Data Modeling and Representation 1453 A formal data modeling language is needed to represent the 1454 conceptual FE model described in this document and a full 1455 specification will be written using such a data modeling language. 1456 It is also necessary to identify a data representation method for 1457 over-the-wire transport of the FE model data. 1459 The following is a list of some potential candidates for 1460 consideration. For the moment, we intend to leave this as an open 1461 issue and much debate is needed in the ForCES WG before a decision 1462 can be made. Therefore, we only provide the candidate list and some 1463 initial discussion here without drawing a conclusion yet. 1465 - XML (Extensible Markup Language) Schema 1466 - ASN.1 (Abstract Syntax Notation One) 1467 - SMI (Structure of Management Information) [RFC1155] 1468 - SPPI (Structure of Policy Provisioning Information) [RFC3159] 1469 - UML (Universal Modeling Language) 1471 Most of the candidates here, with the notable exception of UML, are 1472 capable of representing the model in the document and over-the- 1473 wire. Of course, it is also possible to choose one data model 1474 language for specification in the document and later allow several 1475 over-the-wire representations to map the model into different 1476 implementations. 1478 XML has the advantage of being human and machine readable with 1479 widely available tools support. However, it is very verbose and 1480 hence less efficient for over-the-wire transport. It also requires 1481 XML parsing functions in both the CE and FE and hence may impose 1482 large footprint esp. for FEs. Currently XML is not yet widely 1483 deployed and used in network elements. XML for network 1484 configuration in general remains an open area that still requires 1485 substantial investigation and experiment in IETF. 1487 ASN.1 format is human readable and widely used in network 1488 protocols. SMI is based on a subset of ASN.1 and used to define 1489 Management Information Base (MIB) for SNMP. SPPI is the adapted 1490 subset of SMI used to define Policy Information Base (PIB) for 1491 COPS. Substantial investment has been made in SMI/MIBs/SNMP by IETF 1492 and the Internet community collectively has had many years of 1493 design and operation experience with SMI/MIBs/SNMP. However, it is 1494 also well recognized that SMI/MIBs/SNMP is not well suited for 1495 configuration and so SPPI/PIBs/COPS-PR attempts to optimize for 1496 network provisioning and configuration. 1498 UML is the software industry�s standard language for specifying, 1499 visualizing, constructing and documenting the artifacts of software 1500 systems. It is a powerful tool for data modeling. However, it does 1501 not provide a data representation format for over-the-wire 1502 transport. 1504 9. Security Considerations 1506 The FE model just describes the representation and organization of 1507 data sets and attributes in the forwarding plane. The associated 1508 communication protocol (i.e., ForCES protocol) will be defined in 1509 separate documents and so the security issues will be addressed 1510 there. 1512 10. Intellectual Property Right 1513 The authors are not aware of any intellectual property right issues 1514 pertaining to this document. 1516 11. IANA consideration 1518 A namespace is needed to uniquely identify the FE block type for 1519 each FE logical function. 1521 12. Normative References 1523 [RFC1812] F. Baker, �Requirements for IP Version 4 Routers", June 1524 1995. 1526 [RFC1155] M. Rose, et. al., �Structure and Identification of 1527 Management Informationfor TCP/IP-based Internets", May 1528 1990. 1530 [RFC3084] K. Chan, et. al., �COPS Usage for Policy Provisioning,� 1531 March 2001. 1533 [RFC3159] K. McCloghrie, et. al., �Structure of Policy Provisioning 1534 Information (SPPI)", August 2001. 1536 [RFC3290] Y. Bernet, et. al., �An Informal Management Model for 1537 Diffserv Routers�, May 2002. 1539 [FORCES-REQ] H. Khosravi, et. al., �Requirements for Separation of 1540 IP Control and Forwarding", work in progress, May 2003, 1541 . 1543 13. Informative References 1545 [RFC3317] K. Chan, et. al., �Differentiated Services Quality of 1546 Service Policy Information Base�, March 2003. 1548 [RFC3318] R.Sahita, et. al., �Framework Policy Information Base�, 1549 RFC 3318, March 2003. 1551 [QDDIM] B. Moore, et. al., �Information Model for Describing 1552 Network Device QoS Datapath Mechanisms�, work in 1553 progress, May 2002, . 1556 [QPIM] Y. Snir, et. al., �Policy Framework QoS Information Model�, 1557 work in progress, Nov 2001, 1564 [IPSEC-MIB] C. Madson, et. al., �IPsec Flow Monitoring MIB�, work 1565 in progress, March 2003, 1568 14. Acknowledgments 1570 The authors would also like to thank the following individuals for 1571 their invaluable technical input: David Putzolu, Hormuzd Khosravi, 1572 Eric Johnson, David Durham, Andrzej Matejko, T. Sridhar, Jamal 1573 Hadi, Alex Audu. 1575 15. Authors' Addresses 1577 Lily L. Yang 1578 Intel Labs 1579 2111 NE 25th Avenue 1580 Hillsboro, OR 97124, USA 1581 Phone: +1 503 264 8813 1582 Email: lily.l.yang@intel.com 1584 Joel Halpern 1585 P.O.Box 6049 1586 Leesburg, VA 20178, USA 1587 Phone: +1 703 371 3043 1588 Email: jmh@joelhalpern.com 1590 Ram Gopal 1591 Nokia Research Center 1592 5, Wayside Road, 1593 Burlington, MA 01803, USA 1594 Phone: +1 781 993 3685 1595 Email: ram.gopal@nokia.com 1597 Alan DeKok 1598 IDT Inc. 1599 1575 Carling Ave. 1600 Ottawa, ON K1G 0T3, Canada 1601 Phone: +1 613 724 6004 ext. 231 1602 Email: alan.dekok@idt.com