idnits 2.17.1 draft-ietf-aaa-acct-04.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 3 instances of too long lines in the document, the longest one being 9 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 125 has weird spacing: '...tion be measu...' == Line 585 has weird spacing: '...is only one a...' == Line 796 has weird spacing: '..., or on expir...' == Line 810 has weird spacing: '...condary serve...' == Line 1151 has weird spacing: '...need to keep ...' == (11 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: '10' is defined on line 2251, but no explicit reference was found in the text == Unused Reference: '14' is defined on line 2266, but no explicit reference was found in the text == Unused Reference: '16' is defined on line 2272, but no explicit reference was found in the text == Unused Reference: '17' is defined on line 2275, but no explicit reference was found in the text == Unused Reference: '21' is defined on line 2290, but no explicit reference was found in the text == Unused Reference: '22' is defined on line 2294, but no explicit reference was found in the text == Unused Reference: '23' is defined on line 2297, but no explicit reference was found in the text == Unused Reference: '28' is defined on line 2314, but no explicit reference was found in the text == Unused Reference: '29' is defined on line 2318, but no explicit reference was found in the text == Unused Reference: '30' is defined on line 2321, but no explicit reference was found in the text == Unused Reference: '31' is defined on line 2324, but no explicit reference was found in the text == Unused Reference: '32' is defined on line 2328, but no explicit reference was found in the text == Unused Reference: '33' is defined on line 2332, but no explicit reference was found in the text == Unused Reference: '34' is defined on line 2336, but no explicit reference was found in the text == Unused Reference: '35' is defined on line 2339, but no explicit reference was found in the text == Unused Reference: '36' is defined on line 2342, but no explicit reference was found in the text == Unused Reference: '37' is defined on line 2346, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2138 (ref. '3') (Obsoleted by RFC 2865) ** Obsolete normative reference: RFC 2139 (ref. '4') (Obsoleted by RFC 2866) ** Obsolete normative reference: RFC 2393 (ref. '5') (Obsoleted by RFC 3173) ** Obsolete normative reference: RFC 793 (ref. '7') (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2486 (ref. '8') (Obsoleted by RFC 4282) ** Obsolete normative reference: RFC 2576 (ref. '11') (Obsoleted by RFC 3584) ** Obsolete normative reference: RFC 2298 (ref. '12') (Obsoleted by RFC 3798) ** Obsolete normative reference: RFC 1892 (ref. '14') (Obsoleted by RFC 3462) ** Obsolete normative reference: RFC 1521 (ref. '17') (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) ** Obsolete normative reference: RFC 2570 (ref. '19') (Obsoleted by RFC 3410) == Outdated reference: A later version (-05) exists of draft-ietf-fax-timely-delivery-00 -- Unexpected draft version: The latest known version of draft-ietf-radius-ext is -06, but you're referring to -07. == Outdated reference: A later version (-13) exists of draft-ietf-sigtran-sctp-05 ** Obsolete normative reference: RFC 2571 (ref. '27') (Obsoleted by RFC 3411) ** Obsolete normative reference: RFC 1902 (ref. '31') (Obsoleted by RFC 2578) ** Obsolete normative reference: RFC 1903 (ref. '32') (Obsoleted by RFC 2579) ** Obsolete normative reference: RFC 1904 (ref. '33') (Obsoleted by RFC 2580) ** Obsolete normative reference: RFC 1906 (ref. '36') (Obsoleted by RFC 3417) ** Obsolete normative reference: RFC 2572 (ref. '37') (Obsoleted by RFC 3412) ** Obsolete normative reference: RFC 2574 (ref. '38') (Obsoleted by RFC 3414) ** Obsolete normative reference: RFC 2573 (ref. '39') (Obsoleted by RFC 3413) ** Obsolete normative reference: RFC 2575 (ref. '40') (Obsoleted by RFC 3415) ** Obsolete normative reference: RFC 1905 (ref. '41') (Obsoleted by RFC 3416) == Outdated reference: A later version (-34) exists of draft-ietf-cat-kerberos-pk-init-09 == Outdated reference: A later version (-08) exists of draft-ietf-cat-kerberos-pk-cross-04 == Outdated reference: A later version (-02) exists of draft-hornstein-snmpv3-ksm-00 == Outdated reference: A later version (-04) exists of draft-ietf-aaa-accounting-attributes-03 == Outdated reference: A later version (-09) exists of draft-irtf-nmrg-snmp-tcp-03 == Outdated reference: A later version (-01) exists of draft-irtf-nmrg-snmp-compression-00 == Outdated reference: A later version (-01) exists of draft-irtf-nmrg-get-subtree-mib-00 Summary: 29 errors (**), 0 flaws (~~), 33 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 AAA Working Group Bernard Aboba 2 INTERNET-DRAFT Microsoft Corporation 3 Category: Informational Jari Arkko 4 Ericsson 5 11 June 2000 David Harrington 6 Cabletron Systems Inc. 8 Introduction to Accounting Management 10 1. Status of this Memo 12 This document is an Internet-Draft and is in full conformance with all 13 provisions of Section 10 of RFC 2026. 15 Internet-Drafts are working documents of the Internet Engineering Task 16 Force (IETF), its areas, and its working groups. Note that other groups 17 may also distribute working documents as Internet-Drafts. Internet- 18 Drafts are draft documents valid for a maximum of six months and may be 19 updated, replaced, or obsoleted by other documents at any time. It is 20 inappropriate to use Internet-Drafts as reference material or to cite 21 them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt. 26 To view the list Internet-Draft Shadow Directories, see 27 http://www.ietf.org/shadow.html. 29 2. Copyright Notice 31 Copyright (C) The Internet Society (2000). All Rights Reserved. 33 3. Abstract 35 The field of Accounting Management is concerned with the collection of 36 resource consumption data for the purposes of capacity and trend 37 analysis, cost allocation, auditing, and billing. This document 38 describes each of these problems, and discusses the issues involved in 39 design of modern accounting systems. 41 Since accounting applications do not have uniform security and 42 reliability requirements, it is not possible to devise a single 43 accounting protocol and set of security services that will meet all 44 needs. Thus the goal of accounting management is to provide a set of 45 tools that can be used to meet the requirements of each application. 46 This document describes the currently available tools as well as the 47 state of the art in accounting protocol design. A companion document, 48 draft-ietf-aaa-accounting-attributes-03.txt, reviews the state of the 49 art in accounting attributes and record formats. 51 3.1. History 53 -04 draft: rewrote SNMP section, cleaned up references 54 -03 draft: rewrote SNMPv3 section. 55 -02 draft: added discussion of accounting proxies. Expanded 56 discussion of accounting server faults and failover. Revised 57 section on SNMPv3. Revised requirements and evaluation tables. 58 Fixed spelling mistakes. 60 4. Table of Contents 62 1. Status of this Memo 1 63 2. Copyright notice 1 64 3. Abstract 1 65 4. Table of Contents 2 66 5. Introduction 3 67 5.1 Requirements language 3 68 5.2 Terminology 3 69 5.3 Accounting management architecture 5 70 5.4 Accounting management objectives 7 71 5.5 Intra-domain and inter-domain accounting 10 72 5.6 Accounting record production 11 73 5.7 Requirements summary 13 74 6. Scaling and reliability 14 75 6.1 Fault resilience 14 76 6.2 Resource consumption 22 77 6.3 Data collection models 25 78 7. Review of Accounting Protocols 31 79 7.1 RADIUS 31 80 7.2 TACACS+ 32 81 7.3 SNMP 32 82 8. Review of Accounting Data Transfer 46 83 8.1 SMTP 46 84 8.2 Other protocols 47 85 9. Summary 47 86 10. Acknowledgments 50 87 11. References 50 88 12. Authors' Addresses 53 89 13. Intellectual Property Statement 54 90 14. Full Copyright Statement 55 91 15. Expiration date 55 93 5. Introduction 95 The field of Accounting Management is concerned with the collection of 96 resource consumption data for the purposes of capacity and trend 97 analysis, cost allocation, auditing, and billing. This document 98 describes each of these problems, and discusses the issues involved in 99 design of modern accounting systems. 101 Since accounting applications do not have uniform security and 102 reliability requirements, it is not possible to devise a single 103 accounting protocol and set of security services that will meet all 104 needs. Thus the goal of accounting management is to provide a set of 105 tools that can be used to meet the requirements of each application. 106 This document describes the currently available tools as well as the 107 state of the art in accounting protocol design. A companion document, 108 draft-ietf-aaa-accounting-attributes-03.txt, reviews the state of the 109 art in accounting attributes and record formats. 111 5.1. Requirements language 113 In this document, the key words "MAY", "MUST, "MUST NOT", "optional", 114 "recommended", "SHOULD", and "SHOULD NOT", are to be interpreted as 115 described in [6]. 117 5.2. Terminology 119 This document frequently uses the following terms: 121 Accounting 122 The collection of resource consumption data for the purposes 123 of capacity and trend analysis, cost allocation, auditing, and 124 billing. Accounting management requires that resource 125 consumption be measured, rated, assigned, and communicated 126 between appropriate parties. 128 Archival accounting 129 In archival accounting, the goal is to collect all accounting 130 data, to reconstruct missing entries as best as possible in 131 the event of data loss, and to archive data for a mandated 132 time period. It is "usual and customary" for these systems to 133 be engineered to be very robust against accounting data loss. 134 This may include provisions for transport layer as well as 135 application layer acknowledgments, use of non-volatile 136 storage, interim accounting capabilities (stored or 137 transmitted over the wire), etc. Legal or financial 138 requirements frequently mandate archival accounting practices, 139 and may often dictate that data be kept confidential, 140 regardless of whether it is to be used for billing purposes or 141 not. 143 Rating The act of determining the price to be charged for use of a 144 resource. 146 Billing The act of preparing an invoice. 148 Usage sensitive billing 149 A billing process that depends on usage information to prepare 150 an invoice can be said to be usage-sensitive. In contrast, a 151 process that is independent of usage information is said to be 152 non-usage-sensitive. 154 Auditing The act of verifying the correctness of a procedure. In order 155 to be able to conduct an audit it is necessary to be able to 156 definitively determine what procedures were actually carried 157 out so as to be able to compare this to the recommended 158 process. Accomplishing this may require security services such 159 as authentication and integrity protection. 161 Cost Allocation 162 The act of allocating costs between entities. Note that cost 163 allocation and rating are fundamentally different processes. 164 In cost allocation the objective is typically to allocate a 165 known cost among several entities. In rating the objective is 166 to determine the amount to be charged for use of a resource. 167 In cost allocation, the cost per unit of resource may need to 168 be determined; in rating, this is typically a given. 170 Interim accounting 171 Interim accounting provides a snapshot of usage during a 172 user's session. This may be useful in the event of a device 173 reboot or other network problem that prevents the reception or 174 generation of a session summary packet or session record. 175 Interim accounting records can always be summarized without 176 the loss of information. Note that interim accounting records 177 may be stored internally on the device (such as in non- 178 volatile storage) so as to survive a reboot and thus may not 179 always be transmitted over the wire. 181 Session record 182 A session record represents a summary of the resource 183 consumption of a user over the entire session. Accounting 184 gateways creating the session record may do so by processing 185 interim accounting events or accounting events from several 186 devices serving the same user. 188 Accounting Protocol 189 A protocol used to convey data for accounting purposes. 191 Intra-domain accounting 192 Intra-domain accounting involves the collection of information 193 on resource usage within an administrative domain, for use 194 within that domain. In intra-domain accounting, accounting 195 packets and session records typically do not cross 196 administrative boundaries. 198 Inter-domain accounting 199 Inter-domain accounting involves the collection of information 200 on resource usage within an administrative domain, for use 201 within another administrative domain. In inter-domain 202 accounting, accounting packets and session records will 203 typically cross administrative boundaries. 205 Real-time accounting 206 Real-time accounting involves the processing of information on 207 resource usage within a defined time window. Time constraints 208 are typically imposed in order to limit financial risk. 210 Accounting server 211 The accounting server receives accounting data from devices 212 and translates it into session records. The accounting server 213 may also take responsibility for the routing of session 214 records to interested parties. 216 5.3. Accounting management architecture 218 The accounting management architecture involves interactions between 219 network devices, accounting servers, and billing servers. The network 220 device collects resource consumption data in the form of accounting 221 metrics. This information is then transferred to an accounting server. 222 Typically this is accomplished via an accounting protocol, although it 223 is also possible for devices to generate their own session records. 225 The accounting server then processes the accounting data received from 226 the network device. This processing may include summarization of interim 227 accounting information, elimination of duplicate data, or generation of 228 session records. 230 The processed accounting data is then submitted to a billing server, 231 which typically handles rating and invoice generation, but may also 232 carry out auditing, cost allocation, trend analysis or capacity planning 233 functions. Session records may be batched and compressed by the 234 accounting server prior to submission to the billing server in order to 235 reduce the volume of accounting data and the bandwidth required to 236 accomplish the transfer. 238 One of the functions of the accounting server is to distinguish between 239 inter and intra-domain accounting events and to route them 240 appropriately. For session records containing a Network Access 241 Identifier (NAI), described in [8], the distinction can be made by 242 examining the domain portion of the NAI. If the domain portion is absent 243 or corresponds to the local domain, then the session record is treated 244 as an intra-domain accounting event. Otherwise, it is treated as an 245 inter-domain accounting event. 247 Intra-domain accounting events are typically routed to the local billing 248 server, while inter-domain accounting events will be routed to 249 accounting servers operating within other administrative domains. While 250 it is not required that session record formats used in inter and intra- 251 domain accounting be the same, this is desirable, since it eliminates 252 translations that would otherwise be required. 254 Where a proxy forwarder is employed, domain-based access controls may be 255 employed by the proxy forwarder, rather than by the devices themselves. 256 The network device will typically speak an accounting protocol to the 257 proxy forwarder, which may then either convert the accounting packets to 258 session records, or forward the accounting packets to another domain. 259 In either case, domain separation is typically achieved by having the 260 proxy forwarder sort the session records or accounting messages by 261 destination. 263 Where the accounting proxy is not trusted, it may be difficult to verify 264 that the proxy is issuing correct session records based on the 265 accounting messages it receives, since the original accounting messages 266 typically are not forwarded along with the session records. Therefore 267 where trust is an issue, the proxy typically forwards the accounting 268 packets themselves. Assuming that the accounting protocol supports data 269 object security, this allows the end-points to verify that the proxy has 270 not modified the data in transit or snooped on the packet contents. 272 The diagram below illustrates the accounting management architecture: 274 +------------+ 275 | | 276 | Network | 277 | Device | 278 | | 279 +------------+ 280 | 281 Accounting | 282 Protocol | 283 | 284 V 285 +------------+ +------------+ 286 | | | | 287 | Org B | Inter-domain session records | Org A | 288 | Acctg. |<----------------------------->| Acctg. | 289 |Proxy/Server| or accounting protocol | Server | 290 | | | | 291 +------------+ +------------+ 292 | | 293 | | 294 Transfer | Intra-domain | 295 Protocol | Session records | 296 | | 297 V V 298 +------------+ +------------+ 299 | | | | 300 | Org B | | Org A | 301 | Billing | | Billing | 302 | Server | | Server | 303 | | | | 304 +------------+ +------------+ 306 5.4. Accounting management objectives 308 Accounting Management involves the collection of resource consumption 309 data for the purposes of capacity and trend analysis, cost allocation, 310 auditing, billing. Each of these tasks has different requirements. 312 5.4.1. Trend analysis and capacity planning 314 In trend analysis and capacity planning, the goal is typically a 315 forecast of future usage. Since such forecasts are inherently 316 imperfect, high reliability is typically not required, and moderate 317 packet loss can be tolerated. Where it is possible to use statistical 318 sampling techniques to reduce data collection requirements while still 319 providing the forecast with the desired statistical accuracy, it may be 320 possible to tolerate high packet loss as long as bias is not introduced. 322 The security requirements for trend analysis and capacity planning 323 depend on the circumstances of data collection and the sensitivity of 324 the data. Additional security services may be required when data is 325 being transferred between administrative domains. For example, when 326 information is being collected and analyzed within the same 327 administrative domain, integrity protection and authentication may be 328 used in order to guard against collection of invalid data. In inter- 329 domain applications confidentiality may be desirable to guard against 330 snooping by third parties. 332 5.4.2. Billing 334 When accounting data is used for billing purposes, the requirements 335 depend on whether the billing process is usage-sensitive or not. 337 5.4.2.1. Non-usage sensitive billing 339 Since by definition, non-usage-sensitive billing does not require usage 340 information, in theory all accounting data can be lost without affecting 341 the billing process. Of course this would also affect other tasks such 342 as trend analysis or auditing, so that such wholesale data loss would 343 still be unacceptable. 345 5.4.2.2. Usage-sensitive billing 347 Since usage-sensitive billing processes depend on usage information, 348 packet loss may translate directly to revenue loss. As a result, the 349 billing process may need to conform to financial reporting and legal 350 requirements, and therefore an archival accounting approach may be 351 needed. 353 Usage-sensitive systems may also require low processing delay. Today 354 credit risk is commonly managed by computerized fraud detection systems 355 that are designed to detect unusual activity. While efficiency concerns 356 might otherwise dictate batched transmission of accounting data, where 357 there is a risk of fraud, financial exposure increases with processing 358 delay. Thus it may be advisable to transmit each event individually to 359 minimize batch size, or even to utilize quality of service techniques to 360 minimize queuing delays. In addition, it may be necessary for 361 authorization to be dependent on ability to pay. 363 Whether these techniques will be useful varies by application since the 364 degree of financial exposure is application-dependent. For dial-up 365 Internet access from a local provider, charges are typically low and 366 therefore the risk of loss is small. However, in the case of dial-up 367 roaming or voice over IP, time-based charges may be substantial and 368 therefore the risk of fraud is larger. In such situations it is highly 369 desirable to quickly detect unusual account activity, and it may be 370 desirable for authorization to depend on ability to pay. In situations 371 where valuable resources can be reserved, or where charges can be high, 372 very large bills may be rung up quickly, and processing may need to be 373 completed within a defined time window in order to limit exposure. 375 Since in usage-sensitive systems, accounting data translates into 376 revenue, the security and reliability requirements are greater. Due to 377 financial and legal requirements such systems need to be able to survive 378 an audit. Thus security services such as authentication, integrity and 379 replay protection are frequently required and confidentiality and data 380 object integrity may also be desirable. Application-layer 381 acknowledgments are also often required so as to guard against 382 accounting server failures. 384 5.4.3. Auditing 386 With enterprise networking expenditures on the rise, interest in 387 auditing is increasing. Auditing, which is the act of verifying the 388 correctness of a procedure, commonly relies on accounting data. Auditing 389 tasks include verifying the correctness of an invoice submitted by a 390 service provider, or verifying conformance to usage policy, service 391 level agreements, or security guidelines. 393 To permit a credible audit, the auditing data collection process must be 394 at least as reliable as the accounting process being used by the entity 395 that is being audited. Similarly, security policies for the audit should 396 be at least as stringent as those used in preparation of the original 397 invoice. Due to financial and legal requirements, archival accounting 398 practices are frequently required in this application. 400 Where auditing procedures are used to verify conformance to usage or 401 security policies, security services may be desired. This typically will 402 include authentication, integrity and replay protection as well as 403 confidentiality and data object integrity. In order to permit response 404 to security incidents in progress, auditing applications frequently are 405 built to operate with low processing delay. 407 5.4.4. Cost allocation 409 The application of cost allocation and billback methods by enterprise 410 customers is not yet widespread. However, with the convergence of 411 telephony and data communications, there is increasing interest in 412 applying cost allocation and billback procedures to networking costs, as 413 is now commonly practiced with telecommunications costs. 415 Cost allocation models, including traditional costing mechanisms 416 described in [21]-[23] and activity-based costing techniques described 417 in [24] are typically based on detailed analysis of usage data, and as a 418 result they are almost always usage-sensitive. Whether these techniques 419 are applied to allocation of costs between partners in a venture or to 420 allocation of costs between departments in a single firm, cost 421 allocation models often have profound behavioral and financial impacts. 422 As a result, systems developed for this purposes are typically as 423 concerned with reliable data collection and security as are billing 424 applications. Due to financial and legal requirements, archival 425 accounting practices are frequently required in this application. 427 5.5. Intra-domain and inter-domain accounting 429 Much of the initial work on accounting management has focused on intra- 430 domain accounting applications. However, with the increasing deployment 431 of services such as dial-up roaming, Internet fax, Voice and Video over 432 IP and QoS, applications requiring inter-domain accounting are becoming 433 increasingly common. 435 Inter-domain accounting differs from intra-domain accounting in several 436 important ways. Intra-domain accounting involves the collection of 437 information on resource consumption within an administrative domain, for 438 use within that domain. In intra-domain accounting, accounting packets 439 and session records typically do not cross administrative boundaries. As 440 a result, intra-domain accounting applications typically experience low 441 packet loss and involve transfer of data between trusted entities. 443 In contrast, inter-domain accounting involves the collection of 444 information on resource consumption within an administrative domain, for 445 use within another administrative domain. In inter-domain accounting, 446 accounting packets and session records will typically cross 447 administrative boundaries. As a result, inter-domain accounting 448 applications may experience substantial packet loss. In addition, the 449 entities involved in the transfers cannot be assumed to trust each 450 other. 452 Since inter-domain accounting applications involve transfers of 453 accounting data between domains, additional security measures may be 454 desirable. In addition to authentication, replay and integrity 455 protection, it may be desirable to deploy security services such as 456 confidentiality and data object integrity. In inter-domain accounting 457 each involved party also typically requires a copy of each accounting 458 event for invoice generation and auditing. 460 5.6. Accounting record production 462 Typically, a single accounting record is produced per session, or in 463 some cases, a set of interim records which can be summarized in a single 464 record for billing purposes. However, to support deployment of services 465 such as wireless access or complex billing regimes, a more sophisticated 466 approach is required. 468 It is necessary to generate several accounting records from a single 469 session when pricing changes during a session. For instance, the price 470 of a service can be higher during peak hours than off-peak. For a 471 session continuing from one tariff period to another, it becomes 472 necessary for a device to report "packets sent" during both periods. 474 Time is not the only factor requiring this approach. For instance, in 475 mobile access networks the user may roam from one place to another while 476 still being connected in the same session. If roaming causes a change in 477 the tariffs, it is necessary to account for resource consumed in the 478 first and second areas. Another example is where modifications are 479 allowed to an ongoing session. For example, it is possible that a 480 session could be re-authorized with improved QoS. This would require 481 production of accounting records at both QoS levels. 483 These examples could be addressed by using vectors or multi-dimensional 484 arrays to represent resource consumption within a single session record. 485 For example, the vector or array could describe the resource consumption 486 for each combination of factors, e.g. one data item could be the number 487 of packets during peak hour in the area of the home operator. However, 488 such an approach seems complicated and inflexible and as a result, most 489 current systems produce a set of records from one session. A session 490 identifier needs to be present in the records to permit accounting 491 systems to tie the records together. 493 In most cases, the network device will determine when multiple session 494 records are needed, as the local device is aware of factors affecting 495 local tariffs, such as QoS changes and roaming. However, future systems 496 are being designed that enable the home domain to control the generation 497 of accounting records. This is of importance in inter-domain accounting 498 or when network devices do not have tariff information. The centralized 499 control of accounting record production can be realized, for instance, 500 by having authorization servers require re-authorization at certain 501 times and requiring the production of accounting records upon each re- 502 authorization. 504 In conclusion, in some cases it is necessary to produce multiple 505 accounting records from a single session. It must be possible to do 506 this without requiring the user to start a new session or to re- 507 authenticate. The production of multiple records can be controlled 508 either by the network device or by the AAA server. The requirements for 509 timeliness, security and reliability in multiple record sessions are the 510 same as for single-record sessions. 512 5.7. Requirements summary 514 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 515 | | | | 516 | Usage | Intra-domain | Inter-domain | 517 | | | | 518 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 519 | | Robustness vs. | Robustness vs. | 520 | | packet loss | packet loss | 521 | Capacity | | | 522 | Planning | Integrity, | Integrity, | 523 | | authentication, | authentication, | 524 | | replay protection | replay prot. | 525 | | [confidentiality] | confidentiality | 526 | | | [data object sec.]| 527 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 528 | Non-usage | Integrity, | Integrity, | 529 | Sensitive | authentication, | authentication, | 530 | Billing | replay protection | replay protection | 531 | | [confidentiality] | confidentiality | 532 | | | [data object sec.]| 533 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 534 | | Archival | Archival | 535 | Usage | accounting | accounting | 536 | Sensitive | Integrity, | Integrity, | 537 | Billing, | authentication, | authentication, | 538 | Cost | replay protection | replay prot. | 539 | Allocation & | [confidentiality] | confidentiality | 540 | Auditing | [Bounds on | [data object sec.]| 541 | | processing delay] | [Bounds on | 542 | | | processing delay] | 543 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 544 | | Archival | Archival | 545 | Time | accounting | accounting | 546 | Sensitive | Integrity, | Integrity, | 547 | Billing, | authentication, | authentication, | 548 | fraud | replay protection | replay prot. | 549 | detection, | [confidentiality] | confidentiality | 550 | roaming | | [Data object | 551 | | Bounds on | security and | 552 | | processing delay | receipt support] | 553 | | | Bounds on | 554 | | | processing delay | 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 557 Key 558 [] = optional 560 6. Scaling and reliability 562 With the continuing growth of the Internet, it is important that 563 accounting management systems be scalable and reliable. This section 564 discusses the resources consumed by accounting management systems as 565 well as the scalability and reliability properties exhibited by various 566 data collection and transport models. 568 6.1. Fault resilience 570 As noted earlier, in applications such as usage-sensitive billing, cost 571 allocation and auditing, an archival approach to accounting is 572 frequently mandated, due to financial and legal requirements. Since in 573 such situations loss of accounting data can translate to revenue loss, 574 there is incentive to engineer a high degree of fault resilience. Faults 575 which may be encountered include: 577 Packet loss 578 Accounting server failures 579 Network failures 580 Device reboots 582 To date, much of the debate on accounting reliability has focused on 583 resilience against packet loss and the differences between UDP, SCTP and 584 TCP-based transport. However, it should be understood that resilience 585 against packet loss is only one aspect of meeting archival accounting 586 requirements. 588 As noted in [18], "once the cable is cut you don't need more 589 retransmissions, you need a *lot* more voltage." Thus, the choice of 590 transport has no impact on resilience against faults such as network 591 partition, accounting server failures or device reboots. What does 592 provide resilience against these faults is non-volatile storage. 594 The importance of non-volatile storage in design of reliable accounting 595 systems cannot be over-emphasized. Without non-volatile storage, event- 596 driven systems will lose data once the transmission timeout has been 597 exceeded, and batching designs will experience data loss once the 598 internal memory used for accounting data storage has been exceeded. Via 599 use of non-volatile storage, and internally stored interim records, most 600 of these data losses can be avoided. 602 It may even be argued that non-volatile storage is more important to 603 accounting reliability than network connectivity, since for many years 604 reliable accounting systems were implemented based solely on physical 605 storage, without any network connectivity. For example, phone usage data 606 used to be stored on paper, film, or magnetic media and carried from the 607 place of collection to a central location for bill processing. 609 6.1.1. Interim accounting 611 Interim accounting provides protection against loss of session summary 612 data by providing checkpoint information that can be used to reconstruct 613 the session record in the event that the session summary information is 614 lost. This technique may be applied to any data collection model (i.e. 615 event-driven or polling) and is supported in both RADIUS [25] and in 616 TACACS+. 618 While interim accounting can provide resilience against packet loss, 619 server failures, short-duration network failures, or device reboot, its 620 applicability is limited. Transmission of interim accounting data over 621 the wire should not be thought of as a mainstream reliability 622 improvement technique since it increases use of network bandwidth in 623 normal operation, while providing benefits only in the event of a fault. 625 Since most packet loss on the Internet is due to congestion, sending 626 interim accounting data over the wire can make the problem worse by 627 increasing bandwidth usage. Therefore on-the-wire interim accounting is 628 best restricted to high-value accounting data such as information on 629 long-lived sessions. To protect against loss of data on such sessions, 630 the interim reporting interval is typically set several standard 631 deviations larger than the average session duration. This ensures that 632 most sessions will not result in generation of interim accounting events 633 and the additional bandwidth consumed by interim accounting will be 634 limited. However, as the interim accounting interval decreases toward 635 the average session time, the additional bandwidth consumed by interim 636 accounting increases markedly, and as a result, the interval must be set 637 with caution. 639 Where non-volatile storage is unavailable, interim accounting can also 640 result in excessive consumption of memory that could be better allocated 641 to storage of session data. As a result, implementors should be careful 642 to ensure that new interim accounting data overwrites previous data 643 rather than accumulating additional interim records in memory, thereby 644 worsening the buffer exhaustion problem. 646 Given the increasing popularity of non-volatile storage for use in 647 consumer devices such as digital cameras, such devices are rapidly 648 declining in price. This makes it increasingly feasible for network 649 devices to include built-in support for non-volatile storage. This can 650 be accomplished, for example, by support for compact PCMCIA cards. 652 Where non-volatile storage is available, this can be used to store 653 interim accounting data. Stored interim events are then replaced by 654 updated interim events or by session data when the session completes. 655 The session data can itself be erased once the data has been transmitted 656 and acknowledged at the application layer. This approach avoids interim 657 data being transmitted over the wire except in the case of a device 658 reboot. When a device reboots, internally stored interim records are 659 transferred to the accounting server. 661 6.1.2. Multiple record sessions 663 Generation of multiple accounting records within a session can introduce 664 scalability problems that cannot be controlled using the techniques 665 available in interim accounting. 667 For example, in the case of interim records kept in non-volatile 668 storage, it is possible to overwrite previous interim records with the 669 most recent one or summarize them to a session record. Where interim 670 updates are sent over the wire, it is possible to control bandwidth 671 usage by adjusting the interim accounting interval. 673 These measures are not applicable where multiple session records are 674 produced from a single session, since these records cannot be summarized 675 or overwritten without loss of information. As a result, multiple 676 record production can result in increased consumption of bandwidth and 677 memory. Implementors should be careful to ensure that worst-case 678 multiple record processing requirements do not exceed the capabilities 679 of their systems. 681 As an example, a tariff change at a particular time of day could, if 682 implemented carelessly, create a sudden peak in the consumption of 683 memory and bandwidth as the records need to be stored and/or 684 transported. Rather than attempting to send all of the records at once, 685 it may be desirable to keep them in non-volatile storage and send all of 686 the related records together in a batch when the session completes. It 687 may also be desirable to shape the accounting traffic flow so as to 688 reduce the peak bandwidth consumption. This can be accomplished by 689 introduction of a randomized delay interval. If the home domain can 690 also control the generation of multiple accounting records, the 691 estimation of the worst-case processing requirements can be very 692 difficult. 694 6.1.3. Packet loss 696 As packet loss is a fact of life on the Internet, accounting protocols 697 dealing with session data need to be resilient against packet loss. This 698 is particularly important in inter-domain accounting, where packets 699 often pass through Network Access Points (NAPs) where packet loss may be 700 substantial. Resilience against packet loss can be accomplished via 701 implementation of a retry mechanism on top of UDP, or use of TCP [7] or 702 SCTP [26]. On-the-wire interim accounting provides only limited benefits 703 in mitigating the effects of packet loss. 705 UDP-based transport is frequently used in accounting applications. 706 However, this is not appropriate in all cases. Where accounting data 707 will not fit within a single UDP packet without fragmentation, use of 708 TCP or SCTP transport may be preferred to use of multiple round-trips in 709 UDP. As noted in [47] and [49], this may be an issue in the retrieval of 710 large tables. 712 In addition, in cases where congestion is likely, such as in inter- 713 domain accounting, TCP or SCTP congestion control and round-trip time 714 estimation will be very useful, optimizing throughput. In applications 715 which require maintenance of session state, such as simultaneous usage 716 control, TCP and application-layer keep alive packets or SCTP with its 717 built-in heartbeat capabilities provide a mechanism for keeping track of 718 session state. 720 When implementing UDP retransmission, there are a number of issues to 721 keep in mind: 723 Data model 724 Retry behavior 725 Congestion control 726 Timeout behavior 728 Accounting reliability can be influenced by how the data is modeled. 729 For example, it is almost always preferable to use cumulative variables 730 rather than expressing accounting data in terms of a change from a 731 previous data item. With cumulative data, the current state can be 732 recovered by a successful retrieval, even after many packets have been 733 lost. However, if the data is transmitted as a change then the state 734 will not be recovered until the next cumulative update is sent. Thus, 735 such implementations are much more vulnerable to packet loss, and should 736 be avoided wherever possible. 738 In designing a UDP retry mechanism, it is important that the retry 739 timers relate to the round-trip time, so that retransmissions will not 740 typically occur within the period in which acknowledgments may be 741 expected to arrive. Accounting bandwidth may be significant in some 742 circumstances, so that the added traffic due to unnecessary 743 retransmissions may increase congestion levels. 745 Congestion control in accounting data transfer is a somewhat 746 controversial issue. Since accounting traffic is often considered 747 mission-critical, it has been argued that congestion control is not a 748 requirement; better to let other less-critical traffic back off in 749 response to congestion. Moreover, without non-volatile storage, 750 congestive back-off in accounting applications can result in data loss 751 due to buffer exhaustion. 753 However, it can also be argued that in modern accounting 754 implementations, it is possible to implement congestion control while 755 improving throughput and maintaining high reliability. In circumstances 756 where there is sustained packet loss, there simply is not sufficient 757 capacity to maintain existing transmission rates. Thus, aggregate 758 throughput will actually improve if congestive back-off is implemented. 759 This is due to elimination of retransmissions and the ability to utilize 760 techniques such as RED to desynchronize flows. In addition, with QoS 761 mechanisms such as differentiated services, it is possible to mark 762 accounting packets for preferential handling so as to provide for lower 763 packet loss if desired. Thus considerable leeway is available to the 764 network administrator in controlling the treatment of accounting packets 765 and hard coding inelastic behavior is unnecessary. Typically, systems 766 implementing non-volatile storage allow for backlogged accounting data 767 to be placed in non-volatile storage pending transmission, so that 768 buffer exhaustion resulting from congestive back-off need not be a 769 concern. 771 Since UDP is not really a transport protocol, UDP-based accounting 772 protocols such as [4] often do not prescribe timeout behavior. Thus 773 implementations may exhibit widely different behavior. For example, one 774 implementation may drop accounting data after three constant duration 775 retries to the same server, while another may implement exponential 776 back-off to a given server, then switch to another server, up to a total 777 timeout interval of twelve hours, while storing the untransmitted data 778 on non-volatile storage. The practical difference between these 779 approaches is substantial; the former approach will not satisfy archival 780 accounting requirements while the latter may. More predictable behavior 781 can be achieved via use of SCTP or TCP transport. 783 6.1.4. Accounting server failover 785 In the event of a failure of the primary accounting server, it is 786 desirable for the device to failover to a secondary server. Providing 787 one or more secondary servers can remove much of the risk of accounting 788 server failure, and as a result use of secondary servers has become 789 commonplace. 791 For protocols based on TCP, it is possible for the device to maintain 792 connections to both the primary and secondary accounting servers, using 793 the secondary connection after expiration of a timer on the primary 794 connection. Alternatively, it is possible to open a connection to the 795 secondary accounting server after a timeout or loss of the primary 796 connection, or on expiration of a timer. Thus, accounting protocols 797 based on TCP are capable of responding more rapidly to connectivity 798 failures than TCP timeouts would otherwise allow, at the expense of an 799 increased risk of duplicates. 801 With SCTP, it is possible to control transport layer timeout behavior, 802 and therefore it is not necessary for the accounting application to 803 maintain its own timers. SCTP also enables multiplexing of multiple 804 connections within a single transport connection, all maintaining the 805 same congestion control state, avoiding the "head of line blocking" 806 issues that can occur with TCP. However, since SCTP is not widely 807 available, use of this transport can impose an additional implementation 808 burden on the designer. 810 For protocols using UDP, transmission to the secondary server can occur 811 after a number of retries or timer expiration. For compatibility with 812 congestion avoidance, it is advisable to incorporate techniques such as 813 round-trip-time estimation, slow start and congestive back-off. Thus 814 the accounting protocol designer utilizing UDP often is lead to re- 815 inventing techniques already existing in TCP and SCTP. As a result, the 816 use of raw UDP transport in accounting applications is not recommended. 818 With any transport it is possible for the primary and secondary 819 accounting servers to receive duplicate packets, so support for 820 duplicate elimination is required. Since accounting server failures can 821 result in data accumulation on accounting clients, use of non-volatile 822 storage can ensure against data loss due to transmission timeouts or 823 buffer exhaustion. On-the-wire interim accounting provides only limited 824 benefits in mitigating the effects of accounting server failures. 826 6.1.5. Application layer acknowledgments 828 It is possible for the accounting server to experience partial failures. 829 For example, a failure in the database back end could leave the 830 accounting retrieval process or thread operable while the process or 831 thread responsible for storing the data is non-functional. Similarly, it 832 is possible for the accounting application to run out of disk space, 833 making it unable to continue storing incoming session records. 835 In such cases it is desirable to distinguish between transport layer 836 acknowledgment and application layer acknowledgment. Even though both 837 acknowledgments may be sent within the same packet (such as a TCP 838 segment carrying an application layer acknowledgment along with a piggy- 839 backed ACK), the semantics are different. A transport-layer 840 acknowledgment means "the transport layer has taken responsibility for 841 delivering the data to the application", while an application-layer 842 acknowledgment means "the application has taken responsibility for the 843 data". 845 A common misconception is that use of TCP transport guarantees that data 846 is delivered to the application. However, as noted in RFC 793 [7]: 848 An acknowledgment by TCP does not guarantee that the data has been 849 delivered to the end user, but only that the receiving TCP has taken 850 the responsibility to do so. 852 Therefore, if receiving TCP fails after sending the ACK, the application 853 may not receive the data. Similarly, if the application fails prior to 854 committing the data to stable storage, the data may be lost. In order 855 for a sending application to be sure that the data it sent was received 856 by the receiving application, either a graceful close of the TCP 857 connection or an application-layer acknowledgment is required. In order 858 to protect against data loss, it is necessary that the application-layer 859 acknowledgment imply that the data has been written to stable storage or 860 suitably processed so as to guard against loss. 862 In the case of partial failures, it is possible for the transport layer 863 to acknowledge receipt via transport layer acknowledgment, without 864 having delivered the data to the application. Similarly, the application 865 may not complete the tasks necessary to take responsibility for the 866 data. 868 For example, an accounting server may receive data from the transport 869 layer but be incapable of storing it data due to a back end database 870 problem or disk fault. In this case it should not send an application 871 layer acknowledgment, even though a a transport layer acknowledgment is 872 appropriate. Rather, an application layer error message should be sent 873 indicating the source of the problem, such as "Backend store 874 unavailable". 876 Thus application-layer acknowledgment capability requires not only the 877 ability to acknowledge when the application has taken responsibility for 878 the data, but also the ability to indicate when the application has not 879 taken responsibility for the data, and why. 881 6.1.6. Network failures 883 Network failures may result in partial or complete loss of connectivity 884 for the accounting client. In the event of partial connectivity loss, it 885 may not be possible to reach the primary accounting server, in which 886 case switch over to the secondary accounting server is necessary. In 887 the event of a network partition, it may be necessary to store 888 accounting events in device memory or non-volatile storage until 889 connectivity can be re-established. 891 As with accounting server failures, on-the-wire interim accounting 892 provides only limited benefits in mitigating the effects of network 893 failures. 895 6.1.7. Device reboots 897 In the event of a device reboot, it is desirable to minimize the loss of 898 data on sessions in progress. Such losses may be significant even if the 899 devices themselves are very reliable, due to long-lived sessions, which 900 can comprise a significant fraction of total resource consumption. To 901 guard against loss of these high-value sessions, interim accounting data 902 is typically transmitted over the wire. When interim accounting in-place 903 is combined with non-volatile storage it becomes possible to guard 904 against data loss in much shorter sessions. This is possible since 905 interim accounting data need only be stored in non-volatile memory until 906 the session completes, at which time the interim data may be replaced by 907 the session record. As a result, interim accounting data need never be 908 sent over the wire, and it is possible to decrease the interim interval 909 so as to provide a very high degree of protection against data loss. 911 6.1.8. Accounting proxies 913 In order to maintain high reliability, it is important that accounting 914 proxies pass through transport and application layer acknowledgments and 915 do not store and forward accounting packets. This enables the end- 916 systems to control re-transmission behavior and utilize techniques such 917 as non-volatile storage and secondary servers to improve resilience. 919 Accounting proxies sending a transport or application layer ACK to the 920 device without receiving one from the accounting server fool the device 921 into thinking that the accounting request had been accepted by the 922 accounting server when this is not the case. As a result, the device can 923 delete the accounting packet from non-volatile storage before it has 924 been accepted by the accounting server. The leaves the accounting proxy 925 responsible for delivering accounting packets. If the accounting proxy 926 involves moving parts (e.g. a disk drive) while the devices do not, 927 overall system reliability can be reduced. 929 Store and forward accounting proxies only add value in situations where 930 the accounting subsystem is unreliable. For example, where devices do 931 not implement non-volatile storage and the accounting protocol lacks 932 transport and application layer reliability, locating the accounting 933 proxy (with its stable storage) close to the device can reduce the risk 934 of data loss. 936 However, such systems are inherently unreliable so that they are only 937 appropriate for use in capacity planning or non-usage sensitive billing 938 applications. If archival accounting reliability is desired, it is 939 necessary to engineer a reliable accounting system from the start using 940 the techniques described in this document, rather than attempting to 941 patch an inherently unreliable system by adding store and forward 942 accounting proxies. 944 6.1.9. Fault resilience summary 946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 947 | | | 948 | Fault | Counter-measures | 949 | | | 950 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 951 | | | 952 | Packet | Retransmission based on RTT | 953 | loss | Congestion control | 954 | | Well-defined timeout behavior | 955 | | Duplicate elimination | 956 | | Interim accounting* | 957 | | Non-volatile storage | 958 | | Cumulative variables | 959 | | | 960 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 961 | | | 962 | Accounting | Primary-secondary servers | 963 | server & net | Duplicate elimination | 964 | failures | Interim accounting* | 965 | | Application layer ACK & error msgs. | 966 | | Non-volatile storage | 967 | | | 968 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 969 | | | 970 | Device | Interim accounting* | 971 | reboots | Non-volatile storage | 972 | | | 973 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 975 Key 976 * = limited usefulness without non-volatile storage 978 Note: Accounting proxies are not a reliability 979 enhancement mechanism. 981 6.2. Resource consumption 983 In the process of growing to meet the needs of providers and customers, 984 accounting management systems consume a variety of resources, including: 986 Network bandwidth 987 Memory 988 Non-volatile storage 989 State on the accounting management system 990 CPU on the management system and managed devices 992 In order to understand the limits to scaling, we examine each of these 993 resources in turn. 995 6.2.1. Network bandwidth 997 Accounting management systems consume network bandwidth in transferring 998 accounting data. The network bandwidth consumed is proportional to the 999 amount of data transferred, as well as required network overhead. Since 1000 accounting data for a given event may be 100 octets or less, if each 1001 event is transferred individually, overhead can represent a considerable 1002 proportion of total bandwidth consumption. As a result, it is often 1003 desirable to transfer accounting data in batches, enabling network 1004 overhead to be spread over a larger payload, and enabling efficient use 1005 of compression. As noted in [48], compression can be enabled in the 1006 accounting protocol, or can be done at the IP layer as described in [5]. 1008 6.2.2. Memory 1010 In accounting systems without non-volatile storage, accounting data must 1011 be stored in volatile memory during the period between when it is 1012 generated and when it is transferred. The resulting memory consumption 1013 will depend on retry and retransmission algorithms. Since systems 1014 designed for high reliability will typically wish to retry for long 1015 periods, or may store interim accounting data, the resulting memory 1016 consumption can be considerable. As a result, if non-volatile storage is 1017 unavailable, it may be desirable to compress accounting data awaiting 1018 transmission. 1020 As noted earlier, implementors of interim accounting should take care to 1021 ensure against excessive memory usage by overwriting older interim 1022 accounting data with newer data for the same session rather than 1023 accumulating interim data in the buffer. 1025 6.2.3. Non-volatile storage 1027 Since accounting data stored in memory will typically be lost in the 1028 event of a device reboot or a timeout, it may be desirable to provide 1029 non-volatile storage for undelivered accounting data. With the costs of 1030 non-volatile storage declining rapidly, network devices will be 1031 increasingly capable of incorporating non-volatile storage support over 1032 the next few years. 1034 Non-volatile storage may be used to store interim or session records. As 1035 with memory utilization, interim accounting overwrite is desirable so as 1036 to prevent excessive storage consumption. Note that the use of ASCII 1037 data representation enables use of highly efficient text compression 1038 algorithms that can minimize storage requirements. Such compression 1039 algorithms are only typically applied to session records so as to enable 1040 implementation of interim data overwrite. 1042 6.2.4. State on the accounting management system 1044 In order to keep track of received accounting data, accounting 1045 management systems may need to keep state on managed devices or 1046 concurrent sessions. Since the number of devices is typically much 1047 smaller than the number of concurrent sessions, it is desirable to keep 1048 only per-device state if possible. 1050 6.2.5. CPU requirements 1052 CPU consumption of the managed and managing nodes will be proportional 1053 to the complexity of the required accounting processing. Operations such 1054 as ASN.1 encoding and decoding, compression/decompression, and 1055 encryption/decryption can consume considerable resources, both on 1056 accounting clients and servers. 1058 The effect of these operations on accounting system reliability should 1059 not be under-estimated, particularly in the case of devices with 1060 moderate CPU resources. In the event that devices are over-taxed by 1061 accounting tasks, it is likely that overall device reliability will 1062 suffer. 1064 6.2.6. Efficiency measures 1066 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1067 | | | 1068 | Resource | Efficiency measures | 1069 | | | 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 | | | 1072 | Network | Batching | 1073 | Bandwidth | Compression | 1074 | | | 1075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1076 | | | 1077 | Memory | Compression | 1078 | | Interim accounting overwrite | 1079 | | | 1080 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1081 | | | 1082 | Non-volatile | Compression | 1083 | Storage | Interim accounting overwrite | 1084 | | | 1085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1086 | | | 1087 | System | Per-device state | 1088 | state | | 1089 | | | 1090 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1091 | | | 1092 | CPU | Hardware assisted | 1093 | requirements | compression/encryption | 1094 | | | 1095 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1097 6.3. 1099 Data collection models 1101 Several data collection models are currently in use today for the 1102 purposes of accounting data collection. These include: 1104 Polling model 1105 Event-driven model without batching 1106 Event-driven model with batching 1107 Event-driven polling model 1109 6.3.1. Polling model 1111 In the polling model, an accounting manager will poll devices for 1112 accounting information at regular intervals. In order to ensure against 1113 loss of data, the polling interval will need to be shorter than the 1114 maximum time that accounting data can be stored on the polled device. 1115 For devices without non-volatile stage, this is typically determined by 1116 available memory; for devices with non-volatile storage the maximum 1117 polling interval is determined by the size of non-volatile storage. 1119 The polling model results in an accumulation of data within individual 1120 devices, and as a result, data is typically transferred to the 1121 accounting manager in a batch, resulting in an efficient transfer 1122 process. In terms of Accounting Manager state, polling systems scale 1123 with the number of managed devices, and system bandwidth usage scales 1124 with the amount of data transferred. 1126 Without non-volatile storage, the polling model results in loss of 1127 accounting data due to device reboots, but not due to packet loss or 1128 network failures of sufficiently short duration to be handled within 1129 available memory. This is because the Accounting Manager will continue 1130 to poll until the data is received. In situations where operational 1131 difficulties are encountered, the volume of accounting data will 1132 frequently increase so as to make data loss more likely. However, in 1133 this case the polling model will detect the problem since attempts to 1134 reach the managed devices will fail. 1136 The polling model scales poorly for implementation of shared use or 1137 roaming services, including wireless data, Internet telephony, QoS 1138 provisioning or Internet access. This is because in order to retrieve 1139 accounting data for users within a given domain, the Accounting 1140 Management station would need to periodically poll all devices in all 1141 domains, most of which would not contain any relevant data. There are 1142 also issues with processing delay, since use of a polling interval also 1143 implies an average processing delay of half the polling interval. This 1144 may be too high for accounting data that requires low processing delay. 1145 Thus the event-driven polling or the pure event-driven approach is more 1146 appropriate for usage sensitive billing applications such as shared use 1147 or roaming implementations. 1149 Per-device state is typical of polling-based network management systems, 1150 which often also carry out accounting management functions, since 1151 network management systems need to keep track of the state of network 1152 devices for operational purposes. These systems offer average processing 1153 delays equal to half the polling interval. 1155 6.3.2. Event-driven model without batching 1157 In the event-driven model, a device will contact the accounting server 1158 or manager when it is ready to transfer accounting data. Most event- 1159 driven accounting systems, such as those based on RADIUS accounting, 1160 described in [4], transfer only one accounting event per packet, which 1161 is inefficient. 1163 Without non-volatile storage, a pure event-driven model typically stores 1164 accounting events that have not yet been delivered only until the 1165 timeout interval expires. As a result this model has the smallest memory 1166 requirements. Once the timeout interval has expired, the accounting 1167 event is lost, even if the device has sufficient buffer space to 1168 continue to store it. As a result, the event-driven model is the least 1169 reliable, since accounting data loss will occur due to device reboots, 1170 sustained packet loss, or network failures of duration greater than the 1171 timeout interval. In event-driven protocols without a "keep alive" 1172 message, accounting servers cannot assume a device failure should no 1173 messages arrive for an extended period. Thus, event-driven accounting 1174 systems are typically not useful in monitoring of device health. 1176 The event-driven model is frequently used in shared use networks and 1177 roaming, since this model sends data to the recipient domains without 1178 requiring them to poll a large number of devices, most of which have no 1179 relevant data. Since the event-driven model typically does not support 1180 batching, it permits accounting records to be sent with low processing 1181 delay, enabling application of fraud prevention techniques. However, 1182 because roaming accounting events are frequently of high value, the poor 1183 reliability of this model is an issue. As a result, the event-driven 1184 polling model may be more appropriate. 1186 Per-session state is typical of event-driven systems without batching. 1187 As a result, the event-driven approach scales poorly. However, event- 1188 driven systems offer the lowest processing delay since events are 1189 processed immediately and there is no possibility of an event requiring 1190 low processing delay being caught behind a batch transfer. 1192 6.3.3. Event-driven model with batching 1194 In the event-driven model with batching, a device will contact the 1195 accounting server or manager when it is ready to transfer accounting 1196 data. The device can contact the server when a batch of a given size has 1197 been gathered, when data of a certain type is available or after a 1198 minimum time period has elapsed. Such systems can transfer more than one 1199 accounting event per packet and are thus more efficient. 1201 An event-driven system with batching will store accounting events that 1202 have not yet been delivered up to the limits of memory. As a result, 1203 accounting data loss will occur due to device reboots, but not due to 1204 packet loss or network failures of sufficiently short duration to be 1205 handled within available memory. Note that while transfer efficiency 1206 will increase with batch size, without non-volatile storage, the 1207 potential data loss from a device reboot will also increase. 1209 Where event-driven systems with batching have a keep-alive interval and 1210 run over reliable transport, the accounting server can assume that a 1211 failure has occurred if no messages are received within the keep-alive 1212 interval. Thus, such implementations can be useful in monitoring of 1213 device health. When used for this purpose the average time delay prior 1214 to failure detection is one half the keep-alive interval. 1216 Through implementation of a scheduling algorithm, event-driven systems 1217 with batching can deliver appropriate service to accounting events that 1218 require low processing delay. For example, high-value inter-domain 1219 accounting events could be sent immediately, thus enabling use of fraud- 1220 prevention techniques, while all other events would be batched. However, 1221 there is a possibility that an event requiring low processing delay will 1222 be caught behind a batch transfer in progress. Thus the maximum 1223 processing delay is proportional to the maximum batch size divided by 1224 the link speed. 1226 Event-driven systems with batching scale with the number of active 1227 devices. As a result this approach scales better than the pure event- 1228 driven approach, or even the polling approach, and is equivalent in 1229 terms of scaling to the event-driven polling approach. However, the 1230 event-driven batching approach has lower processing delay than the 1231 event-driven polling approach, since delivery of accounting data 1232 requires fewer round-trips and events requiring low processing delay can 1233 be accommodated if a scheduling algorithm is employed. 1235 6.3.4. Event-driven polling model 1237 In the event-driven polling model an accounting manager will poll the 1238 device for accounting data only when it receives an event. The 1239 accounting client can generate an event when a batch of a given size has 1240 been gathered, when data of a certain type is available or after a 1241 minimum time period has elapsed. Note that while transfer efficiency 1242 will increase with batch size, without non-volatile storage, the 1243 potential data loss from a device reboot will also increase. 1245 Without non-volatile storage, an event-driven polling model will lose 1246 data due to device reboots, but not due to packet loss, or network 1247 partitions of short-duration. Unless a minimum delivery interval is set, 1248 event-driven polling systems are not useful in monitoring of device 1249 health. 1251 The event-driven polling model can be suitable for use in roaming since 1252 it permits accounting data to be sent to the roaming partners with low 1253 processing delay. At the same time non-roaming accounting can be handled 1254 via more efficient polling techniques, thereby providing the best of 1255 both worlds. 1257 Where batching can be implemented, the state required in event-driven 1258 polling can be reduced to scale with the number of active devices. If 1259 portions of the network vary widely in usage, then this state may 1260 actually be less than that of the polling approach. Note that processing 1261 delay in this approach is higher than in event-driven accounting with 1262 batching since at least two round-trips are required to deliver data: 1263 one for the event notification, and one for the resulting poll. 1265 6.3.5. Data collection summary 1267 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1268 | | | | 1269 | Model | Pros | Cons | 1270 | | | | 1271 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1272 | Polling | Per-device state | Not robust | 1273 | | Robust against | against device | 1274 | | packet loss | reboot, server | 1275 | | Batch transfers | or network | 1276 | | | failures* | 1277 | | | Polling interval | 1278 | | | determined by | 1279 | | | storage limit | 1280 | | | High processing | 1281 | | | delay | 1282 | | | Unsuitable for | 1283 | | | use in roaming | 1284 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1285 | Event-driven, | Lowest processing | Not robust | 1286 | no batching | delay | against packet | 1287 | | Suitable for | loss, device | 1288 | | use in roaming | reboot, or | 1289 | | | network | 1290 | | | failures* | 1291 | | | Low efficiency | 1292 | | | Per-session state | 1293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1294 | Event-driven, | Single round-trip | Not robust | 1295 | with batching | latency | against device | 1296 | and | Batch transfers | reboot, network | 1297 | scheduling | Suitable for | failures* | 1298 | | use in roaming | | 1299 | | Per active device | | 1300 | | state | | 1301 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1302 | Event-driven | Batch transfers | Not robust | 1303 | polling | Suitable for | against device | 1304 | | use in roaming | reboot, network | 1305 | | Per active device | failures* | 1306 | | state | Two round-trip | 1307 | | | latency | 1308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1310 Key 1311 * = addressed by non-volatile storage 1313 7. Review of Accounting Protocols 1315 Accounting systems have been successfully implemented using protocols 1316 such as RADIUS, TACACS+, and SNMP. This section describes the 1317 characteristics of each of these protocols. 1319 7.1. RADIUS 1321 RADIUS accounting, described in [4], was developed as an add-on to the 1322 RADIUS authentication protocol, described in [3]. As a result, RADIUS 1323 accounting shares the event-driven approach of RADIUS authentication, 1324 without support for batching or polling. As a result, RADIUS accounting 1325 scales with the number of accounting events instead of the number of 1326 devices, and accounting transfers are inefficient. 1328 Since RADIUS accounting is based on UDP and timeout and retry parameters 1329 are not specified, implementations vary widely in their approach to 1330 reliability, with some implementations retrying until delivery or buffer 1331 exhaustion, and others losing accounting data after a few retries. Since 1332 RADIUS accounting does not provide for application-layer acknowledgments 1333 or error messages, a RADIUS Accounting-Response is equivalent to a 1334 transport-layer acknowledgment and provides no protection against 1335 application layer malfunctions. Due to the lack of reliability, it is 1336 not possible to do simultaneous usage control based on RADIUS accounting 1337 alone. Typically another device data source is required, such as polling 1338 of a session MIB or a command-line session over telnet. 1340 RADIUS accounting implementations are vulnerable to packet loss as well 1341 as application layer failures, network failures and device reboots. 1342 These deficiencies are magnified in inter-domain accounting as is 1343 required in roaming ([1],[2]). On the other hand, the event-driven 1344 approach of RADIUS accounting is useful where low processing delay is 1345 required, such as credit risk management or fraud detection. 1347 While RADIUS accounting does provide hop-by-hop authentication and 1348 integrity protection, and IPSEC can be employed to provide hop-by-hop 1349 confidentiality, data object security is not supported, and thus systems 1350 based on RADIUS accounting are not capable of being deployed with 1351 untrusted proxies, or in situations requiring auditability, as noted in 1352 [2]. 1354 While RADIUS does not support compression, IP compression, described in 1355 [5], can be employed to provide this. While in principle extensible 1356 with the definition of new attributes, RADIUS suffers from the very 1357 small standard attribute space (256 attributes). 1359 7.2. TACACS+ 1361 TACACS+ offers an accounting model with start, stop, and interim update 1362 messages. Since TACACS+ is based on TCP, implementations are typically 1363 resilient against packet loss and short-lived network partitions, and 1364 TACACS+ scales with the number of devices. Since TACACS+ runs over TCP, 1365 it offers support for both transport layer and application layer 1366 acknowledgments, and is suitable for simultaneous usage control and 1367 handling of accounting events that require moderate though not the 1368 lowest processing delay. 1370 TACACS+ provides for hop-by-hop authentication and integrity protection 1371 as well as hop-by-hop confidentiality. Data object security is not 1372 supported, and therefore systems based on TACACS+ accounting are not 1373 deployable in the presence of untrusted proxies. While TACACS+ does not 1374 support compression, IP compression, described in [5], can be employed 1375 to provide this. 1377 7.3. SNMP 1379 SNMP, described in [19],[27]-[41], has been widely deployed in a wide 1380 variety of intra-domain accounting applications, typically using the 1381 polling data collection model. Since polling allows data to be collected 1382 on multiple accounting events simultaneously, this model results in per- 1383 device state. Since the management agent is able to retry requests when 1384 a response is not received, such systems are resilient against packet 1385 loss or even short-lived network partitions. While implementations 1386 without non-volatile storage can only store accounting events up to the 1387 limits of their memory, and thus are not robust against device reboots 1388 or network failures, when combined with non-volatile storage, they can 1389 be made highly reliable. With version 2 of the SNMP protocol operations 1390 [41] (supported in SNMPv2c and SNMPv3) it is possible to support Inform 1391 Requests which can be considered to be confirmed notifications. This 1392 makes it possible to implement an event-driven polling model or even an 1393 event-driven batching model. However, we are not aware of any SNMP-based 1394 accounting implementations built on these models. 1396 7.3.1. Security services 1398 While SNMPv1 and SNMPv2c support per-packet authentication (via the 1399 community string), they do not support per-packet integrity and replay 1400 protection or confidentiality. With SNMPv1 and SNMPv2c it is possible to 1401 support view-based access control. The snmpCommunityMIB defined in [11] 1402 allows a communityName to be mapped onto the proper contextName, 1403 securityName, and assumes a default securityLevel of noAuth, permitting 1404 view-based access controls described in [40] to be applied. 1406 SNMPv3 supports per-packet authentication, integrity and replay 1407 protection as well as confidentiality. The User Security Model (USM) is 1408 described in [38], and the View Access Control Model (VACM) is described 1409 in [40]. As a result, SNMPv3-based accounting implementations can 1410 provide hop-by-hop authentication, integrity and replay protection, 1411 confidentiality and access-control. 1413 Where the product of the number of domains and devices is large, such as 1414 in inter-domain accounting applications, the number of shared secrets 1415 can get out of hand. The localized key capability in the SNMPv3 USM 1416 allows a manager to have one central key, sharing it with many agents in 1417 a localized way while preventing the agents from getting at each other's 1418 data. This can assist in cross-domain security if deployed properly. 1420 SNMPv3 does not support data-object security. Merely providing security 1421 for individual MIB variables is not sufficient. In order to prevent a 1422 cut and paste attack by an untrusted proxy, it is necessary to provide 1423 integrity protection covering enough of the packet (including other MIB 1424 variables) to protect against replay. Note that data-object security is 1425 very computationally intensive (no matter what protocol is used to carry 1426 it), so that it should be used sparingly. Thus its use may be reserved 1427 for inter-domain accounting involving user-sensitive billing, where data 1428 passes through an untrusted proxy. 1430 7.3.2. Application layer acknowledgments 1432 SNMP provides for limited application-layer acknowledgment capability. 1433 Since an SNMP Response to a get, get-next or get-bulk request returns 1434 the requested data, this implies that the receiving application received 1435 the request. 1437 For a SetRequest, RFC 1905 [41], section 4.2.5 states: 1439 If any of these assignments fail (even after all the previous 1440 validations), then all other assignments are undone, and the 1441 Response-PDU is modified to have the value of its error-status field 1442 set to 'commitFailed', and the value of its error-index field set to 1443 the index of the failed variable binding. 1445 If and only if it is not possible to undo all the assignments, then 1446 the Response-PDU is modified to have the value of its error-status 1447 field set to 'undoFailed', and the value of its error-index field is 1448 set to zero. Note that implementations are strongly encouraged to 1449 take all possible measures to avoid use of either 'commitFailed' or 1450 'undoFailed' - these two error-status codes are not to be taken as 1451 license to take the easy way out in an implementation. 1453 Finally, the generated Response-PDU is encapsulated into a message, 1454 and transmitted to the originator of the SetRequest-PDU. 1456 Section 4.2.4 states: 1458 Upon receipt of a Response-PDU, the receiving SNMP entity presents 1459 its contents to the SNMP application which generated the request 1460 with the same request-id value. 1462 If the error-status field of the Response-PDU is non-zero, the value 1463 fields of the variable bindings in the variable binding list are 1464 ignored. 1466 If both the error-status field and the error-index field of the 1467 Response-PDU are non-zero, then the value of the error-index field is 1468 the index of the variable binding (in the variable-binding list of 1469 the corresponding request) for which the request failed. 1471 As a result, a 'noError' Response to a SetRequest indicates that the 1472 requested assignments were made by the application, which can include 1473 writing to stable storage if required. An error-response indicates that 1474 the Command Responder application received the request, but did not 1475 succeed in executing it. While a Response to an errant SetRequest 1476 operation does return SetRequest-specific error codes, and does indicate 1477 the index of the variable binding for which the request failed, it does 1478 not return MIB-specific error codes. Thus, it is not possible to 1479 indicate why a specific MIB object could not be set/changed, and thus it 1480 is not possible to provide application-specific error codes. 1482 For an InformRequest, RFC 1905 [41] section 4.2.7 states: 1484 An InformRequest-PDU is generated and transmitted at the request of 1485 an application in a SNMP entity acting in a manager role, that 1486 wishes to notify another application (in a SNMP entity also acting 1487 in a manager role) of information in a MIB view which is remote to 1488 the receiving application. 1490 The destination(s) to which an InformRequest-PDU is sent is specified 1491 by the requesting application. The first two variable bindings in 1492 the variable binding list of an InformRequest-PDU are sysUpTime.0 1493 and snmpTrapOID.0 respectively. If the OBJECTS clause is present 1494 in the invocation of the corresponding NOTIFICATION-TYPE macro, then 1495 each corresponding variable, as instantiated by this notification, is 1496 copied, in order, to the variable-bindings field. 1498 Upon receipt of an InformRequest-PDU... otherwise the receiving 1499 SNMP entity: 1501 (1) presents its contents to the appropriate SNMP application; 1502 (2) generates a Response-PDU with the same values in its request-id 1503 and variable-bindings fields as the received InformRequest-PDU, 1504 with the value of its error-status field is set to `noError' 1505 and the value of its error-index field is zero; and 1507 (3) transmits the generated Response-PDU to the originator of the 1508 InformRequest-PDU 1510 As with the SetRequest, the Response to an Inform-Request returns the 1511 same values as in the Inform-Request on completion of the operation, and 1512 returns only a general error-status, not MIB-specific or application- 1513 level error codes. In the case of a Response to an Inform-Request, the 1514 error-index can ONLY be zero. Thus while a Response to an errant Inform- 1515 Request operation can return an error-status, it cannot indicate the 1516 index of the variable binding for which the Inform-Request failed, nor 1517 can it return MIB-specific or application-specific error codes. Thus, it 1518 is not possible to indicate which specific MIB object included in the 1519 Inform-Request could not be accepted, or why. 1521 The processing sequence prior to sending a Response to an Inform-Request 1522 PDU is implementation specific and not standardized. An Inform-Request 1523 is transmitted at the request of a sending application, and is presented 1524 to an application on the receiving side. In a typical implementation, 1525 the Notification receiver accepts an Inform-Request, stores the varBinds 1526 in an internal buffer, and tells the SNMP stack/engine that it was 1527 received, at which point a Response is sent. In such an implementation, 1528 the Response confirms receipt by the receiving application, and is 1529 presented to the sending application. However, it does not confirm that 1530 the application has processed the data. Should an error occur in 1531 processing, it would be too late to send an error message since the 1532 Response has already been sent. In such an implementation, the Response 1533 is not a true application-layer acknowledgment, since it does not 1534 indicate that the application has taken responsibility for the data. On 1535 the other hand, it is possible for an implementation to wait until the 1536 receiving application has processed the Inform-Request prior to sending 1537 the Response. In this case, the Response would represent a true 1538 application-layer acknowledgment. 1540 7.3.3. Proxy forwarders 1542 In the accounting management architecture, proxy forwarders play an 1543 important role, forwarding intra and inter-domain accounting events to 1544 the correct destinations. The proxy forwarder may also play a role in a 1545 polling or event-driven polling architecture. 1547 The functionality of an SNMP Proxy Forwarder is defined in [39]. The use 1548 of proxy forwarders simplifies the configuration of the devices, as well 1549 as reducing the number of shared secrets required for inter-domain 1550 accounting. For example, the network devices may be configured to send 1551 notifications for all domains to the Proxy Forwarder, and the devices 1552 may be configured to allow the Proxy Forwarder to access all MIB data. 1554 Where Proxy Forwarders are employed, the domains typically share a 1555 secret with the Proxy Forwarder, and in turn, the Proxy Forwarder shares 1556 a secret with each of the devices. Thus the number of shared secrets 1557 will scale with the sum of the number of devices and domains rather than 1558 the product. 1560 Note that an SNMP Proxy Forwarder is forbidden to provide access control 1561 at the varBind level. This means that the Proxy Forwarder does not need 1562 to look inside the PDU of the message except to determine the 1563 contextEngineID to verify it is not destined to itself. If 1564 contextEngine == securityEngine, with other qualifications, then the 1565 message is being sent to the current engine, so it is processed locally 1566 rather than being sent to the proxy forwarder. 1568 Restrictions on use of proxies to provide access control at the varBind 1569 level also affect the ability to provide support for legacy devices. If 1570 legacy devices do not support view-based access control, then the proxy 1571 will not be able to provide this capability. 1573 Issues of legacy support also exist with the NMRG proposals. A proxy 1574 forwarder receiving a "get subtree" PDU destined for a device that did 1575 not support this PDU would need to translate the "get subtree" PDU into 1576 multiple get-next or get-bulk requests. This issue would also exist with 1577 the subtree retrieval MIB since unless the legacy devices also supported 1578 the MIB, the proxy would encounter the "get-bulk overshoot" problem. 1580 Similarly, unless a device supports the SNMP-over-TCP transport mapping, 1581 deployment of an TCP transport-capable proxy forwarder will not provide 1582 much benefit, since the proxy forwarder will need to fall back to UDP- 1583 based get-next or get-bulk operations. This will result in multiple 1584 round-trips and high latency and in addition the risk of inconsistent 1585 tables would remain. Existing proxy forwarders only supports current 1586 standards so that new proxy forwarder code would be needed to support 1587 the NMRG proposals. 1589 7.3.4. Domain-based access controls in SNMP 1591 Domain-based access controls are required where multiple administrative 1592 domains are involved, such as in the shared use networks and roaming 1593 associations described in [1]. Since the same device may be accessed by 1594 multiple organizations, it is often necessary to control access to 1595 accounting data according to the user's organization. This ensures that 1596 organizations may be given access to accounting data relating to their 1597 users, but not to data relating to users of other organizations. 1599 In order to apply domain-based access controls, in inter-domain 1600 accounting, it is first necessary to identify the data subset that is 1601 to have its access controlled. Several conceptual abstractions are used 1602 for identifying subsets of data in SNMP. These include engines, 1603 contexts, and views. This section describes how this functionality may 1604 be applied in intra and inter-domain accounting. 1606 7.3.4.1. Engines 1608 The new SNMP architecture, described in [27], added the concept of 1609 engine and contextEngineID. The SNMPv3 message format explicitly passes 1610 the contextEngineID in the message. SNMPv1/v2c depend on a mapping table 1611 (snmpCommunityTable, described in [11]) to map a community name plus 1612 possibly an address into a contextEngineID and contextName. 1614 SNMPv3 supports the use of the contextEngineID field in order to 1615 identify the engine which provides access to the data. In traditional 1616 terms, this is the agent. engineID support was added in order to improve 1617 handling of mobility as well as well as to improve SNMP proxy forwarder 1618 support. Use of engineID enables improved mobility, allowing the agent 1619 on a laptop to be identified independently of the IP address where it is 1620 attached to the network. In SNMPv1 and v2c, different endpoint addresses 1621 imply different agents. This is not the case with SNMPv3. 1623 contextEngineID also enables an SNMP Proxy Forwarder to identify the 1624 data origin. While in SNMPv1 and v2c, the data origin is automatically 1625 assumed to be the communications endpoint (SNMP agent), with SNMPv3 it 1626 is possible to distinguish the data origin from the communications 1627 endpoint. 1629 For example, let us assume that agent A sends a notification to manager 1630 M through Proxy Forwarder B, who forwards it. Let us assume that an SNMP 1631 Proxy Forwarder supporting SNMPv3 is used, and manager M supports SNMPv3 1632 as well. Then the notification received by manager M will have a source 1633 address corresponding to Proxy Forwarder B, and contextEngineID will 1634 identify agent A as the data origin of the notification. Thus, by using 1635 contextEngineID, Proxy Forwarder B can allow manager M to identify the 1636 data origin, no matter how many Proxy Forwarders have forwarded it. 1637 However, SNMPv1 and v2c do not support explicitly passing 1638 contextEngineID in the message, so that if Proxy Forwarder B or manager 1639 M only supported SNMPv1/v2c, M would need to assume that the data in the 1640 notification (from A) refers to the instrumentation of the agent at the 1641 last hop (B). 1643 Note that in SNMP there is only a single contextEngineID per SNMP 1644 entity. 1646 7.3.4.2. Contexts 1648 Contexts are used to identify subsets of objects that are tied to 1649 instrumentation. These subsets are defined by the agent rather than the 1650 manager since if the manager defined them, the agent would not know how 1651 to tie the contexts to the underlying instrumentation. 1653 In SNMPv3, contextName is passed explicitly within the message. 1654 SNMPv1/v2c depend on a mapping table (snmpCommunityTable, described in 1655 [11]) to map a community name plus possibly an address into a 1656 contextEngineID and contextName. 1658 contextName represents a slice of the data contained within a particular 1659 engine. Contexts are defined in a dynamic table, with the names defined 1660 as read-only. The agent uses the dynamic table to tell the manager what 1661 contexts it recognizes. 1663 Contexts are commonly tied to hardware components, to logical entities 1664 related to the hardware components, or to logical services. For example, 1665 contextNames might include board5, board7, repeater1, repeater2, etc. 1667 While each context may support multiple MIB modules, each contextName is 1668 limited to one instance of a particular MIB module. Thus, if multiple 1669 instances of a MIB module are required per engine, then unique 1670 contextNames must be defined. If it only makes sense to have one 1671 instance of a MIB module in an engine, such as the USM userTable, such a 1672 MIB will typically fall into the default context "". Note that while a 1673 MIB module may allow more than one instance per engine, a given SNMPv3 1674 implementation may not support this. 1676 7.3.4.3. Views 1678 A view is a mask for a particular contextName (subset of data). The view 1679 identifies which objects are visible, by specifying OIDs of the subtrees 1680 involved. There is also a mechanism to allow wildcards in the OID 1681 specification. 1683 For example, it is possible to define a view that includes RMON tables, 1684 and another view that includes only the SNMPv3 security related tables. 1685 Using these views, it is possible to allow access to the RMON view for 1686 users Joe and Josephine (the RMON administrators), and access to the 1687 SNMPv3 security tables for user Adam (the SNMP security Administrator). 1689 Views can be set up with wildcards. For a table that is indexed using IP 1690 addresses, Joe can be allowed access to all rows in given RMON tables 1691 (e.g. the RMON hostTable). For example, rows that are in the subnet 1692 134.141.x.x can be accessed by Joe, with Josephine given access to all 1693 rows for subnet 134.200.x.x. However, for this to work the table must be 1694 indexed by the differentiating variable, since views filter at the name 1695 level (OIDs), not at the value level. It is therefore not possible to 1696 define a view that filters on the value of non-index data. In this 1697 example, were the IP address to have been used merely as a data item 1698 rather than an index, it would not be possible to utilize view-based 1699 access control to achieve the desired objective (delegation of 1700 administrative responsibility according to subnet). 1702 7.3.5. Inter-domain access-control alternatives 1704 In order to be able to control access to accounting data on a per-domain 1705 basis, there are several alternatives. These include use of the domain 1706 as an index, engines, contexts and proxies. 1708 7.3.5.1. Domain as index 1710 Through use of view-based access control [40], it is possible to define 1711 multiple fine-grained views of an SNMP MIB, and to assign views to 1712 specific groups of users, such that access rights to the included data 1713 elements will depend on the identity of the user making the request. 1714 For example, all users of bigco.com which are allowed access to the 1715 device would be defined in the User-based security MIB (or other 1716 security model MIB). For simplicity in administering access control, 1717 the users can be grouped using a vacmGroupName, e.g. bigco. A view of a 1718 subset of the data objects in the MIB can be defined in the 1719 vacmViewFamilyTreeTable. A vacmAccessTable pairs groups and views. For 1720 messages received from users in the bigco group, access would only be 1721 provided to the data permitted to be viewed by bigco users, as defined 1722 in the view family tree. This requires that each domain accessing the 1723 data be given one or more separate vacmGroupNames, an appropriate 1724 ViewTable be defined, and the vacmAccessTable be configured for each 1725 group. 1727 As the number of network devices within the shared use or roaming 1728 network grows, the polling model of data collection becomes increasingly 1729 impractical since most devices will not carry data relating to the 1730 polling organization. As a result, shared-use networks or roaming 1731 associations relying on SNMP-based accounting have generally collected 1732 data for all organizations and then sorted the resulting session records 1733 for delivery to each organization. While functional, this approach will 1734 typically result in increased processing delay as the number of 1735 organizations and data records grows. 1737 This issue can be addressed in SNMPv3 through use of view-based access 1738 control and the SNMP notification tables, using the event-driven, event- 1739 driven polling or event-driven batching approaches. This permits 1740 SNMPv3-enabled devices to notify domains that have accounting data 1741 awaiting collection. 1743 However, since views filter at the OID level, not at the data level, 1744 when using views to filter by domain it is necessary to use the domain 1745 as an index. For example, a table of session data could be indexed by 1746 record number and domain, allowing a view to be defined that could 1747 restrict access to domain data to the administrators of that domain. For 1748 example, user bigco could be allowed to view data relating to users 1749 within the bigco.com domain, but user smallco would not be allowed 1750 access to this view. 1752 An advantage of using domains as an index is that this technique can be 1753 used with SNMPv1 and v2c agents as well as with v3 agents. A 1754 disadvantage is that the MIBs must be specifically designed for this 1755 purpose. Since existing MIBs rarely use the domain as an index, domain 1756 separation cannot be enabled within legacy MIBs using this technique. 1758 7.3.5.2. Engines 1760 Another approach is to use contextEngineID to differentiate between data 1761 within individual domains. This approach would only be feasible for use 1762 with SNMPv3, where contextEngineID is supported. Since this technique 1763 can work with existing MIBs it enables domain separation to be applied 1764 to MIBs not specifically designed with domain separation in mind. 1766 One way this can be implemented is to provide multiple SNMP agents on 1767 the same system, one for each domain, differentiated by contextEngineID. 1768 However, this approach is not very scalable, particularly if there are a 1769 large number of domains involved, since it would require multiple agent 1770 instantiations, each with their own separate data space. 1772 7.3.5.3. Contexts 1774 Still another approach is to use contextName to differentiate between 1775 data within individual domains. contextName could offer a mechanism for 1776 de-multiplexing MIB modules, just as community names did in SNMPv1/v2c. 1777 The distinction is that community was overloaded to serve multiple 1778 purposes, while contextName is not. Since this technique can work with 1779 existing MIBs it enables domain separation to be applied to MIBs not 1780 specifically designed with domain separation in mind. 1782 contextName can also be made to work with SNMPv1/v2c as well as SNMPv3. 1783 contextName can be supported in SNMPv1/v2c via the snmpCommunityMIB 1784 described in [11]. In this case contextName does not travel in the SNMP 1785 message; the communityName is mapped to a contextName via the 1786 communityTable. This means that an SNMPv1/v2c agent can use 1787 contextNames and VACM-based access control if it supports the VACM MIB 1788 [40] and the snmpCommunityMIB [11]. 1790 Of course, legacy SNMPv1/v2c agents typically do not support these MIBs, 1791 so in most cases a new multi-lingual SNMPv1/v2c/v3 agent that supports 1792 not only SNMPv3 security (USM) but also VACM and snmpCommunityMIB will 1793 be required. Such an agent can handle SNMP get/get-next/get-bulk 1794 requests from SNMPv1, SNMPv2c and SNMPv3 managers, and can map such 1795 requests into the proper contextEngineID, contextName, securityName and 1796 securityLevel. VACM-based access control can then be used for any SNMP 1797 request, whether it arrived in SNMPv1, SNMPv2c or SNMPv3 format. 1799 Since multiple contextNames, and the use of contextNames for domain 1800 separation, represents new territory, careful consideration is 1801 recommended prior to implementation. In the design of SNMPv3, contexts 1802 were intended to be used by an agent to inform a manager about the 1803 contexts known to the agent. As a result, vacmContextName is read-only 1804 and so cannot be configured directly using SNMP. Since contextName is 1805 not manager configurable, this implies that agents must dynamically 1806 create contexts, or that a new MIB needs to be defined for this purpose. 1808 Manager-defined contexts are problematic because the agent doesn't know 1809 what objects are encompassed by such a manager-defined context. A 1810 supplementary mib module, to which can be written the manager-defined 1811 domain names, could be used to circumvent the read-only nature of the 1812 vacmContexts. The agent would need to take the manager-defined context 1813 names from the supplementary MIB module and reflect them in the 1814 vacmContext table, i.e. dynamically create the domain-specific contexts. 1816 For an agent to create contexts as needed, it needs to sort accounting 1817 data by domain, dynamically creating new contextNames and putting the 1818 accounting records into virtual tables corresponding to the 1819 contextNames. Note that in AgentX [51], when a subagent registers a MIB 1820 (or subtree) for a new contextName, then the master agent can decide to 1821 create a new contextName. 1823 Since views filter at the name level, not the value level, domain 1824 separation is handled by using contextName to differentiate multiple 1825 virtual tables. For example, if accounting data has been collected on 1826 users with the bigco.com and smallco.com domains, then a separate 1827 instance of the accounting session record table would exist for each 1828 domain, and each domain would have a corresponding contextName. When a 1829 get-bulk request is made with a contextName of bigco.com, then the table 1830 instance corresponding to bigco.com would be returned. Since this 1831 approach requires multiple virtual tables, instance-level access control 1832 is required so that an operation can be permitted or denied based on the 1833 provided contextName. 1835 Indexing of the virtual tables can be handled in several ways. The 1836 indexing scheme could be chosen so that a given index never exists in 1837 more than one context. For example, the table for the bigco.com context 1838 could have indexes 4,5,6,7,8,9 and 10 and small.com context could have 1 1839 and 2, with nobody having 3 and the default context having a virtual 1840 table with all index values. 1842 Another strategy would be to hide the provisioning from the customer by 1843 having bigco.com owning indices 1,2,3,4,5,6 in context bigco.com, while 1844 the smallco.com context would own index 1 and 2 in context smallco.com. 1845 Thus the indices would be renormalized within each context. Note that 1846 this approach makes it more difficult to handle the default context 1847 which is comprised of all entries. 1849 In practice, for this approach to work, it is necessary that the 1850 contextName be passed to the agent. Within AgentX [51], the only 1851 standardized sub-agent protocol, access control is handled by the master 1852 agent. After the master agent starts up, sub-agent(s) connect to the 1853 master agent and register a MIB or table in a MIB, at which time the 1854 sub-agent registers for one or more contextNames. Note that in this 1855 case the sub-agent will typically need to create new contexts on the 1856 fly, and so may not know beforehand all the contextNames it should 1857 register for. 1859 The master-agent then passes requests with the requested contextNames. 1860 The decision on whether the data is returned to the manager is in the 1861 hands of the master-agent. When the sub-agent registers for one or more 1862 contextNames, the Master agent creates entries in contextTable. Since 1863 contextName is part of the index of the vacmAccessTable, when a request 1864 arrives, the master-agent will not pass the request to the sub-agent if 1865 access is not allowed within that contextName. 1867 The agent (combination of master and sub-agents) creates the appropriate 1868 virtual tables and makes the data available within the appropriate 1869 contextName. Since legacy agents will not be able to do this, they must 1870 be updated to take advantage of this capability. 1872 7.3.6. Outstanding issues 1874 Reference [49] discusses issues that arise when using SNMP for transfer 1875 of bulk data. These include issues of latency, network overhead, and 1876 table retrieval. 1878 In accounting applications, often necessary for management stations to 1879 retrieve large tables, and in such situations the latency can be high, 1880 even with the get-bulk operation. This is because the response must fit 1881 into the largest supported packet size, requiring multiple round-trips. 1882 Unless multiple threads are employed, the transfers will be serialized 1883 and the resulting latency will be a combination of multiple round-trip 1884 times, timeout and re-transmission delays and processing overhead, which 1885 may result in unacceptable performance. Since data may change during the 1886 course of multiple retrievals, it can be difficult to get a consistent 1887 snapshot. 1889 Reference [49] also discusses file-based storage of SNMP data, as well 1890 as the FTP MIB. Together these MIBs enable storage of SNMP data in non- 1891 volatile storage, and subsequent bulk transfer via SNMP. It is noted 1892 that this approach requires implementation of additional MIBs as well as 1893 FTP, and requires separate security mechanisms such as IPSEC to provide 1894 authentication, replay, integrity protection and confidentiality for the 1895 data in transit. However, the the file-based transfer approach also has 1896 an important benefit, which is compatibility with non-volatile storage. 1898 For bulk transfers, SNMP network overhead can be high due to the lack of 1899 compression, inefficiency of BER encoding, the transmission of 1900 redundant OID prefixes, and the "get-bulk overshoot problem". 1901 Compression methods include compression of the IP packet, as described 1902 in [5] or compression of the SNMP payload, described in [48]. In bulk 1903 transfer of a table, the OIDs transferred are redundant: all OID 1904 prefixes up to the column number are identical, as are the instance 1905 identifier postfixes of all entries of a single table row. Thus it may 1906 be possible to reduce this redundancy by compressing the OIDs, or even 1907 to avoid transferring an OID with each variable altogether. 1909 The "get-bulk overshoot problem", described in reference [50], occurs 1910 when using the get-bulk PDU. The problem is that the manager typically 1911 does not know the number of rows in the table. As a result, it must 1912 either request too many rows, retrieving unneeded data, or too few, 1913 resulting in the need for multiple get-bulk requests. Note that the 1914 "get-bulk overshoot" problem may be preventable on the agent side. 1915 Reference [41] states that an agent can terminate the get-bulk because 1916 of "local constraints" (see items 1 and 3 on pages 15/16 of [41]). This 1917 could be interpreted to mean that it is possible to stop at the end of a 1918 table. 1920 7.3.6.1. Ongoing research 1922 To address issues of latency and efficiency, the Network Management 1923 Research Group (NMRG) was formed within the Internet Research Task Force 1924 (IRTF). Since the NMRG work is research and is not on the standards 1925 track, it should be understood that the NMRG proposals may never be 1926 standardized, or may change substantially in during the standardization 1927 process. As a result, these proposals represent works in progress and 1928 are not readily available for use. 1930 The proposals under discussion in the IRTF Network Management Research 1931 Group (NMRG) are described in [46]. These include an SNMP-over-TCP 1932 transport mapping, described in [47]; SNMP payload compression, 1933 described in [48]; and the addition of a "get subtree" PDU or the 1934 subtree retrieval MIB [50]. 1936 The SNMP-over-TCP transport mapping results in substantial latency 1937 reductions in table retrieval. While it is possible for SNMP to operate 1938 in polling, event-driven, event-driven batching and event-driven polling 1939 modes, the latency reduction from the SNMP-over-TCP transport mapping 1940 manifests itself primarily in the polling, event-driven polling and 1941 event-driven batching modes. While an SNMP over TCP transport mapping 1942 is easily implemented, it does require an SNMP agent to listen on TCP 1943 ports 161 and a manager to listen on TCP port 162. 1945 Reference [46] also discusses addition of a "get subtree" PDU. Since it 1946 may be possible to address the "get-bulk overshoot problem" without 1947 changes to the SNMP protocol, the necessity of this modification is 1948 controversial, especially since addition of a "get subtree" PDU implies 1949 changes to every agent that the management station will interact with. 1950 In order to reduce the required changes, a subtree retrieval MIB has 1951 been proposed [50]. Since the subtree retrieval MIB requires no changes 1952 to the SNMP protocol or SNMP protocol engine, it can be implemented and 1953 deployed more easily. 1955 7.3.6.2. Security model extensions 1957 In order to simplify key management and enable use of certificate-based 1958 security in SNMPv3, a Kerberos Security Model (KSM) for SNMPv3 has been 1959 proposed in [44]. This draft is not yet on the standards track, and 1960 therefore is not yet readily available for use. 1962 Use of Kerberos with SNMPv3 requires storage of a key on the KDC for 1963 each device and domain, while dynamically generating a session key for 1964 conversations between domains and devices. Thus, in terms of stored keys 1965 the KSM approach scales with the sum of devices and domains, whereas in 1966 terms of dynamic session keys, it scales as the product of domains and 1967 devices. 1969 As Kerberos is extended to allow initial authentication via public key, 1970 as described in [42], and cross-realm authentication, as described in 1971 [43], the KSM inherits these capabilities. As a result, this approach 1972 may have potential to reduce or even eliminate the shared secret 1973 management problem. However, it should also be noted that certificate- 1974 based authentication can strain the limits of UDP packet sizes supported 1975 in SNMP implementations, so that the SNMP-over-TCP transport mapping 1976 described in [47] may be required to support this. 1978 An IPSEC-based security model for SNMPv3 has also been discussed. 1979 Implementation of such a security model would require the SNMPv3 engine 1980 to be able to retrieve the properties of the IPSEC security association 1981 used to protect the SNMPv3 traffic. This would include the security 1982 services invoked, as well as information relating to the other endpoint, 1983 such as the authentication method and presented identity and 1984 certificate. To date such APIs have not been widely implemented, and in 1985 addition, most IPSEC implementations only support machine certificates, 1986 which may not provide the required granularity of identification. Thus, 1987 an IPSEC-based security model for SNMPv3 will probably take several 1988 years to come to fruition. 1990 7.3.7. SNMP summary 1992 Given the wealth of existing accounting-related MIBs, it is likely that 1993 SNMP will remain a popular accounting protocol for the foreseeable 1994 future. Given the SNMPv3 enhancements, it is desirable for SNMP-based 1995 intra-domain accounting implementations to upgrade to SNMPv3. Such an 1996 upgrade is virtually mandatory for inter-domain applications. 1998 With SNMPv3, it is possible to provide hop-by-hop security services. 1999 Through use of the SNMPv3 notify tables, and confirmed notifications, it 2000 is possible to implement the event-driven, event-driven polling and 2001 event-driven batching models. This makes it possible to notify domains 2002 of available data rather than requiring them to poll for it, which is 2003 critical in shared use networks and roaming. 2005 In inter-domain accounting, the burden of managing SNMPv3 shared secrets 2006 can be reduced via the localized key capability or via implementation of 2007 a Proxy Forwarder. In the long term, alternative security models such as 2008 the Kerberos Security Model may further reduce the effort required to 2009 manage security and enable streamlined inter-domain operation. 2011 As noted in [49], SNMP-based accounting has limitations in terms of 2012 efficiency and latency that may make it inappropriate for use in 2013 situations requiring low processing delay or low overhead. This includes 2014 usage sensitive billing applications where fraud detection may be 2015 required. These issues can be addressed via proposals under discussion 2016 in the IRTF Network Management Research Group (NMRG). Compatibility 2017 with non-volatile storage can be achieved via implementation of the Bulk 2018 File and FTP MIBs, described in [49]. 2020 SNMP supports separation of accounting data by domain, using either of 2021 two approaches. The domain as index approach can be used where the MIB 2022 supports this. This approach can be used with any version of SNMP. 2023 However, few current MIBs support the domain as an index. It is 2024 typically not possible to retrofit existing MIBs to support domain 2025 separation, since while it is possible to add columns to a table via an 2026 AUGMENTS clause, it is not possible to add columns to a table. AUGMENTS 2027 uses the same indexing as the INDEX of the original (base) table. Thus 2028 a new MIB is typically required. 2030 Where it is desirable to support an existing MIB, and where contextName 2031 is supported, multiple instances of the MIB can be used to provide 2032 domain separation support by contextName. This can be used with any 2033 version of SNMP, but typically requires re-writing the sub-agent. Since 2034 there are no known implementations of this approach, the use of 2035 contextName for domain separation represents new ground. 2037 In usage or time sensitive billing applications, latency reduction may 2038 be important. In such cases it would be advisable to consider use of the 2039 SNMP over TCP transport mapping. Depending on the volume of data, some 2040 form of compression may also be worth considering. However, since these 2041 proposals are still in the research stage, and are not on the standards 2042 track, these capabilities are not readily available, and the 2043 specifications could change considerably before they reach their final 2044 form. 2046 8. Review of Accounting Data Transfer 2048 In order for session records to be transmitted between accounting 2049 servers, a transfer protocol is required. Transfer protocols in use 2050 today include SMTP, FTP, and HTTP. For a review of accounting 2051 attributes and record formats, see [45]. 2053 Reference [49] contains a discussion of alternative encodings for SMI 2054 data types, as well as alternative protocols for transmission of 2055 accounting data. For example, [49] describes how MIME tags and XML DTDs 2056 may be used for encoding of SNMP messages or SMI data types. This 2057 enables data from SNMP MIBs to be transported using any protocol that 2058 can encapsulate MIME or XML, including SMTP and HTTP. 2060 8.1. SMTP 2062 To date, few accounting management systems have been built on SMTP since 2063 the implementation of a store-and-forward message system has 2064 traditionally required access to non-volatile storage which has not been 2065 widely available on network devices. However, SMTP-based 2066 implementations have many desirable characteristics, particularly with 2067 regards to security. 2069 Accounting management systems using SMTP for accounting transfer will 2070 typically support batching so that message processing overhead will be 2071 spread over multiple accounting records. As a result, these systems 2072 result in per-active device state. Since accounting systems using SMTP 2073 as a transfer mechanism have access to substantial non-volatile storage, 2074 they can generate, compress if necessary, and store accounting records 2075 until they are transferred to the collection site. As a result, 2076 accounting systems implemented using SMTP can be highly efficient and 2077 scalable. Using IPSEC, TLS or Kerberos, hop-by-hop security services 2078 such as authentication, integrity protection and confidentiality can be 2079 provided. 2081 As described in [13] and [15], data object security is available for 2082 SMTP, and in addition, the facilities described in [12] make it possible 2083 to request and receive signed receipts, which enables non-repudiation as 2084 described in [12]-[17]. As a result, accounting systems utilizing SMTP 2085 for accounting data transfer are capable of satisfying the most 2086 demanding security requirements. However, such systems are not typically 2087 capable of providing low processing delay, although this may be 2088 addressed by the enhancements described in [20]. 2090 8.2. Other protocols 2092 File transfer protocols such as FTP and HTTP have been used for transfer 2093 of accounting data. For example, Reference [9] describes a means for 2094 representing ASN.1-based accounting data for storage on archival media. 2095 Through the use of the Bulk File MIB, accounting data from an SNMP MIB 2096 can be stored in ASN.1, bulk binary or Bulk ASCII format, and then 2097 subsequently retrieved as required using the FTP Client MIB. 2099 Given access to sufficient non-volatile storage, accounting systems 2100 based on record formats and transfer protocols can avoid loss of data 2101 due to long-duration network partitions, server failures or device 2102 reboots. Since it is possible for the transfer to be driven from the 2103 collection site, the collector can retry transfers until successful, or 2104 with HTTP may even be able to restart partially completed transfers. As 2105 a result, file transfer-based systems can be made highly reliable, and 2106 the batching of accounting records makes possible efficient transfers 2107 and application of required security services with lessened overhead. 2109 9. Summary 2111 As noted previously in this document, accounting applications vary in 2112 their security and reliability requirements. Some uses such as capacity 2113 planning may only require authentication, integrity and replay 2114 protection, and modest reliability. Other applications such as inter- 2115 domain usage-sensitive billing may require the highest degree of 2116 security and reliability, since in these cases the transfer of 2117 accounting data will lead directly to the transfer of funds. 2119 Since accounting applications do not have uniform security and 2120 reliability requirements, it is not possible to devise a single 2121 accounting protocol and set of security services that will meet all 2122 needs. Rather, the goal of accounting management should be to provide a 2123 set of tools that can be used to construct accounting systems meeting 2124 the requirements of an individual application. As a result, it is 2125 important to analyze a given accounting application to ensure that the 2126 methods chosen meet the security and reliability requirements of the 2127 application. 2129 Based on an analysis of the requirements, it appears that existing 2130 deployed protocols are capable of meeting the requirements for intra- 2131 domain capacity planning and non-usage sensitive billing. In these 2132 applications efficient transfer of bulk data is useful although not 2133 critical. Thus, it is possible to use SNMPv3 to satisfy these 2134 requirements, without the NMRG extensions. These include TCP transport 2135 mapping, sub-tree retrieval, and OID compression. 2137 In inter-domain capacity planning and non-usage sensitive billing, the 2138 security and reliability requirements are greater. As a result, no 2139 existing deployed protocol satisfies the requirements. For example, 2140 existing protocols lack data object security support and extensions to 2141 improve scalability of inter-domain authentication are needed, such as 2142 the Kerberos Security Model (KSM) for SNMPv3. 2144 For usage sensitive billing, as well as cost allocation and auditing 2145 applications, the reliability requirement are greater. Here transport 2146 layer reliability is required to provide robustness against packet loss, 2147 as well as application layer acknowledgments to provide robustness 2148 against accounting server failures. SNMP provides application layer 2149 acknowledgments, and the TCP transport mapping proposed by NMRG provides 2150 robustness against packet loss. Inter-domain operation can benefit from 2151 data object security (which no existing protocol provides) as well as 2152 inter-domain security model enhancements (such as the KSM). 2154 Where high-value sessions are involved, such as in roaming, Mobile IP, 2155 or telephony, it may be necessary to put bounds on processing delay. 2156 This implies the need to reduce table retrieval latency. As a result, 2157 the NMRG extensions are required in time sensitive billing applications, 2158 including TCP transport mapping, get-subtree capabilities and OID 2159 compression. High reliability is also required in this application, 2160 implying the need for application layer as well as transport layer 2161 acknowledgments. SNMPv3 with the NMRG extensions and security 2162 scalability improvements such as the KSM can satisfy the requirements in 2163 intra-domain use. 2165 However, in inter-domain use, additional security precautions such as 2166 data object security and receipt support are required. No existing 2167 protocol can meet these requirements. A summary is given in the table 2168 on the next page. 2170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2171 | | | | 2172 | Usage | Intra-domain | Inter-domain | 2173 | | | | 2174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2175 | | | | 2176 | Capacity | SNMPv3 | SNMPv3 <* | 2177 | Planning | RADIUS #%@ | | 2178 | | TACACS+ @ | | 2179 | | | | 2180 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2181 | | | | 2182 | Non-usage | SNMPv3 | SNMPv3 <* | 2183 | Sensitive | RADIUS #%@ | | 2184 | Billing | TACACS+ @ | | 2185 | | | | 2186 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2187 | | | | 2188 | Usage | | | 2189 | Sensitive | | | 2190 | Billing, | SNMPv3 >$ | SNMPv3 <>*$ | 2191 | Cost | TACACS+ &$@ | | 2192 | Allocation & | | | 2193 | Auditing | | | 2194 | | | | 2195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2196 | | | | 2197 | Time | | | 2198 | Sensitive | SNMPv3 >$ | No existing | 2199 | Billing, | | protocol | 2200 | fraud | | | 2201 | detection, | | | 2202 | roaming | | | 2203 | | | | 2204 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2206 Key 2207 # = lacks confidentiality support 2208 * = lacks data object security 2209 % = limited robustness against packet loss 2210 & = lacks application layer acknowledgment 2211 $ = requires non-volatile storage 2212 @ = lacks batching support 2213 < = lacks certificate support (KSM, work in progress) 2214 > = lacks support for large packet sizes (TCP transport mapping, experimental) 2216 10. Acknowledgments 2218 The authors would like to thank Bert Wijnen (Lucent), Keith McCloghrie 2219 (Cisco Systems), Jan Melen (Ericsson) and Jarmo Savolainen (Ericsson) 2220 for useful discussions of this problem space. 2222 11. References 2224 [1] Aboba, B., Lu J., Alsop J., Ding J., and W. Wang, "Review of 2225 Roaming Implementations", RFC 2194, September 1997. 2227 [2] Aboba, B., and G. Zorn, "Criteria for Evaluating Roaming 2228 Protocols", RFC 2477, January 1999. 2230 [3] Rigney, C., Rubens, A., Simpson, W., Willens, S., "Remote 2231 Authentication Dial In User Service (RADIUS)", RFC 2138, April, 2232 1997. 2234 [4] Rigney, C., "RADIUS Accounting", RFC 2139, April 1997. 2236 [5] Shacham, A., Monsour, R., Pereira, R., Thomas, M., "IP Payload 2237 Compression Protocol (IPComp)", RFC 2393, December 1998. 2239 [6] Bradner, S., "Key words for use in RFCs to Indicate Requirement 2240 Levels", BCP 14, RFC 2119, March 1997. 2242 [7] Information Sciences Institute, "Transmission Control Protocol", 2243 RFC 793, September 1981. 2245 [8] Aboba, B., and M. Beadles, "The Network Access Identifier", 2246 RFC 2486, January 1999. 2248 [9] McCloghrie, K., Heinanen, J., Greene, W., Prasad, A., "Accounting 2249 Information for ATM Networks", RFC 2512, February 1999. 2251 [10] McCloghrie, K., Heinanen, J., Greene, W., Prasad, A., "Managed 2252 Objects for Controlling the Collection and Storage of Accounting 2253 Information for Connection-Oriented Networks", RFC 2513, February 2254 1999. 2256 [11] Frye, R., Levi, D., Routhier, S., Wijnen, B., "Coexistence between 2257 Version 1, Version 2, and Version 3 of the Internet-standard 2258 Management Framework", RFC 2576, March 2000. 2260 [12] Fajman, R., "An Extensible Message Format for Message Disposition 2261 Notifications", RFC 2298, March 1998. 2263 [13] Elkins, M., "MIME Security with Pretty Good Privacy (PGP)", RFC 2264 2015, October 1996. 2266 [14] Vaudreuil, G., "The Multipart/Report Content Type for the Reporting 2267 of Mail System Administrative Messages", RFC 1892, January 1996. 2269 [15] Galvin, J., et al. "Security Multiparts for MIME: Multi- 2270 part/Signed and Multipart/Encrypted", RFC 1847, October 1995. 2272 [16] Crocker, D., "MIME Encapsulation of EDI Objects", RFC 1767, March 2273 1995. 2275 [17] Borenstein, N., Freed, N, "MIME (Multipurpose Internet Mail 2276 Extensions) Part One: Mechanisms for Specifying and Describing 2277 the Format of Internet Message Bodies", RFC 1521, December 1993. 2279 [18] Rose, M.T., The Simple Book, Second Edition, Prentice Hall, Upper 2280 Saddle River, NJ, 1996. 2282 [19] Case, J., Mundy, R., Partain, D., Stewart, B., "Introduction to 2283 Version 3 of the Internet-standard Network Management Framework", 2284 RFC 2570, April 1999. 2286 [20] Klyne, G., "Timely Delivery for Facsimile Using Internet Mail", 2287 Internet draft (work in progress), draft-ietf-fax-timely- 2288 delivery-00.txt, October 1999. 2290 [21] Johnson, H. T., Kaplan, R. S., Relevance Lost: The Rise and Fall of 2291 Management Accounting, Harvard Business School Press, Boston, 2292 Massachusetts, 1987. 2294 [22] Horngren, C. T., Foster, G., Cost Accounting: A Managerial 2295 Emphasis. Prentice Hall, Englewood Cliffs, New Jersey, 1991. 2297 [23] Kaplan, R. S., Atkinson, Anthony A., Advanced Management 2298 Accounting, Prentice Hall, Englewood Cliffs, New Jersey, 1989. 2300 [24] Cooper, R., Kaplan, R. S., The Design of Cost Management Systems. 2301 Prentice Hall, Englewood Cliffs, New Jersey, 1991. 2303 [25] Rigney, C., Willens, S., Calhoun, P., "RADIUS Extensions", draft- 2304 ietf-radius-ext-07.txt, Internet Draft (work in progress), February 2305 2000. 2307 [26] Stewart, R. R., et al., "Simple Control Transmission Protocol", 2308 Internet draft (work in progress), draft-ietf-sigtran-sctp-05.txt, 2309 January 2000. 2311 [27] Harrington, D., Presuhn, R., and B. Wijnen, "An Architecture for 2312 Describing SNMP Management Frameworks", RFC 2571, April 1999. 2314 [28] Rose, M., and K. McCloghrie, "Structure and Identification of 2315 Management Information for TCP/IP-based Internets", RFC 1155, May 2316 1990. 2318 [29] Rose, M., and K. McCloghrie, "Concise MIB Definitions", RFC 1212, 2319 March 1991. 2321 [30] M. Rose, "A Convention for Defining Traps for use with the SNMP", 2322 RFC 1215, March 1991. 2324 [31] Case, J., McCloghrie, K., Rose, M., and S. Waldbusser, "Structure 2325 of Management Information for Version 2 of the Simple Network 2326 Management Protocol (SNMPv2)", RFC 1902, January 1996. 2328 [32] Case, J., McCloghrie, K., Rose, M., and S. Waldbusser, "Textual 2329 Conventions for Version 2 of the Simple Network Management Protocol 2330 (SNMPv2)", RFC 1903, January 1996. 2332 [33] Case, J., McCloghrie, K., Rose, M., and S. Waldbusser, "Conformance 2333 Statements for Version 2 of the Simple Network Management Protocol 2334 (SNMPv2)", RFC 1904, January 1996. 2336 [34] Case, J., Fedor, M., Schoffstall, M., and J. Davin, "Simple Network 2337 Management Protocol", RFC 1157, May 1990. 2339 [35] Case, J., McCloghrie, K., Rose, M., and S. Waldbusser, 2340 "Introduction to Community-based SNMPv2", RFC 1901, January 1996. 2342 [36] Case, J., McCloghrie, K., Rose, M., and S. Waldbusser, "Transport 2343 Mappings for Version 2 of the Simple Network Management Protocol 2344 (SNMPv2)", RFC 1906, January 1996. 2346 [37] Case, J., Harrington D., Presuhn R., and B. Wijnen, "Message 2347 Processing and Dispatching for the Simple Network Management 2348 Protocol (SNMP)", RFC 2572, April 1999. 2350 [38] Blumenthal, U., and B. Wijnen, "User-based Security Model (USM) for 2351 version 3 of the Simple Network Management Protocol (SNMPv3)", RFC 2352 2574, April 1999. 2354 [39] Levi, D., Meyer, P., and B. Stewart, "SNMPv3 Applications", RFC 2355 2573, April 1999. 2357 [40] Wijnen, B., Presuhn, R., and K. McCloghrie, "View-based Access 2358 Control Model (VACM) for the Simple Network Management Protocol 2359 (SNMP)", RFC 2575, April 1999. 2361 [41] Case, J., McCloghrie, K., Rose, M., and S. Waldbusser, "Protocol 2362 Operations for Version 2 of the Simple Network Management Protocol 2363 (SNMPv2)", RFC 1905, January 1996. 2365 [42] Tung, B., Neuman, C., Hur, M., Medvinsky, A., Medvinsky, S., Wray, 2366 J., Trostle, J., "Public Key Cryptography for Initial 2367 Authentication in Kerberos", Internet draft (work in progress), 2368 draft-ietf-cat-kerberos-pk-init-09.txt, June 1999. 2370 [43] Tung, B., Ryutov, T., Neuman, C., Tsudik, G., Sommerfeld, B., 2371 Medvinsky, A., Hur, M., "Public Key Cryptography for Cross-Realm 2372 Authentication in Kerberos", Internet draft (work in progress), 2373 draft-ietf-cat-kerberos-pk-cross-04.txt, June 1999. 2375 [44] Hornstein, K., Hardaker, W., "A Kerberos Security Model for 2376 SNMPv3", Internet draft (work in progress), draft-hornstein- 2377 snmpv3-ksm-00.txt, June 1999. 2379 [45] Brownlee, N., Blount, A., "Accounting Attributes and Record 2380 Formats", Internet draft (work in progress), draft-ietf-aaa- 2381 accounting-attributes-03.txt, April 2000. 2383 [46] Network Management Research Group Web page, http://www.ibr.cs.tu- 2384 bs.de/projects/nmrg/ 2386 [47] Schoenwaelder, J.,"SNMP-over-TCP Transport Mapping", Internet draft 2387 (work in progress), draft-irtf-nmrg-snmp-tcp-03.txt, April 2000. 2389 [48] Schoenwaelder, J.,"SNMP Payload Compression", Internet draft (work 2390 in progress), draft-irtf-nmrg-snmp-compression-00.txt, June 1999. 2392 [49] Sprenkels, R., Martin-Flatin, J.,"Bulk Transfers of MIB Data", 2393 Simple Times, http://www.simple-times.org/pub/simple- 2394 times/issues/7-1.html, March 1999. 2396 [50] Thaler, D., "Get Subtree Retrieval MIB", Internet draft (work in 2397 progress), draft-irtf-nmrg-get-subtree-mib-00.txt, October 1999. 2399 [51] Daniele, M., Wijnen, B., Ellison, M., Francisco, D., "Agent 2400 Extensibility (AgentX) Protocol Version 1", RFC 2741, January 2000. 2402 12. Author's Addresses 2404 Bernard Aboba 2405 Microsoft Corporation 2406 One Microsoft Way 2407 Redmond, WA 98052 2408 USA 2410 Phone: +1 425 936 6605 2411 EMail: bernarda@microsoft.com 2413 Jari Arkko 2414 Oy LM Ericsson Ab 2415 02420 Jorvas 2416 Finland 2418 Phone: +358 40 5079256 2419 EMail: Jari.Arkko@ericsson.com 2421 David Harrington 2422 Cabletron Systems Inc. 2423 P.O.Box 5005 2424 Rochester NH 03867-5005 2425 USA 2427 Phone: +1 603 337 7357 2428 EMail: dbh@cabletron.com 2430 13. Intellectual Property Statement 2432 The IETF takes no position regarding the validity or scope of any 2433 intellectual property or other rights that might be claimed to pertain 2434 to the implementation or use of the technology described in this 2435 document or the extent to which any license under such rights might or 2436 might not be available; neither does it represent that it has made any 2437 effort to identify any such rights. Information on the IETF's 2438 procedures with respect to rights in standards-track and standards- 2439 related documentation can be found in BCP-11. Copies of claims of 2440 rights made available for publication and any assurances of licenses to 2441 be made available, or the result of an attempt made to obtain a general 2442 license or permission for the use of such proprietary rights by 2443 implementors or users of this specification can be obtained from the 2444 IETF Secretariat. 2446 The IETF invites any interested party to bring to its attention any 2447 copyrights, patents or patent applications, or other proprietary rights 2448 which may cover technology that may be required to practice this 2449 standard. Please address the information to the IETF Executive 2450 Director. 2452 14. Full Copyright Statement 2454 Copyright (C) The Internet Society (2000). All Rights Reserved. 2455 This document and translations of it may be copied and furnished to 2456 others, and derivative works that comment on or otherwise explain it or 2457 assist in its implementation may be prepared, copied, published and 2458 distributed, in whole or in part, without restriction of any kind, 2459 provided that the above copyright notice and this paragraph are included 2460 on all such copies and derivative works. However, this document itself 2461 may not be modified in any way, such as by removing the copyright notice 2462 or references to the Internet Society or other Internet organizations, 2463 except as needed for the purpose of developing Internet standards in 2464 which case the procedures for copyrights defined in the Internet 2465 Standards process must be followed, or as required to translate it into 2466 languages other than English. The limited permissions granted above are 2467 perpetual and will not be revoked by the Internet Society or its 2468 successors or assigns. This document and the information contained 2469 herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE 2470 INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR 2471 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 2472 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 2473 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 2475 15. Expiration Date 2477 This memo is filed as , and expires January 2478 1, 2001.