Evolution Of SNMP WG Satyen Chandragiri INTERNET-DRAFT Ranch Networks, Inc Expires October 2001 April 2001 Efficient Transfer of Bulk SNMP Data draft-ietf-eos-snmpbulk-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Abstract Managing networks and network devices using the Internet Standard Management Framework often requires the retrieval of significant amounts of MIB data via SNMP, which can result in large latency, increased network overhead, and other problems. This memo discusses the need for an efficient mechanism for transferring large amounts of SNMP data and explores possible solutions for overcoming the current limitations of the protocol. S.Chandragiri Expires October 2001 [Page 1] Efficient Transfer of SNMP Bulk Data April 2001 Table of Contents Status of this Memo................................................1 Abstract...........................................................1 Overview...........................................................2 Previous Work......................................................3 Proposed Solution..................................................5 Summary............................................................9 Security Considerations............................................9 IANA Considerations................................................9 Author's Address...................................................9 Acknowledgements...................................................9 References........................................................10 Full Copyright Statement..........................................10 Overview As network elements grow in size and complexity, so do the size of MIB tables used to monitor and configure them. Since the first introduction of SNMP in the late 1980s, the amount of MIB data that is required to be transferred between a command generator and a command responder has grown tremendously. For example, the size of IP routing tables in a typical backbone router, or the subscriber management tables in a cable modem termination system can be quite substantial. Retrieving such data via SNMP may involve the transfer of several hundred kilobytes of data across a network. Although SNMP has evolved substantially, with version 3 providing many desirable features such as security and access control, no enhancements have been made that address the issue of bulk transfer of SNMP data. Using SNMPv1, a command generator has to generate a series of GetNextRequests to traverse the MIB tables. For large tables like the ones mentioned above, this requires a very large sequence of request-response exchanges between the two entities thereby resulting in large latency. To address this problem, the GetBulkRequest operation was introduced in SNMPv2. This operation attempts to reduce the number of protocol exchanges required to retrieve a large amount of MIB data by returning a series of variable bindings in a single response. The command generator is required to specify a "max-repetitions" count, and the command responder then fills in as many variable bindings as it can without exceeding either this count, or the maximum message size. The main problem with retrieving tables using GetBulkRequest is that the command generator typically does not know the number of rows in the table, and hence cannot set max-repetitions to the optimal value. As a result, it must either set max-repetitions to some very large value, resulting in a potentially large waste of bandwidth S. Chandragiri Expires October 2001 [Page 2] Efficient Transfer of SNMP Bulk Data April 2001 when many more variable bindings are returned than are needed (this is referred to as the "overshoot" problem), or else it must issue multiple GetBulkRequests sequentially to traverse a large table. Another significant issue affecting the efficiency of bulk transfer is network overhead [ref]. This refers to the amount of non-data bytes (header information, encoding bytes, etc.) needed to be sent along with each PDU. All current versions of SNMP use the Basic Encoding Rules (BER) to encode PDUs. Though BER was selected because of its simplicity and easy availability, it is quite inefficient in terms of network overhead. Added to this is the repetitive information contained in OIDs “ there are multiple occurrences of identical portions of identifiers in the OIDs of the MIB values sent in a response. Transferring large amounts of MIB data further compounds these problems. Retrieving MIB table data (as versus MIB scalars) can pose some special problems. The main reason for these problems is that SNMP does not recognize table rows and columns and thus all protocol operations have to deal with "conceptual" rows and columns. If a table allows certain columnar objects of a conceptual row to be absent, then it creates "holes" in the table. A command generator that is performing a "GetNext" or "GetBulk" on all columnar rows will be returned all elements of the following row, except if there are "holes", in which case the first columnar object of a row that does have this object are returned. The command receiver (generator) has to realize that not all returned objects are from the same row, and has to correctly reconstruct the MIB table while determining the locations of the "holes". This can be very time consuming and challenging for a network management application to implement. Another problem with tables that have rapidly refreshed values (e.g. packet counts, number of active connections, etc.) is that the latency in retrieving table rows can create inconsistencies since by the time a management application reads a value it may be obsolete on the device. Previous Work Several solutions have been proposed and discussed in the past that attempt to address the problems mentioned above. These range from those requiring no changes to the existing protocol to evolutionary changes to the framework to non-SNMP solutions. Some of these proposals are described in this section. Each method has its own merits and demerits. Pipeline Retrieval In RFC 1187 "Bulk Table Retrieval with the SNMP", the authors propose a pipeline algorithm using multiple threads to traverse different sections of a MIB table simultaneously. This improves latency because several pieces of MIB data are gathered in parallel, however it adds complexity on both the command generator and command S. Chandragiri Expires October 2001 [Page 3] Efficient Transfer of SNMP Bulk Data April 2001 responder especially when packets are dropped and retransmission of the request or response is required. It also does not reduce the number of request-response exchanges required between the two entities. SNMP over TCP Another proposal is to extend the transport mapping for SNMP by sending the PDUs over a TCP connection rather than UDP. An immediate benefit is that UDP's 64KB restriction on the SNMP maximum message size is eliminated since TCP's windowing mechanism can be used to send several segments of data in parallel. This reduces the number of request-response exchanges thereby significantly lowering latency as well as network overhead. On the other hand, the SNMP entities now have to manage their TCP connections and be able to accommodate larger buffers for packet processing. Changing PDU Encoding As previously mentioned, the BER encoding used for SNMP PDUs is inefficient and is a major contributor to network overhead. Several alternatives exist, but it should be noted that any other encoding scheme in place of BER entails a major change in implementation and reduces interoperability. One alternative scheme is "Packed Encoding Rules" (PER) which has approximately 30% shorter encodings and requires much less encoding buffer space than BER. Other possibilities are "Lightweight Encoding Rules" (LER) which allows quick encoding and decoding, thereby reducing latency; or "Distinguished Encoding Rules" (DER) which have better encoding time while also keeping the network overhead low. Notification-based GetSubtree The GetSubtree operation allows a management application to retrieve "subtrees" of MIB data. It first specifies the root of the subtree to be retrieved (it can be rooted anywhere “ at the head of a table, a specific column, etc.) and then triggers the retrieval operation. The command responder must then retrieve the MIB data contained lexicographically under the specified root and send the retrieved values back to the management application. The responder stops when it reaches the end of the subtree (thereby eliminating the overshoot problem of GetBulk). The responses are sent back as a series of Notifications to the management application. Multiple varbinds are bundled in each trap packing in as many as allowed by the maximum message size constraint. A sequence number is provided so that the receiver can detect packet loss and request retransmission (via a GetBulk request). The solution can be extended to allow multiple subtrees to be retrieved in parallel if the command responder can handle it. The limitations of this approach are that a) it requires the management application to be registered as a notification receiver, b) it tightly couples the command generator and notification receiver, and c) in a lossy network this protocol degenerates to GetBulk. S. Chandragiri Expires October 2001 [Page 4] Efficient Transfer of SNMP Bulk Data April 2001 Non-SNMP Solutions Several non-SNMP based solutions to the bulk MIB-data transfer problem have been implemented or proposed. Prominent among them is Cisco Systems' FTP-based solution, where the MIB data is retrieved and stored in a file on the device, and then transferred to the management application via FTP. Two MIB modules are involved: the "CISCO-BULK-FILE-MIB" and "CISCO-FTP-CLIENT-MIB". The bulk file MIB consists of three tables that are used to specify the MIB data objects to retrieve and the name of the file, its storage type, and encoding format (BER / binary / ASCII). The FTP client MIB has a single table that allows the management application to specify the FTP server details, and the name of the local file where the data should be uploaded. This solution requires FTP capability on both SNMP entities and since a non-SNMP transfer mechanism is used, security considerations need to be taken into account by the implementers. Other non-SNMP alternatives for bulk transfer include using MIME or XML Document Type Definition (DFD) to encode the MIB data. The data can then be transferred via the well-known HTTP protocol since it is well suited for bulk transfer of MIME-encapsulated data. However, since HTTP is primarily intended for transferring World Wide Web data, it has many features and options that are not required for management data. However, a compliant implementation will nevertheless have to implement those features thereby increasing the size and complexity of the SNMP implementations without additional benefit. In addition, future evolution of HTTP may add features or requirements that make it unsuitable for transferring MIB data. Proposed Solution While each of the proposals above attempts to improve/eliminate one or more problem areas (latency, overhead, table retrieval) none addresses them all. Specifically, the question of how to handle "holes" in MIB tables is not addressed. Moreover, for a solution to be widely adopted by the development community it would have to be easily implementable and not be overly complex or require special resources. Similarly, solutions that are proprietary or rely on non- SNMP mechanisms are also not good candidates for standardization as an evolutionary change to the SNMP protocol. A solution that fits these requirements and adequately addresses the problems associated with bulk transfer is the "GetCols" operation proposed by David Perkins. This solution requires the addition of a new PDU type called "GetColsRequest" to SNMP that will operate as explained below. This PDU type will have the same syntax as the GetBulkRequest and Response PDUs thus minimizing the effect on implementation. Columnar objects in MIB tables are usually attributes of modeled entities. Management applications often need to retrieve specific S. Chandragiri Expires October 2001 [Page 5] Efficient Transfer of SNMP Bulk Data April 2001 columns of MIB data rather than all columns (i.e. entire rows are not required). Moreover, only a certain segment of MIB data (called a "slice") may be required rather than the entire table. The GetColsRequest is used to specify columns of interest to the management application. The command responder sends back only the data of interest & terminates the PDU with an "end-of-data" marker. The GetColsRequest PDU is identical to the GetBulk PDU, and is as follows: BulkPDU ::= SEQUENCE { request-id Integer32, non-repeaters INTEGER (0..max-bindings), max-repetitions INTEGER (0..max-bindings), Variable-bindings VarBindList } The Response PDU is unchanged from that currently used, and is as follows: PDU ::= SEQUENCE { request-id Integer32, error-status -- sometimes ignored INTEGER { noError(0), tooBig(1), noSuchName(2), -- for proxy compatibility badValue(3), -- for proxy compatibility readOnly(4), -- for proxy compatibility genErr(5), noAccess(6), wrongType(7), wrongLength(8), wrongEncoding(9), wrongValue(10), noCreation(11), inconsistentValue(12), resourceUnavailable(13), commitFailed(14), undoFailed(15), authorizationError(16), notWritable(17), inconsistentName(18) S. Chandragiri Expires October 2001 [Page 6] Efficient Transfer of SNMP Bulk Data April 2001 }, error-index -- sometimes ignored INTEGER (0..max-bindings), variable-bindings -- values are sometimes ignored VarBindList } GetCols Example: The Interfaces Table (ifTable) has 18 columns, but a management application may only be interested in retrieving ifIndex, ifType, ifAdminStatus and ifOperStatus for all the interfaces on a router, and the value of ifTableLastChange. The application could use a GetBulkRequest to retrieve the data, but not knowing the number of instances present, it would not be able to set the "max-repetitions" parameter to an optimal value. As explained earlier, this may require the application to make multiple requests until the entire table is traversed, or it may cause an overshoot problem where a large number of unwanted data from past the end of the table is retrieved. The GetColsRequest operation can therefore be utilized here in place of GetBulkRequest. The request would be constructed as follows: GetColsRequest (, 1, 100, ifTableLastChange:NULL, ifIndex:NULL, ifType:NULL, ifAdminStatus:NULL, ifOperStatus:NULL) The command responder will reply with a list of varbinds containing the value of ifTableLastChange and repetitions of the other varbinds and values similar to a GetBulk response, BUT there are important differences: a. There will be no "holes" in the response. If an instance of a columnar object does not exist, it will use a "noSuchInstance" exception in its place rather than skip over to the next instance for that column. Thus, each set of repetitions in the response has the same instance value. b. If the max-repetitions value exceeds the number of instances available in the table, the command responder stops at the end of the table and adds a "End-Of-Rows" marker in the Response PDU rather than overshoot and provide irrelevant MIB data that will be discarded by the requester. S. Chandragiri Expires October 2001 [Page 7] Efficient Transfer of SNMP Bulk Data April 2001 c. The OIDs in the Response PDU are compressed to eliminate redundancy. Since all repetitions of a columnar value have a common OID prefix “ differing only in the instance part, the Response PDU only needs to use the suffix identifiers to distinguish between instance values. The OID compression is done as follows: Either two or three sub- identifiers are used in place of the complete OID “ .. or . The represents the MIB table to which the instance value belongs. In the example above, since all the varbinds belong to the same MIB table (ifTable), would be '0'. Had there been objects from other tables, for those instances would be '1', '2', etc. The represents the column to which the instances belong. The first requested column will have a value of '0' for , the second column will have a value of '1', and so on. The is only present for the first columnar instance of a row and represents the index value. Since subsequent columnar objects of that row have the same index value, it need not be specified in the Response, and hence the is not required for these objects. An indication for a table is provided by setting the for that table equal to one more than the number of columns requested for the table. The Response PDU to the GetCols request in the example presented above would therefore be: Response (, noError, 0, ifTableLastChange:, 0.0.1:, 0.1:, 0.2:, 0.3:, 0.0.2:, 0.1:, 0.2:, 0.3:, ... 0.0.n:, 0.1:, 0.2:, 0.3:, 0.4:) The Response contains the non-repeaters (ifTableLastChange in this example) followed by a VarBindList for the repeaters. The compressed OID values '0.0.x' represent the first requested column (ifIndex in this example) where 'x' is the row index. Compressed OID '0.1' represents the second requested column (ifType), '0.2' represents the third requested column (ifAdminStatus), and '0.3' represents the fourth requested column (ifOperStatus). Note that for the second, third and fourth columns the row index is not included. The special OID '0.4' marks the end of rows for this table (ifTable). Of course, S. Chandragiri Expires October 2001 [Page 8] Efficient Transfer of SNMP Bulk Data April 2001 this is present only if the number of rows returned (n) is less than or equal to the max-repetitions requested in the GetCols request. If there are more rows in the table, the absence of this marker notifies the command generator that there are more rows to be retrieved. Summary The GetCols operation adds a new PDU type to the SNMP protocol, but by reusing the same message formats it minimizes the implementation effort required to add this feature to existing applications. It provides many desirable features such as reduced latency and network overhead, elimination of problems caused by holes in MIB tables, and OID compression to greatly reduce the amount of data transmitted in the Response PDU. The overshoot problem of GetBulk is also eliminated because GetCols stops at the end of the table and hence the management application can choose a very large value for the max-repetitions parameter (constrained mainly by the maximum message size limit). Calculations (presented by David Perkins) demonstrate that significant performance improvements can be gained by using GetCols versus GetBulk operations to retrieve large amounts of MIB data. Security Considerations IANA Considerations Author's Address Satyen Chandragiri Ranch Networks, Inc 65 Route 34 North Suite 200 Phone: 1-732-817-1900 x264 Morganville, NJ 07751 Email: satyen@ranchnetworks.com Acknowledgements The author wishes to acknowledge Ron Sprenkels, Jean-Philippe Martin-Flatin, Juergen Schoenwaelder and other members of the Network Management Research Group (NMRG) of the IRTF for their prior S. Chandragiri Expires October 2001 [Page 9] Efficient Transfer of SNMP Bulk Data April 2001 work in this area; and David Perkins for the GetCols proposal presented in this document. References [1] M. Rose, K. McCloghrie, J. Davin, "Bulk Table Retrieval with the SNMP", RFC 1187, October 1990 [2] R. Sprenkels, J. Martin-Flatin, "Bulk Transfers of MIB Data", The Simple Times Volume 7 Number 1, March 1999 [3] D. Thaler, "Subtree Retrieval MIB", Presentation at the 48th IETF meeting in Pittsburgh, PA, July 2000 [4] D. Perkins, "GetCols Operation", Presentation at the 50th IETF meeting in Minneapolis, MN, March 2001 Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. S. Chandragiri Expires October 2001 [Page 10]