Internet Draft R. Perlman Sun Microsystems, Inc. 6 January 1998 Folklore of Protocol Design draft-iab-perlman-folklore-00.txt Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To view the entire list of current Internet-Drafts, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract This document is intended to set the tone as an IETF collaboration to collect various tricks and ''gotchas'' in protocol design. It is not intended to declare the ''right'' and ''wrong'' ways of doing things, but rather ''this practice has the following advantages and disadvantages'', or ''here are several ways of solving the following problem'', with technical explanation of the pros and cons of the various approaches. Discussion will take place on the mailing list folklore@external.cisco.com. To join, send a message to folklore- request@external.cisco.com. 1 Simplicity vs Flexibility vs Optimality Obviously a simpler protocol is better, all things being equal, but other goals, such as making the protocol flexible enough to fit every possible situation or always finding the theoretically optimal solution, create a more complex protocol. The question to ask is whether the tradeoff is worth it. Sometimes going after "the optimal" solution makes a protocol many times as complex, when users wouldn't actually be able to tell the difference between a "pretty Perlman [Page 1] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 good" solution and an "optimal" solution. Also, sometimes designing for every possible problem and every possible future technology change makes a protocol too complicated for the added flexibility. The simpler the protocol, the more likely it is to be successfully implemented and deployed. If a protocol works in most situations, but fails in some obscure case, such as a network in which there are 300 baud links or routers implemented on toasters, it might be worthwhile to abandon those cases, either forcing users to upgrade their equipment or design a custom protocol for those networks. Underspecification creates complexity. When the goal of flexibility is carried too far, one can wind up with a protocol that is so general that it is unlikely that two independent, conformant (to the specification) implementations will interwork. Many of the ISO protocols had this property. The specification was so general, and left so many choices, that it was necessary to hold "implementor workshops" to agree on what subsets to build and what choices to make. The specification wasn't a specification of a protocol. Instead it was a framework in which a protocol could be designed and implemented. In other words, rather than specifying an algorithm for, say, data compression, the standard would only specify "compression type", and "type-specific data". Often even the type codes would not be defined in the specification, much less the specifics of each choice. Choices are often the result of the inability of the committee to reach consensus. An interesting example is cryptographic algorithm choices. For example, PGP specified "RSA for keys, IDEA for encryption". One argument is that it is necessary to have a choice of algorithms, in case an algorithm is broken or is only legal in some countries. However, having a choice of algorithms means the protocol has to be more complex in order to negotiate algorithms, and runs the risk of non-interoperability because different nodes might implement non- overlapping subsets. If simplicity is chosen instead of flexibility, then a new protocol can be deployed if an algorithm is broken, or in countries where the chosen algorithm is illegal. But then there it could be argued that a new protocol is needed in order to negotiate which of the simple, non-flexible protocols to use, and the result is similar to having designed a flexibility protocol with algorithm choices. A middle ground for something like cryptographic algorithms, where there is the possibility that one or more will be broken, is to specify a set of algorithms, and have all implementations capable of using any from that set. Then later, if an algorithm gets broken it is simple to configure each implementation to no longer generate (or accept) that algorithm. Perlman [Page 2] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 2 Define the Problem The first step to designing a good protocol is defining the problem. What applications will use it? What are their "must have" needs, vs their "desirable" features. One example is multicast. A protocol reasonable for broadcasting IETF meetings to the majority of the Internet might be very different from a protocol for a conference call of several participants. Is it better to design one general protocol that will meet the needs of very different sorts of multicast groups, or is it better to design multiple protocols? The answer is "it depends", but before designing any protocol, it is good to jus- tify the choice. A justification for designing without defining the problem is that one cannot imagine what applications will develop. Design the tool and the applications will come. The argument against is that a protocol designed without defining the problem is likely to be more complex and expensive (bandwidth, etc) than necessary, and if an appli Another example is "policy based routing". Dave Clark described the general problem, from a theoretical point of view, in [Clark]. But nobody ever described all the actual customer needs. BGP provides some set of policies, but not the general case. For instance, a BGP router chooses a single path to the destination, without taking into account the source. Maybe some sources need to have data routed differently from others. Did BGP solve the important cases, or did the world adapt to what BGP happened to solve? If the latter, would the world have been satisfied with a more conveniently accommodated subset, or perhaps even without policy-based routing at all? 3 Overhead/Scaling One should calculate the overhead of the algorithm. For example, the bandwidth used by source route bridging increases exponentially with the number of nodes in a reasonably richly interconnected topology. It is usually possible to choose an algorithm with less dramatic growth, but most algorithms have some limit. Make reasonable bounds on the limits, and publish these in the specification. Sometimes there is no reason to scale beyond a certain point. For example, a protocol that was n**2 or even exponential might be reasonable if it's known that there would never be more than 5 nodes participating. 4 Operation Above Capacity If there are assumptions about the size of the problem to be solved, Perlman [Page 3] Internet Draft draft-perlman-folklore-00.txt January 1998 either the limit should be so large that it would never in practice be exceeded, or the protocol should be designed to gracefully degrade if the limit is exceeded, or at the very least detect that the topology is now illegal and complain (or disconnect a subset to bring the topology within legal limits). An example of a protocol that considered graceful operation beyond expected limits was IS-IS, when a router's capacity for storing link state information was exceeded. Routing depends on all routers making decisions based on identical link state databases, so loops and other disruption can form if a router attempts to continue making decisions based on a subset of the information. The protocol was designed so that: * an overloaded router would not disrupt operations by being on any paths (except as a last resort) * the router was still reachable on the network, so that it could be remotely managed * if the router was on a cut set of the network, the nodes on the other side could (probably) still be reachable through that router * if the routing database somehow got smaller, the router would return to normal operation without human intervention This was accomplished by having the router report, in its own link state information, that it was "overloaded". Other routers treated links to that router as usable on as a "last resort". If some amount of time elapsed without the router needing to discard link state information, the router decleared itself normal again by reissuing its link state information. 5 Identifiers Often a protocol contains a field indentifying something, for instance a protocol type. Most IETF standards have numbers assigned by the IANA. This enables a field to be reasonbly compact. An alternative is an "object identifier" as in ASN.1. Object identifiers are very large, but have the advantage that it is not necessary to obtain one from the IANA, since the hierarchical structure of the object identifier makes it possible to get a unique identifier without central administration. There might also be cases in which companies might want to deploy proprietary extensions without letting anyone know that they are doing this. With an object identifier it is not necessary to tell a central authority of your plans. And in some cases the central authority might publicly divulge the assigned numbers, and the recipient of each assigned number. Perlman [Page 4] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 There are several disadvantages to object identifiers: * the field is larger, and therefore consumes memory and bandwidth and CPU * there is no central place to look up all the currently used object identifiers, so it might be difficult to debug a network * sometimes the same protocol will wind up with multiple object identifiers, again because there is no central coordination so two different organizations might define an object identifier for the sa= me protocol. Then it is possible that two implementations might be in theory interoperable, but since the object identifiers assigned to some field differ, the two implementations might refuse to interoperate. 6 Optimize for Most Common or Important Case Huffman coding is an example of this principle. It might be applicable to implementation or to protocol design. An example of an implementation that optimizes for the usual case is one in which a "common" IP packet (no options, nothing else unusual) is switched in hardware, whereas if there is anything unusual about the packet it is sent to the dungeon of the central processor to be prodded and pondered when the router finds it convenient. An example of this principal in protocol design is encoding "unusual" requests, such as source routing, as an option, which is less efficient in space and in parsing overhead than having the capability encoded in a fixed portion of the header. 7 Forward Compatibility Protocols generally evolve, and it is good to design it with provision for making minor or major changes. Some changes are "incompatible", so that it is preferable for the later version node to be aware that it is talking to an earlier version node, and switch to speaking the earlier version of the protocol. Other changes are "compatible", where later version protocol messages can be processed without harm by earlier version nodes. There are various techniques. 7.1 Large Enough Fields A common mistake is to make fields too small. It is better to overestimate than to underestimate. It greatly expands the lifetime of a protocol. Examples of fields that one could argue should have been larger are: IP address Perlman [Page 5] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 "packet identifier" in IP header (because it could wrap around withi= n a packet lifetime) "fragment identifier" in IS-IS (because an LSP could be larger than = 256 fragments) packet size in IPv6 (though some might argue that the "optimize for most common case" is the reason for splitting the high order part in= to an option in the very unusual case where packets larger than 64K byt= es would be desired) date fields 7.2 Independence of Layers It is desirable to design a protocol with as little as possible dependence on other layers, so that in the future one layer can be replaced without affecting other layers. An example is having protocols above layer 3 make the assumption that addresses are 4 bytes long. The downside of this principal is that if you do not exploit the special capabilities of a particular technology at layer n, then you wind up with "least common denominator". For example, not all data links provide multicast capability, yet it is very useful for routing algorithms to use link level multicast for neighbor discovery, efficient propagation of information to all LAN neighbors, etc. If we adhered too strictly to the principal of not making special assumptions about the data link layer, then we might not have allowed layer 3 to exploit the multicast capability of some layer 2 technologies. Another danger of exploiting special capabilities of layer n-1 is that a new technology at layer n-1 might need to be altered in unnatural ways to make it support the API designed for a different tech- nology. An example is attempting to make a technology like Frame Relay or SMDS provide multicast so that it "looks like" Ethernet. For example, the way in which multicast was simulated in SMDS was to have packets with a multicast destination address transmitted to a special node that was manually configured with the individual members, and that node individually addressed copies of the "multicast" packet to each of the recipients. 7.3 Reserved Fields Often there are spare bits. If they are carefully specified to be transmitted as zero and ignored upon receipt, then they can later be used for functions such as signaling that the transmitting node has Perlman [Page 6] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 implemented later version features, or they can be used to encode information such as priority that is safe for some nodes to not understand. This is an excellent example of the maxim "Be conservative in what you send, and liberal in what you accept", because you should always set reserved bits to zero and ignore them upon receipt. 7.4 Single Version Number Field One method of expressing version is a single number. What should an implementation do if the version number is different? Sometimes a node might implement multiple previous versions. Sometimes later versions are indeed compatible with previous versions. It is generally good to specify that a node that receives a packet with a larger version number simply drop it, or respond with an earlier version packet, rather than logging an error, or crashing. If two nodes attempt to communicate, and the one with the larger version notices it is talking to a node with a smaller version, the later version node simply switches to talking the older version of the protocol, setting the version number to the one recognized by the other side. One problem that can result is that two new version nodes might get tricked into talking the old version of the protocol to each other, since any memory from one side that the other side is older will cause it to talk the older version, and therefore cause the other side to talk the older version. A method of solving this problem is to use a reserved bit indicating "I could be speaking a later version but I think this is the latest version you support". Another possibility is to periodically probe with a later version packet. 7.5 Split Version Number Field This strategy uses two or more subfields, sometimes referred to as "major" and "minor" version numbers. The major subfield is incremented if the protocol has been modified in an incompatible way and it is dangerous for an old version node to attempt to process the packet. The minor subfield is incremented if there are compatible changes to the protocol. An example of a compatible change is where a Transport layer protocol might have added the feature of delayed acks to avoid silly window syndrome [Clark's paper]. The same result could be applied with reserved bits (signalling that you implement enhanced features that are compatible with this version), but having a "minor" version field in addition to the "major version" allows 2**n possible enhancements to be signalled with an n-bit "minor version" field (assuming the enhancements were Perlman [Page 7] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 added to the protocol in sequential order, so that announcing enhancement 23 means you support all previous enhancements as well). If you want to allow more flexibility than "all versions up to n", then there are various possibilities: * I support all capabilities between k and n (requires double the "minor" version field) * I support capabilities 2, 3, and 6 (probably better off with a bitmask) With a version number field, care must be taken if it is allowed to wrap around. It is far simpler not to face this issue by either making the version number field very large or being conservative about incrementing it. 7.6 Options Another way of providing for future protocol evolution is to allow appending "options". IP has option fields. It is desirable to encode it in a way so that an unknown option can be skipped. Though sometimes it is desirable for an unknown option to generate an error rather than be ignored. The most flexible capability is to specify for each option what a node that does not recognize the option should do, whether it be "skip and ignore", "skip and log", or "stop parsing and generate error" To be able to skip unknown options, strategies are: * have a special marker at the end of the option (requires linear scan= of option to find the end) * have options be TLV encoded, which means a "type" field, a "length" field, and a "value" field. Note that the "L" has to always mean the same thing. Sometimes protocols have L depend on T, for instance not having any L field if the particular type is always fixed length, or having the L be expressed in bits vs bytes. If L depends on T then an unknown option cannot be skipped. Another way to make it impossible to parse an unknown option is if L is the "usable length", and the actual length is always padded to, say, a multiple of 8 bytes. If the specification is clear that all options interpret L that way, then options can be parsed, but if some option types use L as "how much data to skip" and others as "relevant information" to which padding is inferred somehow, then it is not possible to parse unknown options. Perlman [Page 8] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 To know what to do with unknown options there are various strategies: * Specify the handling of all unknown types (e.g., skip and log, skip and ignore, generate error and ignore entire packet) * Have a field present in all options that specifies the handling of the option (such as the "copy" flag in IPv4 that specifies whether an option should be copied into each fragment or just the initial fragment, so that a router can perform that even if the router does not understand the option). * Have the handling implicit in the type number, for instance a range of T valies that the specification says should be ignored and another range to be skipped and logged, etc.. This is similar to considering a bit in the type field as a flag indicating the handling of the packet. An example of an option that would make sense to ignore if unknown is priority. An example of an option in which the packet should be dropped is strict source routing. 8 Parameters There are various reasons for having parameters, some good and some bad. * the protocol designers could not figure out the proper values, so leave it to the user to figure it out. This might make sense, if deployment experience might help determine reasonable values. However, if the protocol designers simply can't decide, it is unreasonable to expect the users to have any better judgement. At any rate, if deployment experience does give enough information to set the values, then the parameters should no longer be settable, and should instead just be constants specified in the specification * there are reasonable tradeoffs, say between responsiveness and overhead. In this case, the parameter descriptions should explain the range, and reasons for choosing points in the range. In general, it is a good idea to avoid parameters wherever possible, because it makes for intimidating documentation which must be written and, more importantly, read, in order to use the protocol. It is also desirable, whenever possible, for the computers to figure out the values for the parameters rather than forcing the parameter to be set by humans. Examples include link cost, which could be measured at link startup time by measuring the round trip delay and bandwidth, and network layer address. Perlman [Page 9] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 It is important to design the protocol so that parameters set by people can be modified in a running network, one node at a time. In some protocols, parameters can be set incorrectly and the protocol will not run properly. Unfortunately it isn't as simple as having a legal range for the parameter, because one parameter might interact with another, even a parameter in a different layer. In a distributed system it's possible for two systems to independently have reasonable parameter settings, but have the parameter settings incompatible. A simple example of incompatible settings is in a neighbor aliveness detection protocol, where one sends hellos every n seconds and the other declares the neighbor dead if it does not hear a hello for k seconds. If k is not greater than n, the protocol will not work very well. There are some tricks for causing parameters to be compatible in a distributed system. In some cases, it is reasonable for nodes to operate with different parameter settings, just so long as all the nodes know the parameter setting of other (relevant) nodes. The "report" method has node N report the value of its parameter, in protocol messages, to all the other nodes that need to hear it. IS-IS uses the "report" method. If the parameter is one that neighbors need to know, then it would be reported in a "Hello" message (a message that does not get forwarded, and is therefore only seen by the neighbors). If the parameter is one that all nodes (in an area) need to know, then it would be reported in an LSP. This method allows each node to have independent parameter settings and yet interoperate, because for example, a node will adjust its Listen timer (when to declare a neighbor dead) for neighbor N based on N's reported Hello timer (how often it sends Hellos). Another method is the "detect misconfiguration" method, in which parameters are reported so that nodes can detect whether they are misconfigured. An example where the "detect misconfiguration" strategy makes sense is where routers on a LAN might report to each other the (IP address, subnet mask) of the LAN. An example where the "detect misconfiguration" method is not the best choice is the OSPF protocol, which puts the Hello timer and other parameters into Hello messages, and has neighbors refuse to talk if the parameter settings aren't identical. This forces all nodes on a LAN to have the same Hello timer, but there might be legitimate reasons why the responsiveness/overhead tradeoff for one router might be different than for another router, so that neighbors might legitimately need different values for the Hello Timer. Also, the OSPF method makes it difficult to change parameters in a running network because neighbors will refuse to talk to each other while the network is being migrated from one value to another. Perlman [Page 10] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 Another method is the "use my parameters" method. One example is the bridge spanning tree algorithm, where the Root bridge reports, in its spanning tree message, its values for parameters that should be used by all the bridges. This way bridges can be configured one by one, but a non-Root bridge will simply store the configured value in nonvolatile storage to be used if that bridge becomes Root. The value everyone uses for the parameters are the ones as configured into the bridge that is currently acting as Root. This is a reasonable strategy provided that there is no reason to want nodes to be working with different parameter values. Another example of "use my parameter" is Appletalk, where the "seed router" informs the other routers of the proper LAN parameters, such as network number range. However, it is different from the bridge algorithm because if there is more than one seed router, they must be configured with the same parameter values. A dangerous version of the "use my parameters" method is one in which all nodes store the parameters when receiving a report. This might lead to problems because misconfiguring one node can cause all the other nodes to be permanently misconfigured. In contrast, with the bridge algorithm, although the Root bridge might get misconfigured with undesirable parameters, even if those parameters cause the network to be nonfunctional, simply disconnecting the Root bridge will cause some other bridge to take over, and cause all bridges to use that bridge's parameter settings. Or simply reconfiguring the one Root bridge will clear the network. 9 Making Multiprotocol Operation Possible Unfortunately, there is not a single protocol or protocol suite in the world. There will be computers that will want to be able to receive packets in multiple "languages". Unfortunately, since the protocol designers do not in general coordinate with each other to make their protocols self-describing, it is necessary to figure out a way to ensure that a computer can receive a message in your protocol and not confuse it with another protocol the computer may also be capable of handling. There are several methods of doing this, and because of that it can be very confusing. There is no single "right" way to do it, although the world would be simpler if everyone did it the same way, but we will attempt to explain the various approaches: * protocol type at layer (n-1): This is a field administered by the owner of the layer n-1 specification. Each layer n protocol that wishes to be carried in a layer (n-1) envelope is given a unique value. The Ethernet standard [XXX] has a protocol type field Perlman [Page 11] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 assigned. * socket, port, or SAP at layer (n-1). This consists of two fields at layer (n-1), one applying to the source and the other applying to the destination. This makes sense when these fields need to be applied dynamically. However, almost always when this approach is taken, there are some predefined "well-known" sockets. A process tends to "listen" on the well-known socket, and wait for a dynamically assigned socket from another machine to connect. In practice, although the IEEE 802.2 header is defined as using "SAP"s, in reality the field is used as a protocol type, because the SAP values are either well-known (and therefore the Destination and Source SAP values will be the same), or there is a special SAP known as the "SNAP SAP" which indicates that true multiplexing is done with a protocol type later in the header. * Protocol type at layer n. This consists of a field in the layer n header that allows multiple different protocol n protocols to distinguish themselves from each other. This is usually done when multiple protocols defined by a particular standards body share the same layer (n-1) protocol type. One could argue that the "version number" field in IP is actually a layer-n protocol type, especially since "version"=3D5 is clearly not intended as the next "version" of IP. So the multiplexing information might be one field or two (one for source, one for destination), and the multiplexing information might be dynamically asisgned or "well-known". Multiplexing based on dynamically assigned sockets does not work well with n-party protocols, so for something like a LAN on which multicast is possible, sockets would be the wrong choice. In particular, IEEE made the wrong choice when it changed the Ethernet protocol to have sockets (SAPs), especially with the destination and source sockets being only 8 bits long. Furthermore they defined 2 of the bits, so there were only 64 possible values to assign to "well- known" sockets, and 64 possible values to be assigned dynamically, or by anyone other than IEEE. Because of this mistake, the SNAP encoding was invented, whereby a single well-known socket (the SNAP SAP) was assigned to indicate that the header was expanded to include a true protocol type field. Dynamically assigned values work best in a connection-oriented environment. If one believes the Ethernet should always be combined with LLC type 2 (connection oriented, reliable protocol), then it might be reasonable to multiplex based on sockets. Indeed it is similar to combining TCP or UDP with Ethernet, and including the TCP/UDP port numbers in the combined protocol. However, if Perlman [Page 12] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 reliability is considered as belonging in a different layer (if needed at all), then SAPs were a poor choice. If protocol types were used instead of SAPs in IEEE for multiplexing, then all the functionality of LLC type 2 (or any other connection- oriented protocol) could have been easily accomplished by assigning LLC type 2 a protocol type, and having LLC type 2 define socket fields within its own header. It is not as easy to accommodate connectionless protocols on top of sockets unless you "cheat" by assigning well-known socket values, and basically treating the socket as a protocol type. Especially in the IEEE case this was inconvenient because there were not enough socket values to assign a well-known value to every connectionless protocol. The SNAP kludge saved the day, though, by allowing all connectionless protocols to share a single SAP. 10 Running over Layer 3 vs Layer 2 Sometimes protocols that only work neighbor to neighbor are encapsulated in a layer 3 header. An example is many of the routing protocols for routing IP. Since such messages are not intended to ever be forwarded by IP, there is no reason to have an IP header. The IP header makes the messages longer, and care must be taken to ensure that packets don't actually get routed, because that could confuse distant routers into thinking they are neighbors. The alternative is to acquire a layer 2 protocol type. Sometimes there are implementation reasons to run a neighbor-to- neighbor protocol such as a routing algorithm over layer 3. For instance, there might be an API for running over layer 3, so that the application can be built as a user process, whereas there might not be an API for running over layer 2, and therefore running over layer 2 would require modifications to the kernel. Or it might be bureacratically difficult to obtain a layer 2 protocol type. 11 Robustness One type of robustness is "simple robustness", where the protocol adapts to node and link fail-stop failures. Another type is "self-stabilization", where although operation might have become disrupted due to extraordinary events like a malfunctioning node injecting incorrect messages, once the malfunc- tioning node is disconnected from the network, the network should return to normal operation. The ARPANET link state distribution protocol was not self-stabilizing, and after a sick router injected a few bad LSPs, the network would have been down forever without hours Perlman [Page 13] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 of difficult manual intervention, even though the sick router had failed completely hours before and only "correctly functioning" routers were participating in the protocol. Another type is "Byzantine robustness", where the network can continue to work properly even in the face of malfunctioning nodes, whether the malfunctions be due to hardware problems or even malice. As society gets more dependent on networks, it is desirable to attempt to achieve Byzantine robustness in any distributed algorithm such as clock synchronization, directory system synchronization, or routing. This is difficult, however it is important if the protocol is to be used in a hostile environment (such as where the nodes cooperating in the protocol are remotely manageable from across the Internet, or where a disgruntled employee might be able to physically access one of the nodes). Some interesting points to consider for making a system robust: * every line of code should be exercised frequently. If there is code that only gets invoked when the nuclear power plant is about to explode, it is possible that the code will no longer work when it is actually needed. This could be due to modifications that have been made to the system since the special case code was last checked, or seemingly unrelated events such as increasing link bandwidth. * sometimes it is better to crash rather than gradually degrade in the presence of problems, so that the problems get fixed or at least diagnosed. For example, it might be preferable to bring down a link that has a high error rate. * it is sometimes possible to partition the network with containment points, so that a problem on one side will not spread to the other. An example is attaching two LANs with a router vs a bridge. A broadcast storm (using data link multicast) will "spread" to both sides, whereas it will not spread through a router * Connectivity can be weird. For instance, a link might be one-way, either because that is the way the technology works or because the hardware is broken (e.g., one side has a broken transmitter, or the other has a broken receiver).. Or a link might work except be sensitive to certain bit patterns. Or it might look to your protocol like a node is a neighbor when in fact there are bridges in between, and somewhere on the bridged path is a link with a smaller MTU size. Therefore it could look like you are neighbors, but indeed packets beyond a certain size will not succeed. It is a good idea to have your protocol check that the link is indeed functioning properly (e.g., pad hellos to maximum length to determine if large packets Perlman [Page 14] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 actually get through, test that connectivity is 2-way, etc.) * Certain checksums detect certain error conditions better than others. For example, if bytes are getting swapped, the Fletcher checksum will catch the problem whereas the IPv4 checksum will not. 12 Determinism vs Stability The Designated Router election protocols in IS-IS and OSPF differ in an interesting way. In IS-IS the protocol is "deterministic", considered by some to be a desirable property. "Determinism" means that the behavior at this moment does not depend on past events. So the protocol was designed so that given a particular set of routers that are up, the same one would always be DR. In contrast, OSPF went for "stability", to cause minimal disruption to the network if routers go up or down. In OSPF, once a node is elected DR it will remain DR unless it crashes, whereas in IS-IS if the router with a "better" configured priority will usurp the role when it comes up. A good compromise was done for the NLSP protocol (basically IS-IS for IPX). Nodes change their priority by some constant (say 20) after being DR for some time (say a minute). Then by configuring all the routers with the same priority th protocol acts like OSPF. By configuring all the routers with priorities more than 20 apart, it acts like IS-IS. To allow OSPF-like behavior among a particular subset of the routers (e.g., higher capacity routers), set them all with a priority 20 greater than any of the other routers. That way if any on the high priority set is up a high priority router will become DR, but no other router will usurp the role. Perhaps a simpler way to think of it is that each router could be configured with two priorities, one initially and one after being DR for a time. 13 Performance for Correctness Sometimes in order to be "correct" an implementation must meet certain performance constraints. An example is the bridge spanning tree algorithm. Loops in a bridged network can be disastrous, since packets can proliferate exponentially while they are looping. The spanning tree algorithm depends on receipt of spanning tree messages in order to keep a link from forwarding. If temporary congestion caused a bridge to throw away packets before processing them, then the bridge might be throwing away spanning tree messages, causing links that should be in hot-standby to forward traffic, causing loops and exponentially more congestion. It is very possible that a bridged topology might not recover from such an event. Therefore it is highly desirable, if not something worth mandating, that bridges operate at Perlman [Page 15] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 wire speed. A lot of denial of service attacks are possible (e.g., TCP SYN attack) because nodes are not capable of processing every received packet at wire speeds. 14 ASN.1 The concept of ASN.1 is appealing. You don't have to think of how the actual data would be represented on each machine. Bit/byte order, word size do not have to be considered by the protocol designer. Many protocols therefore define their packet formats using ASN.1. However there are certain "gotchas" that should be understood to decide whether ASN.1 is a good choice: * ASN.1 has a lot of overhead. It adds bytes of overhead in databases and bytes on the wire, and increases the complexity of the code. Although an expert in ASN.1 can define structures so that they will generate reasonably efficient data structures, a nonexpert can easily create wildly inefficient structures. For example, the way an address was defined in ASN.1 in Kerberos version 5, an IPv4 address would be encoded (in databases and on the wire) in 11 bytes, whereas an ASN.1 expert could have defined it differently, to use 6 bytes. Some might argue that a naive C programmer can generate inefficient code, but perhaps inefficient C code is less important because it only effects the inside of a machine, and can later be improved, whereas an inefficient data structure results in bits on the wire. * TLV encoding makes optional fields easy and should make forward compatibility easy. However, ASN.1 1984 was not implemented to make it easy to add optional fields. Athough it translated into TLV encoding, the parser would reject a data structure with added fields. Although the 1988 version of ASN.1 fixed this, most protocols continue to use 1984 ASN.1 because of the availability of 1984 ASN.1 compilers. 15 Security Pitfalls Although a complete coverage of security pitfalls is beyond the scope of a short paper, it is probably useful to note a few. * bad random number generators for seeds for keys. Though this is usually an implementation problem rather than a protocol problem, it is a sufficiently common mistake that it is worth mentioning * encryption alone does not necessarily provide data integrity. For example, an encryption algorithm that precomputes a pseudorandom bit string, and XOR's it with the data. If the data is predictable, then Perlman [Page 16] Internet Draft draft-iab-perlman-folklore-00.txt January 1998 the real data can be XOR'd out, and replaced with new data, even though the ciphertext cannot be "decrypted" * reflection attacks, especially with multiple servers. If the same secret is used with multiple servers, a common mistake in some (bad) protocols allows a message sent to one to be replayed at another * backward compatibility with weak or broken crypto alogithms. Sometimes for compatibility with exportable versions, or old versions, a negotiation is done in which one side can request weaker security. If this negotiation is not itself integrity protected, an intruder can fool two sides capable of talking good security into speaking weaker security by injecting a message into the negotiation requesting the weaker security. * IP addresses are spoofable. Sometimes the assumption is that only the client needs to authenticate to the server. However, if an intruder spoofs a server, it can cause the client machine to do things like send the user's password in the clear. * Sometimes protocols can trick something into decrypting or signing something. For example, if the method of authentication is to accept any abritrary challenge and sign it with your private key, then the "challenge" might actually be a promise to pay someone a million dollars. The PKCS standards are designed to avoid this sort of pitfall. 16 Author's Address Radia Perlman Sun Microsystems, Inc. 2 Elizabeth Drive Chelmsford, MA 01824 Tel: +1.978.442.3252 Email: radia.perlman@sun.com Perlman [Page 17]