idnits 2.17.1 draft-hildebrand-spud-prototype-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 104 has weird spacing: '...unknown no in...' == Line 107 has weird spacing: '...opening the i...' == Line 110 has weird spacing: '...running the t...' == Line 112 has weird spacing: '...esuming an ou...' == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (February 12, 2015) is 3361 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Hildebrand 3 Internet-Draft Cisco Systems 4 Intended status: Informational B. Trammell 5 Expires: August 16, 2015 ETH Zurich 6 February 12, 2015 8 Substrate Protocol for User Datagrams (SPUD) Prototype 9 draft-hildebrand-spud-prototype-01 11 Abstract 13 SPUD is a prototype for grouping UDP packets together in a "tube", 14 also allowing network devices on the path between endpoints to 15 participate explicitly in the tube outside the end-to-end context. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on August 16, 2015. 34 Copyright Notice 36 Copyright (c) 2015 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 1. Introduction 51 The goal of SPUD (Substrate Protocol for User Datagrams) is to 52 provide a mechanism for grouping UDP packets together into a "tube" 53 with a defined beginning and end in time. Devices on the network 54 path between the endpoints speaking SPUD may communicate explicitly 55 with the endpoints outside the context of the end-to-end 56 conversation. 58 The SPUD protocol is a prototype, intended to promote further 59 discussion of potential use cases within the framework of a concrete 60 approach. To move forward, ideas explored in this protocol might be 61 implemented inside another protocol such as DTLS. 63 1.1. Terminology 65 In this document, the key words "MUST", "MUST NOT", "REQUIRED", 66 "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", 67 and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 68 [RFC2119]. 70 2. Requirements 72 o Deploy on existing Internet 74 o No kernel modifications required 76 o Only widely-available APIs required 78 o No root permissions required for endpoint applications 80 o New choices for congestion, retransmit, etc. available in 81 transport protocols inside SPUD 83 o Single firewall-traversal mechanism, multiple transport semantics 85 o Low overhead 87 * Determine SPUD is in use (very fast) 89 * Associate packets with a tube (relatively fast) 91 o Policy per-tube 93 o Multiple interfaces for each endpoint 95 3. Lifetime of a tube 97 A tube is a grouping of packets between two endpoints on the network. 98 Tubes are started by the "initiator" expressing an interest in 99 comminicating with the "responder". A tube may be closed by either 100 endpoint. 102 A tube may be in one of the following states: 104 unknown no information is currently known about the tube. All tubes 105 implicitly start in the unknown state. 107 opening the initiator has requested a tube that the responder has 108 not yet acknowledged. 110 running the tube is set up and will allow data to flow 112 resuming an out-of-sequence SPUD packet has been received for this 113 tube. Policy will need to be developed describing how (or if) 114 this state can be exploited for quicker tube resumption by higher- 115 level protocols. 117 This leads to the following state transitions (see Section 4.2 for 118 details on the commands that cause transitions): 120 +--------------------+ +-----+ 121 | | |close| 122 | v | v 123 | +-----open--- +-------+ <--close----+ 124 | | |unknown| | 125 | | +------> +-------+ --ack,-+ | 126 | | | data | | 127 | | close | | 128 | v | v | 129 | +-------+ -------data-------> +--------+ 130 | +---|opening| |resuming|---+ 131 | | +-------+ <------open-------- +--------+ | 132 | | ^ | | ^ | 133 | | | | | | | 134 | +open-+ +-ack--> +-------+ <--ack-+ +-data+ 135 | |running| 136 +-------close------- +-------+ 137 ^ | 138 | | open,ack,data 139 +----+ 141 Figure 1: State transitions 143 4. Packet layout 145 SPUD packets are sent inside UDP packets, with the SPUD header 146 directly after the UDP header. 148 0 1 2 3 149 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 150 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 151 | magic = 0xd80000d8 | 152 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 153 |cmd|a|p| tube ID | 154 +-+-+-+-+ + 155 | | 156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 157 | CBOR Map | 158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 160 Figure 2: SPUD packets 162 The fields in the packet are: 164 o 32-bit constant magic number (see Section 4.1) 166 o 2 bits of command (see Section 4.2) 168 o 1 bit marking this packet as an application declaration (adec) 170 o 1 bit marking this packet as a path declaration (pdec) 172 o 60 bits defining the id of this tube 174 o Data. If any of the command, adec, or pdec bits are set, the data 175 is CBOR. 177 4.1. Detecting usage 179 The first 32 bits of every SPUD packet is the constant bit pattern 180 d80000d8 (hex), or 1101 1000 0000 0000 1101 1000 (binary). This 181 pattern was selected to be invalid UTF-8, UTF-16 (both big- and 182 little-endian), and UTF-32 (both big- and little-endian). The intent 183 is to ensure that text-based non-SPUD protocols would not use this 184 pattern by mistake. A survey of other protocols will be done to see 185 if this pattern occurs often in existing traffic. 187 The intent of this magic number is not to provide conclusive evidence 188 that SPUD is being used in this packet, but instead to allow a very 189 fast (i.e., trivially implementable in hardware) way to decide that 190 SPUD is not in use on packets that do not include the magic number. 192 4.2. Commands 194 The next 2 bits of a SPUD packet encode a command: 196 Data (00) Normal data in a running tube 198 Open (01) A request to begin a tube 200 Close (10) A request to end a tube 202 Ack (11) An acknowledgement to an open request 204 4.3. Declaration bits 206 The adec bit is set when the application is making a declaration to 207 the path. The pdec bit is set when the path is making a declaration 208 to the application. 210 4.4. Additional information 212 The information after the SPUD header is a CBOR [RFC7049] map (major 213 type 5). Each key in the map may be an integer (major type 0 or 1) 214 or a text string (major type 3). Integer keys are reserved for 215 standardized protocols, with a registry defining their meaning. This 216 convention can save several bytes per packet, since small integers 217 only take a single byte in the CBOR encoding, and a single-character 218 string takes at least two bytes (more when useful-length strings are 219 used). 221 The only integer keys reserved by this version of the document are: 223 0 (anything) Application Data. Any CBOR data type, used as 224 application-specific data. Often this will be a byte string 225 (major type 2), particularly for protocols that encrypt data. 227 The overhead for always using CBOR is therefore effectively three or 228 more bytes 0xA1 (map with one element), 0x00 (integer 0 as the key), 229 and 0x41 (byte string containing one byte). [EDITOR'S NOTE: It may 230 be that the simplicity and extensisbility of this approach is worth 231 the three bytes of overhead.] 233 5. Initiating a tube 235 To begin a tube, the initiator sends a SPUD packet with the "open" 236 command (bits 01). 238 Future versions of this specification may contain CBOR requesting 239 proof of implementation from the receiving endpoint. 241 6. Acknowledging tube creation 243 To acknowledge the creation of a tube, the responder sends a SPUD 244 packet with the "ack" command (bits 11). The current thought is that 245 the security provided by the TCP three-way handshake would be left to 246 transport protocols inside of SPUD. Further exploration of this 247 prototype will help decide how much of this handshake needs to be 248 made visible to path elements that _only_ process SPUD. 250 7. Closing a tube 252 To close a tube, either side sends a packet with the "close" command 253 (bits 10). Whenever a path element sees a close packet for a tube, 254 it MAY drop all stored state for that tube. Further exploration of 255 this prototype will determine when close packets are sent, what CBOR 256 they contain, and how they interact with transport protocols inside 257 of SPUD. 259 What is likely at this time is that SPUD close packets MAY contain 260 error information in the following CBOR keys (and associated values): 262 "error" (map, major type 5) a map from text string (major type 3) to 263 text string. The keys are [RFC5646] language tags, and the values 264 are strings that can be presented to a user that understands that 265 language. The key "*" can be used as the default. 267 "url" (text string, major type 3) a URL identifying some information 268 about the path or its relationship with the tube. The URL 269 represents some path condition, and retrieval of content at the 270 URL should include a human-readable description. 272 8. Path declarations 274 SPUD can be used for path declarations: information delivered to the 275 endpoints from devices along the path. Path declarations can be 276 thought of as enhanced ICMP for transports using SPUD, allowing 277 information about the condition or state of the path or the tube to 278 be communicated directly to a sender. 280 Path declarations may be sent in either direction (toward the 281 initiator or responder) at any time. The scope of a path declaration 282 is the tube (identified by tube ID) to which it is associated. 283 Devices along the path cannot make declarations to endpoints without 284 a tube to associate them with. Path declarations are sent to one 285 endpoint in a SPUD conversation by the path device sending SPUD 286 packets with the source IP address and UDP port from the other 287 endpoint in the conversation. These "spoofed" packets are required 288 to allow existing network elements that pass traffic for a given 289 5-tuple to continue to work. To ensure that the context for these 290 declarations is correct, path declaration packets MUST have the pdec 291 bit set. Path declarations MUST use the "data" command (bits 00). 293 Path declarations do not imply specific required actions on the part 294 of receivers. Any path declaration MAY be ignored by a receiving 295 application. When using a path declaration as input to an algorithm, 296 the application will make decisions about the trustworthiness of the 297 declaration before using the data in the declaration. 299 The data associated with a path declaration may always have the 300 following keys (and associated values), regardless of what other 301 information is included: 303 "ipaddr" (byte string, major type 2) the IPv4 address or IPv6 304 address of the sender, as a string of 4 or 16 bytes in network 305 order. This is necessary as the source IP address of the packet 306 is spoofed 308 "cookie" (byte string, major type 2) data that identifies the 309 sending path element unambiguously 311 "url" (text string, major type 3) a URL identifying some information 312 about the path or its relationship with the tube. The URL 313 represents some path condition, and retrieval of content at the 314 URL should include a human-readable description. 316 "warning" (map, major type 5) a map from text string (major type 3) 317 to text string. The keys are [RFC5646] language tags, and the 318 values are strings that can be presented to a user that 319 understands that language. The key "*" can be used as the 320 default. 322 The SPUD mechanism is defined to be completely extensible in terms of 323 the types of path declarations that can be made. However, in order 324 for this mechanism to be of use, endpoints and devices along the path 325 must share a relatively limited vocabulary of path declarations. The 326 following subsections briefly explore declarations we believe may be 327 useful, and which will be further developed on the background of 328 concrete use cases to be defined as part of the SPUD effort. 330 Terms in this vocabulary considered universally useful may be added 331 to the SPUD path declaration map keys, which in this case would then 332 be defined as an IANA registry. 334 8.1. ICMP 336 ICMP [RFC4443] (e.g.) messages are sometimes blocked by path elements 337 attempting to provide security. Even when they are delivered to the 338 host, many ICMP messages are not made available to applications 339 through portable socket interfaces. As such, a path element might 340 decide to copy the ICMP message into a path declaration, using the 341 following key/value pairs: 343 "icmp" (byte string, major type 2) the full ICMP payload. This is 344 intended to allow ICMP messages (which may be blocked by the path, 345 or not made available to the receiving application) to be bound to 346 a tube. Note that sending a path declaration ICMP message is not 347 a substitute for sending a required ICMP or ICMPv6 message. 349 "icmp-type" (unsigned, major type 0) the ICMP type 351 "icmp-code" (unsigned, major type 0) the ICMP code 353 Other information from particular ICMP codes may be parsed out into 354 key/value pairs. 356 8.2. Address translation 358 SPUD-aware path elements that perform Network Address Translation 359 MUST send a path declaration describing the translation that was 360 done, using the following key/value pairs: 362 "translated-external-address" (byte string, major type 2) The 363 translated external IPv4 address or IPv6 address for this 364 endpoint, as a string of 4 or 16 bytes in network order 366 "translated-external-port" (unsigned, major type 0) The translated 367 external UDP port number for this endpoint 369 "internal-address" (byte string, major type 2) The pre-translation 370 (internal) IPv4 address or IPv6 address for this endpoint, as a 371 string of 4 or 16 bytes in network order 373 "internal-port" (unsigned, major type 0) The pre-translation 374 (internal) UDP port number for this endpoint 376 The internal addresses are useful when multiple address translations 377 take place on the same path. 379 8.3. Tube lifetime 381 SPUD-aware path elements that are maintaining state MAY drop state 382 using inactivity timers, however if they use a timer they MUST send a 383 path declaration in both directions with the length of that timer, 384 using the following key/value pairs: 386 "inactivity-timer" (unsigned, major type 0) The length of the 387 inactivity timer (in microseconds). A value of 0 means no timeout 388 is being enforced by this path element, which might be useful if 389 the timeout changes over the lifetime of a tube. 391 8.4. Explicit congestion notification 393 Similar to ICMP, getting explicit access to ECN [RFC3168] information 394 in applications can be difficult. As such, a path element might 395 decide to generate a path declaration using the following key/value 396 pairs: 398 "ecn" (True, major type 7) congestion has been detected 400 [EDITOR'S NOTE: we will track current proposals to improve ECN 401 resolution here. DCTCP uses higher marking rate and lower response 402 rate to get high resolution marking; we have ints, which are more 403 powerful, if we can find an algorithm simple enough for path elements 404 to use.] 406 8.5. Path element identity 408 Path elements can describe themselves using the following key/value 409 pairs: 411 "description" (text string, major type 3) the name of the software, 412 hardware, product, etc. that generated the declaration 414 "version" (text string, major type 3) the version of the software, 415 hardware, product, etc. that generated the declaration 417 "caps" (byte string, major type 2) a hash of the capabilities of the 418 software, hardware, product, etc. that generated the declaration 419 [TO BE DESCRIBED] 421 "ttl" (unisigned integer, major type 0) IP time to live / IPv6 Hop 422 Limit of associated device [EDITOR'S NOTE: more detail is required 423 on how this is calculated] 425 8.6. Maximum Datagram Size 427 A path element may tell the endpoint the maximum size of a datagram 428 it is willing or able to forward for a tube, to augment various path 429 MTU discovery mechanisms. This declaration uses the following key/ 430 value pairs: 432 "mtu" (unsigned, major type 0) the maximum transmission unit (in 433 bytes) 435 8.7. Rate Limit 437 A path element may tell the endpoint the maximum data rate (in octets 438 or packets) that it is willing or able to forward for a tube. As all 439 path declarations are advisory, the device along the path must not 440 rely on the endpoint to set its sending rate at or below the declared 441 rate limit, and reduction of rate is not a guarantee to the endpoint 442 of zero queueing delay. This mechanism is intended for "gross" rate 443 limitation, i.e. to declare that the output interface is connected to 444 a limited or congested link, not as a substitute for loss-based or 445 explicit congestion notification on the RTT timescale. This 446 declaration uses the following key/value pairs: 448 "max-byte-rate" (unsigned, major type 0) the maximum bandwidth (in 449 bytes per second) 451 "max-packet-rate" (unsigned, major type 0) the maximum bandwidth (in 452 packets per second) 454 8.8. Latency Advisory 456 A path element may tell the endpoint the latency attributable to 457 traversing that path element. This mechanism is intended for "gross" 458 latency advisories, for instance to declare the output interface is 459 connected to a satellite or [RFC1149] link. This declaration uses 460 the following key/value pairs: 462 "latency" (unsigned, major type 0) the latency (in microseconds) 464 8.9. Prohibition Report 466 A path element which refuses to forward a packet may declare why the 467 packet was not forwarded, similar to the various Destination 468 Unreachable codes of ICMP. 470 [EDITOR'S NOTE: Further thought will be given to how these reports 471 interact with the ICMP support from Section 8.1.] 473 9. Declaration reflection 475 In some cases, a device along the path may wish to send a path 476 declaration but may not be able to send packets ont he reverse path. 477 It may ask the endpoint in the forward direction to reflect a SPUD 478 packet back along the reverse path in this case. 480 [EDITOR'S NOTE: Bob Briscoe raised this issue during the SEMI 481 workshop, which has largely to do with tunnels. It is not clear to 482 the authors yet how a point along the path would know that it must 483 reflect a declaration, but this approach is included for 484 completeness.] 486 A reflected declaration is a SPUD packet with both the pdec and adec 487 flags set, and contains the same content as a path declaration would. 488 However the packet has the same source address and port and 489 destination address and port as the SPUD packet which triggered it. 491 When a SPUD endpoint receives a declaration reflection, it SHOULD 492 reflect it: swapping the source and destination addresses IP 493 addresses and ports. The reflecting endpoint MUST unset the adec 494 bit, sending the packet it as if it were a path declaration. 496 [EDITOR's NOTE: this facility will need careful security analysis 497 before it makes it into any final specification.] 499 10. Application declarations 501 Applications may also use the SPUD mechanism to describe the traffic 502 in the tube to the application on the other side, and/or to any point 503 along the path. As with path declarations, the scope of an 504 application declaration is the tube (identified by tube ID) to which 505 it is associated. 507 An application declaration is a SPUD packet with the adec flag set, 508 and contains an application declaration formatted in CBOR in its 509 payload. As with path declarations, an application declaration is a 510 CBOR map, which may always have the following keys: 512 o cookie (byte string, major type 2): an identifier for this 513 application declaration, used to address a particular path element 515 Unless the cookie matches one sent by the path element for this tube, 516 every device along the path MUST forward application declarations on 517 towards the destination endpoint. 519 The definition of an application declaration vocabulary is left as 520 future work; we note only at this point that the mechanism supports 521 such declarations. 523 11. CBOR Profile 525 Moving forward, we will likely specify a subset of CBOR that can be 526 used in SPUD, including the avoidance of floating point numbers, 527 indefinite-length arrays, and indefinite-length maps. This will 528 allow a significantly less complicated CBOR implementation to be 529 used, which would be particularly nice on constrained devices. 531 12. Security Considerations 533 This gives endpoints the ability to expose information about 534 conversations to elements on path. As such, there are going to be 535 very strict security requirements about what can be exposed, how it 536 can be exposed, etc. This prototype DOES NOT tackle these issues 537 yet. 539 The goal is to ensure that this layer is better than TCP from a 540 security perspective. The prototype is clearly not yet to that 541 point. 543 13. IANA Considerations 545 If this protocol progresses beyond prototype in some way, a registry 546 will be needed for well-known CBOR map keys. 548 14. Acknowledgements 550 Thanks to Ted Hardie for suggesting the change from "Session" to 551 "Substrate" in the title, and to Joel Halpern for suggesting the 552 change from "session" to "tube" in the protocol description. 554 15. References 556 15.1. Normative References 558 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 559 Requirement Levels", BCP 14, RFC 2119, March 1997. 561 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 562 of Explicit Congestion Notification (ECN) to IP", RFC 563 3168, September 2001. 565 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 566 Message Protocol (ICMPv6) for the Internet Protocol 567 Version 6 (IPv6) Specification", RFC 4443, March 2006. 569 [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying 570 Languages", BCP 47, RFC 5646, September 2009. 572 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 573 Representation (CBOR)", RFC 7049, October 2013. 575 15.2. Informative References 577 [RFC1149] Waitzman, D., "Standard for the transmission of IP 578 datagrams on avian carriers", RFC 1149, April 1990. 580 Authors' Addresses 582 Joe Hildebrand 583 Cisco Systems 585 Email: jhildebr@cisco.com 587 Brian Trammell 588 ETH Zurich 590 Email: ietf@trammell.ch