idnits 2.17.1 draft-sharabayko-mops-srt-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The existing key values MUST not be extended, and MUST not differ from those described in this section. -- The document date (9 September 2020) is 1315 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-34) exists of draft-ietf-quic-http-29 == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-29 -- Obsolete informational reference (is this intentional?): RFC 2898 (Obsoleted by RFC 8018) -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MOPS M.P. Sharabayko 3 Internet-Draft M.A. Sharabayko 4 Intended status: Standards Track Haivision Network Video, GmbH 5 Expires: 13 March 2021 J. Dube 6 Haivision 7 JS. Kim 8 JW. Kim 9 SK Telecom Co., Ltd. 10 9 September 2020 12 The SRT Protocol 13 draft-sharabayko-mops-srt-01 15 Abstract 17 This document specifies Secure Reliable Transport (SRT) protocol. 18 SRT is a user-level protocol over User Datagram Protocol and provides 19 reliability and security optimized for low latency live video 20 streaming, as well as generic bulk data transfer. For this, SRT 21 introduces control packet extension, improved flow control, enhanced 22 congestion control and a mechanism for data encryption. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on 13 March 2021. 41 Copyright Notice 43 Copyright (c) 2020 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 48 license-info) in effect on the date of publication of this document. 49 Please review these documents carefully, as they describe your rights 50 and restrictions with respect to this document. Code Components 51 extracted from this document must include Simplified BSD License text 52 as described in Section 4.e of the Trust Legal Provisions and are 53 provided without warranty as described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 3 59 1.2. Secure Reliable Transport Protocol . . . . . . . . . . . 4 60 2. Terms and Definitions . . . . . . . . . . . . . . . . . . . . 5 61 3. Packet Structure . . . . . . . . . . . . . . . . . . . . . . 6 62 3.1. Data Packets . . . . . . . . . . . . . . . . . . . . . . 7 63 3.2. Control Packets . . . . . . . . . . . . . . . . . . . . . 8 64 3.2.1. Handshake . . . . . . . . . . . . . . . . . . . . . . 9 65 3.2.2. Key Material . . . . . . . . . . . . . . . . . . . . 17 66 3.2.3. Keep-Alive . . . . . . . . . . . . . . . . . . . . . 21 67 3.2.4. ACK (Acknowledgment) . . . . . . . . . . . . . . . . 22 68 3.2.5. NAK (Loss Report) . . . . . . . . . . . . . . . . . . 24 69 3.2.6. Shutdown . . . . . . . . . . . . . . . . . . . . . . 25 70 3.2.7. ACKACK . . . . . . . . . . . . . . . . . . . . . . . 26 71 4. SRT Data Transmission and Control . . . . . . . . . . . . . . 26 72 4.1. Stream Multiplexing . . . . . . . . . . . . . . . . . . . 27 73 4.2. Data Transmission Modes . . . . . . . . . . . . . . . . . 27 74 4.2.1. Message Mode . . . . . . . . . . . . . . . . . . . . 27 75 4.2.2. Live Mode . . . . . . . . . . . . . . . . . . . . . . 28 76 4.2.3. Buffer Mode . . . . . . . . . . . . . . . . . . . . . 28 77 4.3. Handshake Messages . . . . . . . . . . . . . . . . . . . 28 78 4.3.1. Caller-Listener Handshake . . . . . . . . . . . . . . 31 79 4.3.2. Rendezvous Handshake . . . . . . . . . . . . . . . . 33 80 4.4. SRT Buffer Latency . . . . . . . . . . . . . . . . . . . 39 81 4.5. Timestamp-Based Packet Delivery . . . . . . . . . . . . . 40 82 4.5.1. Packet Delivery Time . . . . . . . . . . . . . . . . 42 83 4.6. Too-Late Packet Drop . . . . . . . . . . . . . . . . . . 43 84 4.7. Drift Management . . . . . . . . . . . . . . . . . . . . 44 85 4.8. Acknowledgement and Lost Packet Handling . . . . . . . . 46 86 4.8.1. Packet Acknowledgement (ACKs, ACKACKs) . . . . . . . 46 87 4.8.2. Packet Retransmission (NAKs) . . . . . . . . . . . . 47 88 4.9. Bidirectional Transmission Queues . . . . . . . . . . . . 49 89 4.10. Round-Trip Time Estimation . . . . . . . . . . . . . . . 49 90 4.11. Congestion Control . . . . . . . . . . . . . . . . . . . 50 91 5. Encryption . . . . . . . . . . . . . . . . . . . . . . . . . 50 92 5.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 51 93 5.1.1. Encryption Scope . . . . . . . . . . . . . . . . . . 51 94 5.1.2. AES Counter . . . . . . . . . . . . . . . . . . . . . 51 95 5.1.3. Stream Encrypting Key (SEK) . . . . . . . . . . . . . 51 96 5.1.4. Key Encrypting Key (KEK) . . . . . . . . . . . . . . 52 97 5.1.5. Key Material Exchange . . . . . . . . . . . . . . . . 52 98 5.1.6. KM Refresh . . . . . . . . . . . . . . . . . . . . . 53 99 5.2. Encryption Process . . . . . . . . . . . . . . . . . . . 54 100 5.2.1. Generating the Stream Encrypting Key . . . . . . . . 54 101 5.2.2. Encrypting the Payload . . . . . . . . . . . . . . . 54 102 5.3. Decryption Process . . . . . . . . . . . . . . . . . . . 54 103 5.3.1. Restoring the Stream Encrypting Key . . . . . . . . . 55 104 5.3.2. Decrypting the Payload . . . . . . . . . . . . . . . 55 105 6. Security Considerations . . . . . . . . . . . . . . . . . . . 56 106 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 56 107 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 56 108 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 56 109 References . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 110 Normative References . . . . . . . . . . . . . . . . . . . . . 56 111 Informative References . . . . . . . . . . . . . . . . . . . . 57 112 Appendix A. Packet Sequence List Coding . . . . . . . . . . . . 59 113 Appendix B. SRT Access Control . . . . . . . . . . . . . . . . . 60 114 B.1. General Syntax . . . . . . . . . . . . . . . . . . . . . 60 115 B.2. Standard Keys . . . . . . . . . . . . . . . . . . . . . . 61 116 B.3. Examples . . . . . . . . . . . . . . . . . . . . . . . . 62 117 Appendix C. Changelog . . . . . . . . . . . . . . . . . . . . . 62 118 C.1. Since Version 00 . . . . . . . . . . . . . . . . . . . . 62 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 63 121 1. Introduction 123 1.1. Motivation 125 The demand for live video streaming has been increasing steadily for 126 many years. With the emergence of cloud technologies, many video 127 processing pipeline components have transitioned from on-premises 128 appliances to software running on cloud instances. While real-time 129 streaming over TCP-based protocols like RTMP [RTMP] is possible at 130 low bitrates and on a small scale, the exponential growth of the 131 streaming market has created a need for more powerful solutions. 133 To improve scalability on the delivery side, content delivery 134 networks (CDNs) at one point transitioned to segmentation-based 135 technologies like HLS (HTTP Live Streaming) [RFC8216] and DASH 136 (Dynamic Adaptive Streaming over HTTP) [ISO23009]. This move 137 increased the end-to-end latency of live streaming to over 30 138 seconds, which makes it unattractive for many use cases. Over time, 139 the industry optimized these delivery methods, bringing the latency 140 down to 3 seconds. 142 While the delivery side scaled up, improvements to video transcoding 143 became a necessity. Viewers watch video streams on a variety of 144 different devices, connected over different types of networks. Since 145 upload bandwidth from on-premises locations is often limited, video 146 transcoding moved to the cloud. 148 RTMP became the de facto standard for contribution over the public 149 Internet. But there are limitations for the payload to be 150 transmitted, since RTMP as a media specific protocol only supports 151 two audio channels and a restricted set of audio and video codecs, 152 lacking support for newer formats such as HEVC [H.265], VP9 [VP9], or 153 AV1 [AV1]. 155 Since RTMP, HLS and DASH rely on TCP, these protocols can only 156 guarantee acceptable reliability over connections with low RTTs, and 157 can not use the bandwidth of network connections to their full extent 158 due to limitations imposed by congestion control. Notably, QUIC 159 [I-D.ietf-quic-transport] has been designed to address these problems 160 with HTTP-based delivery protocols in HTTP/3 [I-D.ietf-quic-http]. 161 Like QUIC, SRT [SRTSRC] uses UDP instead of the TCP transport 162 protocol, but assures more reliable delivery using Automatic Repeat 163 Request (ARQ), packet acknowledgments, end-to-end latency management, 164 etc. 166 1.2. Secure Reliable Transport Protocol 168 Low latency video transmissions across reliable (usually local) IP 169 based networks typically take the form of MPEG-TS [ISO13818-1] 170 unicast or multicast streams using the UDP/RTP protocol, where any 171 packet loss can be mitigated by enabling forward error correction 172 (FEC). Achieving the same low latency between sites in different 173 cities, countries or even continents is more challenging. While it 174 is possible with satellite links or dedicated MPLS [RFC3031] 175 networks, these are expensive solutions. The use of public Internet 176 connectivity, while less expensive, imposes significant bandwidth 177 overhead to achieve the necessary level of packet loss recovery. 178 Introducing selective packet retransmission (reliable UDP) to recover 179 from packet loss removes those limitations. 181 Derived from the UDP-based Data Transfer (UDT) protocol [GHG04b], SRT 182 is a user-level protocol that retains most of the core concepts and 183 mechanisms while introducing several refinements and enhancements, 184 including control packet modifications, improved flow control for 185 handling live streaming, enhanced congestion control, and a mechanism 186 for encrypting packets. 188 SRT is a transport protocol that enables the secure, reliable 189 transport of data across unpredictable networks, such as the 190 Internet. While any data type can be transferred via SRT, it is 191 ideal for low latency (sub-second) video streaming. SRT provides 192 improved bandwidth utilization compared to RTMP, allowing much higher 193 contribution bitrates over long distance connections. 195 As packets are streamed from source to destination, SRT detects and 196 adapts to the real-time network conditions between the two endpoints, 197 and helps compensate for jitter and bandwidth fluctuations due to 198 congestion over noisy networks. Its error recovery mechanism 199 minimizes the packet loss typical of Internet connections. 201 To achieve low latency streaming, SRT had to address timing issues. 202 The characteristics of a stream from a source network are completely 203 changed by transmission over the public Internet, which introduces 204 delays, jitter, and packet loss. This, in turn, leads to problems 205 with decoding, as the audio and video decoders do not receive packets 206 at the expected times. The use of large buffers helps, but latency 207 is increased. SRT includes a mechanism to keep a constant end-to-end 208 latency, thus recreating the signal characteristics on the receiver 209 side, and reducing the need for buffering. 211 Like TCP, SRT employs a listener/caller model. The data flow is bi- 212 directional and independent of the connection initiation - either the 213 sender or receiver can operate as listener or caller to initiate a 214 connection. The protocol provides an internal multiplexing 215 mechanism, allowing multiple SRT connections to share the same UDP 216 port, providing access control functionality to identify the caller 217 on the listener side. 219 Supporting forward error correction (FEC) and selective packet 220 retransmission (ARQ), SRT provides the flexibility to use either of 221 the two mechanisms or both combined, allowing for use cases ranging 222 from the lowest possible latency to the highest possible reliability. 224 SRT maintains the ability for fast file transfers introduced in UDT, 225 and adds support for AES encryption. 227 2. Terms and Definitions 229 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 230 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 231 "OPTIONAL" in this document are to be interpreted as described in BCP 232 14 [RFC2119] [RFC8174] when, and only when, they appear in all 233 capitals, as shown here. 235 SRT: The Secure Reliable Transport protocol described by this 236 document. 238 PRNG: Pseudo-Random Number Generator. 240 3. Packet Structure 242 SRT packets are transmitted as UDP payload [RFC0768]. Every UDP 243 packet carrying SRT traffic contains an SRT header immediately after 244 the UDP header (Figure 1). 246 0 1 2 3 247 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 248 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 249 | SrcPort | DstPort | 250 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 251 | Len | ChkSum | 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 | | 254 + SRT Packet + 255 | | 256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 Figure 1: SRT packet as UDP payload 260 SRT has two types of packets distinguished by the Packet Type Flag: 261 data packet and control packet. 263 The structure of the SRT packet is shown in Figure 2. 265 0 1 2 3 266 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 267 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 268 |F| (Field meaning depends on the packet type) | 269 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 270 | (Field meaning depends on the packet type) | 271 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 272 | Timestamp | 273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 274 | Destination Socket ID | 275 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 276 | | 277 + Packet Contents | 278 | (depends on the packet type) + 279 | | 280 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 282 Figure 2: SRT packet structure 284 F: 1 bit. Packet Type Flag. The control packet has this flag set to 285 "1". The data packet has this flag set to "0". 287 Timestamp: 32 bits. The timestamp of the packet, in microseconds. 288 The value is relative to the time the SRT connection was 289 established. Depending on the transmission mode (Section 4.2), 290 the field stores the packet send time or the packet origin time. 292 Destination Socket ID: 32 bits. A fixed-width field providing the 293 SRT socket ID to which a packet should be dispatched. The field 294 may have the special value "0" when the packet is a connection 295 request. 297 3.1. Data Packets 299 The structure of the SRT data packet is shown in Figure 3. 301 0 1 2 3 302 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 303 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 304 |0| Packet Sequence Number | 305 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 306 |P P|O|K K|R| Message Number | 307 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 308 | Timestamp | 309 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 310 | Destination Socket ID | 311 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 312 | | 313 + Data + 314 | | 315 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 317 Figure 3: Data packet structure 319 Packet Sequence Number: 31 bits. The sequential number of the data 320 packet. 322 PP: 2 bits. Packet Position Flag. This field indicates the position 323 of the data packet in the message. The value "10b" (binary) means 324 the first packet of the message. "00b" indicates a packet in the 325 middle. "01b" designates the last packet. If a single data packet 326 forms the whole message, the value is "11b". 328 O: 1 bit. Order Flag. Indicates whether the message should be 329 delivered by the receiver in order (1) or not (0). Certain 330 restrictions apply depending on the data transmission mode used 331 (Section 4.2). 333 KK: 2 bits. Key-based Encryption Flag. The flag bits indicate 334 whether or not data is encrypted. The value "00b" (binary) means 335 data is not encrypted. "01b" indicates that data is encrypted with 336 an even key, and "10b" is used for odd key encryption. Refer to 337 Section 5. The value "11b" is only used in control packets. 339 R: 1 bit. Retransmitted Packet Flag. This flag is clear when a 340 packet is transmitted the first time. The flag is set to "1" when 341 a packet is retransmitted. 343 Message Number: 26 bits. The sequential number of consecutive data 344 packets that form a message (see PP field). 346 Timestamp: 32 bits. See Section 3. 348 Destination Socket ID: 32 bits. See Section 3. 350 Data: variable length. The payload of the data packet. The length 351 of the data is the remaining length of the UDP packet. 353 3.2. Control Packets 355 An SRT control packet has the following structure. 357 0 1 2 3 358 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 359 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 360 |1| Control Type | Subtype | 361 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 362 | Type-specific Information | 363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 364 | Timestamp | 365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 366 | Destination Socket ID | 367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- CIF -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 368 | | 369 + Control Information Field + 370 | | 371 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 Figure 4: Control packet structure 375 Control Type: 15 bits. Control Packet Type. The use of these bits 376 is determined by the control packet type definition. See Table 1. 378 Subtype: 16 bits. This field specifies an additional subtype for 379 specific packets. See Table 1. 381 Type-specific Information: 32 bits. The use of this field depends on 382 the particular control packet type. Handshake packets do not use 383 this field. 385 Timestamp: 32 bits. See Section 3. 387 Destination Socket ID: 32 bits. See Section 3. 389 Control Information Field (CIF): variable length. The use of this 390 field is defined by the Control Type field of the control packet. 392 The types of SRT control packets are shown in Table 1. The value 393 "0x7FFF" is reserved for a user-defined type. 395 +===================+==============+=========+===============+ 396 | Packet Type | Control Type | Subtype | Section | 397 +===================+==============+=========+===============+ 398 | HANDSHAKE | 0x0000 | 0x0 | Section 3.2.1 | 399 +-------------------+--------------+---------+---------------+ 400 | KEEPALIVE | 0x0001 | 0x0 | Section 3.2.3 | 401 +-------------------+--------------+---------+---------------+ 402 | ACK | 0x0002 | 0x0 | Section 3.2.4 | 403 +-------------------+--------------+---------+---------------+ 404 | NAK (Loss Report) | 0x0003 | 0x0 | Section 3.2.5 | 405 +-------------------+--------------+---------+---------------+ 406 | SHUTDOWN | 0x0005 | 0x0 | Section 3.2.6 | 407 +-------------------+--------------+---------+---------------+ 408 | ACKACK | 0x0006 | 0x0 | Section 3.2.7 | 409 +-------------------+--------------+---------+---------------+ 410 | User-Defined Type | 0x7FFF | - | N/A | 411 +-------------------+--------------+---------+---------------+ 413 Table 1: SRT Control Packet Types 415 3.2.1. Handshake 417 Handshake control packets (Control Type = 0x0000) are used to 418 exchange peer configurations, to agree on connection parameters, and 419 to establish a connection. 421 The Control Information Field (CIF) of a handshake control packet is 422 shown in Figure 5. 424 0 1 2 3 425 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 427 | Version | 428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 429 | Encryption Field | Extension Field | 430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 431 | Initial Packet Sequence Number | 432 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 433 | Maximum Transmission Unit Size | 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 | Maximum Flow Window Size | 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 437 | Handshake Type | 438 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 439 | SRT Socket ID | 440 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 441 | SYN Cookie | 442 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 443 | | 444 + + 445 | | 446 + Peer IP Address + 447 | | 448 + + 449 | | 450 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 451 | Extension Type | Extension Length | 452 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 453 | | 454 + Extension Contents + 455 | | 456 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 458 Figure 5: Handshake packet structure 460 Version: 32 bits. A base protocol version number. Currently used 461 values are 4 and 5. Values greater than 5 are reserved for future 462 use. 464 Encryption Field: 16 bits. Block cipher family and key size. The 465 values of this field are described in Table 2. The default value 466 is AES-128. 468 +=======+============================+ 469 | Value | Cipher family and key size | 470 +=======+============================+ 471 | 0 | No Encryption Advertised | 472 +-------+----------------------------+ 473 | 2 | AES-128 | 474 +-------+----------------------------+ 475 | 3 | AES-192 | 476 +-------+----------------------------+ 477 | 4 | AES-256 | 478 +-------+----------------------------+ 480 Table 2: Handshake Encryption 481 Field Values 483 Extension Field: 16 bits. This field is message specific extension 484 related to Handshake Type field. The value MUST be set to 0 485 except for the following cases. (1) If the handshake control 486 packet is the INDUCTION message, this field is sent back by the 487 Listener. (2) In the case of a CONCLUSION message, this field 488 value should contain a combination of Extension Type values. For 489 more details, see Section 4.3.1. 491 +============+========+ 492 | Bitmask | Flag | 493 +============+========+ 494 | 0x00000001 | HSREQ | 495 +------------+--------+ 496 | 0x00000002 | KMREQ | 497 +------------+--------+ 498 | 0x00000004 | CONFIG | 499 +------------+--------+ 501 Table 3: Handshake 502 Extension Flags 504 Initial Packet Sequence Number: 32 bits. The sequence number of the 505 very first data packet to be sent. 507 Maximum Transmission Unit Size: 32 bits. This value is typically set 508 to 1500, which is the default Maximum Transmission Unit (MTU) size 509 for Ethernet, but can be less. 511 Maximum Flow Window Size: 32 bits. The value of this field is the 512 maximum number of data packets allowed to be "in flight" 514 (i.e. the number of sent packets for which an ACK control packet 515 has not yet been received). 517 Handshake Type: 32 bits. This field indicates the handshake packet 518 type. The possible values are described in Table 4. For more 519 details refer to Section 4.3. 521 +============+================+ 522 | Value | Handshake type | 523 +============+================+ 524 | 0xFFFFFFFD | DONE | 525 +------------+----------------+ 526 | 0xFFFFFFFE | AGREEMENT | 527 +------------+----------------+ 528 | 0xFFFFFFFF | CONCLUSION | 529 +------------+----------------+ 530 | 0x00000000 | WAVEHAND | 531 +------------+----------------+ 532 | 0x00000001 | INDUCTION | 533 +------------+----------------+ 535 Table 4: Handshake Type 537 SRT Socket ID: 32 bits. This field holds the ID of the source SRT 538 socket from which a handshake packet is issued. 540 SYN Cookie: 32 bits. Randomized value for processing a handshake. 541 The value of this field is specified by the handshake message 542 type. See Section 4.3. 544 Peer IP Address: 128 bits. IPv4 or IPv6 address of the packet's 545 sender. The value consists of four 32-bit fields. In the case of 546 IPv4 addresses, fields 2, 3 and 4 are filled with zeroes. 548 Extension Type: 16 bits. The value of this field is used to process 549 an integrated handshake. Each extension can have a pair of 550 request and response types. 552 +=======+====================+===================+ 553 | Value | Extension Type | HS Extension Flag | 554 +=======+====================+===================+ 555 | 1 | SRT_CMD_HSREQ | HSREQ | 556 +-------+--------------------+-------------------+ 557 | 2 | SRT_CMD_HSRSP | HSREQ | 558 +-------+--------------------+-------------------+ 559 | 3 | SRT_CMD_KMREQ | KMREQ | 560 +-------+--------------------+-------------------+ 561 | 4 | SRT_CMD_KMRSP | KMREQ | 562 +-------+--------------------+-------------------+ 563 | 5 | SRT_CMD_SID | CONFIG | 564 +-------+--------------------+-------------------+ 565 | 6 | SRT_CMD_CONGESTION | CONFIG | 566 +-------+--------------------+-------------------+ 567 | 7 | SRT_CMD_FILTER | CONFIG | 568 +-------+--------------------+-------------------+ 569 | 8 | SRT_CMD_GROUP | CONFIG | 570 +-------+--------------------+-------------------+ 572 Table 5: Handshake Extension Type values 574 Extension Length: 16 bits. The length of the Extension Contents 575 field in four-byte blocks. 577 Extension Contents: variable length. The payload of the extension. 579 3.2.1.1. Handshake Extension Message 581 In a Handshake Extension, the value of the Extension Field of the 582 handshake control packet is defined as 1 for a Handshake Extension 583 request (SRT_CMD_HSREQ in Table 5), and 2 for a Handshake Extension 584 response (SRT_CMD_HSRSP in Table 5). 586 The Extension Contents field of a Handshake Extension Message is 587 structured as follows: 589 0 1 2 3 590 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 592 | SRT Version | 593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 594 | SRT Flags | 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 596 | Receiver TSBPD Delay | Sender TSBPD Delay | 597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 599 Figure 6: Handshake Extension Message structure 601 SRT Version: 32 bits. SRT library version MUST be formed as major * 602 0x10000 + minor * 0x100 + patch. 604 SRT Flags: 32 bits. SRT configuration flags (see Section 3.2.1.1.1). 606 Receiver TSBPD Delay: 16 bits. Timestamp-Based Packet Delivery 607 (TSBPD) Delay of the receiver. Refer to Section 4.5. 609 Sender TSBPD Delay: 16 bits. TSBPD of the sender. Refer to 610 Section 4.5. 612 3.2.1.1.1. Handshake Extension Message Flags 614 +============+===============+ 615 | Bitmask | Flag | 616 +============+===============+ 617 | 0x00000001 | TSBPDSND | 618 +------------+---------------+ 619 | 0x00000002 | TSBPDRCV | 620 +------------+---------------+ 621 | 0x00000004 | CRYPT | 622 +------------+---------------+ 623 | 0x00000008 | TLPKTDROP | 624 +------------+---------------+ 625 | 0x00000010 | PERIODICNAK | 626 +------------+---------------+ 627 | 0x00000020 | REXMITFLG | 628 +------------+---------------+ 629 | 0x00000040 | STREAM | 630 +------------+---------------+ 631 | 0x00000080 | PACKET_FILTER | 632 +------------+---------------+ 634 Table 6: Handshake 635 Extension Message Flags 637 * TSBPDSND flag defines if the TSBPD mechanism (Section 4.5) will be 638 used for sending. 640 * TSBPDRCV flag defines if the TSBPD mechanism (Section 4.5) will be 641 used for receiving. 643 * CRYPT flag MUST be set. It is a legacy flag that indicates the 644 party understands KK field of the SRT Packet (Figure 3). 646 * TLPKTDROP flag should be set if too-late packet drop mechanism 647 will be used during transmission. See Section 4.6. 649 * PERIODICNAK flag set indicates the peer will send periodic NAK 650 packets. See Section 4.8.2. 652 * REXMITFLG flag MUST be set. It is a legacy flag that indicates 653 the peer understands the R field of the SRT DATA Packet 654 (Figure 3). 656 * STREAM flag identifies the transmission mode (Section 4.2) to be 657 used in the connection. If the flag is set the buffer mode 658 (Section 4.2.3) will be used. Otherwise, message mode 659 (Section 4.2.1) is to be used. 661 * PACKET_FILTER flag indicates if the peer supports packet filter. 663 3.2.1.2. Key Material Extension Message 665 If an encrypted connection is being established, the Key Material 666 (KM) is first transmitted as a Handshake Extension message. This 667 extension is not supplied for unprotected connections. The purpose 668 of the extension is to let peers exchange and negotiate encryption- 669 related information to be used to encrypt and decrypt the payload of 670 the stream. 672 The extension can be supplied with the Handshake Extension Type field 673 set to either SRT_CMD_KMREQ or SRT_CMD_HSRSP (see Table 5 in 674 Section 3.2.1). For more details refer to Section 4.3. 676 The KM message is placed in the Extension Contents. See 677 Section 3.2.2 for the structure of the KM message. 679 3.2.1.3. Stream ID Extension Message 681 The Stream ID handshake extension message can be used to identify the 682 stream content. The Stream ID value can be free-form, but there is 683 also a recommended convention that can be used to achieve 684 interoperability. 686 The Stream ID handshake extension message has SRT_CMD_SID extension 687 type (see Table 5. The extension contents are a sequence of UTF-8 688 characters. The maximum allowed size of the StreamID extension is 689 512 bytes. 691 0 1 2 3 692 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 693 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 | | 695 | Stream ID | 696 ... 697 | | 698 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 700 Figure 7: Stream ID Extension Message 702 The Extension Contents field holds a sequence of UTF-8 characters 703 (see Figure 7). The maximum allowed size of the StreamID extension 704 is 512 bytes. The actual size is determined by the Extension Length 705 field (Figure 5), which defines the length in four byte blocks. If 706 the actual payload is less than the declared length, the remaining 707 bytes are set to zeros. 709 The content is stored as 32-bit little endian words. 711 3.2.1.4. Group Membership Extension 713 The Group Membership handshake extension is used to distinguish 714 single SRT connections and bonded SRT connections (group 715 connections). 717 0 1 2 3 718 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 719 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 720 | Group ID | 721 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 722 | Type | Flags | Weight | 723 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 725 Figure 8: Group Membership Extension Message 727 GroupID: 32 bits. The identifier of a group whose members include 728 the sender socket that is making a connection. The target socket 729 that should interpret it should belong to the corresponding group 730 on its side (or should create one, if it doesn't exist). 732 Type: 8 bits. Group type, as per SRT_GTYPE_ enumeration. 734 * 0: undefined group type, 736 * 1: broadcast group type, 738 * 2: main/backup group type 739 * 3: balancing group type (reserved for future use) 741 * 4: multicast group type (reserved for future use) 743 Flags: 8 bits. Special flags mostly reserved for the future. See 744 Figure 9. 746 Weight: 16 bits. Special value with interpretation depending on the 747 Type field value. 749 * Not used with broadcast groups. 751 * Defines the link priority in backup groups. 753 * Not yet defined (reserved for future) for any other cases. 755 0 1 2 3 4 5 6 7 756 +-+-+-+-+-+-+-+ 757 | (zero) |M| 758 +-+-+-+-+-+-+-+ 760 Figure 9: Group Membership Extension Flags 762 M: 1 bit. When set, defines synchronization on message numbers, 763 otherwise transmission is synchronized on sequence numbers. 765 3.2.2. Key Material 767 The purpose of the Key Material Message is to let peers exchange 768 encryption-related information to be used to encrypt and decrypt the 769 payload of the stream. 771 This message can be supplied in two possible ways: 773 * as a Handshake Extension, see Section 3.2.1.2, 775 * in the Content Information Field of the User-Defined control 776 packet (described below). 778 When the Key Material is transmitted as a control packet, the Control 779 Type field of the SRT packet header is set to User-Defined Type (see 780 Table 1), the Subtype field of the header is set to SRT_CMD_KMREQ for 781 key-refresh request and SRT_CMD_KMRSP for key-refresh response 782 (Table 5). The KM Refresh mechanism is described in Section 5.1.6. 784 The structure of the Key Material message is illustrated in 785 Figure 10. 787 0 1 2 3 788 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 789 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 790 |S| V | PT | Sign | Resv1 | KK| 791 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 792 | KEKI | 793 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 794 | Cipher | Auth | SE | Resv2 | 795 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 796 | Resv3 | SLen/4 | KLen/4 | 797 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 798 | Salt | 799 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 800 | | 801 + Wrapped Key + 802 | | 803 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 805 Figure 10: Key Material Message structure 807 S: 1 bit, value = {0}. This is a fixed-width field that is reserved 808 for future usage. 810 Version (V): 3 bits, value = {1}. This is a fixed-width field that 811 indicates the SRT version: 813 * 1: initial version 815 Packet Type (PT): 4 bits, value = {2}. This is a fixed-width field 816 that indicates the Packet Type: 818 * 0: Reserved 820 * 1: Media Stream Message (MSmsg) 822 * 2: Keying Material Message (KMmsg) 824 * 7: Reserved to discriminate MPEG-TS packet (0x47=sync byte) 826 Sign: 16 bits, value = {0x2029}. This is a fixed-width field that 827 contains the signature 'HAI' encoded as a PnP Vendor ID ([PNPID]) 828 (in big-endian order) 830 Resv1: 6 bits, value = {0}. This is a fixed-width field reserved for 831 flag extension or other usage. 833 Key-based Encryption (KK): 2 bits. This is a fixed-width field that 834 indicates which SEKs (odd and/or even) are provided in the 835 extension: 837 * 00b: no SEK is provided (invalid extension format) 839 * 01b: even key is provided 841 * 10b: odd key is provided 843 * 11b: both even and odd keys are provided 845 Key Encryption Key Index (KEKI): 32 bits, value = {0}. This is a 846 fixed-width field for specifying the KEK index (big-endian order) 847 was used to wrap (and optionally authenticate) the SEK(s). The 848 value 0 is used to indicate the default key of the current stream. 849 Other values are reserved for the possible use of a key management 850 system in the future to retrieve a cryptographic context. 852 * 0: Default stream associated key (stream/system default) 854 * 1..255: Reserved for manually indexed keys 856 Cipher: 8 bits, value = {0..2}. This is a fixed-width field for 857 specifying encryption cipher and mode: 859 * 0: None or KEKI indexed crypto context 861 * 2: AES-CTR [SP800-38A] 863 Authentication (Auth): 8 bits, value = {0}. This is a fixed-width 864 field for specifying a message authentication code algorithm: 866 * 0: None or KEKI indexed crypto context 868 Stream Encapsulation (SE): 8 bits, value = {2}. This is a fixed- 869 width field for describing the stream encapsulation: 871 * 0: Unspecified or KEKI indexed crypto context 873 * 1: MPEG-TS/UDP 875 * 2: MPEG-TS/SRT 877 Resv2: 8 bits, value = {0}. This is a fixed-width field reserved for 878 future use. 880 Resv3: 16 bits, value = {0}. This is a fixed-width field reserved 881 for future use. 883 SLen/4: 8 bits, value = {4}. This is a fixed-width field for 884 specifying salt length SLen in bytes divided by 4. Can be zero if 885 no salt/IV present. The only valid length of salt defined is 128 886 bits. 888 KLen/4: 8 bits, value = {4,6,8}. This is a fixed-width field for 889 specifying SEK length in bytes divided by 4. Size of one key even 890 if two keys present. MUST match the key size specified in the 891 Encryption Field of the handshake packet Table 2. 893 Salt (SLen): SLen * 8 bits, value = { }. This is a variable-width 894 field that complements the keying material by specifying a salt 895 key. 897 Wrap: (64 + n * KLen * 8) bits, value = { }. This is a variable- 898 width field for specifying Wrapped key(s), where n = (KK + 1)/2 899 and the size of the wrap field is ((n * KLen) + 8) bytes. 901 0 1 2 3 902 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 903 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 904 | | 905 + Integrity Check Vector (ICV) + 906 | | 907 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 908 | xSEK | 909 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 910 | oSEK | 911 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 913 Figure 11: Unwrapped key structure 915 ICV: 64 bits. 64-bit Integrity Check Vector(AES key wrap integrity). 916 This field is used to detect if the keys were unwrapped properly. 917 If the KEK in hand is invalid, validation fails and unwrapped keys 918 are discarded. 920 xSEK: variable width. This field identifies an odd or even SEK. If 921 only one key is present, the bit set in the KK field tells which 922 SEK is provided. If both keys are present, then this field is 923 eSEK (even key) and it is followed by odd key oSEK. The length of 924 this field is calculated as KLen * 8. 926 oSEK: variable width. This field with the odd key is present only 927 when the message carries the two SEKs (identified by he KK field). 929 3.2.3. Keep-Alive 931 Keep-alive control packets are sent after a certain timeout from the 932 last time any packet (Control or Data) was sent. The purpose of this 933 control packet is to notify the peer to keep the connection open when 934 no data exchange is taking place. 936 The default timeout for a keep-alive packet to be sent is 1 second. 938 An SRT keep-alive packet is formatted as follows: 940 0 1 2 3 941 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 942 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 943 |1| Control Type | Reserved | 944 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 945 | Type-specific Information | 946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 947 | Timestamp | 948 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 949 | Destination Socket ID | 950 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 952 Figure 12: Keep-Alive control packet 954 Packet Type: 1 bit, value = 1. The packet type value of a keep-alive 955 control packet is "1". 957 Control Type: 15 bits, value = KEEPALIVE{0x0001}. The control type 958 value of a keep-alive control packet is "1". 960 Reserved: 16 bits, value = 0. This is a fixed-width field reserved 961 for future use. 963 Type-specific Information. This field is reserved for future 964 definition. 966 Timestamp: 32 bits. See Section 3. 968 Destination Socket ID: 32 bits. See Section 3. 970 Keep-alive controls packet do not contain Control Information Field 971 (CIF). 973 3.2.4. ACK (Acknowledgment) 975 Acknowledgment control packets are used to provide delivery status of 976 data packets. By acknowled reception of data packets up to the 977 acknowledged packet sequence number the receiver notifies the sender 978 that all prior packets were received or, in case of live transmission 979 mode (Section 4.2.2), preceeeding missing packets if any were dropped 980 as too late to be delivered. 982 ACK packets may also carry some additional information from the 983 receiver like RTT, bandwidth, receiving speed, etc. The CIF portion 984 of the ACK control packet is expanded as follows: 986 0 1 2 3 987 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 988 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 989 |1| Control Type | Reserved | 990 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 991 | Acknowledgement Number | 992 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 993 | Timestamp | 994 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 995 | Destination Socket ID | 996 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- CIF -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 997 | Last Acknowledged Packet Sequence Number | 998 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 999 | RTT | 1000 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1001 | RTT Variance | 1002 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1003 | Available Buffer Size | 1004 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1005 | Packets Receiving Rate | 1006 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1007 | Estimated Link Capacity | 1008 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1009 | Receiving Rate | 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1012 Figure 13: ACK control packet 1014 Packet Type: 1 bit, value = 1. The packet type value of an ACK 1015 control packet is "1". 1017 Control Type: 15 bits, value = ACK{0x0002}. The control type value 1018 of an ACK control packet is "2". 1020 Reserved: 16 bits, value = 0. This is a fixed-width field reserved 1021 for future use. 1023 Acknowledgement Number: 32 bits. This field contains the sequential 1024 number of the full acknowledgment packet starting from 1. 1026 Timestamp: 32 bits. See Section 3. 1028 Destination Socket ID: 32 bits. See Section 3. 1030 Last Acknowledged Packet Sequence Number: 32 bits. This field 1031 contains the sequence number of the last data packet being 1032 acknowledged plus one. In other words, if it the sequence number 1033 of the first unacknowledged packet. 1035 RTT: 32 bits. RTT value, in microseconds, estimated by the receiver 1036 based on the previous ACK-ACKACK packet exchange. 1038 RTT Variance: 32 bits. The variance of the RTT estimation, in 1039 microseconds. 1041 Available Buffer Size: 32 bits. Available size of the receiver's 1042 buffer, in packets. 1044 Packets Receiving Rate: 32 bits. The rate at which packets are being 1045 received, in packets per second. 1047 Estimated Link Capacity: 32 bits. Estimated bandwidth of the link, 1048 in packets per second. 1050 Receiving Rate: 32 bits. Estimated receiving rate, in bytes per 1051 second. 1053 There are several types of ACK packets: 1055 * A Full ACK control packet is sent every 10 ms and has all the 1056 fields of Figure 13. 1058 * A Lite ACK control packet includes only the Last Acknowledged 1059 Packet Sequence Number field. The Type-specific Information field 1060 should be set to 0. 1062 * A Small ACK includes the fields up to and including the Available 1063 Buffer Size field. The Type-specific Information field should be 1064 set to 0. 1066 The sender only acknowledges the receipt of Full ACK packets (see 1067 ACKACK Section Section 3.2.7). 1069 The Lite ACK and Small ACK packets are used in cases when the 1070 receiver should acknowledge received data packets more often than 1071 every 10 ms. This is usually needed at high data rates. It is up to 1072 the receiver to decide the condition and the type of ACK packet to 1073 send (Lite or Small). The recommendation is to send a Lite ACK for 1074 every 64 packets received. 1076 3.2.5. NAK (Loss Report) 1078 Negative acknowledgment (NAK) control packets are used to signal 1079 failed data packet deliveries. The receiver notifies the sender 1080 about lost data packets by sending a NAK packet that contains a list 1081 of sequence numbers for those lost packets. 1083 An SRT NAK packet is formatted as follows: 1085 0 1 2 3 1086 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1087 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1088 |1| Control Type | Reserved | 1089 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1090 | Type-specific Information | 1091 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1092 | Timestamp | 1093 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1094 | Destination Socket ID | 1095 +-+-+-+-+-+-+-+-+-+-+-+- CIF (Loss List) -+-+-+-+-+-+-+-+-+-+-+-+ 1096 |0| Lost packet sequence number | 1097 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1098 |1| Range of lost packets from sequence number | 1099 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1100 |0| Up to sequence number | 1101 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1102 |0| Lost packet sequence number | 1103 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1105 Figure 14: NAK control packet 1107 Packet Type: 1 bit, value = 1. The packet type value of a NAK 1108 control packet is "1". 1110 Control Type: 15 bits, value = NAK{0x0003}. The control type value 1111 of a NAK control packet is "3". 1113 Reserved: 16 bits, value = 0. This is a fixed-width field reserved 1114 for future use. 1116 Type-specific Information: 32 bits. This field is reserved for 1117 future definition. 1119 Timestamp: 32 bits. See Section 3. 1121 Destination Socket ID: 32 bits. See Section 3. 1123 Control Information Field (CIF). A single value or a range of lost 1124 packets sequence numbers. See packet sequence number coding in 1125 Appendix A. 1127 3.2.6. Shutdown 1129 Shutdown control packets are used to initiate the closing of an SRT 1130 connection. 1132 An SRT shutdown control packet is formatted as follows: 1134 0 1 2 3 1135 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1136 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1137 |1| Control Type | Reserved | 1138 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1139 | Type-specific Information | 1140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1141 | Timestamp | 1142 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1143 | Destination Socket ID | 1144 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1146 Figure 15: Shutdown control packet 1148 Packet Type: 1 bit, value = 1. The packet type value of a shutdown 1149 control packet is "1". 1151 Control Type: 15 bits, value = SHUTDOWN{0x0005}. The control type 1152 value of a shutdown control packet is "5". 1154 Timestamp: 32 bits. See Section 3. 1156 Destination Socket ID: 32 bits. See Section 3. 1158 Type-specific Information. This field is reserved for future 1159 definition. 1161 Shutdown control packets do not contain Control Information Field 1162 (CIF). 1164 3.2.7. ACKACK 1166 ACKACK control packets are sent to acknowledge the reception of a 1167 Full ACK, and are used in the calculation of RTT by the receiver. 1169 An SRT ACKACK Control packet is formatted as follows: 1171 0 1 2 3 1172 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1173 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1174 |1| Control Type | Reserved | 1175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1176 | Acknowledgement Number | 1177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1178 | Timestamp | 1179 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1180 | Destination Socket ID | 1181 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1183 Figure 16: ACKACK control packet 1185 Packet Type: 1 bit, value = 1. The packet type value of an ACKACK 1186 control packet is "1". 1188 Control Type: 15 bits, value = ACKACK{0x0006}. The control type 1189 value of an ACKACK control packet is "6". 1191 Acknowledgement Number. This field contains the Acknowledgement 1192 Number of the full ACK packet the reception of which is being 1193 acknowledged by this ACKACK packet. 1195 Timestamp: 32 bits. See Section 3. 1197 Destination Socket ID: 32 bits. See Section 3. 1199 ACKACK control packets do not contain Control Information Field 1200 (CIF). 1202 4. SRT Data Transmission and Control 1204 This section describes key concepts related to the handling of 1205 control and data packets during the transmission process. 1207 After the handshake and exchange of capabilities is completed, packet 1208 data can be sent and received over the established connection. To 1209 fully utilize the features of low latency and error recovery provided 1210 by SRT, the sender and receiver must handle control packets, timers, 1211 and buffers for the connection as specified in this section. 1213 4.1. Stream Multiplexing 1215 Multiple SRT sockets may share the same UDP socket so that the 1216 packets received to this UDP socket will be correctly dispatched to 1217 those SRT sockets they are currently destined. 1219 During the handshake, the parties exchange their SRT Socket IDs. 1220 These IDs are then used in the Destination Socket ID field of every 1221 control and data packet (see Section 3). 1223 4.2. Data Transmission Modes 1225 SRT has been mainly created for Live Streaming and therefore its main 1226 and default transmission mode is "live". SRT supports, however, the 1227 modes that the original UDT library supported, that is, buffer and 1228 message transmission. 1230 4.2.1. Message Mode 1232 When the STREAM flag of the handshake Extension Message 1233 Section 3.2.1.1 is set to 0, the protocol operates in Message mode, 1234 characterized as follows: 1236 * Every packet has its own Packet Sequence Number. 1238 * One or several consecutive SRT Data packets can form a message. 1240 * All the packets belonging to the same message have a similar 1241 message number set in the Message Number field. 1243 The first packet of a message has the first bit of the Packet 1244 Position Flags (Section 3.1) set to 1. The last packet of the 1245 message has the second bit of the Packet Position Flags set to 1. 1246 Thus, a PP equal to "11b" indicates a packet that forms the whole 1247 message. A PP equal to "00b" indicates a packet that belongs to the 1248 inner part of the message. 1250 The concept of the message in SRT comes from UDT ([GHG04b]). In this 1251 mode a single sending instruction passes exactly one piece of data 1252 that has boundaries (a message). This message may span across 1253 multiple UDP packets (and multiple SRT data packets). The only size 1254 limitation is that it shall fit as a whole in the buffers of the 1255 sender and the receiver. Although internally all operations (e.g. 1256 ACK, NAK) on data packets are performed independently, an application 1257 must send and receive the whole message. Until the message is 1258 complete (all packets are received) the application will not be 1259 allowed to read it. 1261 When the Order Flag of a Data packet is set to 1, this imposes a 1262 sequential reading order on messages. An Order Flag set to 0 allows 1263 an application to read messages that are already fully available, 1264 before any preceding messages that may have some packets missing. 1266 4.2.2. Live Mode 1268 Live mode is a special type of message mode where only data packets 1269 with their PP field set to "11b" are allowed. 1271 Additionally Timestamp-Based Packet Delivery (TSBPD) (Section 4.5) 1272 and Too-Late Packet Drop (Section 4.6) mechanisms are used in this 1273 mode. 1275 4.2.3. Buffer Mode 1277 Buffer mode is negotiated during the Handshake by setting the STREAM 1278 flag of the handshake Extension Message Flags to 1. 1280 In this mode consecutive packets form one continuous stream that can 1281 be read, with portions of any size. 1283 4.3. Handshake Messages 1285 SRT is a connection-oriented protocol. It embraces the concepts of 1286 "connection" and "session". The UDP system protocol is used by SRT 1287 for sending data and control packets. 1289 An SRT connection is characterized by the fact that it is: 1291 * first engaged by a handshake process; 1293 * maintained as long as any packets are being exchanged in a timely 1294 manner; 1296 * considered closed when a party receives the appropriate close 1297 command from its peer (connection closed by the foreign host), or 1298 when it receives no packets at all for some predefined time 1299 (connection broken on timeout). 1301 SRT supports two connection configurations: 1303 1. Caller-Listener, where one side waits for the other to initiate a 1304 connection 1306 2. Rendezvous, where both sides attempt to initiate a connection 1307 The handshake is performed between two parties: "Initiator" and 1308 "Responder": 1310 * Initiator starts the extended SRT handshake process and sends 1311 appropriate SRT extended handshake requests. 1313 * Responder expects the SRT extended handshake requests to be sent 1314 by the Initiator and sends SRT extended handshake responses back. 1316 There are two basic types of SRT handshake extensions that are 1317 exchanged in the handshake: 1319 * Handshake Extension Message exchanges the basic SRT information; 1321 * Key Material Exchange exchanges the wrapped stream encryption key 1322 (used only if encryption is requested). 1324 * Stream ID extension exchanges some stream-specific information 1325 that can be used by the application to identify the incoming 1326 stream connection. 1328 The Initiator and Responder roles are assigned depending on the 1329 connection mode. 1331 For Caller-Listener connections: the Caller is the Initiator, the 1332 Listener is the Responder. For Rendezvous connections: the Initiator 1333 and Responder roles are assigned based on the initial data 1334 interchange during the handshake. 1336 The Handshake Type field in the Handshake Structure (see Figure 5) 1337 indicates the handshake message type. 1339 Caller-Listener handshake exchange has the following order of 1340 Handshake Types: 1342 1. Caller to Listener: INDUCTION 1344 2. Listener to Caller: INDUCTION (reports cookie) 1346 3. Caller to Listener: CONCLUSION (uses previously returned cookie) 1348 4. Listener to Caller: CONCLUSION (confirms connection established) 1350 Rendezvous handshake exchange has the following order of Handshake 1351 Types: 1353 1. After starting the connection: WAVEAHAND. 1355 2. After receiving the above message from the peer: CONCLUSION. 1357 3. After receiving the above message from the peer: AGREEMENT. 1359 When a connection process has failed before either party can send the 1360 CONCLUSION handshake, the Handshake Type field will contain the 1361 appropriate error value for the rejected connection. See the list of 1362 error codes in Table 7. 1364 +======+================+=========================================+ 1365 | Code | Error | Description | 1366 +======+================+=========================================+ 1367 | 1000 | REJ_UNKNOWN | Unknown reason | 1368 +------+----------------+-----------------------------------------+ 1369 | 1001 | REJ_SYSTEM | System function error | 1370 +------+----------------+-----------------------------------------+ 1371 | 1002 | REJ_PEER | Rejected by peer | 1372 +------+----------------+-----------------------------------------+ 1373 | 1003 | REJ_RESOURCE | Resource allocation problem | 1374 +------+----------------+-----------------------------------------+ 1375 | 1004 | REJ_ROGUE | incorrect data in handshake | 1376 +------+----------------+-----------------------------------------+ 1377 | 1005 | REJ_BACKLOG | listener's backlog exceeded | 1378 +------+----------------+-----------------------------------------+ 1379 | 1006 | REJ_IPE | internal program error | 1380 +------+----------------+-----------------------------------------+ 1381 | 1007 | REJ_CLOSE | socket is closing | 1382 +------+----------------+-----------------------------------------+ 1383 | 1008 | REJ_VERSION | peer is older version than agent's min | 1384 +------+----------------+-----------------------------------------+ 1385 | 1009 | REJ_RDVCOOKIE | rendezvous cookie collision | 1386 +------+----------------+-----------------------------------------+ 1387 | 1010 | REJ_BADSECRET | wrong password | 1388 +------+----------------+-----------------------------------------+ 1389 | 1011 | REJ_UNSECURE | password required or unexpected | 1390 +------+----------------+-----------------------------------------+ 1391 | 1012 | REJ_MESSAGEAPI | Stream flag collision | 1392 +------+----------------+-----------------------------------------+ 1393 | 1013 | REJ_CONGESTION | incompatible congestion-controller type | 1394 +------+----------------+-----------------------------------------+ 1395 | 1014 | REJ_FILTER | incompatible packet filter | 1396 +------+----------------+-----------------------------------------+ 1397 | 1015 | REJ_GROUP | incompatible group | 1398 +------+----------------+-----------------------------------------+ 1400 Table 7: Handshake Rejection Reason Codes 1402 The specification of the cipher family and block size is decided by 1403 the data Sender. When the transmission is bidirectional, this value 1404 MUST be agreed upon at the outset because when both are set the 1405 Responder wins. For Caller-Listener connections it is reasonable to 1406 set this value on the Listener only. In the case of Rendezvous the 1407 only reasonable approach is to decide upon the correct value from the 1408 different sources and to set it on both parties (note that *AES-128* 1409 is the default). 1411 4.3.1. Caller-Listener Handshake 1413 This section describes the handshaking process where a Listener is 1414 waiting for an incoming Handshake request on a bound UDP port from a 1415 Caller. The process has two phases: induction and conclusion. 1417 4.3.1.1. The Induction Phase 1419 The INDUCTION phase serves only to set a cookie on the Listener so 1420 that it doesn't allocate resources, thus mitigating a potential DoS 1421 attack that might be perpetrated by flooding the Listener with 1422 handshake commands. 1424 The Caller begins by sending the INDUCTION handshake, which contains 1425 the following (significant) fields: 1427 * Version: MUST always be 4 1429 * Encryption Field: 0 1431 * Extension Field: 2 1433 * Handshake Type: INDUCTION 1435 * SRT Socket ID: SRT Socket ID of the Caller 1437 * SYN Cookie: 0 1439 The Destination Socket ID of the SRT packet header in this message is 1440 0, which is interpreted as a connection request. 1442 The handshake version number is set to 4 in this initial handshake. 1443 This is due to the initial design of SRT that was to be compliant 1444 with the UDT protocol ([GHG04b]) on which it is based. 1446 The Listener responds with the following: 1448 * Version: 5 1449 * Encryption Field: Advertised cipher family and block size. 1451 * Extension Field: SRT magic code 0x4A17 1453 * Handshake Type: INDUCTION 1455 * SRT Socket ID: Socket ID of the Listener 1457 * SYN Cookie: a cookie that is crafted based on host, port and 1458 current time with 1 minute accuracy to avoid SYN flooding attack 1459 [RFC4987] 1461 At this point the Listener still does not know if the Caller is SRT 1462 or UDT, and it responds with the same set of values regardless of 1463 whether the Caller is SRT or UDT. 1465 If the party is SRT, it does interpret the values in Version and 1466 Extension Field. If it receives the value 5 in Version, it 1467 understands that it comes from an SRT party, so it knows that it 1468 should prepare the proper handshake messages phase. It also checks 1469 the following: 1471 * whether the Extension Flags contains the magic value 0x4A17; 1472 otherwise the connection is rejected. This is a contingency for 1473 the case where someone who, in an attempt to extend UDT 1474 independently, increases the Version value to 5 and tries to test 1475 it against SRT. 1477 * whether the Encryption Flags contain a non-zero value, which is 1478 interpreted as an advertised cipher family and block size. 1480 A legacy UDT party completely ignores the values reported in Version 1481 and Handshake Type. It is, however, interested in the SYN Cookie 1482 value, as this must be passed to the next phase. It does interpret 1483 these fields, but only in the "conclusion" message. 1485 4.3.1.2. The Conclusion Phase 1487 Once the Caller gets the SYN cookie from the Listener, it sends the 1488 CONCLUSION handshake to the Listener. 1490 The following values are set by the compliant caller: 1492 * Version: 5 1494 * Handshake Type: CONCLUSION 1496 * SRT Socket ID: Socket ID of the Caller 1497 * SYN Cookie: the cookie previously received in the induction phase 1499 The Destination Socket ID in this message is the socket ID that was 1500 previously received in the induction phase in the SRT Socket ID field 1501 of the handshake structure. 1503 * Encryption Flags: advertised cipher family and block size. 1505 * Extension Flags: A set of flags that define the extensions 1506 provided in the handshake. 1508 The Listener responds with the same values shown above, without the 1509 cookie (which is not needed here), as well as the extensions for HS 1510 Version 5 (which will probably be exactly the same). 1512 There is not any "negotiation" here. If the values passed in the 1513 handshake are in any way not acceptable by the other side, the 1514 connection will be rejected. The only case when the Listener can 1515 have precedence over the Caller is the advertised Cipher Family and 1516 Block Size (Table 2) in the Encryption Field of the Handshake. 1518 The value for latency is always agreed to be the greater of those 1519 reported by each party. 1521 4.3.2. Rendezvous Handshake 1523 The Rendezvous process uses a state machine. It is slightly 1524 different from UDT Rendezvous handshake [GHG04b], although it is 1525 still based on the same message request types. 1527 Both parties start with WAVEAHAND and use the Version value of 5. 1528 Legacy Version 4 clients do not look at the Version value, whereas 1529 Version 5 clients can detect version 5. The parties only continue 1530 with the Version 5 Rendezvous process when Version is set to 5 for 1531 both. Otherwise the process continues exclusively according to 1532 Version 4 rules [GHG04b]. 1534 With Version 5 Rendezvous, both parties create a cookie for a process 1535 called the "cookie contest". This is necessary for the assignment of 1536 Initiator and Responder roles. Each party generates a cookie value 1537 (a 32-bit number) based on the host, port, and current time with 1 1538 minute accuracy. This value is scrambled using an MD5 sum 1539 calculation. The cookie values are then compared with one another. 1541 Since it is impossible to have two sockets on the same machine bound 1542 to the same NIC and port and operating independently, it is virtually 1543 impossible that the parties will generate identical cookies. 1544 However, this situation may occur if an application tries to "connect 1545 to itself" - that is, either connects to a local IP address, when the 1546 socket is bound to INADDR_ANY, or to the same IP address to which the 1547 socket was bound. If the cookies are identical (for any reason), the 1548 connection will not be made until new, unique cookies are generated 1549 (after a delay of up to one minute). In the case of an application 1550 "connecting to itself", the cookies will always be identical, and so 1551 the connection will never be established. 1553 When one party's cookie value is greater than its peer's, it wins the 1554 cookie contest and becomes Initiator (the other party becomes the 1555 Responder). 1557 At this point there are two possible "handshake flows": serial and 1558 parallel. 1560 4.3.2.1. Serial Handshake Flow 1562 In the serial handshake flow, one party is always first, and the 1563 other follows. That is, while both parties are repeatedly sending 1564 WAVEAHAND messages, at some point one party - let's say Alice - will 1565 find she has received a WAVEAHAND message before she can send her 1566 next one, so she sends a CONCLUSION message in response. Meantime, 1567 Bob (Alice's peer) has missed Alice's WAVEAHAND messages, so that 1568 Alice's CONCLUSION is the first message Bob has received from her. 1570 This process can be described easily as a series of exchanges between 1571 the first and following parties (Alice and Bob, respectively): 1573 1. Initially, both parties are in the waving state. Alice sends a 1574 handshake message to Bob: 1576 * Version: 5 1578 * Type: Extension field: 0, Encryption field: advertised 1579 "PBKEYLEN". 1581 * Handshake Type: WAVEAHAND 1583 * SRT Socket ID: Alice's socket ID 1585 * SYN Cookie: Created based on host/port and current time. 1587 While Alice does not yet know if she is sending this message to a 1588 Version 4 or Version 5 peer, the values from these fields would not 1589 be interpreted by the Version 4 peer when the Handshake Type is 1590 WAVEAHAND. 1592 1. Bob receives Alice's WAVEAHAND message, switches to the 1593 "attention" state. Since Bob now knows Alice's cookie, he 1594 performs a "cookie contest" (compares both cookie values). If 1595 Bob's cookie is greater than Alice's, he will become the 1596 Initiator. Otherwise, he will become the Responder. 1598 The resolution of the Handshake Role (Initiator or Responder) is 1599 essential for further processing. 1601 Then Bob responds: 1603 * Version: 5 1605 * Extension field: appropriate flags if Initiator, otherwise 0 1607 * Encryption field: advertised PBKEYLEN 1609 * Handshake Type: CONCLUSION 1611 If Bob is the Initiator and encryption is on, he will use either his 1612 own cipher family and block size or the one received from Alice (if 1613 she has advertised those values). 1615 1. Alice receives Bob's CONCLUSION message. While at this point she 1616 also performs the "cookie contest", the outcome will be the same. 1617 She switches to the "fine" state, and sends: 1619 * Version: 5 1621 * Appropriate extension flags and encryption flags 1623 * Handshake Type: CONCLUSION 1625 Both parties always send extension flags at this point, which will 1626 contain HSREQ if the message comes from an Initiator, or HSRSP if it 1627 comes from a Responder. If the Initiator has received a previous 1628 message from the Responder containing an advertised cipher family and 1629 block size in the encryption flags field, it will be used as the key 1630 length for key generation sent next in the KMREQ extension. 1632 1. Bob receives Alice's CONCLUSION message, and then does one of the 1633 following (depending on Bob's role): 1635 * If Bob is the Initiator (Alice's message contains HSRSP), he: 1637 - switches to the "connected" state 1638 - sends Alice a message with Handshake Type AGREEMENT, but 1639 containing no SRT extensions (Extension Flags field should 1640 be 0) 1642 * If Bob is the Responder (Alice's message contains HSREQ), he: 1644 - switches to "initiated" state 1646 - sends Alice a message with Handshake Type CONCLUSION that 1647 also contains extensions with HSRSP 1649 o awaits a confirmation from Alice that she is also 1650 connected (preferably by AGREEMENT message) 1652 2. Alice receives the above message, enters into the "connected" 1653 state, and then does one of the following (depending on Alice's 1654 role): 1656 * If Alice is the Initiator (received CONCLUSION with HSRSP), 1657 she sends Bob a message with Handshake Type = AGREEMENT. 1659 * If Alice is the Responder, the received message has Handshake 1660 Type AGREEMENT and in response she does nothing. 1662 3. At this point, if Bob was Initiator, he is connected already. If 1663 he was a Responder, he should receive the above AGREEMENT 1664 message, after which he switches to the "connected" state. In 1665 the case where the UDP packet with the agreement message gets 1666 lost, Bob will still enter the "connected" state once he receives 1667 anything else from Alice. If Bob is going to send, however, he 1668 has to continue sending the same CONCLUSION until he gets the 1669 confirmation from Alice. 1671 4.3.2.2. Parallel Handshake Flow 1673 The chances of the parallel handshake flow are very low, but still it 1674 may occur if the handshake messages with WAVEAHAND are sent and 1675 received by both peers at precisely the same time. 1677 The resulting flow is very much like Bob's behaviour in the serial 1678 handshake flow, but for both parties. Alice and Bob will go through 1679 the same state transitions: 1681 Waving -> Attention -> Initiated -> Connected 1683 In the Attention state they know each other's cookies, so they can 1684 assign roles. In contrast to serial flows, which are mostly based on 1685 request-response cycles, here everything happens completely 1686 asynchronously: the state switches upon reception of a particular 1687 handshake message with appropriate contents (the Initiator MUST 1688 attach the HSREQ extension, and Responder MUST attach the "HSRSP" 1689 extension). 1691 Here's how the parallel handshake flow works, based on roles: 1693 Initiator: 1695 1. Waving 1697 * Receives WAVEAHAND message 1699 * Switches to Attention 1701 * Sends CONCLUSION + HSREQ 1703 2. Attention 1705 * Receives CONCLUSION message, which: 1707 - contains no extensions: 1709 o switches to Initiated, still sends CONCLUSION + HSREQ 1711 - contains "HSRSP" extension: 1713 o switches to Connected, sends AGREEMENT 1715 3. Initiated 1717 * Receives CONCLUSION message, which: 1719 - Contains no extensions: 1721 o REMAINS IN THIS STATE, still sends CONCLUSION + HSREQ 1723 - contains "HSRSP" extension: 1725 o switches to Connected, sends AGREEMENT 1727 4. Connected 1729 * May receive CONCLUSION and respond with AGREEMENT, but 1730 normally by now it should already have received payload 1731 packets. 1733 Responder: 1735 1. Waving 1737 * Receives WAVEAHAND message 1739 * Switches to Attention 1741 * Sends CONCLUSION message (with no extensions) 1743 2. Attention 1745 * Receives CONCLUSION message with HSREQ. This message might 1746 contain no extensions, in which case the party shall simply 1747 send the empty CONCLUSION message, as before, and remain in 1748 this state. 1750 * Switches to Initiated and sends CONCLUSION message with HSRSP 1752 3. Initiated 1754 * Receives: 1756 - CONCLUSION message with HSREQ 1758 o responds with CONCLUSION with HSRSP and remains in this 1759 state 1761 - AGREEMENT message 1763 o responds with AGREEMENT and switches to Connected 1765 - Payload packet 1767 o responds with AGREEMENT and switches to Connected 1769 4. Connected 1771 * Is not expecting to receive any handshake messages anymore. 1772 The AGREEMENT message is always sent only once or per every 1773 final CONCLUSION message. 1775 Note that any of these packets may be missing, and the sending party 1776 will never become aware. The missing packet problem is resolved this 1777 way: 1779 1. If the Responder misses the CONCLUSION + HSREQ message, it simply 1780 continues sending empty CONCLUSION messages. Only upon reception 1781 of CONCLUSION + HSREQ does it respond with CONCLUSION + HSRSP. 1783 2. If the Initiator misses the CONCLUSION + HSRSP response from the 1784 Responder, it continues sending CONCLUSION + HSREQ. The 1785 Responder MUST always respond with CONCLUSION + HSRSP when the 1786 Initiator sends CONCLUSION + HSREQ, even if it has already 1787 received and interpreted it. 1789 3. When the Initiator switches to the Connected state it responds 1790 with a AGREEMENT message, which may be missed by the Responder. 1791 Nonetheless, the Initiator may start sending data packets because 1792 it considers itself connected - it does not know that the 1793 Responder has not yet switched to the Connected state. Therefore 1794 it is exceptionally allowed that when the Responder is in the 1795 Initiated state and receives a data packet (or any control packet 1796 that is normally sent only between connected parties) over this 1797 connection, it may switch to the Connected state just as if it 1798 had received a AGREEMENT message. 1800 4. If the the Initiator has already switched to the Connected state 1801 it will not bother the Responder with any more handshake 1802 messages. But the Responder may be completely unaware of that 1803 (having missed the AGREEMENT message from the Initiator). 1804 Therefore it does not exit the connecting state, which means that 1805 it continues sending CONCLUSION + HSRSP messages until it 1806 receives any packet that will make it switch to the Connected 1807 state (normally AGREEMENT). Only then does it exit the 1808 connecting state and the application can start transmission. 1810 4.4. SRT Buffer Latency 1812 The SRT sender and receiver have buffers to store packets. 1814 On the sender, latency is the time that SRT holds a packet to give it 1815 a chance to be delivered successfully while maintaining the rate of 1816 the sender at the receiver. If an acknowledgment (ACK) is missing or 1817 late for more than the configured latency, the packet is dropped from 1818 the sender buffer. A packet can be retransmitted as long as it 1819 remains in the buffer for the duration of the latency window. On the 1820 receiver, packets are delivered to an application from a buffer after 1821 the latency interval has passed. This helps to recover from 1822 potential packet losses. See Section 4.5, Section 4.6 for details. 1824 Latency is a value, in milliseconds, that can cover the time to 1825 transmit hundreds or even thousands of packets at high bitrate. 1826 Latency can be thought of as a window that slides over time, during 1827 which a number of activities take place, such as the reporting of 1828 acknowledged packets (ACKs) (Section 4.8.1) and unacknowledged 1829 packets (NAKs)(Section 4.8.2). 1831 Latency is configured through the exchange of capabilities during the 1832 extended handshake process between initiator and responder. The 1833 Handshake Extension Message (Section 3.2.1.1) has TSBPD delay 1834 information, in milliseconds, from the SRT receiver and sender. The 1835 latency for a connection will be established as the maximum value of 1836 latencies proposed by the initiator and responder. 1838 4.5. Timestamp-Based Packet Delivery 1840 The goal of the SRT Timestamp-Based Packet Delivery (TSBPD) mechanism 1841 is to reproduce the output of the sending application (e.g., encoder) 1842 at the input of the receiving application (e.g., decoder) in live 1843 data transmission mode (see Section 4.2). It attempts to reproduce 1844 the timing of packets committed by the sending application to the SRT 1845 sender. This allows packets to be scheduled for delivery by the SRT 1846 receiver, making them ready to be read by the receiving application 1847 (see Figure 17). 1849 The SRT receiver, using the timestamp of the SRT data packet header, 1850 delivers packets to a receiving application with a fixed minimum 1851 delay from the time the packet was scheduled for sending on the SRT 1852 sender side. Basically, the sender timestamp in the received packet 1853 is adjusted to the receiver's local time (compensating for the time 1854 drift or different time zones) before releasing the packet to the 1855 application. Packets can be withheld by the SRT receiver for a 1856 configured receiver delay. A higher delay can accommodate a larger 1857 uniform packet drop rate, or a larger packet burst drop. Packets 1858 received after their "play time" are dropped if the Too-Late Packet 1859 Drop feature is enabled (see Section 4.6). 1861 The packet timestamp, in microseconds, is relative to the SRT 1862 connection creation time. Packets are inserted based on the sequence 1863 number in the header field. The origin time, in microseconds, of the 1864 packet is already sampled when a packet is first submitted by the 1865 application to the SRT sender unless explicitly provided. The TSBPD 1866 feature uses this time to stamp the packet for first transmission and 1867 any subsequent retransmission. This timestamp and the configured SRT 1868 latency (Section 4.4) control the recovery buffer size and the 1869 instant that packets are delivered at the destination (the 1870 aforementioned "play time" which is decided by adding the timestamp 1871 to the configured latency). 1873 It is worth mentioning that the use of the packet sending time to 1874 stamp the packets is inappropriate for the TSBPD feature, since a new 1875 time (current sending time) is used for retransmitted packets, 1876 putting them out of order when inserted at their proper place in the 1877 stream. 1879 Figure 17 illustrates the key latency points during the packet 1880 transmission with the TSBPD feature enabled. 1882 | Sending | | | 1883 | Delay | ~RTT/2 | SRT Latency | 1884 |<--------->|<------------>|<----------------->| 1885 | | | | 1886 | | | | 1887 | | | | 1888 ___ Scheduled Sent Received Scheduled 1889 / for sending | | for delivery 1890 Packet | | | | 1891 State | | | | 1892 | | | | 1893 | | | | 1894 -----------------------------------------------------> 1895 Time 1897 Figure 17: Key Latency Points during the Packet Transmission 1899 The main packet states shown in Figure 17 are the following: 1901 * "Scheduled for sending": the packet is committed by the sending 1902 application, stamped and ready to be sent; 1904 * "Sent": the packet is passed to the UDP socket and sent; 1906 * "Received": the packet is received and read from the UDP socket; 1908 * "Scheduled for delivery": the packet is scheduled for the delivery 1909 and ready to be read by the receiving application. 1911 It is worth noting that the round-trip time (RTT) of an SRT link may 1912 vary in time. However the actual end-to-end latency on the link 1913 becomes fixed and is approximately equal to (RTT_0/2 + SRT Latency) 1914 once the SRT handshake exchange happens, where RTT_0 is the actual 1915 value of the round-trip time during the SRT handshake exchange (the 1916 value of the round-trip time once the SRT connection has been 1917 established). 1919 The value of sending delay depends on the hardware performance. 1920 Usually it is relatively small (several microseconds) in contrast to 1921 RTT_0/2 and SRT latency which are measured in milliseconds. 1923 4.5.1. Packet Delivery Time 1925 Packet delivery time is the moment, estimated by the receiver, when a 1926 packet should be delivered to the upstream application. The 1927 calculation of packet delivery time (PktTsbpdTime) is performed upon 1928 receiving a data packet according to the following formula: 1930 PktTsbpdTime = TsbpdTimeBase + PKT_TIMESTAMP + TsbpdDelay + Drift 1932 where 1934 * TsbpdTimeBase is the time base that reflects the time difference 1935 between local clock of the receiver and the clock used by the 1936 sender to timestamp packets being sent (see Section 4.5.1.1); 1938 * PKT_TIMESTAMP is the data packet timestamp, in microseconds; 1940 * TsbpdDelay is the receiver's buffer delay (or receiver's buffer 1941 latency, or SRT Latency). This is the time, in milliseconds, that 1942 SRT holds a packet from the moment it has been received till the 1943 time it should be delivered to the upstream application; 1945 * Drift is the time drift used to adjust the fluctuations between 1946 sender and receiver clock, in microseconds. 1948 SRT Latency (TsbpdDelay) should be a buffer time large enough to 1949 cover the unexpectedly extended RTT time, and the time needed to 1950 retransmit the lost packet. The value of minimum TsbpdDelay is 1951 negotiated during the SRT handshake exchange and is equal to 120 1952 milliseconds. The recommended value of TsbpdDelay is 3-4 times RTT. 1954 It is worth noting that TsbpdDelay limits the number of packet 1955 retransmissions to a certain extent making impossible to retransmit 1956 packets endlessly. This is important for live data transmission. 1958 4.5.1.1. TSBPD Time Base Calculation 1960 The initial value of TSBPD time base (TsbpdTimeBase) is calculated at 1961 the moment of the second handshake request is received as follows: 1963 TsbpdTimeBase = T_NOW - HSREQ_TIMESTAMP 1965 where T_NOW is the current time according to the receiver clock; 1966 HSREQ_TIMESTAMP is the handshake packet timestamp, in microseconds. 1968 The value of TsbpdTimeBase is approximately equal to the initial one- 1969 way delay of the link RTT_0/2, where RTT_0 is the actual value of the 1970 round-trip time during the SRT handshake exchange. 1972 During the transmission process, the value of TSBPD time base may be 1973 adjusted in two cases: 1975 1. During the TSBPD wrapping period. The TSBPD wrapping period 1976 happens every 01:11:35 hours. This time corresponds to the 1977 maximum timestamp value of a packet (MAX_TIMESTAMP). 1978 MAX_TIMESTAMP is equal to 0xFFFFFFFF, or the maximum value of 1979 32-bit unsigned integer, in microseconds (Section 3). The TSBPD 1980 wrapping period starts 30 seconds before reaching the maximum 1981 timestamp value of a packet and ends once the packet with 1982 timestamp within (30, 60) seconds interval is delivered (read 1983 from the buffer). The updated value of TsbpdTimeBase will be 1984 recalculated as follows: 1986 TsbpdTimeBase = TsbpdTimeBase + MAX_TIMESTAMP + 1 1988 2. By drift tracer. See Section 4.7 for details. 1990 4.6. Too-Late Packet Drop 1992 The Too-Late Packet Drop (TLPKTDROP) mechanism allows the sender to 1993 drop packets that have no chance to be delivered in time, and allows 1994 the receiver to skip missing packets that have not been delivered in 1995 time. The timeout of dropping a packet is based on the TSBPD 1996 mechanism (see Section 4.5). 1998 In the SRT, when Too-Late Packet Drop is enabled, and a packet 1999 timestamp is older than 125% of the SRT latency, it is considered too 2000 late to be delivered and may be dropped by the sender. However, the 2001 sender keeps packets for at least 1 second in case the SRT latency is 2002 not enough for a large RTT (that is, if 125% of the SRT latency is 2003 less than 1 second). 2005 When enabled on the receiver, the receiver drops packets that have 2006 not been delivered or retransmitted in time, and delivers the 2007 subsequent packets to the application when it is their time to play. 2009 In pseudo-code, the algorithm of reading from the receiver buffer is 2010 the following: 2012 2013 pos = 0; /* Current receiver buffer position */ 2014 i = 0; /* Position of the next available in the receiver buffer 2015 packet relatively to the current buffer position pos */ 2017 while(True) { 2018 // Get the position i of the next available packet 2019 // in the receiver buffer 2020 i = next_avail(); 2021 // Calculate packet delivery time PktTsbpdTime 2022 // for the next available packet 2023 PktTsbpdTime = delivery_time(i); 2025 if T_NOW < PktTsbpdTime: 2026 continue; 2028 Drop packets which buffer position number is less than i; 2030 Deliver packet with the buffer position i; 2032 pos = i + 1; 2033 } 2034 2036 where T_NOW is the current time according to the receiver clock. 2038 The TLPKTDROP mechanism can be turned off to always ensure a clean 2039 delivery. However, a lost packet can simply pause a delivery for 2040 some longer, potentially undefined time, and cause even worse tearing 2041 for the player. Setting higher SRT latency will help much more in 2042 the case when TLPKTDROP causes packet drops too often. 2044 4.7. Drift Management 2046 When the sender enters "connected" status it tells the application 2047 there is a socket interface that is transmitter-ready. At this point 2048 the application can start sending data packets. It adds packets to 2049 the SRT sender's buffer at a certain input rate, from which they are 2050 transmitted to the receiver at scheduled times. 2052 A synchronized time is required to keep proper sender/receiver buffer 2053 levels, taking into account the time zone and round-trip time (up to 2054 2 seconds for satellite links). Considering addition/subtraction 2055 round-off, and possibly unsynchronized system times, an agreed-upon 2056 time base drifts by a few microseconds every minute. The drift may 2057 accumulate over many days to a point where the sender or receiver 2058 buffers will overflow or deplete, seriously affecting the quality of 2059 the video. SRT has a time management mechanism to compensate for 2060 this drift. 2062 When a packet is received, SRT determines the difference between the 2063 time it was expected and its timestamp. The timestamp is calculated 2064 on the receiver side. The RTT tells the receiver how much time it 2065 was supposed to take. SRT maintains a reference between the time at 2066 the leading edge of the send buffer's latency window and the 2067 corresponding time on the receiver (the present time). This allows 2068 to convert packet timestamp to the local receiver time. Based on 2069 this time, various events (packet delivery, etc.) can be scheduled. 2071 The receiver samples time drift data and periodically calculates a 2072 packet timestamp correction factor, which is applied to each data 2073 packet received by adjusting the inter-packet interval. When a 2074 packet is received it is not given right away to the application. As 2075 time advances, the receiver knows the expected time for any missing 2076 or dropped packet, and can use this information to fill any "holes" 2077 in the receive queue with another packet (see Section 4.5). 2079 It is worth noting that the period of sampling time drift data is 2080 based on a number of packets rather than time duration to ensure 2081 enough samples, independently of the media stream packet rate. The 2082 effect of network jitter on the estimated time drift is attenuated by 2083 using a large number of samples. The actual time drift being very 2084 slow (affecting a stream only after many hours) does not require a 2085 fast reaction. 2087 The receiver uses local time to be able to schedule events -- to 2088 determine, for example, if it is time to deliver a certain packet 2089 right away. The timestamps in the packets themselves are just 2090 references to the beginning of the session. When a packet is 2091 received (with a timestamp from the sender), the receiver makes a 2092 reference to the beginning of the session to recalculate its 2093 timestamp. The start time is derived from the local time at the 2094 moment that the session is connected. A packet timestamp equals 2095 "now" minus "StartTime", where the latter is the point in time when 2096 the socket was created. 2098 4.8. Acknowledgement and Lost Packet Handling 2100 To enable the Automatic Repeat reQuest of data packet 2101 retransmissions, a sender stores all sent data packets in its buffer. 2103 The SRT receiver periodically sends acknowledgments (ACKs) for the 2104 received data packets so that the SRT sender can remove the 2105 acknowledged packets from its buffer (Section 4.8.1). Once the 2106 acknowledged packets are removed, their retransmission is no longer 2107 possible and presumably not needed. 2109 Upon receiving the full acknowledgment (ACK) control packet, the SRT 2110 sender should acknowledge its reception to the receiver by sending an 2111 ACKACK control packet with the sequence number of the full ACK packet 2112 being acknowledged. 2114 The SRT receiver also sends NAK control packets to notify the sender 2115 about the missing packets (Section 4.8.2). The sending of a NAK 2116 packet can be triggered immediately after a gap in sequence numbers 2117 of data packets is detected. In addition, a Periodic NAK report 2118 mechanism can be used to send NAK reports periodically. The NAK 2119 packet in that case will list all the packets that the receiver 2120 considers being lost up to the moment the Periodic NAK report is 2121 sent. 2123 Upon reception of the NAK packet, the SRT sender prioritizes 2124 retransmissions of lost packets over the regular data packets to be 2125 transmitted for the first time. 2127 The retransmission of the missing packet is repeated until the 2128 receiver acknowledges its receipt, or if both peers agree to drop 2129 this packet (see Section 4.6). 2131 4.8.1. Packet Acknowledgement (ACKs, ACKACKs) 2133 At certain intervals (see below), the SRT receiver sends an 2134 acknowledgment (ACK) that causes the acknowledged packets to be 2135 removed from the SRT sender's buffer. 2137 An ACK control packet contains the sequence number of the packet 2138 immediately following the latest in the list of received packets. 2139 Where no packet loss has occurred up to the packet with sequence 2140 number n, an ACK would include the sequence number (n + 1). 2142 An ACK (from a receiver) will trigger the transmission of an ACKACK 2143 (by the sender), with almost no delay. The time it takes for an ACK 2144 to be sent and an ACKACK to be received is the RTT. The ACKACK tells 2145 the receiver to stop sending the ACK position because the sender 2146 already knows it. Otherwise, ACKs (with outdated information) would 2147 continue to be sent regularly. Similarly, if the sender does not 2148 receive an ACK, it does not stop transmitting. 2150 There are two conditions for sending an acknowledgment. A full ACK 2151 is based on a timer of 10 milliseconds (the ACK period or 2152 synchronization time interval SYN). For high bitrate transmissions, 2153 a "light ACK" can be sent, which is an ACK for a sequence of packets. 2154 In a 10 milliseconds interval, there are often so many packets being 2155 sent and received that the ACK position on the sender does not 2156 advance quickly enough. To mitigate this, after 64 packets (even if 2157 the ACK period has not fully elapsed) the receiver sends a light ACK. 2158 A light ACK is a shorter ACK (SRT header and one 32-bit field). It 2159 does not trigger an ACKACK. 2161 When a receiver encounters the situation where the next packet to be 2162 played was not successfully received from the sender, it will "skip" 2163 this packet (see Section 4.6) and send a fake ACK. To the sender, 2164 this fake ACK is a real ACK, and so it just behaves as if the packet 2165 had been received. This facilitates the synchronization between SRT 2166 sender and receiver. The fact that a packet was skipped remains 2167 unknown by the sender. Skipped packets are recorded in the 2168 statistics on the SRT receiver. 2170 4.8.2. Packet Retransmission (NAKs) 2172 The SRT receiver sends NAK control packets to notify the sender about 2173 the missing packets. The NAK packet sending can be triggered 2174 immediately after a gap in sequence numbers of data packets is 2175 detected. 2177 Upon reception of the NAK packet, the SRT sender prioritizes 2178 retransmissions of lost packets over the regular data packets to be 2179 transmitted for the first time. 2181 The SRT sender maintains a list of lost packets (loss list) that is 2182 built from NAK reports. When scheduling packet transmission, it 2183 looks to see if a packet in the loss list has priority and sends it 2184 if so. Otherwise, it sends the next packet scheduled for the first 2185 transmission list. Note that when a packet is transmitted, it stays 2186 in the buffer in case it is not received by the SRT receiver. 2188 NAK packets are processed to fill in the loss list. As the latency 2189 window advances and packets are dropped from the sending queue, a 2190 check is performed to see if any of the dropped or resent packets are 2191 in the loss list, to determine if they can be removed from there as 2192 well so that they are not retransmitted unnecessarily. 2194 There is a counter for the packets that are resent. If there is no 2195 ACK for a packet, it will stay in the loss list and can be resent 2196 more than once. Packets in the loss list are prioritized. 2198 If packets in the loss list continue to block the send queue, at some 2199 point this will cause the send queue to fill. When the send queue is 2200 full, the sender will begin to drop packets without even sending them 2201 the first time. An encoder (or other application) may continue to 2202 provide packets, but there's no place for them, so they will end up 2203 being thrown away. 2205 This condition where packets are unsent does not happen often. There 2206 is a maximum number of packets held in the send buffer based on the 2207 configured latency. Older packets that have no chance to be 2208 retransmitted and played in time are dropped, making room for newer 2209 real-time packets produced by the sending application. See 2210 Section 4.5, Section 4.6 for details. 2212 In addition to the regular NAKs, the Periodic NAK report mechanism 2213 can be used to send NAK reports periodically. The NAK packet in that 2214 case will have all the packets that the receiver considers being lost 2215 at the time of sending the Periodic NAK report. 2217 SRT Periodic NAK reports are sent with a period of (RTT + 4 * RTTVar) 2218 / 2 (so called NAKInterval), with a 20 milliseconds floor, where RTT 2219 and RTTVar are defined in section Section 4.10. A NAK control packet 2220 contains a compressed list of the lost packets. Therefore, only lost 2221 packets are retransmitted. By using NAKInterval for the NAK reports 2222 period, it may happen that lost packets are retransmitted more than 2223 once, but it helps maintain low latency in the case where NAK packets 2224 are lost. 2226 An ACKACK tells the receiver to stop sending the ACK position because 2227 the sender already knows it. Otherwise, ACKs (with outdated 2228 information) would continue to be sent regularly. 2230 An ACK serves as a ping, with a corresponding ACKACK pong, to measure 2231 RTT. The time it takes for an ACK to be sent and an ACKACK to be 2232 received is the RTT. Each ACK has a number. A corresponding ACKACK 2233 has that same number. The receiver keeps a list of all ACKs in a 2234 queue to match them. Unlike a full ACK, which contains the current 2235 RTT and several other values in the Control Information Field (CIF) 2236 (Section 3.2.4), a light ACK just contains the sequence number. All 2237 control messages are sent directly and processed upon reception, but 2238 ACKACK processing time is negligible (the time this takes is included 2239 in the round-trip time). 2241 4.9. Bidirectional Transmission Queues 2243 Once an SRT connection is established, both peers can send data 2244 packets simultaneously. 2246 4.10. Round-Trip Time Estimation 2248 Round-trip time (RTT) in SRT is estimated during the transmission of 2249 data packets based on a difference in time between an ACK packet is 2250 sent out and a corresponding ACKACK packet is received back by the 2251 SRT receiver. 2253 An ACK sent by the receiver triggers an ACKACK from the sender with 2254 minimal processing delay. The ACKACK response is expected to arrive 2255 at the receiver roughly one RTT after the corresponding ACK was sent. 2257 The SRT receiver records the time when an ACK is sent out. The ACK 2258 carries a unique sequence number (independent of the data packet 2259 sequence number). The corresponding ACKACK also carries the same 2260 sequence number. Upon receiving the ACKACK, SRT calculates the RTT 2261 by comparing the difference between the ACKACK arrival time and the 2262 ACK departure time. In the following formula, RTT is the current 2263 value that the receiver maintains and rtt is the recent value that 2264 was just calculated from an ACK/ACKACK pair: 2266 RTT = RTT * 0.875 + rtt * 0.125 2268 RTT variance RTTVar is obtained as follows: 2270 RTTVar = RTTVar * 0.75 + abs(RTT - rtt) * 0.25 2272 where abs() means an absolute value. 2274 Both RTT and RTTVar are measured in microseconds. The initial value 2275 of RTT is 100 milliseconds, RTTVar is 50 milliseconds. 2277 The smoothed RTT calculated by the receiver as well as the RTT 2278 variance RTTVar are sent with the next full acknowledgement packet 2279 (see Section 3.2.4). Note that the first ACK in an SRT session might 2280 contain an initial RTT value of 100 milliseconds, because the early 2281 calculations may not be precise. 2283 The sender always gets the RTT from the receiver. It does not have 2284 an analog to the ACK/ACKACK mechanism, i.e. it can not send a message 2285 that guarantees an immediate return without processing. Upon an ACK 2286 reception, the SRT sender updates its own RTT and RTTVar values using 2287 the same formulas as above, in which case rtt is the most recent 2288 value it receives, i.e., carried by an incoming ACK. 2290 Note that an SRT socket can both send and receive data packets. RTT 2291 and RTTVar are updated by the socket based on algorithms for the 2292 sender (using ACK packets) and for the receiver (using ACK-ACKACK 2293 pairs). When an SRT socket receives data, it updates its local RTT 2294 and RTTVar, which can be used for its own sender as well. 2296 4.11. Congestion Control 2298 SRT provides certain mechanisms for the sender to get some feedback 2299 from the receiving side through the ACK packets (Section 3.2.4). 2300 Every 10 ms the sender receives the latest values of RTT and RTT 2301 variance, Available Buffer Size, Packets Receiving Rate and Estimated 2302 Link Capacity. Upon reception of the NAK packet (Section 3.2.5) the 2303 sender can detect packet losses during the transmission. These 2304 mechanisms provide a solid background for various congestion control 2305 algorithms. 2307 Given that SRT can operate in live and file transfer modes, there are 2308 two groups of congestion control algorithms possible. 2310 For live transmission mode (Section 4.2.2) the congestion control 2311 algorithm does not need to control the sending pace of the data 2312 packets, as the sending timing is provided by the live input. 2313 Although certain limitations on the minimal inter-sending time of 2314 consecutive packets can be applied in order to avoid congestion 2315 during fluctuations of the source bitrate. Also it is allowed to 2316 drop those packets that can not be delivered in time. 2318 For file transfer, any known File Congestion Control algorithms like 2319 CUBIC [RFC8312] and BBR [BBR] can apply, including the congestion 2320 control mechanism proposed in UDT [GHG04b], [GuAnAO]. The UDT 2321 congestion control relies on the available link capacity, packet loss 2322 reports (NAK) and packet acknowledgements (ACKs). It then slows down 2323 the output of packets as needed by adjusting the packet sending pace. 2324 In periods of congestion, it can block the main stream and focus on 2325 the lost packets. 2327 5. Encryption 2329 This section describes the encryption mechanism that protects the 2330 payload of SRT streams. Based on standard cryptographic algorithms, 2331 the mechanism allows an efficient stream cipher with a key 2332 establishment method. 2334 5.1. Overview 2336 SRT implements encryption using AES [AES] in counter mode (AES-CTR) 2337 [SP800-38A] with a short-lived key to encrypt and decrypt the media 2338 stream. The AES-CTR cipher is suitable for continuous stream 2339 encryption that permits decryption from any point, without access to 2340 start of the stream (random access), and for the same reason 2341 tolerates packet loss. It also offers strong confidentiality when 2342 the counter is managed properly. 2344 5.1.1. Encryption Scope 2346 SRT encrypts only the payload of SRT data packets (Section 3.1), 2347 while the header is left unencrypted. The unencrypted header 2348 contains the Packet Sequence Number field used to keep the 2349 synchronization of the cipher counter between the encrypting sender 2350 and the decrypting receiver. No constraints apply to the payload of 2351 SRT data packets as no padding of the payload is required by counter 2352 mode ciphers. 2354 5.1.2. AES Counter 2356 The counter for AES-CTR is the size of the cipher's block, i.e. 128 2357 bits. It is derived from a 128-bit sequence consisting of 2359 * a block counter in the least significant 16 bits, which counts the 2360 blocks in a packet, 2362 * a packet index - based on the packet sequence number in the SRT 2363 header - in the next 32 bits, 2365 * eighty zeroed bits. 2367 The upper 112 bits of this sequence are XORed with an Initialization 2368 Vector (IV) to produce a unique counter for each crypto block. The 2369 IV is derived from the Salt provided in the Keying Material 2370 (Section 3.2.2): 2372 IV = MSB(112, Salt): Most significant 112 bits of the salt. 2374 5.1.3. Stream Encrypting Key (SEK) 2376 The key used for AES-CTR encryption is called the "Stream Encrypting 2377 Key" (SEK). It is used for up to 2^25 packets with further rekeying. 2378 The short-lived SEK is generated by the sender using a pseudo-random 2379 number generator (PRNG), and transmitted within the stream, wrapped 2380 with another longer-term key, the Key Encrypting Key (KEK), using a 2381 known AES key wrap protocol. 2383 For connection-oriented transport such as SRT, there is no need to 2384 periodically transmit the short-lived key since no additional party 2385 can join a stream in progress. The keying material is transmitted 2386 within the connection handshake packets, and for a short period when 2387 rekeying occurs. 2389 5.1.4. Key Encrypting Key (KEK) 2391 The Key Encrypting Key (KEK) is derived from a secret (passphrase) 2392 shared between the sender and the receiver. The KEK provides access 2393 to the Stream Encrypting Key, which in turn provides access to the 2394 protected payload of SRT data packets. The KEK has to be at least as 2395 long as the SEK. 2397 The KEK is generated by a password-based key generation function 2398 (PBKDF2) [RFC2898], using the passphrase, a number of iterations 2399 (2048), a keyed-hash (HMAC-SHA1) [RFC2104], and a key length value 2400 (KLen). The PBKDF2 function hashes the passphrase to make a long 2401 string, by repetition or padding. The number of iterations is based 2402 on how much time can be given to the process without it becoming 2403 disruptive. 2405 5.1.5. Key Material Exchange 2407 The KEK is used to generate a wrap [RFC3394] that is put in a key 2408 material (KM) message by the initiator of a connection (i.e. caller 2409 in caller-listener handshake and initiator in the rendezvous 2410 handshake, see Section 4.3) to send to the responder (listener). The 2411 KM message contains the key length, the salt (one of the arguments 2412 provided to the PBKDF2 function), the protocol being used (e.g. AES- 2413 256) and the AES counter (which will eventually change, see 2414 Section 5.1.6). 2416 On the other side, the responder attempts to decode the wrap to 2417 obtain the Stream Encrypting Key. In the protocol for the wrap there 2418 is a padding, which is a known template, so the responder knows from 2419 the KM that it has the right KEK to decode the SEK. The SEK 2420 (generated and transmitted by the initiator) is random, and cannot be 2421 known in advance. The KEK formula is calculated on both sides, with 2422 the difference that the responder gets the key length (KLen) from the 2423 initiator via the key material (KM). It is the initiator who decides 2424 on the configured length. The responder obtains it from the material 2425 sent by the initiator. 2427 The responder returns the same KM message to show that it has the 2428 same information as the initiator, and that the encoded material will 2429 be decrypted. If the responder does not return this status, this 2430 means that it does not have the SEK. All incoming encrypted packets 2431 received by the responder will be lost (undecrypted). Even if they 2432 are transmitted successfully, the receiver will be unable to decrypt 2433 them, and so packets will be dropped. All data packets coming from 2434 responder will be unencrypted. 2436 5.1.6. KM Refresh 2438 The short lived SEK is regenerated for cryptographic reasons when a 2439 pre-determined number of packets has been encrypted. The KM refresh 2440 period is determined by the implementation. The receiver knows which 2441 SEK (odd or even) was used to encrypt the packet by means of the KK 2442 field of the SRT Data Packet (Section 3.1). 2444 There are two variables used to determine the KM Refresh timing: 2446 * KM Refresh Period specifies the number of packets to be sent 2447 before switching to the new SEK, 2449 * KM Pre-Announcement Period specifies when a new key is announced 2450 in a number of packets before key switchover. The same value is 2451 used to determine when to decommission the old key after 2452 switchover. 2454 The recommended KM Refresh Period is after 2^25 packets encrypted 2455 with the same SEK are sent. The recommended KM Pre-Announcement 2456 Period is 4000 packets (i.e. a new key is generated, wrapped, and 2457 sent at 2^25 minus 4000 packets; the old key is decommissioned at 2458 2^25 plus 4000 packets). 2460 Even and odd keys are alternated during transmission the following 2461 way. The packets with the earlier key #1 (let it be the odd key) 2462 will continue to be sent. The receiver will receive the new key #2 2463 (even), then decrypt and unwrap it. The receiver will reply to the 2464 sender if it is able to understand. Once the sender gets to the 2465 2^25th packet using the odd key (key #1), it will then start to send 2466 packets with the even key (key #2), knowing that the receiver has 2467 what it needs to decrypt them. This happens transparently, from one 2468 packet to the next. At 2^25 plus 4000 packets the first key will be 2469 decommissioned automatically. 2471 Both keys live in parallel for two times the Pre-Announcement Period 2472 (e.g. 4000 packets before the key switch, and 4000 packets after). 2473 This is to allow for packet retransmission. It is possible for 2474 packets with the older key to arrive at the receiver a bit late. 2475 Each packet contains a description of which key it requires, so the 2476 receiver will still have the ability to decrypt it. 2478 5.2. Encryption Process 2480 5.2.1. Generating the Stream Encrypting Key 2482 On the sending side SEK, Salt and KEK are generated the following 2483 way: 2485 SEK = PRNG(KLen) 2486 Salt = PRNG(128) 2487 KEK = PBKDF2(passphrase, LSB(64,Salt), Iter, Klen) 2489 where 2491 * PBKDF2 is the PKCS#5 Password Based Key Derivation Function 2492 [RFC2898], 2494 * passphrase is the pre-shared passphrase, 2496 * Salt is the field of the KM message, 2498 * LSB(n, v) is the function taking n least significant bits of v, 2500 * Iter=2048 defines the number of iterations for PBKDF2, 2502 * KLen is the field of the KM message. 2504 Wrap = AESkw(KEK, SEK) 2506 where AESkw(KEK, SEK) is the key wrapping function [RFC3394]. 2508 5.2.2. Encrypting the Payload 2510 The encryption of the payload of the SRT DATA packet is done with 2511 AES-CTR 2513 EncryptedPayload = AES_CTR_Encrypt(SEK, IV, UnencryptedPayload) 2515 where the Initialization Vector is derived as 2517 IV = (MSB(112, Salt) << 2) XOR (PktSeqNo) 2519 * PktSeqNo is the value of the Packet Sequence Number field of the 2520 SRT data packet. 2522 5.3. Decryption Process 2523 5.3.1. Restoring the Stream Encrypting Key 2525 For the receiver to be able to decrypt the incoming stream it has to 2526 know the stream encrypting key (SEK) used by the sender. The 2527 receiver must know the passphrase used by the sender. The remaining 2528 information can be extracted from the Keying Material message. 2530 The Keying Material message contains the AES-wrapped [RFC3394] SEK 2531 used by the encoder. The Key-Encryption Key (KEK) required to unwrap 2532 the SEK is calculated as: 2534 KEK = PBKDF2(passphrase, LSB(64,Salt), Iter, KLen) 2536 where 2538 * PBKDF2 is the PKCS#5 Password Based Key Derivation Function 2539 [RFC2898], 2541 * passphrase is the pre-shared passphrase, 2543 * Salt is the field of the KM message, 2545 * LSB(n, v) is the function taking n least significant bits of v, 2547 * Iter=2048 defines the number of iterations for PBKDF2, 2549 * KLen is the field of the KM message. 2551 SEK = AESkuw(KEK, Wrap) 2553 where AESkuw(KEK, Wrap) is the key unwrapping function. 2555 5.3.2. Decrypting the Payload 2557 The decryption of the payload of the SRT data packet is done with 2558 AES-CTR 2560 DecryptedPayload = AES_CTR_Encrypt(SEK, IV, EncryptedPayload) 2562 where the Initialization Vector is derived as 2564 IV = (MSB(112, Salt) << 2) XOR (PktSeqNo) 2566 * PktSeqNo is the value of the Packet Sequence Number field of the 2567 SRT data packet. 2569 6. Security Considerations 2571 SRT supports confidentiality of user data using stream ciphering 2572 based on AES. Session keys for ciphering are delivered through 2573 control packets during handshake, with the protection by Key 2574 Encryption Key, which is generated by a sender and receiver with pre- 2575 shared secret such as passphrase. As in UDT, careful uses of SYN 2576 Cookies may help to deter denial of service attacks. Appropriate 2577 security policy including key size, key refresh period, as well as 2578 passphrase should be managed by security officers, which is out of 2579 scope of the present document. 2581 7. IANA Considerations 2583 This document makes no requests of the IANA. 2585 Contributors 2587 This specification is heavily based on the SRT Protocol Technical 2588 Overview [SRTTO] written by Jean Dube and Steve Matthews. 2590 In alphabetical order, the contributors to the pre-IETF SRT project 2591 and specification at Haivision are: Marc Cymontkowski, Roman 2592 Diouskine, Jean Dube, Mikolaj Malecki, Steve Matthews, Maria 2593 Sharabayko, Maxim Sharabayko, Adam Yellen. 2595 The contributors to this specification at SK Telecom are Jeongseok 2596 Kim and Joonwoong Kim. 2598 We cannot list all the contributors to the open-sourced 2599 implementation of SRT on GitHub. But we appreciate the help, 2600 contribution, integrations and feedback of the SRT and SRT Alliances 2601 community. 2603 Acknowledgments 2605 The basis of the SRT protocol and its implementation was the UDP- 2606 based Data Transfer Protocol [GHG04b]. The authors thank Yunhong Gu 2607 and Robert Grossman, the authors of the UDP-based Data Transfer 2608 Protocol [GHG04b]. 2610 TODO acknowledge. 2612 References 2614 Normative References 2616 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 2617 DOI 10.17487/RFC0768, August 1980, 2618 . 2620 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2621 Requirement Levels", BCP 14, RFC 2119, 2622 DOI 10.17487/RFC2119, March 1997, 2623 . 2625 Informative References 2627 [AES] National Institute of Standards and Technology, "FIPS Pub 2628 197: Advanced Encryption Standard (AES)", November 2001, 2629 . 2632 [AV1] Rivaz, P.d. and J. Haughton, "AV1 Bitstream & Decoding 2633 Process Specification", September 2020, 2634 . 2636 [BBR] Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., and V. 2637 Jacobson, "BBR: Congestion-Based Congestion Control", 2638 October 2016. 2640 [GHG04b] Gu, Y., Hong, X., and R.L. Grossman, "Experiences in 2641 Design and Implementation of a High Performance Transport 2642 Protocol", DOI 10.1109/SC.2004.24, December 2004, 2643 . 2645 [GuAnAO] Gu, Y., Hong, X., and R.L. Grossman, "An Analysis of AIMD 2646 Algorithm with Decreasing Increases", October 2004. 2648 [H.265] International Telecommunications Union, "H.265 : High 2649 efficiency video coding", ITU-T Recommendation H.265, 2650 2019. 2652 [I-D.ietf-quic-http] 2653 Bishop, M., "Hypertext Transfer Protocol Version 3 2654 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 2655 quic-http-29, 9 June 2020, . 2658 [I-D.ietf-quic-transport] 2659 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 2660 and Secure Transport", Work in Progress, Internet-Draft, 2661 draft-ietf-quic-transport-29, 9 June 2020, 2662 . 2665 [ISO13818-1] 2666 ISO, "Information technology -- Generic coding of moving 2667 pictures and associated audio information: Systems", ISO/ 2668 IEC 13818-1, September 2020. 2670 [ISO23009] ISO, "Information technology -- Dynamic adaptive streaming 2671 over HTTP (DASH)", ISO/IEC 23009:2019, September 2020. 2673 [PNPID] "PNP ID AND ACPI ID REGISTRY", September 2020, 2674 . 2676 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 2677 Hashing for Message Authentication", RFC 2104, 2678 DOI 10.17487/RFC2104, February 1997, 2679 . 2681 [RFC2898] Kaliski, B., "PKCS #5: Password-Based Cryptography 2682 Specification Version 2.0", RFC 2898, 2683 DOI 10.17487/RFC2898, September 2000, 2684 . 2686 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 2687 Label Switching Architecture", RFC 3031, 2688 DOI 10.17487/RFC3031, January 2001, 2689 . 2691 [RFC3394] Schaad, J. and R. Housley, "Advanced Encryption Standard 2692 (AES) Key Wrap Algorithm", RFC 3394, DOI 10.17487/RFC3394, 2693 September 2002, . 2695 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 2696 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 2697 . 2699 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2700 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2701 May 2017, . 2703 [RFC8216] Pantos, R., Ed. and W. May, "HTTP Live Streaming", 2704 RFC 8216, DOI 10.17487/RFC8216, August 2017, 2705 . 2707 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 2708 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 2709 RFC 8312, DOI 10.17487/RFC8312, February 2018, 2710 . 2712 [RTMP] "Real-Time Messaging Protocol", September 2020, 2713 . 2715 [SP800-38A] 2716 Dworkin, M., "Recommendation for Block Cipher Modes of 2717 Operation", December 2001. 2719 [SRTSRC] "SRT fully functional reference implementation", September 2720 2020, . 2722 [SRTTO] Dube, J. and S. Matthews, "SRT Protocol Technical 2723 Overview", December 2019. 2725 [VP9] WebM, "VP9 Video Codec", September 2020, 2726 . 2728 Appendix A. Packet Sequence List Coding 2730 For any single packet sequence number, it uses the original sequence 2731 number in the field. The first bit MUST start with "0". 2733 0 1 2 3 2734 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2735 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2736 |0| Sequence Number | 2737 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2739 Figure 18: Single sequence numbers coding 2741 For any consecutive packet sequence numbers that the difference 2742 between the last and first is more than 1, only record the first (a) 2743 and the the last (b) sequence numbers in the list field, and modify 2744 the the first bit of a to "1". 2746 0 1 2 3 2747 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2748 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2749 |1| Sequence Number a (first) | 2750 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2751 |0| Sequence Number b (last) | 2752 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2754 Figure 19: Range of sequence numbers coding 2756 Appendix B. SRT Access Control 2758 One type of information that can be interchanged when a connection is 2759 being established in SRT is the Stream ID, which can be used in a 2760 caller-listener connection layout. This is a string of maximum 512 2761 characters set on the caller side. It can be retrieved at the 2762 listener side on the newly accepted connection. 2764 SRT listener can notify an upstream application about the connection 2765 attempt when a HS conclusion arrives, exposing the contents of the 2766 Stream ID extension message. Based on this information, the 2767 application can accept or reject the connection, select the desired 2768 data stream, or set an appropriate passphrase for the connection. 2770 The Stream ID value can be used as free-form, but there is a 2771 recommended convention so that all SRT users speak the same language. 2772 The intent of the convention is to: 2774 * promote readability and consistency among free-form names, 2776 * interpret some typical data in the key-value style. 2778 B.1. General Syntax 2780 This recommended syntax starts with the characters known as an 2781 executable specification in POSIX: #!. 2783 The next two characters are: 2785 : - this marks the YAML format, the only one currently used 2786 The content format, which is either: 2787 : - the comma-separated keys with no nesting 2788 { - like above, but nesting is allowed and must end with } 2790 (Nesting means that you can have multiple level brace-enclosed parts 2791 inside.) 2793 The form of the key-value pair is: 2795 key1=value1,key2=value2... 2797 B.2. Standard Keys 2799 Beside the general syntax, there are several top-level keys treated 2800 as standard keys. All single letter key definitions, including those 2801 not listed in this section, are reserved for future use. Users can 2802 additionally use custom key definitions with user_* or companyname_* 2803 prefixes, where user and companyname are to be replaced with an 2804 actual user or company name. 2806 The existing key values MUST not be extended, and MUST not differ 2807 from those described in this section. 2809 The following keys are standard: 2811 * u: User Name, or authorization name, that is expected to control 2812 which password should be used for the connection. The application 2813 should interpret it to distinguish which user should be used by 2814 the listener party to set up the password. 2816 * r: Resource Name identifies the name of the resource and 2817 facilitates selection should the listener party be able to serve 2818 multiple resources. 2820 * h: Host Name identifies the hostname of the resource. For 2821 example, to request a stream with the URI somehost.com/videos/ 2822 querry.php?vid=366 the hostname field should have somehost.com, 2823 and the resource name can have videos/querry.php?vid=366 or simply 2824 366. Note that this is still a key to be specified explicitly. 2825 Support tools that apply simplifications and URI extraction are 2826 expected to insert only the host portion of the URI here. 2828 * s: Session ID is a temporary resource identifier negotiated with 2829 the server, used just for verification. This is a one-shot 2830 identifier, invalidated after the first use. The expected usage 2831 is when details for the resource and authorization are negotiated 2832 over a separate connection first, and then the session ID is used 2833 here alone. 2835 * t: Type specifies the purpose of the connection. Several standard 2836 types are defined, but users may extend the use: 2838 - stream (default, if not specified): for exchanging the user- 2839 specified payload for an application-defined purpose, 2841 - file: for transmitting a file, where r is the filename, 2842 - auth: for exchanging sensible data. The r value states its 2843 purpose. No specific possible values for that are known so far 2844 (FUTURE USE). 2846 * m: Mode expected for this connection: 2848 - request (default): the caller wants to receive the stream, 2850 - publish: the caller wants to send the stream data, 2852 - bidirectional: bidirectional data exchange is expected. 2854 Note that "m" is not required in the case where Stream ID is not used 2855 to distinguish authorization or resources, and the caller is expected 2856 to send the data. This is only for cases where the listener can 2857 handle various purposes of the connection and is therefore required 2858 to know what the caller is attempting to do. 2860 B.3. Examples 2862 The example content of the StreamID is: 2864 #!::u=admin,r=bluesbrothers1_hi 2866 It specifies the username and the resource name of the stream to be 2867 served to the caller. 2869 #!::u=johnny,t=file,m=publish,r=results.csv 2871 This specifies that the file is expected to be transmitted from the 2872 caller to the listener and its name is results.csv. 2874 Appendix C. Changelog 2876 C.1. Since Version 00 2878 * Improved and extended the description of "Encryption" section, 2880 * Improved and extended the description of "Round-Trip Time 2881 Estimation" section, 2883 * Extended the description of "Handshake" section with "Stream ID 2884 Extension Message", "Group Membership Extension" subsections, 2886 * Extended "Handshake Messages" section with the detailed 2887 description of handshake procedure, 2889 * Improved "Key Material" section description, 2890 * Changed packet structure formatting for "Packet Structure" 2891 section, 2893 * Did minor additions to the "Acknowledgement and Lost Packet 2894 Handling" section, 2896 * Fixed broken links, 2898 * Extended the list of references. 2900 Authors' Addresses 2902 Maxim Sharabayko 2903 Haivision Network Video, GmbH 2905 Email: maxsharabayko@haivision.com 2907 Maria Sharabayko 2908 Haivision Network Video, GmbH 2910 Email: msharabayko@haivision.com 2912 Jean Dube 2913 Haivision 2915 Email: jdube@haivision.com 2917 Jeongseok Kim 2918 SK Telecom Co., Ltd. 2920 Email: jeongseok.kim@sk.com 2922 Joonwoong Kim 2923 SK Telecom Co., Ltd. 2925 Email: joonwoong.kim@sk.com