idnits 2.17.1 draft-sharabayko-srt-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The existing key values MUST not be extended, and MUST not differ from those described in this section. -- The document date (8 March 2021) is 1145 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-34) exists of draft-ietf-quic-http-33 -- Obsolete informational reference (is this intentional?): RFC 2898 (Obsoleted by RFC 8018) -- Obsolete informational reference (is this intentional?): RFC 6528 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M.P. Sharabayko 3 Internet-Draft M.A. Sharabayko 4 Intended status: Informational Haivision Network Video, GmbH 5 Expires: 9 September 2021 J. Dube 6 Haivision Systems, Inc. 7 JS. Kim 8 JW. Kim 9 SK Telecom Co., Ltd. 10 8 March 2021 12 The SRT Protocol 13 draft-sharabayko-srt-00 15 Abstract 17 This document specifies Secure Reliable Transport (SRT) protocol. 18 SRT is a user-level protocol over User Datagram Protocol and provides 19 reliability and security optimized for low latency live video 20 streaming, as well as generic bulk data transfer. For this, SRT 21 introduces control packet extension, improved flow control, enhanced 22 congestion control and a mechanism for data encryption. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on 9 September 2021. 41 Copyright Notice 43 Copyright (c) 2021 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 48 license-info) in effect on the date of publication of this document. 49 Please review these documents carefully, as they describe your rights 50 and restrictions with respect to this document. Code Components 51 extracted from this document must include Simplified BSD License text 52 as described in Section 4.e of the Trust Legal Provisions and are 53 provided without warranty as described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 4 59 1.2. Secure Reliable Transport Protocol . . . . . . . . . . . 5 60 2. Terms and Definitions . . . . . . . . . . . . . . . . . . . . 6 61 3. Packet Structure . . . . . . . . . . . . . . . . . . . . . . 6 62 3.1. Data Packets . . . . . . . . . . . . . . . . . . . . . . 7 63 3.2. Control Packets . . . . . . . . . . . . . . . . . . . . . 9 64 3.2.1. Handshake . . . . . . . . . . . . . . . . . . . . . . 10 65 3.2.2. Key Material . . . . . . . . . . . . . . . . . . . . 18 66 3.2.3. Keep-Alive . . . . . . . . . . . . . . . . . . . . . 22 67 3.2.4. ACK (Acknowledgment) . . . . . . . . . . . . . . . . 23 68 3.2.5. NAK (Loss Report) . . . . . . . . . . . . . . . . . . 25 69 3.2.6. Congestion Warning . . . . . . . . . . . . . . . . . 26 70 3.2.7. Shutdown . . . . . . . . . . . . . . . . . . . . . . 27 71 3.2.8. ACKACK . . . . . . . . . . . . . . . . . . . . . . . 27 72 3.2.9. Message Drop Request . . . . . . . . . . . . . . . . 28 73 3.2.10. Peer Error . . . . . . . . . . . . . . . . . . . . . 29 74 4. SRT Data Transmission and Control . . . . . . . . . . . . . . 30 75 4.1. Stream Multiplexing . . . . . . . . . . . . . . . . . . . 30 76 4.2. Data Transmission Modes . . . . . . . . . . . . . . . . . 31 77 4.2.1. Message Mode . . . . . . . . . . . . . . . . . . . . 31 78 4.2.2. Buffer Mode . . . . . . . . . . . . . . . . . . . . . 32 79 4.3. Handshake Messages . . . . . . . . . . . . . . . . . . . 32 80 4.3.1. Caller-Listener Handshake . . . . . . . . . . . . . . 35 81 4.3.2. Rendezvous Handshake . . . . . . . . . . . . . . . . 37 82 4.4. SRT Buffer Latency . . . . . . . . . . . . . . . . . . . 43 83 4.5. Timestamp-Based Packet Delivery . . . . . . . . . . . . . 44 84 4.5.1. Packet Delivery Time . . . . . . . . . . . . . . . . 45 85 4.6. Too-Late Packet Drop . . . . . . . . . . . . . . . . . . 47 86 4.7. Drift Management . . . . . . . . . . . . . . . . . . . . 48 87 4.8. Acknowledgement and Lost Packet Handling . . . . . . . . 50 88 4.8.1. Packet Acknowledgement (ACKs, ACKACKs) . . . . . . . 50 89 4.8.2. Packet Retransmission (NAKs) . . . . . . . . . . . . 51 90 4.9. Bidirectional Transmission Queues . . . . . . . . . . . . 53 91 4.10. Round-Trip Time Estimation . . . . . . . . . . . . . . . 53 92 5. SRT Packet Pacing and Congestion Control . . . . . . . . . . 54 93 5.1. SRT Packet Pacing and Live Congestion Control (LiveCC) . 54 94 5.1.1. Configuring Maximum Bandwidth . . . . . . . . . . . . 55 95 5.1.2. SRT's Default LiveCC Algorithm . . . . . . . . . . . 57 96 5.2. File Transfer Congestion Control (FileCC) . . . . . . . . 58 97 5.2.1. SRT's Default FileCC Algorithm . . . . . . . . . . . 58 98 6. Encryption . . . . . . . . . . . . . . . . . . . . . . . . . 66 99 6.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 66 100 6.1.1. Encryption Scope . . . . . . . . . . . . . . . . . . 66 101 6.1.2. AES Counter . . . . . . . . . . . . . . . . . . . . . 66 102 6.1.3. Stream Encrypting Key (SEK) . . . . . . . . . . . . . 67 103 6.1.4. Key Encrypting Key (KEK) . . . . . . . . . . . . . . 67 104 6.1.5. Key Material Exchange . . . . . . . . . . . . . . . . 67 105 6.1.6. KM Refresh . . . . . . . . . . . . . . . . . . . . . 68 106 6.2. Encryption Process . . . . . . . . . . . . . . . . . . . 69 107 6.2.1. Generating the Stream Encrypting Key . . . . . . . . 69 108 6.2.2. Encrypting the Payload . . . . . . . . . . . . . . . 69 109 6.3. Decryption Process . . . . . . . . . . . . . . . . . . . 70 110 6.3.1. Restoring the Stream Encrypting Key . . . . . . . . . 70 111 6.3.2. Decrypting the Payload . . . . . . . . . . . . . . . 70 112 7. Best Practices and Configuration Tips for Data Transmission via 113 SRT . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 114 7.1. Live Streaming . . . . . . . . . . . . . . . . . . . . . 71 115 7.2. File Transmission . . . . . . . . . . . . . . . . . . . . 72 116 7.2.1. File Transmission in Buffer Mode . . . . . . . . . . 72 117 7.2.2. File Transmission in Message Mode . . . . . . . . . . 73 118 8. Security Considerations . . . . . . . . . . . . . . . . . . . 73 119 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 74 120 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 74 121 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 75 122 References . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 123 Normative References . . . . . . . . . . . . . . . . . . . . . 75 124 Informative References . . . . . . . . . . . . . . . . . . . . 75 125 Appendix A. Packet Sequence List Coding . . . . . . . . . . . . 77 126 Appendix B. SRT Access Control . . . . . . . . . . . . . . . . . 78 127 B.1. General Syntax . . . . . . . . . . . . . . . . . . . . . 78 128 B.2. Standard Keys . . . . . . . . . . . . . . . . . . . . . . 79 129 B.3. Examples . . . . . . . . . . . . . . . . . . . . . . . . 80 130 Appendix C. Changelog . . . . . . . . . . . . . . . . . . . . . 81 131 C.1. Since Version 00 . . . . . . . . . . . . . . . . . . . . 81 132 C.2. Since Version 01 . . . . . . . . . . . . . . . . . . . . 81 133 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 82 135 1. Introduction 136 1.1. Motivation 138 The demand for live video streaming has been increasing steadily for 139 many years. With the emergence of cloud technologies, many video 140 processing pipeline components have transitioned from on-premises 141 appliances to software running on cloud instances. While real-time 142 streaming over TCP-based protocols like RTMP [RTMP] is possible at 143 low bitrates and on a small scale, the exponential growth of the 144 streaming market has created a need for more powerful solutions. 146 To improve scalability on the delivery side, content delivery 147 networks (CDNs) at one point transitioned to segmentation-based 148 technologies like HLS (HTTP Live Streaming) [RFC8216] and DASH 149 (Dynamic Adaptive Streaming over HTTP) [ISO23009]. This move 150 increased the end-to-end latency of live streaming to over few tens 151 of seconds, which makes it unattractive for specific use cases where 152 real-time is important. Over time, the industry optimized these 153 delivery methods, bringing the latency down to few seconds. 155 While the delivery side scaled up, improvements to video transcoding 156 became a necessity. Viewers watch video streams on a variety of 157 different devices, connected over different types of networks. Since 158 upload bandwidth from on-premises locations is often limited, video 159 transcoding moved to the cloud. 161 RTMP became the de facto standard for contribution over the public 162 Internet. But there are limitations for the payload to be 163 transmitted, since RTMP as a media specific protocol only supports 164 two audio channels and a restricted set of audio and video codecs, 165 lacking support for newer formats such as HEVC [H.265], VP9 [VP9], or 166 AV1 [AV1]. 168 Since RTMP, HLS and DASH rely on TCP, these protocols can only 169 guarantee acceptable reliability over connections with low RTTs, and 170 can not use the bandwidth of network connections to their full extent 171 due to limitations imposed by congestion control. Notably, QUIC 172 [I-D.ietf-quic-transport] has been designed to address these problems 173 with HTTP-based delivery protocols in HTTP/3 [I-D.ietf-quic-http]. 174 Like QUIC, SRT [SRTSRC] uses UDP instead of the TCP transport 175 protocol, but assures more reliable delivery using Automatic Repeat 176 Request (ARQ), packet acknowledgments, end-to-end latency management, 177 etc. 179 1.2. Secure Reliable Transport Protocol 181 Low latency video transmissions across reliable (usually local) IP 182 based networks typically take the form of MPEG-TS [ISO13818-1] 183 unicast or multicast streams using the UDP/RTP protocol, where any 184 packet loss can be mitigated by enabling forward error correction 185 (FEC). Achieving the same low latency between sites in different 186 cities, countries or even continents is more challenging. While it 187 is possible with satellite links or dedicated MPLS [RFC3031] 188 networks, these are expensive solutions. The use of public Internet 189 connectivity, while less expensive, imposes significant bandwidth 190 overhead to achieve the necessary level of packet loss recovery. 191 Introducing selective packet retransmission (reliable UDP) to recover 192 from packet loss removes those limitations. 194 Derived from the UDP-based Data Transfer (UDT) protocol [GHG04b], SRT 195 is a user-level protocol that retains most of the core concepts and 196 mechanisms while introducing several refinements and enhancements, 197 including control packet modifications, improved flow control for 198 handling live streaming, enhanced congestion control, and a mechanism 199 for encrypting packets. 201 SRT is a transport protocol that enables the secure, reliable 202 transport of data across unpredictable networks, such as the 203 Internet. While any data type can be transferred via SRT, it is 204 ideal for low latency (sub-second) video streaming. SRT provides 205 improved bandwidth utilization compared to RTMP, allowing much higher 206 contribution bitrates over long distance connections. 208 As packets are streamed from source to destination, SRT detects and 209 adapts to the real-time network conditions between the two endpoints, 210 and helps compensate for jitter and bandwidth fluctuations due to 211 congestion over noisy networks. Its error recovery mechanism 212 minimizes the packet loss typical of Internet connections. 214 To achieve low latency streaming, SRT had to address timing issues. 215 The characteristics of a stream from a source network are completely 216 changed by transmission over the public Internet, which introduces 217 delays, jitter, and packet loss. This, in turn, leads to problems 218 with decoding, as the audio and video decoders do not receive packets 219 at the expected times. The use of large buffers helps, but latency 220 is increased. SRT includes a mechanism to keep a constant end-to-end 221 latency, thus recreating the signal characteristics on the receiver 222 side, and reducing the need for buffering. 224 Like TCP, SRT employs a listener/caller model. The data flow is bi- 225 directional and independent of the connection initiation - either the 226 sender or receiver can operate as listener or caller to initiate a 227 connection. The protocol provides an internal multiplexing 228 mechanism, allowing multiple SRT connections to share the same UDP 229 port, providing access control functionality to identify the caller 230 on the listener side. 232 Supporting forward error correction (FEC) and selective packet 233 retransmission (ARQ), SRT provides the flexibility to use either of 234 the two mechanisms or both combined, allowing for use cases ranging 235 from the lowest possible latency to the highest possible reliability. 237 SRT maintains the ability for fast file transfers introduced in UDT, 238 and adds support for AES encryption. 240 2. Terms and Definitions 242 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 243 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 244 "OPTIONAL" in this document are to be interpreted as described in BCP 245 14 [RFC2119] [RFC8174] when, and only when, they appear in all 246 capitals, as shown here. 248 SRT: The Secure Reliable Transport protocol described by this 249 document. 251 PRNG: Pseudo-Random Number Generator. 253 3. Packet Structure 255 SRT packets are transmitted as UDP payload [RFC0768]. Every UDP 256 packet carrying SRT traffic contains an SRT header immediately after 257 the UDP header (Figure 1). 259 0 1 2 3 260 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 262 | SrcPort | DstPort | 263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 264 | Len | ChkSum | 265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 266 | | 267 + SRT Packet + 268 | | 269 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 271 Figure 1: SRT packet as UDP payload 273 SRT has two types of packets distinguished by the Packet Type Flag: 274 data packet and control packet. 276 The structure of the SRT packet is shown in Figure 2. 278 0 1 2 3 279 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 280 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 281 |F| (Field meaning depends on the packet type) | 282 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 283 | (Field meaning depends on the packet type) | 284 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 285 | Timestamp | 286 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 287 | Destination Socket ID | 288 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 289 | | 290 + Packet Contents | 291 | (depends on the packet type) + 292 | | 293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 295 Figure 2: SRT packet structure 297 F: 1 bit. Packet Type Flag. The control packet has this flag set to 298 "1". The data packet has this flag set to "0". 300 Timestamp: 32 bits. The timestamp of the packet, in microseconds. 301 The value is relative to the time the SRT connection was 302 established. Depending on the transmission mode (Section 4.2), 303 the field stores the packet send time or the packet origin time. 305 Destination Socket ID: 32 bits. A fixed-width field providing the 306 SRT socket ID to which a packet should be dispatched. The field 307 may have the special value "0" when the packet is a connection 308 request. 310 3.1. Data Packets 312 The structure of the SRT data packet is shown in Figure 3. 314 0 1 2 3 315 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 316 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 317 |0| Packet Sequence Number | 318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 319 |P P|O|K K|R| Message Number | 320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 321 | Timestamp | 322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 323 | Destination Socket ID | 324 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 325 | | 326 + Data + 327 | | 328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 330 Figure 3: Data packet structure 332 Packet Sequence Number: 31 bits. The sequential number of the data 333 packet. 335 PP: 2 bits. Packet Position Flag. This field indicates the position 336 of the data packet in the message. The value "10b" (binary) means 337 the first packet of the message. "00b" indicates a packet in the 338 middle. "01b" designates the last packet. If a single data packet 339 forms the whole message, the value is "11b". 341 O: 1 bit. Order Flag. Indicates whether the message should be 342 delivered by the receiver in order (1) or not (0). Certain 343 restrictions apply depending on the data transmission mode used 344 (Section 4.2). 346 KK: 2 bits. Key-based Encryption Flag. The flag bits indicate 347 whether or not data is encrypted. The value "00b" (binary) means 348 data is not encrypted. "01b" indicates that data is encrypted with 349 an even key, and "10b" is used for odd key encryption. Refer to 350 Section 6. The value "11b" is only used in control packets. 352 R: 1 bit. Retransmitted Packet Flag. This flag is clear when a 353 packet is transmitted the first time. The flag is set to "1" when 354 a packet is retransmitted. 356 Message Number: 26 bits. The sequential number of consecutive data 357 packets that form a message (see PP field). 359 Timestamp: 32 bits. See Section 3. 361 Destination Socket ID: 32 bits. See Section 3. 363 Data: variable length. The payload of the data packet. The length 364 of the data is the remaining length of the UDP packet. 366 3.2. Control Packets 368 An SRT control packet has the following structure. 370 0 1 2 3 371 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 372 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 |1| Control Type | Subtype | 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 | Type-specific Information | 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 377 | Timestamp | 378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 379 | Destination Socket ID | 380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- CIF -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 381 | | 382 + Control Information Field + 383 | | 384 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 386 Figure 4: Control packet structure 388 Control Type: 15 bits. Control Packet Type. The use of these bits 389 is determined by the control packet type definition. See Table 1. 391 Subtype: 16 bits. This field specifies an additional subtype for 392 specific packets. See Table 1. 394 Type-specific Information: 32 bits. The use of this field depends on 395 the particular control packet type. Handshake packets do not use 396 this field. 398 Timestamp: 32 bits. See Section 3. 400 Destination Socket ID: 32 bits. See Section 3. 402 Control Information Field (CIF): variable length. The use of this 403 field is defined by the Control Type field of the control packet. 405 The types of SRT control packets are shown in Table 1. The value 406 "0x7FFF" is reserved for a user-defined type. 408 +====================+==============+=========+================+ 409 | Packet Type | Control Type | Subtype | Section | 410 +====================+==============+=========+================+ 411 | HANDSHAKE | 0x0000 | 0x0 | Section 3.2.1 | 412 +--------------------+--------------+---------+----------------+ 413 | KEEPALIVE | 0x0001 | 0x0 | Section 3.2.3 | 414 +--------------------+--------------+---------+----------------+ 415 | ACK | 0x0002 | 0x0 | Section 3.2.4 | 416 +--------------------+--------------+---------+----------------+ 417 | NAK (Loss Report) | 0x0003 | 0x0 | Section 3.2.5 | 418 +--------------------+--------------+---------+----------------+ 419 | Congestion Warning | 0x0004 | 0x0 | Section 3.2.6 | 420 +--------------------+--------------+---------+----------------+ 421 | SHUTDOWN | 0x0005 | 0x0 | Section 3.2.7 | 422 +--------------------+--------------+---------+----------------+ 423 | ACKACK | 0x0006 | 0x0 | Section 3.2.8 | 424 +--------------------+--------------+---------+----------------+ 425 | DROPREQ | 0x0007 | 0x0 | Section 3.2.9 | 426 +--------------------+--------------+---------+----------------+ 427 | PEERERROR | 0x0008 | 0x0 | Section 3.2.10 | 428 +--------------------+--------------+---------+----------------+ 429 | User-Defined Type | 0x7FFF | - | N/A | 430 +--------------------+--------------+---------+----------------+ 432 Table 1: SRT Control Packet Types 434 3.2.1. Handshake 436 Handshake control packets (Control Type = 0x0000) are used to 437 exchange peer configurations, to agree on connection parameters, and 438 to establish a connection. 440 The Control Information Field (CIF) of a handshake control packet is 441 shown in Figure 5. 443 0 1 2 3 444 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 446 | Version | 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 | Encryption Field | Extension Field | 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 | Initial Packet Sequence Number | 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 452 | Maximum Transmission Unit Size | 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 | Maximum Flow Window Size | 455 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 456 | Handshake Type | 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 458 | SRT Socket ID | 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 | SYN Cookie | 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | | 463 + + 464 | | 465 + Peer IP Address + 466 | | 467 + + 468 | | 469 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 470 | Extension Type | Extension Length | 471 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 472 | | 473 + Extension Contents + 474 | | 475 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 477 Figure 5: Handshake packet structure 479 Version: 32 bits. A base protocol version number. Currently used 480 values are 4 and 5. Values greater than 5 are reserved for future 481 use. 483 Encryption Field: 16 bits. Block cipher family and key size. The 484 values of this field are described in Table 2. The default value 485 is AES-128. 487 +=======+============================+ 488 | Value | Cipher family and key size | 489 +=======+============================+ 490 | 0 | No Encryption Advertised | 491 +-------+----------------------------+ 492 | 2 | AES-128 | 493 +-------+----------------------------+ 494 | 3 | AES-192 | 495 +-------+----------------------------+ 496 | 4 | AES-256 | 497 +-------+----------------------------+ 499 Table 2: Handshake Encryption 500 Field Values 502 Extension Field: 16 bits. This field is message specific extension 503 related to Handshake Type field. The value MUST be set to 0 504 except for the following cases. (1) If the handshake control 505 packet is the INDUCTION message, this field is sent back by the 506 Listener. (2) In the case of a CONCLUSION message, this field 507 value should contain a combination of Extension Type values. For 508 more details, see Section 4.3.1. 510 +============+========+ 511 | Bitmask | Flag | 512 +============+========+ 513 | 0x00000001 | HSREQ | 514 +------------+--------+ 515 | 0x00000002 | KMREQ | 516 +------------+--------+ 517 | 0x00000004 | CONFIG | 518 +------------+--------+ 520 Table 3: Handshake 521 Extension Flags 523 Initial Packet Sequence Number: 32 bits. The sequence number of the 524 very first data packet to be sent. 526 Maximum Transmission Unit Size: 32 bits. This value is typically set 527 to 1500, which is the default Maximum Transmission Unit (MTU) size 528 for Ethernet, but can be less. 530 Maximum Flow Window Size: 32 bits. The value of this field is the 531 maximum number of data packets allowed to be "in flight" (i.e. the 532 number of sent packets for which an ACK control packet has not yet 533 been received). 535 Handshake Type: 32 bits. This field indicates the handshake packet 536 type. The possible values are described in Table 4. For more 537 details refer to Section 4.3. 539 +============+================+ 540 | Value | Handshake type | 541 +============+================+ 542 | 0xFFFFFFFD | DONE | 543 +------------+----------------+ 544 | 0xFFFFFFFE | AGREEMENT | 545 +------------+----------------+ 546 | 0xFFFFFFFF | CONCLUSION | 547 +------------+----------------+ 548 | 0x00000000 | WAVEHAND | 549 +------------+----------------+ 550 | 0x00000001 | INDUCTION | 551 +------------+----------------+ 553 Table 4: Handshake Type 555 SRT Socket ID: 32 bits. This field holds the ID of the source SRT 556 socket from which a handshake packet is issued. 558 SYN Cookie: 32 bits. Randomized value for processing a handshake. 559 The value of this field is specified by the handshake message 560 type. See Section 4.3. 562 Peer IP Address: 128 bits. IPv4 or IPv6 address of the packet's 563 sender. The value consists of four 32-bit fields. In the case of 564 IPv4 addresses, fields 2, 3 and 4 are filled with zeroes. 566 Extension Type: 16 bits. The value of this field is used to process 567 an integrated handshake. Each extension can have a pair of 568 request and response types. 570 +=======+====================+===================+ 571 | Value | Extension Type | HS Extension Flag | 572 +=======+====================+===================+ 573 | 1 | SRT_CMD_HSREQ | HSREQ | 574 +-------+--------------------+-------------------+ 575 | 2 | SRT_CMD_HSRSP | HSREQ | 576 +-------+--------------------+-------------------+ 577 | 3 | SRT_CMD_KMREQ | KMREQ | 578 +-------+--------------------+-------------------+ 579 | 4 | SRT_CMD_KMRSP | KMREQ | 580 +-------+--------------------+-------------------+ 581 | 5 | SRT_CMD_SID | CONFIG | 582 +-------+--------------------+-------------------+ 583 | 6 | SRT_CMD_CONGESTION | CONFIG | 584 +-------+--------------------+-------------------+ 585 | 7 | SRT_CMD_FILTER | CONFIG | 586 +-------+--------------------+-------------------+ 587 | 8 | SRT_CMD_GROUP | CONFIG | 588 +-------+--------------------+-------------------+ 590 Table 5: Handshake Extension Type values 592 Extension Length: 16 bits. The length of the Extension Contents 593 field in four-byte blocks. 595 Extension Contents: variable length. The payload of the extension. 597 3.2.1.1. Handshake Extension Message 599 In a Handshake Extension, the value of the Extension Field of the 600 handshake control packet is defined as 1 for a Handshake Extension 601 request (SRT_CMD_HSREQ in Table 5), and 2 for a Handshake Extension 602 response (SRT_CMD_HSRSP in Table 5). 604 The Extension Contents field of a Handshake Extension Message is 605 structured as follows: 607 0 1 2 3 608 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 609 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 610 | SRT Version | 611 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 612 | SRT Flags | 613 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 614 | Receiver TSBPD Delay | Sender TSBPD Delay | 615 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 617 Figure 6: Handshake Extension Message structure 619 SRT Version: 32 bits. SRT library version MUST be formed as major * 620 0x10000 + minor * 0x100 + patch. 622 SRT Flags: 32 bits. SRT configuration flags (see Section 3.2.1.1.1). 624 Receiver TSBPD Delay: 16 bits. Timestamp-Based Packet Delivery 625 (TSBPD) Delay of the receiver. Refer to Section 4.5. 627 Sender TSBPD Delay: 16 bits. TSBPD of the sender. Refer to 628 Section 4.5. 630 3.2.1.1.1. Handshake Extension Message Flags 632 +============+===============+ 633 | Bitmask | Flag | 634 +============+===============+ 635 | 0x00000001 | TSBPDSND | 636 +------------+---------------+ 637 | 0x00000002 | TSBPDRCV | 638 +------------+---------------+ 639 | 0x00000004 | CRYPT | 640 +------------+---------------+ 641 | 0x00000008 | TLPKTDROP | 642 +------------+---------------+ 643 | 0x00000010 | PERIODICNAK | 644 +------------+---------------+ 645 | 0x00000020 | REXMITFLG | 646 +------------+---------------+ 647 | 0x00000040 | STREAM | 648 +------------+---------------+ 649 | 0x00000080 | PACKET_FILTER | 650 +------------+---------------+ 652 Table 6: Handshake 653 Extension Message Flags 655 * TSBPDSND flag defines if the TSBPD mechanism (Section 4.5) will be 656 used for sending. 658 * TSBPDRCV flag defines if the TSBPD mechanism (Section 4.5) will be 659 used for receiving. 661 * CRYPT flag MUST be set. It is a legacy flag that indicates the 662 party understands KK field of the SRT Packet (Figure 3). 664 * TLPKTDROP flag should be set if too-late packet drop mechanism 665 will be used during transmission. See Section 4.6. 667 * PERIODICNAK flag set indicates the peer will send periodic NAK 668 packets. See Section 4.8.2. 670 * REXMITFLG flag MUST be set. It is a legacy flag that indicates 671 the peer understands the R field of the SRT DATA Packet 672 (Figure 3). 674 * STREAM flag identifies the transmission mode (Section 4.2) to be 675 used in the connection. If the flag is set, the buffer mode 676 (Section 4.2.2) is used. Otherwise, the message mode 677 (Section 4.2.1) is used. 679 * PACKET_FILTER flag indicates if the peer supports packet filter. 681 3.2.1.2. Key Material Extension Message 683 If an encrypted connection is being established, the Key Material 684 (KM) is first transmitted as a Handshake Extension message. This 685 extension is not supplied for unprotected connections. The purpose 686 of the extension is to let peers exchange and negotiate encryption- 687 related information to be used to encrypt and decrypt the payload of 688 the stream. 690 The extension can be supplied with the Handshake Extension Type field 691 set to either SRT_CMD_KMREQ or SRT_CMD_HSRSP (see Table 5 in 692 Section 3.2.1). For more details refer to Section 4.3. 694 The KM message is placed in the Extension Contents. See 695 Section 3.2.2 for the structure of the KM message. 697 3.2.1.3. Stream ID Extension Message 699 The Stream ID handshake extension message can be used to identify the 700 stream content. The Stream ID value can be free-form, but there is 701 also a recommended convention that can be used to achieve 702 interoperability. 704 The Stream ID handshake extension message has SRT_CMD_SID extension 705 type (see Table 5. The extension contents are a sequence of UTF-8 706 characters. The maximum allowed size of the StreamID extension is 707 512 bytes. 709 0 1 2 3 710 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 711 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 712 | | 713 | Stream ID | 714 ... 715 | | 716 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 718 Figure 7: Stream ID Extension Message 720 The Extension Contents field holds a sequence of UTF-8 characters 721 (see Figure 7). The maximum allowed size of the StreamID extension 722 is 512 bytes. The actual size is determined by the Extension Length 723 field (Figure 5), which defines the length in four byte blocks. If 724 the actual payload is less than the declared length, the remaining 725 bytes are set to zeros. 727 The content is stored as 32-bit little endian words. 729 3.2.1.4. Group Membership Extension 731 The Group Membership handshake extension is reserved for the future 732 and is going to be used to allow multipath SRT connections. 734 0 1 2 3 735 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 736 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 737 | Group ID | 738 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 739 | Type | Flags | Weight | 740 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 742 Figure 8: Group Membership Extension Message 744 GroupID: 32 bits. The identifier of a group whose members include 745 the sender socket that is making a connection. The target socket 746 that is interpreting GroupID SHOULD belong to the corresponding 747 group on the target side. If such a group does not exist, the 748 target socket MAY create it. 750 Type: 8 bits. Group type, as per SRT_GTYPE_ enumeration: 752 * 0: undefined group type, 754 * 1: broadcast group type, 756 * 2: main/backup group type, 757 * 3: balancing group type, 759 * 4: multicast group type (reserved for future use). 761 Flags: 8 bits. Special flags mostly reserved for the future. See 762 Figure 9. 764 Weight: 16 bits. Special value with interpretation depending on the 765 Type field value: 767 * Not used with broadcast group type, 769 * Defines the link priority for main/backup group type, 771 * Not yet defined for any other cases (reserved for future use). 773 0 1 2 3 4 5 6 7 774 +-+-+-+-+-+-+-+ 775 | (zero) |M| 776 +-+-+-+-+-+-+-+ 778 Figure 9: Group Membership Extension Flags 780 M: 1 bit. When set, defines synchronization on message numbers, 781 otherwise transmission is synchronized on sequence numbers. 783 3.2.2. Key Material 785 The purpose of the Key Material Message is to let peers exchange 786 encryption-related information to be used to encrypt and decrypt the 787 payload of the stream. 789 This message can be supplied in two possible ways: 791 * as a Handshake Extension (see Section 3.2.1.2) 793 * in the Content Information Field of the User-Defined control 794 packet (described below). 796 When the Key Material is transmitted as a control packet, the Control 797 Type field of the SRT packet header is set to User-Defined Type (see 798 Table 1), the Subtype field of the header is set to SRT_CMD_KMREQ for 799 key-refresh request and SRT_CMD_KMRSP for key-refresh response 800 (Table 5). The KM Refresh mechanism is described in Section 6.1.6. 802 The structure of the Key Material message is illustrated in 803 Figure 10. 805 0 1 2 3 806 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 807 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 808 |S| V | PT | Sign | Resv1 | KK| 809 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 810 | KEKI | 811 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 812 | Cipher | Auth | SE | Resv2 | 813 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 814 | Resv3 | SLen/4 | KLen/4 | 815 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 816 | Salt | 817 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 818 | | 819 + Wrapped Key + 820 | | 821 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 823 Figure 10: Key Material Message structure 825 S: 1 bit, value = {0}. This is a fixed-width field that is reserved 826 for future usage. 828 Version (V): 3 bits, value = {1}. This is a fixed-width field that 829 indicates the SRT version: 831 * 1: Initial version. 833 Packet Type (PT): 4 bits, value = {2}. This is a fixed-width field 834 that indicates the Packet Type: 836 * 0: Reserved 838 * 1: Media Stream Message (MSmsg) 840 * 2: Keying Material Message (KMmsg) 842 * 7: Reserved to discriminate MPEG-TS packet (0x47=sync byte). 844 Sign: 16 bits, value = {0x2029}. This is a fixed-width field that 845 contains the signature 'HAI' encoded as a PnP Vendor ID [PNPID] 846 (in big-endian order). 848 Resv1: 6 bits, value = {0}. This is a fixed-width field reserved for 849 flag extension or other usage. 851 Key-based Encryption (KK): 2 bits. This is a fixed-width field that 852 indicates which SEKs (odd and/or even) are provided in the 853 extension: 855 * 00b: No SEK is provided (invalid extension format); 857 * 01b: Even key is provided; 859 * 10b: Odd key is provided; 861 * 11b: Both even and odd keys are provided. 863 Key Encryption Key Index (KEKI): 32 bits, value = {0}. This is a 864 fixed-width field for specifying the KEK index (big-endian order) 865 was used to wrap (and optionally authenticate) the SEK(s). The 866 value 0 is used to indicate the default key of the current stream. 867 Other values are reserved for the possible use of a key management 868 system in the future to retrieve a cryptographic context. 870 * 0: Default stream associated key (stream/system default) 872 * 1..255: Reserved for manually indexed keys. 874 Cipher: 8 bits, value = {0..2}. This is a fixed-width field for 875 specifying encryption cipher and mode: 877 * 0: None or KEKI indexed crypto context 879 * 2: AES-CTR [SP800-38A]. 881 Authentication (Auth): 8 bits, value = {0}. This is a fixed-width 882 field for specifying a message authentication code algorithm: 884 * 0: None or KEKI indexed crypto context. 886 Stream Encapsulation (SE): 8 bits, value = {2}. This is a fixed- 887 width field for describing the stream encapsulation: 889 * 0: Unspecified or KEKI indexed crypto context 891 * 1: MPEG-TS/UDP 893 * 2: MPEG-TS/SRT. 895 Resv2: 8 bits, value = {0}. This is a fixed-width field reserved for 896 future use. 898 Resv3: 16 bits, value = {0}. This is a fixed-width field reserved 899 for future use. 901 SLen/4: 8 bits, value = {4}. This is a fixed-width field for 902 specifying salt length SLen in bytes divided by 4. Can be zero if 903 no salt/IV present. The only valid length of salt defined is 128 904 bits. 906 KLen/4: 8 bits, value = {4,6,8}. This is a fixed-width field for 907 specifying SEK length in bytes divided by 4. Size of one key even 908 if two keys present. MUST match the key size specified in the 909 Encryption Field of the handshake packet Table 2. 911 Salt (SLen): SLen * 8 bits, value = { }. This is a variable-width 912 field that complements the keying material by specifying a salt 913 key. 915 Wrap: (64 + n * KLen * 8) bits, value = { }. This is a variable- 916 width field for specifying Wrapped key(s), where n = (KK + 1)/2 917 and the size of the wrap field is ((n * KLen) + 8) bytes. 919 0 1 2 3 920 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 922 | | 923 + Integrity Check Vector (ICV) + 924 | | 925 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 926 | xSEK | 927 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 928 | oSEK | 929 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 931 Figure 11: Unwrapped key structure 933 ICV: 64 bits. 64-bit Integrity Check Vector(AES key wrap integrity). 934 This field is used to detect if the keys were unwrapped properly. 935 If the KEK in hand is invalid, validation fails and unwrapped keys 936 are discarded. 938 xSEK: variable width. This field identifies an odd or even SEK. If 939 only one key is present, the bit set in the KK field tells which 940 SEK is provided. If both keys are present, then this field is 941 eSEK (even key) and it is followed by odd key oSEK. The length of 942 this field is calculated as KLen * 8. 944 oSEK: variable width. This field with the odd key is present only 945 when the message carries the two SEKs (identified by he KK field). 947 3.2.3. Keep-Alive 949 Keep-alive control packets are sent after a certain timeout from the 950 last time any packet (Control or Data) was sent. The purpose of this 951 control packet is to notify the peer to keep the connection open when 952 no data exchange is taking place. 954 The default timeout for a keep-alive packet to be sent is 1 second. 956 An SRT keep-alive packet is formatted as follows: 958 0 1 2 3 959 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 960 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 961 |1| Control Type | Reserved | 962 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 963 | Type-specific Information | 964 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 965 | Timestamp | 966 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 967 | Destination Socket ID | 968 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 970 Figure 12: Keep-Alive control packet 972 Packet Type: 1 bit, value = 1. The packet type value of a keep-alive 973 control packet is "1". 975 Control Type: 15 bits, value = KEEPALIVE{0x0001}. The control type 976 value of a keep-alive control packet is "1". 978 Reserved: 16 bits, value = 0. This is a fixed-width field reserved 979 for future use. 981 Type-specific Information. This field is reserved for future 982 definition. 984 Timestamp: 32 bits. See Section 3. 986 Destination Socket ID: 32 bits. See Section 3. 988 Keep-alive controls packet do not contain Control Information Field 989 (CIF). 991 3.2.4. ACK (Acknowledgment) 993 Acknowledgment (ACK) control packets are used to provide the delivery 994 status of data packets. By acknowledging the reception of data 995 packets up to the acknowledged packet sequence number, the receiver 996 notifies the sender that all prior packets were received or, in the 997 case of live streaming (Section 4.2, Section 7.1), preceding missing 998 packets (if any) were dropped as too late to be delivered 999 (Section 4.6). 1001 ACK packets may also carry some additional information from the 1002 receiver like the estimates of RTT, RTT variance, link capacity, 1003 receiving speed, etc. The CIF portion of the ACK control packet is 1004 expanded as follows: 1006 0 1 2 3 1007 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1008 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1009 |1| Control Type | Reserved | 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 | Acknowledgement Number | 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | Timestamp | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1015 | Destination Socket ID | 1016 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- CIF -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1017 | Last Acknowledged Packet Sequence Number | 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1019 | RTT | 1020 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 | RTT Variance | 1022 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1023 | Available Buffer Size | 1024 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1025 | Packets Receiving Rate | 1026 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1027 | Estimated Link Capacity | 1028 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1029 | Receiving Rate | 1030 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1032 Figure 13: ACK control packet 1034 Packet Type: 1 bit, value = 1. The packet type value of an ACK 1035 control packet is "1". 1037 Control Type: 15 bits, value = ACK{0x0002}. The control type value 1038 of an ACK control packet is "2". 1040 Reserved: 16 bits, value = 0. This is a fixed-width field reserved 1041 for future use. 1043 Acknowledgement Number: 32 bits. This field contains the sequential 1044 number of the full acknowledgment packet starting from 1. 1046 Timestamp: 32 bits. See Section 3. 1048 Destination Socket ID: 32 bits. See Section 3. 1050 Last Acknowledged Packet Sequence Number: 32 bits. This field 1051 contains the sequence number of the last data packet being 1052 acknowledged plus one. In other words, if it the sequence number 1053 of the first unacknowledged packet. 1055 RTT: 32 bits. RTT value, in microseconds, estimated by the receiver 1056 based on the previous ACK-ACKACK packet exchange. 1058 RTT Variance: 32 bits. The variance of the RTT estimate, in 1059 microseconds. 1061 Available Buffer Size: 32 bits. Available size of the receiver's 1062 buffer, in packets. 1064 Packets Receiving Rate: 32 bits. The rate at which packets are being 1065 received, in packets per second. 1067 Estimated Link Capacity: 32 bits. Estimated bandwidth of the link, 1068 in packets per second. 1070 Receiving Rate: 32 bits. Estimated receiving rate, in bytes per 1071 second. 1073 There are several types of ACK packets: 1075 * A Full ACK control packet is sent every 10 ms and has all the 1076 fields of Figure 13. 1078 * A Light ACK control packet includes only the Last Acknowledged 1079 Packet Sequence Number field. The Type-specific Information field 1080 should be set to 0. 1082 * A Small ACK includes the fields up to and including the Available 1083 Buffer Size field. The Type-specific Information field should be 1084 set to 0. 1086 The sender only acknowledges the receipt of Full ACK packets (see 1087 Section 3.2.8). 1089 The Light ACK and Small ACK packets are used in cases when the 1090 receiver should acknowledge received data packets more often than 1091 every 10 ms. This is usually needed at high data rates. It is up to 1092 the receiver to decide the condition and the type of ACK packet to 1093 send (Light or Small). The recommendation is to send a Light ACK for 1094 every 64 packets received. 1096 3.2.5. NAK (Loss Report) 1098 Negative acknowledgment (NAK) control packets are used to signal 1099 failed data packet deliveries. The receiver notifies the sender 1100 about lost data packets by sending a NAK packet that contains a list 1101 of sequence numbers for those lost packets. 1103 An SRT NAK packet is formatted as follows: 1105 0 1 2 3 1106 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1107 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1108 |1| Control Type | Reserved | 1109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1110 | Type-specific Information | 1111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1112 | Timestamp | 1113 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1114 | Destination Socket ID | 1115 +-+-+-+-+-+-+-+-+-+-+-+- CIF (Loss List) -+-+-+-+-+-+-+-+-+-+-+-+ 1116 |0| Lost packet sequence number | 1117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1118 |1| Range of lost packets from sequence number | 1119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1120 |0| Up to sequence number | 1121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 |0| Lost packet sequence number | 1123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1125 Figure 14: NAK control packet 1127 Packet Type: 1 bit, value = 1. The packet type value of a NAK 1128 control packet is "1". 1130 Control Type: 15 bits, value = NAK{0x0003}. The control type value 1131 of a NAK control packet is "3". 1133 Reserved: 16 bits, value = 0. This is a fixed-width field reserved 1134 for future use. 1136 Type-specific Information: 32 bits. This field is reserved for 1137 future definition. 1139 Timestamp: 32 bits. See Section 3. 1141 Destination Socket ID: 32 bits. See Section 3. 1143 Control Information Field (CIF). A single value or a range of lost 1144 packets sequence numbers. See packet sequence number coding in 1145 Appendix A. 1147 3.2.6. Congestion Warning 1149 The Congestion Warning control packet is reserved for future use. 1150 Its purpose is to allow a receiver to signal a sender that there is 1151 congestion happening at the receiving side. The expected behaviour 1152 is that upon receiving this packet the sender slows down its sending 1153 rate by increasing the minimum inter-packet sending interval by a 1154 discrete value (posited to be 12.5%). 1156 Note that the conditions for a receiver to issue this type of packet 1157 are not yet defined. 1159 0 1 2 3 1160 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1161 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1162 |1| Control Type = 4 | Reserved = 0 | 1163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1164 | Type-specific Information = 0 | 1165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1166 | Timestamp | 1167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1168 | Destination Socket ID | 1169 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1171 Figure 15: Congestion Warning control packet 1173 Packet Type: 1 bit, value = 1. The packet type value of a Congestion 1174 Warning control packet is "1". 1176 Control Type: 15 bits, value = 4. The control type value of a 1177 Congestion Warning control packet is "4". 1179 Timestamp: 32 bits. See Section 3. 1181 Destination Socket ID: 32 bits. See Section 3. 1183 Type-specific Information. This field is reserved for future 1184 definition. 1186 3.2.7. Shutdown 1188 Shutdown control packets are used to initiate the closing of an SRT 1189 connection. 1191 An SRT shutdown control packet is formatted as follows: 1193 0 1 2 3 1194 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1195 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1196 |1| Control Type | Reserved | 1197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1198 | Type-specific Information | 1199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1200 | Timestamp | 1201 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1202 | Destination Socket ID | 1203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1205 Figure 16: Shutdown control packet 1207 Packet Type: 1 bit, value = 1. The packet type value of a shutdown 1208 control packet is "1". 1210 Control Type: 15 bits, value = SHUTDOWN{0x0005}. The control type 1211 value of a shutdown control packet is "5". 1213 Timestamp: 32 bits. See Section 3. 1215 Destination Socket ID: 32 bits. See Section 3. 1217 Type-specific Information. This field is reserved for future 1218 definition. 1220 Shutdown control packets do not contain Control Information Field 1221 (CIF). 1223 3.2.8. ACKACK 1225 ACKACK control packets are sent to acknowledge the reception of a 1226 Full ACK, and are used in the calculation of RTT by the receiver. 1228 An SRT ACKACK Control packet is formatted as follows: 1230 0 1 2 3 1231 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1232 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1233 |1| Control Type | Reserved | 1234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1235 | Acknowledgement Number | 1236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1237 | Timestamp | 1238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1239 | Destination Socket ID | 1240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1242 Figure 17: ACKACK control packet 1244 Packet Type: 1 bit, value = 1. The packet type value of an ACKACK 1245 control packet is "1". 1247 Control Type: 15 bits, value = ACKACK{0x0006}. The control type 1248 value of an ACKACK control packet is "6". 1250 Acknowledgement Number. This field contains the Acknowledgement 1251 Number of the full ACK packet the reception of which is being 1252 acknowledged by this ACKACK packet. 1254 Timestamp: 32 bits. See Section 3. 1256 Destination Socket ID: 32 bits. See Section 3. 1258 ACKACK control packets do not contain Control Information Field 1259 (CIF). 1261 3.2.9. Message Drop Request 1263 A Message Drop Request control packet is sent by the sender to the 1264 receiver when it requests the retransmission of an unacknowledged 1265 packet (all or part of a message) which is not present in the 1266 sender's buffer. This may happen, for example, when a TTL parameter 1267 (passed in the sending function) triggers a timeout for 1268 retransmitting lost packets which constitute parts of a message, 1269 causing these packets to be removed from the sender's buffer. 1271 The sender notifies the receiver that it must not wait for 1272 retransmission of this message. Note that a Message Drop Request 1273 control packet is not sent if the Too Late Packet Drop mechanism 1274 (Section 4.6) causes the sender to drop a message, as in this case 1275 the receiver is expected to drop it anyway. 1277 0 1 2 3 1278 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1279 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1280 |1| Control Type = 7 | Reserved = 0 | 1281 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1282 | Message Number | 1283 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1284 | Timestamp | 1285 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1286 | Destination Socket ID | 1287 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1288 | First Packet Sequence Number | 1289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1290 | Last Packet Sequence Number | 1291 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1293 Figure 18: Drop Request control packet 1295 Packet Type: 1 bit, value = 1. The packet type value of a Drop 1296 Request control packet is "1". 1298 Control Type: 15 bits, value = 7. The control type value of a Drop 1299 Request control packet is "7". 1301 Message Number: 32 bits. The identifying number of the message 1302 requested to be dropped. See the Message Number field in 1303 Section 3.1. 1305 Timestamp: 32 bits. See Section 3. 1307 Destination Socket ID: 32 bits. See Section 3. 1309 First Packet Sequence Number: 32 bits. The sequence number of the 1310 first packet in the message. 1312 Last Packet Sequence Number: 32 bits. The sequence number of the 1313 last packet in the message. 1315 3.2.10. Peer Error 1317 The Peer Error control packet is sent by a receiver when a processing 1318 error (e.g. write to disk failure) occurs. This informs the sender 1319 of the situation and unblocks it from waiting for further responses 1320 from the receiver. 1322 The sender receiving this type of control packet must unblock any 1323 sending operation in progress. 1325 *NOTE*: This control packet is only used if the File Transfer 1326 Congestion Control (Section 5.2) is enabled. 1328 0 1 2 3 1329 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1330 +-+-+-+-+-+-+-+-+-+-+-+-+- SRT Header +-+-+-+-+-+-+-+-+-+-+-+-+-+ 1331 |1| Control Type = 8 | Reserved = 0 | 1332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1333 | Error Code | 1334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1335 | Timestamp | 1336 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1337 | Destination Socket ID | 1338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1340 Figure 19: Peer Error control packet 1342 Packet Type: 1 bit, value = 1. The packet type value of a Peer Error 1343 control packet is "1". 1345 Control Type: 15 bits, value = 8. The control type value of a Peer 1346 Error control packet is "8". 1348 Error Code: 32 bits. Peer error code. At the moment the only value 1349 defined is 4000 - file system error. 1351 Timestamp: 32 bits. See Section 3. 1353 Destination Socket ID: 32 bits. See Section 3. 1355 4. SRT Data Transmission and Control 1357 This section describes key concepts related to the handling of 1358 control and data packets during the transmission process. 1360 After the handshake and exchange of capabilities is completed, packet 1361 data can be sent and received over the established connection. To 1362 fully utilize the features of low latency and error recovery provided 1363 by SRT, the sender and receiver must handle control packets, timers, 1364 and buffers for the connection as specified in this section. 1366 4.1. Stream Multiplexing 1368 Multiple SRT sockets may share the same UDP socket so that the 1369 packets received to this UDP socket will be correctly dispatched to 1370 those SRT sockets they are currently destined. 1372 During the handshake, the parties exchange their SRT Socket IDs. 1373 These IDs are then used in the Destination Socket ID field of every 1374 control and data packet (see Section 3). 1376 4.2. Data Transmission Modes 1378 There are two data transmission modes supported by SRT: message mode 1379 (Section 4.2.1) and buffer mode (Section 4.2.2). These are the modes 1380 originally defined in the UDT protocol [GHG04b]. 1382 As SRT has been mainly designed for live video and audio streaming, 1383 its main and default transmission mode is message mode with certain 1384 settings applied (Section 7.1). 1386 Besides live streaming, SRT maintains the ability for fast file 1387 transfers introduced in UDT (Section 7.2). The usage of both message 1388 and buffer modes is possible in this case. 1390 Best practices and configuration tips for both use cases can be found 1391 in Section 7. 1393 4.2.1. Message Mode 1395 When the STREAM flag of the handshake Extension Message 1396 Section 3.2.1.1 is set to 0, the protocol operates in Message mode, 1397 characterized as follows: 1399 * Every packet has its own Packet Sequence Number. 1401 * One or several consecutive SRT data packets can form a message. 1403 * All the packets belonging to the same message have a similar 1404 message number set in the Message Number field. 1406 The first packet of a message has the first bit of the Packet 1407 Position Flags (Section 3.1) set to 1. The last packet of the 1408 message has the second bit of the Packet Position Flags set to 1. 1409 Thus, a PP equal to "11b" indicates a packet that forms the whole 1410 message. A PP equal to "00b" indicates a packet that belongs to the 1411 inner part of the message. 1413 The concept of the message in SRT comes from UDT [GHG04b]. In this 1414 mode, a single sending instruction passes exactly one piece of data 1415 that has boundaries (a message). This message may span multiple UDP 1416 packets and multiple SRT data packets. The only size limitation is 1417 that it shall fit as a whole in the buffers of the sender and the 1418 receiver. Although internally all operations (e.g., ACK, NAK) on 1419 data packets are performed independently, an application must send 1420 and receive the whole message. Until the message is complete (all 1421 packets are received) the application will not be allowed to read it. 1423 When the Order Flag of a data packet is set to 1, this imposes a 1424 sequential reading order on messages. An Order Flag set to 0 allows 1425 an application to read messages that are already fully available, 1426 before any preceding messages that may have some packets missing. 1428 4.2.2. Buffer Mode 1430 Buffer mode is negotiated during the handshake by setting the STREAM 1431 flag of the handshake Extension Message Flags (Section 3.2.1.1.1) to 1432 1. 1434 In this mode, consecutive packets form one continuous stream that can 1435 be read with portions of any size. 1437 4.3. Handshake Messages 1439 SRT is a connection-oriented protocol. It embraces the concepts of 1440 "connection" and "session". The UDP system protocol is used by SRT 1441 for sending data and control packets. 1443 An SRT connection is characterized by the fact that it is: 1445 * first engaged by a handshake process, 1447 * maintained as long as any packets are being exchanged in a timely 1448 manner, and 1450 * considered closed when a party receives the appropriate close 1451 command from its peer (connection closed by the foreign host), or 1452 when it receives no packets at all for some predefined time 1453 (connection broken on timeout). 1455 SRT supports two connection configurations: 1457 1. Caller-Listener, where one side waits for the other to initiate a 1458 connection; 1460 2. Rendezvous, where both sides attempt to initiate a connection. 1462 The handshake is performed between two parties: "Initiator" and 1463 "Responder" in the following order: 1465 * Initiator starts an extended SRT handshake process and sends 1466 appropriate SRT extended handshake requests. 1468 * Responder expects the SRT extended handshake requests to be sent 1469 by the Initiator and sends SRT extended handshake responses back. 1471 There are three basic types of SRT handshake extensions that are 1472 exchanged in the handshake: 1474 * Handshake Extension Message exchanges the basic SRT information; 1476 * Key Material Exchange exchanges the wrapped stream encryption key 1477 (used only if an encryption is requested). 1479 * Stream ID extension exchanges some stream-specific information 1480 that can be used by the application to identify an incoming stream 1481 connection. 1483 The Initiator and Responder roles are assigned depending on the 1484 connection mode. 1486 For Caller-Listener connections: the Caller is the Initiator, the 1487 Listener is the Responder. For Rendezvous connections: the Initiator 1488 and Responder roles are assigned based on the initial data 1489 interchange during the handshake. 1491 The Handshake Type field in the Handshake Structure (see Figure 5) 1492 indicates the handshake message type. 1494 Caller-Listener handshake exchange has the following order of 1495 Handshake Types: 1497 1. Caller to Listener: INDUCTION 1499 2. Listener to Caller: INDUCTION (reports cookie) 1501 3. Caller to Listener: CONCLUSION (uses previously returned cookie) 1503 4. Listener to Caller: CONCLUSION (confirms connection established). 1505 Rendezvous handshake exchange has the following order of Handshake 1506 Types: 1508 1. After starting the connection: WAVEAHAND 1509 2. After receiving the above message from the peer: CONCLUSION 1511 3. After receiving the above message from the peer: AGREEMENT. 1513 When a connection process has failed before either party can send the 1514 CONCLUSION handshake, the Handshake Type field will contain the 1515 appropriate error value for the rejected connection. See the list of 1516 error codes in Table 7. 1518 +======+================+=========================================+ 1519 | Code | Error | Description | 1520 +======+================+=========================================+ 1521 | 1000 | REJ_UNKNOWN | Unknown reason | 1522 +------+----------------+-----------------------------------------+ 1523 | 1001 | REJ_SYSTEM | System function error | 1524 +------+----------------+-----------------------------------------+ 1525 | 1002 | REJ_PEER | Rejected by peer | 1526 +------+----------------+-----------------------------------------+ 1527 | 1003 | REJ_RESOURCE | Resource allocation problem | 1528 +------+----------------+-----------------------------------------+ 1529 | 1004 | REJ_ROGUE | incorrect data in handshake | 1530 +------+----------------+-----------------------------------------+ 1531 | 1005 | REJ_BACKLOG | listener's backlog exceeded | 1532 +------+----------------+-----------------------------------------+ 1533 | 1006 | REJ_IPE | internal program error | 1534 +------+----------------+-----------------------------------------+ 1535 | 1007 | REJ_CLOSE | socket is closing | 1536 +------+----------------+-----------------------------------------+ 1537 | 1008 | REJ_VERSION | peer is older version than agent's min | 1538 +------+----------------+-----------------------------------------+ 1539 | 1009 | REJ_RDVCOOKIE | rendezvous cookie collision | 1540 +------+----------------+-----------------------------------------+ 1541 | 1010 | REJ_BADSECRET | wrong password | 1542 +------+----------------+-----------------------------------------+ 1543 | 1011 | REJ_UNSECURE | password required or unexpected | 1544 +------+----------------+-----------------------------------------+ 1545 | 1012 | REJ_MESSAGEAPI | Stream flag collision | 1546 +------+----------------+-----------------------------------------+ 1547 | 1013 | REJ_CONGESTION | incompatible congestion-controller type | 1548 +------+----------------+-----------------------------------------+ 1549 | 1014 | REJ_FILTER | incompatible packet filter | 1550 +------+----------------+-----------------------------------------+ 1551 | 1015 | REJ_GROUP | incompatible group | 1552 +------+----------------+-----------------------------------------+ 1554 Table 7: Handshake Rejection Reason Codes 1556 The specification of the cipher family and block size is decided by 1557 the data Sender. When the transmission is bidirectional, this value 1558 MUST be agreed upon at the outset because when both are set the 1559 Responder wins. For Caller-Listener connections it is reasonable to 1560 set this value on the Listener only. In the case of Rendezvous the 1561 only reasonable approach is to decide upon the correct value from the 1562 different sources and to set it on both parties (note that *AES-128* 1563 is the default). 1565 4.3.1. Caller-Listener Handshake 1567 This section describes the handshaking process where a Listener is 1568 waiting for an incoming Handshake request on a bound UDP port from a 1569 Caller. The process has two phases: induction and conclusion. 1571 4.3.1.1. The Induction Phase 1573 The INDUCTION phase serves only to set a cookie on the Listener so 1574 that it doesn't allocate resources, thus mitigating a potential DoS 1575 attack that might be perpetrated by flooding the Listener with 1576 handshake commands. 1578 The Caller begins by sending the INDUCTION handshake which contains 1579 the following significant fields: 1581 * Version: MUST always be 4 1583 * Encryption Field: 0 1585 * Extension Field: 2 1587 * Handshake Type: INDUCTION 1589 * SRT Socket ID: SRT Socket ID of the Caller 1591 * SYN Cookie: 0. 1593 The Destination Socket ID of the SRT packet header in this message is 1594 0, which is interpreted as a connection request. 1596 The handshake version number is set to 4 in this initial handshake. 1597 This is due to the initial design of SRT that was to be compliant 1598 with the UDT protocol [GHG04b] on which it is based. 1600 The Listener responds with the following: 1602 * Version: 5 1603 * Encryption Field: Advertised cipher family and block size 1605 * Extension Field: SRT magic code 0x4A17 1607 * Handshake Type: INDUCTION 1609 * SRT Socket ID: Socket ID of the Listener 1611 * SYN Cookie: a cookie that is crafted based on host, port and 1612 current time with 1 minute accuracy to avoid SYN flooding attack 1613 [RFC4987]. 1615 At this point the Listener still does not know if the Caller is SRT 1616 or UDT, and it responds with the same set of values regardless of 1617 whether the Caller is SRT or UDT. 1619 If the party is SRT, it does interpret the values in Version and 1620 Extension Field. If it receives the value 5 in Version, it 1621 understands that it comes from an SRT party, so it knows that it 1622 should prepare the proper handshake messages phase. It also checks 1623 the following: 1625 * whether the Extension Flags contains the magic value 0x4A17; 1626 otherwise the connection is rejected. This is a contingency for 1627 the case where someone who, in an attempt to extend UDT 1628 independently, increases the Version value to 5 and tries to test 1629 it against SRT; 1631 * whether the Encryption Flags contain a non-zero value, which is 1632 interpreted as an advertised cipher family and block size. 1634 A legacy UDT party completely ignores the values reported in Version 1635 and Handshake Type. It is, however, interested in the SYN Cookie 1636 value, as this must be passed to the next phase. It does interpret 1637 these fields, but only in the "conclusion" message. 1639 4.3.1.2. The Conclusion Phase 1641 Once the Caller gets the SYN cookie from the Listener, it sends the 1642 CONCLUSION handshake to the Listener. 1644 The following values are set by the compliant Caller: 1646 * Version: 5 1648 * Handshake Type: CONCLUSION 1650 * SRT Socket ID: Socket ID of the Caller 1651 * SYN Cookie: the cookie previously received in the induction phase 1653 * Encryption Flags: advertised cipher family and block size 1655 * Extension Flags: a set of flags that define the extensions 1656 provided in the handshake 1658 * The Destination Socket ID in this message is the socket ID that 1659 was previously received in the induction phase in the SRT Socket 1660 ID field of the handshake structure. 1662 The Listener responds with the same values shown above, without the 1663 cookie (which is not needed here), as well as the extensions for HS 1664 Version 5 (which will probably be exactly the same). 1666 There is not any "negotiation" here. If the values passed in the 1667 handshake are in any way not acceptable by the other side, the 1668 connection will be rejected. The only case when the Listener can 1669 have precedence over the Caller is the advertised Cipher Family and 1670 Block Size (see Table 2) in the Encryption Field of the Handshake. 1672 The value for latency is always agreed to be the greater of those 1673 reported by each party. 1675 4.3.2. Rendezvous Handshake 1677 The Rendezvous process uses a state machine. It is slightly 1678 different from UDT Rendezvous handshake [GHG04b], although it is 1679 still based on the same message request types. 1681 Both parties start with WAVEAHAND and use the Version value of 5. 1682 Legacy Version 4 clients do not look at the Version value, whereas 1683 Version 5 clients can detect version 5. The parties only continue 1684 with the Version 5 Rendezvous process when Version is set to 5 for 1685 both. Otherwise the process continues exclusively according to 1686 Version 4 rules [GHG04b]. 1688 With Version 5 Rendezvous, both parties create a cookie for a process 1689 called the "cookie contest". This is necessary for the assignment of 1690 Initiator and Responder roles. Each party generates a cookie value 1691 (a 32-bit number) based on the host, port, and current time with 1 1692 minute accuracy. This value is scrambled using an MD5 sum 1693 calculation. The cookie values are then compared with one another. 1695 Since it is impossible to have two sockets on the same machine bound 1696 to the same NIC and port and operating independently, it is virtually 1697 impossible that the parties will generate identical cookies. 1698 However, this situation may occur if an application tries to "connect 1699 to itself" - that is, either connects to a local IP address, when the 1700 socket is bound to INADDR_ANY, or to the same IP address to which the 1701 socket was bound. If the cookies are identical (for any reason), the 1702 connection will not be made until new, unique cookies are generated 1703 (after a delay of up to one minute). In the case of an application 1704 "connecting to itself", the cookies will always be identical, and so 1705 the connection will never be established. 1707 When one party's cookie value is greater than its peer's, it wins the 1708 cookie contest and becomes Initiator (the other party becomes the 1709 Responder). 1711 At this point there are two possible "handshake flows": serial and 1712 parallel. 1714 4.3.2.1. Serial Handshake Flow 1716 In the serial handshake flow, one party is always first, and the 1717 other follows. That is, while both parties are repeatedly sending 1718 WAVEAHAND messages, at some point one party - let's say Alice - will 1719 find she has received a WAVEAHAND message before she can send her 1720 next one, so she sends a CONCLUSION message in response. Meantime, 1721 Bob (Alice's peer) has missed Alice's WAVEAHAND messages, so that 1722 Alice's CONCLUSION is the first message Bob has received from her. 1724 This process can be described easily as a series of exchanges between 1725 the first and following parties (Alice and Bob, respectively): 1727 1. Initially, both parties are in the waving state. Alice sends a 1728 handshake message to Bob: 1730 * Version: 5 1732 * Type: Extension field: 0, Encryption field: advertised 1733 "PBKEYLEN" 1735 * Handshake Type: WAVEAHAND 1737 * SRT Socket ID: Alice's socket ID 1739 * SYN Cookie: Created based on host/port and current time. 1741 While Alice does not yet know if she is sending this message to a 1742 Version 4 or Version 5 peer, the values from these fields would 1743 not be interpreted by the Version 4 peer when the Handshake Type 1744 is WAVEAHAND. 1746 2. Bob receives Alice's WAVEAHAND message, switches to the 1747 "attention" state. Since Bob now knows Alice's cookie, he 1748 performs a "cookie contest" (compares both cookie values). If 1749 Bob's cookie is greater than Alice's, he will become the 1750 Initiator. Otherwise, he will become the Responder. 1752 The resolution of the Handshake Role (Initiator or Responder) is 1753 essential for further processing. 1755 Then Bob responds: 1757 * Version: 5 1759 * Extension field: appropriate flags if Initiator, otherwise 0 1761 * Encryption field: advertised PBKEYLEN 1763 * Handshake Type: CONCLUSION. 1765 If Bob is the Initiator and encryption is on, he will use either 1766 his own cipher family and block size or the one received from 1767 Alice (if she has advertised those values). 1769 3. Alice receives Bob's CONCLUSION message. While at this point she 1770 also performs the "cookie contest", the outcome will be the same. 1771 She switches to the "fine" state, and sends: 1773 * Version: 5 1775 * Appropriate extension flags and encryption flags 1777 * Handshake Type: CONCLUSION. 1779 Both parties always send extension flags at this point, which 1780 will contain HSREQ if the message comes from an Initiator, or 1781 HSRSP if it comes from a Responder. If the Initiator has 1782 received a previous message from the Responder containing an 1783 advertised cipher family and block size in the encryption flags 1784 field, it will be used as the key length for key generation sent 1785 next in the KMREQ extension. 1787 4. Bob receives Alice's CONCLUSION message, and then does one of the 1788 following (depending on Bob's role): 1790 * If Bob is the Initiator (Alice's message contains HSRSP), he: 1792 - switches to the "connected" state, and 1793 - sends Alice a message with Handshake Type AGREEMENT, but 1794 containing no SRT extensions (Extension Flags field should 1795 be 0). 1797 * If Bob is the Responder (Alice's message contains HSREQ), he: 1799 - switches to "initiated" state, 1801 - sends Alice a message with Handshake Type CONCLUSION that 1802 also contains extensions with HSRSP, and 1804 - awaits a confirmation from Alice that she is also connected 1805 (preferably by AGREEMENT message). 1807 5. Alice receives the above message, enters into the "connected" 1808 state, and then does one of the following (depending on Alice's 1809 role): 1811 * If Alice is the Initiator (received CONCLUSION with HSRSP), 1812 she sends Bob a message with Handshake Type = AGREEMENT. 1814 * If Alice is the Responder, the received message has Handshake 1815 Type AGREEMENT and in response she does nothing. 1817 6. At this point, if Bob was an Initiator, he is connected already. 1818 If he was a Responder, he should receive the above AGREEMENT 1819 message, after which he switches to the "connected" state. In 1820 the case where the UDP packet with the agreement message gets 1821 lost, Bob will still enter the "connected" state once he receives 1822 anything else from Alice. If Bob is going to send, however, he 1823 has to continue sending the same CONCLUSION until he gets the 1824 confirmation from Alice. 1826 4.3.2.2. Parallel Handshake Flow 1828 The chances of the parallel handshake flow are very low, but still it 1829 may occur if the handshake messages with WAVEAHAND are sent and 1830 received by both peers at precisely the same time. 1832 The resulting flow is very much like Bob's behaviour in the serial 1833 handshake flow, but for both parties. Alice and Bob will go through 1834 the same state transitions: 1836 Waving -> Attention -> Initiated -> Connected 1838 In the Attention state they know each other's cookies, so they can 1839 assign roles. In contrast to serial flows, which are mostly based on 1840 request-response cycles, here everything happens completely 1841 asynchronously: the state switches upon reception of a particular 1842 handshake message with appropriate contents (the Initiator MUST 1843 attach the HSREQ extension, and Responder MUST attach the "HSRSP" 1844 extension). 1846 Here is how the parallel handshake flow works, based on roles and 1847 states: 1849 (1) Initiator 1851 1. Waving 1853 * Receives WAVEAHAND message, 1855 * Switches to Attention, 1857 * Sends CONCLUSION + HSREQ. 1859 2. Attention 1861 Receives CONCLUSION message which 1863 * either contains no extensions, then switches to Initiated, 1864 still sends CONCLUSION + HSREQ; or 1866 * contains "HSRSP" extension, then switches to Connected, sends 1867 AGREEMENT. 1869 3. Initiated 1871 Receives CONCLUSION message, which 1873 * either contains no extensions, then REMAINS IN THIS STATE, 1874 still sends CONCLUSION + HSREQ; or 1876 * contains "HSRSP" extension, then switches to Connected, sends 1877 AGREEMENT. 1879 4. Connected 1881 May receive CONCLUSION and respond with AGREEMENT, but normally 1882 by now it should already have received payload packets. 1884 (2) Responder 1886 1. Waving 1888 * Receives WAVEAHAND message, 1889 * Switches to Attention, 1891 * Sends CONCLUSION message (with no extensions). 1893 2. Attention 1895 * Receives CONCLUSION message with HSREQ. This message might 1896 contain no extensions, in which case the party SHALL simply 1897 send the empty CONCLUSION message, as before, and remain in 1898 this state. 1900 * Switches to Initiated and sends CONCLUSION message with HSRSP. 1902 3. Initiated 1904 Receives: 1906 * CONCLUSION message with HSREQ, then responds with CONCLUSION 1907 with HSRSP and remains in this state; 1909 * AGREEMENT message, then responds with AGREEMENT and switches 1910 to Connected; 1912 * Payload packet, then responds with AGREEMENT and switches to 1913 Connected. 1915 4. Connected 1917 Is not expecting to receive any handshake messages anymore. The 1918 AGREEMENT message is always sent only once or per every final 1919 CONCLUSION message. 1921 Note that any of these packets may be missing, and the sending party 1922 will never become aware. The missing packet problem is resolved this 1923 way: 1925 1. If the Responder misses the CONCLUSION + HSREQ message, it simply 1926 continues sending empty CONCLUSION messages. Only upon reception 1927 of CONCLUSION + HSREQ it does respond with CONCLUSION + HSRSP. 1929 2. If the Initiator misses the CONCLUSION + HSRSP response from the 1930 Responder, it continues sending CONCLUSION + HSREQ. The 1931 Responder MUST always respond with CONCLUSION + HSRSP when the 1932 Initiator sends CONCLUSION + HSREQ, even if it has already 1933 received and interpreted it. 1935 3. When the Initiator switches to the Connected state it responds 1936 with a AGREEMENT message, which may be missed by the Responder. 1937 Nonetheless, the Initiator may start sending data packets because 1938 it considers itself connected - it does not know that the 1939 Responder has not yet switched to the Connected state. Therefore 1940 it is exceptionally allowed that when the Responder is in the 1941 Initiated state and receives a data packet (or any control packet 1942 that is normally sent only between connected parties) over this 1943 connection, it may switch to the Connected state just as if it 1944 had received a AGREEMENT message. 1946 4. If the the Initiator has already switched to the Connected state 1947 it will not bother the Responder with any more handshake 1948 messages. But the Responder may be completely unaware of that 1949 (having missed the AGREEMENT message from the Initiator). 1950 Therefore it does not exit the connecting state, which means that 1951 it continues sending CONCLUSION + HSRSP messages until it 1952 receives any packet that will make it switch to the Connected 1953 state (normally AGREEMENT). Only then does it exit the 1954 connecting state and the application can start transmission. 1956 4.4. SRT Buffer Latency 1958 The SRT sender and receiver have buffers to store packets. 1960 On the sender, latency is the time that SRT holds a packet to give it 1961 a chance to be delivered successfully while maintaining the rate of 1962 the sender at the receiver. If an acknowledgment (ACK) is missing or 1963 late for more than the configured latency, the packet is dropped from 1964 the sender buffer. A packet can be retransmitted as long as it 1965 remains in the buffer for the duration of the latency window. On the 1966 receiver, packets are delivered to an application from a buffer after 1967 the latency interval has passed. This helps to recover from 1968 potential packet losses. See Section 4.5, Section 4.6 for details. 1970 Latency is a value, in milliseconds, that can cover the time to 1971 transmit hundreds or even thousands of packets at high bitrate. 1972 Latency can be thought of as a window that slides over time, during 1973 which a number of activities take place, such as the reporting of 1974 acknowledged packets (ACKs) (Section 4.8.1) and unacknowledged 1975 packets (NAKs) (Section 4.8.2). 1977 Latency is configured through the exchange of capabilities during the 1978 extended handshake process between initiator and responder. The 1979 Handshake Extension Message (Section 3.2.1.1) has TSBPD delay 1980 information, in milliseconds, from the SRT receiver and sender. The 1981 latency for a connection will be established as the maximum value of 1982 latencies proposed by the initiator and responder. 1984 4.5. Timestamp-Based Packet Delivery 1986 The goal of the SRT Timestamp-Based Packet Delivery (TSBPD) mechanism 1987 is to reproduce the output of the sending application (e.g., encoder) 1988 at the input of the receiving application (e.g., decoder) in the case 1989 of live streaming (Section 4.2, Section 7.1). It attempts to 1990 reproduce the timing of packets committed by the sending application 1991 to the SRT sender. This allows packets to be scheduled for delivery 1992 by the SRT receiver, making them ready to be read by the receiving 1993 application (see Figure 20). 1995 The SRT receiver, using the timestamp of the SRT data packet header, 1996 delivers packets to a receiving application with a fixed minimum 1997 delay from the time the packet was scheduled for sending on the SRT 1998 sender side. Basically, the sender timestamp in the received packet 1999 is adjusted to the receiver's local time (compensating for the time 2000 drift or different time zones) before releasing the packet to the 2001 application. Packets can be withheld by the SRT receiver for a 2002 configured receiver delay. A higher delay can accommodate a larger 2003 uniform packet drop rate, or a larger packet burst drop. Packets 2004 received after their "play time" are dropped if the Too-Late Packet 2005 Drop feature is enabled (Section 4.6). For example, in the case of 2006 live video streaming, TSBPD and Too-Late Packet Drop mechanisms allow 2007 to intentionally drop those packets that were lost and have no chance 2008 to be retransmitted before their play time. Thus, SRT provides a 2009 fixed end-to-end latency of the stream. 2011 The packet timestamp, in microseconds, is relative to the SRT 2012 connection creation time. Packets are inserted based on the sequence 2013 number in the header field. The origin time, in microseconds, of the 2014 packet is already sampled when a packet is first submitted by the 2015 application to the SRT sender unless explicitly provided. The TSBPD 2016 feature uses this time to stamp the packet for first transmission and 2017 any subsequent retransmission. This timestamp and the configured SRT 2018 latency (Section 4.4) control the recovery buffer size and the 2019 instant that packets are delivered at the destination (the 2020 aforementioned "play time" which is decided by adding the timestamp 2021 to the configured latency). 2023 It is worth mentioning that the use of the packet sending time to 2024 stamp the packets is inappropriate for the TSBPD feature, since a new 2025 time (current sending time) is used for retransmitted packets, 2026 putting them out of order when inserted at their proper place in the 2027 stream. 2029 Figure 20 illustrates the key latency points during the packet 2030 transmission with the TSBPD feature enabled. 2032 | Sending | | | 2033 | Delay | ~RTT/2 | SRT Latency | 2034 |<--------->|<------------>|<----------------->| 2035 | | | | 2036 | | | | 2037 | | | | 2038 ___ Scheduled Sent Received Scheduled 2039 / for sending | | for delivery 2040 Packet | | | | 2041 State | | | | 2042 | | | | 2043 | | | | 2044 -----------------------------------------------------> 2045 Time 2047 Figure 20: Key Latency Points during the Packet Transmission 2049 The main packet states shown in Figure 20 are the following: 2051 * "Scheduled for sending": the packet is committed by the sending 2052 application, stamped and ready to be sent; 2054 * "Sent": the packet is passed to the UDP socket and sent; 2056 * "Received": the packet is received and read from the UDP socket; 2058 * "Scheduled for delivery": the packet is scheduled for the delivery 2059 and ready to be read by the receiving application. 2061 It is worth noting that the round-trip time (RTT) of an SRT link may 2062 vary in time. However the actual end-to-end latency on the link 2063 becomes fixed and is approximately equal to (RTT_0/2 + SRT Latency) 2064 once the SRT handshake exchange happens, where RTT_0 is the actual 2065 value of the round-trip time during the SRT handshake exchange (the 2066 value of the round-trip time once the SRT connection has been 2067 established). 2069 The value of sending delay depends on the hardware performance. 2070 Usually it is relatively small (several microseconds) in contrast to 2071 RTT_0/2 and SRT latency which are measured in milliseconds. 2073 4.5.1. Packet Delivery Time 2075 Packet delivery time is the moment, estimated by the receiver, when a 2076 packet should be delivered to the upstream application. The 2077 calculation of packet delivery time (PktTsbpdTime) is performed upon 2078 receiving a data packet according to the following formula: 2080 PktTsbpdTime = TsbpdTimeBase + PKT_TIMESTAMP + TsbpdDelay + Drift 2082 where 2084 * TsbpdTimeBase is the time base that reflects the time difference 2085 between local clock of the receiver and the clock used by the 2086 sender to timestamp packets being sent (see Section 4.5.1.1); 2088 * PKT_TIMESTAMP is the data packet timestamp, in microseconds; 2090 * TsbpdDelay is the receiver's buffer delay (or receiver's buffer 2091 latency, or SRT Latency). This is the time, in milliseconds, that 2092 SRT holds a packet from the moment it has been received till the 2093 time it should be delivered to the upstream application; 2095 * Drift is the time drift used to adjust the fluctuations between 2096 sender and receiver clock, in microseconds. 2098 SRT Latency (TsbpdDelay) should be a buffer time large enough to 2099 cover the unexpectedly extended RTT time, and the time needed to 2100 retransmit the lost packet. The value of minimum TsbpdDelay is 2101 negotiated during the SRT handshake exchange and is equal to 120 2102 milliseconds. The recommended value of TsbpdDelay is 3-4 times RTT. 2104 It is worth noting that TsbpdDelay limits the number of packet 2105 retransmissions to a certain extent making it impossible to 2106 retransmit packets endlessly. This is important for the case of live 2107 streaming (Section 4.2, Section 7.1). 2109 4.5.1.1. TSBPD Time Base Calculation 2111 The initial value of TSBPD time base (TsbpdTimeBase) is calculated at 2112 the moment of the second handshake request is received as follows: 2114 TsbpdTimeBase = T_NOW - HSREQ_TIMESTAMP 2116 where T_NOW is the current time according to the receiver clock; 2117 HSREQ_TIMESTAMP is the handshake packet timestamp, in microseconds. 2119 The value of TsbpdTimeBase is approximately equal to the initial one- 2120 way delay of the link RTT_0/2, where RTT_0 is the actual value of the 2121 round-trip time during the SRT handshake exchange. 2123 During the transmission process, the value of TSBPD time base may be 2124 adjusted in two cases: 2126 1. During the TSBPD wrapping period. The TSBPD wrapping period 2127 happens every 01:11:35 hours. This time corresponds to the 2128 maximum timestamp value of a packet (MAX_TIMESTAMP). 2129 MAX_TIMESTAMP is equal to 0xFFFFFFFF, or the maximum value of 2130 32-bit unsigned integer, in microseconds (Section 3). The TSBPD 2131 wrapping period starts 30 seconds before reaching the maximum 2132 timestamp value of a packet and ends once the packet with 2133 timestamp within (30, 60) seconds interval is delivered (read 2134 from the buffer). The updated value of TsbpdTimeBase will be 2135 recalculated as follows: 2137 TsbpdTimeBase = TsbpdTimeBase + MAX_TIMESTAMP + 1 2139 2. By drift tracer. See Section 4.7 for details. 2141 4.6. Too-Late Packet Drop 2143 The Too-Late Packet Drop (TLPKTDROP) mechanism allows the sender to 2144 drop packets that have no chance to be delivered in time, and allows 2145 the receiver to skip missing packets that have not been delivered in 2146 time. The timeout of dropping a packet is based on the TSBPD 2147 mechanism (Section 4.5). 2149 In the SRT, when Too-Late Packet Drop is enabled, and a packet 2150 timestamp is older than 125% of the SRT latency, it is considered too 2151 late to be delivered and may be dropped by the sender. However, the 2152 sender keeps packets for at least 1 second in case the SRT latency is 2153 not enough for a large RTT (that is, if 125% of the SRT latency is 2154 less than 1 second). 2156 When enabled on the receiver, the receiver drops packets that have 2157 not been delivered or retransmitted in time, and delivers the 2158 subsequent packets to the application when it is their time to play. 2160 In pseudo-code, the algorithm of reading from the receiver buffer is 2161 the following: 2163 2164 pos = 0; /* Current receiver buffer position */ 2165 i = 0; /* Position of the next available in the receiver buffer 2166 packet relatively to the current buffer position pos */ 2168 while(True) { 2169 // Get the position i of the next available packet 2170 // in the receiver buffer 2171 i = next_avail(); 2172 // Calculate packet delivery time PktTsbpdTime 2173 // for the next available packet 2174 PktTsbpdTime = delivery_time(i); 2176 if T_NOW < PktTsbpdTime: 2177 continue; 2179 Drop packets which buffer position number is less than i; 2181 Deliver packet with the buffer position i; 2183 pos = i + 1; 2184 } 2185 2187 where T_NOW is the current time according to the receiver clock. 2189 The TLPKTDROP mechanism can be turned off to always ensure a clean 2190 delivery. However, a lost packet can simply pause a delivery for 2191 some longer, potentially undefined time, and cause even worse tearing 2192 for the player. Setting higher SRT latency will help much more in 2193 the case when TLPKTDROP causes packet drops too often. 2195 4.7. Drift Management 2197 When the sender enters "connected" status it tells the application 2198 there is a socket interface that is transmitter-ready. At this point 2199 the application can start sending data packets. It adds packets to 2200 the SRT sender's buffer at a certain input rate, from which they are 2201 transmitted to the receiver at scheduled times. 2203 A synchronized time is required to keep proper sender/receiver buffer 2204 levels, taking into account the time zone and round-trip time (up to 2205 2 seconds for satellite links). Considering addition/subtraction 2206 round-off, and possibly unsynchronized system times, an agreed-upon 2207 time base drifts by a few microseconds every minute. The drift may 2208 accumulate over many days to a point where the sender or receiver 2209 buffers will overflow or deplete, seriously affecting the quality of 2210 the video. SRT has a time management mechanism to compensate for 2211 this drift. 2213 When a packet is received, SRT determines the difference between the 2214 time it was expected and its timestamp. The timestamp is calculated 2215 on the receiver side. The RTT tells the receiver how much time it 2216 was supposed to take. SRT maintains a reference between the time at 2217 the leading edge of the send buffer's latency window and the 2218 corresponding time on the receiver (the present time). This allows 2219 to convert packet timestamp to the local receiver time. Based on 2220 this time, various events (packet delivery, etc.) can be scheduled. 2222 The receiver samples time drift data and periodically calculates a 2223 packet timestamp correction factor, which is applied to each data 2224 packet received by adjusting the inter-packet interval. When a 2225 packet is received it is not given right away to the application. As 2226 time advances, the receiver knows the expected time for any missing 2227 or dropped packet, and can use this information to fill any "holes" 2228 in the receive queue with another packet (see Section 4.5). 2230 It is worth noting that the period of sampling time drift data is 2231 based on a number of packets rather than time duration to ensure 2232 enough samples, independently of the media stream packet rate. The 2233 effect of network jitter on the estimated time drift is attenuated by 2234 using a large number of samples. The actual time drift being very 2235 slow (affecting a stream only after many hours) does not require a 2236 fast reaction. 2238 The receiver uses local time to be able to schedule events -- to 2239 determine, for example, if it is time to deliver a certain packet 2240 right away. The timestamps in the packets themselves are just 2241 references to the beginning of the session. When a packet is 2242 received (with a timestamp from the sender), the receiver makes a 2243 reference to the beginning of the session to recalculate its 2244 timestamp. The start time is derived from the local time at the 2245 moment that the session is connected. A packet timestamp equals 2246 "now" minus "StartTime", where the latter is the point in time when 2247 the socket was created. 2249 4.8. Acknowledgement and Lost Packet Handling 2251 To enable the Automatic Repeat reQuest of data packet 2252 retransmissions, a sender stores all sent data packets in its buffer. 2254 The SRT receiver periodically sends acknowledgments (ACKs) for the 2255 received data packets so that the SRT sender can remove the 2256 acknowledged packets from its buffer (Section 4.8.1). Once the 2257 acknowledged packets are removed, their retransmission is no longer 2258 possible and presumably not needed. 2260 Upon receiving the full acknowledgment (ACK) control packet, the SRT 2261 sender SHOULD acknowledge its reception to the receiver by sending an 2262 ACKACK control packet with the sequence number of the full ACK packet 2263 being acknowledged. 2265 The SRT receiver also sends NAK control packets to notify the sender 2266 about the missing packets (Section 4.8.2). The sending of a NAK 2267 packet can be triggered immediately after a gap in sequence numbers 2268 of data packets is detected. In addition, a Periodic NAK report 2269 mechanism can be used to send NAK reports periodically. The NAK 2270 packet in that case will list all the packets that the receiver 2271 considers being lost up to the moment the Periodic NAK report is 2272 sent. 2274 Upon reception of the NAK packet, the SRT sender prioritizes 2275 retransmissions of lost packets over the regular data packets to be 2276 transmitted for the first time. 2278 The retransmission of the missing packet is repeated until the 2279 receiver acknowledges its receipt, or if both peers agree to drop 2280 this packet (Section 4.6). 2282 4.8.1. Packet Acknowledgement (ACKs, ACKACKs) 2284 At certain intervals (see below), the SRT receiver sends an 2285 acknowledgment (ACK) that causes the acknowledged packets to be 2286 removed from the SRT sender's buffer. 2288 An ACK control packet contains the sequence number of the packet 2289 immediately following the latest in the list of received packets. 2290 Where no packet loss has occurred up to the packet with sequence 2291 number n, an ACK would include the sequence number (n + 1). 2293 An ACK (from a receiver) will trigger the transmission of an ACKACK 2294 (by the sender), with almost no delay. The time it takes for an ACK 2295 to be sent and an ACKACK to be received is the RTT. The ACKACK tells 2296 the receiver to stop sending the ACK position because the sender 2297 already knows it. Otherwise, ACKs (with outdated information) would 2298 continue to be sent regularly. Similarly, if the sender does not 2299 receive an ACK, it does not stop transmitting. 2301 There are two conditions for sending an acknowledgment. A full ACK 2302 is based on a timer of 10 milliseconds (the ACK period or 2303 synchronization time interval SYN). For high bitrate transmissions, 2304 a "light ACK" can be sent, which is an ACK for a sequence of packets. 2305 In a 10 milliseconds interval, there are often so many packets being 2306 sent and received that the ACK position on the sender does not 2307 advance quickly enough. To mitigate this, after 64 packets (even if 2308 the ACK period has not fully elapsed) the receiver sends a light ACK. 2309 A light ACK is a shorter ACK (SRT header and one 32-bit field). It 2310 does not trigger an ACKACK. 2312 When a receiver encounters the situation where the next packet to be 2313 played was not successfully received from the sender, it will "skip" 2314 this packet (see Section 4.6) and send a fake ACK. To the sender, 2315 this fake ACK is a real ACK, and so it just behaves as if the packet 2316 had been received. This facilitates the synchronization between SRT 2317 sender and receiver. The fact that a packet was skipped remains 2318 unknown by the sender. Skipped packets are recorded in the 2319 statistics on the SRT receiver. 2321 4.8.2. Packet Retransmission (NAKs) 2323 The SRT receiver sends NAK control packets to notify the sender about 2324 the missing packets. The NAK packet sending can be triggered 2325 immediately after a gap in sequence numbers of data packets is 2326 detected. 2328 Upon reception of the NAK packet, the SRT sender prioritizes 2329 retransmissions of lost packets over the regular data packets to be 2330 transmitted for the first time. 2332 The SRT sender maintains a list of lost packets (loss list) that is 2333 built from NAK reports. When scheduling packet transmission, it 2334 looks to see if a packet in the loss list has priority and sends it 2335 if so. Otherwise, it sends the next packet scheduled for the first 2336 transmission list. Note that when a packet is transmitted, it stays 2337 in the buffer in case it is not received by the SRT receiver. 2339 NAK packets are processed to fill in the loss list. As the latency 2340 window advances and packets are dropped from the sending queue, a 2341 check is performed to see if any of the dropped or resent packets are 2342 in the loss list, to determine if they can be removed from there as 2343 well so that they are not retransmitted unnecessarily. 2345 There is a counter for the packets that are resent. If there is no 2346 ACK for a packet, it will stay in the loss list and can be resent 2347 more than once. Packets in the loss list are prioritized. 2349 If packets in the loss list continue to block the send queue, at some 2350 point this will cause the send queue to fill. When the send queue is 2351 full, the sender will begin to drop packets without even sending them 2352 the first time. An encoder (or other application) may continue to 2353 provide packets, but there's no place for them, so they will end up 2354 being thrown away. 2356 This condition where packets are unsent does not happen often. There 2357 is a maximum number of packets held in the send buffer based on the 2358 configured latency. Older packets that have no chance to be 2359 retransmitted and played in time are dropped, making room for newer 2360 real-time packets produced by the sending application. See 2361 Section 4.5, Section 4.6 for details. 2363 In addition to the regular NAKs, the Periodic NAK report mechanism 2364 can be used to send NAK reports periodically. The NAK packet in that 2365 case will have all the packets that the receiver considers being lost 2366 at the time of sending the Periodic NAK report. 2368 SRT Periodic NAK reports are sent with a period of (RTT + 4 * RTTVar) 2369 / 2 (so called NAKInterval), with a 20 milliseconds floor, where RTT 2370 and RTTVar are defined in Section 4.10. A NAK control packet 2371 contains a compressed list of the lost packets. Therefore, only lost 2372 packets are retransmitted. By using NAKInterval for the NAK reports 2373 period, it may happen that lost packets are retransmitted more than 2374 once, but it helps maintain low latency in the case where NAK packets 2375 are lost. 2377 An ACKACK tells the receiver to stop sending the ACK position because 2378 the sender already knows it. Otherwise, ACKs (with outdated 2379 information) would continue to be sent regularly. 2381 An ACK serves as a ping, with a corresponding ACKACK pong, to measure 2382 RTT. The time it takes for an ACK to be sent and an ACKACK to be 2383 received is the RTT. Each ACK has a number. A corresponding ACKACK 2384 has that same number. The receiver keeps a list of all ACKs in a 2385 queue to match them. Unlike a full ACK, which contains the current 2386 RTT and several other values in the Control Information Field (CIF) 2387 (Section 3.2.4), a light ACK just contains the sequence number. All 2388 control messages are sent directly and processed upon reception, but 2389 ACKACK processing time is negligible (the time this takes is included 2390 in the round-trip time). 2392 4.9. Bidirectional Transmission Queues 2394 Once an SRT connection is established, both peers can send data 2395 packets simultaneously. 2397 4.10. Round-Trip Time Estimation 2399 Round-trip time (RTT) in SRT is estimated during the transmission of 2400 data packets based on a difference in time between an ACK packet is 2401 sent out and a corresponding ACKACK packet is received back by the 2402 SRT receiver. 2404 An ACK sent by the receiver triggers an ACKACK from the sender with 2405 minimal processing delay. The ACKACK response is expected to arrive 2406 at the receiver roughly one RTT after the corresponding ACK was sent. 2408 The SRT receiver records the time when an ACK is sent out. The ACK 2409 carries a unique sequence number (independent of the data packet 2410 sequence number). The corresponding ACKACK also carries the same 2411 sequence number. Upon receiving the ACKACK, SRT calculates the RTT 2412 by comparing the difference between the ACKACK arrival time and the 2413 ACK departure time. In the following formula, RTT is the current 2414 value that the receiver maintains and rtt is the recent value that 2415 was just calculated from an ACK/ACKACK pair: 2417 RTT = 7/8 * RTT + 1/8 * rtt 2419 RTT variance (RTTVar) is obtained as follows: 2421 RTTVar = 3/4 * RTTVar + 1/4 * abs(RTT - rtt) 2423 where abs() means an absolute value. 2425 Both RTT and RTTVar are measured in microseconds. The initial value 2426 of RTT is 100 milliseconds, RTTVar is 50 milliseconds. 2428 The round-trip time (RTT) calculated by the receiver as well as the 2429 RTT variance (RTTVar) are sent with the next full acknowledgement 2430 packet (see Section 3.2.4). Note that the first ACK in an SRT 2431 session might contain an initial RTT value of 100 milliseconds, 2432 because the early calculations may not be precise. 2434 The sender always gets the RTT from the receiver. It does not have 2435 an analog to the ACK/ACKACK mechanism, i.e. it can not send a message 2436 that guarantees an immediate return without processing. Upon an ACK 2437 reception, the SRT sender updates its own RTT and RTTVar values using 2438 the same formulas as above, in which case rtt is the most recent 2439 value it receives, i.e., carried by an incoming ACK. 2441 Note that an SRT socket can both send and receive data packets. RTT 2442 and RTTVar are updated by the socket based on algorithms for the 2443 sender (using ACK packets) and for the receiver (using ACK/ACKACK 2444 pairs). When an SRT socket receives data, it updates its local RTT 2445 and RTTVar, which can be used for its own sender as well. 2447 5. SRT Packet Pacing and Congestion Control 2449 SRT provides certain mechanisms for exchanging feedback on the state 2450 of packet transmission between sender and receiver. Every 10 2451 milliseconds the receiving side sends acknowledgement (ACK) packets 2452 (Section 3.2.4) to the sender that include the latest values of RTT, 2453 RTT variance, available buffer size, receiving rate, and estimated 2454 link capacity. Similarly, NAK packets (Section 3.2.5) from the 2455 receiver inform the sender of any packet loss during the 2456 transmission, triggering an appropriate response. These mechanisms 2457 provide a solid background for the integration of various congestion 2458 control algorithms in the SRT protocol. 2460 As SRT is designed both for live streaming and file transmission 2461 (Section 4.2), there are two groups of congestion control algorithms 2462 defined in SRT: Live Congestion Control (LiveCC), and File Transfer 2463 Congestion Control (FileCC). 2465 5.1. SRT Packet Pacing and Live Congestion Control (LiveCC) 2467 To ensure smooth video playback on a receiving peer during live 2468 streaming, SRT must control the sender's buffer level to prevent 2469 overfill and depletion. The pacing control module is designed to 2470 send packets as fast as they are submitted by a video application 2471 while maintaining a relatively stable buffer level. While this looks 2472 like a simple problem, the details of the Automatic Repeat Request 2473 (ARQ) behaviour between input and output of the SRT sender add some 2474 complexity. 2476 SRT needs a certain amount of bandwidth overhead in order to have 2477 space for the sender to insert packets for retransmission with 2478 minimum impact on the output rate of the main packet transmission. 2480 This balance is achieved by adjusting the maximum allowed bandwidth 2481 MAX_BW (Section 5.1.1) which limits the bandwidth usage by SRT. The 2482 MAX_BW value is used by the Live Congestion Control (LiveCC) module 2483 to calculate the minimum interval between consecutive sent packets 2484 PKT_SND_PERIOD. In principle, the space between packets determines 2485 where retransmissions can be inserted, and the overhead represents 2486 the available margin. There is an empiric calculation that defines 2487 the interval, in microseconds, between two packets to give a certain 2488 bitrate. It is a function of the average packet payload (which 2489 includes video, audio, etc.) and the configured maximum bandwidth 2490 (MAX_BW). See Section 5.1.2 for details. 2492 In the case of live streaming, the sender is allowed to drop packets 2493 that cannot be delivered in time (Section 4.6). 2495 The combination of pacing control and Live Congestion Control 2496 (LiveCC), based on the input rate and an overhead for packets 2497 retransmission, helps avoid congestion during fluctuations of the 2498 source bitrate. 2500 During live streaming over highly variable networks, fairness can be 2501 achieved by controlling the bitrate of the source encoder at the 2502 input of the SRT sender. SRT sender can provide a variety of network 2503 related statistics, such as RTT estimate, packet loss level, the 2504 number of packets dropped, etc., to the encoder which can be used for 2505 making decisions and adjusting the bitrate in real time. 2507 5.1.1. Configuring Maximum Bandwidth 2509 There are several ways of configuring maximum bandwidth (MAX_BW): 2511 1. MAXBW_SET mode: Set the value explicitly. 2513 The recommended default value is 1 Gbps. The default value is 2514 set only for live streaming. 2516 Note that this static setting is not well-suited to a variable 2517 input, like when you change the bitrate on an encoder. Each time 2518 the input bitrate is configured on the encoder, MAX_BW should 2519 also be reconfigured. 2521 2. INPUTBW_SET mode: Set the SRT sender's input rate (INPUT_BW) and 2522 overhead (OVERHEAD). 2524 In this mode, SRT calculates the maximum bandwidth as follows: 2526 MAX_BW = INPUT_BW * (1 + OVERHEAD /100) 2527 Note that INPUTBW_SET mode reduces to the MAXBW_SET mode and the 2528 same restrictions apply. 2530 3. INPUTBW_ESTIMATED mode: Measure the SRT sender's input rate 2531 internally and set the overhead (OVERHEAD). 2533 In this mode, SRT adjusts the value of maximum bandwidth each 2534 time it gets the updated estimate of the input rate EST_INPUT_BW: 2536 MAX_BW = EST_INPUT_BW * (1 + OVERHEAD /100) 2538 Note that the units of MAX_BW, INPUT_BW, and EST_INPUT_BW are bytes 2539 per second. OVERHEAD is defined in %. 2541 INPUTBW_ESTIMATED mode is recommended for setting the maximum 2542 bandwidth (MAX_BW) as it follows the fluctuations in SRT sender's 2543 input rate. However, there are certain considerations that should be 2544 taken into account. 2546 In INPUTBW_SET mode, SRT takes as an input the rate that had been 2547 configured as the expected output rate of an encoder (in terms of 2548 bitrate for the packets including audio and overhead). But it is 2549 normal for an encoder to occasionally overshoot. At low bitrate, 2550 sometimes an encoder can be too optimistic and will output more bits 2551 than expected. Under these conditions, SRT packets would not go out 2552 fast enough because the configured bandwidth limitation would be too 2553 low. 2555 This is mitigated by calculating the bitrate internally 2556 (INPUTBW_ESTIMATED mode). SRT examines the packets being submitted 2557 and calculates the input rate as a moving average. However, this 2558 introduces a bit of a delay based on the content. It also means that 2559 if an encoder encounters black screens or still frames, this would 2560 dramatically lower the bitrate being measured, which would in turn 2561 reduce the SRT output rate. And then, when the video picks up again, 2562 the input rate rises sharply. SRT would not start up again fast 2563 enough on output because of the time it takes to measure the speed. 2564 Packets might be accumulated in the SRT's sender buffer and delayed 2565 as a result, causing them to arrive too late at the decoder, and 2566 possible drops by the receiver. 2568 The following table shows a summary of the bandwidth configuration 2569 modes and the variables that need to be set (v) or ignored (-): 2571 | Mode / Variable | MAX_BW | INPUT_BW | OVERHEAD | 2572 | --------------------- | ------ | -------- | -------- | 2573 | MAXBW_SET | v | - | - | 2574 | INPUTBW_SET | - | v | v | 2575 | INPUTBW_ESTIMATED | - | - | v | 2577 5.1.2. SRT's Default LiveCC Algorithm 2579 The main goal of the SRT's default LiveCC algorithm is to adjust the 2580 minimum allowed packet sending period PKT_SND_PERIOD (and, as a 2581 result, the maximum allowed sending rate) during transmission based 2582 on the average packet payload size (AvgPayloadSize) and maximum 2583 bandwidth (MAX_BW). 2585 On the sender side, there are three events that the LiveCC algorithm 2586 reacts to: (1) sending a data packet, (2) receiving an 2587 acknowledgement (ACK) packet, and (3) a timeout event as described 2588 below. 2590 (1) On sending a data packet (either original or retransmitted), 2591 update the value of average packet payload size (AvgPayloadSize): 2593 AvgPayloadSize = 7/8 * AvgPayloadSize + 1/8 * PacketPayloadSize 2595 where PacketPayloadSize is the payload size of a sent data packet, in 2596 bytes; the initial value of AvgPayloadSize is equal to the maximum 2597 allowed packet payload size, which cannot be larger than 1456 bytes. 2599 (2) On an acknowledgement (ACK) packet reception: 2601 Step 1. Calculate SRT packet size (PktSize) as the sum of average 2602 payload size (AvgPayloadSize) and SRT header size (Section 3), in 2603 bytes. 2605 Step 2. Calculate the minimum allowed packet sending period 2606 (PKT_SND_PERIOD) as: 2608 PKT_SND_PERIOD = PktSize * 1000000 / MAX_BW 2610 where MAX_BW is the configured maximum bandwidth which limits the 2611 bandwidth usage by SRT, in bytes per second; PKT_SND_PERIOD is 2612 measured in microseconds. 2614 (3) On a retransmission timeout (RTO) event, follow the same steps as 2615 described in method (1) above. 2617 RTO is the amount of time within which an acknowledgement is expected 2618 after a data packet is sent out. If there is no ACK after this 2619 amount of time has elapsed, a timeout event is triggered. Since SRT 2620 only acknowledges every SYN time (Section 4.8.1), the value of 2621 retransmission timeout is defined as follows: 2623 RTO = RTT + 4 * RTTVar + 2 * SYN 2625 where RTT is the round-trip time estimate, in microseconds, and 2626 RTTVar is the variance of RTT estimate, in microseconds, reported by 2627 the receiver and smoothed at the sender side (see Section 3.2.4, 2628 Section 4.10). Here and throughout the current section, smoothing 2629 means applying an exponentially weighted moving average (EWMA). 2631 Continuous timeout should increase the RTO value. In SRT, a counter 2632 (RexmitCount) is used to track the number of continuous timeouts: 2634 RTO = RexmitCount * (RTT + 4 * RTTVar + 2 * SYN) + SYN 2636 On the receiver side, when a loss report is sent, the sending 2637 interval of periodic NAK reports (Section 4.8.2) is updated as 2638 follows: 2640 NAKInterval = min((RTT + 4 * RTTVar) / 2, 20000) 2642 where RTT and RTTVar are receiver's estimates (see Section 3.2.4, 2643 Section 4.10). The minimum value of NAKInterval is set to 20 2644 milliseconds in order to avoid sending periodic NAK reports too often 2645 under low latency conditions. 2647 5.2. File Transfer Congestion Control (FileCC) 2649 For file transfer (Section 4.2), any known congestion control 2650 algorithm like CUBIC [RFC8312] or BBR [BBR] can be applied, including 2651 SRT's default FileCC algorithm described below. 2653 5.2.1. SRT's Default FileCC Algorithm 2655 SRT's default FileCC algorithm is a modified version of the UDT 2656 native congestion control algorithm [GuAnAO], [GHG04b] designed for a 2657 bulk data transfer over networks with a large bandwidth-delay product 2658 (BDP). It is a hybrid Additive Increase Multiplicative Decrease 2659 (AIMD) algorithm, hence it adjusts both congestion window size 2660 (CWND_SIZE) and packet sending period (PKT_SND_PERIOD). The units of 2661 measurement for CWND_SIZE and PKT_SND_PERIOD are packets and 2662 microseconds, respectively. 2664 The algorithm controls sending rate by tuning the packet sending 2665 period (i.e. how often packets are sent out). The sending rate is 2666 increased upon receipt of an acknowledgement (ACK), and decreased 2667 when receiving a loss report (negative acknowledgement, or NAK). 2668 Only full ACKs, not light ACKs (Section 4.8.1), trigger an increase 2669 in the sending rate. 2671 SRT congestion control has two phases: "Slow Start" and "Congestion 2672 Avoidance". In the slow start phase the congestion control module 2673 probes the network to determine available bandwidth and the target 2674 sending rate for the next (operational) phase, which is congestion 2675 avoidance. In this phase, if there is no congestion detected via 2676 loss reports, the sending rate is gradually increased. Conversely, 2677 if a network congestion is detected, the algorithm decreases the 2678 sending rate to reduce subsequent packet loss. The slow start phase 2679 runs exactly once at the beginning of a connection, and stops when a 2680 packet loss occurs, when the congestion window size reaches its 2681 maximum value, or on a timeout event. 2683 The detailed algorithm behaviour at both phases is described in 2684 Section 5.2.1.1 and Section 5.2.1.2, respectively. 2686 As with LiveCC, SRT's default FileCC algorithm reacts to three 2687 events: (1) sending a data packet, (2) receiving an acknowledgement 2688 (ACK) packet, and (3) a timeout event. These are described below as 2689 they apply to the congestion control phases. 2691 5.2.1.1. Slow Start 2693 During the slow start phase, the packet sending period PKT_SND_PERIOD 2694 is kept at 1 microsecond in order to send packets as fast as 2695 possible, but not at an infinite rate. The initial value of the 2696 congestion window size (CWND_SIZE) is set to 16 packets. CWND_SIZE 2697 has an upper threshold, which is the maximum allowed congestion 2698 window size (MAX_CWND_SIZE), so that even if there is no packet loss, 2699 the slow start phase has to stop at a certain point. The threshold 2700 can be set to the maximum receiver buffer size (12 MB). 2702 (1) On an acknowledgement (ACK) packet reception: 2704 Step 1. If the interval since the last time the sending rate was 2705 either increased or kept (LastRCTime) is less than RC_INTERVAL: 2707 a. Keep the sending rate at the same level; 2709 b. Stop. 2711 2712 if (currTime - LastRCTime < RC_INTERVAL) 2713 { 2714 Keep the sending rate at the same level; 2715 Stop; 2716 } 2717 2719 where currTime is the current time, in microseconds; LastRCTime is 2720 the last time the sending rate was either increased, or kept, in 2721 microseconds. 2723 Step 2. Update the value of LastRCTime to the current time: 2725 LastRCTime = currTime 2727 Step 3. The size of congestion window CWND_SIZE is increased by the 2728 difference in sequence numbers of the data packet being acknowledged 2729 ACK_SEQNO and the last acknowledged data packet LAST_ACK_SEQNO: 2731 CWND_SIZE += ACK_SEQNO - LAST_ACK_SEQNO 2733 Step 4. The sequence number of the last acknowledged data packet 2734 LAST_ACK_SEQNO is updated as follows: 2736 LAST_ACK_SEQNO = ACK_SEQNO 2738 Step 5. If the congestion window size CWND_SIZE calculated at Step 3 2739 is greater than the upper threshold MAX_CWND_SIZE, slow start phase 2740 ends. Set the packet sending period PKT_SND_PERIOD as follows: 2742 2743 if (RECEIVING_RATE > 0) 2744 PKT_SND_PERIOD = 1000000 / RECEIVING_RATE; 2745 else 2746 PKT_SND_PERIOD = CWND_SIZE / (RTT + RC_INTERVAL); 2747 2749 where 2751 * RECEIVING_RATE is the rate at which packets are being received, in 2752 packets per second, reported by the receiver and smoothed at the 2753 sender side (see Section 3.2.4, Section 5.2.1.3); 2755 * RTT is the round-trip time estimate, in microseconds, reported by 2756 the receiver and smoothed at the sender side (see Section 3.2.4, 2757 Section 4.10); 2759 * RC_INTERVAL is the fixed rate control interval, in microseconds. 2760 RC_INTERVAL of SRT is SYN, or synchronization time interval, which 2761 is 0.01 second. An ACK in SRT is sent every fixed time interval. 2762 The maximum and default ACK time interval is SYN. See 2763 Section 4.8.1 for details. 2765 (2) On a loss report (NAK) packet reception: 2767 * Slow start phase ends; 2769 * Set the packet sending period PKT_SND_PERIOD as described in Step 2770 5 of section (1) above. 2772 (3) On a retransmission timeout (RTO) event: 2774 * Slow start phase ends; 2776 * Set the packet sending period PKT_SND_PERIOD as described in Step 2777 5 of section (1) above. 2779 5.2.1.2. Congestion Avoidance 2781 Once the slow start phase ends, the algorithm enters the congestion 2782 avoidance phase and behaves as described below. 2784 (1) On an acknowledgement (ACK) packet reception: 2786 Step 1. If the interval since the last time the sending rate was 2787 either increased or kept (LastRCTime) is less than RC_INTERVAL: 2789 a. Keep the sending rate at the same level; 2791 b. Stop. 2793 2794 if (currTime - LastRCTime < RC_INTERVAL) 2795 { 2796 Keep the sending rate at the same level; 2797 Stop; 2798 } 2799 2801 where currTime is the current time, in microseconds; LastRCTime is 2802 the last time the sending rate was either increased, or kept, in 2803 microseconds. 2805 Step 2. Update the value of LastRCTime to the current time: 2807 LastRCTime = currTime 2809 Step 3. Set the congestion window size to: 2811 CWND_SIZE = RECEIVING_RATE * (RTT + RC_INTERVAL) / 1000000 + 16 2813 Step 4. If there is packet loss reported by the receiver 2814 (bLoss=True): 2816 a. Keep the value of PKT_SND_PERIOD at the same level; 2818 b. Set the value of bLoss to False; 2820 c. Stop. 2822 bLoss flag is equal to True if a packet loss has happened since the 2823 last sending rate increase. Initial value: False. 2825 Step 5. If there is no packet loss reported by the receiver 2826 (bLoss=False), calculate PKT_SND_PERIOD as follows: 2828 2829 inc = 0; 2831 lossBandwidth = 2 * (1000000 / LastDecPeriod); 2832 linkCapacity = min(lossBandwidth, EST_LINK_CAPACITY); 2833 B = linkCapacity - 1000000 / PKT_SND_PERIOD; 2835 if ((PKT_SND_PERIOD > LastDecPeriod) && ((linkCapacity / 9) < B)) 2836 B = linkCapacity / 9; 2837 if (B <= 0) 2838 inc = 1 / S; 2839 else 2840 { 2841 inc = pow(10.0, ceil(log10(B * S * 8))) * 0.0000015 / S; 2842 inc = max(inc, 1 / S); 2843 } 2845 PKT_SND_PERIOD = (PKT_SND_PERIOD * RC_INTERVAL) / 2846 (PKT_SND_PERIOD * inc + RC_INTERVAL); 2847 2849 where 2851 * LastDecPeriod is the value of PKT_SND_PERIOD right before the last 2852 sending rate decrease has happened (on a loss report (NAK) packet 2853 reception), in microseconds. The initial value of LastDecPeriod 2854 is set to 1 microsecond; 2856 * EST_LINK_CAPACITY is the estimated link capacity reported by the 2857 receiver within an ACK packet and smoothed at the sender side 2858 (Section 5.2.1.3), in packets per second; 2860 * B is the estimated available bandwidth, in packets per second; 2862 * S is the SRT packet size (in terms of IP payload) in bytes. SRT 2863 treats 1500 bytes as a standard packet size. 2865 A detailed explanation of the formulas used to calculate the increase 2866 in sending rate can be found in [GuAnAO]. UDT's available bandwidth 2867 estimation has been modified to take into account the bandwidth 2868 registered at the moment of packet loss, since the estimated link 2869 capacity reported by the receiver may overestimate the actual link 2870 capacity significantly. 2872 Step 6. If the value of maximum bandwidth MAX_BW defined in 2873 Section 5.1 is set, limit the value of PKT_SND_PERIOD to the minimum 2874 allowed period, if necessary: 2876 2877 if (MAX_BW) 2878 MIN_PERIOD = 1000000 / (MAX_BW / S); 2880 if (PKT_SND_PERIOD < MIN_PERIOD) 2881 PKT_SND_PERIOD = MIN_PERIOD; 2882 2884 Note that in the case of file transmission the the maximum allowed 2885 bandwidth (MAX_BW) for SRT can be defined. This limits the minimum 2886 possible interval between packets sent. Only the usage of MAXBW_SET 2887 mode is possible (Section 5.1.1). In contrast with live streaming, 2888 there is no default value set for MAX_BW, and the transmission rate 2889 is not limited if not set explicitly. 2891 (2) On a loss report (NAK) packet reception: 2893 Step 1. Set the value of flag bLoss equal to True. 2895 Step 2. If the current loss ratio estimated by the sender is less 2896 than 2%: 2898 a. Keep the sending rate at the same level; 2900 b. Update the value of LastDecPeriod: 2902 LastDecPeriod = PKT_SND_PERIOD 2903 c. Stop. 2905 This modification has been introduced to increase the algorithm 2906 tolerance to a random packet loss specific for public networks, but 2907 not related to the absence of available bandwidth. 2909 Step 3. If sequence number of a packet being reported as lost is 2910 greater than the largest sequence number has been sent so far 2911 (LastDecSeq), i.e. this NAK starts a new congestion period: 2913 a. Set the value of LastDecPeriod to the current packet sending 2914 period PKT_SND_PERIOD; 2916 b. Increase the value of packet sending period: 2918 PKT_SND_PERIOD = 1.03 * PKT_SND_PERIOD 2920 c. Update AvgNAKNum: 2922 AvgNAKNum = 0.97 * AvgNAKNum + 0.03 * NAKCount 2924 d. Reset NAKCount and DecCount values to 1; 2926 e. Record the current largest sent sequence number LastDecSeq; 2928 f. Compute DecRandom to a random (uniform distribution) number 2929 between 1 and AvgNAKNum. If DecRandom < 1: DecRandom = 1; 2931 g. Stop; 2933 where 2935 * AvgNAKNum is the average number of NAKs during a congestion 2936 period. Initial value: 0; 2938 * NAKCount is the number of NAKs received so far in the current 2939 congestion period. Initial value: 0; 2941 * DecCount means the number of times that the sending rate has been 2942 decreased during the congestion period. Initial value: 0; 2944 * DecRandom is a random number used to decide if the rate should be 2945 decreased or not for the following NAKs (not the first one) during 2946 the congestion period. DecRandom is a random number between 1 and 2947 the average number of NAKs per congestion period (AvgNAKNum). 2949 Congestion period is defined as the time between two NAKs in which 2950 the biggest lost packet sequence number carried in the NAK is greater 2951 than the LastDecSeq. 2953 The coefficients used in the formulas above have been slightly 2954 modified to reduce the amount by which the sending rate decreases. 2956 Step 4. If DecCount <= 5, and NAKCount == DecCount * DecRandom: 2958 a. Update SND period: SND = 1.03 * SND; 2960 b. Increase DecCount and NAKCount by 1; 2962 c. Record the current largest sent sequence number (LastDecSeq). 2964 5.2.1.3. Link Capacity and Receiving Rate Estimation 2966 Estimates of link capacity and receiving rate, in packets/bytes per 2967 second, are calculated at the receiver side during file transmission 2968 (Section 4.2). It is worth noting that the receiving rate estimate, 2969 while available during the entire data transmission period, is used 2970 only during the slow start phase of the congestion control algorithm 2971 (Section 5.2.1.1). The latest estimate obtained before the end of 2972 the slow start period is used by the sender as a reference maximum 2973 speed to continue data transmission without further congestion. Link 2974 capacity is estimated all the time and used primarily (as well as 2975 packet loss ratio and other protocol statistics) for sending rate 2976 adjustments during the transmission process. 2978 As each data packet arrives, the receiver records the time delta with 2979 respect to the arrival of the previous data packet, which is used to 2980 estimate bandwidth and receiving speed (delivery rate). This and 2981 other control information is communicated to the sender by means of 2982 acknowledgment (ACK) packets sent every 10 milliseconds. At the 2983 sender side, upon receiving a new value, an exponentially weighted 2984 moving average (EWMA) is applied to update the latest estimate 2985 maintained at the sender side. 2987 It is important to note that for bandwidth estimation only data 2988 probing packets are taken into account, while all data packets (both 2989 data and data probing) are used for estimating receiving speed. Data 2990 probing refers to the use of the packet pairs technique, whereby 2991 pairs of probing packets are sent to a server back-to-back, thus 2992 making it possible to measure the minimum interval in receiving 2993 consecutive packets. 2995 The detailed description of models used to estimate link capacity and 2996 receiving rate can be found in [GuAnAO], [GHG04b]. 2998 6. Encryption 3000 This section describes the encryption mechanism that protects the 3001 payload of SRT streams. Based on standard cryptographic algorithms, 3002 the mechanism allows an efficient stream cipher with a key 3003 establishment method. 3005 6.1. Overview 3007 SRT implements encryption using AES [AES] in counter mode (AES-CTR) 3008 [SP800-38A] with a short-lived key to encrypt and decrypt the media 3009 stream. The AES-CTR cipher is suitable for continuous stream 3010 encryption that permits decryption from any point, without access to 3011 start of the stream (random access), and for the same reason 3012 tolerates packet loss. It also offers strong confidentiality when 3013 the counter is managed properly. 3015 6.1.1. Encryption Scope 3017 SRT encrypts only the payload of SRT data packets (Section 3.1), 3018 while the header is left unencrypted. The unencrypted header 3019 contains the Packet Sequence Number field used to keep the 3020 synchronization of the cipher counter between the encrypting sender 3021 and the decrypting receiver. No constraints apply to the payload of 3022 SRT data packets as no padding of the payload is required by counter 3023 mode ciphers. 3025 6.1.2. AES Counter 3027 The counter for AES-CTR is the size of the cipher's block, i.e. 128 3028 bits. It is derived from a 128-bit sequence consisting of 3030 * a block counter in the least significant 16 bits which counts the 3031 blocks in a packet; 3033 * a packet index, based on the packet sequence number in the SRT 3034 header, in the next 32 bits; 3036 * eighty zeroed bits. 3038 The upper 112 bits of this sequence are XORed with an Initialization 3039 Vector (IV) to produce a unique counter for each crypto block. The 3040 IV is derived from the Salt provided in the Keying Material 3041 (Section 3.2.2): 3043 IV = MSB(112, Salt): Most significant 112 bits of the salt. 3045 6.1.3. Stream Encrypting Key (SEK) 3047 The key used for AES-CTR encryption is called the "Stream Encrypting 3048 Key" (SEK). It is used for up to 2^25 packets with further rekeying. 3049 The short-lived SEK is generated by the sender using a pseudo-random 3050 number generator (PRNG), and transmitted within the stream, wrapped 3051 with another longer-term key, the Key Encrypting Key (KEK), using a 3052 known AES key wrap protocol. 3054 For connection-oriented transport such as SRT, there is no need to 3055 periodically transmit the short-lived key since no additional party 3056 can join a stream in progress. The keying material is transmitted 3057 within the connection handshake packets, and for a short period when 3058 rekeying occurs. 3060 6.1.4. Key Encrypting Key (KEK) 3062 The Key Encrypting Key (KEK) is derived from a secret (passphrase) 3063 shared between the sender and the receiver. The KEK provides access 3064 to the Stream Encrypting Key, which in turn provides access to the 3065 protected payload of SRT data packets. The KEK has to be at least as 3066 long as the SEK. 3068 The KEK is generated by a password-based key generation function 3069 (PBKDF2) [RFC2898], using the passphrase, a number of iterations 3070 (2048), a keyed-hash (HMAC-SHA1) [RFC2104], and a key length value 3071 (KLen). The PBKDF2 function hashes the passphrase to make a long 3072 string, by repetition or padding. The number of iterations is based 3073 on how much time can be given to the process without it becoming 3074 disruptive. 3076 6.1.5. Key Material Exchange 3078 The KEK is used to generate a wrap [RFC3394] that is put in a key 3079 material (KM) message by the initiator of a connection (i.e. caller 3080 in caller-listener handshake and initiator in the rendezvous 3081 handshake, see Section 4.3) to send to the responder (listener). The 3082 KM message contains the key length, the salt (one of the arguments 3083 provided to the PBKDF2 function), the protocol being used (e.g. AES- 3084 256) and the AES counter (which will eventually change, see 3085 Section 6.1.6). 3087 On the other side, the responder attempts to decode the wrap to 3088 obtain the Stream Encrypting Key. In the protocol for the wrap there 3089 is a padding, which is a known template, so the responder knows from 3090 the KM that it has the right KEK to decode the SEK. The SEK 3091 (generated and transmitted by the initiator) is random, and cannot be 3092 known in advance. The KEK formula is calculated on both sides, with 3093 the difference that the responder gets the key length (KLen) from the 3094 initiator via the key material (KM). It is the initiator who decides 3095 on the configured length. The responder obtains it from the material 3096 sent by the initiator. 3098 The responder returns the same KM message to show that it has the 3099 same information as the initiator, and that the encoded material will 3100 be decrypted. If the responder does not return this status, this 3101 means that it does not have the SEK. All incoming encrypted packets 3102 received by the responder will be lost (undecrypted). Even if they 3103 are transmitted successfully, the receiver will be unable to decrypt 3104 them, and so packets will be dropped. All data packets coming from 3105 responder will be unencrypted. 3107 6.1.6. KM Refresh 3109 The short lived SEK is regenerated for cryptographic reasons when a 3110 pre-determined number of packets has been encrypted. The KM refresh 3111 period is determined by the implementation. The receiver knows which 3112 SEK (odd or even) was used to encrypt the packet by means of the KK 3113 field of the SRT Data Packet (Section 3.1). 3115 There are two variables used to determine the KM Refresh timing: 3117 * KM Refresh Period specifies the number of packets to be sent 3118 before switching to the new SEK. 3120 * KM Pre-Announcement Period specifies when a new key is announced 3121 in a number of packets before key switchover. The same value is 3122 used to determine when to decommission the old key after 3123 switchover. 3125 The recommended KM Refresh Period is after 2^25 packets encrypted 3126 with the same SEK are sent. The recommended KM Pre-Announcement 3127 Period is 4000 packets (i.e. a new key is generated, wrapped, and 3128 sent at 2^25 minus 4000 packets; the old key is decommissioned at 3129 2^25 plus 4000 packets). 3131 Even and odd keys are alternated during transmission the following 3132 way. The packets with the earlier key #1 (let it be the odd key) 3133 will continue to be sent. The receiver will receive the new key #2 3134 (even), then decrypt and unwrap it. The receiver will reply to the 3135 sender if it is able to understand. Once the sender gets to the 3136 2^25th packet using the odd key (key #1), it will then start to send 3137 packets with the even key (key #2), knowing that the receiver has 3138 what it needs to decrypt them. This happens transparently, from one 3139 packet to the next. At 2^25 plus 4000 packets the first key will be 3140 decommissioned automatically. 3142 Both keys live in parallel for two times the Pre-Announcement Period 3143 (e.g. 4000 packets before the key switch, and 4000 packets after). 3144 This is to allow for packet retransmission. It is possible for 3145 packets with the older key to arrive at the receiver a bit late. 3146 Each packet contains a description of which key it requires, so the 3147 receiver will still have the ability to decrypt it. 3149 6.2. Encryption Process 3151 6.2.1. Generating the Stream Encrypting Key 3153 On the sending side SEK, Salt and KEK are generated in the following 3154 way: 3156 SEK = PRNG(KLen) 3157 Salt = PRNG(128) 3158 KEK = PBKDF2(passphrase, LSB(64,Salt), Iter, KLen) 3160 where 3162 * PBKDF2 is the PKCS#5 Password Based Key Derivation Function 3163 [RFC2898]; 3165 * passphrase is the pre-shared passphrase; 3167 * Salt is a field of the KM message; 3169 * LSB(n, v) is the function taking n least significant bits of v; 3171 * Iter=2048 defines the number of iterations for PBKDF2; 3173 * KLen is a field of the KM message. 3175 Wrap = AESkw(KEK, SEK) 3177 where AESkw(KEK, SEK) is the key wrapping function [RFC3394]. 3179 6.2.2. Encrypting the Payload 3181 The encryption of the payload of the SRT data packet is done with 3182 AES-CTR 3184 EncryptedPayload = AES_CTR_Encrypt(SEK, IV, UnencryptedPayload) 3186 where the Initialization Vector (IV) is derived as 3188 IV = (MSB(112, Salt) << 2) XOR (PktSeqNo) 3189 PktSeqNo is the value of the Packet Sequence Number field of the SRT 3190 data packet. 3192 6.3. Decryption Process 3194 6.3.1. Restoring the Stream Encrypting Key 3196 For the receiver to be able to decrypt the incoming stream it has to 3197 know the stream encrypting key (SEK) used by the sender. The 3198 receiver MUST know the passphrase used by the sender. The remaining 3199 information can be extracted from the Keying Material message. 3201 The Keying Material message contains the AES-wrapped [RFC3394] SEK 3202 used by the encoder. The Key-Encryption Key (KEK) required to unwrap 3203 the SEK is calculated as: 3205 KEK = PBKDF2(passphrase, LSB(64,Salt), Iter, KLen) 3207 where 3209 * PBKDF2 is the PKCS#5 Password Based Key Derivation Function 3210 [RFC2898]; 3212 * passphrase is the pre-shared passphrase; 3214 * Salt is a field of the KM message; 3216 * LSB(n, v) is the function taking n least significant bits of v; 3218 * Iter=2048 defines the number of iterations for PBKDF2; 3220 * KLen is a field of the KM message. 3222 SEK = AESkuw(KEK, Wrap) 3224 where AESkuw(KEK, Wrap) is the key unwrapping function. 3226 6.3.2. Decrypting the Payload 3228 The decryption of the payload of the SRT data packet is done with 3229 AES-CTR 3231 DecryptedPayload = AES_CTR_Encrypt(SEK, IV, EncryptedPayload) 3233 where the Initialization Vector (IV) is derived as 3235 IV = (MSB(112, Salt) << 2) XOR (PktSeqNo) 3236 PktSeqNo is the value of the Packet Sequence Number field of the SRT 3237 data packet. 3239 7. Best Practices and Configuration Tips for Data Transmission via SRT 3241 7.1. Live Streaming 3243 This section describes real world examples of live audio/video 3244 streaming and the current consensus on maintaining the compatibility 3245 between SRT implementations by different vendors. It is meant as 3246 guidance for developers to write applications compatible with 3247 existing SRT implementations. 3249 The term "live streaming" refers to MPEG-TS style continuous data 3250 transmission with latency management. Live streaming based on 3251 segmentation and transmission of files like in HLS protocol [RFC8216] 3252 is not part of this use case. 3254 The default SRT data transmission mode for continuous live streaming 3255 is message mode (Section 4.2.1) with certain settings applied as 3256 described below: 3258 * Only data packets with their Packet Position Flag (PP) field set 3259 to "11b" are allowed, meaning a single data packet forms exactly 3260 one message (Section 3.1). 3262 * Timestamp-Based Packet Delivery (TSBPD) (Section 4.5) and Too-Late 3263 Packet Drop (TLPKTDROP) (Section 4.6) mechanisms must be enabled. 3265 * Live Congestion Control (LiveCC) (Section 5.1) must be used. 3267 * The Order Flag (Section 3.1) needs special attention. In the case 3268 of live streaming, it is set to 0 allowing out of order delivery 3269 of a packet. However, in this use case the Order Flag has to be 3270 ignored by the receiver. As TSBPD is enabled, the receiver will 3271 still deliver packets in order, but based on the timestamps. In 3272 the case of a packet arriving too late and skipped by the 3273 TLPKTDROP mechanism, the order of delivery is still maintained 3274 except for potential sequence discontinuity. 3276 This method has grown historically and is the current common 3277 standard for live streaming across different SRT implementations. 3278 A change or variation of the settings will break compatibility 3279 between two parties. 3281 This combination of settings allows live streaming with a constant 3282 latency (Section 4.4). The receiving end will not "fall behind" in 3283 time by waiting for missing packets. However, data integrity might 3284 not be ensured if packets or retransmitted packets do not arrive 3285 within the expected time frame. Audio or video interruption can 3286 occur, but the overall latency is maintained and does not increase 3287 over time whenever packets are missing. 3289 7.2. File Transmission 3291 This section describes the use case of file transmission and provides 3292 configuration examples. 3294 The usage of both message and buffer modes (Section 4.2) is possible 3295 in this case. For both modes, Timestamp-Based Packet Delivery 3296 (TSBPD) (Section 4.5) and Too-Late Packet Drop (TLPKTDROP) 3297 (Section 4.6) mechanisms must be turned off, while File Transfer 3298 Congestion Control (FileCC) ({{fileCC}) must be enabled. 3300 When TSBPD is disabled, each packet gets timestamped with the time it 3301 is sent by the SRT sender. A packet being sent for the first time 3302 will have a timestamp different from that of a corresponding 3303 retransmitted packet. In contrast to the live streaming case, the 3304 timing of packets' delivery, when sending files, is not critical. 3305 The most important thing is data integrity. Therefore the TLPKTDROP 3306 mechanism must be disabled in this case. No data is allowed to be 3307 dropped, because this will result in corrupted files with missing 3308 data. The retransmission of missing packets has to happen until the 3309 packets are finally acknowledged by the SRT receiver. 3311 The File Transfer Congestion Control (FileCC) mechanism will take 3312 care of using the available link bandwidth for maximum transfer 3313 speed. 3315 7.2.1. File Transmission in Buffer Mode 3317 The original UDT protocol [GHG04b] used buffer mode (Section 4.2.2) 3318 to send files, and the same is possible in SRT. This mode was 3319 designed to transmit one file per connection. For a single file 3320 transmission, a socket is opened, a file is transmitted, and then the 3321 socket is closed. This procedure is repeated for each subsequent 3322 single file, as the receiver cannot distinguish between two files in 3323 a continuous data stream. 3325 Buffer mode is not suitable for the transmission of many small files 3326 since for every file a new connection has to be established. To 3327 initiate a new connection, at least two round-trip times (RTTs) for 3328 the handshake exchange are required (Section 4.3). 3330 It is also important to note that the SRT protocol does not add any 3331 information to the data being transmitted. The file name or any 3332 auxiliary information can be declared separately by the sending 3333 application, e.g., in the form of a Stream ID Extension Message 3334 (Section 3.2.1.3). 3336 7.2.2. File Transmission in Message Mode 3338 If message mode (Section 4.2.1) is used for the file transmission, 3339 the application should either segment the file into several messages, 3340 or use one message per file. The size of an individual message plays 3341 an important role on the receiving side since the size of the 3342 receiver buffer should be large enough to store at least a single 3343 message entirely. 3345 In the case of file transfer in message mode, the file name, 3346 segmentation rules, or any auxiliary information can be specified 3347 separately by both sending and receiving applications. The SRT 3348 protocol does not provide a specific way of doing this. It could be 3349 done by setting the file name, etc., in the very first message of a 3350 message sequence, followed by the file itself. 3352 When designing an application for SRT file transfer, it is also 3353 important to be aware of the delivery order of the received messages. 3354 This can be set by the Order Flag as described in Section 3.1. 3356 8. Security Considerations 3358 SRT provides confidentiality of the payload using stream cipher and a 3359 pre-shared private key as specified in Section 6. The security can 3360 be compromised if the pre-shared passphrase is known to the attacker. 3362 On the protocol control level, SRT does not encrypt packet headers. 3363 Therefore it has some vulnerabilities similar to TCP [RFC6528]: 3365 * A peer tells a counterpart its public IP during the handshake that 3366 is visible to any attacker. 3368 * An attacker may potentially count the number of SRT processes 3369 behind a Network Address Translator (NAT) by establishing multiple 3370 SRT connections and tracking the ranges of SRT Socket IDs. If a 3371 random Socket ID is generated for the first connection, subsequent 3372 connections may get consecutive SRT Socket IDs. Assuming one 3373 system runs only one SRT process, for example, then an attacker 3374 can estimate the number of systems behind a NAT. 3376 * Similarly, the possibility of attack depends on the implementation 3377 of the initial sequence number (ISN) generation. If an ISN is not 3378 generated randomly for each connection, an attacker may 3379 potentially count the number of systems behind a Network Address 3380 Translator (NAT) by establishing a number of SRT connections and 3381 identifying the number of different sequence number "spaces", 3382 given that no SRT packet headers are encrypted. 3384 * An eavesdropper can hijack existing connections only if it steals 3385 the IP and port of one of the parties. If some stream addresses 3386 an existing SRT receiver by its SRT socket ID, IP, and port 3387 number, but arrives from a different IP or port, the SRT receiver 3388 ignores it. 3390 * SRT has a certain protection from DoS attacks, see Section 4.3. 3392 There are some important considerations regarding the encryption 3393 feature of SRT: 3395 * The SEK must be changed at an appropriate refresh interval to 3396 avoid the risk associated with the use of security keys over a 3397 long period of time. 3399 * The shared secret for KEK generation must be carefully configured 3400 by a security officer responsible for security policies, enforcing 3401 encryption, and limiting key size selection. 3403 9. IANA Considerations 3405 This document makes no requests of the IANA. 3407 Contributors 3409 This specification is heavily based on the SRT Protocol Technical 3410 Overview [SRTTO] written by Jean Dube and Steve Matthews. 3412 In alphabetical order, the contributors to the pre-IETF SRT project 3413 and specification at Haivision are: Marc Cymontkowski, Roman 3414 Diouskine, Jean Dube, Mikolaj Malecki, Steve Matthews, Maria 3415 Sharabayko, Maxim Sharabayko, Adam Yellen. 3417 The contributors to this specification at SK Telecom are Jeongseok 3418 Kim and Joonwoong Kim. 3420 We cannot list all the contributors to the open-sourced 3421 implementation of SRT on GitHub. But we appreciate the help, 3422 contribution, integrations and feedback of the SRT and SRT Alliances 3423 community. 3425 Acknowledgments 3427 The basis of the SRT protocol and its implementation was the UDP- 3428 based Data Transfer Protocol [GHG04b]. The authors thank Yunhong Gu 3429 and Robert Grossman, the authors of the UDP-based Data Transfer 3430 Protocol [GHG04b]. 3432 TODO acknowledge. 3434 References 3436 Normative References 3438 [GHG04b] Gu, Y., Hong, X., and R.L. Grossman, "Experiences in 3439 Design and Implementation of a High Performance Transport 3440 Protocol", DOI 10.1109/SC.2004.24, December 2004, 3441 . 3443 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 3444 DOI 10.17487/RFC0768, August 1980, 3445 . 3447 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3448 Requirement Levels", BCP 14, RFC 2119, 3449 DOI 10.17487/RFC2119, March 1997, 3450 . 3452 Informative References 3454 [AES] National Institute of Standards and Technology, "FIPS Pub 3455 197: Advanced Encryption Standard (AES)", November 2001, 3456 . 3459 [AV1] Rivaz, P.d. and J. Haughton, "AV1 Bitstream & Decoding 3460 Process Specification", March 2021, 3461 . 3463 [BBR] Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., and V. 3464 Jacobson, "BBR: Congestion-Based Congestion Control", ACM 3465 Queue, vol. 14 , October 2016. 3467 [GuAnAO] Gu, Y., Hong, X., and R.L. Grossman, "An Analysis of AIMD 3468 Algorithm with Decreasing Increases", Proceedings of the 3469 1st International Workshop on Networks for Grid 3470 Applications (GridNets '04) , October 2004. 3472 [H.265] International Telecommunications Union, "H.265 : High 3473 efficiency video coding", ITU-T Recommendation H.265, 3474 2019. 3476 [I-D.ietf-quic-http] 3477 Bishop, M., "Hypertext Transfer Protocol Version 3 3478 (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf- 3479 quic-http-33, 15 December 2020, . 3482 [I-D.ietf-quic-transport] 3483 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 3484 and Secure Transport", Work in Progress, Internet-Draft, 3485 draft-ietf-quic-transport-34, 14 January 2021, 3486 . 3489 [ISO13818-1] 3490 ISO, "Information technology -- Generic coding of moving 3491 pictures and associated audio information: Systems", ISO/ 3492 IEC 13818-1, March 2021. 3494 [ISO23009] ISO, "Information technology -- Dynamic adaptive streaming 3495 over HTTP (DASH)", ISO/IEC 23009:2019, March 2021. 3497 [PNPID] "PNP ID AND ACPI ID REGISTRY", March 2021, 3498 . 3500 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 3501 Hashing for Message Authentication", RFC 2104, 3502 DOI 10.17487/RFC2104, February 1997, 3503 . 3505 [RFC2898] Kaliski, B., "PKCS #5: Password-Based Cryptography 3506 Specification Version 2.0", RFC 2898, 3507 DOI 10.17487/RFC2898, September 2000, 3508 . 3510 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 3511 Label Switching Architecture", RFC 3031, 3512 DOI 10.17487/RFC3031, January 2001, 3513 . 3515 [RFC3394] Schaad, J. and R. Housley, "Advanced Encryption Standard 3516 (AES) Key Wrap Algorithm", RFC 3394, DOI 10.17487/RFC3394, 3517 September 2002, . 3519 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 3520 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 3521 . 3523 [RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence 3524 Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February 3525 2012, . 3527 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 3528 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 3529 May 2017, . 3531 [RFC8216] Pantos, R., Ed. and W. May, "HTTP Live Streaming", 3532 RFC 8216, DOI 10.17487/RFC8216, August 2017, 3533 . 3535 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 3536 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 3537 RFC 8312, DOI 10.17487/RFC8312, February 2018, 3538 . 3540 [RTMP] "Real-Time Messaging Protocol", March 2021, 3541 . 3543 [SP800-38A] 3544 Dworkin, M., "Recommendation for Block Cipher Modes of 3545 Operation", December 2001. 3547 [SRTSRC] "SRT fully functional reference implementation", March 3548 2021, . 3550 [SRTTO] Dube, J. and S. Matthews, "SRT Protocol Technical 3551 Overview", December 2019. 3553 [VP9] WebM, "VP9 Video Codec", March 2021, 3554 . 3556 Appendix A. Packet Sequence List Coding 3558 For any single packet sequence number, it uses the original sequence 3559 number in the field. The first bit MUST start with "0". 3561 0 1 2 3 3562 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3563 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3564 |0| Sequence Number | 3565 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3566 Figure 21: Single sequence numbers coding 3568 For any consecutive packet sequence numbers that the difference 3569 between the last and first is more than 1, only record the first (a) 3570 and the the last (b) sequence numbers in the list field, and modify 3571 the the first bit of a to "1". 3573 0 1 2 3 3574 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 3575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3576 |1| Sequence Number a (first) | 3577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3578 |0| Sequence Number b (last) | 3579 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 3581 Figure 22: Range of sequence numbers coding 3583 Appendix B. SRT Access Control 3585 One type of information that can be interchanged when a connection is 3586 being established in SRT is the Stream ID, which can be used in a 3587 caller-listener connection layout. This is a string of maximum 512 3588 characters set on the caller side. It can be retrieved at the 3589 listener side on the newly accepted connection. 3591 SRT listener can notify an upstream application about the connection 3592 attempt when a HS conclusion arrives, exposing the contents of the 3593 Stream ID extension message. Based on this information, the 3594 application can accept or reject the connection, select the desired 3595 data stream, or set an appropriate passphrase for the connection. 3597 The Stream ID value can be used as free-form, but there is a 3598 recommended convention so that all SRT users speak the same language. 3599 The intent of the convention is to: 3601 * promote readability and consistency among free-form names, 3603 * interpret some typical data in the key-value style. 3605 B.1. General Syntax 3607 This recommended syntax starts with the characters known as an 3608 executable specification in POSIX: #!. 3610 The next two characters are: 3612 : - this marks the YAML format, the only one currently used 3613 The content format, which is either: 3614 : - the comma-separated keys with no nesting 3615 { - like above, but nesting is allowed and must end with } 3617 (Nesting means that you can have multiple level brace-enclosed parts 3618 inside.) 3620 The form of the key-value pair is 3622 key1=value1,key2=value2... 3624 B.2. Standard Keys 3626 Beside the general syntax, there are several top-level keys treated 3627 as standard keys. All single letter key definitions, including those 3628 not listed in this section, are reserved for future use. Users can 3629 additionally use custom key definitions with user_* or companyname_* 3630 prefixes, where user and companyname are to be replaced with an 3631 actual user or company name. 3633 The existing key values MUST not be extended, and MUST not differ 3634 from those described in this section. 3636 The following keys are standard: 3638 * u: User Name, or authorization name, that is expected to control 3639 which password should be used for the connection. The application 3640 should interpret it to distinguish which user should be used by 3641 the listener party to set up the password. 3643 * r: Resource Name identifies the name of the resource and 3644 facilitates selection should the listener party be able to serve 3645 multiple resources. 3647 * h: Host Name identifies the hostname of the resource. For 3648 example, to request a stream with the URI somehost.com/videos/ 3649 querry.php?vid=366 the hostname field should have somehost.com, 3650 and the resource name can have videos/querry.php?vid=366 or simply 3651 366. Note that this is still a key to be specified explicitly. 3652 Support tools that apply simplifications and URI extraction are 3653 expected to insert only the host portion of the URI here. 3655 * s: Session ID is a temporary resource identifier negotiated with 3656 the server, used just for verification. This is a one-shot 3657 identifier, invalidated after the first use. The expected usage 3658 is when details for the resource and authorization are negotiated 3659 over a separate connection first, and then the session ID is used 3660 here alone. 3662 * t: Type specifies the purpose of the connection. Several standard 3663 types are defined: 3665 - stream (default, if not specified): for exchanging the user- 3666 specified payload for an application-defined purpose, 3668 - file: for transmitting a file where r is the filename, 3670 - auth: for exchanging sensible data. The r value states its 3671 purpose. No specific possible values for that are known so far 3672 (for future use). 3674 * m: Mode expected for this connection: 3676 - request (default): the caller wants to receive the stream data, 3678 - publish: the caller wants to send the stream data, 3680 - bidirectional: bidirectional data exchange is expected. 3682 Note that "m" is not required in the case where Stream ID is not used 3683 to distinguish authorization or resources, and the caller is expected 3684 to send the data. This is only for cases where the listener can 3685 handle various purposes of the connection and is therefore required 3686 to know what the caller is attempting to do. 3688 B.3. Examples 3690 The example content of the StreamID is the following: 3692 #!::u=admin,r=bluesbrothers1_hi 3694 It specifies the username and the resource name of the stream to be 3695 served to the caller. 3697 The next example specifies that the file is expected to be 3698 transmitted from the caller to the listener and its name is 3699 results.csv: 3701 #!::u=johnny,t=file,m=publish,r=results.csv 3703 Appendix C. Changelog 3705 C.1. Since draft-sharabayko-mops-srt Version 00 3707 * Improved and extended the description of "Encryption" section. 3709 * Improved and extended the description of "Round-Trip Time 3710 Estimation" section. 3712 * Extended the description of "Handshake" section with "Stream ID 3713 Extension Message", "Group Membership Extension" subsections. 3715 * Extended "Handshake Messages" section with the detailed 3716 description of handshake procedure. 3718 * Improved "Key Material" section description. 3720 * Changed packet structure formatting for "Packet Structure" 3721 section. 3723 * Did minor additions to the "Acknowledgement and Lost Packet 3724 Handling" section. 3726 * Fixed broken links. 3728 * Extended the list of references. 3730 C.2. Since draft-sharabayko-mops-srt Version 01 3732 * Extended "Congestion Control" section with the detailed 3733 description of SRT packet pacing for both live streaming and file 3734 transmission cases. 3736 * Improved "Group Membership Extension" section. 3738 * Reworked "Security Consideration" section. 3740 * Added missing control packets: Drop Request, Peer Error, 3741 Congestion Warning. 3743 * Improved "Data Transmission Modes" section as well as added "Best 3744 Practices and Configuration Tips for Data Transmission via SRT" 3745 section describing the use cases of live streaming and file 3746 transmission via SRT. 3748 * Changed the workgroup from "MOPS" to "Network Working Group". 3750 * Changed the intended status of the document from "Standards Track" 3751 to "Informational". 3753 * Overall corrections throughout the document: fixed lists, 3754 punctuation, etc. 3756 Authors' Addresses 3758 Maxim Sharabayko 3759 Haivision Network Video, GmbH 3761 Email: maxsharabayko@haivision.com 3763 Maria Sharabayko 3764 Haivision Network Video, GmbH 3766 Email: msharabayko@haivision.com 3768 Jean Dube 3769 Haivision Systems, Inc. 3771 Email: jdube@haivision.com 3773 Jeongseok Kim 3774 SK Telecom Co., Ltd. 3776 Email: jeongseok.kim@sk.com 3778 Joonwoong Kim 3779 SK Telecom Co., Ltd. 3781 Email: joonwoong.kim@sk.com