idnits 2.17.1 draft-irtf-nwcrg-bats-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 21, 2021) is 1159 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '0' on line 532 == Missing Reference: 'K-1' is mentioned on line 532, but not defined == Missing Reference: 'K-2' is mentioned on line 229, but not defined == Missing Reference: 'T-P-1' is mentioned on line 232, but not defined == Missing Reference: 'T-P' is mentioned on line 232, but not defined == Missing Reference: 'T-1' is mentioned on line 364, but not defined == Missing Reference: 'WI' is mentioned on line 370, but not defined -- Looks like a reference, but probably isn't: '1' on line 532 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force S. Yang 3 Internet-Draft CUHK(SZ) 4 Intended status: Informational X. Huang 5 Expires: August 25, 2021 R.W. Yeung 6 CUHK 7 J.K. Zao 8 NCTU 9 February 21, 2021 11 BATS Coding Scheme for Multi-hop Data Transport 12 draft-irtf-nwcrg-bats-00 14 Abstract 16 BATS code is a class of efficient linear network coding scheme with a 17 matrix generalization of fountain codes as the outer code, and batch- 18 based linear network coding as the inner code. This document 19 describes a baseline BATS coding scheme for communication through 20 multi-hop networks, and discusses the related research issues towards 21 a more sophisticated BATS coding scheme. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on August 25, 2021. 40 Copyright Notice 42 Copyright (c) 2021 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 59 2. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 4 61 2.2. Data Delivery Procedures . . . . . . . . . . . . . . . . 5 62 2.2.1. Source Node Data Partitioning and Padding . . . . . . 5 63 2.2.2. Source Node Outer Code Encoding Procedure . . . . . . 6 64 2.2.3. Recoding Procedures . . . . . . . . . . . . . . . . . 7 65 2.2.4. Destination Node Procedures . . . . . . . . . . . . . 8 66 2.3. Recommendation for the Parameters . . . . . . . . . . . . 8 67 2.4. Example DDP Packet Format . . . . . . . . . . . . . . . . 9 68 2.4.1. Packet Header . . . . . . . . . . . . . . . . . . . . 9 69 2.4.2. Packet Payload . . . . . . . . . . . . . . . . . . . 10 70 2.4.3. Packet Footer . . . . . . . . . . . . . . . . . . . . 10 71 3. BATS Code Specification . . . . . . . . . . . . . . . . . . . 11 72 3.1. Common Parts . . . . . . . . . . . . . . . . . . . . . . 11 73 3.2. Outer Code Encoder . . . . . . . . . . . . . . . . . . . 12 74 3.3. Inner Code Encoder (Recoder) . . . . . . . . . . . . . . 13 75 3.4. Belief Propagation Decoder . . . . . . . . . . . . . . . 13 76 4. Research Issues . . . . . . . . . . . . . . . . . . . . . . . 14 77 4.1. Coding Design Issues . . . . . . . . . . . . . . . . . . 14 78 4.2. Protocol Design Issues . . . . . . . . . . . . . . . . . 15 79 4.3. Application Related Issues . . . . . . . . . . . . . . . 16 80 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 81 6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 82 6.1. Provision of Confidentiality Protection . . . . . . . . . 17 83 6.2. Countermeasures against Pollution Attacks . . . . . . . . 18 84 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 85 7.1. Normative References . . . . . . . . . . . . . . . . . . 18 86 7.2. Informative References . . . . . . . . . . . . . . . . . 19 87 Appendix A. Additional Stuff . . . . . . . . . . . . . . . . . . 20 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 90 1. Introduction 92 This document specifies a baseline BATS code [Yang14] scheme for data 93 delivery in multi-hop networks, and discusses the related research 94 issues towards a more sophisticated scheme. The BATS code described 95 here includes an outer code and an inner code. The outer code is a 96 matrix generalization of fountain codes (see also the RapterQ code 97 described in RFC 6330 [RFC6330]), which inherits the advantages of 98 reliability and efficiency and possesses the extra desirable property 99 of being network coding compatible. The inner code, also called 100 recoding, is formed by linear network coding for combating packet 101 loss, improving the multicast efficiency, etc. A detailed design and 102 analysis of BATS codes are provided in the BATS monograph [Yang17]. 104 A BATS coding scheme can be applied in multi-hop networks formed by 105 wireless communication links, which are inherently unreliable due to 106 interference. Existing transport protocols like TCP use end-to-end 107 retransmission, while network protocols such as IP might enable 108 store-and-forward at the relays, so that packet loss would accumulate 109 along the way. 111 A BATS coding scheme can be used for various data delivery 112 applications like file transmission, video streaming over wireless 113 multi-hop networks, etc. Different from traditional forward error 114 correcting (FEC) schemes that are applied either hop-by-hop or end- 115 to-end, the BATS coding scheme combines the end-to-end coding (the 116 outer code) with certain hop-by-hop coding (the inner code), and 117 hence can potentially achieve better performance. 119 The baseline coding scheme described here considers a network with 120 multiple communication flows. For each flow, the source node encodes 121 the data for transmission separately. Inside the network, however, 122 it is possible to mix the packets from different flows for recoding. 123 In this document, we describe a simple case where recoding is 124 performed within each flow. Note that the same encoding/decoding 125 scheme described here can be used with different recoding schemes as 126 long as they follow the principle as we illustrate in this document. 128 The purpose of the baseline BATS coding scheme is twofold. First, it 129 provides researchers and engineers a starting point for developing 130 network communication applications/protocols based on BATS codes. 131 Second, it helps to make the research issues more clear towards a 132 sophisticated BATS code based network protocol. Important research 133 directions include the security issues, congestion control and 134 routing algorithms for BATS codes, etc. 136 1.1. Requirements Language 138 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 139 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 140 document are to be interpreted as described in RFC 2119 [RFC2119]. 142 2. Procedures 144 2.1. Introduction 146 A BATS coding scheme includes an outer code encoder (also called 147 encoder), an inner code encoder (also called recoder) and a decoder. 148 The BATS coding scheme can be used for a single data flow that 149 includes a single source and one or multiple destinations. Thus 150 there exists only one encoder with multiple recoders and decoders. 151 The BATS coding scheme described in this document can be used by a 152 Data Delivery Protocol (DDP) with the following procedures. 154 Outer Code Encoding at a source node which has the data for 155 transmission: 157 * The DDP provides the data to be delivered and the related 158 information to the BATS encoder. 160 * The BATS encoder generates a sequence of batches, each 161 consisting of a set of coded packets and the information 162 pertaining to the batch. 164 The batches generated at the source node are further recoded 165 before transmitting: 167 * A BATS recoder generates recoded packets of a batch. 169 * The DDP forms and transmits the DDP packets using the batches 170 and the corresponding batch information. 172 Recoding at an intermediate node that does not need the data: 174 * The DDP extracts the batches and the corresponding batch 175 information from its received DDP packets. 177 * A BATS recoder generates recoded packets of a batch. 179 * The DDP forms and transmits DDP packets using the recoded 180 packets and the corresponding batch information. 182 Decoding at a destination node that needs the data: 184 * The DDP extracts the batches and the corresponding batch 185 information from its received DDP packets. 187 * A BATS decoder tries to recover the transmitted data using the 188 received batches. 190 * The DDP sends the decoded data to the application that needs 191 the data. 193 2.2. Data Delivery Procedures 195 Suppose that the DDP has F octets of data for transmission. We 196 describe the procedures of one BATS session for transmitting the F 197 octets. There is a limit on F of a single BATS session. If the 198 total data has more than the limit, the data needs to be transmitted 199 using multiple BATS sessions. The limit on F of a single BATS 200 session depends on the MTU (maximum transmission unit) of the 201 network, which MUST be known by the DDP. We have F is no more than 202 (MTU-10)2^16-1 octets. 204 2.2.1. Source Node Data Partitioning and Padding 206 The DDP first determines the following parameters: 208 o Batch size (M): the number of coded packets in a batch. 210 o Recoding field size (q): the number of elements in the finite 211 field for recoding. q is 2 or 2^8 213 o BATS payload size (TO): the number of payload octets in a BATS 214 packet, including the coded data and the coefficient vector. 216 Based on the above parameters, the parameters T, O and K are 217 calculated as follows: 219 o O: the number of octets of a coefficient vector, calculated as O = 220 ceil(M*log2(q)/8). 222 o T: the number of data octets of a BATS packet, calculated as T = 223 TO - O. 225 o K: number of source packets, calculated as K = floor(F/T)+1. 227 The data MUST be padded to have T*K octets, which will be partitioned 228 into K source packets b[0], ..., b[K-1], each of T octets. In our 229 padding scheme, b[0], ..., b[K-2] are filled with data bits, and 230 b[K-1] is filled with the remaining data octets and padding octets. 231 Let P = K*T-F denote the number of padding octets. We use b[K-1, 0], 232 ..., b[K-1, T-P-1] to denote the T-P source octets and b[K-1, T-P], 233 ..., b[K-1, T-1] to denote the P padding octets in b[K-1], 234 respectively. The padding process is shown in Figure 1. 236 Z = T - P 237 Let bl be the last source packet b[K-1] 238 for i = 1, 2, ... do 239 if Z + i >= T - 1 do 240 bl[Z...T-1] = i 241 break 242 bl[Z...Z+i-1] = i 243 Z = Z + i 245 Figure 1: Data Padding Process 247 2.2.2. Source Node Outer Code Encoding Procedure 249 The DDP provides the BATS encoder with the following information: 251 o Batch size (M): the number of coded packets in a batch. 253 o Recoding field size (q): the number of elements in the finite 254 field for recoding. 256 o MAX_DEG: the size of DD. 258 o The degree distribution (DD), which is an unsigned integer array 259 of size MAX_DEG+1. 261 o A sequence of batch IDs (j, j = 0, 1, ...). 263 o Number of source packets (K). 265 o Packet size (T): the number of octets in a source packet. 267 o The source packets (b[i], i = 0, 1, ..., K-1). 269 Using this information, the (outer code) encoder generates a batch 270 for each batch ID. For the batch ID j, the encoder returns the DDP 271 that contains 273 o a sparse degree d[j], and 275 o M coded packets (x[j,i], i =0, 1, ..., M-1), each containing TO 276 octets. 278 The DDP will use the batches to form DDP packets to be transmitted to 279 other network nodes towards the destination nodes. The DDP MUST 280 deliver with each coded packet its 282 o d: sparse degree 283 o BID: batch ID 285 The DDP MUST deliver the following information to each recoder: 287 o M: batch size M 289 o q: recoding field size 291 The DDP MUST deliver the following information to each decoder: 293 o M: batch size 295 o q: recoding field size 297 o K: the number of source packets 299 o T: the number of octets in a source packet 301 The BID is used by both recoders and decoders. The BATS payload size 302 TO MUST be known by all the nodes. 304 The DDP will also include some necessary extra information in the 305 packet header so that the network nodes can identify different BATS 306 sessions, and different end-to-end communication flows. However, 307 such specifications are beyond the scope of this document. 309 2.2.3. Recoding Procedures 311 Both the source node and the intermediate nodes perform recoding on 312 the batches before transmission. At the source node, the recoder 313 receives the batches from the outer code encoding procedure. At an 314 intermediate node, the DDP receives the DDP packets from the other 315 network nodes, and should be able to extract coded packets and the 316 corresponding batch information from these packets. 318 The DDP provides the recoder with the following information: 320 o the batch size M, 322 o the recoding field size q, 324 o a number of received coded packets of the same batch, each 325 containing TO octets, and 327 o link statistics, e.g., packet loss rates. 329 For a received batch, the recoder determines a positive integer Mr, 330 the number of recoded packets to be transmitted for the batch. The 331 recoder uses the information provided by the DDP to generate Mr 332 recoded packets, each containing TO octets. The DDP uses the Mr 333 recoded packets to form the DDP packets for transmitting. 335 2.2.4. Destination Node Procedures 337 A destination node needs the data transmitted by the source node. At 338 the destination node, the DDP receives DDP packets from the other 339 network nodes, and should be able to extract coded packets and the 340 corresponding batch information from these packets. 342 The DDP provides the decoder with the following information: 344 o M: batch size, 346 o q: recoding field size, 348 o K: the number of source packets 350 o T: the number of octets of a source packet 352 o A sequence of batches, each of which is formed by a number of 353 coded packets belonging to the same batch, with their 354 corresponding batch IDs and degrees. 356 The decoder uses this information to decode the K source packets. If 357 successful, the decoder returns the recovered K source packets to the 358 DDP, which will use the K source packets to form the F octets data. 359 The recommended padding process is shown as follows: 361 // this procedure returns the number P of padding octets 362 // at the end of b[K-1] 363 Let bl be the last decoded source packet b[K-1] 364 PL = bl[T-1] 365 if PL == 1 do 366 return P = 1 367 WI = T - 1 368 while bl[WI] == PL do 369 WI = WI - 1 370 return P = (1 + bl[WI]) * bl[WI] + T - WI - 1 372 Figure 2: Data Depadding Process 374 2.3. Recommendation for the Parameters 376 The recommendation for the parameters M and q is shown as follows: 378 o When q=2, M=16,32,64 379 o When q=256, M=8,16,32,64 381 It is RECOMMENDED that K is at least 128. However, the encoder/ 382 decoder SHALL support an arbitrary positive integer value less than 383 2^16. 385 2.4. Example DDP Packet Format 387 A DDP can form a DDP packet with a header (5 octets), a footer (3 388 octets) and a payload (TO octets). A DDP packet has totally 8+TO 389 octets. 391 2.4.1. Packet Header 393 The BATS packet header has 40 bits (5 octets) and includes fields 394 Packet_Count, Mq, Batch_ID, and Degree. 396 0 1 2 3 397 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 399 | Packet_Count | Mq | Batch_ID | 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 | Degree | 402 +-+-+-+-+-+-+-+-+ 404 Figure 3: BATS packet header format. 406 o Packet_Count: 16-bit unsigned integer, specifying the number K of 407 packets of the BATS session. 409 o Mq: 4-bit unsigned integer to specify the value of M and q as 410 Table 1. 412 o Batch_ID: 12-bit unsigned integer, specifying the batch ID BID of 413 the batch the packet belonging to. 415 o Degree: 8-bit unsigned integer, specifying the batch degree d of 416 the batch the packet belonging to. 418 +------+----+-----+----+ 419 | Mq | M | q | O | 420 +------+----+-----+----+ 421 | 0010 | 16 | 2 | 2 | 422 | 0100 | 32 | 2 | 4 | 423 | 0110 | 64 | 2 | 8 | 424 | 0001 | 8 | 256 | 8 | 425 | 0011 | 16 | 256 | 16 | 426 | 0101 | 32 | 256 | 32 | 427 | 0111 | 64 | 256 | 64 | 428 +------+----+-----+----+ 430 Table 1: Values of Mq field 432 2.4.2. Packet Payload 434 O T 435 +-----------------------+-------------------------------+ 436 | coefficient vector | coded data | 437 +-----------------------+-------------------------------+ 439 Figure 4: BATS packet payload format. 441 The payload has TO octets, where the first O octets contain the 442 coefficient vector and the remaining T octets contain the coded data. 443 Information in both fields MAY be encoded in JSON (ASCII) or protobuf 444 (binary) formats. 446 o coefficient vector: O octets. The range of the value of O is in 447 Table 1. 449 o coded data: T octets. T is at most MTU - 10, where 10 is the 450 total of the header and footer length plus the minimum value of O. 452 2.4.3. Packet Footer 454 0 1 2 455 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 456 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 457 | signature | parity check | 458 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 Figure 5: BATS packet footer format. 462 The footer has three octets. 464 o signature: 2 octets. A signature of the individual packet to 465 prevent pollution attack. 467 o parity check: 1 octet. A parity check field used to verity the 468 correctness of the packet. 470 3. BATS Code Specification 472 3.1. Common Parts 474 The T octets of a source packets are treated as a column vector of T 475 elements in GF(256). Linear algebra and matrix operations over 476 finite fields are assumed in this section. 478 Suppose that a pseudorandom number generator Rand() which generates 479 an unsigned integer of 32 bits is shared by both encoding and 480 decoding. The pseudorandom generator can be initialized by 481 Rand_Init(S) with seed S. When S is not provided, the pseudorandom 482 generator is initialized arbitrarily. One example of such a 483 pseudorandom generator is defined in RFC 8682 [RFC8682]. 485 A function called BatchSampler is used in both encoding and decoding. 486 The function takes two integers j and d as input, and generates an 487 array idx of d integers and a d x M matrix G. The function first 488 initializes the pseudorandom generator with j, sample d distinct 489 integers from 0 to K-1 as idx, and sample d*M integers from 0 to 255 490 as G. See the pseudocode in Figure 6. 492 function BatchSampler(j,d) 493 // initialize the pseudorandom generator by seed j. 494 Rand_Init(j) 495 // sample d distinct integers between 0 and K-1. 496 for k = 0, ..., d-1 do 497 r = Rand() % K 498 while r already exists in idx do 499 r = Rand() % K 500 idx[k] = r 502 // sample d x M matrix 503 for r = 0, ..., d-1 do 504 for c = 0,...,M-1 do 505 G[r,c] = Rand() % 256 507 return idx, G 509 Figure 6: Batch Sampler Function 511 3.2. Outer Code Encoder 513 Define a function called DegreeSampler that return an integer d using 514 the degree distribution DD. We expect that the empirical 515 distribution of the returning d converges to DD(d) when d < K. One 516 design of DegreeSampler is illustrated in Figure 7. 518 function DegreeSampler(j, DD) 519 Let CDF be an array 520 CDF[0] = 0 521 for i = 1, ..., MAX_DEG do 522 CDF[i] = CDF[i-1] + DD[i] 523 Rand_Init() 524 r = Rand() % CDF[MAX_DEG] 525 for d = 1, ..., MAX_DEG do 526 if r >= CDF[d] do 527 return min(d,K) 528 return min(MAX_DEG,K) 530 Figure 7: Degree Sampler Function 532 Let b[0], b[1], ..., b[K-1] be the K source packets. A batch with 533 BID j is generated using the following steps. 535 o Obtain a degree d by calling DegreeSampler with input j. 537 o Obtain idx and G[j] by calling BatchSampler with input j and d. 539 o Let B[j] = (b[idx[0]], b[idx[1]], ..., b[idx[d-1]]). Form the 540 batch X[j] = B[j]*G[j], whose dimension is T x M. 542 o Form the TO x M matrix Xr[j], where the first O rows of Xr[j] form 543 the M x M identity matrix I with entries in GF(q), and the last T 544 rows of Xr[j] is X[j]. 546 See the pseudocode of the batch generating process in Figure 8. 548 function GenBatch(j) 549 d = DegreeSampler(j) 550 (idx, G) = BatchSampler(j,d) 551 B = (b[idx[0]], b[idx[i]], ..., b[idx[d-1]]) 552 X = B * G 553 Xr = [I_M; X] 554 return Xr 556 Figure 8: Batch Generation Function 558 3.3. Inner Code Encoder (Recoder) 560 The inner code comprises (random) linear network coding applied on 561 the coded packets belonging to the same batch. At a particular 562 network node, recoded packets are generated by (random) linear 563 combinations of the received coded packets of a batch. The recoded 564 packets have the same BID, sparse degree and coded packet length. 566 The number Mr of recoded packets for a batch is decided first by the 567 recoder. Mr can be set as M. When the link statistics is known, the 568 recoder can try to obtain the link packet loss rate e for the link to 569 transmit the recoded batch, and set Mr to be (1+e)M. 571 Suppose that coded packets xr[i], i = 0, 1, ..., r-1, which have the 572 same BID j, have been received at an intermediate node. Using the 573 recommended packet format, it can be verified whether the 574 corresponding packet headers of these coded packets are the same. 575 Then a recoded packet can be generated by one of the following two 576 approaches: 578 o forwarding: when receiving xr[i], directly use xr[i] as a recoded 579 packet. 581 o linear combination recoding: (randomly) choose a sequence of 582 coefficients c[i], i = 0, 1, ..., r-1 from GF(q). Generate 583 c[0]xr[0]+c[1]xr[1]+...+c[r-1]xr[r-1] as a recoded packet. 585 A recoder can combine these two approaches to generate recoded 586 packets. For example, the recoder will output xr[i], i = 0, 1, ..., 587 r-1 as r systematic recoded packets and generate Mr-r recoded packets 588 using linear combinations of randomly chosen coefficients. 590 3.4. Belief Propagation Decoder 592 The decoder receives a sequence of batches Yr[j], j = 0, 1, ..., n-1, 593 each of which is a TO-row matrix over GF(256). The degree d[j] of 594 batch j is also known. Let Y[j] be the submatrix of the last T rows 595 of Yr[j]. When q = 256, let H[j] be the first M rows of Yr[j]; when 596 q = 2, let H[j] be the matrix over GF(256) formed by embedding each 597 bit in the first M/8 rows of Yr[j] into GF(256). 599 By calling BatchSampler with input j and d[j], we obtain idx[j] and 600 G[j]. According to the encoding and recoding processes described in 601 Section 3.2 and Section 3.3, we have the system of linear equations 602 Y[j] = B[j]G[j]H[j] for each received batch with ID j, where B[j] = 603 (b[idx[j,0]], b[idx[j,1]], ..., b[idx[j,d-1]]) is unknown. 605 We describe a belief propagation (BP) decoder that can efficiently 606 solve the source packets when a sufficient number of batches have 607 been received. A batch j is said to be decodable if rank(G[j]H[j]) = 608 d[j] (i.e., the system of linear equations Y[j] = B[j]G[j]H[j] with 609 B[j] as the variable matrix has a unique solution). The BP decoding 610 algorithm has multiple iterations. Each iteration is formed by the 611 following steps: 613 o Decoding step: Find a batches j that is decodable. Solve the 614 corresponding system of linear equations Y[j] = B[j]G[j]H[j] and 615 decode B[j]. 617 o Substitution step: Substitute the decoded source packets into 618 undecodable batches. Suppose that a decoded source packet b[k] is 619 used in generating a undecodable Y[j]. The substitution involves 620 1) removing the entry in idx[j] corresponding to k, 2) removing 621 the row in G[j] corresponding to b[k], and 3) reducing d[j] by 1. 623 The BP decoder repeats the above steps until no batches are decodable 624 during the decoding step. 626 4. Research Issues 628 The baseline BATS coding scheme described in Section 2 and Section 3 629 needs various refinement and complement towards a more sophisticated 630 network communication application. Various related research issues 631 are discussed in this section, but the security related issues are 632 left to Section 6. 634 4.1. Coding Design Issues 636 The BATS code specification in Section 3 has nearly optimal 637 throughput when the number of source packets K is sufficiently large. 638 But when K is small, the degree sampler function in Figure 7 and the 639 BatchSampler function in Figure 6 based on a pseudorandom generator 640 may not sample all the source packets evenly, so that some of the 641 source packets are not well protected. One approach to solve this 642 issue is to generate a deterministic degree sequence when the number 643 of batches is relatively small, and design a special pseudorandom 644 generator that has a good sampling performance when K is small. 646 The belief propagation decoder in Section 3.4 guarantees the recovery 647 of a given fraction of the source packets. To recover all the source 648 packets, a precode can be applied to the source packets to generate a 649 fraction of redundant packets before applying the outer code 650 encoding. Moreover, when the belief propagation decoder stops, it is 651 possible to continue with inactivation decoding, where certain source 652 packets are treated inactive so that a similar belief propagation 653 process can be resumed. The reader is referred to RFC 6330 [RFC6330] 654 for the design of a precode with a good inactivation decoding 655 performance. 657 There are research issues related to recoding discussed in 658 Section 3.3. One question is how many recoded packets to generate 659 for each batch. Though it is asymptotically optimal when using the 660 same number of recoded packets for all batches, it has been shown 661 that transmitting a different number of recoded packets for different 662 batches can improve the recoding efficiency. The intuition is that 663 for a batch with a lower rank, a smaller number of recoded packets 664 need to be transmitted. This kind of recoding scheme is called 665 adaptive recoding [Yin19]. 667 Packet loss in network communication is usually bursty, which may 668 harm the recoding performance. One way to resolve this issue is to 669 transmit the packets of different batches in a mixed order, which is 670 also called batch interleaving [Yin20]. How to efficiently 671 interleave batches without increasing too much end-to-end latency is 672 a research issue. 674 Though we only focus on the BATS coding scheme with one source node 675 and one destination node, a BATS coding scheme can be used for 676 multiple source and destination nodes. To benefit from multiple 677 source nodes, we would need different source nodes to generate 678 statistically independent batches. For communicating the same data 679 to multiple destination nodes, which is also call multicast, it is 680 well-known that linear network coding [Li03] achieves the mulicast 681 capacity. BATS codes can benefit from network coding due to its 682 inner code, but how to efficiently implement multicast needs further 683 research. 685 4.2. Protocol Design Issues 687 The baseline scheme in this document focuses on the reliable 688 communication. There are other issues to be considered towards 689 designing a fully functionally DDP based on a BATS coding scheme. 690 Here we discuss some network management issues that are closely 691 related to a BATS coding scheme: routing, congestion control and 692 media access control. 694 The outer code of a BATS code can be regarded as a channel code for 695 the channel induced by the inner code, and hence the network 696 management algorithms should try to maximize the capacity of the 697 channel induced by the inner code. A network utility maximization 698 problem [Dong20] for BATS coding can be applied to study routing, 699 congestion control and media access control jointly. Compared with 700 the network utility maximization for Internet, there are two major 701 differences. First, the network flow rate is not measured by the 702 rate of the raw packets. Instead, a rank based measurement induced 703 by the inner code is applied for BATS coding schemes. Second, due to 704 recoding, the raw packet rate of a flow may not be the same for 705 different links, i.e., no flow conservation for BATS coding schemes. 706 These differences affect both the objective and the constraints of 707 the utility maximization problem. 709 Practical congestion control, routing and media access control 710 algorithms for BATS coding schemes deserve more research efforts. 711 Due to the recoding operation, congestion control cannot be only 712 performed end-to-end. The rate of transmitting batches can be 713 controlled end-to-end, but the number of recoded packets generated 714 for a batch must be controlled at the intermediate nodes, which 715 introduces new research issues for congestion control. For routing, 716 the BATS coding scheme is flexible for implementing multi-path data 717 transmission, and different batches can be transmitted on a different 718 path between a source node and a destination node. Under the 719 scenario of BATS coding schemes, media access control can have some 720 different considerations: Retransmission is not necessary, and a 721 reasonably high packet loss rate can be tolerated. 723 4.3. Application Related Issues 725 There are more researche issues pertaining to different applications. 726 The reliable communication technique provided by BATS codes can be 727 used for a broad range of network communication scenarios. In 728 general, a BATS coding scheme is suitable for data delivery in 729 networks with multiple hops and unreliable links. 731 One class of typical application scenario is wireless mesh and ad hoc 732 networks [Toh02], including vehicular networks, wireless sensor 733 networks, smart-lamppost networks, etc. These networks are 734 characterized by a large number of network devices connected 735 wirelessly with each other without a centralized network 736 infrastructure. A BATS coding scheme is suitable for high data load 737 delivery in such networks without the requirement that the point-to- 738 point/one-hop communication is highly reliable. Therefore, employing 739 a BATS coding scheme can provide more freedom for media access 740 control, including power control so that the overall network 741 throughput can be improved. 743 Another typical application scenario of BATS coding schemes is 744 underwater acoustic networks [Sprea19], where the propagation delay 745 of acoustic waves in underwater can be as long as several seconds. 746 Due to the long delay, feedback based mechanisms become inefficient. 747 Moreover, point-to-point/one-hop underwater acoustic communication 748 (for both the forward and reverse directions) is highly unreliable. 750 Due to these reasons, traditional networking techinques developed for 751 radio and wireline networks cannot be directly applied to underwater 752 networks. As a BATS coding scheme does not rely on the feedback for 753 reliability communication and can tolerate highly unreliable links, 754 it makes a good candidate for developing data delivery protocols for 755 underwater acoustic networks. 757 Last but not least, due to its capability of performing multi-source, 758 multi-destination communications, a BATS coding scheme can be applied 759 in various content distribution scenarios. For example, a BATS 760 coding scheme can be a candidate for the erasure code used in the 761 liquid data networking framework [Byers20] of CCN (content centric 762 networking), and provides the extra benefit of network coding 763 [Zhang16]. 765 5. IANA Considerations 767 This memo includes no request to IANA. 769 6. Security Considerations 771 Subsuming both Random Linear Network Codes (RLNC) and fountain codes, 772 BATS codes naturally inherit both their desirable capability of 773 offering confidentiality protection as well as their vulnerability 774 towards pollution attacks. 776 6.1. Provision of Confidentiality Protection 778 Since the transported messages are linearly combined with random 779 coefficients at each recoding node, it is statistically impossible to 780 recover the individual messages by capturing the coded messages at 781 any one or small number of nodes. As long as the coding matrices of 782 the transported messages cannot be fully recovered, any attempt of 783 decoding any particular symbol of the transported messages is 784 equivalent to random guessing [Bhattad05]. 786 The threat towards confidentiality, however, also exists in the form 787 of eavesdropping on the initial encoding process, which takes place 788 at the encoding nodes. In these nodes, the transported data are 789 presented in plain text and can be read along their transfer paths. 790 Hence, information isolation between the encoding process and all 791 other user processes running on the node must be assured. 793 In addition, the authenticity and trustworthiness of the encoding, 794 recoding and decoding program running on all the nodes must be 795 attested by a trusted authority. Such a measure is also necessary in 796 countering pollution attacks. 798 6.2. Countermeasures against Pollution Attacks 800 Like all network codes, BATS codes are vulnerable under pollution 801 attacks. In these attacks, one or more compromised coding node(s) 802 can pollute the coded messages by injecting forged messages into the 803 coding network and thus prevent the receivers from recovering the 804 transported data correctly. 806 The research community has long been investigating the use of various 807 signature schemes (including homomorphic signatures) to identify the 808 forged messages and stall the attacks (see [Zhao07], [Yu08], 809 [Agrawal09]). However, these countermeasures are regarded as being 810 too computationally expensive to be employed in broadband 811 communications. Hence, a system-level approach based on Trusted 812 Computing [TC-Wikipedia] is proposed as a practical alternative to 813 protect BATS codes against pollution attacks. This Trusted Computing 814 based protection consists of the following countermeasures: 816 1. Attestation and Validation of all BATS encoding, recoding and 817 decoding nodes in the network. Remote attestation and repetitive 818 validation of the identity and capability of these node based on 819 valid public key certificates with proper authorization MUST be a 820 pre-requisite for admitting these nodes to a network and 821 permitting them to remain on that network. 823 2. Attestation of all encoding, recoding and decoding programs used 824 in the coding nodes. All programs used to perform the BATS 825 encoding, recoding and decoding processes MUST be remotely 826 attested before they are permitted to run on any of the coding 827 nodes. Reloading or alteration of programs MUST NOT be permitted 828 during an encoding session. Programs MUST be attested or 829 validated again when they are executed in new execution 830 environments instantiated even in the same node. 832 3. Original Authentication of all coded messages using network level 833 security protocols such as IPsec or Peer Authentication over 834 session-based communication using transport level security 835 protocols such as TLS/DTLS MUST be employed in order to provide 836 Message Origin or Communication Peer Authentication to every 837 coded message sent through the coding network. 839 7. References 841 7.1. Normative References 843 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 844 Requirement Levels", BCP 14, RFC 2119, 845 DOI 10.17487/RFC2119, March 1997, 846 . 848 [RFC8682] Saito, M., Matsumoto, M., Roca, V., Ed., and E. Baccelli, 849 "TinyMT32 Pseudorandom Number Generator (PRNG)", RFC 8682, 850 DOI 10.17487/RFC8682, January 2020, 851 . 853 7.2. Informative References 855 [Agrawal09] 856 Agrawal, S. and D. Boneh, "Homomorphic MACs: MAC-based 857 integrity for network coding", International Conference on 858 Applied Cryptography and Network Security , 2009. 860 [Bhattad05] 861 Bhattad, K. and K. Narayanan, "Weakly Secure Network 862 Coding", ISIT , 2007. 864 [Byers20] Byers, J. and M. Luby, "Liquid Data Networking", ICN , 865 2020. 867 [Dong20] Dong, Y., Jin, S., Yang, S., and H. Yin, "Network Utility 868 Maximization for BATS Code enabled Multihop Wireless 869 Networks", ICC , 2020. 871 [Li03] Li, S., Yeung, R., and N. Cai, "Linear Network Coding", 872 IEEE Transactions on Information Theory , 2003. 874 [RFC6330] Luby, M., Shokrollahi, A., Watson, M., Stockhammer, T., 875 and L. Minder, "RaptorQ Forward Error Correction Scheme 876 for Object Delivery", RFC 6330, DOI 10.17487/RFC6330, 877 August 2011, . 879 [Sprea19] Sprea, N., Bashir, M., Truhachev, D., Srinivas, K., 880 Schlegel, C., and C. Claudio Sacchi, "BATS Coding for 881 Underwater Acoustic Communication Networks", OCEANS , 882 2019. 884 [TC-Wikipedia] 885 "Trusted Computing", 886 Wikipedia https://en.wikipedia.org/wiki/Trusted_Computing. 888 [Toh02] Toh, C., "Ad Hoc Mobile Wireless Networks", Prentice Hall 889 Publishers , 2002. 891 [Yang14] Yang, S. and R. Yeung, "Batched Sparse Codes", IEEE 892 Transactions on Information Theory 60(9), 5322-5346, 2014. 894 [Yang17] Yang, S. and R. Yeung, "BATS Codes: Theory and Practice", 895 Morgan & Claypool Publishers , 2017. 897 [Yin19] Yin, H., Tang, B., Ng, K., Yang, S., Wang, X., and Q. 898 Zhou, "A Unified Adaptive Recoding Framework for Batched 899 Network Coding", ISIT , 2019. 901 [Yin20] Yin, H., Yeung, R., and S. Yang, "A Protocol Design 902 Paradigm for Batched Sparse Codes", Entroy , 2020. 904 [Yu08] Yu, Z., Wei, Y., Ramkumar, B., and Y. Guan, "An Efficient 905 Signature-Based Scheme for Securing Network Coding Against 906 Pollution Attacks", INFOCOM , 2008. 908 [Zhang16] Zhang, G. and Z. Xu, "Combing CCN with network coding: An 909 architectural perspective", Computer Networks , 2016. 911 [Zhao07] Zhao, F., Kalker, T., Medard, M., and K. Han, "Signatures 912 for content distribution with network coding", ISIT , 913 2007. 915 Appendix A. Additional Stuff 917 Authors' Addresses 919 Shenghao Yang 920 CUHK(SZ) 921 Shenzhen, Guangdong 922 China 924 Phone: +86 755 8427 3827 925 Email: shyang@cuhk.edu.cn 927 Xuan Huang 928 CUHK 929 Hong Kong, Hong Kong SAR 930 China 932 Phone: +852 3943 8375 933 Email: 1155136647@link.cuhk.edu.hk 934 Raymond W. Yeung 935 CUHK 936 Hong Kong, Hong Kong SAR 937 China 939 Phone: +852 3943 8375 940 Email: whyeung@ie.cuhk.edu.hk 942 John K. Zao 943 NCTU 944 Hsinchu, Taiwan 945 China 947 Email: jkzao@ieee.org