idnits 2.17.1 draft-faltstrom-base45-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (April 03, 2021) is 1118 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'A' is mentioned on line 97, but not defined == Missing Reference: 'B' is mentioned on line 91, but not defined == Missing Reference: 'C' is mentioned on line 97, but not defined == Missing Reference: 'D' is mentioned on line 97, but not defined == Missing Reference: 'E' is mentioned on line 91, but not defined == Missing Reference: '65 66' is mentioned on line 146, but not defined == Missing Reference: '105 101' is mentioned on line 175, but not defined == Missing Reference: '116 102' is mentioned on line 175, but not defined -- Looks like a reference, but probably isn't: '33' on line 175 -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO18004' Summary: 0 errors (**), 0 flaws (~~), 10 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Faltstrom 3 Internet-Draft Netnod 4 Intended status: Standards Track F. Ljunggren 5 Expires: October 5, 2021 Kirei 6 D. van Gulik 7 Webweaving 8 April 03, 2021 10 The Base45 Data Encoding 11 draft-faltstrom-base45-04 13 Abstract 15 This document describes the base 45 encoding scheme which is built 16 upon the base 64, base 32 and base 16 encoding schemes. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on October 5, 2021. 35 Copyright Notice 37 Copyright (c) 2021 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (https://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 53 2. Conventions Used in This Document . . . . . . . . . . . . . . 2 54 3. Interpretation of Encoded Data . . . . . . . . . . . . . . . 2 55 4. The Base 45 Encoding . . . . . . . . . . . . . . . . . . . . 2 56 4.1. When to use Base45 . . . . . . . . . . . . . . . . . . . 3 57 4.2. The alphabet used in Base45 . . . . . . . . . . . . . . . 3 58 4.3. Encoding example . . . . . . . . . . . . . . . . . . . . 3 59 4.4. Decoding example . . . . . . . . . . . . . . . . . . . . 4 60 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 4 61 6. Security Considerations . . . . . . . . . . . . . . . . . . . 4 62 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5 63 8. Normative References . . . . . . . . . . . . . . . . . . . . 5 64 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 5 66 1. Introduction 68 When using QR or Aztec codes a different encoding scheme is needed 69 than the already established base 64, base 32 and base 16 encoding 70 schemes that are described in RFC 4648 [RFC4648]. The difference 71 from those and base 45 is the key table and that the padding with '=' 72 is not required. 74 2. Conventions Used in This Document 76 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 77 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 78 document are to be interpreted as described in RFC 2119 [RFC2119]. 80 3. Interpretation of Encoded Data 82 Encoded data is to be interpreted as described in RFC 4648 [RFC4648] 83 with the exception that a different alphabet is selected. 85 4. The Base 45 Encoding 87 A 45-character subset of US-ASCII is used, the 45 characters that can 88 be used in a QR or Aztec code. If we look at Base 64, it encodes 3 89 bytes in 4 characters. Base 45 encodes 2 bytes in 3 characters. 91 The two bytes [A, B] are turned into [C, D, E] where (A*256) + B = C 92 + (D*45) + (E*45*45). The values C, D and E are then looked up in 93 Table 1 to produce a three character string and the reverse when 94 decoding. 96 If the number of octets are not dividable by two, the last remaining 97 byte is represented by two characters. [A] is turned into [C, D] 98 where A = C + (D*45). 100 4.1. When to use Base45 102 If binary data is to be stored in a QR-Code one possible way is to 103 use the Alphanumeric encoding that uses 11 bits for 2 characters as 104 defined in section 7.3.4 in ISO/IEC 18004:2015 [ISO18004]. The ECI 105 mode indicator for this encoding is 0010. 107 If the data is to use some other transport a transport encoding 108 suitable for that transport should be used. It is not recommended to 109 for example first encode data in Base45 and then encode the Base45 110 blob in for example Base64 if the data is to be sent via email. 111 Instead the Base45 encoding should be removed, and the data itself 112 should be encoded in Base64. 114 4.2. The alphabet used in Base45 116 The alphanumeric code is defined to use 45 characters as specified in 117 this alphabet. 119 Table 1: The Base 45 Alphabet 121 Value Encoding Value Encoding Value Encoding Value Encoding 122 00 0 12 C 24 O 36 Space 123 01 1 13 D 25 P 37 $ 124 02 2 14 E 26 Q 38 % 125 03 3 15 F 27 R 39 * 126 04 4 16 G 28 S 40 + 127 05 5 17 H 29 T 41 - 128 06 6 18 I 30 U 42 . 129 07 7 19 J 31 V 43 / 130 08 8 20 K 32 W 44 : 131 09 9 21 L 33 X 132 10 A 22 M 34 Y 133 11 B 23 N 35 Z 135 4.3. Encoding example 137 A series of bytes is turned into groups of two. Each such 16 bit 138 value is turned into a series of three values calculated by doing 139 successive calculations modulo 45. The values are in turned looked 140 up in what is displayed in Table 1. 142 It should be noted that although the examples are all text, Base45 is 143 an encoding for binary data where each octet can have any value 144 0-255. 146 Encoding example 1: The string "AB" is the byte sequence [65 66]. 147 The 16 bit value is 65 * 256 + 66 = 16706. 16706 equals 11 + 45 * 11 148 + 45 * 45 * 8 so the sequence in base 45 is [11 11 8]. By looking up 149 these values in the table we get the encoded string "BB8". 151 Encoding example 2: The string "Hello!!" is the byte sequence [72 101 152 108 108 111 33 33]. If we look at each 16 bit value, it is [18533 153 27756 28449 33]. Note the 33 for the last byte. When looking at the 154 values modulo 45, we get [[38 6 9] [36 31 13] [9 2 14] [33 0]] where 155 the last byte is represented by two. By looking up these values in 156 the table we get the encoded string "%69 VD92EX0". 158 Encoding example 3: The string "base-45" is the byte sequence [98 97 159 115 101 45 52 53]. If we look at each 16 bit value, it is [25185 160 29541 11572 53]. Note the 53 for the last byte. When looking at the 161 values modulo 45, we get [[30 19 12] [21 26 14] [7 32 5] [8 1]] where 162 the last byte is represented by two. By looking up these values in 163 the table we get the encoded string "UJCLQE7W581". 165 4.4. Decoding example 167 The series of characters are lookup up in Table 1, and the indices 168 for the characters, three and three, are interpreted as a number in 169 base 45. This number is then turned into two bytes in base 8. 171 Decoding example 1: The string "QED8WEX0" represents when lookup in 172 Table 1 the values [26 14 13 8 32 14 33 0]. We look at the numbers 173 in three number sequences (except last) and get [[26 14 13] [8 32 14] 174 [33 0]]. In base 45 we get [26981 29798 33] where the bytes are 175 [[105 101] [116 102] [33]]. If we look at the ascii values we get 176 the string "ietf!". 178 5. IANA Considerations 180 There are no considerations for IANA in this document. 182 6. Security Considerations 184 When implementing encoding and decoding it is important to be very 185 careful so that buffer overflow does not take place, or anything 186 similar. This includes of course the calculations of modulo 45 and 187 lookup in the table of characters. Decoder also must be robust 188 regarding input, including proper handling of any byte value 0-255, 189 including the NUL character (ASCII 0). 191 It should be noted that Base 64 (for example) pad the string so that 192 the encoding has the correct number of characters. This is something 193 that Base 45 does not do, i.e. Base 45 do not include padding. 194 Because of this, special care is to be taken when odd number of 195 octets are to be encoded which results not in N*3 characters, but 196 (N-1)*3+2 characters in the encoded string and vice versa, when the 197 number of encoded characters are not divisible by 3. 199 Further that a base45 encoded piece of data includes non-URL-safe 200 characters so if base45 encoded data have to be URL safe, one have to 201 use %-encoding. 203 7. Acknowledgements 205 The authors thank Alan Barrett, Tomas Harreveld, Christian Landgren, 206 Anders Lowinger, Jakob Schlyter, Peter Teufl and Gaby Whitehead for 207 the feedback. Also everyone that have been working with Base64 208 during the years that have proven the implementions are stable. 210 8. Normative References 212 [ISO18004] 213 ISO/IEC JTC 1/SC 31, "ISO/IEC 18004:2015 Information 214 technology - Automatic identification and data capture 215 techniques - QR Code bar code symbology specification", 216 ISO/IEC 217 18004:2015 https://www.iso.org/standard/62021.html, 218 February 2015. 220 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 221 Requirement Levels", BCP 14, RFC 2119, 222 DOI 10.17487/RFC2119, March 1997, 223 . 225 [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data 226 Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, 227 . 229 Authors' Addresses 231 Patrik Faltstrom 232 Netnod 234 Email: paf@netnod.se 235 Fredrik Ljunggren 236 Kirei 238 Email: fredrik@kirei.se 240 Dirk-Willem van Gulik 241 Webweaving 243 Email: dirkx@webweaving.org