idnits 2.17.1 draft-boutell-png-spec-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-03-28) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (10 June 1996) is 10153 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '4' on line 2703 -- Looks like a reference, but probably isn't: '3' on line 2726 -- Looks like a reference, but probably isn't: '256' on line 4323 == Unused Reference: 'COLOR-2' is defined on line 4440, but no explicit reference was found in the text == Unused Reference: 'COLOR-3' is defined on line 4445, but no explicit reference was found in the text == Unused Reference: 'COLOR-4' is defined on line 4451, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'COLOR-1' -- Possible downref: Non-RFC (?) normative reference: ref. 'COLOR-2' -- Possible downref: Non-RFC (?) normative reference: ref. 'COLOR-3' -- Possible downref: Non-RFC (?) normative reference: ref. 'COLOR-4' -- Possible downref: Non-RFC (?) normative reference: ref. 'COLOR-5' -- Possible downref: Non-RFC (?) normative reference: ref. 'GAMMA-FAQ' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO-3309' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO-8859' -- Possible downref: Non-RFC (?) normative reference: ref. 'ITU-BT709' -- Possible downref: Non-RFC (?) normative reference: ref. 'ITU-V42' -- Possible downref: Non-RFC (?) normative reference: ref. 'PAETH' -- Possible downref: Non-RFC (?) normative reference: ref. 'POSTSCRIPT' -- Possible downref: Non-RFC (?) normative reference: ref. 'PNG-EXTENSIONS' ** Obsolete normative reference: RFC 1521 (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) ** Obsolete normative reference: RFC 1590 (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) ** Downref: Normative reference to an Informational RFC: RFC 1950 ** Downref: Normative reference to an Informational RFC: RFC 1951 -- Possible downref: Non-RFC (?) normative reference: ref. 'SMPTE-170M' Summary: 11 errors (**), 0 flaws (~~), 4 warnings (==), 20 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT T. Boutell, et al 3 Portable Network Graphics Boutell.Com, Inc. 4 Expires: 10 Dec 1996 10 June 1996 6 PNG (Portable Network Graphics) Specification 8 Version 1.0 10 File draft-boutell-png-spec-04.txt 12 Status of This Memo 14 This document is an Internet-Draft. Internet-Drafts are working 15 documents of the Internet Engineering Task Force (IETF), its areas, 16 and its working groups. Note that other groups may also distribute 17 working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet-Drafts as reference 22 material or to cite them other than as "work in progress." 24 To learn the current status of any Internet-Draft, please check the 25 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 26 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 27 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 28 ftp.isi.edu (US West Coast). 30 Distribution of this memo is unlimited. 32 Abstract 34 This document describes PNG (Portable Network Graphics), an 35 extensible file format for the lossless, portable, well-compressed 36 storage of raster images. PNG provides a patent-free replacement for 37 GIF and can also replace many common uses of TIFF. Indexed-color, 38 grayscale, and truecolor images are supported, plus an optional alpha 39 channel. Sample depths range from 1 to 16 bits. 41 PNG is designed to work well in online viewing applications, such as 42 the World Wide Web, so it is fully streamable with a progressive 43 display option. PNG is robust, providing both full file integrity 44 checking and simple detection of common transmission errors. Also, 45 PNG can store gamma and chromaticity data for improved color matching 46 on heterogeneous platforms. 48 This specification defines a proposed Internet Media Type image/png. 50 Table of Contents 52 1. Introduction ................................................... 4 53 2. Data Representation ............................................ 5 54 2.1. Integers and byte order ................................... 5 55 2.2. Color values .............................................. 5 56 2.3. Image layout .............................................. 6 57 2.4. Alpha channel ............................................. 7 58 2.5. Filtering ................................................. 8 59 2.6. Interlaced data order ..................................... 8 60 2.7. Gamma correction .......................................... 9 61 2.8. Text strings ............................................. 10 62 3. File Structure ................................................ 10 63 3.1. PNG file signature ....................................... 10 64 3.2. Chunk layout ............................................. 10 65 3.3. Chunk naming conventions ................................. 11 66 3.4. CRC algorithm ............................................ 14 67 4. Chunk Specifications .......................................... 14 68 4.1. Critical Chunks .......................................... 14 69 4.1.1. IHDR Image Header .................................. 14 70 4.1.2. PLTE Palette ....................................... 16 71 4.1.3. IDAT Image Data .................................... 16 72 4.1.4. IEND Image Trailer ................................. 17 73 4.2. Ancillary Chunks ......................................... 17 74 4.2.1. bKGD Background Color .............................. 17 75 4.2.2. cHRM Primary Chromaticities and White Point ........ 18 76 4.2.3. gAMA Image Gamma ................................... 19 77 4.2.4. hIST Image Histogram ............................... 19 78 4.2.5. pHYs Physical Pixel Dimensions ..................... 20 79 4.2.6. sBIT Significant Bits .............................. 21 80 4.2.7. tEXt Textual Data .................................. 22 81 4.2.8. tIME Image Last-Modification Time .................. 23 82 4.2.9. tRNS Transparency .................................. 24 83 4.2.10. zTXt Compressed Textual Data ...................... 25 84 4.3. Summary of Standard Chunks ............................... 26 85 4.4. Additional Chunk Types ................................... 26 86 5. Deflate/Inflate Compression ................................... 27 87 6. Filter Algorithms ............................................. 29 88 6.1. Filter type 0: None ...................................... 30 89 6.2. Filter type 1: Sub ....................................... 30 90 6.3. Filter type 2: Up ........................................ 31 91 6.4. Filter type 3: Average ................................... 31 92 6.5. Filter type 4: Paeth ..................................... 32 93 7. Chunk Ordering Rules .......................................... 33 94 7.1. Behavior of PNG editors .................................. 34 95 7.2. Ordering of ancillary chunks ............................. 34 96 7.3. Ordering of critical chunks .............................. 35 97 8. Miscellaneous Topics .......................................... 35 98 8.1. File name extension ...................................... 35 99 8.2. Internet media type ...................................... 35 100 8.3. Macintosh file layout .................................... 35 101 8.4. Multiple-image extension ................................. 36 102 8.5. Security considerations .................................. 36 103 9. Recommendations for Encoders .................................. 37 104 9.1. Bit depth scaling ........................................ 37 105 9.2. Encoder gamma handling ................................... 39 106 9.3. Encoder color handling ................................... 41 107 9.4. Alpha channel creation ................................... 43 108 9.5. Suggested palettes ....................................... 43 109 9.6. Filter selection ......................................... 44 110 9.7. Text chunk processing .................................... 44 111 9.8. Use of private chunks .................................... 45 112 9.9. Private type and method codes ............................ 46 113 10. Recommendations for Decoders ................................. 46 114 10.1. Error checking .......................................... 46 115 10.2. Pixel dimensions ........................................ 47 116 10.3. Truecolor image handling ................................ 47 117 10.4. Bit depth rescaling ..................................... 48 118 10.5. Decoder gamma handling .................................. 49 119 10.6. Decoder color handling .................................. 51 120 10.7. Background color ........................................ 52 121 10.8. Alpha channel processing ................................ 52 122 10.9. Progressive display ..................................... 56 123 10.10. Suggested-palette and histogram usage .................. 57 124 10.11. Text chunk processing .................................. 58 125 11. Glossary ..................................................... 59 126 12. Appendix: Rationale .......................................... 63 127 12.1. Why a new file format? .................................. 63 128 12.2. Why these features? ..................................... 63 129 12.3. Why not these features? ................................. 64 130 12.4. Why not use format X? ................................... 65 131 12.5. Byte order .............................................. 65 132 12.6. Interlacing ............................................. 66 133 12.7. Why gamma? .............................................. 66 134 12.8. Non-premultiplied alpha ................................. 67 135 12.9. Filtering ............................................... 68 136 12.10. Text strings ........................................... 68 137 12.11. PNG file signature ..................................... 69 138 12.12. Chunk layout ........................................... 70 139 12.13. Chunk naming conventions ............................... 70 140 12.14. Palette histograms ..................................... 72 141 13. Appendix: Gamma Tutorial ..................................... 72 142 14. Appendix: Color Tutorial ..................................... 80 143 15. Appendix: Sample CRC Code .................................... 84 144 16. Appendix: Online Resources ................................... 86 145 17. Appendix: Revision History ................................... 87 146 18. References ................................................... 87 147 19. Credits ...................................................... 89 149 1. Introduction 151 The PNG format provides a portable, legally unencumbered, well- 152 compressed, well-specified standard for lossless bitmapped image 153 files. 155 Although the initial motivation for developing PNG was to replace 156 GIF, the design provides some useful new features not available in 157 GIF, with minimal cost to developers. 159 GIF features retained in PNG include: 161 * Indexed-color images of up to 256 colors. 162 * Streamability: files can be read and written serially, thus 163 allowing the file format to be used as a communications 164 protocol for on-the-fly generation and display of images. 165 * Progressive display: a suitably prepared image file can be 166 displayed as it is received over a communications link, 167 yielding a low-resolution image very quickly followed by 168 gradual improvement of detail. 169 * Transparency: portions of the image can be marked as 170 transparent, creating the effect of a nonrectangular image. 171 * Ancillary information: textual comments and other data can be 172 stored within the image file. 173 * Complete hardware and platform independence. 174 * Effective, 100% lossless compression. 176 Important new features of PNG, not available in GIF, include: 178 * Truecolor images of up to 48 bits per pixel. 179 * Grayscale images of up to 16 bits per pixel. 180 * Full alpha channel (general transparency masks). 181 * Image gamma information, which supports automatic display of 182 images with correct brightness/contrast regardless of the 183 machines used to originate and display the image. 184 * Reliable, straightforward detection of file corruption. 185 * Faster initial presentation in progressive display mode. 187 PNG is designed to be: 189 * Simple and portable: developers should be able to implement PNG 190 easily. 191 * Legally unencumbered: to the best knowledge of the PNG authors, 192 no algorithms under legal challenge are used. (Some 193 considerable effort has been spent to verify this.) 194 * Well compressed: both indexed-color and truecolor images are 195 compressed as effectively as in any other widely used lossless 196 format, and in most cases more effectively. 197 * Interchangeable: any standard-conforming PNG decoder will read 198 all conforming PNG files. 199 * Flexible: the format allows for future extensions and private 200 add-ons, without compromising interchangeability of basic PNG. 202 * Robust: the design supports full file integrity checking as 203 well as simple, quick detection of common transmission errors. 205 The main part of this specification gives the definition of the file 206 format and recommendations for encoder and decoder behavior. An 207 appendix gives the rationale for many design decisions. Although the 208 rationale is not part of the formal specification, reading it can 209 help implementors understand the design. Cross-references in the 210 main text point to relevant parts of the rationale. Additional 211 appendixes, also not part of the formal specification, provide 212 tutorials on gamma and color theory as well as other supporting 213 material. 215 See Rationale: Why a new file format? (Section 12.1), Why these 216 features? (Section 12.2), Why not these features? (Section 12.3), Why 217 not use format X? (Section 12.4). 219 Pronunciation 221 PNG is pronounced "ping". 223 2. Data Representation 225 This chapter discusses basic data representations used in PNG files, 226 as well as the expected representation of the image data. 228 2.1. Integers and byte order 230 All integers that require more than one byte will be in network 231 byte order: the most significant byte comes first, then the less 232 significant bytes in descending order of significance (MSB LSB for 233 two-byte integers, B3 B2 B1 B0 for four-byte integers). The 234 highest bit (value 128) of a byte is numbered bit 7; the lowest 235 bit (value 1) is numbered bit 0. Values are unsigned unless 236 otherwise noted. Values explicitly noted as signed are represented 237 in two's complement notation. 239 See Rationale: Byte order (Section 12.5). 241 2.2. Color values 243 Colors can be represented by either grayscale or RGB (red, green, 244 blue) sample data. Grayscale data represents luminance; RGB data 245 represents calibrated color information (if the cHRM chunk is 246 present) or uncalibrated device-dependent color (if cHRM is 247 absent). All color values range from zero (representing black) to 248 most intense at the maximum value for the bit depth. Note that 249 the maximum value at a given bit depth is (2^bitdepth)-1, not 250 2^bitdepth. 252 Sample values are not necessarily linear; the gAMA chunk specifies 253 the gamma characteristic of the source device, and viewers are 254 strongly encouraged to compensate properly. See Gamma correction 255 (Section 2.7). 257 Source data with a precision not directly supported in PNG (for 258 example, 5 bit/sample truecolor) must be scaled up to the next 259 higher supported bit depth. This scaling is reversible with no 260 loss of data, and it reduces the number of cases that decoders 261 must cope with. See Recommendations for Encoders: Bit depth 262 scaling (Section 9.1) and Recommendations for Decoders: Bit depth 263 rescaling (Section 10.4). 265 2.3. Image layout 267 Conceptually, a PNG image is a rectangular pixel array, with 268 pixels appearing left-to-right within each scanline, and scanlines 269 appearing top-to-bottom. (For progressive display purposes, the 270 data may actually be transmitted in a different order; see 271 Interlaced data order, Section 2.6.) The size of each pixel is 272 determined by the bit depth, which is the number of bits per 273 sample in the image data. 275 Three types of pixel are supported: 277 * An indexed-color pixel is represented by a single sample 278 that is an index into a supplied palette. The image bit 279 depth determines the maximum number of palette entries, but 280 not the color precision within the palette. 281 * A grayscale pixel is represented by a single sample that is 282 a grayscale level, where zero is black and the largest value 283 for the bit depth is white. 284 * A truecolor pixel is represented by three samples: red (zero 285 = black, max = red) appears first, then green (zero = black, 286 max = green), then blue (zero = black, max = blue). The bit 287 depth specifies the size of each sample, not the total pixel 288 size. 290 Optionally, grayscale and truecolor pixels can also include an 291 alpha sample, as described in the next section. 293 Pixels are always packed into scanlines with no wasted bits 294 between pixels. Pixels smaller than a byte never cross byte 295 boundaries; they are packed into bytes with the leftmost pixel in 296 the high-order bits of a byte, the rightmost in the low-order 297 bits. Permitted bit depths and pixel types are restricted so that 298 in all cases the packing is simple and efficient. 300 PNG permits multi-sample pixels only with 8- and 16-bit samples, 301 so multiple samples of a single pixel are never packed into one 302 byte. 16-bit samples are stored in network byte order (MSB 303 first). 305 Scanlines always begin on byte boundaries. When pixels have fewer 306 than 8 bits and the scanline width is not evenly divisible by the 307 number of pixels per byte, the low-order bits in the last byte of 308 each scanline are wasted. The contents of these wasted bits are 309 unspecified. 311 An additional "filter type" byte is added to the beginning of 312 every scanline (see Filtering, Section 2.5). The filter type byte 313 is not considered part of the image data, but it is included in 314 the datastream sent to the compression step. 316 2.4. Alpha channel 318 An alpha channel, representing transparency information on a per- 319 pixel basis, can be included in grayscale and truecolor PNG 320 images. 322 An alpha value of zero represents full transparency, and a value 323 of (2^bitdepth)-1 represents a fully opaque pixel. Intermediate 324 values indicate partially transparent pixels that can be combined 325 with a background image to yield a composite image. (Thus, alpha 326 is really the degree of opacity of the pixel. But most people 327 refer to alpha as providing transparency information, not opacity 328 information, and we continue that custom here.) 330 Alpha channels can be included with images that have either 8 or 331 16 bits per sample, but not with images that have fewer than 8 332 bits per sample. Alpha samples are represented with the same bit 333 depth used for the image samples. The alpha sample for each pixel 334 is stored immediately following the grayscale or RGB samples of 335 the pixel. 337 The color values stored for a pixel are not affected by the alpha 338 value assigned to the pixel. This rule is sometimes called 339 "unassociated" or "non-premultiplied" alpha. (Another common 340 technique is to store sample values premultiplied by the alpha 341 fraction; in effect, such an image is already composited against a 342 black background. PNG does not use premultiplied alpha.) 344 Transparency control is also possible without the storage cost of 345 a full alpha channel. In an indexed-color image, an alpha value 346 can be defined for each palette entry. In grayscale and truecolor 347 images, a single pixel value can be identified as being 348 "transparent". These techniques are controlled by the tRNS 349 ancillary chunk type. 351 If no alpha channel nor tRNS chunk is present, all pixels in the 352 image are to be treated as fully opaque. 354 Viewers can support transparency control partially, or not at all. 356 See Rationale: Non-premultiplied alpha (Section 12.8), 357 Recommendations for Encoders: Alpha channel creation (Section 358 9.4), and Recommendations for Decoders: Alpha channel processing 359 (Section 10.8). 361 2.5. Filtering 363 PNG allows the image data to be filtered before it is compressed. 364 Filtering can improve the compressibility of the data. The filter 365 step itself does not reduce the size of the data. All PNG filters 366 are strictly lossless. 368 PNG defines several different filter algorithms, including "none" 369 which indicates no filtering. The filter algorithm is specified 370 for each scanline by a filter type byte that precedes the filtered 371 scanline in the precompression datastream. An intelligent encoder 372 can switch filters from one scanline to the next. The method for 373 choosing which filter to employ is up to the encoder. 375 See Filter Algorithms (Chapter 6) and Rationale: Filtering 376 (Section 12.9). 378 2.6. Interlaced data order 380 A PNG image can be stored in interlaced order to allow progressive 381 display. The purpose of this feature is to allow images to "fade 382 in" when they are being displayed on-the-fly. Interlacing 383 slightly expands the file size on average, but it gives the user a 384 meaningful display much more rapidly. Note that decoders are 385 required to be able to read interlaced images, whether or not they 386 actually perform progressive display. 388 With interlace method 0, pixels are stored sequentially from left 389 to right, and scanlines sequentially from top to bottom (no 390 interlacing). 392 Interlace method 1, known as Adam7 after its author, Adam M. 393 Costello, consists of seven distinct passes over the image. Each 394 pass transmits a subset of the pixels in the image. The pass in 395 which each pixel is transmitted is defined by replicating the 396 following 8-by-8 pattern over the entire image, starting at the 397 upper left corner: 399 1 6 4 6 2 6 4 6 400 7 7 7 7 7 7 7 7 401 5 6 5 6 5 6 5 6 402 7 7 7 7 7 7 7 7 403 3 6 4 6 3 6 4 6 404 7 7 7 7 7 7 7 7 405 5 6 5 6 5 6 5 6 406 7 7 7 7 7 7 7 7 408 Within each pass, the selected pixels are transmitted left to 409 right within a scanline, and selected scanlines sequentially from 410 top to bottom. For example, pass 2 contains pixels 4, 12, 20, 411 etc. of scanlines 0, 8, 16, etc. (numbering from 0,0 at the upper 412 left corner). The last pass contains the entirety of scanlines 1, 413 3, 5, etc. 415 The data within each pass is laid out as though it were a complete 416 image of the appropriate dimensions. For example, if the complete 417 image is 16 by 16 pixels, then pass 3 will contain two scanlines, 418 each containing four pixels. When pixels have fewer than 8 bits, 419 each such scanline is padded as needed to fill an integral number 420 of bytes (see Image layout, Section 2.3). Filtering is done on 421 this reduced image in the usual way, and a filter type byte is 422 transmitted before each of its scanlines (see Filter Algorithms, 423 Chapter 6). Notice that the transmission order is defined so that 424 all the scanlines transmitted in a pass will have the same number 425 of pixels; this is necessary for proper application of some of the 426 filters. 428 Caution: If the image contains fewer than five columns or fewer 429 than five rows, some passes will be entirely empty. Encoder and 430 decoder authors must be careful to handle this case correctly. In 431 particular, filter type bytes are only associated with nonempty 432 scanlines; no filter type bytes are present in an empty pass. 434 See Rationale: Interlacing (Section 12.6) and Recommendations for 435 Decoders: Progressive display (Section 10.9). 437 2.7. Gamma correction 439 PNG images can specify, via the gAMA chunk, the gamma 440 characteristic of the image with respect to the original scene. 441 Display programs are strongly encouraged to use this information, 442 plus information about the display device they are using and room 443 lighting, to present the image to the viewer in a way that 444 reproduces what the image's original author saw as closely as 445 possible. See Gamma Tutorial (Chapter 13) if you aren't already 446 familiar with gamma issues. 448 Gamma correction is not applied to the alpha channel, if any. 449 Alpha samples always represent a linear fraction of full opacity. 451 For high-precision applications, the exact chromaticity of the RGB 452 data in a PNG image can be specified via the cHRM chunk, allowing 453 more accurate color matching than gamma correction alone will 454 provide. See Color Tutorial (Chapter 14) if you aren't already 455 familiar with color representation issues. 457 See Rationale: Why gamma? (Section 12.7), Recommendations for 458 Encoders: Encoder gamma handling (Section 9.2), and 459 Recommendations for Decoders: Decoder gamma handling (Section 460 10.5). 462 2.8. Text strings 464 A PNG file can store text associated with the image, such as an 465 image description or copyright notice. Keywords are used to 466 indicate what each text string represents. 468 ISO 8859-1 (Latin-1) is the character set recommended for use in 469 text strings [ISO-8859]. This character set is a superset of 7- 470 bit ASCII. 472 Character codes not defined in Latin-1 should not be used, because 473 they have no platform-independent meaning. If a non-Latin-1 code 474 does appear in a PNG text string, its interpretation will vary 475 across platforms and decoders. Some systems might not even be 476 able to display all the characters in Latin-1, but most modern 477 systems can. 479 Provision is also made for the storage of compressed text. 481 See Rationale: Text strings (Section 12.10). 483 3. File Structure 485 A PNG file consists of a PNG signature followed by a series of 486 chunks. This chapter defines the signature and the basic properties 487 of chunks. Individual chunk types are discussed in the next chapter. 489 3.1. PNG file signature 491 The first eight bytes of a PNG file always contain the following 492 (decimal) values: 494 137 80 78 71 13 10 26 10 496 This signature indicates that the remainder of the file contains a 497 single PNG image, consisting of a series of chunks beginning with 498 an IHDR chunk and ending with an IEND chunk. 500 See Rationale: PNG file signature (Section 12.11). 502 3.2. Chunk layout 504 Each chunk consists of four parts: 506 Length 507 A 4-byte unsigned integer giving the number of bytes in the 508 chunk's data field. The length counts only the data field, not 509 itself, the chunk type code, or the CRC. Zero is a valid 510 length. Although encoders and decoders should treat the length 511 as unsigned, its value must not exceed (2^31)-1 bytes. 513 Chunk Type 514 A 4-byte chunk type code. For convenience in description and 515 in examining PNG files, type codes are restricted to consist of 516 uppercase and lowercase ASCII letters (A-Z and a-z, or 65-90 517 and 97-122 decimal). However, encoders and decoders must treat 518 the codes as fixed binary values, not character strings. For 519 example, it would not be correct to represent the type code 520 IDAT by the EBCDIC equivalents of those letters. Additional 521 naming conventions for chunk types are discussed in the next 522 section. 524 Chunk Data 525 The data bytes appropriate to the chunk type, if any. This 526 field can be of zero length. 528 CRC 529 A 4-byte CRC (Cyclic Redundancy Check) calculated on the 530 preceding bytes in the chunk, including the chunk type code and 531 chunk data fields, but not including the length field. The CRC 532 is always present, even for chunks containing no data. See CRC 533 algorithm (Section 3.4). 535 The chunk data length can be any number of bytes up to the 536 maximum; therefore, implementors cannot assume that chunks are 537 aligned on any boundaries larger than bytes. 539 Chunks can appear in any order, subject to the restrictions placed 540 on each chunk type. (One notable restriction is that IHDR must 541 appear first and IEND must appear last; thus the IEND chunk serves 542 as an end-of-file marker.) Multiple chunks of the same type can 543 appear, but only if specifically permitted for that type. 545 See Rationale: Chunk layout (Section 12.12). 547 3.3. Chunk naming conventions 549 Chunk type codes are assigned so that a decoder can determine some 550 properties of a chunk even when it does not recognize the type 551 code. These rules are intended to allow safe, flexible extension 552 of the PNG format, by allowing a decoder to decide what to do when 553 it encounters an unknown chunk. The naming rules are not normally 554 of interest when the decoder does recognize the chunk's type. 556 Four bits of the type code, namely bit 5 (value 32) of each byte, 557 are used to convey chunk properties. This choice means that a 558 human can read off the assigned properties according to whether 559 each letter of the type code is uppercase (bit 5 is 0) or 560 lowercase (bit 5 is 1). However, decoders should test the 561 properties of an unknown chunk by numerically testing the 562 specified bits; testing whether a character is uppercase or 563 lowercase is inefficient, and even incorrect if a locale-specific 564 case definition is used. 566 It is worth noting that the property bits are an inherent part of 567 the chunk name, and hence are fixed for any chunk type. Thus, 568 TEXT and Text would be unrelated chunk type codes, not the same 569 chunk with different properties. Decoders should recognize type 570 codes by a simple four-byte literal comparison; it is incorrect to 571 perform case conversion on type codes. 573 The semantics of the property bits are: 575 Ancillary bit: bit 5 of first byte 576 0 (uppercase) = critical, 1 (lowercase) = ancillary. 578 Chunks that are not strictly necessary in order to meaningfully 579 display the contents of the file are known as "ancillary" 580 chunks. A decoder encountering an unknown chunk in which the 581 ancillary bit is 1 can safely ignore the chunk and proceed to 582 display the image. The time chunk (tIME) is an example of an 583 ancillary chunk. 585 Chunks that are necessary for successful display of the file's 586 contents are called "critical" chunks. A decoder encountering 587 an unknown chunk in which the ancillary bit is 0 must indicate 588 to the user that the image contains information it cannot 589 safely interpret. The image header chunk (IHDR) is an example 590 of a critical chunk. 592 Private bit: bit 5 of second byte 593 0 (uppercase) = public, 1 (lowercase) = private. 595 A public chunk is one that is part of the PNG specification or 596 is registered in the list of PNG special-purpose public chunk 597 types. Applications can also define private (unregistered) 598 chunks for their own purposes. The names of private chunks 599 must have a lowercase second letter, while public chunks will 600 always be assigned names with uppercase second letters. Note 601 that decoders do not need to test the private-chunk property 602 bit, since it has no functional significance; it is simply an 603 administrative convenience to ensure that public and private 604 chunk names will not conflict. See Additional Chunk Types 605 (Section 4.4) and Recommendations for Encoders: Use of private 606 chunks (Section 9.8). 608 Reserved bit: bit 5 of third byte 609 Must be 0 (uppercase) always. 611 The significance of the case of the third letter of the chunk 612 name is reserved for possible future expansion. At the present 613 time all chunk names must have uppercase third letters. 614 (Decoders should not complain about a lowercase third letter, 615 however, as some future version of the PNG specification could 616 define a meaning for this bit. It is sufficient to treat a 617 chunk with a lowercase third letter in the same way as any 618 other unknown chunk type.) 620 Safe-to-copy bit: bit 5 of fourth byte 621 0 (uppercase) = unsafe to copy, 1 (lowercase) = safe to copy. 623 This property bit is not of interest to pure decoders, but it 624 is needed by PNG editors (programs that modify PNG files). 625 This bit defines the proper handling of unrecognized chunks in 626 a file that is being modified. 628 If a chunk's safe-to-copy bit is 1, the chunk may be copied to 629 a modified PNG file whether or not the software recognizes the 630 chunk type, and regardless of the extent of the file 631 modifications. 633 If a chunk's safe-to-copy bit is 0, it indicates that the chunk 634 depends on the image data. If the program has made any changes 635 to critical chunks, including addition, modification, deletion, 636 or reordering of critical chunks, then unrecognized unsafe 637 chunks must not be copied to the output PNG file. (Of course, 638 if the program does recognize the chunk, it can choose to 639 output an appropriately modified version.) 641 A PNG editor is always allowed to copy all unrecognized chunks 642 if it has only added, deleted, modified, or reordered ancillary 643 chunks. This implies that it is not permissible for ancillary 644 chunks to depend on other ancillary chunks. 646 PNG editors that do not recognize a critical chunk must report 647 an error and refuse to process that PNG file at all. The 648 safe/unsafe mechanism is intended for use with ancillary 649 chunks. The safe-to-copy bit will always be 0 for critical 650 chunks. 652 Rules for PNG editors are discussed further in Chunk Ordering 653 Rules (Chapter 7). 655 For example, the hypothetical chunk type name "bLOb" has the 656 property bits: 658 bLOb <-- 32 bit chunk type code represented in text form 659 |||| 660 |||+- Safe-to-copy bit is 1 (lower case letter; bit 5 is 1) 661 ||+-- Reserved bit is 0 (upper case letter; bit 5 is 0) 662 |+--- Private bit is 0 (upper case letter; bit 5 is 0) 663 +---- Ancillary bit is 1 (lower case letter; bit 5 is 1) 665 Therefore, this name represents an ancillary, public, safe-to-copy 666 chunk. 668 See Rationale: Chunk naming conventions (Section 12.13). 670 3.4. CRC algorithm 672 Chunk CRCs are calculated using standard CRC methods with pre and 673 post conditioning, as defined by ISO 3309 [ISO-3309] or ITU-T V.42 674 [ITU-V42]. The CRC polynomial employed is 676 x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 678 The 32-bit CRC register is initialized to all 1's, and then the 679 data from each byte is processed from the least significant bit 680 (1) to the most significant bit (128). After all the data bytes 681 are processed, the CRC register is inverted (its ones complement 682 is taken). This value is transmitted (stored in the file) MSB 683 first. For the purpose of separating into bytes and ordering, the 684 least significant bit of the 32-bit CRC is defined to be the 685 coefficient of the x^31 term. 687 Practical calculation of the CRC always employs a precalculated 688 table to greatly accelerate the computation. See Sample CRC Code 689 (Chapter 15). 691 4. Chunk Specifications 693 This chapter defines the standard types of PNG chunks. 695 4.1. Critical Chunks 697 All implementations must understand and successfully render the 698 standard critical chunks. A valid PNG image must contain an IHDR 699 chunk, one or more IDAT chunks, and an IEND chunk. 701 4.1.1. IHDR Image Header 703 The IHDR chunk must appear FIRST. It contains: 705 Width: 4 bytes 706 Height: 4 bytes 707 Bit depth: 1 byte 708 Color type: 1 byte 709 Compression method: 1 byte 710 Filter method: 1 byte 711 Interlace method: 1 byte 713 Width and height give the image dimensions in pixels. They are 714 4-byte integers. Zero is an invalid value. The maximum for each 715 is (2^31)-1 in order to accommodate languages that have 716 difficulty with unsigned 4-byte values. 718 Bit depth is a single-byte integer giving the number of bits 719 per sample (not per pixel, except when a pixel contains just 720 one sample). Valid values are 1, 2, 4, 8, and 16, although not 721 all values are allowed for all color types. 723 Color type is a single-byte integer that describes the 724 interpretation of the image data. Color type codes represent 725 sums of the following values: 1 (palette used), 2 (color used), 726 and 4 (alpha channel used). Valid values are 0, 2, 3, 4, and 6. 728 Bit depth restrictions for each color type are imposed to 729 simplify implementations and to prohibit combinations that do 730 not compress well. Decoders must support all legal 731 combinations of bit depth and color type. The allowed 732 combinations are: 734 Color Allowed Interpretation 735 Type Bit Depths 737 0 1,2,4,8,16 Each pixel is a grayscale sample. 739 2 8,16 Each pixel is an R,G,B triple. 741 3 1,2,4,8 Each pixel is a palette index; 742 a PLTE chunk must appear. 744 4 8,16 Each pixel is a grayscale sample, 745 followed by an alpha sample. 747 6 8,16 Each pixel is an R,G,B triple, 748 followed by an alpha sample. 750 Compression method is a single-byte integer that indicates the 751 method used to compress the image data. At present, only 752 compression method 0 (deflate/inflate compression with a 32K 753 sliding window) is defined. All standard PNG images must be 754 compressed with this scheme. The compression method field is 755 provided for possible future expansion or proprietary variants. 756 Decoders must check this byte and report an error if it holds 757 an unrecognized code. See Deflate/Inflate Compression (Chapter 758 5) for details. 760 Filter method is a single-byte integer that indicates the 761 preprocessing method applied to the image data before 762 compression. At present, only filter method 0 (adaptive 763 filtering with five basic filter types) is defined. As with 764 the compression method field, decoders must check this byte and 765 report an error if it holds an unrecognized code. See Filter 766 Algorithms (Chapter 6) for details. 768 Interlace method is a single-byte integer that indicates the 769 transmission order of the image data. Two values are currently 770 defined: 0 (no interlace) or 1 (Adam7 interlace). See 771 Interlaced data order (Section 2.6) for details. 773 4.1.2. PLTE Palette 775 The PLTE chunk contains from 1 to 256 palette entries, each a 776 three-byte series of the form: 778 Red: 1 byte (0 = black, 255 = red) 779 Green: 1 byte (0 = black, 255 = green) 780 Blue: 1 byte (0 = black, 255 = blue) 782 The number of entries is determined from the chunk length. A 783 chunk length not divisible by 3 is an error. 785 This chunk must appear for color type 3, and can appear for 786 color types 2 and 6; it is not allowed for color types 0 and 4. 787 If this chunk does appear, it must precede the first IDAT 788 chunk. There cannot be more than one PLTE chunk. 790 For color type 3 (indexed color), the PLTE chunk is required. 791 The first entry in PLTE is referenced by pixel value 0, the 792 second by pixel value 1, etc. The number of palette entries 793 must not exceed the range that can be represented in the image 794 bit depth (for example, 2^4 = 16 for a bit depth of 4). It is 795 permissible to have fewer entries than the bit depth would 796 allow. In that case, any out-of-range pixel value found in the 797 image data is an error. 799 For color types 2 and 6 (truecolor and truecolor with alpha), 800 the PLTE chunk is optional. If present, it provides a 801 suggested set of from 1 to 256 colors to which the truecolor 802 image can be quantized if the viewer cannot display truecolor 803 directly. If PLTE is not present, such a viewer must select 804 colors on its own, but it is often preferable for this to be 805 done once by the encoder. (See Recommendations for Encoders: 806 Suggested palettes, Section 9.5.) 808 Note that the palette uses 8 bits (1 byte) per sample 809 regardless of the image bit depth specification. In 810 particular, the palette is 8 bits deep even when it is a 811 suggested quantization of a 16-bit truecolor image. 813 There is no requirement that the palette entries all be used by 814 the image, nor that they all be different. 816 4.1.3. IDAT Image Data 818 The IDAT chunk contains the actual image data. To create this 819 data: 821 * Begin with image scanlines represented as described in 822 Image layout (Section 2.3); the layout and total size of 823 this raw data are determined by the fields of IHDR. 824 * Filter the image data according to the filtering method 825 specified by the IHDR chunk. (Note that with filter 826 method 0, the only one currently defined, this implies 827 prepending a filter type byte to each scanline.) 828 * Compress the filtered data using the compression method 829 specified by the IHDR chunk. 831 The IDAT chunk contains the output datastream of the 832 compression algorithm. 834 To read the image data, reverse this process. 836 There can be multiple IDAT chunks; if so, they must appear 837 consecutively with no other intervening chunks. The compressed 838 datastream is then the concatenation of the contents of all the 839 IDAT chunks. The encoder can divide the compressed datastream 840 into IDAT chunks however it wishes. (Multiple IDAT chunks are 841 allowed so that encoders can work in a fixed amount of memory; 842 typically the chunk size will correspond to the encoder's 843 buffer size.) It is important to emphasize that IDAT chunk 844 boundaries have no semantic significance and can occur at any 845 point in the compressed datastream. A PNG file in which each 846 IDAT chunk contains only one data byte is legal, though 847 remarkably wasteful of space. (For that matter, zero-length 848 IDAT chunks are legal, though even more wasteful.) 850 See Filter Algorithms (Chapter 6) and Deflate/Inflate 851 Compression (Chapter 5) for details. 853 4.1.4. IEND Image Trailer 855 The IEND chunk must appear LAST. It marks the end of the PNG 856 datastream. The chunk's data field is empty. 858 4.2. Ancillary Chunks 860 All ancillary chunks are optional, in the sense that encoders need 861 not write them and decoders can ignore them. However, encoders 862 are encouraged to write the standard ancillary chunks when the 863 information is available, and decoders are encouraged to interpret 864 these chunks when appropriate and feasible. 866 The standard ancillary chunks are listed in alphabetical order. 867 This is not necessarily the order in which they would appear in a 868 file. 870 4.2.1. bKGD Background Color 872 The bKGD chunk specifies a default background color to present 873 the image against. Note that viewers are not bound to honor 874 this chunk; a viewer can choose to use a different background. 876 For color type 3 (indexed color), the bKGD chunk contains: 878 Palette index: 1 byte 880 The value is the palette index of the color to be used as 881 background. 883 For color types 0 and 4 (grayscale, with or without alpha), 884 bKGD contains: 886 Gray: 2 bytes, range 0 .. (2^bitdepth)-1 888 (For consistency, 2 bytes are used regardless of the image bit 889 depth.) The value is the gray level to be used as background. 891 For color types 2 and 6 (truecolor, with or without alpha), 892 bKGD contains: 894 Red: 2 bytes, range 0 .. (2^bitdepth)-1 895 Green: 2 bytes, range 0 .. (2^bitdepth)-1 896 Blue: 2 bytes, range 0 .. (2^bitdepth)-1 898 (For consistency, 2 bytes per sample are used regardless of the 899 image bit depth.) This is the RGB color to be used as 900 background. 902 When present, the bKGD chunk must precede the first IDAT chunk, 903 and must follow the PLTE chunk, if any. 905 See Recommendations for Decoders: Background color (Section 906 10.7). 908 4.2.2. cHRM Primary Chromaticities and White Point 910 Applications that need device-independent specification of 911 colors in a PNG file can use the cHRM chunk to specify the 1931 912 CIE x,y chromaticities of the red, green, and blue primaries 913 used in the image, and the referenced white point. See Color 914 Tutorial (Chapter 14) for more information. 916 The cHRM chunk contains: 918 White Point x: 4 bytes 919 White Point y: 4 bytes 920 Red x: 4 bytes 921 Red y: 4 bytes 922 Green x: 4 bytes 923 Green y: 4 bytes 924 Blue x: 4 bytes 925 Blue y: 4 bytes 927 Each value is encoded as a 4-byte unsigned integer, 928 representing the x or y value times 100000. For example, a 929 value of 0.3127 would be stored as the integer 31270. 931 cHRM is allowed in all PNG files, although it is of little 932 value for grayscale images. 934 If the encoder does not know the chromaticity values, it should 935 not write a cHRM chunk; the absence of a cHRM chunk indicates 936 that the image's primary colors are device-dependent. 938 If the cHRM chunk appears, it must precede the first IDAT 939 chunk, and it must also precede the PLTE chunk if present. 941 See Recommendations for Encoders: Encoder color handling 942 (Section 9.3), and Recommendations for Decoders: Decoder color 943 handling (Section 10.6). 945 4.2.3. gAMA Image Gamma 947 The gAMA chunk specifies the gamma of the camera (or simulated 948 camera) that produced the image, and thus the gamma of the 949 image with respect to the original scene. More precisely, the 950 gAMA chunk encodes the file_gamma value, as defined in Gamma 951 Tutorial (Chapter 13). 953 The gAMA chunk contains: 955 Image gamma: 4 bytes 957 The value is encoded as a 4-byte unsigned integer, representing 958 gamma times 100000. For example, a gamma of 0.45 would be 959 stored as the integer 45000. 961 If the encoder does not know the image's gamma value, it should 962 not write a gAMA chunk; the absence of a gAMA chunk indicates 963 that the gamma is unknown. 965 If the gAMA chunk appears, it must precede the first IDAT 966 chunk, and it must also precede the PLTE chunk if present. 968 See Gamma correction (Section 2.7), Recommendations for 969 Encoders: Encoder gamma handling (Section 9.2), and 970 Recommendations for Decoders: Decoder gamma handling (Section 971 10.5). 973 4.2.4. hIST Image Histogram 975 The hIST chunk gives the approximate usage frequency of each 976 color in the color palette. A histogram chunk can appear only 977 when a palette chunk appears. If a viewer is unable to provide 978 all the colors listed in the palette, the histogram may help it 979 decide how to choose a subset of the colors for display. 981 The hIST chunk contains a series of 2-byte (16 bit) unsigned 982 integers. There must be exactly one entry for each entry in 983 the PLTE chunk. Each entry is proportional to the fraction of 984 pixels in the image that have that palette index; the exact 985 scale factor is chosen by the encoder. 987 Histogram entries are approximate, with the exception that a 988 zero entry specifies that the corresponding palette entry is 989 not used at all in the image. It is required that a histogram 990 entry be nonzero if there are any pixels of that color. 992 When the palette is a suggested quantization of a truecolor 993 image, the histogram is necessarily approximate, since a 994 decoder may map pixels to palette entries differently than the 995 encoder did. In this situation, zero entries should not 996 appear. 998 The hIST chunk, if it appears, must follow the PLTE chunk, and 999 must precede the first IDAT chunk. 1001 See Rationale: Palette histograms (Section 12.14), and 1002 Recommendations for Decoders: Suggested-palette and histogram 1003 usage (Section 10.10). 1005 4.2.5. pHYs Physical Pixel Dimensions 1007 The pHYs chunk specifies the intended resolution for display of 1008 the image. It contains: 1010 Pixels per unit, X axis: 4 bytes (unsigned integer) 1011 Pixels per unit, Y axis: 4 bytes (unsigned integer) 1012 Unit specifier: 1 byte 1014 The following values are legal for the unit specifier: 1016 0: unit is unknown 1017 1: unit is the meter 1019 When the unit specifier is 0, the pHYs chunk defines pixel 1020 aspect ratio only; the actual size of the pixels remains 1021 unspecified. 1023 Conversion note: one inch is equal to exactly 0.0254 meters. 1025 If this ancillary chunk is not present, pixels are assumed to 1026 be square, and the physical size of each pixel is unknown. 1028 If present, this chunk must precede the first IDAT chunk. 1030 See Recommendations for Decoders: Pixel dimensions (Section 1031 10.2). 1033 4.2.6. sBIT Significant Bits 1035 To simplify decoders, PNG specifies that only certain sample 1036 bit depths can be used, and further specifies that sample 1037 values should be scaled to the full range of possible values at 1038 that bit depth. However, the sBIT chunk is provided in order 1039 to store the original number of significant bits. This allows 1040 decoders to recover the original data losslessly even if the 1041 data had a bit depth not directly supported by PNG. We 1042 recommend that an encoder emit an sBIT chunk if it has 1043 converted the data from a lower bit depth. 1045 For color type 0 (grayscale), the sBIT chunk contains a single 1046 byte, indicating the number of bits that were significant in 1047 the source data. 1049 For color type 2 (truecolor), the sBIT chunk contains three 1050 bytes, indicating the number of bits that were significant in 1051 the source data for the red, green, and blue channels, 1052 respectively. 1054 For color type 3 (indexed color), the sBIT chunk contains three 1055 bytes, indicating the number of bits that were significant in 1056 the source data for the red, green, and blue components of the 1057 palette entries, respectively. 1059 For color type 4 (grayscale with alpha channel), the sBIT chunk 1060 contains two bytes, indicating the number of bits that were 1061 significant in the source grayscale data and the source alpha 1062 data, respectively. 1064 For color type 6 (truecolor with alpha channel), the sBIT chunk 1065 contains four bytes, indicating the number of bits that were 1066 significant in the source data for the red, green, blue and 1067 alpha channels, respectively. 1069 Each depth specified in sBIT must be greater than zero and less 1070 than or equal to the sample depth (which is 8 for indexed-color 1071 images, and the bit depth given in IHDR for other color types). 1073 A decoder need not pay attention to sBIT: the stored image is a 1074 valid PNG file of the sample depth indicated by IHDR. However, 1075 if the decoder wishes to recover the original data at its 1076 original precision, this can be done by right-shifting the 1077 stored samples (the stored palette entries, for an indexed- 1078 color image). The encoder must scale the data in such a way 1079 that the high-order bits match the original data. 1081 If the sBIT chunk appears, it must precede the first IDAT 1082 chunk, and it must also precede the PLTE chunk if present. 1084 See Recommendations for Encoders: Bit depth scaling (Section 1085 9.1) and Recommendations for Decoders: Bit depth rescaling 1086 (Section 10.4). 1088 4.2.7. tEXt Textual Data 1090 Textual information that the encoder wishes to record with the 1091 image can be stored in tEXt chunks. Each tEXt chunk contains a 1092 keyword and a text string, in the format: 1094 Keyword: 1-79 bytes (character string) 1095 Null separator: 1 byte 1096 Text: n bytes (character string) 1098 The keyword and text string are separated by a zero byte (null 1099 character). Neither the keyword nor the text string can 1100 contain a null character. Note that the text string is not 1101 null-terminated (the length of the chunk is sufficient 1102 information to locate the ending). The keyword must be at 1103 least one character and less than 80 characters long. The text 1104 string can be of any length from zero bytes up to the maximum 1105 permissible chunk size less the length of the keyword and 1106 separator. 1108 Any number of tEXt chunks can appear, and more than one with 1109 the same keyword is permissible. 1111 The keyword indicates the type of information represented by 1112 the text string. The following keywords are predefined and 1113 should be used where appropriate: 1115 Title Short (one line) title or caption for image 1116 Author Name of image's creator 1117 Description Description of image (possibly long) 1118 Copyright Copyright notice 1119 Creation Time Time of original image creation 1120 Software Software used to create the image 1121 Disclaimer Legal disclaimer 1122 Warning Warning of nature of content 1123 Source Device used to create the image 1124 Comment Miscellaneous comment; conversion from 1125 GIF comment 1127 For the Creation Time keyword, the date format defined in 1128 section 5.2.14 of RFC 1123 is suggested, but not required 1129 [RFC-1123]. Decoders should allow for free-format text 1130 associated with this or any other keyword. 1132 Other keywords may be invented for other purposes. Keywords of 1133 general interest can be registered with the maintainers of the 1134 PNG specification. However, it is also permitted to use 1135 private unregistered keywords. (Private keywords should be 1136 reasonably self-explanatory, in order to minimize the chance 1137 that the same keyword will be used for incompatible purposes by 1138 different people.) 1140 Both keyword and text are interpreted according to the ISO 1141 8859-1 (Latin-1) character set [ISO-8859]. The text string can 1142 contain any Latin-1 character. Newlines in the text string 1143 should be represented by a single linefeed character (decimal 1144 10); use of other control characters in the text is 1145 discouraged. 1147 Keywords must contain only printable Latin-1 characters and 1148 spaces; that is, only character codes 32-126 and 161-255 1149 decimal are allowed. To reduce the chances for human 1150 misreading of a keyword, leading and trailing spaces are 1151 forbidden, as are consecutive spaces. Note also that the non- 1152 breaking space (code 160) is not permitted in keywords, since 1153 it is visually indistinguishable from an ordinary space. 1155 Keywords must be spelled exactly as registered, so that 1156 decoders can use simple literal comparisons when looking for 1157 particular keywords. In particular, keywords are considered 1158 case-sensitive. 1160 See Recommendations for Encoders: Text chunk processing 1161 (Section 9.7) and Recommendations for Decoders: Text chunk 1162 processing (Section 10.11). 1164 4.2.8. tIME Image Last-Modification Time 1166 The tIME chunk gives the time of the last image modification 1167 (not the time of initial image creation). It contains: 1169 Year: 2 bytes (complete; for example, 1995, not 95) 1170 Month: 1 byte (1-12) 1171 Day: 1 byte (1-31) 1172 Hour: 1 byte (0-23) 1173 Minute: 1 byte (0-59) 1174 Second: 1 byte (0-60) (yes, 60, for leap seconds; not 61, 1175 a common error) 1177 Universal Time (UTC, also called GMT) should be specified 1178 rather than local time. 1180 The tIME chunk is intended for use as an automatically-applied 1181 time stamp that is updated whenever the image data is changed. 1182 It is recommended that tIME not be changed by PNG editors that 1183 do not change the image data. See also the Creation Time tEXt 1184 keyword, which can be used for a user-supplied time. 1186 4.2.9. tRNS Transparency 1188 The tRNS chunk specifies that the image uses simple 1189 transparency: either alpha values associated with palette 1190 entries (for indexed-color images) or a single transparent 1191 color (for grayscale and truecolor images). Although simple 1192 transparency is not as elegant as the full alpha channel, it 1193 requires less storage space and is sufficient for many common 1194 cases. 1196 For color type 3 (indexed color), the tRNS chunk contains a 1197 series of one-byte alpha values, corresponding to entries in 1198 the PLTE chunk: 1200 Alpha for palette index 0: 1 byte 1201 Alpha for palette index 1: 1 byte 1202 ... etc ... 1204 Each entry indicates that pixels of the corresponding palette 1205 index should be treated as having the specified alpha value. 1206 Alpha values have the same interpretation as in an 8-bit full 1207 alpha channel: 0 is fully transparent, 255 is fully opaque, 1208 regardless of image bit depth. The tRNS chunk must not contain 1209 more alpha values than there are palette entries, but tRNS can 1210 contain fewer values than there are palette entries. In this 1211 case, the alpha value for all remaining palette entries is 1212 assumed to be 255. In the common case in which only palette 1213 index 0 need be made transparent, only a one-byte tRNS chunk is 1214 needed. 1216 For color type 0 (grayscale), the tRNS chunk contains a single 1217 gray level value, stored in the format: 1219 Gray: 2 bytes, range 0 .. (2^bitdepth)-1 1221 (For consistency, 2 bytes are used regardless of the image bit 1222 depth.) Pixels of the specified gray level are to be treated as 1223 transparent (equivalent to alpha value 0); all other pixels are 1224 to be treated as fully opaque (alpha value (2^bitdepth)-1). 1226 For color type 2 (truecolor), the tRNS chunk contains a single 1227 RGB color value, stored in the format: 1229 Red: 2 bytes, range 0 .. (2^bitdepth)-1 1230 Green: 2 bytes, range 0 .. (2^bitdepth)-1 1231 Blue: 2 bytes, range 0 .. (2^bitdepth)-1 1233 (For consistency, 2 bytes per sample are used regardless of the 1234 image bit depth.) Pixels of the specified color value are to be 1235 treated as transparent (equivalent to alpha value 0); all other 1236 pixels are to be treated as fully opaque (alpha value 1237 (2^bitdepth)-1). 1239 tRNS is prohibited for color types 4 and 6, since a full alpha 1240 channel is already present in those cases. 1242 Note: when dealing with 16-bit grayscale or truecolor data, it 1243 is important to compare both bytes of the sample values to 1244 determine whether a pixel is transparent. Although decoders 1245 may drop the low-order byte of the samples for display, this 1246 must not occur until after the data has been tested for 1247 transparency. For example, if the grayscale level 0x0001 is 1248 specified to be transparent, it would be incorrect to compare 1249 only the high-order byte and decide that 0x0002 is also 1250 transparent. 1252 When present, the tRNS chunk must precede the first IDAT chunk, 1253 and must follow the PLTE chunk, if any. 1255 4.2.10. zTXt Compressed Textual Data 1257 The zTXt chunk contains textual data, just as tEXt does; 1258 however, zTXt takes advantage of compression. zTXt and tEXt 1259 chunks are semantically equivalent, but zTXt is recommended for 1260 storing large blocks of text. 1262 A zTXt chunk contains: 1264 Keyword: 1-79 bytes (character string) 1265 Null separator: 1 byte 1266 Compression method: 1 byte 1267 Compressed text: n bytes 1269 The keyword and null separator are exactly the same as in the 1270 tEXt chunk. Note that the keyword is not compressed. The 1271 compression method byte identifies the compression method used 1272 in this zTXt chunk. The only value presently defined for it is 1273 0 (deflate/inflate compression). The compression method byte is 1274 followed by a compressed datastream that makes up the remainder 1275 of the chunk. For compression method 0, this datastream 1276 adheres to the zlib datastream format (see Deflate/Inflate 1277 Compression, Chapter 5). Decompression of this datastream 1278 yields Latin-1 text that is identical to the text that would be 1279 stored in an equivalent tEXt chunk. 1281 Any number of zTXt and tEXt chunks can appear in the same file. 1282 See the preceding definition of the tEXt chunk for the 1283 predefined keywords and the recommended format of the text. 1285 See Recommendations for Encoders: Text chunk processing 1286 (Section 9.7), and Recommendations for Decoders: Text chunk 1287 processing (Section 10.11). 1289 4.3. Summary of Standard Chunks 1291 This table summarizes some properties of the standard chunk types. 1293 Critical chunks (must appear in this order, except PLTE 1294 is optional): 1296 Name Multiple Ordering constraints 1297 OK? 1299 IHDR No Must be first 1300 PLTE No Before IDAT 1301 IDAT Yes Multiple IDATs must be consecutive 1302 IEND No Must be last 1304 Ancillary chunks (need not appear in this order): 1306 Name Multiple Ordering constraints 1307 OK? 1309 cHRM No Before PLTE and IDAT 1310 gAMA No Before PLTE and IDAT 1311 sBIT No Before PLTE and IDAT 1312 bKGD No After PLTE; before IDAT 1313 hIST No After PLTE; before IDAT 1314 tRNS No After PLTE; before IDAT 1315 pHYs No Before IDAT 1316 tIME No None 1317 tEXt Yes None 1318 zTXt Yes None 1320 Standard keywords for tEXt and zTXt chunks: 1322 Title Short (one line) title or caption for image 1323 Author Name of image's creator 1324 Description Description of image (possibly long) 1325 Copyright Copyright notice 1326 Creation Time Time of original image creation 1327 Software Software used to create the image 1328 Disclaimer Legal disclaimer 1329 Warning Warning of nature of content 1330 Source Device used to create the image 1331 Comment Miscellaneous comment; conversion from 1332 GIF comment 1334 4.4. Additional Chunk Types 1336 Additional public PNG chunk types are defined in the document "PNG 1337 Special-Purpose Public Chunks" [PNG-EXTENSIONS]. Chunks described 1338 there are expected to be less widely supported than those defined 1339 in this specification. However, application authors are 1340 encouraged to use those chunk types whenever appropriate for their 1341 applications. Additional chunk types can be proposed for 1342 inclusion in that list by contacting the PNG specification 1343 maintainers at png-info@uunet.uu.net. 1345 New public chunks will only be registered if they are of use to 1346 others and do not violate the design philosophy of PNG. Chunk 1347 registration is not automatic, although it is the intent of the 1348 authors that it be straightforward when a new chunk of potentially 1349 wide application is needed. Note that the creation of new 1350 critical chunk types is discouraged unless absolutely necessary. 1352 Applications can also use private chunk types to carry data that 1353 is not of interest to other applications. See Recommendations for 1354 Encoders: Use of private chunks (Section 9.8). 1356 Decoders must be prepared to encounter unrecognized public or 1357 private chunk type codes. Unrecognized chunk types must be 1358 handled as described in Chunk naming conventions (Section 3.3). 1360 5. Deflate/Inflate Compression 1362 PNG compression method 0 (the only compression method presently 1363 defined for PNG) specifies deflate/inflate compression with a 32K 1364 sliding window. Deflate compression is an LZ77 derivative used in 1365 zip, gzip, pkzip and related programs. Extensive research has been 1366 done supporting its patent-free status. Portable C implementations 1367 are freely available. 1369 Deflate-compressed datastreams within PNG are stored in the "zlib" 1370 format, which has the structure: 1372 Compression method/flags code: 1 byte 1373 Additional flags/check bits: 1 byte 1374 Compressed data blocks: n bytes 1375 Check value: 4 bytes 1377 Further details on this format are given in the zlib specification 1378 [RFC-1950]. 1380 For PNG compression method 0, the zlib compression method/flags code 1381 must specify method code 8 ("deflate" compression) and an LZ77 window 1382 size of not more than 32K. Note that the zlib compression method 1383 number is not the same as the PNG compression method number. The 1384 additional flags must not specify a preset dictionary. 1386 The compressed data within the zlib datastream is stored as a series 1387 of blocks, each of which can represent raw (uncompressed) data, 1388 LZ77-compressed data encoded with fixed Huffman codes, or LZ77- 1389 compressed data encoded with custom Huffman codes. A marker bit in 1390 the final block identifies it as the last block, allowing the decoder 1391 to recognize the end of the compressed datastream. Further details 1392 on the compression algorithm and the encoding are given in the 1393 deflate specification [RFC-1951]. 1395 The check value stored at the end of the zlib datastream is 1396 calculated on the uncompressed data represented by the datastream. 1397 Note that the algorithm used is not the same as the CRC calculation 1398 used for PNG chunk check values. The zlib check value is useful 1399 mainly as a cross-check that the deflate and inflate algorithms are 1400 implemented correctly. Verifying the chunk CRCs provides adequate 1401 confidence that the PNG file has been transmitted undamaged. 1403 In a PNG file, the concatenation of the contents of all the IDAT 1404 chunks makes up a zlib datastream as specified above. This 1405 datastream decompresses to filtered image data as described elsewhere 1406 in this document. 1408 It is important to emphasize that the boundaries between IDAT chunks 1409 are arbitrary and can fall anywhere in the zlib datastream. There is 1410 not necessarily any correlation between IDAT chunk boundaries and 1411 deflate block boundaries or any other feature of the zlib data. For 1412 example, it is entirely possible for the terminating zlib check value 1413 to be split across IDAT chunks. 1415 In the same vein, there is no required correlation between the 1416 structure of the image data (i.e., scanline boundaries) and deflate 1417 block boundaries or IDAT chunk boundaries. The complete image data 1418 is represented by a single zlib datastream that is stored in some 1419 number of IDAT chunks; a decoder that assumes any more than this is 1420 incorrect. (Of course, some encoder implementations may emit files 1421 in which some of these structures are indeed related. But decoders 1422 cannot rely on this.) 1424 PNG also uses zlib datastreams in zTXt chunks. In a zTXt chunk, the 1425 remainder of the chunk following the compression method byte is a 1426 zlib datastream as specified above. This datastream decompresses to 1427 the user-readable text described by the chunk's keyword. Unlike the 1428 image data, such datastreams are not split across chunks; each zTXt 1429 chunk contains an independent zlib datastream. 1431 Additional documentation and portable C code for deflate and inflate 1432 are available from the Info-ZIP archives at 1433 . 1435 6. Filter Algorithms 1437 This chapter describes the filter algorithms that can be applied 1438 before compression. The purpose of these filters is to prepare the 1439 image data for optimum compression. 1441 PNG filter method 0 defines five basic filter types: 1443 Type Name 1445 0 None 1446 1 Sub 1447 2 Up 1448 3 Average 1449 4 Paeth 1451 (Note that filter method 0 in IHDR specifies exactly this set of five 1452 filter types. If the set of filter types is ever extended, a 1453 different filter method number will be assigned to the extended set, 1454 so that decoders need not decompress the data to discover that it 1455 contains unsupported filter types.) 1457 The encoder can choose which of these filter algorithms to apply on a 1458 scanline-by-scanline basis. In the image data sent to the 1459 compression step, each scanline is preceded by a filter type byte 1460 that specifies the filter algorithm used for that scanline. 1462 Filtering algorithms are applied to bytes, not to pixels, regardless 1463 of the bit depth or color type of the image. The filtering 1464 algorithms work on the byte sequence formed by a scanline that has 1465 been represented as described in Image layout (Section 2.3). If the 1466 image includes an alpha channel, the alpha data is filtered in the 1467 same way as the image data. 1469 When the image is interlaced, each pass of the interlace pattern is 1470 treated as an independent image for filtering purposes. The filters 1471 work on the byte sequences formed by the pixels actually transmitted 1472 during a pass, and the "previous scanline" is the one previously 1473 transmitted in the same pass, not the one adjacent in the complete 1474 image. Note that the subimage transmitted in any one pass is always 1475 rectangular, but is of smaller width and/or height than the complete 1476 image. Filtering is not applied when this subimage is empty. 1478 For all filters, the bytes "to the left of" the first pixel in a 1479 scanline must be treated as being zero. For filters that refer to 1480 the prior scanline, the entire prior scanline must be treated as 1481 being zeroes for the first scanline of an image (or of a pass of an 1482 interlaced image). 1484 To reverse the effect of a filter, the decoder must use the decoded 1485 values of the prior pixel on the same line, the pixel immediately 1486 above the current pixel on the prior line, and the pixel just to the 1487 left of the pixel above. This implies that at least one scanline's 1488 worth of image data must be stored by the decoder at all times. Even 1489 though some filter types do not refer to the prior scanline, the 1490 decoder must always store each scanline as it is decoded, since the 1491 next scanline might use a filter that refers to it. 1493 PNG imposes no restriction on which filter types can be applied to an 1494 image. However, the filters are not equally effective on all types 1495 of data. See Recommendations for Encoders: Filter selection (Section 1496 9.6). 1498 See also Rationale: Filtering (Section 12.9). 1500 6.1. Filter type 0: None 1502 With the None filter, the scanline is transmitted unmodified; it 1503 is only necessary to insert a filter type byte before the data. 1505 6.2. Filter type 1: Sub 1507 The Sub filter transmits the difference between each byte and the 1508 value of the corresponding byte of the prior pixel. 1510 To compute the Sub filter, apply the following formula to each 1511 byte of the scanline: 1513 Sub(x) = Raw(x) - Raw(x-bpp) 1515 where x ranges from zero to the number of bytes representing the 1516 scanline minus one, Raw(x) refers to the raw data byte at that 1517 byte position in the scanline, and bpp is defined as the number of 1518 bytes per complete pixel, rounding up to one. For example, for 1519 color type 2 with a bit depth of 16, bpp is equal to 6 (three 1520 samples, two bytes per sample); for color type 0 with a bit depth 1521 of 2, bpp is equal to 1 (rounding up); for color type 4 with a bit 1522 depth of 16, bpp is equal to 4 (two-byte grayscale sample, plus 1523 two-byte alpha sample). 1525 Note this computation is done for each byte, regardless of bit 1526 depth. In a 16-bit image, each MSB is predicted from the 1527 preceding MSB and each LSB from the preceding LSB, because of the 1528 way that bpp is defined. 1530 Unsigned arithmetic modulo 256 is used, so that both the inputs 1531 and outputs fit into bytes. The sequence of Sub values is 1532 transmitted as the filtered scanline. 1534 For all x < 0, assume Raw(x) = 0. 1536 To reverse the effect of the Sub filter after decompression, 1537 output the following value: 1539 Sub(x) + Raw(x-bpp) 1541 (computed mod 256), where Raw refers to the bytes already decoded. 1543 6.3. Filter type 2: Up 1545 The Up filter is just like the Sub filter except that the pixel 1546 immediately above the current pixel, rather than just to its left, 1547 is used as the predictor. 1549 To compute the Up filter, apply the following formula to each byte 1550 of the scanline: 1552 Up(x) = Raw(x) - Prior(x) 1554 where x ranges from zero to the number of bytes representing the 1555 scanline minus one, Raw(x) refers to the raw data byte at that 1556 byte position in the scanline, and Prior(x) refers to the 1557 unfiltered bytes of the prior scanline. 1559 Note this is done for each byte, regardless of bit depth. 1560 Unsigned arithmetic modulo 256 is used, so that both the inputs 1561 and outputs fit into bytes. The sequence of Up values is 1562 transmitted as the filtered scanline. 1564 On the first scanline of an image (or of a pass of an interlaced 1565 image), assume Prior(x) = 0 for all x. 1567 To reverse the effect of the Up filter after decompression, output 1568 the following value: 1570 Up(x) + Prior(x) 1572 (computed mod 256), where Prior refers to the decoded bytes of the 1573 prior scanline. 1575 6.4. Filter type 3: Average 1577 The Average filter uses the average of the two neighboring pixels 1578 (left and above) to predict the value of a pixel. 1580 To compute the Average filter, apply the following formula to each 1581 byte of the scanline: 1583 Average(x) = Raw(x) - floor((Raw(x-bpp)+Prior(x))/2) 1585 where x ranges from zero to the number of bytes representing the 1586 scanline minus one, Raw(x) refers to the raw data byte at that 1587 byte position in the scanline, Prior(x) refers to the unfiltered 1588 bytes of the prior scanline, and bpp is defined as for the Sub 1589 filter. 1591 Note this is done for each byte, regardless of bit depth. The 1592 sequence of Average values is transmitted as the filtered 1593 scanline. 1595 The subtraction of the predicted value from the raw byte must be 1596 done modulo 256, so that both the inputs and outputs fit into 1597 bytes. However, the sum Raw(x-bpp)+Prior(x) must be formed 1598 without overflow (using at least nine-bit arithmetic). floor() 1599 indicates that the result of the division is rounded to the next 1600 lower integer if fractional; in other words, it is an integer 1601 division or right shift operation. 1603 For all x < 0, assume Raw(x) = 0. On the first scanline of an 1604 image (or of a pass of an interlaced image), assume Prior(x) = 0 1605 for all x. 1607 To reverse the effect of the Average filter after decompression, 1608 output the following value: 1610 Average(x) + floor((Raw(x-bpp)+Prior(x))/2) 1612 where the result is computed mod 256, but the prediction is 1613 calculated in the same way as for encoding. Raw refers to the 1614 bytes already decoded, and Prior refers to the decoded bytes of 1615 the prior scanline. 1617 6.5. Filter type 4: Paeth 1619 The Paeth filter computes a simple linear function of the three 1620 neighboring pixels (left, above, upper left), then chooses as 1621 predictor the neighboring pixel closest to the computed value. 1622 This technique is due to Alan W. Paeth [PAETH]. 1624 To compute the Paeth filter, apply the following formula to each 1625 byte of the scanline: 1627 Paeth(x) = Raw(x) - PaethPredictor(Raw(x-bpp), Prior(x), 1628 Prior(x-bpp)) 1630 where x ranges from zero to the number of bytes representing the 1631 scanline minus one, Raw(x) refers to the raw data byte at that 1632 byte position in the scanline, Prior(x) refers to the unfiltered 1633 bytes of the prior scanline, and bpp is defined as for the Sub 1634 filter. 1636 Note this is done for each byte, regardless of bit depth. 1637 Unsigned arithmetic modulo 256 is used, so that both the inputs 1638 and outputs fit into bytes. The sequence of Paeth values is 1639 transmitted as the filtered scanline. 1641 The PaethPredictor function is defined by the following 1642 pseudocode: 1644 function PaethPredictor (a, b, c) 1645 begin 1646 ; a = left, b = above, c = upper left 1647 p := a + b - c ; initial estimate 1648 pa := abs(p - a) ; distances to a, b, c 1649 pb := abs(p - b) 1650 pc := abs(p - c) 1651 ; return nearest of a,b,c, 1652 ; breaking ties in order a,b,c. 1653 if pa <= pb AND pa <= pc then return a 1654 else if pb <= pc then return b 1655 else return c 1656 end 1658 The calculations within the PaethPredictor function must be 1659 performed exactly, without overflow. Arithmetic modulo 256 is to 1660 be used only for the final step of subtracting the function result 1661 from the target byte value. 1663 Note that the order in which ties are broken is critical and must 1664 not be altered. The tie break order is: pixel to the left, pixel 1665 above, pixel to the upper left. (This order differs from that 1666 given in Paeth's article.) 1668 For all x < 0, assume Raw(x) = 0 and Prior(x) = 0. On the first 1669 scanline of an image (or of a pass of an interlaced image), assume 1670 Prior(x) = 0 for all x. 1672 To reverse the effect of the Paeth filter after decompression, 1673 output the following value: 1675 Paeth(x) + PaethPredictor(Raw(x-bpp), Prior(x), Prior(x-bpp)) 1677 (computed mod 256), where Raw and Prior refer to bytes already 1678 decoded. Exactly the same PaethPredictor function is used by both 1679 encoder and decoder. 1681 7. Chunk Ordering Rules 1683 To allow new chunk types to be added to PNG, it is necessary to 1684 establish rules about the ordering requirements for all chunk types. 1685 Otherwise a PNG editing program cannot know what to do when it 1686 encounters an unknown chunk. 1688 We define a "PNG editor" as a program that modifies a PNG file and 1689 wishes to preserve as much as possible of the ancillary information 1690 in the file. Two examples of PNG editors are a program that adds or 1691 modifies text chunks, and a program that adds a suggested palette to 1692 a truecolor PNG file. Ordinary image editors are not PNG editors in 1693 this sense, because they usually discard all unrecognized information 1694 while reading in an image. (Note: we strongly encourage programs 1695 handling PNG files to preserve ancillary information whenever 1696 possible.) 1698 As an example of possible problems, consider a hypothetical new 1699 ancillary chunk type that is safe-to-copy and is required to appear 1700 after PLTE if PLTE is present. If our program to add a suggested 1701 PLTE does not recognize this new chunk, it may insert PLTE in the 1702 wrong place, namely after the new chunk. We could prevent such 1703 problems by requiring PNG editors to discard all unknown chunks, but 1704 that is a very unattractive solution. Instead, PNG requires 1705 ancillary chunks not to have ordering restrictions like this. 1707 To prevent this type of problem while allowing for future extension, 1708 we put some constraints on both the behavior of PNG editors and the 1709 allowed ordering requirements for chunks. 1711 7.1. Behavior of PNG editors 1713 The rules for PNG editors are: 1715 * When copying an unknown unsafe-to-copy ancillary chunk, a 1716 PNG editor must not move the chunk relative to any critical 1717 chunk. It can relocate the chunk freely relative to other 1718 ancillary chunks that occur between the same pair of 1719 critical chunks. (This is well defined since the editor 1720 must not add, delete, modify, or reorder critical chunks if 1721 it is preserving unknown unsafe-to-copy chunks.) 1722 * When copying an unknown safe-to-copy ancillary chunk, a PNG 1723 editor must not move the chunk from before IDAT to after 1724 IDAT or vice versa. (This is well defined because IDAT is 1725 always present.) Any other reordering is permitted. 1726 * When copying a known ancillary chunk type, an editor need 1727 only honor the specific chunk ordering rules that exist for 1728 that chunk type. However, it can always choose to apply the 1729 above general rules instead. 1730 * PNG editors must give up on encountering an unknown critical 1731 chunk type, because there is no way to be certain that a 1732 valid file will result from modifying a file containing such 1733 a chunk. (Note that simply discarding the chunk is not good 1734 enough, because it might have unknown implications for the 1735 interpretation of other chunks.) 1737 These rules are expressed in terms of copying chunks from an input 1738 file to an output file, but they apply in the obvious way if a PNG 1739 file is modified in place. 1741 See also Chunk naming conventions (Section 3.3). 1743 7.2. Ordering of ancillary chunks 1745 The ordering rules for an ancillary chunk type cannot be any 1746 stricter than this: 1748 * Unsafe-to-copy chunks can have ordering requirements 1749 relative to critical chunks. 1750 * Safe-to-copy chunks can have ordering requirements relative 1751 to IDAT. 1753 The actual ordering rules for any particular ancillary chunk type 1754 may be weaker. See for example the ordering rules for the 1755 standard ancillary chunk types (Summary of Standard Chunks, 1756 Section 4.3). 1758 Decoders must not assume more about the positioning of any 1759 ancillary chunk than is specified by the chunk ordering rules. In 1760 particular, it is never valid to assume that a specific ancillary 1761 chunk type occurs with any particular positioning relative to 1762 other ancillary chunks. (For example, it is unsafe to assume that 1763 your private ancillary chunk occurs immediately before IEND. Even 1764 if your application always writes it there, a PNG editor might 1765 have inserted some other ancillary chunk after it. But you can 1766 safely assume that your chunk will remain somewhere between IDAT 1767 and IEND.) 1769 7.3. Ordering of critical chunks 1771 Critical chunks can have arbitrary ordering requirements, because 1772 PNG editors are required to give up if they encounter unknown 1773 critical chunks. For example, IHDR has the special ordering rule 1774 that it must always appear first. A PNG editor, or indeed any 1775 PNG-writing program, must know and follow the ordering rules for 1776 any critical chunk type that it can emit. 1778 8. Miscellaneous Topics 1780 8.1. File name extension 1782 On systems where file names customarily include an extension 1783 signifying file type, the extension ".png" is recommended for PNG 1784 files. Lower case ".png" is preferred if file names are case- 1785 sensitive. 1787 8.2. Internet media type 1789 The PNG authors intend to register "image/png" as the Internet 1790 Media Type for PNG [RFC-1521, RFC-1590]. At the date of this 1791 document, the media type registration process had not been 1792 completed. It is recommended that implementations also recognize 1793 the interim media type "image/x-png". 1795 8.3. Macintosh file layout 1797 In the Apple Macintosh system, the following conventions are 1798 recommended: 1800 * The four-byte file type code for PNG files is "PNGf". (This 1801 code has been registered with Apple for PNG files.) The 1802 creator code will vary depending on the creating 1803 application. 1804 * The contents of the data fork shall be a PNG file exactly as 1805 described in the rest of this specification. 1806 * The contents of the resource fork are unspecified. It may 1807 be empty or may contain application-dependent resources. 1808 * When transferring a Macintosh PNG file to a non-Macintosh 1809 system, only the data fork should be transferred. 1811 8.4. Multiple-image extension 1813 PNG itself is strictly a single-image format. However, it may be 1814 necessary to store multiple images within one file; for example, 1815 this is needed to convert some GIF files. In the future, a 1816 multiple-image format based on PNG may be defined. Such a format 1817 will be considered a separate file format and will have a 1818 different signature. PNG-supporting applications may or may not 1819 choose to support the multiple-image format. 1821 See Rationale: Why not these features? (Section 12.3). 1823 8.5. Security considerations 1825 A PNG file or datastream is composed of a collection of explicitly 1826 typed "chunks". Chunks whose contents are defined by the 1827 specification could actually contain anything, including malicious 1828 code. But there is no known risk that such malicious code could 1829 be executed on the recipient's computer as a result of decoding 1830 the PNG image. 1832 The possible security risks associated with future chunk types 1833 cannot be specified at this time. Security issues will be 1834 considered when evaluating chunks proposed for registration as 1835 public chunks. There is no additional security risk associated 1836 with unknown or unimplemented chunk types, because such chunks 1837 will be ignored, or at most be copied into another PNG file. 1839 The tEXt and zTXt chunks contain data that is meant to be 1840 displayed as plain text. It is possible that if the decoder 1841 displays such text without filtering out control characters, 1842 especially the ESC (escape) character, certain systems or 1843 terminals could behave in undesirable and insecure ways. We 1844 recommend that decoders filter out control characters to avoid 1845 this risk; see Recommendations for Decoders: Text chunk processing 1846 (Section 10.11). 1848 Because every chunk's length is available at its beginning, and 1849 because every chunk has a CRC trailer, there is a very robust 1850 defense against corrupted data and against fraudulent chunks that 1851 attempt to overflow the decoder's buffers. Also, the PNG 1852 signature bytes provide early detection of common file 1853 transmission errors. 1855 A decoder that fails to check CRCs could be subject to data 1856 corruption. The only likely consequence of such corruption is 1857 incorrectly displayed pixels within the image. Worse things might 1858 happen if the CRC of the IHDR chunk is not checked and the width 1859 or height fields are corrupted. See Recommendations for Decoders: 1860 Error checking (Section 10.1). 1862 A poorly written decoder might be subject to buffer overflow, 1863 because chunks can be extremely large, up to (2^31)-1 bytes long. 1864 But properly written decoders will handle large chunks without 1865 difficulty. 1867 9. Recommendations for Encoders 1869 This chapter gives some recommendations for encoder behavior. The 1870 only absolute requirement on a PNG encoder is that it produce files 1871 that conform to the format specified in the preceding chapters. 1872 However, best results will usually be achieved by following these 1873 recommendations. 1875 9.1. Bit depth scaling 1877 When encoding input samples that have a bit depth that cannot be 1878 directly represented in PNG, the encoder must scale the samples up 1879 to a bit depth that is allowed by PNG. The most accurate scaling 1880 method is the linear equation 1882 output = ROUND(input * MAXOUTSAMPLE / MAXINSAMPLE) 1884 where the input samples range from 0 to MAXINSAMPLE and the 1885 outputs range from 0 to MAXOUTSAMPLE (which is (2^bitdepth)-1). 1887 A close approximation to the linear scaling method can be achieved 1888 by "left bit replication", which is shifting the valid bits to 1889 begin in the most significant bit and repeating the most 1890 significant bits into the open bits. This method is often faster 1891 to compute than linear scaling. As an example, assume that 5-bit 1892 samples are being scaled up to 8 bits. If the source sample value 1893 is 27 (in the range from 0-31), then the original bits are: 1895 4 3 2 1 0 1896 --------- 1897 1 1 0 1 1 1899 Left bit replication gives a value of 222: 1901 7 6 5 4 3 2 1 0 1902 ---------------- 1903 1 1 0 1 1 1 1 0 1904 |=======| |===| 1905 | Leftmost Bits Repeated to Fill Open Bits 1906 | 1907 Original Bits 1909 which matches the value computed by the linear equation. Left bit 1910 replication usually gives the same value as linear scaling, and is 1911 never off by more than one. 1913 A distinctly less accurate approximation is obtained by simply 1914 left-shifting the input value and filling the low order bits with 1915 zeroes. This scheme cannot reproduce white exactly, since it does 1916 not generate an all-ones maximum value; the net effect is to 1917 darken the image slightly. This method is not recommended in 1918 general, but it does have the effect of improving compression, 1919 particularly when dealing with greater-than-eight-bit sample 1920 depths. Since the relative error introduced by zero-fill scaling 1921 is small at high bit depths, some encoders may choose to use it. 1922 Zero-fill should not be used for alpha channel data, however, 1923 since many decoders will special-case alpha values of all zeroes 1924 and all ones. It is important to represent both those values 1925 exactly in the scaled data. 1927 When the encoder writes an sBIT chunk, it is required to do the 1928 scaling in such a way that the high-order bits of the stored 1929 samples match the original data. That is, if the sBIT chunk 1930 specifies a bit depth of S, the high-order S bits of the stored 1931 data must agree with the original S-bit data values. This allows 1932 decoders to recover the original data by shifting right. The 1933 added low-order bits are not constrained. Note that all the above 1934 scaling methods meet this restriction. 1936 When scaling up source data, it is recommended that the low-order 1937 bits be filled consistently for all samples; that is, the same 1938 source value should generate the same sample value at any pixel 1939 position. This improves compression by reducing the number of 1940 distinct sample values. However, this is not a requirement, and 1941 some encoders may choose not to follow it. For example, an 1942 encoder might instead dither the low-order bits, improving 1943 displayed image quality at the price of increasing file size. 1945 In some applications the original source data may have a range 1946 that is not a power of 2. The linear scaling equation still works 1947 for this case, although the shifting methods do not. It is 1948 recommended that an sBIT chunk not be written for such images, 1949 since sBIT suggests that the original data range was exactly 1950 0..2^S-1. 1952 9.2. Encoder gamma handling 1954 See Gamma Tutorial (Chapter 13) if you aren't already familiar 1955 with gamma issues. 1957 Proper handling of gamma encoding and the gAMA chunk in an encoder 1958 depends on the prior history of the sample values and on whether 1959 these values have already been quantized to integers. 1961 If the encoder has access to sample intensity values in floating- 1962 point or high-precision integer form (perhaps from a computer 1963 image renderer), then it is recommended that the encoder perform 1964 its own gamma encoding before quantizing the data to integer 1965 values for storage in the file. Applying gamma encoding at this 1966 stage results in images with fewer banding artifacts at a given 1967 sample bit depth, or allows smaller samples while retaining the 1968 same visual quality. 1970 A linear intensity level, expressed as a floating-point value in 1971 the range 0 to 1, can be converted to a gamma-encoded sample value 1972 by 1974 sample = ROUND((intensity ^ encoder_gamma) * MAXSAMPLEVAL) 1976 The file_gamma value to be written in the PNG gAMA chunk is the 1977 same as encoder_gamma in this equation, since we are assuming the 1978 initial intensity value is linear (in effect, camera_gamma is 1979 1.0). 1981 If the image is being written to a file only, the encoder_gamma 1982 value can be selected somewhat arbitrarily. Values of 0.45 or 0.5 1983 are generally good choices because they are common in video 1984 systems, and so most PNG decoders should do a good job displaying 1985 such images. 1987 Some image renderers may simultaneously write the image to a PNG 1988 file and display it on-screen. The displayed pixels should be 1989 gamma corrected for the display system and viewing conditions in 1990 use, so that the user sees a proper representation of the intended 1991 scene. An appropriate gamma correction value is 1993 screen_gc = viewing_gamma / display_gamma 1995 If the renderer wants to write the same gamma-corrected sample 1996 values to the PNG file, avoiding a separate gamma-encoding step 1997 for file output, then this screen_gc value should be written in 1998 the gAMA chunk. This will allow a PNG decoder to reproduce what 1999 the file's originator saw on screen during rendering (provided the 2000 decoder properly supports arbitrary values in a gAMA chunk). 2002 However, it is equally reasonable for a renderer to apply gamma 2003 correction for screen display using a gamma appropriate to the 2004 viewing conditions, and to separately gamma-encode the sample 2005 values for file storage using a standard value of gamma such as 2006 0.5. In fact, this is preferable, since some PNG decoders may not 2007 accurately display images with unusual gAMA values. 2009 Computer graphics renderers often do not perform gamma encoding, 2010 instead making sample values directly proportional to scene light 2011 intensity. If the PNG encoder receives sample values that have 2012 already been quantized into linear-light integer values, there is 2013 no point in doing gamma encoding on them; that would just result 2014 in further loss of information. The encoder should just write the 2015 sample values to the PNG file. This "linear" sample encoding is 2016 equivalent to gamma encoding with a gamma of 1.0, so graphics 2017 programs that produce linear samples should always emit a gAMA 2018 chunk specifying a gamma of 1.0. 2020 When the sample values come directly from a piece of hardware, the 2021 correct gAMA value is determined by the gamma characteristic of 2022 the hardware. In the case of video digitizers ("frame grabbers"), 2023 gAMA should be 0.45 or 0.5 for NTSC (possibly less for PAL or 2024 SECAM) since video camera transfer functions are standardized. 2025 Image scanners are less predictable. Their output samples may be 2026 linear (gamma 1.0) since CCD sensors themselves are linear, or the 2027 scanner hardware may have already applied gamma correction 2028 designed to compensate for dot gain in subsequent printing (gamma 2029 of about 0.57), or the scanner may have corrected the samples for 2030 display on a CRT (gamma of 0.4-0.5). You will need to refer to 2031 the scanner's manual, or even scan a calibrated gray wedge, to 2032 determine what a particular scanner does. 2034 File format converters generally should not attempt to convert 2035 supplied images to a different gamma. Store the data in the PNG 2036 file without conversion, and record the source gamma if it is 2037 known. Gamma alteration at file conversion time causes 2038 requantization of the set of intensity levels that are 2039 represented, introducing further roundoff error with little 2040 benefit. It's almost always better to just copy the sample values 2041 intact from the input to the output file. 2043 In some cases, the supplied image may be in an image format (e.g., 2044 TIFF) that can describe the gamma characteristic of the image. In 2045 such cases, a file format converter is strongly encouraged to 2046 write a PNG gAMA chunk that corresponds to the known gamma of the 2047 source image. Note that some file formats specify the gamma of 2048 the display system, not the camera. If the input file's gamma 2049 value is greater than 1.0, it is almost certainly a display system 2050 gamma, and you should use its reciprocal for the PNG gAMA. 2052 If the encoder or file format converter does not know how an image 2053 was originally created, but does know that the image has been 2054 displayed satisfactorily on a display with gamma display_gamma 2055 under lighting conditions where a particular viewing_gamma is 2056 appropriate, then the image can be marked as having the 2057 file_gamma: 2059 file_gamma = viewing_gamma / display_gamma 2061 This will allow viewers of the PNG file to see the same image that 2062 the person running the file format converter saw. Although this 2063 may not be precisely the correct value of the image gamma, it's 2064 better to write a gAMA chunk with an approximately right value 2065 than to omit the chunk and force PNG decoders to guess at an 2066 appropriate gamma. 2068 On the other hand, if the image file is being converted as part of 2069 a "bulk" conversion, with no one looking at each image, then it is 2070 better to omit the gAMA chunk entirely. If the image gamma must 2071 be guessed at, leave it to the decoder to do the guessing. 2073 Gamma does not apply to alpha samples; alpha is always represented 2074 linearly. 2076 See also Recommendations for Decoders: Decoder gamma handling 2077 (Section 10.5). 2079 9.3. Encoder color handling 2081 See Color Tutorial (Chapter 14) if you aren't already familiar 2082 with color issues. 2084 If it is possible for the encoder to determine the chromaticities 2085 of the source display primaries, or to make a strong guess based 2086 on the origin of the image or the hardware running it, then the 2087 encoder is strongly encouraged to output the cHRM chunk. If it 2088 does so, the gAMA chunk should also be written; decoders can do 2089 little with cHRM if gAMA is missing. 2091 Video created with recent video equipment probably uses the CCIR 2092 709 primaries and D65 white point [ITU-BT709], which are: 2094 R G B White 2095 x 0.640 0.300 0.150 0.3127 2096 y 0.330 0.600 0.060 0.3290 2098 An older but still very popular video standard is SMPTE-C [SMPTE- 2099 170M]: 2101 R G B White 2102 x 0.630 0.310 0.155 0.3127 2103 y 0.340 0.595 0.070 0.3290 2105 The original NTSC color primaries have not been used in decades. 2106 Although you may still find the NTSC numbers listed in standards 2107 documents, you won't find any images that actually use them. 2109 Scanners that produce PNG files as output should insert the filter 2110 chromaticities into a cHRM chunk and the camera_gamma into a gAMA 2111 chunk. 2113 In the case of hand-drawn or digitally edited images, you have to 2114 determine what monitor they were viewed on when being produced. 2115 Many image editing programs allow you to specify what type of 2116 monitor you are using. This is often because they are working in 2117 some device-independent space internally. Such programs have 2118 enough information to write valid cHRM and gAMA chunks, and should 2119 do so automatically. 2121 If the encoder is compiled as a portion of a computer image 2122 renderer that performs full-spectral rendering, the monitor values 2123 that were used to convert from the internal device-independent 2124 color space to RGB should be written into the cHRM chunk. Any 2125 colors that are outside the gamut of the chosen RGB device should 2126 be clipped or otherwise constrained to be within the gamut; PNG 2127 does not store out of gamut colors. 2129 If the computer image renderer performs calculations directly in 2130 device-dependent RGB space, a cHRM chunk should not be written 2131 unless the scene description and rendering parameters have been 2132 adjusted to look good on a particular monitor. In that case, the 2133 data for that monitor (if known) should be used to construct a 2134 cHRM chunk. 2136 There are often cases where an image's exact origins are unknown, 2137 particularly if it began life in some other format. A few image 2138 formats store calibration information, which can be used to fill 2139 in the cHRM chunk. For example, all PhotoCD images use the CCIR 2140 709 primaries and D65 whitepoint, so these values can be written 2141 into the cHRM chunk when converting a PhotoCD file. PhotoCD also 2142 uses the SMPTE-170M transfer function, which is closely 2143 approximated by a gAMA of 0.5. (PhotoCD can store colors outside 2144 the RGB gamut, so the image data will require gamut mapping before 2145 writing to PNG format.) TIFF 6.0 files can optionally store 2146 calibration information, which if present should be used to 2147 construct the cHRM chunk. GIF and most other formats do not store 2148 any calibration information. 2150 It is not recommended that file format converters attempt to 2151 convert supplied images to a different RGB color space. Store the 2152 data in the PNG file without conversion, and record the source 2153 primary chromaticities if they are known. Color space 2154 transformation at file conversion time is a bad idea because of 2155 gamut mismatches and rounding errors. As with gamma conversions, 2156 it's better to store the data losslessly and incur at most one 2157 conversion when the image is finally displayed. 2159 See also Recommendations for Decoders: Decoder color handling 2160 (Section 10.6). 2162 9.4. Alpha channel creation 2164 The alpha channel can be regarded either as a mask that 2165 temporarily hides transparent parts of the image, or as a means 2166 for constructing a non-rectangular image. In the first case, the 2167 color values of fully transparent pixels should be preserved for 2168 future use. In the second case, the transparent pixels carry no 2169 useful data and are simply there to fill out the rectangular image 2170 area required by PNG. In this case, fully transparent pixels 2171 should all be assigned the same color value for best compression. 2173 Encoders should keep in mind the possibility that a decoder will 2174 ignore transparency control. Hence, the colors assigned to 2175 transparent pixels should be reasonable background colors whenever 2176 feasible. 2178 For applications that do not require a full alpha channel, or 2179 cannot afford the price in compression efficiency, the tRNS 2180 transparency chunk is also available. 2182 If the image has a known background color, this color should be 2183 written in the bKGD chunk. Even decoders that ignore transparency 2184 may use the bKGD color to fill unused screen area. 2186 If the original image has premultiplied (also called "associated") 2187 alpha data, convert it to PNG's non-premultiplied format by 2188 dividing each sample value by the corresponding alpha value, then 2189 multiplying by the maximum value for the image bit depth, and 2190 rounding to the nearest integer. In valid premultiplied data, the 2191 sample values never exceed their corresponding alpha values, so 2192 the result of the division should always be in the range 0 to 1. 2193 If the alpha value is zero, output black (zeroes). 2195 9.5. Suggested palettes 2197 A PLTE chunk can appear in truecolor PNG files. In such files, 2198 the chunk is not an essential part of the image data, but simply 2199 represents a suggested palette that viewers may use to present the 2200 image on indexed-color display hardware. A suggested palette is 2201 of no interest to viewers running on truecolor hardware. 2203 If an encoder chooses to provide a suggested palette, it is 2204 recommended that a hIST chunk also be written to indicate the 2205 relative importance of the palette entries. The histogram values 2206 are most easily computed as "nearest neighbor" counts, that is, 2207 the approximate usage of each palette entry if no dithering is 2208 applied. (These counts will often be available for free as a 2209 consequence of developing the suggested palette.) 2211 For images of color type 2 (truecolor without alpha channel), it 2212 is recommended that the palette and histogram be computed with 2213 reference to the RGB data only, ignoring any transparent-color 2214 specification. If the file uses transparency (has a tRNS chunk), 2215 viewers can easily adapt the resulting palette for use with their 2216 intended background color. They need only replace the palette 2217 entry closest to the tRNS color with their background color (which 2218 may or may not match the file's bKGD color, if any). 2220 For images of color type 6 (truecolor with alpha channel), it is 2221 recommended that a bKGD chunk appear and that the palette and 2222 histogram be computed with reference to the image as it would 2223 appear after compositing against the specified background color. 2224 This definition is necessary to ensure that useful palette entries 2225 are generated for pixels having fractional alpha values. The 2226 resulting palette will probably only be useful to viewers that 2227 present the image against the same background color. It is 2228 recommended that PNG editors delete or recompute the palette if 2229 they alter or remove the bKGD chunk in an image of color type 6. 2230 If PLTE appears without bKGD in an image of color type 6, the 2231 circumstances under which the palette was computed are 2232 unspecified. 2234 9.6. Filter selection 2236 For images of color type 3 (indexed color), filter type 0 (none) 2237 is usually the most effective. 2239 Filter type 0 is also recommended for images of bit depths less 2240 than 8. For low-bit-depth grayscale images, it may be a net win 2241 to expand the image to 8-bit representation and apply filtering, 2242 but this is rare. 2244 For truecolor and grayscale images, any of the five filters may 2245 prove the most effective. If an encoder uses a fixed filter, the 2246 Paeth filter is most likely to be the best. 2248 For best compression of truecolor and grayscale images, we 2249 recommend an adaptive filtering approach in which a filter is 2250 chosen for each scanline. The following simple heuristic has 2251 performed well in early tests: compute the output scanline using 2252 all five filters, and select the filter that gives the smallest 2253 sum of absolute values of outputs. (Consider the output bytes as 2254 signed differences for this test.) This method usually 2255 outperforms any single fixed filter choice. However, it is likely 2256 that much better heuristics will be found as more experience is 2257 gained with PNG. 2259 Filtering according to these recommendations is effective on 2260 interlaced as well as noninterlaced images. 2262 9.7. Text chunk processing 2264 A nonempty keyword must be provided for each text chunk. The 2265 generic keyword "Comment" can be used if no better description of 2266 the text is available. If a user-supplied keyword is used, be 2267 sure to check that it meets the restrictions on keywords. 2269 PNG text strings are expected to use the Latin-1 character set. 2270 Encoders should avoid storing characters that are not defined in 2271 Latin-1, and should provide character code remapping if the local 2272 system's character set is not Latin-1. 2274 Encoders should discourage the creation of single lines of text 2275 longer than 79 characters, in order to facilitate easy reading. 2277 It is recommended that text items less than 1K (1024 bytes) in 2278 size be output using uncompressed tEXt chunks. In particular, it 2279 is recommended that the basic title and author keywords always be 2280 output using uncompressed tEXt chunks. Lengthy disclaimers, on the 2281 other hand, are ideal candidates for zTXt. 2283 Placing large tEXt and zTXt chunks after the image data (after 2284 IDAT) can speed up image display in some situations, since the 2285 decoder won't have to read over the text to get to the image data. 2286 But it is recommended that small text chunks, such as the image 2287 title, appear before IDAT. 2289 9.8. Use of private chunks 2291 Applications can use PNG private chunks to carry information that 2292 need not be understood by other applications. Such chunks must be 2293 given names with lowercase second letters, to ensure that they can 2294 never conflict with any future public chunk definition. Note, 2295 however, that there is no guarantee that some other application 2296 will not use the same private chunk name. If you use a private 2297 chunk type, it is prudent to store additional identifying 2298 information at the beginning of the chunk data. 2300 Use an ancillary chunk type (lowercase first letter), not a 2301 critical chunk type, for all private chunks that store information 2302 that is not absolutely essential to view the image. Creation of 2303 private critical chunks is discouraged because they render PNG 2304 files unportable. Such chunks should not be used in publicly 2305 available software or files. If private critical chunks are 2306 essential for your application, it is recommended that one appear 2307 near the start of the file, so that a standard decoder need not 2308 read very far before discovering that it cannot handle the file. 2310 If you want others outside your organization to understand a chunk 2311 type that you invent, contact the maintainers of the PNG 2312 specification to submit a proposed chunk name and definition for 2313 addition to the list of special-purpose public chunks (see 2314 Additional Chunk Types, Section 4.4). Note that a proposed public 2315 chunk name (with uppercase second letter) must not be used in 2316 publicly available software or files until registration has been 2317 approved. 2319 If an ancillary chunk contains textual information that might be 2320 of interest to a human user, you should not create a special chunk 2321 type for it. Instead use a tEXt chunk and define a suitable 2322 keyword. That way, the information will be available to users not 2323 using your software. 2325 Keywords in tEXt chunks should be reasonably self-explanatory, 2326 since the idea is to let other users figure out what the chunk 2327 contains. If of general usefulness, new keywords can be 2328 registered with the maintainers of the PNG specification. But it 2329 is permissible to use keywords without registering them first. 2331 9.9. Private type and method codes 2333 This specification defines the meaning of only some of the 2334 possible values of some fields. For example, only compression 2335 method 0 and filter types 0 through 4 are defined. Use numbers 2336 greater than 127 when inventing experimental or private 2337 definitions of values for any of these fields. Numbers below 128 2338 are reserved for possible future public extensions of this 2339 specification. Note that use of private type codes may render a 2340 file unreadable by standard decoders. Such codes are strongly 2341 discouraged except for experimental purposes, and should not 2342 appear in publicly available software or files. 2344 10. Recommendations for Decoders 2346 This chapter gives some recommendations for decoder behavior. The 2347 only absolute requirement on a PNG decoder is that it successfully 2348 read any file conforming to the format specified in the preceding 2349 chapters. However, best results will usually be achieved by 2350 following these recommendations. 2352 10.1. Error checking 2354 To ensure early detection of common file-transfer problems, 2355 decoders should verify that all eight bytes of the PNG file 2356 signature are correct. (See Rationale: PNG file signature, 2357 Section 12.11.) A decoder can have additional confidence in the 2358 file's integrity if the next eight bytes are an IHDR chunk header 2359 with the correct chunk length. 2361 Unknown chunk types must be handled as described in Chunk naming 2362 conventions (Section 3.3). An unknown chunk type is not to be 2363 treated as an error unless it is a critical chunk. 2365 It is strongly recommended that decoders verify the CRC on each 2366 chunk. 2368 In some situations it is desirable to check chunk headers (length 2369 and type code) before reading the chunk data and CRC. The chunk 2370 type can be checked for plausibility by seeing whether all four 2371 bytes are ASCII letters (codes 65-90 and 97-122); note that this 2372 need only be done for unrecognized type codes. If the total file 2373 size is known (from file system information, HTTP protocol, etc), 2374 the chunk length can be checked for plausibility as well. 2376 If CRCs are not checked, dropped/added data bytes or an erroneous 2377 chunk length can cause the decoder to get out of step and 2378 misinterpret subsequent data as a chunk header. Verifying that 2379 the chunk type contains letters is an inexpensive way of providing 2380 early error detection in this situation. 2382 For known-length chunks such as IHDR, decoders should treat an 2383 unexpected chunk length as an error. Future extensions to this 2384 specification will not add new fields to existing chunks; instead, 2385 new chunk types will be added to carry new information. 2387 Unexpected values in fields of known chunks (for example, an 2388 unexpected compression method in the IHDR chunk) must be checked 2389 for and treated as errors. However, it is recommended that 2390 unexpected field values be treated as fatal errors only in 2391 critical chunks. An unexpected value in an ancillary chunk can be 2392 handled by ignoring the whole chunk as though it were an unknown 2393 chunk type. (This recommendation assumes that the chunk's CRC has 2394 been verified. In decoders that do not check CRCs, it is safer to 2395 treat any unexpected value as indicating a corrupted file.) 2397 10.2. Pixel dimensions 2399 Non-square pixels can be represented (see the pHYs chunk), but 2400 viewers are not required to account for them; a viewer can present 2401 any PNG file as though its pixels are square. 2403 Conversely, viewers running on display hardware with non-square 2404 pixels are strongly encouraged to rescale images for proper 2405 display. 2407 10.3. Truecolor image handling 2409 To achieve PNG's goal of universal interchangeability, decoders 2410 are required to accept all types of PNG image: indexed-color, 2411 truecolor, and grayscale. Viewers running on indexed-color 2412 display hardware need to be able to reduce truecolor images to 2413 indexed format for viewing. This process is usually called "color 2414 quantization". 2416 A simple, fast way of doing this is to reduce the image to a fixed 2417 palette. Palettes with uniform color spacing ("color cubes") are 2418 usually used to minimize the per-pixel computation. For 2419 photograph-like images, dithering is recommended to avoid ugly 2420 contours in what should be smooth gradients; however, dithering 2421 introduces graininess that can be objectionable. 2423 The quality of rendering can be improved substantially by using a 2424 palette chosen specifically for the image, since a color cube 2425 usually has numerous entries that are unused in any particular 2426 image. This approach requires more work, first in choosing the 2427 palette, and second in mapping individual pixels to the closest 2428 available color. PNG allows the encoder to supply a suggested 2429 palette in a PLTE chunk, but not all encoders will do so, and the 2430 suggested palette may be unsuitable in any case (it may have too 2431 many or too few colors). High-quality viewers will therefore need 2432 to have a palette selection routine at hand. A large lookup table 2433 is usually the most feasible way of mapping individual pixels to 2434 palette entries with adequate speed. 2436 Numerous implementations of color quantization are available. The 2437 PNG reference implementation, libpng, includes code for the 2438 purpose. 2440 10.4. Bit depth rescaling 2442 Decoders may wish to scale PNG data to a lesser bit depth (sample 2443 precision) for display. For example, 16-bit data will need to be 2444 reduced to 8-bit depth for use on most present-day display 2445 hardware. Reduction of 8-bit data to 5-bit depth is also common. 2447 The most accurate scaling is achieved by the linear equation 2449 output = ROUND(input * MAXOUTSAMPLE / MAXINSAMPLE) 2451 where 2453 MAXINSAMPLE = (2^bitdepth)-1 2454 MAXOUTSAMPLE = (2^desired_bitdepth)-1 2456 A slightly less accurate conversion is achieved by simply shifting 2457 right by bitdepth-desired_bitdepth places. For example, to reduce 2458 16-bit samples to 8-bit, one need only discard the low-order byte. 2459 In many situations the shift method is sufficiently accurate for 2460 display purposes, and it is certainly much faster. (But if gamma 2461 correction is being done, sample rescaling can be merged into the 2462 gamma correction lookup table, as is illustrated in Decoder gamma 2463 handling, Section 10.5.) 2465 When an sBIT chunk is present, the original pre-PNG data can be 2466 recovered by shifting right to the bit depth specified by sBIT. 2467 Note that linear scaling will not necessarily reproduce the 2468 original data, because the encoder is not required to have used 2469 linear scaling to scale the data up. However, the encoder is 2470 required to have used a method that preserves the high-order bits, 2471 so shifting always works. This is the only case in which shifting 2472 might be said to be more accurate than linear scaling. 2474 When comparing pixel values to tRNS chunk values to detect 2475 transparent pixels, it is necessary to do the comparison exactly. 2476 Therefore, transparent pixel detection must be done before 2477 reducing sample precision. 2479 10.5. Decoder gamma handling 2481 See Gamma Tutorial (Chapter 13) if you aren't already familiar 2482 with gamma issues. 2484 To produce correct tone reproduction, a good image display program 2485 must take into account the gammas of the image file and the 2486 display device, as well as the viewing_gamma appropriate to the 2487 lighting conditions near the display. This can be done by 2488 calculating 2490 gbright = sampleval / MAXSAMPLEVAL 2491 bright = gbright ^ (1.0 / file_gamma) 2492 vbright = bright ^ viewing_gamma 2493 gcvideo = vbright ^ (1.0 / display_gamma) 2494 fbval = ROUND(gcvideo * MAXFBVAL) 2496 where MAXSAMPLEVAL is the maximum sample value in the file (255 2497 for 8-bit, 65535 for 16-bit, etc), MAXFBVAL is the maximum value 2498 of a frame buffer sample (255 for 8-bit, 31 for 5-bit, etc), 2499 sampleval is the value of the sample in the PNG file, and fbval is 2500 the value to write into the frame buffer. The first line converts 2501 from integer samples into a normalized 0 to 1 floating point 2502 value, the second undoes the gamma encoding of the image file to 2503 produce a linear intensity value, the third adjusts for the 2504 viewing conditions, the fourth corrects for the display system's 2505 gamma value, and the fifth converts to an integer frame buffer 2506 sample. In practice, the second through fourth lines can be 2507 merged into 2509 gcvideo = gbright^(viewing_gamma / (file_gamma*display_gamma)) 2511 so as to perform only one power calculation. For color images, the 2512 entire calculation is performed separately for R, G, and B values. 2514 It is not necessary to perform transcendental math for every 2515 pixel. Instead, compute a lookup table that gives the correct 2516 output value for every possible sample value. This requires only 2517 256 calculations per image (for 8-bit accuracy), not one or three 2518 calculations per pixel. For an indexed-color image, a one-time 2519 correction of the palette is sufficient, unless the image uses 2520 transparency and is being displayed against a nonuniform 2521 background. 2523 In some cases even the cost of computing a gamma lookup table may 2524 be a concern. In these cases, viewers are encouraged to have 2525 precomputed gamma correction tables for file_gamma values of 1.0 2526 and 0.5 with some reasonable choice of viewing_gamma and 2527 display_gamma, and to use the table closest to the gamma indicated 2528 in the file. This will produce acceptable results for the majority 2529 of real files. 2531 When the incoming image has unknown gamma (no gAMA chunk), choose 2532 a likely default file_gamma value, but allow the user to select a 2533 new one if the result proves too dark or too light. 2535 In practice, it is often difficult to determine what value of 2536 display_gamma should be used. In systems with no built-in gamma 2537 correction, the display_gamma is determined entirely by the CRT. 2538 Assuming a CRT_gamma of 2.5 is recommended, unless you have 2539 detailed calibration measurements of this particular CRT 2540 available. 2542 However, many modern frame buffers have lookup tables that are 2543 used to perform gamma correction, and on these systems the 2544 display_gamma value should be the gamma of the lookup table and 2545 CRT combined. You may not be able to find out what the lookup 2546 table contains from within an image viewer application, so you may 2547 have to ask the user what the system's gamma value is. 2548 Unfortunately, different manufacturers use different ways of 2549 specifying what should go into the lookup table, so interpretation 2550 of the system gamma value is system-dependent. Gamma Tutorial 2551 (Chapter 13) gives some examples. 2553 The response of real displays is actually more complex than can be 2554 described by a single number (display_gamma). If actual 2555 measurements of the monitor's light output as a function of 2556 voltage input are available, the fourth and fifth lines of the 2557 computation above can be replaced by a lookup in these 2558 measurements, to find the actual frame buffer value that most 2559 nearly gives the desired brightness. 2561 The value of viewing_gamma depends on lighting conditions; see 2562 Gamma Tutorial (Chapter 13) for more detail. Ideally, a viewer 2563 would allow the user to specify viewing_gamma, either directly 2564 numerically, or via selecting from "bright surround", "dim 2565 surround", and "dark surround" conditions. Viewers that don't 2566 want to do this should just assume a value for viewing_gamma of 2567 1.0, since most computer displays live in brightly-lit rooms. 2569 When viewing images that are digitized from video, or that are 2570 destined to become video frames, the user might want to set the 2571 viewing_gamma to about 1.25 regardless of the actual level of room 2572 lighting. This value of viewing_gamma is "built into" NTSC video 2573 practice, and displaying an image with that viewing_gamma allows 2574 the user to see what a TV set would show under the current room 2575 lighting conditions. (This is not the same thing as trying to 2576 obtain the most accurate rendition of the content of the scene, 2577 which would require adjusting viewing_gamma to correspond to the 2578 room lighting level.) This is another reason viewers might want 2579 to allow users to adjust viewing_gamma directly. 2581 10.6. Decoder color handling 2583 See Color Tutorial (Chapter 14) if you aren't already familiar 2584 with color issues. 2586 In many cases, decoders will treat image data in PNG files as 2587 device-dependent RGB data and display it without modification 2588 (except for appropriate gamma correction). This provides the 2589 fastest display of PNG images. But unless the viewer uses exactly 2590 the same display hardware as the original image author used, the 2591 colors will not be exactly the same as the original author saw, 2592 particularly for darker or near-neutral colors. The cHRM chunk 2593 provides information that allows closer color matching than that 2594 provided by gamma correction alone. 2596 Decoders can use the cHRM data to transform the image data from 2597 RGB to XYZ and thence into a perceptually linear color space such 2598 as CIE LAB. They can then partition the colors to generate an 2599 optimal palette, because the geometric distance between two colors 2600 in CIE LAB is strongly related to how different those colors 2601 appear (unlike, for example, RGB or XYZ spaces). The resulting 2602 palette of colors, once transformed back into RGB color space, 2603 could be used for display or written into a PLTE chunk. 2605 Decoders that are part of image processing applications might also 2606 transform image data into CIE LAB space for analysis. 2608 In applications where color fidelity is critical, such as product 2609 design, scientific visualization, medicine, architecture, or 2610 advertising, decoders can transform the image data from source_RGB 2611 to the display_RGB space of the monitor used to view the image. 2612 This involves calculating the matrix to go from source_RGB to XYZ 2613 and the matrix to go from XYZ to display_RGB, then combining them 2614 to produce the overall transformation. The decoder is responsible 2615 for implementing gamut mapping. 2617 Decoders running on platforms that have a Color Management System 2618 (CMS) can pass the image data, gAMA and cHRM values to the CMS for 2619 display or further processing. 2621 Decoders that provide color printing facilities can use the 2622 facilities in Level 2 PostScript to specify image data in 2623 calibrated RGB space or in a device-independent color space such 2624 as XYZ. This will provide better color fidelity than a simple RGB 2625 to CMYK conversion. The PostScript Language Reference manual 2626 gives examples of this process [POSTSCRIPT]. Such decoders are 2627 responsible for implementing gamut mapping between source_RGB 2628 (specified in the cHRM chunk) and the target printer. The 2629 PostScript interpreter is then responsible for producing the 2630 required colors. 2632 Decoders can use the cHRM data to calculate an accurate grayscale 2633 representation of a color image. Conversion from RGB to gray is 2634 simply a case of calculating the Y (luminance) component of XYZ, 2635 which is a weighted sum of the R G and B values. The weights 2636 depend on the monitor type, i.e., the values in the cHRM chunk. 2637 Decoders may wish to do this for PNG files with no cHRM chunk. In 2638 that case, a reasonable default would be the CCIR 709 primaries 2639 [ITU-BT709]. Do not use the original NTSC primaries, unless you 2640 really do have an image color-balanced for such a monitor. Few 2641 monitors ever used the NTSC primaries, so such images are probably 2642 nonexistent these days. 2644 10.7. Background color 2646 The background color given by bKGD will typically be used to fill 2647 unused screen space around the image, as well as any transparent 2648 pixels within the image. (Thus, bKGD is valid and useful even 2649 when the image does not use transparency.) If no bKGD chunk is 2650 present, the viewer must make its own decision about a suitable 2651 background color. 2653 Viewers that have a specific background against which to present 2654 the image (such as Web browsers) will ignore the bKGD chunk, in 2655 effect overriding bKGD with their preferred background color or 2656 background image. 2658 The background color given by bKGD is not to be considered 2659 transparent, even if it happens to match the color given by tRNS 2660 (or, in the case of an indexed-color image, refers to a palette 2661 index that is marked as transparent by tRNS). Otherwise one would 2662 have to imagine something "behind the background" to composite 2663 against. The background color is either used as background or 2664 ignored; it is not an intermediate layer between the PNG image and 2665 some other background. 2667 Indeed, it will be common that bKGD and tRNS specify the same 2668 color, since then a decoder that does not implement transparency 2669 processing will give the intended display, at least when no 2670 partially-transparent pixels are present. 2672 10.8. Alpha channel processing 2674 In the most general case, the alpha channel can be used to 2675 composite a foreground image against a background image; the PNG 2676 file defines the foreground image and the transparency mask, but 2677 not the background image. Decoders are not required to support 2678 this most general case. It is expected that most will be able to 2679 support compositing against a single background color, however. 2681 The equation for computing a composited sample value is 2683 output = alpha * foreground + (1-alpha) * background 2685 where alpha and the input and output sample values are expressed 2686 as fractions in the range 0 to 1. This computation should be 2687 performed with linear (non-gamma-encoded) sample values. For 2688 color images, the computation is done separately for R, G, and B 2689 samples. 2691 The following code illustrates the general case of compositing a 2692 foreground image over a background image. It assumes that you 2693 have the original pixel data available for the background image, 2694 and that output is to a frame buffer for display. Other variants 2695 are possible; see the comments below the code. The code allows 2696 the bit depths and gamma values of foreground image, background 2697 image, and frame buffer/CRT all to be different. Don't assume 2698 they are the same without checking. 2700 This code is standard C, with line numbers added for reference in 2701 the comments below. 2703 01 int foreground[4]; /* image pixel: R, G, B, A */ 2704 02 int background[3]; /* background pixel: R, G, B */ 2705 03 int fbpix[3]; /* frame buffer pixel */ 2706 04 int fg_maxsample; /* foreground max sample */ 2707 05 int bg_maxsample; /* background max sample */ 2708 06 int fb_maxsample; /* frame buffer max sample */ 2709 07 int ialpha; 2710 08 float alpha, compalpha; 2711 09 float gamfg, linfg, gambg, linbg, comppix, gcvideo; 2713 /* Get max sample values in data and frame buffer */ 2714 10 fg_maxsample = (1 << fg_bit_depth) - 1; 2715 11 bg_maxsample = (1 << bg_bit_depth) - 1; 2716 12 fb_maxsample = (1 << frame_buffer_bit_depth) - 1; 2717 /* 2718 * Get integer version of alpha. 2719 * Check for opaque and transparent special cases; 2720 * no compositing needed if so. 2721 * 2722 * We show the whole gamma decode/correct process in 2723 * floating point, but it would more likely be done 2724 * with lookup tables. 2725 */ 2726 13 ialpha = foreground[3]; 2727 14 if (ialpha == 0) { 2728 /* 2729 * Foreground image is transparent here. 2730 * If the background image is already in the frame 2731 * buffer, there is nothing to do. 2732 */ 2733 15 ; 2734 16 } else if (ialpha == fg_maxsample) { 2735 /* 2736 * Copy foreground pixel to frame buffer. 2737 */ 2738 17 for (i = 0; i < 3; i++) { 2739 18 gamfg = (float) foreground[i] / fg_maxsample; 2740 19 linfg = pow(gamfg, 1.0/fg_gamma); 2741 20 comppix = linfg; 2742 21 gcvideo = pow(comppix,viewing_gamma/display_gamma); 2743 22 fbpix[i] = (int) (gcvideo * fb_maxsample + 0.5); 2744 23 } 2745 24 } else { 2746 /* 2747 * Compositing is necessary. 2748 * Get floating-point alpha and its complement. 2749 * Note: alpha is always linear; gamma does not 2750 * affect it. 2751 */ 2752 25 alpha = (float) ialpha / fg_maxsample; 2753 26 compalpha = 1.0 - alpha; 2755 27 for (i = 0; i < 3; i++) { 2756 /* 2757 * Convert foreground and background to floating 2758 * point, then linearize (undo gamma encoding). 2759 */ 2760 28 gamfg = (float) foreground[i] / fg_maxsample; 2761 29 linfg = pow(gamfg, 1.0/fg_gamma); 2762 30 gambg = (float) background[i] / bg_maxsample; 2763 31 linbg = pow(gambg, 1.0/bg_gamma); 2764 /* 2765 * Composite. 2766 */ 2767 32 comppix = linfg * alpha + linbg * compalpha; 2768 /* 2769 * Gamma correct for display. 2770 * Convert to integer frame buffer pixel. 2771 */ 2772 33 gcvideo = pow(comppix,viewing_gamma/display_gamma); 2773 34 fbpix[i] = (int) (gcvideo * fb_maxsample + 0.5); 2774 35 } 2775 36 } 2777 Variations: 2779 * If output is to another PNG image file instead of a frame 2780 buffer, lines 21, 22, 33, and 34 should be changed to be 2781 something like 2783 /* 2784 * Gamma encode for storage in output file. 2785 * Convert to integer sample value. 2786 */ 2787 gamout = pow(comppix, outfile_gamma); 2788 outpix[i] = (int) (gamout * out_maxsample + 0.5); 2790 Also, it becomes necessary to process background pixels when 2791 alpha is zero, rather than just skipping pixels. Thus, line 2792 15 must be replaced by copies of lines 17-23, but processing 2793 background instead of foreground pixel values. 2794 * If the bit depth of the output file, foreground file, and 2795 background file are all the same, and the three gamma values 2796 also match, then the no-compositing code in lines 14-23 2797 reduces to nothing more than copying pixel values from the 2798 input file to the output file if alpha is one, or copying 2799 pixel values from background to output file if alpha is 2800 zero. Since alpha is typically either zero or one for the 2801 vast majority of pixels in an image, this is a great 2802 savings. No gamma computations are needed for most pixels. 2803 * When the bit depths and gamma values all match, it may 2804 appear attractive to skip the gamma decoding and encoding 2805 (lines 28-31, 33-34) and just perform line 32 using gamma- 2806 encoded sample values. Although this doesn't hurt image 2807 quality too badly, the time savings are small if alpha 2808 values of zero and one are special-cased as recommended 2809 here. 2810 * If the original pixel values of the background image are no 2811 longer available, only processed frame buffer pixels left by 2812 display of the background image, then lines 30 and 31 must 2813 extract intensity from the frame buffer pixel values using 2814 code like 2816 /* 2817 * Decode frame buffer value back into linear space. 2818 */ 2819 gcvideo = (float) fbpix[i] / fb_maxsample; 2820 linbg = pow(gcvideo, display_gamma / viewing_gamma); 2822 However, some roundoff error can result, so it is better to 2823 have the original background pixels available if at all 2824 possible. 2825 * Note that lines 18-22 are performing exactly the same gamma 2826 computation that is done when no alpha channel is present. 2827 So, if you handle the no-alpha case with a lookup table, you 2828 can use the same lookup table here. Lines 28-31 and 33-34 2829 can also be done with (different) lookup tables. 2830 * Of course, everything here can be done in integer 2831 arithmetic. Just be careful to maintain sufficient 2832 precision all the way through. 2834 Note: in floating point, no overflow or underflow checks are 2835 needed, because the input sample values are guaranteed to be 2836 between 0 and 1, and compositing always yields a result that is in 2837 between the input values (inclusive). With integer arithmetic, 2838 some roundoff-error analysis might be needed to guarantee no 2839 overflow or underflow. 2841 When displaying a PNG image with full alpha channel, it is 2842 important to be able to composite the image against some 2843 background, even if it's only black. Ignoring the alpha channel 2844 will cause PNG images that have been converted from an 2845 associated-alpha representation to look wrong. (Of course, if the 2846 alpha channel is a separate transparency mask, then ignoring alpha 2847 is a useful option: it allows the hidden parts of the image to be 2848 recovered.) 2850 Even if the decoder author does not wish to implement true 2851 compositing logic, it is simple to deal with images that contain 2852 only zero and one alpha values. (This is implicitly true for 2853 grayscale and truecolor PNG files that use a tRNS chunk; for 2854 indexed-color PNG files, it is easy to check whether tRNS contains 2855 any values other than 0 and 255.) In this simple case, 2856 transparent pixels are replaced by the background color, while 2857 others are unchanged. If a decoder contains only this much 2858 transparency capability, it should deal with a full alpha channel 2859 by treating all nonzero alpha values as fully opaque; that is, do 2860 not replace partially transparent pixels by the background. This 2861 approach will not yield very good results for images converted 2862 from associated-alpha formats, but it's better than doing nothing. 2864 10.9. Progressive display 2866 When receiving images over slow transmission links, decoders can 2867 improve perceived performance by displaying interlaced images 2868 progressively. This means that as each pass is received, an 2869 approximation to the complete image is displayed based on the data 2870 received so far. One simple yet pleasing effect can be obtained 2871 by expanding each received pixel to fill a rectangle covering the 2872 yet-to-be-transmitted pixel positions below and to the right of 2873 the received pixel. This process can be described by the 2874 following pseudocode: 2876 Starting_Row [1..7] = { 0, 0, 4, 0, 2, 0, 1 } 2877 Starting_Col [1..7] = { 0, 4, 0, 2, 0, 1, 0 } 2878 Row_Increment [1..7] = { 8, 8, 8, 4, 4, 2, 2 } 2879 Col_Increment [1..7] = { 8, 8, 4, 4, 2, 2, 1 } 2880 Block_Height [1..7] = { 8, 8, 4, 4, 2, 2, 1 } 2881 Block_Width [1..7] = { 8, 4, 4, 2, 2, 1, 1 } 2882 pass := 1 2883 while pass <= 7 2884 begin 2885 row := Starting_Row[pass] 2887 while row < height 2888 begin 2889 col := Starting_Col[pass] 2891 while col < width 2892 begin 2893 visit (row, col, 2894 min (Block_Height[pass], height - row), 2895 min (Block_Width[pass], width - col)) 2896 col := col + Col_Increment[pass] 2897 end 2898 row := row + Row_Increment[pass] 2899 end 2901 pass := pass + 1 2902 end 2904 Here, the function "visit(row,column,height,width)" obtains the 2905 next transmitted pixel and paints a rectangle of the specified 2906 height and width, whose upper-left corner is at the specified row 2907 and column, using the color indicated by the pixel. Note that row 2908 and column are measured from 0,0 at the upper left corner. 2910 If the decoder is merging the received image with a background 2911 image, it may be more convenient just to paint the received pixel 2912 positions; that is, the "visit()" function sets only the pixel at 2913 the specified row and column, not the whole rectangle. This 2914 produces a "fade-in" effect as the new image gradually replaces 2915 the old. An advantage of this approach is that proper alpha or 2916 transparency processing can be done as each pixel is replaced. 2917 Painting a rectangle as described above will overwrite 2918 background-image pixels that may be needed later, if the pixels 2919 eventually received for those positions turn out to be wholly or 2920 partially transparent. Of course, this is only a problem if the 2921 background image is not stored anywhere offscreen. 2923 10.10. Suggested-palette and histogram usage 2925 In truecolor PNG files, the encoder may have provided a suggested 2926 PLTE chunk for use by viewers running on indexed-color hardware. 2928 If the image has a tRNS chunk, the viewer will need to adapt the 2929 suggested palette for use with its desired background color. To 2930 do this, replace the palette entry closest to the tRNS color with 2931 the desired background color; or just add a palette entry for the 2932 background color, if the viewer can handle more colors than there 2933 are PLTE entries. 2935 For images of color type 6 (truecolor with alpha channel), any 2936 suggested palette should have been designed for display of the 2937 image against a uniform background of the color specified by bKGD. 2938 Viewers should probably ignore the palette if they intend to use a 2939 different background, or if the bKGD chunk is missing. Viewers 2940 can use a suggested palette for display against a different 2941 background than it was intended for, but the results may not be 2942 very good. 2944 If the viewer presents a transparent truecolor image against a 2945 background that is more complex than a single color, it is 2946 unlikely that the suggested palette will be optimal for the 2947 composite image. In this case it is best to perform a truecolor 2948 compositing step on the truecolor PNG image and background image, 2949 then color-quantize the resulting image. 2951 The histogram chunk is useful when the viewer cannot provide as 2952 many colors as are used in the image's palette. If the viewer is 2953 only short a few colors, it is usually adequate to drop the 2954 least-used colors from the palette. To reduce the number of 2955 colors substantially, it's best to choose entirely new 2956 representative colors, rather than trying to use a subset of the 2957 existing palette. This amounts to performing a new color 2958 quantization step; however, the existing palette and histogram can 2959 be used as the input data, thus avoiding a scan of the image data. 2961 If no palette or histogram chunk is provided, a decoder can 2962 develop its own, at the cost of an extra pass over the image data. 2963 Alternatively, a default palette (probably a color cube) can be 2964 used. 2966 See also Recommendations for Encoders: Suggested palettes (Section 2967 9.5). 2969 10.11. Text chunk processing 2971 If practical, decoders should have a way to display to the user 2972 all tEXt and zTXt chunks found in the file. Even if the decoder 2973 does not recognize a particular text keyword, the user might be 2974 able to understand it. 2976 PNG text is not supposed to contain any characters outside the ISO 2977 8859-1 "Latin-1" character set (that is, no codes 0-31 or 127- 2978 159), except for the newline character (decimal 10). But decoders 2979 might encounter such characters anyway. Some of these characters 2980 can be safely displayed (e.g., TAB, FF, and CR, decimal 9, 12, and 2981 13, respectively), but others, especially the ESC character 2982 (decimal 27), could pose a security hazard because unexpected 2983 actions may be taken by display hardware or software. To prevent 2984 such hazards, decoders should not attempt to directly display any 2985 non-Latin-1 characters (except for newline and perhaps TAB, FF, 2986 CR) encountered in a tEXt or zTXt chunk. Instead, ignore them or 2987 display them in a visible notation such as "\nnn". See Security 2988 considerations (Section 8.5). 2990 Even though encoders are supposed to represent newlines as LF, it 2991 is recommended that decoders not rely on this; it's best to 2992 recognize all the common newline combinations (CR, LF, and CR-LF) 2993 and display each as a single newline. TAB can be expanded to the 2994 proper number of spaces needed to arrive at a column multiple of 2995 8. 2997 Decoders running on systems with non-Latin-1 character set 2998 encoding should provide character code remapping so that Latin-1 2999 characters are displayed correctly. Some systems may not provide 3000 all the characters defined in Latin-1. Mapping unavailable 3001 characters to a visible notation such as "\nnn" is a good 3002 fallback. In particular, character codes 127-255 should be 3003 displayed only if they are printable characters on the decoding 3004 system. Some systems may interpret such codes as control 3005 characters; for security, decoders running on such systems should 3006 not display such characters literally. 3008 Decoders should be prepared to display text chunks that contain 3009 any number of printing characters between newline characters, even 3010 though encoders are encouraged to avoid creating lines in excess 3011 of 79 characters. 3013 11. Glossary 3015 Alpha 3016 A value representing a pixel's degree of transparency. The more 3017 transparent a pixel, the less it hides the background against 3018 which the image is presented. In PNG, alpha is really the degree 3019 of opacity: zero alpha represents a completely transparent pixel, 3020 maximum alpha represents a completely opaque pixel. But most 3021 people refer to alpha as providing transparency information, not 3022 opacity information, and we continue that custom here. 3024 Ancillary chunk 3025 A chunk that provides additional information. A decoder can still 3026 produce a meaningful image, though not necessarily the best 3027 possible image, without processing the chunk. 3029 Byte 3030 Eight bits; also called an octet. 3032 Channel 3033 The set of all samples of the same kind within an image; for 3034 example, all the blue samples in a truecolor image. (The term 3035 "component" is also used, but not in this specification.) A 3036 sample is the intersection of a channel and a pixel. 3038 Chunk 3039 A section of a PNG file. Each chunk has a type indicated by its 3040 chunk type name. Most types of chunks also include some data. 3041 The format and meaning of the data within the chunk are determined 3042 by the type name. 3044 Chromaticity 3045 A pair of values x,y that precisely specify the hue, though not 3046 the absolute brightness, of a perceived color. 3048 Composite 3049 As a verb, to form an image by merging a foreground image and a 3050 background image, using transparency information to determine 3051 where the background should be visible. The foreground image is 3052 said to be "composited against" the background. 3054 CRC 3055 Cyclic Redundancy Check. A CRC is a type of check value designed 3056 to catch most transmission errors. A decoder calculates the CRC 3057 for the received data and compares it to the CRC that the encoder 3058 calculated, which is appended to the data. A mismatch indicates 3059 that the data was corrupted in transit. 3061 CRT 3062 Cathode Ray Tube: a common type of computer display hardware. 3064 Critical chunk 3065 A chunk that must be understood and processed by the decoder in 3066 order to produce a meaningful image from a PNG file. 3068 Datastream 3069 A sequence of bytes. This term is used rather than "file" to 3070 describe a byte sequence that is only a portion of a file. We 3071 also use it to emphasize that a PNG image might be generated and 3072 consumed "on the fly", never appearing in a stored file at all. 3074 Deflate 3075 The name of the compression algorithm used in standard PNG files, 3076 as well as in zip, gzip, pkzip, and other compression programs. 3077 Deflate is a member of the LZ77 family of compression methods. 3079 Filter 3080 A transformation applied to image data in hopes of improving its 3081 compressibility. PNG uses only lossless (reversible) filter 3082 algorithms. 3084 Frame buffer 3085 The final digital storage area for the image shown by a computer 3086 display. Software causes an image to appear onscreen by loading 3087 it into the frame buffer. 3089 Gamma 3090 The brightness of mid-level tones in an image. More precisely, a 3091 parameter that describes the shape of the transfer function for 3092 one or more stages in an imaging pipeline. The transfer function 3093 is given by the expression 3095 output = input ^ gamma 3097 where both input and output are scaled to the range 0 to 1. 3099 Grayscale 3100 An image representation in which each pixel is represented by a 3101 single sample value representing overall luminance (on a scale 3102 from black to white). PNG also permits an alpha sample to be 3103 stored for each pixel of a grayscale image. 3105 Indexed color 3106 An image representation in which each pixel is represented by a 3107 single sample that is an index into a palette or lookup table. 3108 The selected palette entry defines the actual color of the pixel. 3110 Lossless compression 3111 Any method of data compression that guarantees the original data 3112 can be reconstructed exactly, bit-for-bit. 3114 Lossy compression 3115 Any method of data compression that reconstructs the original data 3116 approximately, rather than exactly. 3118 LSB 3119 Least Significant Byte of a multi-byte value. 3121 Luminance 3122 Perceived brightness, or grayscale level, of a color. Luminance 3123 and chromaticity together fully define a perceived color. 3125 LUT 3126 Look Up Table. In general, a table used to transform data. In 3127 frame buffer hardware, a LUT can be used to map indexed-color 3128 pixels into a selected set of truecolor values, or to perform 3129 gamma correction. In software, a LUT can be used as a fast way of 3130 implementing any one-variable mathematical function. 3132 MSB 3133 Most Significant Byte of a multi-byte value. 3135 Palette 3136 The set of colors available in an indexed-color image. In PNG, a 3137 palette is an array of colors defined by red, green, and blue 3138 samples. (Alpha values can also be defined for palette entries, 3139 via the tRNS chunk.) 3141 Pixel 3142 The information stored for a single grid point in the image. The 3143 complete image is a rectangular array of pixels. 3145 PNG editor 3146 A program that modifies a PNG file and preserves ancillary 3147 information, including chunks that it does not recognize. Such a 3148 program must obey the rules given in Chunk Ordering Rules (Chapter 3149 7). 3151 Sample 3152 A single number in the image data; for example, the red value of a 3153 pixel. A pixel is composed of one or more samples. We use 3154 "sample" both for color values and for the palette index values of 3155 an indexed-color image. 3157 Scanline 3158 One horizontal row of pixels within an image. 3160 Truecolor 3161 An image representation in which pixel colors are defined by 3162 storing three samples for each pixel, representing red, green, and 3163 blue intensities respectively. PNG also permits an alpha sample 3164 to be stored for each pixel of a truecolor image. 3166 White point 3167 The chromaticity of a computer display's nominal white value. 3169 zlib 3170 A particular format for data that has been compressed using 3171 deflate-style compression. Also the name of a library 3172 implementing this method. PNG implementations need not use the 3173 zlib library, but they must conform to its format for compressed 3174 data. 3176 x^y 3177 Exponentiation; x raised to the power y. C programmers should be 3178 careful not to misread this notation as exclusive-or. Note that 3179 in gamma-related calculations, zero raised to any power is valid 3180 and should give a zero result. 3182 12. Appendix: Rationale 3184 (This appendix is not part of the formal PNG specification.) 3186 This appendix gives the reasoning behind some of the design decisions 3187 in PNG. Many of these decisions were the subject of considerable 3188 debate. The authors freely admit that another group might have made 3189 different decisions; however, we believe that our choices are 3190 defensible and consistent. 3192 12.1. Why a new file format? 3194 Does the world really need yet another graphics format? We 3195 believe so. GIF is no longer freely usable, but no other commonly 3196 used format can directly replace it, as is discussed in more 3197 detail below. We might have used an adaptation of an existing 3198 format, for example GIF with an unpatented compression scheme. 3199 But this would require new code anyway; it would not be all that 3200 much easier to implement than a whole new file format. (PNG is 3201 designed to be simple to implement, with the exception of the 3202 compression engine, which would be needed in any case.) We feel 3203 that this is an excellent opportunity to design a new format that 3204 fixes some of the known limitations of GIF. 3206 12.2. Why these features? 3208 The features chosen for PNG are intended to address the needs of 3209 applications that previously used the special strengths of GIF. 3210 In particular, GIF is well adapted for online communications 3211 because of its streamability and progressive display capability. 3212 PNG shares those attributes. 3214 We have also addressed some of the widely known shortcomings of 3215 GIF. In particular, PNG supports truecolor images. We know of no 3216 widely used image format that losslessly compresses truecolor 3217 images as effectively as PNG does. We hope that PNG will make use 3218 of truecolor images more practical and widespread. 3220 Some form of transparency control is desirable for applications in 3221 which images are displayed against a background or together with 3222 other images. GIF provided a simple transparent-color 3223 specification for this purpose. PNG supports a full alpha channel 3224 as well as transparent-color specifications. This allows both 3225 highly flexible transparency and compression efficiency. 3227 Robustness against transmission errors has been an important 3228 consideration. For example, images transferred across Internet 3229 are often mistakenly processed as text, leading to file 3230 corruption. PNG is designed so that such errors can be detected 3231 quickly and reliably. 3233 PNG has been expressly designed not to be completely dependent on 3234 a single compression technique. Although deflate/inflate 3235 compression is mentioned in this document, PNG would still exist 3236 without it. 3238 12.3. Why not these features? 3240 Some features have been deliberately omitted from PNG. These 3241 choices were made to simplify implementation of PNG, promote 3242 portability and interchangeability, and make the format as simple 3243 and foolproof as possible for users. In particular: 3245 * There is no uncompressed variant of PNG. It is possible to 3246 store uncompressed data by using only uncompressed deflate 3247 blocks (a feature normally used to guarantee that deflate 3248 does not make incompressible data much larger). However, 3249 any software that does not support full deflate/inflate will 3250 not be considered compliant with the PNG standard. The two 3251 most important features of PNG---portability and 3252 compression---are absolute requirements for online 3253 applications, and users demand them. Failure to support full 3254 deflate/inflate compromises both of these objectives. 3255 * There is no lossy compression in PNG. Existing formats such 3256 as JFIF already handle lossy compression well. Furthermore, 3257 available lossy compression methods (e.g., JPEG) are far 3258 from foolproof --- a poor choice of quality level can ruin 3259 an image. To avoid user confusion and unintentional loss of 3260 information, we feel it is best to keep lossy and lossless 3261 formats strictly separate. Also, lossy compression is 3262 complex to implement. Adding JPEG support to a PNG decoder 3263 might increase its size by an order of magnitude. This 3264 would certainly cause some decoders to omit support for the 3265 feature, which would destroy our goal of interchangeability. 3266 * There is no support for CMYK or other unusual color spaces. 3267 Again, this is in the name of promoting portability. CMYK, 3268 in particular, is far too device-dependent to be useful as a 3269 portable image representation. 3270 * There is no standard chunk for thumbnail views of images. 3271 In discussions with software vendors who use thumbnails in 3272 their products, it has become clear that most would not use 3273 a "standard" thumbnail chunk. For one thing, every vendor 3274 has a different idea of what the dimensions and 3275 characteristics of a thumbnail should be. Also, some 3276 vendors keep thumbnails in separate files to accommodate 3277 varied image formats; they are not going to stop doing that 3278 simply because of a thumbnail chunk in one new format. 3279 Proprietary chunks containing vendor-specific thumbnails 3280 appear to be more practical than a common thumbnail format. 3282 It is worth noting that private extensions to PNG could easily add 3283 these features. We will not, however, include them as part of the 3284 basic PNG standard. 3286 Basic PNG also does not support multiple images in one file. This 3287 restriction is a reflection of the reality that many applications 3288 do not need and will not support multiple images per file. (While 3289 the GIF standard nominally allows multiple images per file, few 3290 applications actually support it.) In any case, single images are 3291 a fundamentally different sort of object from sequences of images. 3292 Rather than make false promises of interchangeability, we have 3293 drawn a clear distinction between single-image and multi-image 3294 formats. PNG is a single-image format. 3296 12.4. Why not use format X? 3298 Numerous existing formats were considered before deciding to 3299 develop PNG. None could meet the requirements we felt were 3300 important for PNG. 3302 GIF is no longer suitable as a universal standard because of legal 3303 entanglements. Although just replacing GIF's compression method 3304 would avoid that problem, GIF does not support truecolor images, 3305 alpha channels, or gamma correction. The spec has more subtle 3306 problems too. Only a small subset of the GIF89 spec is actually 3307 portable across a variety of implementations, but there is no 3308 codification of the most portable part of the spec. 3310 TIFF is far too complex to meet our goals of simplicity and 3311 interchangeability. Defining a TIFF subset would meet that 3312 objection, but would frustrate users making the reasonable 3313 assumption that a file saved as TIFF from their existing software 3314 would load into a program supporting our flavor of TIFF. 3315 Furthermore, TIFF is not designed for stream processing, has no 3316 provision for progressive display, and does not currently provide 3317 any good, legally unencumbered, lossless compression method. 3319 IFF has also been suggested, but is not suitable in detail: 3320 available image representations are too machine-specific or not 3321 adequately compressed. The overall chunk structure of IFF is a 3322 useful concept that PNG has liberally borrowed from, but we did 3323 not attempt to be bit-for-bit compatible with IFF chunk structure. 3324 Again this is due to detailed issues, notably the fact that IFF 3325 FORMs are not designed to be serially writable. 3327 Lossless JPEG is not suitable because it does not provide for the 3328 storage of indexed-color images. Furthermore, its lossless 3329 truecolor compression is often inferior to that of PNG. 3331 12.5. Byte order 3333 It has been asked why PNG uses network byte order. We have 3334 selected one byte ordering and used it consistently. Which order 3335 in particular is of little relevance, but network byte order has 3336 the advantage that routines to convert to and from it are already 3337 available on any platform that supports TCP/IP networking, 3338 including all PC platforms. The functions are trivial and will be 3339 included in the reference implementation. 3341 12.6. Interlacing 3343 PNG's two-dimensional interlacing scheme is more complex to 3344 implement than GIF's line-wise interlacing. It also costs a 3345 little more in file size. However, it yields an initial image 3346 eight times faster than GIF (the first pass transmits only 1/64th 3347 of the pixels, compared to 1/8th for GIF). Although this initial 3348 image is coarse, it is useful in many situations. For example, if 3349 the image is a World Wide Web imagemap that the user has seen 3350 before, PNG's first pass is often enough to determine where to 3351 click. The PNG scheme also looks better than GIF's, because 3352 horizontal and vertical resolution never differ by more than a 3353 factor of two; this avoids the odd "stretched" look seen when 3354 interlaced GIFs are filled in by replicating scanlines. 3355 Preliminary results show that small text in an interlaced PNG 3356 image is typically readable about twice as fast as in an 3357 equivalent GIF, i.e., after PNG's fifth pass or 25% of the image 3358 data, instead of after GIF's third pass or 50%. This is again due 3359 to PNG's more balanced increase in resolution. 3361 12.7. Why gamma? 3363 It might seem natural to standardize on storing sample values that 3364 are linearly proportional to light intensity (that is, have gamma 3365 of 1.0). But in fact, it is common for images to have a gamma of 3366 less than 1. There are three good reasons for this: 3368 * For reasons detailed in Gamma Tutorial (Chapter 13), all 3369 video cameras apply a "gamma correction" function to the 3370 intensity information. This causes the video signal to have 3371 a gamma of about 0.5 relative to the light intensity in the 3372 original scene. Thus, images obtained by frame-grabbing 3373 video already have a gamma of about 0.5. 3374 * The human eye has a nonlinear response to intensity, so 3375 linear encoding of samples either wastes sample codes in 3376 bright areas of the image, or provides too few sample codes 3377 to avoid banding artifacts in dark areas of the image, or 3378 both. At least 12 bits per sample are needed to avoid 3379 visible artifacts in linear encoding with a 100:1 image 3380 intensity range. An image gamma in the range 0.3 to 0.5 3381 allocates sample values in a way that roughly corresponds to 3382 the eye's response, so that 8 bits/sample are enough to 3383 avoid artifacts caused by insufficient sample precision in 3384 almost all images. This makes "gamma encoding" a much 3385 better way of storing digital images than the simpler linear 3386 encoding. 3387 * Many images are created on PCs or workstations with no gamma 3388 correction hardware and no software willing to provide gamma 3389 correction either. In these cases, the images have had 3390 their lighting and color chosen to look best on this 3391 platform --- they can be thought of as having "manual" gamma 3392 correction built in. To see what the image author intended, 3393 it is necessary to treat such images as having a file_gamma 3394 value in the range 0.4-0.6, depending on the room lighting 3395 level that the author was working in. 3397 In practice, image gamma values around 1.0 and around 0.5 are both 3398 widely found. Older image standards such as GIF often do not 3399 account for this fact. The JFIF standard specifies that images in 3400 that format should use linear samples, but many JFIF images found 3401 on the Internet actually have a gamma somewhere near 0.4 or 0.5. 3402 The variety of images found and the variety of systems that people 3403 display them on have led to widespread problems with images 3404 appearing "too dark" or "too light". 3406 PNG expects viewers to compensate for image gamma at the time that 3407 the image is displayed. Another possible approach is to expect 3408 encoders to convert all images to a uniform gamma at encoding 3409 time. While that method would speed viewers slightly, it has 3410 fundamental flaws: 3412 * Gamma correction is inherently lossy due to quantization and 3413 roundoff error. Requiring conversion at encoding time thus 3414 causes irreversible loss. Since PNG is intended to be a 3415 lossless storage format, this is undesirable; we should 3416 store unmodified source data. 3417 * The encoder might not know the source gamma value. If the 3418 decoder does gamma correction at viewing time, it can adjust 3419 the gamma (change the displayed brightness) in response to 3420 feedback from a human user. The encoder has no such 3421 recourse. 3422 * Whatever "standard" gamma we settled on would be wrong for 3423 some displays. Hence viewers would still need gamma 3424 correction capability. 3426 Since there will always be images with no gamma or an incorrect 3427 recorded gamma, good viewers will need to incorporate gamma 3428 adjustment code anyway. Gamma correction at viewing time is thus 3429 the right way to go. 3431 See Gamma Tutorial (Chapter 13) for more information. 3433 12.8. Non-premultiplied alpha 3435 PNG uses "unassociated" or "non-premultiplied" alpha so that 3436 images with separate transparency masks can be stored losslessly. 3437 Another common technique, "premultiplied alpha", stores pixel 3438 values premultiplied by the alpha fraction; in effect, the image 3439 is already composited against a black background. Any image data 3440 hidden by the transparency mask is irretrievably lost by that 3441 method, since multiplying by a zero alpha value always produces 3442 zero. 3444 Some image rendering techniques generate images with premultiplied 3445 alpha (the alpha value actually represents how much of the pixel 3446 is covered by the image). This representation can be converted to 3447 PNG by dividing the sample values by alpha, except where alpha is 3448 zero. The result will look good if displayed by a viewer that 3449 handles alpha properly, but will not look very good if the viewer 3450 ignores the alpha channel. 3452 Although each form of alpha storage has its advantages, we did not 3453 want to require all PNG viewers to handle both forms. We 3454 standardized on non-premultiplied alpha as being the lossless and 3455 more general case. 3457 12.9. Filtering 3459 PNG includes filtering capability because filtering can 3460 significantly reduce the compressed size of truecolor and 3461 grayscale images. Filtering is also sometimes of value on 3462 indexed-color images, although this is less common. 3464 The filter algorithms are defined to operate on bytes, rather than 3465 pixels; this gains simplicity and speed with very little cost in 3466 compression performance. Tests have shown that filtering is 3467 usually ineffective for images with fewer than 8 bits per sample, 3468 so providing pixelwise filtering for such images would be 3469 pointless. For 16 bit/sample data, bytewise filtering is nearly 3470 as effective as pixelwise filtering, because MSBs are predicted 3471 from adjacent MSBs, and LSBs are predicted from adjacent LSBs. 3473 The encoder is allowed to change filters for each new scanline. 3474 This creates no additional complexity for decoders, since a 3475 decoder is required to contain defiltering logic for every filter 3476 type anyway. The only cost is an extra byte per scanline in the 3477 pre-compression datastream. Our tests showed that when the same 3478 filter is selected for all scanlines, this extra byte compresses 3479 away to almost nothing, so there is little storage cost compared 3480 to a fixed filter specified for the whole image. And the 3481 potential benefits of adaptive filtering are too great to ignore. 3482 Even with the simplistic filter-choice heuristics so far 3483 discovered, adaptive filtering usually outperforms fixed filters. 3484 In particular, an adaptive filter can change behavior for 3485 successive passes of an interlaced image; a fixed filter cannot. 3487 12.10. Text strings 3489 Most graphics file formats include the ability to store some 3490 textual information along with the image. But many applications 3491 need more than that: they want to be able to store several 3492 identifiable pieces of text. For example, a database using PNG 3493 files to store medical X-rays would likely want to include 3494 patient's name, doctor's name, etc. A simple way to do this in 3495 PNG would be to invent new private chunks holding text. The 3496 disadvantage of such an approach is that other applications would 3497 have no idea what was in those chunks, and would simply ignore 3498 them. Instead, we recommend that textual information be stored in 3499 standard tEXt chunks with suitable keywords. Use of tEXt tells 3500 any PNG viewer that the chunk contains text that might be of 3501 interest to a human user. Thus, a person looking at the file with 3502 another viewer will still be able to see the text, and even 3503 understand what it is if the keywords are reasonably self- 3504 explanatory. (To this end, we recommend spelled-out keywords, not 3505 abbreviations that will be hard for a person to understand. 3506 Saving a few bytes on a keyword is false economy.) 3508 The ISO 8859-1 (Latin-1) character set was chosen as a compromise 3509 between functionality and portability. Some platforms cannot 3510 display anything more than 7-bit ASCII characters, while others 3511 can handle characters beyond the Latin-1 set. We felt that 3512 Latin-1 represents a widely useful and reasonably portable 3513 character set. Latin-1 is a direct subset of character sets 3514 commonly used on popular platforms such as Microsoft Windows and X 3515 Windows. It can also be handled on Macintosh systems with a 3516 simple remapping of characters. 3518 There is presently no provision for text employing character sets 3519 other than Latin-1. We recognize that the need for other character 3520 sets will increase. However, PNG already requires that 3521 programmers implement a number of new and unfamiliar features, and 3522 text representation is not PNG's primary purpose. Since PNG 3523 provides for the creation and public registration of new ancillary 3524 chunks of general interest, we expect that text chunks for other 3525 character sets, such as Unicode, eventually will be registered and 3526 increase gradually in popularity. 3528 12.11. PNG file signature 3530 The first eight bytes of a PNG file always contain the following 3531 values: 3533 (decimal) 137 80 78 71 13 10 26 10 3534 (hexadecimal) 89 50 4e 47 0d 0a 1a 0a 3535 (ASCII C notation) \211 P N G \r \n \032 \n 3537 This signature both identifies the file as a PNG file and provides 3538 for immediate detection of common file-transfer problems. The 3539 first two bytes distinguish PNG files on systems that expect the 3540 first two bytes to identify the file type uniquely. The first 3541 byte is chosen as a non-ASCII value to reduce the probability that 3542 a text file may be misrecognized as a PNG file; also, it catches 3543 bad file transfers that clear bit 7. Bytes two through four name 3544 the format. The CR-LF sequence catches bad file transfers that 3545 alter newline sequences. The control-Z character stops file 3546 display under MS-DOS. The final line feed checks for the inverse 3547 of the CR-LF translation problem. 3549 A decoder may further verify that the next eight bytes contain an 3550 IHDR chunk header with the correct chunk length; this will catch 3551 bad transfers that drop or alter null (zero) bytes. 3553 Note that there is no version number in the signature, nor indeed 3554 anywhere in the file. This is intentional: the chunk mechanism 3555 provides a better, more flexible way to handle format extensions, 3556 as explained in Chunk naming conventions (Section 12.13). 3558 12.12. Chunk layout 3560 The chunk design allows decoders to skip unrecognized or 3561 uninteresting chunks: it is simply necessary to skip the 3562 appropriate number of bytes, as determined from the length field. 3564 Limiting chunk length to (2^31)-1 bytes avoids possible problems 3565 for implementations that cannot conveniently handle 4-byte 3566 unsigned values. In practice, chunks will usually be much shorter 3567 than that anyway. 3569 A separate CRC is provided for each chunk in order to detect 3570 badly-transferred images as quickly as possible. In particular, 3571 critical data such as the image dimensions can be validated before 3572 being used. 3574 The chunk length is excluded from the CRC so that the CRC can be 3575 calculated as the data is generated; this avoids a second pass 3576 over the data in cases where the chunk length is not known in 3577 advance. Excluding the length from the CRC does not create any 3578 extra risk of failing to discover file corruption, since if the 3579 length is wrong, the CRC check will fail: the CRC will be computed 3580 on the wrong set of bytes and then be tested against the wrong 3581 value from the file. 3583 12.13. Chunk naming conventions 3585 The chunk naming conventions allow safe, flexible extension of the 3586 PNG format. This mechanism is much better than a format version 3587 number, because it works on a feature-by-feature basis rather than 3588 being an overall indicator. Decoders can process newer files if 3589 and only if the files use no unknown critical features (as 3590 indicated by finding unknown critical chunks). Unknown ancillary 3591 chunks can be safely ignored. We decided against having an 3592 overall format version number because experience has shown that 3593 format version numbers hurt portability as much as they help. 3594 Version numbers tend to be set unnecessarily high, leading to 3595 older decoders rejecting files that they could have processed 3596 (this was a serious problem for several years after the GIF89 spec 3597 came out, for example). Furthermore, private extensions can be 3598 made either critical or ancillary, and standard decoders will 3599 react appropriately; overall version numbers are no help for 3600 private extensions. 3602 A hypothetical chunk for vector graphics would be a critical 3603 chunk, since if ignored, important parts of the intended image 3604 would be missing. A chunk carrying the Mandelbrot set coordinates 3605 for a fractal image would be ancillary, since other applications 3606 could display the image without understanding what the image 3607 represents. In general, a chunk type should be made critical only 3608 if it is impossible to display a reasonable representation of the 3609 intended image without interpreting that chunk. 3611 The public/private property bit ensures that any newly defined 3612 public chunk type name cannot conflict with proprietary chunks 3613 that could be in use somewhere. However, this does not protect 3614 users of private chunk names from the possibility that someone 3615 else may use the same chunk name for a different purpose. It is a 3616 good idea to put additional identifying information at the start 3617 of the data for any private chunk type. 3619 When a PNG file is modified, certain ancillary chunks may need to 3620 be changed to reflect changes in other chunks. For example, a 3621 histogram chunk needs to be changed if the image data changes. If 3622 the file editor does not recognize histogram chunks, copying them 3623 blindly to a new output file is incorrect; such chunks should be 3624 dropped. The safe/unsafe property bit allows ancillary chunks to 3625 be marked appropriately. 3627 Not all possible modification scenarios are covered by the 3628 safe/unsafe semantics. In particular, chunks that are dependent 3629 on the total file contents are not supported. (An example of such 3630 a chunk is an index of IDAT chunk locations within the file: 3631 adding a comment chunk would inadvertently break the index.) 3632 Definition of such chunks is discouraged. If absolutely necessary 3633 for a particular application, such chunks can be made critical 3634 chunks, with consequent loss of portability to other applications. 3635 In general, ancillary chunks can depend on critical chunks but not 3636 on other ancillary chunks. It is expected that mutually dependent 3637 information should be put into a single chunk. 3639 In some situations it may be unavoidable to make one ancillary 3640 chunk dependent on another. Although the chunk property bits are 3641 insufficient to represent this case, a simple solution is 3642 available: in the dependent chunk, record the CRC of the chunk 3643 depended on. It can then be determined whether that chunk has 3644 been changed by some other program. 3646 The same technique can be useful for other purposes. For example, 3647 if a program relies on the palette being in a particular order, it 3648 can store a private chunk containing the CRC of the PLTE chunk. 3649 If this value matches when the file is again read in, then it 3650 provides high confidence that the palette has not been tampered 3651 with. Note that it is not necessary to mark the private chunk 3652 unsafe-to-copy when this technique is used; thus, such a private 3653 chunk can survive other editing of the file. 3655 12.14. Palette histograms 3657 A viewer may not be able to provide as many colors as are listed 3658 in the image's palette. (For example, some colors could be 3659 reserved by a window system.) To produce the best results in this 3660 situation, it is helpful to have information about the frequency 3661 with which each palette index actually appears, in order to choose 3662 the best palette for dithering or to drop the least-used colors. 3663 Since images are often created once and viewed many times, it 3664 makes sense to calculate this information in the encoder, although 3665 it is not mandatory for the encoder to provide it. 3667 Other image formats have usually addressed this problem by 3668 specifying that the palette entries should appear in order of 3669 frequency of use. That is an inferior solution, because it 3670 doesn't give the viewer nearly as much information: the viewer 3671 can't determine how much damage will be done by dropping the last 3672 few colors. Nor does a sorted palette give enough information to 3673 choose a target palette for dithering, in the case that the viewer 3674 must reduce the number of colors substantially. A palette 3675 histogram provides the information needed to choose such a target 3676 palette without making a pass over the image data. 3678 13. Appendix: Gamma Tutorial 3680 (This appendix is not part of the formal PNG specification.) 3682 It would be convenient for graphics programmers if all of the 3683 components of an imaging system were linear. The voltage coming from 3684 an electronic camera would be directly proportional to the intensity 3685 (power) of light in the scene, the light emitted by a CRT would be 3686 directly proportional to its input voltage, and so on. However, 3687 real-world devices do not behave in this way. All CRT displays, 3688 almost all photographic film, and many electronic cameras have 3689 nonlinear signal-to-light-intensity or intensity-to-signal 3690 characteristics. 3692 Fortunately, all of these nonlinear devices have a transfer function 3693 that is approximated fairly well by a single type of mathematical 3694 function: a power function. This power function has the general 3695 equation 3697 output = input ^ gamma 3699 where ^ denotes exponentiation, and "gamma" (often printed using the 3700 Greek letter gamma, thus the name) is simply the exponent of the 3701 power function. 3703 By convention, "input" and "output" are both scaled to the range 3704 0..1, with 0 representing black and 1 representing maximum white (or 3705 red, etc). Normalized in this way, the power function is completely 3706 described by a single number, the exponent "gamma". 3708 So, given a particular device, we can measure its output as a 3709 function of its input, fit a power function to this measured transfer 3710 function, extract the exponent, and call it gamma. We often say 3711 "this device has a gamma of 2.5" as a shorthand for "this device has 3712 a power-law response with an exponent of 2.5". We can also talk 3713 about the gamma of a mathematical transform, or of a lookup table in 3714 a frame buffer, so long as the input and output of the thing are 3715 related by the power-law expression above. 3717 How do gammas combine? 3719 Real imaging systems will have several components, and more than 3720 one of these can be nonlinear. If all of the components have 3721 transfer characteristics that are power functions, then the 3722 transfer function of the entire system is also a power function. 3723 The exponent (gamma) of the whole system's transfer function is 3724 just the product of all of the individual exponents (gammas) of 3725 the separate stages in the system. 3727 Also, stages that are linear pose no problem, since a power 3728 function with an exponent of 1.0 is really a linear function. So 3729 a linear transfer function is just a special case of a power 3730 function, with a gamma of 1.0. 3732 Thus, as long as our imaging system contains only stages with 3733 linear and power-law transfer functions, we can meaningfully talk 3734 about the gamma of the entire system. This is indeed the case 3735 with most real imaging systems. 3737 What should overall gamma be? 3739 If the overall gamma of an imaging system is 1.0, its output is 3740 linearly proportional to its input. This means that the ratio 3741 between the intensities of any two areas in the reproduced image 3742 will be the same as it was in the original scene. It might seem 3743 that this should always be the goal of an imaging system: to 3744 accurately reproduce the tones of the original scene. Alas, that 3745 is not the case. 3747 When the reproduced image is to be viewed in "bright surround" 3748 conditions, where other white objects nearby in the room have 3749 about the same brightness as white in the image, then an overall 3750 gamma of 1.0 does indeed give real-looking reproduction of a 3751 natural scene. Photographic prints viewed under room light and 3752 computer displays in bright room light are typical "bright 3753 surround" viewing conditions. 3755 However, sometimes images are intended to be viewed in "dark 3756 surround" conditions, where the room is substantially black except 3757 for the image. This is typical of the way movies and slides 3758 (transparencies) are viewed by projection. Under these 3759 circumstances, an accurate reproduction of the original scene 3760 results in an image that human viewers judge as "flat" and lacking 3761 in contrast. It turns out that the projected image needs to have 3762 a gamma of about 1.5 relative to the original scene for viewers to 3763 judge it "natural". Thus, slide film is designed to have a gamma 3764 of about 1.5, not 1.0. 3766 There is also an intermediate condition called "dim surround", 3767 where the rest of the room is still visible to the viewer, but is 3768 noticeably darker than the reproduced image itself. This is 3769 typical of television viewing, at least in the evening, as well as 3770 subdued-light computer work areas. In dim surround conditions, 3771 the reproduced image needs to have a gamma of about 1.25 relative 3772 to the original scene in order to look natural. 3774 The requirement for boosted contrast (gamma) in dark surround 3775 conditions is due to the way the human visual system works, and 3776 applies equally well to computer monitors. Thus, a PNG viewer 3777 trying to achieve the maximum realism for the images it displays 3778 really needs to know what the room lighting conditions are, and 3779 adjust the gamma of the displayed image accordingly. 3781 If asking the user about room lighting conditions is inappropriate 3782 or too difficult, just assume that the overall gamma 3783 (viewing_gamma as defined below) should be 1.0 or 1.25. That's 3784 all that most systems that implement gamma correction do. 3786 What is a CRT's gamma? 3788 All CRT displays have a power-law transfer characteristic with a 3789 gamma of about 2.5. This is due to the physical processes 3790 involved in controlling the electron beam in the electron gun, and 3791 has nothing to do with the phosphor. 3793 An exception to this rule is fancy "calibrated" CRTs that have 3794 internal electronics to alter their transfer function. If you 3795 have one of these, you probably should believe what the 3796 manufacturer tells you its gamma is. But in all other cases, 3797 assuming 2.5 is likely to be pretty accurate. 3799 There are various images around that purport to measure gamma, 3800 usually by comparing the intensity of an area containing 3801 alternating white and black with a series of areas of continuous 3802 gray of different intensity. These are usually not reliable. 3803 Test images that use a "checkerboard" pattern of black and white 3804 are the worst, because a single white pixel will be reproduced 3805 considerably darker than a large area of white. An image that 3806 uses alternating black and white horizontal lines (such as the 3807 "gamma.png" test image at 3808 ftp://ftp.uu.net/graphics/png/images/suite/gamma.png) is much 3809 better, but even it may be inaccurate at high "picture" settings 3810 on some CRTs. 3812 If you have a good photometer, you can measure the actual light 3813 output of a CRT as a function of input voltage and fit a power 3814 function to the measurements. However, note that this procedure 3815 is very sensitive to the CRT's black level adjustment, somewhat 3816 sensitive to its picture adjustment, and also affected by ambient 3817 light. Furthermore, CRTs spread some light from bright areas of 3818 an image into nearby darker areas; a single bright spot against a 3819 black background may be seen to have a "halo". Your measuring 3820 technique will need to minimize the effects of this. 3822 Because of the difficulty of measuring gamma, using either test 3823 images or measuring equipment, you're usually better off just 3824 assuming gamma is 2.5 rather than trying to measure it. 3826 What is gamma correction? 3828 A CRT has a gamma of 2.5, and we can't change that. To get an 3829 overall gamma of 1.0 (or somewhere near that) for an imaging 3830 system, we need to have at least one other component of the "image 3831 pipeline" that is nonlinear. If, in fact, there is only one 3832 nonlinear stage in addition to the CRT, then it's traditional to 3833 say that the CRT has a certain gamma, and that the other nonlinear 3834 stage provides "gamma correction" to compensate for the CRT. 3835 However, exactly where the "correction" is done depends on 3836 circumstance. 3838 In all broadcast video systems, gamma correction is done in the 3839 camera. This choice was made in the days when television 3840 electronics were all analog, and a good gamma-correction circuit 3841 was expensive to build. The original NTSC video standard required 3842 cameras to have a transfer function with a gamma of 1/2.2, or 3843 about 0.45. Recently, a more complex two-part transfer function 3844 has been adopted [SMPTE-170M], but its behavior can be well 3845 approximated by a power function with a gamma of 0.5. When the 3846 resulting image is displayed on a CRT with a gamma of 2.5, the 3847 image on screen ends up with a gamma of about 1.25 relative to the 3848 original scene, which is appropriate for "dim surround" viewing. 3850 These days, video signals are often digitized and stored in 3851 computer frame buffers. This works fine, but remember that gamma 3852 correction is "built into" the video signal, and so the digitized 3853 video has a gamma of about 0.5 relative to the original scene. 3855 Computer rendering programs often produce linear samples. To 3856 display these correctly, intensity on the CRT must be directly 3857 proportional to the sample values in the frame buffer. This can 3858 be done with a special hardware lookup table between the frame 3859 buffer and the CRT hardware. The lookup table (often called LUT) 3860 is loaded with a mapping that implements a power function with a 3861 gamma of 0.4, thus providing "gamma correction" for the CRT gamma. 3863 Thus, gamma correction sometimes happens before the frame buffer, 3864 sometimes after. As long as images created in a particular 3865 environment are always displayed in that environment, everything 3866 is fine. But when people try to exchange images, differences in 3867 gamma correction conventions often result in images that seem far 3868 too bright and washed out, or far too dark and contrasty. 3870 Gamma-encoded samples are good 3872 So, is it better to do gamma correction before or after the frame 3873 buffer? 3875 In an ideal world, sample values would be stored in floating 3876 point, there would be lots of precision, and it wouldn't really 3877 matter much. But in reality, we're always trying to store images 3878 in as few bits as we can. 3880 If we decide to use samples that are linearly proportional to 3881 intensity, and do the gamma correction in the frame buffer LUT, it 3882 turns out that we need to use at least 12 bits for each of red, 3883 green, and blue to have enough precision in intensity. With any 3884 less than that, we will sometimes see "contour bands" or "Mach 3885 bands" in the darker areas of the image, where two adjacent sample 3886 values are still far enough apart in intensity for the difference 3887 to be visible. 3889 However, through an interesting coincidence, the human eye's 3890 subjective perception of brightness is related to the physical 3891 stimulation of light intensity in a manner that is very much like 3892 the power function used for gamma correction. If we apply gamma 3893 correction to measured (or calculated) light intensity before 3894 quantizing to an integer for storage in a frame buffer, we can get 3895 away with using many fewer bits to store the image. In fact, 8 3896 bits per color is almost always sufficient to avoid contouring 3897 artifacts. This is because, since gamma correction is so closely 3898 related to human perception, we are assigning our 256 available 3899 sample codes to intensity values in a manner that approximates how 3900 visible those intensity changes are to the eye. Compared to a 3901 linear-sample image, we allocate fewer sample values to brighter 3902 parts of the tonal range and more sample values to the darker 3903 portions of the tonal range. 3905 Thus, for the same apparent image quality, images using gamma- 3906 encoded sample values need only about two-thirds as many bits of 3907 storage as images using linear samples. 3909 General gamma handling 3911 When more than two nonlinear transfer functions are involved in 3912 the image pipeline, the term "gamma correction" becomes too vague. 3913 If we consider a pipeline that involves capturing (or calculating) 3914 an image, storing it in an image file, reading the file, and 3915 displaying the image on some sort of display screen, there are at 3916 least 5 places in the pipeline that could have nonlinear transfer 3917 functions. Let's give each a specific name for their 3918 characteristic gamma: 3920 camera_gamma 3921 the characteristic of the image sensor 3923 encoding_gamma 3924 the gamma of any transformation performed by the software 3925 writing the image file 3927 decoding_gamma 3928 the gamma of any transformation performed by the software 3929 reading the image file 3931 LUT_gamma 3932 the gamma of the frame buffer LUT, if present 3934 CRT_gamma 3935 the gamma of the CRT, generally 2.5 3937 In addition, let's add a few other names: 3939 file_gamma 3940 the gamma of the image in the file, relative to the original 3941 scene. This is 3943 file_gamma = camera_gamma * encoding_gamma 3945 display_gamma 3946 the gamma of the "display system" downstream of the frame 3947 buffer. This is 3949 display_gamma = LUT_gamma * CRT_gamma 3951 viewing_gamma 3952 the overall gamma that we want to obtain to produce pleasing 3953 images --- generally 1.0 to 1.5. 3955 The file_gamma value, as defined above, is what goes in the gAMA 3956 chunk in a PNG file. If file_gamma is not 1.0, we know that gamma 3957 correction has been done on the sample values in the file, and we 3958 could call them "gamma corrected" samples. However, since there 3959 can be so many different values of gamma in the image display 3960 chain, and some of them are not known at the time the image is 3961 written, the samples are not really being "corrected" for a 3962 specific display condition. We are really using a power function 3963 in the process of encoding an intensity range into a small integer 3964 field, and so it is more correct to say "gamma encoded" samples 3965 instead of "gamma corrected" samples. 3967 When displaying an image file, the image decoding program is 3968 responsible for making the overall gamma of the system equal to 3969 the desired viewing_gamma, by selecting the decoding_gamma 3970 appropriately. When displaying a PNG file, the gAMA chunk 3971 provides the file_gamma value. The display_gamma may be known for 3972 this machine, or it might be obtained from the system software, or 3973 the user might have to be asked what it is. The correct 3974 viewing_gamma depends on lighting conditions, and that will 3975 generally have to come from the user. 3977 Ultimately, you should have 3979 file_gamma * decoding_gamma * display_gamma = viewing_gamma 3981 Some specific examples 3983 In digital video systems, camera_gamma is about 0.5 by declaration 3984 of the various video standards documents. CRT_gamma is 2.5 as 3985 usual, while encoding_gamma, decoding_gamma, and LUT_gamma are all 3986 1.0. As a result, viewing_gamma ends up being about 1.25. 3988 On frame buffers that have hardware gamma correction tables, and 3989 that are calibrated to display linear samples correctly, 3990 display_gamma is 1.0. 3992 Many workstations and X terminals and PC displays lack gamma 3993 correction lookup tables. Here, LUT_gamma is always 1.0, so 3994 display_gamma is 2.5. 3996 On the Macintosh, there is a LUT. By default, it is loaded with a 3997 table whose gamma is about 0.72, giving a display_gamma (LUT and 3998 CRT combined) of about 1.8. Some Macs have a "Gamma" control 3999 panel that allows gamma to be changed to 1.0, 1.2, 1.4, 1.8, or 4000 2.2. These settings load alternate LUTs that are designed to give 4001 a display_gamma that is equal to the label on the selected button. 4002 Thus, the "Gamma" control panel setting can be used directly as 4003 display_gamma in decoder calculations. 4005 On recent SGI systems, there is a hardware gamma-correction table 4006 whose contents are controlled by the (privileged) "gamma" program. 4007 The gamma of the table is actually the reciprocal of the number 4008 that "gamma" prints, and it does not include the CRT gamma. To 4009 obtain the display_gamma, you need to find the SGI system gamma 4010 (either by looking in a file, or asking the user) and then 4011 calculating 4012 display_gamma = 2.5 / SGI_system_gamma 4014 You will find SGI systems with the system gamma set to 1.0 and 2.2 4015 (or higher), but the default when machines are shipped is 1.7. 4017 A note about video gamma 4019 The original NTSC video standards specified a simple power-law 4020 camera transfer function with a gamma of 1/2.2 or 0.45. This is 4021 not possible to implement exactly in analog hardware because the 4022 function has infinite slope at x=0, so all cameras deviated to 4023 some degree from this ideal. More recently, a new camera transfer 4024 function that is physically realizable has been accepted as a 4025 standard [SMPTE-170M]. It is 4027 Vout = 4.5 * Vin if Vin < 0.018 4028 Vout = 1.099 * (Vin^0.45) - 0.099 if Vin >= 0.018 4030 where Vin and Vout are measured on a scale of 0 to 1. Although 4031 the exponent remains 0.45, the multiplication and subtraction 4032 change the shape of the transfer function, so it is no longer a 4033 pure power function. If you want to perform extremely precise 4034 calculations on video signals, you should use the expression above 4035 (or its inverse, as required). 4037 However, PNG does not provide a way to specify that an image uses 4038 this exact transfer function; the gAMA chunk always assumes a pure 4039 power-law function. If we plot the two-part transfer function 4040 above along with the family of pure power functions, we find that 4041 a power function with a gamma of about 0.5 to 0.52 (not 0.45) most 4042 closely approximates the transfer function. Thus, when writing a 4043 PNG file with data obtained from digitizing the output of a modern 4044 video camera, the gAMA chunk should contain 0.5 or 0.52, not 0.45. 4045 The remaining difference between the true transfer function and 4046 the power function is insignificant for almost all purposes. (In 4047 fact, the alignment errors in most cameras are likely to be larger 4048 than the difference between these functions.) The designers of 4049 PNG deemed the simplicity and flexibility of a power-law 4050 definition of gAMA to be more important than being able to 4051 describe the SMPTE-170M transfer curve exactly. 4053 The PAL and SECAM video standards specify a power-law camera 4054 transfer function with a gamma of 1/2.8 or 0.36 --- not the 1/2.2 4055 of NTSC. However, this is too low in practice, so real cameras 4056 are likely to have their gamma set close to NTSC practice. Just 4057 guessing 0.45 or 0.5 is likely to give you viewable results, but 4058 if you want precise values you'll probably have to measure the 4059 particular camera. 4061 Further reading 4063 If you have access to the World Wide Web, read Charles Poynton's 4064 excellent "Gamma FAQ" [GAMMA-FAQ] for more information about 4065 gamma. 4067 14. Appendix: Color Tutorial 4069 (This appendix is not part of the formal PNG specification.) 4071 About chromaticity 4073 The cHRM chunk is used, together with the gAMA chunk, to convey 4074 precise color information so that a PNG image can be displayed or 4075 printed with better color fidelity than is possible without this 4076 information. The preceding chapters state how this information is 4077 encoded in a PNG image. This tutorial briefly outlines the 4078 underlying color theory for those who might not be familiar with 4079 it. 4081 Note that displaying an image with incorrect gamma will produce 4082 much larger color errors than failing to use the chromaticity 4083 data. First be sure the monitor set-up and gamma correction are 4084 right, then worry about chromaticity. 4086 The problem 4088 The color of an object depends not only on the precise spectrum of 4089 light emitted or reflected from it, but also on the observer --- 4090 their species, what else they can see at the same time, even what 4091 they have recently looked at! Furthermore, two very different 4092 spectra can produce exactly the same color sensation. Color is 4093 not an objective property of real-world objects; it is a 4094 subjective, biological sensation. However, by making some 4095 simplifying assumptions (such as: we are talking about human 4096 vision) it is possible to produce a mathematical model of color 4097 and thereby obtain good color accuracy. 4099 Device-dependent color 4101 Display the same RGB data on three different monitors, side by 4102 side, and you will get a noticeably different color balance on 4103 each display. This is because each monitor emits a slightly 4104 different shade and intensity of red, green, and blue light. RGB 4105 is an example of a device-dependent color model --- the color you 4106 get depends on the device. This also means that a particular 4107 color --- represented as say RGB 87, 146, 116 on one monitor --- 4108 might have to be specified as RGB 98, 123, 104 on another to 4109 produce the same color. 4111 Device-independent color 4113 A full physical description of a color would require specifying 4114 the exact spectral power distribution of the light source. 4115 Fortunately, the human eye and brain are not so sensitive as to 4116 require exact reproduction of a spectrum. Mathematical, device- 4117 independent color models exist that describe fairly well how a 4118 particular color will be seen by humans. The most important 4119 device-independent color model, to which all others can be 4120 related, was developed by the International Lighting Committee 4121 (CIE, in French) and is called XYZ. 4123 In XYZ, X is the sum of a weighted power distribution over the 4124 whole visible spectrum. So are Y and Z, each with different 4125 weights. Thus any arbitrary spectral power distribution is 4126 condensed down to just three floating point numbers. The weights 4127 were derived from color matching experiments done on human 4128 subjects in the 1920s. CIE XYZ has been an International Standard 4129 since 1931, and it has a number of useful properties: 4131 * two colors with the same XYZ values will look the same to 4132 humans 4133 * two colors with different XYZ values will not look the same 4134 * the Y value represents all the brightness information 4135 (luminance) 4136 * the XYZ color of any object can be objectively measured 4138 Color models based on XYZ have been used for many years by people 4139 who need accurate control of color --- lighting engineers for film 4140 and TV, paint and dyestuffs manufacturers, and so on. They are 4141 thus proven in industrial use. Accurate, device-independent color 4142 started to spread from high-end, specialized areas into the 4143 mainstream during the late 1980s and early 1990s, and PNG takes 4144 notice of that trend. 4146 Calibrated, device-dependent color 4148 Traditionally, image file formats have used uncalibrated, device- 4149 dependent color. If the precise details of the original display 4150 device are known, it becomes possible to convert the device- 4151 dependent colors of a particular image to device-independent ones. 4152 Making simplifying assumptions, such as working with CRTs (which 4153 are much easier than printers), all we need to know are the XYZ 4154 values of each primary color and the CRT_gamma. 4156 So why does PNG not store images in XYZ instead of RGB? Well, two 4157 reasons. First, storing images in XYZ would require more bits of 4158 precision, which would make the files bigger. Second, all 4159 programs would have to convert the image data before viewing it. 4160 Whether calibrated or not, all variants of RGB are close enough 4161 that undemanding viewers can get by with simply displaying the 4162 data without color correction. By storing calibrated RGB, PNG 4163 retains compatibility with existing programs that expect RGB data, 4164 yet provides enough information for conversion to XYZ in 4165 applications that need precise colors. Thus, we get the best of 4166 both worlds. 4168 What are chromaticity and luminance? 4170 Chromaticity is an objective measurement of the color of an 4171 object, leaving aside the brightness information. Chromaticity 4172 uses two parameters x and y, which are readily calculated from 4173 XYZ: 4175 x = X / (X + Y + Z) 4176 y = Y / (X + Y + Z) 4178 XYZ colors having the same chromaticity values will appear to have 4179 the same hue but can vary in absolute brightness. Notice that x,y 4180 are dimensionless ratios, so they have the same values no matter 4181 what units we've used for X,Y,Z. 4183 The Y value of an XYZ color is directly proportional to its 4184 absolute brightness and is called the luminance of the color. We 4185 can describe a color either by XYZ coordinates or by chromaticity 4186 x,y plus luminance Y. The XYZ form has the advantage that it is 4187 linearly related to (linear, gamma=1.0) RGB color spaces. 4189 How are computer monitor colors described? 4191 The "white point" of a monitor is the chromaticity x,y of the 4192 monitor's nominal white, that is, the color produced when 4193 R=G=B=maximum. 4195 It's customary to specify monitor colors by giving the 4196 chromaticities of the individual phosphors R, G, and B, plus the 4197 white point. The white point allows one to infer the relative 4198 brightnesses of the three phosphors, which isn't determined by 4199 their chromaticities alone. 4201 Note that the absolute brightness of the monitor is not specified. 4202 For computer graphics work, we generally don't care very much 4203 about absolute brightness levels. Instead of dealing with 4204 absolute XYZ values (in which X,Y,Z are expressed in physical 4205 units of radiated power, such as candelas per square meter), it is 4206 convenient to work in "relative XYZ" units, where the monitor's 4207 nominal white is taken to have a luminance (Y) of 1.0. Given this 4208 assumption, it's simple to compute XYZ coordinates for the 4209 monitor's white, red, green, and blue from their chromaticity 4210 values. 4212 Why does cHRM use x,y rather than XYZ? Simply because that is how 4213 manufacturers print the information in their spec sheets! 4214 Usually, the first thing a program will do is convert the cHRM 4215 chromaticities into relative XYZ space. 4217 What can I do with it? 4219 If a PNG file has the gAMA and cHRM chunks, the source_RGB values 4220 can be converted to XYZ. This lets you: 4222 * do accurate grayscale conversion (just use the Y component) 4223 * convert to RGB for your own monitor (to see the original 4224 colors) 4225 * print the image in Level 2 PostScript with better color 4226 fidelity than a simple RGB to CMYK conversion could provide 4227 * calculate an optimal color palette 4228 * pass the image data to a color management system 4229 * etc. 4231 How do I convert from source_RGB to XYZ? 4233 Make a few simplifying assumptions first, like the monitor really 4234 is jet black with no input and the guns don't interfere with one 4235 another. Then, given that you know the CIE XYZ values for each of 4236 red, green, and blue for a particular monitor, you put them into a 4237 matrix m: 4239 Xr Xg Xb 4240 m = Yr Yg Yb 4241 Zr Zg Zb 4243 Here we assume we are working with linear RGB floating point data 4244 in the range 0..1. If the gamma is not 1.0, make it so on the 4245 floating point data. Then convert source_RGB to XYZ by matrix 4246 multiplication: 4248 X R 4249 Y = m G 4250 Z B 4252 In other words, X = Xr*R + Xg*G + Xb*B, and similarly for Y and Z. 4253 You can go the other way too: 4255 R X 4256 G = im Y 4257 B Z 4259 where im is the inverse of the matrix m. 4261 What is a gamut? 4263 The gamut of a device is the subset of visible colors which that 4264 device can display. (It has nothing to do with gamma.) The gamut 4265 of an RGB device can be visualized as a polyhedron in XYZ space; 4266 the vertices correspond to the device's black, blue, red, green, 4267 magenta, cyan, yellow and white. 4269 Different devices have different gamuts, in other words one device 4270 will be able to display certain colors (usually highly saturated 4271 ones) that another device cannot. The gamut of a particular RGB 4272 device can be determined from its R, G, and B chromaticities and 4273 white point (the same values given in the cHRM chunk). The gamut 4274 of a color printer is more complex and can only be determined by 4275 measurement. However, printer gamuts are typically smaller than 4276 monitor gamuts, meaning that there can be many colors in a 4277 displayable image that cannot physically be printed. 4279 Converting image data from one device to another generally results 4280 in gamut mismatches --- colors that cannot be represented exactly 4281 on the destination device. The process of making the colors fit, 4282 which can range from a simple clip to elaborate nonlinear scaling 4283 transformations, is termed gamut mapping. The aim is to produce a 4284 reasonable visual representation of the original image. 4286 Further reading 4288 References [COLOR-1] through [COLOR-5] provide more detail about 4289 color theory. 4291 15. Appendix: Sample CRC Code 4293 The following sample code represents a practical implementation of 4294 the CRC (Cyclic Redundancy Check) employed in PNG chunks. (See also 4295 ISO 3309 [ISO-3309] or ITU-T V.42 [ITU-V42] for a formal 4296 specification.) 4298 The sample code is in the ANSI C programming language. Non C users 4299 may find it easier to read with these hints: 4301 & 4302 Bitwise AND operator. 4304 ^ 4305 Bitwise exclusive-OR operator. (Caution: elsewhere in this 4306 document, ^ represents exponentiation.) 4308 >> 4309 Bitwise right shift operator. When applied to an unsigned 4310 quantity, as here, right shift inserts zeroes at the left. 4312 ! 4313 Logical NOT operator. 4315 ++ 4316 "n++" increments the variable n. 4318 0xNNN 4319 0x introduces a hexadecimal (base 16) constant. Suffix L 4320 indicates a long value (at least 32 bits). 4322 /* Table of CRCs of all 8-bit messages. */ 4323 unsigned long crc_table[256]; 4325 /* Flag: has the table been computed? Initially false. */ 4326 int crc_table_computed = 0; 4328 /* Make the table for a fast CRC. */ 4329 void make_crc_table(void) 4330 { 4331 unsigned long c; 4332 int n, k; 4334 for (n = 0; n < 256; n++) { 4335 c = (unsigned long) n; 4336 for (k = 0; k < 8; k++) { 4337 if (c & 1) 4338 c = 0xedb88320L ^ (c >> 1); 4339 else 4340 c = c >> 1; 4341 } 4342 crc_table[n] = c; 4343 } 4344 crc_table_computed = 1; 4345 } 4347 /* Update a running CRC with the bytes buf[0..len-1]--the CRC 4348 should be initialized to all 1's, and the transmitted value 4349 is the 1's complement of the final running CRC (see the 4350 crc() routine below)). */ 4352 unsigned long update_crc(unsigned long crc, unsigned char *buf, 4353 int len) 4354 { 4355 unsigned long c = crc; 4356 int n; 4358 if (!crc_table_computed) 4359 make_crc_table(); 4360 for (n = 0; n < len; n++) { 4361 c = crc_table[(c ^ buf[n]) & 0xff] ^ (c >> 8); 4362 } 4363 return c; 4364 } 4365 /* Return the CRC of the bytes buf[0..len-1]. */ 4366 unsigned long crc(unsigned char *buf, int len) 4367 { 4368 return update_crc(0xffffffffL, buf, len) ^ 0xffffffffL; 4369 } 4371 16. Appendix: Online Resources 4373 (This appendix is not part of the formal PNG specification.) 4375 This appendix gives the locations of some Internet resources for PNG 4376 software developers. By the nature of the Internet, the list is 4377 incomplete and subject to change. 4379 Archive sites 4381 The latest released versions of this document and related 4382 information can always be found at the PNG FTP archive site, 4383 ftp://ftp.uu.net/graphics/png/. The PNG specification is 4384 available in several formats, including HTML, plain text, and 4385 PostScript. 4387 Reference implementation and test images 4389 A reference implementation in portable C is available from the PNG 4390 FTP archive site, ftp://ftp.uu.net/graphics/png/src/. The 4391 reference implementation is freely usable in all applications, 4392 including commercial applications. 4394 Test images are available from 4395 ftp://ftp.uu.net/graphics/png/images/. 4397 Electronic mail 4399 The maintainers of the PNG specification can be contacted by e- 4400 mail at png-info@uunet.uu.net. 4402 PNG home page 4404 There is a World Wide Web home page for PNG at 4405 http://quest.jpl.nasa.gov/PNG/. This page is a central location 4406 for current information about PNG and PNG-related tools. 4408 17. Appendix: Revision History 4410 (This appendix is not part of the formal PNG specification.) 4412 The PNG format has been frozen since the Ninth Draft of March 7, 4413 1995, and all future changes are intended to be backwards compatible. 4414 The revisions since the Ninth Draft are simply clarifications, 4415 improvements in presentation, and additions of supporting material. 4417 Changes since the Tenth Draft of 5 May, 1995 4419 * Clarified meaning of a suggested-palette PLTE chunk in a 4420 truecolor image that uses transparency 4421 * Clarified exact semantics of sBIT and allowed bit depth 4422 scaling procedures 4423 * Clarified status of spaces in tEXt chunk keywords 4424 * Distinguished private and public extension values in type 4425 and method fields 4426 * Added a "Creation Time" tEXt keyword 4427 * Macintosh representation of PNG specified 4428 * Added discussion of security issues 4429 * Added more extensive discussion of gamma and chromaticity 4430 handling, including tutorial appendixes 4431 * Added a glossary 4432 * Editing and reformatting 4434 18. References 4436 [COLOR-1] 4437 Hall, Roy, Illumination and Color in Computer Generated Imagery. 4438 Springer-Verlag, New York, 1989. ISBN 0-387-96774-5. 4440 [COLOR-2] 4441 Kasson, J., and W. Plouffe, "An Analysis of Selected Computer 4442 Interchange Color Spaces", ACM Transactions on Graphics, vol 11 no 4443 4 (1992), pp 373-405. 4445 [COLOR-3] 4446 Lilley, C., F. Lin, W.T. Hewitt, and T.L.J. Howard, Colour in 4447 Computer Graphics. CVCP, Sheffield, 1993. ISBN 1-85889-022-5. 4448 Also available from 4449 4451 [COLOR-4] 4452 Stone, M.C., W.B. Cowan, and J.C. Beatty, "Color gamut mapping and 4453 the printing of digital images", ACM Transactions on Graphics, vol 4454 7 no 3 (1988), pp 249-292. 4456 [COLOR-5] 4457 Travis, David, Effective Color Displays --- Theory and Practice. 4458 Academic Press, London, 1991. ISBN 0-12-697690-2. 4460 [GAMMA-FAQ] 4461 Poynton, C., "Gamma FAQ". 4462 4464 [ISO-3309] 4465 International Organization for Standardization, "Information 4466 Processing Systems --- Data Communication High-Level Data Link 4467 Control Procedure --- Frame Structure", IS 3309, October 1984, 3rd 4468 Edition. 4470 [ISO-8859] 4471 International Organization for Standardization, "Information 4472 Processing --- 8-bit Single-Byte Coded Graphic Character Sets --- 4473 Part 1: Latin Alphabet No. 1", IS 8859-1, 1987. 4474 Also see sample files at 4475 ftp://ftp.uu.net/graphics/png/documents/iso_8859-1.* 4477 [ITU-BT709] 4478 International Telecommunications Union, "Basic Parameter Values 4479 for the HDTV Standard for the Studio and for International 4480 Programme Exchange", ITU-R Recommendation BT.709 (formerly CCIR 4481 Rec. 709), 1990. 4483 [ITU-V42] 4484 International Telecommunications Union, "Error-correcting 4485 Procedures for DCEs Using Asynchronous-to-Synchronous Conversion", 4486 ITU-T Recommendation V.42, 1994, Rev. 1. 4488 [PAETH] 4489 Paeth, A.W., "Image File Compression Made Easy", in Graphics Gems 4490 II, James Arvo, editor. Academic Press, San Diego, 1991. ISBN 4491 0-12-064480-0. 4493 [POSTSCRIPT] 4494 Adobe Systems Incorporated, PostScript Language Reference Manual, 4495 2nd edition. Addison-Wesley, Reading, 1990. ISBN 0-201-18127-4. 4497 [PNG-EXTENSIONS] 4498 PNG Group, "PNG Special-Purpose Public Chunks". Available in 4499 several formats from 4500 ftp://ftp.uu.net/graphics/png/documents/pngextensions.* 4502 [RFC-1123] 4503 Braden, R., Editor, "Requirements for Internet Hosts --- 4504 Application and Support", STD 3, RFC 1123, USC/Information 4505 Sciences Institute, October 1989. 4506 4508 [RFC-1521] 4509 Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail 4510 Extensions) Part One: Mechanisms for Specifying and Describing the 4511 Format of Internet Message Bodies", RFC 1521, Bellcore, Innosoft, 4512 September 1993. 4513 4515 [RFC-1590] 4516 Postel, J., "Media Type Registration Procedure", RFC 1590, 4517 USC/Information Sciences Institute, March 1994. 4518 4520 [RFC-1950] 4521 Deutsch, P. and J-L. Gailly, "ZLIB Compressed Data Format 4522 Specification version 3.3", RFC 1950, Aladdin Enterprises, May 4523 1996. 4524 4526 [RFC-1951] 4527 Deutsch, P., "DEFLATE Compressed Data Format Specification version 4528 1.3", RFC 1951, Aladdin Enterprises, May 1996. 4529 4531 [SMPTE-170M] 4532 Society of Motion Picture and Television Engineers, "Television 4533 --- Composite Analog Video Signal --- NTSC for Studio 4534 Applications", SMPTE-170M, 1994. 4536 19. Credits 4538 Editor 4540 Thomas Boutell, boutell@boutell.com 4542 Contributing Editor 4544 Tom Lane, tgl@sss.pgh.pa.us 4546 Authors 4548 Authors' names are presented in alphabetical order. 4550 * Mark Adler, madler@alumni.caltech.edu 4551 * Thomas Boutell, boutell@boutell.com 4552 * Christian Brunschen, cb@df.lth.se 4553 * Adam M. Costello, amc@cs.wustl.edu 4554 * Lee Daniel Crocker, lee@piclab.com 4555 * Andreas Dilger, adilger@enel.ucalgary.ca 4556 * Oliver Fromme, fromme@rz.tu-clausthal.de 4557 * Jean-loup Gailly, gzip@prep.ai.mit.edu 4558 * Chris Herborth, chrish@qnx.com 4559 * Alex Jakulin, alex@hermes.si 4560 * Neal Kettler, kettler@cs.colostate.edu 4561 * Tom Lane, tgl@sss.pgh.pa.us 4562 * Alexander Lehmann, alex@hal.rhein-main.de 4563 * Chris Lilley, chris.lilley@mcc.ac.uk 4564 * Dave Martindale, davem@cs.ubc.ca 4565 * Owen Mortensen, ojm@csi.compuserve.com 4566 * Robert P. Poole, lionboy@primenet.com 4567 * Glenn Randers-Pehrson, glennrp@arl.mil or 4568 randeg@alumni.rpi.edu 4569 * Greg Roelofs, newt@uchicago.edu 4570 * Willem van Schaik, gwillem@ntuvax.ntu.ac.sg 4571 * Guy Schalnat, schalnat@group42.com 4572 * Paul Schmidt, pschmidt@photodex.com 4573 * Tim Wegner, 71320.675@compuserve.com 4574 * Jeremy Wohl, jeremy@cs.sunysb.edu 4576 The authors wish to acknowledge the contributions of the Portable 4577 Network Graphics mailing list and the readers of comp.graphics. 4579 Trademarks 4581 GIF is a service mark of CompuServe Incorporated. IBM PC is a 4582 trademark of International Business Machines Corporation. 4583 Macintosh is a trademark of Apple Computer, Inc. Microsoft and 4584 MS-DOS are trademarks of Microsoft Corporation. PhotoCD is a 4585 trademark of Eastman Kodak Company. PostScript and TIFF are 4586 trademarks of Adobe Systems Incorporated. SGI is a trademark of 4587 Silicon Graphics, Inc. X Window System is a trademark of the 4588 Massachusetts Institute of Technology. 4590 IESG Note 4592 A disclaimer by the Internet Engineering Steering Group regarding 4593 intellectual property claims will be inserted here. 4595 COPYRIGHT NOTICE 4597 Copyright (c) 1996 by: Massachusetts Institute of Technology (MIT) 4599 This W3C specification is being provided by the copyright holders 4600 under the following license. By obtaining, using and/or copying 4601 this specification, you agree that you have read, understood, and 4602 will comply with the following terms and conditions: 4604 Permission to use, copy, and distribute this specification for any 4605 purpose and without fee or royalty is hereby granted, provided 4606 that the full text of this NOTICE appears on ALL copies of the 4607 specification or portions thereof, including modifications, that 4608 you make. 4610 THIS SPECIFICATION IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE 4611 NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF 4612 EXAMPLE, BUT NOT LIMITATION, COPYRIGHT HOLDERS MAKE NO 4613 REPRESENTATIONS OR WARRANTIES OF MERCHANTABILITY OR FITNESS FOR 4614 ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SPECIFICATION WILL 4615 NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR 4616 OTHER RIGHTS. COPYRIGHT HOLDERS WILL BEAR NO LIABILITY FOR ANY 4617 USE OF THIS SPECIFICATION. 4619 The name and trademarks of copyright holders may NOT be used in 4620 advertising or publicity pertaining to the specification without 4621 specific, written prior permission. Title to copyright in this 4622 specification and any associated documentation will at all times 4623 remain with copyright holders. 4625 Security Considerations 4627 Security issues are discussed in Security considerations (Section 4628 8.5). 4630 Author's Address 4632 Thomas Boutell 4633 PO Box 20837 4634 Seattle, WA 98102 4636 Phone: (206) 329-4969 4638 EMail: boutell@boutell.com 4640 End of PNG Specification