idnits 2.17.1 draft-hansen-rfc-use-of-pdf-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 25, 2015) is 3319 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 653 -- Looks like a reference, but probably isn't: '2' on line 655 -- Looks like a reference, but probably isn't: '3' on line 657 -- Looks like a reference, but probably isn't: '4' on line 659 -- Looks like a reference, but probably isn't: '5' on line 661 -- Looks like a reference, but probably isn't: '6' on line 663 -- Looks like a reference, but probably isn't: '7' on line 666 == Missing Reference: 'XMP' is mentioned on line 614, but not defined -- Looks like a reference, but probably isn't: '8' on line 668 == Missing Reference: 'PDFA3' is mentioned on line 625, but not defined == Missing Reference: 'PDF' is mentioned on line 609, but not defined -- Looks like a reference, but probably isn't: '9' on line 670 == Missing Reference: 'PDFA2' is mentioned on line 621, but not defined == Missing Reference: 'PDFUA' is mentioned on line 629, but not defined -- Looks like a reference, but probably isn't: '10' on line 746 -- Looks like a reference, but probably isn't: '11' on line 771 -- Looks like a reference, but probably isn't: '12' on line 787 -- Looks like a reference, but probably isn't: '13' on line 799 -- Looks like a reference, but probably isn't: '14' on line 801 -- Looks like a reference, but probably isn't: '15' on line 803 -- Looks like a reference, but probably isn't: '16' on line 805 -- Looks like a reference, but probably isn't: '17' on line 806 -- Looks like a reference, but probably isn't: '18' on line 834 -- Looks like a reference, but probably isn't: '19' on line 834 -- Looks like a reference, but probably isn't: '20' on line 834 -- Looks like a reference, but probably isn't: '21' on line 834 -- Looks like a reference, but probably isn't: '22' on line 834 == Unused Reference: 'RFC3778' is defined on line 635, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 3778 (Obsoleted by RFC 8118) == Outdated reference: A later version (-06) exists of draft-flanagan-nonascii-04 == Outdated reference: A later version (-10) exists of draft-hildebrand-html-rfc-04 Summary: 2 errors (**), 0 flaws (~~), 9 warnings (==), 24 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Hansen, Ed. 3 Internet-Draft AT&T Laboratories 4 Intended status: Informational L. Masinter 5 Expires: September 26, 2015 M. Hardy 6 Adobe 7 March 25, 2015 9 PDF for an RFC Series Output Document Format 10 draft-hansen-rfc-use-of-pdf-07 12 Abstract 14 This document discusses options and requirements for the PDF 15 rendering of RFCs in the RFC Series, as outlined in RFC 6949. It 16 also discusses the use of PDF for Internet-Drafts, and available or 17 needed software tools for producing and working with PDF. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on September 26, 2015. 36 Copyright Notice 38 Copyright (c) 2015 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Options and Requirements for PDF RFCs . . . . . . . . . . . . 3 55 2.1. "Visible" Requirements . . . . . . . . . . . . . . . . . 3 56 2.1.1. General Visible Requirements . . . . . . . . . . . . 4 57 2.1.2. Page Size, Margins . . . . . . . . . . . . . . . . . 4 58 2.1.3. Headers and Footers . . . . . . . . . . . . . . . . . 4 59 2.1.4. Paragraph Numbering . . . . . . . . . . . . . . . . . 4 60 2.1.5. Paged Content Layout Quality . . . . . . . . . . . . 5 61 2.1.6. Similarity to Other Outputs . . . . . . . . . . . . . 6 62 2.1.7. Typeface Choices . . . . . . . . . . . . . . . . . . 6 63 2.1.8. Hyphenation and Line Breaks . . . . . . . . . . . . . 7 64 2.1.9. Hyperlinks . . . . . . . . . . . . . . . . . . . . . 7 65 2.2. "Invisible" Options and Requirements . . . . . . . . . . 8 66 2.2.1. Internal Text Representation . . . . . . . . . . . . 8 67 2.2.2. Unicode Support . . . . . . . . . . . . . . . . . . . 9 68 2.2.3. Image Processing (Artwork) . . . . . . . . . . . . . 10 69 2.2.4. Text Description of Images (Alt-Text) . . . . . . . . 10 70 2.2.5. Metadata Support . . . . . . . . . . . . . . . . . . 10 71 2.2.6. Document Structure Support . . . . . . . . . . . . . 11 72 2.2.7. Tagged PDF . . . . . . . . . . . . . . . . . . . . . 11 73 2.2.8. Embedded Files . . . . . . . . . . . . . . . . . . . 11 74 2.3. Digital Signatures . . . . . . . . . . . . . . . . . . . 12 75 3. Choosing PDF versions and Standards . . . . . . . . . . . . . 12 76 4. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 77 4.1. References . . . . . . . . . . . . . . . . . . . . . . . 13 78 4.2. Informative References . . . . . . . . . . . . . . . . . 14 79 4.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 14 80 Appendix A. History and Current Use of PDF with RFCs and 81 Internet-Drafts . . . . . . . . . . . . . . . . . . 15 82 A.1. RFCs . . . . . . . . . . . . . . . . . . . . . . . . . . 16 83 A.2. Internet-Drafts . . . . . . . . . . . . . . . . . . . . . 16 84 Appendix B. Tooling . . . . . . . . . . . . . . . . . . . . . . 16 85 B.1. PDF Viewers . . . . . . . . . . . . . . . . . . . . . . . 16 86 B.2. Printers . . . . . . . . . . . . . . . . . . . . . . . . 17 87 B.3. PDF Generation Libraries . . . . . . . . . . . . . . . . 17 88 B.4. Typefaces . . . . . . . . . . . . . . . . . . . . . . . . 17 89 B.5. Other Tools . . . . . . . . . . . . . . . . . . . . . . . 18 90 Appendix C. Additional Reading . . . . . . . . . . . . . . . . . 18 91 Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 18 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 94 1. Introduction 96 The RFC Series is evolving, as outlined in [RFC6949]. Future 97 documents will use an canonical format, XML, with renderings in 98 various formats, including PDF. 100 Because PDF has a wide range of capabilities and alternatives, not 101 all PDFs are "equal". For example, visually similar documents could 102 consist of scanned or rasterized images, or include text layout 103 options, hyperlinks, embedded fonts, and digital signatures. (See 104 [1] for a history of PDF. Also see [2] and [3] for descriptions of 105 PDF/A and PDF/UA, respectively.) 107 This document explains some of the relevant options and makes 108 recommendations, both for the RFC series and Internet-Drafts. 110 The PDF format and the tools to manipulate it are not as well known 111 as those for the other RFC formats, at least in the IETF community. 112 This document discusses some of the processes for creating and using 113 PDFs using both open source and commercial products. 115 NOTE: see [4] for XML source, related files, and an issue tracker for 116 this document. 118 2. Options and Requirements for PDF RFCs 120 This section lays out options and requirements for PDFs produced by 121 the RFC editor for RFCs. There are two sections: "Visible" options 122 are related to how the PDF appears when it is viewed with a PDF 123 viewer. "Internal Structure" options affect the ability to process 124 PDFs in other ways, but do not control the way the document appears. 125 (Of course, a viewer UI might display processing capabilities, such 126 as showing if a document has been digitally signed.) 128 In many cases, the choice of PDF requirements is heavily influenced 129 by the capabilities of available tools to create PDFs. Most of the 130 discussion of tooling is to be found in Appendix B. 132 NOTE: each option in this section will eventually outline the nature 133 of the design choice, outline the pros and cons, and make a 134 recommendation. 136 2.1. "Visible" Requirements 138 PDF supports rich visible layout of fixed-sized pages. 140 2.1.1. General Visible Requirements 142 For a consistent "look" of RFC and good style, the PDFs produced by 143 the RFC editor should have a clear, consistent, identifiable and 144 easy-to-read style. They should print well on the widest range of 145 printers, and look good on displays of varying resolution. 147 2.1.2. Page Size, Margins 149 PDF files are laid out for a particular size of page and margins. 150 There are two paper sizes in common use: "US Letter" (8.5 x 11 151 inches, 216x279 mm, in popular use in North America) and "A4" 152 (210x297 mm, 8.27x11.7 inches, standard for the rest of the world). 153 Usually PDF printing software is used in a "shrink to fit" mode where 154 the printing is adjusted to fit the paper in the printer. There is 155 some controversy, but the argument that A4 is an international 156 standard is compelling. 158 Recommendation: The Internet-Draft and RFC processors should produce 159 A4 size by default. However, the margins and header positioning will 160 need to be chosen to look good on both paper sizes using common 161 printing methods. 163 2.1.3. Headers and Footers 165 Page headers and footers are part of the page layout. There are a 166 variety of options. Note that page headers and footers in PDF can be 167 typeset in a way that the entire (longer) title might fit. 169 Recommendation: Page headers and footers should contain similar 170 information as the headings in the current text versions of 171 documents, including page numbers, title, author, working group. 172 However, the page headers and footers should be typeset in a way so 173 as to be inobtrusive. The page headers and footers should be placed 174 into the PDF in a way not to interfere with screen readers. 176 2.1.4. Paragraph Numbering 178 One common feature of the Internet-draft output formats are optional 179 visible paragraph numbers, to aid in discussions. In the PDF and 180 thus printed rendition, it is possible to make paragraph numbers 181 unobtrusive, and even to impinge on the margins. 183 Recommendation: When the XML "editing=yes" option has been chosen, 184 show paragraph numbers in the right margin, typeset in a way so as to 185 be inobtrusive. (The right margin instead of the left margin 186 prevents the paragraph numbers from being confused with the section 187 numbers.) If possible, the paragraph numbers should be coded in a 188 way that they do not not interfere with screen readers. 190 2.1.5. Paged Content Layout Quality 192 The process of creating a paged document from running text typically 193 involves insuring that related material is present on the same page 194 together, and that artifacts of pagination don't interfere with easy 195 reading of the document. Typical high-quality layout processors do 196 several things: 198 Widow and Orphan Management: Widows and orphans ([5]) should be 199 avoided automatically (unless the entire paragraph is only one 200 line). Insure that a page break does not occur after the first 201 line of a paragraph (orphans), if necessary, using slightly longer 202 page sizes. Similarly, insure that a page break does not occur 203 before the last line of a paragraph (widows). 205 Keep Section Heading Contiguous: Do not page break immediately after 206 a section heading. If there isn't room on a page for the first 207 (two) lines of a section after the section heading, page break 208 before the heading. 210 Avoid Splitting Artwork: Figures should not be split from figure 211 titles. If possible, keep the figure on the same page as the 212 (first) mention of the figure. 214 Headers for Long Tables after Page Breaks: Another common option in 215 producing paginated documents is to include the column headings of 216 a table if the table cannot be displayed on a single page. 217 Similarly, tables should not be split from the table titles. 219 keepWithNext and keepWithPrevious: The XML attributes of 220 "keepWithNext" and "keepWithPrevious" should be followed whenever 221 possible. 223 Whitespace Preservation: The XML entities such as NBSP and NBHYPHEN 224 should be followed as directed whenever possible. 226 Layout engines differ in the quality of the algorithms used to 227 automate these processes. In some cases, the automated processes 228 require some manual assistance to insure, for example, that a text 229 line intended as a heading is "kept" with the text it is heading for. 231 Recommendation: Choose a layout engine so that manual intervention is 232 minimized, and that widow and orphan processing, heading and title 233 contiguation are automatic. 235 2.1.6. Similarity to Other Outputs 237 There is some advantage to having the PDF files look like the text or 238 HTML renderings of the same document. There are several options even 239 so. The PDF 241 1. could look like the text version of the document, or 243 2. could look like the text version of the document but with 244 pictures rendered as pictures instead of using their ASCII-art 245 equivalent, or 247 3. could look like the HTML version. 249 Recommendation: the PDF rendition should look like the HTML 250 rendition, at least in spirit. Some differences from the HTML 251 rendition would include different typeface and size (chosen for 252 printing), page numbers in the table of contents and index, and the 253 use of page headers and footers. 255 Most of the choices used for the HTML rendering are thus applicable: 257 TBA 259 Most of the choices used for the CSS rendering are also applicable: 261 TBA 263 2.1.7. Typeface Choices 265 A PDF may refer to a font by name, or it may use an embedded font. 266 When a font is not embedded, a PDF viewer will attempt to locate a 267 locally installed font of the same name. If it can not find an exact 268 match, it will find a "close match". If a close match is not 269 available, it will fall back to something implementation dependent 270 and usually undesirable. 272 In addition, the PDF/A standards mandate the embedding of fonts. 273 Preferably, the software generating the files would produce PDF/ 274 A-conforming files directly, thus ensuring that all glyphs include 275 Unicode mappings and embedded fonts from the outset. 277 If the HTML version of the document is being visually mimicked, the 278 font(s) chosen should have both variable width and constant width 279 components, as well as bold and italic representations. 281 The typefaces used by Internet-Drafts and by RFCs need not be 282 identical. 284 Few fonts have glyphs for the entire repertoire of Unicode 285 characters; for this purpose, the PDF generation tool may need a set 286 of fonts and a way of choosing them. The RFC Editor is defining 287 where Unicode characters may be used within 288 RFCs.[I-D.flanagan-nonascii] 290 Typefaces are typically licensed and, in many cases, there is a fee 291 for use by PDF creation tools; however, not for display or print of 292 the embedded fonts. 294 Recommendations: 296 o Recommendation: for consistent viewing, all fonts should be 297 embedded. The fonts used must be available for use by the IETF 298 community. 300 o The choice of type faces with respect to serif, sans serif, 301 monospace, etc., should follow the recommendations for HTML and 302 CSS rendering [I-D.hildebrand-html-rfc]. 304 o The range of Unicode characters allowed in the XML source for 305 Internet-Drafts and RFCs may be bounded by the availability of 306 embeddable fonts with appropriate glyphs [I-D.flanagan-nonascii]. 308 2.1.8. Hyphenation and Line Breaks 310 Typically, when doing page layout of running text, especially with 311 narrow page width and long words, layout processors of English text 312 often have the option of hyphenating words, or using existing hyphens 313 as a place to introduce word breaks. However, line breaks mid-word 314 where the "word" is actually technically a sequence of characters 315 representing a protocol element or protocol sequence is actively 316 harmful. 318 Recommendation: avoid introducing hyphenated line breaks mid-word 319 into the visual display, consistent with requirements for plain text 320 and HTML. 322 2.1.9. Hyperlinks 324 PDF supports hyperlinks both to sections of the same document and to 325 other documents. 327 The conversion to PDF can generate: 329 o hyperlinks within the document 331 o hyperlinks to other RFCs and Internet-Drafts 332 o hyperlinks to external locations 334 o hyperlinks within a table of contents 336 o hyperlinks within an index 338 One question that must be answered is where should hyperlinks to RFCs 339 point? To the info page for the RFC? To the PDF version of the RFC? 341 Similar questions need to be answered on references to Internet- 342 Drafts: Where should hyperlinks to Internet-Drafts point? To the 343 datatracker entry? To the tools entry? To a PDF version of the 344 Internet-Draft? 346 Recommendations: 348 o All hyperlinks available in the HTML rendition of the RFC should 349 also be visible and active in the PDF produced. This includes 350 both internal hyperlinks and hyperlinks to external resources. 352 o The table of contents, including page numbers, are useful when 353 printed. These should also be hyperlinked to their respective 354 sections. 356 o Hyperlinks to RFCs from the references section should point to the 357 RFC "info" page, which then links to the various formats 358 available. 360 o Hyperlinks to Internet-Drafts from the references section should 361 point to the datatracker entry page for the draft, which then 362 links to the various formats available. 364 2.2. "Invisible" Options and Requirements 366 PDF offers a number of features which improve the utility of PDF 367 files in a variety of workflows, at the cost of extra effort in the 368 xml2rfc conversion process; the tradeoffs may be different for the 369 RFC editor production of RFCs and for Internet-Drafts. 371 2.2.1. Internal Text Representation 373 The contents of a PDF file can be represented in many ways. The PDF 374 file could be generated: 376 o as an image of the visual representation, such as a JPEG image of 377 the word "IETF". That is, there might be no internal 378 representation of letters, words or paragraphs at all. 380 o placing individual characters in position on the page, such as 381 saying "put an 'F' here", then "put an 'T' before it", then "put 382 an 'E' before that", then "put an 'I' before that" to render the 383 word "IETF". That is, there might be no internal representation 384 of words or paragraphs at all. 386 o placing words in position on the page, such as keeping the word 387 "IETF" would be kept together. That is, there might be no 388 internal representation of paragraphs at all. 390 o insuring that the running order of text in the content stream 391 matches the logical reading order. That is, a sentence sentence 392 such as 'The Internet Engineering Task Force (IETF) supports the 393 Internet.' would be kept together as a sentence, and multiple 394 sentences within a paragraph would be kept together. 396 All of these end up with essentially the same visual representation 397 of the output. However, each level has tradeoffs for auxiliary uses, 398 such as searching or indexing, commenting and annotation, and 399 accessibility (text-to-speech). Keeping the running order of text in 400 the content stream in the proper order supports all of these auxiliar 401 uses. 403 In addition, the "role map" feature of PDF ([6]) would additionally 404 allow for the mapping of the logical tags found in the original XML 405 into tags in the PDF. 407 Recommendations: 409 o Text in content streams should follow the XML document's logical 410 order (in the order of tags) to the extent possible. This will 411 provide optimal reuse by software that does not understand Tagged 412 PDF. (PDF/UA requires this.) 414 o It might be possible to use the "role map" annotation to capture 415 enough of the xml2rfc source structure, to the point where it is 416 possible to reconstruct the XML source structure completely. 417 However, there is not a compelling case to do so over embedding 418 the original XML, as described in Section 2.2.8. 420 2.2.2. Unicode Support 422 PDF itself does not require use of Unicode. Text is represented as a 423 sequence of glyphs which then can be mapped to Unicode. 425 Recommendations: 427 PDF files generated must have the full text, as it appears in the 428 original XML. 430 Unicode normalization may occur. 432 Text within SVG for SVG images should also have Unicode mappings. 434 Alt-text for images should also support Unicode. 436 2.2.3. Image Processing (Artwork) 438 The XML allows both ASCII art and SVG to be used for artwork. 440 Recommendations: 442 If both ASCII art and SVG are available for a picture, the SVG 443 artwork should be the preferred over the ASCII artwork. 445 ASCII artwork must be rendered using a monospace font. 447 2.2.4. Text Description of Images (Alt-Text) 449 Guidelines for accessibility of PDF [7] recommend that images, 450 formulas, and other non-text items provide textual alternatives, 451 using the '/Alt' Tag in PDF to provide human-readable text that can 452 be vocalized by text-to-speech technology. 454 Recommendation: Any alt-text for artwork and figures available in the 455 XML source should be stored using the PDF /Alt property. Internet 456 draft authors and the RFC editor should insure inclusion of alt-text 457 for all SVG or images, within the XML source. 459 2.2.5. Metadata Support 461 Metadata encodes information about the document authors, the document 462 series, date created, etc. Having this metadata within the PDF file 463 allows it to be used by search engines, viewers and other reuse 464 tools. PDF supports embedded metadata in a variety of ways, 465 including using XMP [XMP], the Extensible Metadata Platform (XMP). 466 The RFC editor maintains metadata about an RFC on its info page. 468 Recommendation: The PDFs generated should have all of the metadata 469 from the XML version embedded directly as XMP metadata, including the 470 author, date, the document series, and a URL for where the document 471 can be retrieved. This information should be consistent with the RFC 472 editor info page at the time of publication. 474 2.2.6. Document Structure Support 476 PDF supports an "outline" feature where sections of the document are 477 marked; this oould be used in addition to the table of contents as a 478 navigation aid. 480 The section structure of an RFC can be mapped into the PDF elements 481 for the document structure. This will allow the bookmark feature of 482 PDF readers to be used to quickly access sections of the document. 484 Recommendation: The section structure of an RFC should be mapped into 485 the PDF elements for the document structure. This would include 486 section headings for the boilerplate sections such as the Abstract, 487 Status of the Document, Table of Contents, and Author Addresses, plus 488 the obvious section headings that are normally included in the 489 Table of Contents. If possible, this should be done in a way that 490 the same fragment identifiers for the HTML version of the RFC will 491 work for the PDF version. 493 2.2.7. Tagged PDF 495 NOTE: say more about the use of alternative texts for images, tagging 496 text spans, and providing replacement texts for symbols and images. 497 A role-map could be provided here to map the logical tags found in 498 the RFC XML to the standard tagset for PDF. This could be included 499 in the generated PDF. (See also [8].) 501 2.2.8. Embedded Files 503 PDF has the capability of including other files; the files may be 504 labeled both by a media type and a role, the AFRelationship key 505 [PDFA3]. In this way, the PDF file acts also as a container. 507 Embedded content may be compressed. 509 Many PDF viewers support the ability to view and extract embedded 510 files, although this capability is not universal. 512 Embedding content in the PDF file allows the PDF to act as a complete 513 package, which can be transformed, archived, and digitally signed. 514 Useful possibilities: 516 Embed the source XML input file itself within the PDF. If the 517 source SVG and images for illustrations are also embedded, this 518 would make the PDF file totally self-referential. 520 Embed directly extractable components that are useful for 521 independent processing, including ABNF, MIBs, source code for 522 reference implementations. This capability might be supported 523 through other mechanisms from the XML source files, but could also 524 be supported within the PDF. 526 Finding, extracting and embedding other components may require 527 additional markup to clearly identify them, and additional review 528 to insure the correctness of embedded files that are not visible. 530 Recommendations: 532 Embed the XML source and all illustrations, for RFCs, as a 533 standard feature for xml2rfc's PDF output. 535 If possible, make this a standard feature for Internet-Drafts as 536 well. 538 Named entries should be embedded. 540 Embedded bitmap images (SVG sources, JPEGs, PNGs, etc) should be 541 embedded. 543 2.3. Digital Signatures 545 PDF has supported digital signatures since PDF 1.2. There are 546 multiple methods for signing PDF files. The signature is intended to 547 apply not only to the bits in the file (that they haven't been 548 modified) but also to lock down the visual presentation as well. 550 Normally, the authenticity of RFC files is not an issue, since the 551 RFC editor maintains a repository of all RFCs which is widely 552 replicated. However, the RFC Editor and staff are at times called to 553 provide evidence that a particular RFC is the "original" and has not 554 been visually modified, and there may be other use cases. As 555 signatures also apply to embedded content, embedding the XML source 556 will provide a way of signing the source XML as well. 558 Recommendation: PDFs produced by the RFC editor should be signed with 559 a PDF digital signature. The management of certificates for the RFC 560 editor function needs further review. 562 Recommendation: At this time, the authors see no need for Internet- 563 Drafts to be signed with a PDF digital signature. 565 3. Choosing PDF versions and Standards 567 PDF has gone through several revisions, primarily for the addition of 568 features. PDF features have generally been added in a way that older 569 viewers 'fail gracefully', but even so, the older the PDF version 570 produced, the more legacy viewers will support that version, but the 571 fewer features will be enabled. 573 As PDF has evolved a broad set of capabilities, additional standards 574 for PDF files are applicable. These standards establish ground rules 575 that are important for specific applications. For example PDF/X was 576 specifically designed for Prepress digital data exchange, with 577 careful attention to color management and printing instructions, 578 while the PDF/E standard was designed for engineering documents. 580 Two additional standards families are important to the RFC format, 581 though: long-term preservation (PDF/A), and user acessibility (PDF/ 582 UA). These then have sub-profiles (PDF/A-1, PDF/A-2, PDF/A-3), each 583 of which have conformance levels. These standards are then supported 584 by various software libraries and tools. 586 It is effective and useful to use these standards to capture PDF for 587 RFC requirements, and they will make the PDF files useful in 588 workflows that expect them. 590 Recommendations: 592 Use PDF 1.7; although relatively recent, it is well supported by 593 widely available viewers. 595 For RFCs, require PDF/A-3 with conformance level "U". This 596 captures the archivability and long-term stability of PDF 1.7 597 files, mandatory Unicode mapping, and many of the requirement 598 features. 600 Use PDF/A-3 for embedding additional data (including the XML 601 source file) in RFCs and Internet-Drafts. 603 Use PDF/UA. 605 4. References 607 4.1. References 609 [PDF] ISO, "Portable document format -- Part 1: PDF 1.7", ISO 610 32000-1, 2008. 612 Also available free from Adobe. 614 [XMP] ISO, "Extensible metadata platform (XMP) specification -- 615 Part 1: Data model, serialization and core properties", 616 ISO 16684-1, 2012. 618 Not available free, but there are a number of descriptive 619 resources, e.g., [9] 621 [PDFA2] ISO, "Electronic document file format for long-term 622 preservation -- Part 2: Use of ISO 32000-1 (PDF/A-2).", 623 ISO 19005-2, 2011. 625 [PDFA3] ISO, "Electronic document file format for long-term 626 preservation -- Part 3: Use of ISO 32000-1 with support 627 for embedded files (PDF/A-3)", ISO 19005-3, 2012. 629 [PDFUA] ISO, "Electronic document file format enhancement for 630 accessibility -- Part 1: Use of ISO 32000-1 (PDF/UA-1)", 631 ISO 19005-3, 2012. 633 4.2. Informative References 635 [RFC3778] Taft, E., Pravetz, J., Zilles, S., and L. Masinter, "The 636 application/pdf Media Type", RFC 3778, May 2004. 638 [RFC6949] Flanagan, H. and N. Brownlee, "RFC Series Format 639 Requirements and Future Development", RFC 6949, May 2013. 641 [I-D.flanagan-nonascii] 642 Flanagan, H., "The Use of Non-ASCII Characters in RFCs", 643 draft-flanagan-nonascii-04 (work in progress), January 644 2015. 646 [I-D.hildebrand-html-rfc] 647 Hildebrand, J. and H. Flanagan, "HyperText Markup Language 648 Request For Comments Format", draft-hildebrand-html-rfc-04 649 (work in progress), October 2014. 651 4.3. URIs 653 [1] https://en.wikipedia.org/wiki/Portable_Document_Format 655 [2] https://en.wikipedia.org/wiki/PDF/A 657 [3] https://en.wikipedia.org/wiki/PDF/UA 659 [4] https://github.com/masinter/pdfrfc 661 [5] https://en.wikipedia.org/wiki/Widows_and_orphans 663 [6] http://help.adobe.com/en_US/acrobat/X/pro/using/ 664 WS58a04a822e3e50102bd615109794195ff-7cd8.w.html 666 [7] http://www.w3.org/TR/WCAG20-TECHS/PDF1.html 668 [8] http://www.pdfa.org/2011/10/the-value-of-tagged-pdf/ 670 [9] http://en.wikipedia.org/wiki/List_of_PDF_software#Viewers 672 [10] http://www.i-programmer.info/news/136-open-source/7433-google- 673 open-sources-pdf-software-library.html 675 [11] http://greenbytes.de/tech/webdav/rfc2629xslt/ 676 rfc2629xslt.html#output.pdf.fop 678 [12] https://sourceforge.net/projects/ 679 sourcesans.adobe/?source=directory 681 [13] https://sourceforge.net/projects/ 682 sourceserifpro.adobe/?source=directory 684 [14] https://sourceforge.net/projects/ 685 sourcecodepro.adobe/?source=drectory 687 [15] https://www.rosettatype.com/Skolar 689 [16] https://www.google.com/get/noto/ 691 [17] http://www.pdflib.com/fileadmin/pdflib/pdf/whitepaper/ 692 Whitepaper-Technical-Introduction-to-PDFA.pdf 694 [18] http://www.pdfa.org/wp-content/uploads/2011/08/ 695 tn0003_metadata_in_pdfa-1_2008-03-128.pdf 697 [19] http://www.pdfa.org/wp-content/uploads/2011/08/PDFA-in- 698 a-Nutshell_1b.pdf 700 [20] http://www.pdfa.org/2011/08/pdfa-%E2%80%93-a-look-at-the- 701 technical-side/ 703 [21] http://pdf.editme.com/pdfa 705 Appendix A. History and Current Use of PDF with RFCs and Internet- 706 Drafts 708 NOTE: this section is meant as an overview to give some background. 710 A.1. RFCs 712 The RFC series has for a long time accepted Postscript renderings of 713 RFCs, either in addition to or instead of the text renderings of 714 those same RFCs. These have usually been produced when there was a 715 complicated figure or mathematics within the document. For example, 716 consider the figures and mathematics found in RFC 1119 and RFC 1142, 717 and compare the figures found in the text version of RFC 3550 with 718 those in the Postscript version. The RFC editor has provided a PDF 719 rendering of RFCs. Usually, this has been a print of the text file 720 that does not take advantage of any of the broader PDF functionality, 721 unless there was a Postscript version of the RFC, which would then be 722 used by the RFC editor to generate the PDF. 724 A.2. Internet-Drafts 726 In addition to PDFs generated and published by the RFC editor, the 727 IETF tools community has also long supported PDF for Internet-Drafts. 728 Most RFCs start with Internet-Drafts, edited by individual authors. 729 The Internet-Drafts submission tool at https://datatracker.ietf.org/ 730 submit/ accepts PDF and Postscript files in addition to the 731 (required) text submission and (currently optional) XML. If a PDF 732 wasn't submitted for a particular version of an Internet-Draft, the 733 tools would generate one from the Postscript, HTML, or text. 735 Appendix B. Tooling 737 This section discusses tools for viewing, comparing, creating, 738 manipulating, transforming PDF files, including those currently in 739 use by the RFC editor and Internet-Drafts, as well as outlining 740 available PDF tools for various processes. 742 B.1. PDF Viewers 744 As with most file formats, PDF files are experienced through a reader 745 or viewer of PDF files, and there are numerous viewers. One partial 746 list of PDF viewers can be found at [10]. 748 PDF viewers vary in capabilities, and it is important to note which 749 PDF viewers support the features utilized in PDF RFCs and Internet- 750 Drafts (features such as links, digital signatures, Tagged PDF and 751 others mentioned in Section 2). 753 A survey of the IETF community might broaden the list of viewers in 754 common use, but an initial list to consider include some that are 755 currently maintained and supported viewers and legacy systems. 756 Maintained viewers include: 758 Adobe Reader Multiple platforms. Supports all of the features on 759 most platforms. 761 Google Chrome Multiple platforms. Web browser which includes PDF 762 support. Rapidly moving target, open source. 764 PDF.js Multiple platforms. A JavaScript library to convert PDF 765 files into HTML5, usable as a web-based viewer that can be 766 included in web browsers. Used by Mozilla Firefox. Also rapidly 767 moving target. 769 Foxit Reader Multiple platforms. PDF Viewer / Reader for Desktop 770 computer and Mobile Devices. Recently licensed by Google, and the 771 code for this purpose was made open source; see [11]. 773 Several "legacy" viewers to consider include: Ghostview, Xpdf. 775 B.2. Printers 777 While almost all viewers also support printing of PDF files, printing 778 is one of the most important use cases for PDFs. Some printers have 779 direct PDF support. 781 B.3. PDF Generation Libraries 783 Because the xml2rfc format is a unique format, software for 784 converting XML source documents to the various formats will be 785 needed, including PDF generation. 787 One promising direction is suggested in [12]: using XSLT to generate 788 XSL-FO which is then processed by a formatting object processor such 789 as Apache FOP. 791 Several libraries are also available for generating PDF signatures. 793 B.4. Typefaces 795 This section is intended to discuss available typefaces that might 796 satisfy requirements. Some openly available fixed-width typefaces 797 (without extensive Unicode support, however) include: 799 o Source Sans [13] 801 o Source Serif Pro [14] 803 o Source Code Pro [15] 804 A font that looks promising for its broad Unicode support is Skolar 805 [16], but it requires licensing. Another potentially useful set of 806 typefaces is the Noto [17] family from Google. 808 B.5. Other Tools 810 In addition to generating and viewing PDF, other categories of PDF 811 tools are available and may be useful both during specification 812 development and for published RFCs. These include tools for 813 comparing two PDFs, checkers that could be used to validate the 814 results of conversion, reviewing and commentary tools that attach 815 annotations to PDF files, and digital signature creation and 816 validation. 818 Validation of an arbitrary author-generated PDF file would be quite 819 difficult; there are few PDF validation tools. However, if RFCs and 820 Internet-Drafts are generated by conversion from XML via xml2rfc, 821 then explicit validation of PDF and adherance to expected profiles 822 would mainly be useful to insure that xml2rfc has functioned 823 properly. 825 Recommendations: 827 o Discourage (but allow) submission of a PDF representation for 828 Internet-Drafts. In most cases, the PDF for an Internet-Draft 829 should be produced automatically when XML is suhmitted, with an 830 opportunity to verify the conversion. 832 Appendix C. Additional Reading 834 [18] [19] [20] [21] [22] 836 Appendix D. Acknowledgements 838 The input of the following people is gratefully acknowledged: Brian 839 Carpenter, Chris Dearlove, Martin Duerst, Heather Flanagan, Joe 840 Hildebrand, Duff Johnson, Leonard Rosenthol, .... 842 Authors' Addresses 844 Tony Hansen (editor) 845 AT&T Laboratories 846 200 Laurel Ave. South 847 Middletown, NJ 07748 848 USA 850 Email: tony+rfc2pdf@maillennium.att.com 851 Larry Masinter 852 Adobe 853 345 Park Ave 854 San Jose, CA 95110 855 USA 857 Email: masinter@adobe.com 858 URI: http://larry.masinter.net 860 Matthew Hardy 861 Adobe 862 345 Park Ave 863 San Jose, CA 95110 864 USA 866 Email: mahardy@adobe.com