idnits 2.17.1 draft-iab-rfc-use-of-pdf-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 17, 2016) is 2901 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'PDF' is mentioned on line 614, but not defined == Missing Reference: 'PDFUA' is mentioned on line 635, but not defined == Missing Reference: 'PDFA2' is mentioned on line 627, but not defined == Missing Reference: 'PDFA3' is mentioned on line 631, but not defined == Missing Reference: 'XMP' is mentioned on line 619, but not defined -- Looks like a reference, but probably isn't: '1' on line 811 -- Looks like a reference, but probably isn't: '2' on line 813 -- Looks like a reference, but probably isn't: '3' on line 815 -- Looks like a reference, but probably isn't: '4' on line 818 -- Looks like a reference, but probably isn't: '5' on line 819 == Outdated reference: A later version (-05) exists of draft-hardy-pdf-mime-01 == Outdated reference: A later version (-03) exists of draft-housley-rfc-and-id-signatures-02 Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Hansen, Ed. 3 Internet-Draft AT&T Laboratories 4 Intended status: Informational L. Masinter 5 Expires: November 18, 2016 M. Hardy 6 Adobe 7 May 17, 2016 9 PDF for an RFC Series Output Document Format 10 draft-iab-rfc-use-of-pdf-02 12 Abstract 14 This document discusses options and requirements for the PDF 15 rendering of RFCs in the RFC Series, as outlined in RFC 6949. It 16 also discusses the use of PDF for Internet-Drafts, and available or 17 needed software tools for producing and working with PDF. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on November 18, 2016. 36 Copyright Notice 38 Copyright (c) 2016 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Choosing PDF versions and Standards . . . . . . . . . . . . . 3 55 3. Options and Requirements for PDF RFCs . . . . . . . . . . . . 4 56 3.1. "Visible" Requirements . . . . . . . . . . . . . . . . . 4 57 3.1.1. General Visible Requirements . . . . . . . . . . . . 4 58 3.1.2. Page Size, Margins . . . . . . . . . . . . . . . . . 5 59 3.1.3. Headers and Footers . . . . . . . . . . . . . . . . . 5 60 3.1.4. Paragraph Numbering . . . . . . . . . . . . . . . . . 5 61 3.1.5. Paged Content Layout . . . . . . . . . . . . . . . . 6 62 3.1.6. Typeface Choices . . . . . . . . . . . . . . . . . . 6 63 3.1.7. Hyphenation and Line Breaks . . . . . . . . . . . . . 7 64 3.1.8. Hyperlinks . . . . . . . . . . . . . . . . . . . . . 8 65 3.1.9. Similarity to Other Outputs . . . . . . . . . . . . . 8 66 3.2. "Invisible" Options and Requirements . . . . . . . . . . 10 67 3.2.1. Internal Text Representation . . . . . . . . . . . . 10 68 3.2.2. Unicode Support . . . . . . . . . . . . . . . . . . . 11 69 3.2.3. Image Processing (Artwork) . . . . . . . . . . . . . 11 70 3.2.4. Text Description of Images (Alt-Text) . . . . . . . . 11 71 3.2.5. Metadata Support . . . . . . . . . . . . . . . . . . 12 72 3.2.6. Document Structure Support . . . . . . . . . . . . . 12 73 3.2.7. Embedded Files . . . . . . . . . . . . . . . . . . . 12 74 3.3. Digital Signatures . . . . . . . . . . . . . . . . . . . 13 75 4. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 4.1. References . . . . . . . . . . . . . . . . . . . . . . . 14 77 4.2. Informative References . . . . . . . . . . . . . . . . . 14 78 Appendix A. History and Current Use of PDF with RFCs and 79 Internet-Drafts . . . . . . . . . . . . . . . . . . 15 80 A.1. RFCs . . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 A.2. Internet-Drafts . . . . . . . . . . . . . . . . . . . . . 16 82 Appendix B. Paged Content Layout Quality . . . . . . . . . . . . 16 83 Appendix C. Tooling . . . . . . . . . . . . . . . . . . . . . . 17 84 C.1. PDF Viewers . . . . . . . . . . . . . . . . . . . . . . . 17 85 C.2. Printers . . . . . . . . . . . . . . . . . . . . . . . . 17 86 C.3. PDF Generation Libraries . . . . . . . . . . . . . . . . 17 87 C.4. Typefaces . . . . . . . . . . . . . . . . . . . . . . . . 18 88 C.5. Other Tools . . . . . . . . . . . . . . . . . . . . . . . 18 89 Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 18 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 92 1. Introduction 94 The RFC Series is evolving, as outlined in [RFC6949]. Future 95 documents will use a canonical format, XML, with renderings in 96 various formats, including PDF. 98 Because PDF has a wide range of capabilities and alternatives, not 99 all PDFs are "equal". For example, visually similar documents could 100 consist of scanned or rasterized images, or include text layout 101 options, hyperlinks, embedded fonts, and digital signatures. (See 102 [I-D.hardy-pdf-mime] for a history of PDF.) 104 This document explains some of the relevant options and makes 105 recommendations, both for the RFC series and Internet-Drafts. 107 The PDF format and the tools to manipulate it are not as well known 108 as those for the other RFC formats, at least in the IETF community. 109 This document discusses some of the processes for creating and using 110 PDFs using both open source and commercial products. 112 The details described in this document are expected to change based 113 on experience gained in implementing the RFC production center's 114 toolset. Revised documents will be published capturing those changes 115 as the toolset is completed. Other implementers must not expect 116 those changes to remain backwards-compatible with the details 117 described in this document. 119 NOTE: [RFC-EDITOR: This note should be removed before publication.] 120 See for XML source, related 121 files, and an issue tracker for this document. 123 2. Choosing PDF versions and Standards 125 PDF [PDF] has gone through several revisions, primarily for the 126 addition of features. PDF features have generally been added in a 127 way that older viewers 'fail gracefully', but even so, the older the 128 PDF version produced, the more legacy viewers will support that 129 version, but the fewer features will be enabled. 131 As PDF has evolved a broad set of capabilities, additional standards 132 for PDF files are applicable. These standards establish ground rules 133 that are important for specific applications. For example PDF/X was 134 specifically designed for Prepress digital data exchange, with 135 careful attention to color management and printing instructions. The 136 PDF/E standard was designed for engineering documents with dynamic 137 workflows (where a document continues to be revised after 138 publication) and allows interactive media (including animation and 139 3D). 141 Two additional standards families are important to the RFC format, 142 though: long-term preservation (PDF/A), and user accessibility (PDF/ 143 UA [PDFUA]). These then have sub-profiles (PDF/A-1, PDF/A-2 [PDFA2], 144 PDF/A-3 [PDFA3]), each of which have conformance levels. These 145 standards are then supported by various software libraries and tools. 147 It is effective and useful to use these standards to capture PDF for 148 RFC requirements, and they will make the PDF files useful in 149 workflows that expect them. 151 Recommendations: 153 Use PDF 1.7; although relatively recent, it is well supported by 154 widely available viewers. 156 For RFCs, require PDF/A-3 with conformance level "U". This 157 captures the archivability and long-term stability of PDF 1.7 158 files, mandatory Unicode mapping, and many of the requirement 159 features. 161 Use PDF/A-3 for embedding additional data (including the XML 162 source file) in RFCs and Internet-Drafts. 164 Use PDF/UA for user accessibility. 166 3. Options and Requirements for PDF RFCs 168 This section lays out options and requirements for PDFs produced by 169 the RFC editor for RFCs. There are two sections: "Visible" options 170 are related to how the PDF appears when it is viewed with a PDF 171 viewer. "Internal Structure" options affect the ability to process 172 PDFs in other ways, but do not control the way the document appears. 173 (Of course, a viewer UI might display processing capabilities, such 174 as showing whether a document has been digitally signed.) 176 In many cases, the choice of PDF requirements is heavily influenced 177 by the capabilities of available tools to create PDFs. Most of the 178 discussion of tooling is to be found in Appendix C. 180 3.1. "Visible" Requirements 182 PDF supports rich visible layout of fixed-sized pages. 184 3.1.1. General Visible Requirements 186 For a consistent "look" of RFC and good style, the PDFs produced by 187 the RFC editor should have a clear, consistent, identifiable and 188 easy-to-read style. They should print well on the widest range of 189 printers, and look good on displays of varying resolution. 191 3.1.2. Page Size, Margins 193 PDF files are laid out for a particular size of page and margins. 194 There are two paper sizes in common use: "US Letter" (8.5 x 11 195 inches, 216x279 mm, in popular use in North America) and "A4" 196 (210x297 mm, 8.27x11.7 inches, standard for the rest of the world). 197 Usually PDF printing software is used in a "shrink to fit" mode where 198 the printing is adjusted to fit the paper in the printer. There is 199 some controversy, but the argument that A4 is an international 200 standard is compelling. However, if the margins and header 201 positioning are chosen appropriately, the document can be printed 202 without any scaling. 204 Recommendation: The Internet-Draft and RFC processors should produce 205 A4 size by default. However, the margins and header positioning need 206 to be chosen to look good on both paper sizes without scaling. 207 Following the advice found in [RFC2346], this means that we should 208 use A4 portrait mode with left and right margins of 20 mm, and top 209 and bottom margins of 33 mm. 211 3.1.3. Headers and Footers 213 Page headers and footers are part of the page layout. There are a 214 variety of options. Note that page headers and footers in PDF can be 215 typeset in a way that the entire (longer) title might fit. 217 Recommendation: Page headers and footers should contain similar 218 information as the headings in the current text versions of 219 documents, including page numbers, title, author, working group. 220 However, the page headers and footers should be typeset in a way so 221 as to be unobtrusive. The page headers and footers should be placed 222 into the PDF in a way not to interfere with screen readers. 224 3.1.4. Paragraph Numbering 226 One common feature of the Internet-draft output formats are optional 227 visible paragraph numbers, to aid in discussions. In the PDF and 228 thus printed rendition, it is possible to make paragraph numbers 229 unobtrusive, and even to impinge on the margins. 231 Recommendation: When the XML "editing=yes" option has been chosen, 232 show paragraph numbers in the right margin, typeset in a way so as to 233 be unobtrusive. (The right margin instead of the left margin 234 prevents the paragraph numbers from being confused with the section 235 numbers.) If possible, the paragraph numbers should be coded in a 236 way that they do not interfere with screen readers. 238 3.1.5. Paged Content Layout 240 By its nature, PDF is paginated, so pagination issues must be 241 considered. This is reflected in two areas: running headers and 242 footers, and how text is layed out on a page for optimal reading. 244 Appendix B describes the process of creating a paged document from 245 running text such that related material is present on the same page 246 together and artifacts of pagination don't interfere with easy 247 reading of the document. 249 Layout engines differ in the quality of the algorithms used to 250 automate these processes. In some cases, the automated processes 251 require some manual assistance to ensure, for example, that a text 252 line intended as a heading is "kept" with the text it is heading for. 254 Recommendations: 256 o Headers and footers should be printed on each page. The 257 information should include the RFC number or internet-draft name, 258 the page number, the category (informational, etc.), a shortened 259 version of the authors' names, the date of the RFC or internet- 260 draft, and the short form of the document title. 262 o Choose a layout engine so that manual intervention is minimized, 263 and that widow and orphan processing, heading and title 264 contiguation are automatic. 266 3.1.6. Typeface Choices 268 A PDF may refer to a font by name, or it may use an embedded font. 269 When a font is not embedded, a PDF viewer will attempt to locate a 270 locally installed font of the same name. If it can not find an exact 271 match, it will find a "close match". If a close match is not 272 available, it will fall back to something implementation dependent 273 and usually undesirable. 275 In addition, the PDF/A standards mandate the embedding of fonts. 276 Instead of using additional software to embed the fonts, the software 277 generating the PDF files should produce PDF/A-conforming files 278 directly, thus ensuring that all glyphs include Unicode mappings and 279 embedded fonts from the outset. 281 If the HTML version of the document is being visually mimicked, the 282 font(s) chosen should have both variable width and constant width 283 components, as well as bold and italic representations. 285 The typefaces used by Internet-Drafts and by RFCs need not be 286 identical. 288 Few fonts have glyphs for the entire repertoire of Unicode 289 characters; for this purpose, the PDF generation tool may need a set 290 of fonts and a way of choosing them. The RFC Editor is defining 291 where Unicode characters may be used within 292 RFCs.[I-D.flanagan-nonascii] 294 Typefaces are typically licensed and, in many cases, there is a fee 295 for use by PDF creation tools; however, not for display or print of 296 the embedded fonts. 298 Recommendations: 300 o For consistent viewing, all fonts should be embedded. The fonts 301 used must be available for use by the IETF community. Some 302 discussion of available typefaces can be found in Appendix C.4. 304 o The choice of type faces with respect to serif, sans serif, 305 monospace, etc., should follow the recommendations for HTML and 306 CSS rendering [I-D.hildebrand-html-rfc] and 307 [I-D.flanagan-rfc-css]. 309 o The range of Unicode characters allowed in the XML source for 310 Internet-Drafts and RFCs may be bounded by the availability of 311 embeddable fonts with appropriate glyphs [I-D.flanagan-nonascii]. 313 3.1.7. Hyphenation and Line Breaks 315 Typically, when doing page layout of running text, especially with 316 narrow page width and long words, layout processors of English text 317 often have the option of hyphenating words, or using existing hyphens 318 as a place to introduce word breaks. However, inserting line breaks 319 mid-word can be harmful when the "word" is actually a sequence of 320 characters representing a protocol element or protocol sequence. 322 Recommendation: avoid introducing hyphenated line breaks mid-word 323 into the visual display, consistent with requirements for plain text 324 and HTML. 326 3.1.8. Hyperlinks 328 PDF supports hyperlinks both to sections of the same document and to 329 other documents. 331 The conversion to PDF can generate: 333 o hyperlinks within the document 335 o hyperlinks to other RFCs and Internet-Drafts 337 o hyperlinks to external locations 339 o hyperlinks within a table of contents 341 o hyperlinks within an index 343 Recommendations: 345 o All hyperlinks available in the HTML rendition of the RFC should 346 also be visible and active in the PDF produced. This includes 347 both internal hyperlinks and hyperlinks to external resources. 349 o The table of contents, including page numbers, are useful when 350 printed. These should also be hyperlinked to their respective 351 sections. 353 o As specified in the section on Referencing RFCs in [RFC7322], 354 hyperlinks to RFCs from the references section should point to the 355 RFC "info" page, which then links to the various formats 356 available. 358 o Hyperlinks to Internet-Drafts from the references section should 359 point to the datatracker entry page for the draft, which then 360 links to the various formats available. 362 3.1.9. Similarity to Other Outputs 364 There is some advantage to having the PDF files look like the text or 365 HTML renderings of the same document. There are several options even 366 so. The PDF 368 1. could look like the text version of the document, or 370 2. could look like the text version of the document but with 371 pictures rendered as pictures instead of using their ASCII-art 372 equivalent, or 374 3. could look like the HTML version. 376 Recommendation: the PDF rendition should look like the HTML 377 rendition, at least in spirit. Some differences from the HTML 378 rendition would include different typeface and size (chosen for 379 printing), page numbers in the table of contents and index, and the 380 use of page headers and footers. 382 Most of the choices used for the [I-D.hildebrand-html-rfc] rendering 383 and [I-D.flanagan-rfc-css] are thus applicable. See those documents 384 for specifics on the rendering of the specific XML elements. Some 385 notes are: 387 Every place in the document that would receive an HTML ID would be 388 given an identical PDF named destination. In addition, a named 389 destination will be created for each page with the form "pg-#", as 390 in "pg-35". 392 No pilcrows are generated or made visible. 394 The table of contents (generated if the XML's element's 395 tocInclude attribute has the value "true") will have the section 396 number linked to that section named destination, but will also 397 include a page number that is linked to the page named 398 destination. The section title and the page number will be 399 separated by a visually-appropriate separator and the page numbers 400 will be aligned with each other. 402 The index (generated if the XML's element's indexInclude 403 attribute has the value "true") will have the section number 404 linked to that section named destination, but will also include a 405 page number that is linked to the page named destination. 407 The running header in one line (on page 2 and all subsequent 408 pages) has the RFC number on the left (RFC NNNN), the (possibly 409 shortened form) title centered, and the date (Month Year) on the 410 right. The text is rendered in a way that is visually 411 unobtrusive. 413 The running footer in one line (on all pages) has the author's 414 last name on the left, category centered, and the page number on 415 the right ([Page N]). The text is rendered in a way that is 416 visually unobtrusive. 418 We should not attempt to replicate in PDF the feature of the HTML 419 format that includes a dynamic block that displays up-to-date 420 information on updates, obsoletions and errata. 422 3.2. "Invisible" Options and Requirements 424 PDF offers a number of features which improve the utility of PDF 425 files in a variety of workflows, at the cost of extra effort in the 426 xml2rfc conversion process; the tradeoffs may be different for the 427 RFC editor production of RFCs and for Internet-Drafts. 429 3.2.1. Internal Text Representation 431 The contents of a PDF file can be represented in many ways. The PDF 432 file could be generated: 434 o as an image of the visual representation, such as a JPEG image of 435 the word "IETF". That is, there might be no internal 436 representation of letters, words or paragraphs at all. 438 o placing individual characters in position on the page, such as 439 saying "put an 'F' here", then "put an 'T' before it", then "put 440 an 'E' before that", then "put an 'I' before that" to render the 441 word "IETF". That is, there might be no internal representation 442 of words or paragraphs at all. 444 o placing words in position on the page, such as keeping the word 445 "IETF" would be kept together. That is, there might be no 446 internal representation of paragraphs at all. 448 o ensuring that the running order of text in the content stream 449 matches the logical reading order. That is, a sentence such as 450 'The Internet Engineering Task Force (IETF) supports the 451 Internet.' would be kept together as a sentence, and multiple 452 sentences within a paragraph would be kept together. 454 All of these end up with essentially the same visual representation 455 of the output. However, each level has tradeoffs for auxiliary uses, 456 such as searching or indexing, commenting and annotation, and 457 accessibility (text-to-speech). Keeping the running order of text in 458 the content stream in the proper order supports all of these 459 auxiliary uses. 461 In addition, the "role map" feature of PDF 462 () would additionally 464 allow for the mapping of the logical tags found in the original XML 465 into tags in the PDF. 467 Recommendations: 469 o Text in content streams should follow the XML document's logical 470 order (in the order of tags) to the extent possible. This will 471 provide optimal reuse by software that does not understand Tagged 472 PDF. (PDF/UA requires this.) 474 o It might be possible to use the "role map" annotation to capture 475 enough of the xml2rfc source structure, to the point where it is 476 possible to reconstruct the XML source structure completely. 477 However, there is not a compelling case to do so over embedding 478 the original XML, as described in Section 3.2.7. 480 3.2.2. Unicode Support 482 PDF itself does not require use of Unicode. Text is represented as a 483 sequence of glyphs which then can be mapped to Unicode. 485 Recommendations: 487 PDF files generated must have the full text, as it appears in the 488 original XML. 490 Unicode normalization may occur. 492 Text within SVG for SVG images should also have Unicode mappings. 494 Alt-text for images should also support Unicode. 496 3.2.3. Image Processing (Artwork) 498 The XML allows both ASCII art and SVG to be used for artwork. 500 Recommendations: 502 If both ASCII art and SVG are available for a picture, the SVG 503 artwork should be the preferred over the ASCII artwork. 505 ASCII artwork must be rendered using a monospace font. 507 3.2.4. Text Description of Images (Alt-Text) 509 Guidelines for accessibility of PDF recommend that images, formulas, and other non-text 511 items provide textual alternatives, using the '/Alt' Tag in PDF to 512 provide human-readable text that can be vocalized by text-to-speech 513 technology. 515 Recommendation: Any alt-text for artwork and figures available in the 516 XML source should be stored using the PDF /Alt property. Internet 517 draft authors and the RFC editor should ensure inclusion of alt-text 518 for all SVG or images, within the XML source. 520 3.2.5. Metadata Support 522 Metadata encodes information about the document authors, the document 523 series, date created, etc. Having this metadata within the PDF file 524 allows it to be used by search engines, viewers and other reuse 525 tools. PDF supports embedded metadata in a variety of ways, 526 including using XMP [XMP], the Extensible Metadata Platform (XMP). 527 The RFC editor maintains metadata about an RFC on its info page. 529 Recommendation: The PDFs generated should have all of the metadata 530 from the XML version embedded directly as XMP metadata, including the 531 author, date, the document series, and a URL for where the document 532 can be retrieved. This information should be consistent with the RFC 533 editor info page at the time of publication. 535 3.2.6. Document Structure Support 537 PDF supports an "outline" feature where sections of the document are 538 marked; this could be used in addition to the table of contents as a 539 navigation aid. 541 The section structure of an RFC can be mapped into the PDF elements 542 for the document structure. This will allow the bookmark feature of 543 PDF readers to be used to quickly access sections of the document. 545 Recommendation: The section structure of an RFC should be mapped into 546 the PDF elements for the document structure. This would include 547 section headings for the boilerplate sections such as the Abstract, 548 Status of the Document, Table of Contents, and Author Addresses, plus 549 the obvious section headings that are normally included in the 550 Table of Contents. If possible, this should be done in a way that 551 the same fragment identifiers for the HTML version of the RFC will 552 work for the PDF version. 554 3.2.7. Embedded Files 556 PDF has the capability of including other files; the files may be 557 labeled both by a media type and a role, the AFRelationship key 558 [PDFA3]. In this way, the PDF file acts also as a container. 560 Embedded content may be compressed. 562 Many PDF viewers support the ability to view and extract embedded 563 files, although this capability is not universal. 565 Embedding content in the PDF file allows the PDF to act as a complete 566 package, which can be transformed, archived, and digitally signed. 567 (Some sample code illustrating how items can be attached to a PDF 568 file and subsequently extracted can be found at 569 .) Useful possibilities: 571 Embed the source XML input file itself within the PDF. If the 572 source SVG and images for illustrations are also embedded, this 573 would make the PDF file totally self-referential. 575 Embed directly extractable components that are useful for 576 independent processing, including ABNF, MIBs, source code for 577 reference implementations. This capability might be supported 578 through other mechanisms from the XML source files, but could also 579 be supported within the PDF. 581 Finding, extracting and embedding other components may require 582 additional markup to clearly identify them, and additional review 583 to ensure the correctness of embedded files that are not visible. 585 Recommendations: 587 Embed the XML source and all illustrations, for RFCs, as a 588 standard feature for xml2rfc's PDF output. 590 If possible, make this a standard feature for Internet-Drafts as 591 well. 593 Named entries should be embedded. 595 Bitmap images (SVG sources, JPEGs, PNGs, etc) should be embedded. 597 3.3. Digital Signatures 599 The RFC Editor and staff are at times called to provide evidence that 600 a particular RFC is the "original" and has not been modified; digital 601 signatures can provide that verification. As signatures also apply 602 to embedded content, embedding the XML source will provide a way of 603 signing the source XML that was used to product the PDF file as well. 605 PDF has supported digital signatures since PDF 1.2, and there are 606 multiple methods and options available for signing PDF files. The 607 signing of internet-drafts and RFCs will be guided by 608 [I-D.housley-rfc-and-id-signatures]. 610 4. References 612 4.1. References 614 [PDF] ISO, "Portable document format -- Part 1: PDF 1.7", 615 ISO 32000-1, 2008. 617 Also available free from Adobe. 619 [XMP] ISO, "Extensible metadata platform (XMP) specification -- 620 Part 1: Data model, serialization and core properties", 621 ISO 16684-1, 2012. 623 Not available free, but there are a number of descriptive 624 resources, e.g., 627 [PDFA2] ISO, "Electronic document file format for long-term 628 preservation -- Part 2: Use of ISO 32000-1 (PDF/A-2).", 629 ISO 19005-2, 2011. 631 [PDFA3] ISO, "Electronic document file format for long-term 632 preservation -- Part 3: Use of ISO 32000-1 with support 633 for embedded files (PDF/A-3)", ISO 19005-3, 2012. 635 [PDFUA] ISO, "Electronic document file format enhancement for 636 accessibility -- Part 1: Use of ISO 32000-1 (PDF/UA-1)", 637 ISO 19005-3, 2012. 639 4.2. Informative References 641 [RFC2346] Palme, J., "Making Postscript and PDF International", 642 RFC 2346, DOI 10.17487/RFC2346, May 1998, 643 . 645 [RFC6949] Flanagan, H. and N. Brownlee, "RFC Series Format 646 Requirements and Future Development", RFC 6949, 647 DOI 10.17487/RFC6949, May 2013, 648 . 650 [RFC7322] Flanagan, H. and S. Ginoza, "RFC Style Guide", RFC 7322, 651 DOI 10.17487/RFC7322, September 2014, 652 . 654 [I-D.flanagan-nonascii] 655 Flanagan, H., "The Use of Non-ASCII Characters in RFCs", 656 draft-flanagan-nonascii-06 (work in progress), November 657 2015. 659 [I-D.flanagan-rfc-css] 660 Flanagan, H., "CSS Requirements for RFCs", draft-flanagan- 661 rfc-css-04 (work in progress), September 2015. 663 [I-D.hardy-pdf-mime] 664 Hardy, M., Masinter, L., Markovic, D., Johnson, D., and M. 665 Bailey, "The application/pdf Media Type", draft-hardy-pdf- 666 mime-01 (work in progress), April 2016. 668 [I-D.hildebrand-html-rfc] 669 Hildebrand, J. and P. Hoffman, "HyperText Markup Language 670 Request For Comments Format", draft-hildebrand-html-rfc-10 671 (work in progress), August 2015. 673 [I-D.housley-rfc-and-id-signatures] 674 Housley, R., "Digital Signatures on RFC and Internet-Draft 675 Documents", draft-housley-rfc-and-id-signatures-02 (work 676 in progress), May 2016. 678 4.3. URIs 680 [1] https://sourceforge.net/projects/ 681 sourcesans.adobe/?source=directory 683 [2] https://sourceforge.net/projects/ 684 sourceserifpro.adobe/?source=directory 686 [3] https://sourceforge.net/projects/ 687 sourcecodepro.adobe/?source=drectory 689 [4] https://www.rosettatype.com/Skolar 691 [5] https://www.google.com/get/noto/ 693 Appendix A. History and Current Use of PDF with RFCs and Internet- 694 Drafts 696 NOTE: this section is meant as an overview to give some background. 698 A.1. RFCs 700 The RFC series has for a long time accepted Postscript renderings of 701 RFCs, either in addition to or instead of the text renderings of 702 those same RFCs. These have usually been produced when there was a 703 complicated figure or mathematics within the document. For example, 704 consider the figures and mathematics found in RFC 1119 and RFC 1142, 705 and compare the figures found in the text version of RFC 3550 with 706 those in the Postscript version. The RFC editor has provided a PDF 707 rendering of RFCs. Usually, this has been a print of the text file 708 that does not take advantage of any of the broader PDF functionality, 709 unless there was a Postscript version of the RFC, which would then be 710 used by the RFC editor to generate the PDF. 712 A.2. Internet-Drafts 714 In addition to PDFs generated and published by the RFC editor, the 715 IETF tools community has also long supported PDF for Internet-Drafts. 716 Most RFCs start with Internet-Drafts, edited by individual authors. 717 The Internet-Drafts submission tool at https://datatracker.ietf.org/ 718 submit/ accepts PDF and Postscript files in addition to the 719 (required) text submission and (currently optional) XML. If a PDF 720 wasn't submitted for a particular version of an Internet-Draft, the 721 tools would generate one from the Postscript, HTML, or text. 723 Appendix B. Paged Content Layout Quality 725 The process of creating a paged document from running text typically 726 involves ensuring that related material is present on the same page 727 together, and that artifacts of pagination don't interfere with easy 728 reading of the document. Typical high-quality layout processors do 729 several things: 731 Widow and Orphan Management: Widows and orphans 732 () should be 733 avoided automatically (unless the entire paragraph is only one 734 line). Ensure that a page break does not occur after the first 735 line of a paragraph (orphans), if necessary, using slightly longer 736 page sizes. Similarly, ensure that a page break does not occur 737 before the last line of a paragraph (widows). 739 Keep Section Heading Contiguous: Do not insert a page break 740 immediately after a section heading. If there isn't room on a 741 page for the first (two) lines of a section after the section 742 heading, insert a page break before the heading. 744 Avoid Splitting Artwork: Figures should not be split from figure 745 titles. If possible, keep the figure on the same page as the 746 (first) mention of the figure. 748 Headers for Long Tables after Page Breaks: Another common option in 749 producing paginated documents is to include the column headings of 750 a table if the table cannot be displayed on a single page. 751 Similarly, tables should not be split from the table titles. 753 keepWithNext and keepWithPrevious: The XML attributes of 754 "keepWithNext" and "keepWithPrevious" should be followed whenever 755 possible. 757 Whitespace Preservation: The XML entities such as NBSP and NBHYPHEN 758 should be followed as directed whenever possible. 760 Appendix C. Tooling 762 This section discusses tools for viewing, comparing, creating, 763 manipulating, transforming PDF files, including those currently in 764 use by the RFC editor and Internet-Drafts, as well as outlining 765 available PDF tools for various processes. 767 C.1. PDF Viewers 769 As with most file formats, PDF files are experienced through a reader 770 or viewer of PDF files. For most of the common platforms in use 771 (iOS, OS X, Windows, Android, ChromeOS, Kindle) and for most browsers 772 (Edge, Safari, Chrome, Firefox), PDF viewing is built in. In 773 addition there are many PDF viewers available for download and 774 install. 776 PDF viewers vary in capabilities, and it is important to note which 777 PDF viewers support the features utilized in PDF RFCs and Internet- 778 Drafts (features such as links, digital signatures, Tagged PDF and 779 others mentioned in Section 3). 781 C.2. Printers 783 While almost all viewers also support printing of PDF files, printing 784 is one of the most important use cases for PDFs. Some printers have 785 direct PDF support. 787 C.3. PDF Generation Libraries 789 Because the xml2rfc format is a unique format, software for 790 converting XML source documents to the various formats will be 791 needed, including PDF generation. 793 One promising direction is suggested in 794 : using XSLT to generate XSL-FO which 796 is then processed by a formatting object processor such as Apache 797 FOP. 799 Several libraries are also available for generating PDF signatures. 800 The choice of library to use for xml2pdf will depend on many factors: 802 programming language, quality of implementation, quality of PDF 803 generated, support, cost, availability, and so forth. 805 C.4. Typefaces 807 This section is intended to discuss available typefaces that might 808 satisfy requirements. Some openly available fixed-width typefaces 809 (without extensive Unicode support, however) include: 811 o Source Sans [1] 813 o Source Serif Pro [2] 815 o Source Code Pro [3] 817 A font that looks promising for its broad Unicode support is Skolar 818 [4], but it requires licensing. Another potentially useful set of 819 typefaces is the Noto [5] family from Google. 821 C.5. Other Tools 823 In addition to generating and viewing PDF, other categories of PDF 824 tools are available and may be useful both during specification 825 development and for published RFCs. These include tools for 826 comparing two PDFs, checkers that could be used to validate the 827 results of conversion, reviewing and commentary tools that attach 828 annotations to PDF files, and digital signature creation and 829 validation. 831 Validation of an arbitrary author-generated PDF file would be quite 832 difficult; there are few PDF validation tools. However, if RFCs and 833 Internet-Drafts are generated by conversion from XML via xml2rfc, 834 then explicit validation of PDF and adherence to expected profiles 835 would mainly be useful to ensure that xml2rfc has functioned 836 properly. 838 Recommendations: 840 o Discourage (but allow) submission of a PDF representation for 841 Internet-Drafts. In most cases, the PDF for an Internet-Draft 842 should be produced automatically when XML is submitted, with an 843 opportunity to verify the conversion. 845 Appendix D. Acknowledgements 847 The input of the following people is gratefully acknowledged: Nevil 848 Brownlee (ISE), Brian Carpenter, Chris Dearlove, Martin Duerst, 849 Heather Flanagan (RSE), Joe Hildebrand, Paul Hoffman, Duff Johnson, 850 Ted Lemon, Sean Leonard, Henrik Levkowetz, Julian Reschke, Adam 851 Roach, Leonard Rosenthol, Alice Russo, Robert Sparks, Andrew 852 Sullivan, and Dave Thaler. 854 Authors' Addresses 856 Tony Hansen (editor) 857 AT&T Laboratories 858 200 Laurel Ave. South 859 Middletown, NJ 07748 860 USA 862 Email: tony@att.com 864 Larry Masinter 865 Adobe 866 345 Park Ave 867 San Jose, CA 95110 868 USA 870 Email: masinter@adobe.com 871 URI: http://larry.masinter.net 873 Matthew Hardy 874 Adobe 875 345 Park Ave 876 San Jose, CA 95110 877 USA 879 Email: mahardy@adobe.com