<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
     which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
     There has to be one entity for each item to be referenced. 
     An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC4291 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4291.xml">
<!ENTITY RFC3493 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3493.xml">
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC4038 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4038.xml">
<!ENTITY RFC5156 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5156.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
     please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
     (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
     (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-kawamura-ipv6-text-representation-01" ipr="trust200902">
  <!-- category values: std, bcp, info, exp, and historic
     ipr values: full3667, noModification3667, noDerivatives3667
     you can add the attributes updates="NNNN" and obsoletes="NNNN" 
     they will automatically be output with "(if approved)" -->

  <!-- ***** FRONT MATTER ***** -->

  <front>
    <!-- The abbreviated title is used in the page header - it is only necessary if the 
         full title is longer than 39 characters -->

    <title abbrev="IPv6 Text Representation">A Recommendation for IPv6 Address Text Representation</title>

    <!-- add 'role="editor"' below for the editors if appropriate -->

    <!-- Another author who claims to be an editor -->

    <author fullname="Seiichi Kawamura" initials="S.K." 
            surname="Kawamura">
      <organization>NEC BIGLOBE, Ltd.</organization>

      <address>
        <postal>
          <street>14-22, Shibaura 4-chome</street>

          <!-- Reorder these if your country does things differently -->

          <city>Minatoku</city>

          <region>Tokyo</region>

          <code>108-8558</code>

          <country>JAPAN</country>
        </postal>

        <phone>+81 3 3798 6085</phone>

        <email>kawamucho@mesh.ad.jp</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <author fullname="Masanobu Kawashima" initials="M.K." 
            surname="Kawashima">
      <organization>NEC AccessTechnica, Ltd.</organization>

      <address>
        <postal>
          <street>800, Shimomata</street>

          <!-- Reorder these if your country does things differently -->

          <city>Kakegawa-shi</city>

          <region>Shizuoka</region>

          <code>436-8501</code>

          <country>JAPAN</country>
        </postal>

        <phone>+81 537 23 9655</phone>

        <email>kawashimam@necat.nec.co.jp</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <date month="April" year="2009" />

    <!-- If the month and year are both specified and are the current ones, xml2rfc will fill 
         in the current day for you. If only the current year is specified, xml2rfc will fill 
	 in the current day and month for you. If the year is not the current one, it is 
	 necessary to specify at least a month (xml2rfc assumes day="1" if not specified for the 
	 purpose of calculating the expiry date).  With drafts it is normally sufficient to 
	 specify just the year. -->

    <!-- Meta-data Declarations -->

    <area>General</area>

    <workgroup>Internet Engineering Task Force</workgroup>

    <!-- WG name at the upperleft corner of the doc,
         IETF is fine for individual submissions.  
	 If this element is not present, the default is "Network Working Group",
         which is used by the RFC Editor as a nod to the history of the IETF. -->

    <keyword>IPv6</keyword>
    <keyword>text representation</keyword>

    <!-- Keywords will be incorporated into HTML output
         files in a meta tag but they have no effect on text or nroff
         output. If you submit your draft to the RFC Editor, the
         keywords will be used for the search engine. -->

    <abstract>
      <t>As IPv6 network grows, there will be more engineers and also non-engineers who will have the need to use an 
         IPv6 address in text. While the IPv6 address architecture [RFC4291] section 2.2 depicts a flexible model for 
         text representation of an IPv6 address, this flexibility has been causing problems for operators ,system 
         engineers, and customers. The following draft will describe the problems that a flexible text representation 
         has been causing. This document also recommends a canonical representation format that best avoids confusion.
      </t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>A single IPv6 address can be text represented in many ways. Examples are shown below.</t>

      <t><list style="empty">
      <t>2001:db8:0:0:1:0:0:1</t>
      <t>2001:0db8:0:0:1:0:0:1</t>
      <t>2001:db8::1:0:0:1</t>
      <t>2001:db8::0:1:0:0:1</t>
      <t>2001:0db8::1:0:0:1</t>
      <t>2001:db8:0:0:1::1</t>
      <t>2001:db8:0000:0:1::1</t>
      <t>2001:DB8:0:0:1::1</t>
      </list></t>

      <t>All the above point to the same IPv6 address. This flexiblity has caused many problems
      for operators, systems engineers, and customers. The problems will be noted in section 3.
      Also, a canonical representation format to avoid problems will be introduced in section 4.
      </t>

        <section title="Requirements Language">
          <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
          "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
          document are to be interpreted as described in <xref
          target="RFC2119">RFC 2119</xref>.</t>
        </section>
    </section>

    <section title="Text representation flexibility of RFC4291">
          <t>Examples of flexibility in Section 2.2 of RFC4291 are described below.
          </t>
        <section title="leading zeros">
          <t><list style="empty">
          <t>'It is not necessary to write the leading zeros in an individual field.'</t>
          </list></t>
          <t>In other words, it is also not necessary to omit leading zeros. This means that, it is possible to select
             from such as the following example. The final 16bit field is different, but all these addresses are the same.
          </t>
          <t><list style="empty">
          <t>2001:db8:aaaa:bbbb:cccc:dddd:eeee:0001</t>
          <t>2001:db8:aaaa:bbbb:cccc:dddd:eeee:001</t>
          <t>2001:db8:aaaa:bbbb:cccc:dddd:eeee:01</t>
          <t>2001:db8:aaaa:bbbb:cccc:dddd:eeee:1</t>
          </list></t>
        </section>

        <section title="zero compression">
          <t><list style="empty">
          <t>'A special syntax is available to compress the zeros. The use of "::" indicates one or more groups of 16 bits of zeros.'</t>
          </list></t>
          <t>It is possible to select whether or not to omit just one 16bits of zeros.
          </t>
          <t><list style="empty">
          <t>2001:db8:aaaa:bbbb:cccc:dddd::1</t>
          <t>2001:db8:aaaa:bbbb:cccc:dddd:0:1</t>
          </list></t>
          <t>In case where there are more than one zero fields, there is a choice of how many fields can be shortened. Examples follow.
          </t>
          <t><list style="empty">
          <t>2001:db8:0:0:0::1</t>
          <t>2001:db8:0:0::1</t>
          <t>2001:db8:0::1</t>
          <t>2001:db8::1</t>
          <t>... and more</t>
          </list></t>
          <t>In addition, RFC4291 in section 2.2 notes,
          </t>
          <t><list style="empty">
          <t>'The "::" can also be used to compress leading or trailing zeros in an address.'</t>
          </list></t>
          <t>It is possible to choose whether to compress a leading zero or a trailing zero in a single address. Examples are shown below.
          </t>
          <t><list style="empty">
          <t>2001:db8::aaaa:0:0:1</t>
          <t>2001:db8:0:0:aaaa::1</t>
          </list></t>
        </section>

        <section title="Uppercase or Lowercase">
          <t>RFC4291 does not mention about preference of uppercase or lowercase. Various flavors are shown below.
          </t>
          <t><list style="empty">
          <t>2001:db8:aaaa:bbbb:cccc:dddd:eeee:aaaa</t>
          <t>2001:db8:aaaa:bbbb:cccc:dddd:eeee:AAAA</t>
          <t>2001:db8:aaaa:bbbb:cccc:dddd:eeee:AaAa</t>
          <t>... more combinations</t>
          </list></t>
        </section>

    </section>

    <section title="Problems Encountered with the Flexible Model">
        <section title="Searching">
            <section title="General Summary">
              <t>A search of an IPv6 address if conducted through a UNIX system is usually case sensitive and extended options to allow for regular expression use will come in handy. However, there are many applications in the internet today that do not provide this capability. When searching for an IPv6 address in such systems, the system engineer will have to try each and every possibility to search for an address. This has critical impacts especially when trying to deploy IPv6 over an enterprise network.
              </t>
            </section>
            <section title="Searching Spreadsheets and Text Files">
              <t>Spreadsheet applications and text editors on GUI systems, rarely have the ability to search for a text using regular expression. Moreover, there are many non-engineers (who are not aware of case sensitivity and regular expression use) that use these application to manage IP addresses. This has worked quite well with IPv4 since text representation in IPv4 has very little flexibility. There is no incentive to encourage these non-engineers  to change their tool or learn reagular expression when they decide to go dual-stack. If the entry in the spreadsheet reads, 2001:db8::1:0:0:1, but the search was conducted as 2001:db8:0:0:1::1, this will show a result of no match. One example where this will cause problem is, when the search is being conducted to assign a new address from a pool, and a check was being done to see if it was not in use. This may cause problems to the end-hosts or end-users. This type of address management is very often seen in enterprise networks and also in ISPs.
              </t>
            </section>
            <section title="Searching with Whois">
              <t>The whois utility is used by a wide range of people today. When a record is set to the whois database, one will likely check the output to see if the entry is correct. If a entity was recorded as 2001:db8::/48, but the whois ouput showed 2001:0db8:0000::/48, most non-engineers would think that their input was wrong, and will likely retry several times or make a frustrated call to the database hostmaster. If there was a need to register the same address on different systems, and each system showed a different text representation, this would confuse people even more. Although this document focuses on addresses rather than prefixes, this is worth mentioning since problems encountered are mostly equal.
              </t>
            </section>
            <section title="Searching for an Address in a Network Diagram">
              <t>Network diagrams and blue-prints contain IP addresses of systems. In times of trouble shooting, there may be a need to search through a diagram to find the point of failure (for example, if a traceroute stopped at 2001:db8::1, one would search the diagram for that address). This is a technique quite often in use in enterprise networks and managed services. Again, the different flavors of text representation will result in a time-consuming search, leading to longer MTTR in times of trouble.
              </t>
            </section>
        </section>

        <section title="Parsing and Modifying">
            <section title="General Summary">
              <t>With all the possible text representation ways, each application must include a module, object, link, etc. to a function that will parse IPv6 addresses in a manner that no matter how it is represented, they will mean the same address. This is not too much a problem if the output is to be just 'read' or 'managed' by a network engineer. However, many system engineers who integrate complex computer systems to corporate customers will have difficulties finding that their favorite tool will not have this function, or will encounter difficulties such as having to rewrite their macro's or scripts for their customers. It must be noted that each additional line of a program will result in increased development fees that will be charged to the customers.
              </t>
            </section>
            <section title="Logging">
              <t>If an application were to ouput a log summary that represented the address in full (such as 2001:0db8:0000:0000:1111:2222:3333:4444), the output would be highly unreadable compared to the IPv4 output. The address would have to be parsed and reformed to make it useful for human reading. This will result in additional code on the applications which will result in extra fees charged to the customers. Sometimes, logging for critical systems is done by mirroring the same traffic to two different systems. Care must be taken that no matter what the log output is, the logs should be parsed so they will mean the same.
              </t>
            </section>
            <section title="Auditing. Case 1">
              <t>When a router or any other network appliance machine configuration is audited, there are many methods to compare the configuration information of a node. Sometimes, auditing will be done by just comparing the changes made each day. In this case, if configuration was done such that 2001:db8::1 was changed to 2001:0db8:0000:0000:0000:0000:0000:0001 just because the new engineer on the block felt it was better, a simple diff will tell you that a different address was configured. If this was done on a wide scale network, people will be focusing on 'why the extra zeros were put in' instead of doing any real auditing. Lots of tools are just plain diffs that do not take into account address representation rules.
              </t>
            </section>
            <section title="Auditing. Case 2">
              <t>Node configurations will be matched against a information system that manages IP addresses. If output notation is different, there will need to be a script that is implemented to cover for this. An SNMP GET of an interface address and text representation in a humanly written text file is highly unlikely to match on first try.
              </t>
            </section>
            <section title="Unexpected Modifying">
              <t>Sometimes, a system will take an address and modify it as a convenience. For example, a system may take an input of 2001:0db8:0::1 and make the output 2001:db8::1 (which is seen in some RIR databases). If the zeros were inputed for a reason, the outcome may be somewhat unexpected.
              </t>
            </section>
        </section>

        <section title="Operating">
            <section title="General Summary">
              <t>When an operator sets an IPv6 address of a system as 2001:db8:0:0:1:0:0:1, the system may take the address and show the configuration result as 2001:DB8::1:0:0:1. A distinguished engineer will know that the right address is set, but an operator, or a customer that is communicating with the operator to solve a problem, is usually not as distinguished as we would like. Again, the extra load in checking that the IP address is the same as was intended, will result in fees that will be charged to the customers.
              </t>
            </section>
            <section title="Customer Calls">
              <t>When a customer calls to inquire about a suspected outage, IPv6 address representation should be handled with care. Not all customers are engineers nor have the same skill in IPv6 technology. The NOC will have to take extra steps to humanly parse the address to avoid having to explain to the customers that 2001:db8:0:1::1 is the same as 2001:db8::1:0:0:0:1. This is one thing that will never happen in IPv4 because IPv4 address cannot be abbreviated.
              </t>
            </section>
            <section title="Abuse">
              <t>Network abuse is reported along with the abusing IP address. This 'reporting' could take any shape or form of the flexible model. A team that handles network abuse must be able to tell the difference between a 2001:db8::1:0:1 and 2001:db8:1::0:1. Mistakes in the placement of the "::" will result in a critical situation. A system that handles these incidents should be able to handle any type of input and parse it in a correct manner. Also, incidents are reported over the phone. It is unnecessary to report if the letter is an uppercase or lowercase. However, when a letter is spelled uppercase, people tend to clarify that it is uppercase, which is unnecessary information.
              </t>
            </section>
        </section>


        <section title="Other Minor Problems">
            <section title="Changing Platforms">
              <t>When an engineer decides to change the platform of a running service, the same code may not work as expected due to the difference in IPv6 address text representation. Usually, a change in a platform (e.g. Unix to Windows, Cisco to Juniper) will result in a major change of code, but flexibility in address representation will increase the work load which will again, result in fees that will be charged to the customers, and also longer down time of systems.
              </t>
            </section>
            <section title="Preference in Documentation">
              <t>A document that is edited by more than one author, may become harder to read.
              </t>
            </section>
            <section title="Legibility">
              <t>Capital case D and 0 can be quite often misread. Capital B and 8 can also be misread.
              </t>
            </section>
        </section>
    </section>

    <section title="A Recommendation for IPv6 Text Representation">

        <t>A recommendation for a canonical text representation format of IPv6 addresses is presented in this section. The recommendation in this document is one that, complies fully with RFC 4291, is implemented by various operating systems, and is human friendly.</t>

        <section title="Handling Leading Zeros">
          <t>Leading zeros should be chopped for human legibility and easier searching. Also, a single 16 bit 0000 field should be represented as just 0. Place holder zeros are often cause of mis-reading.
          </t>
        </section>

        <section title="&quot;::&quot; usage">
            <section title="shorten as much as possible">
          <t>The use of "::" should be used to its maximum capability (i.e. 2001:db8::0:1 is not very clean).</t>
            </section>

            <section title="one 16bit 0 field">
          <t>"::" should not be used to shorten just one 16bit 0 field for it would tend to mislead that there are more than one 16 bit field that is shortened.</t>
            </section>

            <section title="when &quot;::&quot; can be used twice">
          <t>When cases where it is possible to use "::" in two or more different sections of an address, implementation to shorten the side with more 16bit 0 fields are more common (i.e. latter is shortened in 2001:0:0:1:0:0:0:1). When the length of 16bit 0 fields are equal (i.e. 2001:db8:0:0:1:0:0:1), the former is usually shortened. One idea to avoid any confusion, is for the operator to not use 16bit field 0 in the first 64 bits. By nature IPv6 addresses are usually assigned or allocated to end-users as longer than 32 bits (typicaly 48bit or longer). 
          </t>
            </section>

        </section>

        <section title="Lower Case">
          <t>Recent implementations tend to represent IPv6 address as lower case. It is better to use lower case to avoid problems such as described in section 3.3.3 and 3.4.3.
          </t>
        </section>

    </section>

    <section title="Conclusion">
          <t>The recommended format of text representing an IPv6 address is summarized as follows.
          </t>
         <t><list style="empty">
          <t>(1) omit leading zeros</t>
          <t>(2) "::" used to their maximum extent whenever possible</t>
          <t>(3) "::" used where shortens address the most</t>
          <t>(4) "::" used in the former part in case of a tie breaker</t>
          <t>(5) do not shorten one 16bit 0 field</t>
          <t>(6) use lower case</t>
          </list></t>
          <t>Hints for developers are written in the Appendix section.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>None.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>None.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>The authors would like to thank Jan Zorz, Randy Bush, Yuichi Minami, Toshimitsu Matsuura for their generous and helpful comments.</t>
    </section>

    <!-- Possibly a 'Contributors' section ... -->

  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>

    <references title="Normative References">
      <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.4291.xml"?-->
      &RFC4291;

    </references>

    <references title="Informative References">
      &RFC2119;
      <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.3493.xml"?-->
      &RFC3493;
      &RFC4038;
      &RFC5156;
    </references>

    <section anchor="app-additional-1" title="IPv6 Addresses with Embedded IPv4 Addresses">
      <t>IPv4-Compatible IPv6 address and IPv4-Mapped IPv6 address are defined that carry an IPv4 address in the low-order 32 bits of the address. These addresses have special representation that combine hexadecimal and decimal notations. IPv4-Compatible IPv6 address is deprecated. Although the use of IPv4-Mapped IPv6 address is not recommended due to security and portability problems, the text representation method noted in this document should be applied for the hexadecimal part.</t>
    </section>

    <section anchor="app-additional-2" title="For developers">
      <t>We recommend that developers use display routines that conform to these rules. For example, the usage of getnameinfo() with flags argument NI_NUMERICHOST in FreeBSD 7.0 will give a conforming output. The function inet_ntop() of FreeBSD7.0 is a good C code reference, but should not be called directly. See RFC4038 for details.</t>
    </section>

    <section anchor="app-additional-3" title="Prefix Issues">
      <t>Problems with prefixes are just the same as problems encountered with addresses. Text representation method of IPv6 prefixes should be no different
         from that of IPv6 addresses. </t>
    </section>

    <section anchor="app-additional-4" title="Phonetic Alphabet and Figure Code">
      <t>The use of Phonetics Alphabet is essential for complete accuracy in voice communications. For example, ITU Phonetic alphabet and Figure Code is as follows (extracted hexadecimal from it):</t>
    <texttable>
      <ttcol align='left'>Character</ttcol>
      <ttcol align='left'>Word</ttcol>
      <ttcol align='left'>Pronunciation</ttcol>
      <c>0</c>
      <c>Nadazero</c>
      <c>NAH-DAH-ZAY-ROH</c>
      <c>1</c>
      <c>Unaone</c>
      <c>OO-NAH-WUN</c>
      <c>2</c>
      <c>Bissotwo</c>
      <c>BEES-SOH-TOO</c>
      <c>3</c>
      <c>Terrathree</c>
      <c>TAY-RAH-TREE</c>
      <c>4</c>
      <c>Kartefour</c>
      <c>KAR-TAY-FOWER</c>
      <c>5</c>
      <c>Pantafive</c>
      <c>PAN-TAH-FIVE</c>
      <c>6</c>
      <c>Soxisix</c>
      <c>SOK-SEE-SIX</c>
      <c>7</c>
      <c>Setteseven</c>
      <c>SAY-TAY-SEVEN</c>
      <c>8</c>
      <c>Oktoeight</c>
      <c>OK-TOH-AIT</c>
      <c>9</c>
      <c>Novenine</c>
      <c>NO-VAY-NINER</c>
      <c>A</c>
      <c>Alfa</c>
      <c>AL FAH</c>
      <c>B</c>
      <c>Bravo</c>
      <c>BRAH VOH</c>
      <c>C</c>
      <c>Charlie</c>
      <c>CHAR LEE or SHAR LEE</c>
      <c>D</c>
      <c>Delta</c>
      <c>DELL TAH</c>
      <c>E</c>
      <c>Echo</c>
      <c>ECK OH</c>
      <c>F</c>
      <c>Foxtrot</c>
      <c>FOKS TROT</c>
    </texttable>

    </section>

  </back>
</rfc>


