Independent Submission N. G. Huang Internet Draft Wuxi Institute of Technology Intended status: Experimental November 21, 2014 Expires: May 2015 Universally Traceable Identifier (UTID) draft-huangng-utid-04.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on May 21, 2015. N. G. Huang Expires May 21, 2015 [Page 1] Internet-Draft UTID November 2014 Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract A Universally Traceable Identifier (UTID) is a compact string of characters for identifying an abstract or physical object. A unique feature of UTID is that it contains two types of forwarding messages to achieve traceability. UTIDs are designed specially for Identifier Tracing Protocol (ITDP) [I-D-IDTP]. This document defines the generic syntax of UTID, a generative grammar for UTID, and the guidelines for their use, too. Table of Contents 1. Introduction ................................................ 3 1.1. Overview of UTIDs ...................................... 3 1.2. Terminology ............................................ 4 1.3. UTIDs and URIs ......................................... 5 2. Conventions Used in This Document ............................ 5 3. Characters .................................................. 5 3.1. Reserved Characters ..................................... 6 3.2. Unreserved Characters ................................... 6 4. Syntax Components ........................................... 6 4.1. Dns .................................................... 6 4.2. Catalog ................................................ 6 4.3. Id ..................................................... 7 5. Formal Syntax ............................................... 7 5.1. UTID Syntax ............................................ 7 5.1.1. Maximum Length of UTIDs ............................ 7 5.1.2. Dns ............................................... 8 5.1.3. Catalog ........................................... 8 5.1.4. Id ................................................ 8 5.1.5. Nested UTID ....................................... 9 5.2. Reserved catalog ....................................... 9 N. G. Huang Expires May 21, 2015 [Page 2] Internet-Draft UTID November 2014 5.3. Examples ............................................... 9 6. Usage ...................................................... 10 6.1. Components Omitted .................................... 11 6.1.1. Catalog Omitted ................................... 11 6.1.2. Id Omitted ....................................... 11 6.1.3. Dns Only ......................................... 11 6.2. Without Dns ........................................... 11 6.3. Spaces ................................................ 12 6.4. Carrier of UTIDs ...................................... 12 6.5. Use of UTID ........................................... 12 7. Normalization and Comparison ................................ 13 8. Security Considerations .................................... 13 9. IANA Considerations ........................................ 13 10. Change log of this document ................................ 13 11. References ................................................ 14 11.1. Normative References .................................. 14 11.2. Informative References ................................ 15 12. Acknowledgments ........................................... 15 Appendix A. Parsing a UTID with a Regular Expression ........... 16 Appendix B. Delimiting a UTID in Context ....................... 17 1. Introduction A Universal Traceable Identifier (UTID) provides a simple and extensible means for identifying an abstract object and a physical object. This specification of UTID syntax and semantic is derived from concepts introduced by the Identifier Tracing Protocol (IDTP) [I-D-IDTP], which is a communication protocol designed for tracing an object and is initially presented in a reference [Huang2011]. The syntax and semantic of UTID is designed to meet the requirements needed for the operating of IDTP. Note 1: This version of this document has an important change compare to the previous version. The major change is the syntax of UTID which has many influences on this document and IDTP protocol [I-D-IDTP]. Note 2: A reference implementation, which is called "busilet", of UTID and IDTP has been developed as open source software and could be downloaded from http://sourceforge.net/projects/busilet/. For more information please visit http://www.utid.org. 1.1. Overview of UTIDs UTIDs are characterized as follows: N. G. Huang Expires May 21, 2015 [Page 3] Internet-Draft UTID November 2014 o Universality: Universality provides several benefits. It allows UTIDs to be used for different types of objects in various contexts with same syntax. It allows UTIDs to be compatible with preexisting identifiers defined in various existing identification system. o Traceability: UTIDs contain forwarding messages used by IDTP [I- D-IDTP] to trace the origin of the information associated to the objects identified by the UTIDs. There two types of forwarding messages contained in UTID used by IDTP. This is a unique feature of UTIDs and IDTP comparing to URIs and other communication protocol. This is why a new identification system and communication protocol are proposed. o Identifier: An identifier embodies the information required to distinguish what is being identified from all other objects within its scope of identification. A UTID is an identifier consisting of a sequence of characters matching the syntax rule named in Section 5.1. This specification does not place any limits on the nature of an object. UTIDs have a global scope and are interpreted consistently regardless of context. 1.2. Terminology This specification uses a number of terms related to UTID for understanding the concept of UTID. o Traceability: It refers to the ability to trace the history, application or location of an entity by means of recorded identifications [ISO8402]. The concept of entity in this document is extended to abstract objects and physical objects. o Object: It is refer to an abstract object or a physical object in this document. o IDTP: It is Identifier Tracing Protocol, as defined in reference [I-D-IDTP]. UTIDs are designed specially for IDTP. o UTID suffix: It is the last part of a UTID starting from a given position. o Tracing: It is the process to trace a request to its origin server by forwarding the request. It is a special kind of forwarding for the purpose of traceability. N. G. Huang Expires May 21, 2015 [Page 4] Internet-Draft UTID November 2014 1.3. UTIDs and URIs A UTID is similar to Uniform Resource Identifier (URI) defined by RFC3986 [RFC3986]. URIs are uniform identifiers for general purposes. UTIDs defined in this document are universal identifiers for IDTP only. Their differences are as follows: 1. Traceability: A UTID contains two types of forwarding messages to achieve traceability. 2. Syntax: The syntax is different from URIs. There are no percent- encoding characters used to represent predefined delimiters in components in UTIDs. 3. Nested: A UTID can nest another UTID as its component, even with two or three nested levels, as long as the whole UTID is not longer than 96 characters. 4. Usage: URIs are uniform identifiers for general purposes, while UTIDs are universal identifiers designed for IDTP only. UTIDs are not compatible with URIs because of the differences described above. Therefore, UTIDs could not become a scheme under URIs scheme architecture. 2. Conventions Used in This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. In this document, these words will appear with that interpretation only when in ALL CAPS. Lower case uses of these words are not to be interpreted as carrying RFC-2119 significance. This specification uses the terms "character" in accordance with the definitions provided in [BCP19]. 3. Characters A UTID is composed from a set of characters consisting of digits, letters, and graphic symbols. A reserved subset of those characters may be used to delimit syntax components within a UTID while the remaining characters, including both the unreserved set and those reserved characters not acting as delimiters, define each component's identifying data. N. G. Huang Expires May 21, 2015 [Page 5] Internet-Draft UTID November 2014 3.1. Reserved Characters UTIDs include components that are delimited by characters in the "reserved" set. These characters are called "reserved" because they are defined as delimiters by the UTID syntax. reserved = "~" / "$" 3.2. Unreserved Characters Characters that are allowed in a UTID but do not have a reserved purpose are called unreserved. These include lowercase letters, decimal digits, hyphen, and period. unreserved = alpha / DIGIT / "-" / "." alpha = %x61-7A ; a-z 4. Syntax Components A UTID consists of three components referred to as the dns, catalog, and id. 4.1. Dns A dns component is the domain name in Domain Name System (DNS) defined by RFC1034 and RFC1035 [RFC1034] [RFC1035] for who defines the UTID and assigns the UTID to an object. It is required and MUST NOT be empty (no characters). The dns expresses one type of forwarding message of a UTID, which is used directly by the TCP/IP network. 4.2. Catalog There might be various ids in a dns of an organization. These ids should be grouped into some catalogs. A catalog component expresses the catalog message of the object identified by a UTID. It is not required and may be empty (no characters). The catalog message is used by IDTP to map a UTID to a set of data format standard for communication, as defined in Section 6.2 of reference [I-D-IDTP]. N. G. Huang Expires May 21, 2015 [Page 6] Internet-Draft UTID November 2014 4.3. Id An id component is a string of characters for identifying an object. It MUST be unique in the context defined by the catalog and dns components. It may be a serial number, a unique string with meaning, such as user name, or a unique number or string generated by machine. It is not required but usually is not empty. 5. Formal Syntax The following syntax specification uses the augmented Backus-Naur Form (BNF) as described in RFC-5234 [RFC5234]. 5.1. UTID Syntax The UTID Syntax is as follows: UTID = [ id ] "~" [ catalog ] "$" dns id = 1 * 92 graphic catalog = label 0 * 30 dot-label 0 * 1 label dns = label 0 * 60 dot-label "." 2 * 7 alpha label = alpha / DIGIT dot-label = alpha / DIGIT / "-" / "." alpha = %x61-7A ; a-z graphic = %x21-7E / Graphic Unicode character If one component (catalog) is empty, the delimiter ('~') before the component SHOULD NOT be omitted. 5.1.1. Maximum Length of UTIDs The maximum length of a UTID is 96 bytes including all delimiters in the UTID although the sum of the maximum length of each components is larger than 96. If one component is long, the other components should be short to guarantee that the length of a UTID does not exceed 96 bytes. N. G. Huang Expires May 21, 2015 [Page 7] Internet-Draft UTID November 2014 5.1.2. Dns The dns component MUST be a real DNS name registered in a domain name registration agent and MUST NOT be an IP address or "localhost" that is a loopback name of local machine. That is to say, there must be at least one dot '.' in the dns component and the part after the last dot must be a top-level domain. The Domain Name System (DNS) [RFC1034] [RFC1035] defines the maximum length of a DNS is 255. However, to simplify implementations, the maximum length of a DNS used in UTID is limited to 64, which is longer than nearly all DNS actually used in real world. In the DNS definition, characters used in DNS may be lower case or upper case without any significance attached to the case. However, to simplify implementations, the characters used in dns component in UTID MUST be lower case. The Internationalizing Domain Names in Applications (IDNA) system allows user applications, such as web browsers, map Unicode strings into the valid DNS character set using Punycode, which is called internationalized domain names. However, to simplify implementations, internationalized domain names are not allowed in UTIDs. 5.1.3. Catalog The catalog component consists of lowercase letters, decimal digits, hyphen, and period only. A hyphen and a period should not be in the beginning and ending position of the catalog component. 5.1.4. Id Like catalog component, the id component consists of lowercase letters, decimal digits, hyphen, and period. A hyphen and a period should not be in the beginning and ending position of the catalog component. However, for the best compatibility, the id component may consists of graphic characters, which include all graphic characters defined in ISO/IEC 646 [ISO646] and all graphic characters defined in Unicode [Unicode] except white space. The Unicode character SHOULD be encoded in UTF-8 [STD63] character set. The id component is case sensitive, while the catalog and dns components support lower case only. N. G. Huang Expires May 21, 2015 [Page 8] Internet-Draft UTID November 2014 It is recommended that id component use lowercase letters, decimal digits, hyphen, and period only. The graphic characters are used only for compatibility to existing code system purpose. Space ('%x20') or any character less than %x20 is neither supported by id component nor supported by the catalog and dns components in UTIDs. 5.1.5. Nested UTID A nested UTID is defined as the id component of a UTID is another UTID. Therefore, the id component of a nested UTID MUST contain dns component and follow the UTID syntax. On the contrary, a UTID in which id component contains dns component is not necessary to be a nested UTID. Whether a UTID is a nested UTID is determined both by the syntax and the usage context of the UTID. 5.2. Reserved catalog Some catalogs are reserved for future use. These catalogs include 'u', 'v' ,'w', 'x', 'y', and 'z', and all catalogs that end with '.u', '.v' ,'.w', '.x', '.y', and '.z'. 5.3. Examples Examples of UTIDs are as follows: 123~cat$abc.example 123~$abc.example ~cat$abc.example ~$abc.example Examples of nested UTIDs are as follows: 123~cat$abc.example~log$zyx.example 123~cat$abc.example~$zyx.example 123~$abc.example~log$zyx.example ~$abc.example~$zyx.example Examples of invalid UTIDs are as follows: N. G. Huang Expires May 21, 2015 [Page 9] Internet-Draft UTID November 2014 123$abc.example There is no ~ symbol in the string so that the 123 in the string is ambiguous as id or catalog. 123~abc.example There is no $ symbol in the string so that no dns could be defined. cat$abc.example There is no ~ symbol in the string. $abc.example There is no ~ symbol in the string. 123~abc$abcexample The dns component is not a valid DNS. 123~abc$ABC.example The dns component has a upper case letter. Examples of invalid nested UTIDs are as follows: 123~$abc~$zyx.example The is component is not a valid UTID because "abc" is not a valid dns. 123~$abc.com$zyx.example 123$abc.test~$zyx.example 6. Usage A UTID should be used as a whole. There is no concept such as relative reference in URIs. One or more components of UTIDs may be omitted (emppty), which are discussed in following. N. G. Huang Expires May 21, 2015 [Page 10] Internet-Draft UTID November 2014 6.1. Components Omitted 6.1.1. Catalog Omitted The catalog component is used to map a UTID to a set of data format standard for communication by IDTP. If the catalog component is omitted, it means that default catalog is used. The default catalog of a UTID is an empty string with length of zero. There is always a catalog in a UTID in spite of catalog component omitted or not, either explicitly defined by catalog component in a UTID or implicitly defined by default catalog of empty string. 6.1.2. Id Omitted The id component is scarcely omitted. The id component may be omitted in the following two cases: o A UTID identifies all objects in the catalog specified in the UTID. o It does not concern the object that a UTID identifies. Therefore, the UTID is used only as forwarding messages by IDTP. 6.1.3. Dns Only Occasionally, a UTID consists of only dns component without id and catalog components. In this case, both delimiter '~' and '$' SHOULD NOT be omitted, catalog component is the default catalog of empty string, and the id components are not the concerns. The UTID of dns only is used only as forwarding messages by IDTP, that is, the UTID refers to the origin server referred by dns only. 6.2. Without Dns A UTID without dns component is invalid. However, UTIDs without dns component may exist in some special usage context. Such UTIDs are strictly limited to the interior of server and MUST NOT be transmitted in communications. For example, UTIDs are used as primary key in a database, which is a typical usage of UTIDs. In this case, all the primary key values have same dns component or even same catalog components. Therefore, the redundant components are not necessary to be saved in the N. G. Huang Expires May 21, 2015 [Page 11] Internet-Draft UTID November 2014 database and a mechanism of mapping primary key to UTID should be established in the interior of server. 6.3. Spaces Space ('%x20') or any character less than %x20 is neither supported in id component nor the catalog and dns components in UTIDs. It is acceptable for catalog components without supporting of space character. However, it is occasionally happens that spaces character occur in an id in existing identification system. In this case, the spaces character should be replaced by some other visible characters for compatibility, which is not defined in this document. 6.4. Carrier of UTIDs UTIDs may be used in RFID or two dimension bar code as tags to identify physical objects. UTIDs may also be used in a database as primary keys or foreign keys to identify physical or abstract objects. In such cases, the length of UTIDs is critical especially in RFID and the data structure design of databases. To simplify implementations, the maximum length of UTIDs is limited to 96 bytes. If a UTID is used as a primary key in a database, the primary key values usually are saved without dns component, or even without catalog components and there should be a mechanism to convert the primary keys to UTIDs in the server interior. If a UTID is used as a foreign key in a database, the foreign key values must be saved as whole UTIDs following the syntax of UTIDs. There is no strict foreign key constraint to the referenced table. In this case, the foreign key in a table usually refers to many referenced tables distributed in many databases in local or remote hosts, which are defined in the forwarding messages (dns and catalog components) of UTIDs of foreign key values. 6.5. Use of UTID The UTID may be used to identify any physical or conceptual thing in the real world or conceptual world. For example, an UTID of "101.room102~sensor$sample.test" indicate a sensor with number of 101 at room 102 in sample.test company. The UTID is also compatible with existing code system. For example, an ISBN of a book "9787111316275" may be coded in UTID as N. G. Huang Expires May 21, 2015 [Page 12] Internet-Draft UTID November 2014 "9787111316275~isbn$id.cmpbook.com", where cmpbook.com is the publisher of the book. Another example is email address. An email of "test@gmail.com" may be coded in UTID as "test@gmail.com~email$example.com". A UTID can be read as "id of catalog in dns". For example, The UTID "101~sensor$sample.test" can be read as "101 of sensor in sample.test". 7. Normalization and Comparison One of the most common operations on UTIDs is comparison to determine whether two UTIDs are equivalent. All UTIDs are already normalized so that the comparison of two UTIDs is case sensitive comparison of them. 8. Security Considerations This section is meant to inform application developers, information providers, and users of the security limitations of UTID as described by this document. The discussion does not include definitive solutions to the problems revealed, though it does make some suggestions for reducing security risks. Although a UTID does not in itself pose a security threat, but a UTID is in plaintext transmitted over network without encryption. Hence it exposes to any network sniffer tools and results in the unintentional leakage of this information during the transferring over network. There are some potential risks if id or catalog components contain sensitive information. Therefore, it should be careful to design the UTID components to avoid leakage of sensitive information. 9. IANA Considerations No IANA actions are required by this document. 10. Change log of this document draft-huangng-utid-01: Add two links to the web site of reference implementation of UTID and IDTP and official web site of UTID and IDTP in the section "1. Introduction". draft-huangng-utid-02: Add a section titled "6.5. Use of UTID" N. G. Huang Expires May 21, 2015 [Page 13] Internet-Draft UTID November 2014 draft-huangng-utid-03: There is an important change. The syntax of UTID is changed from id~cat@loc$dns to id~cat$dns. It causes many changes in this document. draft-huangng-utid-04: (1) No changes; (2) Just make the ID active. 11. References 11.1. Normative References [BCP19] Freed, N. and J. Postel, "IANA Charset Registration Procedures", BCP 19, RFC 2978, October 2000. [ISO646] International Organization for Standardization (ISO), "Information Technology: ISO 7-bit Coded Character Set for Information Interchange", International Standard, Ref. No. ISO/IEC 646:1991. [ISO8402] International Organization for Standardization. ISO 8402: 1994: Quality Management and Quality Assurance-Vocabulary. International Organization for Standardization, 1994. [RFC1034] P. Mockapetris, "Domain names - concepts and facilities", IETF RFC1034, Internet Eng. Task Force, Jan. 1987; www.ietf.org/rfc/rfc1034.txt. [RFC1035] P. Mockapetris, "Domain names - implementation and specification", IETF RFC1035, Internet Eng. Task Force, Jan. 1987; www.ietf.org/rfc/rfc1035.txt. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3986] T. Berners-Lee, RT Fielding, and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", IETF RFC3986 (standards track), Internet Eng. Task Force, Jan. 2005; www.ietf.org/rfc/rfc3986.txt. [RFC5234] D. Crocker, Ed., "Augmented BNF for Syntax Specifications: ABNF", IETF RFC5234 (standards track), Internet Eng. Task Force, Jan. 2008; www.ietf.org/rfc/rfc5234.txt. [STD63] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. N. G. Huang Expires May 21, 2015 [Page 14] Internet-Draft UTID November 2014 [Unicode] Julie D. Allen. The Unicode Standard, Version 6.0, The Unicode Consortium, Mountain View, 2011. 11.2. Informative References [Huang2011] Neng-Geng Huang, Bing-Liang Zhang, Zhi-Yuan Huang (2011): "Concept and design of a things mail system", Signal Processing, Communications and Computing (ICSPCC), 2011 IEEE International Conference on. DOI: 10.1109/ICSPCC.2011.6061741 [I-D-IDTP] N. G. Huang, "Identifier Tracing Protocol (IDTP)", Internet-Draft, draft-huangng-idtp-04.txt, Jun. 2014. 12. Acknowledgments The author of this document thanks to Mr. Zhang Bing-Liang for his innovative idea of things mail that inspired the concept of UTID and IDTP. This document was prepared using 2-Word-v2.0.template.dot. N. G. Huang Expires May 21, 2015 [Page 15] Internet-Draft UTID November 2014 Appendix A. Parsing a UTID with a Regular Expression The following line is the regular expression for breaking-down a UTID into its components. ^([\u0021-\u007e\u0080-\uffff]{1,92})?~([a-z0-9][a-z0-9\- \.]{0,30}[a-z0-9]?)?\$([a-z0-9][a-z0-9\-\.]{0,60}\.[a-z]{2,7})$ For example, matching the above expression to 123~cat$abc.example results in the following sub expression matches: $0 = 123~cat$abc.example $1 = 123 $2 = cat $3 = abc.example Therefore, it can be determined that the three components are as follows: id = $1 catalog = $2 dns = $3 When above regular expression is used, any string that does not have delimiters of '~' and '$' will cause an exception. N. G. Huang Expires May 21, 2015 [Page 16] Internet-Draft UTID November 2014 Appendix B. Delimiting a UTID in Context UTIDs are usually transmitted through formats that provide a clear context for their interpretation. However, it is important to be able to delimit the UTID from the rest of the text if a UTID is in a plain text file for printing or transmitting. In such case, UTIDs are delimited by double-quotes is recommended. For example, "123$abc.example" is a UTID. Copyright (c) 2014 IETF Trust and the persons identified as authors of the code. All rights reserved. Redistribution and use in source and binary forms, with or without modification, is permitted pursuant to, and subject to the license terms contained in, the Simplified BSD License set forth in Section 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info). N. G. Huang Expires May 21, 2015 [Page 17] Internet-Draft UTID November 2014 Authors' Addresses Neng Geng Huang School of the Internet of Things Wuxi Institute of Technology Wuxi, Jiangsu, China, 214121 Phone: 86-13921501950 Email: huangng@gmail.com N. G. Huang Expires May 21, 2015 [Page 18]