idnits 2.17.1 draft-ietf-xmpp-address-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 6, 2010) is 4950 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3490 (ref. 'IDNA2003') (Obsoleted by RFC 5890, RFC 5891) ** Obsolete normative reference: RFC 3491 (ref. 'NAMEPREP') (Obsoleted by RFC 5891) ** Obsolete normative reference: RFC 3454 (ref. 'STRINGPREP') (Obsoleted by RFC 7564) -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE-SEC' -- Obsolete informational reference (is this intentional?): RFC 3920 (Obsoleted by RFC 6120) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Saint-Andre 3 Internet-Draft Cisco 4 Intended status: Standards Track October 6, 2010 5 Expires: April 9, 2011 7 Extensible Messaging and Presence Protocol (XMPP): Address Format 8 draft-ietf-xmpp-address-05 10 Abstract 12 This document defines the format for addresses used in the Extensible 13 Messaging and Presence Protocol (XMPP), including support for non- 14 ASCII characters. 16 Status of this Memo 18 This Internet-Draft is submitted in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF). Note that other groups may also distribute 23 working documents as Internet-Drafts. The list of current Internet- 24 Drafts is at http://datatracker.ietf.org/drafts/current/. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 This Internet-Draft will expire on April 9, 2011. 33 Copyright Notice 35 Copyright (c) 2010 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with respect 43 to this document. Code Components extracted from this document must 44 include Simplified BSD License text as described in Section 4.e of 45 the Trust Legal Provisions and are provided without warranty as 46 described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 2. Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 2.1. Fundamentals . . . . . . . . . . . . . . . . . . . . . . . 4 53 2.2. Domainpart . . . . . . . . . . . . . . . . . . . . . . . . 5 54 2.3. Localpart . . . . . . . . . . . . . . . . . . . . . . . . 6 55 2.4. Resourcepart . . . . . . . . . . . . . . . . . . . . . . . 7 56 3. Internationalization Considerations . . . . . . . . . . . . . 8 57 4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 58 4.1. Reuse of Stringprep . . . . . . . . . . . . . . . . . . . 8 59 4.2. Reuse of Unicode . . . . . . . . . . . . . . . . . . . . . 8 60 4.3. Confusable Characters . . . . . . . . . . . . . . . . . . 8 61 4.4. Address Spoofing . . . . . . . . . . . . . . . . . . . . . 9 62 4.4.1. Address Forging . . . . . . . . . . . . . . . . . . . 9 63 4.4.2. Address Mimicking . . . . . . . . . . . . . . . . . . 10 64 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 65 5.1. Nodeprep Profile of Stringprep . . . . . . . . . . . . . . 11 66 5.2. Resourceprep Profile of Stringprep . . . . . . . . . . . . 11 67 6. Conformance Requirements . . . . . . . . . . . . . . . . . . . 12 68 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 69 7.1. Normative References . . . . . . . . . . . . . . . . . . . 13 70 7.2. Informative References . . . . . . . . . . . . . . . . . . 14 71 Appendix A. Nodeprep . . . . . . . . . . . . . . . . . . . . . . 16 72 A.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 16 73 A.2. Character Repertoire . . . . . . . . . . . . . . . . . . . 16 74 A.3. Mapping . . . . . . . . . . . . . . . . . . . . . . . . . 17 75 A.4. Normalization . . . . . . . . . . . . . . . . . . . . . . 17 76 A.5. Prohibited Output . . . . . . . . . . . . . . . . . . . . 17 77 A.6. Bidirectional Characters . . . . . . . . . . . . . . . . . 17 78 A.7. Notes . . . . . . . . . . . . . . . . . . . . . . . . . . 18 79 Appendix B. Resourceprep . . . . . . . . . . . . . . . . . . . . 18 80 B.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 18 81 B.2. Character Repertoire . . . . . . . . . . . . . . . . . . . 19 82 B.3. Mapping . . . . . . . . . . . . . . . . . . . . . . . . . 19 83 B.4. Normalization . . . . . . . . . . . . . . . . . . . . . . 19 84 B.5. Prohibited Output . . . . . . . . . . . . . . . . . . . . 19 85 B.6. Bidirectional Characters . . . . . . . . . . . . . . . . . 19 86 Appendix C. Differences From RFC 3920 . . . . . . . . . . . . . . 20 87 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20 89 1. Introduction 91 The Extensible Messaging and Presence Protocol (XMPP) is an 92 application profile of the Extensible Markup Language [XML] for 93 streaming XML data in close to real time between any two or more 94 network-aware entities. The address format for XMPP entities was 95 originally developed in the Jabber open-source community in 1999, 96 first described by [XEP-0029] in 2002, and defined canonically by 97 [RFC3920] in 2004. 99 As specified in RFC 3920, the XMPP address format re-uses the 100 "stringprep" technology for preparation of non-ASCII characters 101 [STRINGPREP], including the Nameprep profile for internationalized 102 domain names as specified in [NAMEPREP] and [IDNA2003] along with two 103 XMPP-specific profiles for the localpart and resourcepart. 105 Since the publication of RFC 3920, IDNA2003 has been superseded by 106 IDNA2008 (see [IDNA-PROTO] and related documents), which is not based 107 on stringprep. Following the lead of the IDNA community, other 108 technology communities that use stringprep have begun discussions 109 about migrating away from stringprep toward more "modern" approaches. 110 The XMPP community is participating in those discussions in order to 111 find a replacement for the Nodeprep and Resourceprep profiles of 112 stringprep defined in RFC 3920. However, work on improved handling 113 of internationalized addresses is currently in progress within the 114 PRECIS Working Group and at the time of this writing it seems that 115 such work might take several years to complete. Because all other 116 aspects of revised documentation for XMPP have been incorporated into 117 [rfc3920bis], the XMPP Working Group decided to split the XMPP 118 address format into a separate specification so as not to 119 significantly delay publication of improved documentation for XMPP 120 while awaiting the conclusion of work on improved handling of 121 internationalized addresses. 123 Therefore, this specification provides corrected documentation of the 124 XMPP address format using the internationalization technologies 125 available in 2004 (when RFC 3920 was published), with the intent that 126 this specification will be superseded as soon as work on a new 127 approach to preparation and comparison of internationalized strings 128 has been defined by the PRECIS Working Group and applied to the 129 specific cases of XMPP localparts and resourceparts. 131 2. Addresses 132 2.1. Fundamentals 134 An XMPP entity is anything that is network-addressable and that can 135 communicate using XMPP. For historical reasons, the native address 136 of an XMPP entity is called a Jabber Identifier or JID. A valid JID 137 is a string of [UNICODE] code points, encoded using [UTF-8], and 138 structured as an ordered sequence of localpart, domainpart, and 139 resourcepart (where the first two parts are demarcated by the '@' 140 character used as a separator, and the last two parts are similarly 141 demarcated by the '/' character). 143 The syntax for a JID is defined as follows using the Augmented 144 Backus-Naur Form as specified in [ABNF]. 146 jid = [ localpart "@" ] domainpart [ "/" resourcepart ] 147 localpart = 1*(nodepoint) 148 ; a "nodepoint" is a UTF-8 encoded Unicode code 149 ; point that satisfies the Nodeprep profile of 150 ; stringprep 151 domainpart = IP-literal / IPv4address / ifqdn 152 ; the "IPv4address" and "IP-literal" rules are 153 ; defined in RFC 3986, and the first-match-wins 154 ; (a.k.a. "greedy") algorithm described in RFC 155 ; 3986 applies to the matching process 156 ifqdn = 1*(namepoint) 157 ; a "namepoint" is a UTF-8 encoded Unicode 158 ; code point that satisfies the Nameprep 159 ; profile of stringprep 160 resourcepart = 1*(resourcepoint) 161 ; a "resourcepoint" is a UTF-8 encoded Unicode 162 ; code point that satisfies the Resourceprep 163 ; profile of stringprep 165 All JIDs are based on the foregoing structure. One common use of 166 this structure is to identify a messaging and presence account, the 167 server that hosts the account, and a connected resource (e.g., a 168 specific device) in the form of . 169 However, localparts other than clients are possible; for example, a 170 specific chat room offered by a multi-user chat service (see 171 [XEP-0045]) is addressed as (where "room" is the name 172 of the chat room and "service" is the hostname of the multi-user chat 173 service) and a specific occupant of such a room could be addressed as 174 (where "nick" is the occupant's room nickname). 175 Many other JID types are possible (e.g., could be a 176 server-side script or service). 178 Each allowable portion of a JID (localpart, domainpart, and 179 resourcepart) MUST NOT be zero bytes in length and MUST NOT be more 180 than 1023 bytes in length, resulting in a maximum total size 181 (including the '@' and '/' separators) of 3071 bytes. 183 An entity's address on an XMPP network MUST be represented as a JID 184 (without a URI scheme) and not a [URI] or [IRI] as specified in 185 [XMPP-URI]; the latter specification is provided only for 186 identification and interaction outside the context of XMPP itself. 188 Implementation Note: When dividing a JID into its component parts, 189 an implementation needs to match the separator characters '@' and 190 '/' before applying any transformation algorithms, which might 191 decompose certain Unicode code points to the separator characters 192 (e.g., U+FE6B SMALL COMMERCIAL AT might decompose into U+0040 193 COMMERCIAL AT). 195 2.2. Domainpart 197 The DOMAINPART of a JID is that portion after the '@' character (if 198 any) and before the '/' character (if any); it is the primary 199 identifier and is the only REQUIRED element of a JID (a mere 200 domainpart is a valid JID). Typically a domainpart identifies the 201 "home" server to which clients connect for XML routing and data 202 management functionality. However, it is not necessary for an XMPP 203 domainpart to identify an entity that provides core XMPP server 204 functionality (e.g., a domainpart can identify an entity such as a 205 multi-user chat service, a publish-subscribe service, or a user 206 directory). 208 The domainpart for every server or service that will communicate over 209 a network SHOULD be a fully qualified domain name or "FQDN" (see 210 [DNS]); although the domainpart is allowed to be either an Internet 211 Protocol (IPv4 or IPv6) address or a text label that is resolvable on 212 a local network (commonly called an "unqualified hostname"), it is 213 possible that domainparts that are IP addresses will not be 214 acceptable to other services for the sake of interdomain 215 communication. Furthermore, domainparts that are unqualified 216 hostnames MUST NOT be used on public networks but MAY be used on 217 private networks. 219 Note: If the domainpart includes a final character considered to 220 be a label separator (dot) by [IDNA2003] or [DNS], this character 221 MUST be stripped from the domainpart before the JID of which it is 222 a part is used for the purpose of routing an XML stanza, comparing 223 against another JID, or constructing an [XMPP-URI]; in particular, 224 the character MUST be stripped before any other canonicalization 225 steps are taken, such as application of the [NAMEPREP] profile of 226 [STRINGPREP] or completion of the ToASCII operation as described 227 in [IDNA2003]. 229 A domainpart MUST NOT be zero bytes in length and MUST NOT be more 230 than 1023 bytes in length. 232 A domainpart consisting of a fully qualified domain name MUST be an 233 "internationalized domain name" as defined in [IDNA2003], that is, it 234 MUST be "a domain name in which every label is an internationalized 235 label" and MUST follow the rules for construction of 236 internationalized domain names specified in [IDNA2003]. When 237 preparing a text label (consisting of a sequence of UTF-8 encoded 238 Unicode code points) for representation as an internationalized label 239 in the process of constructing an XMPP domainpart or comparing two 240 XMPP domainparts, an application MUST ensure that for each text label 241 it is possible to apply without failing the ToASCII operation 242 specified in [IDNA2003] with the UseSTD3ASCIIRules flag set (thus 243 forbidding ASCII code points other than letters, digits, and 244 hyphens). If the ToASCII operation can be applied without failing, 245 then the label is an internationalized label. (Note: The ToASCII 246 operation includes application of the [NAMEPREP] profile of 247 [STRINGPREP] and encoding using the algorithm specified in 248 [PUNYCODE]; for details, see [IDNA2003].) Although XMPP applications 249 do not communicate the output of the ToASCII operation (called an 250 "ACE label") over the wire, it MUST be possible to apply that 251 operation without failing to each internationalized label. If an 252 XMPP application receives as input an ACE label, it SHOULD convert 253 that ACE label to an internationalized label using the ToUnicode 254 operation (see [IDNA2003]) before including the label in an XMPP 255 domainpart that will be communicated over the wire on an XMPP network 256 (however, instead of converting the label, there are legitimate 257 reasons why an application might instead refuse the input altogether 258 and return an error to the entity that provided the offending data). 260 In the terms of IDNA2008 [IDNA-DEFS], the domainpart of a JID is a 261 "domain name slot". 263 2.3. Localpart 265 The LOCALPART of a JID is an optional identifier placed before the 266 domainpart and separated from the latter by the '@' character. 267 Typically a localpart uniquely identifies the entity requesting and 268 using network access provided by a server (i.e., a local account), 269 although it can also represent other kinds of entities (e.g., a chat 270 room associated with a multi-user chat service). The entity 271 represented by an XMPP localpart is addressed within the context of a 272 specific domain. 274 A localpart MUST NOT be zero bytes in length and MUST NOT be more 275 than 1023 bytes in length. 277 A localpart MUST be formatted such that the Nodeprep profile of 278 [STRINGPREP] can be applied without failing (see Appendix A). Before 279 comparing two localparts, an application MUST first ensure that the 280 Nodeprep profile has been applied to each identifier (the profile 281 need not be applied each time a comparison is made, as long as it has 282 been applied before comparison). 284 2.4. Resourcepart 286 The resourcepart of a JID is an optional identifier placed after the 287 domainpart and separated from the latter by the '/' character. A 288 resourcepart can modify either a address or a 289 mere address. Typically a resourcepart uniquely 290 identifies a specific connection (e.g., a device or location) or 291 object (e.g., an occupant in a multi-user chat room) belonging to the 292 entity associated with an XMPP localpart at a local domain. 294 When an XMPP address does not include a resourcepart (i.e., when it 295 is of the form or ), it is 296 referred to as a BARE JID. When an XMPP address includes a 297 resourcepart (i.e., when it is of the form or 298 ), is referred to as a FULL JID. 300 A resourcepart MUST NOT be zero bytes in length and MUST NOT be more 301 than 1023 bytes in length. 303 A resourcepart MUST be formatted such that the Resourceprep profile 304 of [STRINGPREP] can be applied without failing (see Appendix B). 305 Before comparing two resourceparts, an application MUST first ensure 306 that the Resourceprep profile has been applied to each identifier 307 (the profile need not be applied each time a comparison is made, as 308 long as it has been applied before comparison). 310 Note: For historical reasons, the term "resource identifier" is 311 often used in XMPP to refer to the optional portion of an XMPP 312 address that follows the domainpart and the "/" separator 313 character; to help prevent confusion between an XMPP "resource 314 identifier" and the meanings of "resource" and "identifier" 315 provided in Section 1.1 of [URI], this specification uses the term 316 "resourcepart" instead of "resource identifier" (as in RFC 3920). 318 XMPP entities SHOULD consider resourceparts to be opaque strings and 319 SHOULD NOT impute meaning to any given resourcepart. In particular: 321 o Use of the '/' character as a separator between the domainpart and 322 the resourcepart does not imply that XMPP addresses are 323 hierarchical in the way that, say, HTTP addresses are 324 hierarchical; thus for example an XMPP address of the form 325 does not identify a resource "bar" that 326 exists below a resource "foo" in a hierarchy of resources 327 associated with the entity "localpart@domain". 329 o The '@' character is allowed in the resourcepart, and is often 330 used in the "nick" shown in XMPP chatrooms. For example, the JID 331 describes an entity who is an 332 occupant of the room with an (asserted) 333 nick of . However, chatroom services do not 334 necessarily check such an asserted nick against the occupant's 335 real JID. 337 3. Internationalization Considerations 339 XMPP servers MUST, and XMPP clients SHOULD, support [IDNA2003] for 340 domainparts (including the [NAMEPREP] profile of [STRINGPREP]), the 341 Nodeprep (Appendix A) profile of [STRINGPREP] for localparts, and the 342 Resourceprep (Appendix B) profile of [STRINGPREP] for resourceparts; 343 this enables XMPP addresses to include a wide variety of characters 344 outside the US-ASCII range. Rules for enforcement of the XMPP 345 address format are provided in [rfc3920bis]. 347 4. Security Considerations 349 4.1. Reuse of Stringprep 351 The security considerations described in [STRINGPREP] apply to the 352 Nodeprep (Appendix A) and Resourceprep (Appendix B) profiles defined 353 in this document for XMPP localparts and resourceparts. The security 354 considerations described in [STRINGPREP] and [NAMEPREP] apply to the 355 Nameprep profile that is re-used here for XMPP domainparts. 357 4.2. Reuse of Unicode 359 The security considerations described in [UNICODE-SEC] apply to the 360 use of Unicode characters in XMPP addresses. 362 4.3. Confusable Characters 364 The Unicode and ISO/IEC 10646 repertoires have many characters that 365 look similar (so-called "confusable characters"). In many cases, 366 users of security protocols might perform visual matching, such as 367 when comparing the names of trusted third parties. Because it is 368 impossible to map similar-looking characters without a great deal of 369 context (such as knowing the fonts used), stringprep does nothing to 370 map similar-looking characters together, nor to prohibit some 371 characters because they look like others. Some specific suggestions 372 about identification and handling of confusable characters appear in 373 the Unicode Security Considerations [UNICODE-SEC]. 375 A localpart can be employed as one part of an entity's address in 376 XMPP. One common usage is as the username of an instant messaging 377 user; another is as the name of a multi-user chat room; and many 378 other kinds of entities could use localparts as part of their 379 addresses. The security of such services could be compromised based 380 on different interpretations of the internationalized localpart; for 381 example, a user entering a single internationalized localpart could 382 access another user's account information, or a user could gain 383 access to a hidden or otherwise restricted chat room or service. 385 A resourcepart can be employed as one part of an entity's address in 386 XMPP. One common usage is as the name for an instant messaging 387 user's connected resource; another is as the nickname of a user in a 388 multi-user chat room; and many other kinds of entities could use 389 resourceparts as part of their addresses. The security of such 390 services could be compromised based on different interpretations of 391 the internationalized resourcepart; for example, a user could attempt 392 to initiate multiple connections with the same name, or a user could 393 send a message to someone other than the intended recipient in a 394 multi-user chat room. 396 4.4. Address Spoofing 398 There are two forms of address spoofing: forging and mimicking. 400 4.4.1. Address Forging 402 In the context of XMPP technologies, address forging occurs when an 403 entity is able to generate an XML stanza whose 'from' address does 404 not correspond to the account credentials with which the entity 405 authenticated onto the network (or an authorization identity provided 406 during SASL negotiation). For example, address forging occurs if an 407 entity that authenticated as "juliet@im.example.com" is able to send 408 XML stanzas from "nurse@im.example.com" or "romeo@example.net". 410 Address forging is difficult in XMPP systems, given the requirement 411 for sending servers to stamp 'from' addresses and for receiving 412 servers to verify sending domains via server-to-server authentication 413 (see [rfc3920bis]). However, address forging is not impossible, 414 since a rogue server could forge JIDs at the sending domain by 415 ignoring the stamping requirement. Therefore, an entity outside the 416 security perimeter of a particular server cannot reliably distinguish 417 between bare JIDs of the form at that server 418 and thus can authenticate only the domainpart of such JIDs with any 419 level of assurance. This specification does not define methods for 420 discovering or counteracting such rogue servers. 422 Furthermore, it is possible for an attacker to forge JIDs at other 423 domains by means of a DNS poisoning attack if DNS security extensions 424 [DNSSEC] are not used. 426 4.4.2. Address Mimicking 428 Address mimicking occurs when an entity provides legitimate 429 authentication credentials for and sends XML stanzas from an account 430 whose JID appears to a human user to be the same as another JID. For 431 example, in some XMPP clients the address "paypa1@example.org" 432 (spelled with the number one as the final character of the localpart) 433 might appear to be the same as "paypal@example.org (spelled with the 434 lower-case version of the letter "L"), especially on casual visual 435 inspection; this phenomenon is sometimes called "typejacking". A 436 more sophisticated example of address mimicking might involve the use 437 of characters from outside the US-ASCII range, such as the Cherokee 438 characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 instead 439 of the US-ASCII characters "STPETER". 441 In some examples of address mimicking, it is unlikely that the 442 average user could tell the difference between the real JID and the 443 fake JID. (Indeed, there is no way to distinguish with full 444 certainty which is the fake JID and which is the real JID; in some 445 communication contexts, the JID with Cherokee characters might be the 446 real JID and the JID with US-ASCII characters might thus appear to be 447 the fake JID.) Because JIDs can contain almost any Unicode 448 character, it can be relatively easy to mimic some JIDs in XMPP 449 systems. The possibility of address mimicking introduces security 450 vulnerabilities of the kind that have also plagued the World Wide 451 Web, specifically the phenomenon known as phishing. 453 As noted in [IDNA-DEFS], "there are no comprehensive technical 454 solutions to the problems of confusable characters". Mimicked JIDs 455 that involve characters from only one character set or from the 456 character set typically employed by a particular user are not easy to 457 combat (e.g., the simple typejacking attack previously described, 458 which relies on a surface similarity between the characters "1" and 459 "l" in some presentations). However, mimicked addresses that involve 460 characters from more than one character set, or from a character set 461 not typically employed by a particular user, can be mitigated 462 somewhat through intelligent presentation. In particular, every 463 human user of an XMPP technology presumably has a preferred language 464 (or, in some cases, a small set of preferred languages), which an 465 XMPP application SHOULD gather either explicitly from the user or 466 implicitly via the operating system of the user's device. 468 Furthermore, every language has a range (or a small set of ranges) of 469 characters normally used to represent that language in textual form. 470 Therefore, an XMPP application SHOULD warn the user when presenting a 471 JID that mixes characters from more than one character set or that 472 uses characters outside the normal range of the user's preferred 473 language(s). This recommendation is not intended to discourage 474 communication across language communities; instead, it recognizes the 475 existence of such language communities and encourages due caution 476 when presenting unfamiliar character sets to human users. 478 5. IANA Considerations 480 The following sections update the registrations provided in 481 [RFC3920]. 483 5.1. Nodeprep Profile of Stringprep 485 The Nodeprep profile of stringprep is defined under Nodeprep 486 (Appendix A). The IANA has registered Nodeprep in the stringprep 487 profile registry. 489 Name of this profile: 491 Nodeprep 493 RFC in which the profile is defined: 495 XXXX 497 Indicator whether or not this is the newest version of the profile: 499 This is the first version of Nodeprep 501 5.2. Resourceprep Profile of Stringprep 503 The Resourceprep profile of stringprep is defined under Resourceprep 504 (Appendix B). The IANA has registered Resourceprep in the stringprep 505 profile registry. 507 Name of this profile: 509 Resourceprep 511 RFC in which the profile is defined: 513 XXXX 515 Indicator whether or not this is the newest version of the profile: 517 This is the first version of Resourceprep 519 6. Conformance Requirements 521 This section describes a protocol feature set that summarizes the 522 conformance requirements of this specification. This feature set is 523 appropriate for use in software certification, interoperability 524 testing, and implementation reports. For each feature, this section 525 provides the following information: 527 o A human-readable name 529 o An informational description 531 o A reference to the particular section of this document that 532 normatively defines the feature 534 o Whether the feature applies to the Client role, the Server role, 535 or both (where "N/A" signifies that the feature is not applicable 536 to the specified role) 538 o Whether the feature MUST or SHOULD be implemented, where the 539 capitalized terms are to be understood as described in [KEYWORDS] 541 The feature set specified here attempts to adhere to the concepts and 542 formats proposed by Larry Masinter within the IETF's NEWTRK Working 543 Group in 2005, as captured in [INTEROP]. Although this feature set 544 is more detailed than called for by [REPORTS], it provides a suitable 545 basis for the generation of implementation reports to be submitted in 546 support of advancing this specification from Proposed Standard to 547 Draft Standard in accordance with [PROCESS]. 549 Feature: address-domain-length 550 Description: Ensure that the domainpart of an XMPP address is at 551 least one byte in length and at most 1023 bytes in length. 552 Section: Section 2.2 553 Roles: Both MUST. 555 Feature: address-domain-prep 556 Description: Ensure that the domainpart of an XMPP address conforms 557 to the Nameprep profile of Stringprep. 558 Section: Section 2.2 559 Roles: Client SHOULD, Server MUST. 561 Feature: address-localpart-length 562 Description: Ensure that the localpart of an XMPP address is at 563 least one byte in length and at most 1023 bytes in length. 564 Section: Section 2.3 565 Roles: Both MUST. 567 Feature: address-localpart-prep 568 Description: Ensure that the localpart of an XMPP address conforms 569 to the Nodeprep profile of Stringprep. 570 Section: Section 2.3 571 Roles: Client SHOULD, Server MUST. 573 Feature: address-resource-length 574 Description: Ensure that the resourcepart of an XMPP address is at 575 least one byte in length and at most 1023 bytes in length. 576 Section: Section 2.4 577 Roles: Both MUST. 579 Feature: address-resource-prep 580 Description: Ensure that the resourcepart of an XMPP address 581 conforms to the Resourceprep profile of Stringprep. 582 Section: Section 2.2 583 Roles: Client SHOULD, Server MUST. 585 7. References 587 7.1. Normative References 589 [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax 590 Specifications: ABNF", STD 68, RFC 5234, January 2008. 592 [IDNA2003] 593 Faltstrom, P., Hoffman, P., and A. Costello, 594 "Internationalizing Domain Names in Applications (IDNA)", 595 RFC 3490, March 2003. 597 See Section 1 for an explanation of why the normative 598 reference to an obsoleted specification is needed. 600 [KEYWORDS] 601 Bradner, S., "Key words for use in RFCs to Indicate 602 Requirement Levels", BCP 14, RFC 2119, March 1997. 604 [NAMEPREP] 605 Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep 606 Profile for Internationalized Domain Names (IDN)", 607 RFC 3491, March 2003. 609 See Section 1 for an explanation of why the normative 610 reference to an obsoleted specification is needed. 612 [rfc3920bis] 613 Saint-Andre, P., "Extensible Messaging and Presence 614 Protocol (XMPP): Core", draft-ietf-xmpp-3920bis-17 (work 615 in progress), October 2010. 617 [STRINGPREP] 618 Hoffman, P. and M. Blanchet, "Preparation of 619 Internationalized Strings ("stringprep")", RFC 3454, 620 December 2002. 622 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 623 3.2.0", 2000. 625 The Unicode Standard, Version 3.2.0 is defined by The 626 Unicode Standard, Version 3.0 (Reading, MA, Addison- 627 Wesley, 2000. ISBN 0-201-61633-5), as amended by the 628 Unicode Standard Annex #27: Unicode 3.1 629 (http://www.unicode.org/reports/tr27/) and by the Unicode 630 Standard Annex #28: Unicode 3.2 631 (http://www.unicode.org/reports/tr28/). 633 [UNICODE-SEC] 634 The Unicode Consortium, "Unicode Technical Report #36: 635 Unicode Security Considerations", 2008. 637 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 638 10646", STD 63, RFC 3629, November 2003. 640 7.2. Informative References 642 [DNS] Mockapetris, P., "Domain names - implementation and 643 specification", STD 13, RFC 1035, November 1987. 645 [DNSSEC] Arends, R., Austein, R., Larson, M., Massey, D., and S. 646 Rose, "DNS Security Introduction and Requirements", 647 RFC 4033, March 2005. 649 [IDNA-DEFS] 650 Klensin, J., "Internationalized Domain Names for 651 Applications (IDNA): Definitions and Document Framework", 652 RFC 5890, August 2010. 654 [IDNA-PROTO] 655 Klensin, J., "Internationalized Domain Names in 656 Applications (IDNA): Protocol", RFC 5891, August 2010. 658 [INTEROP] Masinter, L., "Formalizing IETF Interoperability 659 Reporting", draft-ietf-newtrk-interop-reports-00 (work in 660 progress), October 2005. 662 [IRI] Duerst, M. and M. Suignard, "Internationalized Resource 663 Identifiers (IRIs)", RFC 3987, January 2005. 665 [PROCESS] Bradner, S., "The Internet Standards Process -- Revision 666 3", BCP 9, RFC 2026, October 1996. 668 [PUNYCODE] 669 Costello, A., "Punycode: A Bootstring encoding of Unicode 670 for Internationalized Domain Names in Applications 671 (IDNA)", RFC 3492, March 2003. 673 [REPORTS] Dusseault, L. and R. Sparks, "Guidance on Interoperation 674 and Implementation Reports for Advancement to Draft 675 Standard", BCP 9, RFC 5657, September 2009. 677 [RFC3920] Saint-Andre, P., Ed., "Extensible Messaging and Presence 678 Protocol (XMPP): Core", RFC 3920, October 2004. 680 [URI] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 681 Resource Identifier (URI): Generic Syntax", STD 66, 682 RFC 3986, January 2005. 684 [XEP-0029] 685 Kaes, C., "Definition of Jabber Identifiers (JIDs)", XSF 686 XEP 0029, October 2003. 688 [XEP-0030] 689 Hildebrand, J., Millard, P., Eatmon, R., and P. Saint- 690 Andre, "Service Discovery", XSF XEP 0030, June 2008. 692 [XEP-0045] 693 Saint-Andre, P., "Multi-User Chat", XSF XEP 0045, 694 January in progress, last updated 2010. 696 [XEP-0060] 697 Millard, P., Saint-Andre, P., and R. Meijer, "Publish- 698 Subscribe", XSF XEP 0060, September 2008. 700 [XML] Paoli, J., Maler, E., Sperberg-McQueen, C., Yergeau, F., 701 and T. Bray, "Extensible Markup Language (XML) 1.0 (Fourth 702 Edition)", World Wide Web Consortium Recommendation REC- 703 xml-20060816, August 2006, 704 . 706 [XMPP-URI] 707 Saint-Andre, P., "Internationalized Resource Identifiers 708 (IRIs) and Uniform Resource Identifiers (URIs) for the 709 Extensible Messaging and Presence Protocol (XMPP)", 710 RFC 5122, February 2008. 712 Appendix A. Nodeprep 714 A.1. Introduction 716 This appendix defines the "Nodeprep" profile of stringprep. As such, 717 it specifies processing rules that will enable users to enter 718 internationalized localparts in the Extensible Messaging and Presence 719 Protocol (XMPP) and have the highest chance of getting the content of 720 the strings correct. (An XMPP localpart is the optional portion of 721 an XMPP address that precedes an XMPP domainpart and the '@' 722 separator; it is often but not exclusively associated with an instant 723 messaging username.) These processing rules are intended only for 724 XMPP localparts and are not intended for arbitrary text or any other 725 aspect of an XMPP address. 727 This profile defines the following, as required by [STRINGPREP]: 729 o The intended applicability of the profile: internationalized 730 localparts within XMPP 731 o The character repertoire that is the input and output to 732 stringprep: Unicode 3.2, specified in Section 2 of this Appendix 733 o The mappings used: specified in Section 3 734 o The Unicode normalization used: specified in Section 4 735 o The characters that are prohibited as output: specified in Section 736 5 737 o Bidirectional character handling: specified in Section 6 739 A.2. Character Repertoire 741 This profile uses Unicode 3.2 with the list of unassigned code points 742 being Table A.1, both defined in Appendix A of [STRINGPREP]. 744 A.3. Mapping 746 This profile specifies mapping using the following tables from 747 [STRINGPREP]: 749 Table B.1 750 Table B.2 752 A.4. Normalization 754 This profile specifies the use of Unicode normalization form KC, as 755 described in [STRINGPREP]. 757 A.5. Prohibited Output 759 This profile specifies the prohibition of using the following tables 760 from [STRINGPREP]. 762 Table C.1.1 763 Table C.1.2 764 Table C.2.1 765 Table C.2.2 766 Table C.3 767 Table C.4 768 Table C.5 769 Table C.6 770 Table C.7 771 Table C.8 772 Table C.9 774 In addition, the following additional Unicode characters are also 775 prohibited: 777 U+0022 (QUOTATION MARK), i.e., " 778 U+0026 (AMPERSAND), i.e., & 779 U+0027 (APOSTROPHE), i.e., ' 780 U+002F (SOLIDUS), i.e., / 781 U+003A (COLON), i.e., : 782 U+003C (LESS-THAN SIGN), i.e., < 783 U+003E (GREATER-THAN SIGN), i.e., > 784 U+0040 (COMMERCIAL AT), i.e., @ 786 A.6. Bidirectional Characters 788 This profile specifies checking bidirectional strings, as described 789 in Section 6 of [STRINGPREP]. 791 A.7. Notes 793 Because the additional characters prohibited by Nodeprep are 794 prohibited after normalization, an implementation MUST NOT enable a 795 human user to input any Unicode code point whose decomposition 796 includes those characters; such code points include but are not 797 necessarily limited to the following (refer to [UNICODE] for complete 798 information). 800 o U+2100 (ACCOUNT OF) 801 o U+2101 (ADDRESSED TO THE SUBJECT) 802 o U+2105 (CARE OF) 803 o U+2106 (CADA UNA) 804 o U+226E (NOT LESS-THAN) 805 o U+226F (NOT GREATER-THAN) 806 o U+2A74 (DOUBLE COLON EQUAL) 807 o U+FE13 (SMALL COLON) 808 o U+FE60 (SMALL AMPERSAND) 809 o U+FE64 (SMALL LESS-THAN SIGN) 810 o U+FE65 (SMALL GREATER-THAN SIGN) 811 o U+FE6B (SMALL COMMERCIAL AT) 812 o U+FF02 (FULLWIDTH QUOTATION MARK) 813 o U+FF06 (FULLWIDTH AMPERSAND) 814 o U+FF07 (FULLWIDTH APOSTROPHE) 815 o U+FF0F (FULLWIDTH SOLIDUS) 816 o U+FF1A (FULLWIDTH COLON) 817 o U+FF1C (FULLWIDTH LESS-THAN SIGN) 818 o U+FF1E (FULLWIDTH GREATER-THAN SIGN) 819 o U+FF20 (FULLWIDTH COMMERCIAL AT) 821 Appendix B. Resourceprep 823 B.1. Introduction 825 This appendix defines the "Resourceprep" profile of stringprep. As 826 such, it specifies processing rules that will enable users to enter 827 internationalized resourceparts in the Extensible Messaging and 828 Presence Protocol (XMPP) and have the highest chance of getting the 829 content of the strings correct. (An XMPP resourcepart is the 830 optional portion of an XMPP address that follows an XMPP domainpart 831 and the '/' separator.) These processing rules are intended only for 832 XMPP resourceparts and are not intended for arbitrary text or any 833 other aspect of an XMPP address. 835 This profile defines the following, as required by [STRINGPREP]: 837 o The intended applicability of the profile: internationalized 838 resourceparts within XMPP 839 o The character repertoire that is the input and output to 840 stringprep: Unicode 3.2, specified in Section 2 of this Appendix 841 o The mappings used: specified in Section 3 842 o The Unicode normalization used: specified in Section 4 843 o The characters that are prohibited as output: specified in Section 844 5 845 o Bidirectional character handling: specified in Section 6 847 B.2. Character Repertoire 849 This profile uses Unicode 3.2 with the list of unassigned code points 850 being Table A.1, both defined in Appendix A of [STRINGPREP]. 852 B.3. Mapping 854 This profile specifies mapping using the following tables from 855 [STRINGPREP]: 857 Table B.1 859 B.4. Normalization 861 This profile specifies the use of Unicode normalization form KC, as 862 described in [STRINGPREP]. 864 B.5. Prohibited Output 866 This profile specifies the prohibition of using the following tables 867 from [STRINGPREP]. 869 Table C.1.2 870 Table C.2.1 871 Table C.2.2 872 Table C.3 873 Table C.4 874 Table C.5 875 Table C.6 876 Table C.7 877 Table C.8 878 Table C.9 880 B.6. Bidirectional Characters 882 This profile specifies checking bidirectional strings, as described 883 in Section 6 of [STRINGPREP]. 885 Appendix C. Differences From RFC 3920 887 Based on consensus derived from implementation and deployment 888 experience as well as formal interoperability testing, the following 889 substantive modifications were made from RFC 3920. 891 o Corrected the ABNF syntax to (1) ensure consistency with [URI] and 892 [IRI], and (2) prevent zero-length localparts, domainparts, and 893 resourceparts. 894 o To avoid confusion with the term "node" as used in [XEP-0030] and 895 [XEP-0060], changed the term "node identifier" to "localpart" (but 896 retained the name "Nodeprep" for backward compatibility). 897 o To avoid confusion with the terms "resource" and "identifier" as 898 used in [URI], changed the term "resource identifier" to 899 "resourcepart". 900 o Corrected the nameprep processing rules to require use of the 901 UseSTD3ASCIIRules flag. 903 Author's Address 905 Peter Saint-Andre 906 Cisco 907 1899 Wyknoop Street, Suite 600 908 Denver, CO 80202 909 USA 911 Phone: +1-303-308-3282 912 Email: psaintan@cisco.com