idnits 2.17.1 draft-ietf-xmpp-address-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). (Using the creation date from RFC3920, updated by this document, for RFC5378 checks: 2002-12-09) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 10, 2010) is 4879 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3490 (ref. 'IDNA2003') (Obsoleted by RFC 5890, RFC 5891) ** Obsolete normative reference: RFC 3491 (ref. 'NAMEPREP') (Obsoleted by RFC 5891) ** Obsolete normative reference: RFC 3454 (ref. 'STRINGPREP') (Obsoleted by RFC 7564) -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE-SEC' == Outdated reference: A later version (-22) exists of draft-ietf-xmpp-3920bis-20 -- Obsolete informational reference (is this intentional?): RFC 3920 (Obsoleted by RFC 6120) Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 XMPP P. Saint-Andre 3 Internet-Draft Cisco 4 Updates: 3920 (if approved) December 10, 2010 5 Intended status: Standards Track 6 Expires: June 13, 2011 8 Extensible Messaging and Presence Protocol (XMPP): Address Format 9 draft-ietf-xmpp-address-08 11 Abstract 13 This document defines the format for addresses used in the Extensible 14 Messaging and Presence Protocol (XMPP), including support for non- 15 ASCII characters. This document updates RFC 3920. 17 Status of this Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on June 13, 2011. 34 Copyright Notice 36 Copyright (c) 2010 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 1.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 54 2. Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.1. Fundamentals . . . . . . . . . . . . . . . . . . . . . . . 4 56 2.2. Domainpart . . . . . . . . . . . . . . . . . . . . . . . . 6 57 2.3. Localpart . . . . . . . . . . . . . . . . . . . . . . . . 7 58 2.4. Resourcepart . . . . . . . . . . . . . . . . . . . . . . . 8 59 3. Internationalization Considerations . . . . . . . . . . . . . 9 60 4. Security Considerations . . . . . . . . . . . . . . . . . . . 9 61 4.1. Reuse of Stringprep . . . . . . . . . . . . . . . . . . . 9 62 4.2. Reuse of Unicode . . . . . . . . . . . . . . . . . . . . . 9 63 4.3. Address Spoofing . . . . . . . . . . . . . . . . . . . . . 9 64 4.3.1. Address Forging . . . . . . . . . . . . . . . . . . . 10 65 4.3.2. Address Mimicking . . . . . . . . . . . . . . . . . . 10 66 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 67 5.1. Nodeprep Profile of Stringprep . . . . . . . . . . . . . . 13 68 5.2. Resourceprep Profile of Stringprep . . . . . . . . . . . . 13 69 6. Conformance Requirements . . . . . . . . . . . . . . . . . . . 14 70 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 71 7.1. Normative References . . . . . . . . . . . . . . . . . . . 15 72 7.2. Informative References . . . . . . . . . . . . . . . . . . 16 73 Appendix A. Nodeprep . . . . . . . . . . . . . . . . . . . . . . 18 74 A.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 18 75 A.2. Character Repertoire . . . . . . . . . . . . . . . . . . . 19 76 A.3. Mapping . . . . . . . . . . . . . . . . . . . . . . . . . 19 77 A.4. Normalization . . . . . . . . . . . . . . . . . . . . . . 19 78 A.5. Prohibited Output . . . . . . . . . . . . . . . . . . . . 19 79 A.6. Bidirectional Characters . . . . . . . . . . . . . . . . . 20 80 A.7. Notes . . . . . . . . . . . . . . . . . . . . . . . . . . 20 81 Appendix B. Resourceprep . . . . . . . . . . . . . . . . . . . . 20 82 B.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 21 83 B.2. Character Repertoire . . . . . . . . . . . . . . . . . . . 21 84 B.3. Mapping . . . . . . . . . . . . . . . . . . . . . . . . . 21 85 B.4. Normalization . . . . . . . . . . . . . . . . . . . . . . 21 86 B.5. Prohibited Output . . . . . . . . . . . . . . . . . . . . 21 87 B.6. Bidirectional Characters . . . . . . . . . . . . . . . . . 22 88 Appendix C. Differences From RFC 3920 . . . . . . . . . . . . . . 22 89 Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 22 90 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 23 92 1. Introduction 94 1.1. Overview 96 The Extensible Messaging and Presence Protocol (XMPP) is an 97 application profile of the Extensible Markup Language [XML] for 98 streaming XML data in close to real time between any two or more 99 network-aware entities. The address format for XMPP entities was 100 originally developed in the Jabber open-source community in 1999, 101 first described by [XEP-0029] in 2002, and defined canonically by 102 [RFC3920] in 2004. 104 As specified in RFC 3920, the XMPP address format re-uses the 105 "stringprep" technology for preparation of non-ASCII characters 106 [STRINGPREP], including the Nameprep profile for internationalized 107 domain names as specified in [NAMEPREP] and [IDNA2003] along with two 108 XMPP-specific profiles for the localpart and resourcepart. 110 Since the publication of RFC 3920, IDNA2003 has been superseded by 111 IDNA2008 (see [IDNA-PROTO] and related documents), which is not based 112 on stringprep. Following the lead of the IDNA community, other 113 technology communities that use stringprep have begun discussions 114 about migrating away from stringprep toward more "modern" approaches. 115 The XMPP community is participating in those discussions in order to 116 find a replacement for the Nodeprep and Resourceprep profiles of 117 stringprep defined in RFC 3920. However, work on updated handling of 118 internationalized addresses is currently in progress within the 119 PRECIS Working Group and at the time of this writing it seems that 120 such work might take several years to complete. Because all other 121 aspects of revised documentation for XMPP have been incorporated into 122 [XMPP], the XMPP Working Group decided to split the XMPP address 123 format into a separate specification so as not to significantly delay 124 publication of improved documentation for XMPP while awaiting the 125 conclusion of work on updated handling of internationalized 126 addresses. 128 Therefore, this specification provides corrected documentation of the 129 XMPP address format using the internationalization technologies 130 available in 2004 (when RFC 3920 was published), with the intent that 131 this specification will be superseded as soon as work on a new 132 approach to preparation and comparison of internationalized strings 133 has been defined by the PRECIS Working Group and applied to the 134 specific cases of XMPP localparts and resourceparts. In the 135 meantime, this document normatively references [IDNA2003] and 136 [NAMEPREP]; XMPP software implementations are encouraged to begin 137 migrating to IDNA2008 (see [IDNA-PROTO] and related documents) 138 because it is nearly certain that the specification superseding this 139 one will re-use IDNA2008. 141 This document updates RFC 3920. 143 1.2. Terminology 145 Many important terms used in this document are defined in [IDNA2003], 146 [STRINGPREP], [UNICODE], and [XMPP]. 148 The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 149 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 150 "OPTIONAL" in this document are to be interpreted as described in 151 [KEYWORDS]. 153 2. Addresses 155 2.1. Fundamentals 157 An XMPP entity is anything that is network-addressable and that can 158 communicate using XMPP. For historical reasons, the native address 159 of an XMPP entity is called a Jabber Identifier or JID. A valid JID 160 is a string of [UNICODE] code points, encoded using [UTF-8], and 161 structured as an ordered sequence of localpart, domainpart, and 162 resourcepart (where the first two parts are demarcated by the '@' 163 character used as a separator, and the last two parts are similarly 164 demarcated by the '/' character). 166 The syntax for a JID is defined as follows using the Augmented 167 Backus-Naur Form as specified in [ABNF]. 169 jid = [ localpart "@" ] domainpart [ "/" resourcepart ] 170 localpart = 1*(nodepoint) 171 ; 172 ; a "nodepoint" is a UTF-8 encoded Unicode code 173 ; point that satisfies the Nodeprep profile of 174 ; stringprep 175 ; 176 domainpart = IP-literal / IPv4address / ifqdn 177 ; 178 ; the "IPv4address" and "IP-literal" rules are 179 ; defined in RFC 3986, and the first-match-wins 180 ; (a.k.a. "greedy") algorithm described in RFC 181 ; 3986 applies to the matching process 182 ; 183 ; note well that re-use of the IP-literal rule 184 ; from RFC 3986 implies that IPv6 addresses are 185 ; enclosed in square brackets (i.e., beginning 186 ; with '[' and ending with ']'), which was not 187 ; the case in RFC 3920 188 ; 189 ifqdn = 1*(namepoint) 190 ; 191 ; a "namepoint" is a UTF-8 encoded Unicode 192 ; code point that satisfies the Nameprep 193 ; profile of stringprep 194 ; 195 resourcepart = 1*(resourcepoint) 196 ; 197 ; a "resourcepoint" is a UTF-8 encoded Unicode 198 ; code point that satisfies the Resourceprep 199 ; profile of stringprep 200 ; 202 All JIDs are based on the foregoing structure. 204 Each allowable portion of a JID (localpart, domainpart, and 205 resourcepart) MUST NOT be zero bytes in length and MUST NOT be more 206 than 1023 bytes in length, resulting in a maximum total size 207 (including the '@' and '/' separators) of 3071 bytes. 209 For the purpose of communication over an XMPP network (e.g., in the 210 'to' or 'from' address of an XMPP stanza), an entity's address MUST 211 be represented as a JID, not as a Uniform Resource Identifier [URI] 212 or Internationalized Resource Identifier [IRI]. An XMPP IRI 213 [XMPP-URI] is in essence a JID prepended with 'xmpp:'; however, the 214 native addressing format used in XMPP is that of a mere JID without a 215 URI scheme. [XMPP-URI] is provided only for identification and 216 interaction outside the context of XMPP itself, for example when 217 linking to a JID from a web page. See [XMPP-URI] for a description 218 of the process for securely extracting a JID from an XMPP URI or IRI. 220 Implementation Note: When dividing a JID into its component parts, 221 an implementation needs to match the separator characters '@' and 222 '/' before applying any transformation algorithms, which might 223 decompose certain Unicode code points to the separator characters 224 (e.g., U+FE6B SMALL COMMERCIAL AT might decompose into U+0040 225 COMMERCIAL AT). 227 2.2. Domainpart 229 The domainpart of a JID is that portion after the '@' character (if 230 any) and before the '/' character (if any); it is the primary 231 identifier and is the only REQUIRED element of a JID (a mere 232 domainpart is a valid JID). Typically a domainpart identifies the 233 "home" server to which clients connect for XML routing and data 234 management functionality. However, it is not necessary for an XMPP 235 domainpart to identify an entity that provides core XMPP server 236 functionality (e.g., a domainpart can identify an entity such as a 237 multi-user chat service, a publish-subscribe service, or a user 238 directory). 240 The domainpart for every XMPP service MUST be a fully qualified 241 domain name ("FQDN"; see [DNS]), IPv4 address, IPv6 address, or 242 unqualifed hostname (i.e., a text label that is resolvable on a local 243 network). 245 Interoperability Note: Domainparts that are IP addresses might not 246 be accepted by other services for the sake of server-to-server 247 communication, and domainparts that are unqualified hostnames 248 cannot be used on public networks because they are resolvable only 249 on a local network. 251 If the domainpart includes a final character considered to be a label 252 separator (dot) by [IDNA2003] or [DNS], this character MUST be 253 stripped from the domainpart before the JID of which it is a part is 254 used for the purpose of routing an XML stanza, comparing against 255 another JID, or constructing an [XMPP-URI]; in particular, the 256 character MUST be stripped before any other canonicalization steps 257 are taken, such as application of the [NAMEPREP] profile of 258 [STRINGPREP] or completion of the ToASCII operation as described in 259 [IDNA2003]. 261 A domainpart consisting of a fully qualified domain name MUST be an 262 "internationalized domain name" as defined in [IDNA2003], that is, it 263 MUST be "a domain name in which every label is an internationalized 264 label" and MUST follow the rules for construction of 265 internationalized domain names specified in [IDNA2003]. When 266 preparing a text label (consisting of a sequence of UTF-8 encoded 267 Unicode code points) for representation as an internationalized label 268 in the process of constructing an XMPP domainpart or comparing two 269 XMPP domainparts, an application MUST ensure that for each text label 270 it is possible to apply without failing the ToASCII operation 271 specified in [IDNA2003] with the UseSTD3ASCIIRules flag set (thus 272 forbidding ASCII code points other than letters, digits, and 273 hyphens). If the ToASCII operation can be applied without failing, 274 then the label is an internationalized label. (Note: The ToASCII 275 operation includes application of the [NAMEPREP] profile of 276 [STRINGPREP] and encoding using the algorithm specified in 277 [PUNYCODE]; for details, see [IDNA2003].) Although XMPP applications 278 do not communicate the output of the ToASCII operation (called an 279 "ACE label") over the wire, it MUST be possible to apply that 280 operation without failing to each internationalized label. If an 281 XMPP application receives as input an ACE label, it SHOULD convert 282 that ACE label to an internationalized label using the ToUnicode 283 operation (see [IDNA2003]) before including the label in an XMPP 284 domainpart that will be communicated over the wire on an XMPP network 285 (however, instead of converting the label, there are legitimate 286 reasons why an application might instead refuse the input altogether 287 and return an error to the entity that provided the offending data). 289 A domainpart MUST NOT be zero bytes in length and MUST NOT be more 290 than 1023 bytes in length. This rule is to be enforced after any 291 mapping or normalization resulting from application of the Nameprep 292 profile of stringprep (e.g., in Nameprep some characters can be 293 mapped to nothing, which might result in a string of zero length). 294 Naturally, the length limits of [DNS] apply, and nothing in this 295 document is to be interpreted as overriding those more fundamental 296 limits. 298 In the terms of IDNA2008 [IDNA-DEFS], the domainpart of a JID is a 299 "domain name slot". 301 2.3. Localpart 303 The localpart of a JID is an optional identifier placed before the 304 domainpart and separated from the latter by the '@' character. 305 Typically a localpart uniquely identifies the entity requesting and 306 using network access provided by a server (i.e., a local account), 307 although it can also represent other kinds of entities (e.g., a chat 308 room associated with a multi-user chat service). The entity 309 represented by an XMPP localpart is addressed within the context of a 310 specific domain (i.e., ). 312 A localpart MUST be formatted such that the Nodeprep profile of 314 [STRINGPREP] can be applied without failing (see Appendix A). Before 315 comparing two localparts, an application MUST first ensure that the 316 Nodeprep profile has been applied to each identifier (the profile 317 need not be applied each time a comparison is made, as long as it has 318 been applied before comparison). 320 A localpart MUST NOT be zero bytes in length and MUST NOT be more 321 than 1023 bytes in length. This rule is to be enforced after any 322 mapping or normalization resulting from application of the Nodeprep 323 profile of stringprep (e.g., in Nodeprep some characters can be 324 mapped to nothing, which might result in a string of zero length). 326 2.4. Resourcepart 328 The resourcepart of a JID is an optional identifier placed after the 329 domainpart and separated from the latter by the '/' character. A 330 resourcepart can modify either a address or a 331 mere address. Typically a resourcepart uniquely 332 identifies a specific connection (e.g., a device or location) or 333 object (e.g., an occupant in a multi-user chat room) belonging to the 334 entity associated with an XMPP localpart at a domain (i.e., 335 ). 337 A resourcepart MUST be formatted such that the Resourceprep profile 338 of [STRINGPREP] can be applied without failing (see Appendix B). 339 Before comparing two resourceparts, an application MUST first ensure 340 that the Resourceprep profile has been applied to each identifier 341 (the profile need not be applied each time a comparison is made, as 342 long as it has been applied before comparison). 344 A resourcepart MUST NOT be zero bytes in length and MUST NOT be more 345 than 1023 bytes in length. This rule is to be enforced after any 346 mapping or normalization resulting from application of the 347 Resourceprep profile of stringprep (e.g., in Resourceprep some 348 characters can be mapped to nothing, which might result in a string 349 of zero length). 351 Informational Note: For historical reasons, the term "resource 352 identifier" is often used in XMPP to refer to the optional portion 353 of an XMPP address that follows the domainpart and the "/" 354 separator character; to help prevent confusion between an XMPP 355 "resource identifier" and the meanings of "resource" and 356 "identifier" provided in Section 1.1 of [URI], this specification 357 uses the term "resourcepart" instead of "resource identifier" (as 358 in RFC 3920). 360 XMPP entities SHOULD consider resourceparts to be opaque strings and 361 SHOULD NOT impute meaning to any given resourcepart. In particular: 363 o Use of the '/' character as a separator between the domainpart and 364 the resourcepart does not imply that XMPP addresses are 365 hierarchical in the way that, say, HTTP addresses are 366 hierarchical; thus for example an XMPP address of the form 367 does not identify a resource "bar" 368 that exists below a resource "foo" in a hierarchy of resources 369 associated with the entity "localpart@domain". 371 o The '@' character is allowed in the resourcepart, and is often 372 used in the "nick" shown in XMPP chatrooms. For example, the JID 373 describes an entity who is an 374 occupant of the room with an (asserted) 375 nick of . However, chatroom services do not 376 necessarily check such an asserted nick against the occupant's 377 real JID. 379 3. Internationalization Considerations 381 XMPP servers MUST, and XMPP clients SHOULD, support [IDNA2003] for 382 domainparts (including the [NAMEPREP] profile of [STRINGPREP]), the 383 Nodeprep (Appendix A) profile of [STRINGPREP] for localparts, and the 384 Resourceprep (Appendix B) profile of [STRINGPREP] for resourceparts; 385 this enables XMPP addresses to include a wide variety of characters 386 outside the US-ASCII range. Rules for enforcement of the XMPP 387 address format are provided in [XMPP]. 389 4. Security Considerations 391 4.1. Reuse of Stringprep 393 The security considerations described in [STRINGPREP] apply to the 394 Nodeprep (Appendix A) and Resourceprep (Appendix B) profiles defined 395 in this document for XMPP localparts and resourceparts. The security 396 considerations described in [STRINGPREP] and [NAMEPREP] apply to the 397 Nameprep profile that is re-used here for XMPP domainparts. 399 4.2. Reuse of Unicode 401 The security considerations described in [UNICODE-SEC] apply to the 402 use of Unicode characters in XMPP addresses. 404 4.3. Address Spoofing 406 There are two forms of address spoofing: forging and mimicking. 408 4.3.1. Address Forging 410 In the context of XMPP technologies, address forging occurs when an 411 entity is able to generate an XML stanza whose 'from' address does 412 not correspond to the account credentials with which the entity 413 authenticated onto the network (or an authorization identity provided 414 during negotiation of SASL authentication [SASL] as described in 415 [XMPP]). For example, address forging occurs if an entity that 416 authenticated as "juliet@im.example.com" is able to send XML stanzas 417 from "nurse@im.example.com" or "romeo@example.net". 419 Address forging is difficult in XMPP systems, given the requirement 420 for sending servers to stamp 'from' addresses and for receiving 421 servers to verify sending domains via server-to-server authentication 422 (see [XMPP]). However, address forging is possible if: 424 o A poorly implemented server ignores the requirement for stamping 425 the 'from' address. This would enable any entity that 426 authenticated with the server to send stanzas from any 427 localpart@domainpart as long as the domainpart matches the sending 428 domain of the server. 430 o An actively malicious server generates stanzas on behalf of any 431 registered account. 433 Therefore, an entity outside the security perimeter of a particular 434 server cannot reliably distinguish between JIDs of the form 435 at that server and thus can authenticate only 436 the domainpart of such JIDs with any level of assurance. This 437 specification does not define methods for discovering or 438 counteracting such poorly implemented or rogue servers. However, the 439 end-to-end authentication or signing of XMPP stanzas could help to 440 mitigate this risk, since it would require the rogue server to 441 generate false credentials in addition to modifying 'from' addresses. 443 Furthermore, it is possible for an attacker to forge JIDs at other 444 domains by means of a DNS poisoning attack if DNS security extensions 445 [DNSSEC] are not used. 447 4.3.2. Address Mimicking 449 Address mimicking occurs when an entity provides legitimate 450 authentication credentials for and sends XML stanzas from an account 451 whose JID appears to a human user to be the same as another JID. For 452 example, in some XMPP clients the address "ju1iet@example.org" 453 (spelled with the number one as the third character of the localpart) 454 might appear to be the same as "juliet@example.org (spelled with the 455 lower-case version of the letter "L"), especially on casual visual 456 inspection; this phenomenon is sometimes called "typejacking". A 457 more sophisticated example of address mimicking might involve the use 458 of characters from outside the familiar Latin extended-A block of 459 Unicode code points, such as the characters U+13DA U+13A2 U+13B5 460 U+13AC U+13A2 U+13AC U+13D2 from the Cherokee block instead of the 461 similar-looking US-ASCII characters "STPETER". 463 In some examples of address mimicking, it is unlikely that the 464 average user could tell the difference between the real JID and the 465 fake JID. (Indeed, there is no programmatic way to distinguish with 466 full certainty which is the fake JID and which is the real JID; in 467 some communication contexts, the JID formed of Cherokee characters 468 might be the real JID and the JID formed of US-ASCII characters might 469 thus appear to be the fake JID.) Because JIDs can contain almost any 470 properly-encoded Unicode code point, it can be relatively easy to 471 mimic some JIDs in XMPP systems. The possibility of address 472 mimicking introduces security vulnerabilities of the kind that have 473 also plagued the World Wide Web, specifically the phenomenon known as 474 phishing. 476 These problems arise because Unicode and ISO/IEC 10646 repertoires 477 have many characters that look similar (so-called "confusable 478 characters" or "confusables"). In many cases, XMPP users might 479 perform visual matching, such as when comparing the JIDs of 480 communication partners. Because it is impossible to map similar- 481 looking characters without a great deal of context (such as knowing 482 the fonts used), stringprep and stringprep-based technologies such as 483 Nameprep, Nodeprep, and Resourceprep do nothing to map similar- 484 looking characters together, nor do they prohibit some characters 485 because they look like others. As a result, XMPP localparts and 486 resourceparts could contain confusable characters, producing JIDs 487 that appear to mimic other JIDs and thus leading to security 488 vulnerabilities such as the following: 490 o A localpart can be employed as one part of an entity's address in 491 XMPP. One common usage is as the username of an instant messaging 492 user; another is as the name of a multi-user chat room; and many 493 other kinds of entities could use localparts as part of their 494 addresses. The security of such services could be compromised 495 based on different interpretations of the internationalized 496 localpart; for example, a user entering a single internationalized 497 localpart could access another user's account information, or a 498 user could gain access to a hidden or otherwise restricted chat 499 room or service. 501 o A resourcepart can be employed as one part of an entity's address 502 in XMPP. One common usage is as the name for an instant messaging 503 user's connected resource; another is as the nickname of a user in 504 a multi-user chat room; and many other kinds of entities could use 505 resourceparts as part of their addresses. The security of such 506 services could be compromised based on different interpretations 507 of the internationalized resourcepart; for example, two or more 508 confusable resources could be bound at the same time to the same 509 account (resulting in inconsistent authorization decisions in an 510 XMPP application that uses full JIDs), or a user could send a 511 message to someone other than the intended recipient in a multi- 512 user chat room. 514 Despite the fact that some specific suggestions about identification 515 and handling of confusable characters appear in the Unicode Security 516 Considerations [UNICODE-SEC], it is also true (as noted in 517 [IDNA-DEFS]) that "there are no comprehensive technical solutions to 518 the problems of confusable characters". Mimicked JIDs that involve 519 characters from only one script, or from the script typically 520 employed by a particular user or community of language users, are not 521 easy to combat (e.g., the simple typejacking attack previously 522 described, which relies on a surface similarity between the 523 characters "1" and "l" in some presentations). However, mimicked 524 addresses that involve characters from more than one script, or from 525 a script not typically employed by a particular user or community of 526 language users, can be mitigated somewhat through the application of 527 appropriate registration policies at XMPP services and presentation 528 policies in XMPP client software. Therefore the following policies 529 are encouraged: 531 1. Because an XMPP service that allows registration of XMPP user 532 accounts (localparts) plays a role similar to that of a registry 533 for DNS domain names, such a service SHOULD establish a policy 534 about the scripts or blocks of characters it will allow in 535 localparts at the service. Such a policy is likely to be 536 informed by the languages and scripts that are used to write 537 registered account names; in particular, to reduce confusion, the 538 service MAY forbid registration of XMPP localparts that contain 539 characters from more than one script and to restrict 540 registrations to characters drawn from a very small number of 541 scripts (e.g., scripts that are well-understood by the 542 administrators of the service). Such policies are also 543 appropriate for XMPP services that allow temporary or permanent 544 registration of XMPP resourceparts, e.g., during resource binding 545 [XMPP] or upon joining an XMPP-based chat room [XEP-0045]. For 546 related considerations in the context of domain name 547 registration, refer to Section 4.3 of [IDNA-PROTO] and Section 548 3.2 of [IDNA-RATIONALE]. Note well that methods for enforcing 549 such restrictions are out of scope for this document. 551 2. Because every human user of an XMPP client presumably has a 552 preferred language (or, in some cases, a small set of preferred 553 languages), an XMPP client SHOULD gather that information either 554 explicitly from the user or implicitly via the operating system 555 of the user's device. Furthermore, because most languages are 556 typically represented by a single script (or a small set of 557 scripts) and most scripts are typically contained in one or more 558 blocks of characters, an XMPP client SHOULD warn the user when 559 presenting a JID that mixes characters from more than one script 560 or block, or that uses characters outside the normal range of the 561 user's preferred language(s). This recommendation is not 562 intended to discourage communication across different communities 563 of language users; instead, it recognizes the existence of such 564 communities and encourages due caution when presenting unfamiliar 565 scripts or characters to human users. 567 5. IANA Considerations 569 The following sections update the registrations provided in 570 [RFC3920]. 572 5.1. Nodeprep Profile of Stringprep 574 The Nodeprep profile of stringprep is defined under Nodeprep 575 (Appendix A). The IANA has registered Nodeprep in the stringprep 576 profile registry. 578 Name of this profile: 580 Nodeprep 582 RFC in which the profile is defined: 584 RFC XXXX 586 Indicator whether or not this is the newest version of the profile: 588 This is the first version of Nodeprep 590 5.2. Resourceprep Profile of Stringprep 592 The Resourceprep profile of stringprep is defined under Resourceprep 593 (Appendix B). The IANA has registered Resourceprep in the stringprep 594 profile registry. 596 Name of this profile: 598 Resourceprep 600 RFC in which the profile is defined: 602 RFC XXXX 604 Indicator whether or not this is the newest version of the profile: 606 This is the first version of Resourceprep 608 6. Conformance Requirements 610 This section describes a protocol feature set that summarizes the 611 conformance requirements of this specification. This feature set is 612 appropriate for use in software certification, interoperability 613 testing, and implementation reports. For each feature, this section 614 provides the following information: 616 o A human-readable name 618 o An informational description 620 o A reference to the particular section of this document that 621 normatively defines the feature 623 o Whether the feature applies to the Client role, the Server role, 624 or both (where "N/A" signifies that the feature is not applicable 625 to the specified role) 627 o Whether the feature MUST or SHOULD be implemented, where the 628 capitalized terms are to be understood as described in [KEYWORDS] 630 The feature set specified here attempts to adhere to the concepts and 631 formats proposed by Larry Masinter within the IETF's NEWTRK Working 632 Group in 2005, as captured in [INTEROP]. Although this feature set 633 is more detailed than called for by [REPORTS], it provides a suitable 634 basis for the generation of implementation reports to be submitted in 635 support of advancing this specification from Proposed Standard to 636 Draft Standard in accordance with [PROCESS]. 638 Feature: address-domain-length 639 Description: Ensure that the domainpart of an XMPP address is at 640 least one byte in length and at most 1023 bytes in length, and 641 conforms to the underlying length limits of the DNS. 643 Section: Section 2.2 644 Roles: Both MUST. 646 Feature: address-domain-prep 647 Description: Ensure that the domainpart of an XMPP address conforms 648 to the Nameprep profile of Stringprep. 649 Section: Section 2.2 650 Roles: Client SHOULD, Server MUST. 652 Feature: address-localpart-length 653 Description: Ensure that the localpart of an XMPP address is at 654 least one byte in length and at most 1023 bytes in length. 655 Section: Section 2.3 656 Roles: Both MUST. 658 Feature: address-localpart-prep 659 Description: Ensure that the localpart of an XMPP address conforms 660 to the Nodeprep profile of Stringprep. 661 Section: Section 2.3 662 Roles: Client SHOULD, Server MUST. 664 Feature: address-resource-length 665 Description: Ensure that the resourcepart of an XMPP address is at 666 least one byte in length and at most 1023 bytes in length. 667 Section: Section 2.4 668 Roles: Both MUST. 670 Feature: address-resource-prep 671 Description: Ensure that the resourcepart of an XMPP address 672 conforms to the Resourceprep profile of Stringprep. 673 Section: Section 2.2 674 Roles: Client SHOULD, Server MUST. 676 7. References 678 7.1. Normative References 680 [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax 681 Specifications: ABNF", STD 68, RFC 5234, January 2008. 683 [DNS] Mockapetris, P., "Domain names - implementation and 684 specification", STD 13, RFC 1035, November 1987. 686 [IDNA2003] 687 Faltstrom, P., Hoffman, P., and A. Costello, 688 "Internationalizing Domain Names in Applications (IDNA)", 689 RFC 3490, March 2003. 691 See Section 1 for an explanation of why the normative 692 reference to an obsoleted specification is needed. 694 [KEYWORDS] 695 Bradner, S., "Key words for use in RFCs to Indicate 696 Requirement Levels", BCP 14, RFC 2119, March 1997. 698 [NAMEPREP] 699 Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep 700 Profile for Internationalized Domain Names (IDN)", 701 RFC 3491, March 2003. 703 See Section 1 for an explanation of why the normative 704 reference to an obsoleted specification is needed. 706 [STRINGPREP] 707 Hoffman, P. and M. Blanchet, "Preparation of 708 Internationalized Strings ("stringprep")", RFC 3454, 709 December 2002. 711 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 712 3.2.0", 2000. 714 The Unicode Standard, Version 3.2.0 is defined by The 715 Unicode Standard, Version 3.0 (Reading, MA, Addison- 716 Wesley, 2000. ISBN 0-201-61633-5), as amended by the 717 Unicode Standard Annex #27: Unicode 3.1 718 (http://www.unicode.org/reports/tr27/) and by the Unicode 719 Standard Annex #28: Unicode 3.2 720 (http://www.unicode.org/reports/tr28/). 722 [UNICODE-SEC] 723 The Unicode Consortium, "Unicode Technical Report #36: 724 Unicode Security Considerations", 2008. 726 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 727 10646", STD 63, RFC 3629, November 2003. 729 [XMPP] Saint-Andre, P., "Extensible Messaging and Presence 730 Protocol (XMPP): Core", draft-ietf-xmpp-3920bis-20 (work 731 in progress), December 2010. 733 7.2. Informative References 735 [DNSSEC] Arends, R., Austein, R., Larson, M., Massey, D., and S. 736 Rose, "DNS Security Introduction and Requirements", 737 RFC 4033, March 2005. 739 [IDNA-DEFS] 740 Klensin, J., "Internationalized Domain Names for 741 Applications (IDNA): Definitions and Document Framework", 742 RFC 5890, August 2010. 744 [IDNA-PROTO] 745 Klensin, J., "Internationalized Domain Names in 746 Applications (IDNA): Protocol", RFC 5891, August 2010. 748 [IDNA-RATIONALE] 749 Klensin, J., "Internationalized Domain Names for 750 Applications (IDNA): Background, Explanation, and 751 Rationale", RFC 5894, August 2010. 753 [INTEROP] Masinter, L., "Formalizing IETF Interoperability 754 Reporting", draft-ietf-newtrk-interop-reports-00 (work in 755 progress), October 2005. 757 [IRI] Duerst, M. and M. Suignard, "Internationalized Resource 758 Identifiers (IRIs)", RFC 3987, January 2005. 760 [PROCESS] Bradner, S., "The Internet Standards Process -- Revision 761 3", BCP 9, RFC 2026, October 1996. 763 [PUNYCODE] 764 Costello, A., "Punycode: A Bootstring encoding of Unicode 765 for Internationalized Domain Names in Applications 766 (IDNA)", RFC 3492, March 2003. 768 [REPORTS] Dusseault, L. and R. Sparks, "Guidance on Interoperation 769 and Implementation Reports for Advancement to Draft 770 Standard", BCP 9, RFC 5657, September 2009. 772 [RFC3920] Saint-Andre, P., Ed., "Extensible Messaging and Presence 773 Protocol (XMPP): Core", RFC 3920, October 2004. 775 [SASL] Melnikov, A. and K. Zeilenga, "Simple Authentication and 776 Security Layer (SASL)", RFC 4422, June 2006. 778 [URI] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 779 Resource Identifier (URI): Generic Syntax", STD 66, 780 RFC 3986, January 2005. 782 [XEP-0029] 783 Kaes, C., "Definition of Jabber Identifiers (JIDs)", XSF 784 XEP 0029, October 2003. 786 [XEP-0030] 787 Hildebrand, J., Millard, P., Eatmon, R., and P. Saint- 788 Andre, "Service Discovery", XSF XEP 0030, June 2008. 790 [XEP-0045] 791 Saint-Andre, P., "Multi-User Chat", XSF XEP 0045, 792 July 2008. 794 [XEP-0060] 795 Millard, P., Saint-Andre, P., and R. Meijer, "Publish- 796 Subscribe", XSF XEP 0060, September 2008. 798 [XEP-0165] 799 Saint-Andre, P., "Best Practices to Discourage JID 800 Mimicking", XSF XEP 0045, December 2007. 802 [XML] Paoli, J., Maler, E., Sperberg-McQueen, C., Yergeau, F., 803 and T. Bray, "Extensible Markup Language (XML) 1.0 (Fourth 804 Edition)", World Wide Web Consortium Recommendation REC- 805 xml-20060816, August 2006, 806 . 808 [XMPP-URI] 809 Saint-Andre, P., "Internationalized Resource Identifiers 810 (IRIs) and Uniform Resource Identifiers (URIs) for the 811 Extensible Messaging and Presence Protocol (XMPP)", 812 RFC 5122, February 2008. 814 Appendix A. Nodeprep 816 A.1. Introduction 818 This appendix defines the "Nodeprep" profile of stringprep. As such, 819 it specifies processing rules that will enable users to enter 820 internationalized localparts in the Extensible Messaging and Presence 821 Protocol (XMPP) and have the highest chance of getting the content of 822 the strings correct. (An XMPP localpart is the optional portion of 823 an XMPP address that precedes an XMPP domainpart and the '@' 824 separator; it is often but not exclusively associated with an instant 825 messaging username.) These processing rules are intended only for 826 XMPP localparts and are not intended for arbitrary text or any other 827 aspect of an XMPP address. 829 This profile defines the following, as required by [STRINGPREP]: 831 o The intended applicability of the profile: internationalized 832 localparts within XMPP 834 o The character repertoire that is the input and output to 835 stringprep: Unicode 3.2, specified in Section 2 of this Appendix 836 o The mappings used: specified in Section 3 837 o The Unicode normalization used: specified in Section 4 838 o The characters that are prohibited as output: specified in Section 839 5 840 o Bidirectional character handling: specified in Section 6 842 A.2. Character Repertoire 844 This profile uses Unicode 3.2 with the list of unassigned code points 845 being Table A.1, both defined in Appendix A of [STRINGPREP]. 847 A.3. Mapping 849 This profile specifies mapping using the following tables from 850 [STRINGPREP]: 852 Table B.1 853 Table B.2 855 A.4. Normalization 857 This profile specifies the use of Unicode normalization form KC, as 858 described in [STRINGPREP]. 860 A.5. Prohibited Output 862 This profile specifies the prohibition of using the following tables 863 from [STRINGPREP]. 865 Table C.1.1 866 Table C.1.2 867 Table C.2.1 868 Table C.2.2 869 Table C.3 870 Table C.4 871 Table C.5 872 Table C.6 873 Table C.7 874 Table C.8 875 Table C.9 877 In addition, the following additional Unicode characters are also 878 prohibited: 880 U+0022 (QUOTATION MARK), i.e., " 881 U+0026 (AMPERSAND), i.e., & 882 U+0027 (APOSTROPHE), i.e., ' 883 U+002F (SOLIDUS), i.e., / 884 U+003A (COLON), i.e., : 885 U+003C (LESS-THAN SIGN), i.e., < 886 U+003E (GREATER-THAN SIGN), i.e., > 887 U+0040 (COMMERCIAL AT), i.e., @ 889 A.6. Bidirectional Characters 891 This profile specifies checking bidirectional strings, as described 892 in Section 6 of [STRINGPREP]. 894 A.7. Notes 896 Because the additional characters prohibited by Nodeprep are 897 prohibited after normalization, an implementation MUST NOT enable a 898 human user to input any Unicode code point whose decomposition 899 includes those characters; such code points include but are not 900 necessarily limited to the following (refer to [UNICODE] for complete 901 information). 903 o U+2100 (ACCOUNT OF) 904 o U+2101 (ADDRESSED TO THE SUBJECT) 905 o U+2105 (CARE OF) 906 o U+2106 (CADA UNA) 907 o U+226E (NOT LESS-THAN) 908 o U+226F (NOT GREATER-THAN) 909 o U+2A74 (DOUBLE COLON EQUAL) 910 o U+FE13 (SMALL COLON) 911 o U+FE60 (SMALL AMPERSAND) 912 o U+FE64 (SMALL LESS-THAN SIGN) 913 o U+FE65 (SMALL GREATER-THAN SIGN) 914 o U+FE6B (SMALL COMMERCIAL AT) 915 o U+FF02 (FULLWIDTH QUOTATION MARK) 916 o U+FF06 (FULLWIDTH AMPERSAND) 917 o U+FF07 (FULLWIDTH APOSTROPHE) 918 o U+FF0F (FULLWIDTH SOLIDUS) 919 o U+FF1A (FULLWIDTH COLON) 920 o U+FF1C (FULLWIDTH LESS-THAN SIGN) 921 o U+FF1E (FULLWIDTH GREATER-THAN SIGN) 922 o U+FF20 (FULLWIDTH COMMERCIAL AT) 924 Appendix B. Resourceprep 925 B.1. Introduction 927 This appendix defines the "Resourceprep" profile of stringprep. As 928 such, it specifies processing rules that will enable users to enter 929 internationalized resourceparts in the Extensible Messaging and 930 Presence Protocol (XMPP) and have the highest chance of getting the 931 content of the strings correct. (An XMPP resourcepart is the 932 optional portion of an XMPP address that follows an XMPP domainpart 933 and the '/' separator.) These processing rules are intended only for 934 XMPP resourceparts and are not intended for arbitrary text or any 935 other aspect of an XMPP address. 937 This profile defines the following, as required by [STRINGPREP]: 939 o The intended applicability of the profile: internationalized 940 resourceparts within XMPP 941 o The character repertoire that is the input and output to 942 stringprep: Unicode 3.2, specified in Section 2 of this Appendix 943 o The mappings used: specified in Section 3 944 o The Unicode normalization used: specified in Section 4 945 o The characters that are prohibited as output: specified in Section 946 5 947 o Bidirectional character handling: specified in Section 6 949 B.2. Character Repertoire 951 This profile uses Unicode 3.2 with the list of unassigned code points 952 being Table A.1, both defined in Appendix A of [STRINGPREP]. 954 B.3. Mapping 956 This profile specifies mapping using the following tables from 957 [STRINGPREP]: 959 Table B.1 961 B.4. Normalization 963 This profile specifies the use of Unicode normalization form KC, as 964 described in [STRINGPREP]. 966 B.5. Prohibited Output 968 This profile specifies the prohibition of using the following tables 969 from [STRINGPREP]. 971 Table C.1.2 972 Table C.2.1 973 Table C.2.2 974 Table C.3 975 Table C.4 976 Table C.5 977 Table C.6 978 Table C.7 979 Table C.8 980 Table C.9 982 B.6. Bidirectional Characters 984 This profile specifies checking bidirectional strings, as described 985 in Section 6 of [STRINGPREP]. 987 Appendix C. Differences From RFC 3920 989 Based on consensus derived from implementation and deployment 990 experience as well as formal interoperability testing, the following 991 substantive modifications were made from RFC 3920. 993 o Corrected the ABNF syntax to ensure consistency with [URI] and 994 [IRI], including consistency with RFC 3986 and RFC 5952 with 995 regard to IPv6 addresses (e.g., enclosing the IPv6 address in 996 square brackets '[' and ']'). 997 o Corrected the ABNF syntax to prevent zero-length localparts, 998 domainparts, and resourceparts (and also noted that the underlying 999 length limits from the DNS apply to domainparts). 1000 o To avoid confusion with the term "node" as used in [XEP-0030] and 1001 [XEP-0060], changed the term "node identifier" to "localpart" (but 1002 retained the name "Nodeprep" for backward compatibility). 1003 o To avoid confusion with the terms "resource" and "identifier" as 1004 used in [URI], changed the term "resource identifier" to 1005 "resourcepart". 1006 o Corrected the nameprep processing rules to require use of the 1007 UseSTD3ASCIIRules flag. 1009 Appendix D. Acknowledgements 1011 Thanks to Ben Campbell, Waqas Hussain, Jehan Pages and Florian Zeitz 1012 for their feedback. Thanks also to Richard Barnes and Elwyn Davies 1013 for their reviews on behalf of the Security Directorate and the 1014 General Area Review Team, respectively. 1016 The Working Group chairs were Ben Campbell and Joe Hildebrand. The 1017 responsible Area Director was Gonzalo Camarillo. 1019 Some text in this document was borrowed or adapted from [IDNA-DEFS], 1020 [IDNA-PROTO], [IDNA-RATIONALE], and [XEP-0165]. 1022 Author's Address 1024 Peter Saint-Andre 1025 Cisco 1026 1899 Wyknoop Street, Suite 600 1027 Denver, CO 80202 1028 USA 1030 Phone: +1-303-308-3282 1031 Email: psaintan@cisco.com