idnits 2.17.1 draft-zeilenga-ldapbis-strmatch-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The abstract seems to contain references ([UNICODE], [CONTROLCHARACTERS], [ISO10646], [RFC2119], [UTR17], [GLOSSARY]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 701 has weird spacing: '...for the purpo...' == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (3 March 2003) is 7725 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '0' on line 449 -- Looks like a reference, but probably isn't: '1' on line 450 -- Looks like a reference, but probably isn't: '2' on line 451 ** Obsolete normative reference: RFC 3454 (Obsoleted by RFC 7564) -- Obsolete informational reference (is this intentional?): RFC 3356 (ref. 'IETF-ITU') (Obsoleted by RFC 6756) Summary: 7 errors (**), 0 flaws (~~), 2 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet-Draft Editor: Kurt D. Zeilenga 3 Intended Category: Informational OpenLDAP Foundation 4 Expires in six months 3 March 2003 6 Internationalized String Matching Rules for X.500 7 9 Status of this Memo 11 This document is an Internet-Draft and is in full conformance with all 12 provisions of Section 10 of RFC2026. 14 This document is intended to be submitted to the ITU for publication 15 as an amendment to X.520 and published as an Informational RFC. 16 Distribution of this memo is unlimited. Technical discussion of this 17 document will take place on the IETF LDAP Revision Working Group 18 mailing list . Please send editorial 19 comments directly to the author . 21 Internet-Drafts are working documents of the Internet Engineering Task 22 Force (IETF), its areas, and its working groups. Note that other 23 groups may also distribute working documents as Internet-Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as ``work in progress.'' 29 The list of current Internet-Drafts can be accessed at 30 . The list of 31 Internet-Draft Shadow Directories can be accessed at 32 . 34 Copyright 2003, The Internet Society. All Rights Reserved. 36 Please see the Copyright section near the end of this document for 37 more information. 39 Abstract 41 The existing X.500 Directory Service technical specifications do not 42 precisely define how string matching is to be performed. This has 43 lead to a number of interoperability problems. This document provides 44 string preparation profiles for standard syntaxes and matching rules 45 defined in X.520. 47 This document is intended to be submitted to the ITU-T for publication 48 as an amendment to X.520 and published as an Informational RFC. 50 Conventions 52 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 53 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 54 document are to be interpreted as described in BCP 14 [RFC2119]. 56 Character names in this document use the notation for code points and 57 names from the Unicode Standard [UNICODE] and ISO/IEC 10646-1 58 [ISO10646]. For example, the letter "a" may be represented as either 59 or . In the lists of mappings and the 60 prohibited characters, the "U+" is left off to make the lists easier 61 to read. The comments for character ranges are shown in square 62 brackets (such as "[CONTROL CHARACTERS]") and do not come from the 63 standards. 65 Note: a glossary of terms used in Unicode and ISO/IEC 10646 can be 66 found in [GLOSSARY]. Information on the ISO/IEC 10646/Unicode 67 character encoding model can be found in [UTR17]. 69 1. Introduction 71 1.1. Background 73 An X.500 matching rule [X.501] defines an algorithm for determining 74 whether a presented value matches an attribute value in accordance 75 with the criteria defined for the rule. The proposition may be 76 evaluated to True, False, or Undefined. 78 True - the attribute contains a matching value, 80 False - the attribute contains no matching value, 82 Undefined - it cannot be determined whether the attribute contains 83 a matching value or not. 85 For instance, the caseIgnoreMatch matching rule may be used to compare 86 whether the commonName attribute contains a particular value without 87 regard for case and insignificant spaces. 89 1.2. X.500 String Matching Rules 90 "X.520: Selected attribute types" [X.520] provides (amongst other 91 things) value syntaxes and matching rules for comparing values 92 commonly used in the Directory [X.500]. These specifications are 93 inadequate for strings composed of characters from the Universal 94 Character Set (UCS) [ISO10646], a superset of Unicode [UNICODE]. 96 The CaseIgnoreMatch matching rule, for example, is simply defined as 97 being a case insensitive comparison where insignificant spaces are 98 ignored. For printableString, there is only one space character and 99 case mapping is bijective, hence this definition is sufficient. 100 However, for UCS-based string types such as universalString, this is 101 not sufficient. For example, a case insensitive matching 102 implementation which folded lower case characters to upper case would 103 yield different different results than an implementation which used 104 upper case to lower case folding. Or one implementation may view 105 space as referring to only SPACE (U+0020), a second implementation may 106 view any character with the space separator (Zs) property as a space, 107 and another implementation may view any character with the whitespace 108 (WS) category as a space. 110 The lack of precise specification for string matching has led to 111 significant interoperability problems. When used in certificate chain 112 validation, security vulnerabilities can arise. To address these 113 problems, this document updates X.520 [X.520] with a detailed 114 specification of string syntax and matching rule requirements. 116 1.3. Relationship to "stringprep" 118 The matching rule algorithms described in this document are based upon 119 the "stringprep" approach [RFC3454]. In "stringprep", presented and 120 stored values are first prepared for comparison and so that a 121 character-by-character comparison yields the "correct" result. 123 The algorithm used here is a refinement of the "stringprep" [RFC3454] 124 approach. The algorithm involves two additional preparation steps. 126 a) prior to applying the Unicode string preparation steps outlined in 127 "stringprep", the string is transcoded to Unicode; 129 b) after applying the Unicode string preparation steps outlined in 130 "stringprep", characters insignificant to the matching rules are 131 removed. 133 Hence, preparation of strings for X.500 matching involves the 134 following steps: 136 1) Transcode 137 2) Map 138 3) Normalize 139 4) Prohibit 140 5) Check Bidi (Bidirectional) 141 6) Insignificant Character Removal 143 These steps are described in Section 3. Section 2 details design 144 considerations. 146 1.4. Relationship to X.500 148 This document updates X.520 [X.520] with additional normative and 149 informative information. Sections 3, 4, and 5 are normative parts of 150 this update. Other sections are informative. 152 Section 3 provides a specification for X.500 string preparation. It 153 is intended to be added as a new section in X.520. 155 Section 4 replaces section 6.1 of X.520 [X.520]. It updates select 156 string matching rules. 158 Section 5 replaces portions of section 6.2 of X.520 [X.520]. It 159 updates select syntax-based matching rules. 161 2. Design Considerations 163 The X.500 string matching rule specification provided in Section 3 is 164 designed to leverage the "stringprep" framework [RFC3454] for 165 comparing of strings. As noted above, transcoding and space removal 166 steps have been added. 168 This section describes the rationale for these and other design 169 decisions. 171 2.1. Transcode 173 In the past, transcoding only occurred when all of the input strings 174 were not encoded in the same character set. If all were encoded in 175 the same character set, no transcoding was to be performed. 176 Otherwise, all of the strings would be transcoded to one of character 177 sets used. 179 As mappings between character sets, such as T.61 and UCS, are not 180 bijective, this specification requires transliteration of all strings 181 to a common character encoding set. UCS was the logical choice as all 182 other character sets (used in X.500) can be transcoded to it without 183 information loss. None of the other character sets (used in X.500) 184 offer this property. 186 2.2. Map 188 Code points which have no semantic meaning in normal text are mapped 189 to nothing. Code points which are semantically equivalent in normal 190 text are mapped to a single code point. 192 "Normal text", in this context, is viewed as text commonly held in 193 attributes of Directory String syntax, such as identifiers, common 194 names, and short descriptive text. 196 2.3. Normalize 198 Normalization is performed to ensure that comparison is always done 199 between canonical-equivalent strings. As directory strings are often 200 used as identifiers, we selected Form KC (compatibility composed) as 201 it allows a greater number of strings to be treated as equivalent. 203 Unfortunately, this choice is not best for all applications. 204 Additional matching rules which use different string preparation 205 algorithms may be introduced in the future to better support these 206 applications. In particular, matching rules which use Form C 207 (composed) normalization instead of Form KC would also be generally 208 useful. It may be desirable to add additional matching rules to X.500 209 which use Form C normalization. 211 2.4. Prohibit 213 TBD 215 2.5. Check bidi 217 TBD 219 2.6. Insignificant Character Removal 221 This step is used to remove insignificant characters from the string. 222 Unlike the map step, which supports mapping of characters to nothing, 223 this step allows removal of characters based upon their location in 224 the string, surrounding characters in the string, and other factors. 226 3. String Preparation 228 The following six-step process SHALL be applied to each presented and 229 attribute value in preparation for string match rule evaluation. 231 1) Transcode 232 2) Map 233 3) Normalize 234 4) Prohibit 235 5) Check bidi 236 6) Insignificant Character Removal 238 Failure in any step is be cause the assertion to be Undefined. 240 The character repertoire of this process is Unicode 3.2 [UNICODE]. 242 3.1. Transcode 244 Each non-Unicode string value is transcoded to Unicode. 246 TeletexString values are transcoded to Unicode as described in 247 [T61-UCS]. 249 PrintableString value are transcoded directly to Unicode. 251 UniversalString, UTF8String, and bmpString values need not be 252 transcoded as they are Unicode-based strings (in the case of 253 bmpString, restricted to a subset of Unicode). 255 If the implementation is unable or unwilling to perform the 256 transcoding as described above, or the transcoding fails, this step 257 fails and the assertion is evaluated to Undefined. 259 The transcoded string is the output string. 261 3.2. Map 263 SOFT HYPHEN (U+00AD) and MONGOLIAN TODO SOFT HYPHEN (U+1806) code 264 points are mapped to nothing. COMBINING GRAPHEME JOINER (U+034F) and 265 VARIATION SELECTORs (U+180B-180D,FF00-FE0F) code points are also 266 mapped to nothing. The OBJECT REPLACEMENT CHARACTER (U+FFFC) is 267 mapped to nothing. 269 CHARACTER TABULATION (U+0009), LINE FEED (LF) (U+000A), LINE 270 TABULATION (U+000B), FORM FEED (FF) (U+000C), CARRIAGE RETURN (CR) 271 (U+000D), and NEXT LINE (NEL) (U+0085) are mapped to SPACE (U+0020). 273 All other control code points (e.g., Cc) or code points with a control 274 function (e.g., Cf) are mapped to nothing. 276 ZERO WIDTH SPACE (U+200B) is mapped to nothing. All other code points 277 with Separator (space, line, or paragraph) property (e.g, Zs, Zl, or 278 Zp) are mapped to SPACE (U+0020). 280 For case ignore, numeric, and stored prefix string matching rules, 281 characters are case folded per B.2 of [RFC3454]. 283 3.3. Normalize 285 The input string is be normalized to Unicode Form KC (compatibility 286 composed) as described in [UAX15]. 288 3.4. Prohibit 290 All Unassigned, Private Use, and non-character code points are 291 prohibited. Surrogate codes (U+D800-DFFFF) are prohibited. 293 The REPLACEMENT CHARACTER (U+FFFD) code point is prohibited. 295 The first code point of a string is probibited from being a combining 296 character. 298 Empty strings are prohibited. 300 The step fails and the assertion is evaluated to Undefined if the 301 input string contains any prohibited code point. The output string is 302 the input string. 304 3.5. Check bidi 306 There are no bidirectional restrictions. The output string is the 307 input string. 309 3.6. Insignificant Character Removal 311 In this step, characters insignificant to the matching rule are to be 312 removed. The characters to be removed differ from matching rule to 313 matching rule. 315 Section 3.6.1 applies to case ignore and exact string matching. 316 Section 3.6.2 applies to numericString matching. 318 Section 3.6.3 applies to telephoneNumber matching 320 3.6.1. Insignificant Space Removal 322 For the purposes of this section, a space is defined to be the SPACE 323 (U+0020) code point followed by no combining marks. 325 NOTE - The previous steps ensure that the string cannot contain 326 any code points in the separator class, other than SPACE 327 (U+0020). 329 The following spaces are regarded as not significant and are to be 330 removed: 331 - leading spaces (i.e. those preceding the first character that is 332 not a space); 333 - trailing spaces (i.e. those following the last character that is 334 not a space); 335 - multiple consecutive spaces (these are taken as equivalent to a 336 single space character). 338 (A string consisting entirely of spaces is equivalent to a string 339 containing exactly one space.) 341 For example, removal of spaces from the Form KC string: 342 "foobar" would result in 343 the output string: 344 "foobar". 346 and the Form KC string: 347 "" would result in the output string: 348 "". 350 3.6.2. NumericString Insignificant Character Removal 352 For the purposes of this section, a space is defined to be the SPACE 353 (U+0020) code point followed by no combining marks. 355 All spaces are regarded as not significant and are to be removed. 357 For example, removal of spaces from the Form KC string: 358 "123456" would result in 359 the output string: 360 "123456". 362 and the Form KC string: 363 "" would result in an empty output string. 365 3.6.3. TelephoneNumber Insignificant Character Removal 367 For the purposes of this section, a hyphen is defined to be 368 HYPHEN-MINUS (U+002D), ARMENIAN HYPHEN (U+058A), HYPHEN (U+2010), 369 NON-BREAKING HYPHEN (U+2011), MINUS SIGN (U+2212), SMALL HYPHEN-MINUS 370 (U+FE63), or FULLWIDTH HYPHEN-MINUS (U+FF0D) code point followed by no 371 combining marks and a space is defined to be the SPACE (U+0020) code 372 point followed by no combining marks. 374 All hyphens and spaces are regarded as not significant and are to be 375 removed. 377 4. String Matching Rules 379 In the matching rules specified in this section, all presented and 380 stored string values are be prepared for matching as described in 381 Section 3. String preparation produces strings suitable for 382 character-by-character matching. 384 4.1. Case Exact / Ignore Match 386 The Case Exact Match rule compares for equality a presented string 387 with an attribute value of type DirectoryString or one of the data 388 types appearing in the choice type DirectoryString, e.g. UTF8String, 389 without regards to insignificant spaces (3.4.1). 391 caseExactMatch MATCHING-RULE ::= { 392 SYNTAX DirectoryString {ub-match} 393 ID id-mr-caseExactMatch } 395 The Case Ignore Match rule compares for equality a presented string 396 with an attribute value of type DirectoryString or one of the data 397 types appearing in the choice type DirectoryString, e.g. UTF8String, 398 without regard to the case (upper or lower) of the strings (e.g. 399 "Dundee" and "DUNDEE" match) and insignificant spaces (3.4.1). The 400 rule is identical to the caseExactMatch rule except upper case 401 characters are folded to lower case during string preparation as 402 discussed in 3.2. 404 caseIgnoreMatch MATCHING-RULE ::= { 405 SYNTAX DirectoryString {ub-match} 406 ID id-mr-caseIgnoreMatch } 408 Both rules return TRUE if the prepared strings are the same length and 409 corresponding characters are identical. 411 4.2. Case Exact / Ignore Ordering Match 413 The Case Exact Ordering Match rule compares the collation order of a 414 presented string with an attribute value of type DirectoryString or 415 one of the data types appearing in the choice type DirectoryString, 416 e.g. UTF8String, without regard to insignificant spaces (3.4.1). 418 caseExactOrderingMatch MATCHING-RULE ::= { 419 SYNTAX DirectoryString {ub-match} 420 ID id-mr-caseExactOrderingMatch } 422 The Case Ignore Ordering Match rule compares the collation order of a 423 presented string an attribute value of type DirectoryString or one of 424 the data types appearing in the choice type DirectoryString, e.g. 425 UTF8String, without regard to the case (upper or lower) of the strings 426 and insignificant spaces (3.4.1). The rule is identical to the 427 caseExactOrderingMatch rule except upper case characters are folded to 428 lower case during string preparation as discussed in 3.2. 430 caseIgnoreOrderingMatch MATCHING-RULE ::= { 431 SYNTAX DirectoryString {ub-match} 432 ID id-mr-caseIgnoreOrderingMatch } 434 Both rules return TRUE if the attribute value is "less" or appears 435 earlier than the presented value, when the prepared strings are 436 compared using Unicode code point collation order. 438 4.3. Case Exact / Ignore Substrings Match 440 The Case Exact Substrings Match rule determines whether a presented 441 value is a substring of an attribute value of type DirectoryString or 442 one of the data types appearing in the choice type DirectoryString, 443 e.g. UTF8String, without regard to insignficant spaces (3.4.1). 445 caseExactSubstringsMatch MATCHING-RULE ::= { 446 SYNTAX SubstringAssertion 447 ID id-mr-caseExactSubstringsMatch } 448 SubstringAssertion ::= SEQUENCE OF CHOICE { 449 initial [0] DirectoryString {ub-match}, 450 any [1] DirectoryString {ub-match}, 451 final [2] DirectoryString {ub-match}, 452 control Attribute } 453 -- Used to specify interpretation of the following items 454 -- at most one initial and one final component 456 The Case Ignore Substrings Match rule determines whether a presented 457 value is a substring of an attribute value of type DirectoryString or 458 one of the data types appearing in the choice type DirectoryString, 459 e.g. UTF8String, without regard to the case (upper or lower) of the 460 strings and insignificant spaces (3.4.1). The rule is identical to 461 the caseExactSubstringsMatch rule except upper case characters are 462 folded to lower case during string preparation as discussed in 3.2. 464 caseIgnoreSubstringsMatch MATCHING-RULE ::= { 465 SYNTAX SubstringAssertion 466 ID id-mr-caseIgnoreSubstringsMatch } 468 Both rules return TRUE if there is a partitioning of the prepared 469 attribute value (into portions) such that: 470 - the specified substrings (initial, any, final) match different 471 portions of the value in the order of the strings sequence. 472 - initial, if present, matches the first portion of the value; 473 - any, if present, matches some arbitrary portion of the value; 474 - final, if present, matches the last portion of the value. 475 - control is not used for the caseExactSubstringsMatch, 476 caseIgnoreSubstringsMatch, telephoneNumberSubstringsMatch, or any 477 other form of substring match for which only initial, any, or 478 final elements are used in the matching algorithm; if a control 479 element is encountered, it is ignored. The control element is 480 only used for matching rules that explicitly specify its use in 481 the matching algorithm. Such a matching rule may also redefine the 482 semantics of the initial, any and final substrings. 483 NOTE - The generalWordMatch matching rule is an example of such 484 a matching rule. 486 There shall be at most one initial, and at most one final in the 487 SubstringAssertion. If initial is present, it shall be the first 488 element. If final is present, it shall be the last element. There 489 shall be zero or more any elements. 491 For a component of substrings to match a portion of the attribute 492 value, corresponding characters must be identical (including all 493 combining characters in the combining character sequences). 495 4.4. Numeric String Match 497 The Numeric String Match rule compares for equality a presented 498 numeric string with an attribute value of type NumericString. 500 numericStringMatch MATCHING-RULE ::= { 501 SYNTAX NumericString 502 ID id-mr-numericStringMatch } 504 The rule is identical to the caseIgnoreMatch rule (case is irrelevant 505 as characters are numeric) except that all space characters are 506 removed during string preparation as detailed in Section 3.6.2. 508 4.5. Numeric String Ordering Match 510 The Numeric String Ordering Match rule compares the collation order of 511 a presented string with an attribute value of type NumericString. 513 numericStringOrderingMatch MATCHING-RULE ::= { 514 SYNTAX NumericString 515 ID id-mr-numericStringOrderingMatch } 517 The rule is identical to the caseIgnoreOrderingMatch rule (case is 518 irrelevant as characters are numeric) except that all space characters 519 are removed during string preparation as detailed in Section 3.6. 521 4.6. Numeric String Substrings Match 523 The Numeric String Substrings Match rule determines whether a 524 presented value is a substring of an attribute value of type 525 NumericString. 527 numericStringSubstringsMatch MATCHING-RULE ::= { 528 SYNTAX SubstringAssertion 529 ID id-mr-numericStringSubstringsMatch } 531 The rule is identical to the caseIgnoreSubstringsMatch rule (case is 532 irrelevant as characters are numeric) except that all space characters 533 are removed during string preparation as detailed in Section 3.6. 535 4.7. Case Ignore List Match 537 The Case Ignore List Match rule compares for equality a presented 538 sequence of strings with an attribute value which is a sequence of 539 DirectoryStrings, without regard to the case (upper or lower) of the 540 strings and insignificant spaces (3.6.1). 542 caseIgnoreListMatch MATCHING-RULE ::= { 543 SYNTAX CaseIgnoreList 544 ID id-mr-caseIgnoreListMatch } 545 CaseIgnoreList ::= SEQUENCE OF DirectoryString {ub-match} 547 The rule returns TRUE if and only if the number of strings in each is 548 the same, and corresponding strings match. The latter matching is as 549 for the caseIgnoreMatch matching rule. 551 4.8. Case Ignore List Substrings Match 553 The Case Ignore List Substring rule compares a presented substring 554 with an attribute value which is a sequence of DirectoryStrings, but 555 without regard for the case (upper or lower) of the strings and 556 insignificant spaces (3.6.1). 558 caseIgnoreListSubstringsMatch MATCHING-RULE ::= { 559 SYNTAX SubstringAssertion 560 ID id-mr-caseIgnoreListSubstringsMatch } 562 A presented value matches a stored value if and only if the presented 563 value matches the string formed by concatenating the strings of the 564 stored value. This matching is done according to the 565 caseIgnoreSubstringsMatch rule; however, none of the initial, any, or 566 final values of the presented value are considered to match a 567 substring of the concatenated string which spans more than one of the 568 strings of the stored value. 570 4.9. Stored Prefix Match 572 The Stored Prefix Match rule determines whether an attribute value, 573 whose syntax is DirectoryString, is a prefix (i.e. initial substring) 574 of the presented value, without regard to the case (upper or lower) of 575 the strings and insignficant spaces (3.6.1). 577 NOTE - It can be used, for example, to compare values in 578 the Directory which are telephone area codes with a value 579 which is a purported telephone number. 581 storedPrefixMatch MATCHING-RULE ::= { 582 SYNTAX DirectoryString {ub-match} 583 ID id-mr-storedPrefixMatch } 585 The rule returns TRUE if the attribute value is an initial substring 586 of the presented value with corresponding characters identical except 587 with regard to case. 589 5. Other changes to X.520 591 This document makes the following changes to X.520: 593 The section 6.2.8 (Telephone Number Match) sentence: 594 The rules for matching are identical to those for caseIgnoreMatch, 595 except that all space and "-" characters are skipped during the 596 comparison. 598 is replaced with: 599 The rules for matching are identical to those for caseIgnoreMatch, 600 except that all hyphens and spaces are insignficant (3.6.3) and 601 removed during the insignificant character removal step. 603 The section 6.2.9 (Telephone Number Substrings Match) sentence: 604 The rules for matching are identical to those for 605 caseExactSubstringsMatch, except that all space and "-" characters 606 are skipped during the comparison. 608 is replaced with: 609 The rules for matching are identical to those for 610 caseExactSubstringsMatch, except that all hyphens and spaces are 611 insignficant (3.6.3) and removed during the insignificant 612 character removal step. 614 6. Security Considerations 616 See [RFC3454]. 618 7. Acknowledgments 620 The approach used in this document is based upon design principles and 621 algorithms described in "Preparation of Internationalized Strings 622 ('stringprep')" [RFC3454] by Paul Hoffman and Marc Blanchet. Some 623 additional guidance was drawn from Unicode Technical Standards, 624 Technical Reports, and Notes. 626 Sections 3.3 and 4 of this document are derived from Section 6.1 of 627 [X.520]. Additionally, some text was borrowed from [RFC3454]. 629 This document is the product of IETF and ITU-T collaboration [IETF- 630 ITU]. 632 8. Editor's Address 634 Kurt Zeilenga 635 E-mail: 637 9. References 639 9.1. Normative References 641 [RFC2119] S. Bradner, "Key words for use in RFCs to Indicate 642 Requirement Levels", BCP 14 (also RFC 2119), March 1997. 644 [RFC3454] P. Hoffman, M. Blanchet, "Preparation of Internationalized 645 Strings ('stringprep')", RFC 3454, December 2002. 647 [X.501] International Telephone Union, "The Directory: The Models", 648 X.501, 2000. 650 [X.520] International Telephone Union, "The Directory: Selected 651 Attribute Types", X.520, 2000. 653 [ISO10646] Universal Multiple-Octet Coded Character Set (UCS) - 654 Architecture and Basic Multilingual Plane, ISO/IEC 10646-1 655 : 1993. 657 [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 658 3.2.0" is defined by "The Unicode Standard, Version 3.0" 659 (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5), as 660 amended by the "Unicode Standard Annex #27: Unicode 3.1" 661 (http://www.unicode.org/reports/tr27/) and by the "Unicode 662 Standard Annex #28: Unicode 3.2" 663 (http://www.unicode.org/reports/tr28/). 665 [UAX15] M. Davis, M. Duerst, "Unicode Standard Annex #15: Unicode 666 Normalization Forms, Version 3.2.0". 667 , 668 March 2002. 670 [T61-UCS] TBD 672 9.2. Informative References 674 [X.500] International Telephone Union, "The Directory: Overview of 675 Concepts, Models and Service", X.500, 2000. 677 [IETF-ITU] G. Fishman, S. Bradner, "Internet Engineering Task Force 678 and International Telecommunication Union - 679 Telecommunications Standardization Sector Collaboration 680 Guidelines", TSAG A-Series Supplement 3, November 2001 681 (also RFC 3356, published August 2002). 683 [GLOSSARY] The Unicode Consortium, "Unicode Glossary", 684 . 686 [UTR17] K. Whistler, M. Davis, "Unicode Technical Report 687 #17, Character Encoding Model", UTR17, 688 , August 689 2000. 691 Copyright 2003, The Internet Society. All Rights Reserved. 693 This document and translations of it may be copied and furnished to 694 others, and derivative works that comment on or otherwise explain it 695 or assist in its implementation may be prepared, copied, published and 696 distributed, in whole or in part, without restriction of any kind, 697 provided that the above copyright notice and this paragraph are 698 included on all such copies and derivative works. However, this 699 document itself may not be modified in any way, such as by removing 700 the copyright notice or references to the Internet Society or other 701 Internet organizations, except as needed for the purpose of 702 developing Internet standards in which case the procedures for 703 copyrights defined in the Internet Standards process must be followed, 704 or as required to translate it into languages other than English. 706 The limited permissions granted above are perpetual and will not be 707 revoked by the Internet Society or its successors or assigns. 709 This document and the information contained herein is provided on an 710 "AS IS" basis and THE AUTHORS, THE INTERNET SOCIETY, AND THE INTERNET 711 ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, 712 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 713 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 714 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.