idnits 2.17.1 draft-ietf-find-cip-tagged-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 19 longer pages, the longest (page 14) being 64 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 21 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 693: '... are not changed SHOULD not be present...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 238 has weird spacing: '...re only modif...' == Line 241 has weird spacing: '... should sign...' == Line 259 has weird spacing: '...ed when excha...' == Line 800 has weird spacing: '...ing the two...' == Line 801 has weird spacing: '...lists of val...' == (26 more instances...) == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: Note that in the above record, the attributes dn, cn and sn are modified from the original record. The attributes that do not change from the original are objectclass, uid, telephonenumber and description. Any attributes that are not changed SHOULD not be present in UPDATE block. Notice the title attribute has been removed from Barbara Jensen-Smith's entry. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '1' ** Downref: Normative reference to an Historic RFC: RFC 1913 (ref. '2') -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Possible downref: Non-RFC (?) normative reference: ref. '7' -- Possible downref: Non-RFC (?) normative reference: ref. '8' -- Possible downref: Non-RFC (?) normative reference: ref. '10' -- Possible downref: Non-RFC (?) normative reference: ref. '11' Summary: 11 errors (**), 0 flaws (~~), 10 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Roland Hedberg 3 Internet Draft Bruce Greenblatt 4 Ryan Moats 5 Expires in six months Mark Wahl 7 A Tagged Index Object for use in the Common Indexing Protocol 9 Status of this Memo 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its areas, and 13 its working groups. Note that other groups may also distribute working 14 documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six 17 months. Internet-Drafts may be updated, replaced, or made obsolete by 18 other documents at any time. It is not appropriate to use Internet- 19 Drafts as reference material or to cite them other than as a "working 20 draft" or "work in progress". 22 To view the entire list of current Internet-Drafts, please check 23 the "1id-abstracts.txt" listing contained in the Internet-Drafts 24 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 25 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au 26 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu 27 (US West Coast). 29 Distribution of this document is unlimited. 31 Abstract 33 This document defines a mechanism by which information servers can 34 exchange indices of information from their databases by making use of 35 the Common Indexing Protocol (CIP). This document defines the structure 36 of the index information being exchanged, as well as a the appropriate 37 meanings for the headers that are defined in the Common Indexing Proto- 38 col. It is assumed that the structures defined here can be used by 39 X.500 DSAs, LDAP servers, Whois++ servers, CCSO servers and many others. 41 1. Introduction 43 The Common Indexing Protocol (CIP) as defined in [1] proposes a 44 mechanism for distributing searches across several instances of a single 45 type of search engine with a view to creating a global directory. CIP 46 provides a scalable, flexible scheme to tie individual databases into 47 distributed data warehouses that can scale gracefully with the growth of 48 the Internet. CIP provides a mechanism for meeting these goals that is 49 independent of the access method that is used to access the actual data 50 that underlies the indices. Separate from CIP is the definition of the 51 Index Object that is used to contain the information that is exchanged 52 among Index Servers. One such Index Object that has already been 53 defined is the Centroid that is derived from the Whois++ protocol [2]. 55 The Centroid does not meet all of the requirements for the exchange 56 of index information amongst information servers. For example, it does 57 not support the notion of incremental updates natively. For information 58 servers that contain millions of records in their database, constant 59 exchange of complete dredges of the database is bandwidth intensive. 60 The Tagged Index Object is specifically designed to support the exchange 61 of index update information. This design comes at the cost of an 62 increase in the size of the index object being exchanged. The Centroid 63 is also not tailored to always be able to give boolean answers to 64 queries. In the Centroid Model, "an index server will take a query in 65 standard Whois++ format, search its collections of centroids and other 66 forward information, determine which servers hold records which may fill 67 that query, and then notifies the user's client of the next servers to 68 contact to submit the query." [2] Thus, the exchange of Centroids 69 amongst index servers allows hints to be given as to which information 70 server actually contains the information. The Tagged Index Object 71 labels the various pieces of information with identifiers that tie the 72 individual object attributes back to an object as a whole. This "tag- 73 ging" of information allows an index server to be more capable of 74 directing a specific query to the appropriate information server. 75 Again, this feature is added to the Tagged Index Object at the expense 76 of an increase in the size of the index object. 78 2. Background 80 The Lightweight Directory Access Protocol (LDAP) is defined in [3], 81 and it defines a mechanism for accessing a collection of information 82 arranged hierarchically in such a manner as to provide a globally 83 distributed database which is normally called the Directory Information 84 Tree (DIT). Some distinguishing characteristics of LDAP servers are 85 that it is normally the case that several servers cooperate to manage a 86 common subtree of the DIT. LDAP servers are expected to respond to 87 requests that pertain to portions of the DIT for which they have data, 88 as well as for those portions for which they have no information in 89 their database. For example, the LDAP server for a portion of the DIT in 90 the United States (c=US) must be able to provide a response to a Search 91 operation that pertains to a portion of the DIT in Sweden (c=se). Nor- 92 mally, the response given will be a referral to another LDAP server that 93 is expected to be more knowledgeable about the appropriate subtree. 94 However, there is no mechanism that currently enables these LDAP servers 95 to refer the LDAP client to the supposedly more knowledgeable server. 96 Typically, an LDAP (v3) server is configured with the name of exactly 97 one other LDAP server to which all LDAP clients are referred when their 98 requests fall outside the subtree of the DIT for which that LDAP server 99 has knowledge. This specification defines a mechanism whereby LDAP 100 server can exchange index information that will allow referrals to point 101 towards a clearly accurate destination. 103 While the X.500 series of recommendations defines the Directory 104 Information Shadowing Protocol (DISP) [4] which allows X.500 DSAs to 105 exchange actual information in the DIT. Shadowing allows various infor- 106 mation from various portions of the DIT to be replicated amongst partic- 107 ipating DSAs. The design point of DISP is optimized at the exchange of 108 entire portions of the DIT, whereas the design point of CIP and the 109 Tagged Index Object is optimize at the exchange of structural index 110 information about the DIT, and improving the performance of tree naviga- 111 tion amongst various information servers. The Tagged Index Object is 112 more appropriate for the exchange of index information than is DISP. 113 DISP is more targeted at DIT distribution and fault tolerance. DISP is 114 thus more appropriate for the exchange of the actual data in order to 115 spread the load amongst several information servers. DISP is tailored 116 specifically to X.500 (and other hierarchical directory systems), while 117 the Tagged Index Object and CIP can be used in a wide variety of infor- 118 mation server environments. 120 While DISP allows an individual directory server to collect infor- 121 mation about large parts of the DIT, it would require a huge database to 122 collect all of the replicas for a meaningful portion of the DIT. Fur- 123 thermore, as X.525 states: "Before shadowing can occur, an agreement, 124 covering the conditions under which shadowing may occur is required. 125 Although such agreements may be established in a variety of ways, such 126 as policy statements covering all DSAs within a given DMD ...", where a 127 DMD is a Directory Management Domain. This is due to the case that the 128 actual data in the DIT is being exchanged amongst DSA rather than only 129 the information required to maintain an Index. In many environments 130 such an agreement is not appropriate, and in order to collect informa- 131 tion for a meaningful portion of the DIT, a large number of agreements 132 may need to be arranged. 134 3. Object 136 What is desired is to have an information server (or network of 137 information servers) that can quickly respond to real world requests, 138 like: 140 - What is Tim Howes' email address? This is much harder than, What 141 is Tim Howes at Netscape's email address. 143 - What is the X.509 certificate for Fred Smith at compuserve.com? 144 One certainly doesn't want to search CompuServe's entire directory 145 tree to find out this one piece of information. I also don't want 146 to have to shadow the entire CompuServe directory subtree onto my 147 server. If this request is being made because Fred is trying to 148 log into my server, I'd certainly want to be able to respond to the 149 BIND in real time. 151 - Who are all of the people at Novell that have a title of program- 152 mer? 154 All of these requests can reasonably be translated into LDAP or 155 Whois++, and other directory access protocol queries. They can also be 156 serviced in a straightforward manner by the users home information 157 server if it has the appropriate reference information into the database 158 that contains the source data. In this situation, the first server 159 would be able to "chain" the request on behalf of the user. Alterna- 160 tively, a precise referral could be returned. If the home information 161 server wants to service (i.e chain) the request based on the index 162 information that it has on hand, this servicing could be done by any 163 number of means: 165 - issuing LDAP operations to the remote directory server 167 - issuing DSP operations to the remote directory server 169 - issuing DAP operations to the remote directory server 170 - issuing Whois++ operations to the remote Whois++ server 172 - ... 174 4. The Tagged Index Object 176 This section defines a Tagged Index Object that can be exchanged by 177 Information Servers using CIP. While in many cases it is acceptable for 178 Information Servers to make use of the Centroid construct (as defined in 179 [2]) to exchange index information, the goals in defining a new con- 180 struct are multi-pronged: 182 - When the Information Server receives a search request that warrants 183 that a referral be returned, allow the server to return a referral 184 that will point client to a server that is most likely able to 185 answer the request correctly. False positive referrals (the search 186 turns up hits in the index object that generate referrals to 187 servers that don't hold the desired information) can be reduced, 188 depending on the choice of attribute tokenization types that are 189 used. 191 - When the Information Server receives a search request that is not 192 operating against local data, allow the Information Server itself 193 to "chain" the request to the appropriate remote Information 194 Server. Note that LDAP itself does not define how Chaining works, 195 but X.500 does. This seems very similar to the first "prong". 197 - Finally, when a collection of Information Servers are operating 198 against a large distributed directory, allow them to distribute 199 index information amongst themselves (ala CIP) so that as their own 200 searches can be carried out with some degree of efficiency. 202 4.1. The Agreement 204 Before a Tagged Index Object can be exchanged, the organization 205 which administers the object supplier and the organization which admin- 206 isters the object consumer must reach an agreement on how the servers 207 will communicate. This agreement contains the following: 209 - "version":The version of the agreement and the index type. This 210 specification describes the index type "x-tagged-index-1" 212 - "dsi": An OID which uniquely identifies the subtree and scope. 213 This field is not explicitly necessary, as it may not provide 214 information beyond that which is contained in the "base-uri" below. 216 - "base-uri": One or more URI's which will form the base of any 217 referrals created based upon the index object that is governed by 218 this agreement. For example, in the LDAP URL format [8] the base- 219 uri would specify (among other items): the LDAP host, the base 220 object to which this index object refers (e.g. c=SE), and the scope 221 of the index object (e.g. single container). 223 - "supplier": The hostname and listening port number of the supplier 224 server, as well as any alternative servers holding that same naming 225 contexts, in case the supplier is unavailable. 227 - "consumeraddr": This is a URI of the "mailto:" form, with the RFC 228 822 email address of the consumer server. Subsequent versions of 229 this draft allow other forms of URI, so that the consumer may 230 retrieve the update via the WWW, FTP or CIP 232 - "updateinterval": The maximum duration in seconds between occu- 233 rances of the supplier server generating an update. If the con- 234 sumer server has not received an update from the supplier server 235 after waiting this long since the previous update, it is likely 236 that the index information is now out of date. A typical value for 237 a server with frequent updates would be 604800 seconds, or every 238 week. Servers whose DITs are only modified annually could have a 239 much longer update interval. 241 - "securityoption": Whether and how the supplier server should sign 242 and encrypt the update before sending it to the consumer server. 243 Options for this version of the specification are: 245 "none" - the update is sent in plaintext 247 "PGP/MIME": the update is digitally signed and encrypted using 248 PGP [9] 250 "S/MIME": the update is digitally signed and encrypted using 251 S/MIME [10] 253 "SSLv3": the update is digitally signed and encrypted using an 254 SSLv3 connection [11] 256 "Fortezza": the update is digitally signed and encrypted using 257 Fortezza [5] 259 It is recommended that the "PGP/MIME" option be used when exchang- 260 ing sensitive information across public networks, and both the supplier 261 and consumer have PGP keys. The "Fortezza" option is intended for use in 262 environments where security protocols are based on Fortezza-compatible 263 devices. The "S/MIME" option can be used with both the supplier and 264 consumer have RSA keys and can make use of the PKCS protocols defined in 265 the S/MIME specification. The "SSLv3" option can be used when both the 266 supplier and consumer have access to SSL services, have server certifi- 267 cates, and can mutually authenticate each other. Should these be IANA 268 registered things??? 270 - Security Credentials: The long-term cryptographic credentials used 271 for key exchange and authentication of the consumer and supplier 272 servers, if a security option was selected. For "PGP/MIME", this 273 will be the trusted public keys of both servers. For "Fortezza", 274 this will be the certificate paths of both servers to a common 275 point of trust. For "S/MIME" and "SSLv3" these will be the certifi- 276 cates of the supplier and consumer. 278 Note that if the index server maintains the information that would 279 appear in the agreement in a directory according to the definitions in 280 [7], then no real formal agreement between the two parties needs to be 281 put in place, and the information that is required for communication 282 between the two index servers is derived automatically from the direc- 283 tory. 285 4.2. Content Type 287 The update consists of a MIME object of type application/cip-index- 288 object. The parameters are: 290 "type": this has value "application/index.obj.tagged". 292 "dsi": the DSI (if any) from the agreement. 294 "base-uri". A set of URIs, separated by spaces. In each URI, the 295 hostname/portno must be distinct, and based on the "supplier" part 296 of the agreement. 298 The payload is mostly textual data but may include bytes with the 299 high bit set. The originating information server should set the con- 300 tent-transfer-encoding as appropriate for the information included in 301 the payload. 303 This object may be encapsulated in a wrapper content (such as mul- 304 tipart/signed) or be encrypted as part of the security procedures. The 305 resulting content can the distributed, for example via electronic mail. 306 For example, 307 From: supplier@sup.com Date: Thu, 16 Jan 1997 13:50:37 -0500 308 Message-Id: <199701161850.NAA29295@sup.com>; 309 To: consumer@consumer.com <<-- from consumer server address 310 Reply-to: supplier-admin@sup.com 311 MIME-Version: 1.0 312 Content-Type: application/index.obj.tagged; 313 dsi=1.3.6.1.4.1.1466.85.85.1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16; 314 base-uri="ldap://sup.com/dc=sup,dc=com ldap://alt.com/dc=sup,dc=com" 316 The payload is series of CRLF-terminated lines. The payload only 317 includes characters from a subset of the printable US-ASCII subset of 318 UTF-8. Attribute values that occur outside of this subset are encoded 319 as defined below. As more experience is gained with index objects and 320 UTF-8 data, a future version of this specification may allow for the 321 native transfer of UTF-8 data without requiring this special encoding. 322 No other character sets are permitted by this version of the specifica- 323 tion. Some supplier servers may only be able to generate the printable 324 US-ASCII subset, but all consumer servers must be able to handle the 325 full range of Unicode characters when decoding the attribute values (in 326 the "attr-value" field in the BNF below). 328 4.3. Tagged Index BNF 330 The Tagged Index object has the following grammar, expressed in 331 modified BNF format: 333 index-object = 0*(io-part SEP) io-part 334 io-part = header SEP schema-spec SEP index-info 335 header = version-spec SEP update-type SEP this-update SEP 336 last-update SEP context-size 337 version-spec = "version:" *SPACE "x-tagged-index-1" 338 update-type = "updatetype:" *SPACE ( "total" | "incremental") 339 this-update = "thisupdate:" *SPACE TIMESTAMP 340 last-update = [ "lastupdate:" *SPACE TIMESTAMP ] 341 context-size = [ "contextsize:" *SPACE 1*DIGIT ] 342 schema-spec = "BEGIN IO-Schema" SEP 1*(schema-line SEP) 343 "END IO-Schema" 344 schema-line = attribute-name ":" token-type 345 token-type = "FULL" | "TOKEN" | "RFC822" | "UUCP" | "DNS" 346 index-info = full-index | incremental-index 347 full-index = "BEGIN Index-Info" SEP 1*(index-block SEP) 348 "END Index-Info" 349 incremental-index = 1*(add-block | delete-block | update-block) 350 add-block = "BEGIN Add Block" SEP 1*(index-block SEP) 351 "END Add Block" 352 delete-block = "BEGIN Delete Block" SEP 1*(index-block SEP) 353 "END Delete Block" 354 update-block = "BEGIN Update Block" SEP 1*(index-block SEP) 355 "END Update Block" 356 index-block = first-line 0*(SEP cont-line) 357 first-line = attr-name ":" *SPACE taglist "/" attr-value 358 cont-line = "-" taglist "/" attr-value 359 taglist = tag 0*("," tag) | "*" 360 tag = 1*DIGIT ["-" 1*DIGIT] 361 attr-value = 0*(UTF8) 362 attr-name = 1*(NAMECHAR) 363 UTF8 = ASCII | "%" HEX HEX 364 TIMESTAMP = 1*DIGIT 365 ASCII = DIGIT | UPPER | LOWER | OTHER 366 NAMECHAR = DIGIT | UPPER | LOWER | "-" | ";" | "." 367 SPACE = ; 368 SEP = (CR LF) | LF 369 CR = ; 370 LF = ; 371 HEX = "a" | "b" | "c" | "d" | "e" | "f" | DIGIT 372 DIGIT = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | 373 "8" | "9" 374 UPPER = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | 375 "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | 376 "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | 377 "Y" | "Z" 378 LOWER = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | 379 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | 380 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | 381 "y" | "z" 382 OTHER = "(" | ")" | "+" | "," | "-" | "." | "/" | ":" | 383 "=" | "?" | "@" | ";" | "$" | "_" | "!" | "~" | 384 "*" | "'" | "\" | """ | "#" | "&" | "<" | ">" | 385 "[" | "]" | "^" | "`" | "{" | "|" | "}" 387 Characters that are allowed to appear unescaped in attr-values are 388 the printable subset of (low) ASCII minus the "%" characters, i.e. hex 389 21 through hex 7e inclusive with the exception of hex 25 (which is the 390 "%" character). Any other UTF-8 encoding of a character that appears in 391 an attr-value must be excaped by using the "%" character and two hex 392 digits that encode the character. For example, The UCS-2 sequence 393 "A." (0041, 2262, 0391, 002E) may be encoded in 394 UTF-8 as follows: 395 41 E2 89 A2 CE 91 2E 397 If this character sequence appears in an attribute that is in a 398 Tagged Index Object attr-value, then it is encoded as: 399 41 25 65 32 25 38 39 25 61 32 25 63 65 25 39 31 2E 401 When viewed as an character string the encoding appears as: 402 "A%e2%89%a2%ce%91." 403 The set of characters allowed to appear in the attr-name field is 404 limited to the set of characters used in LDAP and WHOIS++ attribute 405 names. For other services that have attribute name character sets that 406 are larger than these, it is suggested that those services create a pro- 407 file that maps the names onto object identifiers, and the sequence of 408 digits and periods is used by those services in creating the attr-name 409 fields for their Tagged Index Objects. 411 Note that the attribute value may only be empty in the case of an 412 incremental update that contains a "Update Block" in which the index 413 object indicates that certain attributes of objects are being removed. 414 This specification only supports the replacement of entire attributes, 415 so that in the case of a multi-valued attribute, all of the values must 416 be specified in the Replace Block, not just the newly added values. The 417 intention of the Tagged Index Object is to supply a snapshot of the cur- 418 rent index of the directory. 420 4.3.1. Header Descriptions 422 The header section consists of one or more "header lines". The 423 following header lines are defined: 425 "version": This line must always be present, and have the value "x- 426 tagged-index-1" for this version of the specification. 428 "updatetype": This line must always be present. It takes as the 429 value either "total" or 431 "incremental". The first update sent by a supplier server to a 432 consumer server for a DSI must be a "total" update (why?). 434 "thisupdate": This line must always be present. The value is the 435 number of seconds from 00:00:00 UTC January 1, 1970 at which the 436 supplier constructed this update. 438 "lastupdate": This line must be present if the "updatetype" list 439 has the value 441 "incremental". The value is the number of seconds from 00:00:00 442 UTC January 1, 1970 at which the supplier constructed the previous 443 update sent to the consumer. This field allows the consumer to 444 determine if a previous update was missed. 446 "contextsize": This line may be present at the supplier's option. 447 The value is a number, which is the approximate total number of 448 entries in the subtree. This information is provided for statisti- 449 cal purposes only. 451 4.3.2. Tokenization Types 453 The Tagged Index Object inherits the "TOKEN" scheme for tokeniza- 454 tion as specified in [2]. In addition, there are several other tok- 455 enization schemes defined for the Tagged Index Object. The following 456 table presents these schemes and what character(s) are used to delimit 457 tokens. 459 Token Type Tokenization Characters 460 FULL none 461 TOKEN white space, "@" 462 RFC822 white space, ".", "@" 463 UUCP white space, "!" 464 DNS any character note a number, letter, or "-" 466 4.3.3. Tag Conventions 468 In the tag list, multiple consecutive tags may be shortened by 469 using "#-#". For example, the list "3,4,5,6,7,8,9,10" may be shortened 470 to "3-10". Tags are to be applied to the data on a per entry level. 471 Thus, if two index lines in the same index object contain the same tag, 472 then it is always the case that those two lines refer back to the same 473 "record" in the directory. In LDAP terminology, the two lines would 474 refer back to the same directory object. Additionally if two index 475 lines in the same index object contain different tags, then it is always 476 the case that those two lines refer back to different records in the 477 directory. 479 The tags in the index object are meaningful only in the context of 480 that transmission. The tag applied to the same underlying record in two 481 separate transmissions of a full-update may be different. Thus, receiv- 482 ing index servers should make no assumptions about the values of the 483 tags across index object boundaries. If the recieving index server is 484 implemented in such a way that it maintains a structure similar to the 485 one that exists in the tagged index object with numbered tags attached 486 to various records, then these "internal" tags are distinct from the 487 tags that appear in the index object as created by the transmitting 488 index server. 490 4.4. Incremental Indexing 492 The tagged index object format supports the ability of information 493 servers to distribute only delta index data, rather than distributing 494 total index information each time. This scenario, known as incremental 495 indexing supports three basic types of operations: add, delete and 496 replace. If th incremental updatetype is specified in the tagged index 497 object, then the index object contains a snapshot of only the changes 498 that have been made since the index object specified in the lastupdate 499 header was distributed. If the receiving index server did not receive 500 that index object, it should request a total index object. If the CIP 501 protocol supports it, the index server may request the specific index 502 object that it missed. 504 If the tagged index object contains an Add Block, then the lines in 505 the Add Block refer to new records that were added to the information 506 base of the transmitting index server. It can be guaranteed that those 507 records did not exist in any previously received tagged index object, 508 and the receiving index server can insert this index information in the 509 index that it already maintains for the transmitting index server. If 510 the receiving index server is maintaining internal tags, then a new 511 internal tag should be created for each tag in the Add Block. 513 If the tagged index object contains a Delete Block, then the Delete 514 Block contains lines each of which refers to the "key" field (in the 515 attr-name area of the index line) from a record in the information 516 server that has been deleted since the last update (specified in the 517 lastupdate header field). This key field is assumed to be the unique 518 identifier on the transmitting information server for the record that 519 has been deleted. In the case of LDAP servers, this field would have an 520 attr-name of "dn". Other forms of information servers would use the 521 appropriate unique identifier. Thus, the unique identifier must have 522 previously been sent by the transmitting index server. If the receiving 523 index server has never received information for the record refered to by 524 a line in the Delete Block, then it should be ignored, with the proviso 525 that the receiving index server has more than likely "lost" some infor- 526 mation previously distributed by the transmitting index server. If the 527 receiving index server is maintaining internal tags, then after process- 528 ing the Delete Block, the internal tag numbers may be reordered so as to 529 not have "holes" in the sequence. 531 If the tagged index object contains an Update Block, then the lines 532 in the Update Block refer to records that were changed in the informa- 533 tion base of the transmitting index server. As was mentioned in clause 534 4.3, if any portion of an attribute in the information server has been 535 changed, then the entire attribute must be specified, and all index 536 information from all values of a multi-valued attribute must be speci- 537 fied. If the attribute was removed from the record in the information 538 server, the attribute value specified in the attr-value field should be 539 empty. Attributes which have not been changed in the record are not 540 specified. The Update Block also supports the idea of indexing new 541 attributes which were not previously included in the tagged index 542 object. For example, if the transmitting index server began including 543 index information on postal addresses, then it could include an Update 544 Block in the index object that included all of the index information on 545 postal addresses for all records in its information base, and indicate 546 that nothing else has changed. If the receiving index server is main- 547 taining internal tags, then after processing the Update Block, the 548 internal tag numbers should remain the same. 550 5. Example 552 As an example, the following LDIF [6] entries and the resulting 553 Tagged Index Object are presented. 555 dn: cn=Barbara Jensen, ou=Product Development, o=Ace 556 Industry, c=US 557 objectclass: top 558 objectclass: person 559 objectclass: organizationalPerson 560 cn: Barbara Jensen 561 cn: Barbara J Jensen 562 cn: Babs Jensen 563 sn: Jensen 564 uid: bjensen 565 telephonenumber: +1 408 555 1212 566 description: A big sailing fan. 567 dn: cn=Bjorn Jensen, ou=Accounting, o=Ace Industry, c=US 568 objectclass: top 569 objectclass: person 570 objectclass: organizationalPerson 571 cn: Bjorn Jensen 572 sn: Jensen 573 telephonenumber: +1 408 555 1212 574 dn: cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 575 objectclass: top 576 objectclass: person 577 objectclass: organizationalPerson 578 cn: Gern Jensen 579 cn: Gern O Jensen 580 sn: Jensen 581 uid: gernj 582 telephonenumber: +1 408 555 1212 583 dn: cn=Horatio Jensen, ou=Product Testing, o=Ace Industry, 584 c=US 585 objectclass: top 586 objectclass: person 587 objectclass: organizationalPerson 588 cn: Horatio Jensen 589 cn: Horatio N Jensen 590 sn: Jensen 591 uid: hjensen 592 telephonenumber: +1 408 555 1212 594 The Tagged Index Object for this example would be: 596 version: x-tagged-index-1 597 updatetype: total 598 thisupdate: 855938804 599 BEGIN IO-Schema 600 dn: FULL 601 ou: TOKEN 602 o: TOKEN 603 c: TOKEN 604 objectclass: FULL 605 cn: TOKEN 606 sn: FULL 607 uid: FULL 608 title: TOKEN 609 END IO-Schema 610 BEGIN Index-Info 611 dn: 1/cn=Barbara Jensen,ou=Product 612 Development,o=Ace Industry,c=US 613 -2/cn=Bjorn Jensen,ou=Accounting,o=Ace 614 Industry,c=US 615 -3/cn=Gern Jensen,ou=Product Testing,o=Ace 616 Industry,c=US 617 -4/cn=Horatio Jensen,ou=Product Testing,o=Ace 618 Industry,c=US 619 ou: 1,3-4/Product 620 -1/Development 621 -2/Accounting 622 -3-4/Testing 623 o: */Ace 624 -*/Industry 625 c: */US 626 objectclass: */top 627 -*/person 628 -*/organizationalPerson 629 cn: 1/Barbara 630 -1/J 631 -1/Babs 632 -*/Jensen 633 -2/Bjorn 634 -3/Gern 635 -3/O 636 -4/Horatio 637 -4/N 638 sn: */Jensen 639 uid: 1/bjensen 640 -3/gernj 641 -4/hjensen 642 title: 1/product 643 1/manager 644 1/rod 645 1/and 646 1/reel 647 1/division 648 END Index-Info 650 As an example of the Incremental Index Object, consider an update 651 that occurs when Barbara Jensen's entry above changes to: 653 dn: cn=Barbara Jensen-Smith, ou=Product Development, o=Ace 654 Industry, c=US 655 objectclass: top 656 objectclass: person 657 objectclass: organizationalPerson 658 cn: Barbara Jensen-Smith 659 cn: Barbara J Jensen-Smith 660 cn: Babs Jensen-Smith 661 sn: Jensen-Smith 662 uid: bjensen 663 telephonenumber: +1 408 555 1212 664 description: A big sailing fan. 666 The Tagged Index Object for this example would be: 668 version: x-tagged-index-1 669 updatetype: incremental 670 lastupdate: 855940000 671 thisupdate: 855938804 672 BEGIN IO-schema 673 dn: FULL 674 rdn: FULL 675 cn: TOKEN 676 sn: FULL 677 title: FULL 678 END IO-Schema 679 BEGIN Update Block 680 dn: 1/cn=Barbara Jensen,ou=Product 681 Development,o=Ace Industry,c=US 682 rdn: 1/rdn=Barbara Jensen-Smith 683 cn: 1/ Barbara 684 cn: 1/ Babs 685 cn: 1/Jensen-Smith 686 sn: 1/Jensen-Smith 687 title: 1/ 688 END Update Block 690 Note that in the above record, the attributes dn, cn and sn are 691 modified from the original record. The attributes that do not change 692 from the original are objectclass, uid, telephonenumber and description. 693 Any attributes that are not changed SHOULD not be present in UPDATE 694 block. Notice the title attribute has been removed from Barbara Jensen- 695 Smith's entry. 697 In this next example, consider an LDIF file containing a series of 698 change records and comments. 700 # Add a new entry 701 dn: cn=Fiona Jensen, ou=Marketing, o=Ace Industry, c=US 702 changetype: add 703 objectclass: top 704 objectclass: person 705 objectclass: organizationalPerson 706 cn: Fiona Jensen 707 sn: Jensen 708 uid: fiona 709 telephonenumber: +1 408 555 1212 710 jpegphoto:< /usr/local/directory/photos/fiona.jpg 711 # Delete an existing entry 712 dn: cn=Robert Jensen, ou=Marketing, o=Ace Industry, c=US 713 changetype: delete 714 # Modify an entry's relative distinguished name 715 dn: cn=Paul Jensen, ou=Product Development, o=Ace Industry, c=US 716 changetype: modrdn 717 newrdn: cn=Paula Jensen 718 deleteoldrdn: 1 719 # Rename and entry and move all of its children to a new location in 720 # the directory tree (only implemented by LDAPv3 servers). 721 dn: ou=PD Accountants, ou=Product Development, o=Ace Industry, c=US 722 changetype: modrdn 723 newrdn: ou=Product Development Accountants 724 deleteoldrdn: 0 725 newsuperior: ou=Accounting, o=Ace Industry, c=US 726 # Modify an entry: add an additional value to the postaladdress 727 attribute, 728 # completely delete the description attribute, replace the 729 telephonenumber 730 # attribute with two values, and delete a specific value from the 731 # facsimiletelephonenumber attribute 732 dn: cn=Paula Jensen, ou=Product Development, o=Ace Industry, c=US 733 changetype: modify 734 add: postaladdress 735 postaladdress: 123 Anystreet $ Sunnyvale, CA $ 94086 736 - 737 delete: description 738 - 739 replace: telephonenumber 740 telephonenumber: +1 408 555 1234 741 telephonenumber: +1 408 555 5678 742 - 743 delete: facsimiletelephonenumber 744 facsimiletelephonenumber: +1 408 555 9876 745 - 746 The Tagged Index Object for this example would be: 748 version: x-tagged-index-1 749 updatetype: incremental 750 thisupdate: 855938804 751 lastupdate: 855912345 752 BEGIN IO-Schema 753 dn: FULL 754 ou: TOKEN 755 o: TOKEN 756 c: TOKEN 757 objectclass: FULL 758 cn: TOKEN 759 sn: FULL 760 uid: FULL 761 title: TOKEN 762 END IO-Schema 763 BEGIN Add Block 764 objectclass: top 765 objectclass: person 766 objectclass: organizationalPerson 767 c: 1/us 768 o: 1/Ace 769 o: 1/Industry 770 ou: 1/Marketing 771 cn: 1/Fiona 772 cn: 1/Jensen 773 sn: 1/Jensen 774 uid: 1/Fiona 775 END Add Block 777 BEGIN Delete Block 778 dn: 1/cn=Robert Jensen, ou=Marketing, o=Ace Industry, c=us 779 END Delete Block 781 BEGIN Update Block 782 dn: 1/ou=PD Accountants, ou=Product Development, o=Ace Industry, c=US 783 -2/cn=Paula Jensen, ou=Product Development, o=Ace Industry, c=US 784 rdn: 1/Product Development Accountants 785 description: 2/ 786 telephonenumber: 2/+1 408 555 5678 787 facsimilenumber: 2/ 788 postaladdress: 2/123 789 -2/AnyStreet 790 -2/Sunnyvale 791 -2/CA 792 -2/94086 793 END Update Block 794 END Index-Info 796 6. Aggregation 798 6.1. Aggregation of Tagged Index Objects 800 Aggregation of two tagged index objects is done by merging the two 801 lists of values and rewriting each tag list. The tag list rewriting 802 process is done so that the resulting index object appears as if it came 803 from a single source. Tags from one of the two tagged index objects are 804 "mapped" to the number space above that used by the other tagged index 805 object. An index server that aggregates tagged index objects for export 806 MUST ensure that the export URL (i.e. the base-uri of the CIP object) 807 for the aggregate index object will route all queries that have "hits" 808 on the index object to that server (otherwise, query routing will not 809 succeed). 811 7. Security Considerations 813 This specification provides a protocol for transfering information 814 between two servers. The actual information transfered may be protected 815 by laws in many countries, so care must be taken in the methods used to 816 tokenize the data in order to ensure that protected data may not be 817 reconstructed in full by the receiving server. This protocol does not 818 have any inherent protection against spoofing or eavesdropping. How- 819 ever, since this protocol is transported in MIME messages (as are all 820 CIP index objects), it inherits all of the security capabilities and 821 liabilities of other MIME messages. Specifically, those wanting to pre- 822 vent eavesdropping or spoofing may use some of the various techniques 823 for signing and encrypting MIME messages. 825 Information Server administrators must decide what portions of 826 their databases are appropriate for inclusion in the Tagged Index 827 Object. For distribution of information outside of the enterprise, 828 information server developers are encouraged to allow for facilities 829 that hide the organizational structure when generating the Tagged Index 830 Object from the underlying information database. In order to allow for 831 the secure transmission of Tagged Index Objects across the Internet, 832 Index Servers should make use of SSL to carry out the connection. In 833 order to strongly verify the identity of the peer index server on the 834 other side of the connection, SSL version 3 certificate exchange should 835 be implemented, and the identity in the peer's certificate verify with 836 the Public Key Infrastructure. If electronic mail is used to exchange 837 the Tagged Index Objects, then a secure messaging facility, such as 838 PGP/MIME or S/MIME should be used to sign or encrypt (or both) the 839 information. 841 8. References 843 [1] J. Allen, M. Mealling, "The Architecture of the Common Indexing 844 Protocol (CIP)," Internet Draft (work in progress) June 1997. 846 [2] C. Weider, J. Fullton, S. Spero, "Architecture of the Whois++ Index 847 Service. RFC 1913, February 1996. 849 [3] M. Wahl, T. Howes, S. Kille, "Lightweight Directory Access Protocol 850 (v3)," Internet Draft (work in progress), June 1997. 852 [4] ITU, "X.525 Information Technology - Open Systems Interconnection - 853 The Directory: Replication", November 1993. 855 [5] "FORTEZZA Application Implementors Guide for the FORTEZZA Crypto 856 Card (Production Version)", Document #PD4002102-1.01, SPYRUS, 1995. 858 [6] The LDAP Data Interchange Format (LDIF). Internet Draft (work in 859 progress), 25 November 1996. 861 [7] R. Hedberg, "LDAPv2 client Vs the Index Mesh". Internet Draft (work 862 in progress), November 1997. 864 [8] T. Howes, M. Smith, "The LDAP URL Format". Internet Draft (work in 865 progress), June 1997. 867 [9] M. Elkins, "MIME Security with Pretty Good Privacy (PGP)", RFC2015, 868 October 1996. 870 [10] Blake Ramsdell, "S/MIME Version 3 Message Specification", Internet 871 Draft, (work in progress), May 1997. 873 [11] C. Allen, T. Dierks, "The TLS Protocol Version 1.0", Internet 874 Draft, (work in progress), November 1997. 876 9. Author's Addresses 878 Roland Hedberg 879 Umdac 880 Umea University 881 901 87 Umea 882 Sweden 883 Email: Roland.Hedberg@umdac.umu.se 885 Bruce Greenblatt 886 RSA Data Security 887 100 Marine Parkway 888 Suite 500 889 Redwood City, CA 94065 890 USA 891 Email: bgreenblatt@rsa.com 892 Phone: +1-650-595-8782 894 Ryan Moats 895 AT&T 896 15621 Drexel Circle 897 Omaha, NE 68135-2358 898 USA 899 EMail: jayhawk@ds.internic.net 900 Phone: +1 402 894-9456 902 Mark Wahl 903 Critical Angle, Inc. 904 4815 W Braker Lane #502-385 905 Austin, TX 78759 906 Email: M.Wahl@critical-angle.com 907 Table of Contents 909 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2 910 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 911 3. Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 912 4. The Tagged Index Object . . . . . . . . . . . . . . . . . . . . . 5 913 4.1. The Agreement . . . . . . . . . . . . . . . . . . . . . . . . . 5 914 4.2. Content Type . . . . . . . . . . . . . . . . . . . . . . . . . 7 915 4.3 Tagged Index BNF . . . . . . . . . . . . . . . . . . . . . . . . 8 916 4.3.1. Header Descriptions . . . . . . . . . . . . . . . . . . . . . 10 917 4.3.2. Tokenization types . . . . . . . . . . . . . . . . . . . . . 11 918 4.3.3. Tag Conventions . . . . . . . . . . . . . . . . . . . . . . . 11 919 4.4. Incremental Indexing . . . . . . . . . . . . . . . . . . . . . 11 920 5. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 921 6. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 922 6.1 Aggregation of Tagged Index Objects . . . . . . . . . . . . . . 18 923 7. Security Considerations . . . . . . . . . . . . . . . . . . . . . 18 924 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 925 9. Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . 20