idnits 2.17.1 draft-ietf-find-cip-tagged-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 22 longer pages, the longest (page 21) being 83 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 23 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 13 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 424: '...objects MUST be performed in the order...' RFC 2119 keyword, line 930: '...jects for export MUST ensure that the ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 230 has weird spacing: '...re only modif...' == Line 262 has weird spacing: '... should sign...' == Line 926 has weird spacing: '...ing the two...' == Line 927 has weird spacing: '...lists of val...' == Line 937 has weird spacing: '... This speci...' == (21 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '1' ** Downref: Normative reference to an Historic RFC: RFC 1913 (ref. '2') ** Obsolete normative reference: RFC 2251 (ref. '3') (Obsoleted by RFC 4510, RFC 4511, RFC 4512, RFC 4513) -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Possible downref: Non-RFC (?) normative reference: ref. '7' ** Obsolete normative reference: RFC 2255 (ref. '8') (Obsoleted by RFC 4510, RFC 4516) -- Possible downref: Non-RFC (?) normative reference: ref. '10' -- Possible downref: Non-RFC (?) normative reference: ref. '11' Summary: 14 errors (**), 0 flaws (~~), 9 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Roland Hedberg 3 Internet Draft Bruce Greenblatt 4 Ryan Moats 5 Expires in six months Mark Wahl 7 A Tagged Index Object for use in the Common Indexing Protocol 9 Status of this Memo 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its areas, and 13 its working groups. Note that other groups may also distribute working 14 documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six 17 months. Internet-Drafts may be updated, replaced, or made obsolete by 18 other documents at any time. It is not appropriate to use Internet- 19 Drafts as reference material or to cite them other than as a "working 20 draft" or "work in progress". 22 To view the entire list of current Internet-Drafts, please check 23 the "1id-abstracts.txt" listing contained in the Internet-Drafts 24 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 25 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au 26 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu 27 (US West Coast). 29 Distribution of this document is unlimited. 31 Abstract 33 This document defines a mechanism by which information servers can 34 exchange indices of information from their databases by making use of 35 the Common Indexing Protocol (CIP). This document defines the structure 36 of the index information being exchanged, as well as a the appropriate 37 meanings for the headers that are defined in the Common Indexing Proto- 38 col. It is assumed that the structures defined here can be used by 39 X.500 DSAs, LDAP servers, Whois++ servers, CSO Ph servers and many others. 41 1. Introduction 43 The Common Indexing Protocol (CIP) as defined in [1] proposes a 44 mechanism for distributing searches across several instances of a single 45 type of search engine to create a global directory. CIP 46 provides a scalable, flexible scheme to tie individual databases into 47 distributed data warehouses that can scale gracefully with the growth of 48 the Internet. CIP provides a mechanism for meeting these goals that is 49 independent of the access method that is used to access the data 50 that underlies the indices. Separate from CIP is the definition of the 51 Index Object that is used to contain the information that is exchanged 52 among Index Servers. One such Index Object that has already been 53 defined is the Centroid that is derived from the Whois++ protocol [2]. 55 The Centroid does not meet all the requirements for the exchange 56 of index information amongst information servers. For example, it does 57 not support the notion of incremental updates natively. For information 58 servers that contain millions of records in their database, constant 59 exchange of complete dredges of the database is bandwidth intensive. 60 The Tagged Index Object is specifically designed to support the exchange 61 of index update information. This design comes at the cost of an 62 increase in the size of the index object being exchanged. The Centroid 63 is also not tailored to always be able to give boolean answers to 64 queries. In the Centroid Model, "an index server will take a query in 65 standard Whois++ format, search its collections of centroids and other 66 forward information, determine which servers hold records which may fill 67 that query, and then notifies the user's client of the next servers to 68 contact to submit the query." [2] Thus, the exchange of Centroids 69 amongst index servers allows hints to be given about which information 70 server actually contains the information. The Tagged Index Object 71 labels the various pieces of information with identifiers that tie the 72 individual object attributes back to an object as a whole. This "tagging" 73 of information allows an index server to be more capable of 74 directing a specific query to the appropriate information server. 75 Again, this feature is added to the Tagged Index Object at the expense 76 of an increase in the size of the index object. 78 2. Background 80 The Lightweight Directory Access Protocol (LDAP) is defined in [3], 81 and it defines a mechanism for accessing a collection of information 82 arranged hierarchically in such a way as to provide a globally 83 distributed database which is normally called the Directory Information 84 Tree (DIT). Some distinguishing characteristics of LDAP servers are 85 that normally, several servers cooperate to manage a 86 common subtree of the DIT. LDAP servers are expected to respond to 87 requests that pertain to portions of the DIT for which they have data, 88 as well as for those portions for which they have no information in 89 their database. For example, the LDAP server for a portion of the DIT in 90 the United States (c=US) must be able to provide a response to a Search 91 operation that pertains to a portion of the DIT in Sweden (c=se). Nor- 92 mally, the response given will be a referral to another LDAP server that 93 is expected to be more knowledgeable about the appropriate subtree. 94 However, there is no mechanism that currently enables these LDAP servers 95 to refer the LDAP client to the supposedly more knowledgeable server. 96 Typically, an LDAP (v3) server is configured with the name of exactly 97 one other LDAP server to which all LDAP clients are referred when their 98 requests fall outside the subtree of the DIT for which that LDAP server 99 has knowledge. This specification defines a mechanism whereby LDAP 100 server can exchange index information that will allow referrals to point 101 towards a clearly accurate destination. 103 The X.500 series of recommendations defines the Directory 104 Information Shadowing Protocol (DISP) [4] which allows X.500 DSAs to 105 exchange information in the DIT. Shadowing allows various 106 information from various portions of the DIT to be replicated amongst 107 participating DSAs. The design point of DISP is improved at the exchange 108 of entire portions of the DIT, whereas the design point of CIP and the 109 Tagged Index Object is optimized at the exchange of structural index 110 information about the DIT, and improving the performance of tree naviga- 111 tion amongst various information servers. The Tagged Index Object is 112 more appropriate for the exchange of index information than is DISP. 113 DISP is more targeted at DIT distribution and fault tolerance. DISP is 114 thus more appropriate for the exchange of the data in order to 115 spread the load amongst several information servers. DISP is tailored 116 specifically to X.500 (and other hierarchical directory systems), while 117 the Tagged Index Object and CIP can be used in a wide variety of infor- 118 mation server environments. 120 While DISP allows an individual directory server to collect infor- 121 mation about large parts of the DIT, it would require a huge database to 122 collect all the replicas for a significant portion of the DIT. Fur- 123 thermore, as X.525 states: "Before shadowing can occur, an agreement, 124 covering the conditions under which shadowing may occur is required. 125 Although such agreements may be established in a variety of ways, such 126 as policy statements covering all DSAs within a given DMD ...", where a 127 DMD is a Directory Management Domain. This is owing to the case that the 128 data in the DIT is being exchanged amongst DSA rather than only 129 the information required to maintain an Index. In many environments 130 such an agreement is not appropriate, and to collect information 131 for a meaningful portion of the DIT, many agreements 132 may need to be arranged. 134 3. Object 136 What is desired is to have an information server (or network of 137 information servers) that can quickly respond to real world requests, 138 like: 140 - What is Tim Howes's email address? This is much harder than; What 141 email address does Tim Howes at Netscape have ? 143 - What is the X.509 certificate for Fred Smith at compuserve.com? 144 One certainly doesn't want to search CompuServe's entire directory 145 tree to find out this one piece of information. I also don't want 146 to have to shadow the entire CompuServe directory subtree onto my 147 server. If this request is being made because Fred is trying to 148 log into my server, I'd certainly want to be able to respond to the 149 BIND in real time. 151 - Who are all the people at Novell that have a title of programmer? 153 all these requests can reasonably be translated into LDAP or 154 Whois++, and other directory access protocol queries. They can also be 155 serviced in a straightforward way by the users home information 156 server if it has the appropriate reference information into the database 157 that contains the source data. Here, the first server 158 would be able to "chain" the request for the user. Alterna- 159 tively, a precise referral could be returned. If the home information 160 server wants to service (i.e chain) the request based on the index 161 information that it has on hand, this servicing could be done several 162 different means: 164 - issuing LDAP operations to the remote directory server 166 - issuing DSP operations to the remote directory server 168 - issuing DAP operations to the remote directory server 169 - issuing Whois++ operations to the remote Whois++ server 171 - ... 173 4. The Tagged Index Object 175 This section defines a Tagged Index Object that can be exchanged by 176 Information Servers using CIP. While often it is acceptable for 177 Information Servers to make use of the Centroid definition (from 178 [2]) to exchange index information, the goals in defining a new con- 179 struct are multi-pronged: 181 - When the Information Server receives a search request that warrants 182 that a referral be returned, allow the server to return a referral 183 that will point client to a server that is most likely able to 184 answer the request correctly. False positive referrals (the search 185 turns up hits in the index object that generate referrals to 186 servers that don't hold the desired information) can be reduced, 187 depending on the choice of attribute tokenization types that are 188 used. 190 - Potentially allow incremental updates that will then consume 191 substantially less bandwidth then if full updates always had to be 192 used. 194 4.1. The Agreement 196 Before a Tagged Index Object can be exchanged, the organization 197 that administers the object supplier and the organization that admin- 198 isters the object consumer must reach an agreement on how the servers 199 will communicate. This agreement contains the following: 201 - "index-type": This specification describes the index type 202 "x-tagged-index-1" 204 - "dsi": An OID that uniquely identifies the subtree and scope. 205 This field is not explicitly necessary, as it may not provide 206 information beyond what is contained in the "base-uri" below. 208 - "base-uri": One or more URI's that will form the base of any 209 referrals created based on the index object that is governed by 210 this agreement. For example, in the LDAP URL format [8] the base- 211 uri would specify (among other items): the LDAP host, the base 212 object to which this index object refers (e.g. c=SE), and the scope 213 of the index object (e.g. single container). 215 - "supplier": The hostname and listening portnumber of the supplier 216 server, as well as any alternative servers holding that same naming 217 contexts, if the supplier is unavailable. 219 - "consumeraddr": This is a URI of the "mailto:" form, with the RFC 220 822 email address of the consumer server. Further versions of 221 this draft allow other forms of URI, so that the consumer may 222 retrieve the update via the WWW, FTP or CIP 224 - "updateinterval": The maximum duration in seconds between occu- 225 rances of the supplier server generating an update. If the con- 226 sumer server has not received an update from the supplier server 227 after waiting this long since the previous update, it is likely 228 that the index information is now out of date. A typical value for 229 a server with frequent updates would be 604800 seconds, or every 230 week. Servers whose DITs are only modified annually could have a 231 much longer update interval. 233 - "attributeNamespace": Every set of index servers that together 234 wants to support a specific usage of indeces, has to agree on which 235 attributenames to use in the index objects. The participating 236 directory servers also has to agree on the mapping from local 237 attributenames to the attributenames used in the index. Since one 238 specific index server might be involved in several such sets, it 239 has to have some way to connect a update to the proper set of 240 indexes. One possible solution to this would be to use different 241 DSIs. 243 - "consistencybase": How consistency of the index is maintained over 244 incremental updates: 246 "complete" - every change or delete concerning one object has 247 to contain all tokens connected to that object. This method 248 must be supported by any server who wants to comply with this 249 standard. 251 "tag" - starting at a full update every incremental update 252 refering back to this full updated has to maintain state- 253 information regarding tags, such that a object within the 254 original database is assigned the same tagnumber every time. 255 This method is optional. 257 "unique" - every object in the Dataset has to have a unique 258 value for a specific attribute in the index. A example of such 259 a attribute could be the distinguishedName attribute. This 260 method is also optional. 262 - "securityoption": Whether and how the supplier server should sign 263 and encrypt the update before sending it to the consumer server. 264 Options for this version of the specification are: 266 "none" - the update is sent in plaintext 268 "PGP/MIME": the update is digitally signed and encrypted using 269 PGP [9] 271 "S/MIME": the update is digitally signed and encrypted using 272 S/MIME [10] 274 "SSLv3": the update is digitally signed and encrypted using an 275 SSLv3 connection [11] 277 "Fortezza": the update is digitally signed and encrypted using 278 Fortezza [5] 280 It is recommended that the "PGP/MIME" option be used when exchanging 281 sensitive information across public networks, and both the supplier 282 and consumer have PGP keys. The "Fortezza" option is intended for use in 283 environments where security protocols are based on Fortezza-compatible 284 devices. The "S/MIME" option can be used with both the supplier and 285 consumer have RSA keys and can make use of the PKCS protocols defined in 286 the S/MIME specification. The "SSLv3" option can be used when both the 287 supplier and consumer have access to SSL services, have server certifi- 288 cates, and can mutually authenticate each other. 290 - Security Credentials: The long-term cryptographic credentials used 291 for key exchange and authentication of the consumer and supplier 292 servers, if a security option was selected. For "PGP/MIME," this 293 will be the trusted public keys of both servers. For "Fortezza," 294 this will be the certificate paths of both servers to a common 295 point of trust. For "S/MIME" and "SSLv3" these will be the certifi- 296 cates of the supplier and consumer. 298 Note that if the index server maintains the information that would 299 appear in the agreement in a directory according to the definitions in 300 [7], then no real formal agreement between the two parties needs to be 301 put in place, and the information that is required for communication 302 between the two index servers is derived automatically from the 303 directory. 305 4.2. Content Type 307 The update consists of a MIME object of type application/cip-index- 308 object. The parameters are: 310 "type": this has value "application/index.obj.tagged". 312 "dsi": the DSI (if any) from the agreement. 314 "base-uri". A set of URIs, separated by spaces. In each URI, the 315 hostname/portno must be distinct, and based on the "supplier" part 316 of the agreement. 318 The payload is mostly textual data but may include bytes with the 319 high bit set. The originating information server should set the con- 320 tent-transfer-encoding as appropriate for the information included in 321 the payload. 323 This object may be encapsulated in a wrapper content (such as mul- 324 tipart/signed) or be encrypted as part of the security procedures. The 325 resulting content can the distributed, for example via electronic mail. 326 For example, 327 From: supplier@sup.com Date: Thu, 16 Jan 1997 13:50:37 -0500 328 Message-Id: <199701161850.NAA29295@sup.com>; 329 To: consumer@consumer.com <<-- from consumer server address 331 Reply-to: supplier-admin@sup.com 332 MIME-Version: 1.0 333 Content-Type: application/index.obj.tagged; 334 dsi=1.3.6.1.4.1.1466.85.85.1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16; 335 base-uri="ldap://sup.com/dc=sup,dc=com ldap://alt.com/dc=sup,dc=com" 337 The payload is series of CRLF-terminated lines. The payload is 338 UTF-8. 339 Some supplier servers may only be able to generate the printable 340 US-ASCII subset of UTF-8, but all consumer servers must be able to 341 handle the full range of Unicode characters when decoding the attribute 342 values (in the "attr-value" field in the BNF below). 344 4.3. Tagged Index BNF 346 The Tagged Index object has the following grammar, expressed in 347 modified BNF format: 349 index-object = 0*(io-part SEP) io-part 350 io-part = header SEP schema-spec SEP index-info 351 header = version-spec SEP update-type SEP this-update SEP 352 last-update context-size name-space SEP 353 version-spec = "version:" *SPACE "x-tagged-index-1" 354 update-type = "updatetype:" *SPACE ( "total" | 355 ( "incremental" [*SPACE "tagbased"|"uniqueIDbased" ] ) 356 this-update = "thisupdate:" *SPACE TIMESTAMP 357 last-update = [ "lastupdate:" *SPACE TIMESTAMP SEP] 358 context-size = [ "contextsize:" *SPACE 1*DIGIT SEP] 359 schema-spec = "BEGIN IO-Schema" SEP 1*(schema-line SEP) 360 "END IO-Schema" 361 schema-line = attribute-name ":" token-type 362 token-type = "FULL" | "TOKEN" | "RFC822" | "UUCP" | "DNS" 363 index-info = full-index | incremental-index 364 full-index = "BEGIN Index-Info" SEP 1*(index-block SEP) 365 "END Index-Info" 366 incremental-index = 1*(add-block | delete-block | update-block) 367 add-block = "BEGIN Add Block" SEP 1*(index-block SEP) 368 "END Add Block" 369 delete-block = "BEGIN Delete Block" SEP 1*(index-block SEP) 370 "END Delete Block" 371 update-block = "BEGIN Update Block" SEP 372 0*(old-index-block SEP) 373 1*(new-index-block SEP) 374 "END Update Block" 375 old-index-block = "BEGIN Old" SEP 1*(index-block SEP) 376 "END Old" 377 new-index-block = "BEGIN New" SEP 1*(index-block SEP) 378 "END New" 379 index-block = first-line 0*(SEP cont-line) 380 first-line = attr-name ":" *SPACE taglist "/" attr-value 381 cont-line = "-" taglist "/" attr-value 382 taglist = tag 0*("," tag) | "*" 383 tag = 1*DIGIT ["-" 1*DIGIT] 384 attr-value = 1*(UTF8) 385 attr-name = 1*(NAMECHAR) 386 TIMESTAMP = 1*DIGIT 387 NAMECHAR = DIGIT | UPPER | LOWER | "-" | ";" | "." 388 SPACE = ; 389 SEP = (CR LF) | LF 390 CR = ; 391 LF = ; 392 DIGIT = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | 393 "8" | "9" 395 UPPER = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | 396 "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | 397 "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | 398 "Y" | "Z" 399 LOWER = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | 400 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | 401 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | 402 "y" | "z" 404 US-ASCII-SAFE = %x01-09 / %x0B-0C / %x0E-7F 405 ;; US-ASCII except CR, LF, NUL 406 UTF8 = US-ASCII-SAFE / UTF8-1 / UTF8-2 / UTF8-3 407 / UTF8-4 / UTF8-5 408 UTF8-CONT = %x80-BF 409 UTF8-1 = %xC0-DF UTF8-CONT 410 UTF8-2 = %xE0-EF 2UTF8-CONT 411 UTF8-3 = %xF0-F7 3UTF8-CONT 412 UTF8-4 = %xF8-FB 4UTF8-CONT 413 UTF8-5 = %xFC-FD 5UTF8-CONT 415 The set of characters allowed to appear in the attr-name field is 416 limited to the set of characters used in LDAP and WHOIS++ attribute 417 names. For other services that have attribute name character sets that 418 are larger than these, those services should create a pro- 419 file that maps the names onto object identifiers, and the sequence of 420 digits and periods is used by those services in creating the attr-name 421 fields for their Tagged Index Objects. 423 It is worth mentioning that updates to a index based in tagged index 424 objects MUST be performed in the order specified by the tagged index 425 object itself. 427 4.3.1. Header Descriptions 429 The header section consists of one or more "header lines". The 430 following header lines are defined: 432 "version": This line must always be present, and have the value "x- 433 tagged-index-1" for this version of the specification. 435 "updatetype": This line must always be present. It takes as the 436 value either "total" or "incremental". The first update sent by 437 a supplier server to a consumer server for a DSI must be a "total" 438 update. 440 "thisupdate": This line must always be present. The value is the 441 number of seconds from 00:00:00 UTC January 1, 1970 at which the 442 supplier constructed this update. 444 "lastupdate": This line must be present if the "updatetype" list 445 has the value "incremental". The value is the number of seconds 446 from 00:00:00 UTC January 1, 1970 at which the supplier constructed 447 the previous update sent to the consumer. This field allows the 448 consumer to determine if a previous update was missed 450 "contextsize": This line may be present at the supplier's option. 451 The value is a number, which is the approximate total number of 452 entries in the subtree. This information is provided for statisti- 453 cal purposes only. 455 4.3.2. Tokenization Types 457 The Tagged Index Object inherits the "TOKEN" scheme for tokeniza- 458 tion as specified in [2]. In addition, there are several other tok- 459 enization schemes defined for the Tagged Index Object. The following 460 table presents these schemes and what character(s) are used to delimit 461 tokens. 463 Token Type Tokenization Characters 464 FULL none 465 TOKEN white space, "@" 466 RFC822 white space, ".", "@" 467 UUCP white space, "!" 468 DNS any character note a number, letter, or "-" 470 4.3.3. Tag Conventions 472 In the tag list, multiple consecutive tags may be shortened by 473 using "#-#". For example, the list "3,4,5,6,7,8,9,10" may be shortened 474 to "3-10". Tags are to be applied to the data on a per entry level. 475 Thus, if two index lines in the same index object contain the same tag, 476 then those two lines always refer to the same 477 "record" in the directory. In LDAP terminology, the two lines would 478 refer to the same directory object. Additionally if two index 479 lines in the same index object contain different tags, then it is always 480 the case that those two lines refer back to different records in the 481 directory. The meaning of '*' in the tag position is that that specific 482 token apears in every record in the directory. 484 The tag applied to the same underlying record in two separate 485 transmissions of a full-index may be different. Thus, receiving index 486 servers should make no assumptions about the values of the tags across 487 index object boundaries. 489 4.4. Incremental Indexing 491 The tagged index object format supports the ability of information 492 servers to distribute only delta index data, rather than distributing 493 total index information each time. This scenario, known as incremental 494 indexing supports three basic types of operations: add, delete and 495 replace. If the incremental updatetype is specified in the tagged index 496 object, then the index object contains a snapshot of only the changes 497 that have been made since the index object specified in the lastupdate 498 header was distributed. If the receiving index server did not receive 499 that index object, it should request a total index object. If the CIP 500 protocol supports it, the index server may request the specific index 501 object that it missed. 503 If the tagged index object contains an Add Block, then the lines in 504 the Add Block refer to new records that were added to the information 505 base of the transmitting index server. It can be guaranteed that those 506 records did not exist in any previously received tagged index object, 507 and the receiving index server can insert this index information in the 508 index that it already maintains for the transmitting index server. 510 If the tagged index object contains a Delete Block, then the 511 structure of the Delete Block depends on how the consistency is 512 maintained; 514 - "completeRecord": all the tokens connected to the record to be 515 deleted has to be included, the tag used to connect tokens in this 516 message has no relation to tags used in previously sent tagged index 517 objects. 519 - "uniqueIDBased": only the unique identifier has to be defined. 521 - "tagBased": all the tokens connected to the record has to be included 522 but then preceded by the tag used for this specific record in the 523 preceding set of the last full update and the there on following 524 incremental updates. 526 If the tagged index object contains an Update Block, then the lines 527 in the Update Block refer to records that were changed in the information 528 base of the transmitting index server. Again the specific content of 529 the block depends on how the consistency is maintained. 531 - "completeRecord": All the tokens representing the old version of the 532 record as well as the new ones has to be included. 534 - "uniqueIDBased": The unique ID has to be included together with the 535 tokens that have changed. 537 - "tagBased": Only the changed tokens are included, but then both the 538 old version, if there was one, as well as the new one, if there is 539 one. 541 The Update Block also supports the idea of indexing new 542 attributes that were not previously included in the tagged index 543 object. For example, if the transmitting index server began including 544 index information on postal addresses, then it could include an Update 545 Block in the index object that included all the index information on 546 postal addresses for all records in its information base, and indicate 547 that nothing else has changed. 549 5. Example 551 In the following sections, for each different consistencybase 552 type, the tagged index object is represented for the following scenario; 553 The examples starts with one full update and following that a set of 554 updates. The underlying information is presented in the LDIF [6] format. 556 5.1 The original database 558 dn: cn=Barbara Jensen, ou=Product Development, o=Ace Industry, c=US 559 objectclass: top 560 objectclass: person 561 objectclass: organizationalPerson 562 cn: Barbara Jensen 563 cn: Barbara J Jensen 564 cn: Babs Jensen 565 sn: Jensen 566 uid: bjensen 567 dn: cn=Bjorn Jensen, ou=Accounting, o=Ace Industry, c=US 568 objectclass: top 569 objectclass: person 570 objectclass: organizationalPerson 571 cn: Bjorn Jensen 572 sn: Jensen 573 title: Accounting manager 574 dn: cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 575 objectclass: top 576 objectclass: person 577 objectclass: organizationalPerson 578 cn: Gern Jensen 579 cn: Gern O Jensen 580 sn: Jensen 581 title: testpilot 582 dn: cn=Horatio Jensen, ou=Product Testing, o=Ace Industry, c=US 583 objectclass: top 584 objectclass: person 585 objectclass: organizationalPerson 586 cn: Horatio Jensen 587 cn: Horatio N Jensen 588 sn: Jensen 589 title: testpilot 591 5.1.1 "Complete" consistency based full update 593 version: x-tagged-index-1 594 updatetype: total 595 thisupdate: 855938804 596 BEGIN IO-Schema 597 cn: TOKEN 598 sn: FULL 599 title: TOKEN 600 END IO-Schema 601 BEGIN Index-Info 602 cn: 1/Barbara 603 -1/J 604 -1/Babs 605 -*/Jensen 606 -2/Bjorn 607 -3/Gern 608 -3/O 609 -4/Horatio 610 -4/N 611 sn: */Jensen 612 title: 1/product 613 -1-2/manager 614 -1/accounting 615 -3,4/testpilot 616 END Index-Info 618 5.1.2 "tag" consistency based full update 620 version: x-tagged-index-1 621 updatetype: total 622 thisupdate: 855938804 623 BEGIN IO-Schema 624 cn: TOKEN 625 sn: FULL 626 title: TOKEN 627 END IO-Schema 628 BEGIN Index-Info 629 cn: 1/Barbara 630 -1/J 631 -1/Babs 632 -*/Jensen 633 -2/Bjorn 634 -3/Gern 635 -3/O 636 -4/Horatio 637 -4/N 638 sn: */Jensen 639 title: 1/product 640 -1-2/manager 641 -1/accounting 642 -3,4/testpilot 643 END Index-Info 645 5.1.3 "unique" consistency based full update 647 version: x-tagged-index-1 648 updatetype: total 649 thisupdate: 855938804 650 BEGIN IO-Schema 651 dn: FULL 652 cn: TOKEN 653 sn: FULL 654 title: TOKEN 655 END IO-Schema 656 BEGIN Index-Info 657 dn: 1/cn=Barbara Jensen, ou=Product Development, o=Ace Industry, c=US 658 -2/cn=Bjorn Jensen, ou=Accounting, o=Ace Industry, c=US 659 -3/cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 660 -4/cn=Horatio Jensen, ou=Product Testing, o=Ace Industry, c=US 661 cn: 1/Barbara 662 -1/J 663 -1/Babs 664 -*/Jensen 665 -2/Bjorn 666 -3/Gern 667 -3/O 668 -4/Horatio 669 -4/N 670 sn: */Jensen 671 title: 1/product 672 -1-2/manager 673 -1/accounting 674 -3,4/testpilot 675 END Index-Info 677 5.2 First update 679 Gern Jensen's entry above changes to: 681 dn: cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 682 objectclass: top 683 objectclass: person 684 objectclass: organizationalPerson 685 cn: Gern Jensen 686 cn: Gern O Jensen 687 sn: Jensen 688 title: chiefpilot 690 5.2.1 First update using "complete" 692 version: x-tagged-index-1 693 updatetype: incremental 694 lastupdate: 855940000 695 thisupdate: 855938804 696 BEGIN IO-schema 697 cn: TOKEN 698 sn: FULL 699 title: FULL 700 END IO-Schema 701 BEGIN Update Block 702 BEGIN Old 703 cn: 1/Gern 704 cn: 1/O 705 cn: 1/Jensen 706 sn: 1/Jensen 707 title: 1/testpilot 708 END Old 709 BEGIN New 710 cn: 1/Gern 711 cn: 1/O 712 cn: 1/Jensen 713 sn: 1/Jensen 714 title: 1/chiefpilot 715 END New 716 END Update Block 718 5.2.2 First update using "tag" consistency 720 version: x-tagged-index-1 721 updatetype: incremental 722 lastupdate: 855940000 723 thisupdate: 855938804 724 BEGIN IO-schema 725 cn: TOKEN 726 sn: FULL 727 title: FULL 728 END IO-Schema 729 BEGIN Update Block 730 BEGIN Old 731 title: 3/testpilot 732 END Old 733 BEGIN New 734 title: 3/chiefpilot 735 END New 736 END Update Block 738 5.2.3 First update using "unique" ID's 740 version: x-tagged-index-1 741 updatetype: incremental 742 lastupdate: 855940000 743 thisupdate: 855938804 744 BEGIN IO-schema 745 cn: TOKEN 746 sn: FULL 747 title: FULL 748 END IO-Schema 749 BEGIN Update Block 750 BEGIN Old 751 dn: 1/cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 752 title: 1/testpilot 753 END Old 754 BEGIN New 755 dn: 1/cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 756 title: 1/chiefpilot 757 END New 758 END Update Block 760 5.3 Second update 762 # Add a new entry 763 dn: cn=Bo Didley, ou=Marketing, o=Ace Industry, c=US 764 changetype: add 765 objectclass: top 766 objectclass: person 767 objectclass: organizationalPerson 768 cn: Bo Didley 769 sn: Didley 770 title: Policy Maker 771 # Delete an existing entry 772 dn: cn=Bjorn Jensen, ou=Accounting, o=Ace Industry, c=US 773 changetype: delete 774 # Modify all other entries: adding an additional locality value 775 dn: cn=Barbara Jensen, ou=Product Development, o=Ace Industry, c=US 776 changetype: modify 777 add: locality 778 locality: New Jersey 779 dn: cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 780 changetype: modify 781 add: locality 782 locality: New Orleans 783 dn: cn=Horatio Jensen, ou=Product Testing, o=Ace Industry, c=US 784 changetype: modify 785 add: locality 786 locality: New Caledonia 788 5.3.1 "complete" 790 version: x-tagged-index-1 791 updatetype: incremental 792 lastupdate: 855938804 793 thisupdate: 855939525 794 BEGIN IO-schema 795 cn: TOKEN 796 sn: FULL 797 title: FULL 798 locality: TOKEN 799 END IO-Schema 800 BEGIN Add Block 801 cn: 1/Bo 802 -1/Didley 803 sn: 1/Didley 804 title: 1/Policy 805 -1/maker 806 locality: 1/New 807 -1/York 808 END Add Block 809 BEGIN Delete Block 810 cn: 1/Bjorn 811 -1/Jensen 812 sn: 1/Jensen 813 title: 1/Accounting 814 -1/Manager 815 END Delete Block 816 BEGIN Update Block 817 BEGIN Old 818 cn: 1/Barbara 819 -1/J 820 -1-3/Jensen 821 -2/Gern 822 -2/O 823 -3/Horatio 824 sn: 1-3/Jensen 825 title: 1/Production 826 -1/Manager 827 -2/Testpilot 828 -3/Chiefpilot 829 END Old 830 BEGIN New 831 cn: 1/Barbara 832 -1/J 833 -1-3/Jensen 834 -2/Gern 835 -2/O 836 -3/Horatio 837 sn: 1-3/Jensen 838 title: 1/Production 839 -1/Manager 840 -2/Testpilot 841 -3/Chiefpilot 842 locality: 1/Jersey 843 -2/Orleans 844 -3/Caledonia 845 -1-3/New 846 END New END Update Block 848 5.3.2 "tag" 850 version: x-tagged-index-1 851 updatetype: incremental 852 lastupdate: 855938804 853 thisupdate: 855939525 854 BEGIN IO-schema 855 cn: TOKEN 856 sn: FULL 857 title: FULL 858 locality: TOKEN 859 END IO-Schema 860 BEGIN Add Block 861 cn: 5/Bo 862 -5/Didley 863 sn: 5/Didley 864 title: 5/Policy 865 -5/maker 866 locality: 5/New 867 -5/York 868 END Add Block 869 BEGIN Delete Block 870 cn: 2/Bjorn 871 -2/Jensen 872 sn: 2/Jensen 873 title: 2/Accounting 874 -2/Manager 875 END Delete Block 876 BEGIN Update Block 877 BEGIN New 878 locality: 1/Jersey 879 -2/Orleans 880 -4/Caledonia 881 -1,2,4/New 882 END New 883 END Update Block 885 5.3.3 "unique" 887 version: x-tagged-index-1 888 updatetype: incremental 889 lastupdate: 855938804 890 thisupdate: 855939525 891 BEGIN IO-schema 892 cn: TOKEN 893 sn: FULL 894 title: FULL 895 locality: TOKEN 896 END IO-Schema 897 BEGIN Add Block 898 dn: 1/cn=Bo Didley, ou=Marketing, o=Ace Industry, c=US 899 cn: 1/Bo 900 -1/Didley 901 sn: 1/Didley 902 title: 1/Policy 903 -1/maker 904 locality: 1/New 905 -1/York 906 END Add Block 907 BEGIN Delete Block 908 dn: 1/cn=Bjorn Jensen, ou=Accounting, o=Ace Industry, c=US 909 END Delete Block 910 BEGIN Update Block 911 BEGIN New 912 dn: 1/cn=Barbara Jensen, ou=Product Development, o=Ace Industry, c=US 913 -2/cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US 914 -3/cn=Horatio Jensen, ou=Product Testing, o=Ace Industry, c=US 915 locality: 1/Jersey 916 -2/Orleans 917 -3/Caledonia 918 -1-3/New 919 END New 920 END Update Block 922 6. Aggregation 924 6.1. Aggregation of Tagged Index Objects 926 Aggregation of two tagged index objects is done by merging the two 927 lists of values and rewriting each tag list. The tag list rewriting 928 process is done so that the resulting index object appears as if it came 929 from a single source. An index server that aggregates tagged index 930 objects for export MUST ensure that the export URL (i.e. the base-uri of 931 the CIP object) for the aggregate index object will route all queries 932 that have "hits" on the index object to that server (otherwise, query 933 routing will not succeed). 935 7. Security Considerations 937 This specification provides a protocol for transferring information 938 between two servers. The information transferred may be protected 939 by laws in many countries, so care must be taken in the methods used to 940 tokenize the data to ensure that protected data may not be 941 reconstructed in full by the receiving server. This protocol does not 942 have any inherent protection against spoofing or eavesdropping. 943 However, since this protocol is transported in MIME messages (as are all 944 CIP index objects), it inherits all the security capabilities and 945 liabilities of other MIME messages. Specifically, those wanting to 946 prevent eavesdropping or spoofing may use some of the various techniques 947 for signing and encrypting MIME messages. 949 Information Server administrators must decide what portions of 950 their databases are appropriate for inclusion in the Tagged Index 951 Object. For distribution of information outside the enterprise, 952 information server developers are encouraged to allow for facilities 953 that hide the organizational structure when generating the Tagged Index 954 Object from the underlying information database. To allow for 955 the secure transmission of Tagged Index Objects across the Internet, 957 Index Servers should make use of SSL when completing the connection. In 958 order to strongly verify the identity of the peer index server on the 959 other side of the connection, SSL version 3 certificate exchange should 960 be implemented, and the identity in the peer's certificate verify with 961 the Public Key Infrastructure. If electronic mail is used to exchange 962 the Tagged Index Objects, then a secure messaging facility, such as 963 PGP/MIME or S/MIME should be used to sign or encrypt (or both) the 964 information. 966 8. References 968 [1] J. Allen, M. Mealling, "The Architecture of the Common Indexing 969 Protocol (CIP)," Internet Draft (work in progress) June 1997. 971 [2] C. Weider, J. Fullton, S. Spero, "Architecture of the Whois++ Index 972 Service. RFC 1913, February 1996. 974 [3] M. Wahl, T. Howes, S. Kille, "Lightweight Directory Access Protocol 975 (v3)," RFC 2251, December 1997. 977 [4] ITU, "X.525 Information Technology - Open Systems Interconnection - 978 The Directory: Replication", November 1993. 980 [5] "FORTEZZA Application Implementors Guide for the FORTEZZA Crypto 981 Card (Production Version)", Document #PD4002102-1.01, SPYRUS, 1995. 983 [6] G. Good, " The LDAP Data Interchange Format (LDIF) - Technical 984 Specification", Internet Draft (work in prgress) , November 1998. 986 [7] R. Hedberg, "LDAPv2 client Vs the Index Mesh". Internet Draft (work 987 in progress), November 1997. 989 [8] T. Howes, M. Smith, "The LDAP URL Format", RFC 2255, December 1997. 991 [9] M. Elkins, "MIME Security with Pretty Good Privacy (PGP)", RFC 2015, 992 October 1996. 994 [10] Blake Ramsdell, "S/MIME Version 3 Message Specification", Internet 995 Draft, (work in progress), August 1998. 997 [11] C. Allen, T. Dierks, "The TLS Protocol Version 1.0", Internet 998 Draft, (work in progress), November 1997. 1000 9. Author's Addresses 1002 Roland Hedberg 1003 Catalogix 1004 Dalsveien 53 1005 0387 Oslo 1006 Norway 1007 Email: roland@catalogix.ac.se 1009 Bruce Greenblatt 1010 6841 Heaton Moor Drive 1011 San Jose, CA 95119 1012 USA 1013 Email: bruceg@innetix.com 1014 Phone: +1-408-224-5349 1016 Ryan Moats 1017 AT&T 1018 15621 Drexel Circle 1019 Omaha, NE 68135-2358 1020 USA 1021 EMail: jayhawk@att.com 1022 Phone: +1 402 894-9456 1023 Mark Wahl 1024 Innosoft International, Inc. 1025 8911 Capital of Texas Hwy, Suite 4140 1026 Austin, TX 78759 1027 USA 1028 Phone +1 626 919 3600 1029 EMail Mark.Wahl@innosoft.com 1031 Table of Contents 1033 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1034 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1035 3. Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1036 4. The Tagged Index Object . . . . . . . . . . . . . . . . . . . . . 5 1037 4.1. The Agreement . . . . . . . . . . . . . . . . . . . . . . . . . 5 1038 4.2. Content Type . . . . . . . . . . . . . . . . . . . . . . . . . 7 1039 4.3 Tagged Index BNF . . . . . . . . . . . . . . . . . . . . . . . . 8 1040 4.3.1. Header Descriptions . . . . . . . . . . . . . . . . . . . . . 10 1041 4.3.2. Tokenization types . . . . . . . . . . . . . . . . . . . . . 11 1042 4.3.3. Tag Conventions . . . . . . . . . . . . . . . . . . . . . . . 11 1043 4.4. Incremental Indexing . . . . . . . . . . . . . . . . . . . . . 11 1044 5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1045 5.1 The original database . . . . . . . . . . . . . . . . . . . . . 13 1046 5.1.1 "complete" consistency based full update . . . . . . . . . . . 14 1047 5.1.2 "tag" consistency based full update . . . . . . . . . . . . . 14 1048 5.1.3 "unique" consistency based full update . . . . . . . . . . . . 15 1049 5.2 First update . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1050 5.2.1 "complete" consistency based incremental update . . . . . . . 16 1051 5.2.2 "tag" consistency based incremental update . . . . . . . . . 16 1052 5.2.3 "unique" consistency based incremental update . . . . . . . . 17 1053 5.3 Second update . . . . . . . . . . . . . . . . . . . . . . . . . 17 1054 5.3.1 "complete" consistency based incremental update . . . . . . . 18 1055 5.3.2 "tag" consistency based incremental update . . . . . . . . . . 19 1056 5.3.3 "unique" consistency based incremental update . . . . . . . . 20 1057 6. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1058 6.1 Aggregation of Tagged Index Objects . . . . . . . . . . . . . . 20 1059 7. Security Considerations . . . . . . . . . . . . . . . . . . . . . 21 1060 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1061 9. Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . 22