idnits 2.17.1 

draft-ietf-find-cip-tagged-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 19
     longer pages, the longest (page 14) being 64 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 21 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 693: '... are not changed SHOULD not be present...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 238 has weird spacing: '...re only  modif...'

  == Line 241 has weird spacing: '... should  sign...'

  == Line 259 has weird spacing: '...ed when  excha...'

  == Line 800 has weird spacing: '...ing the  two...'

  == Line 801 has weird spacing: '...lists  of  val...'

  == (26 more instances...)

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     Note that in the above record, the attributes dn, cn and sn are
     modified from the original record.  The attributes that do not change
     from the original are objectclass, uid, telephonenumber and description.
     Any attributes that are not changed SHOULD not be present in UPDATE
     block.  Notice the title attribute has been removed from Barbara
     Jensen-Smith's entry.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  ** Downref: Normative reference to an Historic RFC: RFC 1913 (ref. '2')

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  -- Possible downref: Non-RFC (?) normative reference: ref. '10'

  -- Possible downref: Non-RFC (?) normative reference: ref. '11'


     Summary: 11 errors (**), 0 flaws (~~), 10 warnings (==), 12 comments
     (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                     Roland Hedberg
3	Internet Draft                                          Bruce Greenblatt
4	<draft-ietf-find-cip-tagged-06.txt>                           Ryan Moats
5	Expires in six months                                          Mark Wahl

7	     A Tagged Index Object for use in the Common Indexing Protocol

9	Status of this Memo

11	     This document is an Internet-Draft.  Internet-Drafts are working
12	documents of the Internet Engineering Task Force (IETF), its areas, and
13	its working groups. Note that other groups may also distribute working
14	documents as Internet-Drafts.

16	     Internet-Drafts are draft documents valid for a maximum of six
17	months.  Internet-Drafts may be updated, replaced, or made obsolete by
18	other documents at any time.  It is not appropriate to use  Internet-
19	Drafts as reference material or to cite them other than as a "working
20	draft" or "work in progress".

22	To view the entire list of current Internet-Drafts, please check
23	the "1id-abstracts.txt" listing contained in the Internet-Drafts
24	Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
25	(Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
26	(Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
27	(US West Coast).

29	     Distribution of this document is unlimited.

31	     Abstract

33	     This document defines a mechanism by which information servers can
34	exchange indices of information from their databases by making use of
35	the Common Indexing Protocol (CIP).  This document defines the structure
36	of the index information being exchanged, as well as a the appropriate
37	meanings for the headers that are defined in the Common Indexing Proto-
38	col.  It is assumed that the structures defined here can be used by
39	X.500 DSAs, LDAP servers, Whois++ servers, CCSO servers and many others.

41	1. Introduction

43	     The Common Indexing Protocol (CIP) as defined in [1] proposes a
44	mechanism for distributing searches across several instances of a single
45	type of search engine with a view to creating a global directory.  CIP
46	provides a scalable, flexible scheme to tie individual databases into
47	distributed data warehouses that can scale gracefully with the growth of
48	the Internet.  CIP provides a mechanism for meeting these goals that is
49	independent of the access method that is used to access the actual data
50	that underlies the indices.  Separate from CIP is the definition of the
51	Index Object that is used to contain the information that is exchanged
52	among Index Servers.  One such Index Object that has already been
53	defined is the Centroid that is derived from the Whois++ protocol [2].

55	     The Centroid does not meet all of the requirements for the exchange
56	of index information amongst information servers.  For example, it does
57	not support the notion of incremental updates natively.  For information
58	servers that contain millions of records in their database, constant
59	exchange of complete dredges of the database is bandwidth intensive.
60	The Tagged Index Object is specifically designed to support the exchange
61	of index update information.  This design comes at the cost of an
62	increase in the size of the index object being exchanged.  The Centroid
63	is also not tailored to always be able to give boolean answers to
64	queries.  In the Centroid Model, "an index server will take a query in
65	standard Whois++ format, search its collections of centroids and other
66	forward information, determine which servers hold records which may fill
67	that query, and then notifies the user's client of the next servers to
68	contact to submit the query." [2] Thus, the exchange of Centroids
69	amongst index servers allows hints to be given as to which information
70	server actually contains the information.  The Tagged Index Object
71	labels the various pieces of information with identifiers that tie the
72	individual object attributes back to an object as a whole.  This "tag-
73	ging" of information allows an index server to be more capable of
74	directing a specific query to the appropriate information server.
75	Again, this feature is added to the Tagged Index Object at the expense
76	of an increase in the size of the index object.

78	2. Background

80	     The Lightweight Directory Access Protocol (LDAP) is defined in [3],
81	and it defines a mechanism for accessing a collection of information
82	arranged hierarchically in such a manner as to provide a globally
83	distributed database which is normally called the Directory Information
84	Tree (DIT).  Some distinguishing characteristics of LDAP servers are
85	that it is normally the case that several servers cooperate to manage a
86	common subtree of the DIT.  LDAP servers are expected to respond to
87	requests that pertain to portions of the DIT for which they have data,
88	as well as for those portions for which they have no information in
89	their database. For example, the LDAP server for a portion of the DIT in
90	the United States (c=US) must be able to provide a response to a Search
91	operation that pertains to a portion of the DIT in Sweden (c=se).  Nor-
92	mally, the response given will be a referral to another LDAP server that
93	is expected to be more knowledgeable about the appropriate subtree.
94	However, there is no mechanism that currently enables these LDAP servers
95	to refer the LDAP client to the supposedly more knowledgeable server.
96	Typically, an LDAP (v3) server is configured with the name of exactly
97	one other LDAP server to which all LDAP clients are referred when their
98	requests fall outside the subtree of the DIT for which that LDAP server
99	has knowledge.  This specification defines a mechanism whereby LDAP
100	server can exchange index information that will allow referrals to point
101	towards a clearly accurate destination.

103	     While the X.500 series of recommendations defines the Directory
104	Information Shadowing Protocol (DISP) [4] which allows X.500 DSAs to
105	exchange actual information in the DIT.  Shadowing allows various infor-
106	mation from various portions of the DIT to be replicated amongst partic-
107	ipating DSAs.  The design point of DISP is optimized at the exchange of
108	entire portions of the DIT, whereas the design point of CIP and the
109	Tagged Index Object is optimize at the exchange of structural index
110	information about the DIT, and improving the performance of tree naviga-
111	tion amongst various information servers.  The Tagged Index Object is
112	more appropriate for the exchange of index information than is DISP.
113	DISP is more targeted at DIT distribution and fault tolerance.  DISP is
114	thus more appropriate for the exchange of the actual data in order to
115	spread the load amongst several information servers.  DISP is tailored
116	specifically to X.500 (and other hierarchical directory systems), while
117	the Tagged Index Object and CIP can be used in a wide variety of infor-
118	mation server environments.

120	     While DISP allows an individual directory server to collect infor-
121	mation about large parts of the DIT, it would require a huge database to
122	collect all of the replicas for a meaningful portion of the DIT.  Fur-
123	thermore, as X.525 states: "Before shadowing can occur, an agreement,
124	covering the conditions under which shadowing may occur is required.
125	Although such agreements may be established in a variety of ways, such
126	as policy statements covering all DSAs within a given DMD ...", where a
127	DMD is a Directory Management Domain.  This is due to the case that the
128	actual data in the DIT is being exchanged amongst DSA rather than only
129	the information required to maintain an Index.  In many environments
130	such an agreement is not appropriate, and in order to collect informa-
131	tion for a meaningful portion of the DIT, a large number of agreements
132	may need to be arranged.

134	3. Object

136	     What is desired is to have an information server (or network of
137	information servers) that can quickly respond to real world requests,
138	like:

140	-    What is Tim Howes' email address?  This is much harder than, What
141	     is Tim Howes at Netscape's email address.

143	-    What is the X.509 certificate for Fred Smith at compuserve.com?
144	     One certainly doesn't want to search CompuServe's entire directory
145	     tree to find out this one piece of information.  I also don't want
146	     to have to shadow the entire CompuServe directory subtree onto my
147	     server.  If this request is being made because Fred is trying to
148	     log into my server, I'd certainly want to be able to respond to the
149	     BIND in real time.

151	-    Who are all of the people at Novell that have a title of program-
152	     mer?

154	     All of these requests can reasonably be translated into LDAP or
155	Whois++, and other directory access protocol queries.  They can also be
156	serviced in a straightforward manner by the users home information
157	server if it has the appropriate reference information into the database
158	that contains the source data.  In this situation, the first server
159	would be able to "chain" the request on behalf of the user.  Alterna-
160	tively, a precise referral could be returned.  If the home information
161	server wants to service (i.e chain) the request based on the index
162	information that it has on hand, this servicing could be done by any
163	number of means:

165	-    issuing LDAP operations to the remote directory server

167	-    issuing DSP operations to the remote directory server

169	-    issuing DAP operations to the remote directory server
170	-    issuing Whois++ operations to the remote Whois++ server

172	-     ...

174	4. The Tagged Index Object

176	     This section defines a Tagged Index Object that can be exchanged by
177	Information Servers using CIP.  While in many cases it is acceptable for
178	Information Servers to make use of the Centroid construct (as defined in
179	[2]) to exchange index information, the goals in defining a new con-
180	struct are multi-pronged:

182	-    When the Information Server receives a search request that warrants
183	     that a referral be returned, allow the server to return a referral
184	     that will point client to a server that is most likely able to
185	     answer the request correctly.  False positive referrals (the search
186	     turns up hits in the index object that generate referrals to
187	     servers that don't hold the desired information) can be reduced,
188	     depending on the choice of attribute tokenization types that are
189	     used.

191	-    When the Information Server receives a search request that is not
192	     operating against local data, allow the Information Server itself
193	     to "chain" the request to the appropriate remote Information
194	     Server.  Note that LDAP itself does not define how Chaining works,
195	     but X.500 does.  This seems very similar to the first "prong".

197	-    Finally, when a collection of Information Servers are operating
198	     against a large distributed directory, allow them to distribute
199	     index information amongst themselves (ala CIP) so that as their own
200	     searches can be carried out with some degree of efficiency.

202	4.1. The Agreement

204	     Before a Tagged Index Object can be exchanged, the organization
205	which administers the object supplier and the organization which admin-
206	isters the object consumer must reach an agreement on how the servers
207	will communicate. This agreement contains the following:

209	-    "version":The version of the agreement and the index type.  This
210	     specification describes the index type "x-tagged-index-1"

212	-    "dsi": An OID which uniquely identifies the subtree and scope.
213	     This field is not explicitly necessary, as it may not provide
214	     information beyond that which is contained in the "base-uri" below.

216	-    "base-uri": One or more URI's which will form the base of any
217	     referrals created based upon the index object that is governed by
218	     this agreement.  For example, in the LDAP URL format [8] the base-
219	     uri would specify (among other items): the LDAP host,  the base
220	     object to which this index object refers (e.g. c=SE), and the scope
221	     of the index object (e.g. single container).

223	-    "supplier": The hostname and listening port number of the supplier
224	     server, as well as any alternative servers holding that same naming
225	     contexts, in case the supplier is unavailable.

227	-    "consumeraddr": This is a URI of the "mailto:" form, with the RFC
228	     822 email address of the consumer server.  Subsequent versions of
229	     this draft allow other forms of URI, so that the consumer may
230	     retrieve the update via the WWW, FTP or CIP

232	-    "updateinterval": The maximum duration in seconds between occu-
233	     rances of the supplier server generating an update.  If the con-
234	     sumer server has not received an update from the supplier server
235	     after waiting this long since the previous update, it is likely
236	     that the index information is now out of date.  A typical value for
237	     a server with frequent updates would be 604800 seconds, or every
238	     week.  Servers whose DITs are only  modified annually could have a
239	     much longer update interval.

241	-    "securityoption": Whether and how the supplier server should  sign
242	     and encrypt the update before sending it to the consumer server.
243	     Options for this version of the specification are:

245	          "none" - the update is sent in plaintext

247	          "PGP/MIME": the update is digitally signed and encrypted using
248	          PGP [9]

250	          "S/MIME": the update is digitally signed and encrypted using
251	          S/MIME [10]

253	          "SSLv3": the update is digitally signed and encrypted using an
254	          SSLv3 connection [11]

256	          "Fortezza": the update is digitally signed and encrypted using
257	          Fortezza [5]

259	     It is recommended that the "PGP/MIME" option be used when  exchang-
260	ing sensitive information across public networks, and both the supplier
261	and consumer have PGP keys. The "Fortezza" option is intended for use in
262	environments where security protocols are based on Fortezza-compatible
263	devices. The "S/MIME" option can be used with both the supplier and
264	consumer have RSA keys and can make use of the PKCS protocols defined in
265	the S/MIME specification. The "SSLv3" option can be used when both the
266	supplier and consumer have access to SSL services, have server certifi-
267	cates, and can mutually authenticate each other.  Should these be IANA
268	registered things???

270	-    Security Credentials: The long-term cryptographic credentials used
271	     for key exchange and authentication of the consumer and supplier
272	     servers, if a security option was selected.  For "PGP/MIME", this
273	     will be the trusted public keys of both servers.  For "Fortezza",
274	     this will be the certificate paths of both servers to a common
275	     point of trust. For "S/MIME" and "SSLv3" these will be the certifi-
276	     cates of the supplier and consumer.

278	     Note that if the index server maintains the information that would
279	appear in the agreement in a directory according to the definitions in
280	[7], then no real formal agreement between the two parties needs to be
281	put in place, and the information that is required for communication
282	between the two index servers is derived automatically from the direc-
283	tory.

285	4.2. Content Type

287	     The update consists of a MIME object of type application/cip-index-
288	object.  The parameters are:

290	     "type": this has value "application/index.obj.tagged".

292	     "dsi": the DSI (if any) from the agreement.

294	     "base-uri". A set of URIs, separated by spaces. In each URI, the
295	     hostname/portno must be distinct, and based on the "supplier" part
296	     of the agreement.

298	     The payload is mostly textual data but may include bytes with the
299	high bit set.  The originating information server should set the con-
300	tent-transfer-encoding as appropriate for the information included in
301	the payload.

303	     This object may be encapsulated in a wrapper content (such as mul-
304	tipart/signed) or be encrypted as part of the security procedures.   The
305	resulting content can the distributed, for example via electronic mail.
306	For example,
307	From: supplier@sup.com Date: Thu, 16 Jan 1997 13:50:37 -0500
308	Message-Id: <199701161850.NAA29295@sup.com>;
309	To: consumer@consumer.com       <<-- from consumer server address
310	Reply-to: supplier-admin@sup.com
311	MIME-Version: 1.0
312	Content-Type: application/index.obj.tagged;
313	dsi=1.3.6.1.4.1.1466.85.85.1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16;
314	base-uri="ldap://sup.com/dc=sup,dc=com ldap://alt.com/dc=sup,dc=com"

316	     The payload is series of CRLF-terminated lines.  The payload only
317	includes characters from a subset of the printable US-ASCII subset of
318	UTF-8.  Attribute values that occur outside of this subset are encoded
319	as defined below.  As more experience is gained with index objects and
320	UTF-8 data, a future version of this specification may allow for the
321	native transfer of UTF-8 data without requiring this special encoding.
322	No other character sets are permitted by this version of the specifica-
323	tion.  Some supplier servers may only be able to generate the printable
324	US-ASCII subset, but all consumer servers must be able to handle the
325	full range of Unicode characters when decoding the attribute values (in
326	the "attr-value" field in the BNF below).

328	4.3.  Tagged Index BNF

330	     The Tagged Index object has the following grammar, expressed in
331	modified BNF format:

333	index-object = 0*(io-part SEP) io-part
334	io-part      = header SEP schema-spec SEP index-info
335	header       = version-spec SEP update-type SEP this-update SEP
336	                last-update SEP context-size
337	version-spec = "version:" *SPACE "x-tagged-index-1"
338	update-type  = "updatetype:" *SPACE ( "total" | "incremental")
339	this-update  = "thisupdate:" *SPACE TIMESTAMP
340	last-update  = [ "lastupdate:" *SPACE TIMESTAMP ]
341	context-size = [ "contextsize:" *SPACE 1*DIGIT ]
342	schema-spec  = "BEGIN IO-Schema" SEP 1*(schema-line SEP)
343	               "END IO-Schema"
344	schema-line  = attribute-name ":" token-type
345	token-type   = "FULL" | "TOKEN" | "RFC822" | "UUCP" | "DNS"
346	index-info   = full-index | incremental-index
347	full-index   = "BEGIN Index-Info" SEP 1*(index-block SEP)
348	               "END Index-Info"
349	incremental-index = 1*(add-block | delete-block | update-block)
350	add-block    = "BEGIN Add Block" SEP 1*(index-block SEP)
351	               "END Add Block"
352	delete-block = "BEGIN Delete Block" SEP 1*(index-block SEP)
353	               "END Delete Block"
354	update-block = "BEGIN Update Block" SEP 1*(index-block SEP)
355	               "END Update Block"
356	index-block  = first-line 0*(SEP cont-line)
357	first-line   = attr-name ":" *SPACE taglist "/" attr-value
358	cont-line    = "-" taglist "/" attr-value
359	taglist      = tag 0*("," tag) | "*"
360	tag          = 1*DIGIT ["-" 1*DIGIT]
361	attr-value   = 0*(UTF8)
362	attr-name    = 1*(NAMECHAR)
363	UTF8         = ASCII | "%" HEX HEX
364	TIMESTAMP    = 1*DIGIT
365	ASCII        = DIGIT | UPPER | LOWER | OTHER
366	NAMECHAR     = DIGIT | UPPER | LOWER | "-" | ";" | "."
367	SPACE        = <ASCII space, hex 20>;
368	SEP          = (CR LF) | LF
369	CR           = <ASCII CR, carriage return, hex 0D>;
370	LF           = <ASCII LF, line feed, hex 0A>;
371	HEX          = "a" | "b" | "c" | "d" | "e" | "f" | DIGIT
372	DIGIT        = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
373	               "8" | "9"
374	UPPER        = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
375	               "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
376	               "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
377	               "Y" | "Z"
378	LOWER        = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
379	               "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
380	               "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
381	               "y" | "z"
382	OTHER        = "(" | ")" | "+" | "," | "-" | "." | "/" | ":" |
383	               "=" | "?" | "@" | ";" | "$" | "_" | "!" | "~" |
384	               "*" | "'" | "\" | """ | "#" | "&" | "<" | ">" |
385	               "[" | "]" | "^" | "`" | "{" | "|" | "}"

387	     Characters that are allowed to appear unescaped in attr-values are
388	the printable subset of (low) ASCII minus the "%" characters, i.e. hex
389	21 through hex 7e inclusive with the exception of hex 25 (which is the
390	"%" character).  Any other UTF-8 encoding of a character that appears in
391	an attr-value must be excaped by using the "%" character and two hex
392	digits that encode the character.  For example, The UCS-2 sequence
393	"A<NOT IDENTICAL TO><ALPHA>." (0041, 2262, 0391, 002E) may be encoded in
394	UTF-8 as follows:
395	   41 E2 89 A2 CE 91 2E

397	     If this character sequence appears in an attribute that is in a
398	Tagged Index Object attr-value, then it is encoded as:
399	   41 25 65 32 25 38 39 25 61 32 25 63 65 25 39 31 2E

401	     When viewed as an character string the encoding appears as:
402	   "A%e2%89%a2%ce%91."
403	     The set of characters allowed to appear in the attr-name field is
404	limited to the set of characters used in LDAP and WHOIS++ attribute
405	names.  For other services that have attribute name character sets that
406	are larger than these, it is suggested that those services create a pro-
407	file that maps the names onto object identifiers, and the sequence of
408	digits and periods is used by those services in creating the attr-name
409	fields for their Tagged Index Objects.

411	     Note that the attribute value may only be empty in the case of an
412	incremental update that contains a "Update Block" in which the index
413	object indicates that certain attributes of objects are being removed.
414	This specification only supports the replacement of entire attributes,
415	so that in the case of a multi-valued attribute, all of the values must
416	be specified in the Replace Block, not just the newly added values.  The
417	intention of the Tagged Index Object is to supply a snapshot of the cur-
418	rent index of the directory.

420	4.3.1.  Header Descriptions

422	     The header section consists of one or more "header lines".  The
423	following header lines are defined:

425	     "version": This line must always be present, and have the value "x-
426	     tagged-index-1" for this version of the specification.

428	     "updatetype": This line must always be present.  It takes as the
429	     value either "total" or

431	     "incremental".  The first update sent by a supplier server to a
432	     consumer server for a DSI must be a "total" update (why?).

434	     "thisupdate": This line must always be present. The value is the
435	     number of seconds from 00:00:00 UTC January 1, 1970 at which the
436	     supplier constructed this update.

438	     "lastupdate": This line must be present if the "updatetype" list
439	     has the value

441	     "incremental".  The value is the number of seconds from 00:00:00
442	     UTC January 1, 1970 at which the supplier constructed the previous
443	     update sent to the consumer.  This field allows the consumer to
444	     determine if a previous update was missed.

446	     "contextsize": This line may be present at the supplier's option.
447	     The value is a number, which is the approximate total number of
448	     entries in the subtree.  This information is provided for statisti-
449	     cal purposes only.

451	4.3.2.  Tokenization Types

453	     The Tagged Index Object inherits the "TOKEN" scheme for tokeniza-
454	tion as specified in [2].  In addition, there are several other tok-
455	enization schemes defined for the Tagged Index Object.  The following
456	table presents these schemes and what character(s) are used to delimit
457	tokens.

459	        Token Type      Tokenization Characters
460	        FULL    none
461	        TOKEN   white space, "@"
462	        RFC822  white space, ".", "@"
463	        UUCP    white space, "!"
464	        DNS     any character note a number, letter, or "-"

466	4.3.3.  Tag Conventions

468	     In the tag list, multiple consecutive tags may be shortened by
469	using "#-#".  For example, the list "3,4,5,6,7,8,9,10" may be shortened
470	to "3-10".  Tags are to be applied to the data on a per entry level.
471	Thus, if two index lines in the same index object contain the same tag,
472	then it is always the case that those two lines refer back to the same
473	"record" in the directory.  In LDAP terminology, the two lines would
474	refer back to the same directory object.  Additionally if two index
475	lines in the same index object contain different tags, then it is always
476	the case that those two lines refer back to different records in the
477	directory.

479	     The tags in the index object are meaningful only in the context of
480	that transmission.  The tag applied to the same underlying record in two
481	separate transmissions of a full-update may be different.  Thus, receiv-
482	ing index servers should make no assumptions about the values of the
483	tags across index object boundaries.  If the recieving index server is
484	implemented in such a way that it maintains a structure similar to the
485	one that exists in the tagged index object with numbered tags attached
486	to various records, then these "internal" tags are distinct from the
487	tags that appear in the index object as created by the transmitting
488	index server.

490	4.4. Incremental Indexing

492	     The tagged index object format supports the ability of information
493	servers to distribute only delta index data, rather than distributing
494	total index information each time.  This scenario, known as incremental
495	indexing supports three basic types of operations: add, delete and
496	replace.  If th incremental updatetype is specified in the tagged index
497	object, then the index object contains a snapshot of only the changes
498	that have been made since the index object specified in the lastupdate
499	header was distributed.  If the receiving index server did not receive
500	that index object, it should request a total index object.  If the CIP
501	protocol supports it, the index server may request the specific index
502	object that it missed.

504	     If the tagged index object contains an Add Block, then the lines in
505	the Add Block refer to new records that were added to the information
506	base of the transmitting index server.  It can be guaranteed that those
507	records did not exist in any previously received tagged index object,
508	and the receiving index server can insert this index information in the
509	index that it already maintains for the transmitting index server.  If
510	the receiving index server is maintaining internal tags, then a new
511	internal tag should be created for each tag in the Add Block.

513	     If the tagged index object contains a Delete Block, then the Delete
514	Block contains lines each of which refers to the "key" field (in the
515	attr-name area of the index line) from a record in the information
516	server that has been deleted since the last update (specified in the
517	lastupdate header field).  This key field is assumed to be the unique
518	identifier on the transmitting information server for the record that
519	has been deleted.  In the case of LDAP servers, this field would have an
520	attr-name of "dn".  Other forms of information servers would use the
521	appropriate unique identifier.  Thus, the unique identifier must have
522	previously been sent by the transmitting index server.  If the receiving
523	index server has never received information for the record refered to by
524	a line in the Delete Block, then it should be ignored, with the proviso
525	that the receiving index server has more than likely "lost" some infor-
526	mation previously distributed by the transmitting index server.  If the
527	receiving index server is maintaining internal tags, then after process-
528	ing the Delete Block, the internal tag numbers may be reordered so as to
529	not have "holes" in the sequence.

531	     If the tagged index object contains an Update Block, then the lines
532	in the Update Block refer to records that were changed in the informa-
533	tion base of the transmitting index server.  As was mentioned in clause
534	4.3, if any portion of an attribute in the information server has been
535	changed, then the entire attribute must be specified, and all index
536	information from all values of a multi-valued attribute must be speci-
537	fied.  If the attribute was removed from the record in the information
538	server, the attribute value specified in the attr-value field should be
539	empty.  Attributes which have not been changed in the record are not
540	specified.  The Update Block also supports the idea of indexing new
541	attributes which were not previously included in the tagged index
542	object.  For example, if the transmitting index server began including
543	index information on postal addresses, then it could include an Update
544	Block in the index object that included all of the index information on
545	postal addresses for all records in its information base, and indicate
546	that nothing else has changed.  If the receiving index server is main-
547	taining internal tags, then after processing the Update Block, the
548	internal tag numbers should remain the same.

550	5. Example

552	     As an example, the following LDIF [6] entries and the resulting
553	Tagged Index Object are presented.

555	           dn: cn=Barbara Jensen, ou=Product Development, o=Ace
556	Industry, c=US
557	           objectclass: top
558	           objectclass: person
559	           objectclass: organizationalPerson
560	           cn: Barbara Jensen
561	           cn: Barbara J Jensen
562	           cn: Babs Jensen
563	           sn: Jensen
564	           uid: bjensen
565	           telephonenumber: +1 408 555 1212
566	           description: A big sailing fan.
567	           dn: cn=Bjorn Jensen, ou=Accounting, o=Ace Industry, c=US
568	           objectclass: top
569	           objectclass: person
570	           objectclass: organizationalPerson
571	           cn: Bjorn Jensen
572	           sn: Jensen
573	           telephonenumber: +1 408 555 1212
574	           dn: cn=Gern Jensen, ou=Product Testing, o=Ace Industry, c=US
575	           objectclass: top
576	           objectclass: person
577	           objectclass: organizationalPerson
578	           cn: Gern Jensen
579	           cn: Gern O Jensen
580	           sn: Jensen
581	           uid: gernj
582	           telephonenumber: +1 408 555 1212
583	           dn: cn=Horatio Jensen, ou=Product Testing, o=Ace Industry,
584	c=US
585	           objectclass: top
586	           objectclass: person
587	           objectclass: organizationalPerson
588	           cn: Horatio Jensen
589	           cn: Horatio N Jensen
590	           sn: Jensen
591	           uid: hjensen
592	           telephonenumber: +1 408 555 1212

594	     The Tagged Index Object for this example would be:

596	                      version: x-tagged-index-1
597	                      updatetype: total
598	                      thisupdate: 855938804
599	                      BEGIN IO-Schema
600	                      dn: FULL
601	                      ou: TOKEN
602	                      o: TOKEN
603	                      c: TOKEN
604	                      objectclass: FULL
605	                      cn: TOKEN
606	                      sn: FULL
607	                      uid: FULL
608	                      title: TOKEN
609	                      END IO-Schema
610	                      BEGIN Index-Info
611	                      dn: 1/cn=Barbara Jensen,ou=Product
612	Development,o=Ace Industry,c=US
613	                      -2/cn=Bjorn Jensen,ou=Accounting,o=Ace
614	Industry,c=US
615	                      -3/cn=Gern Jensen,ou=Product Testing,o=Ace
616	Industry,c=US
617	                      -4/cn=Horatio Jensen,ou=Product Testing,o=Ace
618	Industry,c=US
619	                      ou: 1,3-4/Product
620	                      -1/Development
621	                      -2/Accounting
622	                      -3-4/Testing
623	                      o: */Ace
624	                      -*/Industry
625	                      c: */US
626	                      objectclass: */top
627	                      -*/person
628	                      -*/organizationalPerson
629	                      cn: 1/Barbara
630	                      -1/J
631	                      -1/Babs
632	                      -*/Jensen
633	                      -2/Bjorn
634	                      -3/Gern
635	                      -3/O
636	                      -4/Horatio
637	                      -4/N
638	                      sn: */Jensen
639	                      uid: 1/bjensen
640	                      -3/gernj
641	                      -4/hjensen
642	                      title: 1/product
643	                      1/manager
644	                      1/rod
645	                      1/and
646	                      1/reel
647	                      1/division
648	                      END Index-Info

650	     As an example of the Incremental Index Object, consider an update
651	that occurs when Barbara Jensen's entry above changes to:

653	           dn: cn=Barbara Jensen-Smith, ou=Product Development, o=Ace
654	Industry, c=US
655	           objectclass: top
656	           objectclass: person
657	           objectclass: organizationalPerson
658	           cn: Barbara Jensen-Smith
659	           cn: Barbara J Jensen-Smith
660	           cn: Babs Jensen-Smith
661	           sn: Jensen-Smith
662	           uid: bjensen
663	           telephonenumber: +1 408 555 1212
664	           description: A big sailing fan.

666	     The Tagged Index Object for this example would be:

668	                      version: x-tagged-index-1
669	                      updatetype: incremental
670	                      lastupdate: 855940000
671	                      thisupdate: 855938804
672	                      BEGIN IO-schema
673	                      dn: FULL
674	                      rdn: FULL
675	                      cn: TOKEN
676	                      sn: FULL
677	                      title: FULL
678	                      END IO-Schema
679	                      BEGIN Update Block
680	                      dn: 1/cn=Barbara Jensen,ou=Product
681	Development,o=Ace Industry,c=US
682	                      rdn: 1/rdn=Barbara Jensen-Smith
683	                      cn: 1/ Barbara
684	                      cn: 1/ Babs
685	                      cn: 1/Jensen-Smith
686	                      sn: 1/Jensen-Smith
687	                      title: 1/
688	                      END Update Block

690	     Note that in the above record, the attributes dn, cn and sn are
691	modified from the original record.  The attributes that do not change
692	from the original are objectclass, uid, telephonenumber and description.
693	Any attributes that are not changed SHOULD not be present in UPDATE
694	block.  Notice the title attribute has been removed from Barbara Jensen-
695	Smith's entry.

697	     In this next example, consider an LDIF file containing a series of
698	change records and comments.

700	   # Add a new entry
701	   dn: cn=Fiona Jensen, ou=Marketing, o=Ace Industry, c=US
702	   changetype: add
703	   objectclass: top
704	   objectclass: person
705	   objectclass: organizationalPerson
706	   cn: Fiona Jensen
707	   sn: Jensen
708	   uid: fiona
709	   telephonenumber: +1 408 555 1212
710	   jpegphoto:< /usr/local/directory/photos/fiona.jpg
711	   # Delete an existing entry
712	   dn: cn=Robert Jensen, ou=Marketing, o=Ace Industry, c=US
713	   changetype: delete
714	   # Modify an entry's relative distinguished name
715	   dn: cn=Paul Jensen, ou=Product Development, o=Ace Industry, c=US
716	   changetype: modrdn
717	   newrdn: cn=Paula Jensen
718	   deleteoldrdn: 1
719	   # Rename and entry and move all of its children to a new location in
720	   # the directory tree (only implemented by LDAPv3 servers).
721	   dn: ou=PD Accountants, ou=Product Development, o=Ace Industry, c=US
722	   changetype: modrdn
723	   newrdn: ou=Product Development Accountants
724	   deleteoldrdn: 0
725	   newsuperior: ou=Accounting, o=Ace Industry, c=US
726	   # Modify an entry: add an additional value to the postaladdress
727	attribute,
728	   # completely delete the description attribute, replace the
729	telephonenumber
730	   # attribute with two values, and delete a specific value from the
731	   # facsimiletelephonenumber attribute
732	   dn: cn=Paula Jensen, ou=Product Development, o=Ace Industry, c=US
733	   changetype: modify
734	   add: postaladdress
735	   postaladdress: 123 Anystreet $ Sunnyvale, CA $ 94086
736	   -
737	   delete: description
738	   -
739	   replace: telephonenumber
740	   telephonenumber: +1 408 555 1234
741	   telephonenumber: +1 408 555 5678
742	   -
743	   delete: facsimiletelephonenumber
744	   facsimiletelephonenumber: +1 408 555 9876
745	   -
746	     The Tagged Index Object for this example would be:

748	version: x-tagged-index-1
749	updatetype: incremental
750	thisupdate: 855938804
751	lastupdate: 855912345
752	BEGIN IO-Schema
753	dn: FULL
754	ou: TOKEN
755	o: TOKEN
756	c: TOKEN
757	objectclass: FULL
758	cn: TOKEN
759	sn: FULL
760	uid: FULL
761	title: TOKEN
762	END IO-Schema
763	BEGIN Add Block
764	objectclass: top
765	objectclass: person
766	objectclass: organizationalPerson
767	c: 1/us
768	o: 1/Ace
769	o: 1/Industry
770	ou: 1/Marketing
771	cn: 1/Fiona
772	cn: 1/Jensen
773	sn: 1/Jensen
774	uid: 1/Fiona
775	END Add Block

777	BEGIN Delete Block
778	dn: 1/cn=Robert Jensen, ou=Marketing, o=Ace Industry, c=us
779	END Delete Block

781	BEGIN Update Block
782	dn: 1/ou=PD Accountants, ou=Product Development, o=Ace Industry, c=US
783	-2/cn=Paula Jensen, ou=Product Development, o=Ace Industry, c=US
784	rdn: 1/Product Development Accountants
785	description: 2/
786	telephonenumber: 2/+1 408 555 5678
787	facsimilenumber: 2/
788	postaladdress: 2/123
789	-2/AnyStreet
790	-2/Sunnyvale
791	-2/CA
792	-2/94086
793	END Update Block
794	END Index-Info

796	6. Aggregation

798	6.1. Aggregation of Tagged Index Objects

800	     Aggregation of two tagged index objects is done by merging the  two
801	lists  of  values  and  rewriting each tag list.  The tag list rewriting
802	process is done so that the resulting index object appears as if it came
803	from a single source.  Tags from one of the two tagged index objects are
804	"mapped" to the number space above that used by the other  tagged  index
805	object.  An index server that aggregates tagged index objects for export
806	MUST ensure that the export URL (i.e. the base-uri of  the  CIP  object)
807	for  the  aggregate index object will route all queries that have "hits"
808	on the index object to that server (otherwise, query  routing  will  not
809	succeed).

811	7. Security Considerations

813	     This  specification provides a protocol for transfering information
814	between two servers.  The actual information transfered may be protected
815	by  laws in many countries, so care must be taken in the methods used to
816	tokenize the data in order to ensure that  protected  data  may  not  be
817	reconstructed  in  full by the receiving server.  This protocol does not
818	have any inherent protection against spoofing  or  eavesdropping.   How-
819	ever,  since  this  protocol is transported in MIME messages (as are all
820	CIP index objects), it inherits all of  the  security  capabilities  and
821	liabilities of other MIME messages.  Specifically, those wanting to pre-
822	vent eavesdropping or spoofing may use some of  the  various  techniques
823	for signing and encrypting MIME messages.

825	     Information  Server  administrators  must  decide  what portions of
826	their databases are  appropriate  for  inclusion  in  the  Tagged  Index
827	Object.   For  distribution  of  information  outside of the enterprise,
828	information server developers are encouraged  to  allow  for  facilities
829	that  hide the organizational structure when generating the Tagged Index
830	Object from the underlying information database.  In order to allow  for
831	the  secure  transmission  of  Tagged Index Objects across the Internet,
832	Index Servers should make use of SSL to carry out  the  connection.   In
833	order  to  strongly  verify the identity of the peer index server on the
834	other side of the connection, SSL version 3 certificate exchange  should
835	be  implemented,  and the identity in the peer's certificate verify with
836	the Public Key Infrastructure.  If electronic mail is used  to  exchange
837	the  Tagged  Index  Objects,  then  a secure messaging facility, such as
838	PGP/MIME  or S/MIME should be used to sign  or  encrypt  (or  both)  the
839	information.

841	8. References

843	[1]  J.  Allen,  M.  Mealling,  "The Architecture of the Common Indexing
844	     Protocol (CIP)," Internet Draft (work in progress) June 1997.

846	[2]  C. Weider, J. Fullton, S. Spero, "Architecture of the Whois++ Index
847	     Service.  RFC 1913, February 1996.

849	[3]  M. Wahl, T. Howes, S. Kille, "Lightweight Directory Access Protocol
850	     (v3)," Internet Draft (work in progress), June 1997.

852	[4]  ITU, "X.525 Information Technology - Open Systems Interconnection -
853	     The Directory: Replication", November 1993.

855	[5]  "FORTEZZA  Application  Implementors  Guide for the FORTEZZA Crypto
856	     Card (Production Version)", Document #PD4002102-1.01, SPYRUS, 1995.

858	[6]  The  LDAP  Data  Interchange Format (LDIF). Internet Draft (work in
859	     progress), 25 November 1996.

861	[7]  R. Hedberg, "LDAPv2 client Vs the Index Mesh". Internet Draft (work
862	     in progress), November 1997.

864	[8]  T.  Howes, M. Smith, "The LDAP URL Format". Internet Draft (work in
865	     progress), June 1997.

867	[9]  M. Elkins, "MIME Security with Pretty Good Privacy (PGP)", RFC2015,
868	     October 1996.

870	[10] Blake Ramsdell, "S/MIME Version 3 Message Specification",  Internet
871	     Draft,  (work in progress), May 1997.

873	[11] C. Allen, T. Dierks,  "The  TLS  Protocol  Version  1.0",  Internet
874	     Draft, (work in progress), November 1997.

876	9.  Author's Addresses

878	     Roland Hedberg
879	     Umdac
880	     Umea University
881	     901 87 Umea
882	     Sweden
883	     Email:  Roland.Hedberg@umdac.umu.se

885	     Bruce Greenblatt
886	     RSA Data Security
887	     100 Marine Parkway
888	     Suite 500
889	     Redwood City, CA 94065
890	     USA
891	     Email: bgreenblatt@rsa.com
892	     Phone: +1-650-595-8782

894	     Ryan Moats
895	     AT&T
896	     15621 Drexel Circle
897	     Omaha, NE 68135-2358
898	     USA
899	     EMail:  jayhawk@ds.internic.net
900	     Phone:  +1 402 894-9456

902	     Mark Wahl
903	     Critical Angle, Inc.
904	     4815 W Braker Lane #502-385
905	     Austin, TX 78759
906	     Email: M.Wahl@critical-angle.com
907	                           Table of Contents

909	1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . .   2
910	2. Background  . . . . . . . . . . . . . . . . . . . . . . . . . . .   2
911	3. Object  . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
912	4. The Tagged Index Object . . . . . . . . . . . . . . . . . . . . .   5
913	4.1. The Agreement . . . . . . . . . . . . . . . . . . . . . . . . .   5
914	4.2. Content Type  . . . . . . . . . . . . . . . . . . . . . . . . .   7
915	4.3 Tagged Index BNF . . . . . . . . . . . . . . . . . . . . . . . .   8
916	4.3.1. Header Descriptions . . . . . . . . . . . . . . . . . . . . .  10
917	4.3.2. Tokenization types  . . . . . . . . . . . . . . . . . . . . .  11
918	4.3.3. Tag Conventions . . . . . . . . . . . . . . . . . . . . . . .  11
919	4.4. Incremental Indexing  . . . . . . . . . . . . . . . . . . . . .  11
920	5. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  13
921	6. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . .  18
922	6.1 Aggregation of Tagged Index Objects  . . . . . . . . . . . . . .  18
923	7. Security Considerations . . . . . . . . . . . . . . . . . . . . .  18
924	8. References  . . . . . . . . . . . . . . . . . . . . . . . . . . .  19
925	9. Author's Addresses  . . . . . . . . . . . . . . . . . . . . . . .  20