idnits 2.17.1 

draft-ietf-idpr-specv1-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** Expected the document's filename to be given on the first page, but
     didn't find any

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 1338 instances of too long lines in the document, the longest
     one being 5 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Couldn't figure out when the document was first submitted -- there may
     comments or warnings related to the use of a disclaimer for pre-RFC5378
     work that could not be issued because of this.  Please check the Legal
     Provisions document at https://trustee.ietf.org/license-info to determine
     if you need the pre-RFC5378 disclaimer.

  -- The document date (30 November 1992) is 11469 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '2' is defined on line 4654, but no explicit reference
     was found in the text

  == Unused Reference: '3' is defined on line 4657, but no explicit reference
     was found in the text

  == Unused Reference: '4' is defined on line 4660, but no explicit reference
     was found in the text

  == Unused Reference: '8' is defined on line 4673, but no explicit reference
     was found in the text

  == Unused Reference: '9' is defined on line 4677, but no explicit reference
     was found in the text

  == Unused Reference: '10' is defined on line 4679, but no explicit
     reference was found in the text

  == Unused Reference: '11' is defined on line 4684, but no explicit
     reference was found in the text

  == Unused Reference: '12' is defined on line 4688, but no explicit
     reference was found in the text

  == Unused Reference: '13' is defined on line 4692, but no explicit
     reference was found in the text

  == Unused Reference: '14' is defined on line 4695, but no explicit
     reference was found in the text

  ** Downref: Normative reference to an Unknown state RFC: RFC 1102 (ref. '1')

  ** Downref: Normative reference to an Unknown state RFC: RFC 1125 (ref. '2')

  ** Downref: Normative reference to an Unknown state RFC: RFC 1126 (ref. '3')

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  ** Obsolete normative reference: RFC 1131 (ref. '9') (Obsoleted by RFC 1247)

  -- Possible downref: Non-RFC (?) normative reference: ref. '10'

  -- Possible downref: Non-RFC (?) normative reference: ref. '11'

  ** Obsolete normative reference: RFC 1113 (ref. '12') (Obsoleted by RFC
     1421)

  ** Obsolete normative reference: RFC 1114 (ref. '13') (Obsoleted by RFC
     1422)

  ** Obsolete normative reference: RFC 1115 (ref. '14') (Obsoleted by RFC
     1423)

  ** Obsolete normative reference: RFC 1320 (ref. '15') (Obsoleted by RFC
     6150)

  ** Downref: Normative reference to an Informational RFC: RFC 1321 (ref.
     '16')


     Summary: 20 errors (**), 0 flaws (~~), 12 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Inter-Domain Policy Routing Working Group                    M. Steenstrup
2	Internet Draft                                BBN Systems and Technologies
3	May 1992                                          Expires 30 November 1992

5	           Inter-Domain Policy Routing Protocol Specification:
6	                              Version 1

8	                          Status of this Memo

10	This document is an Internet Draft.  Internet Drafts are working documents
11	of the Internet Engineering Task Force (IETF), its Areas, and its Working
12	Groups.  Note that other grousp may also distribute working documents as
13	Internet Drafts.

15	  Internet Drafts are draft documents valid for a maximum of six months.
16	Internet Drafts may be updated, replaced, or obsoleted by other documents at
17	any time.  It is not appropriate to use Internet Drafts as reference
18	material or to cite them other than as a ``working draft'' or ``work in
19	progress''.

21	  Please check the 1id-abstracts.txt listing contained in the
22	internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net,
23	nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the current
24	status of any Internet Draft.

26	  This Internet Draft will be submitted to the RFC editor as a protocol
27	specification.  Distribution of this Internet Draft is unlimited.  Please
28	send comments to idpr-wg@bbn.com.

30	                               Abstract

32	We present the set of protocols and procedures that constitute inter-domain
33	policy routing (IDPR). IDPR includes the virtual gateway protocol, the
34	flooding protocol, the route server query protocol, the route generation
35	procedure, the path control protocol, and the data message forwarding
36	procedure.

38	                             Contributors

40	The following people have contributed to the protocols and procedures
41	described in this document:  Helen Bowns, Lee Breslau, Ken Carlberg, Isidro
42	Castineyra, Deborah Estrin, Tony Li, Mike Little, Katia Obraczka, Sam
43	Resheff, Martha Steenstrup, Gene Tsudik, and Robert Woodburn.

45	Contents

47	1  Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . .1

49	   1.1 Domain Elements. . . . . . . . . . . . . . . . . . . . . . . . 1

51	   1.2 Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . .2

53	   1.3 IDPR Functions. . . . . . . . . . . . . . . . . . . . . . . . .3

55	       1.3.1 IDPR Entities. . . . . . . . . . . . . . . . . . . . . . 3

57	   1.4 Policy Semantics. . . . . . . . . . . . . . . . . . . . . . . .4

59	       1.4.1 Source Policies. . . . . . . . . . . . . . . . . . . . . 5

61	       1.4.2 Transit Policies. . . . . . . . . . . . . . . . . . . . .5

63	   1.5 IDPR Message Encapsulation. . . . . . . . . . . . . . . . . . .6

65	       1.5.1 IDPR Data Message Format. . . . . . . . . . . . . . . . .9

67	   1.6 Security. . . . . . . . . . . . . . . . . . . . . . . . . . . .9

69	   1.7 Timestamps and Clock Synchronization. . . . . . . . . . . . . .10

71	   1.8 Network Management. . . . . . . . . . . . . . . . . . . . . . .11

73	       1.8.1 Policy Gateway Configuration. . . . . . . . . . . . . . .14

75	       1.8.2 Route Server Configuration. . . . . . . . . . . . . . . .15

77	2  Control Message Transport Protocol. . . . . . . . . . . . . . . . .16

79	   2.1 Message Transmission. . . . . . . . . . . . . . . . . . . . . .17

81	   2.2 Message Reception. . . . . . . . . . . . . . . . . . . . . . . 19

83	   2.3 Message Validation. . . . . . . . . . . . . . . . . . . . . . .20

85	   2.4 CMTP Message Formats. . . . . . . . . . . . . . . . . . . . . .21

87	3  Virtual Gateway Protocol. . . . . . . . . . . . . . . . . . . . . .25

89	   3.1 Message Scope. . . . . . . . . . . . . . . . . . . . . . . . . 25

91	       3.1.1 Pair-PG Messages. . . . . . . . . . . . . . . . . . . . .26

93	                                  i
94	       3.1.2 Intra-VG Messages. . . . . . . . . . . . . . . . . . . . 26

96	       3.1.3 Inter-VG Messages. . . . . . . . . . . . . . . . . . . . 27

98	       3.1.4 VG Representatives. . . . . . . . . . . . . . . . . . . .29

100	   3.2 Up/Down Protocol. . . . . . . . . . . . . . . . . . . . . . . .29

102	       3.2.1 Implementation. . . . . . . . . . . . . . . . . . . . . .31

104	   3.3 Policy Gateway Connectivity. . . . . . . . . . . . . . . . . . 33

106	       3.3.1 Within a Virtual Gateway. . . . . . . . . . . . . . . . .33

108	       3.3.2 Between Virtual Gateways. . . . . . . . . . . . . . . . .35

110	       3.3.3 Communication Complexity. . . . . . . . . . . . . . . . .38

112	   3.4 VGP Message Formats. . . . . . . . . . . . . . . . . . . . . . 39

114	       3.4.1 Up/Down. . . . . . . . . . . . . . . . . . . . . . . . . 39

116	       3.4.2 PG Connect. . . . . . . . . . . . . . . . . . . . . . . .39

118	       3.4.3 PG Policy. . . . . . . . . . . . . . . . . . . . . . . . 40

120	       3.4.4 VG Connect. . . . . . . . . . . . . . . . . . . . . . . .41

122	       3.4.5 VG Policy. . . . . . . . . . . . . . . . . . . . . . . . 42

124	       3.4.6 Negative Acknowledgements. . . . . . . . . . . . . . . . 43

126	4  Routing Information Distribution. . . . . . . . . . . . . . . . . .44

128	   4.1 AD Representatives. . . . . . . . . . . . . . . . . . . . . . .44

130	   4.2 Flooding Protocol. . . . . . . . . . . . . . . . . . . . . . . 45

132	       4.2.1 Message Generation. . . . . . . . . . . . . . . . . . . .47

134	       4.2.2 Sequence Numbers. . . . . . . . . . . . . . . . . . . . .49

136	       4.2.3 Message Acceptance. . . . . . . . . . . . . . . . . . . .49

138	       4.2.4 Message Incorporation. . . . . . . . . . . . . . . . . . 51

140	       4.2.5 Routing Information Database. . . . . . . . . . . . . . .53

142	   4.3 Routing Information Message Formats. . . . . . . . . . . . . . 54

144	                                  ii
145	       4.3.1 Configuration. . . . . . . . . . . . . . . . . . . . . . 54

147	       4.3.2 Dynamic. . . . . . . . . . . . . . . . . . . . . . . . . 58

149	       4.3.3 Negative Acknowledgements. . . . . . . . . . . . . . . . 59

151	5  Route Server Query Protocol. . . . . . . . . . . . . . . . . . . . 60

153	   5.1 Message Exchange. . . . . . . . . . . . . . . . . . . . . . . .60

155	       5.1.1 Routing Information. . . . . . . . . . . . . . . . . . . 61

157	       5.1.2 Routes. . . . . . . . . . . . . . . . . . . . . . . . . .62

159	   5.2 Remote Route Server Communication. . . . . . . . . . . . . . . 62

161	   5.3 Route Server Message Formats. . . . . . . . . . . . . . . . . .63

163	       5.3.1 Routing Information Request. . . . . . . . . . . . . . . 63

165	       5.3.2 Route Request. . . . . . . . . . . . . . . . . . . . . . 64

167	       5.3.3 Route Response. . . . . . . . . . . . . . . . . . . . . .66

169	       5.3.4 Negative Acknowledgements. . . . . . . . . . . . . . . . 67

171	6  Route Generation. . . . . . . . . . . . . . . . . . . . . . . . . .69

173	   6.1 Searching. . . . . . . . . . . . . . . . . . . . . . . . . . . 71

175	       6.1.1 Implementation. . . . . . . . . . . . . . . . . . . . . .71

177	   6.2 Route Database. . . . . . . . . . . . . . . . . . . . . . . . .74

179	       6.2.1 Cache Maintenance. . . . . . . . . . . . . . . . . . . . 75

181	7  Path Control Protocol and Data Message Forwarding Procedure. . . . 77

183	   7.1 An Example of Path Setup. . . . . . . . . . . . . . . . . . . .77

185	   7.2 Path Identifiers. . . . . . . . . . . . . . . . . . . . . . . .80

187	   7.3 Path Control Messages. . . . . . . . . . . . . . . . . . . . . 82

189	   7.4 Setting Up and Tearing Down a Path. . . . . . . . . . . . . . .84

191	       7.4.1 Validating Path Identifiers. . . . . . . . . . . . . . . 86

193	                                 iii
194	       7.4.2 Path Consistency with Configured Transit Policies. . . . 86

196	       7.4.3 Path Consistency with Virtual Gateway Reachability. . . .88

198	       7.4.4 Obtaining Resources. . . . . . . . . . . . . . . . . . . 89

200	       7.4.5 Target Response. . . . . . . . . . . . . . . . . . . . . 89

202	       7.4.6 Originator Response. . . . . . . . . . . . . . . . . . . 90

204	       7.4.7 Path Life. . . . . . . . . . . . . . . . . . . . . . . . 91

206	   7.5 Path Failure and Recovery. . . . . . . . . . . . . . . . . . . 92

208	       7.5.1 Handling Implicit Path Failures. . . . . . . . . . . . . 92

210	       7.5.2 Local Path Repair. . . . . . . . . . . . . . . . . . . . 94

212	       7.5.3 Repairing a Path. . . . . . . . . . . . . . . . . . . . .95

214	   7.6 Path Control Message Formats. . . . . . . . . . . . . . . . . .97

216	       7.6.1 Setup. . . . . . . . . . . . . . . . . . . . . . . . . . 97

218	       7.6.2 Accept. . . . . . . . . . . . . . . . . . . . . . . . . .99

220	       7.6.3 Refuse. . . . . . . . . . . . . . . . . . . . . . . . . .99

222	       7.6.4 Teardown. . . . . . . . . . . . . . . . . . . . . . . . .100

224	       7.6.5 Error. . . . . . . . . . . . . . . . . . . . . . . . . . 101

226	       7.6.6 Repair. . . . . . . . . . . . . . . . . . . . . . . . . .102

228	       7.6.7 Negative Acknowledgements. . . . . . . . . . . . . . . . 102

230	                                  iv
231	1  Introduction

233	  In this document, we specify the protocols and procedures that compose
234	inter-domain policy routing (IDPR). The objective of IDPR is to construct
235	and maintain routes between source and destination administrative domains,
236	that provide user traffic with the services requested within the constraints
237	stipulated for the domains transited.  IDPR supports link state routing
238	information distribution and route generation in conjunction with source
239	specified message forwarding.  Refer to [5] for a detailed justification of
240	our approach to inter-domain policy routing.

242	1.1 Domain Elements

244	  The IDPR architecture has been designed to accommodate an Internet with
245	tens of thousands of administrative domains collectively containing hundreds
246	of thousands of local networks.  Inter-domain policy routes are constructed
247	using information about the services offered by, and the connectivity
248	between, administrative domains.  The intra-domain details -- gateways,
249	networks, and links traversed -- of an inter-domain policy route are the
250	responsibility of intra-domain routing and are thus outside the scope of
251	IDPR.

253	  An administrative domain (AD) is a collection of contiguous hosts,
254	gateways, networks, and links managed by a single administrative authority
255	that defines service restrictions for transit traffic and service
256	requirements for locally-generated traffic, and selects the addressing
257	schemes and routing procedures that apply within the domain.  Each domain
258	has a unique numeric identifier within the Internet.

260	  Virtual gateways (VGs) are the only IDPR-recognized connecting points
261	between adjacent domains.  Each virtual gateway is a collection of
262	directly-connected policy gateways (see below) in two adjoining domains,
263	whose existence has been sanctioned by the administrators of both domains.
264	The domain administrators may agree to establish more than one virtual
265	gateway between the two domains.  For each such virtual gateway, the two
266	administrators together assign a local numeric identifier, unique within the
267	set of virtual gateways connecting the two domains.  To produce a virtual
268	gateway identifier unique within its domain, a domain administrator
269	concatenates the mutually assigned local virtual gateway identifier together
270	with the adjacent domain's identifier.

272	  Policy gateways (PGs) are the physical gateways within a virtual gateway.
273	Each policy gateway enforces service restrictions on IDPR transit traffic,

275	                                  1
276	as stipulated by the domain administrator, and forwards the traffic
277	accordingly.  Within a domain, two policy gateways are neighbors if they are
278	in different virtual gateways.  A single policy gateway may belong to
279	multiple virtual gateways.  Within a virtual gateway, two policy gateways
280	are peers if they are in the same domain and are adjacent if they are in
281	different domains.  Adjacent policy gateways are directly connected if the
282	only Internet-addressable entities attached to the connecting medium are
283	policy gateways in the virtual gateways.  Note that this definition implies
284	that not only point-to-point links but also networks may serve as direct
285	connections between adjacent policy gateways.  The domain administrator
286	assigns to each of its policy gateways a numeric identifier, unique within
287	that domain.

289	  A domain component is a subset of a domain's entities such that all
290	entities within the subset are mutually reachable via intra-domain routes,
291	but no entities outside the subset are reachable via intra-domain routes
292	from entities within the subset.  Normally, a domain consists of a single
293	component, namely itself; however, when partitioned, a domain consists of
294	multiple components.  Each domain component has an identifier, unique within
295	the Internet, composed of the domain identifier together with the identifier
296	of the lowest-numbered operational policy gateway within the component.  All
297	operational policy gateways within a domain component can discover mutual
298	reachability through intra-domain routing information.  Hence, all such
299	policy gateways can consistently determine, without explicit negotiation,
300	which of them has the lowest number.

302	1.2 Policy

304	  With IDPR, each domain administrator sets transit policies that dictate
305	how and by whom the resources in its domain should be used.  Transit
306	policies are usually public, and they specify offered services comprising:

308	Access restrictions: e.g., applied to traffic to or from certain domains or
309	    classes of users.

311	Quality: e.g., delay, throughput, or error characteristics.

313	Monetary cost: e.g., charge per byte, message, or unit time.

315	Each domain administrator also sets source policies for traffic originating
316	in its domain.  Source policies are usually private, and they specify
317	requested services comprising:

319	                                  2
320	Access restrictions: e.g., domains to favor or avoid in routes.

322	Quality: e.g., acceptable delay, throughput, and reliability.

324	Monetary cost: e.g., acceptable session cost.

326	1.3 IDPR Functions

328	  IDPR comprises the following functions:

330	 1. Collecting and distributing routing information including domain
331	    transit policies and inter-domain connectivity.

333	 2. Generating and selecting policy routes based on the routing information
334	    distributed and on the source policies configured or requested.

336	 3. Setting up paths across the Internet using the policy routes generated.

338	 4. Forwarding messages across and between domains along the established
339	    paths.

341	 5. Maintaining databases of routing information, inter-domain policy
342	    routes, forwarding information, and configuration information.

344	1.3.1 IDPR Entities

346	Several different entities are responsible for performing the IDPR
347	functions.

349	  Policy gateways, the only IDPR-recognized connecting points between
350	adjacent domains, collect and distribute routing information, participate in
351	path setup, forward data messages along established paths, and maintain
352	forwarding information databases.

354	  Path agents, resident within policy gateways and within route servers
355	(see below), act on behalf of hosts to select policy routes, to set up and
356	manage paths, and to maintain forwarding information databases.  Any
357	Internet host can reap the benefits of IDPR, as long as there exists a path
358	agent configured to act on its behalf and a means by which the host's
359	messages can reach the path agent.

361	  Route servers maintain both the routing information database and the
362	route database, and they generate policy routes using the routing
363	information collected and the source policies requested by the path agents.

365	                                  3
366	A route server may reside within a policy gateway, or it may exist as an
367	autonomous entity.  Separating the route server functions from the policy
368	gateways frees the policy gateways from both the memory intensive task of
369	routing information and route database maintenance and the computationally
370	intensive task of route generation.  Route servers, like policy gateways,
371	each have a unique numeric identifier within their domain, assigned by the
372	domain administrator.

374	  Given the size of the current Internet, each policy gateway can perform
375	the route server functions, in addition to its message forwarding functions,
376	with little or no degradation in message forwarding performance.
377	Aggregating the routing functions into policy gateways simplifies
378	implementation; one need only install IDPR protocols in policy gateways.
379	Moreover, it simplifies communication between routing functions, as all
380	functions reside within each policy gateway.  As the Internet grows, the
381	processing and memory required to perform the route server functions may
382	become a burden for the policy gateways.  When this happens, each domain
383	administrator should separate the route server functions from the policy
384	gateways in its domain.

386	  Mapping servers maintain the database of mappings that resolve Internet
387	names and addresses to domain identifiers.  The mapping server function will
388	be integrated into the existing DNS name service.

390	  Configuration servers maintain the databases of configured information
391	that apply to IDPR entities within their domains.  Configuration information
392	for a given domain includes transit policies (i.e., service offerings),
393	source policies (i.e., service requirements), and mappings between local
394	IDPR entities and their names and addresses.  The configuration server
395	function will be integrated into a domain's existing network management
396	system.

398	1.4 Policy Semantics

400	  The source and transit policies supported by IDPR are intended to
401	accommodate a wide range of services available throughout the Internet.  We
402	describe the semantics of these policies, concentrating on the access
403	restriction aspects.  To express these policies in this document, we have
404	chosen to use a syntactic variant of Clark's policy term notation [1].
405	However, we provide a more succinct syntax (see [6]) for actually
406	configuring source and transit policies.

408	                                  4
409	1.4.1 Source Policies

411	Each source policy takes the form of a collection of sets as follows:

413	{((H11,s11),...,(H1f1,s1f1)),...,((Hn1,sn1),...(Hnfn,snfn))}: The set of
414	    groups of source/destination traffic flows to which the source policy
415	    applies.  Each traffic flow group ((Hi1,si1),...,(Hifi,sifi)) contains
416	    a set of source hosts and corresponding destination hosts.  Here, Hij
417	    represents a host, and sij an element of {source,destination}
418	    represents an indicator of whether Hij is to be considered as a source
419	    or as a destination.

421	{(AD1,x1),...,(ADm,xm)}: The set of transit domains that the traffic
422	    flows should favor, avoid, or exclude.  Here, ADi represents a set of
423	    domains, and xi an element of {favor,avoid,exclude} represents an
424	    indicator of whether routes including members of ADi are to be favored,
425	    avoided if possible, or unconditionally excluded.

427	UCI: The user class applied to the traffic flows listed.

429	Requested: The set of requested services not related to access
430	    restrictions, i.e., service quality and monetary cost.

432	  The path agent honoring such a source policy will select a route for a
433	traffic flow from any source host Hij to any destination host Hik, where
434	1 <= i <= n and 1 <= j,k <= fi, provided that:

436	 1. For each domain, ADp, contained in the route, ADp <> ADk, where
437	    xk = exclude and 1 <= k <= m.

439	 2. The route provides the services listed in the set Requested.

441	1.4.2 Transit Policies

443	Each transit policy takes the form of a collection of sets as follows:

445	{((H11,AD11,s11),...,(H1f1,AD1f1,s1f1)),...,((Hn1,ADn1,sn1),...,
446	    (Hnfn,ADnfn,snfn))}: The set of groups of source and destination hosts
447	    and domains to which the transit policy applies.  Each host/domain group
448	    ((Hi1,ADi1,si1),...,(Hifi,ADifi,sifi)) contains a set of source and
449	    destination hosts and domains such that this transit domain will carry
450	    traffic from each source listed to each destination listed.  Here, Hij
451	    represents a set of hosts, ADij represents a set of domains containing

453	                                  5
454	    Hij, and sij a subset of {source,destination} represents an indicator
455	    of whether (Hij,ADij) is to be considered as a set of sources,
456	    destinations, or both.

458	Time: The set of time intervals during which the transit policy applies.

460	UCI: The set of user classes to which the transit policy applies.

462	Offered: The set of offered services not related to access restrictions,
463	    i.e., service quality and monetary cost.

465	{((VG11,e11),...,(VG1g1,e1g1)),...,((VGm1,em1),...,(VGmgm,emgm))}: The set of
466	    groups of entry and exit virtual gateways to which the transit policy
467	    applies.  Each virtual gateway group ((VGi1,ei1),...,(VGigi,eigi))
468	    contains a set of domain entry and exit points such that each entry
469	    virtual gateway can reach (barring any intra-domain routing failure) each
470	    exit virtual gateway via an intra-domain route supporting the transit
471	    policy.  Here, VGij represents a virtual gateway, and eij a subset of
472	    {entry,exit} represents an indicator of whether VGij is to be considered
473	    as a domain entry point, exit point, or both.

475	  The domain advertising such a transit policy will carry traffic from any
476	host in the set Hij in ADij to any host in the set Hik in ADik, where
477	1 <= i <= n and 1 <= j,k <= fi, provided that:

479	 1. source is an element of sij.

481	 2. destination is an element of sik.

483	 3. Traffic from Hij enters the domain during one of the intervals in the
484	    set Time.

486	 4. Traffic from Hij carries one of the user class identifiers in the set
487	    UCI.

489	 5. Traffic from Hij enters via any VGuv such that entry is an element of
490	    euv, where 1 <= u <= m and 1 <= v <= gu.

492	 6. Traffic to Hik leaves via any VGuw such that exit is an element of euw,
493	    where 1 <= w <= gu.

495	1.5 IDPR Message Encapsulation

497	  There are two kinds of IDPR messages:

499	                                  6
500	 1. Data messages containing user data generated by hosts.

502	 2. Control messages containing IDPR protocol-related control information
503	    generated by policy gateways and route servers.

505	Within the Internet, only policy gateways and route servers are able to
506	generate, recognize, and process IDPR messages.  The existence of IDPR is
507	invisible to all other gateways and hosts, including mapping servers and
508	configuration servers.  Mapping servers and configuration servers perform
509	necessary but ancillary functions for IDPR, and thus they are not required
510	to handle IDPR messages.

512	  An IDPR entity places IDPR-specific information in each IDPR control
513	message it originates; this information is significant only to recipient
514	IDPR entities.  Using encapsulation across each domain, an IDPR message
515	tunnels from source to destination across the Internet through domains that
516	may employ disparate intra-domain addressing schemes and routing procedures.

518	  As an alternative to encapsulation, we had considered embedding IDPR in
519	IP, as a set of IP options.  However, this approach has the following
520	disadvantages:

522	 1. Only domains that support IP would be able to participate in IDPR;
523	    domains that do not support IP would be excluded.

525	 2. Each gateway, policy or other, in a participating domain would at least
526	    have to recognize the IDPR option, even if it did not execute the IDPR
527	    protocols.  However, most commercial routers are not optimized for IP
528	    options processing, and so IDPR message handling might require
529	    significant processing at each gateway.

531	 3. For some IDPR protocols, in particular path control, the size
532	    restrictions on IP options would preclude inclusion of all of the
533	    necessary protocol-related information.

535	For these reasons, we decided against the IP option approach and in favor of
536	encapsulation.

538	  An IDPR message travels from source to destination between consecutive
539	policy gateways.  Each policy gateway encapsulates the IDPR message with
540	information, for example an IP header, that will enable the message to reach
541	the next policy gateway.  Note that the encapsulating header and the
542	IDPR-specific information may increase message size beyond the MTU of the
543	given domain.  However, message fragmentation and reassembly is the

545	                                  7
546	responsibility of the protocol, for example IP, that encapsulates IDPR
547	messages for transport between successive policy gateways; it is not the
548	responsibility of IDPR itself.

550	  A policy gateway, when forwarding an IDPR message to a peer or a neighbor
551	policy gateway, encapsulates the message in accordance with the addressing
552	scheme and routing procedure of the given domain and indicates in the
553	protocol field of the encapsulating header that the message is indeed an
554	IDPR message.  Intermediate gateways between the two policy gateways forward
555	the IDPR message as they would any other message, using the information in
556	the encapsulating header.  Only the recipient policy gateway interprets the
557	protocol field, strips off the encapsulating header, and processes the IDPR
558	message.

560	  A policy gateway, when forwarding an IDPR message to a directly-connected
561	adjacent policy gateway, encapsulates the message in accordance with the
562	addressing scheme of the entities within the virtual gateway and indicates
563	in the protocol field of the encapsulating header that the message is indeed
564	an IDPR message.  The recipient policy gateway strips off the encapsulating
565	header and processes the IDPR message.  We recommend that the recipient
566	policy gateway perform the following validation check of the encapsulating
567	header, prior to stripping it off.  Specifically, the recipient policy
568	gateway should verify that the source address and the destination address in
569	the encapsulating header match the adjacent policy gateway's address and its
570	own address, respectively.  Moreover, the recipient policy gateway should
571	verify that the message arrived on the interface designated for the direct
572	connection to the adjacent policy gateway.  These checks help to ensure that
573	IDPR traffic that crosses domain boundaries does so only over direct
574	connections between adjacent policy gateways.

576	  Policy gateways forward IDPR data messages according to a forwarding
577	information database which maps path identifiers into next policy gateways,
578	and they forward IDPR control messages according to next policy gateways
579	selected by the particular IDPR control protocol.  Distinguishing IDPR data
580	messages and IDPR control messages at the encapsulating protocol level,
581	instead of at the IDPR protocol level, eliminates an extra level of
582	dispatching and hence makes IDPR message forwarding more efficient.  When
583	encapsulated within IP messages, IDPR data messages and IDPR control
584	messages carry the IP protocol numbers 35 and 38, respectively.

586	                                  8
587	1.5.1 IDPR Data Message Format

589	The path agents at a source domain determine which data messages generated
590	by local hosts are to be handled by IDPR. To each data message selected for
591	IDPR handling, a source path agent prepends the following header:

593	     0_________8________16________24_____31__
594	     |_VERSION_|__PROTO__|______LENGTH______|
595	     |              PATH ID                 |
596	     |______________________________________|
597	     |_____________TIMESTAMP________________|
598	     |              INT/AUTH                |
599	     |______________________________________|

601	VERSION (8 bits) Version number for IDPR data messages, currently equal to 1.

603	PROTO (8 bits) Numeric identifier for the protocol with which to process the
604	    contents of the IDPR data message.  Only the path agent at the
605	    destination interprets and acts upon the contents of the PROTO field.

607	LENGTH (16 bits) Length of the entire IDPR data message in bytes.

609	PATH ID (64 bits) Path identifier assigned by the source path agent and
610	    consisting of the numeric identifier of the path agent's domain (16
611	    bits), the numeric identifier of the path agent's policy gateway (16
612	    bits), and the path agent's local path identifier (32 bits) (see
613	    section 7.2).

615	TIMESTAMP (32 bits) Number of seconds elapsed since 1 January 1970 0:00 GMT.

617	INT/AUTH (variable) Computed integrity/authentication value, dependent on
618	    the type of integrity/authentication requested during path setup.

620	We describe the IDPR control message header in section 2.4.

622	1.6 Security

624	  IDPR contains mechanisms for verifying message integrity and source
625	authenticity and for protecting against certain types of denial of service
626	attacks.  It is particularly important to keep IDPR control messages intact,
627	because they carry control information critical to the construction and use
628	of viable policy routes between domains.

630	                                  9
631	  All IDPR messages carry a single piece of information, referred to in the
632	IDPR documentation as the integrity/authentication value, which may be used
633	not only to detect message corruption but also to verify the authenticity of
634	the message source.  The Internet coordinator(1) sanctions the set of valid
635	algorithms which may be used to compute the integrity/authentication values.
636	This set may include algorithms that perform only message integrity checks
637	such as n-bit cyclic redundancy checksums (CRCs), as well as algorithms
638	that perform both message integrity and source authentication checks such as
639	signed hash functions of message contents.

641	  Each domain administrator is free to select any integrity/authentication
642	algorithm, from the set specified by the Internet coordinator, for computing
643	the integrity/authentication values contained in its domain's messages.
644	However, we recommend that IDPR entities in each domain be capable of
645	executing all of the valid algorithms so that an IDPR control message
646	originating at an entity in one domain can be properly checked by an entity
647	in another domain.

649	  IDPR control messages must carry a non-null integrity/authentication
650	value.  We recommend that the integrity/authentication algorithm be a
651	digital signature, in particular an algorithm such as MD4 [15] or MD5 [16],
652	which simultaneously verifies message integrity and source authenticity.
653	The digital signature may be based on either public-key or private-key
654	cryptography.  Our approach to digital signature use in IDPR is based on the
655	privacy-enhanced Internet electronic mail service [12]-[14], already
656	available in the Internet.

658	  We do not require IDPR data messages to carry a non-null
659	integrity/authentication value.  In fact, we recommend that a higher layer
660	(end-to-end) protocol, and not IDPR, assume responsibility for checking the
661	integrity and authenticity of data messages, because of the amount of
662	computation required.

664	1.7 Timestamps and Clock Synchronization

666	  Each IDPR message carries a timestamp (expressed in seconds elapsed since
667	1 January 1970 0:00 GMT, following the UNIX precedent) supplied by the
668	____________________________
669	 (1)Throughout this document, we use the term Internet coordinator to refer
670	to a coordinating body that makes administrative decisions about the
671	Internet as a whole.  There may actually be separate bodies responsible for
672	separate aspects of the Internet.  However, for simplicity, we use the
673	single term Internet coordinator.

675	                                  10
676	source IDPR entity, which serves to indicate the age of the message.  IDPR
677	entities use the absolute value of the timestamp to confirm that a message
678	is current and use the relative difference between timestamps to determine
679	which message contains the more recent information.

681	  All IDPR entities must possess internal clocks that are synchronized to
682	some degree, in order for the absolute value of a message timestamp to be
683	meaningful.  The synchronization granularity required by IDPR is on the
684	order of minutes and can be achieved manually.  Thus, a synchronization
685	protocol operating among all IDPR entities in all domains, while useful, is
686	not necessary.

688	  An IDPR entity can determine whether to accept or reject a message based
689	on the discrepancy between the message's timestamp and the entity's own
690	internal clock time.  Any IDPR message whose timestamp lies outside of the
691	acceptable range may contain stale or corrupted information or may have been
692	issued by a source whose internal clock has lost synchronization with the
693	message recipient's internal clock.  Timestamp checks are required for
694	control messages because of the consequences of propagating and acting upon
695	incorrect control information.  However, timestamp checks are discretionary
696	for data messages but may be invoked during problem diagnosis, for example,
697	when checking for suspected message replays.

699	  We note that none of the IDPR protocols contain explicit provisions for
700	dealing with an exhausted timestamp space.  As timestamp space exhaustion
701	will not occur until well into the next century, we expect timestamp space
702	viability to outlast the IDPR protocols.

704	1.8 Network Management

706	  In this document, we do not describe how to configure and manage IDPR.
707	However, in this section, we do provide a list of the types of IDPR
708	configuration information required.  Also, in later sections describing the
709	IDPR protocols, we briefly note the types of exceptional events that must be
710	logged for network management.  Complete descriptions of IDPR entity
711	configuration and IDPR managed objects appear in [6] and [7] respectively.

713	  To participate in inter-domain policy routing, policy gateways and route
714	servers within a domain each require configuration information.  Some of the
715	configuration information is specifically defined within the given domain,
716	while some of the configuration information is universally defined
717	throughout the Internet.  A domain administrator determines domain-specific
718	information, and the Internet coordinator determines globally significant

720	                                  11
721	information.  (Refer to [6] for detailed instructions on configuring an
722	administrative domain to support IDPR.)

724	  To produce valid domain configurations, the domain administrators must
725	receive the following global information from the Internet coordinator:

727	 1. For each Internet integrity/authentication type, the numeric
728	    identifier, syntax, and semantics.  Available integrity and
729	    authentication types include but are not limited to:

731	    (a) public-key based signatures;
732	    (b) private-key based signatures;
733	    (c) cyclic redundancy checksums;
734	    (d) no integrity/authentication.

736	 2. For each Internet user class, the numeric identifier, syntax, and
737	    semantics.  Available user classes include but are not limited to:

739	    (a) federal (and if necessary, agency-specific such as NSF, DOD, DOE,
740	        etc.);
741	    (b) research;
742	    (c) commercial;
743	    (d) support.

745	 3. For each Internet offered service that may be advertised in transit
746	    policies, the numeric identifier, syntax, and semantics.  Available
747	    offered services include but are not limited to:

749	    (a) average message delay;
750	    (b) message delay variation;
751	    (c) average bandwidth available;
752	    (d) bandwidth variation;
753	    (e) maximum transfer unit (MTU);
754	    (f) charge per byte;
755	    (g) charge per message;
756	    (h) charge per unit time.

758	                                  12
759	 4. For transit policy applicability time periods, the syntax and
760	    semantics.

762	 5. For each Internet requested service that may appear within a path setup
763	    message, the numeric identifier, syntax, and semantics.  Available
764	    requested services include but are not limited to:

766	    (a) maximum path life in minutes, messages, or bytes;
767	    (b) integrity/authentication algorithms to be used on data messages
768	        sent over the path;
769	    (c) path delay;
770	    (d) minimum delay for path;
771	    (e) path delay variation;
772	    (f) minimum delay variation path;
773	    (g) path bandwidth;
774	    (h) maximum bandwidth path;
775	    (i) session monetary cost;
776	    (j) minimum session monetary cost path;
777	    (k) billing address;
778	    (l) charge number.

780	  In an Internet-wide implementation of IDPR, the set of global
781	configuration parameters and their syntax and semantics must be consistent
782	across all participating domains.  The Internet coordinator, responsible for
783	establishing the full set of global configuration parameters, relies on the
784	cooperation of the administrator of each participating domain to ensure that
785	the global parameters are consistent with the desired transit policies and
786	user service requirements of each domain.  Moreover, as the syntax and
787	semantics of the global parameters affects the syntax and semantics of the
788	corresponding IDPR software, the Internet coordinator must carefully define
789	each global parameter so that it is unlikely to require future
790	modifications.

792	  The Internet coordinator distributes configured global information to
793	configuration servers in all domains participating in IDPR. Each domain
794	administrator uses the configured global information maintained by its
795	configuration servers to develop configurations for each IDPR entity within
796	its domain.  Each configuration server retains a copy of the configuration
797	for each local IDPR entity and also distributes the configuration to that
798	entity using, for example, SNMP.

800	                                  13
801	1.8.1 Policy Gateway Configuration

803	Each policy gateway must contain sufficient configuration information to
804	perform its IDPR functions, which subsume those of the path agent.  These
805	include:  validating IDPR control messages; generating and distributing
806	virtual gateway connectivity and routing information messages to peer,
807	neighbor, and adjacent policy gateways; distributing routing information
808	messages to route servers in its domain; resolving destination addresses;
809	requesting policy routes from route servers; selecting policy routes and
810	initiating path setup; ensuring consistency of a path with its domain's
811	transit policies; establishing path forwarding information; and forwarding
812	IDPR data messages along existing paths.  The necessary configuration
813	information includes the following:

815	 1. For each integrity/authentication type, the numeric identifier, syntax,
816	    and semantics.

818	 2. For each policy gateway and route server in the given domain, the
819	    numeric identifier and set of addresses or names.

821	 3. For each virtual gateway connected to the given domain, the numeric
822	    identifier, the numeric identifiers of the constituent peer policy
823	    gateways, and the numeric identifier of the adjacent domain.

825	 4. For each virtual gateway of which the given policy gateway is a member,
826	    the numeric identifiers and set of addresses of the constituent
827	    adjacent policy gateways.

829	 5. For each policy gateway directly-connected and adjacent to the given
830	    policy gateway, the local connecting interface.

832	 6. For each local route server to which the given policy gateway
833	    distributes routing information, the numeric identifier.

835	 7. For each source policy applicable to hosts within the given domain, the
836	    syntax and semantics.

838	 8. For each transit policy applicable to the domain, the numeric
839	    identifier, syntax, and semantics.

841	 9. For each requested service that may appear within a path setup message,
842	    the numeric identifier, syntax, and semantics.

844	                                  14
845	 10. For each source user class, the numeric identifier, syntax, and
846	     semantics.

848	1.8.2 Route Server Configuration

850	Each route server must contain sufficient configuration information to
851	perform its IDPR functions, which subsume those of the path agent.  These
852	include:  validating IDPR control messages; deciphering and storing the
853	contents of routing information messages; exchanging routing information
854	with other route servers and policy gateways; generating policy routes that
855	respect transit policy restrictions and source service requirements;
856	distributing policy routes to path agents in policy gateways; resolving
857	destination addresses; selecting policy routes and initiating path setup;
858	establishing path forwarding information; and forwarding IDPR data messages
859	along existing paths.  The necessary configuration information includes the
860	following:

862	 1. For each integrity/authentication type, the numeric identifier, syntax,
863	    and semantics.

865	 2. For each policy gateway and route server in the given domain, the
866	    numeric identifier and set of addresses or names.

868	 3. For each source policy applicable to hosts within the given domain, the
869	    syntax and semantics.

871	 4. For each offered service that may be advertised in transit policies,
872	    the numeric identifier, syntax, and semantics.

874	 5. For each requested service that may appear within a path setup message,
875	    the numeric identifier, syntax, and semantics.

877	 6. For each source user class, the numeric identifier, syntax, and
878	    semantics.

880	                                  15
881	2  Control Message Transport Protocol

883	  IDPR control messages convey routing-related information that directly
884	affects the policy routes generated and the paths set up across the
885	Internet.  Errors in IDPR control messages can have widespread, deleterious
886	effects on inter-domain policy routing, and so the IDPR protocols have been
887	designed to minimize loss and corruption of control messages.  For every
888	control message it transmits, each IDPR protocol expects to receive
889	notification as to whether the control message successfully reached the
890	intended IDPR recipient.  Moreover, the IDPR recipient of a control message
891	first verifies that the message appears to be well-formed, before acting on
892	its contents.

894	  All IDPR protocols use the control message transport protocol (CMTP), a
895	connectionless, transaction-based transport layer protocol, for
896	communication with intended recipients of control messages.  CMTP
897	retransmits unacknowledged control messages and applies integrity and
898	authenticity checks to received control messages.

900	  There are three types of CMTP messages:

902	datagram: Contains IDPR control messages.

904	ack: Positive acknowledgement in response to a datagram message.

906	nak: Negative acknowledgement in response to a datagram message.

908	Each CMTP message contains several pieces of information supplied by the
909	sender that allow the recipient to test the integrity and authenticity of
910	the message.  The set of integrity and authenticity checks performed after
911	CMTP message reception are collectively referred to as the validation checks
912	and are described in section 2.3.

914	  When we first designed the IDPR protocols, CMTP as a distinct protocol
915	did not exist.  Instead, CMTP-equivalent functionality was embedded in each
916	IDPR protocol.  To provide a cleaner implementation, we later decided to
917	provide a single transport protocol that could be used by all IDPR
918	protocols.  We originally considered using an existing transport protocol,
919	but rejected this approach for the following reasons:

921	 1. The existing reliable transport protocols do not provide all of the
922	    validation checks, in particular the timestamp and authenticity checks,
923	    required by the IDPR protocols.  Hence, if we were to use one of these
924	    protocols, we would still have to provide a separate protocol on top of

926	                                  16
927	    the transport protocol to force retransmission of IDPR messages that
928	    failed to pass the required validation checks.

930	 2. Many of the existing reliable transport protocols are window-based
931	    and hence can result in increased message delay and resource use when,
932	    as is the case with IDPR, multiple independent messages use the
933	    same transport connection.  A single message experiencing transmission
934	    problems and requiring retransmission can prevent the window from
935	    advancing, forcing all subsequent messages to queue behind the given
936	    message.  Moreover, many of the window-based protocols do not support
937	    selective retransmission of failed messages but instead require
938	    retransmission of not only the failed message but also all preceding
939	    messages within the window.

941	2.1 Message Transmission

943	  At the transmitting entity, when an IDPR protocol is ready to issue a
944	control message, it passes a copy of the message to CMTP; it also passes a
945	set of parameters to CMTP for inclusion in the CMTP header and for proper
946	CMTP message handling.  In turn, CMTP converts the control message and
947	associated parameters into a datagram by prepending the appropriate header
948	to the control message.  The CMTP header contains several pieces of
949	information to aid the message recipient in detecting errors (see
950	section 2.4).  Each IDPR protocol can specify all of the following CMTP
951	parameters applicable to its control message:

953	 1. IDPR protocol and message type.

955	 2. Destination.

957	 3. Integrity/authentication scheme.

959	 4. Timestamp.

961	 5. Maximum number of transmissions allotted.

963	 6. Retransmission interval in microseconds.

965	One of these parameters, the timestamp, can be specified directly by CMTP as
966	the internal clock time at which the message is transmitted.  However, two
967	of the IDPR protocols, namely flooding and path control, themselves require
968	message generation timestamps for proper protocol operation.  Thus, instead
969	of requiring CMTP to pass back a timestamp to the IDPR protocol, we simplify
970	the service interface between the two protocols by allowing the IDPR

972	                                  17
973	protocol to specify the timestamp in the first place.

975	  Using the control message and accompanying parameters supplied by the
976	IDPR protocol, CMTP constructs a datagram, adding to the header
977	CMTP-specific parameters.  In particular, CMTP assigns a transaction
978	identifier to each datagram generated, used to associate acknowledgements
979	with datagram messages.  Each datagram recipient includes the received
980	transaction identifier in its returned ack or nak, and each datagram sender
981	uses the transaction identifier to match the received ack or nak with the
982	original datagram.

984	  A single datagram, for example, a routing information message or a path
985	control message, may be handled by CMTP at many different policy gateways.
986	Within a pair of consecutive IDPR entities, the datagram sender expects to
987	receive an acknowledgement from the datagram recipient.  However, only the
988	IDPR entity that actually generated the original CMTP datagram has control
989	over the transaction identifier.  The intermediate policy gateways that
990	transmit the datagram do not change the transaction identifier.
991	Nevertheless, at each intermediate policy gateway, the transaction
992	identifier must uniquely distinguish the datagram so that only one
993	acknowledgement from the next policy gateway matches the original datagram.

995	  The transaction identifier consists of the numeric identifiers for the
996	domain and IDPR entity (policy gateway or route server) issuing the original
997	datagram, together with a 32-bit local identifier assigned by CMTP operating
998	within that IDPR entity.  We recommend implementing the 32-bit local
999	identifier either as a simple counter incremented for each datagram
1000	generated or as a fine granularity clock.  The former always guarantees
1001	uniqueness of transaction identifiers; the latter guarantees uniqueness of
1002	transaction identifiers, provided the clock granularity is finer than the
1003	minimum possible interval between datagram generations and the clock
1004	wrapping period is longer than the maximum round-trip delay to and from any
1005	Internet destination.

1007	  Before transmitting a datagram, CMTP computes the length of the entire
1008	message, taking into account the prescribed integrity/authentication scheme,
1009	and then computes the integrity/authentication value over the whole message.
1010	CMTP includes both of these quantities, which are crucial for checking
1011	message integrity and authenticity at the recipient, in the datagram header.
1012	After sending a datagram, CMTP saves a copy and sets an associated
1013	retransmission timer, as directed by the IDPR protocol parameters.  If the
1014	retransmission timer fires and CMTP has received neither an ack nor a nak
1015	for the datagram, CMTP then retransmits the datagram, provided this

1017	                                  18
1018	retransmission does not exceed the transmission allotment.  Whenever a
1019	datagram exhausts its transmission allotment, CMTP discards the datagram,
1020	informs the IDPR protocol that the control message transmission was not
1021	successful, and logs the event for network management.  In this case, the
1022	IDPR protocol may either resubmit its control message to CMTP, specifying an
1023	alternate destination, or discard the control message altogether.

1025	2.2 Message Reception

1027	  At the receiving entity, when CMTP obtains a datagram, it takes one of
1028	the following actions, depending upon the outcome of the message validation
1029	checks:

1031	 1. The datagram passes the CMTP validation checks.  CMTP then delivers the
1032	    datagram with enclosed IDPR control message, to the appropriate IDPR
1033	    protocol, which in turn applies its own integrity checks to the control
1034	    message before acting on the contents.  The recipient IDPR protocol,
1035	    except in one case,(2) directs CMTP to generate an ack and return the
1036	    ack to the sender.  In addition, the IDPR protocol may pass control
1037	    information to CMTP for inclusion in the ack, depending on the contents
1038	    of the original control message.  For example, a route server unable to
1039	    fill a request for routing information may inform the requesting IDPR
1040	    entity to place its request elsewhere, through an ack for the initial
1041	    request.

1043	 2. The datagram fails at least one of the CMTP validation checks.  CMTP
1044	    then generates a nak, returns the nak to the sender, and discards the
1045	    datagram, regardless of the type of IDPR control message contained in
1046	    the datagram.  The nak indicates the nature of the validation failure
1047	    and serves to help the sender establish communication with the
1048	    recipient.  In particular, the CMTP nak provides a mechanism for
1049	    negotiation of IDPR version and integrity/authentication scheme, two
1050	    parameters crucial for establishing communication between IDPR
1051	    entities.

1053	  Upon receiving an ack or a nak, CMTP immediately discards the message if
1054	at least one of the validation checks fails or if it is unable to locate the
1055	____________________________
1056	 (2)The up/down protocol (see section 3.2) determines reachability of
1057	adjacent policy gateways and does not use CMTP ack messages to notify the
1058	sender of message reception.  Instead, the protocol messages themselves
1059	carry implicit information about message reception at the adjacent policy
1060	gateway.

1062	                                  19
1063	associated datagram.  CMTP logs the latter event for network management.
1064	Otherwise, if all of the validation checks pass and if it is able to locate
1065	the associated datagram, CMTP clears the associated retransmission timer and
1066	then takes one of the following actions, depending upon the message type:

1068	 1. The message is an ack.  CMTP discards the associated datagram and
1069	    delivers the ack, which may contain IDPR control information, to the
1070	    appropriate IDPR protocol.

1072	 2. The message is a nak.  If the associated datagram has exhausted its
1073	    transmission allotment, CMTP discards the datagram, informs the
1074	    appropriate IDPR protocol that the control message transmission was not
1075	    successful, and logs the event for network management.  Otherwise, if
1076	    the associated datagram has not yet exhausted its transmission
1077	    allotment, CMTP first checks its copy of the datagram against the
1078	    failure indication contained in the nak.  If its datagram copy appears
1079	    to be intact, CMTP retransmits the datagram and sets the associated
1080	    retransmission timer.  However, if its datagram copy appears to be
1081	    corrupted, CMTP discards the datagram, informs the IDPR protocol that
1082	    the control message transmission was not successful, and logs the event
1083	    for network management.

1085	2.3 Message Validation

1087	  On every CMTP message received, CMTP performs a set of validation checks
1088	to test message integrity and authenticity.  The order in which these tests
1089	are executed is important.  CMTP must first determine if it can parse enough
1090	of the message to compute the integrity/authentication value.  (Refer to
1091	section 2.4 for a description of CMTP message formats.)  Then, CMTP must
1092	immediately compute the integrity/authentication value before checking other
1093	header information.  An incorrect integrity/authentication value means that
1094	the message is corrupted, and so it is likely that CMTP header information
1095	is incorrect.  Checking specific header fields before computing the
1096	integrity/authentication value not only may waste time and resources, but
1097	also may lead to incorrect diagnoses of a validation failure.

1099	  The CMTP validation checks are as follows:

1101	 1. CMTP verifies that it can recognize both the control message version
1102	    and type contained in the header.  Failure to recognize either one of
1103	    these values means that CMTP cannot continue to parse the message.

1105	                                  20
1106	 2. CMTP verifies that it can recognize and accept the
1107	    integrity/authentication type contained in the header; no
1108	    integrity/authentication is not an acceptable type for CMTP.

1110	 3. CMTP computes the integrity/authentication value and verifies that it
1111	    equals the integrity/authentication value contained in the header.  For
1112	    key-based integrity/authentication schemes, CMTP may use the source
1113	    domain identifier contained in the CMTP header to index the correct
1114	    key.  Failure to index a key means that CMTP cannot compute the
1115	    integrity/authentication value.

1117	 4. CMTP computes the message length in bytes and verifies that it equals
1118	    the length value contained in the header.

1120	 5. CMTP verifies that the message timestamp is in the acceptable range.
1121	    The message should be no more recent than cmtp_new (5) minutes ahead of
1122	    the entity's current internal clock time.(3)  The cmtp_new value allows
1123	    some clock drift between IDPR entities.  Moreover, each IDPR protocol
1124	    has its own limit on the maximum age of its control messages.  The
1125	    message should be no less recent than a prescribed number of minutes
1126	    behind the entity's current internal clock time.  Hence, each IDPR
1127	    protocol performs its own message timestamp check in addition to that
1128	    performed by CMTP.

1130	 6. CMTP verifies that it can recognize the IDPR protocol designated for
1131	    the enclosed control message.

1133	Whenever CMTP encounters a failure while performing any of these validation
1134	checks, it logs the event for network management.  If the failure occurs on
1135	a datagram, CMTP immediately generates a nak containing the reason for the
1136	failure, returns the nak to the sender, and discards the datagram message.
1137	If the failure occurs on an ack or a nak, CMTP discards the ack or nak
1138	message.

1140	2.4 CMTP Message Formats

1142	  In designing the format of IDPR control messages, we have attempted to
1143	strike a balance between efficiency of link bandwidth usage and efficiency
1144	of message processing.  In general, we have chosen compact representations
1145	____________________________
1146	 (3)In this document, when we present an IDPR system configuration
1147	parameter, such as cmtp_new, we usually follow it with a recommended value
1148	in parentheses.

1150	                                  21
1151	for IDPR information in order to minimize the link bandwidth consumed by
1152	IDPR-specific information.  However, we have also organized IDPR information
1153	in order to speed message processing, which does not always result in
1154	minimum link bandwidth usage.

1156	  To limit link bandwidth usage, we currently use fixed-length identifier
1157	fields in IDPR messages; domains, virtual gateways, policy gateways, and
1158	route servers are all represented by fixed-length identifiers.  To simplify
1159	message processing, we currently align fields containing an even number of
1160	bytes on even-byte boundaries within a message.  In the future, if the
1161	Internet adopts the use of super domains, we will offer hierarchical,
1162	variable-length identifier fields in an updated version of IDPR.

1164	  The header of each CMTP message contains the following information:

1166	    0_________8________16________24_____31__
1167	    |_VERSION_|_PRT_MSG_|_DPR_DMS_|I/A_TYP_|
1168	    |_____SOURCE_AD_____|___SOURCE_ENT_____|
1169	    |______________TRANS_ID________________|
1170	    |______________TIMESTAMP_______________|
1171	    |______LENGTH_______|_message_specific_|
1172	    |____DATAGRAM_AD____|___DATAGRAM_ENT___|
1173	    |_______________INFORM_________________|
1174	    |              INT/AUTH                |
1175	    |______________________________________|

1177	VERSION (8 bits) Version number for IDPR control messages, currently equal
1178	    to 1.

1180	PRT (4 bits) Numeric identifier for the control message transport protocol,
1181	    equal to 0 for CMTP.

1183	MSG (4 bits) Numeric identifier for the CMTP message type, equal to 0 for a
1184	    datagram, 1 for an ack, and 2 for a nak.

1186	DPR(4 bits) Numeric identifier for the original datagram's IDPR protocol
1187	    type.

1189	DMS(4 bits) Numeric identifier for the original datagram's IDPR message
1190	    type.

1192	I/A TYP (8 bits) Numeric identifier for the integrity/authentication scheme
1193	    used.  CMTP requires the use of an integrity/authentication scheme;
1194	    this value must not be set equal to 0, indicating no
1195	    integrity/authentication in use.

1197	                                  22
1198	SOURCE AD (16 bits) Numeric identifier for the domain containing the IDPR
1199	    entity that generated the message.

1201	SOURCE ENT (16 bits) Numeric identifier for the IDPR entity that generated
1202	    the message.

1204	TRANSACTION ID (32 bits) Local transaction identifier assigned by the IDPR
1205	    entity that generated the original datagram.

1207	TIMESTAMP (32 bits) Number of seconds elapsed since 1 January 1970 0:00 GMT.

1209	LENGTH (16 bits) Length of the entire IDPR control message, including the
1210	    CMTP header, in bytes.

1212	message specific (16 bits) Dependent upon CMTP message type.

1214	    For datagram and ack messages:
1215	        RESERVED (16 bits) Reserved for future use and currently set equal
1216	           to 0.

1218	    For nak messages:
1219	        ERR TYP (8 bits) Numeric identifier for the type of CMTP validation
1220	           failure encountered.  Validation failures include the following
1221	           types:
1222	           1. Unrecognized IDPR control message version number.

1224	           2. Unrecognized CMTP message type.

1226	           3. Unrecognized integrity/authentication type.

1228	           4. Unacceptable integrity/authentication type.

1230	           5. Unable to locate key using source domain.

1232	           6. Incorrect integrity/authentication value.

1234	           7. Incorrect message length.

1236	           8. Message timestamp out of range.

1238	           9. Unrecognized IDPR protocol designated for the enclosed
1239	              control message.

1241	        ERR INFO (8 bits) CMTP supplies the following additional

1243	                                  23
1244	           information for the designated types of validation failures:
1245	           Type 1: Acceptable IDPR version number.
1246	           Types 2 and 3: Acceptable integrity/authentication type.

1248	DATAGRAM AD (16 bits) Numeric identifier for the domain containing the IDPR
1249	    entity that generated the original datagram.  Present only in ack and
1250	    nak messages.

1252	DATAGRAM ENT (16 bits) Numeric identifier for the IDPR entity that generated
1253	    the original datagram.  Present only in ack and nak messages.

1255	INFORM (optional, variable) Information to be interpreted by the IDPR
1256	    protocol that issued the original datagram.  Present only in ack
1257	    messages and dependent on the original datagram's IDPR protocol type.

1259	INT/AUTH (variable) Computed integrity/authentication value, dependent on
1260	    type of integrity/authentication scheme used.

1262	                                  24
1263	3  Virtual Gateway Protocol

1265	  Every policy gateway within a domain participates in gathering
1266	information about connectivity within and between virtual gateways of which
1267	it is a member and in distributing this information to other virtual
1268	gateways in its domain.  We refer to these functions collectively as the
1269	virtual gateway protocol (VGP).

1271	  The information collected through VGP has both local and global
1272	significance for IDPR. Virtual gateway connectivity information, distributed
1273	to policy gateways within a single domain, aids those policy gateways in
1274	selecting routes across and between virtual gateways connecting their domain
1275	to adjacent domains.  Inter-domain connectivity information, distributed
1276	throughout the Internet in routing information messages, aids route servers
1277	in constructing feasible policy routes.

1279	  Provided that a domain contains simple virtual gateway and transit policy
1280	configurations, one need only implement a small subset of the VGP functions.
1281	The connectivity among policy gateways within a virtual gateway and the
1282	heterogeneity of transit policies within a domain determine which VGP
1283	functions must be implemented, as we explain toward the end of this section.

1285	3.1 Message Scope

1287	  Policy gateways generate VGP messages containing information about
1288	perceived changes in virtual gateway connectivity and distribute these
1289	messages to other policy gateways within the same domain and within the same
1290	virtual gateway.  We classify VGP messages into three distinct categories:
1291	pair-PG, intra-VG, and inter-VG, depending upon the scope of message
1292	distribution.

1294	  Policy gateways use CMTP for reliable transport of VGP messages.  The
1295	issuing policy gateway must communicate to CMTP the maximum number of
1296	transmissions per VGP message, vgp_ret, and the interval between VGP message
1297	retransmissions, vgp_int microseconds.  The recipient policy gateway must
1298	determine VGP message acceptability; conditions of acceptability depend on
1299	the type of VGP message, as we describe below.

1301	  Policy gateways store, act upon, and in the case of inter-VG messages,
1302	forward the information contained in acceptable VGP messages.  VGP messages
1303	that pass the CMTP validation checks but fail a specific VGP message
1304	acceptability check are considered to be unacceptable and are hence
1305	discarded by recipient policy gateways.  A policy gateway that receives an

1307	                                  25
1308	unacceptable VGP message also logs the event for network management.

1310	3.1.1 Pair-PG Messages

1312	Pair-PG message communication occurs between the two members of a pair of
1313	adjacent, peer, or neighbor policy gateways.  With IDPR, the only pair-PG
1314	messages are those periodically generated by the up/down protocol and used
1315	to monitor mutual reachability between policy gateways.

1317	  A pair-PG message is acceptable if:

1319	 1. It passes the CMTP validation checks.

1321	 2. Its timestamp is less than vgp_old (300) seconds behind the recipient's
1322	    internal clock time.

1324	 3. Its destination policy gateway identifier coincides with the identifier
1325	    of the recipient policy gateway.

1327	 4. Its source policy gateway identifier coincides with the identifier of a
1328	    policy gateway configured for the recipient's domain or associated
1329	    virtual gateway.

1331	3.1.2 Intra-VG Messages

1333	Intra-VG message communication occurs between one policy gateway and all of
1334	its peers.  Whenever a policy gateway discovers that its connectivity to an
1335	adjacent or neighbor policy gateway has changed, it issues an intra-VG
1336	message indicating the connectivity change to all of its reachable peers.
1337	Whenever a policy gateway detects that a previously unreachable peer is now
1338	reachable, it issues, to that peer, intra-VG messages indicating
1339	connectivity to adjacent and neighbor policy gateways.  If the issuing
1340	policy gateway fails to receive an analogous intra-VG message from the newly
1341	reachable peer within twice the configured VGP retransmission interval,
1342	vgp_int microseconds, it actively requests the intra-VG message from that
1343	peer.  These message exchanges ensure that peers maintain a consistent view
1344	of each others' connectivity to adjacent and neighbor policy gateways.

1346	  An intra-VG message is acceptable if:

1348	 1. It passes the CMTP validation checks.

1350	                                  26
1351	 2. Its timestamp is less than vgp_old (300) seconds behind the recipient's
1352	    internal clock time.

1354	 3. Its virtual gateway identifier coincides with that of a virtual gateway
1355	    configured for the recipient's domain.

1357	3.1.3 Inter-VG Messages

1359	Inter-VG message communication occurs between one policy gateway and all of
1360	its neighbors.  Whenever the lowest-numbered operational policy gateway in a
1361	set of mutually reachable peers discovers that its virtual gateway's
1362	connectivity to the adjacent domain or to another virtual gateway has
1363	changed, it issues an inter-VG message indicating the connectivity change to
1364	all of its neighbors.  Specifically, the policy gateway distributes an
1365	inter-VG message to a VG-representative policy gateway (see section 3.1.4
1366	below) in each virtual gateway in the domain.  Each VG representative in
1367	turn propagates the inter-VG message to each of its peers.

1369	  Whenever the lowest-numbered operational policy gateway in a set of
1370	mutually peers detects that one or more previously unreachable peers are now
1371	reachable, it issues, to the lowest-numbered operational policy gateway in
1372	all other virtual gateways, requests for inter-VG information indicating
1373	connectivity to adjacent domains and to other virtual gateways.  The
1374	recipient policy gateways return the requested inter-VG messages to the
1375	issuing policy gateway, which in turn distributes the messages to the newly
1376	reachable peers.  These message exchanges ensure that virtual gateways
1377	maintain a consistent view of each others' connectivity, while consuming
1378	minimal domain resources in distributing connectivity information.

1380	  An inter-VG message contains information about the entire virtual
1381	gateway, not just about the issuing policy gateway.  Thus, when virtual
1382	gateway connectivity changes happen in rapid succession, recipients of the
1383	resultant inter-VG messages should be able to determine the most recent
1384	message and that message must contain the current virtual gateway
1385	connectivity information.  To ensure that the connectivity information
1386	distributed is consistent and unambiguous, we designate a single policy
1387	gateway, namely the lowest-numbered operational peer, for generating and
1388	distributing inter-VG messages.  It is a simple procedure for a set of
1389	mutually reachable peers to determine the lowest-numbered member.

1391	  To understand why a single member of a virtual gateway must issue
1392	inter-VG messages, consider the following example.  Suppose that two peers

1394	                                  27
1395	in a virtual gateway each detect a different connectivity change and
1396	generate a separate inter-VG message.  Recipients may not be able to
1397	determine which message is more recent, as policy gateway internal clocks
1398	may not be synchronized to the necessary granularity.  Moreover, even if the
1399	clocks were synchronized so that recipients could determine message recency,
1400	it is possible for each peer to issue its inter-VG message before receiving
1401	current information from the other.  As a result, neither inter-VG message
1402	contains the correct connectivity for the virtual gateway.  However, these
1403	problems are eliminated if all inter-VG messages are generated by a single
1404	peer within a virtual gateway, in particular the lowest-numbered operational
1405	policy gateway.

1407	  An inter-VG message is acceptable if:

1409	 1. It passes the CMTP validation checks.

1411	 2. Its timestamp is less than vgp_old (300) seconds behind the recipient's
1412	    internal clock time.

1414	 3. Its virtual gateway identifier coincides with that of a virtual gateway
1415	    configured for the recipient's domain.

1417	 4. Its source policy gateway identifier represents the lowest numbered
1418	    operational member of the issuing virtual gateway, reachable from the
1419	    recipient.

1421	  Distribution of intra-VG messages among peers often triggers generation
1422	and distribution of inter-VG messages among virtual gateways.  Usually, the
1423	lowest-numbered operational policy gateway in a virtual gateway generates
1424	and distributes an inter-VG message immediately after detecting a change in
1425	virtual gateway connectivity, through receipt or generation of an intra-VG
1426	message.  However, if this policy gateway is also waiting for an intra-VG
1427	message from a newly reachable peer, it does not immediately generate and
1428	distribute the inter-VG message.

1430	  Waiting for intra-VG messages enables the lowest-numbered operational
1431	policy gateway in a virtual gateway to gather the most recent connectivity
1432	information for inclusion in the inter-VG message.  However, under unusual
1433	circumstances, the policy gateway may fail to receive an intra-VG message
1434	from a newly reachable peer, even after actively requesting such a message.
1435	To accommodate this case, VGP uses an upper bound of four times the
1436	configured retransmission interval, vgp_int microseconds, on the amount of
1437	time to wait before generating and distributing an inter-VG message, when
1438	receipt of an intra-VG message is pending.

1440	                                  28
1441	3.1.4 VG Representatives

1443	When distributing an inter-VG message, the issuing policy gateway selects as
1444	recipients one neighbor, the VG representative, from each virtual gateway in
1445	the domain.  To be selected as a VG representative, a policy gateway must be
1446	reachable from the issuing policy gateway via intra-domain routing.  The
1447	issuing policy gateway gives preference to neighbors that are members of
1448	more than one virtual gateway.  Such a neighbor acts as a VG representative
1449	for all virtual gateways of which it is a member and restricts inter-VG
1450	message distribution as follows:  any policy gateway that is a peer in more
1451	than one of the represented virtual gateways receives at most one copy of
1452	the inter-VG message.  This message distribution strategy minimizes the
1453	number of message copies required for disseminating inter-VG information.

1455	3.2 Up/Down Protocol

1457	  Directly-connected adjacent policy gateways execute the up/down protocol
1458	to determine mutual reachability.  Pairs of peer or neighbor policy gateways
1459	can determine mutual reachability through information provided by the
1460	intra-domain routing procedure or through execution of the up/down protocol.
1461	In general, we do not recommend implementing the up/down protocol between
1462	each pair of policy gateways in a domain, as it results in O(n^2) (where n is
1463	the number of policy gateways within the domain) communications complexity.
1464	However, if the intra-domain routing procedure is slow to detect
1465	connectivity changes or is unable to report reachability at the IDPR entity
1466	level, the reachability information obtained through the up/down protocol
1467	may well be worth the extra communications cost.  In the remainder of this
1468	section, we decribe the up/down protocol from the perspective of adjacent
1469	policy gateways, but we note that the identical protocol can be applied to
1470	peer and neighbor policy gateways as well.

1472	  The up/down protocol determines whether the direct connection between
1473	adjacent policy gateways is acceptable for data traffic transport.  A direct
1474	connection is presumed to be down (unacceptable for data traffic transport)
1475	until the up/down protocol declares it to be up (acceptable for data traffic
1476	transport).  We say that a virtual gateway is up if there exists at least
1477	one pair of adjacent policy gateways whose direct connection is acceptable
1478	for data traffic transport, and that a virtual gateway is down if there
1479	exists no such pair of adjacent policy gateways.

1481	  When executing the up/down protocol, policy gateways exchange up/down
1482	messages every ud_per (1) second.  All policy gateways use the same default

1484	                                  29
1485	period of ud_per initially and then negotiate a preferred period through
1486	exchange of up/down messages.  A policy gateway reports its desired value
1487	for ud_per within its up/down messages.  It then chooses the larger of its
1488	desired value and that of the adjacent policy gateway as the period for
1489	exchanging subsequent up/down messages.  Policy gateways also exchange, in
1490	up/down messages, information about the identity of their respective domain
1491	components.  This information assists the policy gateways in selecting
1492	routes across virtual gateways to partitioned domains.

1494	  Each up/down message is transported using CMTP and hence is covered by
1495	the CMTP validation checks.  However, unlike other IDPR control messages,
1496	up/down messages do not require reliable transport.  Specifically, the
1497	up/down protocol requires only a single transmission per up/down message and
1498	never directs CMTP to return an ack.  As pair-PG messages, up/down messages
1499	are acceptable under the conditions described in section 3.1.1.

1501	  Each policy gateway assesses the state of its direct connection, to the
1502	adjacent policy gateway, by counting the number of acceptable up/down
1503	messages received within a set of consecutive periods.  A policy gateway
1504	communicates its perception of the state of the direct connection through
1505	its up/down messages.  Initially, a policy gateway indicates the down state
1506	in each of its up/down messages.  Only when the direct connection appears to
1507	be up from its perspective does a policy gateway indicate the up state in
1508	its up/down messages.

1510	  A policy gateway can begin to transport data traffic over a direct
1511	connection only after both of the following conditions are satisfied:

1513	 1. The policy gateway receives from the adjacent policy gateway at least j
1514	    acceptable up/down messages within the last m consecutive periods.
1515	    From the recipient policy gateway's perspective, this event constitutes
1516	    a state transition of the direct connection from down to up.  Hence,
1517	    the policy gateway indicates the up state in its subsequent up/down
1518	    messages.

1520	 2. The up/down message most recently received from the adjacent policy
1521	    gateway indicates the up state, signifying that the adjacent policy
1522	    gateway considers the direct connection to be up as well.

1524	  A policy gateway must cease to transport data traffic over a direct
1525	connection whenever either of the following conditions is satisfied:

1527	 1. The policy gateway receives from the adjacent policy gateway at most k

1529	                                  30
1530	    acceptable up/down messages within the last n consecutive periods.

1532	 2. The up/down message most recently received from the adjacent policy
1533	    gateway indicates the down state, signifying that the adjacent policy
1534	    gateway considers the direct connection to be down.

1536	From the recipient policy gateway's perspective, either of these events
1537	constitutes a state transition of the direct connection from up to down.
1538	Hence, the policy gateway indicates the down state in its subsequent up/down
1539	messages.

1541	3.2.1 Implementation

1543	We recommend implementing the up/down protocol using a sliding window.  Each
1544	window slot indicates the up/down message activity during a given period,
1545	containing either a hit for receipt of an acceptable up/down message or a
1546	miss for failure to receive an acceptable up/down message, within the given
1547	period.  In addition to the sliding window, the implementation should
1548	include a tally of hits recorded during the current period and a tally of
1549	misses recorded over the current window.

1551	  When the direct connection moves to the down state, the initial values of
1552	the up/down protocol parameters must be set as follows:

1554	  o The sliding window size is equal to m.

1556	  o Each window slot contains a miss.

1558	  o The current period hit tally is equal to 0.

1560	  o The current window miss tally is equal to m.

1562	  When the direct connection moves to the up state, the initial values of
1563	the up/down protocol parameters must be set as follows:

1565	  o The sliding window size is equal to n.

1567	  o Each window slot contains a hit.

1569	  o The current period hit tally is equal to 0.

1571	  o The current window miss tally is equal to 0.

1573	  At the conclusion of each period, a policy gateway computes the miss

1575	                                  31
1576	tally and determines whether there has been a state transition of the direct
1577	connection to the adjacent policy gateway.  In the down state, a miss tally
1578	of no more than m-j signals a transition to the up state.  In the up
1579	state, a miss tally of no less than n-k signals a transition to the down
1580	state.

1582	  Computing the correct miss tally involves several steps.  First, the
1583	policy gateway prepares to slide the window by one slot so that the oldest
1584	slot disappears, making room for the newest slot.  However, before sliding
1585	the window, the policy gateway checks the contents of the oldest window
1586	slot.  If this slot contains a miss, the policy gateway decrements the miss
1587	tally by 1, as this slot is no longer part of the current window.

1589	  After sliding the window, the policy gateway initially records a miss in
1590	the newest window slot and then determines what the proper slot contents
1591	should be.  If the hit tally for the current period equals 0, a miss is the
1592	correct value for the newest slot, and so the policy gateway increments the
1593	miss tally by 1.  Otherwise, if the hit tally for the current period is
1594	greater than 0, the policy gateway applies the hits to any slot containing a
1595	miss, beginning with the newest and progressing to the oldest such slot.
1596	For each such slot, the policy gateway records a hit in that slot and
1597	decrements the hit tally by 1.  If the selected slot is not the newest slot,
1598	the hit cancels out an actual miss, and so the policy gateway decrements the
1599	miss tally by 1 as well.  The policy gateway continues to apply each
1600	remaining hit tallied to any slot containing a miss, until either all such
1601	hits are exhausted or all such slots are accounted for.  Before beginning
1602	the next up/down period, the policy gateway resets the hit tally to 0.

1604	  Although we expect the hit tally, within any given period, to be no
1605	greater than 1, we do anticipate the occasional period in which a policy
1606	gateway receives more than one up/down message from an adjacent policy
1607	gateway.  The most common reasons for this occurrence are message delay and
1608	clock drift.  When an up/down message is delayed, the receiving policy
1609	gateway observes a miss in one period followed by two hits in the next
1610	period, one of which cancels the previous miss.  However, excess hits
1611	remaining in the tally after miss cancellation indicate a problem, such as
1612	clock drift.  Thus, whenever a policy gateway accumulates excess hits, it
1613	logs the event for network management.

1615	  When clock drift occurs between two adjacent policy gateways, it causes
1616	the period of one policy gateway to grow with respect to the period of the
1617	other policy gateway.  Let pX be the period for PG X, let pY be the period
1618	for PG Y, and let g and h be the smallest positive integers such that

1620	                                  32
1621	gpX = hpY.  Suppose that pX < pY because of clock drift.  In this case, PG
1622	X observes g-h misses in g consecutive periods, while PG Y observes g-h
1623	surplus hits in h consecutive periods.  As long as (g-h)/g < (n-k)/n and
1624	(g-h)/g <= (m-j)/m, the clock drift itself will not cause the direct
1625	connection to enter or remain in the down state.

1627	3.3 Policy Gateway Connectivity

1629	  Policy gateways collect connectivity information through the intra-domain
1630	routing procedure and through VGP, and they distribute connectivity changes
1631	through VGP in both intra-VG messages to peers and inter-VG messages to
1632	neighbors.  Locally, this connectivity information assists policy gateways
1633	in selecting routes, not only across a virtual gateway to an adjacent domain
1634	but also across a domain between two virtual gateways.  Moreover, changes in
1635	connectivity between domains are distributed, in routing information
1636	messages, to route servers throughout the Internet.

1638	3.3.1 Within a Virtual Gateway

1640	Each policy gateway within a virtual gateway constantly monitors its
1641	connectivity to all adjacent and to all peer policy gateways.  To determine
1642	the state of its direct connection to an adjacent policy gateway, a policy
1643	gateway uses reachability information supplied by the up/down protocol.  To
1644	determine the state of its intra-domain routes to a peer policy gateway, a
1645	policy gateway uses reachability information supplied by either the
1646	intra-domain routing procedure or the up/down protocol.

1648	  When a policy gateway detects a change, in state or adjacent domain
1649	component, associated with its direct connection to an adjacent policy
1650	gateway, or when a policy gateway detects that a previously unreachable peer
1651	is now reachable, it generates a PG connect message.  In the first case, it
1652	distributes a copy to each peer reachable via intra-domain routing, and in
1653	the second case, it distributes a copy to the newly reachable peer.  A PG
1654	connect message is an intra-VG message that includes information about each
1655	adjacent policy gateway directly connected to the issuing policy gateway.
1656	Specifically, the PG connect message contains the adjacent policy gateway's
1657	identifier, status (reachable or unreachable), and domain component
1658	identifier.  If a PG connect message contains a request, each peer that
1659	receives the message responds to the sender with its own PG connect message.

1661	  All mutually reachable peers monitor policy gateway connectivity within

1663	                                  33
1664	their virtual gateway, through the up/down protocol, the intra-domain
1665	routing procedure, and the exchange of PG connect messages.  Within a given
1666	virtual gateway, each constituent policy gateway maintains the following
1667	information about each configured adjacent policy gateway:

1669	 1. The identifier for the adjacent policy gateway.

1671	 2. The status of the adjacent policy gateway:  reachable/unreachable,
1672	    directly connected/not directly connected.

1674	 3. The local exit interfaces used to reach the adjacent policy gateway,
1675	    provided it is reachable.

1677	 4. The identifier for the adjacent policy gateway's domain component.

1679	 5. The set of peers to which the adjacent policy gateway is
1680	    directly-connected.

1682	Hence, all mutually reachable peers can detect changes in connectivity
1683	across the virtual gateway to adjacent domain components.

1685	  When the lowest-numbered operational policy gateway within a virtual
1686	gateway detects a change in the set of adjacent domain components reachable
1687	through direct connections across the given virtual gateway, it generates a
1688	VG connect message and distributes a copy to a VG representative in all
1689	other virtual gateways connected to its domain.  A VG connect message is an
1690	inter-VG message that includes information about each peer's connectivity
1691	across the given virtual gateway.  Specifically, the VG connect message
1692	contains, for each peer, its identifier and the identifiers of the domain
1693	components reachable through its direct connections to adjacent policy
1694	gateways.  Moreover, the VG connect message gives each recipient enough
1695	information to determine the state, up or down, of the issuing virtual
1696	gateway.

1698	  The issuing policy gateway, namely the lowest-numbered operational peer,
1699	may have to wait up to four times vgp_int microseconds after detecting the
1700	connectivity change, before generating and distributing the VG connect
1701	message, as described in section 3.1.3.  Each recipient VG representative in
1702	turn distributes a copy of the VG connect message to each of its peers
1703	reachable via intra-domain routing.  If a VG connect message contains a
1704	request, then in each recipient virtual gateway, the lowest-numbered
1705	operational peer that receives the message responds to the original sender
1706	with its own VG connect message.

1708	                                  34
1709	3.3.2 Between Virtual Gateways

1711	At present, we expect transit policies to be uniform over all intra-domain
1712	routes between any pair of policy gateways within a domain.  However, when
1713	tariffed qualities of service become prevalent offerings for intra-domain
1714	routing, we can no longer expect uniformity of transit policies throughout a
1715	domain.  To monitor the transit policies supported on intra-domain routes
1716	between virtual gateways requires both a policy-sensitive intra-domain
1717	routing procedure and a VGP exchange of policy information between neighbor
1718	policy gateways.

1720	  Each policy gateway within a domain constantly monitors its connectivity
1721	to all peer and neighbor policy gateways, including the transit policies
1722	supported on intra-domain routes to these policy gateways.  To determine the
1723	state of its intra-domain connection to a peer or neighbor policy gateway, a
1724	policy gateway uses reachability information supplied by either the
1725	intra-domain routing procedure or the up/down protocol.  To determine the
1726	transit policies supported on intra-domain routes to a peer or neighbor
1727	policy gateway, a policy gateway uses policy-sensitive reachability
1728	information supplied by the intra-domain routing procedure.  We note that
1729	when transit policies are uniform over a domain, reachability and
1730	policy-sensitive reachability are equivalent.

1732	  Within a virtual gateway, each constituent policy gateway maintains the
1733	following information about each configured peer and neighbor policy
1734	gateway:

1736	 1. The identifier for the peer or neighbor policy gateway.

1738	 2. The identifiers corresponding to the transit policies configured to be
1739	    supported by intra-domain routes to the peer or neighbor policy
1740	    gateway.

1742	 3. For each transit policy, the status of the peer or neighbor policy
1743	    gateway:  reachable/unreachable.

1745	 4. For each transit policy, the local exit interfaces used to reach the
1746	    peer or neighbor policy gateway, provided it is reachable.

1748	 5. The identifiers for the adjacent domain components reachable through
1749	    direct connections from the peer or neighbor policy gateway, obtained
1750	    through VG connect messages.

1752	Using this information, a policy gateway can detect changes in its

1754	                                  35
1755	connectivity to a neighboring domain component, with respect to a given
1756	transit policy and through a given neighbor.  Moreover, combining the
1757	information obtained for all neighbors within a given virtual gateway, the
1758	policy gateway can detect changes in its connectivity, with respect to a
1759	given transit policy, to another virtual gateway and to adjacent domain
1760	components reachable through that virtual gateway.

1762	  All policy gateways mutually reachable via intra-domain routes supporting
1763	a configured transit policy need not exchange information about perceived
1764	changes in connectivity, with respect to the given transit policy.  In this
1765	case, each policy gateway can infer another's policy-sensitive reachability
1766	to a third, through mutual peer intra-domain reachability information
1767	provided by the intra-domain routing procedure.  However, whenever two or
1768	more policy gateways are no longer mutually reachable with respect to a
1769	given transit policy, these policy gateways can no longer infer each other's
1770	reachability to other policy gateways, with respect to that transit policy.
1771	In this case, these policy gateways must exchange explicit information about
1772	changes in connectivity to other policy gateways, with respect to that
1773	transit policy.

1775	  When a policy gateway detects a change in its connectivity to another
1776	virtual gateway, with respect to a configured transit policy, or to an
1777	adjacent domain component reachable through that virtual gateway, or when a
1778	policy gateway detects that a previously unreachable peer is now reachable,
1779	it generates a PG policy message.  In the first case, it distributes a copy
1780	to each peer reachable via intra-domain routing but not currently reachable
1781	via any intra-domain routes of the given transit policy, and in the second
1782	case, it distributes a copy to the newly reachable peer.  A PG policy
1783	message is an intra-VG message that includes information about each
1784	configured transit policy and each virtual gateway configured to be
1785	reachable from the issuing policy gateway via intra-domain routes of the
1786	given transit policy.  Specifically, the PG policy message contains, for
1787	each configured transit policy:

1789	 1. The identifier of the transit policy.

1791	 2. The identifiers of the virtual gateways associated with the given
1792	    transit policy and currently reachable, with respect to that transit
1793	    policy, from the issuing policy gateway.

1795	 3. The identifiers of the domain components reachable from and adjacent to
1796	    the members of the given virtual gateways.

1798	If a PG policy message contains a request, each peer that receives the

1800	                                  36
1801	message responds to the original sender with its own PG policy message.

1803	  In addition to connectivity between itself and its neighbors, each policy
1804	gateway also monitors the connectivity, between domain components adjacent
1805	to its virtual gateway and domain components adjacent to other virtual
1806	gateways, through its domain and with respect to the configured transit
1807	policies.  For each member of each of its virtual gateways, a policy gateway
1808	monitors:

1810	 1. The set of adjacent domain components currently reachable through
1811	    direct connections across the given virtual gateway.  The policy
1812	    gateway obtains this information through PG connect messages from
1813	    reachable peers and through up/down messages from adjacent policy
1814	    gateways.

1816	 2. For each configured transit policy, the set of virtual gateways
1817	    currently reachable from the given virtual gateway with respect to that
1818	    transit policy and the set of neighboring domain components currently
1819	    reachable through direct connections across those virtual gateways.
1820	    The policy gateway obtains this information through PG policy messages
1821	    from peers, VG connect messages from neighbors, and the intra-domain
1822	    routing procedure.  Using this information, a policy gateway can detect
1823	    connectivity changes, through its domain and with respect to a given
1824	    transit policy, between neighboring domain components.

1826	  When the lowest-numbered operational policy gateway within a virtual
1827	gateway detects a change in the connectivity between a domain component
1828	adjacent to its virtual gateway and a domain component adjacent to another
1829	virtual gateway in its domain, with respect to a configured transit policy,
1830	it generates a VG policy message and distributes a copy to a VG
1831	representative in selected virtual gateways connected to its domain.  In
1832	particular, the lowest-numbered operational policy gateway distributes a VG
1833	policy message to a VG representative in every other virtual gateway
1834	containing a member reachable via intra-domain routing but not currently
1835	reachable via any routes of the given transit policy.  A VG policy message
1836	is an inter-VG message that includes information about the connectivity
1837	between domain components adjacent to the issuing virtual gateway and domain
1838	components adjacent to the other virtual gateways in the domain, with
1839	respect to configured transit policies.  Specifically, the VG policy message
1840	contains, for each transit policy:

1842	 1. The identifier of the transit policy.

1844	 2. The identifiers of the virtual gateways associated with the given

1846	                                  37
1847	    transit policy and currently reachable, with respect to that transit
1848	    policy, from the issuing virtual gateway.

1850	 3. The identifiers of the domain components reachable from and adjacent to
1851	    the members of the given virtual gateways.

1853	  The issuing policy gateway, namely the lowest-numbered operational peer,
1854	may have to wait up to four times vgp_int microseconds after detecting the
1855	connectivity change, before generating and distributing the VG policy
1856	message, as described in section 3.1.3.  Each recipient VG representative in
1857	turn distributes a copy of the VG policy message to each of its peers
1858	reachable via intra-domain routing.  If a VG policy message contains a
1859	request, then in each recipient virtual gateway, the lowest-numbered
1860	operational peer that receives the message responds to the original sender
1861	with its own VG policy message.

1863	3.3.3 Communication Complexity

1865	We offer an example, to provide an estimate of the number of VGP messages
1866	exchanged within a domain, AD X, after a detected change in policy gateway
1867	connectivity.  Suppose that an adjacent domain, AD Y, partitions such that
1868	the partition is detectable through the exchange of up/down messages across
1869	a virtual gateway connecting AD X and AD Y.  Let V be the number of
1870	virtual gateways in AD X, and let P be the number of peer policy gateways
1871	within each virtual gateway.  Within AD X, the detected partition will
1872	result in the following VGP message exchanges:

1874	 1. P-1 policy gateways each receive one PG connect message.  The policy
1875	    gateway detecting the adjacent domain partition generates a PG connect
1876	    message and distributes it to each peer in the virtual gateway.

1878	 2. P(V-1) policy gateways each receive one VG connect message.  The
1879	    lowest-numbered operational policy gateway in the virtual gateway
1880	    detecting the partition of the adjacent domain generates a VG connect
1881	    message and distributes it to a VG representative in all other virtual
1882	    gateways connected to the domain.  In turn, each VG representative
1883	    distributes the VG connect message to each peer within its virtual
1884	    gateway.

1886	 3. P(V-1) policy gateways each receive at most P-1 PG policy messages,
1887	    and only if the domain has more than a single, uniform transit policy.
1888	    Each policy gateway in each virtual gateway generates a PG policy

1890	                                  38
1891	    message and distributes it to all reachable peers not currently
1892	    reachable with respect to the given transit policy.

1894	 4. PV policy gateways each receive at most V-1 VG policy messages, and
1895	    only if the domain has more than a single, uniform transit policy.  The
1896	    lowest-numbered operational policy gateway in each virtual gateway
1897	    generates a VG policy message and distributes it to a VG representative
1898	    in all other virtual gateways containing at least one reachable member
1899	    not currently reachable with respect to the given transit policy.  In
1900	    turn, each VG representative distributes a VG policy message to each
1901	    peer within its virtual gateway.

1903	3.4 VGP Message Formats

1905	  The virtual gateway protocol number is equal to 0.  We describe the
1906	contents of each type of VGP message below.

1908	3.4.1 Up/Down

1910	The up/down message type is equal to 0.

1912	    0_________8________16________24_____31__
1913	    |_____SRC_CMP_______|_____DST_AD_______|
1914	    |______DST_PG_______|_PERIOD__|_STATE__|

1916	SRC CMP (16 bits) Numeric identifier for the domain component containing the
1917	    issuing policy gateway.

1919	DST AD (16 bits) Numeric identifier for the destination domain.

1921	DST PG (16 bits) Numeric identifier for the destination policy gateway.

1923	PERIOD (8 bits) Length in seconds of the up/down message generation period.

1925	STATE (8 bits) Perceived state (1 up, 0 down) of the direct connection in
1926	    the direction from the destination policy gateway to the issuing policy
1927	    gateway, contained in the right-most bit.

1929	                                  39
1930	3.4.2 PG Connect

1932	The PG connect message type is equal to 1.  PG connect messages are not
1933	required for any virtual gateway containing exactly two policy gateways.

1935	                            0_________8________16________24_____31__
1936	                            |______ADJ_AD_______|___VG___|__RQST___|
1937	                            |_____NUM_RCH_______|___NUM_UNRCH______|
1938	                            |_______________________________________
1939	     For each reachable PG: |______ADJ_PG_______|_____ADJ_CMP______|
1940	                            |____________________
1941	   For each unreachable PG: |______ADJ_PG_______|

1943	ADJ AD(16 bits) Numeric identifier for the adjacent domain.

1945	VG (8 bits) Numeric identifier for the virtual gateway associated with the
1946	    adjacent domain.

1948	RQST (8 bits) Request for a PG connect message (1 request, 0 no request)
1949	    from each recipient peer, contained in the right-most bit.

1951	NUM RCH (16 bits) Number of adjacent policy gateways within the virtual
1952	    gateway, which are directly-connected to and currently reachable from
1953	    the issuing policy gateway.

1955	NUM UNRCH (16 bits) Number of adjacent policy gateways within the virtual
1956	    gateway, which are directly-connected to but not currently reachable
1957	    from the issuing policy gateway.

1959	ADJ PG (16 bits) Numeric identifier for a directly-connected adjacent policy
1960	    gateway.

1962	ADJ CMP (16 bits) Numeric identifier for the domain component containing the
1963	    reachable, directly-connected adjacent policy gateway.

1965	3.4.3 PG Policy

1967	The PG policy message type is equal to 2.  PG policy messages are not
1968	required for any virtual gateway containing exactly two policy gateways or
1969	for any domain with a single, uniform transit policy.

1971	                                  40
1972	                              0_________8_________16________24_____31__
1973	                              |_______ADJ_AD_______|___VG____|__RQST__|
1974	                              |_______NUM_TP_______|
1975	                              |________________________________________
1976	                 For each TP: |_________TP_________|_____NUM_VG_______|
1977	                              |________________________________________
1978	For each VG reachable via TP: |_______ADJ_AD_______|____VG___|_UNUSED_|
1979	                              |______NUM_CMP_______|______CMP_________|

1981	ADJ AD (16 bits) Numeric identifier for the adjacent domain.

1983	VG (8 bits) Numeric identifier for the virtual gateway associated with the
1984	    adjacent domain.

1986	RQST (8 bits) Request for a PG policy message (1 request, 0 no request) from
1987	    each recipient peer, contained in the right-most bit.

1989	NUM TP (8 bits) Number of transit policies configured to include the virtual
1990	    gateway.

1992	TP (16 bits) Numeric identifier for a transit policy associated with the
1993	    virtual gateway.

1995	NUM VG (16 bits) Number of virtual gateways reachable from the issuing
1996	    policy gateway, via intra-domain routes supporting the transit policy.

1998	UNUSED (8 bits) Not currently used; must be set equal to 0.

2000	NUM CMP (16 bits) Number of adjacent domain components reachable via direct
2001	    connections through the virtual gateway.

2003	CMP (16 bits) Numeric identifier for a reachable adjacent domain component.

2005	3.4.4 VG Connect

2007	The VG connect message type is equal to 3.

2009	                0_________8________16________24_____31__
2010	                |______ADJ_AD_______|___VG____|__RQST__|
2011	                |______NUM_PG_______|
2012	                |_______________________________________
2013	   For each PG: |________PG_________|_____NUM_CMP______|
2014	                |______ADJ_CMP______|

2016	                                  41
2017	ADJ AD (16 bits) Numeric identifier for the adjacent domain.

2019	VG (8 bits) Numeric identifier for the virtual gateway associated with the
2020	    adjacent domain.

2022	RQST (8 bits) Request for a VG connect message (1 request, 0 no request)
2023	    from a recipient in all other virtual gateways, contained in the
2024	    right-most bit.

2026	NUM PG (16 bits) Number of mutually-reachable peer policy gateways in the
2027	    virtual gateway.

2029	PG (16 bits) Numeric identifier for a peer policy gateway.

2031	NUM CMP (16 bits) Number of components of the adjacent domain reachable via
2032	    direct connections from the policy gateway.

2034	ADJ CMP (16 bits) Numeric identifier for a reachable adjacent domain
2035	    component.

2037	3.4.5 VG Policy

2039	The VG policy message type is equal to 4.  VG policy messages are not
2040	required for any domain with a single, uniform transit policy.

2042	                                    0_________8________16________24_____31__
2043	                                    |______ADJ_AD_______|___VG____|__RQST__|
2044	                                    |______NUM_TP_______|
2045	                                    |_______________________________________
2046	                       For each TP: |________TP_________|_____NUM_GRP______|
2047	                                    |_______________________________________
2048	For each VG Group reachable via TP: |______NUM_VG_______|_____ADJ_AD_______|
2049	                                    |___VG____|_UNUSED__|_____NUM_CMP______|
2050	                                    |________CMP________|

2052	ADJ AD (16 bits) Numeric identifier for the adjacent domain.

2054	VG (8 bits) Numeric identifier for a virtual gateway associated with the
2055	    adjacent domain.

2057	RQST (8 bits) Request for a VG policy message (1 request, 0 no request) from
2058	    a recipient in all other virtual gateways, contained in the right-most
2059	    bit.

2061	NUM TP (16 bits) Number of transit policies configured to include the
2062	    virtual gateway.

2064	                                  42
2065	TP (16 bits) Numeric identifier for a transit policy associated with the
2066	    virtual gateway.

2068	NUM GRP (16 bits) Number of groups of virtual gateways, such that all
2069	    members in a group are reachable from the issuing virtual gateway via
2070	    intra-domain routes supporting the given transit policy.

2072	NUM VG (16 bits) Number of virtual gateways in a virtual gateway group.

2074	UNUSED (8 bits) Not currently used; must be set equal to 0.

2076	NUM CMP (16 bits) Number of adjacent domain components reachable via direct
2077	    connections through the virtual gateway.

2079	CMP (16 bits) Numeric identifier for a reachable adjacent domain component.

2081	  Normally, each VG policy message will contain a single virtual gateway
2082	group.  However, if the issuing virtual gateway becomes partitioned such
2083	that peers are mutually reachable with respect to some transit policies but
2084	not others, virtual gateway groups may be necessary.  For example, let
2085	PG X and PG Y be two peers in VG1.  Suppose that PG X and PG Y are
2086	reachable with respect to transit policy A but not with respect to transit
2087	policy B.  Furthermore, suppose that PG X can reach members of VG2 via
2088	intra-domain routes of transit policy B and that PG Y can reach members of
2089	VG3 via intra-domain routes of transit policy B.  Then the entry in the VG
2090	policy message issued by VG1 will include, for transit policy B, two
2091	groups of virtual gateways, one containing VG2 and one containing VG3.

2093	3.4.6 Negative Acknowledgements

2095	When a policy gateway receives an unacceptable VGP message that passes the
2096	CMTP validation checks, it includes, in its CMTP ack, an appropriate VGP
2097	negative acknowledgement.  This information is placed in the INFORM field of
2098	the CMTP ack (described in section 2.4); the numeric identifier for each
2099	type of VGP negative acknowledgement is contained in the left-most 8 bits of
2100	the INFORM field.  Negative acknowledgements associated with VGP include the
2101	following types:

2103	 1. Unrecognized VGP message type.  Numeric identifier for the unrecognized
2104	    message type (8 bits).

2106	 2. Out-of-date VGP message.

2108	 3. Unrecognized virtual gateway source.  Numeric identifier for the
2109	    unrecognized virtual gateway including adjacent administrative domain
2110	    (16 bits) and local identifier (8 bits).

2112	                                  43
2113	4  Routing Information Distribution

2115	  Each domain participating in IDPR generates and distributes its routing
2116	information messages to route servers throughout the Internet.  IDPR routing
2117	information messages contain information about the transit policies in
2118	effect across the given domain and the virtual gateway connectivity to
2119	adjacent domains.  Route servers in turn use IDPR routing information to
2120	generate policy routes between source and destination domains.

2122	  There are two different procedures for distributing IDPR routing
2123	information:  the flooding protocol and the route server query protocol.
2124	With the flooding protocol, a representative policy gateway in each domain
2125	floods its routing information messages to all other domains.  With the
2126	route server query protocol, a policy gateway or route server requests
2127	routing information from another route server, which in turn responds with
2128	routing information from its database.  The route server query protocol can
2129	be used for quickly updating the routing information maintained by a policy
2130	gateway or route server that has just been connected or reconnected to the
2131	Internet.  In this section, we describe the flooding protocol only; in
2132	section 5, we describe the route server query protocol.

2134	  Policy gateways and route servers use CMTP for reliable transport of IDPR
2135	routing information messages flooded between peer, neighbor, and adjacent
2136	policy gateways and between those policy gateways and route servers.  The
2137	issuing policy gateway must communicate to CMTP the maximum number of
2138	transmissions per routing information message, flood_ret, and the interval
2139	between routing information message retransmissions, flood_int microseconds.
2140	The recipient policy gateway or route server must determine routing
2141	information message acceptability, as we describe in section 4.2.3 below.

2143	4.1 AD Representatives

2145	  We designate a single policy gateway, the AD representative, for
2146	generating and distributing IDPR routing information about its domain, to
2147	ensure that the routing information distributed is consistent and
2148	unambiguous and to minimize the communication required for routing
2149	information distribution.  There is usually only a single AD representative
2150	per domain, namely the lowest-numbered operational policy gateway in the
2151	domain.  Within a domain, policy gateways need no explicit election
2152	procedure to determine the AD representative.  Instead, all members of a set
2153	of policy gateways mutually reachable via intra-domain routes can agree on
2154	set membership and therefore on which member has the lowest number.

2156	                                  44
2157	  A partitioned domain has as many AD representatives as it does domain
2158	components.  In fact, the numeric identifier for an AD representative is
2159	identical to the numeric identifier for a domain component.  One cannot
2160	normally predict when and where a domain partition will occur, and thus any
2161	policy gateway within a domain may become an AD representative at any time.
2162	To prepare for the role of AD representative in the event of a domain
2163	partition, every policy gateway must continually monitor its domain's IDPR
2164	routing information, through VGP and through the intra-domain routing
2165	procedure.

2167	4.2 Flooding Protocol

2169	  An AD representative policy gateway uses unrestricted flooding among all
2170	domains to distribute its domain's IDPR routing information messages to
2171	route servers in the Internet.  There are two kinds of IDPR routing
2172	information messages issued by each AD representative:  configuration and
2173	dynamic messages.  Each configuration message contains the transit policy
2174	information configured by the domain administrator, including for each
2175	transit policy, its identifier, its specification, and the set of virtual
2176	gateways configured as mutually reachable via intra-domain routes supporting
2177	the given transit policy.  Each dynamic message contains information about
2178	current virtual gateway connectivity to adjacent domains and about which
2179	members of the sets of virtual gateways are at present mutually reachable
2180	via intra-domain routes supporting the configured transit policies.

2182	  The IDPR flooding protocol is similar to the flooding procedures
2183	described in [8]-[10].  Through flooding, the AD representative distributes
2184	its routing information messages to route servers in its own domain and in
2185	adjacent domains.  After generating a routing information message, the AD
2186	representative distributes a copy to each of its peers, to a selected VG
2187	representative (see section 3.1.4) in all other virtual gateways connected
2188	to the domain, and to each route server to which it has been configured to
2189	deliver routing information.  We recommend that for each route server not
2190	contained within a policy gateway, the domain administrator should configure
2191	at least two distinct policy gateways to deliver routing information to that
2192	route server.  Thus, the route server will continue to receive routing
2193	information messages, even when one of its associated policy gateways
2194	becomes unreachable; the route server will, however, normally receive
2195	duplicate copies of a routing information message.

2197	  Each VG representative in turn distributes a copy of the routing
2198	information message to each member of its configured set of route servers

2200	                                  45
2201	and to each of its peers.  We note that distribution of routing information
2202	messages among virtual gateways and among peers within a virtual gateway is
2203	identical to distribution of inter-VG messages in VGP, as described in
2204	section 3.1.3.

2206	  Within a virtual gateway, each policy gateway distributes a copy of the
2207	routing information message to each member of its configured set of route
2208	servers and to certain directly-connected adjacent policy gateways, selected
2209	as follows.  Each policy gateway knows, through information provided by VGP,
2210	which peers have direct connections to which components of the adjacent
2211	domain.  Only when it is the lowest-numbered operational peer with a direct
2212	connection to a given adjacent domain component does a policy gateway
2213	distribute a routing information message to a directly-connected adjacent
2214	policy gateway in that domain component.  If the policy gateway has direct
2215	connections to more than one adjacent policy gateway in that domain
2216	component, it selects the routing information message recipient according
2217	the order in which the adjacent policy gateways appear in its database,
2218	choosing the first encountered.  This selection procedure ensures that a
2219	copy of the routing information message reaches each component of the
2220	adjacent domain, while limiting the number of copies distributed across the
2221	virtual gateway.

2223	  Once a routing information message reaches an adjacent policy gateway,
2224	that policy gateway distributes copies of the message throughout its domain.
2225	The adjacent policy gateway, acting as the first recipient of the routing
2226	information message in its domain, follows the same message distribution
2227	procedure as the AD representative in the source domain, as described above.
2228	The flooding procedure terminates when all reachable route servers in the
2229	Internet receive a copy of the routing information message.

2231	  Neighbor policy gateways may receive copies of the same routing
2232	information message from different neighboring domains.  If two neighbor
2233	policy gateways receive the message copies simultaneously, they will
2234	distribute them to VG representatives in other virtual gateways within their
2235	domain, resulting in duplicate message distribution.  However, each policy
2236	gateway stops the spread of duplicate routing information messages as soon
2237	as it detects them, as described in section 4.2.3 below.  Moreover, we
2238	expect simultaneous message receptions to be the exception rather than the
2239	rule, given the hierarchical structure of the current Internet topology.

2241	                                  46
2242	4.2.1 Message Generation

2244	The AD representative generates and distributes a configuration message
2245	whenever there is a change in a transit policy or virtual gateway configured
2246	for its domain.  This ensures that route servers maintain an up-to-date view
2247	of the domain's configured transit policies and adjacencies.  The AD
2248	representative may also distribute a configuration message at a configurable
2249	period of conf_per (500) hours.  A configuration message contains, for each
2250	configured transit policy, the identifier assigned by the domain
2251	administrator, the specification, and the set of associated virtual gateway
2252	groups.  Each virtual gateway group comprises virtual gateways configured to
2253	be mutually reachable via intra-domain routes of the given transit policy.
2254	Accompanying each virtual gateway listed is an indication of whether the
2255	virtual gateway is configured to be a domain entry point, a domain exit
2256	point, or both according to the given transit policy.  The configuration
2257	message also contains the set of local route servers that the domain
2258	administrator has configured to be available to IDPR clients in other
2259	domains.

2261	  The AD representative generates and distributes a dynamic message
2262	whenever there is a change in transit policy currently supported across the
2263	given domain or in current virtual gateway connectivity to an adjacent
2264	domain.  This ensures that route servers maintain an up-to-date view of
2265	supported transit policies and existing domain adjacencies and how they
2266	differ from those configured for the domain.  Specifically, the AD
2267	representative generates a dynamic message whenever there is a change in the
2268	connectivity, through the given domain and with respect to a configured
2269	transit policy, between two neighboring domain components.  The AD
2270	representative may also distribute a dynamic message at a configurable
2271	period of dyn_per (24) hours.  A dynamic message contains, for each
2272	configured transit policy, its identifier, associated virtual gateway
2273	groups, and domain components reachable through virtual gateways in each
2274	group.  Each dynamic message also contains the set of currently unavailable,
2275	either down or unreachable, virtual gateways in the domain.

2277	  We note that each virtual gateway group expressed in a dynamic message
2278	may be a proper subset of one of the corresponding virtual gateway groups
2279	expressed in a configuration message.  For example, suppose that, for a
2280	given domain, the virtual gateway group (VG1,...,VG5) were configured for a
2281	transit policy such that each virtual gateway was both a domain entry and
2282	exit point.  Thus, all virtual gateways in this group are configured to be
2283	mutually reachable via intra-domain routes of the given transit policy.  Now

2285	                                  47
2286	suppose that VG5 becomes unreachable because of a power failure and
2287	furthermore that the remaining virtual gateways form two distinct groups,
2288	(VG1,VG2) and (VG3,VG4), such that although virtual gateways in both
2289	groups are still mutually reachable via some intra-domain routes they are no
2290	longer mutually reachable via any intra-domain routes of a given transit
2291	policy.  In this case, the virtual gateway groups for the given transit
2292	policy now become (VG1,VG2) and (VG3,VG4); VG5 is listed as an
2293	unavailable virtual gateway.

2295	  A route server uses information about the set of unavailable virtual
2296	gateways to determine which of its routes are no longer viable, and it
2297	subsequently removes such routes from its route database.  Although route
2298	servers could determine the set of unavailable virtual gateways using
2299	information about configured virtual gateways and currently reachable
2300	virtual gateways, the associated processing cost is high.  In particular, a
2301	route server would have to examine all virtual gateway groups listed in a
2302	dynamic message to determine whether there are any unavailable virtual
2303	gateways in the given domain.  To reduce the message processing at each
2304	route server, we have chosen to include the set of unavailable virtual
2305	gateways in each dynamic message.

2307	  In order to construct a dynamic message, the AD representative assembles
2308	information gathered from intra-domain routing and from VGP. Specifically,
2309	the AD representative uses the following information:

2311	 1. VG connect and up/down messages to determine the state, up or down, of
2312	    each of its domain's virtual gateways and the adjacent domain
2313	    components reachable through a given virtual gateway.

2315	 2. Intra-domain routing information to determine, for each of its domain's
2316	    transit policies, whether a given virtual gateway in the domain is
2317	    reachable with respect to that transit policy.

2319	 3. VG policy messages to determine the connectivity of neighboring domain
2320	    components, across the given domain and with respect to a configured
2321	    transit policy, such that these components are adjacent to virtual
2322	    gateways not currently reachable from the AD representative's virtual
2323	    gateway according to the given transit policy.

2325	                                  48
2326	4.2.2 Sequence Numbers

2328	Each IDPR routing information message carries a sequence number which, when
2329	used in conjunction with the timestamp carried in the CMTP message header,
2330	determines the recency of the message.  The AD representative assigns a
2331	sequence number to each routing information message it generates, depending
2332	upon its internal clock time:

2334	 1. The AD representative sets the sequence number to 0, if its internal
2335	    clock time is greater than the timestamp in its previously generated
2336	    routing information message.

2338	 2. The AD representative sets the sequence number to 1 greater than the
2339	    sequence number in its previously generated routing information
2340	    message, if its internal clock time equals the timestamp for its
2341	    previously generated routing information message and if the previous
2342	    sequence number is less than the maximum value.  If the previous
2343	    sequence number equals the maximum value, the AD representative waits
2344	    until its internal clock time exceeds the timestamp in its previously
2345	    generated routing information message and then sets the sequence number
2346	    to 0.

2348	  In general, we do not expect generation of multiple distinct IDPR routing
2349	information messages carrying identical timestamps, and so the sequence
2350	number may seem superfluous.  However, the sequence number may become
2351	necessary during synchronization of the AD representative's internal clock.
2352	In particular, the AD representative may need to freeze the clock value
2353	during synchronization, and thus distinct sequence numbers serve to
2354	distinguish routing information messages generated during the clock
2355	synchronization interval.

2357	4.2.3 Message Acceptance

2359	Prior to a policy gateway forwarding a routing information message or a
2360	route server incorporating routing information into its routing information
2361	database, the policy gateway or route server assesses routing information
2362	message acceptability.  An IDPR routing information message is acceptable
2363	if:

2365	 1. It passes the CMTP validation checks.

2367	                                  49
2368	 2. Its timestamp is less than conf_old (530) hours behind the recipient's
2369	    internal clock time for configuration messages and less than dyn_old
2370	    (25) hours behind the recipient's internal clock time for dynamic
2371	    messages.

2373	 3. Its timestamp and sequence number indicate that it is more recent than
2374	    the currently-stored routing information from the given domain.  If
2375	    there is no routing information currently stored from the given domain,
2376	    then the routing information message contains, by default, the more
2377	    recent information.

2379	  Policy gateways acknowledge and forward acceptable IDPR routing
2380	information messages, according to the flooding protocol described in
2381	section 4.2 above.  Moreover, each policy gateway retains the timestamp and
2382	sequence number for the most recently accepted routing information message
2383	from each domain and uses these values to determine acceptability of routing
2384	information messages received in the future.  Route servers acknowledge the
2385	receipt of acceptable routing information messages and incorporate the
2386	contents of these messages into their routing information databases,
2387	contingent upon criteria discussed in section 4.2.4 below; however, they do
2388	not participate in the flooding protocol.  We note that when a policy
2389	gateway or route server first returns to service, it immediately updates its
2390	routing information database with routing information obtained from another
2391	route server, using the route server query protocol described in section 5.

2393	  An AD representative takes special action upon receiving an acceptable
2394	routing information message, supposedly generated by itself but originally
2395	obtained from a policy gateway or route server other than itself.  There are
2396	at least three possible reasons for the occurrence of this event:

2398	 1. The routing information message has been corrupted in a way that is not
2399	    detectable by the integrity/authentication value.

2401	 2. The AD representative has experienced a memory error.

2403	 3. Some other entity is generating routing information messages on behalf
2404	    of the AD representative.

2406	In any case, the AD representative logs the event for network management.
2407	Moreover, the AD representative must reestablish its own routing information
2408	messages as the most recent for its domain.  To do so, the AD representative
2409	waits until its internal clock time exceeds the value of the timestamp in
2410	the received routing information message and then generates a new routing
2411	information message using the currently-stored domain routing information

2413	                                  50
2414	supplied by VGP and by the intra-domain routing procedure.  Note that the
2415	length of time the AD representative must wait to generate the new message
2416	is at most cmtp_new (5) minutes, the maximum CMTP-tolerated difference
2417	between the received message's timestamp and the AD representative's
2418	internal clock time.

2420	  IDPR routing information messages that pass the CMTP validity checks but
2421	appear less recent than stored routing information are unacceptable.  Policy
2422	gateways do not forward unacceptable routing information messages, and route
2423	servers do not incorporate the contents of unacceptable routing information
2424	messages into their routing information databases.  Instead, the recipient
2425	of an unacceptable routing information message acknowledges the message in
2426	one of two ways:

2428	 1. If the routing information message timestamp and sequence number are
2429	    equal to the timestamp and sequence number associated with the stored
2430	    routing information for the given domain, the recipient assumes that
2431	    the routing information message is a duplicate and acknowledges the
2432	    message.

2434	 2. If the routing information message timestamp and sequence number
2435	    indicate that the message is less recent than the stored routing
2436	    information for the domain, the recipient acknowledges the message with
2437	    an indication that it is out-of-date.  Such a negative acknowledgement
2438	    is a signal to the sender of the routing information message to request
2439	    more up-to-date routing information from a route server, using the
2440	    route server query protocol.  Furthermore, if the recipient of the
2441	    out-of-date routing information message is a route server, it
2442	    regenerates a routing information message from its own routing
2443	    information database and forwards the message to the sender.  The
2444	    sender may in turn propagate this more recent routing information
2445	    message to other policy gateways and route servers.

2447	4.2.4 Message Incorporation

2449	A route server usually stores the entire contents of an acceptable IDPR
2450	routing information message in its routing information database, so that it
2451	has access to all advertised transit policies when generating a route and so
2452	that it can regenerate the routing information message at a later point in
2453	time if requested to do so by another route server or policy gateway.
2454	However, the route server may elect not to store all routing information
2455	message contents.  In particular, the route server need not store any

2457	                                  51
2458	transit policy that excludes the route server's domain as a source or any
2459	routing information from a domain that the route server's domain's source
2460	policies exclude for transit.  Selective storing of routing information
2461	message contents simplifies the route generation procedure since it reduces
2462	the search space of possible routes, and it limits the amount of route
2463	server memory devoted to routing information.  However, selective storing of
2464	routing information also means that the route server cannot always
2465	regenerate the original routing information message, if requested to do so
2466	by another route server or policy gateway.

2468	  An acceptable IDPR routing information message may contain transit policy
2469	information that is not well-defined according to the route server's
2470	perception.  A configuration message may contain an unrecognized domain,
2471	virtual gateway, or other attribute, such as user class or offered service.
2472	In this case, unrecognized means that the value in the routing information
2473	message is not listed in the route server's configuration database, as
2474	described in section 1.8.2.  A dynamic message may contain an unrecognized
2475	transit policy or virtual gateway.  In this case, unrecognized means that
2476	the transit policy or virtual gateway was not listed in the most recent
2477	configuration message for the given domain.

2479	  Each route server can always parse an acceptable routing information
2480	messsage, even if some of the information is not well-defined, and thus can
2481	always use the information that it does recognize.  Therefore, a route
2482	server can store the contents of acceptable routing information messages
2483	from domains in which it is interested, regardless of whether all contents
2484	appear to be well-defined at present.  In this case, the route server
2485	attempts to obtain the additional information it needs to decipher
2486	unrecognized information.  For a configuration message, the route server
2487	requests updated configuration information; for a dynamic message, the route
2488	server requests, from another route server, the most recent configuration
2489	message for the given domain.

2491	  When a domain is partitioned, each domain component has its own AD
2492	representative, which generates routing information messages on behalf of
2493	that component.  Discovery of a domain partition prompts the AD
2494	representative for each domain component to generate and distribute a
2495	dynamic message.  In this case, a route server receives and stores more than
2496	one routing information message at a time for the given domain, namely one
2497	for each domain component.  When the partition heals, the AD representative
2498	for the entire domain generates and distributes a dynamic message.  In each
2499	route server's routing information database, the new dynamic message does
2500	not automatically replace all of the currently-stored dynamic messages for

2502	                                  52
2503	the given domain.  Instead, the new message only replaces that message whose
2504	AD representative matches the AD representative for the new message.  The
2505	other dynamic messages remaining from the period during which the partition
2506	occurred will be removed from the routing information database when they
2507	attain their maximum age, as decribed in section 4.2.5 below.  In a future
2508	version of IDPR, we may include mechanisms for removing partition-related
2509	dynamic messages immediately after the partition disappears.

2511	4.2.5 Routing Information Database

2513	We expect that most of the IDPR routing information stored in a routing
2514	information database will remain viable for long periods of time, perhaps
2515	until a domain reconfiguration occurs.  However, to reduce the probability
2516	of retaining stale routing information, a route server imposes a maximum
2517	lifetime on each database entry, initialized when it incorporates an
2518	accepted entry into its routing information database.  The maximum routing
2519	information database entry lifetime should be longer than the corresponding
2520	routing information message generation period, so that the database entry is
2521	likely to be refreshed before it expires.

2523	  Each configuration message stored in the routing information database
2524	remains viable for a maximum of conf_old (530) hours; each dynamic message
2525	stored in the routing information database remains viable for a maximum of
2526	dyn_old (25) hours.  By viable, we mean that the message contents may be
2527	used in generating policy routes.  Configuring periodic generation of
2528	routing information messages makes it unlikely that any routing information
2529	message will remain in a routing information database for its full life
2530	span.  However, a routing information message may attain its maximum age in
2531	a route server that is separated from the Internet for a long period of
2532	time.

2534	  When an IDPR routing information message attains its maximum age in a
2535	routing information database, the route server removes the message contents
2536	from its database, so that it will not generate new routes with the outdated
2537	routing information nor distribute old routing information in response to
2538	requests from other route servers or policy gateways.  Nevertheless, the
2539	route server continues to dispense routes previously generated with the old
2540	routing information, as long as path setup (see section 7) for these routes
2541	succeeds.

2543	  The route server treats routing information message expiration
2544	differently, depending on the type of routing information message.  When a

2546	                                  53
2547	configuration message expires, the route server requests, from another route
2548	server, the most recent configuration message issued by the given domain.
2549	When a dynamic message expires, the route server does not initially attempt
2550	to obtain more recent routing information.  Instead, if route generation is
2551	necessary, the route server uses the routing information contained in the
2552	corresponding configuration message for the given domain.  Only if there is
2553	a path setup failure (see section 7.4) involving the given domain does the
2554	route server request, from another route server, the most recent dynamic
2555	message issued by the given domain.

2557	4.3 Routing Information Message Formats

2559	  The flooding protocol number is equal to 1.  We describe the contents of
2560	each type of routing information message below.

2562	4.3.1 Configuration

2564	The configuration message type is equal to 0.

2566	                           0________8_________16________24_____31__
2567	                           |______AD_CMP_______|______SEQ_________|
2568	                           |______NUM_TP_______|_____NUM_RS_______|
2569	                           |________RS_________|
2570	                           |_______________________________________
2571	              For each TP: |________TP_________|_____NUM_ATR______|
2572	                           |_______________________________________
2573	       For each attribute: |_____ATR_TYP_______|_____ATR_LENP_____|
2574	                           |____________________
2575	                           |____NUM_AD_GRP_____|___________________
2576	        For each AD group: |______NUM_AD_______|_______AD_________|
2577	                           |_AD_FLGS_|_NUM_HST_|_HST_SET_|
2578	                           |____________________
2579	                           |______NUM_TIM______|___________________
2580	    For each set of times: |TIM_FLGS_|_________DURATION___________|
2581	                           |_________________START________________|
2582	                           |______PERIOD_______|_____ACTIVE_______|
2583	                           |____________________
2584	                           |______NUM_UCI______|
2585	             For each UCI: |___UCI___|
2586	                           |______________________________________
2587	 For each offered service: |________________OFR_SRV_______________|
2588	                           |____________________
2589	                           |____NUM_VG_GRP_____|___________________
2590	        For each VG group: |______NUM_VG_______|_____ADJ_AD_______|
2591	                           |___VG___|_VG_FLGS__|

2593	                                  54
2594	AD CMP (16 bits) Numeric identifier for the domain component containing the
2595	    AD representative policy gateway.

2597	SEQ (16 bits) Routing information message sequence number.

2599	NUM TP (16 bits) Number of transit policy specifications contained in the
2600	    routing information message.

2602	NUM RS (16 bits) Number of route servers advertised to serve clients outside
2603	    of the domain.

2605	RS (16 bits) Numeric identifier for a route server.

2607	TP (16 bits) Numeric identifier for a transit policy specification.

2609	NUM ATR (16 bits) Number of attributes associated with the transit policy.

2611	ATR TYP (16 bits) Numeric identifier for a type of attribute.  Valid
2612	    attributes include the following types:
2613	     1. Set of virtual gateway groups (see section 1.4.2) associated with
2614	        the transit policy (variable); must be included.

2616	     2. Set of source/destination domain groups (see section 1.4.2)
2617	        associated with the transit policy (variable); may be omitted.
2618	        Absence of this attribute implies that traffic from any source
2619	        domain to any destination domain is acceptable.

2621	     3. Set of time specifications (see section 1.4.2) associated with the
2622	        transit policy (variable); may be omitted.  Absence of this
2623	        attribute implies that the transit policy always applies.

2625	     4. Set of user classes (see section 1.4.2) associated with the transit
2626	        policy (variable); may be omitted.  Absence of this attribute
2627	        implies that traffic of any user class is acceptable.

2629	     5. Average delay in milliseconds (16 bits); may be omitted.

2631	     6. Delay variation in milliseconds (16 bits); may be omitted.

2633	     7. Average available bandwidth in bits per second (48 bits); may be
2634	        omitted.

2636	     8. Available bandwidth variation in bits per second (48 bits); may be
2637	        omitted.

2639	     9. MTU in bytes (16 bits); may be omitted.

2641	                                  55
2642	    10. Charge per byte in thousandths of a cent (16 bits); may be omitted.

2644	    11. Charge per message in thousandths of a cent (16 bits); may be
2645	        omitted.

2647	    12. Charge for session time in thousandths of a cent per second (16
2648	        bits); may be omitted.
2649	    Absence of any charge attributes implies that the domain provides free
2650	    transit service.

2652	ATR LEN (16 bits) Length of an attribute in bytes, beginning with the next
2653	    field.

2655	NUM AD GRP (16 bits) Number of source/destination domain groups associated
2656	    with the transit policy.

2658	NUM AD (16 bits) Number of domains or domain sets (see section 1.4.2) in a
2659	    domain group.

2661	AD (16 bits) Numeric identifier for a domain or domain set.

2663	AD FLGS (8 bits) Set of five flags indicating how to interpret the AD field
2664	    and contained in the right-most bits.  Proceeding left to right, the
2665	    first flag indicates whether the transit policy applies to all domains
2666	    or to specific domains (1 all, 0 specific), and when set to 1, causes
2667	    the second and third flags to be ignored.  The second flag indicates
2668	    whether the domain identifier signifies a single domain or a domain set
2669	    (1 single, 0 set).  The third flag indicates whether the transit policy
2670	    applies to the given domain or domain set (1 applies, 0 does not apply)
2671	    and is used for representing complements of sets of domains.  The
2672	    fourth flag indicates whether the domain is a source (1 source, 0 not
2673	    source).  The fifth flag indicates whether the domain is a destination
2674	    (1 destination, 0 not destination).  At least one of the fourth and
2675	    fifth flags must be set to 1.

2677	NUM HST (8 bits) Number of host sets (see section 1.4.2) associated with a
2678	    particular domain.  The value 0 indicates that all hosts in the given
2679	    domain are acceptable sources or destinations, as specified by the
2680	    fourth and fifth AD flags.

2682	HST (8 bits) Numeric identifier for a host set.

2684	NUM TIM (16 bits) Number of time specifications associated with the transit
2685	    policy.  Each time specification is split into a set of continguous
2686	    identical periods.

2688	                                  56
2689	TIM FLGS (8 bits) Set of two flags indicating how to combine the time
2690	    specifications and contained in the right-most bits.  Proceeding left
2691	    to right, the first flag indicates whether the transit policy applies
2692	    during the periods specified in the time specification (1 applies, 0
2693	    does not apply) and is used for representing complements of transit
2694	    policy applicability periods.  The second flag indicates whether the
2695	    time specification takes precedence over the previous time
2696	    specifications listed (1 precedence, 0 no precedence).  Precedence is
2697	    equivalent to the boolean OR and AND operators in the following sense.
2698	    At any given instant, a transit policy either applies or does not
2699	    apply, according to a given time specification.  We can assign a
2700	    boolean value to the state of transit policy applicability according to
2701	    a given time specification.  If the second flag assumes the value 1 for
2702	    a given time specification, that indicates the boolean operator OR
2703	    should be applied to the value of transit policy applicability,
2704	    according to the given time specification and to all previous time
2705	    specifications.  If the second flag assumes the value 0 for a given
2706	    time specification, that indicate the boolean operator OR should be
2707	    applied to the value of transit policy applicability, according to the
2708	    given time specification and to all previous time specifications.

2710	DURATION (24 bits) Length of time during which the time specification
2711	    applies, in minutes.  A value of 0 indicates the time specification
2712	    applies forever.

2714	START (32 bits) Time at which the time specification first takes effect, in
2715	    seconds elapsed since 1 January 1970 0:00 GMT.

2717	PERIOD (16 bits) Length of each period within the time specification, in
2718	    minutes.

2720	ACTIVE (16 bits) Length of time the transit policy is applicable during each
2721	    period, in minutes from the beginning of the period.

2723	NUM UCI (16 bits) Number of user classes associated with the transit policy.

2725	UCI (8 bits) Numeric identifier for a user class.

2727	NUM VG GRP (16 bits) Number of virtual gateway groups associated with the
2728	    transit policy.

2730	NUM VG (16 bits) Number of virtual gateways in a virtual gateway group.

2732	ADJ AD (16 bits) Numeric identifier for the adjacent domain to which a
2733	    virtual gateway connects.

2735	                                  57
2736	VG (8 bits) Numeric identifier for a virtual gateway.

2738	VG FLGS (8 bits) Set of two flags indicating how to interpret the VG field
2739	    and contained in the right-most bits.  Proceeding left to right, the
2740	    first flag indicates whether the virtual gateway is a domain entry
2741	    point (1 entry, 0 not entry) for the transit policy.  The second flag
2742	    indicates whether the virtual gateway is a domain exit point (1 exit, 0
2743	    not exit) for the transit policy.  At least one of the first and second
2744	    flags must be set to 1.

2746	4.3.2 Dynamic

2748	The dynamic message type is equal to 1.

2750	                          0_________8________16________24_____31__
2751	                          |______AD_CMP_______|_______SEQ________|
2752	                          |_____UNAVL_VG______|______NUM_PS______|
2753	                          |_______________________________________
2754	 For each unavailable VG: |______ADJ_AD_______|___VG____|_UNUSED_|
2755	                          |_______________________________________
2756	         For each TP set: |______NUM_TP_______|____NUM_VG_GRP____|
2757	                          |________TP_________|
2758	                          |_______________________________________
2759	       For each VG Group: |______NUM_VG_______|_____ADJ_AD_______|
2760	                          |___VG___|_VG_FLGS__|_____NUM_CMP______|
2761	                          |________CMP________|

2763	AD CMP (16 bits) Numeric identifier for the domain component containing the
2764	    AD representative policy gateway.

2766	SEQ (16 bits) Routing information message sequence number.

2768	UNAVL VG (16 bits) Number of virtual gateways in the domain component that
2769	    are currently unavailable via any intra-domain routes.

2771	NUM PS (16 bits) Number of sets of transit policies listed.  A single set of
2772	    virtual gateway groups applies to all transit policies in a given set.
2773	    Hence, transit policy sets provide a mechanism for reducing the size of
2774	    dynamic messages.

2776	ADJ AD (16 bits) Numeric identifier for the adjacent domain to which a
2777	    virtual gateway connects.

2779	VG (8 bits) Numeric identifier for a virtual gateway.

2781	                                  58
2782	UNUSED (8 bits) Not currently used; must be set equal to 0.

2784	NUM TP (16 bits) Number of transit policies in a set.

2786	NUM VG GRP (16 bits) Number of virtual gateway groups currently associated
2787	    with the transit policy set.

2789	TP (16 bits) Numeric identifier for a transit policy.

2791	NUM VG (16 bits) Number of virtual gateways in a virtual gateway group.

2793	VG FLGS (8 bits) Set of two flags indicating how to interpret the VG field
2794	    and contained in the right-most bits.  Proceeding left to right, the
2795	    first flag indicates whether the virtual gateway is a domain entry
2796	    point (1 entry, 0 not entry) for the transit policies.  The second flag
2797	    indicates whether the virtual gateway is a domain exit point (1 exit, 0
2798	    not exit) for the transit policies.  At least one of the first and
2799	    second flags must be set to 1.

2801	NUM CMP (16 bits) Number of adjacent domain components reachable via direct
2802	    connections through the virtual gateway.

2804	CMP (16 bits) Numeric identifier for a reachable adjacent domain component.

2806	4.3.3 Negative Acknowledgements

2808	When a policy gateway or route server receives an unacceptable IDPR routing
2809	information message that passes the CMTP validation checks, it includes, in
2810	its CMTP ack, an appropriate negative acknowledgement.  This information is
2811	placed in the INFORM field of the CMTP ack (described in section 2.4); the
2812	numeric identifier for each type of routing information message negative
2813	acknowledgement is contained in the left-most 8 bits of the INFORM field.
2814	Negative acknowledgements associated with routing information messages
2815	include the following types:

2817	 1. Unrecognized IDPR routing information message type.  Numeric identifier
2818	    for the unrecognized message type (8 bits).

2820	 2. Out-of-date IDPR routing information message.  This is a signal to the
2821	    sender that it may not have the most recent routing information for the
2822	    given domain.

2824	                                  59
2825	5  Route Server Query Protocol

2827	  Each route server is responsible for maintaining both the routing
2828	information and route databases and for responding to database information
2829	requests from policy gateways and other route servers.  These requests and
2830	their responses are the messages exchanged via the route server query
2831	protocol (RSQP).

2833	  Policy gateways and route servers normally invoke RSQP to replace absent,
2834	outdated, or corrupted information in their own routing information or route
2835	databases.  In section 4, we discussed some of the situations in which RSQP
2836	must be invoked; in sections 6 and 7, we discuss other such situations.

2838	5.1 Message Exchange

2840	  Policy gateways and route servers use CMTP for reliable transport of
2841	route server requests and responses.  RSQP must communicate to CMTP the
2842	maximum number of transmissions per request/response message, rsqp_ret, and
2843	the interval between request/response message retransmissions, rsqp_int
2844	microseconds.  A route server request/response message is acceptable if:

2846	 1. It passes the CMTP validation checks.

2848	 2. Its timestamp is less than rsqp_old (300) seconds behind the
2849	    recipient's internal clock time.

2851	  With RSQP, a requesting entity expects to receive an acknowledgement from
2852	the queried route server indicating whether the route server can accommodate
2853	the request.  The route server may fail to fill a given request, either
2854	because its corresponding database contains no entry or only a partial entry
2855	for the requested information, or because it is governed by special message
2856	distribution rules, imposed by the domain administrator, that preclude it
2857	from releasing the requested information.  For all requests that it cannot
2858	fill, the route server responds with a negative acknowledgement message
2859	carried in a CMTP acknowledgement, indicating the set of unfulfilled
2860	requests (see section 5.3.4).

2862	  If the requesting entity either receives a negative acknowledgement or
2863	does not receive any acknowledgement after rsqp_ret attempts directed at the
2864	same route server, it queries a different route server, as long as the
2865	number of attempted requests to different route servers does not exceed
2866	rsqp_try (3).  Specifically, the requesting entity proceeds in round-robin

2868	                                  60
2869	order through its list of addressable route servers.  However, if the
2870	requesting entity is unsuccessful after rsqp_try attempts, it abandons the
2871	request altogether and logs the event for network management.

2873	  A policy gateway or a route server can request information from any route
2874	server that it can address.  Addresses for local route servers within a
2875	domain are part of the configuration for each IDPR entity within a domain;
2876	addresses for remote route servers in other domains are obtained through
2877	flooded configuration messages, as described in section 4.2.1.  However,
2878	requesting entities always query local route servers before remote route
2879	servers, in order to contain the costs associated with the query and
2880	response.  If the requesting entity and the queried route server are in the
2881	same domain, they can communicate over intra-domain routes, whereas if the
2882	requesting entity and the queried route server are in different domains,
2883	they must obtain a policy route and establish a path before they can
2884	communicate, as described in section 5.2 below.

2886	5.1.1 Routing Information

2888	Policy gateways and route servers request routing information from route
2889	servers, in order to update their routing information databases.  To obtain
2890	routing information from a route server, the requesting entity issues a
2891	routing information request message containing the type of routing
2892	information requested -- configuration messages, dynamic messages, or both
2893	- -- and the set of domains from which the routing information is requested.

2895	  Upon receiving a routing information request message, a route server
2896	first assesses message acceptability before proceeding to act on the
2897	contents.  If the routing information request message is deemed acceptable,
2898	the route server determines how much the request it can fulfill and then
2899	instructs CMTP to generate an acknowledgement, indicating its ability to
2900	fulfill the request.  The route server proceeds to fulfill as much of the
2901	request as possible by reconstructing individual routing information
2902	messages, one per requested message type and domain, from its routing
2903	information database.  We note that only a regenerated routing information
2904	message whose entire contents match that of the original routing information
2905	message can pass the CMTP integrity/authentication checks.

2907	                                  61
2908	5.1.2 Routes

2910	Path agents request routes from route servers when they require policy
2911	routes for path setup.  To obtain routes from a route server, the requesting
2912	path agent issues a route request message containing the destination domain
2913	and applicable service requirements, the maximum number of routes requested,
2914	a directive indicating whether to generate the routes or retrieve them from
2915	the route database, and a directive indicating whether to refresh the
2916	routing information database with the most recent configuration or dynamic
2917	message from a given domain, before generating the routes.  To refresh its
2918	routing information database, a route server must obtain routing information
2919	from another route server.  The path agent usually issues routing
2920	information database refresh directives in response to a failed path setup.
2921	We discuss the application of these directives in more detail in
2922	section 7.4.

2924	  Upon receiving a route request message, a route server first assesses
2925	message acceptability before proceeding to act on the contents.  If the
2926	route request message is deemed acceptable, the route server determines
2927	whether it can fulfill the request and then instructs CMTP to generate an
2928	acknowledgement, indicating its ability to fulfill the request.  The route
2929	server proceeds to fulfill the request with policy routes, either retrieved
2930	from its route database or generated from its routing information database
2931	if necessary, returned in a route response message.

2933	5.2 Remote Route Server Communication

2935	  Communication with a remote route server requires a policy route and
2936	accompanying path setup (see section 7) between the requesting and queried
2937	entities, as these entities reside in different domains.  After generating a
2938	request message, the requesting entity hands to CMTP its request message
2939	along with the remote route server's entity and domain identifiers.  CMTP
2940	encloses the request in a datagram and hands the datagram and remote route
2941	server information to the path agent.  Using the remote route server
2942	information, the path agent obtains, and if necessary sets up, a path to the
2943	remote route server.  Once the path to the remote route server has been
2944	successfully established, the path agent encapsulates the datagram within an
2945	IDPR data message and forwards the data message along the designated path.

2947	  When the path agent in the remote route server receives the IDPR data
2948	message, it extracts the datagram and hands it to CMTP. In addition, the
2949	path agent, using the requesting entity and domain identifiers contained in

2951	                                  62
2952	the path identifier, obtains, and if necessary sets up, a path back to the
2953	requesting entity.

2955	  If the datagram fails any of the CMTP validation checks, CMTP returns a
2956	nak to the requesting entity.  If the datagram passes all of the CMTP
2957	validation checks, the remote route server assesses the acceptability of the
2958	request message.  Provided the request message is acceptable, the remote
2959	route server determines whether it can fulfill the request and directs CMTP
2960	to return an ack to the requesting entity.  The ack may contain a negative
2961	acknowledgement if the entire request cannot be fulfilled.

2963	  The remote route server generates responses for all requests that it can
2964	fulfill and returns the responses to the requesting entity.  Specifically,
2965	the remote route server hands to CMTP its response and the requesting entity
2966	information.  CMTP in turn encloses the response in a datagram.

2968	  When returning an ack, a nak, or a response to the requesting entity, the
2969	remote route server hands the corresponding CMTP message and requesting
2970	entity information to the path agent.  Using the requesting entity
2971	information, the path agent retrieves the path to the requesting entity,
2972	encapsulates the CMTP message within an IDPR data message, and forwards the
2973	data message along the designated path.

2975	  The requesting entity, upon receiving an ack, nak, or response to its
2976	request, performs the CMTP validation checks for that message.  In the case
2977	of a response messsage, the requesting entity assesses message acceptability
2978	before incorporating the contents into the appropriate database.

2980	5.3 Route Server Message Formats

2982	  The route server query protocol number is equal to 2.  We describe the
2983	contents of each type of RSQP message below.

2985	5.3.1 Routing Information Request

2987	The routing information request message type is equal to 0.

2989	    0_________8________16________24_____31__
2990	    |______QRY_AD_______|______QRY_RS______|
2991	    |______NUM_AD_______|________AD________|
2992	    |_RIM_FLGS_|_UNUSED_|

2994	                                  63
2995	QRY AD (16 bits) Numeric identifier for the domain containing the queried
2996	    route server.

2998	QRY RS (16 bits) Numeric identifier for the queried route server.

3000	NUM AD (16 bits) Number of domains about which information is requested.
3001	    The value 0 indicates a request for routing information from all
3002	    domains.

3004	AD (16 bits) Numeric identifier for a domain.  This field is absent when NUM
3005	    AD equals 0.

3007	RIM FLGS (8 bits) Set of two flags indicating the type of routing
3008	    information messages requested and contained in the right-most bits.
3009	    Proceeding left to right, the first flag indicates whether the request
3010	    is for a configuration message (1 configuration, 0 no configuration).
3011	    The second flag indicates whether the request is for a dynamic message
3012	    (1 dynamic, 0 no dynamic).  At least one of the first and second flags
3013	    must be set to 1.

3015	UNUSED (8 bits) Not currently used; must be set equal to 0.

3017	5.3.2 Route Request

3019	The route request message type is equal to 1.

3021	                             0_________8_________16________24_____31__
3022	                             |_______QRY_AD_______|______QRY_RS______|
3023	                             |_______DST_AD_______|_NUM_RTS_|GEN_FLGS|
3024	                             |_______RFS_AD_______|___UCI___|_UNUSED_|
3025	                             |_______NUM_AD_______|_____NUM_RQS______|
3026	                             |________________________________________
3027	                For each AD: |_________AD_________|_AD_FLGS_|_UNUSED_|
3028	                             |________________________________________
3029	 For each requested service: |______RQS_TYP_______|_____RQS_LEN______|
3030	                             |_________________RQS_SRV_______________|

3032	QRY AD (16 bits) Numeric identifier for the domain containing the
3033	    queried route server.

3035	QRY RS (16 bits) Numeric identifier for the queried route server.

3037	DST AD (16 bits) Numeric identifier for the route's destination domain.

3039	                                  64
3040	NUM RTS (8 bits) Number of policy routes requested.

3042	GEN FLGS (8 bits) Set of three flags indicating how to obtain the requested
3043	    routes and contained in the right-most bits.  Proceeding left to right,
3044	    the first flag indicates whether the route server should retrieve
3045	    existing routes from its route database or generate new routes (1
3046	    retrieve, 0 generate).  The second flag indicates whether the route
3047	    server should refresh its routing information database before
3048	    generating the requested routes (1 refresh, 0 no refresh) and when set
3049	    to 1, causes the third flag and the RFS AD field to become significant.
3050	    The third flag indicates whether the routing information database
3051	    refresh should include configuration messages or dynamic messages (1
3052	    configuration, 0 dynamic).

3054	RFS AD (16 bits) Numeric identifier for the domain for which routing
3055	    information should be refreshed.  This field is meaningful only if the
3056	    second flag in the GEN FLGS field is set to 1.

3058	UCI (8 bits) Numeric identifier of the source user class.  The value 0
3059	    indicates that there is no particular source user class.

3061	UNUSED (8 bits) Not currently used; must be set equal to 0.

3063	NUM AD (16 bits) Number of transit domains that are to be favored, avoided,
3064	    or excluded during route selection.

3066	NUM RQS (16 bits) Number of requested services.  The value 0 indicates that
3067	    there is no special service requested.

3069	AD (16 bits) Numeric identifier for the transit domain to be favored,
3070	    avoided, or excluded.

3072	AD FLGS (8 bits) Three flags indicating how to interpret the AD field and
3073	    contained in the right-most bits.  Proceeding left to right, the first
3074	    flag indicates whether the domain should be favored (1 favored, 0 not
3075	    favored).  The second flag indicates whether the domain should be
3076	    avoided (1 avoided, 0 not avoided).  The third flag indicates whether
3077	    the domain should be excluded (1 excluded, 0 not excluded).  No more
3078	    than one of the first, second, and third flags must set to 1.

3080	RQS TYP (16 bits) Numeric identifier for a type of requested service.
3081	    Valid requested services include the following types:
3082	     1. Delay in milliseconds (16 bits); may be omitted.

3084	     2. Minimum delay route; may be omitted.

3086	                                  65
3087	     3. Delay variation in milliseconds (16 bits); may be omitted.

3089	     4. Minimum delay variation route; may be omitted.

3091	     5. Bandwidth in bits per second (48 bits); may be omitted.

3093	     6. Maximum bandwidth route; may be omitted.

3095	     7. Session monetary cost in cents (32 bits); may be omitted.

3097	     8. Minimum session monetary cost route; may be omitted.

3099	     9. Path lifetime in minutes (16 bits); may be omitted but must be
3100	        present if types 7 or 8 are present.

3102	    10. Path lifetime in messages (16 bits); may be omitted but must be
3103	        present if types 7 or 8 are present.

3105	    11. Path lifetime in bytes (48 bits); may be omitted but must be
3106	        present if types 7 or 8 are present.

3108	    12. MD4 data message authentication; relevant only to setup messages
3109	        (see section 7.4).

3111	    13. MD5 data message authentication; relevant only to setup messages.

3113	    14. Billing address (variable); relevant only to setup messages.

3115	    15. Charge number (variable); relevant only to setup messages.
3116	    Route servers use path lifetime information together with domain
3117	    charging method to compute expected session monetary cost over a given
3118	    domain.

3120	RQS LEN (16 bits) Length of the requested service in bytes, beginning
3121	    with the next field.

3123	RQS SRV (variable) Description of the requested service.

3125	5.3.3 Route Response

3127	The route response message type is equal to 2.

3129	                                  66
3130	                        0_________8         16        24     31
3131	                        |_NUM_RTS_|
3132	                        |_____________________
3133	        For each route: |_NUM_AD__|_RTE_FLGS_|
3134	                        |________________________________________
3135	  For each AD in route: |_AD_LEN__|____VG____|________AD________|
3136	                        |________CMP_________|______NUM_TP______|
3137	                        |_________TP_________|

3139	NUM RTS (16 bits) Maximum number of policy routes requested.

3141	RTE FLGS (8 bits) Set of two flags indicating the directions in which a
3142	    route can be used and contained in the right-most bits.  Refer to
3143	    sections 6.1.1, 7.2, and 7.4 for detailed discussions of path
3144	    directionality.  Proceeding left to right, the first flag indicates
3145	    whether the route can be used from source to destination (1 from
3146	    source, 0 not from source).  The second flag indicates whether the
3147	    route can be used from destination to source (1 from destination, 0 not
3148	    from destination).  At least one of the first and second flags must be
3149	    set to 1, if NUM RTS is greater than 0.

3151	NUM AD (8 bits) Number of domains in the policy route, not including the
3152	    source domain.

3154	AD LEN (8 bits) Length of the information associated with a particular
3155	    domain in bytes, beginning with the next field.

3157	VG (8 bits) Numeric identifier for an entry virtual gateway.

3159	AD (16 bits) Numeric identifier for an adjacent administrative domain.

3161	CMP (16 bits) Numeric identifier for an adjacent domain component.  Used by
3162	    policy gateways to select a route across a virtual gateway connecting
3163	    to a partitioned domain.

3165	NUM TP (16 bits) Number of transit policies that apply to the section of the
3166	    route traversing the domain.

3168	TP (16 bits) Numeric identifier for a transit policy.

3170	5.3.4 Negative Acknowledgements

3172	When a policy gateway receives an unacceptable RSQP message that passes the
3173	CMTP validation checks, it includes, in its CMTP ack, an appropriate

3175	                                  67
3176	negative acknowledgement.  This information is placed in the INFORM field of
3177	the CMTP ack (described in section 2.4); the numeric identifier for each
3178	type of RSQP negative acknowledgement is contained in the left-most 8 bits
3179	of the INFORM field.  Negative acknowledgements associated with RSQP include
3180	the following types:

3182	 1. Unrecognized RSQP message type.  Numeric identifier for the
3183	    unrecognized message type (8 bits).

3185	 2. Out-of-date RSQP message.

3187	 3. Unable to fill requests for routing information from the following
3188	    domains.  Number of domains for which requests cannot be filled (16
3189	    bits); a value of 0 indicates that the route server cannot fill any of
3190	    the requests.  Numeric identifier for each domain for which a request
3191	    cannot be filled (16 bits).

3193	 4. Unable to fill requests for routes to the following destination domain.
3194	    Numeric identifier for the destination domain (16 bits).

3196	                                  68
3197	6  Route Generation

3199	  Route generation is the most computationally complex part of IDPR,
3200	because of the number of domains and the number and heterogeneity of
3201	policies that it must accommodate.  Route servers must generate policy
3202	routes that satisfy the requested services of the source domain and respect
3203	the offered services of the transit domains.

3205	  We distinguish requested qualities of service and route generation with
3206	respect to them as follows:

3208	 1. Optimal requested services include minimum route delay, minimum route
3209	    delay variation, minimum session monetary cost, and maximum available
3210	    route bandwidth.  In the worst case, the computational complexity of
3211	    generating a route that is optimal with respect to a given requested
3212	    service is O(N+L) for breadth-first (BF) search and O((N+L) log N)
3213	    for Dijkstra's shortest path first (SPF) search, where N is the number
3214	    of nodes and L is the number of links in the search graph.
3215	    Multi-criteria optimization, for example finding a route with minimal
3216	    delay variation and minimal session monetary cost, may be defined in
3217	    several ways.  One approach to multi-criteria optimization is to assign
3218	    each link a single value equal to a weighted sum of the values of the
3219	    individual offered qualities of service and generate a route that is
3220	    optimal with respect to this new criterion.  However, selecting the
3221	    weights that yield the desired route generation behavior is itself an
3222	    optimization procedure and hence not trivial.

3224	 2. Requested service limits include upper bounds on route delay, route
3225	    delay variation, and session monetary cost and lower bounds on
3226	    available route bandwidth.  Generating a route that must satisfy more
3227	    than one quality of service constraint, for example route delay of no
3228	    more than X seconds and available route bandwidth of no less than Y
3229	    bits per second, is an NP-complete problem.

3231	  To contain the combinatorial explosion of processing and memory costs
3232	associated with route generation, we supply the following guidelines for
3233	generation of suitable policy routes:

3235	 1. Each route server should only generate policy routes from the
3236	    perspective of its own domain as source; it need not generate policy
3237	    routes for arbitrary source/destination domain pairs.  Thus, we can
3238	    distribute the computational burden over all route servers.

3240	 2. Route servers should precompute routes for which they anticipate

3242	                                  69
3243	    requests and should generate routes on demand only in order to satisfy
3244	    unanticipated route requests.  Hence, a single route server can
3245	    distribute its computational burden over time.

3247	 3. Route servers should cache the results of route generation, in order to
3248	    minimize the computation associated with responding to future route
3249	    requests.

3251	 4. To handle multi-criteria optimization in route selection, a route
3252	    server should generate routes that are optimal with respect to the
3253	    first optimal requested service specified in the route request message.
3254	    The route server should resolve ties between otherwise equivalent
3255	    routes by evaluating these routes according to the other optimal
3256	    requested services contained in the route request message, in the order
3257	    in which they are specified.  With respect to the route server's
3258	    routing information database, the selected route is optimal according
3259	    to the first optimal requested service specified in the route request
3260	    message but is not necessarily optimal according to any other optimal
3261	    requested service specified in the route request message.

3263	 5. To handle requested service limits, a route server should always select
3264	    the first route generated that satisfies all of the requested service
3265	    limits.

3267	 6. To handle a mixture of requested service limits and optimal requested
3268	    services, a route server should generate routes that satisfy all of the
3269	    requested service limits.  The route server should resolve ties between
3270	    otherwise equivalent routes by evaluating these routes as described in
3271	    the multi-criteria optimization case.

3273	 7. All else being equal, a route server should always prefer minimum-hop
3274	    routes, because they minimize the amount of network resources consumed
3275	    by the routes.

3277	 8. A route server should generate at least one route to each component of
3278	    a partitioned destination domain, because it does not know in which
3279	    domain component the destination host resides.  Hence, a route server
3280	    can maximize the chances of providing a feasible route to a destination
3281	    within a partitioned domain.

3283	                                  70
3284	6.1 Searching

3286	  We do not require that all route servers execute the identical procedures
3287	for generating routes.  Each domain administrator is free to specify the
3288	IDPR route generation procedure for route servers in its own domain, making
3289	the procedure as simple or as complex as desired.

3291	  We offer an IDPR route generation procedure as a model.  This procedure
3292	can be used either to generate a single policy route from the source domain
3293	to a specified destination domain or to generate a set of policy routes from
3294	the source domain to all destination domains.  With slight modification,
3295	this procedure can be made to search in either BF or SPF order.

3297	  For high-bandwidth traffic flows, BF search is the recommended search
3298	technique, because it produces minimum-hop routes.  For low-bandwidth
3299	traffic flows, the route server may use either BF search or SPF search.  We
3300	recommend using SPF search only for optimal requested services and never in
3301	response to a request for a maximum bandwidth route.

3303	6.1.1 Implementation

3305	Data Structures: The routing information database contains the graph of
3306	    the Internet, in which virtual gateways are the nodes and intra-domain
3307	    routes between virtual gateways are the links.  During route
3308	    generation, each route is represented as a sequence of domains and
3309	    relevant transit policies, together with a list of route
3310	    characteristics, stored in a temporary array and indexed by destination
3311	    domain.
3312	     1. Execute the Policy Consistency routine, first with the source
3313	        domain as the given domain and second with the destination domain
3314	        as the given domain.  If any policy inconsistency precludes the
3315	        requested traffic flow, go to Exit.

3317	     2. For each domain, initialize a null route, set the route bandwidth
3318	        to 0, and set the following route characteristic values to 1:
3319	        route delay, route delay variation, session monetary cost, and
3320	        route length in hops.

3322	     3. With each operational virtual gateway in the source domain,
3323	        associate the route characteristics of the source domain.

3325	     4. Initialize a next-node data structure which will contain, for each

3327	                                  71
3328	        route in progress, the virtual gateway at the current endpoint of
3329	        the route together with the associated route characteristics.  The
3330	        next-node data structure determines the order in which routes get
3331	        expanded.

3333	        BF: A fifo queue.
3334	        SPF: A heap, ordered according to the first optimal requested
3335	           service listed in the route request message.

3337	Remove Next Node: These steps are performed for each virtual gateway in
3338	    the next-node data structure.
3339	     1. If there are no more virtual gateways in the next-node data
3340	        structure, go to Exit.

3342	     2. Extract a virtual gateway and its associated route characteristics
3343	        from the next-node data structure, obtain the adjacent domain, and:

3345	        SPF: Remake the heap.

3347	     3. If there is a specific destination domain and if for the primary
3348	        optimal service:

3350	        BF: Route length in hops.
3351	        SPF: First optimal requested service listed in the route request
3352	           message.

3354	        the extracted virtual gateway's associated route characteristic is
3355	        no better than that of the destination domain, go to Remove Next
3356	        Node.

3358	     4. Execute the Policy Consistency routine with the adjacent domain as
3359	        the given domain.  If any policy inconsistency precludes the
3360	        requested traffic flow, go to Remove Next Node.

3362	     5. Check that the source domain's transit policies do not preclude
3363	        traffic generated by the source host with the specified user class
3364	        and requested services, from flowing to the adjacent domain as
3365	        destination.  This check is necessary because the route server
3366	        caches all feasible routes, to intermediate domains, generated
3367	        during the computation of the requested route.  If there are no
3368	        policy inconsistencies, associate the route and its characteristics
3369	        with the adjacent domain.

3371	                                  72
3372	     6. If there is a specific destination domain and if the adjacent
3373	        domain is that destination destination domain, go to Remove Next
3374	        Node.

3376	     7. Record the set of all exit virtual gateways in the adjacent domain
3377	        for which the adjacent domain's transit policies permit the
3378	        requested traffic flow and which are currently reachable from the
3379	        entry virtual gateway.

3381	Next Node: These steps are performed for all exit virtual gateways in
3382	    the above set.
3383	     1. If there are no exit virtual gateways in the set, go to Remove Next
3384	        Node.

3386	     2. Compute the characteristics for the route to the exit virtual
3387	        gateway, and check that all of the route characteristic values are
3388	        within the requested service limits.  If any of the route
3389	        characteristic values are outside of these limits, go to Next Node.

3391	     3. Compare these route characteristic values with those already
3392	        associated with the exit virtual gateway (there may be none, if
3393	        this is the first time the exit virtual gateway has been visited in
3394	        the search), according to the primary optimal service.

3396	     4. Select the route with the optimal value of the primary optimal
3397	        service, resolve ties by considering optimality according to the
3398	        other optimal requested services in rank order, and associate the
3399	        selected route and its characteristics with the exit virtual
3400	        gateway.

3402	     5. Add the virtual gateway to the next-node structure:

3404	        BF: Add to the end of the fifo queue.
3405	        SPF: Add to the heap.

3407	        and go to Next Node.

3409	Exit: Return a response to the route request, consisting of either a set
3410	    of candidate policy routes or an indication that the route request
3411	    cannot be fulfilled.

3413	Policy Consistency: Check policy consistency for the given domain.
3414	     1. Check that the given domain is not specified as an excluded domain
3415	        in the route request.

3417	                                  73
3418	     2. Check that the given domain's transit policies do not preclude
3419	        traffic generated by the source host with the specified user class
3420	        and requested services, from flowing to the destination host and
3421	        domain.

3423	  A path agent may wish to set up a bidirectional path using a route
3424	supplied by a route server.  (Refer to sections 7.2 and 7.4 for detailed
3425	discussions of path directionality.)  However, a route server can only
3426	guarantee that the routes it supplies are feasible if used in the direction
3427	from source to destination.  The reason is that the route server, which
3428	resides in the source domain, does not have access to, and thus cannot
3429	account for, the source policies of the destination domain.  Nevertheless,
3430	the route server can provide the path agent with an indication of its
3431	assessment of route feasibility in the direction from destination to source.

3433	  A necessary but insufficient condition for a route to be feasible in the
3434	direction from destination to source is as follows.  The route must be
3435	consistent, in the direction from destination to source, with the transit
3436	policies of the domains that compose the route.  The transit policy
3437	consistency checks performed by the route server during route generation
3438	account for the direction from source to destination but not for the
3439	direction from destination to source.  Only after a route server generates a
3440	feasible route from source to destination does it perform the transit policy
3441	consistency checks for the route in the direction from destination to
3442	source.  Following these checks, the route server includes in its route
3443	response message to the path agent an indication of its assessment of route
3444	feasibility in each direction.

3446	6.2 Route Database

3448	  A policy route, as originally specified by a route server, is an ordered
3449	list of virtual gateways, domains, and transit policies:
3450	VG1 - AD1 - TP1 -...- VGn - ADn - TPn, where VGi is the virtual gateway
3451	that serves as exit from ADi1 and entry to ADi, and TPi is the set of
3452	transit policies associated with ADi and relevant to the particular route.
3453	Route servers and paths agents store policy routes in route databases
3454	maintained as caches whose entries must be periodically flushed to avoid
3455	retention of stale policy routes.  A route server's route database is the
3456	set of all routes it has generated on behalf of its domain as source.  A
3457	path agent's route database is the set of all routes it has requested and
3458	received from route servers on behalf of hosts within its domain.

3460	                                  74
3461	  When attempting to locate a feasible route for a traffic flow, a path
3462	agent first consults its own route database before querying a route server,
3463	provided that the source policy associated with the source host does not
3464	include any requested qualities of service.  In this case, if its route
3465	database contains one or more routes between the given source and
3466	destination domains, the path agent checks each such route against the set
3467	of excluded domains listed in the source policy.  The path agent either
3468	selects the first route encountered that does not include the excluded
3469	domains, or, if no such route exists in its route database, requests a route
3470	from a route server.

3472	  The path agent must query a route server for routes when the source
3473	policy includes requested qualities of service.  The reason is that the path
3474	agent retains no transit policy information, and in particular, no offered
3475	service information about other domains.  Hence, the path agent cannot
3476	determine whether an entry in its route database satisfies the requested
3477	services.

3479	  When responding to a path agent's request for a policy route, a route
3480	server first consults its route database, unless the route request message
3481	contains an explicit directive to generate a new route.  If its route
3482	database contains one or more routes between the given source and
3483	destination domains, the route server checks each such route against the
3484	services requested by the path agent and the services offered by the domains
3485	composing the route.  To obtain the offered services information, the route
3486	server consults its routing information database.  The route server either
3487	selects the first route encountered that is consistent with both the
3488	requested and offered services, or, if no such route exists in its route
3489	database, attempts to generate a new route.

3491	6.2.1 Cache Maintenance

3493	Each route stored in a route database has a finite cache lifetime equal to
3494	rdb_rs minutes for a route server and rdb_pa minutes for a path agent.
3495	Route servers and path agents reclaim cache space by flushing expired
3496	entries.  Moreover, paths agents reclaim cache space for routes whose paths
3497	have failed to be successfully set up or have been torn down (see
3498	section 7.4).

3500	  Nevertheless, cache space may become scarce, even with reclamation of
3501	entries.  If the cache fills, the route server or path agent logs the event
3502	for network management.  To obtain a cache entry when the cache is full, the

3504	                                  75
3505	route server or path agent deletes from the cache the oldest entry.

3507	                                  76
3508	7  Path Control Protocol and Data Message Forwarding Procedure

3510	  Two entities in different domains can exchange IDPR data messages, only
3511	if there exists an IDPR path set up between the two domains.  Path setup
3512	requires cooperation among path agents and intermediate policy gateways.
3513	Path agents locate policy routes, initiate the path control protocol (PCP),
3514	and manage existing paths between administrative domains.  Intermediate
3515	policy gateways verify that a given policy route is consistent with their
3516	domains' transit policies, establish the forwarding information, and forward
3517	messages along existing paths.

3519	  Each policy gateway and each route server contains a path agent.  The
3520	path agent that initiates path setup in the source domain is the originator,
3521	and the path agent that handles the originator's path setup message in the
3522	destination domain is the target.  Every path has two possible directions of
3523	traffic flow:  from originator to target and from target to originator.
3524	Path control messages are free to travel in either direction, but data
3525	messages may be restricted to only one direction.

3527	  Once a path for a policy route is set up, its physical realization is a
3528	set of consecutive policy gateways, with policy gateways or route servers
3529	forming the endpoints.  Two successive entities in this set belong to either
3530	the same domain or the same virtual gateway.  A policy gateway or route
3531	server may, at any time, recover the resources dedicated to a path that goes
3532	through it by tearing down that path.  For example, a policy gateway may
3533	decide to tear down a path that has not been used for some period of time.

3535	  PCP may build multiple paths between source and destination domains, but
3536	it is not responsible for managing such paths as a group or for eliminating
3537	redundant paths.

3539	7.1 An Example of Path Setup

3541	  We illustrate how path setup works by stepping through an example.
3542	Suppose host HX in domain AD X wants to communicate with host HY in
3543	domain AD Y.  HX need not know the identity of its own domain or of HY's
3544	domain in order to send messages to HY.  Instead, HX simply forwards a
3545	message bound for HY to one of the gateways on its local network, according
3546	to its local forwarding information only.  If the recipient gateway is a
3547	policy gateway, the resident path agent determines how to forward the
3548	message outside of the domain.  Otherwise, the recipient gateway forwards
3549	the message to another gateway in AD X, according to its local forwading
3550	information.  Eventually, the message will arrive at a policy gateway in

3552	                                  77
3553	AD X, as policy gateways are the only egress points to other
3554	administrative domains, in domains that support IDPR.

3556	  The path agent resident in the recipient policy gateway uses the message
3557	header, including source and destination addresses and any requested service
3558	information (for example, type of service), in order to determine whether it
3559	is an intra-domain or inter-domain message, and if inter-domain, whether it
3560	requires an IDPR policy route.  Specifically, the path agent attempts to
3561	locate a forwarding information database entry for the given traffic flow,
3562	from the information contained in the message header.  In the future, for IP
3563	messages, the relevant header information may also include special
3564	service-specific IP options or even information from higher layer protocols.

3566	  Forwarding database entries exist for all of the following:

3568	 1. All intra-domain traffic flows.  Intra-domain forwarding information is
3569	    integrated into the forwarding information database as soon as it is
3570	    received.

3572	 2. Inter-domain traffic flows that do not require IDPR policy routes.
3573	    Non-IDPR forwarding information is integrated into the forwarding
3574	    database as soon as it is received.

3576	 3. IDPR inter-domain traffic flows for which a path has already been set
3577	    up.  IDPR forwarding information is integrated into the forwarding
3578	    database only during path setup.

3580	  The path agent uses the message header contents to guide the search for a
3581	forwarding information database entry for a traffic flow.  We recommend a
3582	radix search to locate such an entry.  When the search terminates, it
3583	produces either an entry, or, in the case of a new IDPR traffic flow, a
3584	directive to generate an entry.  If the search terminates in an existing
3585	forwarding information database entry, the path agent forwards the message
3586	according to that entry.

3588	  Suppose that the search terminates indicating that the traffic flow from
3589	HX to HY requires an IDPR policy route and that no entry in the forwarding
3590	information database yet exists for that flow.  In this case, the path agent
3591	first determines the source and destination domains associated with the
3592	message's source and destination addresses, before attempting to obtain a
3593	policy route.  The path agent relies on the mapping servers to supply the
3594	domain information, but it caches all mapping server responses locally to
3595	limit the number of future queries.  When attempting to resolve an address
3596	to a domain, the path agent always checks its local cache before contacting

3598	                                  78
3599	a mapping server.

3601	  After obtaining the source and destination domain information, the path
3602	agent attempts to obtain a policy route to carry the traffic from HX to
3603	HY.  The path agent relies on route servers to supply policy routes, but it
3604	caches all route server responses locally to limit the number of future
3605	queries.  When attempting to locate a suitable policy route, the path agent
3606	usually consults its local cache before contacting a route server, as
3607	described in section 6.2.

3609	  If no suitable cache entry exists, the path agent queries the route
3610	server, providing it with the source and destination domains together with
3611	source policy information carried in the host message and specified through
3612	configuration.  Upon receiving a policy route query, a route server consults
3613	its route database.  If it cannot locate a suitable route in its route
3614	database, the route server attempts to generate at least one route to
3615	AD Y , consistent with the requested services for HX.

3617	  The route server always returns a response to the path agent, regardless
3618	of whether it is successful in locating a suitable policy route.  The
3619	response to a successful route query consists of a set of candidate routes,
3620	from which the path agent makes its selection.  We expect that a path agent
3621	will normally choose a single route from a candidate set.  Nevertheless,
3622	IDPR does not preclude a path agent from selecting multiple routes from the
3623	candidate set.  A path agent may desire multiple routes to support features
3624	such as fault tolerance or load balancing; however, IDPR does not specify
3625	how the path agent should use multiple routes.

3627	  If the policy route is a new route provided by the route server, there
3628	will be no existing path for the route, and thus the path agent must set up
3629	such a path.  However, if the policy route is an existing route extracted
3630	from the path agent's cache, there may well be an existing path for the
3631	route, set up to accommodate a different host traffic flow.  IDPR permits
3632	multiple host traffic flows to use the same path, provided that all flows
3633	sharing the path travel between the same endpoint domains and have the same
3634	service requirements.  Nevertheless, IDPR does not preclude a path agent
3635	from setting up distinct paths along the same policy route to preserve the
3636	distinction between the host traffic flows.

3638	  The path agent associates an identifier with the path, which is included
3639	in each message that travels down the path and is used by the policy
3640	gateways along the path in order to determine how to forward the message.
3641	If the path already exists, the path agent uses the preexisting identifier.

3643	                                  79
3644	However, for new paths, the path agent chooses a path identifier that is
3645	different from those of all other paths that it manages.  The path agent
3646	also updates its forwarding information database to reference the path
3647	identifier and modifies its search procedure to yield the correct entry in
3648	the forwarding information database given the data message header.

3650	  For new paths, the path agent initiates path setup, communicating the
3651	policy route, in terms of requested services, constituent domains, relevant
3652	transit policies, and the connecting virtual gateways, to policy gateways in
3653	intermediate domains.  Using this information, an intermediate policy
3654	gateway determines whether to accept or refuse the path and to which policy
3655	gateway to forward the path setup information.  The path setup procedure
3656	allows policy gateways to set up a path in both directions simultaneously.
3657	Each intermediate policy gateway, after path acceptance, updates its
3658	forwarding information database to include an entry that associates the path
3659	identifier with the appropriate previous and next hop policy gateways.

3661	  When a policy gateway in AD Y accepts a path, it notifies the source
3662	path agent in AD X.  We expect that the source path agent will normally
3663	wait until a path has been successfully established before using it to
3664	transport data traffic.  However, PCP does not preclude a path agent from
3665	forwarding messages along a path prior to confirmation of successful path
3666	establishment.  Paths remain in place until they are torn down because of
3667	failure, expiration, or when resources are scarce, preemption in favor of
3668	other paths.

3670	  We note that data communication between HX and HY may occur over two
3671	separate IDPR paths: one from AD X to AD Y and one from AD Y to
3672	AD X.  The reasons are that within a domain, hosts know nothing about
3673	policy gateways nor IDPR paths, and policy gateways know nothing about other
3674	policy gateways' existing IDPR paths.  Thus, in AD Y, the policy gateway
3675	that terminates the path from AD X may not be the same as the policy
3676	gateway that receives traffic from HY destined for HX.  In this case,
3677	receipt of traffic from HY forces the second policy gateway to set up an
3678	independent path from AD Y to AD X.

3680	7.2 Path Identifiers

3682	  Each path has an associated path identifier, unique throughout the
3683	Internet.  Every IDPR data message travelling along that path includes the
3684	path identifier, used for message forwarding.  The path identifier is the
3685	concatenation of three items:  the identifier of the originator's domain;
3686	the identifier of the originator's policy gateway or route server; and a

3688	                                  80
3689	32-bit local path identifier specified by the originator.  The path
3690	identifier and the CMTP transaction identifier have analogous syntax and
3691	play analogous roles in their respective protocols.

3693	  When issuing a new path identifier, the originator always assigns a local
3694	path identifier that is different from that of any other active or recently
3695	torn-down path originally set up by that path agent.  This helps to
3696	distinguish new paths from replays.  Hence, the originator must keep a
3697	record of each extinct path for long enough that all policy gateways on the
3698	path will have eliminated any reference to it from their memories.  The
3699	right-most 30 bits of the local identifier are the same for each path
3700	direction, as they are assigned by the originator.  The left-most 2 bits of
3701	the local identifier indicate the path direction.

3703	  At path setup time, the originator specifies which of the path directions
3704	to enable contingent upon the information received from the route server in
3705	the route response message.  By enable, we mean that each path agent and
3706	each intermediate policy gateway establishes an association between the path
3707	identifier and the previous and next policy gateways on the path, which it
3708	uses for forwarding data messages along that path.  IDPR data messages may
3709	travel in the enabled path directions only, but path control messages are
3710	always free to travel in either path direction.  The originator may enable
3711	neither path direction, if the entire data transaction can be carried in the
3712	path setup message itself.  In this case, the path agents and the
3713	intermediate policy gateways do not establish forwarding associations for
3714	the path, but they do verify consistency of the policy information contained
3715	in the path setup message, with their own transit policies, before
3716	forwarding the setup message on to the next policy gateway.

3718	  The path direction portion of the local path identifier has different
3719	interpretations, depending upon message type.  In an IDPR path setup
3720	message, the path direction indicates the directions in which the path
3721	should be enabled:  the value 01 denotes originator to target; the value 10
3722	denotes target to originator; the value 11 denotes both directions; and the
3723	value 00 denotes neither direction.  Each policy gateway along the path
3724	interprets the path direction in the setup message and sets up the
3725	forwarding information as directed.  In an IDPR data message, the path
3726	direction indicates the current direction of traffic flow:  either 01 for
3727	originator to target or 10 for target to originator.  Thus, if for example,
3728	an originator sets up a path enabling only the direction from target to
3729	originator, the target sends data messages containing the path identifier
3730	selected by the originator together with the path direction set equal to 10.

3732	                                  81
3733	  Instead of using path identifiers that are unique throughout the
3734	Internet, we could have used path identifiers that are unique only between a
3735	pair of consecutive policy gateways and that change from one policy gateway
3736	pair to the next.  The advantage of locally unique path identifiers is that
3737	they can be much shorter than global identifiers and hence consume less
3738	bandwidth on links.  However, the disadvantage is that the path identifier
3739	carried in each IDPR data message must be modified at each policy gateway,
3740	and hence if the integrity/authentication information covers the path
3741	identifier, it must be recomputed at each policy gateway.  For security
3742	reasons, we have chosen to include the path identifier in the set of
3743	information covered by the integrity/authentication value, and moreover, we
3744	advocate public-key based signatures for authentication.  Thus, it is not
3745	possible for intermediate policy gateways to modify the path identifier and
3746	then recompute the correct integrity/authentication value.  Therefore, we
3747	have decided in favor of path identifiers that do not change from hop to hop
3748	and hence must be globally unique.  To speed forwarding of IDPR data
3749	messages with long path identifiers, policy gateways hash the path
3750	identifiers in order to index IDPR forwarding information.

3752	7.3 Path Control Messages

3754	  Messages exchanged by the path control protocol are classified into
3755	requests:  setup, teardown, repair; and responses:  accept, refuse, error.
3756	These messages have significance for intermediate policy gateways as well as
3757	for path agents.

3759	setup: Establishes a path by linking together pairs of policy gateways.
3760	    The setup message is generated by the originator and propagates to the
3761	    target.  In response to a setup message, the originator expects to
3762	    receive an accept, refuse, or error message.  The setup message carries
3763	    all information necessary to set up the path including path identifier,
3764	    requested services, transit policy information relating to each domain
3765	    traversed, and optionally, expedited data.
3766	accept: Signals successful path establishment.  The accept message is
3767	    generated by the target, in response to a setup message, and propagates
3768	    back to the originator.  Reception of an accept message by the
3769	    originator indicates that the originator can now safely proceed to send
3770	    data along the path.  The accept message contains the path identifier
3771	    and an optional reason for conditional acceptance.
3772	refuse: Signals that the path could not be successfully established, either
3773	    because resources were not available or because there was an

3775	                                  82
3776	    inconsistency between the services requested and the services offered.
3777	    The refuse message is generated by the target or by any intermediate
3778	    policy gateway, in response to a setup message, and propagates back to
3779	    the originator.  All recipients of a refuse message usually recover the
3780	    resources dedicated to the given path.  The refuse message contains the
3781	    path identifier and the reason for path refusal.

3783	teardown: Tears down a path, typically when a non-recoverable failure is
3784	    detected.  The teardown message may be generated by any path agent or
3785	    policy gateway in the path and usually propagates in both path
3786	    directions.  All recipients of a teardown message recover the resources
3787	    dedicated to the given path.  The teardown message contains the path
3788	    identifier and the reason for path teardown.

3790	repair: Establishes a repaired path by linking together pairs of policy
3791	    gateways.  The repair message is generated by a policy gateway after
3792	    detecting that the next policy gateway on one of its existing paths is
3793	    unreachable.  A policy gateway that generates a repair message
3794	    propagates the message forward at most two policy gateways.  In
3795	    response to a repair message, the policy gateway expects to receive an
3796	    accept, refuse, teardown, or error message.  The repair message carries
3797	    the original setup message.

3799	error: Transports information about a path error back to the originator,
3800	    when a PCP message contains unrecognized information.  The error
3801	    message may be generated by the target or by any intermediate policy
3802	    gateway and propagates back to the originator.  Most but not all
3803	    error messages are generated in response to errors encountered during
3804	    path setup.  The error message includes the path identifier and an
3805	    explanation of the error detected.

3807	  Policy gateways use CMTP for reliable transport of PCP messages, between
3808	path agents and policy gateways and between consecutive policy gateways on a
3809	path.  PCP must communicate to CMTP the maximum number of transmissions per
3810	path control message, pcp_ret, and the interval between path contol message
3811	retransmissions, pcp_int microseconds.  All path control messages, except
3812	error messages, may be transmitted up to pcp_ret times; error messages are
3813	never retransmitted.  A path control message is acceptable if:

3815	 1. It passes the CMTP validation checks.

3817	 2. Its timestamp is less than pcp_old (300) seconds behind the recipient's
3818	    internal clock time.

3820	                                  83
3821	 3. It carries a recognized path identifier, provided it is not a setup
3822	    message.

3824	The path control message age limit reduces the likelihood of denial of
3825	service attacks based on message replay.  An intermediate policy gateway
3826	forwards acceptable PCP messages.  As we describe in section 7.4 below,
3827	setup messages must undergo additional tests at each intermediate policy
3828	gateway prior to forwarding.  Moreover, receipt of an acceptable accept,
3829	refuse, teardown, or error message at either path agent or at any
3830	intermediate policy gateway indirectly cancels any active local CMTP
3831	retransmissions of the original setup message.  When a path agent or
3832	intermediate policygateway receives an unacceptable path control message,
3833	it discards the message and logs the event for network management.

3835	7.4 Setting Up and Tearing Down a Path

3837	  Path setup begins when the originator generates a setup message
3838	containing:

3840	 1. The path identifier, including path directions to enable.

3842	 2. An indication of whether the message includes expedited data.

3844	 3. The source user class.

3846	 4. The requested services for the path (see section 5.3.2).

3848	 5. For each domain on the path, the domain component, applicable transit
3849	    policies, and entry and exit virtual gateways.

3851	The only mandatory requested services are the maximum path lifetime,
3852	pth_lif, and the data message integrity/authentication type.  If these are
3853	not specified in the path setup message, each recipient policy gateway
3854	assigns them default values, (60) minutes for pth_lif and no authentication
3855	for integrity/authentication type.  Each path agent and intermediate policy
3856	gateway tears down a path when the path lifetime is exceeded.  Hence, no
3857	single source can indefinitely monopolize policy gateway resources or still
3858	functioning parts of partially broken paths.

3860	  After generating the setup message and establishing the proper local
3861	forwarding information, the originator selects the next policy gateway on
3862	the path and forwards the setup message to the selected policy gateway.  The
3863	next policy gateway selection procedure described below applies when the
3864	originator or when an intermediate policy gateway is making the selection.

3866	                                  84
3867	We have elected to describe the procedure from the perspective of a
3868	selecting intermediate policy gateway.

3870	  The policy gateway selects the next policy gateway on a path, in
3871	round-robin order from its list of policy gateways contained in the next hop
3872	virtual gateway.  In selecting the next policy gateway, the policy gateway
3873	uses information contained in the setup message and information provided by
3874	VGP and by the intra-domain routing procedure.

3876	  If the selecting policy gateway is a domain entry point, the next policy
3877	gateway must be:

3879	 1. A member of the next virtual gateway listed in the setup message.

3881	 2. Reachable according to intra-domain routes supporting the transit
3882	    policies listed in the setup message.

3884	 3. Able to reach, according to VGP, the next domain component listed in
3885	    the setup message.

3887	  If the selecting policy gateway is a domain exit point, the next policy
3888	gateway must be:

3890	 1. A member of the current virtual gateway listed in the setup message
3891	    (which is also the selecting policy gateway's virtual gateway).

3893	 2. Reachable according to VGP.

3895	 3. A member of the next domain component listed in the setup message.

3897	  In addition, the selecting policy gateway may use the requested services
3898	listed in the setup message to resolve ties between otherwise equivalent
3899	next policy gateways in the same domain.  In particular, the selecting
3900	policy gateway may use any quality of service information supplied by
3901	intra-domain routing, to select the next policy gateway whose connecting
3902	intra-domain route is optimal according to the requested services.

3904	  Once the originator or intermediate policy gateway selects a next policy
3905	gateway, it forwards the setup message to the selected policy gateway.  Each
3906	recipient (policy gateway or target) of an acceptable setup message performs
3907	several checks on the contents of the message, in order to determine whether
3908	to establish or reject the path.  We describe these checks in detail below
3909	from the perspective of a policy gateway as setup message recipient.

3911	                                  85
3912	7.4.1 Validating Path Identifiers

3914	The recipient of a setup message first checks the path identifier, to make
3915	sure that it does not correspond to that of an already existing or recently
3916	extinct path.  To detect replays, malicious or otherwise, path agents and
3917	policy gateways maintain a record of each path that they establish, for
3918	maxfpth_lif, pcp_oldg seconds.  If the path identifier and timestamp
3919	carried in the setup message match a stored path identifier and timestamp,
3920	the policy gateway considers the message to be a retransmission and does not
3921	forward the message.  If the path identifier carried in the setup message
3922	matches a stored path identifier but the two timestamps do not agree, the
3923	policy gateway abandons path setup, logs the event for network management,
3924	and returns an error message to the originator via the previous policy
3925	gateway.

3927	7.4.2 Path Consistency with Configured Transit Policies

3929	Provided the path identifier in the setup message appears to be new, the
3930	policy gateway proceeds to determine whether the information contained
3931	within the setup message is consistent with the transit policies configured
3932	for its domain.  The policy gateway must locate the source and destination
3933	domains, the source user class, and its domain-specific information, within
3934	the setup message, in order to evaluate path consistency.  If the policy
3935	gateway fails to recognize the source user class (or one or more of the
3936	requested services), it logs the event for network management but continues
3937	with path setup.  If the policy gateway fails to locate its domain within
3938	the setup message, it abandons path setup, logs the event for network
3939	management, and returns an error message to the originator via the previous
3940	policy gateway.  The originator responds by tearing down the path and
3941	subsequently removing the route from its cache.

3943	  Once the policy gateway locates its domain-specific portion of the setup
3944	message, it may encounter the following problems with the contents:

3946	 1. The domain-specific portion lists a transit policy not configured for
3947	    the domain.

3949	 2. The domain-specific portion lists a virtual gateway not configured for
3950	    the domain.

3952	In each case, the policy gateway abandons path setup, logs the event for
3953	network management, and returns an error message to the originator via the

3955	                                  86
3956	previous policy gateway.  These types of error messages indicate to the
3957	originator that the route may have been generated using information from an
3958	out-of-date configuration message.

3960	  The originator responds to the receipt of such an error message as
3961	follows.  First, it tears down the path and removes the route from its
3962	cache.  Then, it issues to a route server a route request message containing
3963	a directive to refresh the routing information database with the most recent
3964	configuration message from the domain that issued the error message, before
3965	generating a new route.

3967	  Once it verifies that its domain-specific information in the setup
3968	message is recognizable, the policy gateway then checks that the information
3969	contained within the setup message is consistent with the transit policies
3970	configured for its domain.  A policy gateway at the entry to a domain checks
3971	path consistency in the direction from originator to target, if the enabled
3972	path directions include originator to target.  A policy gateway at the exit
3973	to a domain checks path consistency in the direction from target to
3974	originator, if the enabled path directions include target to originator.

3976	  When evaluating the consistency of the path with the configured transit
3977	policies, the policy gateway may encounter any of the following problems
3978	with setup message contents:

3980	 1. A listed transit policy does not apply between the listed virtual
3981	    gateways in the given direction.

3983	 2. A listed transit policy denies access to traffic between the listed
3984	    source and destination domains.

3986	 3. A listed transit policy denies access to traffic of the listed user
3987	    class.

3989	 4. A listed transit policy denies access to traffic at the current time.

3991	In each case, the policy gateway abandons path setup, logs the event for
3992	network management, and returns a refuse message to the originator via the
3993	previous policy gateway.  These types of refuse messages indicate to the
3994	originator that the route may have been generated using information from an
3995	out-of-date configuration message.  The refuse message also serves to
3996	teardown the path.

3998	  The originator responds to such a refuse message, first by removing the
3999	route from its cache.  Then, it issues to a route server a route request

4001	                                  87
4002	message containing a directive to refresh the routing information database
4003	with the most recent configuration message from the domain that issued the
4004	refuse message, before generating a new route.

4006	7.4.3 Path Consistency with Virtual Gateway Reachability

4008	Provided the information contained in the setup message is consistent with
4009	the transit policies configured for its domain, the policy gateway proceeds
4010	to determine whether the path is consistent with the reachability of the
4011	virtual gateway containing the potential next hop.  To determine virtual
4012	gateway reachability, the policy gateway uses information provided by VGP
4013	and by the intra-domain routing procedure.

4015	  When evaluating the consistency of the path with virtual gateway
4016	reachability, the policy gateway may encounter any of the following
4017	problems:

4019	 1. The virtual gateway containing the potential next hop is down.

4021	 2. The virtual gateway containing the potential next hop is not reachable
4022	    via any intra-domain routes supporting the transit policies listed in
4023	    the setup message.

4025	 3. The next domain component listed in the setup message is not reachable.

4027	Each of these determinations is made from the perspective of a single policy
4028	gateway and may not reflect actual reachability.  In each case, the policy
4029	gateway encountering such a problem returns a refuse message to the previous
4030	policy gateway which then selects a different next policy gateway as
4031	described in section 7.4 above.  If the policy gateway receives the same
4032	response from all next policy gateways selected, it abandons path setup,
4033	logs the event for network management, and returns the refuse message to the
4034	originator via the previous policy gateway.  These types of refuse messages
4035	indicate to the originator that the route may have been generated using
4036	information from an out-of-date dynamic message.  The refuse message also
4037	serves to teardown the path.

4039	  The originator first responds to such a refuse message by removing the
4040	route from its cache.  Then, it issues to a route server a route request
4041	message containing a directive to refresh the routing information database
4042	with the most recent dynamic message from the domain that issued the refuse
4043	message, before generating a new route.

4045	                                  88
4046	7.4.4 Obtaining Resources

4048	Once the policy gateway determines that the setup message contents are
4049	consistent with the transit policies and virtual gateway reachability of the
4050	recipient's domain, it attempts to gain resources for the new path.  For
4051	this version of IDPR, path resources consist of memory in the local
4052	forwarding information database.  However, in the future, path resources may
4053	also include reserved link bandwidth.

4055	  If the policy gateway does not have resources to establish the new path,
4056	it uses the following algorithm to determine whether to generate a refuse
4057	message for the new path or a teardown message for an existing path in favor
4058	of the new path.  There are two cases:

4060	 1. No paths have been idle for more than pcp_idle (300) seconds.  In this
4061	    case, the policy gateway returns a refuse message to the previous
4062	    policy gateway.  This policy gateway then tries to select a different
4063	    next policy gateway, as described in section 7.4 above, provided the
4064	    policy gateway that issued the refuse message was not the target.  If
4065	    the refuse message was issued by the target or if there is no available
4066	    next policy gateway, the policy gateway returns the refuse message to
4067	    the originator via the previous policy gateway and logs the event for
4068	    network management.  The refuse message serves to tear down the path.

4070	 2. At least one path has been idle for more than pcp_idle seconds.  In
4071	    this case, the policy gateway tears down an older path in order to
4072	    accommodate the newer path and logs the event for network management.
4073	    Specifically, the entity tears down the least recently used path of
4074	    those that have been idle for longer than pcp_idle seconds, resolving
4075	    ties by choosing the oldest such path.

4077	  If the policy gateway has sufficient resources to establish the path, it
4078	attempts to update its local forwarding information database with
4079	information about the path identifier, previous and next policy gateways on
4080	the path, and directions in which the path should be enabled for data
4081	traffic transport.

4083	7.4.5 Target Response

4085	When an acceptable setup message successfully reaches an entry policy
4086	gateway in the destination domain, this policy gateway performs the all of
4087	the checks described in the above sections.  Provided no problems are

4089	                                  89
4090	encountered, the policy gateway's path agent becomes the target, unless
4091	there is an explicit target specified in the setup message, as with RSQP
4092	messages exchanged between remote route servers (see section 5.2).  If the
4093	policy gateway is not the target, it attempts to forward the setup message
4094	to the target along an intra-domain route.  However, if the target is not
4095	reachable via intra-domain routing, the policy gateway abandons path setup,
4096	logs the event for network management, and returns a refuse message to the
4097	originator via the previous policy gateway.  The refuse message serves to
4098	tear down the path.

4100	  Once the setup message reaches the target, the target determines whether
4101	it has sufficient path resources.  Provided the target does have sufficient
4102	resources to establish the path, it generates an accept message.  The target
4103	then determines whether the destination host is reachable via intra-domain
4104	routing and includes this information in the accept message, before
4105	returning the accept message to the originator via the previous policy
4106	gateway.  Destination host reachability information aids the originator in
4107	determining if the path can be used to reach the destination host.

4109	  The target may choose to use the reverse path to transport data traffic
4110	to the source domain, if the enabled path directions include 10 or 11.
4111	However, the target must first verify the consistency of the reverse path
4112	with its domain's configured source and transit policies.

4114	7.4.6 Originator Response

4116	The originator expects to receive an accept, refuse, or error message in
4117	response to a setup message.  There are three cases:

4119	 1. The originator receives an accept message, confirming successful path
4120	    establishment.  To expedite data delivery, the originator may forward
4121	    data messages along the path prior to receiving an accept message, with
4122	    the understanding that there is no guarantee that the path actually
4123	    exists.

4125	 2. The originator receives a refuse message or an error message, implying
4126	    that the path could not be successfully established.  In response, the
4127	    originator attempts to set up a different path to the same destination,
4128	    as long as the number of selected different paths does not exceed
4129	    setup_try (3).  If the originator is unsuccessful after setup_try
4130	    attempts, it abandons path setup and logs the event for network
4131	    management.

4133	                                  90
4134	 3. The originator fails to receive any response to the setup message
4135	    within setup_int microseconds after transmission.  In this case, the
4136	    originator attempts path setup using the same policy route and a new
4137	    path identifier, as long as the number of path setup attempts using the
4138	    same route does not exceed setup_ret (2).  If the originator fails to
4139	    receive a response to a setup message after setup_ret attempts, it logs
4140	    the event for network management and then proceeds as though it
4141	    received a negative response, namely a refuse or an error, to the setup
4142	    message.  Specifically, it attempts to set up a different path to the
4143	    same destination, or it abandons path setup altogether, depending on
4144	    the value of setup_try.

4146	7.4.7 Path Life

4148	Once set up, a path does not live forever.  A path agent or policy gateway
4149	may tear down an existing path, provided any of the following conditions are
4150	true:

4152	 1. The maximum path lifetime (in minutes, bytes, or messages) has been
4153	    exceeded.  An originator path agent generates a teardown message for
4154	    propagation toward the target.  A target path agent generates a
4155	    teardown message for propagation toward the originator.  An
4156	    intermediate policy gateway generates two teardown messages, one for
4157	    propagation toward the originator and one for propagation toward the
4158	    target.  In all cases, the IDPR entity detecting path expiration logs
4159	    the event for network management.

4161	 2. The previous or next policy gateway becomes unreachable, across a
4162	    virtual gateway or across a domain according to a given transit policy,
4163	    and the path is not repairable.  If the previous policy gateway is
4164	    unreachable, a policy gateway generates a teardown message for
4165	    propagation to the target.  If the next policy gateway is unreachable,
4166	    a policy gateway generates a teardown message for propagation to the
4167	    originator.  In either case, the policy gateway detecting the
4168	    reachability problem logs the event for network management.

4170	 3. All of the policy gateway's path resources are in use, a new path
4171	    requires resources, and the given existing path is expendable,
4172	    according to the least recently used criterion discussed in
4173	    section 7.4.4 above.  A target path agent generates a teardown message
4174	    for propagation toward the originator.  An intermediate policy gateway
4175	    generates two teardown messages, one for propagation toward the

4177	                                  91
4178	    originator and one for propagation toward the target.  In either case,
4179	    the IDPR entity initiating path preemption logs the event for network
4180	    management.

4182	Path teardown at a path agent or policy gateway, whether initiated by one of
4183	the above events or by receipt of a teardown message (or a refuse message
4184	during path setup, as discussed in the previous sections), causes the path
4185	agent or policy gateway to release all resources devoted to both directions
4186	of the path.

4188	7.5 Path Failure and Recovery

4190	  When a policy gateway fails, it may not be able to save information
4191	pertaining to its established paths.  Thus, when the policy gateway returns
4192	to service, it has no recollection of the paths set up through it and can no
4193	longer forward data messages along these paths.  We expect that when a
4194	policy gateway fails, it will usually be out of service for long enough that
4195	the up/down protocol and the intra-domain routing procedure can detect that
4196	the particular policy gateway is no longer reachable.  In this case,
4197	adjacent or neighbor policy gateways that have set up paths through the
4198	failed policy gateway and that have detected the failure, attempt local
4199	route repair (see section 7.5.2 below), and if unsuccessful, issue teardown
4200	messages for all affected paths.

4202	7.5.1 Handling Implicit Path Failures

4204	Nevertheless, policy gateways along a path must be able to handle the case
4205	in which a policy gateway fails and subsequently returns to service without
4206	either the up/down protocol or the intra-domain routing procedure detecting
4207	the failure, although we do not expect this event to occur often.  If the
4208	policy gateway previously contained forwarding information for several
4209	established paths, it may now receive many IDPR data messages containing
4210	unrecognized path identifiers.  This policy gateway must alert the data
4211	sources that their paths through the given policy gateway are no longer
4212	viable.

4214	  Policy gateways that receive IDPR data messages with unrecognized path
4215	identifiers take one of the following two actions, depending upon their past
4216	failure record:

4218	 1. The policy gateway has not failed in the past pg_up (24) hour period.
4219	    In this case, there are at least four possible reasons for the

4221	                                  92
4222	    unrecognized path identifier in the data message:

4224	    (a) The data message path identifier has been corrupted in a way that
4225	        is not detectable by the integrity/authentication value, if one is
4226	        present.

4228	    (b) The policy gateway has experienced a memory error.

4230	    (c) The policy gateway failed sometime during the life of the path and
4231	        the source sent no data on the path for a period of pg_up hours
4232	        following the failure.  Although paths may persist for more than
4233	        pg_up hours, we expect that they will also be used more frequently
4234	        than once every pg_up hours.

4236	    (d) The path was not successfully established, and the originator sent
4237	        data messages down the path prior to receiving a response to its
4238	        setup message.

4240	    In all cases, the policy gateway discards the data message and logs the
4241	    event for network management.

4243	 2. The policy gateway has failed at least once in the past pg_up hour
4244	    period.  Thus, the policy gateway assumes that the unrecognized path
4245	    identifier in the data message can be attributed to its failure.  In
4246	    response to the data message, the policy gateway generates an error
4247	    message containing the unrecognized path identifier.  The policy
4248	    gateway then sends the error message back to the entity from which it
4249	    received the data message, which should be equivalent to the previous
4250	    policy gateway on the path.

4252	  When the previous policy gateway receives the error message, it decides
4253	whether the message is acceptable.  If the policy gateway does not recognize
4254	the path identifier contained in the error message, it does not find the
4255	error message acceptable and subsequently discards the message.  However, if
4256	the policy gateway does find the error message acceptable, it then
4257	determines whether it has already received an accept message for the given
4258	path.  If the policy gateway has not received an accept message for that
4259	path, it discards the error message and takes no further action.

4261	  If the policy gateway has received an accept message for that path, it
4262	then attempts path repair, as described in section 7.5.2 below.  Only if
4263	path repair is unsuccessful does the previous policy gateway generate a
4264	teardown message for the path and return it to the originator.  The teardown
4265	message includes the domain and virtual gateway containing the policy

4267	                                  93
4268	gateway that failed, which aids the originator in selecting a new path that
4269	does not include the domain containing the failed policy gateway.  This
4270	mechanism ensures that path agents quickly discover and recover from
4271	disrupted paths, while guarding against unwarranted path teardown.

4273	7.5.2 Local Path Repair

4275	Failure of one of more entities on a given path may render the path
4276	unusable.  If the failure is within a domain, IDPR relies on the
4277	intra-domain routing procedure to find an alternate route across the domain,
4278	which leaves the path unaffected.  If the failure is in a virtual gateway,
4279	policy gateways must bear the responsibility of repairing the path.  Policy
4280	gateways nearest to the failure are the first to recognize its existence and
4281	hence can react most quickly to repair the path.

4283	  Relinquishing control over path repair to policy gateways in other
4284	domains may be unacceptable to some domain administrators.  The reason is
4285	that these policy gateways cannot guarantee construction of a path that
4286	satisfies the source policies of the source domain, as they have no
4287	knowledge of other domains' source policies.

4289	  Nevertheless, limited local path repair is feasible, without distributing
4290	either source policy information throughout the Internet or detailed path
4291	information among policy gateways in the same domain or in the same virtual
4292	gateway.  We say that a path is locally repairable if there exists an
4293	alternate route between two policy gateways, separated by at most one policy
4294	gateway, on the path.  This definition covers path repair in the presence of
4295	failed routes between consecutive policy gateways as well as failed policy
4296	gateways themselves.

4298	  An IDPR entity attempts local repair of an established path, in the
4299	direction from originator to target, immediately after detecting that the
4300	next policy gateway on the path is no longer reachable.  To prevent multiple
4301	path repairs in response to the same failure, we have stipulated that path
4302	repair can only be initiated in the direction from originator to target.
4303	The entity initiating local path repair attempts to find an alternate path
4304	to the IDPR entity immediately following the unreachable policy gateway on
4305	the path, hence the adjective ``local''.

4307	  Local path repair minimizes the disruption of data traffic flow caused by
4308	certain types of failures along an established path.  Specifically, local
4309	path repair can accommodate an individual failed policy gateway or failed

4311	                                  94
4312	direct connection between two adjacent policy gateways.  However, it can
4313	only be attempted through virtual gateways containing multiple peer policy
4314	gateways.  Local path repair is not designed to repair paths traversing
4315	failed virtual gateways or domain partitions.  Whenever local path repair is
4316	impossible, the failing path must be torn down.

4318	7.5.3 Repairing a Path

4320	When an entity detects through an error message that the next policy gateway
4321	has no knowledge of a given path, it generates a repair message and forwards
4322	it to the next policy gateway.  This repair message will reestablish the
4323	path through the next policy gateway.

4325	  When an entity detects that the next policy gateway on a path is no
4326	longer reachable, it takes one of the following actions, depending upon
4327	whether the entity is a member of the next policy gateway's virtual gateway.
4328	If the entity is not a member of the next policy gateway's virtual gateway,
4329	then one of the following two conditions must be true:

4331	 1. The next policy gateway has a peer that is reachable via an
4332	    intra-domain route consistent with the requested services.  In this
4333	    case, the entity generates a repair message containing the original
4334	    setup message and forwards it to the next policy gateway's peer.

4336	 2. The next policy gateway has no peers that are reachable via
4337	    intra-domain routes consistent with the requested services.  In this
4338	    case, the entity tears down the path back to the originator.

4340	If the entity is a member of the next policy gateway's virtual gateway, then
4341	one of the following four conditions must be true:

4343	 1. The next policy gateway has a peer that belongs to the same domain
4344	    component and is directly-connected to and reachable from the entity.
4345	    In this case, the entity generates a repair message and forwards it to
4346	    the next policy gateway's peer.

4348	 2. The next policy gateway has a peer that belongs to the same domain
4349	    component, is not directly-connected to the entity, but is
4350	    directly-connected to and reachable from one of the entity's peers,
4351	    which in turn is reachable from the entity via an intra-domain route
4352	    consistent with the requested services.  In this case, the entity
4353	    generates a repair message and forwards it to its peer.

4355	                                  95
4356	 3. The next policy gateway has no operational peers within its domain
4357	    component, but is directly-connected to and reachable from one of the
4358	    entity's peers, which in turn is reachable from the entity via an
4359	    intra-domain route consistent with the requested services.  In this
4360	    case, the entity generates a repair message and forwards it to its
4361	    peer.

4363	 4. The next policy gateway has no operational peers within its domain
4364	    component, and the entity has no operational peers which are both
4365	    reachable via intra-domain routes consistent with the requested
4366	    services and directly-connected to and reachable from the next policy
4367	    gateway.  In this case, the entity tears down the path back to the
4368	    originator.

4370	  A recipient of a repair message takes the following steps, depending upon
4371	its relationship to the entity that issued the repair message.  If the
4372	recipient and the issuing entity are in the same domain or in the same
4373	virtual gateway, the recipient extracts the setup message contained within
4374	the repair message and treats the message as it would any other setup
4375	message.  Specifically, the recipient checks consistency of the path with
4376	its domain's transit policies and virtual gateway reachability.  If there
4377	are unrecognized portions of the setup message, the recipient generates an
4378	error message, and if there are path inconsistencies, the recipient
4379	generates a refuse message.  In either case, the recipient returns the
4380	message to the entity that issued the repair message.  Otherwise, if the
4381	recipient accepts the repair message, it updates its local forwarding
4382	information database accordingly and forwards the repair message to a
4383	potential next hop, according to the information contained in the enclosed
4384	setup message.

4386	  If the recipient and the issuing entity are in different domains and in
4387	different virtual gateways, the recipient extracts the setup message from
4388	the repair message and determines whether the associated path matches any of
4389	its established paths.  If the path does not match an established path, the
4390	recipient generates a refuse message and returns it to the previous policy
4391	gateway.  In response to this refuse message, the previous policy gateway
4392	tries a different next policy gateway.

4394	  The path is irreparable if all potential next policy gateways have been
4395	exhausted and a path match has yet to be discovered.  In this case, the
4396	previous policy gateway issues a teardown message to return to the
4397	originator.

4399	                                  96
4400	  The path is repairable, if a path match is discovered.  In this case, the
4401	recipient updates the path entry in the local forwarding information
4402	database and issues an accept message to return to the entity that generated
4403	the repair message.

4405	  An IDPR entity expects to receive an accept, teardown, refuse, or error
4406	message in response to a repair message and reacts to these responses
4407	differently.  The entity always returns a teardown message to the originator
4408	via the previous policy gateway.  It does not return an accept message, but
4409	receipt of such a message indicates that the path has been successfully
4410	repaired.  Upon receipt of a refuse or an error message or when no response
4411	to the repair message arrives within setup_int microseconds, the entity
4412	infers that the path is irreparable and subsequently tears down the path and
4413	logs the event for network management.

4415	  When an entity detects that the previous policy gateway on a path becomes
4416	unreachable, it expects to receive a repair message within setup_wait
4417	microseconds.  If the entity does not receive a repair message for the path
4418	within that time, it infers that the path is irreparable and subsequently
4419	tears down the path and logs the event for network management.

4421	7.6 Path Control Message Formats

4423	  The path control protocol number is equal to 3.  We describe the contents
4424	of each type of PCP message below.

4426	7.6.1 Setup

4428	The setup message type is equal to 0.

4430	                             0_________8_________16________24_____31__
4431	                             |                 PATH ID               |
4432	                             |_______________________________________|
4433	                             |_______TGT_AD_______|_____TGT_ENT______|
4434	                             |_______AD_PTR_______|__ UCI___|_UNUSED_|
4435	                             |_______NUM_RQS______|
4436	                             |________________________________________
4437	 For each requested service: |______RQS_TYP_______|_____RQS_LEN______|
4438	                             |_________________RQS_SRV_______________|
4439	                             |________________________________________
4440	                For each AD: |_AD_LEN__|____VG____|________AD________|
4441	                             |________CMP_________|_____NUM_TP_______|
4442	                             |_________TP_________|

4444	                                  97
4445	PATH ID (64 bits) Path identifier consisting of the numeric identifier of
4446	    the originator's domain (16 bits), the numeric identifier of the
4447	    originator policy gateway or route server (16 bits), the path direction
4448	    (2 bits), and the local path identifier (30 bits).

4450	TGT AD (16 bits) Numeric identifier for the target domain.

4452	TGT ENT (16 bits) Numeric identifier for the target entity.  A value of 0
4453	    indicates that there is no specific target entity.

4455	AD PTR (16 bits) Byte offset from the beginning of the message indicating
4456	    the location of the beginning of the domain-specific information,
4457	    contained in the right-most 15 bits.  The left-most bit indicates
4458	    whether the message includes expedited data (1 expedited data, 0 no
4459	    expedited data).

4461	UCI (8 bits) Numeric identifier for the source user class.  The value 0
4462	    indicates that there is no particular source user class.

4464	UNUSED (8 bits) Not currently used; must be set equal to 0.

4466	NUM RQS (16 bits) Number of requested services.

4468	RQS TYP (16 bits) Numeric identifier for a type of requested service.  Valid
4469	    requested services are described in section 5.3.2.

4471	RQS LEN (16 bits) Length of the requested service in bytes, beginning with
4472	    the next field.

4474	RQS SRV (variable) Description of the requested service.

4476	AD LEN (8 bits) Length of the information associated with a particular
4477	    domain in bytes, beginning with the next field.

4479	VG (8 bits) Numeric identifier for an entry virtual gateway.

4481	AD (16 bits) Numeric identifier for a domain.

4483	CMP (16 bits) Numeric identifier for a domain component.  Used to aid
4484	    a policy gateway in routing across a virtual gateway connected to a
4485	    partitioned domain.

4487	NUM TP (16 bits) Number of transit policies that apply to the section of the
4488	    path traversing the given domain.

4490	TP (16 bits) Numeric identifier for a transit policy.

4492	                                  98
4493	7.6.2 Accept

4495	The accept message type is equal to 1.

4497	    0_________8________16________24_____31__
4498	    |              PATH ID                 |
4499	    |______________________________________|
4500	    |_RSN_TYP_|__________REASON____________|

4502	PATH ID (64 bits) Path identifier contained in the original setup message.

4504	RSN TYP (8 bits) Numeric identifier for the reason for conditional path
4505	    acceptance.

4507	REASON (variable) Description of the reason for conditional path acceptance.
4508	    Valid reasons include the following types:
4509	     1. Destination host is not currently reachable via intra-domain
4510	        routing.

4512	7.6.3 Refuse

4514	The refuse message type is equal to 2.

4516	    0_________8________16________24_____31__
4517	    |               PATH ID                |
4518	    |______________________________________|
4519	    |_RSN_TYP_|__________REASON____________|

4521	PATH ID (64 bits) Path identifier contained in the original setup message.

4523	RSN TYP (8 bits) Numeric identifier for the reason for path refusal.

4525	REASON (variable) Description of the reason for path refusal.  Valid reasons
4526	    include the following types:
4527	     1. Transit policy does not apply between the virtual gateways in a
4528	        given direction.  Numeric identifier for the transit policy (16
4529	        bits).

4531	     2. Transit policy denies access to traffic between the source and
4532	        destination domains.  Numeric identifier for the transit policy (16
4533	        bits).

4535	                                  99
4536	     3. Transit policy denies access to traffic of the given user class.
4537	        Numeric identifier for the transit policy (16 bits).

4539	     4. Transit policy denies access to traffic at the current time.
4540	        Numeric identifier for the transit policy (16 bits).

4542	     5. Virtual gateway is down.  Numeric identifier for the virtual
4543	        gateway (8 bits) and associated adjacent domain (16 bits).

4545	     6. Virtual gateway is not reachable according to the given transit
4546	        policy.  Numeric identifier for the virtual gateway (8 bits),
4547	        associated adjacent domain (16 bits), and transit policy (16 bits).

4549	     7. Domain component is not reachable.  Numeric identifier for the
4550	        domain (16 bits) and the component (16 bits).

4552	     8. Insufficient resources to establish the path.

4554	     9. Target is not reachable via intra-domain routing.

4556	    10. No existing path with the given path identifier, in response to a
4557	        repair message only.

4559	7.6.4 Teardown

4561	The teardown message type is equal to 3.

4563	    0_________8________16________24_____31__
4564	    |               PATH ID                |
4565	    |______________________________________|
4566	    |_RSN_TYP_|__________REASON____________|

4568	PATH ID (64 bits) Path identifier contained in the original setup message.

4570	RSN TYP (8 bits) Numeric identifier for the reason for path teardown.

4572	REASON (variable) Description of the reason for path teardown.  Valid
4573	    reasons include the following types:
4574	     1. Virtual gateway is down.  Numeric identifier for the virtual
4575	        gateway (8 bits) and associated adjacent domain (16 bits).

4577	     2. Virtual gateway is not reachable according to the given transit
4578	        policy.  Numeric identifier for the virtual gateway (8 bits),

4580	                                 100
4581	        associated adjacent domain (16 bits), and transit policy (16 bits).

4583	     3. Domain component is not reachable.  Numeric identifier for the
4584	        domain (16 bits) and the component (16 bits).

4586	     4. Maximum path lifetime exceeded.

4588	     5. Preempted path.

4590	     6. Unable to repair path.

4592	7.6.5 Error

4594	The error message type is equal to 4.

4596	    0_________8________16________24_____31__
4597	    |                PATH ID               |
4598	    |______________________________________|
4599	    |__MSG____|_RSN_TYP_|______REASON______|

4601	PATH ID (64 bits) Path identifier contained in the path control message in
4602	    error.

4604	MSG (8 bits) Numeric identifier for the type of path control message in
4605	    error.  This field is ignored for error type 8.

4607	RSN TYP (8 bits) Numeric identifier for the reason for the PCP message
4608	    error.

4610	REASON (variable) Description of the reason for the PCP message error.
4611	    Valid reasons include the following types:
4612	     1. Path identifier is already currently active.

4614	     2. Domain does not appear in the setup message.

4616	     3. Transit policy not configured for the domain.  Numeric identifer
4617	        for the transit policy (16 bits).

4619	     4. Virtual gateway not configured for the domain.  Numeric identifier
4620	        for the virtual gateway (8 bits) and associated adjacent domain (16
4621	        bits).

4623	     5. Unrecognized path identifier in IDPR data message.

4625	                                 101
4626	7.6.6 Repair

4628	The repair message type is equal to 5.  A repair message contains the
4629	original setup message only.

4631	7.6.7 Negative Acknowledgements

4633	When a policy gateway receives an unacceptable PCP message that passes the
4634	CMTP validation checks, it includes, in its CMTP ack, an appropriate
4635	negative acknowledgement.  This information is placed in the INFORM field of
4636	the CMTP ack (described in section 2.4); the numeric identifier for each
4637	type of PCP negative acknowledgement is contained in the left-most 8 bits of
4638	the INFORM field.  Negative acknowledgements associated with PCP include the
4639	following types:

4641	 1. Unrecognized PCP message type.  Numeric identifier for the unrecognized
4642	    message type (8 bits).

4644	 2. Out-of-date PCP message.

4646	 3. Unrecognized path identifier (for all PCP messages except setup).
4647	    Numeric identifier for the unrecognized path (64 bits).

4649	                                 102
4650	References

4652	 [1] D. Clark. Policy routing in internet protocols. RFC 1102. May 1989.

4654	 [2] D. Estrin. Requirements for policy based routing in the research
4655	    internet. RFC 1125. November 1989.

4657	 [3] M. Little. Goals and functional requirements for inter-autonomous
4658	     system routing. RFC 1126. July 1989.

4660	 [4] L. Breslau and D. Estrin. Design of inter-administrative domain
4661	     routing protocols. Proceedings of the ACM SIGCOMM '90 Symposium,
4662	     September 1990.

4664	 [5] M. Lepp and M. Steenstrup. An architecture for inter-domain policy
4665	     routing. Internet Draft. May 1992.

4667	 [6] H. Bowns and M. Steenstrup. Inter-domain policy routing configuration
4668	     and usage. Internet Draft. July 1992.

4670	 [7] R. Woodburn. Definitions of managed objects for inter-domain policy
4671	     routing (version 1). Internet Draft. March 1992.

4673	 [8] J. McQuillan, I. Richer, E. Rosen, and D. Bertsekas. ARPANET routing
4674	     algorithm improvements:  second semiannual technical report. BBN Report
4675	     No. 3940. October 1978.

4677	 [9] J. Moy. The OSPF Specification. RFC 1131. October 1989.

4679	[10] D. Oran (editor). Intermediate system to Intermediate system routeing
4680	     exchange protocol for use in Conjunction with the Protocol for
4681	     providing the Connectionless-mode Network Service (ISO 8473). ISO/IEC
4682	     JTC1/SC6/WG2. October 1989.

4684	[11] D. Estrin and G. Tsudik. Secure control of transit internetwork
4685	     traffic. TR-89-15. Computer Science Department. University of Southern
4686	     California.

4688	[12] J. Linn. Privacy enhancement for Internet electronic mail:  part I --
4689	     message encipherment and authentication procedures. RFC 1113. August
4690	     1989.

4692	[13] S. Kent and J. Linn. Privacy enhancement for Internet electronic mail:
4693	     part II -- certificate-based key management. RFC 1114. August 1989.

4695	[14] J. Linn. Privacy enhancement for Internet electronic mail:  part III --
4696	     algorithms, modes, and identifiers. RFC 1115. August 1989.

4698	                                 103
4699	[15] R. Rivest. The MD4 Message-Digst Algorithm. RFC 1320. April 1992.

4701	[16] R. Rivest. The MD5 Message-Digst Algorithm. RFC 1321. April 1992.

4703	                        Expires 30 November 1992

4705	                                 104