idnits 2.17.1 draft-ietf-grow-anycast-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 974. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 951. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 958. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 964. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 18, 2005) is 6850 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Obsolete normative reference: RFC 3513 (Obsoleted by RFC 4291) -- Obsolete informational reference (is this intentional?): RFC 2267 (Obsoleted by RFC 2827) Summary: 6 errors (**), 0 flaws (~~), 2 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Abley 3 Internet-Draft ISC 4 Expires: January 19, 2006 K. Lindqvist 5 Netnod Internet Exchange 6 July 18, 2005 8 Operation of Anycast Services 9 draft-ietf-grow-anycast-01 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on January 19, 2006. 36 Copyright Notice 38 Copyright (C) The Internet Society (2005). 40 Abstract 42 As the Internet has grown, and as systems and networked services 43 within enterprises have become more pervasive, many services with 44 high availability requirements have emerged. These requirements have 45 increased the demands on the reliability of the infrastructure on 46 which those services rely. 48 Various techniques have been employed to increase the availability of 49 services deployed on the Internet. This document presents commentary 50 and recommendations for distribution of services using anycast. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 3. Anycast Service Distribution . . . . . . . . . . . . . . . . . 5 57 3.1 General Description . . . . . . . . . . . . . . . . . . . 5 58 3.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 4. Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 60 4.1 Protocol Suitability . . . . . . . . . . . . . . . . . . . 7 61 4.2 Node Placement . . . . . . . . . . . . . . . . . . . . . . 7 62 4.3 Routing Systems . . . . . . . . . . . . . . . . . . . . . 8 63 4.3.1 Anycast within an IGP . . . . . . . . . . . . . . . . 8 64 4.3.2 Anycast within the Global Internet . . . . . . . . . . 9 65 4.4 Routing Considerations . . . . . . . . . . . . . . . . . . 9 66 4.4.1 Signalling Service Availability . . . . . . . . . . . 9 67 4.4.2 Covering Prefix . . . . . . . . . . . . . . . . . . . 10 68 4.4.3 Equal-Cost Paths . . . . . . . . . . . . . . . . . . . 10 69 4.4.4 Route Dampening . . . . . . . . . . . . . . . . . . . 11 70 4.4.5 Reverse Path Forwarding Checks . . . . . . . . . . . . 12 71 4.4.6 Propagation Scope . . . . . . . . . . . . . . . . . . 12 72 4.4.7 Other Peoples' Networks . . . . . . . . . . . . . . . 13 73 4.4.8 Aggregation Risks . . . . . . . . . . . . . . . . . . 14 74 4.5 Addressing Considerations . . . . . . . . . . . . . . . . 14 75 4.6 Data Synchronisation . . . . . . . . . . . . . . . . . . . 15 76 4.7 Node Autonomy . . . . . . . . . . . . . . . . . . . . . . 15 77 4.8 Multi-Service Nodes . . . . . . . . . . . . . . . . . . . 16 78 4.8.1 Multiple Covering Prefixes . . . . . . . . . . . . . . 16 79 4.8.2 Pessimistic Withdrawal . . . . . . . . . . . . . . . . 16 80 4.8.3 Intra-Node Interior Connectivity . . . . . . . . . . . 17 81 5. Service Management . . . . . . . . . . . . . . . . . . . . . . 18 82 5.1 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 18 83 6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 84 6.1 Denial-of-Service Attack Mitigation . . . . . . . . . . . 19 85 6.2 Service Compromise . . . . . . . . . . . . . . . . . . . . 19 86 6.3 Service Hijacking . . . . . . . . . . . . . . . . . . . . 19 87 7. Protocol Considerations . . . . . . . . . . . . . . . . . . . 20 88 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 89 9. Acknowlegements . . . . . . . . . . . . . . . . . . . . . . . 22 90 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 91 10.1 Normative References . . . . . . . . . . . . . . . . . . . 23 92 10.2 Informative References . . . . . . . . . . . . . . . . . . 23 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 25 94 A. Change History . . . . . . . . . . . . . . . . . . . . . . . . 26 95 Intellectual Property and Copyright Statements . . . . . . . . 27 97 1. Introduction 99 To distribute a service using anycast, the service is first 100 associated with a stable set of IP addresses, and reachability to 101 those addresses is advertised in a routing system from multiple, 102 independent service nodes. Various techniques for anycast deployment 103 of services are discussed in [RFC1546], [ISC-TN-2003-1] and [ISC-TN- 104 2004-1]. 106 Anycast has in recent years become increasingly popular for adding 107 redundancy to DNS servers to complement the redundancy which the DNS 108 architecture itself already provides. Several root DNS server 109 operators have distributed their servers widely around the Internet, 110 and both resolver and authority servers are commonly distributed 111 within the networks of service providers. Anycast distribution has 112 been used by commercial DNS authority server operators for several 113 years. The use of anycast is not limited to the DNS, although the 114 use of anycast imposes some additional limitations on the nature of 115 the service being distributed, including transaction longevity, 116 transaction state held on servers and data synchronisation 117 capabilities. 119 Although anycast is conceptually simple, its implementation 120 introduces some pitfalls for operation of services. For example, 121 monitoring the availability of the service becomes more difficult; 122 the observed availability changes according to the location of the 123 client within the network, and the client catchment of individual 124 anycast nodes is neither static, nor reliably deterministic. 126 This document will describe the use of anycast for both local scope 127 distribution of services using an Interior Gateway Protocol (IGP) and 128 global distribution using BGP [RFC1771]. Many of the issues for 129 monitoring and data synchronisation are common to both, but 130 deployment issues differ substantially. 132 2. Terminology 134 Service Address: an IP address associated with a particular service 135 (e.g. the destination address used by DNS resolvers to reach a 136 particular authority server). 138 Anycast: the practice of making a particular Service Address 139 available in multiple, discrete, autonomous locations, such that 140 datagrams sent are routed to one of several available locations. 142 Anycast Node: an internally-connected collection of hosts and routers 143 which together provide service for an anycast Service Address. An 144 Anycast Node might be as simple as a single host participating in 145 a routing protocol with adjacent routers, or it might include a 146 number of hosts connected in some more elaborate fashion; in 147 either case, to the routing system across which the service is 148 being anycast, each Anycast Node presents a unique path to the 149 Service Address. The entire anycast system for the service 150 consists of two or more separate Anycast Nodes. 152 Local-Scope Anycast: reachability information for the anycast Service 153 Address is propagated through a routing system in such a way that 154 a particular anycast node is only visible to a subset of the whole 155 routing system. 157 Local Node: an Anycast Node providing service using a Local-Scope 158 Anycast address. 160 Global-Scope Anycast: reachability information for the anycast 161 Service Address is propagated through a routing system in such a 162 way that a particular anycast node is potentially visible to the 163 whole routing system. 165 Global Node: an Anycast Node providing service using a Global-Scope 166 Anycast address. 168 3. Anycast Service Distribution 170 3.1 General Description 172 Anycast is the name given to the practice of making a Service Address 173 available to a routing system at Anycast Nodes in two or more 174 discrete locations. The service provided by each node is consistent 175 regardless of the particular node chosen by the routing system to 176 handle a particular request. 178 For services distributed using anycast, there is no inherent 179 requirement for referrals to other servers or name-based service 180 distribution ("round-robin DNS"), although those techniques could be 181 combined with anycast service distribution if an application required 182 it. The routing system decides which node is used for each request, 183 based on the topological design of the routing system and the point 184 in the network at which the request originates. 186 The Anycast Node chosen to service a particular query can be 187 influenced by the traffic engineering capabilities of the routing 188 protocols which make up the routing system. The degree of influence 189 available to the operator of the node depends on the scale of the 190 routing system within which the Service Address is anycast. 192 Load-balancing between Anycast Nodes is typically difficult to 193 achieve (load distribution between nodes is generally unbalanced in 194 terms of request and traffic load). Distribution of load between 195 nodes for the purposes of reliability, and coarse-grained 196 distribution of load for the purposes of making popular services 197 scalable can often be achieved, however. 199 The scale of the routing system through which a service is anycast 200 can vary from a small Interior Gateway Protocol (IGP) connecting a 201 small handful of components, to the Border Gateway Protocol (BGP) 202 [RFC1771] connecting the global Internet, depending on the nature of 203 the service distribution that is required. 205 3.2 Goals 207 A service may be anycast for a variety of reasons. A number of 208 common objectives are: 210 1. Coarse ("unbalanced") distribution of load across nodes, to allow 211 infrastructure to scale to increased numbers of queries and to 212 accommodate transient query peaks; 214 2. Mitigation of non-distributed denial of service attacks by 215 localising damage to single anycast nodes; 217 3. Constraint of distributed denial of service attacks or flash 218 crowds to local regions around anycast nodes (perhaps restricting 219 query traffic to local peering links, rather than paid transit 220 circuits); 222 4. To provide additional information to help locate location of 223 traffic sources in the case of attack (or query) traffic which 224 incorporates spoofed source addresses. This information is 225 derived from the property of anycast service distribution that 226 the the selection of the Anycast Node used to service a 227 particular query may be related to the topological source of the 228 request. 230 5. Improvement of query response time, by reducing the network 231 distance between client and server with the provision of a local 232 Anycast Node. The extent to which query response time is 233 improved depends on the way that nodes are selected for the 234 clients by the routing system. Topological nearness within the 235 routing system does not, in general, correlate to round-trip 236 performance across a network; in some cases response times may 237 see no reduction, and may increase. 239 6. To reduce a list of servers to a single, distributed address. 240 For example, a large number of authoritative nameservers for a 241 zone may be deployed using a small set of anycast Service 242 Addresses; this approach can increase the accessibility of zone 243 data in the DNS without increasing the size of a referral 244 response from a nameserver authoritative for the parent zone. 246 4. Design 248 4.1 Protocol Suitability 250 When a service is anycast between two or more nodes, the routing 251 system makes the node selection decision on behalf of a client. 252 Since it is usually a requirement that a single client-server 253 interaction is carried out between a client and the same server node 254 for the duration of the transaction, it follows that the routing 255 system's node selection decision ought to be stable for substantially 256 longer than the expected transaction time, if the service is to be 257 provided reliably. 259 Some services have very short transaction times, and may even be 260 carried out using a single packet request and a single packet reply 261 in some cases (e.g. DNS transactions over UDP transport). Other 262 services involve far longer-lived transactions (e.g. bulk file 263 downloads and audio-visual media streaming). 265 Some anycast deployments have very predictable routing systems, which 266 can remain stable for long periods of time (e.g. anycast within an 267 well-managed and topologically-simple IGP, where node selection 268 changes only occur as a response to node failures). Other 269 deployments have far less predictable characteristics (see 270 Section 4.4.7). 272 The stability of the routing system together with the transaction 273 time of the service should be carefully compared when deciding 274 whether a service is suitable for distribution using anycast. In 275 some cases, for new protocols, it may be practical to split large 276 transactions into an initialisation phase which is handled by anycast 277 servers, and a sustained phase which is provided by non-anycast 278 servers, perhaps chosen during the initialisation phase. 280 This document deliberately avoids prescribing rules as to which 281 protocols or services are suitable for distribution by anycast; to 282 attempt to do so would be presumptuous. 284 4.2 Node Placement 286 Decisions as to where Anycast Nodes should be placed will depend to a 287 large extent on the goals of the service distribution. For example: 289 o A DNS recursive resolver service might be distributed within an 290 ISP's network, one Anycast Node per site. 292 o A root DNS server service might be distributed throughout the 293 Internet with nodes located in regions with poor external 294 connectivity, to ensure that the DNS functions adequately within 295 the region during times of external network failure. 297 o An FTP mirror service might include local nodes located at 298 exchange points, so that ISPs connected to that exchange point 299 could download bulk data more cheaply than if they had to use 300 expensive transit circuits. 302 In general node placement decisions should be made with consideration 303 of likely traffic requirements, the potential for flash crowds or 304 denial-of-service traffic, the stability of the local routing system 305 and the failure modes with respect to node failure, or local routing 306 system failure. 308 4.3 Routing Systems 310 4.3.1 Anycast within an IGP 312 There are several common motivations for the distribution of a 313 Service Address within the scope of an IGP: 315 1. to improve service response times, by hosting a service close to 316 other users of the network; 318 2. to improve service reliability by providing automatic fail-over 319 to backup nodes; and 321 3. to keep service traffic local, to avoid congesting wide-area 322 links. 324 In each case the decisions as to where and how services are 325 provisioned can be made by network engineers without requiring such 326 operational complexities as regional variances in the configuration 327 of client computers, or deliberate DNS incoherence (causing DNS 328 queries to yield different answers depending on where the queries 329 originate). 331 When a service is anycast within an IGP the routing system is 332 typically under the control of the same organisation that is 333 providing the service, and hence the relationship between service 334 transaction characteristics and network stability are likely to be 335 well-understood. This technique is consequently applicable to a 336 larger number of applications than Internet-wide anycast service 337 distribution (see Section 4.1). 339 An IGP will generally have no inherent restriction on the length of 340 prefix that can be introduced to it. There may well therefore be no 341 need to construct a covering prefix for particular Service Addresses; 342 host routes corresponding to the Service Address can instead be 343 introduced to the routing system. See Section 4.4.2 for more 344 discussion of the requirement for a covering prefix. 346 IGPs often feature little or no aggregation of routes, partly due to 347 algorithmic complexities in supporting aggregation. There is little 348 motivation for aggregation in many networks' IGPs in any case, since 349 the amount of routing information carried in the IGP is small enough 350 that scaling concerns in routers do not arise. For discussion of 351 aggregation risks in other routing systems, see Section 4.4.8. 353 By reducing the scope of the IGP to just the hosts providing service 354 (together with one or more gateway routers) this technique can be 355 applied to the construction of server clusters. This application is 356 discussed in some detail in [ISC-TN-2004-1]. 358 4.3.2 Anycast within the Global Internet 360 Service Addresses may be anycast within the global Internet routing 361 system in order to distribute services across the entire network. 362 The principal differences between this application and the IGP-scope 363 distribution discussed in Section 4.3.1 are that: 365 1. the routing system is, in general, controlled by other people; 367 2. the routing protocol concerned (BGP), and commonly-accepted 368 practices in its deployment, impose some additional constraints 369 (see Section 4.4). 371 4.4 Routing Considerations 373 4.4.1 Signalling Service Availability 375 When a routing system is provided with reachability information for a 376 Service Address from an individual node, packets addressed to that 377 Service Address will start to arrive at the node. Since it is 378 essential for the node to be ready to accept requests before they 379 start to arrive, a coupling between the routing information and the 380 availability of the service at a particular node is desirable. 382 Where a routing advertisement from a node corresponds to a single 383 Service Address, this coupling might be such that availability of the 384 service triggers the route advertisement, and non-availability of the 385 service triggers a route withdrawal. This can be achieved using 386 routing protocol implementations on the same server which provide the 387 service being distributed, which are configured to advertise and 388 withdraw the route advertisement in conjunction with the availability 389 (and health) of the software on the host which processes service 390 requests. An example of such an arrangement for a DNS service is 391 included in [ISC-TN-2004-1]. 393 Where a routing advertisement from a node corresponds to two or more 394 Service Addresses, it may not be appropriate to trigger a route 395 withdrawal due to the non-availability of a single service. Another 396 approach is to route requests for the service which is down at one 397 Anycast Node to a different Anycast Node at which the service is up. 398 This approach is discussed in Section 4.8. 400 Rapid advertisement/withdrawal oscillations can cause operational 401 problems, and nodes should be configured such that rapid oscillations 402 are avoided (e.g. by implementing a minimum delay following a 403 withdrawal before the service can be re-advertised). See 404 Section 4.4.4 for a discussion of route oscillations in BGP. 406 4.4.2 Covering Prefix 408 In some routing systems (e.g. the BGP-based routing system of the 409 global Internet) it is not possible, in general, to propagate a host 410 route with confidence that the route will propagate throughout the 411 network. This is a consequence of operational policy, and not a 412 protocol restriction. 414 In such cases it is necessary to propagate a route which covers the 415 Service Address, and which has a sufficiently short prefix that it 416 will not be discarded by commonly-deployed import policies. For IPv4 417 Service Addresses, this is often a 24-bit prefix, but there are other 418 well-documented examples of IPv4 import polices which filter on 419 Regional Internet Registry (RIR) allocation boundaries, and hence 420 some experimentation may be prudent. Corresponding import policies 421 for IPv6 prefixes also exist. See Section 4.5 for more discussion of 422 IPv6 Service Addresses and corresponding anycast routes. 424 The propagation of a single route per service has some associated 425 scaling issues which are discussed in Section 4.4.8. 427 Where multiple Service Addresses are covered by the same covering 428 route, there is no longer a tight coupling between the advertisement 429 of that route and the individual services associated with the covered 430 host routes. The resulting impact on signaling availability of 431 individual services is discussed in Section 4.4.1 and Section 4.8. 433 4.4.3 Equal-Cost Paths 435 Some routing systems support equal-cost paths to the same 436 destination. Where multiple, equal-cost paths exist and lead to 437 different anycast nodes, there is a risk that different request 438 packets associated with a single transaction might be delivered to 439 more than one node. Services provided over TCP [RFC0793] necessarily 440 involve transactions with multiple request packets, due to the TCP 441 setup handshake. 443 Equal cost paths are commonly supported in IGPs. Multi-node 444 selection for a single transaction can be avoided in most cases by 445 careful consideration of IGP link metrics, or by applying equal-cost 446 multi-path (ECMP) selection algorithms which cause a single node to 447 be selected for a single multi-packet transaction. For an example of 448 the use of hash-based ECMP selection in anycast service distribution, 449 see [ISC-TN-2004-1]. 451 For services which are distributed across the global Internet using 452 BGP, equal-cost paths are normally not a consideration: BGP's exit 453 selection algorithm usually selects a single, consistent exit for a 454 single destination regardless of whether multiple candidate paths 455 exist. Implementations of BGP exist that support multi-path exit 456 selection, however, and corner cases where dual selected exits route 457 to different nodes are possible. Analysis of the likely incidence of 458 such corner cases for particular distributions of Anycast Nodes are 459 recommended for services which involve multi-packet transactions. 461 4.4.4 Route Dampening 463 Frequent advertisements and withdrawals of individual prefixes in BGP 464 are known as flaps. Rapid flapping can lead to CPU exhaustion on 465 routers quite remote from the source of the instability, and for this 466 reason rapid route oscillations are frequently "dampened", as 467 described in [RFC2439]. 469 A dampened path will be suppressed by routers for an interval which 470 increases according to the frequency of the observed oscillation; a 471 suppressed path will not propagate. Hence a single router can 472 prevent the propagation of a flapping prefix to the rest of an 473 autonomous system, affording other routers in the network protection 474 from the instability. 476 Some implementations of flap dampening penalise oscillating 477 advertisements based on the observed AS_PATH, and not on the NLRI. 478 For this reason, network instability which leads to route flapping 479 from a single anycast node ought not to cause advertisements from 480 other nodes (which have different AS_PATH attributes) to be dampened. 482 To limit the opportunity of such implementations to penalise 483 advertisements originating from different Anycast Nodes in response 484 to oscillations from just a single node, care should be taken to 485 arrange that the AS_PATH attributes on routes from different nodes 486 are as diverse as possible. For example, Anycast Nodes should use 487 the same origin AS for their advertisements, but might have different 488 upstream ASs. 490 Where different implementations of flap dampening are prevalent, 491 individual nodes' instability may result in stable nodes becoming 492 unavailable. In mitigation, the following measures may be useful: 494 1. Judicious deployment of Local Nodes in combination with 495 especially stable Global Nodes (with high inter-AS path splay, 496 redundant hardware, power, etc) may help limit oscillation 497 problems to the Local Nodes' limited regions of influence; 499 2. Aggressive flap-dampening of the service prefix close to the 500 origin (e.g. within an Anycast Node, or in adjcacent ASes of each 501 Anycast Node) may also help reduce the opportunity of remote ASes 502 to see oscillations at all. 504 4.4.5 Reverse Path Forwarding Checks 506 Reverse Path Forwarding (RPF) checks, first described in [RFC2267], 507 are commonly deployed as part of ingress interface packet filters on 508 routers in the Internet in order to deny packets whose source 509 addresses are spoofed (see also RFC 2827 [RFC2827]). Deployed 510 implementations of RPF make several modes of operation available 511 (e.g. "loose" and "strict"). 513 Some modes of RPF can cause non-spoofed packets to be denied when 514 they originate from multi-homed site, since selected paths might 515 legitimately not correspond with the ingress interface of non-spoofed 516 packets from the multi-homed site. This issue is discussed in 517 [RFC3704]. 519 A collection of anycast nodes deployed across the Internet is largely 520 indistinguishable from a distributed, multi-homed site to the routing 521 system, and hence this risk also exists for anycast nodes, even if 522 individual nodes are not multi-homed. Care should be taken to ensure 523 that each anycast node is treated as a multi-homed network, and that 524 the corresponding recommendations in [RFC3704] with respect to RPF 525 checks are heeded. 527 4.4.6 Propagation Scope 529 In the context of Anycast service distribution across the global 530 Internet, Global Nodes are those which are capable of providing 531 service to clients anywhere in the network; reachability information 532 for the service is propagated globally, without restriction, by 533 advertising the routes covering the Service Addresses for global 534 transit to one or more providers. 536 More than one Global Node can exist for a single service (and indeed 537 this is often the case, for reasons of redundancy and load-sharing). 539 In contrast, it is sometimes desirable to deploy an Anycast Node 540 which only provides services to a local catchment of autonomous 541 systems, and which is deliberately not available to the entire 542 Internet; such nodes are referred to in this document as Local Nodes. 543 An example of circumstances in which a Local Node may be appropriate 544 are nodes designed to serve a region with rich internal connectivity 545 but unreliable, congested or expensive access to the rest of the 546 Internet. 548 Local Nodes advertise covering routes for Service Addresses in such a 549 way that their propagation is restricted. This might be done using 550 well-known community string attributes such as NO_EXPORT [RFC1997] or 551 NOPEER [RFC3765], or by arranging with peers to apply a conventional 552 "peering" import policy instead of a "transit" import policy, or some 553 suitable combination of measures. 555 Advertising reachability to Service Addresses from Local Nodes should 556 ideally be made using a routing policy that require presence of 557 explicit attributes for propagation, rather than reling on implicit 558 (default) policy. Inadvertant propagation of a route beyond its 559 intended horizon can result in capacity problems for Local Nodes 560 which might degrade service performance network-wide. 562 4.4.7 Other Peoples' Networks 564 When Anycast services are deployed across networks operated by 565 others, their reachability is dependent on routing policies and 566 topology changes (planned and unplanned) which are unpredictable and 567 sometimes difficult to identify. Since the routing system may 568 include networks operated by multiple, unrelated organisations, the 569 possibility of unforeseen interactions resulting from the 570 combinations of unrelated changes also exists. 572 The stability and predictability of such a routing system should be 573 taken into consideration when assessing the suitability of anycast as 574 a distribution strategy for particular services and protocols (see 575 also Section 4.1). 577 By way of mitigation, routing policies used by Anycast Nodes across 578 such routing systems should be conservative, individual nodes' 579 internal and external/connecting infrastructure should be scaled to 580 support loads far in excess of the average, and the service should be 581 monitored proactively from many points in order to avoid unpleasant 582 surprises (see Section 5.1). 584 4.4.8 Aggregation Risks 586 The propagation of a single route for each anycast service does not 587 scale well for routing systems in which the load of routing 588 information which must be carried is a concern, and where there are 589 potentially many services to distribute. For example, an autonomous 590 system which provides services to the Internet with N Service 591 Addresses covered by a single exported route, would need to advertise 592 (N+1) routes if each of those services were to be distributed using 593 anycast. 595 The common practice of applying minimum prefix-length filters in 596 import policies on the Internet (see Section 4.4.2) means that for a 597 route covering a Service Address to be usefully propagated the prefix 598 length must be substantially less than that required to advertise 599 just the host route. Widespread advertisement of short prefixes for 600 individual services hence also has a negative impact on address 601 conservation. 603 Both of these issues can be mitigated to some extent by the use of a 604 single covering prefix to accommodate multiple Service Addresses, as 605 described in Section 4.8. This implies a decoupling of the route 606 advertisement from individual service availability (see 607 Section 4.4.1), however, with attendant risks to the stability of the 608 service as a whole (see Section 4.7). 610 In general, the scaling problems described here prevent anycast from 611 being a useful, general approach for service distribution on the 612 global Internet. It remains, however, a useful technique for 613 distributing a limited number of Internet-critical services, as well 614 as in smaller networks where the aggregation concerns discussed here 615 do not apply. 617 4.5 Addressing Considerations 619 Service Addresses should be unique within the routing system that 620 connects all Anycast Nodes to all possible clients of the service. 621 Service Addresses must also be chosen so that corresponding routes 622 will be allowed to propagate within that routing system. 624 For an IPv4-numbered service deployed across the Internet, for 625 example, an address might be chosen from a block where the minimum 626 RIR allocation size is 24 bits, and reachability to that address 627 might be provided by originating the covering 24-bit prefix. 629 For an IPv4-numbered service deployed within a private network, a 630 locally-unused [RFC1918] address might be chosen, and rechability to 631 that address might be signalled using a (32-bit) host route. 633 For IPv6-numbered services, Anycast Addresses are not scoped 634 differently from unicast addresses [RFC3513]. As such the guidelines 635 presented for IPv4 with respect to address suitability follow for 636 IPv6. 638 4.6 Data Synchronisation 640 Although some services have been deployed in localised form (such 641 that clients from particular regions are presented with regionally- 642 relevant content) many services have the property that responses to 643 client requests should be consistent, regardless of where the request 644 originates. For a service distributed using anycast, that implies 645 that different Anycast Nodes must operate in a consistent manner and, 646 where that consistent behaviour is based on a data set, that the data 647 concerned be synchronised between nodes. 649 The mechanism by which data is synchronised depends on the nature of 650 the service; examples are zone transfers for authoritative DNS 651 servers and rsync for FTP archives. In general, the synchronisation 652 of data between Anycast Nodes will involve transactions between non- 653 anycast addresses. 655 Data synchronisation across public networks should be carried out 656 with appropriate authentication and encryption. 658 4.7 Node Autonomy 660 For an Anycast deployment whose goals include improved reliability 661 through redundancy, it is important to minimise the opportunity for a 662 single defect to compromise many (or all) nodes, or for the failure 663 of one node to provide a cascading failure bringing down additional 664 successive nodes until the service as a whole is defeated. 666 Co-dependencies are avoided by making each node as autonomous and 667 self-sufficient as possible. The degree to which nodes can survive 668 failure elsewhere depends on the nature of the service being 669 delivered, but for services which accommodate disconnected operation 670 (e.g. the timed propagation of changes between master and slave 671 servers in the DNS) a high degree of autonomy can be achieved. 673 The possibility of cascading failure due to load can also be reduced 674 by the deployment of both Global and Local Nodes for a single 675 service, since the effective fail-over path of traffic is, in 676 general, from Local Node to Global Node; traffic that might sink one 677 Local Node is unlikely to sink all Local Nodes, except in the most 678 degenerate cases. 680 The chance of cascading failure due to a software defect in an 681 operating system or server can be reduced in many cases by deploying 682 nodes running different implementations of operating system, server 683 software, routing protocol software, etc, such that a defect which 684 appears in a single component does not affect the whole system. 686 4.8 Multi-Service Nodes 688 For a service distributed across a routing system where covering 689 prefixes are required to announce reachability to a single Service 690 Address (see Section 4.4.2), special consideration is required in the 691 case where multiple services need to be distributed across a single 692 set of nodes. This results from the requirement to signal 693 availability of individual services to the routing system so that 694 requests for service are not received by nodes which are not able to 695 process them (see Section 4.4.1). 697 Several approaches are described in the following sections. 699 4.8.1 Multiple Covering Prefixes 701 Each Service Address is chosen such that only one Service Address is 702 covered by each advertised prefix. Advertisement and withdrawal of a 703 single covering prefix can be tightly coupled to the availability of 704 the single associated service. 706 This is the most straightforward approach. However, since it makes 707 very poor utilisation of globally-unique addresses, it is only 708 suitable for use for a small number of critical, infrastructural 709 services such as root DNS servers. General Internet-wide deployment 710 of services using this approach will not scale. 712 4.8.2 Pessimistic Withdrawal 714 Multiple Service Addresses are chosen such that they are covered by a 715 single prefix. Advertisement and withdrawl of the single covering 716 prefix is coupled to the availability of all associated services; if 717 any individual service becomes unavailable, the covering prefix is 718 withdrawn. 720 The coupling between service availability and advertisement of the 721 covering prefix is complicated by the requirement that all Service 722 Addresses must be available -- the announcement needs to be triggered 723 by the presence of all component routes, and not just a single 724 covered route. 726 The fact that a single malfunctioning service causes all deployed 727 services in a node to be taken off-line may make this approach 728 unsuitable for many applications. 730 4.8.3 Intra-Node Interior Connectivity 732 Multiple Service Addresses are chosen such that they are covered by a 733 single prefix. Advertisement and withdrawal of the single covering 734 prefix is coupled to the availability of any one service. Nodes have 735 interior connectivity, e.g. using tunnels, and host routes for 736 service addresses are distributed using an IGP which extends to 737 include routers at all nodes. 739 In the event that a service is unavailable at one node, but available 740 at other nodes, a request may be routed over the interior network 741 from the receiving node towards some other node for processing. 743 In the event that some local services in a node are down and the node 744 is disconnected from other nodes, continued advertisement of the 745 covering prefix might cause requests to become black-holed. 747 This approach allows reasonable address utilisation of the netblock 748 covered by the announced prefix, at the expense of reduced autonomy 749 of individual nodes; the IGP in which all nodes participate can be 750 viewed as a single point of failure. 752 5. Service Management 754 5.1 Monitoring 756 Monitoring a service which is distributed is more complex than 757 monitoring a non-distributed service, since the observed accuracy and 758 availability of the service is, in general, different when viewed 759 from clients attached to different parts of the network. When a 760 problem is identified, it is also not always obvious which node 761 served the request, and hence which node is malfunctioning. 763 It is recommended that distributed services are monitored from probes 764 distributed representatively across the routing system, and, where 765 possible, the identity of the node answering individual requests is 766 recorded along with performance and availability statistics. The 767 RIPE NCC DNSMON service [1] is an example of such monitoring for the 768 DNS. 770 Monitoring the routing system (from a variety of places, in the case 771 of routing systems where perspective is relevant) can also provide 772 useful diagnostics for troubleshooting service availability. This 773 can be achieved using dedicated probes, or public route measurement 774 facilities on the Internet such as the RIPE NCC Routing Information 775 Service [2] and the University of Oregon Route Views Project [3]. 777 Monitoring the health of the component devices in an Anycast 778 deployment of a service (hosts, routers, etc) is straightforward, and 779 can be achieved using the same tools and techniques commonly used to 780 manage other network-connected infrastructure, without the additional 781 complexity involved in monitoring Anycast service addresses. 783 6. Security Considerations 785 6.1 Denial-of-Service Attack Mitigation 787 This document describes mechanisms for deploying services on the 788 Internet which can be used to mitigate vulnerability to attack: 790 1. An Anycast Node can act as a sink for attack traffic originated 791 within its sphere of influence, preventing nodes elsewhere from 792 having to deal with that traffic; 794 2. The task of dealing with attack traffic whose sources are widely 795 distributed is itself distributed across all the nodes which 796 contribute to the service. Since the problem of sorting between 797 legitimate and attack traffic is distributed, this may lead to 798 better scaling properties than a service which is not 799 distributed. 801 6.2 Service Compromise 803 The distribution of a service across several (or many) autonomous 804 nodes imposes increased monitoring as well as an increased systems 805 administration burden on the operator of the service which might 806 reduce the effectiveness of host and router security. 808 The potential benefit of being able to take compromised servers off- 809 line without compromising the service can only be realised if there 810 are working procedures to do so quickly and reliably. 812 6.3 Service Hijacking 814 It is possible that an unauthorised party might advertise routes 815 corresponding to anycast Service Addresses across a network, and by 816 doing so capture legitimate request traffic or process requests in a 817 manner which compromises the service (or both). A rogue Anycast Node 818 might be difficult to detect by clients or by the operator of the 819 service. 821 The risk of service hijacking by manipulation of the routing sytem 822 exists regardless of whether a service is distributed using anycast. 823 However, the fact that legitimate Anycast Nodes are observable in the 824 routing system may make it more difficult to detect rogue nodes. 826 7. Protocol Considerations 828 This document does not impose any protocol considerations. 830 8. IANA Considerations 832 This document requests no action from IANA. 834 9. Acknowlegements 836 The authors gratefully acknowledge the contributions from various 837 participants of the grow working group, and in particular Geoff 838 Huston, Pekka Savola, Danny McPherson and Ben Black. 840 This work was supported by the US National Science Foundation 841 (research grant SCI-0427144) and DNS-OARC. 843 10. References 845 10.1 Normative References 847 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 848 RFC 793, September 1981. 850 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 851 (BGP-4)", RFC 1771, March 1995. 853 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 854 E. Lear, "Address Allocation for Private Internets", 855 BCP 5, RFC 1918, February 1996. 857 [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP 858 Communities Attribute", RFC 1997, August 1996. 860 [RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route 861 Flap Damping", RFC 2439, November 1998. 863 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 864 Defeating Denial of Service Attacks which employ IP Source 865 Address Spoofing", BCP 38, RFC 2827, May 2000. 867 [RFC3513] Hinden, R. and S. Deering, "Internet Protocol Version 6 868 (IPv6) Addressing Architecture", RFC 3513, April 2003. 870 [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed 871 Networks", BCP 84, RFC 3704, March 2004. 873 10.2 Informative References 875 [ISC-TN-2003-1] 876 Abley, J., "Hierarchical Anycast for Global Service 877 Distribution", March 2003, 878 . 880 [ISC-TN-2004-1] 881 Abley, J., "A Software Approach to Distributing Requests 882 for DNS Service using GNU Zebra, ISC BIND 9 and FreeBSD", 883 March 2004, 884 . 886 [RFC1546] Partridge, C., Mendez, T., and W. Milliken, "Host 887 Anycasting Service", RFC 1546, November 1993. 889 [RFC2267] Ferguson, P. and D. Senie, "Network Ingress Filtering: 890 Defeating Denial of Service Attacks which employ IP Source 891 Address Spoofing", RFC 2267, January 1998. 893 [RFC3765] Huston, G., "NOPEER Community for Border Gateway Protocol 894 (BGP) Route Scope Control", RFC 3765, April 2004. 896 URIs 898 [1] 900 [2] 902 [3] 904 Authors' Addresses 906 Joe Abley 907 Internet Systems Consortium, Inc. 908 950 Charter Street 909 Redwood City, CA 94063 910 USA 912 Phone: +1 650 423 1317 913 Email: jabley@isc.org 914 URI: http://www.isc.org/ 916 Kurt Erik Lindqvist 917 Netnod Internet Exchange 918 Bellmansgatan 30 919 118 47 Stockholm 920 Sweden 922 Email: kurtis@kurtis.pp.se 923 URI: http://www.netnod.se/ 925 Appendix A. Change History 927 This section should be removed before publication. 929 draft-kurtis-anycast-bcp-00: Initial draft. Discussed at IETF 61 in 930 the grow meeting and adopted as a working group document shortly 931 afterwards. 933 draft-ietf-grow-anycast-00: Missing and empty sections completed; 934 some structural reorganisation; general wordsmithing. Document 935 discussed at IETF 62. 937 draft-ietd-grow-anycast-01: This appendix added; acknowledgements 938 section added; commentary on [RFC3513] prohibition of anycast on 939 hosts removed; minor sentence re-casting and related jiggery- 940 pokery. This revision published for discussion at IETF 63. 942 Intellectual Property Statement 944 The IETF takes no position regarding the validity or scope of any 945 Intellectual Property Rights or other rights that might be claimed to 946 pertain to the implementation or use of the technology described in 947 this document or the extent to which any license under such rights 948 might or might not be available; nor does it represent that it has 949 made any independent effort to identify any such rights. Information 950 on the procedures with respect to rights in RFC documents can be 951 found in BCP 78 and BCP 79. 953 Copies of IPR disclosures made to the IETF Secretariat and any 954 assurances of licenses to be made available, or the result of an 955 attempt made to obtain a general license or permission for the use of 956 such proprietary rights by implementers or users of this 957 specification can be obtained from the IETF on-line IPR repository at 958 http://www.ietf.org/ipr. 960 The IETF invites any interested party to bring to its attention any 961 copyrights, patents or patent applications, or other proprietary 962 rights that may cover technology that may be required to implement 963 this standard. Please address the information to the IETF at 964 ietf-ipr@ietf.org. 966 Disclaimer of Validity 968 This document and the information contained herein are provided on an 969 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 970 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 971 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 972 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 973 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 974 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 976 Copyright Statement 978 Copyright (C) The Internet Society (2005). This document is subject 979 to the rights, licenses and restrictions contained in BCP 78, and 980 except as set forth therein, the authors retain all their rights. 982 Acknowledgment 984 Funding for the RFC Editor function is currently provided by the 985 Internet Society.