idnits 2.17.1 

draft-levine-iprangepub-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 2 instances of lines with private range IPv4 addresses in the
     document.  If these are generic example addresses, they should be changed
     to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
     198.51.100.x or 203.0.113.x.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 14, 2011) is 4785 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Internet Research Task Force                                   J. Levine
3	Internet-Draft                                      Taughannock Networks
4	Intended status: Experimental                             March 14, 2011
5	Expires: September 15, 2011

7	    An efficient method to publish ranges of IP addresses in the DNS
8	                       draft-levine-iprangepub-02

10	Abstract

12	   The DNS has long been used to publish lists of IPv4 address ranges in
13	   blacklists and whitelists.  The size of the IPv6 address space makes
14	   the entry-per-IP approach used for IPv4 lists impractical.  A new
15	   technique for publishing IP address ranges is described.  It is
16	   intended to permit efficient publishing and querying, and to have
17	   good DNS cache behavior.

19	Status of this Memo

21	   This Internet-Draft is submitted in full conformance with the
22	   provisions of BCP 78 and BCP 79.

24	   Internet-Drafts are working documents of the Internet Engineering
25	   Task Force (IETF).  Note that other groups may also distribute
26	   working documents as Internet-Drafts.  The list of current Internet-
27	   Drafts is at http://datatracker.ietf.org/drafts/current/.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time.  It is inappropriate to use Internet-Drafts as reference
32	   material or to cite them other than as "work in progress."

34	   This Internet-Draft will expire on September 15, 2011.

36	Copyright Notice

38	   Copyright (c) 2011 IETF Trust and the persons identified as the
39	   document authors.  All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents
43	   (http://trustee.ietf.org/license-info) in effect on the date of
44	   publication of this document.  Please review these documents
45	   carefully, as they describe your rights and restrictions with respect
46	   to this document.  Code Components extracted from this document must
47	   include Simplified BSD License text as described in Section 4.e of
48	   the Trust Legal Provisions and are provided without warranty as
49	   described in the Simplified BSD License.

51	Table of Contents

53	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
54	   2.  Assumptions and Goals  . . . . . . . . . . . . . . . . . . . .  4
55	   3.  B-tree data structure  . . . . . . . . . . . . . . . . . . . .  5
56	   4.  DNS record format  . . . . . . . . . . . . . . . . . . . . . .  6
57	   5.  Handling enclosing ranges  . . . . . . . . . . . . . . . . . .  7
58	   6.  Lookup algorithm . . . . . . . . . . . . . . . . . . . . . . .  7
59	   7.  Details of block representation  . . . . . . . . . . . . . . .  8
60	     7.1.  Return values  . . . . . . . . . . . . . . . . . . . . . .  9
61	   8.  Building and updating DNSxLs . . . . . . . . . . . . . . . . .  9
62	     8.1.  Building static DNSxLs . . . . . . . . . . . . . . . . . .  9
63	     8.2.  Building and updating dynamic DNSxLs . . . . . . . . . . . 10
64	   9.  Estimated performance  . . . . . . . . . . . . . . . . . . . . 11
65	   10. Security considerations  . . . . . . . . . . . . . . . . . . . 12
66	   11. Topics for further consideration . . . . . . . . . . . . . . . 12
67	   12. IANA considerations  . . . . . . . . . . . . . . . . . . . . . 13
68	   13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
69	     13.1. References - Normative . . . . . . . . . . . . . . . . . . 13
70	     13.2. References - Informative . . . . . . . . . . . . . . . . . 13
71	   Appendix A.  Change Log  . . . . . . . . . . . . . . . . . . . . . 13
72	     A.1.  Changes from -01 to -02  . . . . . . . . . . . . . . . . . 14
73	     A.2.  Changes from -00 to -01  . . . . . . . . . . . . . . . . . 14
74	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14

76	1.  Introduction

78	   For many years, the Domain Name System[RFC1034] [RFC1035] has been
79	   the Internet's de facto distributed database.  Blacklists and
80	   whitelists of IPv4 addresses have been published in the DNS using a
81	   simple system adapted from rDNS[RFC5782].  A DNSxL (a DNSBL or DNSWL)
82	   is a DNS sub-tree, typically also a DNS zone, with each listed IP
83	   having an A and/or TXT record at a name corresponding to the IP
84	   address.  While this publication method has worked well for IPv4
85	   addresses, the size of the IPv6 address space makes an analogous
86	   approach unworkable.

88	   In an IPv4 Internet, each network is typically limited to a few
89	   thousand or at most a few million addresses.  A single host typically
90	   has a single address, or at most a few hundred addresses.  The
91	   limited size of each network forces a host to use its assigned
92	   address or addresses.  In IPv6 networks, hosts typically use
93	   Stateless Address Autoconfiguration [RFC4862] to select an IP
94	   address, with the low 64 bits of the address being almost entirely
95	   arbitrary.  A hostile host sending mail can easily switch to a new
96	   address for every message it sends, never reusing an address, due to
97	   the vast size of the IPv6 address space.

99	   An IPv6 DNSxL organized like an IPv4 DNSxL with a record per address
100	   could use wildcards or a specialized server such as rbldnsd [RBLDNSD]
101	   to list entire /64 or larger ranges, but that does not help DNS
102	   performance.  Since wildcards are expanded by the DNS server, every
103	   query for a unique IP address causes a unique query to the DNSxL's
104	   server.  Moreover, every unique query will take up a cache entry in
105	   the client's local DNS cache, either a regular entry if the DNSxL
106	   lists the query's address, or a negative entry[RFC2308] if it
107	   doesn't.  In the event that hostile mailers (which we will call
108	   "spammers" for short) use a unique address per message, the normal
109	   DNSxL query traffic will both flood the DNSxL's server and fill local
110	   caches with useless single-use entries, forcing out other cached data
111	   and causing excess traffic to all the other servers the caches query
112	   as well.

114	   For blacklists, an obvious approach would be to limit the granularity
115	   of DNSxLs, so that, say, each /64 had a separate listing, and the
116	   queries only used the high 64 bits of each address.  This arguably
117	   could limit the damage from DNSxL queries (although even a 64 address
118	   is plenty to swamp caches) it is not helpful for DNS whitelists,
119	   which by their nature list individual IP addresses, and are likely to
120	   be far more popular with IPv6 mail than they have been with IPv4
121	   mail.

123	   The problem of looking up an address in a sorted list of IP addresses
124	   stored on a remote DNS server is not unlike the problem of searching
125	   for a key in a sorted list of keys stored on a disk, a problem that
126	   is common in database applications.  An approach called _B-trees_ has
127	   been widely used in databases since the 1970s, and is described in
128	   standard references such as [KNUTHV3].  The technique in this
129	   document stores ordered sets of IP address ranges in DNS records,
130	   using TXT records as a containers for binary data.  The technique is
131	   a straightforward adaptation of the way that B-trees store ordered
132	   sets of key strings in disk blocks.

134	2.  Assumptions and Goals

136	   This design is intended to meet this set of design goals:

138	   1.  Handle arbitrary mixtures of prefix ranges and individual IPs.

140	   2.  Handle overlapping ranges, and exception entries.  (It is typical
141	       for a single instance of rbldnsd to serve copies of several
142	       DNSxLs and to return answers from all of the DNSxLs that have
143	       entries matching a query.)  Each range can be tagged with an
144	       entry type, and the answer includes the tag.

146	   3.  Work well with existing DNS servers and caches.  The number of
147	       different queries should be limited, both to limit cache->server
148	       traffic and the number of cache entries the DNSxL uses.  I assume
149	       that client->cache queries are cheap, while cache->server queries
150	       are much more expensive, so it is a worthwhile tradeoff to
151	       increase the number of the former to decrease the number of the
152	       latter.

154	   4.  Don't assume senders will be well behaved.  In particular,
155	       spammers may use a unique IPv6 address for every message, hopping
156	       around within each /64 (or whatever the network size is) in which
157	       a zombie lives, and there will be many zombies active at the same
158	       time.

160	   5.  Don't assume MTAs remember query results from one lookup to the
161	       next, so the necessary info should be available in the DNS cache.
162	       (If they do, it doesn't hurt.)

164	   6.  Be compatible with DNSSEC, but don't depend on DNSSEC.

166	   7.  Don't attempt to be compatible with existing DNSxLs at the query
167	       level, but do provide a client API similar to the one for
168	       existing DNSxLs.  I expect these lists will mostly be served from
169	       something like rbldnsd, although it would also be possible to
170	       have an external program create the zone files and serve them
171	       from a conventional DNS server such as BIND.

173	3.  B-tree data structure

175	   The goal is to see if a particular address is in one of the ranges.
176	   If we could retrieve the whole list in one query, it'd be easy to do
177	   a binary search of the list, but any useful DNSxL is much too big to
178	   store in one block.  So a b-tree breaks the list into a tree of
179	   blocks, so you can do the binary search block by block.

181	   As an example, the figure below shows simple two-level b-tree.  It's
182	   a tree of blocks, where each block contains an ordered set of values.
183	   The number of ranges can vary from one block to another, since I use
184	   a variable length encoding.)  The example below is two levels; DNSxLs
185	   represented as B-trees turn out to need between 2 and 5 levels.

187	   In each non-leaf record, there is conceptually a lower level record
188	   between each pair of entries.  So in the example below, if you were
189	   looking for 29, you'd look in the top record, see it's not there but
190	   it's between 25 and 34, so you look at the record after 25, then see
191	   it's between 28 and 30.  If this weren't a leaf, there'd be another
192	   record under 28, but since it is a leaf, you know you didn't find it.

194	                         0
195	                         |
196	                         10 16 25 34 50
197	                           |  |  |  |
198	   +-----------------------+  |  |  +--------------+
199	   |                +-------- +  +---+             |
200	   |                |                |             |
201	   11 12 13 14 15   17 18 19 20 21   27 28 30 32   37 38 39 44 45

203	   In a conventional B-tree, each record is a disk block, and the
204	   pointers are explicitly stored in each record, typically as a disk
205	   block number.  But this b-tree is stored in the DNS, where each
206	   record has a name.  Rather than invent pseudo-block numbers to use as
207	   names, I use the entry that points to the block.  So the lower blocks
208	   in the example above would be named 10, 16, 25, and 34.  The root
209	   block is always named 0.

211	   Since the entries in a DNSBL are actually ranges of IP addresses,
212	   each one is represented as the base address of the range, and a
213	   bitmask length.  For an IPv4 DNSxL, the ranges are CIDR ranges, and
214	   the bitmask length is the CIDR size, e.g., in 192.168.40.0/23 the
215	   base address is 192.168.40.0 and the mask length is 23.  If the mask
216	   length is the length of the address (32 for IPv4 or 128 for IPv6, the
217	   range is a single address.

219	4.  DNS record format

221	   A DNSxL is logically an ordered list of of IP address ranges.  The
222	   ranges are in address order from lowest to highest.  If one range
223	   encloses another, the ranges are ordered by the base aaddress of each
224	   range.  Ranges with the base same adddress are ordered from shortest
225	   to longest mask length.

227	   Each DNS record is a block of bytes representing an ordered sub-list
228	   of IP address ranges, Each range also includes a byte saying what
229	   kind of entry it is, i.e., what value to return to a client, and a
230	   one-bit exception saying that this entry is an exception to an
231	   enclosing range with the same value.

233	   Since each DNS record has to have a name, the name is a text version
234	   of the IP address of the range immediately preceding the first range
235	   in the record.  In this context, there is no advantage to doing rDNS-
236	   style nibble reversed naming, so the name is just 32 ASCII characters
237	   for an IPv6 DNSxL or 8 characters for an IPv4 DNSxL, the address as a
238	   hex number.  One block is the root of the B-tree, which is named with
239	   all zeros, such as 00000000000000000000000000000000.dnsxl.example or
240	   00000000.dnsxl.example.

242	   In order to improve performance, each block representing a list of
243	   ranges is stored in the DNS in a binary form using prefix and suffix
244	   compression.  The simplest way to store a block of arbitrary bytes in
245	   the DNS is as a TXT record, with all the strings in the record
246	   concatenated.  The more entries that fit in a block, the better this
247	   will work, so to make the best use of space, each of the strings
248	   other than the last should be the maxmum length, 255 bytes.  Note
249	   that despite its name, the contents of the strings in a TXT record
250	   can be arbitrary binary data, so although the text form of the
251	   records may look rather ugly in a master file, they will work fine in
252	   the DNS.  (If this design catches on, it might be worth defining an
253	   RR type for it to avoid scaring people who expect TXT records to be
254	   readable text.)

256	   Each block contains a set of the ranges that are in the DNSxL.  For
257	   blocks that are not leaves of the B-tree (identified by a per-block
258	   flag, described below), the the entries in the block also partition
259	   the address space, with the ranges between the ones in the current
260	   block in sub-blocks, each named by the entries in the current block.

262	   To minimize edge cases, the root block always contains the lowest and
263	   highest entries.  In other blocks, there can be sub-blocks between
264	   each pair of ranges, but not before the first range or after the
265	   last.  In other words, the last range in a block is not used as the
266	   name of a sub-block.

268	5.  Handling enclosing ranges

270	   Since one range can enclose another, and some ranges enclose
271	   exception ranges, a search has to find not just one entry that
272	   contains an address, but all the ranges that contain the address.  In
273	   order to avoid having to walk up and down the tree of blocks, each
274	   block also includes copies any ranges that enclose the first range in
275	   the block.  This ensures that a search of a block will return all the
276	   values for ranges within the block.

278	   Note that the copied ranges have addresses equal or less to the name
279	   of the block's record, while the other ranges in the block have
280	   addresses greater than the name of the block.  Hence it is easy to
281	   identify copied enclosing ranges, and they need not be specially
282	   marked in the encoded version of the block.

284	6.  Lookup algorithm

286	   A client looks up an IP address in the DNSxL as follows:

288	   1.  Make the root block the current block and fetch it.  Note that
289	       there are no matches found so far.

291	   2.  Search through the current block for the highest-addressed range
292	       entry that contains the target IP address.  Also note any
293	       preceding ranges that also contain the target address.  If there
294	       are amy matches, discard any previous set of matches, and
295	       remember these matches.

297	   3.  If the IP address is lower than the first range in the current
298	       block (disregarding copied ranges), or higher than the last
299	       range, stop.

301	   4.  If this is a leaf block, stop.

303	   5.  Find the range in the current block whose address is just below
304	       the target IP.  Use the address of that range as the name of new
305	       block, fetch that block, and make it the current one.

307	   6.  Go to Paragraph 2.

309	   The result of this algorithm is a possibly empty set of matches.

311	   Some of the matches may be exceptions; if so delete both the
312	   exception and the preceding non-exception match with the same value.
313	   If there are any remaining matches, the address is present in the
314	   DNSxL, so return all of the values for the remaining matches.  If
315	   not, the address is not present in the DNSxL.

317	   It should be evident that this is analogous to a binary tree search,
318	   except that each node has a lot more than two descendants.

320	7.  Details of block representation

322	   The first byte of each block is a flag byte, with the bits
323	   interpreted as follows:

325	   L P P P P P P P

327	   The L bit is set if this is a leaf.  The PPPPPPP is a seven-bit
328	   number which is the implicit prefix size.  If all of the addresses in
329	   the block have the same initial bits as the name of the block does,
330	   which is fairly likely since DNSxL entries often occur in clusters,
331	   the implicit prefix size is the number of common initial bits.  The
332	   common prefix bits are not stored in the block's entries, but are
333	   logically prefixed to each address in the block.  An implicit prefix
334	   size of zero means no implicit prefix.

336	   After that is some number of ranges:

338	   X S S S S S S S (one byte)
339	   Value (one byte)
340	   Address (zero to 16 bytes)

342	   The high bit first byte (X) X is the eXception flag, and means that
343	   this range is an exception entry.  SSSSSSS is 0-127 (IPv6) or 0-31
344	   (IPv4), the mask size minus one.  The value is one byte, the details
345	   of which are discussed below.  The Address is 128-P-S or 32-P-S bits
346	   long, rounded up to a byte.  That is, it omits the implicit prefix
347	   bits which are obtained from the name of the block, and the bits
348	   beyond the prefix size.  For copied ranges, the mask size may be the
349	   same as or less than the implicit prefix size, in which case there
350	   are no address bytes in the entry, and the address bits are all taken
351	   from the implicit prefix.

353	   For example, say an entry is 2001:0DB8:5678:9ABC::/64, the implicit
354	   prefix size is 16 bits, and the value is hex 42.  Then the entry
355	   would be (in hex)

357	   3f 42 0D B8 56 78 9A BC
358	   The 3f is the prefix size (64-1), 42 is the the 2001 is omitted since
359	   that's the common prefix, and the rest is the address.

361	   Each block should be the as large as can fit in a DNS answer.  If you
362	   don't believe in EDNS0, the limit is about 450 bytes.  If you do
363	   believe in EDNS0, it's whatever size you think clients ask for,
364	   probably about 4000 bytes.

366	7.1.  Return values

368	   DNSBLs traditionally can return A and TXT records.  The A records are
369	   the range 127.x.x.x, the TXT records can be arbitrary text that may
370	   be the same for all records, or customized to some extent per record.
371	   In practice, it is rare for a DNSxL to use a large number of
372	   different values.  Hence the values in this encoding are a single
373	   byte, used to look up the values to return.

375	   To look up a return value, do a DNS lookup of a record whose name is
376	   the letter V followed by the two-digit hex representation of the
377	   value, e.g., V00.dnsxl.example, for value zero.  Use its A and TXT
378	   records as the return value.  The most common TXT record
379	   customization is to insert the looked-up address.  In keeping with
380	   common software practice, a dollar sign in the TXT record is replaced
381	   by the text form of the IP address before returning it to the client
382	   application.

384	   Note: this is not entirely satisfactory.  Some DNSBLs are keyed to a
385	   listing database, and use a database record ID rather than the IP
386	   address to customize the TXT record.  It might be necessary to invent
387	   some sort of macro expansion scheme, with the macros in the TXT
388	   record and the values to be inserted included in each range's encoded
389	   entry.

391	8.  Building and updating DNSxLs

393	   DNSxLs are compiled as a list of ranges (prefix and length), and
394	   values.  They must be turned into a tree of named blocks before being
395	   served in the DNS.  The technique varies a little depending on
396	   whether the tree will be updated incrementally, or rebuilt from
397	   scratch if it changes.

399	8.1.  Building static DNSxLs

401	   A static DNSxL should have the minimum number of blocks, each of
402	   which should be as full as possible.  The technique to build the tree
403	   is a direct adaptation of the B-tree building technique in [WPBTREE].

405	   Start with a sorted list of prefix entries.  Save one entry for the
406	   next pass, then take as many entries as possible and make a tentative
407	   block out of them.  Repeat saving an entry and creating a block until
408	   all entries are used.  These will be the leaf blocks.  Now, take the
409	   list of saved entries and repeat to create tentative blocks at the
410	   next level up.  To preserve the identity that there are no entries
411	   covered by a block's range before the first entry in the block or
412	   after the last entry, before creating each block, it's necessary to
413	   promote the first and last entries covered by the block into the
414	   block A way to do this is to recursively "borrow" the first from the
415	   first sub-block, which in turn borrows the last entry from its first
416	   sub-block, all the way down to the leaf.  Then do the same recursive
417	   procedure for the last entry in the last sub-block.  Keep repeating
418	   to create each level of the tree until the process creates only one
419	   block.  Eventually, a pass will create a single block, which is the
420	   root.

422	   Before encoding each block, insert copies of the ranges that enclose
423	   the first range in the block.  A way to do this is to make a linear
424	   pass through the entire sorted list of ranges before starting to
425	   create the blocks, and each entry inside an enclosing range making a
426	   link to the entry that encloses it.  (This can be done in linear time
427	   by keeping a stack of currently "open" enclosing ranges and linking
428	   to the range at the top of the stack.)  Then when encoding the block,
429	   following the links from the first range in the block will find the
430	   enclosing ranges that need to be copied.

432	   When the list changes, rebuild it from scratch.

434	8.2.  Building and updating dynamic DNSxLs

436	   One of the reasons that B-trees are so widely used is that it is
437	   possible to update them efficiently without rebuilding them.  The
438	   same should apply here.

440	   The general approach to updates is to add or delete an entry, then if
441	   that makes a block too big or makes it empty, rebalance the tree to
442	   fix the problem.  If a block is too big, move entries into an
443	   adjacent block if possible, otherwise split the block.  This will
444	   require updating the block above, which in turn might overflow if the
445	   update involves adding rather than replacing an entry.  (It might
446	   overflow even with a replacement if it makes the compressible prefix
447	   shorter.)  In that case, repeat the process, potentially all the way
448	   to the root.  When deleting an entry, if the block becomes empty,
449	   move its pointer entry from the block above up into one of the
450	   adjacent blocks, then adjust the upper block as needed.  Again, this
451	   might cause overflow in which case move entries between blocks or
452	   split the full one.

454	   A tree in which each block is completely full is quite expensive to
455	   update.  The first insertion will cause a leaf to overflow, with
456	   overflows rippling all way up the tree to the root.  It would
457	   probably be a good idea when building a list that is intended to be
458	   updated to leave some slack space in each block, to limit the ripple
459	   effect from changes.  The enclosing ranges also need to be updated.

461	   A significant difference between this design and a conventional
462	   B-tree is the version skew due to DNS caching.  In a normal B-tree,
463	   the pages (blocks) are locked while being changed, and the changes
464	   are immediately visible to all the clients.  In this case, the
465	   clients cache each block for the DNS TTL.  If updates change the
466	   entries in non-leaf blocks, they will break the links between blocks
467	   since they use the entries as pointers.  A possible band-aid is to
468	   add temporary CNAME records at the former names pointing to the
469	   closest new name, so most (admittedly not all) of the entries can
470	   still be located.  Once the TTL on the old blocks has expired, the
471	   CNAMEs can be deleted.

473	9.  Estimated performance

475	   The size of entries varies depending on the length of the prefixes
476	   and the amount of common prefix compression.  A /64 with no common
477	   prefix would take 9 bytes, so I'll use 10 bytes as an estimate of
478	   average entry size.  With EDNS0 and 4K records, that would allow 400
479	   entries per block.  A two-level tree could hold 160,000 entries, a
480	   three level tree 64 million entries, which would need 160,000 blocks.
481	   Large v4 DNSxLs like the CBL have about seven million entries now, so
482	   this should be adequate.  If blocks have to fit in 512 byte
483	   responses, that would be about 40 entries per block.  A five-level
484	   tree could hold 100 million entries in about 2.5 million blocks,
485	   still adequate.

487	   The number of queries for any particular lookup is the number of
488	   levels, which is unlikely to be more than five in a DNSxL of
489	   plausible size.  The cache behavior obviously depends on both the
490	   layout of the entries and the query pattern, but this design avoids
491	   some obvious worst cases.  If a /64 is either entirely listed, not
492	   listed at all, or just has a single /128 listed, all queries for
493	   addresses in that /64 will refetch the same four or five records.  If
494	   a large range of addresses is either listed in one prefix, or not
495	   listed at all, all queries will refetch the same set of blocks, which
496	   would be likely to be cached.

498	   The total number of DNS records used is always less than the number
499	   of records for a traditional entry-per-IP DNSxL for the same set of
500	   entries.  Since all the DNS queries are made by following the tree of
501	   entries, clients shouldn't make queries that fail, so there will be
502	   no negative cache entries.  (This isn't quite true due to version
503	   skew in updated DNSxLs, but it's hard to imagine a plausible scenario
504	   in which there would be a lot of different failing queries.)  This
505	   suggests that the overall cache behavior will be no worse than, and
506	   quite possibly much better than the behavior of traditional IPv4
507	   DNSxLs.

509	   Some preliminary tests using traces of real mail server connection
510	   data and 15 minute TTLs suggest that hit rates will be about 80% for
511	   DNSxLs that include individual addresses, and close to 100% for
512	   DNSxLs that include large ranges.

514	10.  Security considerations

516	   Semantically, there is little difference between a DNSxL published
517	   using this scheme and one published using the traditional entry per
518	   IP approach, since both publish the operator's opinion about some
519	   subset of the IP address space.

521	   One significant practical difference is that it is much easier for
522	   clients to obtain copies of all or part of the database.  For a
523	   traditional DNSxL, the only way to determine its contents is to query
524	   the entire address space (or at least the active part of it) one
525	   address at a time, which would require several billion queries for
526	   IPv4, and is deterred by rate limiting the queries.  In this scheme,
527	   the names of all of the DNS records are easy for clients to
528	   determine, so they can efficiently walk the tree.  While rate
529	   limiting is possible, it is less effective since clients fetch more
530	   data with each query.  It is also easy for a client to fetch all the
531	   entries for a particular IP range, such as the range of a network the
532	   client controls to see what parts of it are blacklisted.

534	11.  Topics for further consideration

536	   o  There might be better ways to do prefix compression, e.g., a per-
537	      entry field that says how many bits are the same as the previous
538	      entry.  Entries in blocks could be bit rather than byte aligned,
539	      although I expect that would be a lot more work for minimal extra
540	      compression.  There may be clever tricks to allocate entries into
541	      blocks to maximize the size of the prefix in each block.  If a
542	      block consists entirely of /128's it might be worth a special
543	      case, leaving out the length byte on each entry.

545	   o  When adding a new entry to a full leaf block, another possibility
546	      would be to make the block a non-leaf, and create two new leaves
547	      below it.  This would make updates faster and less disruptive, at
548	      the cost of possibly slower lookups since some parts of the tree
549	      will be deeper.  Perhaps a hybrid approach would make sense,
550	      rebuild or rebalance the tree when it gets too ragged, with more
551	      than a one-level depth difference between the deepest and
552	      shallowest leaves.

554	12.  IANA considerations

556	   This document makes no requests to IANA.  All data are stored and
557	   queried using existing DNS record types and operations.

559	13.  References

561	13.1.  References - Normative

563	   [RFC1034]  Mockapetris, P., "Domain names - concepts and facilities",
564	              STD 13, RFC 1034, November 1987.

566	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
567	              specification", STD 13, RFC 1035, November 1987.

569	13.2.  References - Informative

571	   [KNUTHV3]  Knuth, D., "The Art of Computer Programming: Volume 3,
572	              Sorting and Searching", 1998.

574	   [RBLDNSD]  Tokarev, M., "rbldnsd: Small Daemon for DNSBLs".

576	   [RFC2308]  Andrews, M., "Negative Caching of DNS Queries (DNS
577	              NCACHE)", RFC 2308, March 1998.

579	   [RFC4862]  Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
580	              Address Autoconfiguration", RFC 4862, September 2007.

582	   [RFC5782]  Levine, J., "DNS Blacklists and Whitelists", RFC 5782,
583	              February 2010.

585	   [WPBTREE]  Wikipedia, "B-tree", December 2010.

587	Appendix A.  Change Log

589	   *NOTE TO RFC EDITOR: This section may be removed upon publication of
590	   this document as an RFC.*

592	A.1.  Changes from -01 to -02

594	   Fix typos

596	A.2.  Changes from -00 to -01

598	   Change CIDRs to prefixes.  Allow for IPv4 addresses.

600	   Add possible updates producing unbalanced trees.

602	Author's Address

604	   John Levine
605	   Taughannock Networks
606	   PO Box 727
607	   Trumansburg, NY  14886

609	   Phone: +1 831 480 2300
610	   Email: standards@taugh.com
611	   URI:   http://jl.ly