idnits 2.17.1
draft-thomson-ecrit-civic-boundary-02.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (February 1, 2011) is 4824 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
== Outdated reference: A later version (-05) exists of
draft-ietf-ecrit-rough-loc-03
== Outdated reference: A later version (-18) exists of
draft-ietf-ecrit-lost-sync-09
Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 ECRIT M. Thomson
3 Internet-Draft Andrew Corporation
4 Intended status: Informational K. Wolf
5 Expires: August 5, 2011 nic.at GmbH
6 February 1, 2011
8 Describing Boundaries for Civic Addresses
9 draft-thomson-ecrit-civic-boundary-02
11 Abstract
13 Algorithms for decision-making based on civic address inputs are
14 described. This includes an algorithm for determining whether one
15 civic address is entirely contained within another. Other algorithms
16 and supplementary discussions relating to the use of civic addresses
17 in describing boundaries are included.
19 Status of this Memo
21 This Internet-Draft is submitted in full conformance with the
22 provisions of BCP 78 and BCP 79.
24 Internet-Drafts are working documents of the Internet Engineering
25 Task Force (IETF). Note that other groups may also distribute
26 working documents as Internet-Drafts. The list of current Internet-
27 Drafts is at http://datatracker.ietf.org/drafts/current/.
29 Internet-Drafts are draft documents valid for a maximum of six months
30 and may be updated, replaced, or obsoleted by other documents at any
31 time. It is inappropriate to use Internet-Drafts as reference
32 material or to cite them other than as "work in progress."
34 This Internet-Draft will expire on August 5, 2011.
36 Copyright Notice
38 Copyright (c) 2011 IETF Trust and the persons identified as the
39 document authors. All rights reserved.
41 This document is subject to BCP 78 and the IETF Trust's Legal
42 Provisions Relating to IETF Documents
43 (http://trustee.ietf.org/license-info) in effect on the date of
44 publication of this document. Please review these documents
45 carefully, as they describe your rights and restrictions with respect
46 to this document. Code Components extracted from this document must
47 include Simplified BSD License text as described in Section 4.e of
48 the Trust Legal Provisions and are provided without warranty as
49 described in the Simplified BSD License.
51 Table of Contents
53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
54 2. Civic Address Model . . . . . . . . . . . . . . . . . . . . . 4
55 3. Civic Address Boundaries . . . . . . . . . . . . . . . . . . . 6
56 3.1. Determining if an Address is Within a Boundary . . . . . . 6
57 3.2. Algorithm Summary . . . . . . . . . . . . . . . . . . . . 7
58 3.3. False Negatives . . . . . . . . . . . . . . . . . . . . . 7
59 3.4. False Positives . . . . . . . . . . . . . . . . . . . . . 8
60 3.5. Address Boundary Limitations . . . . . . . . . . . . . . . 9
61 4. Boundary Combining Algorithms . . . . . . . . . . . . . . . . 10
62 4.1. Boundary Unions . . . . . . . . . . . . . . . . . . . . . 10
63 4.2. Boundary Intersections . . . . . . . . . . . . . . . . . . 10
64 4.3. Avoiding False Positives . . . . . . . . . . . . . . . . . 11
65 5. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
66 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14
67 7. Security Considerations . . . . . . . . . . . . . . . . . . . 15
68 8. Informative References . . . . . . . . . . . . . . . . . . . . 16
69 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17
71 1. Introduction
73 Civic address information ([RFC4776], [RFC5139]) can be used to
74 describe the location of an entity in terms of the human-constructed
75 environment. This description can be be made more or less precise
76 through the addition or removal of labels (respectively).
78 A less precise civic address can be used to describe a zone or region
79 by only including sufficient labels to identify the region. This
80 method is used in the Location-to-Service Translation (LoST) protocol
81 [RFC5222], to convey information to a client about the extents of a
82 particular region where service is guaranteed to be consistent.
84 This information, called a service boundary, allows a client to make
85 decisions about civic addresses other than the one used to query. If
86 another civic address is determined to be "within" the service
87 boundary, the client does not need to request service information
88 from the LoST server.
90 LoST does not provide a definition of "within" for civic addresses.
91 This document describes an algorithm that provides a definition for
92 whether a civic address is contained within another address.
94 Other operations on civic addresses are described, allowing client
95 software to make decisions about the intersection and union of two
96 civic addresses.
98 2. Civic Address Model
100 The simplest model of a civic address is that which considers it as
101 an unordered set of labels. Each label is assigned zero or more
102 values; each value has an associated language (and script).
104 The format in RFC 4776 [RFC4776] allows for values to be given to
105 labels with different languages or scripts. No special
106 considerations apply in applying this model.
108 The format in RFC 5139 [RFC5139] uses multiple "civicAddress"
109 elements to form a single address if labels are provided in multiple
110 languages. Thus, when extracting address information from a PIDF-LO
111 [RFC4119] document, a civic address in this model is formed from all
112 "civicAddress" elements in the same tuple.
114 A civic address describes a series of spatial partitions or regions.
115 Every address includes an implied partition that identifies the
116 habited portion of the Earth. Every label with a value describes a
117 partition of space at a specific scale. The intersection of the
118 spaces described by all the included labels is the resulting
119 location.
121 The algorithm described in this document relies on the following
122 rule:
124 The location described by a set of civic address labels is
125 entirely contained within the location described by any subset of
126 those labels.
128 The following characteristics of civic addresses have no bearing on
129 the algorithms described:
131 1. The civic address formats of [RFC4776] and [RFC5139] include a
132 limited set of hierarchical elements. The "country" and "A1"
133 through "A6" labels follow a strict hierarchy. The algorithms
134 described do not rely on this hierarchy.
136 2. The physical region described by a civic address is not
137 necessarily contiguous. For instance, an address might omit a
138 thoroughfare name, but include a house number of 23. Such an
139 address identifies every house at number 23 within the area
140 described by other labels.
142 More sophisticated models and algorithms are possible in the presence
143 of additional information about the address data. If this sort of
144 information is present, many more options are available for
145 processing addresses. The simple algorithms in this document operate
146 on the address information only, but do not preclude use of outside
147 information.
149 3. Civic Address Boundaries
151 A civic address boundary has the same format as a civic address.
153 A civic address boundary describes a region by containing fewer
154 labels than the addresses of locations contained within the boundary.
155 In that respect, the boundary might be considered an incomplete
156 address, allthough a boundary is actually a valid civic address that
157 simply describes a larger location.
159 A larger region is described by including fewer labels; a smaller
160 region might be described by including more labels, or labels that
161 are more specific.
163 For example, if the described region is the province of Zeeland in
164 the Netherlands, only two labels are required: a country of "NL" and
165 an A1 field of "ZE".
167 A label that is omitted from the civic address boundary indicates
168 that civic addresses within the boundary may have any value for the
169 label.
171 This process does not provide any assurance that a civic address
172 exists, only that if it does exist, it is contained entirely within
173 the described boundary. Determining whether such an address actually
174 exists usually requires additional information, and is therefore not
175 considered by this document.
177 3.1. Determining if an Address is Within a Boundary
179 A civic address is entirely enclosed within a boundary if every label
180 of the boundary that has a value has an equivalent value in the
181 address. A civic address boundary can entirely enclose another civic
182 address boundary.
184 Case folding is performed on values before comparison.
186 A label is considered equivalent if at least one value from the
187 boundary has the same value in the address for the same language (and
188 script). If values are provided in multiple languages, any language
189 that is present in both boundary and address can be used.
191 [[TBD: If the label contains different values for the same language,
192 does this override the above - in light of the Austrian example
193 below, it's probably better that the more lenient equivalence test is
194 used.]]
196 Labels that have the same value, but a different language, are not
197 equivalent. Without information on different translations of the
198 label, the label must be considered to be different.
200 3.2. Algorithm Summary
202 The algorithm for determining whether an address is contained
203 entirely within a given boundary can be summarized by the following
204 pseudocode:
206 SET iswithin = true
207 FOR EACH label IN boundary DO:
208 IF boundary[label] exists THEN:
209 SET equivalent = false
210 FOR lang IN boundary[label] DO:
211 IF boundary[label][lang] == address[label][lang] THEN:
212 SET equivalent = true
213 END
214 END
215 IF NOT equivalent THEN SET iswithin = false
216 END
217 END
218 RETURN iswithin
220 3.3. False Negatives
222 This test can produce false negatives for a number of reasons:
224 1. A particular label might be specified with different languages in
225 the boundary and the address. This label might be considered
226 equivalent if the two values have the same meaning.
228 For instance, the German city Muenchen is known as Munich in
229 English - knowledge of this translation is required to determine
230 that these two values are equivalent.
232 2. A label might have equivalent values, but subtly different
233 language tags [RFC5646] that result in a failed comparison.
235 For instance, in many cases, a language tag of "en" is not
236 significantly different from variants that use the same primary
237 language subtag. Identical values with "en" and "en-US" or
238 "en-Latn" would compare as different, even though the latter two
239 tags are simply more specific than the first. Even "en-GB" is
240 rarely different to these for text that is used in addressing.
242 For creators of civic addresses and boundaries, the guidance in
243 [RFC5646], Section 4.1 recommends that subtags are only added if
244 they include useful distinguishing information. This is intended
245 to avoid processing errors such as this. This guidance is
246 particularly relevant in relation to use of CAtype 128 (script)
247 in the binary encoding of [RFC4776], which is often unnecessarily
248 specified.
250 3. The address might use different labels than the boundary to
251 produce the same result.
253 For instance, a boundary might use labels "A1" through "A6" to
254 describe a location, whereas the same location is described using
255 a postal code in place of these elements. The address is within
256 the described boundary, but this cannot be determined without
257 knowing that the postal code and A-labels are equivalent.
259 In some countries, specific address codes can be used to replace
260 some or all of the other address labels. In some instances, an
261 address consisting of the country and the "ADDCODE" label can be
262 sufficiently descriptive for an application; however, this would
263 not be identified as being within a boundary that was specified
264 using other address labels.
266 4. There are many cases where a value can be expressed in different
267 ways. This includes abbreviations, commonly accepted
268 misspellings, and generally recognized variations in addresses.
270 For instance, abbreviations are common for thoroughfare suffixes,
271 like "Street" ("St." or "St") or "Road" ("Rd." or "Rd").
273 In another example, the Austrian addressing recommendations
274 [RFC5774] let certain labels contain either a code or a
275 descriptive name. Without knowing that "Oberbaumgarten" and
276 "Oberbaumgarten;1208" refer to the same Katastralgemeindenamen,
277 these values must be considered to be different.
279 By using additional information, a system might be able to identify
280 more equivalent labels than the basic algorithm. This can remove
281 some, if not all such false negatives. However, a system should not
282 rely on another system having and employing such knowledge.
284 3.4. False Positives
286 This algorithm guarantees, that a civic address that exists is
287 entirely contained within a boundary.
289 No allowance is made for addresses that do no exist. It is trivially
290 possible to construct a non-sensical or non-existent civic address
291 that is considered "within" a boundary using this algorithm. This
292 can be done by starting with the civic address boundary and adding
293 arbitrary values to labels that do not already have values.
295 3.5. Address Boundary Limitations
297 The address format allows a limited expression for address
298 boundaries. This representation can only be used in limited
299 applications. This simple boundary expression is not suitable for
300 any application that is sensitive to false negatives.
302 In the case of boundary interchange between LoST servers
303 [I-D.ietf-ecrit-lost-sync] would likely require multiple specific
304 boundaries to describe a single boundary. A number of well-known
305 cases would generate a very large number of such boundaries. For
306 instance, if a boundary runs up the middle of street that places odd
307 and even house numbers on opposite sides of the street, each house on
308 that street would require an individual address.
310 Concatenation of address data can introduce other limitations. The
311 limited set of address labels can mean that each field can hold
312 several discrete units of data. An address mapping [RFC5774] might
313 specify that underlying data be concatenated and mapped to a single
314 label.
316 The algorithm described here operates on entire labels only. The
317 algorithm and boundary expression does not allow only part of a label
318 to change. If concatenated data is included in the one label, a
319 generic processer cannot know of this and distinguish between parts
320 that must match and parts that do not matter. To do so would require
321 knowledge of the modes of concatenation, what delimiters were used
322 and it would require syntax that distinguishes important parts from
323 unimportant parts.
325 4. Boundary Combining Algorithms
327 In some cases, it might be necessary to combine boundaries. This
328 section describes three algorithms, including simple union and
329 intersection.
331 4.1. Boundary Unions
333 The union of two civic address boundaries (or addresses) is a single
334 boundary that contains all civic addresses that are contained within
335 either original boundary.
337 The boundary that forms the union of two other boundaries is formed
338 of all labels that are equivalent in both boundaries.
340 If labels differ in the two boundaries, then the resulting union
341 might also include addresses that are in neither boundary. Only use
342 this algorithm if false positives are acceptable.
344 SET union = empty address
345 FOR EACH label IN boundary1,boundary2 DO:
346 IF boundary1[label] EQUIV boundary2[label] THEN
347 SET union[label] = boundary1[label]
348 END
349 END
350 RETURN union
352 The value of any label in a union of boundaries should include a
353 value for all languages that are present in both boundaries.
355 4.2. Boundary Intersections
357 The intersection of two boundaries (or addresses) is a single
358 boundary that contains all addresses that are found in both original
359 boundaries.
361 The boundary that forms the intersection of two other boundaries is
362 formed of the combined value of the labels from both boundaries.
364 If the value of any label is present in both boundaries, but not
365 equivalent, the two boundaries do not intersect for the purposes of
366 this algorithm. In practice, while it is possible that two such
367 boundaries could overlap, this algorithm cannot detect this.
368 Furthermore, the civic address representation does not provide a way
369 to express such an overlap.
371 SET intersection = empty address
372 FOR EACH label IN boundary1,boundary2 DO:
373 IF boundary1[label] exists AND boundary2[label] exists
374 AND boundary1[label] NOT EQUIV boundary2[label] THEN
375 ERROR No overlap
376 END
377 IF boundary1[label] exists THEN:
378 SET intersection[label] = boundary1[label]
379 ELSE
380 SET intersection[label] = boundary2[label]
381 END
382 END
383 RETURN intersection
385 The value of any label in an intersection of boundaries may include a
386 value for all languages that are present in both boundaries.
388 4.3. Avoiding False Positives
390 The service boundaries in LoST [RFC5222] rely on the absence of false
391 positives when determining if an address is within a boundary. False
392 negatives are tolerated, because this only results in the LoST client
393 making another request to discover an unchanged service. A false
394 positive is not desirable because a device could retain a service
395 mapping that is likely to be invalid.
397 [I-D.ietf-ecrit-rough-loc] describes an algorithm where an address is
398 formed of the intersection of multiple boundaries. If no
399 intersection the result of this algorithm was failure, then usable
400 location information cannot be provided to the LoST client.
402 For a LoST service boundary, the goal is to provide a boundary that
403 contains as few labels as possible. For a rough location, the goal
404 is to provide an address with as few labels as possible. False
405 positives are not desirable in either case.
407 Thus, both cases have a similar goal, and both have a more precise
408 address to operate on. In LoST, this is the address used to query
409 the server; for rough locations, this is the address that is to be
410 hidden.
412 To avoid false positives in determining whether an address falls
413 within a boundary, labels from the more precise address are added.
414 Any label where the inputs to the union or intersection disagree is
415 given a value from the precise address. This ensures that the
416 resulting address or boundary entirely contains the precise address.
418 Alternatively, given a precise address and a number of boundaries
419 that are to be combined by either union or intersection, an address
420 or boundary can be formed by removing all labels from the precise
421 address that do not have a value in any boundary. This algorithm
422 produces an identical result.
424 SET result = precise address
425 FOR EACH label IN result DO:
426 SET found = false
427 FOR EACH boundary IN boundaries DO:
428 IF boundary[label] exists THEN SET found = true
429 END
430 IF not found THEN CLEAR result[label]
431 END
432 RETURN result
434 5. Example
436 The following civic address boundary (shown in XML form [RFC5139]),
437 describes a region of Los Angeles, USA.
439
441 US
442 CA
443 Orange
444 Los Angeles
445 Anaheim
446 92802
447
449 The following address includes additional labels, it does not change
450 the value of any label.
452
454 US
455 CA
456 Orange
457 Los Angeles
458 Anaheim
459 SouthHarborBoulevard
460 1313Disneyland
461 92802
462
464 An equivalent address that would not be considered "within" the
465 boundary by the generic algorithm would be one that specified "Orange
466 County" for the "A2" label. It's also possible to unambiguously
467 describe the location without fields "A1" through "A4", since a ZIP
468 (postal) code of "92802" provides sufficient information.
470 Thus, the following variant is not considered "within" the boundary,
471 even if it is the same address. That is, without context-specific
472 knowledge, the algorithm produces a false negative on this address.
474
476 US
477 SouthHarborBoulevard
478 1313Disneyland
479 92802
480
482 6. IANA Considerations
484 This document has no IANA actions.
486 [[RFC Editor: please remove this section prior to publication.]]
488 7. Security Considerations
490 This document describes a civic address model and algorithms for
491 manipulating civic addresses and boundaries in the same format.
492 There are no known security considerations arising from the described
493 application of these algorithms.
495 8. Informative References
497 [RFC4119] Peterson, J., "A Presence-based GEOPRIV Location Object
498 Format", RFC 4119, December 2005.
500 [RFC4776] Schulzrinne, H., "Dynamic Host Configuration Protocol
501 (DHCPv4 and DHCPv6) Option for Civic Addresses
502 Configuration Information", RFC 4776, November 2006.
504 [RFC5139] Thomson, M. and J. Winterbottom, "Revised Civic Location
505 Format for Presence Information Data Format Location
506 Object (PIDF-LO)", RFC 5139, February 2008.
508 [RFC5222] Hardie, T., Newton, A., Schulzrinne, H., and H.
509 Tschofenig, "LoST: A Location-to-Service Translation
510 Protocol", RFC 5222, August 2008.
512 [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying
513 Languages", BCP 47, RFC 5646, September 2009.
515 [RFC5774] Wolf, K. and A. Mayrhofer, "Considerations for Civic
516 Addresses in the Presence Information Data Format Location
517 Object (PIDF-LO): Guidelines and IANA Registry
518 Definition", BCP 154, RFC 5774, March 2010.
520 [I-D.ietf-ecrit-rough-loc]
521 Barnes, R. and M. Lepinski, "Using Imprecise Location for
522 Emergency Context Resolution",
523 draft-ietf-ecrit-rough-loc-03 (work in progress),
524 August 2010.
526 [I-D.ietf-ecrit-lost-sync]
527 Schulzrinne, H. and H. Tschofenig, "Synchronizing
528 Location-to-Service Translation (LoST) Protocol based
529 Service Boundaries and Mapping Elements",
530 draft-ietf-ecrit-lost-sync-09 (work in progress),
531 March 2010.
533 Authors' Addresses
535 Martin Thomson
536 Andrew Corporation
537 Andrew Building (39)
538 Wollongong University Campus
539 Northfields Avenue
540 Wollongong, NSW 2522
541 Australia
543 Phone: +61 2 4221 2915
544 Email: martin.thomson@andrew.com
546 Karl Heinz Wolf
547 nic.at GmbH
548 Karlsplatz 1/2/9
549 Wien A-1010
550 Austria
552 Phone: +43 1 5056416 37
553 Email: karlheinz.wolf@nic.at
554 URI: http://www.nic.at/