idnits 2.17.1
draft-ietf-geopriv-uncertainty-00.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (January 22, 2014) is 3746 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
-- Looks like a reference, but probably isn't: '1' on line 1508
-- Looks like a reference, but probably isn't: '2' on line 685
-- Looks like a reference, but probably isn't: '3' on line 685
== Missing Reference: '2d' is mentioned on line 826, but not defined
== Missing Reference: '3d' is mentioned on line 826, but not defined
-- Looks like a reference, but probably isn't: '0' on line 1508
== Unused Reference: 'RFC3694' is defined on line 1385, but no explicit
reference was found in the text
-- Obsolete informational reference (is this intentional?): RFC 3825
(Obsoleted by RFC 6225)
Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 6 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 GEOPRIV M. Thomson
3 Internet-Draft Mozilla
4 Intended status: Standards Track J. Winterbottom
5 Expires: July 26, 2014 Unaffiliated
6 January 22, 2014
8 Representation of Uncertainty and Confidence in PIDF-LO
9 draft-ietf-geopriv-uncertainty-00
11 Abstract
13 The key concepts of uncertainty and confidence as they pertain to
14 location information are defined. Methods for the manipulation of
15 location estimates that include uncertainty information are outlined.
17 Status of This Memo
19 This Internet-Draft is submitted in full conformance with the
20 provisions of BCP 78 and BCP 79.
22 Internet-Drafts are working documents of the Internet Engineering
23 Task Force (IETF). Note that other groups may also distribute
24 working documents as Internet-Drafts. The list of current Internet-
25 Drafts is at http://datatracker.ietf.org/drafts/current/.
27 Internet-Drafts are draft documents valid for a maximum of six months
28 and may be updated, replaced, or obsoleted by other documents at any
29 time. It is inappropriate to use Internet-Drafts as reference
30 material or to cite them other than as "work in progress."
32 This Internet-Draft will expire on July 26, 2014.
34 Copyright Notice
36 Copyright (c) 2014 IETF Trust and the persons identified as the
37 document authors. All rights reserved.
39 This document is subject to BCP 78 and the IETF Trust's Legal
40 Provisions Relating to IETF Documents
41 (http://trustee.ietf.org/license-info) in effect on the date of
42 publication of this document. Please review these documents
43 carefully, as they describe your rights and restrictions with respect
44 to this document. Code Components extracted from this document must
45 include Simplified BSD License text as described in Section 4.e of
46 the Trust Legal Provisions and are provided without warranty as
47 described in the Simplified BSD License.
49 Table of Contents
51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
52 1.1. Conventions and Terminology . . . . . . . . . . . . . . . 3
53 2. A General Definition of Uncertainty . . . . . . . . . . . . . 4
54 2.1. Uncertainty as a Probability Distribution . . . . . . . . 5
55 2.2. Deprecation of the Terms Precision and Resolution . . . . 7
56 2.3. Accuracy as a Qualitative Concept . . . . . . . . . . . . 7
57 3. Uncertainty in Location . . . . . . . . . . . . . . . . . . . 8
58 3.1. Targets as Points in Space . . . . . . . . . . . . . . . 8
59 3.2. Representation of Uncertainty and Confidence in PIDF-LO . 9
60 3.3. Uncertainty and Confidence for Civic Addresses . . . . . 9
61 3.4. DHCP Location Configuration Information and Uncertainty . 10
62 4. Representation of Confidence in PIDF-LO . . . . . . . . . . . 10
63 4.1. The "confidence" Element . . . . . . . . . . . . . . . . 11
64 4.2. Generating Locations with Confidence . . . . . . . . . . 12
65 4.3. Consuming and Presenting Confidence . . . . . . . . . . . 12
66 5. Manipulation of Uncertainty . . . . . . . . . . . . . . . . . 13
67 5.1. Reduction of a Location Estimate to a Point . . . . . . . 13
68 5.1.1. Centroid Calculation . . . . . . . . . . . . . . . . 14
69 5.1.1.1. Arc-Band Centroid . . . . . . . . . . . . . . . . 14
70 5.1.1.2. Polygon Centroid . . . . . . . . . . . . . . . . 15
71 5.2. Conversion to Circle or Sphere . . . . . . . . . . . . . 17
72 5.3. Three-Dimensional to Two-Dimensional Conversion . . . . . 18
73 5.4. Increasing and Decreasing Uncertainty and Confidence . . 19
74 5.4.1. Rectangular Distributions . . . . . . . . . . . . . . 19
75 5.4.2. Normal Distributions . . . . . . . . . . . . . . . . 20
76 5.5. Determining Whether a Location is Within a Given Region . 20
77 5.5.1. Determining the Area of Overlap for Two Circles . . . 22
78 5.5.2. Determining the Area of Overlap for Two Polygons . . 22
79 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 23
80 6.1. Reduction to a Point or Circle . . . . . . . . . . . . . 23
81 6.2. Increasing and Decreasing Confidence . . . . . . . . . . 26
82 6.3. Matching Location Estimates to Regions of Interest . . . 26
83 6.4. PIDF-LO With Confidence Example . . . . . . . . . . . . . 27
84 7. Confidence Schema . . . . . . . . . . . . . . . . . . . . . . 27
85 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29
86 8.1. URN Sub-Namespace Registration for
87 urn:ietf:params:xml:ns:geopriv:conf . . . . . . . . . . . 29
88 8.2. XML Schema Registration . . . . . . . . . . . . . . . . . 29
89 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
90 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30
91 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 30
92 11.1. Normative References . . . . . . . . . . . . . . . . . . 30
93 11.2. Informative References . . . . . . . . . . . . . . . . . 30
94 Appendix A. Conversion Between Cartesian and Geodetic
95 Coordinates in WGS84 . . . . . . . . . . . . . . . . 32
96 Appendix B. Calculating the Upward Normal of a Polygon . . . . . 33
97 B.1. Checking that a Polygon Upward Normal Points Up . . . . . 34
98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 34
100 1. Introduction
102 Location information represents an estimation of the position of a
103 Target. Under ideal circumstances, a location estimate precisely
104 reflects the actual location of the Target. In reality, there are
105 many factors that introduce errors into the measurements that are
106 used to determine location estimates.
108 The process by which measurements are combined to generate a location
109 estimate is outside of the scope of work within the IETF. However,
110 the results of such a process are carried in IETF data formats and
111 protocols. This document outlines how uncertainty, and its
112 associated datum, confidence, are expressed and interpreted.
114 This document provides a common nomenclature for discussing
115 uncertainty and confidence as they relate to location information.
117 This document also provides guidance on how to manage location
118 information that includes uncertainty. Methods for expanding or
119 reducing uncertainty to obtain a required level of confidence are
120 described. Methods for determining the probability that a Target is
121 within a specified region based on their location estimate are
122 described. These methods are simplified by making certain
123 assumptions about the location estimate and are designed to be
124 applicable to location estimates in a relatively small area.
126 A confidence extension for the Presence Information Data Format -
127 Location Object (PIDF-LO) [RFC4119] is described.
129 1.1. Conventions and Terminology
131 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
132 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
133 document are to be interpreted as described in [RFC2119].
135 This document assumes a basic understanding of the principles of
136 mathematics, particularly statistics and geometry.
138 Some terminology is borrowed from [RFC3693] and [RFC6280].
140 Mathematical formulae are presented using the following notation: add
141 "+", subtract "-", multiply "*", divide "/", power "^" and absolute
142 value "|x|". Precedence is indicated using parentheses.
143 Mathematical functions are represented by common abbreviations:
144 square root "sqrt(x)", sine "sin(x)", cosine "cos(x)", inverse cosine
145 "acos(x)", tangent "tan(x)", inverse tangent "atan(x)", error
146 function "erf(x)", and inverse error function "erfinv(x)".
148 2. A General Definition of Uncertainty
150 Uncertainty results from the limitations of measurement. In
151 measuring any observable quantity, errors from a range of sources
152 affect the result. Uncertainty is a quantification of what is known
153 about the observed quantity, either through the limitations of
154 measurement or through inherent variability of the quantity.
156 Uncertainty is most completely described by a probability
157 distribution. A probability distribution assigns a probability to
158 possible values for the quantity.
160 A probability distribution describing a measured quantity can be
161 arbitrarily complex and so it is desirable to find a simplified
162 model. One approach commonly taken is to reduce the probability
163 distribution to a confidence interval. Many alternative models are
164 used in other areas, but study of those is not the focus of this
165 document.
167 In addition to the central estimate of the observed quantity, a
168 confidence interval is succintly described by two values: an error
169 range and a confidence. The error range describes an interval and
170 the confidence describes an estimated upper bound on the probability
171 that a "true" value is found within the extents defined by the error.
173 In the following example, a measurement result for a length is shown
174 as a nominal value with additional information on error range (0.0043
175 meters) and confidence (95%).
177 e.g. x = 1.00742 +/- 0.0043 meters at 95% confidence
179 This result indicates that the measurement indicates that the value
180 of "x" between 1.00312 and 1.01172 meters with 95% probability. No
181 other assertion is made: in particular, this does not assert that x
182 is 1.00742.
184 This document uses the term _uncertainty_ to refer in general to the
185 concept as well as more specifically to refer to the error increment.
187 Uncertainty and confidence for location estimates can be derived in a
188 number of ways. This document does not attempt to enumerate the many
189 methods for determining uncertainty. [ISO.GUM] and [NIST.TN1297]
190 provide a set of general guidelines for determining and manipulating
191 measurement uncertainty. This document applies that general guidance
192 for consumers of location information.
194 2.1. Uncertainty as a Probability Distribution
196 The Probability Density Function (PDF) that is described by
197 uncertainty indicates the probability that the "true" value lies at
198 any one point. The shape of the probability distribution can vary
199 depending on the method that is used to determine the result. The
200 two probability density functions most generally applicable most
201 applicable to location information are considered in this document:
203 o The normal PDF (also referred to as a Gaussian PDF) is used where
204 a large number of small random factors contribute to errors. The
205 value used for the error range in a normal PDF is related to the
206 standard deviation of the distribution.
208 o A rectangular PDF is used where the errors are known to be
209 consistent across a limited range. A rectangular PDF can occur
210 where a single error source, such as a rounding error, is
211 significantly larger than other errors. A rectangular PDF is
212 often described by the half-width of the distribution; that is,
213 half the width of the distribution.
215 Each of these probability density functions can be characterized by
216 its center point, or mean, and its width. For a normal distribution,
217 uncertainty and confidence together are related to the standard
218 deviation (see Section 5.4). For a rectangular distribution, half of
219 the width of the distribution is used.
221 Figure 1 shows a normal and rectangular probability density function
222 with the mean (m) and standard deviation (s) labelled. The half-
223 width (h) of the rectangular distribution is also indicated.
225 ***** *** Normal PDF
226 ** : ** --- Rectangular PDF
227 ** : **
228 ** : **
229 .---------*---------------*---------.
230 | ** : ** |
231 | ** : ** |
232 | * <-- s -->: * |
233 | * : : : * |
234 | ** : ** |
235 | * : : : * |
236 | * : * |
237 |** : : : **|
238 ** : **
239 *** | : : : | ***
240 ***** | :<------ h ------>| *****
241 .****-------+.......:.........:.........:.......+-------*****.
242 m
244 Figure 1: Normal and Rectangular Probability Density Functions
246 For a given PDF, the value of the PDF describes the probability that
247 the "true" value is found at that point. Confidence for any given
248 interval is the total probability of the "true" value being in that
249 range, defined as the integral of the PDF over the interval.
251 The probability of the "true" value falling between two points is
252 found by finding the area under the curve between the points (that
253 is, the integral of the curve between the points). For any given
254 PDF, the area under the curve for the entire range from negative
255 infinity to positive infinity is 1 or (100%). Therefore, the
256 confidence over any interval of uncertainty is always less than
257 100%.
259 Figure 2 shows how confidence is determined for a normal
260 distribution. The area of the shaded region gives the confidence (c)
261 for the interval between "m-u" and "m+u".
263 *****
264 **:::::**
265 **:::::::::**
266 **:::::::::::**
267 *:::::::::::::::*
268 **:::::::::::::::**
269 **:::::::::::::::::**
270 *:::::::::::::::::::::*
271 *:::::::::::::::::::::::*
272 **:::::::::::::::::::::::**
273 *:::::::::::: c ::::::::::::*
274 *:::::::::::::::::::::::::::::*
275 **|:::::::::::::::::::::::::::::|**
276 ** |:::::::::::::::::::::::::::::| **
277 *** |:::::::::::::::::::::::::::::| ***
278 ***** |:::::::::::::::::::::::::::::| *****
279 .****..........!:::::::::::::::::::::::::::::!..........*****.
280 | | |
281 (m-u) m (m+u)
283 Figure 2: Confidence as the Integral of a PDF
285 In Section 5.4, methods are described for manipulating uncertainty if
286 the shape of the PDF is known.
288 2.2. Deprecation of the Terms Precision and Resolution
290 The terms _Precision_ and _Resolution_ are defined in RFC 3693
291 [RFC3693]. These definitions were intended to provide a common
292 nomenclature for discussing uncertainty; however, these particular
293 terms have many different uses in other fields and their definitions
294 are not sufficient to avoid confusion about their meaning. These
295 terms are unsuitable for use in relation to quantitative concepts
296 when discussing uncertainty and confidence in relation to location
297 information.
299 2.3. Accuracy as a Qualitative Concept
301 Uncertainty is a quantitative concept. The term _accuracy_ is useful
302 in describing, qualitatively, the general concepts of location
303 information. Accuracy is generally useful when describing
304 qualitative aspects of location estimates. Accuracy is not a
305 suitable term for use in a quantitative context.
307 For instance, it could be appropriate to say that a location estimate
308 with uncertainty "X" is more accurate than a location estimate with
309 uncertainty "2X" at the same confidence. It is not appropriate to
310 assign a number to "accuracy", nor is it appropriate to refer to any
311 component of uncertainty or confidence as "accuracy". That is, to
312 say that the "accuracy" for the first location estimate is "X" would
313 be an erroneous use of this term.
315 3. Uncertainty in Location
317 A _location estimate_ is the result of location determination. A
318 location estimate is subject to uncertainty like any other
319 observation. However, unlike a simple measure of a one dimensional
320 property like length, a location estimate is specified in two or
321 three dimensions.
323 Uncertainty in 2- or 3-dimensional locations can be described using
324 confidence intervals. The confidence interval for a location
325 estimate in two or three dimensional space is expressed as a subset
326 of that space. This document uses the term _region of uncertainty_
327 to refer to the area or volume that describes the confidence
328 interval.
330 Areas or volumes that describe regions of uncertainty can be formed
331 by the combination of two or three one-dimensional ranges, or more
332 complex shapes could be described.
334 3.1. Targets as Points in Space
336 This document makes a simplifying assumption that the Target of the
337 PIDF-LO occupies just a single point in space. While this is clearly
338 false in virtually all scenarios with any practical application, it
339 is often a reasonable assumption to make.
341 To a large extent, whether this simplication is valid depends on the
342 size of the target relative to the size of the uncertainty region.
343 When locating a personal device using contemporary location
344 determination techniques, the space the device occupies relative to
345 the uncertainty is proportionally quite small. Even where that
346 device is used as a proxy for a person, the proportions change
347 little.
349 This assumption is less useful as the Target of the PIDF-LO becomes
350 large relative to the uncertainty region. For instance, describing
351 the location of a football stadium or small country would include a
352 region of uncertainty that is infinitesimally larger than the Target
353 itself. In these cases, much of the guidance in this document is not
354 applicable. Indeed, as the accuracy of location determination
355 technology improves, it could be that the advice this document
356 contains becomes less relevant by the same measure.
358 3.2. Representation of Uncertainty and Confidence in PIDF-LO
360 A set of shapes suitable for the expression of uncertainty in
361 location estimates in the Presence Information Data Format - Location
362 Object (PIDF-LO) are described in [GeoShape]. These shapes are the
363 recommended form for the representation of uncertainty in PIDF-LO
364 [RFC4119] documents.
366 The PIDF-LO does not include an indication of confidence, but that
367 confidence is 95%, by definition in [RFC5491]. Similarly, the PIDF-
368 LO format does not provide an indication of the shape of the PDF.
369 Section 4 defines elements to convey this information.
371 Absence of uncertainty information in a PIDF-LO document does not
372 indicate that there is no uncertainty in the location estimate.
373 Uncertainty might not have been calculated for the estimate, or it
374 may be withheld for privacy purposes.
376 If the Point shape is used, confidence and uncertainty are unknown; a
377 receiver can either assume a confidence of 0% or infinite
378 uncertainty. The same principle applies on the altitude axis for
379 two-dimension shapes like the Circle.
381 3.3. Uncertainty and Confidence for Civic Addresses
383 Civic addresses [RFC5139] inherently include uncertainty, based on
384 the area of the most precise element that is specified. Uncertainty
385 is effectively defined by the presence or absence of elements --
386 elements that are not present are deemed to be uncertain.
388 To apply the concept of uncertainty to civic addresses, it is helpful
389 to unify the conceptual models of civic address with geodetic
390 location information.
392 Note: This view is one perspective on the process of geo-coding -
393 the translation of a civic address to a geodetic location.
395 In the unified view, a civic address defines a series of (sometimes
396 non-orthogonal) spatial partitions. The first is the implicit
397 partition that identifies the surface of the earth and the space near
398 the surface. The second is the country. Each label that is included
399 in a civic address provides information about a different set of
400 spatial partitions. Some partions require slight adjustments from a
401 standard interpretation: for instance, a road includes all properties
402 that adjoin the street. Each label might need to be interpreted with
403 other values to provide context.
405 As a value at each level is interpreted, one or more spatial
406 partitions at that level are selected, and all other partitions of
407 that type are excluded. For non-orthogonal partitions, only the
408 portion of the partition that fits within the existing space is
409 selected. This is what distinguishes King Street in Sydney from King
410 Street in Melbourne. Each defined element selects a partition of
411 space. The resulting location is the intersection of all selected
412 spaces.
414 The resulting spatial partition can be considered to represent a
415 region of uncertainty. At no stage does this process select a point;
416 although, as spaces get smaller this distinction might have no
417 practical significance and an approximation if a point could be used.
419 Uncertainty in civic addresses can be increased by removing elements.
420 This doesn't necessarily improve confidence in the same way that
421 arbitrarily increasing uncertainty in a geodetic location doesn't
422 increase confidence.
424 3.4. DHCP Location Configuration Information and Uncertainty
426 Location information is often measured in two or three dimensions;
427 expressions of uncertainty in one dimension only are rare. The
428 "resolution" parameters in [RFC3825] provide an indication of
429 uncertainty in one dimension.
431 [RFC3825] defines a means for representing uncertainty, but a value
432 for confidence is not specified. A default value of 95% confidence
433 can be assumed for the combination of the uncertainty on each axis.
434 That is, the confidence of the resultant rectangular polygon or prism
435 is 95%.
437 4. Representation of Confidence in PIDF-LO
439 On the whole, a fixed definition for confidence is preferable.
440 Primarily because it ensures consistency between implementations.
441 Location generators that are aware of this constraint can generate
442 location information at the required confidence. Location recipients
443 are able to make sensible assumptions about the quality of the
444 information that they receive.
446 In some circumstances - particularly with pre-existing systems -
447 location generators might unable to provide location information with
448 consistent confidence. Existing systems sometimes specify confidence
449 at 38%, 67% or 90%. Existing forms of expressing location
450 information, such as that defined in [TS-3GPP-23_032], contain
451 elements that express the confidence in the result.
453 The addition of a confidence element provides information that was
454 previously unavailable to recipients of location information.
455 Without this information, a location server or generator that has
456 access to location information with a confidence lower than 95% has
457 two options:
459 o The location server can scale regions of uncertainty in an attempt
460 to acheive 95% confidence. This scaling process significantly
461 degrades the quality of the information, because the location
462 server might not have the necessary information to scale
463 appropriately; the location server is forced to make assumptions
464 that are likely result in either an overly conservative estimate
465 with high uncertainty or a overestimate of confidence.
467 o The location server can ignore the confidence entirely, which
468 results in giving the recipient a false impression of its quality.
470 Both of these choices degrade the quality of the information
471 provided.
473 The addition of a confidence element avoids this problem entirely if
474 a location recipient supports and understands the element. A
475 recipient that does not understand, and hence ignores, the confidence
476 element is in no worse a position than if the location server ignored
477 confidence.
479 4.1. The "confidence" Element
481 The confidence element MAY be added to the "location-info" element of
482 the Presence Information Data Format - Location Object (PIDF-LO)
483 [RFC4119] document. This element expresses the confidence in the
484 associated location information as a percentage.
486 The confidence element optionally includes an attribute that
487 indicates the shape of the probability density function (PDF) of the
488 associated region of uncertainty. Three values are possible:
489 unknown, normal and rectangular.
491 Indicating a particular PDF only indicates that the distribution
492 approximately fits the given shape based on the methods used to
493 generate the location information. The PDF is normal if there are a
494 large number of small, independent sources of error; rectangular if
495 all points within the area have roughly equal probability of being
496 the actual location of the Target; otherwise, the PDF MUST either be
497 set to unknown or omitted.
499 If a PIDF-LO does not include the confidence element, confidence is
500 95% [RFC5491]. A Point shape does not have uncertainty (or it has
501 infinite uncertainty), so confidence is meaningless for a point;
502 therefore, this element MUST be omitted if only a point is provided.
504 4.2. Generating Locations with Confidence
506 Location generators SHOULD attempt to ensure that confidence is equal
507 in each dimension when generating location information. This
508 restriction, while not always practical, allows for more accurate
509 scaling, if scaling is necessary.
511 Confidence MUST NOT be included unless location information cannot be
512 acquired with 95% confidence.
514 4.3. Consuming and Presenting Confidence
516 The inclusion of confidence that is anything other than 95% presents
517 a potentially difficult usability problem for applications that use
518 location information. Effectively communicating the probability that
519 a location is incorrect to a user can be difficult.
521 It is inadvisable to simply display locations of any confidence, or
522 to display confidence in a separate or non-obvious fashion. If
523 locations with different confidence levels are displayed such that
524 the distinction is subtle or easy to overlook - such as using fine
525 graduations of color or transparency for graphical uncertainty
526 regions, or displaying uncertainty graphically, but providing
527 confidence as supplementary text - a user could fail to notice a
528 difference in the quality of the location information that might be
529 significant.
531 Depending on the circumstances, different ways of handling confidence
532 might be appropriate. Section 5 describes techniques that could be
533 appropriate for consumers that use automated processing.
535 Providing that the full implications of any choice for the
536 application are understood, some amount of automated processing could
537 be appropriate. In a simple example, applications could choose to
538 discard or suppress the display of location information if confidence
539 does not meet a pre-determined threshold.
541 In settings where there is an opportunity for user training, some of
542 these problems might be mitigated by defining different operational
543 procedures for handling location information at different confidence
544 levels.
546 5. Manipulation of Uncertainty
548 This section deals with manipulation of location information that
549 contains uncertainty.
551 The following rules generally apply when manipulating location
552 information:
554 o Where calculations are performed on coordinate information, these
555 should be performed in Cartesian space and the results converted
556 back to latitude, longitude and altitude. A method for converting
557 to and from Cartesian coordinates is included in Appendix A.
559 While some approximation methods are useful in simplifying
560 calculations, treating latitude and longitude as Cartesian axes
561 is never advisable. The two axes are not orthogonal. Errors
562 can arise from the curvature of the earth and from the
563 convergence of longitude lines.
565 o Normal rounding rules do not apply when rounding uncertainty.
566 When rounding, the region of uncertainty always increases (that
567 is, errors are rounded up) and confidence is always rounded down
568 (see [NIST.TN1297]). This means that any manipulation of
569 uncertainty is a non-reversible operation; each manipulation can
570 result in the loss of some information.
572 5.1. Reduction of a Location Estimate to a Point
574 Manipulating location estimates that include uncertainty information
575 requires additional complexity in systems. In some cases, systems
576 only operate on definitive values, that is, a single point.
578 This section describes algorithms for reducing location estimates to
579 a simple form without uncertainty information. Having a consistent
580 means for reducing location estimates allows for interaction between
581 applications that are able to use uncertainty information and those
582 that cannot.
584 Note: Reduction of a location estimate to a point constitutes a
585 reduction in information. Removing uncertainty information can
586 degrade results in some applications. Also, there is a natural
587 tendency to misinterpret a point location as representing a
588 location without uncertainty. This could lead to more serious
589 errors. Therefore, these algorithms should only be applied where
590 necessary.
592 Several different approaches can be taken when reducing a location
593 estimate to a point. Different methods each make a set of
594 assumptions about the properties of the PDF and the selected point;
595 no one method is more "correct" than any other. For any given region
596 of uncertainty, selecting an arbitrary point within the area could be
597 considered valid; however, given the aforementioned problems with
598 point locations, a more rigorous approach is appropriate.
600 Given a result with a known distribution, selecting the point within
601 the area that has the highest probability is a more rigorous method.
602 Alternatively, a point could be selected that minimizes the overall
603 error; that is, it minimises the expected value of the difference
604 between the selected point and the "true" value.
606 If a rectangular distribution is assumed, the centroid of the area or
607 volume minimizes the overall error. Minimizing the error for a
608 normal distribution is mathematically complex. Therefore, this
609 document opts to select the centroid of the region of uncertainty
610 when selecting a point.
612 5.1.1. Centroid Calculation
614 For regular shapes, such as Circle, Sphere, Ellipse and Ellipsoid,
615 this approach equates to the center point of the region. For regions
616 of uncertainty that are expressed as regular Polygons and Prisms the
617 center point is also the most appropriate selection.
619 For the Arc-Band shape and non-regular Polygons and Prisms, selecting
620 the centroid of the area or volume minimizes the overall error. This
621 assumes that the PDF is rectangular.
623 Note: The centroid of a concave Polygon or Arc-Band shape is not
624 necessarily within the region of uncertainty.
626 5.1.1.1. Arc-Band Centroid
628 The centroid of the Arc-Band shape is found along a line that bisects
629 the arc. The centroid can be found at the following distance from
630 the starting point of the arc-band (assuming an arc-band with an
631 inner radius of "r", outer radius "R", start angle "a", and opening
632 angle "o"):
634 d = 4 * sin(o/2) * (R*R + R*r + r*r) / (3*o*(R + r))
636 This point can be found along the line that bisects the arc; that is,
637 the line at an angle of "a + (o/2)". Negative values are possible if
638 the angle of opening is greater than 180 degrees; negative values
639 indicate that the centroid is found along the angle "a + (o/
640 2) + 180".
642 5.1.1.2. Polygon Centroid
644 Calculating a centroid for the Polygon and Prism shapes is more
645 complex. Polygons that are specified using geodetic coordinates are
646 not necessarily coplanar. For Polygons that are specified without an
647 altitude, choose a value for altitude before attempting this process;
648 an altitude of 0 is acceptable.
650 The method described in this section is simplified by assuming
651 that the surface of the earth is locally flat. This method
652 degrades as polygons become larger; see [GeoShape] for
653 recommendations on polygon size.
655 The polygon is translated to a new coordinate system that has an x-y
656 plane roughly parallel to the polygon. This enables the elimination
657 of z-axis values and calculating a centroid can be done using only x
658 and y coordinates. This requires that the upward normal for the
659 polygon is known.
661 To translate the polygon coordinates, apply the process described in
662 Appendix B to find the normal vector "N = [Nx,Ny,Nz]". This value
663 should be made a unit vector to ensure that the transformation matrix
664 is a special orthogonal matrix. From this vector, select two vectors
665 that are perpendicular to this vector and combine these into a
666 transformation matrix.
668 If "Nx" and "Ny" are non-zero, the matrices in Figure 3 can be used,
669 given "p = sqrt(Nx^2 + Ny^2)". More transformations are provided
670 later in this section for cases where "Nx" or "Ny" are zero.
672 [ -Ny/p Nx/p 0 ] [ -Ny/p -Nx*Nz/p Nx ]
673 T = [ -Nx*Nz/p -Ny*Nz/p p ] T' = [ Nx/p -Ny*Nz/p Ny ]
674 [ Nx Ny Nz ] [ 0 p Nz ]
675 (Transform) (Reverse Transform)
677 Figure 3: Recommended Transformation Matrices
679 To apply a transform to each point in the polygon, form a matrix from
680 the ECEF coordinates and use matrix multiplication to determine the
681 translated coordinates.
683 [ -Ny/p Nx/p 0 ] [ x[1] x[2] x[3] ... x[n] ]
684 [ -Nx*Nz/p -Ny*Nz/p p ] * [ y[1] y[2] y[3] ... y[n] ]
685 [ Nx Ny Nz ] [ z[1] z[2] z[3] ... z[n] ]
687 [ x'[1] x'[2] x'[3] ... x'[n] ]
688 = [ y'[1] y'[2] y'[3] ... y'[n] ]
689 [ z'[1] z'[2] z'[3] ... z'[n] ]
691 Figure 4: Transformation
693 Alternatively, direct multiplication can be used to achieve the same
694 result:
696 x'[i] = -Ny * x[i] / p + Nx * y[i] / p
698 y'[i] = -Nx * Nz * x[i] / p - Ny * Nz * y[i] / p + p * z[i]
700 z'[i] = Nx * x[i] + Ny * y[i] + Nz * z[i]
702 The first and second rows of this matrix ("x'" and "y'") contain the
703 values that are used to calculate the centroid of the polygon. To
704 find the centroid of this polygon, first find the area using:
706 A = sum from i=1..n of (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / 2
708 For these formulae, treat each set of coordinates as circular, that
709 is "x'[0] == x'[n]" and "x'[n+1] == x'[1]". Based on the area, the
710 centroid along each axis can be determined by:
712 Cx' = sum (x'[i]+x'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A)
714 Cy' = sum (y'[i]+y'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A)
716 Note: The formula for the area of a polygon will return a negative
717 value if the polygon is specified in clockwise direction. This
718 can be used to determine the orientation of the polygon.
720 The third row contains a distance from a plane parallel to the
721 polygon. If the polygon is coplanar, then the values for "z'" are
722 identical; however, the constraints recommended in [RFC5491] mean
723 that this is rarely the case. To determine "Cz'", average these
724 values:
726 Cz' = sum z'[i] / n
728 Once the centroid is known in the transformed coordinates, these can
729 be transformed back to the original coordinate system. The reverse
730 transformation is shown in Figure 5.
732 [ -Ny/p -Nx*Nz/p Nx ] [ Cx' ] [ Cx ]
733 [ Nx/p -Ny*Nz/p Ny ] * [ Cy' ] = [ Cy ]
734 [ 0 p Nz ] [ sum of z'[i] / n ] [ Cz ]
736 Figure 5: Reverse Transformation
738 The reverse transformation can be applied directly as follows:
740 Cx = -Ny * Cx' / p - Nx * Nz * Cy' / p + Nx * Cz'
742 Cy = Nx * Cx' / p - Ny * Nz * Cy' / p + Ny * Cz'
744 Cz = p * Cy' + Nz * Cz'
746 The ECEF value "[Cx,Cy,Cz]" can then be converted back to geodetic
747 coordinates. Given a polygon that is defined with no altitude or
748 equal altitudes for each point, the altitude of the result can either
749 be ignored or reset after converting back to a geodetic value.
751 The centroid of the Prism shape is found by finding the centroid of
752 the base polygon and raising the point by half the height of the
753 prism. This can be added to altitude of the final result;
754 alternatively, this can be added to "Cz'", which ensures that
755 negative height is correctly applied to polygons that are defined in
756 a "clockwise" direction.
758 The recommended transforms only apply if "Nx" and "Ny" are non-zero.
759 If the normal vector is "[0,0,1]" (that is, along the z-axis), then
760 no transform is necessary. Similarly, if the normal vector is
761 "[0,1,0]" or "[1,0,0]", avoid the transformation and use the x and z
762 coordinates or y and z coordinates (respectively) in the centroid
763 calculation phase. If either "Nx" or "Ny" are zero, the alternative
764 transform matrices in Figure 6 can be used. The reverse transform is
765 the transpose of this matrix.
767 if Nx == 0: | if Ny == 0:
768 [ 0 -Nz Ny ] [ 0 1 0 ] | [ -Nz 0 Nx ]
769 T = [ 1 0 0 ] T' = [ -Nz 0 Ny ] | T = T' = [ 0 1 0 ]
770 [ 0 Ny Nz ] [ Ny 0 Nz ] | [ Nx 0 Nz ]
772 Figure 6: Alternative Transformation Matrices
774 5.2. Conversion to Circle or Sphere
776 The Circle or Sphere are simple shapes that suit a range of
777 applications. A circle or sphere contains fewer units of data to
778 manipulate, which simplifies operations on location estimates.
780 The simplest method for converting a location estimate to a Circle or
781 Sphere shape is to determine the centroid and then find the longest
782 distance to any point in the region of uncertainty to that point.
783 This distance can be determined based on the shape type:
785 Circle/Sphere: No conversion necessary.
787 Ellipse/Ellipsoid: The greater of either semi-major axis or altitude
788 uncertainty.
790 Polygon/Prism: The distance to the furthest vertex of the polygon
791 (for a Prism, it is only necessary to check points on the base).
793 Arc-Band: The furthest length from the centroid to the points where
794 the inner and outer arc end. This distance can be calculated by
795 finding the larger of the two following formulae:
797 X = sqrt( d*d + R*R - 2*d*R*cos(o/2) )
799 x = sqrt( d*d + r*r - 2*d*r*cos(o/2) )
801 Once the Circle or Sphere shape is found, the associated confidence
802 can be increased if the result is known to follow a normal
803 distribution. However, this is a complicated process and provides
804 limited benefit. In many cases it also violates the constraint that
805 confidence in each dimension be the same. Confidence should be
806 unchanged when performing this conversion.
808 Two dimensional shapes are converted to a Circle; three dimensional
809 shapes are converted to a Sphere.
811 5.3. Three-Dimensional to Two-Dimensional Conversion
813 A three-dimensional shape can be easily converted to a two-
814 dimensional shape by removing the altitude component. A sphere
815 becomes a circle; a prism becomes a polygon; an ellipsoid becomes an
816 ellipse. Each conversion is simple, requiring only the removal of
817 those elements relating to altitude.
819 The altitude is unspecified for a two-dimensional shape and therefore
820 has unlimited uncertainty along the vertical axis. The confidence
821 for the two-dimensional shape is thus higher than the three-
822 dimensional shape. Assuming equal confidence on each axis, the
823 confidence of the circle can be increased using the following
824 approximate formula:
826 C[2d] >= C[3d] ^ (2/3)
828 "C[2d]" is the confidence of the two-dimensional shape and "C[3d]" is
829 the confidence of the three-dimensional shape. For example, a Sphere
830 with a confidence of 95% can be simplified to a Circle of equal
831 radius with confidence of 96.6%.
833 5.4. Increasing and Decreasing Uncertainty and Confidence
835 The combination of uncertainty and confidence provide a great deal of
836 information about the nature of the data that is being measured. If
837 both uncertainty, confidence and PDF are known, certain information
838 can be extrapolated. In particular, the uncertainty can be scaled to
839 meet a desired confidence or the confidence for a particular region
840 of uncertainty can be found.
842 In general, confidence decreases as the region of uncertainty
843 decreases in size and confidence increases as the region of
844 uncertainty increases in size. However, this depends on the PDF;
845 expanding the region of uncertainty for a rectangular distribution
846 has no effect on confidence without additional information. If the
847 region of uncertainty is increased during the process of obfuscation
848 (see [I-D.thomson-geopriv-location-obscuring]), then the confidence
849 cannot be increased.
851 A region of uncertainty that is reduced in size always has a lower
852 confidence.
854 A region of uncertainty that has an unknown PDF shape cannot be
855 reduced in size reliably. The region of uncertainty can be expanded,
856 but only if confidence is not increased.
858 This section makes the simplifying assumption that location
859 information is symmetrically and evenly distributed in each
860 dimension. This is not necessarily true in practice. If better
861 information is available, alternative methods might produce better
862 results.
864 5.4.1. Rectangular Distributions
866 Uncertainty that follows a rectangular distribution can only be
867 decreased in size. Since the PDF is constant over the region of
868 uncertainty, the resulting confidence is determined by the following
869 formula:
871 Cr = Co * Ur / Uo
873 Where "Uo" and "Ur" are the sizes of the original and reduced regions
874 of uncertainty (either the area or the volume of the region); "Co"
875 and "Cb" are the confidence values associated with each region.
877 Information is lost by decreasing the region of uncertainty for a
878 rectangular distribution. Once reduced in size, the uncertainty
879 region cannot subsequently be increased in size.
881 5.4.2. Normal Distributions
883 Uncertainty and confidence can be both increased and decreased for a
884 normal distribution. However, the process is more complicated.
886 For a normal distribution, uncertainty and confidence are related to
887 the standard deviation of the function. The following function
888 defines the relationship between standard deviation, uncertainty and
889 confidence along a single axis:
891 S[x] = U[x] / ( sqrt(2) * erfinv(C[x]) )
893 Where "S[x]" is the standard deviation, "U[x]" is the uncertainty and
894 "C[x]" is the confidence along a single axis. "erfinv" is the
895 inverse error function.
897 Scaling a normal distribution in two dimensions requires several
898 assumptions. Firstly, it is assumed that the distribution along each
899 axis is independent. Secondly, the confidence for each axis is the
900 same. Therefore, the confidence along each axis can be assumed to
901 be:
903 C[x] = Co ^ (1/n)
905 Where "C[x]" is the confidence along a single axis and "Co" is the
906 overall confidence and "n" is the number of dimensions in the
907 uncertainty.
909 Therefore, to find the uncertainty for each axis at a desired
910 confidence, "Cd", apply the following formula:
912 Ud[x] <= U[x] * (erfinv(Cd ^ (1/n)) / erfinv(Co ^ (1/n)))
914 For regular shapes, this formula can be applied as a scaling factor
915 in each dimension to reach a required confidence.
917 5.5. Determining Whether a Location is Within a Given Region
919 A number of applications require that a judgement be made about
920 whether a Target is within a given region of interest. Given a
921 location estimate with uncertainty, this judgement can be difficult.
922 A location estimate represents a probability distribution, and the
923 true location of the Target cannot be definitively known. Therefore,
924 the judgement relies on determining the probability that the Target
925 is within the region.
927 The probability that the Target is within a particular region is
928 found by integrating the PDF over the region. For a normal
929 distribution, there are no analytical methods that can be used to
930 determine the integral of the two or three dimensional PDF over an
931 arbitrary region. The complexity of numerical methods is also too
932 great to be useful in many applications; for example, finding the
933 integral of the PDF in two or three dimensions across the overlap
934 between the uncertainty region and the target region. If the PDF is
935 unknown, no determination can be made. When judging whether a
936 location is within a given region, uncertainties using these PDFs can
937 be assumed to be rectangular. If this assumption is made, the
938 confidence should be scaled to 95%, if possible.
940 Note: The selection of confidence has a significant impact on the
941 final result. Only use a different confidence if an uncertainty
942 value for 95% confidence cannot be found.
944 Given the assumption of a rectangular distribution, the probability
945 that a Target is found within a given region is found by first
946 finding the area (or volume) of overlap between the uncertainty
947 region and the region of interest. This is multiplied by the
948 confidence of the location estimate to determine the probability.
949 Figure 7 shows an example of finding the area of overlap between the
950 region of uncertainty and the region of interest.
952 _.-""""-._
953 .' `. _ Region of
954 / \ / Uncertainty
955 ..+-"""--.. |
956 .-' | :::::: `-. |
957 ,' | :: Ao ::: `. |
958 / \ :::::::::: \ /
959 / `._ :::::: _.X
960 | `-....-' |
961 | |
962 | |
963 \ /
964 `. .' \_ Region of
965 `._ _.' Interest
966 `--..___..--'
968 Figure 7: Area of Overlap Between Two Circular Regions
970 Once the area of overlap, "Ao", is known, the probability that the
971 Target is within the region of interest, "Pi", is:
973 Pi = Co * Ao / Au
975 Given that the area of the region of uncertainty is "Au" and the
976 confidence is "Co".
978 This probability is often input to a decision process that has a
979 limited set of outcomes; therefore, a threshold value needs to be
980 selected. Depending on the application, different threshold
981 probabilities might be selected. In the absence of specific
982 recommendations, this document suggests that the probability be
983 greater than 50% before a decision is made. If the decision process
984 selects between two or more regions, as is required by [RFC5222],
985 then the region with the highest probability can be selected.
987 5.5.1. Determining the Area of Overlap for Two Circles
989 Determining the area of overlap between two arbitrary shapes is a
990 non-trivial process. Reducing areas to circles (see Section 5.2)
991 enables the application of the following process.
993 Given the radius of the first circle "r", the radius of the second
994 circle "R" and the distance between their center points "d", the
995 following set of formulas provide the area of overlap "Ao".
997 o If the circles don't overlap, that is "d >= r+R", "Ao" is zero.
999 o If one of the two circles is entirely within the other, that is
1000 "d <= |r-R|", the area of overlap is the area of the smaller
1001 circle.
1003 o Otherwise, if the circles partially overlap, that is "d < r+R" and
1004 "d > |r-R|", find "Ao" using:
1006 a = (r^2 - R^2 + d^2)/(2*d)
1008 Ao = r^2*acos(a/r) + R^2*acos((d - a)/R) - d*sqrt(r^2 - a^2)
1010 A value for "d" can be determined by converting the center points to
1011 Cartesian coordinates and calculating the distance between the two
1012 center points:
1014 d = sqrt((x1-x2)^2 + (y1-y2)^2 + (z1-z2)^2)
1016 5.5.2. Determining the Area of Overlap for Two Polygons
1018 A calculation of overlap based on polygons can give better results
1019 than the circle-based method. However, efficient calculation of
1020 overlapping area is non-trivial. Algorithms such as Vatti's clipping
1021 algorithm [Vatti92] can be used.
1023 For large polygonal areas, it might be that geodesic interpolation is
1024 used. In these cases, altitude is also frequently omitted in
1025 describing the polygon. For such shapes, a planar projection can
1026 still give a good approximation of the area of overlap if the larger
1027 area polygon is projected onto the local tangent plane of the
1028 smaller. This is only possible if the only area of interest is that
1029 contained within the smaller polygon. Where the entire area of the
1030 larger polygon is of interest, geodesic interpolation is necessary.
1032 6. Examples
1034 This section presents some examples of how to apply the methods
1035 described in Section 5.
1037 6.1. Reduction to a Point or Circle
1039 Alice receives a location estimate from her LIS that contains a
1040 ellipsoidal region of uncertainty. This information is provided at
1041 19% confidence with a normal PDF. A PIDF-LO extract for this
1042 information is shown in Figure 8.
1044
See RFCXXXX.
1310 1311 1312 END 1314 8.2. XML Schema Registration 1316 This section registers an XML schema as per the guidelines in 1317 [RFC3688]. 1319 URI: urn:ietf:params:xml:schema:geopriv:conf 1321 Registrant Contact: IETF, GEOPRIV working group, (geopriv@ietf.org), 1322 Martin Thomson (martin.thomson@andrew.com). 1324 Schema: The XML for this schema can be found as the entirety of 1325 Section 7 of this document. 1327 9. Security Considerations 1329 This document describes methods for managing and manipulating 1330 uncertainty in location. No specific security concerns arise from 1331 most of the information provided. 1333 Adding confidence to location information risks misinterpretation by 1334 consumers of location that do not understand the element. This could 1335 be exploited, particularly when reducing confidence, since the 1336 resulting uncertainty region might include locations that are less 1337 likely to contain the target than the recipient expects. Since this 1338 sort of error is always a possibility, the impact of this is low. 1340 10. Acknowledgements 1342 Peter Rhodes provided assistance with some of the mathematical 1343 groundwork on this document. Dan Cornford provided a detailed review 1344 and many terminology corrections. 1346 11. References 1348 11.1. Normative References 1350 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1351 Requirement Levels", BCP 14, RFC 2119, March 1997. 1353 [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, 1354 January 2004. 1356 [RFC4119] Peterson, J., "A Presence-based GEOPRIV Location Object 1357 Format", RFC 4119, December 2005. 1359 11.2. Informative References 1361 [Convert] Burtch, R., "A Comparison of Methods Used in Rectangular 1362 to Geodetic Coordinate Transformations", April 2006. 1364 [GeoShape] 1365 Thomson, M. and C. Reed, "GML 3.1.1 PIDF-LO Shape 1366 Application Schema for use by the Internet Engineering 1367 Task Force (IETF)", Candidate OpenGIS Implementation 1368 Specification 06-142r1, Version: 1.0, April 2007. 1370 [I-D.thomson-geopriv-location-obscuring] 1371 Thomson, M., "Obscuring Location", draft-thomson-geopriv- 1372 location-obscuring-03 (work in progress), June 2011. 1374 [ISO.GUM] ISO/IEC, "Guide to the expression of uncertainty in 1375 measurement (GUM)", Guide 98:1995, 1995. 1377 [NIST.TN1297] 1378 Taylor, B. and C. Kuyatt, "Guidelines for Evaluating and 1379 Expressing the Uncertainty of NIST Measurement Results", 1380 Technical Note 1297, Sep 1994. 1382 [RFC3693] Cuellar, J., Morris, J., Mulligan, D., Peterson, J., and 1383 J. Polk, "Geopriv Requirements", RFC 3693, February 2004. 1385 [RFC3694] Danley, M., Mulligan, D., Morris, J., and J. Peterson, 1386 "Threat Analysis of the Geopriv Protocol", RFC 3694, 1387 February 2004. 1389 [RFC3825] Polk, J., Schnizlein, J., and M. Linsner, "Dynamic Host 1390 Configuration Protocol Option for Coordinate-based 1391 Location Configuration Information", RFC 3825, July 2004. 1393 [RFC5139] Thomson, M. and J. Winterbottom, "Revised Civic Location 1394 Format for Presence Information Data Format Location 1395 Object (PIDF-LO)", RFC 5139, February 2008. 1397 [RFC5222] Hardie, T., Newton, A., Schulzrinne, H., and H. 1398 Tschofenig, "LoST: A Location-to-Service Translation 1399 Protocol", RFC 5222, August 2008. 1401 [RFC5491] Winterbottom, J., Thomson, M., and H. Tschofenig, "GEOPRIV 1402 Presence Information Data Format Location Object (PIDF-LO) 1403 Usage Clarification, Considerations, and Recommendations", 1404 RFC 5491, March 2009. 1406 [RFC6280] Barnes, R., Lepinski, M., Cooper, A., Morris, J., 1407 Tschofenig, H., and H. Schulzrinne, "An Architecture for 1408 Location and Location Privacy in Internet Applications", 1409 BCP 160, RFC 6280, July 2011. 1411 [Sunday02] 1412 Sunday, D., "Fast polygon area and Newell normal 1413 computation", Journal of Graphics Tools JGT, 1414 7(2):9-13,2002, 2002, 1415