idnits 2.17.1 draft-ietf-geopriv-uncertainty-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 14, 2014) is 3542 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1553 -- Looks like a reference, but probably isn't: '2' on line 710 -- Looks like a reference, but probably isn't: '3' on line 710 == Missing Reference: '2d' is mentioned on line 851, but not defined == Missing Reference: '3d' is mentioned on line 851, but not defined -- Looks like a reference, but probably isn't: '0' on line 1553 ** Downref: Normative reference to an Informational RFC: RFC 3693 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 GEOPRIV M. Thomson 3 Internet-Draft Mozilla 4 Intended status: Standards Track J. Winterbottom 5 Expires: February 15, 2015 Unaffiliated 6 August 14, 2014 8 Representation of Uncertainty and Confidence in PIDF-LO 9 draft-ietf-geopriv-uncertainty-02 11 Abstract 13 The key concepts of uncertainty and confidence as they pertain to 14 location information are defined. Methods for the manipulation of 15 location estimates that include uncertainty information are outlined. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on February 15, 2015. 34 Copyright Notice 36 Copyright (c) 2014 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 52 1.1. Conventions and Terminology . . . . . . . . . . . . . . . 3 53 2. A General Definition of Uncertainty . . . . . . . . . . . . . 4 54 2.1. Uncertainty as a Probability Distribution . . . . . . . . 5 55 2.2. Deprecation of the Terms Precision and Resolution . . . . 7 56 2.3. Accuracy as a Qualitative Concept . . . . . . . . . . . . 7 57 3. Uncertainty in Location . . . . . . . . . . . . . . . . . . . 8 58 3.1. Targets as Points in Space . . . . . . . . . . . . . . . 8 59 3.2. Representation of Uncertainty and Confidence in PIDF-LO . 9 60 3.3. Uncertainty and Confidence for Civic Addresses . . . . . 9 61 3.4. DHCP Location Configuration Information and Uncertainty . 10 62 4. Representation of Confidence in PIDF-LO . . . . . . . . . . . 10 63 4.1. The "confidence" Element . . . . . . . . . . . . . . . . 11 64 4.2. Generating Locations with Confidence . . . . . . . . . . 12 65 4.3. Consuming and Presenting Confidence . . . . . . . . . . . 12 66 5. Manipulation of Uncertainty . . . . . . . . . . . . . . . . . 13 67 5.1. Reduction of a Location Estimate to a Point . . . . . . . 13 68 5.1.1. Centroid Calculation . . . . . . . . . . . . . . . . 14 69 5.1.1.1. Arc-Band Centroid . . . . . . . . . . . . . . . . 14 70 5.1.1.2. Polygon Centroid . . . . . . . . . . . . . . . . 15 71 5.2. Conversion to Circle or Sphere . . . . . . . . . . . . . 17 72 5.3. Three-Dimensional to Two-Dimensional Conversion . . . . . 18 73 5.4. Increasing and Decreasing Uncertainty and Confidence . . 19 74 5.4.1. Rectangular Distributions . . . . . . . . . . . . . . 19 75 5.4.2. Normal Distributions . . . . . . . . . . . . . . . . 20 76 5.5. Determining Whether a Location is Within a Given Region . 20 77 5.5.1. Determining the Area of Overlap for Two Circles . . . 22 78 5.5.2. Determining the Area of Overlap for Two Polygons . . 23 79 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 23 80 6.1. Reduction to a Point or Circle . . . . . . . . . . . . . 23 81 6.2. Increasing and Decreasing Confidence . . . . . . . . . . 27 82 6.3. Matching Location Estimates to Regions of Interest . . . 27 83 6.4. PIDF-LO With Confidence Example . . . . . . . . . . . . . 28 84 7. Confidence Schema . . . . . . . . . . . . . . . . . . . . . . 28 85 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 86 8.1. URN Sub-Namespace Registration for 87 urn:ietf:params:xml:ns:geopriv:conf . . . . . . . . . . . 30 88 8.2. XML Schema Registration . . . . . . . . . . . . . . . . . 30 89 9. Security Considerations . . . . . . . . . . . . . . . . . . . 31 90 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 91 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 92 11.1. Normative References . . . . . . . . . . . . . . . . . . 31 93 11.2. Informative References . . . . . . . . . . . . . . . . . 32 94 Appendix A. Conversion Between Cartesian and Geodetic 95 Coordinates in WGS84 . . . . . . . . . . . . . . . . 33 96 Appendix B. Calculating the Upward Normal of a Polygon . . . . . 34 97 B.1. Checking that a Polygon Upward Normal Points Up . . . . . 35 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35 100 1. Introduction 102 Location information represents an estimation of the position of a 103 Target [RFC6280]. Under ideal circumstances, a location estimate 104 precisely reflects the actual location of the Target. For automated 105 systems that determine location, there are many factors that 106 introduce errors into the measurements that are used to determine 107 location estimates. 109 The process by which measurements are combined to generate a location 110 estimate is outside of the scope of work within the IETF. However, 111 the results of such a process are carried in IETF data formats and 112 protocols. This document outlines how uncertainty, and its 113 associated datum, confidence, are expressed and interpreted. 115 This document provides a common nomenclature for discussing 116 uncertainty and confidence as they relate to location information. 118 This document also provides guidance on how to manage location 119 information that includes uncertainty. Methods for expanding or 120 reducing uncertainty to obtain a required level of confidence are 121 described. Methods for determining the probability that a Target is 122 within a specified region based on their location estimate are 123 described. These methods are simplified by making certain 124 assumptions about the location estimate and are designed to be 125 applicable to location estimates in a relatively small geographic 126 area. 128 A confidence extension for the Presence Information Data Format - 129 Location Object (PIDF-LO) [RFC4119] is described. 131 This document describes methods that can be used in combination with 132 automatically determined location information. These are 133 statistically-based methods. 135 1.1. Conventions and Terminology 137 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 138 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 139 document are to be interpreted as described in [RFC2119]. 141 This document assumes a basic understanding of the principles of 142 mathematics, particularly statistics and geometry. 144 Some terminology is borrowed from [RFC3693] and [RFC6280], in 145 particular Target. 147 Mathematical formulae are presented using the following notation: add 148 "+", subtract "-", multiply "*", divide "/", power "^" and absolute 149 value "|x|". Precedence is indicated using parentheses. 150 Mathematical functions are represented by common abbreviations: 151 square root "sqrt(x)", sine "sin(x)", cosine "cos(x)", inverse cosine 152 "acos(x)", tangent "tan(x)", inverse tangent "atan(x)", two-argument 153 inverse tangent "atan2(y,x)", error function "erf(x)", and inverse 154 error function "erfinv(x)". 156 2. A General Definition of Uncertainty 158 Uncertainty results from the limitations of measurement. In 159 measuring any observable quantity, errors from a range of sources 160 affect the result. Uncertainty is a quantification of what is known 161 about the observed quantity, either through the limitations of 162 measurement or through inherent variability of the quantity. 164 Uncertainty is most completely described by a probability 165 distribution. A probability distribution assigns a probability to 166 possible values for the quantity. 168 A probability distribution describing a measured quantity can be 169 arbitrarily complex and so it is desirable to find a simplified 170 model. One approach commonly taken is to reduce the probability 171 distribution to a confidence interval. Many alternative models are 172 used in other areas, but study of those is not the focus of this 173 document. 175 In addition to the central estimate of the observed quantity, a 176 confidence interval is succinctly described by two values: an error 177 range and a confidence. The error range describes an interval and 178 the confidence describes an estimated upper bound on the probability 179 that a "true" value is found within the extents defined by the error. 181 In the following example, a measurement result for a length is shown 182 as a nominal value with additional information on error range (0.0043 183 meters) and confidence (95%). 185 e.g. x = 1.00742 +/- 0.0043 meters at 95% confidence 187 This result indicates that the measurement indicates that the value 188 of "x" between 1.00312 and 1.01172 meters with 95% probability. No 189 other assertion is made: in particular, this does not assert that x 190 is 1.00742. 192 Uncertainty and confidence for location estimates can be derived in a 193 number of ways. This document does not attempt to enumerate the many 194 methods for determining uncertainty. [ISO.GUM] and [NIST.TN1297] 195 provide a set of general guidelines for determining and manipulating 196 measurement uncertainty. This document applies that general guidance 197 for consumers of location information. 199 As a statistical measure, values determined for uncertainty are 200 determined based on information in the aggregate, across numerous 201 individual estimates. An individual estimate might be determined to 202 be "correct" - by using a survey to validate the result, for example 203 - without invalidating the statistical assertion. 205 This understanding of estimates in the statistical sense explains why 206 asserting a confidence of 100%, which might seem intuitively correct, 207 is rarely advisable. 209 2.1. Uncertainty as a Probability Distribution 211 The Probability Density Function (PDF) that is described by 212 uncertainty indicates the probability that the "true" value lies at 213 any one point. The shape of the probability distribution can vary 214 depending on the method that is used to determine the result. The 215 two probability density functions most generally applicable to 216 location information are considered in this document: 218 o The normal PDF (also referred to as a Gaussian PDF) is used where 219 a large number of small random factors contribute to errors. The 220 value used for the error range in a normal PDF is related to the 221 standard deviation of the distribution. 223 o A rectangular PDF is used where the errors are known to be 224 consistent across a limited range. A rectangular PDF can occur 225 where a single error source, such as a rounding error, is 226 significantly larger than other errors. A rectangular PDF is 227 often described by the half-width of the distribution; that is, 228 half the width of the distribution. 230 Each of these probability density functions can be characterized by 231 its center point, or mean, and its width. For a normal distribution, 232 uncertainty and confidence together are related to the standard 233 deviation (see Section 5.4). For a rectangular distribution, half of 234 the width of the distribution is used. 236 Figure 1 shows a normal and rectangular probability density function 237 with the mean (m) and standard deviation (s) labelled. The half- 238 width (h) of the rectangular distribution is also indicated. 240 ***** *** Normal PDF 241 ** : ** --- Rectangular PDF 242 ** : ** 243 ** : ** 244 .---------*---------------*---------. 245 | ** : ** | 246 | ** : ** | 247 | * <-- s -->: * | 248 | * : : : * | 249 | ** : ** | 250 | * : : : * | 251 | * : * | 252 |** : : : **| 253 ** : ** 254 *** | : : : | *** 255 ***** | :<------ h ------>| ***** 256 .****-------+.......:.........:.........:.......+-------*****. 257 m 259 Figure 1: Normal and Rectangular Probability Density Functions 261 For a given PDF, the value of the PDF describes the probability that 262 the "true" value is found at that point. Confidence for any given 263 interval is the total probability of the "true" value being in that 264 range, defined as the integral of the PDF over the interval. 266 The probability of the "true" value falling between two points is 267 found by finding the area under the curve between the points (that 268 is, the integral of the curve between the points). For any given 269 PDF, the area under the curve for the entire range from negative 270 infinity to positive infinity is 1 or (100%). Therefore, the 271 confidence over any interval of uncertainty is always less than 272 100%. 274 Figure 2 shows how confidence is determined for a normal 275 distribution. The area of the shaded region gives the confidence (c) 276 for the interval between "m-u" and "m+u". 278 ***** 279 **:::::** 280 **:::::::::** 281 **:::::::::::** 282 *:::::::::::::::* 283 **:::::::::::::::** 284 **:::::::::::::::::** 285 *:::::::::::::::::::::* 286 *:::::::::::::::::::::::* 287 **:::::::::::::::::::::::** 288 *:::::::::::: c ::::::::::::* 289 *:::::::::::::::::::::::::::::* 290 **|:::::::::::::::::::::::::::::|** 291 ** |:::::::::::::::::::::::::::::| ** 292 *** |:::::::::::::::::::::::::::::| *** 293 ***** |:::::::::::::::::::::::::::::| ***** 294 .****..........!:::::::::::::::::::::::::::::!..........*****. 295 | | | 296 (m-u) m (m+u) 298 Figure 2: Confidence as the Integral of a PDF 300 In Section 5.4, methods are described for manipulating uncertainty if 301 the shape of the PDF is known. 303 2.2. Deprecation of the Terms Precision and Resolution 305 The terms _Precision_ and _Resolution_ are defined in RFC 3693 306 [RFC3693]. These definitions were intended to provide a common 307 nomenclature for discussing uncertainty; however, these particular 308 terms have many different uses in other fields and their definitions 309 are not sufficient to avoid confusion about their meaning. These 310 terms are unsuitable for use in relation to quantitative concepts 311 when discussing uncertainty and confidence in relation to location 312 information. 314 2.3. Accuracy as a Qualitative Concept 316 Uncertainty is a quantitative concept. The term _accuracy_ is useful 317 in describing, qualitatively, the general concepts of location 318 information. Accuracy is generally useful when describing 319 qualitative aspects of location estimates. Accuracy is not a 320 suitable term for use in a quantitative context. 322 For instance, it could be appropriate to say that a location estimate 323 with uncertainty "X" is more accurate than a location estimate with 324 uncertainty "2X" at the same confidence. It is not appropriate to 325 assign a number to "accuracy", nor is it appropriate to refer to any 326 component of uncertainty or confidence as "accuracy". That is, to 327 say that the "accuracy" for the first location estimate is "X" would 328 be an erroneous use of this term. 330 3. Uncertainty in Location 332 A _location estimate_ is the result of location determination. A 333 location estimate is subject to uncertainty like any other 334 observation. However, unlike a simple measure of a one dimensional 335 property like length, a location estimate is specified in two or 336 three dimensions. 338 Uncertainty in two or three dimensional locations can be described 339 using confidence intervals. The confidence interval for a location 340 estimate in two or three dimensional space is expressed as a subset 341 of that space. This document uses the term _region of uncertainty_ 342 to refer to the area or volume that describes the confidence 343 interval. 345 Areas or volumes that describe regions of uncertainty can be formed 346 by the combination of two or three one-dimensional ranges, or more 347 complex shapes could be described (for example, the shapes in 348 [RFC5491]). 350 3.1. Targets as Points in Space 352 This document makes a simplifying assumption that the Target of the 353 PIDF-LO occupies just a single point in space. While this is clearly 354 false in virtually all scenarios with any practical application, it 355 is often a reasonable simplifying assumption to make. 357 To a large extent, whether this simplification is valid depends on 358 the size of the target relative to the size of the uncertainty 359 region. When locating a personal device using contemporary location 360 determination techniques, the space the device occupies relative to 361 the uncertainty is proportionally quite small. Even where that 362 device is used as a proxy for a person, the proportions change 363 little. 365 This assumption is less useful as uncertainty becomes small relative 366 to the size of the Target of the PIDF-LO (or conversely, as 367 uncertainty becomes small relative to the Target). For instance, 368 describing the location of a football stadium or small country would 369 include a region of uncertainty that is infinitesimally larger than 370 the Target itself. In these cases, much of the guidance in this 371 document is not applicable. Indeed, as the accuracy of location 372 determination technology improves, it could be that the advice this 373 document contains becomes less relevant by the same measure. 375 3.2. Representation of Uncertainty and Confidence in PIDF-LO 377 A set of shapes suitable for the expression of uncertainty in 378 location estimates in the Presence Information Data Format - Location 379 Object (PIDF-LO) are described in [GeoShape]. These shapes are the 380 recommended form for the representation of uncertainty in PIDF-LO 381 [RFC4119] documents. 383 The PIDF-LO can contain uncertainty, but does not include an 384 indication of confidence. [RFC5491] defines a fixed value of 95%. 385 Similarly, the PIDF-LO format does not provide an indication of the 386 shape of the PDF. Section 4 defines elements to convey this 387 information in PIDF-LO. 389 Absence of uncertainty information in a PIDF-LO document does not 390 indicate that there is no uncertainty in the location estimate. 391 Uncertainty might not have been calculated for the estimate, or it 392 may be withheld for privacy purposes. 394 If the Point shape is used, confidence and uncertainty are unknown; a 395 receiver can either assume a confidence of 0% or infinite 396 uncertainty. The same principle applies on the altitude axis for 397 two-dimension shapes like the Circle. 399 3.3. Uncertainty and Confidence for Civic Addresses 401 Automatically determined civic addresses [RFC5139] inherently include 402 uncertainty, based on the area of the most precise element that is 403 specified. In this case, uncertainty is effectively described by the 404 presence or absence of elements -- elements that are not present are 405 deemed to be uncertain. 407 To apply the concept of uncertainty to civic addresses, it is helpful 408 to unify the conceptual models of civic address with geodetic 409 location information. This is particularly useful when considering 410 civic addresses that are determined using reverse geocoding (that is, 411 the process of translating geodetic information into civic 412 addresses). 414 In the unified view, a civic address defines a series of (sometimes 415 non-orthogonal) spatial partitions. The first is the implicit 416 partition that identifies the surface of the earth and the space near 417 the surface. The second is the country. Each label that is included 418 in a civic address provides information about a different set of 419 spatial partitions. Some partitions require slight adjustments from 420 a standard interpretation: for instance, a road includes all 421 properties that adjoin the street. Each label might need to be 422 interpreted with other values to provide context. 424 As a value at each level is interpreted, one or more spatial 425 partitions at that level are selected, and all other partitions of 426 that type are excluded. For non-orthogonal partitions, only the 427 portion of the partition that fits within the existing space is 428 selected. This is what distinguishes King Street in Sydney from King 429 Street in Melbourne. Each defined element selects a partition of 430 space. The resulting location is the intersection of all selected 431 spaces. 433 The resulting spatial partition can be considered as a region of 434 uncertainty. 436 Note: This view is a potential perspective on the process of geo- 437 coding - the translation of a civic address to a geodetic 438 location. 440 Uncertainty in civic addresses can be increased by removing elements. 441 This does not increase confidence unless additional information is 442 used. Similarly, arbitrarily increasing uncertainty in a geodetic 443 location does not increase confidence. 445 3.4. DHCP Location Configuration Information and Uncertainty 447 Location information is often measured in two or three dimensions; 448 expressions of uncertainty in one dimension only are rare. The 449 "resolution" parameters in [RFC6225] provide an indication of how 450 many bits of a number are valid, which could be interpreted as an 451 expression of uncertainty in one dimension. 453 [RFC6225] defines a means for representing uncertainty, but a value 454 for confidence is not specified. A default value of 95% confidence 455 is assumed for the combination of the uncertainty on each axis. This 456 is consistent with the transformation of those forms into the 457 uncertainty representations from [RFC5491]. That is, the confidence 458 of the resultant rectangular polygon or prism is assumed to be 95%. 460 4. Representation of Confidence in PIDF-LO 462 On the whole, a fixed definition for confidence is preferable. 463 Primarily because it ensures consistency between implementations. 464 Location generators that are aware of this constraint can generate 465 location information at the required confidence. Location recipients 466 are able to make sensible assumptions about the quality of the 467 information that they receive. 469 In some circumstances - particularly with pre-existing systems - 470 location generators might unable to provide location information with 471 consistent confidence. Existing systems sometimes specify confidence 472 at 38%, 67% or 90%. Existing forms of expressing location 473 information, such as that defined in [TS-3GPP-23_032], contain 474 elements that express the confidence in the result. 476 The addition of a confidence element provides information that was 477 previously unavailable to recipients of location information. 478 Without this information, a location server or generator that has 479 access to location information with a confidence lower than 95% has 480 two options: 482 o The location server can scale regions of uncertainty in an attempt 483 to acheive 95% confidence. This scaling process significantly 484 degrades the quality of the information, because the location 485 server might not have the necessary information to scale 486 appropriately; the location server is forced to make assumptions 487 that are likely to result in either an overly conservative 488 estimate with high uncertainty or a overestimate of confidence. 490 o The location server can ignore the confidence entirely, which 491 results in giving the recipient a false impression of its quality. 493 Both of these choices degrade the quality of the information 494 provided. 496 The addition of a confidence element avoids this problem entirely if 497 a location recipient supports and understands the element. A 498 recipient that does not understand - and hence ignores - the 499 confidence element is in no worse a position than if the location 500 server ignored confidence. 502 4.1. The "confidence" Element 504 The confidence element MAY be added to the "location-info" element of 505 the Presence Information Data Format - Location Object (PIDF-LO) 506 [RFC4119] document. This element expresses the confidence in the 507 associated location information as a percentage. A special "unknown" 508 value is reserved to indicate that confidence is supported, but not 509 known to the Location Generator. 511 The confidence element optionally includes an attribute that 512 indicates the shape of the probability density function (PDF) of the 513 associated region of uncertainty. Three values are possible: 514 unknown, normal and rectangular. 516 Indicating a particular PDF only indicates that the distribution 517 approximately fits the given shape based on the methods used to 518 generate the location information. The PDF is normal if there are a 519 large number of small, independent sources of error; rectangular if 520 all points within the area have roughly equal probability of being 521 the actual location of the Target; otherwise, the PDF MUST either be 522 set to unknown or omitted. 524 If a PIDF-LO does not include the confidence element, the confidence 525 of the location estimate is 95%, as defined in [RFC5491]. 527 A Point shape does not have uncertainty (or it has infinite 528 uncertainty), so confidence is meaningless for a point; therefore, 529 this element MUST be omitted if only a point is provided. 531 4.2. Generating Locations with Confidence 533 Location generators SHOULD attempt to ensure that confidence is equal 534 in each dimension when generating location information. This 535 restriction, while not always practical, allows for more accurate 536 scaling, if scaling is necessary. 538 A confidence element MUST be included with all location information 539 that includes uncertainty (that is, all forms other than a point). A 540 special "unknown" MAY be used if confidence is not known. 542 4.3. Consuming and Presenting Confidence 544 The inclusion of confidence that is anything other than 95% presents 545 a potentially difficult usability problem for applications that use 546 location information. Effectively communicating the probability that 547 a location is incorrect to a user can be difficult. 549 It is inadvisable to simply display locations of any confidence, or 550 to display confidence in a separate or non-obvious fashion. If 551 locations with different confidence levels are displayed such that 552 the distinction is subtle or easy to overlook - such as using fine 553 graduations of color or transparency for graphical uncertainty 554 regions, or displaying uncertainty graphically, but providing 555 confidence as supplementary text - a user could fail to notice a 556 difference in the quality of the location information that might be 557 significant. 559 Depending on the circumstances, different ways of handling confidence 560 might be appropriate. Section 5 describes techniques that could be 561 appropriate for consumers that use automated processing. 563 Providing that the full implications of any choice for the 564 application are understood, some amount of automated processing could 565 be appropriate. In a simple example, applications could choose to 566 discard or suppress the display of location information if confidence 567 does not meet a pre-determined threshold. 569 In settings where there is an opportunity for user training, some of 570 these problems might be mitigated by defining different operational 571 procedures for handling location information at different confidence 572 levels. 574 5. Manipulation of Uncertainty 576 This section deals with manipulation of location information that 577 contains uncertainty. 579 The following rules generally apply when manipulating location 580 information: 582 o Where calculations are performed on coordinate information, these 583 should be performed in Cartesian space and the results converted 584 back to latitude, longitude and altitude. A method for converting 585 to and from Cartesian coordinates is included in Appendix A. 587 While some approximation methods are useful in simplifying 588 calculations, treating latitude and longitude as Cartesian axes 589 is never advisable. The two axes are not orthogonal. Errors 590 can arise from the curvature of the earth and from the 591 convergence of longitude lines. 593 o Normal rounding rules do not apply when rounding uncertainty. 594 When rounding, the region of uncertainty always increases (that 595 is, errors are rounded up) and confidence is always rounded down 596 (see [NIST.TN1297]). This means that any manipulation of 597 uncertainty is a non-reversible operation; each manipulation can 598 result in the loss of some information. 600 5.1. Reduction of a Location Estimate to a Point 602 Manipulating location estimates that include uncertainty information 603 requires additional complexity in systems. In some cases, systems 604 only operate on definitive values, that is, a single point. 606 This section describes algorithms for reducing location estimates to 607 a simple form without uncertainty information. Having a consistent 608 means for reducing location estimates allows for interaction between 609 applications that are able to use uncertainty information and those 610 that cannot. 612 Note: Reduction of a location estimate to a point constitutes a 613 reduction in information. Removing uncertainty information can 614 degrade results in some applications. Also, there is a natural 615 tendency to misinterpret a point location as representing a 616 location without uncertainty. This could lead to more serious 617 errors. Therefore, these algorithms should only be applied where 618 necessary. 620 Several different approaches can be taken when reducing a location 621 estimate to a point. Different methods each make a set of 622 assumptions about the properties of the PDF and the selected point; 623 no one method is more "correct" than any other. For any given region 624 of uncertainty, selecting an arbitrary point within the area could be 625 considered valid; however, given the aforementioned problems with 626 point locations, a more rigorous approach is appropriate. 628 Given a result with a known distribution, selecting the point within 629 the area that has the highest probability is a more rigorous method. 630 Alternatively, a point could be selected that minimizes the overall 631 error; that is, it minimizes the expected value of the difference 632 between the selected point and the "true" value. 634 If a rectangular distribution is assumed, the centroid of the area or 635 volume minimizes the overall error. Minimizing the error for a 636 normal distribution is mathematically complex. Therefore, this 637 document opts to select the centroid of the region of uncertainty 638 when selecting a point. 640 5.1.1. Centroid Calculation 642 For regular shapes, such as Circle, Sphere, Ellipse and Ellipsoid, 643 this approach equates to the center point of the region. For regions 644 of uncertainty that are expressed as regular Polygons and Prisms the 645 center point is also the most appropriate selection. 647 For the Arc-Band shape and non-regular Polygons and Prisms, selecting 648 the centroid of the area or volume minimizes the overall error. This 649 assumes that the PDF is rectangular. 651 Note: The centroid of a concave Polygon or Arc-Band shape is not 652 necessarily within the region of uncertainty. 654 5.1.1.1. Arc-Band Centroid 656 The centroid of the Arc-Band shape is found along a line that bisects 657 the arc. The centroid can be found at the following distance from 658 the starting point of the arc-band (assuming an arc-band with an 659 inner radius of "r", outer radius "R", start angle "a", and opening 660 angle "o"): 662 d = 4 * sin(o/2) * (R*R + R*r + r*r) / (3*o*(R + r)) 664 This point can be found along the line that bisects the arc; that is, 665 the line at an angle of "a + (o/2)". 667 5.1.1.2. Polygon Centroid 669 Calculating a centroid for the Polygon and Prism shapes is more 670 complex. Polygons that are specified using geodetic coordinates are 671 not necessarily coplanar. For Polygons that are specified without an 672 altitude, choose a value for altitude before attempting this process; 673 an altitude of 0 is acceptable. 675 The method described in this section is simplified by assuming 676 that the surface of the earth is locally flat. This method 677 degrades as polygons become larger; see [GeoShape] for 678 recommendations on polygon size. 680 The polygon is translated to a new coordinate system that has an x-y 681 plane roughly parallel to the polygon. This enables the elimination 682 of z-axis values and calculating a centroid can be done using only x 683 and y coordinates. This requires that the upward normal for the 684 polygon is known. 686 To translate the polygon coordinates, apply the process described in 687 Appendix B to find the normal vector "N = [Nx,Ny,Nz]". This value 688 should be made a unit vector to ensure that the transformation matrix 689 is a special orthogonal matrix. From this vector, select two vectors 690 that are perpendicular to this vector and combine these into a 691 transformation matrix. 693 If "Nx" and "Ny" are non-zero, the matrices in Figure 3 can be used, 694 given "p = sqrt(Nx^2 + Ny^2)". More transformations are provided 695 later in this section for cases where "Nx" or "Ny" are zero. 697 [ -Ny/p Nx/p 0 ] [ -Ny/p -Nx*Nz/p Nx ] 698 T = [ -Nx*Nz/p -Ny*Nz/p p ] T' = [ Nx/p -Ny*Nz/p Ny ] 699 [ Nx Ny Nz ] [ 0 p Nz ] 700 (Transform) (Reverse Transform) 702 Figure 3: Recommended Transformation Matrices 704 To apply a transform to each point in the polygon, form a matrix from 705 the ECEF coordinates and use matrix multiplication to determine the 706 translated coordinates. 708 [ -Ny/p Nx/p 0 ] [ x[1] x[2] x[3] ... x[n] ] 709 [ -Nx*Nz/p -Ny*Nz/p p ] * [ y[1] y[2] y[3] ... y[n] ] 710 [ Nx Ny Nz ] [ z[1] z[2] z[3] ... z[n] ] 712 [ x'[1] x'[2] x'[3] ... x'[n] ] 713 = [ y'[1] y'[2] y'[3] ... y'[n] ] 714 [ z'[1] z'[2] z'[3] ... z'[n] ] 716 Figure 4: Transformation 718 Alternatively, direct multiplication can be used to achieve the same 719 result: 721 x'[i] = -Ny * x[i] / p + Nx * y[i] / p 723 y'[i] = -Nx * Nz * x[i] / p - Ny * Nz * y[i] / p + p * z[i] 725 z'[i] = Nx * x[i] + Ny * y[i] + Nz * z[i] 727 The first and second rows of this matrix ("x'" and "y'") contain the 728 values that are used to calculate the centroid of the polygon. To 729 find the centroid of this polygon, first find the area using: 731 A = sum from i=1..n of (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / 2 733 For these formulae, treat each set of coordinates as circular, that 734 is "x'[0] == x'[n]" and "x'[n+1] == x'[1]". Based on the area, the 735 centroid along each axis can be determined by: 737 Cx' = sum (x'[i]+x'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A) 739 Cy' = sum (y'[i]+y'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A) 741 Note: The formula for the area of a polygon will return a negative 742 value if the polygon is specified in clockwise direction. This 743 can be used to determine the orientation of the polygon. 745 The third row contains a distance from a plane parallel to the 746 polygon. If the polygon is coplanar, then the values for "z'" are 747 identical; however, the constraints recommended in [RFC5491] mean 748 that this is rarely the case. To determine "Cz'", average these 749 values: 751 Cz' = sum z'[i] / n 753 Once the centroid is known in the transformed coordinates, these can 754 be transformed back to the original coordinate system. The reverse 755 transformation is shown in Figure 5. 757 [ -Ny/p -Nx*Nz/p Nx ] [ Cx' ] [ Cx ] 758 [ Nx/p -Ny*Nz/p Ny ] * [ Cy' ] = [ Cy ] 759 [ 0 p Nz ] [ sum of z'[i] / n ] [ Cz ] 761 Figure 5: Reverse Transformation 763 The reverse transformation can be applied directly as follows: 765 Cx = -Ny * Cx' / p - Nx * Nz * Cy' / p + Nx * Cz' 767 Cy = Nx * Cx' / p - Ny * Nz * Cy' / p + Ny * Cz' 769 Cz = p * Cy' + Nz * Cz' 771 The ECEF value "[Cx,Cy,Cz]" can then be converted back to geodetic 772 coordinates. Given a polygon that is defined with no altitude or 773 equal altitudes for each point, the altitude of the result can either 774 be ignored or reset after converting back to a geodetic value. 776 The centroid of the Prism shape is found by finding the centroid of 777 the base polygon and raising the point by half the height of the 778 prism. This can be added to altitude of the final result; 779 alternatively, this can be added to "Cz'", which ensures that 780 negative height is correctly applied to polygons that are defined in 781 a "clockwise" direction. 783 The recommended transforms only apply if "Nx" and "Ny" are non-zero. 784 If the normal vector is "[0,0,1]" (that is, along the z-axis), then 785 no transform is necessary. Similarly, if the normal vector is 786 "[0,1,0]" or "[1,0,0]", avoid the transformation and use the x and z 787 coordinates or y and z coordinates (respectively) in the centroid 788 calculation phase. If either "Nx" or "Ny" are zero, the alternative 789 transform matrices in Figure 6 can be used. The reverse transform is 790 the transpose of this matrix. 792 if Nx == 0: | if Ny == 0: 793 [ 0 -Nz Ny ] [ 0 1 0 ] | [ -Nz 0 Nx ] 794 T = [ 1 0 0 ] T' = [ -Nz 0 Ny ] | T = T' = [ 0 1 0 ] 795 [ 0 Ny Nz ] [ Ny 0 Nz ] | [ Nx 0 Nz ] 797 Figure 6: Alternative Transformation Matrices 799 5.2. Conversion to Circle or Sphere 801 The Circle or Sphere are simple shapes that suit a range of 802 applications. A circle or sphere contains fewer units of data to 803 manipulate, which simplifies operations on location estimates. 805 The simplest method for converting a location estimate to a Circle or 806 Sphere shape is to determine the centroid and then find the longest 807 distance to any point in the region of uncertainty to that point. 808 This distance can be determined based on the shape type: 810 Circle/Sphere: No conversion necessary. 812 Ellipse/Ellipsoid: The greater of either semi-major axis or altitude 813 uncertainty. 815 Polygon/Prism: The distance to the furthest vertex of the polygon 816 (for a Prism, it is only necessary to check points on the base). 818 Arc-Band: The furthest length from the centroid to the points where 819 the inner and outer arc end. This distance can be calculated by 820 finding the larger of the two following formulae: 822 X = sqrt( d*d + R*R - 2*d*R*cos(o/2) ) 824 x = sqrt( d*d + r*r - 2*d*r*cos(o/2) ) 826 Once the Circle or Sphere shape is found, the associated confidence 827 can be increased if the result is known to follow a normal 828 distribution. However, this is a complicated process and provides 829 limited benefit. In many cases it also violates the constraint that 830 confidence in each dimension be the same. Confidence should be 831 unchanged when performing this conversion. 833 Two dimensional shapes are converted to a Circle; three dimensional 834 shapes are converted to a Sphere. 836 5.3. Three-Dimensional to Two-Dimensional Conversion 838 A three-dimensional shape can be easily converted to a two- 839 dimensional shape by removing the altitude component. A sphere 840 becomes a circle; a prism becomes a polygon; an ellipsoid becomes an 841 ellipse. Each conversion is simple, requiring only the removal of 842 those elements relating to altitude. 844 The altitude is unspecified for a two-dimensional shape and therefore 845 has unlimited uncertainty along the vertical axis. The confidence 846 for the two-dimensional shape is thus higher than the three- 847 dimensional shape. Assuming equal confidence on each axis, the 848 confidence of the circle can be increased using the following 849 approximate formula: 851 C[2d] >= C[3d] ^ (2/3) 853 "C[2d]" is the confidence of the two-dimensional shape and "C[3d]" is 854 the confidence of the three-dimensional shape. For example, a Sphere 855 with a confidence of 95% can be simplified to a Circle of equal 856 radius with confidence of 96.6%. 858 5.4. Increasing and Decreasing Uncertainty and Confidence 860 The combination of uncertainty and confidence provide a great deal of 861 information about the nature of the data that is being measured. If 862 uncertainty, confidence and PDF are known, certain information can be 863 extrapolated. In particular, the uncertainty can be scaled to meet a 864 desired confidence or the confidence for a particular region of 865 uncertainty can be found. 867 In general, confidence decreases as the region of uncertainty 868 decreases in size and confidence increases as the region of 869 uncertainty increases in size. However, this depends on the PDF; 870 expanding the region of uncertainty for a rectangular distribution 871 has no effect on confidence without additional information. If the 872 region of uncertainty is increased during the process of obfuscation 873 (see [RFC6772]), then the confidence cannot be increased. 875 A region of uncertainty that is reduced in size always has a lower 876 confidence. 878 A region of uncertainty that has an unknown PDF shape cannot be 879 reduced in size reliably. The region of uncertainty can be expanded, 880 but only if confidence is not increased. 882 This section makes the simplifying assumption that location 883 information is symmetrically and evenly distributed in each 884 dimension. This is not necessarily true in practice. If better 885 information is available, alternative methods might produce better 886 results. 888 5.4.1. Rectangular Distributions 890 Uncertainty that follows a rectangular distribution can only be 891 decreased in size. Increasing uncertainty has no value, since it has 892 no effect on confidence. Since the PDF is constant over the region 893 of uncertainty, the resulting confidence is determined by the 894 following formula: 896 Cr = Co * Ur / Uo 898 Where "Uo" and "Ur" are the sizes of the original and reduced regions 899 of uncertainty (either the area or the volume of the region); "Co" 900 and "Cb" are the confidence values associated with each region. 902 Information is lost by decreasing the region of uncertainty for a 903 rectangular distribution. Once reduced in size, the uncertainty 904 region cannot subsequently be increased in size. 906 5.4.2. Normal Distributions 908 Uncertainty and confidence can be both increased and decreased for a 909 normal distribution. This calculation depends on the number of 910 dimensions of the uncertainty region. 912 For a normal distribution, uncertainty and confidence are related to 913 the standard deviation of the function. The following function 914 defines the relationship between standard deviation, uncertainty, and 915 confidence along a single axis: 917 S[x] = U[x] / ( sqrt(2) * erfinv(C[x]) ) 919 Where "S[x]" is the standard deviation, "U[x]" is the uncertainty, 920 and "C[x]" is the confidence along a single axis. "erfinv" is the 921 inverse error function. 923 Scaling a normal distribution in two dimensions requires several 924 assumptions. Firstly, it is assumed that the distribution along each 925 axis is independent. Secondly, the confidence for each axis is 926 assumed to be the same. Therefore, the confidence along each axis 927 can be assumed to be: 929 C[x] = Co ^ (1/n) 931 Where "C[x]" is the confidence along a single axis and "Co" is the 932 overall confidence and "n" is the number of dimensions in the 933 uncertainty. 935 Therefore, to find the uncertainty for each axis at a desired 936 confidence, "Cd", apply the following formula: 938 Ud[x] <= U[x] * (erfinv(Cd ^ (1/n)) / erfinv(Co ^ (1/n))) 940 For regular shapes, this formula can be applied as a scaling factor 941 in each dimension to reach a required confidence. 943 5.5. Determining Whether a Location is Within a Given Region 945 A number of applications require that a judgment be made about 946 whether a Target is within a given region of interest. Given a 947 location estimate with uncertainty, this judgment can be difficult. 948 A location estimate represents a probability distribution, and the 949 true location of the Target cannot be definitively known. Therefore, 950 the judgment relies on determining the probability that the Target is 951 within the region. 953 The probability that the Target is within a particular region is 954 found by integrating the PDF over the region. For a normal 955 distribution, there are no analytical methods that can be used to 956 determine the integral of the two or three dimensional PDF over an 957 arbitrary region. The complexity of numerical methods is also too 958 great to be useful in many applications; for example, finding the 959 integral of the PDF in two or three dimensions across the overlap 960 between the uncertainty region and the target region. If the PDF is 961 unknown, no determination can be made without a simplifying 962 assumption. 964 When judging whether a location is within a given region, this 965 document assumes that uncertainties are rectangular. This introduces 966 errors, but simplifies the calculations significantly. Prior to 967 applying this assumption, confidence should be scaled to 95%. 969 Note: The selection of confidence has a significant impact on the 970 final result. Only use a different confidence if an uncertainty 971 value for 95% confidence cannot be found. 973 Given the assumption of a rectangular distribution, the probability 974 that a Target is found within a given region is found by first 975 finding the area (or volume) of overlap between the uncertainty 976 region and the region of interest. This is multiplied by the 977 confidence of the location estimate to determine the probability. 978 Figure 7 shows an example of finding the area of overlap between the 979 region of uncertainty and the region of interest. 981 _.-""""-._ 982 .' `. _ Region of 983 / \ / Uncertainty 984 ..+-"""--.. | 985 .-' | :::::: `-. | 986 ,' | :: Ao ::: `. | 987 / \ :::::::::: \ / 988 / `._ :::::: _.X 989 | `-....-' | 990 | | 991 | | 992 \ / 993 `. .' \_ Region of 994 `._ _.' Interest 995 `--..___..--' 997 Figure 7: Area of Overlap Between Two Circular Regions 999 Once the area of overlap, "Ao", is known, the probability that the 1000 Target is within the region of interest, "Pi", is: 1002 Pi = Co * Ao / Au 1004 Given that the area of the region of uncertainty is "Au" and the 1005 confidence is "Co". 1007 This probability is often input to a decision process that has a 1008 limited set of outcomes; therefore, a threshold value needs to be 1009 selected. Depending on the application, different threshold 1010 probabilities might be selected. In the absence of specific 1011 recommendations, this document suggests that the probability be 1012 greater than 50% before a decision is made. If the decision process 1013 selects between two or more regions, as is required by [RFC5222], 1014 then the region with the highest probability can be selected. 1016 5.5.1. Determining the Area of Overlap for Two Circles 1018 Determining the area of overlap between two arbitrary shapes is a 1019 non-trivial process. Reducing areas to circles (see Section 5.2) 1020 enables the application of the following process. 1022 Given the radius of the first circle "r", the radius of the second 1023 circle "R" and the distance between their center points "d", the 1024 following set of formulas provide the area of overlap "Ao". 1026 o If the circles don't overlap, that is "d >= r+R", "Ao" is zero. 1028 o If one of the two circles is entirely within the other, that is 1029 "d <= |r-R|", the area of overlap is the area of the smaller 1030 circle. 1032 o Otherwise, if the circles partially overlap, that is "d < r+R" and 1033 "d > |r-R|", find "Ao" using: 1035 a = (r^2 - R^2 + d^2)/(2*d) 1037 Ao = r^2*acos(a/r) + R^2*acos((d - a)/R) - d*sqrt(r^2 - a^2) 1039 A value for "d" can be determined by converting the center points to 1040 Cartesian coordinates and calculating the distance between the two 1041 center points: 1043 d = sqrt((x1-x2)^2 + (y1-y2)^2 + (z1-z2)^2) 1045 5.5.2. Determining the Area of Overlap for Two Polygons 1047 A calculation of overlap based on polygons can give better results 1048 than the circle-based method. However, efficient calculation of 1049 overlapping area is non-trivial. Algorithms such as Vatti's clipping 1050 algorithm [Vatti92] can be used. 1052 For large polygonal areas, it might be that geodesic interpolation is 1053 used. In these cases, altitude is also frequently omitted in 1054 describing the polygon. For such shapes, a planar projection can 1055 still give a good approximation of the area of overlap if the larger 1056 area polygon is projected onto the local tangent plane of the 1057 smaller. This is only possible if the only area of interest is that 1058 contained within the smaller polygon. Where the entire area of the 1059 larger polygon is of interest, geodesic interpolation is necessary. 1061 6. Examples 1063 This section presents some examples of how to apply the methods 1064 described in Section 5. 1066 6.1. Reduction to a Point or Circle 1068 Alice receives a location estimate from her LIS that contains an 1069 ellipsoidal region of uncertainty. This information is provided at 1070 19% confidence with a normal PDF. A PIDF-LO extract for this 1071 information is shown in Figure 8. 1073 1074 1075 1076 -34.407242 150.882518 34 1077 1078 7.7156 1079 1080 1081 3.31 1082 1083 1084 28.7 1085 1086 1087 43 1088 1089 1090 1091 1092 1094 Figure 8 1096 This information can be reduced to a point simply by extracting the 1097 center point, that is [-34.407242, 150.882518, 34]. 1099 If some limited uncertainty were required, the estimate could be 1100 converted into a circle or sphere. To convert to a sphere, the 1101 radius is the largest of the semi-major, semi-minor and vertical 1102 axes; in this case, 28.7 meters. 1104 However, if only a circle is required, the altitude can be dropped as 1105 can the altitude uncertainty (the vertical axis of the ellipsoid), 1106 resulting in a circle at [-34.407242, 150.882518] of radius 7.7156 1107 meters. 1109 Bob receives a location estimate with a Polygon shape. This 1110 information is shown in Figure 9. 1112 1113 1114 1115 1116 -33.856625 151.215906 -33.856299 151.215343 1117 -33.856326 151.214731 -33.857533 151.214495 1118 -33.857720 151.214613 -33.857369 151.215375 1119 -33.856625 151.215906 1120 1121 1122 1123 1125 Figure 9 1127 To convert this to a polygon, each point is firstly assigned an 1128 altitude of zero and converted to ECEF coordinates (see Appendix A). 1129 Then a normal vector for this polygon is found (see Appendix B). The 1130 result of each of these stages is shown in Figure 10. Note that the 1131 numbers shown are all rounded; no rounding is possible during this 1132 process since rounding would contribute significant errors. 1134 Polygon in ECEF coordinate space 1135 (repeated point omitted and transposed to fit): 1136 [ -4.6470e+06 2.5530e+06 -3.5333e+06 ] 1137 [ -4.6470e+06 2.5531e+06 -3.5332e+06 ] 1138 pecef = [ -4.6470e+06 2.5531e+06 -3.5332e+06 ] 1139 [ -4.6469e+06 2.5531e+06 -3.5333e+06 ] 1140 [ -4.6469e+06 2.5531e+06 -3.5334e+06 ] 1141 [ -4.6469e+06 2.5531e+06 -3.5333e+06 ] 1143 Normal Vector: n = [ -0.72782 0.39987 -0.55712 ] 1145 Transformation Matrix: 1146 [ -0.48152 -0.87643 0.00000 ] 1147 t = [ -0.48828 0.26827 0.83043 ] 1148 [ -0.72782 0.39987 -0.55712 ] 1150 Transformed Coordinates: 1151 [ 8.3206e+01 1.9809e+04 6.3715e+06 ] 1152 [ 3.1107e+01 1.9845e+04 6.3715e+06 ] 1153 pecef' = [ -2.5528e+01 1.9842e+04 6.3715e+06 ] 1154 [ -4.7367e+01 1.9708e+04 6.3715e+06 ] 1155 [ -3.6447e+01 1.9687e+04 6.3715e+06 ] 1156 [ 3.4068e+01 1.9726e+04 6.3715e+06 ] 1158 Two dimensional polygon area: A = 12600 m^2 1159 Two-dimensional polygon centroid: C' = [ 8.8184e+00 1.9775e+04 ] 1161 Average of pecef' z coordinates: 6.3715e+06 1163 Reverse Transformation Matrix: 1164 [ -0.48152 -0.48828 -0.72782 ] 1165 t' = [ -0.87643 0.26827 0.39987 ] 1166 [ 0.00000 0.83043 -0.55712 ] 1168 Polygon centroid (ECEF): C = [ -4.6470e+06 2.5531e+06 -3.5333e+06 ] 1169 Polygon centroid (Geo): Cg = [ -33.856926 151.215102 -4.9537e-04 ] 1171 Figure 10 1173 The point conversion for the polygon uses the final result, "Cg", 1174 ignoring the altitude since the original shape did not include 1175 altitude. 1177 To convert this to a circle, take the maximum distance in ECEF 1178 coordinates from the center point to each of the points. This 1179 results in a radius of 99.1 meters. Confidence is unchanged. 1181 6.2. Increasing and Decreasing Confidence 1183 Assuming that confidence is known to be 19% for Alice's location 1184 information. This is a typical value for a three-dimensional 1185 ellipsoid uncertainty of normal distribution where the standard 1186 deviation is used directly for uncertainty in each dimension. The 1187 confidence associated with Alice's location estimate is quite low for 1188 many applications. Since the estimate is known to follow a normal 1189 distribution, the method in Section 5.4.2 can be used. Each axis can 1190 be scaled by: 1192 scale = erfinv(0.95^(1/3)) / erfinv(0.19^(1/3)) = 2.9937 1194 Ensuring that rounding always increases uncertainty, the location 1195 estimate at 95% includes a semi-major axis of 23.1, a semi-minor axis 1196 of 10 and a vertical axis of 86. 1198 Bob's location estimate (from the previous example) covers an area of 1199 approximately 12600 square meters. If the estimate follows a 1200 rectangular distribution, the region of uncertainty can be reduced in 1201 size. Here we find the confidence that Bob is within the smaller 1202 area of the concert hall. For the concert hall, the polygon 1203 [-33.856473, 151.215257; -33.856322, 151.214973; 1204 -33.856424, 151.21471; -33.857248, 151.214753; 1205 -33.857413, 151.214941; -33.857311, 151.215128] is used. To use this 1206 new region of uncertainty, find its area using the same translation 1207 method described in Section 5.1.1.2, which produces 4566.2 square 1208 meters. Given that the concert hall is entirely within Bob's 1209 original location estimate, the confidence associated with the 1210 smaller area is therefore 95% * 4566.2 / 12600 = 34%. 1212 6.3. Matching Location Estimates to Regions of Interest 1214 Suppose that a circular area is defined centered at 1215 [-33.872754, 151.20683] with a radius of 1950 meters. To determine 1216 whether Bob is found within this area - given that Bob is at 1217 [-34.407242, 150.882518] with an uncertainty radius 7.7156 meters - 1218 we apply the method in Section 5.5. Using the converted Circle shape 1219 for Bob's location, the distance between these points is found to be 1220 1915.26 meters. The area of overlap between Bob's location estimate 1221 and the region of interest is therefore 2209 square meters and the 1222 area of Bob's location estimate is 30853 square meters. This gives 1223 the estimated probability that Bob is less than 1950 meters from the 1224 selected point as 67.8%. 1226 Note that if 1920 meters were chosen for the distance from the 1227 selected point, the area of overlap is only 16196 square meters and 1228 the confidence is 49.8%. Therefore, it is marginally more likely 1229 that Bob is outside the region of interest, despite the center point 1230 of his location estimate being within the region. 1232 6.4. PIDF-LO With Confidence Example 1234 The PIDF-LO document in Figure 11 includes a representation of 1235 uncertainty as a circular area. The confidence element (on the line 1236 marked with a comment) indicates that the confidence is 67% and that 1237 it follows a normal distribution. 1239 1247 1248 1249 1250 1251 1252 42.5463 -73.2512 1253 1254 850.24 1255 1256 1257 67 1258 1259 1260 1261 1262 mac:010203040506 1263 1264 1266 Figure 11: Example PIDF-LO with Confidence 1268 7. Confidence Schema 1270 1271 1278 1279 1281 PIDF-LO Confidence 1282 1283 1284 1286 This schema defines an element that is used for indicating 1287 confidence in PIDF-LO documents. 1288 1289 1291 1293 1294 1295 1296 1298 1299 1300 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1318 1319 1320 1321 1322 1323 1324 1326 1328 8. IANA Considerations 1330 8.1. URN Sub-Namespace Registration for 1331 urn:ietf:params:xml:ns:geopriv:conf 1333 This section registers a new XML namespace, 1334 "urn:ietf:params:xml:ns:geopriv:conf", as per the guidelines in 1335 [RFC3688]. 1337 URI: urn:ietf:params:xml:ns:geopriv:conf 1339 Registrant Contact: IETF, GEOPRIV working group, (geopriv@ietf.org), 1340 Martin Thomson (martin.thomson@gmail.com). 1342 XML: 1344 BEGIN 1345 1346 1348 1349 1350 PIDF-LO Confidence Attribute 1351 1352 1353

Namespace for PIDF-LO Confidence Attribute

1354

urn:ietf:params:xml:ns:geopriv:conf

1355 [[NOTE TO IANA/RFC-EDITOR: Please update RFC URL and replace XXXX 1356 with the RFC number for this specification.]] 1357

See RFCXXXX.

1358 1359 1360 END 1362 8.2. XML Schema Registration 1364 This section registers an XML schema as per the guidelines in 1365 [RFC3688]. 1367 URI: urn:ietf:params:xml:schema:geopriv:conf 1369 Registrant Contact: IETF, GEOPRIV working group, (geopriv@ietf.org), 1370 Martin Thomson (martin.thomson@gmail.com). 1372 Schema: The XML for this schema can be found as the entirety of 1373 Section 7 of this document. 1375 9. Security Considerations 1377 This document describes methods for managing and manipulating 1378 uncertainty in location. No specific security concerns arise from 1379 most of the information provided. 1381 Adding confidence to location information risks misinterpretation by 1382 consumers of location that do not understand the element. This could 1383 be exploited, particularly when reducing confidence, since the 1384 resulting uncertainty region might include locations that are less 1385 likely to contain the target than the recipient expects. Since this 1386 sort of error is always a possibility, the impact of this is low. 1388 10. Acknowledgements 1390 Peter Rhodes provided assistance with some of the mathematical 1391 groundwork on this document. Dan Cornford provided a detailed review 1392 and many terminology corrections. 1394 11. References 1396 11.1. Normative References 1398 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1399 Requirement Levels", BCP 14, RFC 2119, March 1997. 1401 [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, 1402 January 2004. 1404 [RFC3693] Cuellar, J., Morris, J., Mulligan, D., Peterson, J., and 1405 J. Polk, "Geopriv Requirements", RFC 3693, February 2004. 1407 [RFC4119] Peterson, J., "A Presence-based GEOPRIV Location Object 1408 Format", RFC 4119, December 2005. 1410 [RFC5139] Thomson, M. and J. Winterbottom, "Revised Civic Location 1411 Format for Presence Information Data Format Location 1412 Object (PIDF-LO)", RFC 5139, February 2008. 1414 [RFC5491] Winterbottom, J., Thomson, M., and H. Tschofenig, "GEOPRIV 1415 Presence Information Data Format Location Object (PIDF-LO) 1416 Usage Clarification, Considerations, and Recommendations", 1417 RFC 5491, March 2009. 1419 [RFC6225] Polk, J., Linsner, M., Thomson, M., and B. Aboba, "Dynamic 1420 Host Configuration Protocol Options for Coordinate-Based 1421 Location Configuration Information", RFC 6225, July 2011. 1423 [RFC6280] Barnes, R., Lepinski, M., Cooper, A., Morris, J., 1424 Tschofenig, H., and H. Schulzrinne, "An Architecture for 1425 Location and Location Privacy in Internet Applications", 1426 BCP 160, RFC 6280, July 2011. 1428 11.2. Informative References 1430 [Convert] Burtch, R., "A Comparison of Methods Used in Rectangular 1431 to Geodetic Coordinate Transformations", April 2006. 1433 [GeoShape] 1434 Thomson, M. and C. Reed, "GML 3.1.1 PIDF-LO Shape 1435 Application Schema for use by the Internet Engineering 1436 Task Force (IETF)", Candidate OpenGIS Implementation 1437 Specification 06-142r1, Version: 1.0, April 2007. 1439 [ISO.GUM] ISO/IEC, "Guide to the expression of uncertainty in 1440 measurement (GUM)", Guide 98:1995, 1995. 1442 [NIST.TN1297] 1443 Taylor, B. and C. Kuyatt, "Guidelines for Evaluating and 1444 Expressing the Uncertainty of NIST Measurement Results", 1445 Technical Note 1297, Sep 1994. 1447 [RFC5222] Hardie, T., Newton, A., Schulzrinne, H., and H. 1448 Tschofenig, "LoST: A Location-to-Service Translation 1449 Protocol", RFC 5222, August 2008. 1451 [RFC6772] Schulzrinne, H., Tschofenig, H., Cuellar, J., Polk, J., 1452 Morris, J., and M. Thomson, "Geolocation Policy: A 1453 Document Format for Expressing Privacy Preferences for 1454 Location Information", RFC 6772, January 2013. 1456 [Sunday02] 1457 Sunday, D., "Fast polygon area and Newell normal 1458 computation", Journal of Graphics Tools JGT, 1459 7(2):9-13,2002, 2002, 1460 . 1462 [TS-3GPP-23_032] 1463 3GPP, "Universal Geographic Area Description (GAD)", 3GPP 1464 TS 23.032 11.0.0, September 2012. 1466 [Vatti92] Vatti, B., "A generic solution to polygon clipping", 1467 Communications of the ACM Vol35, Issue7, pp56-63, 1992, 1468 . 1470 [WGS84] US National Imagery and Mapping Agency, "Department of 1471 Defense (DoD) World Geodetic System 1984 (WGS 84), Third 1472 Edition", NIMA TR8350.2, January 2000. 1474 Appendix A. Conversion Between Cartesian and Geodetic Coordinates in 1475 WGS84 1477 The process of conversion from geodetic (latitude, longitude and 1478 altitude) to earth-centered, earth-fixed (ECEF) Cartesian coordinates 1479 is relatively simple. 1481 In this section, the following constants and derived values are used 1482 from the definition of WGS84 [WGS84]: 1484 {radius of ellipsoid} R = 6378137 meters 1486 {inverse flattening} 1/f = 298.257223563 1488 {first eccentricity squared} e^2 = f * (2 - f) 1490 {second eccentricity squared} e'^2 = e^2 * (1 - e^2) 1492 To convert geodetic coordinates (latitude, longitude, altitude) to 1493 ECEF coordinates (X, Y, Z), use the following relationships: 1495 N = R / sqrt(1 - e^2 * sin(latitude)^2) 1497 X = (N + altitude) * cos(latitude) * cos(longitude) 1499 Y = (N + altitude) * cos(latitude) * sin(longitude) 1501 Z = (N*(1 - e^2) + altitude) * sin(latitude) 1503 The reverse conversion requires more complex computation and most 1504 methods introduce some error in latitude and altitude. A range of 1505 techniques are described in [Convert]. A variant on the method 1506 originally proposed by Bowring, which results in an acceptably small 1507 error, is described by the following: 1509 p = sqrt(X^2 + Y^2) 1511 r = sqrt(X^2 + Y^2 + Z^2) 1513 u = atan((1-f) * Z * (1 + e'^2 * (1-f) * R / r) / p) 1515 latitude = atan((Z + e'^2 * (1-f) * R * sin(u)^3) 1516 / (p - e^2 * R * cos(u)^3)) 1517 longitude = atan2(Y, X) 1519 altitude = sqrt((p - R * cos(u))^2 + (Z - (1-f) * R * sin(u))^2) 1521 If the point is near the poles, that is "p < 1", the value for 1522 altitude that this method produces is unstable. A simpler method for 1523 determining the altitude of a point near the poles is: 1525 altitude = |Z| - R * (1 - f) 1527 Appendix B. Calculating the Upward Normal of a Polygon 1529 For a polygon that is guaranteed to be convex and coplanar, the 1530 upward normal can be found by finding the vector cross product of 1531 adjacent edges. 1533 For more general cases the Newell method of approximation described 1534 in [Sunday02] may be applied. In particular, this method can be used 1535 if the points are only approximately coplanar, and for non-convex 1536 polygons. 1538 This process requires a Cartesian coordinate system. Therefore, 1539 convert the geodetic coordinates of the polygon to Cartesian, ECEF 1540 coordinates (Appendix A). If no altitude is specified, assume an 1541 altitude of zero. 1543 This method can be condensed to the following set of equations: 1545 Nx = sum from i=1..n of (y[i] * (z[i+1] - z[i-1])) 1547 Ny = sum from i=1..n of (z[i] * (x[i+1] - x[i-1])) 1549 Nz = sum from i=1..n of (x[i] * (y[i+1] - y[i-1])) 1551 For these formulae, the polygon is made of points 1552 "(x[1], y[1], z[1])" through "(x[n], y[n], x[n])". Each array is 1553 treated as circular, that is, "x[0] == x[n]" and "x[n+1] == x[1]". 1555 To translate this into a unit-vector; divide each component by the 1556 length of the vector: 1558 Nx' = Nx / sqrt(Nx^2 + Ny^2 + Nz^2) 1560 Ny' = Ny / sqrt(Nx^2 + Ny^2 + Nz^2) 1562 Nz' = Nz / sqrt(Nx^2 + Ny^2 + Nz^2) 1564 B.1. Checking that a Polygon Upward Normal Points Up 1566 RFC 5491 [RFC5491] stipulates that polygons be presented in anti- 1567 clockwise direction so that the upward normal is in an upward 1568 direction. Accidental reversal of points can invert this vector. 1569 This error can be hard to detect just by looking at the series of 1570 coordinates that form the polygon. 1572 Calculate the dot product of the upward normal of the polygon 1573 (Appendix B) and any vector that points away from the center of the 1574 Earth from the location of polygon. If this product is positive, 1575 then the polygon upward normal also points away from the center of 1576 the Earth. 1578 The inverse cosine of this value indicates the angle between the 1579 horizontal plane and the approximate plane of the polygon. 1581 A unit vector for the upward direction at any point can be found 1582 based on the latitude (lat) and longitude (lng) of the point, as 1583 follows: 1585 Up = [ cos(lat) * cos(lng) ; cos(lat) * sin(lng) ; sin(lat) ] 1587 For polygons that span less than half the globe, any point in the 1588 polygon - including the centroid - can be selected to generate an 1589 approximate up vector for comparison with the upward normal. 1591 Authors' Addresses 1593 Martin Thomson 1594 Mozilla 1595 331 E Evelyn Street 1596 Mountain View, CA 94041 1597 US 1599 Email: martin.thomson@gmail.com 1601 James Winterbottom 1602 Unaffiliated 1603 AU 1605 Email: a.james.winterbottom@gmail.com