idnits 2.17.1 draft-ietf-geopriv-uncertainty-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC5491, but the abstract doesn't seem to directly say this. It does mention RFC5491 though, so this could be OK. -- The draft header indicates that this document updates RFC3693, but the abstract doesn't seem to directly say this. It does mention RFC3693 though, so this could be OK. -- The draft header indicates that this document updates RFC4119, but the abstract doesn't seem to directly say this. It does mention RFC4119 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3693, updated by this document, for RFC5378 checks: 2002-06-25) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 16, 2014) is 3510 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1574 -- Looks like a reference, but probably isn't: '2' on line 715 -- Looks like a reference, but probably isn't: '3' on line 715 == Missing Reference: '2d' is mentioned on line 856, but not defined == Missing Reference: '3d' is mentioned on line 856, but not defined -- Looks like a reference, but probably isn't: '0' on line 1574 ** Downref: Normative reference to an Informational RFC: RFC 3693 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 GEOPRIV M. Thomson 3 Internet-Draft Mozilla 4 Updates: 3693,4119,5491 (if approved) J. Winterbottom 5 Intended status: Standards Track Unaffiliated 6 Expires: March 20, 2015 September 16, 2014 8 Representation of Uncertainty and Confidence in PIDF-LO 9 draft-ietf-geopriv-uncertainty-03 11 Abstract 13 The key concepts of uncertainty and confidence as they pertain to 14 location information are defined. Methods for the manipulation of 15 location estimates that include uncertainty information are outlined. 17 This draft normatively updates the definition of location information 18 representations defined in RFC 4119 and RFC 5491. It also deprecates 19 related terminology defined in RFC 3693. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on March 20, 2015. 38 Copyright Notice 40 Copyright (c) 2014 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 56 1.1. Conventions and Terminology . . . . . . . . . . . . . . . 3 57 2. A General Definition of Uncertainty . . . . . . . . . . . . . 4 58 2.1. Uncertainty as a Probability Distribution . . . . . . . . 5 59 2.2. Deprecation of the Terms Precision and Resolution . . . . 7 60 2.3. Accuracy as a Qualitative Concept . . . . . . . . . . . . 7 61 3. Uncertainty in Location . . . . . . . . . . . . . . . . . . . 8 62 3.1. Targets as Points in Space . . . . . . . . . . . . . . . 8 63 3.2. Representation of Uncertainty and Confidence in PIDF-LO . 9 64 3.3. Uncertainty and Confidence for Civic Addresses . . . . . 9 65 3.4. DHCP Location Configuration Information and Uncertainty . 10 66 4. Representation of Confidence in PIDF-LO . . . . . . . . . . . 10 67 4.1. The "confidence" Element . . . . . . . . . . . . . . . . 11 68 4.2. Generating Locations with Confidence . . . . . . . . . . 12 69 4.3. Consuming and Presenting Confidence . . . . . . . . . . . 12 70 5. Manipulation of Uncertainty . . . . . . . . . . . . . . . . . 13 71 5.1. Reduction of a Location Estimate to a Point . . . . . . . 13 72 5.1.1. Centroid Calculation . . . . . . . . . . . . . . . . 14 73 5.1.1.1. Arc-Band Centroid . . . . . . . . . . . . . . . . 14 74 5.1.1.2. Polygon Centroid . . . . . . . . . . . . . . . . 15 75 5.2. Conversion to Circle or Sphere . . . . . . . . . . . . . 17 76 5.3. Three-Dimensional to Two-Dimensional Conversion . . . . . 18 77 5.4. Increasing and Decreasing Uncertainty and Confidence . . 19 78 5.4.1. Rectangular Distributions . . . . . . . . . . . . . . 19 79 5.4.2. Normal Distributions . . . . . . . . . . . . . . . . 20 80 5.5. Determining Whether a Location is Within a Given Region . 20 81 5.5.1. Determining the Area of Overlap for Two Circles . . . 22 82 5.5.2. Determining the Area of Overlap for Two Polygons . . 23 83 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 23 84 6.1. Reduction to a Point or Circle . . . . . . . . . . . . . 23 85 6.2. Increasing and Decreasing Confidence . . . . . . . . . . 27 86 6.3. Matching Location Estimates to Regions of Interest . . . 27 87 6.4. PIDF-LO With Confidence Example . . . . . . . . . . . . . 28 88 7. Confidence Schema . . . . . . . . . . . . . . . . . . . . . . 28 89 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 90 8.1. URN Sub-Namespace Registration for 91 urn:ietf:params:xml:ns:geopriv:conf . . . . . . . . . . . 30 92 8.2. XML Schema Registration . . . . . . . . . . . . . . . . . 30 93 9. Security Considerations . . . . . . . . . . . . . . . . . . . 31 94 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 95 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 96 11.1. Normative References . . . . . . . . . . . . . . . . . . 31 97 11.2. Informative References . . . . . . . . . . . . . . . . . 32 98 Appendix A. Conversion Between Cartesian and Geodetic 99 Coordinates in WGS84 . . . . . . . . . . . . . . . . 33 100 Appendix B. Calculating the Upward Normal of a Polygon . . . . . 34 101 B.1. Checking that a Polygon Upward Normal Points Up . . . . . 35 102 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35 104 1. Introduction 106 Location information represents an estimation of the position of a 107 Target [RFC6280]. Under ideal circumstances, a location estimate 108 precisely reflects the actual location of the Target. For automated 109 systems that determine location, there are many factors that 110 introduce errors into the measurements that are used to determine 111 location estimates. 113 The process by which measurements are combined to generate a location 114 estimate is outside of the scope of work within the IETF. However, 115 the results of such a process are carried in IETF data formats and 116 protocols. This document outlines how uncertainty, and its 117 associated datum, confidence, are expressed and interpreted. 119 This document provides a common nomenclature for discussing 120 uncertainty and confidence as they relate to location information. 122 This document also provides guidance on how to manage location 123 information that includes uncertainty. Methods for expanding or 124 reducing uncertainty to obtain a required level of confidence are 125 described. Methods for determining the probability that a Target is 126 within a specified region based on its location estimate are 127 described. These methods are simplified by making certain 128 assumptions about the location estimate and are designed to be 129 applicable to location estimates in a relatively small geographic 130 area. 132 A confidence extension for the Presence Information Data Format - 133 Location Object (PIDF-LO) [RFC4119] is described. 135 This document describes methods that can be used in combination with 136 automatically determined location information. These are 137 statistically-based methods. 139 1.1. Conventions and Terminology 141 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 142 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 143 document are to be interpreted as described in [RFC2119]. 145 This document assumes a basic understanding of the principles of 146 mathematics, particularly statistics and geometry. 148 Some terminology is borrowed from [RFC3693] and [RFC6280], in 149 particular Target. 151 Mathematical formulae are presented using the following notation: add 152 "+", subtract "-", multiply "*", divide "/", power "^" and absolute 153 value "|x|". Precedence is indicated using parentheses. 154 Mathematical functions are represented by common abbreviations: 155 square root "sqrt(x)", sine "sin(x)", cosine "cos(x)", inverse cosine 156 "acos(x)", tangent "tan(x)", inverse tangent "atan(x)", two-argument 157 inverse tangent "atan2(y,x)", error function "erf(x)", and inverse 158 error function "erfinv(x)". 160 2. A General Definition of Uncertainty 162 Uncertainty results from the limitations of measurement. In 163 measuring any observable quantity, errors from a range of sources 164 affect the result. Uncertainty is a quantification of what is known 165 about the observed quantity, either through the limitations of 166 measurement or through inherent variability of the quantity. 168 Uncertainty is most completely described by a probability 169 distribution. A probability distribution assigns a probability to 170 possible values for the quantity. 172 A probability distribution describing a measured quantity can be 173 arbitrarily complex and so it is desirable to find a simplified 174 model. One approach commonly taken is to reduce the probability 175 distribution to a confidence interval. Many alternative models are 176 used in other areas, but study of those is not the focus of this 177 document. 179 In addition to the central estimate of the observed quantity, a 180 confidence interval is succinctly described by two values: an error 181 range and a confidence. The error range describes an interval and 182 the confidence describes an estimated upper bound on the probability 183 that a "true" value is found within the extents defined by the error. 185 In the following example, a measurement result for a length is shown 186 as a nominal value with additional information on error range (0.0043 187 meters) and confidence (95%). 189 e.g. x = 1.00742 +/- 0.0043 meters at 95% confidence 191 This result indicates that the measurement indicates that the value 192 of "x" between 1.00312 and 1.01172 meters with 95% probability. No 193 other assertion is made: in particular, this does not assert that x 194 is 1.00742. 196 Uncertainty and confidence for location estimates can be derived in a 197 number of ways. This document does not attempt to enumerate the many 198 methods for determining uncertainty. [ISO.GUM] and [NIST.TN1297] 199 provide a set of general guidelines for determining and manipulating 200 measurement uncertainty. This document applies that general guidance 201 for consumers of location information. 203 As a statistical measure, values determined for uncertainty are 204 determined based on information in the aggregate, across numerous 205 individual estimates. An individual estimate might be determined to 206 be "correct" - by using a survey to validate the result, for example 207 - without invalidating the statistical assertion. 209 This understanding of estimates in the statistical sense explains why 210 asserting a confidence of 100%, which might seem intuitively correct, 211 is rarely advisable. 213 2.1. Uncertainty as a Probability Distribution 215 The Probability Density Function (PDF) that is described by 216 uncertainty indicates the probability that the "true" value lies at 217 any one point. The shape of the probability distribution can vary 218 depending on the method that is used to determine the result. The 219 two probability density functions most generally applicable to 220 location information are considered in this document: 222 o The normal PDF (also referred to as a Gaussian PDF) is used where 223 a large number of small random factors contribute to errors. The 224 value used for the error range in a normal PDF is related to the 225 standard deviation of the distribution. 227 o A rectangular PDF is used where the errors are known to be 228 consistent across a limited range. A rectangular PDF can occur 229 where a single error source, such as a rounding error, is 230 significantly larger than other errors. A rectangular PDF is 231 often described by the half-width of the distribution; that is, 232 half the width of the distribution. 234 Each of these probability density functions can be characterized by 235 its center point, or mean, and its width. For a normal distribution, 236 uncertainty and confidence together are related to the standard 237 deviation of the function (see Section 5.4). For a rectangular 238 distribution, the half-width of the distribution is used. 240 Figure 1 shows a normal and rectangular probability density function 241 with the mean (m) and standard deviation (s) labelled. The half- 242 width (h) of the rectangular distribution is also indicated. 244 ***** *** Normal PDF 245 ** : ** --- Rectangular PDF 246 ** : ** 247 ** : ** 248 .---------*---------------*---------. 249 | ** : ** | 250 | ** : ** | 251 | * <-- s -->: * | 252 | * : : : * | 253 | ** : ** | 254 | * : : : * | 255 | * : * | 256 |** : : : **| 257 ** : ** 258 *** | : : : | *** 259 ***** | :<------ h ------>| ***** 260 .****-------+.......:.........:.........:.......+-------*****. 261 m 263 Figure 1: Normal and Rectangular Probability Density Functions 265 For a given PDF, the value of the PDF describes the probability that 266 the "true" value is found at that point. Confidence for any given 267 interval is the total probability of the "true" value being in that 268 range, defined as the integral of the PDF over the interval. 270 The probability of the "true" value falling between two points is 271 found by finding the area under the curve between the points (that 272 is, the integral of the curve between the points). For any given 273 PDF, the area under the curve for the entire range from negative 274 infinity to positive infinity is 1 or (100%). Therefore, the 275 confidence over any interval of uncertainty is always less than 276 100%. 278 Figure 2 shows how confidence is determined for a normal 279 distribution. The area of the shaded region gives the confidence (c) 280 for the interval between "m-u" and "m+u". 282 ***** 283 **:::::** 284 **:::::::::** 285 **:::::::::::** 286 *:::::::::::::::* 287 **:::::::::::::::** 288 **:::::::::::::::::** 289 *:::::::::::::::::::::* 290 *:::::::::::::::::::::::* 291 **:::::::::::::::::::::::** 292 *:::::::::::: c ::::::::::::* 293 *:::::::::::::::::::::::::::::* 294 **|:::::::::::::::::::::::::::::|** 295 ** |:::::::::::::::::::::::::::::| ** 296 *** |:::::::::::::::::::::::::::::| *** 297 ***** |:::::::::::::::::::::::::::::| ***** 298 .****..........!:::::::::::::::::::::::::::::!..........*****. 299 | | | 300 (m-u) m (m+u) 302 Figure 2: Confidence as the Integral of a PDF 304 In Section 5.4, methods are described for manipulating uncertainty if 305 the shape of the PDF is known. 307 2.2. Deprecation of the Terms Precision and Resolution 309 The terms _Precision_ and _Resolution_ are defined in RFC 3693 310 [RFC3693]. These definitions were intended to provide a common 311 nomenclature for discussing uncertainty; however, these particular 312 terms have many different uses in other fields and their definitions 313 are not sufficient to avoid confusion about their meaning. These 314 terms are unsuitable for use in relation to quantitative concepts 315 when discussing uncertainty and confidence in relation to location 316 information. 318 2.3. Accuracy as a Qualitative Concept 320 Uncertainty is a quantitative concept. The term _accuracy_ is useful 321 in describing, qualitatively, the general concepts of location 322 information. Accuracy is generally useful when describing 323 qualitative aspects of location estimates. Accuracy is not a 324 suitable term for use in a quantitative context. 326 For instance, it could be appropriate to say that a location estimate 327 with uncertainty "X" is more accurate than a location estimate with 328 uncertainty "2X" at the same confidence. It is not appropriate to 329 assign a number to "accuracy", nor is it appropriate to refer to any 330 component of uncertainty or confidence as "accuracy". That is, to 331 say that the "accuracy" for the first location estimate is "X" would 332 be an erroneous use of this term. 334 3. Uncertainty in Location 336 A _location estimate_ is the result of location determination. A 337 location estimate is subject to uncertainty like any other 338 observation. However, unlike a simple measure of a one dimensional 339 property like length, a location estimate is specified in two or 340 three dimensions. 342 Uncertainty in two or three dimensional locations can be described 343 using confidence intervals. The confidence interval for a location 344 estimate in two or three dimensional space is expressed as a subset 345 of that space. This document uses the term _region of uncertainty_ 346 to refer to the area or volume that describes the confidence 347 interval. 349 Areas or volumes that describe regions of uncertainty can be formed 350 by the combination of two or three one-dimensional ranges, or more 351 complex shapes could be described (for example, the shapes in 352 [RFC5491]). 354 3.1. Targets as Points in Space 356 This document makes a simplifying assumption that the Target of the 357 PIDF-LO occupies just a single point in space. While this is clearly 358 false in virtually all scenarios with any practical application, it 359 is often a reasonable simplifying assumption to make. 361 To a large extent, whether this simplification is valid depends on 362 the size of the target relative to the size of the uncertainty 363 region. When locating a personal device using contemporary location 364 determination techniques, the space the device occupies relative to 365 the uncertainty is proportionally quite small. Even where that 366 device is used as a proxy for a person, the proportions change 367 little. 369 This assumption is less useful as uncertainty becomes small relative 370 to the size of the Target of the PIDF-LO (or conversely, as 371 uncertainty becomes small relative to the Target). For instance, 372 describing the location of a football stadium or small country would 373 include a region of uncertainty that is infinitesimally larger than 374 the Target itself. In these cases, much of the guidance in this 375 document is not applicable. Indeed, as the accuracy of location 376 determination technology improves, it could be that the advice this 377 document contains becomes less relevant by the same measure. 379 3.2. Representation of Uncertainty and Confidence in PIDF-LO 381 A set of shapes suitable for the expression of uncertainty in 382 location estimates in the Presence Information Data Format - Location 383 Object (PIDF-LO) are described in [GeoShape]. These shapes are the 384 recommended form for the representation of uncertainty in PIDF-LO 385 [RFC4119] documents. 387 The PIDF-LO can contain uncertainty, but does not include an 388 indication of confidence. [RFC5491] defines a fixed value of 95%. 389 Similarly, the PIDF-LO format does not provide an indication of the 390 shape of the PDF. Section 4 defines elements to convey this 391 information in PIDF-LO. 393 Absence of uncertainty information in a PIDF-LO document does not 394 indicate that there is no uncertainty in the location estimate. 395 Uncertainty might not have been calculated for the estimate, or it 396 may be withheld for privacy purposes. 398 If the Point shape is used, confidence and uncertainty are unknown; a 399 receiver can either assume a confidence of 0% or infinite 400 uncertainty. The same principle applies on the altitude axis for 401 two-dimension shapes like the Circle. 403 3.3. Uncertainty and Confidence for Civic Addresses 405 Automatically determined civic addresses [RFC5139] inherently include 406 uncertainty, based on the area of the most precise element that is 407 specified. In this case, uncertainty is effectively described by the 408 presence or absence of elements. To the recipient of location 409 information, elements that are not present are uncertain. 411 To apply the concept of uncertainty to civic addresses, it is helpful 412 to unify the conceptual models of civic address with geodetic 413 location information. This is particularly useful when considering 414 civic addresses that are determined using reverse geocoding (that is, 415 the process of translating geodetic information into civic 416 addresses). 418 In the unified view, a civic address defines a series of (sometimes 419 non-orthogonal) spatial partitions. The first is the implicit 420 partition that identifies the surface of the earth and the space near 421 the surface. The second is the country. Each label that is included 422 in a civic address provides information about a different set of 423 spatial partitions. Some partitions require slight adjustments from 424 a standard interpretation: for instance, a road includes all 425 properties that adjoin the street. Each label might need to be 426 interpreted with other values to provide context. 428 As a value at each level is interpreted, one or more spatial 429 partitions at that level are selected, and all other partitions of 430 that type are excluded. For non-orthogonal partitions, only the 431 portion of the partition that fits within the existing space is 432 selected. This is what distinguishes King Street in Sydney from King 433 Street in Melbourne. Each defined element selects a partition of 434 space. The resulting location is the intersection of all selected 435 spaces. 437 The resulting spatial partition can be considered as a region of 438 uncertainty. 440 Note: This view is a potential perspective on the process of geo- 441 coding - the translation of a civic address to a geodetic 442 location. 444 Uncertainty in civic addresses can be increased by removing elements. 445 This does not increase confidence unless additional information is 446 used. Similarly, arbitrarily increasing uncertainty in a geodetic 447 location does not increase confidence. 449 3.4. DHCP Location Configuration Information and Uncertainty 451 Location information is often measured in two or three dimensions; 452 expressions of uncertainty in one dimension only are rare. The 453 "resolution" parameters in [RFC6225] provide an indication of how 454 many bits of a number are valid, which could be interpreted as an 455 expression of uncertainty in one dimension. 457 [RFC6225] defines a means for representing uncertainty, but a value 458 for confidence is not specified. A default value of 95% confidence 459 should be assumed for the combination of the uncertainty on each 460 axis. This is consistent with the transformation of those forms into 461 the uncertainty representations from [RFC5491]. That is, the 462 confidence of the resultant rectangular polygon or prism is assumed 463 to be 95%. 465 4. Representation of Confidence in PIDF-LO 467 On the whole, a fixed definition for confidence is preferable, 468 primarily because it ensures consistency between implementations. 469 Location generators that are aware of this constraint can generate 470 location information at the required confidence. Location recipients 471 are able to make sensible assumptions about the quality of the 472 information that they receive. 474 In some circumstances - particularly with pre-existing systems - 475 location generators might unable to provide location information with 476 consistent confidence. Existing systems sometimes specify confidence 477 at 38%, 67% or 90%. Existing forms of expressing location 478 information, such as that defined in [TS-3GPP-23_032], contain 479 elements that express the confidence in the result. 481 The addition of a confidence element provides information that was 482 previously unavailable to recipients of location information. 483 Without this information, a location server or generator that has 484 access to location information with a confidence lower than 95% has 485 two options: 487 o The location server can scale regions of uncertainty in an attempt 488 to acheive 95% confidence. This scaling process significantly 489 degrades the quality of the information, because the location 490 server might not have the necessary information to scale 491 appropriately; the location server is forced to make assumptions 492 that are likely to result in either an overly conservative 493 estimate with high uncertainty or a overestimate of confidence. 495 o The location server can ignore the confidence entirely, which 496 results in giving the recipient a false impression of its quality. 498 Both of these choices degrade the quality of the information 499 provided. 501 The addition of a confidence element avoids this problem entirely if 502 a location recipient supports and understands the element. A 503 recipient that does not understand - and hence ignores - the 504 confidence element is in no worse a position than if the location 505 server ignored confidence. 507 4.1. The "confidence" Element 509 The confidence element MAY be added to the "location-info" element of 510 the Presence Information Data Format - Location Object (PIDF-LO) 511 [RFC4119] document. This element expresses the confidence in the 512 associated location information as a percentage. A special "unknown" 513 value is reserved to indicate that confidence is supported, but not 514 known to the Location Generator. 516 The confidence element optionally includes an attribute that 517 indicates the shape of the probability density function (PDF) of the 518 associated region of uncertainty. Three values are possible: 519 unknown, normal and rectangular. 521 Indicating a particular PDF only indicates that the distribution 522 approximately fits the given shape based on the methods used to 523 generate the location information. The PDF is normal if there are a 524 large number of small, independent sources of error; rectangular if 525 all points within the area have roughly equal probability of being 526 the actual location of the Target; otherwise, the PDF MUST either be 527 set to unknown or omitted. 529 If a PIDF-LO does not include the confidence element, the confidence 530 of the location estimate is 95%, as defined in [RFC5491]. 532 A Point shape does not have uncertainty (or it has infinite 533 uncertainty), so confidence is meaningless for a point; therefore, 534 this element MUST be omitted if only a point is provided. 536 4.2. Generating Locations with Confidence 538 Location generators SHOULD attempt to ensure that confidence is equal 539 in each dimension when generating location information. This 540 restriction, while not always practical, allows for more accurate 541 scaling, if scaling is necessary. 543 A confidence element MUST be included with all location information 544 that includes uncertainty (that is, all forms other than a point). A 545 special "unknown" MAY be used if confidence is not known. 547 4.3. Consuming and Presenting Confidence 549 The inclusion of confidence that is anything other than 95% presents 550 a potentially difficult usability problem for applications that use 551 location information. Effectively communicating the probability that 552 a location is incorrect to a user can be difficult. 554 It is inadvisable to simply display locations of any confidence, or 555 to display confidence in a separate or non-obvious fashion. If 556 locations with different confidence levels are displayed such that 557 the distinction is subtle or easy to overlook - such as using fine 558 graduations of color or transparency for graphical uncertainty 559 regions, or displaying uncertainty graphically, but providing 560 confidence as supplementary text - a user could fail to notice a 561 difference in the quality of the location information that might be 562 significant. 564 Depending on the circumstances, different ways of handling confidence 565 might be appropriate. Section 5 describes techniques that could be 566 appropriate for consumers that use automated processing. 568 Providing that the full implications of any choice for the 569 application are understood, some amount of automated processing could 570 be appropriate. In a simple example, applications could choose to 571 discard or suppress the display of location information if confidence 572 does not meet a pre-determined threshold. 574 In settings where there is an opportunity for user training, some of 575 these problems might be mitigated by defining different operational 576 procedures for handling location information at different confidence 577 levels. 579 5. Manipulation of Uncertainty 581 This section deals with manipulation of location information that 582 contains uncertainty. 584 The following rules generally apply when manipulating location 585 information: 587 o Where calculations are performed on coordinate information, these 588 should be performed in Cartesian space and the results converted 589 back to latitude, longitude and altitude. A method for converting 590 to and from Cartesian coordinates is included in Appendix A. 592 While some approximation methods are useful in simplifying 593 calculations, treating latitude and longitude as Cartesian axes 594 is never advisable. The two axes are not orthogonal. Errors 595 can arise from the curvature of the earth and from the 596 convergence of longitude lines. 598 o Normal rounding rules do not apply when rounding uncertainty. 599 When rounding, the region of uncertainty always increases (that 600 is, errors are rounded up) and confidence is always rounded down 601 (see [NIST.TN1297]). This means that any manipulation of 602 uncertainty is a non-reversible operation; each manipulation can 603 result in the loss of some information. 605 5.1. Reduction of a Location Estimate to a Point 607 Manipulating location estimates that include uncertainty information 608 requires additional complexity in systems. In some cases, systems 609 only operate on definitive values, that is, a single point. 611 This section describes algorithms for reducing location estimates to 612 a simple form without uncertainty information. Having a consistent 613 means for reducing location estimates allows for interaction between 614 applications that are able to use uncertainty information and those 615 that cannot. 617 Note: Reduction of a location estimate to a point constitutes a 618 reduction in information. Removing uncertainty information can 619 degrade results in some applications. Also, there is a natural 620 tendency to misinterpret a point location as representing a 621 location without uncertainty. This could lead to more serious 622 errors. Therefore, these algorithms should only be applied where 623 necessary. 625 Several different approaches can be taken when reducing a location 626 estimate to a point. Different methods each make a set of 627 assumptions about the properties of the PDF and the selected point; 628 no one method is more "correct" than any other. For any given region 629 of uncertainty, selecting an arbitrary point within the area could be 630 considered valid; however, given the aforementioned problems with 631 point locations, a more rigorous approach is appropriate. 633 Given a result with a known distribution, selecting the point within 634 the area that has the highest probability is a more rigorous method. 635 Alternatively, a point could be selected that minimizes the overall 636 error; that is, it minimizes the expected value of the difference 637 between the selected point and the "true" value. 639 If a rectangular distribution is assumed, the centroid of the area or 640 volume minimizes the overall error. Minimizing the error for a 641 normal distribution is mathematically complex. Therefore, this 642 document opts to select the centroid of the region of uncertainty 643 when selecting a point. 645 5.1.1. Centroid Calculation 647 For regular shapes, such as Circle, Sphere, Ellipse and Ellipsoid, 648 this approach equates to the center point of the region. For regions 649 of uncertainty that are expressed as regular Polygons and Prisms the 650 center point is also the most appropriate selection. 652 For the Arc-Band shape and non-regular Polygons and Prisms, selecting 653 the centroid of the area or volume minimizes the overall error. This 654 assumes that the PDF is rectangular. 656 Note: The centroid of a concave Polygon or Arc-Band shape is not 657 necessarily within the region of uncertainty. 659 5.1.1.1. Arc-Band Centroid 661 The centroid of the Arc-Band shape is found along a line that bisects 662 the arc. The centroid can be found at the following distance from 663 the starting point of the arc-band (assuming an arc-band with an 664 inner radius of "r", outer radius "R", start angle "a", and opening 665 angle "o"): 667 d = 4 * sin(o/2) * (R*R + R*r + r*r) / (3*o*(R + r)) 669 This point can be found along the line that bisects the arc; that is, 670 the line at an angle of "a + (o/2)". 672 5.1.1.2. Polygon Centroid 674 Calculating a centroid for the Polygon and Prism shapes is more 675 complex. Polygons that are specified using geodetic coordinates are 676 not necessarily coplanar. For Polygons that are specified without an 677 altitude, choose a value for altitude before attempting this process; 678 an altitude of 0 is acceptable. 680 The method described in this section is simplified by assuming 681 that the surface of the earth is locally flat. This method 682 degrades as polygons become larger; see [GeoShape] for 683 recommendations on polygon size. 685 The polygon is translated to a new coordinate system that has an x-y 686 plane roughly parallel to the polygon. This enables the elimination 687 of z-axis values and calculating a centroid can be done using only x 688 and y coordinates. This requires that the upward normal for the 689 polygon is known. 691 To translate the polygon coordinates, apply the process described in 692 Appendix B to find the normal vector "N = [Nx,Ny,Nz]". This value 693 should be made a unit vector to ensure that the transformation matrix 694 is a special orthogonal matrix. From this vector, select two vectors 695 that are perpendicular to this vector and combine these into a 696 transformation matrix. 698 If "Nx" and "Ny" are non-zero, the matrices in Figure 3 can be used, 699 given "p = sqrt(Nx^2 + Ny^2)". More transformations are provided 700 later in this section for cases where "Nx" or "Ny" are zero. 702 [ -Ny/p Nx/p 0 ] [ -Ny/p -Nx*Nz/p Nx ] 703 T = [ -Nx*Nz/p -Ny*Nz/p p ] T' = [ Nx/p -Ny*Nz/p Ny ] 704 [ Nx Ny Nz ] [ 0 p Nz ] 705 (Transform) (Reverse Transform) 707 Figure 3: Recommended Transformation Matrices 709 To apply a transform to each point in the polygon, form a matrix from 710 the ECEF coordinates and use matrix multiplication to determine the 711 translated coordinates. 713 [ -Ny/p Nx/p 0 ] [ x[1] x[2] x[3] ... x[n] ] 714 [ -Nx*Nz/p -Ny*Nz/p p ] * [ y[1] y[2] y[3] ... y[n] ] 715 [ Nx Ny Nz ] [ z[1] z[2] z[3] ... z[n] ] 717 [ x'[1] x'[2] x'[3] ... x'[n] ] 718 = [ y'[1] y'[2] y'[3] ... y'[n] ] 719 [ z'[1] z'[2] z'[3] ... z'[n] ] 721 Figure 4: Transformation 723 Alternatively, direct multiplication can be used to achieve the same 724 result: 726 x'[i] = -Ny * x[i] / p + Nx * y[i] / p 728 y'[i] = -Nx * Nz * x[i] / p - Ny * Nz * y[i] / p + p * z[i] 730 z'[i] = Nx * x[i] + Ny * y[i] + Nz * z[i] 732 The first and second rows of this matrix ("x'" and "y'") contain the 733 values that are used to calculate the centroid of the polygon. To 734 find the centroid of this polygon, first find the area using: 736 A = sum from i=1..n of (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / 2 738 For these formulae, treat each set of coordinates as circular, that 739 is "x'[0] == x'[n]" and "x'[n+1] == x'[1]". Based on the area, the 740 centroid along each axis can be determined by: 742 Cx' = sum (x'[i]+x'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A) 744 Cy' = sum (y'[i]+y'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A) 746 Note: The formula for the area of a polygon will return a negative 747 value if the polygon is specified in clockwise direction. This 748 can be used to determine the orientation of the polygon. 750 The third row contains a distance from a plane parallel to the 751 polygon. If the polygon is coplanar, then the values for "z'" are 752 identical; however, the constraints recommended in [RFC5491] mean 753 that this is rarely the case. To determine "Cz'", average these 754 values: 756 Cz' = sum z'[i] / n 758 Once the centroid is known in the transformed coordinates, these can 759 be transformed back to the original coordinate system. The reverse 760 transformation is shown in Figure 5. 762 [ -Ny/p -Nx*Nz/p Nx ] [ Cx' ] [ Cx ] 763 [ Nx/p -Ny*Nz/p Ny ] * [ Cy' ] = [ Cy ] 764 [ 0 p Nz ] [ sum of z'[i] / n ] [ Cz ] 766 Figure 5: Reverse Transformation 768 The reverse transformation can be applied directly as follows: 770 Cx = -Ny * Cx' / p - Nx * Nz * Cy' / p + Nx * Cz' 772 Cy = Nx * Cx' / p - Ny * Nz * Cy' / p + Ny * Cz' 774 Cz = p * Cy' + Nz * Cz' 776 The ECEF value "[Cx,Cy,Cz]" can then be converted back to geodetic 777 coordinates. Given a polygon that is defined with no altitude or 778 equal altitudes for each point, the altitude of the result can either 779 be ignored or reset after converting back to a geodetic value. 781 The centroid of the Prism shape is found by finding the centroid of 782 the base polygon and raising the point by half the height of the 783 prism. This can be added to altitude of the final result; 784 alternatively, this can be added to "Cz'", which ensures that 785 negative height is correctly applied to polygons that are defined in 786 a "clockwise" direction. 788 The recommended transforms only apply if "Nx" and "Ny" are non-zero. 789 If the normal vector is "[0,0,1]" (that is, along the z-axis), then 790 no transform is necessary. Similarly, if the normal vector is 791 "[0,1,0]" or "[1,0,0]", avoid the transformation and use the x and z 792 coordinates or y and z coordinates (respectively) in the centroid 793 calculation phase. If either "Nx" or "Ny" are zero, the alternative 794 transform matrices in Figure 6 can be used. The reverse transform is 795 the transpose of this matrix. 797 if Nx == 0: | if Ny == 0: 798 [ 0 -Nz Ny ] [ 0 1 0 ] | [ -Nz 0 Nx ] 799 T = [ 1 0 0 ] T' = [ -Nz 0 Ny ] | T = T' = [ 0 1 0 ] 800 [ 0 Ny Nz ] [ Ny 0 Nz ] | [ Nx 0 Nz ] 802 Figure 6: Alternative Transformation Matrices 804 5.2. Conversion to Circle or Sphere 806 The Circle or Sphere are simple shapes that suit a range of 807 applications. A circle or sphere contains fewer units of data to 808 manipulate, which simplifies operations on location estimates. 810 The simplest method for converting a location estimate to a Circle or 811 Sphere shape is to determine the centroid and then find the longest 812 distance to any point in the region of uncertainty to that point. 813 This distance can be determined based on the shape type: 815 Circle/Sphere: No conversion necessary. 817 Ellipse/Ellipsoid: The greater of either semi-major axis or altitude 818 uncertainty. 820 Polygon/Prism: The distance to the furthest vertex of the polygon 821 (for a Prism, it is only necessary to check points on the base). 823 Arc-Band: The furthest length from the centroid to the points where 824 the inner and outer arc end. This distance can be calculated by 825 finding the larger of the two following formulae: 827 X = sqrt( d*d + R*R - 2*d*R*cos(o/2) ) 829 x = sqrt( d*d + r*r - 2*d*r*cos(o/2) ) 831 Once the Circle or Sphere shape is found, the associated confidence 832 can be increased if the result is known to follow a normal 833 distribution. However, this is a complicated process and provides 834 limited benefit. In many cases it also violates the constraint that 835 confidence in each dimension be the same. Confidence should be 836 unchanged when performing this conversion. 838 Two dimensional shapes are converted to a Circle; three dimensional 839 shapes are converted to a Sphere. 841 5.3. Three-Dimensional to Two-Dimensional Conversion 843 A three-dimensional shape can be easily converted to a two- 844 dimensional shape by removing the altitude component. A sphere 845 becomes a circle; a prism becomes a polygon; an ellipsoid becomes an 846 ellipse. Each conversion is simple, requiring only the removal of 847 those elements relating to altitude. 849 The altitude is unspecified for a two-dimensional shape and therefore 850 has unlimited uncertainty along the vertical axis. The confidence 851 for the two-dimensional shape is thus higher than the three- 852 dimensional shape. Assuming equal confidence on each axis, the 853 confidence of the circle can be increased using the following 854 approximate formula: 856 C[2d] >= C[3d] ^ (2/3) 858 "C[2d]" is the confidence of the two-dimensional shape and "C[3d]" is 859 the confidence of the three-dimensional shape. For example, a Sphere 860 with a confidence of 95% can be simplified to a Circle of equal 861 radius with confidence of 96.6%. 863 5.4. Increasing and Decreasing Uncertainty and Confidence 865 The combination of uncertainty and confidence provide a great deal of 866 information about the nature of the data that is being measured. If 867 uncertainty, confidence and PDF are known, certain information can be 868 extrapolated. In particular, the uncertainty can be scaled to meet a 869 desired confidence or the confidence for a particular region of 870 uncertainty can be found. 872 In general, confidence decreases as the region of uncertainty 873 decreases in size and confidence increases as the region of 874 uncertainty increases in size. However, this depends on the PDF; 875 expanding the region of uncertainty for a rectangular distribution 876 has no effect on confidence without additional information. If the 877 region of uncertainty is increased during the process of obfuscation 878 (see [RFC6772]), then the confidence cannot be increased. 880 A region of uncertainty that is reduced in size always has a lower 881 confidence. 883 A region of uncertainty that has an unknown PDF shape cannot be 884 reduced in size reliably. The region of uncertainty can be expanded, 885 but only if confidence is not increased. 887 This section makes the simplifying assumption that location 888 information is symmetrically and evenly distributed in each 889 dimension. This is not necessarily true in practice. If better 890 information is available, alternative methods might produce better 891 results. 893 5.4.1. Rectangular Distributions 895 Uncertainty that follows a rectangular distribution can only be 896 decreased in size. Increasing uncertainty has no value, since it has 897 no effect on confidence. Since the PDF is constant over the region 898 of uncertainty, the resulting confidence is determined by the 899 following formula: 901 Cr = Co * Ur / Uo 903 Where "Uo" and "Ur" are the sizes of the original and reduced regions 904 of uncertainty (either the area or the volume of the region); "Co" 905 and "Cr" are the confidence values associated with each region. 907 Information is lost by decreasing the region of uncertainty for a 908 rectangular distribution. Once reduced in size, the uncertainty 909 region cannot subsequently be increased in size. 911 5.4.2. Normal Distributions 913 Uncertainty and confidence can be both increased and decreased for a 914 normal distribution. This calculation depends on the number of 915 dimensions of the uncertainty region. 917 For a normal distribution, uncertainty and confidence are related to 918 the standard deviation of the function. The following function 919 defines the relationship between standard deviation, uncertainty, and 920 confidence along a single axis: 922 S[x] = U[x] / ( sqrt(2) * erfinv(C[x]) ) 924 Where "S[x]" is the standard deviation, "U[x]" is the uncertainty, 925 and "C[x]" is the confidence along a single axis. "erfinv" is the 926 inverse error function. 928 Scaling a normal distribution in two dimensions requires several 929 assumptions. Firstly, it is assumed that the distribution along each 930 axis is independent. Secondly, the confidence for each axis is 931 assumed to be the same. Therefore, the confidence along each axis 932 can be assumed to be: 934 C[x] = Co ^ (1/n) 936 Where "C[x]" is the confidence along a single axis and "Co" is the 937 overall confidence and "n" is the number of dimensions in the 938 uncertainty. 940 Therefore, to find the uncertainty for each axis at a desired 941 confidence, "Cd", apply the following formula: 943 Ud[x] <= U[x] * (erfinv(Cd ^ (1/n)) / erfinv(Co ^ (1/n))) 945 For regular shapes, this formula can be applied as a scaling factor 946 in each dimension to reach a required confidence. 948 5.5. Determining Whether a Location is Within a Given Region 950 A number of applications require that a judgment be made about 951 whether a Target is within a given region of interest. Given a 952 location estimate with uncertainty, this judgment can be difficult. 953 A location estimate represents a probability distribution, and the 954 true location of the Target cannot be definitively known. Therefore, 955 the judgment relies on determining the probability that the Target is 956 within the region. 958 The probability that the Target is within a particular region is 959 found by integrating the PDF over the region. For a normal 960 distribution, there are no analytical methods that can be used to 961 determine the integral of the two or three dimensional PDF over an 962 arbitrary region. The complexity of numerical methods is also too 963 great to be useful in many applications; for example, finding the 964 integral of the PDF in two or three dimensions across the overlap 965 between the uncertainty region and the target region. If the PDF is 966 unknown, no determination can be made without a simplifying 967 assumption. 969 When judging whether a location is within a given region, this 970 document assumes that uncertainties are rectangular. This introduces 971 errors, but simplifies the calculations significantly. Prior to 972 applying this assumption, confidence should be scaled to 95%. 974 Note: The selection of confidence has a significant impact on the 975 final result. Only use a different confidence if an uncertainty 976 value for 95% confidence cannot be found. 978 Given the assumption of a rectangular distribution, the probability 979 that a Target is found within a given region is found by first 980 finding the area (or volume) of overlap between the uncertainty 981 region and the region of interest. This is multiplied by the 982 confidence of the location estimate to determine the probability. 983 Figure 7 shows an example of finding the area of overlap between the 984 region of uncertainty and the region of interest. 986 _.-""""-._ 987 .' `. _ Region of 988 / \ / Uncertainty 989 ..+-"""--.. | 990 .-' | :::::: `-. | 991 ,' | :: Ao ::: `. | 992 / \ :::::::::: \ / 993 / `._ :::::: _.X 994 | `-....-' | 995 | | 996 | | 997 \ / 998 `. .' \_ Region of 999 `._ _.' Interest 1000 `--..___..--' 1002 Figure 7: Area of Overlap Between Two Circular Regions 1004 Once the area of overlap, "Ao", is known, the probability that the 1005 Target is within the region of interest, "Pi", is: 1007 Pi = Co * Ao / Au 1009 Given that the area of the region of uncertainty is "Au" and the 1010 confidence is "Co". 1012 This probability is often input to a decision process that has a 1013 limited set of outcomes; therefore, a threshold value needs to be 1014 selected. Depending on the application, different threshold 1015 probabilities might be selected. In the absence of specific 1016 recommendations, this document suggests that the probability be 1017 greater than 50% before a decision is made. If the decision process 1018 selects between two or more regions, as is required by [RFC5222], 1019 then the region with the highest probability can be selected. 1021 5.5.1. Determining the Area of Overlap for Two Circles 1023 Determining the area of overlap between two arbitrary shapes is a 1024 non-trivial process. Reducing areas to circles (see Section 5.2) 1025 enables the application of the following process. 1027 Given the radius of the first circle "r", the radius of the second 1028 circle "R" and the distance between their center points "d", the 1029 following set of formulas provide the area of overlap "Ao". 1031 o If the circles don't overlap, that is "d >= r+R", "Ao" is zero. 1033 o If one of the two circles is entirely within the other, that is 1034 "d <= |r-R|", the area of overlap is the area of the smaller 1035 circle. 1037 o Otherwise, if the circles partially overlap, that is "d < r+R" and 1038 "d > |r-R|", find "Ao" using: 1040 a = (r^2 - R^2 + d^2)/(2*d) 1042 Ao = r^2*acos(a/r) + R^2*acos((d - a)/R) - d*sqrt(r^2 - a^2) 1044 A value for "d" can be determined by converting the center points to 1045 Cartesian coordinates and calculating the distance between the two 1046 center points: 1048 d = sqrt((x1-x2)^2 + (y1-y2)^2 + (z1-z2)^2) 1050 5.5.2. Determining the Area of Overlap for Two Polygons 1052 A calculation of overlap based on polygons can give better results 1053 than the circle-based method. However, efficient calculation of 1054 overlapping area is non-trivial. Algorithms such as Vatti's clipping 1055 algorithm [Vatti92] can be used. 1057 For large polygonal areas, it might be that geodesic interpolation is 1058 used. In these cases, altitude is also frequently omitted in 1059 describing the polygon. For such shapes, a planar projection can 1060 still give a good approximation of the area of overlap if the larger 1061 area polygon is projected onto the local tangent plane of the 1062 smaller. This is only possible if the only area of interest is that 1063 contained within the smaller polygon. Where the entire area of the 1064 larger polygon is of interest, geodesic interpolation is necessary. 1066 6. Examples 1068 This section presents some examples of how to apply the methods 1069 described in Section 5. 1071 6.1. Reduction to a Point or Circle 1073 Alice receives a location estimate from her LIS that contains an 1074 ellipsoidal region of uncertainty. This information is provided at 1075 19% confidence with a normal PDF. A PIDF-LO extract for this 1076 information is shown in Figure 8. 1078 1079 1080 1081 -34.407242 150.882518 34 1082 1083 7.7156 1084 1085 1086 3.31 1087 1088 1089 28.7 1090 1091 1092 43 1093 1094 1095 95 1096 1097 1098 1100 Figure 8 1102 This information can be reduced to a point simply by extracting the 1103 center point, that is [-34.407242, 150.882518, 34]. 1105 If some limited uncertainty were required, the estimate could be 1106 converted into a circle or sphere. To convert to a sphere, the 1107 radius is the largest of the semi-major, semi-minor and vertical 1108 axes; in this case, 28.7 meters. 1110 However, if only a circle is required, the altitude can be dropped as 1111 can the altitude uncertainty (the vertical axis of the ellipsoid), 1112 resulting in a circle at [-34.407242, 150.882518] of radius 7.7156 1113 meters. 1115 Bob receives a location estimate with a Polygon shape (which roughly 1116 corresponds to the location of the Sydney Opera House). This 1117 information is shown in Figure 9. 1119 1120 1121 1122 1123 -33.856625 151.215906 -33.856299 151.215343 1124 -33.856326 151.214731 -33.857533 151.214495 1125 -33.857720 151.214613 -33.857369 151.215375 1126 -33.856625 151.215906 1127 1128 1129 1130 1132 Figure 9 1134 To convert this to a polygon, each point is firstly assigned an 1135 altitude of zero and converted to ECEF coordinates (see Appendix A). 1136 Then a normal vector for this polygon is found (see Appendix B). The 1137 result of each of these stages is shown in Figure 10. Note that the 1138 numbers shown in this document are rounded only for formatting 1139 reasons; the actual calculations do not include rounding, which would 1140 generate significant errors in the final values. 1142 Polygon in ECEF coordinate space 1143 (repeated point omitted and transposed to fit): 1144 [ -4.6470e+06 2.5530e+06 -3.5333e+06 ] 1145 [ -4.6470e+06 2.5531e+06 -3.5332e+06 ] 1146 pecef = [ -4.6470e+06 2.5531e+06 -3.5332e+06 ] 1147 [ -4.6469e+06 2.5531e+06 -3.5333e+06 ] 1148 [ -4.6469e+06 2.5531e+06 -3.5334e+06 ] 1149 [ -4.6469e+06 2.5531e+06 -3.5333e+06 ] 1151 Normal Vector: n = [ -0.72782 0.39987 -0.55712 ] 1153 Transformation Matrix: 1154 [ -0.48152 -0.87643 0.00000 ] 1155 t = [ -0.48828 0.26827 0.83043 ] 1156 [ -0.72782 0.39987 -0.55712 ] 1158 Transformed Coordinates: 1159 [ 8.3206e+01 1.9809e+04 6.3715e+06 ] 1160 [ 3.1107e+01 1.9845e+04 6.3715e+06 ] 1161 pecef' = [ -2.5528e+01 1.9842e+04 6.3715e+06 ] 1162 [ -4.7367e+01 1.9708e+04 6.3715e+06 ] 1163 [ -3.6447e+01 1.9687e+04 6.3715e+06 ] 1164 [ 3.4068e+01 1.9726e+04 6.3715e+06 ] 1166 Two dimensional polygon area: A = 12600 m^2 1167 Two-dimensional polygon centroid: C' = [ 8.8184e+00 1.9775e+04 ] 1169 Average of pecef' z coordinates: 6.3715e+06 1171 Reverse Transformation Matrix: 1172 [ -0.48152 -0.48828 -0.72782 ] 1173 t' = [ -0.87643 0.26827 0.39987 ] 1174 [ 0.00000 0.83043 -0.55712 ] 1176 Polygon centroid (ECEF): C = [ -4.6470e+06 2.5531e+06 -3.5333e+06 ] 1177 Polygon centroid (Geo): Cg = [ -33.856926 151.215102 -4.9537e-04 ] 1179 Figure 10 1181 The point conversion for the polygon uses the final result, "Cg", 1182 ignoring the altitude since the original shape did not include 1183 altitude. 1185 To convert this to a circle, take the maximum distance in ECEF 1186 coordinates from the center point to each of the points. This 1187 results in a radius of 99.1 meters. Confidence is unchanged. 1189 6.2. Increasing and Decreasing Confidence 1191 Assume that confidence is known to be 19% for Alice's location 1192 information. This is a typical value for a three-dimensional 1193 ellipsoid uncertainty of normal distribution where the standard 1194 deviation is used directly for uncertainty in each dimension. The 1195 confidence associated with Alice's location estimate is quite low for 1196 many applications. Since the estimate is known to follow a normal 1197 distribution, the method in Section 5.4.2 can be used. Each axis can 1198 be scaled by: 1200 scale = erfinv(0.95^(1/3)) / erfinv(0.19^(1/3)) = 2.9937 1202 Ensuring that rounding always increases uncertainty, the location 1203 estimate at 95% includes a semi-major axis of 23.1, a semi-minor axis 1204 of 10 and a vertical axis of 86. 1206 Bob's location estimate (from the previous example) covers an area of 1207 approximately 12600 square meters. If the estimate follows a 1208 rectangular distribution, the region of uncertainty can be reduced in 1209 size. Here we find the confidence that Bob is within the smaller 1210 area of the Concert Hall. For the Concert Hall, the polygon 1211 [-33.856473, 151.215257; -33.856322, 151.214973; 1212 -33.856424, 151.21471; -33.857248, 151.214753; 1213 -33.857413, 151.214941; -33.857311, 151.215128] is used. To use this 1214 new region of uncertainty, find its area using the same translation 1215 method described in Section 5.1.1.2, which produces 4566.2 square 1216 meters. Given that the Concert Hall is entirely within Bob's 1217 original location estimate, the confidence associated with the 1218 smaller area is therefore 95% * 4566.2 / 12600 = 34%. 1220 6.3. Matching Location Estimates to Regions of Interest 1222 Suppose that a circular area is defined centered at 1223 [-33.872754, 151.20683] with a radius of 1950 meters. To determine 1224 whether Bob is found within this area - given that Bob is at 1225 [-34.407242, 150.882518] with an uncertainty radius 7.7156 meters - 1226 we apply the method in Section 5.5. Using the converted Circle shape 1227 for Bob's location, the distance between these points is found to be 1228 1915.26 meters. The area of overlap between Bob's location estimate 1229 and the region of interest is therefore 2209 square meters and the 1230 area of Bob's location estimate is 30853 square meters. This gives 1231 the estimated probability that Bob is less than 1950 meters from the 1232 selected point as 67.8%. 1234 Note that if 1920 meters were chosen for the distance from the 1235 selected point, the area of overlap is only 16196 square meters and 1236 the confidence is 49.8%. Therefore, it is marginally more likely 1237 that Bob is outside the region of interest, despite the center point 1238 of his location estimate being within the region. 1240 6.4. PIDF-LO With Confidence Example 1242 The PIDF-LO document in Figure 11 includes a representation of 1243 uncertainty as a circular area. The confidence element (on the line 1244 marked with a comment) indicates that the confidence is 67% and that 1245 it follows a normal distribution. 1247 1255 1256 1257 1258 1259 1260 42.5463 -73.2512 1261 1262 850.24 1263 1264 1265 67 1266 1267 1268 1269 1270 mac:010203040506 1271 1272 1274 Figure 11: Example PIDF-LO with Confidence 1276 7. Confidence Schema 1278 1279 1286 1287 1289 PIDF-LO Confidence 1290 1291 1292 1294 This schema defines an element that is used for indicating 1295 confidence in PIDF-LO documents. 1296 1297 1299 1301 1302 1303 1304 1306 1307 1308 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1326 1327 1328 1329 1330 1331 1332 1334 1336 8. IANA Considerations 1338 8.1. URN Sub-Namespace Registration for 1339 urn:ietf:params:xml:ns:geopriv:conf 1341 This section registers a new XML namespace, 1342 "urn:ietf:params:xml:ns:geopriv:conf", as per the guidelines in 1343 [RFC3688]. 1345 URI: urn:ietf:params:xml:ns:geopriv:conf 1347 Registrant Contact: IETF, GEOPRIV working group, (geopriv@ietf.org), 1348 Martin Thomson (martin.thomson@gmail.com). 1350 XML: 1352 BEGIN 1353 1354 1356 1357 1358 PIDF-LO Confidence Attribute 1359 1360 1361

Namespace for PIDF-LO Confidence Attribute

1362

urn:ietf:params:xml:ns:geopriv:conf

1363 [[NOTE TO IANA/RFC-EDITOR: Please update RFC URL and replace XXXX 1364 with the RFC number for this specification.]] 1365

See RFCXXXX.

1366 1367 1368 END 1370 8.2. XML Schema Registration 1372 This section registers an XML schema as per the guidelines in 1373 [RFC3688]. 1375 URI: urn:ietf:params:xml:schema:geopriv:conf 1377 Registrant Contact: IETF, GEOPRIV working group, (geopriv@ietf.org), 1378 Martin Thomson (martin.thomson@gmail.com). 1380 Schema: The XML for this schema can be found as the entirety of 1381 Section 7 of this document. 1383 9. Security Considerations 1385 This document describes methods for managing and manipulating 1386 uncertainty in location. No specific security concerns arise from 1387 most of the information provided. The considerations of [RFC4119] 1388 all apply. 1390 Providing uncertainty and confidence information can reveal 1391 information about the process by which location information is 1392 generated. For instance, it might reveal information that could be 1393 used to infer that a user is using a mobile device with a GPS, or 1394 that a user is acquiring location information from a particular 1395 network-based service. A Rule Maker might choose to remove 1396 uncertainty-related fields from a location object in order to protect 1397 this information; though it is noted that this information might not 1398 be perfectly protected due to difficulties associated with location 1399 obfuscation, as described in Section 13.5 of [RFC6772]. 1401 Adding confidence to location information risks misinterpretation by 1402 consumers of location that do not understand the element. This could 1403 be exploited, particularly when reducing confidence, since the 1404 resulting uncertainty region might include locations that are less 1405 likely to contain the target than the recipient expects. Since this 1406 sort of error is always a possibility, the impact of this is low. 1408 10. Acknowledgements 1410 Peter Rhodes provided assistance with some of the mathematical 1411 groundwork on this document. Dan Cornford provided a detailed review 1412 and many terminology corrections. 1414 11. References 1416 11.1. Normative References 1418 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1419 Requirement Levels", BCP 14, RFC 2119, March 1997. 1421 [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, 1422 January 2004. 1424 [RFC3693] Cuellar, J., Morris, J., Mulligan, D., Peterson, J., and 1425 J. Polk, "Geopriv Requirements", RFC 3693, February 2004. 1427 [RFC4119] Peterson, J., "A Presence-based GEOPRIV Location Object 1428 Format", RFC 4119, December 2005. 1430 [RFC5139] Thomson, M. and J. Winterbottom, "Revised Civic Location 1431 Format for Presence Information Data Format Location 1432 Object (PIDF-LO)", RFC 5139, February 2008. 1434 [RFC5491] Winterbottom, J., Thomson, M., and H. Tschofenig, "GEOPRIV 1435 Presence Information Data Format Location Object (PIDF-LO) 1436 Usage Clarification, Considerations, and Recommendations", 1437 RFC 5491, March 2009. 1439 [RFC6225] Polk, J., Linsner, M., Thomson, M., and B. Aboba, "Dynamic 1440 Host Configuration Protocol Options for Coordinate-Based 1441 Location Configuration Information", RFC 6225, July 2011. 1443 [RFC6280] Barnes, R., Lepinski, M., Cooper, A., Morris, J., 1444 Tschofenig, H., and H. Schulzrinne, "An Architecture for 1445 Location and Location Privacy in Internet Applications", 1446 BCP 160, RFC 6280, July 2011. 1448 11.2. Informative References 1450 [Convert] Burtch, R., "A Comparison of Methods Used in Rectangular 1451 to Geodetic Coordinate Transformations", April 2006. 1453 [GeoShape] 1454 Thomson, M. and C. Reed, "GML 3.1.1 PIDF-LO Shape 1455 Application Schema for use by the Internet Engineering 1456 Task Force (IETF)", Candidate OpenGIS Implementation 1457 Specification 06-142r1, Version: 1.0, April 2007. 1459 [ISO.GUM] ISO/IEC, "Guide to the expression of uncertainty in 1460 measurement (GUM)", Guide 98:1995, 1995. 1462 [NIST.TN1297] 1463 Taylor, B. and C. Kuyatt, "Guidelines for Evaluating and 1464 Expressing the Uncertainty of NIST Measurement Results", 1465 Technical Note 1297, Sep 1994. 1467 [RFC5222] Hardie, T., Newton, A., Schulzrinne, H., and H. 1468 Tschofenig, "LoST: A Location-to-Service Translation 1469 Protocol", RFC 5222, August 2008. 1471 [RFC6772] Schulzrinne, H., Tschofenig, H., Cuellar, J., Polk, J., 1472 Morris, J., and M. Thomson, "Geolocation Policy: A 1473 Document Format for Expressing Privacy Preferences for 1474 Location Information", RFC 6772, January 2013. 1476 [Sunday02] 1477 Sunday, D., "Fast polygon area and Newell normal 1478 computation", Journal of Graphics Tools JGT, 1479 7(2):9-13,2002, 2002, 1480 . 1482 [TS-3GPP-23_032] 1483 3GPP, "Universal Geographic Area Description (GAD)", 3GPP 1484 TS 23.032 11.0.0, September 2012. 1486 [Vatti92] Vatti, B., "A generic solution to polygon clipping", 1487 Communications of the ACM Vol35, Issue7, pp56-63, 1992, 1488 . 1490 [WGS84] US National Imagery and Mapping Agency, "Department of 1491 Defense (DoD) World Geodetic System 1984 (WGS 84), Third 1492 Edition", NIMA TR8350.2, January 2000. 1494 Appendix A. Conversion Between Cartesian and Geodetic Coordinates in 1495 WGS84 1497 The process of conversion from geodetic (latitude, longitude and 1498 altitude) to earth-centered, earth-fixed (ECEF) Cartesian coordinates 1499 is relatively simple. 1501 In this section, the following constants and derived values are used 1502 from the definition of WGS84 [WGS84]: 1504 {radius of ellipsoid} R = 6378137 meters 1506 {inverse flattening} 1/f = 298.257223563 1508 {first eccentricity squared} e^2 = f * (2 - f) 1510 {second eccentricity squared} e'^2 = e^2 * (1 - e^2) 1512 To convert geodetic coordinates (latitude, longitude, altitude) to 1513 ECEF coordinates (X, Y, Z), use the following relationships: 1515 N = R / sqrt(1 - e^2 * sin(latitude)^2) 1517 X = (N + altitude) * cos(latitude) * cos(longitude) 1519 Y = (N + altitude) * cos(latitude) * sin(longitude) 1521 Z = (N*(1 - e^2) + altitude) * sin(latitude) 1523 The reverse conversion requires more complex computation and most 1524 methods introduce some error in latitude and altitude. A range of 1525 techniques are described in [Convert]. A variant on the method 1526 originally proposed by Bowring, which results in an acceptably small 1527 error, is described by the following: 1529 p = sqrt(X^2 + Y^2) 1531 r = sqrt(X^2 + Y^2 + Z^2) 1533 u = atan((1-f) * Z * (1 + e'^2 * (1-f) * R / r) / p) 1535 latitude = atan((Z + e'^2 * (1-f) * R * sin(u)^3) 1536 / (p - e^2 * R * cos(u)^3)) 1538 longitude = atan2(Y, X) 1540 altitude = sqrt((p - R * cos(u))^2 + (Z - (1-f) * R * sin(u))^2) 1542 If the point is near the poles, that is "p < 1", the value for 1543 altitude that this method produces is unstable. A simpler method for 1544 determining the altitude of a point near the poles is: 1546 altitude = |Z| - R * (1 - f) 1548 Appendix B. Calculating the Upward Normal of a Polygon 1550 For a polygon that is guaranteed to be convex and coplanar, the 1551 upward normal can be found by finding the vector cross product of 1552 adjacent edges. 1554 For more general cases the Newell method of approximation described 1555 in [Sunday02] may be applied. In particular, this method can be used 1556 if the points are only approximately coplanar, and for non-convex 1557 polygons. 1559 This process requires a Cartesian coordinate system. Therefore, 1560 convert the geodetic coordinates of the polygon to Cartesian, ECEF 1561 coordinates (Appendix A). If no altitude is specified, assume an 1562 altitude of zero. 1564 This method can be condensed to the following set of equations: 1566 Nx = sum from i=1..n of (y[i] * (z[i+1] - z[i-1])) 1568 Ny = sum from i=1..n of (z[i] * (x[i+1] - x[i-1])) 1570 Nz = sum from i=1..n of (x[i] * (y[i+1] - y[i-1])) 1572 For these formulae, the polygon is made of points 1573 "(x[1], y[1], z[1])" through "(x[n], y[n], x[n])". Each array is 1574 treated as circular, that is, "x[0] == x[n]" and "x[n+1] == x[1]". 1576 To translate this into a unit-vector; divide each component by the 1577 length of the vector: 1579 Nx' = Nx / sqrt(Nx^2 + Ny^2 + Nz^2) 1581 Ny' = Ny / sqrt(Nx^2 + Ny^2 + Nz^2) 1583 Nz' = Nz / sqrt(Nx^2 + Ny^2 + Nz^2) 1585 B.1. Checking that a Polygon Upward Normal Points Up 1587 RFC 5491 [RFC5491] stipulates that polygons be presented in anti- 1588 clockwise direction so that the upward normal is in an upward 1589 direction. Accidental reversal of points can invert this vector. 1590 This error can be hard to detect just by looking at the series of 1591 coordinates that form the polygon. 1593 Calculate the dot product of the upward normal of the polygon 1594 (Appendix B) and any vector that points away from the center of the 1595 Earth from the location of polygon. If this product is positive, 1596 then the polygon upward normal also points away from the center of 1597 the Earth. 1599 The inverse cosine of this value indicates the angle between the 1600 horizontal plane and the approximate plane of the polygon. 1602 A unit vector for the upward direction at any point can be found 1603 based on the latitude (lat) and longitude (lng) of the point, as 1604 follows: 1606 Up = [ cos(lat) * cos(lng) ; cos(lat) * sin(lng) ; sin(lat) ] 1608 For polygons that span less than half the globe, any point in the 1609 polygon - including the centroid - can be selected to generate an 1610 approximate up vector for comparison with the upward normal. 1612 Authors' Addresses 1613 Martin Thomson 1614 Mozilla 1615 331 E Evelyn Street 1616 Mountain View, CA 94041 1617 US 1619 Email: martin.thomson@gmail.com 1621 James Winterbottom 1622 Unaffiliated 1623 AU 1625 Email: a.james.winterbottom@gmail.com