idnits 2.17.1 

draft-ietf-geopriv-uncertainty-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (August 14, 2014) is 3542 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '1' on line 1553

  -- Looks like a reference, but probably isn't: '2' on line 710

  -- Looks like a reference, but probably isn't: '3' on line 710

  == Missing Reference: '2d' is mentioned on line 851, but not defined

  == Missing Reference: '3d' is mentioned on line 851, but not defined

  -- Looks like a reference, but probably isn't: '0' on line 1553

  ** Downref: Normative reference to an Informational RFC: RFC 3693


     Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	GEOPRIV                                                       M. Thomson
3	Internet-Draft                                                   Mozilla
4	Intended status: Standards Track                         J. Winterbottom
5	Expires: February 15, 2015                                  Unaffiliated
6	                                                         August 14, 2014

8	        Representation of Uncertainty and Confidence in PIDF-LO
9	                   draft-ietf-geopriv-uncertainty-02

11	Abstract

13	   The key concepts of uncertainty and confidence as they pertain to
14	   location information are defined.  Methods for the manipulation of
15	   location estimates that include uncertainty information are outlined.

17	Status of This Memo

19	   This Internet-Draft is submitted in full conformance with the
20	   provisions of BCP 78 and BCP 79.

22	   Internet-Drafts are working documents of the Internet Engineering
23	   Task Force (IETF).  Note that other groups may also distribute
24	   working documents as Internet-Drafts.  The list of current Internet-
25	   Drafts is at http://datatracker.ietf.org/drafts/current/.

27	   Internet-Drafts are draft documents valid for a maximum of six months
28	   and may be updated, replaced, or obsoleted by other documents at any
29	   time.  It is inappropriate to use Internet-Drafts as reference
30	   material or to cite them other than as "work in progress."

32	   This Internet-Draft will expire on February 15, 2015.

34	Copyright Notice

36	   Copyright (c) 2014 IETF Trust and the persons identified as the
37	   document authors.  All rights reserved.

39	   This document is subject to BCP 78 and the IETF Trust's Legal
40	   Provisions Relating to IETF Documents
41	   (http://trustee.ietf.org/license-info) in effect on the date of
42	   publication of this document.  Please review these documents
43	   carefully, as they describe your rights and restrictions with respect
44	   to this document.  Code Components extracted from this document must
45	   include Simplified BSD License text as described in Section 4.e of
46	   the Trust Legal Provisions and are provided without warranty as
47	   described in the Simplified BSD License.

49	Table of Contents

51	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
52	     1.1.  Conventions and Terminology . . . . . . . . . . . . . . .   3
53	   2.  A General Definition of Uncertainty . . . . . . . . . . . . .   4
54	     2.1.  Uncertainty as a Probability Distribution . . . . . . . .   5
55	     2.2.  Deprecation of the Terms Precision and Resolution . . . .   7
56	     2.3.  Accuracy as a Qualitative Concept . . . . . . . . . . . .   7
57	   3.  Uncertainty in Location . . . . . . . . . . . . . . . . . . .   8
58	     3.1.  Targets as Points in Space  . . . . . . . . . . . . . . .   8
59	     3.2.  Representation of Uncertainty and Confidence in PIDF-LO .   9
60	     3.3.  Uncertainty and Confidence for Civic Addresses  . . . . .   9
61	     3.4.  DHCP Location Configuration Information and Uncertainty .  10
62	   4.  Representation of Confidence in PIDF-LO . . . . . . . . . . .  10
63	     4.1.  The "confidence" Element  . . . . . . . . . . . . . . . .  11
64	     4.2.  Generating Locations with Confidence  . . . . . . . . . .  12
65	     4.3.  Consuming and Presenting Confidence . . . . . . . . . . .  12
66	   5.  Manipulation of Uncertainty . . . . . . . . . . . . . . . . .  13
67	     5.1.  Reduction of a Location Estimate to a Point . . . . . . .  13
68	       5.1.1.  Centroid Calculation  . . . . . . . . . . . . . . . .  14
69	         5.1.1.1.  Arc-Band Centroid . . . . . . . . . . . . . . . .  14
70	         5.1.1.2.  Polygon Centroid  . . . . . . . . . . . . . . . .  15
71	     5.2.  Conversion to Circle or Sphere  . . . . . . . . . . . . .  17
72	     5.3.  Three-Dimensional to Two-Dimensional Conversion . . . . .  18
73	     5.4.  Increasing and Decreasing Uncertainty and Confidence  . .  19
74	       5.4.1.  Rectangular Distributions . . . . . . . . . . . . . .  19
75	       5.4.2.  Normal Distributions  . . . . . . . . . . . . . . . .  20
76	     5.5.  Determining Whether a Location is Within a Given Region .  20
77	       5.5.1.  Determining the Area of Overlap for Two Circles . . .  22
78	       5.5.2.  Determining the Area of Overlap for Two Polygons  . .  23
79	   6.  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . .  23
80	     6.1.  Reduction to a Point or Circle  . . . . . . . . . . . . .  23
81	     6.2.  Increasing and Decreasing Confidence  . . . . . . . . . .  27
82	     6.3.  Matching Location Estimates to Regions of Interest  . . .  27
83	     6.4.  PIDF-LO With Confidence Example . . . . . . . . . . . . .  28
84	   7.  Confidence Schema . . . . . . . . . . . . . . . . . . . . . .  28
85	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  30
86	     8.1.  URN Sub-Namespace Registration for
87	           urn:ietf:params:xml:ns:geopriv:conf . . . . . . . . . . .  30
88	     8.2.  XML Schema Registration . . . . . . . . . . . . . . . . .  30
89	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  31
90	   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  31
91	   11. References  . . . . . . . . . . . . . . . . . . . . . . . . .  31
92	     11.1.  Normative References . . . . . . . . . . . . . . . . . .  31
93	     11.2.  Informative References . . . . . . . . . . . . . . . . .  32
94	   Appendix A.  Conversion Between Cartesian and Geodetic
95	                Coordinates in WGS84 . . . . . . . . . . . . . . . .  33
96	   Appendix B.  Calculating the Upward Normal of a Polygon . . . . .  34
97	     B.1.  Checking that a Polygon Upward Normal Points Up . . . . .  35
98	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  35

100	1.  Introduction

102	   Location information represents an estimation of the position of a
103	   Target [RFC6280].  Under ideal circumstances, a location estimate
104	   precisely reflects the actual location of the Target.  For automated
105	   systems that determine location, there are many factors that
106	   introduce errors into the measurements that are used to determine
107	   location estimates.

109	   The process by which measurements are combined to generate a location
110	   estimate is outside of the scope of work within the IETF.  However,
111	   the results of such a process are carried in IETF data formats and
112	   protocols.  This document outlines how uncertainty, and its
113	   associated datum, confidence, are expressed and interpreted.

115	   This document provides a common nomenclature for discussing
116	   uncertainty and confidence as they relate to location information.

118	   This document also provides guidance on how to manage location
119	   information that includes uncertainty.  Methods for expanding or
120	   reducing uncertainty to obtain a required level of confidence are
121	   described.  Methods for determining the probability that a Target is
122	   within a specified region based on their location estimate are
123	   described.  These methods are simplified by making certain
124	   assumptions about the location estimate and are designed to be
125	   applicable to location estimates in a relatively small geographic
126	   area.

128	   A confidence extension for the Presence Information Data Format -
129	   Location Object (PIDF-LO) [RFC4119] is described.

131	   This document describes methods that can be used in combination with
132	   automatically determined location information.  These are
133	   statistically-based methods.

135	1.1.  Conventions and Terminology

137	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
138	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
139	   document are to be interpreted as described in [RFC2119].

141	   This document assumes a basic understanding of the principles of
142	   mathematics, particularly statistics and geometry.

144	   Some terminology is borrowed from [RFC3693] and [RFC6280], in
145	   particular Target.

147	   Mathematical formulae are presented using the following notation: add
148	   "+", subtract "-", multiply "*", divide "/", power "^" and absolute
149	   value "|x|".  Precedence is indicated using parentheses.
150	   Mathematical functions are represented by common abbreviations:
151	   square root "sqrt(x)", sine "sin(x)", cosine "cos(x)", inverse cosine
152	   "acos(x)", tangent "tan(x)", inverse tangent "atan(x)", two-argument
153	   inverse tangent "atan2(y,x)", error function "erf(x)", and inverse
154	   error function "erfinv(x)".

156	2.  A General Definition of Uncertainty

158	   Uncertainty results from the limitations of measurement.  In
159	   measuring any observable quantity, errors from a range of sources
160	   affect the result.  Uncertainty is a quantification of what is known
161	   about the observed quantity, either through the limitations of
162	   measurement or through inherent variability of the quantity.

164	   Uncertainty is most completely described by a probability
165	   distribution.  A probability distribution assigns a probability to
166	   possible values for the quantity.

168	   A probability distribution describing a measured quantity can be
169	   arbitrarily complex and so it is desirable to find a simplified
170	   model.  One approach commonly taken is to reduce the probability
171	   distribution to a confidence interval.  Many alternative models are
172	   used in other areas, but study of those is not the focus of this
173	   document.

175	   In addition to the central estimate of the observed quantity, a
176	   confidence interval is succinctly described by two values: an error
177	   range and a confidence.  The error range describes an interval and
178	   the confidence describes an estimated upper bound on the probability
179	   that a "true" value is found within the extents defined by the error.

181	   In the following example, a measurement result for a length is shown
182	   as a nominal value with additional information on error range (0.0043
183	   meters) and confidence (95%).

185	      e.g.  x = 1.00742 +/- 0.0043 meters at 95% confidence

187	   This result indicates that the measurement indicates that the value
188	   of "x" between 1.00312 and 1.01172 meters with 95% probability.  No
189	   other assertion is made: in particular, this does not assert that x
190	   is 1.00742.

192	   Uncertainty and confidence for location estimates can be derived in a
193	   number of ways.  This document does not attempt to enumerate the many
194	   methods for determining uncertainty.  [ISO.GUM] and [NIST.TN1297]
195	   provide a set of general guidelines for determining and manipulating
196	   measurement uncertainty.  This document applies that general guidance
197	   for consumers of location information.

199	   As a statistical measure, values determined for uncertainty are
200	   determined based on information in the aggregate, across numerous
201	   individual estimates.  An individual estimate might be determined to
202	   be "correct" - by using a survey to validate the result, for example
203	   - without invalidating the statistical assertion.

205	   This understanding of estimates in the statistical sense explains why
206	   asserting a confidence of 100%, which might seem intuitively correct,
207	   is rarely advisable.

209	2.1.  Uncertainty as a Probability Distribution

211	   The Probability Density Function (PDF) that is described by
212	   uncertainty indicates the probability that the "true" value lies at
213	   any one point.  The shape of the probability distribution can vary
214	   depending on the method that is used to determine the result.  The
215	   two probability density functions most generally applicable to
216	   location information are considered in this document:

218	   o  The normal PDF (also referred to as a Gaussian PDF) is used where
219	      a large number of small random factors contribute to errors.  The
220	      value used for the error range in a normal PDF is related to the
221	      standard deviation of the distribution.

223	   o  A rectangular PDF is used where the errors are known to be
224	      consistent across a limited range.  A rectangular PDF can occur
225	      where a single error source, such as a rounding error, is
226	      significantly larger than other errors.  A rectangular PDF is
227	      often described by the half-width of the distribution; that is,
228	      half the width of the distribution.

230	   Each of these probability density functions can be characterized by
231	   its center point, or mean, and its width.  For a normal distribution,
232	   uncertainty and confidence together are related to the standard
233	   deviation (see Section 5.4).  For a rectangular distribution, half of
234	   the width of the distribution is used.

236	   Figure 1 shows a normal and rectangular probability density function
237	   with the mean (m) and standard deviation (s) labelled.  The half-
238	   width (h) of the rectangular distribution is also indicated.

240	                                *****             *** Normal PDF
241	                              **  :  **           --- Rectangular PDF
242	                            **    :    **
243	                           **     :     **
244	                .---------*---------------*---------.
245	                |        **       :       **        |
246	                |       **        :        **       |
247	                |      * <-- s -->:          *      |
248	                |     * :         :         : *     |
249	                |    **           :           **    |
250	                |   *   :         :         :   *   |
251	                |  *              :              *  |
252	                |**     :         :         :     **|
253	               **                 :                 **
254	            *** |       :         :         :       | ***
255	        *****   |                 :<------ h ------>|   *****
256	    .****-------+.......:.........:.........:.......+-------*****.
257	                                  m

259	      Figure 1: Normal and Rectangular Probability Density Functions

261	   For a given PDF, the value of the PDF describes the probability that
262	   the "true" value is found at that point.  Confidence for any given
263	   interval is the total probability of the "true" value being in that
264	   range, defined as the integral of the PDF over the interval.

266	      The probability of the "true" value falling between two points is
267	      found by finding the area under the curve between the points (that
268	      is, the integral of the curve between the points).  For any given
269	      PDF, the area under the curve for the entire range from negative
270	      infinity to positive infinity is 1 or (100%).  Therefore, the
271	      confidence over any interval of uncertainty is always less than
272	      100%.

274	   Figure 2 shows how confidence is determined for a normal
275	   distribution.  The area of the shaded region gives the confidence (c)
276	   for the interval between "m-u" and "m+u".

278	                                *****
279	                              **:::::**
280	                            **:::::::::**
281	                           **:::::::::::**
282	                          *:::::::::::::::*
283	                         **:::::::::::::::**
284	                        **:::::::::::::::::**
285	                       *:::::::::::::::::::::*
286	                      *:::::::::::::::::::::::*
287	                     **:::::::::::::::::::::::**
288	                    *:::::::::::: c ::::::::::::*
289	                   *:::::::::::::::::::::::::::::*
290	                 **|:::::::::::::::::::::::::::::|**
291	               **  |:::::::::::::::::::::::::::::|  **
292	            ***    |:::::::::::::::::::::::::::::|    ***
293	        *****      |:::::::::::::::::::::::::::::|      *****
294	    .****..........!:::::::::::::::::::::::::::::!..........*****.
295	                   |              |              |
296	                 (m-u)            m            (m+u)

298	               Figure 2: Confidence as the Integral of a PDF

300	   In Section 5.4, methods are described for manipulating uncertainty if
301	   the shape of the PDF is known.

303	2.2.  Deprecation of the Terms Precision and Resolution

305	   The terms _Precision_ and _Resolution_ are defined in RFC 3693
306	   [RFC3693].  These definitions were intended to provide a common
307	   nomenclature for discussing uncertainty; however, these particular
308	   terms have many different uses in other fields and their definitions
309	   are not sufficient to avoid confusion about their meaning.  These
310	   terms are unsuitable for use in relation to quantitative concepts
311	   when discussing uncertainty and confidence in relation to location
312	   information.

314	2.3.  Accuracy as a Qualitative Concept

316	   Uncertainty is a quantitative concept.  The term _accuracy_ is useful
317	   in describing, qualitatively, the general concepts of location
318	   information.  Accuracy is generally useful when describing
319	   qualitative aspects of location estimates.  Accuracy is not a
320	   suitable term for use in a quantitative context.

322	   For instance, it could be appropriate to say that a location estimate
323	   with uncertainty "X" is more accurate than a location estimate with
324	   uncertainty "2X" at the same confidence.  It is not appropriate to
325	   assign a number to "accuracy", nor is it appropriate to refer to any
326	   component of uncertainty or confidence as "accuracy".  That is, to
327	   say that the "accuracy" for the first location estimate is "X" would
328	   be an erroneous use of this term.

330	3.  Uncertainty in Location

332	   A _location estimate_ is the result of location determination.  A
333	   location estimate is subject to uncertainty like any other
334	   observation.  However, unlike a simple measure of a one dimensional
335	   property like length, a location estimate is specified in two or
336	   three dimensions.

338	   Uncertainty in two or three dimensional locations can be described
339	   using confidence intervals.  The confidence interval for a location
340	   estimate in two or three dimensional space is expressed as a subset
341	   of that space.  This document uses the term _region of uncertainty_
342	   to refer to the area or volume that describes the confidence
343	   interval.

345	   Areas or volumes that describe regions of uncertainty can be formed
346	   by the combination of two or three one-dimensional ranges, or more
347	   complex shapes could be described (for example, the shapes in
348	   [RFC5491]).

350	3.1.  Targets as Points in Space

352	   This document makes a simplifying assumption that the Target of the
353	   PIDF-LO occupies just a single point in space.  While this is clearly
354	   false in virtually all scenarios with any practical application, it
355	   is often a reasonable simplifying assumption to make.

357	   To a large extent, whether this simplification is valid depends on
358	   the size of the target relative to the size of the uncertainty
359	   region.  When locating a personal device using contemporary location
360	   determination techniques, the space the device occupies relative to
361	   the uncertainty is proportionally quite small.  Even where that
362	   device is used as a proxy for a person, the proportions change
363	   little.

365	   This assumption is less useful as uncertainty becomes small relative
366	   to the size of the Target of the PIDF-LO (or conversely, as
367	   uncertainty becomes small relative to the Target).  For instance,
368	   describing the location of a football stadium or small country would
369	   include a region of uncertainty that is infinitesimally larger than
370	   the Target itself.  In these cases, much of the guidance in this
371	   document is not applicable.  Indeed, as the accuracy of location
372	   determination technology improves, it could be that the advice this
373	   document contains becomes less relevant by the same measure.

375	3.2.  Representation of Uncertainty and Confidence in PIDF-LO

377	   A set of shapes suitable for the expression of uncertainty in
378	   location estimates in the Presence Information Data Format - Location
379	   Object (PIDF-LO) are described in [GeoShape].  These shapes are the
380	   recommended form for the representation of uncertainty in PIDF-LO
381	   [RFC4119] documents.

383	   The PIDF-LO can contain uncertainty, but does not include an
384	   indication of confidence.  [RFC5491] defines a fixed value of 95%.
385	   Similarly, the PIDF-LO format does not provide an indication of the
386	   shape of the PDF.  Section 4 defines elements to convey this
387	   information in PIDF-LO.

389	   Absence of uncertainty information in a PIDF-LO document does not
390	   indicate that there is no uncertainty in the location estimate.
391	   Uncertainty might not have been calculated for the estimate, or it
392	   may be withheld for privacy purposes.

394	   If the Point shape is used, confidence and uncertainty are unknown; a
395	   receiver can either assume a confidence of 0% or infinite
396	   uncertainty.  The same principle applies on the altitude axis for
397	   two-dimension shapes like the Circle.

399	3.3.  Uncertainty and Confidence for Civic Addresses

401	   Automatically determined civic addresses [RFC5139] inherently include
402	   uncertainty, based on the area of the most precise element that is
403	   specified.  In this case, uncertainty is effectively described by the
404	   presence or absence of elements -- elements that are not present are
405	   deemed to be uncertain.

407	   To apply the concept of uncertainty to civic addresses, it is helpful
408	   to unify the conceptual models of civic address with geodetic
409	   location information.  This is particularly useful when considering
410	   civic addresses that are determined using reverse geocoding (that is,
411	   the process of translating geodetic information into civic
412	   addresses).

414	   In the unified view, a civic address defines a series of (sometimes
415	   non-orthogonal) spatial partitions.  The first is the implicit
416	   partition that identifies the surface of the earth and the space near
417	   the surface.  The second is the country.  Each label that is included
418	   in a civic address provides information about a different set of
419	   spatial partitions.  Some partitions require slight adjustments from
420	   a standard interpretation: for instance, a road includes all
421	   properties that adjoin the street.  Each label might need to be
422	   interpreted with other values to provide context.

424	   As a value at each level is interpreted, one or more spatial
425	   partitions at that level are selected, and all other partitions of
426	   that type are excluded.  For non-orthogonal partitions, only the
427	   portion of the partition that fits within the existing space is
428	   selected.  This is what distinguishes King Street in Sydney from King
429	   Street in Melbourne.  Each defined element selects a partition of
430	   space.  The resulting location is the intersection of all selected
431	   spaces.

433	   The resulting spatial partition can be considered as a region of
434	   uncertainty.

436	   Note:  This view is a potential perspective on the process of geo-
437	      coding - the translation of a civic address to a geodetic
438	      location.

440	   Uncertainty in civic addresses can be increased by removing elements.
441	   This does not increase confidence unless additional information is
442	   used.  Similarly, arbitrarily increasing uncertainty in a geodetic
443	   location does not increase confidence.

445	3.4.  DHCP Location Configuration Information and Uncertainty

447	   Location information is often measured in two or three dimensions;
448	   expressions of uncertainty in one dimension only are rare.  The
449	   "resolution" parameters in [RFC6225] provide an indication of how
450	   many bits of a number are valid, which could be interpreted as an
451	   expression of uncertainty in one dimension.

453	   [RFC6225] defines a means for representing uncertainty, but a value
454	   for confidence is not specified.  A default value of 95% confidence
455	   is assumed for the combination of the uncertainty on each axis.  This
456	   is consistent with the transformation of those forms into the
457	   uncertainty representations from [RFC5491].  That is, the confidence
458	   of the resultant rectangular polygon or prism is assumed to be 95%.

460	4.  Representation of Confidence in PIDF-LO

462	   On the whole, a fixed definition for confidence is preferable.
463	   Primarily because it ensures consistency between implementations.
464	   Location generators that are aware of this constraint can generate
465	   location information at the required confidence.  Location recipients
466	   are able to make sensible assumptions about the quality of the
467	   information that they receive.

469	   In some circumstances - particularly with pre-existing systems -
470	   location generators might unable to provide location information with
471	   consistent confidence.  Existing systems sometimes specify confidence
472	   at 38%, 67% or 90%.  Existing forms of expressing location
473	   information, such as that defined in [TS-3GPP-23_032], contain
474	   elements that express the confidence in the result.

476	   The addition of a confidence element provides information that was
477	   previously unavailable to recipients of location information.
478	   Without this information, a location server or generator that has
479	   access to location information with a confidence lower than 95% has
480	   two options:

482	   o  The location server can scale regions of uncertainty in an attempt
483	      to acheive 95% confidence.  This scaling process significantly
484	      degrades the quality of the information, because the location
485	      server might not have the necessary information to scale
486	      appropriately; the location server is forced to make assumptions
487	      that are likely to result in either an overly conservative
488	      estimate with high uncertainty or a overestimate of confidence.

490	   o  The location server can ignore the confidence entirely, which
491	      results in giving the recipient a false impression of its quality.

493	   Both of these choices degrade the quality of the information
494	   provided.

496	   The addition of a confidence element avoids this problem entirely if
497	   a location recipient supports and understands the element.  A
498	   recipient that does not understand - and hence ignores - the
499	   confidence element is in no worse a position than if the location
500	   server ignored confidence.

502	4.1.  The "confidence" Element

504	   The confidence element MAY be added to the "location-info" element of
505	   the Presence Information Data Format - Location Object (PIDF-LO)
506	   [RFC4119] document.  This element expresses the confidence in the
507	   associated location information as a percentage.  A special "unknown"
508	   value is reserved to indicate that confidence is supported, but not
509	   known to the Location Generator.

511	   The confidence element optionally includes an attribute that
512	   indicates the shape of the probability density function (PDF) of the
513	   associated region of uncertainty.  Three values are possible:
514	   unknown, normal and rectangular.

516	   Indicating a particular PDF only indicates that the distribution
517	   approximately fits the given shape based on the methods used to
518	   generate the location information.  The PDF is normal if there are a
519	   large number of small, independent sources of error; rectangular if
520	   all points within the area have roughly equal probability of being
521	   the actual location of the Target; otherwise, the PDF MUST either be
522	   set to unknown or omitted.

524	   If a PIDF-LO does not include the confidence element, the confidence
525	   of the location estimate is 95%, as defined in [RFC5491].

527	   A Point shape does not have uncertainty (or it has infinite
528	   uncertainty), so confidence is meaningless for a point; therefore,
529	   this element MUST be omitted if only a point is provided.

531	4.2.  Generating Locations with Confidence

533	   Location generators SHOULD attempt to ensure that confidence is equal
534	   in each dimension when generating location information.  This
535	   restriction, while not always practical, allows for more accurate
536	   scaling, if scaling is necessary.

538	   A confidence element MUST be included with all location information
539	   that includes uncertainty (that is, all forms other than a point).  A
540	   special "unknown" MAY be used if confidence is not known.

542	4.3.  Consuming and Presenting Confidence

544	   The inclusion of confidence that is anything other than 95% presents
545	   a potentially difficult usability problem for applications that use
546	   location information.  Effectively communicating the probability that
547	   a location is incorrect to a user can be difficult.

549	   It is inadvisable to simply display locations of any confidence, or
550	   to display confidence in a separate or non-obvious fashion.  If
551	   locations with different confidence levels are displayed such that
552	   the distinction is subtle or easy to overlook - such as using fine
553	   graduations of color or transparency for graphical uncertainty
554	   regions, or displaying uncertainty graphically, but providing
555	   confidence as supplementary text - a user could fail to notice a
556	   difference in the quality of the location information that might be
557	   significant.

559	   Depending on the circumstances, different ways of handling confidence
560	   might be appropriate.  Section 5 describes techniques that could be
561	   appropriate for consumers that use automated processing.

563	   Providing that the full implications of any choice for the
564	   application are understood, some amount of automated processing could
565	   be appropriate.  In a simple example, applications could choose to
566	   discard or suppress the display of location information if confidence
567	   does not meet a pre-determined threshold.

569	   In settings where there is an opportunity for user training, some of
570	   these problems might be mitigated by defining different operational
571	   procedures for handling location information at different confidence
572	   levels.

574	5.  Manipulation of Uncertainty

576	   This section deals with manipulation of location information that
577	   contains uncertainty.

579	   The following rules generally apply when manipulating location
580	   information:

582	   o  Where calculations are performed on coordinate information, these
583	      should be performed in Cartesian space and the results converted
584	      back to latitude, longitude and altitude.  A method for converting
585	      to and from Cartesian coordinates is included in Appendix A.

587	         While some approximation methods are useful in simplifying
588	         calculations, treating latitude and longitude as Cartesian axes
589	         is never advisable.  The two axes are not orthogonal.  Errors
590	         can arise from the curvature of the earth and from the
591	         convergence of longitude lines.

593	   o  Normal rounding rules do not apply when rounding uncertainty.
594	      When rounding, the region of uncertainty always increases (that
595	      is, errors are rounded up) and confidence is always rounded down
596	      (see [NIST.TN1297]).  This means that any manipulation of
597	      uncertainty is a non-reversible operation; each manipulation can
598	      result in the loss of some information.

600	5.1.  Reduction of a Location Estimate to a Point

602	   Manipulating location estimates that include uncertainty information
603	   requires additional complexity in systems.  In some cases, systems
604	   only operate on definitive values, that is, a single point.

606	   This section describes algorithms for reducing location estimates to
607	   a simple form without uncertainty information.  Having a consistent
608	   means for reducing location estimates allows for interaction between
609	   applications that are able to use uncertainty information and those
610	   that cannot.

612	   Note:  Reduction of a location estimate to a point constitutes a
613	      reduction in information.  Removing uncertainty information can
614	      degrade results in some applications.  Also, there is a natural
615	      tendency to misinterpret a point location as representing a
616	      location without uncertainty.  This could lead to more serious
617	      errors.  Therefore, these algorithms should only be applied where
618	      necessary.

620	   Several different approaches can be taken when reducing a location
621	   estimate to a point.  Different methods each make a set of
622	   assumptions about the properties of the PDF and the selected point;
623	   no one method is more "correct" than any other.  For any given region
624	   of uncertainty, selecting an arbitrary point within the area could be
625	   considered valid; however, given the aforementioned problems with
626	   point locations, a more rigorous approach is appropriate.

628	   Given a result with a known distribution, selecting the point within
629	   the area that has the highest probability is a more rigorous method.
630	   Alternatively, a point could be selected that minimizes the overall
631	   error; that is, it minimizes the expected value of the difference
632	   between the selected point and the "true" value.

634	   If a rectangular distribution is assumed, the centroid of the area or
635	   volume minimizes the overall error.  Minimizing the error for a
636	   normal distribution is mathematically complex.  Therefore, this
637	   document opts to select the centroid of the region of uncertainty
638	   when selecting a point.

640	5.1.1.  Centroid Calculation

642	   For regular shapes, such as Circle, Sphere, Ellipse and Ellipsoid,
643	   this approach equates to the center point of the region.  For regions
644	   of uncertainty that are expressed as regular Polygons and Prisms the
645	   center point is also the most appropriate selection.

647	   For the Arc-Band shape and non-regular Polygons and Prisms, selecting
648	   the centroid of the area or volume minimizes the overall error.  This
649	   assumes that the PDF is rectangular.

651	   Note:  The centroid of a concave Polygon or Arc-Band shape is not
652	      necessarily within the region of uncertainty.

654	5.1.1.1.  Arc-Band Centroid

656	   The centroid of the Arc-Band shape is found along a line that bisects
657	   the arc.  The centroid can be found at the following distance from
658	   the starting point of the arc-band (assuming an arc-band with an
659	   inner radius of "r", outer radius "R", start angle "a", and opening
660	   angle "o"):

662	      d = 4 * sin(o/2) * (R*R + R*r + r*r) / (3*o*(R + r))

664	   This point can be found along the line that bisects the arc; that is,
665	   the line at an angle of "a + (o/2)".

667	5.1.1.2.  Polygon Centroid

669	   Calculating a centroid for the Polygon and Prism shapes is more
670	   complex.  Polygons that are specified using geodetic coordinates are
671	   not necessarily coplanar.  For Polygons that are specified without an
672	   altitude, choose a value for altitude before attempting this process;
673	   an altitude of 0 is acceptable.

675	      The method described in this section is simplified by assuming
676	      that the surface of the earth is locally flat.  This method
677	      degrades as polygons become larger; see [GeoShape] for
678	      recommendations on polygon size.

680	   The polygon is translated to a new coordinate system that has an x-y
681	   plane roughly parallel to the polygon.  This enables the elimination
682	   of z-axis values and calculating a centroid can be done using only x
683	   and y coordinates.  This requires that the upward normal for the
684	   polygon is known.

686	   To translate the polygon coordinates, apply the process described in
687	   Appendix B to find the normal vector "N = [Nx,Ny,Nz]".  This value
688	   should be made a unit vector to ensure that the transformation matrix
689	   is a special orthogonal matrix.  From this vector, select two vectors
690	   that are perpendicular to this vector and combine these into a
691	   transformation matrix.

693	   If "Nx" and "Ny" are non-zero, the matrices in Figure 3 can be used,
694	   given "p = sqrt(Nx^2 + Ny^2)".  More transformations are provided
695	   later in this section for cases where "Nx" or "Ny" are zero.

697	          [   -Ny/p     Nx/p     0  ]         [ -Ny/p  -Nx*Nz/p  Nx ]
698	      T = [ -Nx*Nz/p  -Ny*Nz/p   p  ]    T' = [  Nx/p  -Ny*Nz/p  Ny ]
699	          [    Nx        Ny      Nz ]         [   0      p       Nz ]
700	                 (Transform)                    (Reverse Transform)

702	               Figure 3: Recommended Transformation Matrices

704	   To apply a transform to each point in the polygon, form a matrix from
705	   the ECEF coordinates and use matrix multiplication to determine the
706	   translated coordinates.

708	      [   -Ny/p     Nx/p     0  ]   [ x[1]  x[2]  x[3]  ...  x[n] ]
709	      [ -Nx*Nz/p  -Ny*Nz/p   p  ] * [ y[1]  y[2]  y[3]  ...  y[n] ]
710	      [    Nx        Ny      Nz ]   [ z[1]  z[2]  z[3]  ...  z[n] ]

712	          [ x'[1]  x'[2]  x'[3]  ... x'[n] ]
713	        = [ y'[1]  y'[2]  y'[3]  ... y'[n] ]
714	          [ z'[1]  z'[2]  z'[3]  ... z'[n] ]

716	                         Figure 4: Transformation

718	   Alternatively, direct multiplication can be used to achieve the same
719	   result:

721	      x'[i] = -Ny * x[i] / p + Nx * y[i] / p

723	      y'[i] = -Nx * Nz * x[i] / p - Ny * Nz * y[i] / p + p * z[i]

725	      z'[i] = Nx * x[i] + Ny * y[i] + Nz * z[i]

727	   The first and second rows of this matrix ("x'" and "y'") contain the
728	   values that are used to calculate the centroid of the polygon.  To
729	   find the centroid of this polygon, first find the area using:

731	      A = sum from i=1..n of (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / 2

733	   For these formulae, treat each set of coordinates as circular, that
734	   is "x'[0] == x'[n]" and "x'[n+1] == x'[1]".  Based on the area, the
735	   centroid along each axis can be determined by:

737	      Cx' = sum (x'[i]+x'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A)

739	      Cy' = sum (y'[i]+y'[i+1]) * (x'[i]*y'[i+1]-x'[i+1]*y'[i]) / (6*A)

741	   Note:  The formula for the area of a polygon will return a negative
742	      value if the polygon is specified in clockwise direction.  This
743	      can be used to determine the orientation of the polygon.

745	   The third row contains a distance from a plane parallel to the
746	   polygon.  If the polygon is coplanar, then the values for "z'" are
747	   identical; however, the constraints recommended in [RFC5491] mean
748	   that this is rarely the case.  To determine "Cz'", average these
749	   values:

751	      Cz' = sum z'[i] / n

753	   Once the centroid is known in the transformed coordinates, these can
754	   be transformed back to the original coordinate system.  The reverse
755	   transformation is shown in Figure 5.

757	      [ -Ny/p  -Nx*Nz/p  Nx ]     [       Cx'        ]   [ Cx ]
758	      [  Nx/p  -Ny*Nz/p  Ny ]  *  [       Cy'        ] = [ Cy ]
759	      [   0        p     Nz ]     [ sum of z'[i] / n ]   [ Cz ]

761	                     Figure 5: Reverse Transformation

763	   The reverse transformation can be applied directly as follows:

765	      Cx = -Ny * Cx' / p - Nx * Nz * Cy' / p + Nx * Cz'

767	      Cy = Nx * Cx' / p - Ny * Nz * Cy' / p + Ny * Cz'

769	      Cz = p * Cy' + Nz * Cz'

771	   The ECEF value "[Cx,Cy,Cz]" can then be converted back to geodetic
772	   coordinates.  Given a polygon that is defined with no altitude or
773	   equal altitudes for each point, the altitude of the result can either
774	   be ignored or reset after converting back to a geodetic value.

776	   The centroid of the Prism shape is found by finding the centroid of
777	   the base polygon and raising the point by half the height of the
778	   prism.  This can be added to altitude of the final result;
779	   alternatively, this can be added to "Cz'", which ensures that
780	   negative height is correctly applied to polygons that are defined in
781	   a "clockwise" direction.

783	   The recommended transforms only apply if "Nx" and "Ny" are non-zero.
784	   If the normal vector is "[0,0,1]" (that is, along the z-axis), then
785	   no transform is necessary.  Similarly, if the normal vector is
786	   "[0,1,0]" or "[1,0,0]", avoid the transformation and use the x and z
787	   coordinates or y and z coordinates (respectively) in the centroid
788	   calculation phase.  If either "Nx" or "Ny" are zero, the alternative
789	   transform matrices in Figure 6 can be used.  The reverse transform is
790	   the transpose of this matrix.

792	    if Nx == 0:                              | if Ny == 0:
793	        [ 0  -Nz  Ny ]       [  0   1  0  ]  |            [ -Nz  0  Nx ]
794	    T = [ 1   0   0  ]  T' = [ -Nz  0  Ny ]  |   T = T' = [  0   1  0  ]
795	        [ 0   Ny  Nz ]       [  Ny  0  Nz ]  |            [  Nx  0  Nz ]

797	               Figure 6: Alternative Transformation Matrices

799	5.2.  Conversion to Circle or Sphere

801	   The Circle or Sphere are simple shapes that suit a range of
802	   applications.  A circle or sphere contains fewer units of data to
803	   manipulate, which simplifies operations on location estimates.

805	   The simplest method for converting a location estimate to a Circle or
806	   Sphere shape is to determine the centroid and then find the longest
807	   distance to any point in the region of uncertainty to that point.
808	   This distance can be determined based on the shape type:

810	   Circle/Sphere:  No conversion necessary.

812	   Ellipse/Ellipsoid:  The greater of either semi-major axis or altitude
813	      uncertainty.

815	   Polygon/Prism:  The distance to the furthest vertex of the polygon
816	      (for a Prism, it is only necessary to check points on the base).

818	   Arc-Band:  The furthest length from the centroid to the points where
819	      the inner and outer arc end.  This distance can be calculated by
820	      finding the larger of the two following formulae:

822	         X = sqrt( d*d + R*R - 2*d*R*cos(o/2) )

824	         x = sqrt( d*d + r*r - 2*d*r*cos(o/2) )

826	   Once the Circle or Sphere shape is found, the associated confidence
827	   can be increased if the result is known to follow a normal
828	   distribution.  However, this is a complicated process and provides
829	   limited benefit.  In many cases it also violates the constraint that
830	   confidence in each dimension be the same.  Confidence should be
831	   unchanged when performing this conversion.

833	   Two dimensional shapes are converted to a Circle; three dimensional
834	   shapes are converted to a Sphere.

836	5.3.  Three-Dimensional to Two-Dimensional Conversion

838	   A three-dimensional shape can be easily converted to a two-
839	   dimensional shape by removing the altitude component.  A sphere
840	   becomes a circle; a prism becomes a polygon; an ellipsoid becomes an
841	   ellipse.  Each conversion is simple, requiring only the removal of
842	   those elements relating to altitude.

844	   The altitude is unspecified for a two-dimensional shape and therefore
845	   has unlimited uncertainty along the vertical axis.  The confidence
846	   for the two-dimensional shape is thus higher than the three-
847	   dimensional shape.  Assuming equal confidence on each axis, the
848	   confidence of the circle can be increased using the following
849	   approximate formula:

851	      C[2d] >= C[3d] ^ (2/3)

853	   "C[2d]" is the confidence of the two-dimensional shape and "C[3d]" is
854	   the confidence of the three-dimensional shape.  For example, a Sphere
855	   with a confidence of 95% can be simplified to a Circle of equal
856	   radius with confidence of 96.6%.

858	5.4.  Increasing and Decreasing Uncertainty and Confidence

860	   The combination of uncertainty and confidence provide a great deal of
861	   information about the nature of the data that is being measured.  If
862	   uncertainty, confidence and PDF are known, certain information can be
863	   extrapolated.  In particular, the uncertainty can be scaled to meet a
864	   desired confidence or the confidence for a particular region of
865	   uncertainty can be found.

867	   In general, confidence decreases as the region of uncertainty
868	   decreases in size and confidence increases as the region of
869	   uncertainty increases in size.  However, this depends on the PDF;
870	   expanding the region of uncertainty for a rectangular distribution
871	   has no effect on confidence without additional information.  If the
872	   region of uncertainty is increased during the process of obfuscation
873	   (see [RFC6772]), then the confidence cannot be increased.

875	   A region of uncertainty that is reduced in size always has a lower
876	   confidence.

878	   A region of uncertainty that has an unknown PDF shape cannot be
879	   reduced in size reliably.  The region of uncertainty can be expanded,
880	   but only if confidence is not increased.

882	   This section makes the simplifying assumption that location
883	   information is symmetrically and evenly distributed in each
884	   dimension.  This is not necessarily true in practice.  If better
885	   information is available, alternative methods might produce better
886	   results.

888	5.4.1.  Rectangular Distributions

890	   Uncertainty that follows a rectangular distribution can only be
891	   decreased in size.  Increasing uncertainty has no value, since it has
892	   no effect on confidence.  Since the PDF is constant over the region
893	   of uncertainty, the resulting confidence is determined by the
894	   following formula:

896	      Cr = Co * Ur / Uo

898	   Where "Uo" and "Ur" are the sizes of the original and reduced regions
899	   of uncertainty (either the area or the volume of the region); "Co"
900	   and "Cb" are the confidence values associated with each region.

902	   Information is lost by decreasing the region of uncertainty for a
903	   rectangular distribution.  Once reduced in size, the uncertainty
904	   region cannot subsequently be increased in size.

906	5.4.2.  Normal Distributions

908	   Uncertainty and confidence can be both increased and decreased for a
909	   normal distribution.  This calculation depends on the number of
910	   dimensions of the uncertainty region.

912	   For a normal distribution, uncertainty and confidence are related to
913	   the standard deviation of the function.  The following function
914	   defines the relationship between standard deviation, uncertainty, and
915	   confidence along a single axis:

917	      S[x] = U[x] / ( sqrt(2) * erfinv(C[x]) )

919	   Where "S[x]" is the standard deviation, "U[x]" is the uncertainty,
920	   and "C[x]" is the confidence along a single axis.  "erfinv" is the
921	   inverse error function.

923	   Scaling a normal distribution in two dimensions requires several
924	   assumptions.  Firstly, it is assumed that the distribution along each
925	   axis is independent.  Secondly, the confidence for each axis is
926	   assumed to be the same.  Therefore, the confidence along each axis
927	   can be assumed to be:

929	      C[x] = Co ^ (1/n)

931	   Where "C[x]" is the confidence along a single axis and "Co" is the
932	   overall confidence and "n" is the number of dimensions in the
933	   uncertainty.

935	   Therefore, to find the uncertainty for each axis at a desired
936	   confidence, "Cd", apply the following formula:

938	      Ud[x] <= U[x] * (erfinv(Cd ^ (1/n)) / erfinv(Co ^ (1/n)))

940	   For regular shapes, this formula can be applied as a scaling factor
941	   in each dimension to reach a required confidence.

943	5.5.  Determining Whether a Location is Within a Given Region

945	   A number of applications require that a judgment be made about
946	   whether a Target is within a given region of interest.  Given a
947	   location estimate with uncertainty, this judgment can be difficult.
948	   A location estimate represents a probability distribution, and the
949	   true location of the Target cannot be definitively known.  Therefore,
950	   the judgment relies on determining the probability that the Target is
951	   within the region.

953	   The probability that the Target is within a particular region is
954	   found by integrating the PDF over the region.  For a normal
955	   distribution, there are no analytical methods that can be used to
956	   determine the integral of the two or three dimensional PDF over an
957	   arbitrary region.  The complexity of numerical methods is also too
958	   great to be useful in many applications; for example, finding the
959	   integral of the PDF in two or three dimensions across the overlap
960	   between the uncertainty region and the target region.  If the PDF is
961	   unknown, no determination can be made without a simplifying
962	   assumption.

964	   When judging whether a location is within a given region, this
965	   document assumes that uncertainties are rectangular.  This introduces
966	   errors, but simplifies the calculations significantly.  Prior to
967	   applying this assumption, confidence should be scaled to 95%.

969	   Note:  The selection of confidence has a significant impact on the
970	      final result.  Only use a different confidence if an uncertainty
971	      value for 95% confidence cannot be found.

973	   Given the assumption of a rectangular distribution, the probability
974	   that a Target is found within a given region is found by first
975	   finding the area (or volume) of overlap between the uncertainty
976	   region and the region of interest.  This is multiplied by the
977	   confidence of the location estimate to determine the probability.
978	   Figure 7 shows an example of finding the area of overlap between the
979	   region of uncertainty and the region of interest.

981	                    _.-""""-._
982	                  .'          `.    _ Region of
983	                 /              \  /  Uncertainty
984	              ..+-"""--..        |
985	           .-'  | :::::: `-.     |
986	         ,'     | :: Ao ::: `.   |
987	        /        \ :::::::::: \ /
988	       /          `._ :::::: _.X
989	      |              `-....-'   |
990	      |                         |
991	      |                         |
992	       \                       /
993	        `.                   .'  \_ Region of
994	          `._             _.'       Interest
995	             `--..___..--'

997	          Figure 7: Area of Overlap Between Two Circular Regions

999	   Once the area of overlap, "Ao", is known, the probability that the
1000	   Target is within the region of interest, "Pi", is:

1002	      Pi = Co * Ao / Au

1004	   Given that the area of the region of uncertainty is "Au" and the
1005	   confidence is "Co".

1007	   This probability is often input to a decision process that has a
1008	   limited set of outcomes; therefore, a threshold value needs to be
1009	   selected.  Depending on the application, different threshold
1010	   probabilities might be selected.  In the absence of specific
1011	   recommendations, this document suggests that the probability be
1012	   greater than 50% before a decision is made.  If the decision process
1013	   selects between two or more regions, as is required by [RFC5222],
1014	   then the region with the highest probability can be selected.

1016	5.5.1.  Determining the Area of Overlap for Two Circles

1018	   Determining the area of overlap between two arbitrary shapes is a
1019	   non-trivial process.  Reducing areas to circles (see Section 5.2)
1020	   enables the application of the following process.

1022	   Given the radius of the first circle "r", the radius of the second
1023	   circle "R" and the distance between their center points "d", the
1024	   following set of formulas provide the area of overlap "Ao".

1026	   o  If the circles don't overlap, that is "d >= r+R", "Ao" is zero.

1028	   o  If one of the two circles is entirely within the other, that is
1029	      "d <= |r-R|", the area of overlap is the area of the smaller
1030	      circle.

1032	   o  Otherwise, if the circles partially overlap, that is "d < r+R" and
1033	      "d > |r-R|", find "Ao" using:

1035	         a = (r^2 - R^2 + d^2)/(2*d)

1037	         Ao = r^2*acos(a/r) + R^2*acos((d - a)/R) - d*sqrt(r^2 - a^2)

1039	   A value for "d" can be determined by converting the center points to
1040	   Cartesian coordinates and calculating the distance between the two
1041	   center points:

1043	      d = sqrt((x1-x2)^2 + (y1-y2)^2 + (z1-z2)^2)

1045	5.5.2.  Determining the Area of Overlap for Two Polygons

1047	   A calculation of overlap based on polygons can give better results
1048	   than the circle-based method.  However, efficient calculation of
1049	   overlapping area is non-trivial.  Algorithms such as Vatti's clipping
1050	   algorithm [Vatti92] can be used.

1052	   For large polygonal areas, it might be that geodesic interpolation is
1053	   used.  In these cases, altitude is also frequently omitted in
1054	   describing the polygon.  For such shapes, a planar projection can
1055	   still give a good approximation of the area of overlap if the larger
1056	   area polygon is projected onto the local tangent plane of the
1057	   smaller.  This is only possible if the only area of interest is that
1058	   contained within the smaller polygon.  Where the entire area of the
1059	   larger polygon is of interest, geodesic interpolation is necessary.

1061	6.  Examples

1063	   This section presents some examples of how to apply the methods
1064	   described in Section 5.

1066	6.1.  Reduction to a Point or Circle

1068	   Alice receives a location estimate from her LIS that contains an
1069	   ellipsoidal region of uncertainty.  This information is provided at
1070	   19% confidence with a normal PDF.  A PIDF-LO extract for this
1071	   information is shown in Figure 8.

1073	     <gp:geopriv>
1074	       <gp:location-info>
1075	         <gs:Ellipsoid srsName="urn:ogc:def:crs:EPSG::4979">
1076	           <gml:pos>-34.407242 150.882518 34</gml:pos>
1077	           <gs:semiMajorAxis uom="urn:ogc:def:uom:EPSG::9001">
1078	             7.7156
1079	           </gs:semiMajorAxis>
1080	           <gs:semiMinorAxis uom="urn:ogc:def:uom:EPSG::9001">
1081	             3.31
1082	           </gs:semiMinorAxis>
1083	           <gs:verticalAxis uom="urn:ogc:def:uom:EPSG::9001">
1084	             28.7
1085	           </gs:verticalAxis>
1086	           <gs:orientation uom="urn:ogc:def:uom:EPSG::9102">
1087	             43
1088	           </gs:orientation>
1089	         </gs:Ellipsoid>
1090	       </gp:location-info>
1091	       <gp:usage-rules/>
1092	     </gp:geopriv>

1094	                                 Figure 8

1096	   This information can be reduced to a point simply by extracting the
1097	   center point, that is [-34.407242, 150.882518, 34].

1099	   If some limited uncertainty were required, the estimate could be
1100	   converted into a circle or sphere.  To convert to a sphere, the
1101	   radius is the largest of the semi-major, semi-minor and vertical
1102	   axes; in this case, 28.7 meters.

1104	   However, if only a circle is required, the altitude can be dropped as
1105	   can the altitude uncertainty (the vertical axis of the ellipsoid),
1106	   resulting in a circle at [-34.407242, 150.882518] of radius 7.7156
1107	   meters.

1109	   Bob receives a location estimate with a Polygon shape.  This
1110	   information is shown in Figure 9.

1112	     <gml:Polygon srsName="urn:ogc:def:crs:EPSG::4326">
1113	       <gml:exterior>
1114	         <gml:LinearRing>
1115	           <gml:posList>
1116	             -33.856625 151.215906 -33.856299 151.215343
1117	             -33.856326 151.214731 -33.857533 151.214495
1118	             -33.857720 151.214613 -33.857369 151.215375
1119	             -33.856625 151.215906
1120	           </gml:posList>
1121	         </gml:LinearRing>
1122	       </gml:exterior>
1123	     </gml:Polygon>

1125	                                 Figure 9

1127	   To convert this to a polygon, each point is firstly assigned an
1128	   altitude of zero and converted to ECEF coordinates (see Appendix A).
1129	   Then a normal vector for this polygon is found (see Appendix B).  The
1130	   result of each of these stages is shown in Figure 10.  Note that the
1131	   numbers shown are all rounded; no rounding is possible during this
1132	   process since rounding would contribute significant errors.

1134	   Polygon in ECEF coordinate space
1135	      (repeated point omitted and transposed to fit):
1136	            [ -4.6470e+06  2.5530e+06  -3.5333e+06 ]
1137	            [ -4.6470e+06  2.5531e+06  -3.5332e+06 ]
1138	    pecef = [ -4.6470e+06  2.5531e+06  -3.5332e+06 ]
1139	            [ -4.6469e+06  2.5531e+06  -3.5333e+06 ]
1140	            [ -4.6469e+06  2.5531e+06  -3.5334e+06 ]
1141	            [ -4.6469e+06  2.5531e+06  -3.5333e+06 ]

1143	   Normal Vector: n = [ -0.72782  0.39987  -0.55712 ]

1145	   Transformation Matrix:
1146	        [ -0.48152  -0.87643   0.00000 ]
1147	    t = [ -0.48828   0.26827   0.83043 ]
1148	        [ -0.72782   0.39987  -0.55712 ]

1150	   Transformed Coordinates:
1151	             [  8.3206e+01  1.9809e+04  6.3715e+06 ]
1152	             [  3.1107e+01  1.9845e+04  6.3715e+06 ]
1153	    pecef' = [ -2.5528e+01  1.9842e+04  6.3715e+06 ]
1154	             [ -4.7367e+01  1.9708e+04  6.3715e+06 ]
1155	             [ -3.6447e+01  1.9687e+04  6.3715e+06 ]
1156	             [  3.4068e+01  1.9726e+04  6.3715e+06 ]

1158	   Two dimensional polygon area: A = 12600 m^2
1159	   Two-dimensional polygon centroid: C' = [ 8.8184e+00  1.9775e+04 ]

1161	   Average of pecef' z coordinates: 6.3715e+06

1163	   Reverse Transformation Matrix:
1164	         [ -0.48152  -0.48828  -0.72782 ]
1165	    t' = [ -0.87643   0.26827   0.39987 ]
1166	         [  0.00000   0.83043  -0.55712 ]

1168	   Polygon centroid (ECEF): C = [ -4.6470e+06  2.5531e+06  -3.5333e+06 ]
1169	   Polygon centroid (Geo): Cg = [ -33.856926  151.215102  -4.9537e-04 ]

1171	                                 Figure 10

1173	   The point conversion for the polygon uses the final result, "Cg",
1174	   ignoring the altitude since the original shape did not include
1175	   altitude.

1177	   To convert this to a circle, take the maximum distance in ECEF
1178	   coordinates from the center point to each of the points.  This
1179	   results in a radius of 99.1 meters.  Confidence is unchanged.

1181	6.2.  Increasing and Decreasing Confidence

1183	   Assuming that confidence is known to be 19% for Alice's location
1184	   information.  This is a typical value for a three-dimensional
1185	   ellipsoid uncertainty of normal distribution where the standard
1186	   deviation is used directly for uncertainty in each dimension.  The
1187	   confidence associated with Alice's location estimate is quite low for
1188	   many applications.  Since the estimate is known to follow a normal
1189	   distribution, the method in Section 5.4.2 can be used.  Each axis can
1190	   be scaled by:

1192	      scale = erfinv(0.95^(1/3)) / erfinv(0.19^(1/3)) = 2.9937

1194	   Ensuring that rounding always increases uncertainty, the location
1195	   estimate at 95% includes a semi-major axis of 23.1, a semi-minor axis
1196	   of 10 and a vertical axis of 86.

1198	   Bob's location estimate (from the previous example) covers an area of
1199	   approximately 12600 square meters.  If the estimate follows a
1200	   rectangular distribution, the region of uncertainty can be reduced in
1201	   size.  Here we find the confidence that Bob is within the smaller
1202	   area of the concert hall.  For the concert hall, the polygon
1203	   [-33.856473, 151.215257; -33.856322, 151.214973;
1204	   -33.856424, 151.21471; -33.857248, 151.214753;
1205	   -33.857413, 151.214941; -33.857311, 151.215128] is used.  To use this
1206	   new region of uncertainty, find its area using the same translation
1207	   method described in Section 5.1.1.2, which produces 4566.2 square
1208	   meters.  Given that the concert hall is entirely within Bob's
1209	   original location estimate, the confidence associated with the
1210	   smaller area is therefore 95% * 4566.2 / 12600 = 34%.

1212	6.3.  Matching Location Estimates to Regions of Interest

1214	   Suppose that a circular area is defined centered at
1215	   [-33.872754, 151.20683] with a radius of 1950 meters.  To determine
1216	   whether Bob is found within this area - given that Bob is at
1217	   [-34.407242, 150.882518] with an uncertainty radius 7.7156 meters -
1218	   we apply the method in Section 5.5.  Using the converted Circle shape
1219	   for Bob's location, the distance between these points is found to be
1220	   1915.26 meters.  The area of overlap between Bob's location estimate
1221	   and the region of interest is therefore 2209 square meters and the
1222	   area of Bob's location estimate is 30853 square meters.  This gives
1223	   the estimated probability that Bob is less than 1950 meters from the
1224	   selected point as 67.8%.

1226	   Note that if 1920 meters were chosen for the distance from the
1227	   selected point, the area of overlap is only 16196 square meters and
1228	   the confidence is 49.8%.  Therefore, it is marginally more likely
1229	   that Bob is outside the region of interest, despite the center point
1230	   of his location estimate being within the region.

1232	6.4.  PIDF-LO With Confidence Example

1234	   The PIDF-LO document in Figure 11 includes a representation of
1235	   uncertainty as a circular area.  The confidence element (on the line
1236	   marked with a comment) indicates that the confidence is 67% and that
1237	   it follows a normal distribution.

1239	     <pidf:presence
1240	         xmlns:pidf="urn:ietf:params:xml:ns:pidf"
1241	         xmlns:dm="urn:ietf:params:xml:ns:pidf:data-model"
1242	         xmlns:gp="urn:ietf:params:xml:ns:pidf:geopriv10"
1243	         xmlns:gs="http://www.opengis.net/pidflo/1.0"
1244	         xmlns:gml="http://www.opengis.net/gml"
1245	         xmlns:con="urn:ietf:params:xml:ns:pidf:geopriv:conf"
1246	         entity="pres:alice@example.com">
1247	       <dm:device id="sg89ab">
1248	         <pidf:status>
1249	           <gp:geopriv>
1250	             <gp:location-info>
1251	               <gs:Circle srsName="urn:ogc:def:crs:EPSG::4326">
1252	                 <gml:pos>42.5463 -73.2512</gml:pos>
1253	                 <gs:radius uom="urn:ogc:def:uom:EPSG::9001">
1254	                   850.24
1255	                 </gs:radius>
1256	               </gs:Circle>
1257	   <!-- c -->  <con:confidence pdf="normal">67</con:confidence>
1258	             </gp:location-info>
1259	             <gp:usage-rules/>
1260	           </gp:geopriv>
1261	         </pidf:status>
1262	         <dm:deviceID>mac:010203040506</dm:deviceID>
1263	       </dm:device>
1264	     </pidf:presence>

1266	                Figure 11: Example PIDF-LO with Confidence

1268	7.  Confidence Schema

1270	   <?xml version="1.0"?>
1271	   <xs:schema
1272	       xmlns:conf="urn:ietf:params:xml:ns:geopriv:conf"
1273	       xmlns:xs="http://www.w3.org/2001/XMLSchema"
1274	       targetNamespace="urn:ietf:params:xml:ns:geopriv:conf"
1275	       elementFormDefault="qualified"
1276	       attributeFormDefault="unqualified">

1278	     <xs:annotation>
1279	       <xs:appinfo
1280	           source="urn:ietf:params:xml:schema:geopriv:conf">
1281	         PIDF-LO Confidence
1282	       </xs:appinfo>
1283	       <xs:documentation source="http://www.ietf.org/rfc/rfcXXXX.txt">
1284	         <!-- [[NOTE TO RFC-EDITOR: Please replace above URL with URL of
1285	              published RFC and remove this note.]] -->
1286	         This schema defines an element that is used for indicating
1287	         confidence in PIDF-LO documents.
1288	       </xs:documentation>
1289	     </xs:annotation>

1291	     <xs:element name="confidence" type="conf:confidenceType"/>

1293	     <xs:complexType name="confidenceType">
1294	       <xs:simpleContent>
1295	         <xs:extension base="conf:confidenceBase">
1296	           <xs:attribute name="pdf" type="conf:pdfType"
1297	                         default="unknown"/>
1298	         </xs:extension>
1299	       </xs:simpleContent>
1300	     </xs:complexType>

1302	     <xs:simpleType name="confidenceBase">
1303	       <xs:union>
1304	         <xs:simpleType>
1305	           <xs:restriction base="xs:decimal">
1306	             <xs:minExclusive value="0.0"/>
1307	             <xs:maxExclusive value="100.0"/>
1308	           </xs:restriction>
1309	         </xs:simpleType>
1310	         <xs:simpleType>
1311	           <xs:restriction base="xs:token">
1312	             <xs:enumeration value="unknown"/>
1313	           </xs:restriction>
1314	         </xs:simpleType>
1315	       </xs:union>
1316	     </xs:simpleType>

1318	     <xs:simpleType name="pdfType">
1319	       <xs:restriction base="xs:token">
1320	         <xs:enumeration value="unknown"/>
1321	         <xs:enumeration value="normal"/>
1322	         <xs:enumeration value="rectangular"/>
1323	       </xs:restriction>
1324	     </xs:simpleType>

1326	   </xs:schema>

1328	8.  IANA Considerations

1330	8.1.  URN Sub-Namespace Registration for
1331	      urn:ietf:params:xml:ns:geopriv:conf

1333	   This section registers a new XML namespace,
1334	   "urn:ietf:params:xml:ns:geopriv:conf", as per the guidelines in
1335	   [RFC3688].

1337	   URI:  urn:ietf:params:xml:ns:geopriv:conf

1339	   Registrant Contact:  IETF, GEOPRIV working group, (geopriv@ietf.org),
1340	      Martin Thomson (martin.thomson@gmail.com).

1342	   XML:

1344	         BEGIN
1345	           <?xml version="1.0"?>
1346	           <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
1347	             "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
1348	           <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
1349	             <head>
1350	               <title>PIDF-LO Confidence Attribute</title>
1351	             </head>
1352	             <body>
1353	               <h1>Namespace for PIDF-LO Confidence Attribute</h1>
1354	               <h2>urn:ietf:params:xml:ns:geopriv:conf</h2>
1355	   [[NOTE TO IANA/RFC-EDITOR: Please update RFC URL and replace XXXX
1356	       with the RFC number for this specification.]]
1357	               <p>See <a href="[[RFC URL]]">RFCXXXX</a>.</p>
1358	             </body>
1359	           </html>
1360	         END

1362	8.2.  XML Schema Registration

1364	   This section registers an XML schema as per the guidelines in
1365	   [RFC3688].

1367	   URI:  urn:ietf:params:xml:schema:geopriv:conf

1369	   Registrant Contact:  IETF, GEOPRIV working group, (geopriv@ietf.org),
1370	      Martin Thomson (martin.thomson@gmail.com).

1372	   Schema:  The XML for this schema can be found as the entirety of
1373	      Section 7 of this document.

1375	9.  Security Considerations

1377	   This document describes methods for managing and manipulating
1378	   uncertainty in location.  No specific security concerns arise from
1379	   most of the information provided.

1381	   Adding confidence to location information risks misinterpretation by
1382	   consumers of location that do not understand the element.  This could
1383	   be exploited, particularly when reducing confidence, since the
1384	   resulting uncertainty region might include locations that are less
1385	   likely to contain the target than the recipient expects.  Since this
1386	   sort of error is always a possibility, the impact of this is low.

1388	10.  Acknowledgements

1390	   Peter Rhodes provided assistance with some of the mathematical
1391	   groundwork on this document.  Dan Cornford provided a detailed review
1392	   and many terminology corrections.

1394	11.  References

1396	11.1.  Normative References

1398	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1399	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1401	   [RFC3688]  Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688,
1402	              January 2004.

1404	   [RFC3693]  Cuellar, J., Morris, J., Mulligan, D., Peterson, J., and
1405	              J. Polk, "Geopriv Requirements", RFC 3693, February 2004.

1407	   [RFC4119]  Peterson, J., "A Presence-based GEOPRIV Location Object
1408	              Format", RFC 4119, December 2005.

1410	   [RFC5139]  Thomson, M. and J. Winterbottom, "Revised Civic Location
1411	              Format for Presence Information Data Format Location
1412	              Object (PIDF-LO)", RFC 5139, February 2008.

1414	   [RFC5491]  Winterbottom, J., Thomson, M., and H. Tschofenig, "GEOPRIV
1415	              Presence Information Data Format Location Object (PIDF-LO)
1416	              Usage Clarification, Considerations, and Recommendations",
1417	              RFC 5491, March 2009.

1419	   [RFC6225]  Polk, J., Linsner, M., Thomson, M., and B. Aboba, "Dynamic
1420	              Host Configuration Protocol Options for Coordinate-Based
1421	              Location Configuration Information", RFC 6225, July 2011.

1423	   [RFC6280]  Barnes, R., Lepinski, M., Cooper, A., Morris, J.,
1424	              Tschofenig, H., and H. Schulzrinne, "An Architecture for
1425	              Location and Location Privacy in Internet Applications",
1426	              BCP 160, RFC 6280, July 2011.

1428	11.2.  Informative References

1430	   [Convert]  Burtch, R., "A Comparison of Methods Used in Rectangular
1431	              to Geodetic Coordinate Transformations", April 2006.

1433	   [GeoShape]
1434	              Thomson, M. and C. Reed, "GML 3.1.1 PIDF-LO Shape
1435	              Application Schema for use by the Internet Engineering
1436	              Task Force (IETF)", Candidate OpenGIS Implementation
1437	              Specification 06-142r1, Version: 1.0, April 2007.

1439	   [ISO.GUM]  ISO/IEC, "Guide to the expression of uncertainty in
1440	              measurement (GUM)", Guide 98:1995, 1995.

1442	   [NIST.TN1297]
1443	              Taylor, B. and C. Kuyatt, "Guidelines for Evaluating and
1444	              Expressing the Uncertainty of NIST Measurement Results",
1445	              Technical Note 1297, Sep 1994.

1447	   [RFC5222]  Hardie, T., Newton, A., Schulzrinne, H., and H.
1448	              Tschofenig, "LoST: A Location-to-Service Translation
1449	              Protocol", RFC 5222, August 2008.

1451	   [RFC6772]  Schulzrinne, H., Tschofenig, H., Cuellar, J., Polk, J.,
1452	              Morris, J., and M. Thomson, "Geolocation Policy: A
1453	              Document Format for Expressing Privacy Preferences for
1454	              Location Information", RFC 6772, January 2013.

1456	   [Sunday02]
1457	              Sunday, D., "Fast polygon area and Newell normal
1458	              computation", Journal of Graphics Tools JGT,
1459	              7(2):9-13,2002, 2002,
1460	              <http://www.acm.org/jgt/papers/Sunday02/>.

1462	   [TS-3GPP-23_032]
1463	              3GPP, "Universal Geographic Area Description (GAD)", 3GPP
1464	              TS 23.032 11.0.0, September 2012.

1466	   [Vatti92]  Vatti, B., "A generic solution to polygon clipping",
1467	              Communications of the ACM Vol35, Issue7, pp56-63, 1992,
1468	              <http://portal.acm.org/citation.cfm?id=129906 >.

1470	   [WGS84]    US National Imagery and Mapping Agency, "Department of
1471	              Defense (DoD) World Geodetic System 1984 (WGS 84), Third
1472	              Edition", NIMA TR8350.2, January 2000.

1474	Appendix A.  Conversion Between Cartesian and Geodetic Coordinates in
1475	             WGS84

1477	   The process of conversion from geodetic (latitude, longitude and
1478	   altitude) to earth-centered, earth-fixed (ECEF) Cartesian coordinates
1479	   is relatively simple.

1481	   In this section, the following constants and derived values are used
1482	   from the definition of WGS84 [WGS84]:

1484	      {radius of ellipsoid} R = 6378137 meters

1486	      {inverse flattening} 1/f = 298.257223563

1488	      {first eccentricity squared} e^2 = f * (2 - f)

1490	      {second eccentricity squared} e'^2 = e^2 * (1 - e^2)

1492	   To convert geodetic coordinates (latitude, longitude, altitude) to
1493	   ECEF coordinates (X, Y, Z), use the following relationships:

1495	      N = R / sqrt(1 - e^2 * sin(latitude)^2)

1497	      X = (N + altitude) * cos(latitude) * cos(longitude)

1499	      Y = (N + altitude) * cos(latitude) * sin(longitude)

1501	      Z = (N*(1 - e^2) + altitude) * sin(latitude)

1503	   The reverse conversion requires more complex computation and most
1504	   methods introduce some error in latitude and altitude.  A range of
1505	   techniques are described in [Convert].  A variant on the method
1506	   originally proposed by Bowring, which results in an acceptably small
1507	   error, is described by the following:

1509	      p = sqrt(X^2 + Y^2)

1511	      r = sqrt(X^2 + Y^2 + Z^2)

1513	      u = atan((1-f) * Z * (1 + e'^2 * (1-f) * R / r) / p)

1515	      latitude = atan((Z + e'^2 * (1-f) * R * sin(u)^3)
1516	      / (p - e^2 * R * cos(u)^3))
1517	      longitude = atan2(Y, X)

1519	      altitude = sqrt((p - R * cos(u))^2 + (Z - (1-f) * R * sin(u))^2)

1521	   If the point is near the poles, that is "p < 1", the value for
1522	   altitude that this method produces is unstable.  A simpler method for
1523	   determining the altitude of a point near the poles is:

1525	      altitude = |Z| - R * (1 - f)

1527	Appendix B.  Calculating the Upward Normal of a Polygon

1529	   For a polygon that is guaranteed to be convex and coplanar, the
1530	   upward normal can be found by finding the vector cross product of
1531	   adjacent edges.

1533	   For more general cases the Newell method of approximation described
1534	   in [Sunday02] may be applied.  In particular, this method can be used
1535	   if the points are only approximately coplanar, and for non-convex
1536	   polygons.

1538	   This process requires a Cartesian coordinate system.  Therefore,
1539	   convert the geodetic coordinates of the polygon to Cartesian, ECEF
1540	   coordinates (Appendix A).  If no altitude is specified, assume an
1541	   altitude of zero.

1543	   This method can be condensed to the following set of equations:

1545	      Nx = sum from i=1..n of (y[i] * (z[i+1] - z[i-1]))

1547	      Ny = sum from i=1..n of (z[i] * (x[i+1] - x[i-1]))

1549	      Nz = sum from i=1..n of (x[i] * (y[i+1] - y[i-1]))

1551	   For these formulae, the polygon is made of points
1552	   "(x[1], y[1], z[1])" through "(x[n], y[n], x[n])".  Each array is
1553	   treated as circular, that is, "x[0] == x[n]" and "x[n+1] == x[1]".

1555	   To translate this into a unit-vector; divide each component by the
1556	   length of the vector:

1558	      Nx' = Nx / sqrt(Nx^2 + Ny^2 + Nz^2)

1560	      Ny' = Ny / sqrt(Nx^2 + Ny^2 + Nz^2)

1562	      Nz' = Nz / sqrt(Nx^2 + Ny^2 + Nz^2)

1564	B.1.  Checking that a Polygon Upward Normal Points Up

1566	   RFC 5491 [RFC5491] stipulates that polygons be presented in anti-
1567	   clockwise direction so that the upward normal is in an upward
1568	   direction.  Accidental reversal of points can invert this vector.
1569	   This error can be hard to detect just by looking at the series of
1570	   coordinates that form the polygon.

1572	   Calculate the dot product of the upward normal of the polygon
1573	   (Appendix B) and any vector that points away from the center of the
1574	   Earth from the location of polygon.  If this product is positive,
1575	   then the polygon upward normal also points away from the center of
1576	   the Earth.

1578	      The inverse cosine of this value indicates the angle between the
1579	      horizontal plane and the approximate plane of the polygon.

1581	   A unit vector for the upward direction at any point can be found
1582	   based on the latitude (lat) and longitude (lng) of the point, as
1583	   follows:

1585	      Up = [ cos(lat) * cos(lng) ; cos(lat) * sin(lng) ; sin(lat) ]

1587	   For polygons that span less than half the globe, any point in the
1588	   polygon - including the centroid - can be selected to generate an
1589	   approximate up vector for comparison with the upward normal.

1591	Authors' Addresses

1593	   Martin Thomson
1594	   Mozilla
1595	   331 E Evelyn Street
1596	   Mountain View, CA  94041
1597	   US

1599	   Email: martin.thomson@gmail.com

1601	   James Winterbottom
1602	   Unaffiliated
1603	   AU

1605	   Email: a.james.winterbottom@gmail.com