[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Simple] Strength of XML validity requirements in SIMPLE documents



Jonathan Rosenberg <jdrosen at dynamicsoft.com> writes:
>I'm saying that a document of type application/pidf+xml could be valid
>against the PIDF schema, but it contains extensions (which are allowed in
>the pidf schema), and those extensions make use of elements defined in
>another schema. However, those elements are not valid according to that
>other schema.

>In this case, do we consider this document "valid"?

Term valid has at least three different senses with the XML:

1) 

W3C defines that an XML document is valid when it has a DTD and
the document complies with the constraints set forth in DTD.

That is not what we want. No <!DOCTYPE> for presence, no "MUST be
valid", please.

(When RFC 3863 says that the application/pidf+xml document MUST
be well-formed and SHOULD be valid, I guess it just says that
extensions are allowed, even if they are not shown in the DTD.)

2) 

XML schema specification uses a bit different terminology. Schemas
don't specify documents, but trees (item and all its descendants). 
You can assess validity of a particular tree, for instance, all
the XML stuff within <presence> element. The validator can also
skip certain elements (like the wildcard elements used for
extending pidf). So, validator can return different results: tree
and items within it are fully validated, partially validated, or
not validated at all. Also each, item within the tree are valid or
invalid or their validity is not known.  

I'd say that the XML document is schema-valid if its root element
is fully validated and it and all its descendants are valid. This
is not the official definition, however.

This is unfeasible for the intermediate nodes (or processes), too. 
Unless the node knows the schema (and semantics) of all the
extensions, it cannot assess the validity of composed or filtered
documents. I think filtering and composing is precisely the things
that we want to do in the server nodes.

I can also imagine some applications that would just use existing
presence service, just add their non-standard application-specific
status in the published presence document. Now imagine two of
those applications running on the same node.

3)

Beside validity assessment, XML schema specification has the
concept of "local schema-validity", too:

<http://www.w3.org/TR/xmlschema-1/#section-Overview-of-XML-Schema>

I must admit that I don't fully understand what this
means, but I have a good hunch that it is what most of us
want. ;)

A element or attribute is valid if it satisfies the constraints
set by the XML schema. So, <presence> element is valid, if it
contains entity attribute, zero or more <tuple> elements, then
zero or more <note> elements and after then something wildcarded
stuff (I guess that is anything else but pidf-namespace stuff). I
don't understand how the datatype checking of evil types like
xs:ID or xs:IDREF is done when the "local schema-validity" is
determined. I *guess* that ID/IDREF attribute in tuple is valid if
its value is syntactically correct. Like id="_123" is valid but,
id="123" is not. No check for all id's within a document/item and
all of its descendats is required.

So, if we have a application/pidf+xml document, we can simply say
that it MUST conform to its schema and other requirements set
forth in RFC 3863. If we want to be more verbose, we could say
that the root node in the document MUST be partially or fully
validated and none of the validated nodes MUST NOT be invalid. Or
we could invent our own term and say that documents MUST be
locally-schema-valid with the schema XXX.

--Pekka

_______________________________________________
Simple mailing list
Simple at ietf.org
https://www1.ietf.org/mailman/listinfo/simple