[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Speechsc] Consolidated comments on NLSML
See below for assignments to issue tracker, etc.
-----Original Message-----
From: Dave Burke [mailto:david.burke at voxpilot.com]
Sent: Tuesday, March 21, 2006 1:03 PM
To: speechsc at ietf.org
Subject: [Speechsc] Consolidated comments on NLSML
Hello,
I wanted to gather together comments I've made previously on NLSML (and
several new ones) into a single thread to make them easier to follow
and,
consensus permitting, to apply.
Dave
General:
----------
o Namespace: We have a namespace for enrollment and verification
elements
but the rest are assigned to no namespace. That's because NLSML is not a
completed standard and has no namespace defined. MRCP has imported and
extended NLSML based on a discontinued working draft from the W3C. I
believe
MRCP's version of NLSML should live completely in the MRCP namespace.
Furthermore, a version attribute really should be added to indicate the
version of MRCP NLSML. Thus the basic structure should be:
<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2">
...
</result>
[Dan>>] Added to issue tracker as issue 68.
o Xforms: Comments made in previous thread - data model not needed nor
defined correctly in NLSML - remove.
[Dan>>] Already tracked as issue 55. See intended resolution at
https://www.softarmor.com/roundup/speechsc/issue55.
o Type of results: Today, speech recognition results are contained in
<result>, voice enrollment results in <enrollment-result> under
<result>,
and speaker verification results in <verification-result>. There is also
a
redundant <result-type> element present only for the latter two results.
Why
not simplify and just have a type attribute on <result> to parameterise
the
type of results, e.g. (NB 'speech' could be default type)
<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2"
type="speech">
...
</result>
<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2"
type="enrollment">
...
</result>
<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2"
type="verification">
...
</result>
[DanB>>] Tracked in issue tracker as issue 66.
o Section 16: The NLSML schema is defined across one XML schema document
and
two RelaxNG documents. This should be consolidated into a single RelaxNG
document IMO.
[DanB>>] Added to issue tracker as issue 72.
o Delete erroneous element <extensions> from NLSML examples.
[DanB>>] Addressed as issue 65-10.
Speech Recognition Related:
-------------------------------------
o For Completion-Cause of no-match or no-input-timeout, do we still
get NLSML results with the <nomatch> or <noinput> elements? Not
explicitly
called out anywhere.
[DanB>>] Added to issue tracker as issue 73.
o If there are multiple grammars activated and a noinput or nomatch
occurs,
what grammar is placed in the grammar attribute of <result>?
[DanB>>] Appended to issue 73.
o Clarify that <instance> is present and empty if <nomatch> or <noinput>
occurred.
[DanB>>] Appended to issue 73.
o Section 9.6.3.3: State that when W3C Semantic Interpretation for
Speech
Recognition is used, the contents of <instance> are the XML
serialisation of
the ECMAScript results (discussed in earlier thread).
[DanB>>] Already tracked as issue 55. See intended resolution at
https://www.softarmor.com/roundup/speechsc/issue55.
Voice Enrollment Related:
---------------------------------
o What does <input> contain for recognition from a voice enrolled
grammar?
[DanB>>] Added to issue tracker as issue 74.
o Section 9.7.x: Editorial - don't capitalise words with XML elements,
e.g.
change <Confusable-Phrases> -> <confusable-phrases> etc
[DanB>>] These should be consistent with the rest. Will lowercase all
elements in NLSML. Applied to draft.
o What is the format for <transcription> contents? If its
platform-specific
then we should say so.
[DanB>>] Added to issue tracker as issue 75.
o The element <item> is apparently used as a delimiter of contents of
<transcriptions>, <clash-phrase-ids>, and <confusable-phrases> but this
element is not specified anywhere.
[DanB>>] Added to issue tracker as issue 76.
o Section 9.7.4.: <consistency-status> values should be lowercase (or
otherwise <decision> should use uppercase). Requires changes to RelaxNG
schema.
[DanB>>] Agree that these should be consistent. Will lowercase values
in <consistency-status>. Applied to draft.
o Section 9.7.x: Indicate in text which elements are required / optional
and
when.
[DanB>>] Added to issue tracker as issue 77.
o Section 9.7: Editorial - fix "???" with correct MIME type
application/nlsml+xml
[DanB>>] Applied to draft.
Speaker Verification Related:
-------------------------------------
o Clarify that grammar attribute on <result> is not present for
verification
results.
[DanB>>] Tracked as part of issue 69.
o Element <needmoredata> not specified in the text but is specified in
schema. Obviously needed for training voiceprints. Note it should be
called
<need-more-data> for consistency. Why not have it as a childless element
instead of containing a boolean - its presence implies more data needed?
Shouldn't <need-more-data> be at the same level as <cumulative> and
<incremental> as opposed to being duplicated in both? Or at least
just be contained in <cumulative>?
[DanB>>] Added to issue tracker as issue 78.
o Element <num-frames> present in examples but not specified in text.
Perhaps examples should be use <utterance-length> instead?
[DanB>>] Addressed as issue 65-5.
o Section 11.5: Introduce what these two examples are supposed to be
conveying
[DanB>>] Added to issue tracker as issue 79.
o Section 11.5.3: Why is <incremental> limited to the first voiceprint?
Doesn't it make sense to have for each voiceprint (say for
identification or
multi-verification)?
[DanB>>] Added to issue tracker as issue 80.
o Section 11.5.4: Clarify that <decision> is not present for training
results.
[DanB>>] Tracked via issue 71.
o Section 11.5.x - Indicate in text which elements are required /
optional
and when.
[DanB>>] Tracked via issue 67.
o Section 11.5.6: Delete et-phoned-home and explain the remaining types
e.g.
what exactly is a electret-phone etc
[DanB>>] Addressed in earlier email and via issue 65-7.
o Section 11.5.9: During training, is <verification-score> duplicated
under
<incremental> and <cumulative> or perhaps it should just appea
r under <cumulative>.
[DanB>>] Added to issue tracker as issue 81.
o Section 11.5.6 / 11.5.7: Seems like <device>, <gender> is duplicated
across <voiceprint>s for identification / multi-verification? Is this
redundancy a known issue?
[DanB>>] Added to issue tracker as issue 82.
_______________________________________________
Speechsc mailing list
Speechsc at ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
_______________________________________________
Speechsc mailing list
Speechsc at ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc