[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Speechsc] Consolidated comments on NLSML



See below for assignments to issue tracker, etc.

-----Original Message-----
From: Dave Burke [mailto:david.burke at voxpilot.com] 
Sent: Tuesday, March 21, 2006 1:03 PM
To: speechsc at ietf.org
Subject: [Speechsc] Consolidated comments on NLSML

Hello,

I wanted to gather together comments I've made previously on NLSML (and
several new ones) into a single thread to make them easier to follow
and,
consensus permitting, to apply.

Dave

General:
----------

o Namespace: We have a namespace for enrollment and verification
elements
but the rest are assigned to no namespace. That's because NLSML is not a
completed standard and has no namespace defined. MRCP has imported and
extended NLSML based on a discontinued working draft from the W3C. I
believe
MRCP's version of NLSML should live completely in the MRCP namespace.
Furthermore, a version attribute really should be added to indicate the
version of MRCP NLSML. Thus the basic structure should be:

<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2";>
    ...
</result>

[Dan>>] Added to issue tracker as issue 68.

o Xforms: Comments made in previous thread - data model not needed nor
defined correctly in NLSML - remove.

[Dan>>] Already tracked as issue 55.  See intended resolution at
https://www.softarmor.com/roundup/speechsc/issue55.

o Type of results: Today, speech recognition results are contained in
<result>, voice enrollment results in <enrollment-result> under
<result>,
and speaker verification results in <verification-result>. There is also
a
redundant <result-type> element present only for the latter two results.
Why
not simplify and just have a type attribute on <result> to parameterise
the
type of results, e.g. (NB 'speech' could be default type)

<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2";
             type="speech">
    ...
</result>

<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2";
             type="enrollment">
    ...
</result>

<?xml version="1.0"?>
<result version="1.0" xmlns="http://www.ietf.org/xml/ns/mrcpv2";
             type="verification">
    ...
</result>

[DanB>>] Tracked in issue tracker as issue 66.

o Section 16: The NLSML schema is defined across one XML schema document
and
two RelaxNG documents. This should be consolidated into a single RelaxNG
document IMO.

[DanB>>] Added to issue tracker as issue 72.

o Delete erroneous element <extensions> from NLSML examples.

[DanB>>] Addressed as issue 65-10.

Speech Recognition Related:
-------------------------------------

o For Completion-Cause of no-match or no-input-timeout, do we still
get NLSML results with the <nomatch> or <noinput> elements? Not
explicitly
called out anywhere.

[DanB>>] Added to issue tracker as issue 73.

o If there are multiple grammars activated and a noinput or nomatch
occurs,
what grammar is placed in the grammar attribute of <result>?

[DanB>>] Appended to issue 73.

o Clarify that <instance> is present and empty if <nomatch> or <noinput>
occurred.

[DanB>>] Appended to issue 73.

o Section 9.6.3.3:  State that when W3C Semantic Interpretation for
Speech
Recognition is used, the contents of <instance> are the XML
serialisation of
the ECMAScript results (discussed in earlier thread).

[DanB>>] Already tracked as issue 55.  See intended resolution at
https://www.softarmor.com/roundup/speechsc/issue55.

Voice Enrollment Related:
---------------------------------

o What does <input> contain for recognition from a voice enrolled
grammar?

[DanB>>] Added to issue tracker as issue 74.

o Section 9.7.x: Editorial - don't capitalise words with XML elements,
e.g.
change <Confusable-Phrases> -> <confusable-phrases> etc

[DanB>>] These should be consistent with the rest.  Will lowercase all
elements in NLSML.  Applied to draft.

o What is the format for <transcription> contents? If its
platform-specific
then we should say so.

[DanB>>] Added to issue tracker as issue 75.

o The element <item> is apparently used as a delimiter of contents of
<transcriptions>, <clash-phrase-ids>, and <confusable-phrases> but this
element is not specified anywhere.

[DanB>>] Added to issue tracker as issue 76.

o Section 9.7.4.: <consistency-status> values should be lowercase (or
otherwise <decision> should use uppercase). Requires changes to RelaxNG
schema.

[DanB>>] Agree that these should be consistent.  Will lowercase values
in <consistency-status>.  Applied to draft.

o Section 9.7.x: Indicate in text which elements are required / optional
and
when.

[DanB>>] Added to issue tracker as issue 77.

o Section 9.7: Editorial - fix "???" with correct MIME type
application/nlsml+xml

[DanB>>] Applied to draft.

Speaker Verification Related:
-------------------------------------

o Clarify that grammar attribute on <result> is not present for
verification
results.

[DanB>>] Tracked as part of issue 69.

o Element <needmoredata> not specified in the text but is specified in
schema. Obviously needed for training voiceprints. Note it should be
called
<need-more-data> for consistency. Why not have it as a childless element
instead of containing a boolean - its presence implies more data needed?
Shouldn't <need-more-data> be at the same level as <cumulative> and
<incremental> as opposed to being duplicated in both? Or at least
just be contained in <cumulative>?

[DanB>>] Added to issue tracker as issue 78.

o Element <num-frames> present in examples but not specified in text.
Perhaps examples should be use <utterance-length> instead?

[DanB>>] Addressed as issue 65-5.

o Section 11.5: Introduce what these two examples are supposed to be
conveying

[DanB>>] Added to issue tracker as issue 79.

o Section 11.5.3: Why is <incremental> limited to the first voiceprint?
Doesn't it make sense to have for each voiceprint (say for
identification or
multi-verification)?

[DanB>>] Added to issue tracker as issue 80.

o Section 11.5.4: Clarify that <decision> is not present for training
results.

[DanB>>] Tracked via issue 71.

o Section 11.5.x - Indicate in text which elements are required /
optional
and when.

[DanB>>] Tracked via issue 67.

o Section 11.5.6: Delete et-phoned-home and explain the remaining types
e.g.
what exactly is a electret-phone etc

[DanB>>] Addressed in earlier email and via issue 65-7.

o Section 11.5.9: During training, is <verification-score> duplicated
under
<incremental> and <cumulative> or perhaps it should just appea
r under <cumulative>.

[DanB>>] Added to issue tracker as issue 81.

o Section 11.5.6 / 11.5.7: Seems like <device>, <gender> is duplicated
across <voiceprint>s for  identification / multi-verification? Is this
redundancy a known issue?

[DanB>>] Added to issue tracker as issue 82.



_______________________________________________
Speechsc mailing list
Speechsc at ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc

_______________________________________________
Speechsc mailing list
Speechsc at ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc