[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Speechsc] Speaker Verification - Insufficient or Noisy Speech



I sent an email previously requesting information on how a speaker 
verification 
system implementing MRCPv2 should cope in the situation, where there was
insufficient or poor quality speech arriving on the RTP audio stream.  It 
seemed
to me that was an area of some deficiency in the specification.  I 
received no
feedback other than one response saying that to his knowledge there were 
no
other implementers for Speaker Verification.

Below I outline the MRCPv2 exchanges for a training operation:

   C->S:  MRCP/2.0 207 START-SESSION 314161
          Channel-Identifier:32AECB23433801 at speakverify
          Repository-URI:http://www.example.com/voiceprintdbase/
          Voiceprint-Mode:train
          Voiceprint-Identifier:johnsmith.voiceprint

   S->C:  MRCP/2.0 82 314161 200 COMPLETE
          Channel-Identifier:32AECB23433801 at speakverify

   C->S:  MRCP/2.0 76 VERIFY 314162
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 85 314162 200 IN-PROGRESS
          Channel-Identifier:32AECB23433801 at speakverify

The end-point detector show insufficient data (which is buffered), or bad 
signal quality (bad SNR for example).  Note that no START-OF-INPUT has NOT 

been sent although speech has begun.

   S->C:  MRCP/2.0 140 VERIFICATION-COMPLETE 314162 COMPLETE
          Channel-Identifier:32AECB23433801 at speakverify
          Completion-Cause:002 no-input-timeout

This is undesirable from my perspective since it gives the impression to 
the 
client that no data has been received (untrue in the insufficient data 
case), and
provides no distinction between this and the "bad data" case.  This 
information
might be of utility to a call-flow designer in an IVR system.

I also note that in the case of text-independent verifiers several turns 
worth of
data may be required for a verification.  Several rounds of "no input" 
timeouts
would surely be confusing to the client, yet this class of verifiers may 
be unable
to generate and nlsml+xml response on the nth dialog turn.

The enrolment might then continue:

   C->S:  MRCP/2.0 76 VERIFY 314163
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 85 314163 200 IN-PROGRESS
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 96 START-OF-INPUT 314163 IN-PROGRESS
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 131 VERIFICATION-COMPLETE 314163 COMPLETE
          Channel-Identifier:32AECB23433801 at speakverify
          Completion-Cause:000 success

   C->S:  MRCP/2.0 76 VERIFY 314164
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 85 314164 200 IN-PROGRESS
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 96 START-OF-INPUT 314164 IN-PROGRESS
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 131 VERIFICATION-COMPLETE 314164 COMPLETE
          Channel-Identifier:32AECB23433801 at speakverify
          Completion-Cause:000 success

   C->S:  MRCP/2.0 81 END-SESSION 314174
          Channel-Identifier:32AECB23433801 at speakverify

   S->C:  MRCP/2.0 82 314174 200 COMPLETE
          Channel-Identifier:32AECB23433801 at speakverify

Since I received no responses (perhaps due to being close to the holiday 
season),
I will venture a proposal for extending the RFC to include the bad signal 
cases 
(+ indicates an addition, * a modification)

   +------------+--------------------------+---------------------------+
   | Cause-Code | Cause-Name               | Description               |
   +------------+--------------------------+---------------------------+
   | 000        | success                  | VERIFY or                 |
   |            |                          | VERIFY-FROM-BUFFER        |
   |            |                          | request completed         |
   |            |                          | successfully.  The verify |
   |            |                          | decision can be           |
   |            |                          | "accepted", "rejected",   |
   |            |                          | or "undecided".           |
   | 001        | error                    | VERIFY or                 |
   |            |                          | VERIFY-FROM-BUFFER        |
   |            |                          | request terminated        |
   |            |                          | prematurely due to a      |
   |            |                          | verification resource or  |
   |            |                          | system error.             |
   | 002        | no-input-timeout         | VERIFY request completed  |
   |            |                          | with no result due to a   |
   |            |                          | no-input-timeout.         |
   | 003        | too-much-speech-timeout  | VERIFY request completed  |
   |            |                          | result due to too much    |
   |            |                          | speech.                   |
   | 004        | speech-too-early         | VERIFY request completed  |
   |            |                          | with no result due to     |
   |            |                          | spoke too soon.           |
 + | 005        | insufficient-speech      | VERIFY or                 |
 + |            |                          | VERIFY-FROM-BUFFER        |
 + |            |                          | request completed         |
 + |            |                          | successfully but had      |
 + |            |                          | insufficient speech to    |
 + |            |                          | complete.  More speech    |
 + |            |                          | will complete the current |
 + |            |                          | incremental operation     |
 + | 006        | bad-speech               | VERIFY or                 |
 + |            |                          | VERIFY-FROM-BUFFER        |
 + |            |                          | request completed         |
 + |            |                          | unsuccessfully, the       |
 + |            |                          | speech quality was too    |
 + |            |                          | poor                      |
 *  | 007        | buffer-empty             | VERIFY-FROM-BUFFER        |
   |            |                          | request completed with no |
   |            |                          | result due to empty       |
   |            |                          | buffer.                   |
*  | 008        | out-of-sequence          | Verification operation    |
   |            |                          | failed due to             |
   |            |                          | out-of-sequence method    |
   |            |                          | invocations.  For example |
   |            |                          | calling VERIFY before     |
   |            |                          | QUERY-VOICEPRINT.         |
*  | 009        | repository-uri-failure   | Failure accessing         |
   |            |                          | Repository URI.           |
*  | 010        | repository-uri-missing   | Repository-uri is not     |
   |            |                          | specified.                |
*  | 011        | voiceprint-id-missing    | Voiceprint-identification |
   |            |                          | is not specified.         |
*  | 012        | voiceprint-id-not-exist  | Voiceprint-identification |
   |            |                          | does not exist in the     |
   |            |                          | voiceprint repository.    |
   +------------+--------------------------+---------------------------+

Alternatively the new entries could be appended for compatibility.  The 
only
disadvantage to doing so would be that entries would not be grouped in the
table by category.

I'll happily accept any corrections to my understanding, incase I have 
misread
the spec, or feedback on my suggestions.




NIK WALDRON

_______________________________________________
Speechsc mailing list
Speechsc at ietf.org
https://www.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
<http://www.standardstrack.com/ietf/speechsc>