[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [speechsc] Fallback <audio> and 003 uri-failure
Dave:
You may misunderstand the capabilities of SSML. SSML supports arbitrarily
deep fallback for each logical segment. The result is that multiple errors
can occur for a single, successfully played SSML prompt.
Tracking these errors to understand exactly what the caller heard is an
important part of post-call analysis. If it "isn't the job of MRCP to
provide the (data for) forensics" in a standard and interoperable manner,
this greatly devalues MRCP for use in realistic deployments.
It's not as if providing for the analysis is particularly burdensome. I've
already noted that a combination of errors and markers is sufficient without
invoking unnecessary and proprietary opaque data.
-=- Jerry
> -----Original Message-----
> From: David R Oran [mailto:oran at cisco.com]
> Sent: Tuesday, May 09, 2006 8:34 AM
> To: Jerry Carter
> Cc: IETF SPEECHSC (E-mail); Andrew Wahbe; Dave Burke
> Subject: Re: [speechsc] Fallback <audio> and 003 uri-failure
>
>
> On May 8, 2006, at 5:43 PM, Jerry Carter wrote:
>
> > Experience leads me to believe that the simpler approach does not
> > work for arbitrary SSML documents. Because URIs may point to
> > streaming data or return content based on cookies, SSML may use the
> > same URI to reference different data. Having explicit marker
> > events and error reporting (either as events or messages) may allow
> > the client to determine the exact URI instance that failed -- for
> > those rare cases where this is necessary.
> >
> Why can't the client re-reference the URI itself to see if it's an
> aliasing problem? Strikes me this is a general issue with any content
> indirection scheme and it isn't the job of MRCP to provide the
> forensics. On the other hand, giving the client a pointer into the
> SSML document for where the synthesizer "gave up" would seem to be
> useful and not much of a burden on the server. If we do something
> like this, it's important to not go *too* far and wind up with
> something complex like compiler tracebacks syntactically and
> semantically mandated by MRCP. Here's one possible approach:
>
> 1) we allow some opaque (to MRCP) data to be returned on the URI
> failure event.
> 2) we suggest to people the server can use this to provide forensics
> to the client for where in parsing/playing the SSML the server barfed
> and possibly why.
>
> Dave.
>
>
> > C->S: MRCP/2.0 489 SPEAK 543257
> > Channel-Identifier:32AECB23433802 at speechsynth
> > Content-Type:application/ssml+xml
> > Content-Length:???
> >
> > <?xml version="1.0"?>
> > <speak version="1.0"
> > xmlns="http://www.w3.org/2001/10/synthesis"
> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> > xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
> > http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
> > xml:lang="en-US" xml:base="http://www.example.com/">
> > <audio src="uri.wav"> <!-- invalid URI -->
> > <mark name="inside first"/>
> > <audio src="uri.wav"/> <!-- valid URI -->
> > </audio>
> > <mark name="before second"/>
> > <audio src="uri.wav"/>
> > </speak>
> >
> > S->C: MRCP/2.0 28 543257 200 IN-PROGRESS
> > Channel-Identifier:32AECB23433802 at speechsynth
> >
> > S->C: MRCP/2.0 543257 407 IN-PROGRESS
> > Channel-Identifier:32AECB23433802 at speechsynth
> > Completion-Cause:009 uri resolution problem
> > Failed-URI-Cause:404
> > Failed-URI:http://www.example.com/uri.wav
> >
> > S->C: MRCP/2.0 SPEECH-MARKER 543257 IN-PROGRESS
> > Channel-Identifier:32AECB23433802 at speechsynth
> > Speech-Marker:timestamp=;inside first
> >
> > S->C: MRCP/2.0 SPEECH-MARKER 543257 IN-PROGRESS
> > Channel-Identifier:32AECB23433802 at speechsynth
> > Speech-Marker:timestamp=;before second
> >
> > S->C: MRCP/2.0 SPEAK-COMPLETE 543257 COMPLETE
> > Channel-Identifier:32AECB23433802 at speechsynth
> > Completion-Cause:000 normal
> >
> >
> >
> > On May 8, 2006, at 4:39 PM, Dave Burke wrote:
> >
> >> At this late stage, I think changing the message exchange pattern
> >> is too incisive (I also quite like the patter...). Though, I
> >> understand you concern for a proliferation of events.... With that
> >> in mind, how about taking a variation of what's been discussed for
> >> grammars and go with:
> >>
> >> 1. Allow Failed-URI to appear multiple times in SPEAK-COMPLETE
> >> (and reports failed <audio>s) with return type 003 uri-failure.
> >> These headers cannot be combined to one comma separated list
> >> because commas are valid reserved URI tokens.
> >>
> >> 2. Combine the the reason in the Failed-URI as you suggested so
> >> that we can have multiple Failed-URIs.
> >>
> >> This is sufficient for the MRCP client to detect what has been
> >> played and what hasn't and is consistent with SSML.
> >>
> >> Dave
> >>
> >> ----- Original Message ----- From: "Carter, Jerry"
> >> <jerry.carter at nuance.com>
> >> To: "Andrew Wahbe" <awahbe at voicegenie.com>; "Dave Burke"
> >> <david.burke at voxpilot.com>
> >> Cc: <speechsc at ietf.org>
> >> Sent: Monday, May 08, 2006 8:34 PM
> >> Subject: RE: [speechsc] Fallback <audio> and 003 uri-failure
> >>
> >>
> >>> I agree that the current text is clear. As described in section
> >>> 5, there is
> >>> a single response delivered for each message. Unfortunately, as
> >>> this case
> >>> and a similar analysis for grammar definitions shows [1], error
> >>> handling is
> >>> an area of weakness in the -09 draft.
> >>>
> >>> There are two solutions that come to mind.
> >>>
> >>> * Add additional events which can be used for error reporting.
> >>> This seems
> >>> to be the direction that you and Dave are endorsing.
> >>>
> >>> * Alternatively, relax the single response requirement so that
> >>> requests
> >>> follow a natural progression from PENDING to IN-PROGRESS to
> >>> COMPLETE. Each
> >>> request would generate exactly one COMPLETE response. This final
> >>> response
> >>> might be preceded by zero or more IN-PROGRESS messages which
> >>> would in turn
> >>> be preceded by zero or more PENDING messages.
> >>>
> >>> I fear the events approach leads to a proliferation of events and
> >>> confuses
> >>> the semantics of the language. Conversely, the clear progression
> >>> in message
> >>> handling states is easily described by adding a paragraph or two
> >>> to section
> >>> 5.
> >>>
> >>>
> >>> [1] http://www1.ietf.org/mail-archive/web/speechsc/current/
> >>> msg01797.html
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Andrew Wahbe [mailto:awahbe at voicegenie.com]
> >>>> Sent: Monday, May 08, 2006 3:08 PM
> >>>> To: Dave Burke
> >>>> Cc: Carter, Jerry; speechsc at ietf.org
> >>>> Subject: Re: [speechsc] Fallback <audio> and 003 uri-failure
> >>>>
> >>>> I was going to give a similar reply but I wanted to reference the
> >>>> restriction text in the spec. Unfortunately, I haven't been able
> >>>> to find
> >>>> it though the term "response" does imply it... of course PENDING
> >>>> and
> >>>> IN-PROGRESS could be interpreted as a kind of "provisional"
> >>>> response....
> >>>> though the examples paint a different picture (only 1 response
> >>>> to each
> >>>> request).
> >>>>
> >>>> Section 5.3 says:
> >>>>
> >>>> After receiving and interpreting the request message for a
> >>>> method,
> >>>> the server resource responds with an MRCPv2 response message.
> >>>>
> >>>> and
> >>>>
> >>>> A PENDING or IN-PROGRESS
> >>>> status indicates that further Event messages may be delivered
> >>>> with
> >>>> that request-id.
> >>>>
> >>>> Perhaps the limit of one response to each request should be stated
> >>>> explicitly somewhere (sorry if I missed it).
> >>>>
> >>>> Andrew
> >>>>
> >>>> Dave Burke wrote:
> >>>> > That works if we change the second response to an event as
> >>>> suggested
> >>>> > by Andrew (the MRCP message exchange pattern rightly restricts
> >>>> one
> >>>> > response to each request).
> >>>> >
> >>>> > Dave
> >>>> >
> >>>> > ----- Original Message ----- From: "Carter, Jerry"
> >>>> > <jerry.carter at nuance.com>
> >>>> > To: "Dave Burke" <david.burke at voxpilot.com>; <speechsc at ietf.org>
> >>>> > Sent: Monday, May 08, 2006 4:45 PM
> >>>> > Subject: RE: [speechsc] Fallback <audio> and 003 uri-failure
> >>>> >
> >>>> >
> >>>> > Would not notification along these lines be appropriate? If
> >>>> so, > perhaps
> >>>> > adding this example to the specification would be useful.
> >>>> >
> >>>> > C->S: MRCP/2.0 489 SPEAK 543257
> >>>> > Channel-Identifier:32AECB23433802 at speechsynth
> >>>> > Content-Type:application/ssml+xml
> >>>> > Content-Length:???
> >>>> >
> >>>> > <?xml version="1.0"?>
> >>>> > <speak version="1.0"
> >>>> > xmlns="http://www.w3.org/2001/10/synthesis"
> >>>> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-
> >>>> instance"
> >>>> > xsi:schemaLocation="http://www.w3.org/2001/10/
> >>>> synthesis
> >>>> > http://www.w3.org/TR/speech-synthesis/
> >>>> synthesis.xsd"
> >>>> > xml:lang="en-US" xml:base="http://
> >>>> www.example.com/">
> >>>> > <audio src="baduri.wav"> <!-- invalid URI -->
> >>>> > <audio src="gooduri.wav"/> <!-- valid URI -->
> >>>> > </audio>
> >>>> > </speak>
> >>>> >
> >>>> >
> >>>> > S->C: MRCP/2.0 28 543257 200 IN-PROGRESS
> >>>> > Channel-Identifier:32AECB23433802 at speechsynth
> >>>> >
> >>>> > S->C: MRCP/2.0 543260 407 IN-PROGRESS
> >>>> > Channel-Identifier:32AECB23433802 at speechsynth
> >>>> > Completion-Cause:009 uri resolution problem
> >>>> > Failed-URI-Cause:404
> >>>> > Failed-URI:http://www.example.com/baduri.wav
> >>>> >
> >>>> > S->C: MRCP/2.0 79 SPEAK-COMPLETE 543257 COMPLETE
> >>>> > Channel-Identifier:32AECB23433802 at speechsynth
> >>>> > Completion-Cause:000 normal
> >>>> >
> >>>> >
> >>>> > Dave Burke wrote:
> >>>> >> If I have some SSML along the lines of
> >>>> >>
> >>>> >> <speak>
> >>>> >> <audio src="baduri.wav"> <!-- invalid URI -->
> >>>> >> <audio src="gooduri.wav"/> <!-- valid URI -->
> >>>> >> </audio>
> >>>> >> </speak>
> >>>> >>
> >>>> >> will I get a SPEAK-COMPLETE with 000 normal or 003 uri-failure?
> >>>> >>
> >>>> >> SSML requires that processing continues but that the
> >>>> >> hosting environment be notified. It would be useful to
> >>>> >> clarify that this is indeed the case with the basicsynth
> >>>> >> / speechsynth and that 003 uri-failure will be returned.
> >>>> >> Without this, the most trivial of media server applications
> >>>> >> (i.e. playing announcements) is not possible to be
> >>>> >> implemented robustly.
> >>>> >>
> >>>> >> Dave
> >>>> >
> >>>> >
> >>>> > _______________________________________________
> >>>> > Speechsc mailing list
> >>>> > Speechsc at ietf.org
> >>>> > https://www1.ietf.org/mailman/listinfo/speechsc
> >>>> >
> >>>> >
> >>>> > _______________________________________________
> >>>> > Speechsc mailing list
> >>>> > Speechsc at ietf.org
> >>>> > https://www1.ietf.org/mailman/listinfo/speechsc
> >>>> >
> >>>> >
> >>>
> >>> _______________________________________________
> >>> Speechsc mailing list
> >>> Speechsc at ietf.org
> >>> https://www1.ietf.org/mailman/listinfo/speechsc
> >>
> >>
> >> _______________________________________________
> >> Speechsc mailing list
> >> Speechsc at ietf.org
> >> https://www1.ietf.org/mailman/listinfo/speechsc
> >>
> >
> >
> > _______________________________________________
> > Speechsc mailing list
> > Speechsc at ietf.org
> > https://www1.ietf.org/mailman/listinfo/speechsc
>
> _______________________________________________
> Speechsc mailing list
> Speechsc at ietf.org
> https://www1.ietf.org/mailman/listinfo/speechsc
_______________________________________________
Speechsc mailing list
Speechsc at ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc