@@ -515,11 +515,11 @@ class StreamingDetectIntentResponse(proto.Message):
515
515
516
516
Multiple response messages can be returned in order:
517
517
518
- 1. If the input was set to streaming audio, the first one or more
519
- messages contain ``recognition_result``. Each
520
- ``recognition_result`` represents a more complete transcript of
521
- what the user said. The last ``recognition_result`` has
522
- ``is_final`` set to ``true`` .
518
+ 1. If the ``StreamingDetectIntentRequest.input_audio`` field was
519
+ set, the ``recognition_result`` field is populated for one or
520
+ more messages. See the
521
+ [StreamingRecognitionResult][google.cloud.dialogflow.v2.StreamingRecognitionResult]
522
+ message for details about the result message sequence .
523
523
524
524
2. The next message contains ``response_id``, ``query_result`` and
525
525
optionally ``webhook_status`` if a WebHook was called.
@@ -570,33 +570,42 @@ class StreamingRecognitionResult(proto.Message):
570
570
the audio that is currently being processed or an indication that
571
571
this is the end of the single requested utterance.
572
572
573
- Example:
574
-
575
- 1. transcript: "tube"
576
-
577
- 2. transcript: "to be a"
578
-
579
- 3. transcript: "to be"
580
-
581
- 4. transcript: "to be or not to be" is_final: true
582
-
583
- 5. transcript: " that's"
584
-
585
- 6. transcript: " that is"
586
-
587
- 7. message_type: ``END_OF_SINGLE_UTTERANCE``
588
-
589
- 8. transcript: " that is the question" is_final: true
590
-
591
- Only two of the responses contain final results (#4 and #8 indicated
592
- by ``is_final: true``). Concatenating these generates the full
593
- transcript: "to be or not to be that is the question".
594
-
595
- In each response we populate:
596
-
597
- - for ``TRANSCRIPT``: ``transcript`` and possibly ``is_final``.
598
-
599
- - for ``END_OF_SINGLE_UTTERANCE``: only ``message_type``.
573
+ While end-user audio is being processed, Dialogflow sends a series
574
+ of results. Each result may contain a ``transcript`` value. A
575
+ transcript represents a portion of the utterance. While the
576
+ recognizer is processing audio, transcript values may be interim
577
+ values or finalized values. Once a transcript is finalized, the
578
+ ``is_final`` value is set to true and processing continues for the
579
+ next transcript.
580
+
581
+ If
582
+ ``StreamingDetectIntentRequest.query_input.audio_config.single_utterance``
583
+ was true, and the recognizer has completed processing audio, the
584
+ ``message_type`` value is set to \`END_OF_SINGLE_UTTERANCE and the
585
+ following (last) result contains the last finalized transcript.
586
+
587
+ The complete end-user utterance is determined by concatenating the
588
+ finalized transcript values received for the series of results.
589
+
590
+ In the following example, single utterance is enabled. In the case
591
+ where single utterance is not enabled, result 7 would not occur.
592
+
593
+ ::
594
+
595
+ Num | transcript | message_type | is_final
596
+ --- | ----------------------- | ----------------------- | --------
597
+ 1 | "tube" | TRANSCRIPT | false
598
+ 2 | "to be a" | TRANSCRIPT | false
599
+ 3 | "to be" | TRANSCRIPT | false
600
+ 4 | "to be or not to be" | TRANSCRIPT | true
601
+ 5 | "that's" | TRANSCRIPT | false
602
+ 6 | "that is | TRANSCRIPT | false
603
+ 7 | unset | END_OF_SINGLE_UTTERANCE | unset
604
+ 8 | " that is the question" | TRANSCRIPT | true
605
+
606
+ Concatenating the finalized transcripts with ``is_final`` set to
607
+ true, the complete utterance becomes "to be or not to be that is the
608
+ question".
600
609
601
610
Attributes:
602
611
message_type (google.cloud.dialogflow_v2.types.StreamingRecognitionResult.MessageType):
0 commit comments