Extract interaction analytics from a media file

Last updated: 2024-02-08

Contributors

Interaction analytics is used to understand a conversation happening in a meeting between two or more people and extract from them more meaningful insights at scale. This API is a comprehensive in that in addition to its unique capabilities, it also bundles functionality found in our other APIs. In processing a media file, this API will provide multiple levels of insights, including:

Conversation insights
- transcription with smart punctuation
- content summaries
- keywords and conversation metrics
Speaker-level insights
- speaker diarization
- speaker identification
Utterance-level insights
- emotion recognition

Let's say we want to analyze a meeting between a sales representative and a customer, and that meeting lasted for twenty minutes. Here are some of the insights we can extract using this API:

Speaker talking time, e.g. a sales representative spoke for ten minutes, and the customer spoke for eight minutes.
Speaker pace, which is measured by an average number of words spoken per minute.
Speaker emotions, which was the tone or emotional context of every utterance.
Auto-generated meeting summary

Extracting interaction analytics

For the best results we recommend following the guidelines below.

The audioType parameter provides the system with a hint about the nature of the meeting which helps improve accuracy. We recommend setting this parameter to CallCenter when there are 2-3 speakers expected to be identified and Meeting when 4 or more speakers are expected.
Set the enableVoiceActivityDetection parameter to True if you want silence and noise segments removed from the diarization output. We suggest you to set it to True in most circumstances.
Setting the source parameter helps to optimize the diarization process by allowing a specialized acoustic model built specifically for the corresponding audio sources.
If you specify the speakerIds parameter, make sure that all the speaker ids in the array exist. Otherwise, the API call will fail. As a good practice, you can always read the speaker ids from your account and use the correct ids of the speakers, who you think that might speak in the audio file.

Request body parameters

Parameter	Type	Description
`encoding`	String	Encoding of audio file like MP3, WAV etc.
`sampleRate`	Number	Sample rate of the audio file. Optional.
`languageCode`	String	Language spoken in the audio file. Default of "en-US".
`separateSpeakerPerChannel`	Boolean	Set to True if the input audio is multi-channel and each channel has a separate speaker. Optional. Default of `False`.
`speakerCount`	Number	Number of speakers in the file. Optional.
`audioType`	String	Type of the audio based on number of speakers. Optional. Permitted values: `CallCenter`, `Meeting`, `EarningsCalls`, `Interview`, `PressConference`, `Voicemail`
`speakerIds`	List[String]	Optional set of speakers to be identified from the audio. Optional.
`enableVoiceActivityDetection`	Boolean	Apply voice activity detection. Optional. Default of `False`.
`contentUri`	String	Publicly facing url.
`source`	String	Source of the audio file eg: `Phone`, `RingCentral`, `GoogleMeet`, `Zoom` etc. Optional.
`insights`	List[String]	List of insights to be returned. Specify `['All']` to extract all insight analytics. Permitted Values: `All`, `KeyPhrases`, `Emotion`, `AbstractiveSummaryLong`, `AbstractiveSummaryShort`, `ExtractiveSummary`, `TalkToListenRatio`, `Energy`, `Pace`, `QuestionsAsked`, `Topics`.
`speechContexts`	List[Phrase Object]	Indicates the words/phrases that will be used for boosting the transcript. This can help to boost accuracy for cases like Person Names, Company names etc.

Sample code to extract insights of a conversation

The following code sample shows how to extract insights of a conversations from a call recording.

Follow the instructions on the quick start section to setup and run your server code before running the sample code below.

Running the code

Edit the variables in ALL CAPS with your app and user credentials before running the code.
You can only run on your production account, this means that you have to use app credentials for production.
Also make sure that you have recorded several voice recordings of your own voice.

JavaScript

const fs = require ('fs')
const RC = require('@ringcentral/sdk').SDK

// Instantiate the SDK and get the platform instance
var rcsdk = new RC({
    server: 'https://platform.ringcentral.com',
    clientId: 'PRODUCTION-APP-CLIENTID',
    clientSecret: 'PRODUCTION-APP-CLIENTSECRET'
});
var platform = rcsdk.platform();

/* Authenticate a user using a personal JWT token */
platform.login({ jwt: 'PRODUCTION-JWT' })

platform.on(platform.events.loginSuccess, () => {
    NGROK = "NGROK-TUNNEL-ADDRESS"
    WEBHOOK_URL = NGROK + "/webhook";
    CONTENT_URI = "PUBLICLY-ACCESSIBLE-CONTENT-URI"
    analyze_interaction()
})

platform.on(platform.events.loginError, function(e){
    console.log("Unable to authenticate to platform. Check credentials.", e.message)
    process.exit(1)
});

/*
* Transcribe a call recording and analyze interaction
*/
async function analyze_interaction() {
    try {
      let bodyParams = {
          contentUri:                   CONTENT_URI,
          encoding:                     "Mpeg",
          languageCode:                 "en-US",
          source:                       "RingCentral",
          audioType:                    "Meeting",
          insights:                     [ "All" ],
          enableVoiceActivityDetection: true,
          separateSpeakerPerChannel:    true
      }
      let endpoint = `/ai/insights/v1/async/analyze-interaction?webhook=${WEBHOOK_URL}`
      let resp = await platform.post(endpoint, bodyParams);
      let jsonObj = await resp.json();
      if (resp.status == 202) {
        console.log("Job ID: " + jsonObj.jobId);
        console.log("Ready to receive response at: " + WEBHOOK_URL);
      }
    } catch (e) {
        console.log(`Unable to call this API. ${e.message}`);
    }
}

Python

from ringcentral import SDK
import os,sys,urllib.parse,json

NGROK_ADDRESS = "NGROK-TUNNEL-ADDRESS"
WEBHOOK_URL = NGROK_ADDRESS + "/webhook";
CONTENT_URI = 'PUBLICLY-ACCESSIBLE-CONTENT-URI'

#
# Transcribe a call recording and analyze interaction
#
def analyze_interaction():
    try:
        bodyParams = {
          'contentUri': CONTENT_URI,
          'encoding': "Mpeg",
          'languageCode': "en-US",
          'source': "RingCentral",
          'audioType': "CallCenter",
          'insights': [ "All" ],
          'enableVoiceActivityDetection': True,
          'separateSpeakerPerChannel': True
        }
        endpoint = f'/ai/insights/v1/async/analyze-interaction?webhook={urllib.parse.quote(WEBHOOK_URL)}'
        resp = platform.post(endpoint, bodyParams)
        jsonObj = resp.json()
        if resp.response().status_code == 202:
            print(f'Job ID: {resp.json().jobId}');
            print(f'Ready to receive response at: {WEBHOOK_URL}');
    except Exception as e:
      print ("Unable to analyze interaction. " + str(e))



# Authenticate a user using a personal JWT token
def login():
  try:
      platform.login( jwt= "PRODUCTION-JWT" )
      analyze_interaction()
  except Exception as e:
      print ("Unable to authenticate to platform. Check credentials. " + str(e))

# Instantiate the SDK and get the platform instance
rcsdk = SDK("PRODUCTION-APP-CLIENTID", "PRODUCTION-APP-CLIENTSECRET", "https://platform.ringcentral.com")
platform = rcsdk.platform()

login()

PHP

<?php
require('vendor/autoload.php');

// Instantiate the SDK and get the platform instance
$rcsdk = new RingCentral\SDK\SDK( 'PRODUCTION-APP-CLIENTID', 'PRODUCTION-APP-CLIENTSECRET', 'https://platform.ringcentral.com' );
$platform = $rcsdk->platform();

/* Authenticate a user using a personal JWT token */
$platform->login(["jwt" => 'PRODUCTION-JWT']);
$NGROK_ADDRESS = "NGROK-TUNNEL-ADDRESS";
$WEBHOOK_URL = $NGROK_ADDRESS . "/webhook";
$CONTENT_URI = "PUBLICLY-ACCESSIBLE-CONTENT-URI";
analyze_interaction();

/*
* Transcribe a call recording and analyze interaction
*/
function analyze_interaction()
{
  global $platform, $WEBHOOK_URL, $CONTENT_URI;
  try {
    $bodyParams = array (
        'contentUri' =>  $CONTENT_URI,
        'encoding' => "Mpeg",
        'languageCode' =>  "en-US",
        'source' => "RingCentral",
        'audioType' =>  "CallCenter",
        'insights' => array ( "All" ),
        'enableVoiceActivityDetection' => True,
        'separateSpeakerPerChannel' =>  True
    );
    $endpoint = "/ai/insights/v1/async/analyze-interaction?webhook=" . urlencode($WEBHOOK_URL);
    $resp = $platform->post($endpoint, $bodyParams);
    $jsonObj = $resp->json();
    if ($resp->response()->getStatusCode() == 202) {
      print_r ("Job ID: " . $jsonObj->jobId . PHP_EOL);
      print_r("Ready to receive response at: " . $WEBHOOK_URL . PHP_EOL);
    }
  }catch (\RingCentral\SDK\Http\ApiException $e) {
    // Getting error messages using PHP native interface
    print_r ('HTTP Error: ' . $e->getMessage() . PHP_EOL);
    // Another way to get message, but keep in mind, that there could be no response if request has failed completely
    print_r ('Unable to analyze interaction. ' . $e->apiResponse->response()->error() . PHP_EOL);
  }
}
?>

Ruby

require 'ringcentral'

NGROK_ADDRESS = "NGROK-TUNNEL-ADDRESS"
WEBHOOK_URL = NGROK_ADDRESS + "/webhook";
CONTENT_URI = 'PUBLICLY-ACCESSIBLE-CONTENT-URI'

#
# Transcribe a call recording and analyze interaction
#
def analyze_interaction()
    bodyParams = {
        'contentUri': CONTENT_URI,
        'encoding': "Mpeg",
        'languageCode': "en-US",
        'source': "RingCentral",
        'audioType': "CallCenter",
        'insights': [ "All" ],
        'enableVoiceActivityDetection': true,
        'separateSpeakerPerChannel': true
    }
    queryParams = {
      'webhook': WEBHOOK_URL
    }
    endpoint = "/ai/insights/v1/async/analyze-interaction"
    begin
      resp = $platform.post(endpoint, payload: bodyParams, params: queryParams)
      body = resp.body
      if resp.status == 202
          puts('Job ID: ' + body['jobId']);
          puts ('Ready to receive response at: ' + WEBHOOK_URL);
      end
    rescue StandardError => e
      puts ("Unable to analyze interaction. " + e.to_s)
    end
end

# Authenticate a user using a personal JWT token
def login()
  begin
    $platform.authorize( jwt: "PRODUCTION-JWT" )
    analyze_interaction()
  rescue StandardError => e
    puts ("Unable to authenticate to platform. Check credentials. " + e.to_s)
  end
end

# Instantiate the SDK and get the platform instance
$platform = RingCentral.new( "PRODUCTION-APP-CLIENTID", "PRODUCTION-APP-CLIENTSECRET", "https://platform.ringcentral.com" )

login()

using System;
using System.IO;
using System.Threading.Tasks;
using System.Collections.Generic;
using RingCentral;
using Newtonsoft.Json;

namespace AnalyzeInteraction {
  class Program {
    static RestClient restClient;
    static string NGROK_ADDRESS = "NGROK-TUNNEL-ADDRESS";
    static string WEBHOOK_URL = NGROK_ADDRESS + "/webhook";
    static string CONTENT_URI = "PUBLICLY-ACCESSIBLE-CONTENT-URI";

    static async Task Main(string[] args){
      try
      {
        // Instantiate the SDK
        restClient = new RestClient("PRODUCTION-APP-CLIENT-ID", "PRODUCTION-APP-CLIENT-SECRET", "https://platform.ringcentral.com");

        // Authenticate a user using a personal JWT token
        await restClient.Authorize("PRODUCTION-JWT");

        await analyze_interaction();
      }
      catch (Exception ex)
      {
        Console.WriteLine("Unable to authenticate to platform. Check credentials. " + ex.Message);
      }
    }
    /*
    * Transcribe a call recording and analyze interaction
    */
    static private async Task analyze_interaction()
    {
      try
      {
        var bodyParams = new InteractionInput()
        {
            contentUri = CONTENT_URI,
            encoding = "Mpeg",
            languageCode = "en-US",
            source = "RingCentral",
            audioType = "CallCenter",
            insights = new String[] { "All" },
            enableVoiceActivityDetection = true,
            separateSpeakerPerChannel = true
        };
        var queryParams = new CaiAnalyzeInteractionParameters() { webhook = WEBHOOK_URL };

        var resp = await restClient.Ai().Insights().V1().Async().AnalyzeInteraction().Post(bodyParams, queryParams);
        Console.WriteLine("Job ID: " + resp.jobId);
        Console.WriteLine("Ready to receive response at: " + WEBHOOK_URL);
      }
      catch (Exception ex)
      {
        Console.WriteLine("Unable to analyze interaction. " + ex.Message);
      }
    }
  }
}

Java

package AnalyzeInteraction;

import java.io.IOException;
import com.google.common.reflect.TypeToken;
import com.google.gson.Gson;

import com.ringcentral.*;
import com.ringcentral.definitions.*;

public class AnalyzeInteraction {
    static String NGROK_ADDRESS = "NGROK-TUNNEL-ADDRESS";
    static String WEBHOOK_URL = NGROK_ADDRESS + "/webhook";
    static String CONTENT_URI = "PUBLICLY-ACCESSIBLE-CONTENT-URI";

    static RestClient restClient;

    public static void main(String[] args) {
      var obj = new AnalyzeInteraction();
      try {
        // Instantiate the SDK
        restClient = new RestClient("PRODUCTION-APP-CLIENT-ID", "PRODUCTION-APP-CLIENT-SECRET", "https://platform.ringcentral.com");

        // Authenticate a user using a personal JWT token
        restClient.authorize("PRODUCTION-JWT");

        obj.analyze_interaction();

      } catch (RestException e) {
        System.out.println(e.getMessage());
      } catch (IOException e) {
        e.printStackTrace();
      }
    }
    /*
    * Transcribe a call recording and analyze interaction
    */
    private void analyze_interaction()
    {
      try {
            var bodyParams = new InteractionInput()
                  .contentUri(CONTENT_URI)
                  .encoding("Mpeg")
                  .languageCode("en-US")
                  .source("RingCentral")
                  .audioType("CallCenter")
                  .insights(new String[] {"All"})
                  .enableVoiceActivityDetection(true)
                  .separateSpeakerPerChannel(true);

            var queryParams = new CaiAnalyzeInteractionParameters().webhook(WEBHOOK_URL);
            var resp = restClient.ai().insights().v1().async().analyzeInteraction().post(bodyParams, queryParams);
        System.out.println("Job ID: " + resp.jobId);
        System.out.println("Ready to receive response at: " + WEBHOOK_URL);
        } catch (Exception ex) {
            System.out.println("Unable to analyze interaction. " + ex.getMessage());
        }
    }
}

Example response

{
    "jobId": "80800e1a-a663-11ee-b548-0050568ccd07",
    "api": "/ai/insights/v1/async/analyze-interaction",
    "creationTime": "2023-12-29T16:01:18.558Z",
    "completionTime": "2023-12-29T16:01:29.217Z",
    "expirationTime": "2024-01-05T16:01:18.558Z",
    "status": "Success",
    "response": {
        "utteranceInsights": [
            {
                "start": 3.72,
                "end": 7.56,
                "text": "Good evening, thank you for calling electronics or this is Rachel.",
                "confidence": 0.85,
                "speakerId": "0",
                "insights": [
                    {
                        "name": "Emotion",
                        "value": "Neutral",
                        "confidence": 0.54
                    }
                ]
            },
            {
                "start": 7.56,
                "end": 8.96,
                "text": "How may I assist you?",
                "confidence": 0.85,
                "speakerId": "0",
                "insights": [
                    {
                        "name": "Emotion",
                        "value": "Fear",
                        "confidence": 0.71
                    }
                ]
            },
            {
                "start": 8.96,
                "end": 9.8,
                "text": "Hi, Rachel.",
                "confidence": 0.85,
                "speakerId": "1",
                "insights": [
                    {
                        "name": "Emotion",
                        "value": "Neutral",
                        "confidence": 0.79
                    }
                ]
            },
            {
                "start": 9.8,
                "end": 11.16,
                "text": "I would like to know how to use this car.",
                "confidence": 0.85,
                "speakerId": "1",
                "insights": [
                    {
                        "name": "Emotion",
                        "value": "Neutral",
                        "confidence": 0.4
                    }
                ]
            },
            {
                "start": 11.16,
                "end": 14.28,
                "text": "Bluetooth headset I recently purchased from your store.",
                "confidence": 0.85,
                "speakerId": "1",
                "insights": [
                    {
                        "name": "Emotion",
                        "value": "Neutral",
                        "confidence": 0.46
                    }
                ]
            },
            {
                "start": 14.28,
                "end": 21.36,
                "text": "Sure, ma'am, I can help you out with that, but before anything else, I have your name so that I can address you properly.",
                "confidence": 0.87,
                "speakerId": "0",
                "insights": [
                    {
                        "name": "Emotion",
                        "value": "Neutral",
                        "confidence": 0.91
                    }
                ]
            },
            {
                "start": 21.36,
                "end": 23.58,
                "text": "Yes, this is Meredith Blake.",
                "confidence": 0.87,
                "speakerId": "1",
                "insights": [
                    {
                        "name": "Emotion",
                        "value": "Neutral",
                        "confidence": 0.91
                    }
                ]
            },
            ...
        ],
        "speakerInsights": {
            "speakerCount": 2,
            "insights": [
                {
                    "name": "Energy",
                    "values": [
                        {
                            "speakerId": "0",
                            "value": 93.11
                        },
                        {
                            "speakerId": "1",
                            "value": 93.65
                        }
                    ]
                },
                {
                    "name": "Pace",
                    "values": [
                        {
                            "speakerId": "0",
                            "value": "medium",
                            "wpm": 152.9
                        },
                        {
                            "speakerId": "1",
                            "value": "fast",
                            "wpm": 196.9
                        }
                    ]
                },
                {
                    "name": "TalkToListenRatio",
                    "values": [
                        {
                            "speakerId": "0",
                            "value": "58:42"
                        },
                        {
                            "speakerId": "1",
                            "value": "42:58"
                        }
                    ]
                },
                {
                    "name": "QuestionsAsked",
                    "values": [
                        {
                            "speakerId": "0",
                            "value": 5,
                            "questions": [
                                {
                                    "text": "Good evening, thank you for calling electronics or this is Rachel. How may I assist you?",
                                    "start": 3.72,
                                    "end": 8.96
                                },
                                {
                                    "text": "Okay, thank you for that, Mrs. Plague, what exactly do you want done with your headset?",
                                    "start": 23.9,
                                    "end": 29.72
                                },
                                ...
                            ]
                        },
                        {
                            "speakerId": "1",
                            "value": 3,
                            "questions": [
                                {
                                    "text": "Well, we have already done that. I only ask a simple question. Why can't you seem to get that?",
                                    "start": 102.22,
                                    "end": 107.7
                                },
                                ...
                            ]
                        }
                    ]
                }
            ]
        },
        "conversationalInsights": [
            {
                "name": "KeyPhrases",
                "values": [
                    {
                        "start": 11.55,
                        "end": 11.94,
                        "value": "headset",
                        "confidence": 0.92
                    },
                    {
                        "start": 13.89,
                        "end": 14.28,
                        "value": "store",
                        "confidence": 0.94
                    },
                    {
                        "start": 29.36,
                        "end": 29.72,
                        "value": "headset",
                        "confidence": 0.86
                    },
                    {
                        "start": 34.32,
                        "end": 34.72,
                        "value": "headset",
                        "confidence": 0.91
                    },
                    {
                        "start": 38.68,
                        "end": 39.08,
                        "value": "phone",
                        "confidence": 0.86
                    },
                    {
                        "start": 43.77,
                        "end": 44.24,
                        "value": "iphone",
                        "confidence": 0.89
                    },
                    ...
                ]
            },
            {
                "name": "ExtractiveSummary",
                "values": []
            },
            {
                "name": "Topics",
                "values": [
                    {
                        "value": "car bluetooth headset",
                        "start": 9.8,
                        "end": 114.2,
                        "confidence": 0.92
                    }
                ]
            },
            {
                "name": "AbstractiveSummaryLong",
                "values": [
                    {
                        "value": "First speaker helps second speaker use a car bluetooth headset from the store and asks speaker 1 to switch off the device with speaker's phone.",
                        "start": 3.72,
                        "end": 114.2,
                        "confidence": 0.4,
                        "groupId": "0"
                    }
                ]
            },
            {
                "name": "AbstractiveSummaryShort",
                "values": [
                    {
                        "value": "First speaker helps second speaker use a car bluetooth headset from the store and asks speaker 1 to switch off the device with speaker's phone.",
                        "start": 3.72,
                        "end": 114.2,
                        "confidence": 0.4
                    }
                ]
            }
        ]
    }
}

Interaction-Analytics-Object

Interaction analytics are presented by insights grouped and categorized under the following category objects:

Parameter	Type	Description
`utteranceInsights`	List[Utterance-Insights]	List of utterances and the insights computed for each utterance.
`speakerInsights`	Object	The set of insights computed for each speaker separately.
`conversationalInsights`	List[Conversational-Insights-Object]	List of insights computed by analyzing the conversation as a whole.

Utterance-Insights

The utteranceInsights is a list of objects, with each object contains the following key/value pairs:

Parameter	Type	Description
`start`	Number	Start time of the audio segment in seconds.
`end`	Number	End time of the audio segment in seconds.
`text`	String	The transcription output corresponding to the segment (a.k.a an utterance).
`confidence`	Number	The confidence score for the transcribed segment.
`speakerId`	String	The speaker id for the corresponding audio segment.
`insights`	List[Utterance-Insights-Unit]	List of insights from the utterance text.

Utterance-Insights-Unit

Currently, only the Emotion insight is supported

Parameter	Type	Description
`name`	String Enum	Currently supported insight: [ Emotion ].
`value`	String	Possible values: Anger, Anticipation, Disgust, Fear, Joy, Sadness, Surprise, Trust, Neutral.
`confidence`	Number	Confidence Score. Optional.

Speaker-Insights-Object

The speakerInsights object contain the number of speakers which was detected

Parameter	Type	Description
`speakerCount`	Number	Number of speakers detected. In case `speakerCount` is not specified, the number of speakers are estimated algorithmically.
`insights`	List[Speaker-Insights-Unit]	List of overall level insights. Each insight is computed separately for each speaker.

Speaker-Insights-Unit

Parameter	Type	Description
`name`	String Enum	Name of the insight. Possible values: `Energy`, `Pace`, `TalkToListenRatio`, `QuestionsAsked`
`values`	List[Speaker-Insights-Value-Unit]	Value corresponding to the insight

Speaker-Insights-Value-Unit

Energy

Parameter Type Description

speakerId String The speaker id for whom insights are computed.

value Number The computed value of the insight for this speaker.

Pace

Parameter	Type	Description
`speakerId`	String	The speaker id for whom insights are computed.
`value`	String	The label of speech speed. `slow`, `medium` or `fast`.
`wpm`	Number	The average number of words per minute spoken by this speaker.

TalkToListenRatio

Parameter Type Description

speakerId String The speaker id for whom insights are computed.

value String The computed time ratio a speaker talks and listens.

QuestionsAsked

Parameter	Type	Description
`speakerId`	String	The speaker id for whom insights are computed.
`value`	Number	The computed value of the insight for this speaker.
`questions`	List[Question-Insights-Value-Unit]	List of questions asked by each speaker.

Question-Insights-Value-Unit

Parameter	Type	Description
`text`	String	The question a speaker asked.
`start`	Number	The start time of the audio segment in seconds.
`end`	Number	The end time of the audio segment in seconds.

Timed-Segment

Parameter	Type	Description
`start`	Number	Start time of the audio segment in seconds.
`end`	Number	End time of the audio segment in seconds.

Conversational-Insights-Object

Parameter	Type	Description
`name`	String Enum	Name of the insight. Possible values: `AbstractiveSummaryLong`, `AbstractiveSummaryShort`, `ExtractiveSummary`, `KeyPhrases`, `Topics`
`values`	List[Conversational-Insights-Value-Unit]	Value corresponding to the insight

Conversational-Insights-Value-Unit

KeyPhrases

Parameter	Type	Description
`start`	Number	Start time of the audio segment in seconds.
`end`	Number	End time of the audio segment in seconds.
`value`	String	The output corresponding to the insight.
`confidence`	Number	The confidence score for the computed insight.

Topics

Parameter	Type	Description
`start`	Number	Start time of the audio segment in seconds.
`end`	Number	End time of the audio segment in seconds.
`value`	String	The output corresponding to the insight.
`confidence`	Number	The confidence score for the computed insight.

ExtractiveSummary

Parameter Type Description

start Number Start time of the audio segment in seconds.

end Number End time of the audio segment in seconds.

sentence String The summarized text segment.

AbstractiveSummaryLong

Parameter	Type	Description
`value`	String	The text of a long abstractive summary.
`start`	Number	Start time of the audio segment in seconds.
`end`	Number	End time of the audio segment in seconds.
`confidence`	Number	The confidence score for the computed insight.
`groupId`	String	The index of this long abstractive summary.

AbstractiveSummaryShort

Parameter	Type	Description
`value`	String	The text of a short abstractive summary.
`start`	Number	Start time of the audio segment in seconds.
`end`	Number	End time of the audio segment in seconds.
`confidence`	Number	The confidence score for the computed insight.

NOTES:

In case of ExtractiveSummary, the start and end times refer to the exact time of the segment.
In case of AbstractiveSummaryLong and AbstractiveSummaryShort the start and end time refer to the time of text blob which is abstracted.