Conversation analytics
Conversations are rich with insights beyond words alone. By quantifying who spoke when, how long they spoke, how fast they talked, how much silence occurred, and if multiple people spoke simultaneously (overlap), organizations gain a clearer picture of conversational dynamics.
Who Benefits:
- Project Managers & Team Leads: Understand team communication patterns, ensuring balanced participation and effective information exchange.
- CX Teams: Verify that customers get ample talk time and agents aren’t rushing or dominating the discussion.
- Business Analysts & Strategists: Identify communication bottlenecks and improve training and workflows based on measured engagement.
- Compliance & QA Officers: Check whether calls adhere to standards of fairness, responsiveness, and clarity.
Value Proposition:
By turning raw conversation audio into quantifiable metrics, teams can quickly diagnose interaction issues, highlight improvements, and ensure more productive, customer-centered communications.
Data Dictionary & Schema
Objective:
Extract key conversation metrics—duration, silence, overlap, speaker ratios, talk times, and pace—from a timestamped, diarized conversation. The userId
field is optional and should only be included if provided as part of the input data.
JSON Schema:
{
"analytics": {
"duration": <number>,
"silence": {
"seconds": <number>
},
"overlap": {
"percent": <number>,
"seconds": <number>
},
"speakers": [
{
"id": "<string>",
"name": "<string>",
"ratio": <number>,
"talkTime": {
"seconds": <number>
},
"listenTime": {
"seconds": <number>
},
"pace": {
"wpm": <number>
}
// Include "userId": "<string>" only if it is provided in the input
}
]
}
}
Data Dictionary:
Field | Meaning for Your Business | Example |
---|---|---|
analytics | Encapsulates all conversation metrics | N/A |
duration | Total conversation duration in seconds | 3600 (60 mins) |
silence.seconds | Amount of silence in seconds | 600 (10 mins) |
overlap.percent | Percent of conversation where multiple people spoke at once | 10.0 |
overlap.seconds | Total overlap time in seconds | 360 |
speakers | Array of speaker-level metrics | [ {...}, {...} ] |
speakers[].id | Unique identifier for a speaker | "speaker-1" |
speakers[].name | The speaker’s display name | "John" |
speakers[].ratio | Speaker’s talk ratio compared to others | 4.0 (John spoke 4x more than Arya) |
speakers[].talkTime.seconds | Total talk time in seconds per speaker | 2400 (40 mins) |
speakers[].listenTime.seconds | Total listen time in seconds (if applicable) | 0 if none |
speakers[].pace.wpm | Words per minute for that speaker | 100 wpm |
speakers[].userId | Optional unique user identifier if provided in input | "user-123" |
If No Analytics:
{
"analytics": {
"duration": 0,
"silence": { "seconds": 0 },
"overlap": { "percent": 0.0, "seconds": 0 },
"speakers": []
}
}
Confidence & Calibration
Confidence Considerations:
- Duration, silence, and overlap are time-based and usually straightforward, leading to high confidence.
- Ratio, talk time, and pace depend on accurate speaker diarization and word counting. If speech-to-text or speaker identification is uncertain, consider assigning lower confidence scores internally (though not explicitly required in the schema).
- Iterative testing with known samples will refine accuracy. Gathering feedback from PMs, CX leads, or compliance officers will guide threshold adjustments or metric definitions.
Calibration Steps:
- Test on Known Conversations: Validate metrics against manually timed samples.
- Adjust Pace Logic: If wpm seems off, refine speech recognition or pause handling.
- Solicit Feedback: Adjust metrics if stakeholders need more granularity or different interpretations.
- Continuous Improvement: Update logic as conversation patterns shift (e.g., shorter standups, longer support calls).
Prompt Construction & Instructions
Role Specification & Reiteration:
- The system is a “highly experienced assistant” focused on extracting these analytics.
- Reiterate the schema and instructions multiple times to avoid confusion.
- Output only the JSON—no reasoning steps, no extra text.
No Hallucination:
- Only use data derivable from the conversation (e.g., timestamps, speech durations).
- If uncertain about any metric, default to zero or omit optional fields.
Strict Formatting:
- Return only the JSON structure.
- Include
userId
for a speaker only if provided in the input data.
Prompt Implementation
System Message (Role: System):
You are a highly experienced assistant that extracts conversation analytics (duration, silence, overlap, speaker metrics) from a timestamped, diarized transcript. Your output must follow this JSON schema:
{
"analytics": {
"duration": <number>,
"silence": {
"seconds": <number>
},
"overlap": {
"percent": <number>,
"seconds": <number>
},
"speakers": [
{
"id": "<string>",
"name": "<string>",
"ratio": <number>,
"talkTime": {
"seconds": <number>
},
"listenTime": {
"seconds": <number>
},
"pace": {
"wpm": <number>
}
// Include "userId": "<string>" only if provided in the input
}
]
}
}
Instructions (Reiterated):
- Calculate total
duration
in seconds. - Identify
silence
duration in seconds. - Determine
overlap
percent and seconds. - For each speaker:
- Assign
id
,name
. - Calculate
ratio
of their talk time compared to others. talkTime.seconds
andlistenTime.seconds
as applicable.pace.wpm
= words per minute if available.- Include
userId
only if present in input data.
- Assign
- Return only the JSON—no extra text or reasoning.
- If no metrics can be derived, return zeros or empty arrays.
Chain-of-Thought (Hidden):
- Reason silently, don’t display reasoning steps.
No Hallucination:
- Stick to data that can be derived from the transcript’s timeline and speaker info.
- If uncertain, use defaults (e.g., 0 seconds, 0.0 percent).
System Summary:
Read the transcript, compute analytics, and return them strictly in the defined JSON format. No additional commentary."
User Message (Role: User):
"Analyze the following conversation and return the analytics as instructed:
[TRANSCRIPT_JSON][TRANSCRIPT_JSON]"
(Replace [TRANSCRIPT_JSON]
with your actual conversation data.)
Updated about 2 months ago