Questions
Conversations often contain explicit or implicit questions that indicate information gaps, decision points, or areas requiring follow-up. Identifying these questions helps teams understand which issues remain unresolved, what additional data is needed, and who might need to provide more input.
Stakeholders & Benefits:
- Product Managers & Team Leads: Quickly surface unresolved questions that need answers to move projects forward.
- CX Teams: Identify customer inquiries that must be addressed, improving responsiveness and satisfaction.
- Business Analysts: Highlight knowledge gaps that require research or additional data, informing more accurate analysis.
- Compliance & QA Officers: Confirm that all critical inquiries—internal or customer-facing—have been acknowledged.
Value Proposition:
Structured extraction of questions ensures nothing falls through the cracks. Teams can track all inquiries, direct them to the right experts, and ensure timely follow-up actions.
Data Dictionary & Schema
Objective:
Extract all questions from a timestamped, diarized transcript and return them in a standardized JSON format. Questions may be explicit (“What features are most relevant?”) or implicit (“Do you think we should...?”). They can be directed at a specific person or general.
Required JSON Schema:
{
"questions": [
{
"id": "<string>",
"text": "<string>",
"type": "question",
"score": <number>,
"entities": [
{
"type": "<string>",
"text": "<string>",
"value": {
"channel": "<string>"
}
}
]
}
]
}
Data Dictionary:
Field | Meaning for Your Business | Example |
---|---|---|
questions | A list of recognized questions from the conversation | N/A |
id | Unique identifier for the question | "question-1" |
text | The textual representation of the question | "What features are most relevant?" |
type | The classification, likely "question" | "question" |
score | Confidence that this statement is a question (0.0-1.0) | 0.9 |
entities | Extracted details about who or what the question references (if any) | [{"type":"channel","text":"Ken","value":{"channel":"Ken"}}] |
If No Questions Found:
Return:
{
"questions": []
}
Confidence & Calibration
Confidence Scoring:
- High (0.8–1.0): Clearly phrased questions or direct inquiries (“What is the deadline?”, “Does Ken know more?”).
- Medium (0.5–0.8): Slightly ambiguous questions (“Should we consider...?”, “Do you think...?”).
- Low (<0.5): Vague or potentially rhetorical phrases that may not warrant classification as a question.
Calibration Tips:
-
Test on Sample Data:
Run the prompt on transcripts. Verify it captures all genuine questions and excludes non-question chatter. -
Refine Entity Recognition:
If struggling to identify the channel (person/team) in the question, provide additional guidance or domain-specific examples. -
Seek Stakeholder Feedback:
Ensure that product managers, CX leads, or compliance officers find the extracted questions useful and actionable. -
Iterative Improvement:
As communication styles evolve, update instructions to recognize new phrasings of questions.
Prompt Construction & Instructions
Role Specification & Reiteration:
- The system is a “highly experienced assistant” specialized in identifying questions within transcripts.
- Reiterate instructions to ensure compliance.
- Include no commentary or reasoning steps in the final output.
Avoid Hallucination:
- Only label what is clearly a question based on the conversation text.
- If uncertain, assign a lower score or omit the question entirely.
Strict Formatting:
- Return only the JSON structure—no extra text or explanations.
- If no questions, return
"questions": []
.
NomessageIds
or beginOffset
:
- Keep the schema simple and focused on
id
,text
,type
,score
, andentities
.
Prompt for Implementation
System Message (Role: System):
You are a highly experienced assistant specializing in identifying questions from a timestamped, diarized conversation. A question is any utterance seeking information, confirmation, or clarity. Your output must follow this JSON schema:
{
"questions": [
{
"id": "<string>",
"text": "<string>",
"type": "question",
"score": <number>,
"entities": [
{
"type": "<string>",
"text": "<string>",
"value": {
"channel": "<string>"
}
}
]
}
]
}
Instructions (Reiterated):
- Identify all questions (explicit or implicit) from the transcript.
- Assign a unique
id
to each question (e.g.,"question-1"
). text
should capture the entire question phrase.type
must be"question"
.- Assign a
score
reflecting confidence. - If the question references a person/team/role, represent that entity with
type: "channel"
and specify"channel": "<Name>"
. - Return only the JSON object—no explanations, no reasoning steps.
- If no questions are found, return
"questions": []"
.
Chain-of-Thought (Hidden):
- Reason internally about which utterances are questions.
- Do not include reasoning in the final answer.
No Hallucination:
- Only extract actual questions present in the transcript.
- If uncertain, lower the score or omit the question.
System Summary:
Read the transcript, identify questions, and return them as per the schema. No extra text beyond the JSON.
User Message (Role: User):
"Analyze the following conversation and return the extracted questions as instructed:
[TRANSCRIPT_JSON][TRANSCRIPT_JSON]"
(Replace [TRANSCRIPT_JSON]
with your actual conversation data.)
Updated about 2 months ago