Persona

Persona Context Page - User Manual

Overview

The Persona page (located at Context → Persona) allows you to create and manage AI personality templates for your voice agents. Each persona defines comprehensive voice characteristics, behavioral parameters, and technical settings based on OpenAI's Realtime API. Personas are reusable templates that can be assigned to multiple agents, ensuring consistent personality and performance across your fleet.

Page Layout

Information Banner

At the top of the page, you'll find a dismissible blue information alert that explains the purpose of the Persona templates and their connection to OpenAI's Realtime API parameters.

Persona Library Table

The main content area displays all configured personas in a sortable table with the following columns:

Persona ID: Unique identifier (e.g., PER-001) in monospace blue text
Name: Persona name with description displayed below in smaller gray text
Voice & Speech: Shows selected voice character (e.g., "shimmer") and speech speed multiplier
Model Settings: Displays temperature value and latency hint setting
Used By: Number of agents currently using this persona (sortable)
Created: Date the persona was created
Actions: View, Edit, and Delete buttons for each persona

The table supports pagination (10 personas per page) and can be sorted by clicking column headers.

Creating a New Persona

Step 1: Open Creation Drawer

Click the "Create Persona" button (purple, located in the top-right corner of the Persona Library card) to open the persona configuration drawer on the right side of the screen.

Step 2: Basic Information

Persona Name (Required)

Enter a descriptive name that clearly identifies the persona's purpose and personality type.

Example: "Professional & Empathetic Support Agent"
Guidelines: Use clear, descriptive names that indicate tone and use case

Description (Required)

Provide a 2-3 sentence description of the personality, ideal use cases, and scenarios where this persona excels.

Example: "Warm, professional tone with high empathy. Ideal for customer support and service recovery."

Step 3: Voice & Speech Configuration

This section controls how your agent sounds and speaks. Click the section header to expand/collapse.

Voice Character (Required)

Select the AI voice that will be used for text-to-speech synthesis. Each voice has distinct characteristics:

Alloy - Neutral, balanced tone
Ash - Smooth, professional
Ballad - Warm, storytelling quality
Coral - Bright, energetic
Echo - Clear, authoritative
Sage - Wise, calm
Shimmer - Friendly, approachable (Default)
Verse - Expressive, dynamic
Marin - Natural, conversational
Cedar - Deep, reassuring (newest)

Best Practice: Test different voices in the Playground to find the best match for your brand and use case.

Speech Speed

Use the slider to control how fast the AI speaks:

Range: 0.25× (very slow) to 4.0× (very fast)
Default: 1.0× (normal speed)
Recommended: 0.9× - 1.15× for most use cases
Use Cases:
- 0.8× - 0.9×: Technical support, complex instructions
- 1.0×: Standard customer service
- 1.1× - 1.2×: Sales, energetic personalities

System Instructions (Required)

This is the most important field—it defines your agent's personality, behavior, and communication style. Be specific and detailed.

Required Elements:

Role definition ("You are a...")
Tone and style guidance
Response length constraints
Language complexity level
Interaction approach

Example:

You are a helpful and professional customer service agent. Speak in a warm, friendly tone while being efficient. Show empathy for customer concerns and offer proactive solutions. Keep responses to 2-3 sentences per turn. Use clear, simple language and avoid technical jargon unless necessary. Always confirm understanding before proceeding with complex tasks.

Best Practices:

Be explicit about desired behavior
Include response length guidelines to prevent rambling
Specify when to escalate or transfer
Define brand voice characteristics

Step 4: Model Behavior

This section controls the AI's response patterns and technical performance.

Temperature

Controls response randomness and creativity:

Range: 0.6 to 1.2
Default: 0.8 (optimal for voice)
Lower values (0.6-0.7): More focused, deterministic, consistent responses
Higher values (0.9-1.2): More creative, varied, conversational responses

Recommendation: Keep at 0.8 for most use cases. Adjust only if responses feel too rigid (increase) or too unpredictable (decrease).

Max Response Tokens

Limits the maximum length of agent responses:

Unlimited (Default): No token limit
512 tokens: Very concise responses (1-2 sentences)
1024 tokens: Standard responses (2-4 sentences)
2048 tokens: Detailed responses (full paragraph)
4096 tokens: Comprehensive responses (multiple paragraphs)

Best Practice: Set explicit limits to prevent verbose responses and control costs. Use 1024 for most customer service scenarios.

Latency Hints

Balances response quality versus speed:

Low - Fastest: Optimizes for speed, minimal latency (use for high-volume, simple interactions)
Balanced (Default): Equal priority to speed and quality
High - Most Coherent: Prioritizes response quality and thoughtfulness (use for complex problem-solving)

Output Modalities

Determines response format:

Text: Agent generates text-only responses
Audio: Agent generates voice-only responses
Text + Audio (Default): Agent generates both (recommended for full functionality)

Context Truncation

Manages conversation history when approaching token limits:

Auto-truncate (Default - ON): Automatically removes oldest messages when nearing limit
Error on full: Throws error when context is full (requires manual intervention)

Recommendation: Keep enabled for production to prevent conversation failures.

Step 5: Turn Detection (Voice Activity)

This section controls how the system detects when the user has finished speaking—critical for natural conversation flow.

Detection Type

Choose how the system identifies speech boundaries:

Server VAD - Automatic voice detection (Default): Server-side Voice Activity Detection, fastest and most reliable
Semantic VAD - Meaning-based: Uses AI to detect when a complete thought is expressed (best for noisy environments)
None - Manual/Push-to-talk: User must manually indicate when finished speaking

Recommendation: Use Server VAD for standard deployments. Use Semantic VAD if customers are in noisy environments (call centers, outdoor settings).

Detection Threshold

Controls sensitivity for detecting speech:

Range: 0 (most sensitive) to 1.0 (least sensitive)
Default: 0.5 (balanced)
Lower values (0.3-0.4): Picks up quiet speech, may trigger on background noise
Higher values (0.6-0.8): Ignores background noise better, may miss soft-spoken customers

Best Practice: Start at 0.5, adjust based on testing. Increase if false triggers occur, decrease if missing customer speech.

Prefix Padding

Milliseconds of audio before speech detection to include:

Range: 0ms to 1000ms (1 second)
Default: 300ms
Purpose: Prevents cutting off the first few words when customer starts speaking

Recommendation: Keep at 300ms to ensure natural conversation starts.

Silence Duration

Milliseconds of silence before considering speech ended:

Range: 100ms (0.1 seconds) to 2000ms (2 seconds)
Default: 200ms
Shorter durations (100-200ms): Faster responses, agent may interrupt mid-thought
Longer durations (500-1000ms): More patient, fewer interruptions, slower response time

Best Practice: Use 200-300ms for most cases. Increase to 500ms+ if customers need time to think or pause frequently.

Auto-create Response

When enabled (default: ON), automatically generates AI response after detecting user finished speaking. Disable only for push-to-talk scenarios.

Allow Interruptions

When enabled (default: ON), allows users to interrupt the AI mid-response. Recommended for natural conversation flow.

Step 6: Audio Settings

This section configures audio input/output formats and transcription.

Input Audio Format

Format for audio received from the user:

PCM16 - 16-bit PCM at 24kHz (Default): Standard format, widely compatible
G.711 μ-law: Telephony standard (North America/Japan)
G.711 A-law: Telephony standard (Europe/rest of world)
Opus - Lowest latency: Compressed format with minimal delay (best for production)

Recommendation: Use PCM16 for testing, switch to Opus for production deployments.

Output Audio Format

Format for audio sent to the user:

PCM16 - Standard: Uncompressed, high quality
G.711 μ-law / A-law: Telephony formats
Opus - Best for production (Recommended): Low latency, efficient bandwidth
MP3: Compressed, universal compatibility
WAV: Uncompressed, largest file size

Recommendation: Use Opus in production for optimal latency and quality.

Enable Transcription

When enabled, uses OpenAI Whisper to transcribe user audio to text:

Default: Disabled (OFF)
Use Cases: Better conversation logging, sentiment analysis, compliance recording
Note: Adds minimal latency and additional API costs

Transcription Language (Optional)

Provide a language hint (ISO code) for improved transcription accuracy:

Auto-detect (no selection) works for most cases
Specify language for better accuracy with accents or multilingual customers
Available: English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean

Editing Existing Personas

Locate the persona in the Persona Library table
Click the "Edit" button in the Actions column
The configuration drawer opens with all current settings pre-populated
Modify any fields as needed
Click "Update Persona" to save changes
Click "Cancel" to discard changes

Important: Changes to personas affect all agents currently using that persona. Test thoroughly in Mirror Mode before updating production personas.

Deleting Personas

Locate the persona in the table
Click the red "Delete" button in the Actions column
The persona is immediately deleted (no confirmation dialog)

Warning: You cannot delete personas that are currently in use by active agents. The system will prevent deletion and display the number of agents using the persona.

Best Practice: Before deleting a persona, verify it's not assigned to any agents by checking the "Used By" column.

Viewing Persona Details

Click the "View" button to see a read-only summary of the persona's complete configuration without opening the edit drawer.

Best Practices

Persona Design

Start with templates: Use the four default personas as starting points
Test iteratively: Create persona → test in Playground → refine → test again
Document use cases: Use clear descriptions to help team members select appropriate personas
Maintain consistency: Create distinct personas for different use cases rather than trying to make one persona do everything

System Instructions

Be specific: Vague instructions lead to unpredictable behavior
Set boundaries: Explicitly state what the agent should and shouldn't do
Control verbosity: Always include response length guidelines ("Keep responses to 2-3 sentences")
Include escalation triggers: Define when to transfer to human agents
Test edge cases: Verify behavior with difficult or unusual customer inputs

Technical Settings

Don't over-tune: Start with defaults, adjust only based on observed behavior
Test in production conditions: Silence thresholds perform differently in quiet offices vs. noisy call centers
Monitor latency: Higher quality settings may introduce delay—balance quality vs. responsiveness
Use Opus in production: Always switch to Opus audio format before deploying to reduce latency

Maintenance

Version control: When making significant changes, create a new persona instead of modifying existing ones
A/B test: Use Mirror Mode to compare persona variations before rolling out changes
Monitor performance: Track quality scores and customer satisfaction by persona to identify optimization opportunities
Regular reviews: Quarterly review all personas to ensure they align with current brand voice and business needs

Common Issues and Solutions

Issue

Likely Cause

Solution

Agent interrupts customers frequently

Silence duration too short

Increase silence duration to 500-700ms

Agent misses first word when customer starts speaking

Prefix padding too low

Increase prefix padding to 400-500ms

Responses are too robotic

Temperature too low

Increase temperature to 0.85-0.9

Responses are inconsistent or random

Temperature too high

Decrease temperature to 0.7-0.75

Agent talks too much

No token limit set

Set max response tokens to 1024

Slow response time

Latency hints set to "High"

Change to "Balanced" or "Low"

Background noise triggers false detections

VAD threshold too low

Increase threshold to 0.6-0.7

Agent Playground (Tools → Playground): Test personas in real-time conversations before deployment
Create Agent Wizard (Agent Management → Create Agents): Assign personas during agent creation
Mirror Mode (Agent Management → Deploy Agents): Compare persona performance with production agents

Next Steps: After creating your persona, navigate to Agent Management → Create Agents to build an agent using this personality template.

PreviousOverview NextDialogue

Last updated 3 months ago

Good morning

hashtagPersona Context Page - User Manual

hashtagOverview

hashtagPage Layout

hashtagInformation Banner

hashtagPersona Library Table

hashtagCreating a New Persona

hashtagStep 1: Open Creation Drawer

hashtagStep 2: Basic Information

hashtagStep 3: Voice & Speech Configuration

hashtagStep 4: Model Behavior

hashtagStep 5: Turn Detection (Voice Activity)

hashtagStep 6: Audio Settings

hashtagEditing Existing Personas

hashtagDeleting Personas

hashtagViewing Persona Details

hashtagBest Practices

hashtagPersona Design

hashtagSystem Instructions

hashtagTechnical Settings

hashtagMaintenance

hashtagCommon Issues and Solutions

hashtagRelated Features