Data Flow
Understand how data flows through urvo during voice conversations
When using urvo, data flows through multiple components during a voice conversation. Understanding this flow is essential for security-conscious organizations.
Overview
This guide explains:
- The complete voice pipeline architecture
- What data passes through each component
- What data is stored on urvo's infrastructure
- How to control call recording and transcription storage
Understanding Log Types
urvo generates two distinct types of logs during calls:
| Log Type | Description | Visibility |
|---|---|---|
| System Logs | Internal operational logs used by urvo for debugging, monitoring, and system health | urvo internal only — never shared with customers |
| Call Logs | Conversation data including transcripts, recordings, and call metadata | Available to customers via the dashboard |
Note: System logs are strictly internal to urvo and are never shared with customers or uploaded to external storage. They contain infrastructure-level data used for urvo's operational purposes only.
Voice Pipeline Architecture
urvo orchestrates a voice pipeline with multiple modular components. Each component handles a specific part of the voice conversation flow.
Complete Pipeline Flow
The following describes the end-to-end flow of a voice call through urvo:
- Transport Layer — Audio enters via SIP or Twilio telephony
- Speech-to-Text (Transcriber) — User audio is converted to text in real-time
- Orchestration Layer — urvo's proprietary models handle endpointing, interruption detection, emotion detection, and backchanneling
- Language Model (LLM) — Generates conversational responses based on transcribed user input
- Text-to-Speech (Voice) — Converts LLM responses into spoken audio
- Transport Layer — Synthesized audio is streamed back to the user
Throughout this pipeline, artifacts are generated: call recordings, transcripts, and call logs.
Pipeline Components
1. Transport Layer
The transport layer handles real-time audio streaming between users and urvo.
| Transport Type | Description | Use Case |
|---|---|---|
| SIP | Session Initiation Protocol | Traditional phone systems, PBX integration, SIP trunking |
| Twilio | Twilio telephony integration | PSTN calls, phone numbers, outbound dialing |
2. Speech-to-Text (Transcriber)
Converts user audio into text in real-time using streaming recognition. urvo uses its own speech-to-text infrastructure — there is no bring-your-own-key option for transcription.
3. Orchestration Layer
urvo runs proprietary real-time models that make conversations feel natural. These models run exclusively on urvo's infrastructure and are not customizable.
| Model | Purpose |
|---|---|
| Endpointing | Detects when the user finishes speaking using audio-text fusion |
| Interruption Detection | Distinguishes barge-in from affirmations like "uh-huh" |
| Background Noise Filtering | Removes ambient sounds in real-time |
| Background Voice Filtering | Isolates primary speaker from TVs, echoes, and other voices |
| Backchanneling | Adds natural affirmations ("uh-huh", "yeah", "got it") |
| Emotion Detection | Analyzes emotional tone and passes it to the LLM |
| Filler Injection | Adds natural speech patterns ("um", "like", "so") |
Note: Orchestration models process data in real-time but do not persist the audio or intermediate results. All processing is ephemeral. Only final transcripts and call logs are stored.
4. Language Model (LLM)
Generates conversational responses based on transcribed user input. You can choose from urvo's available LLMs when configuring your agent.
Note: Bring-your-own-key is not supported for LLMs by default. If you need to use your own API key for a specific LLM, contact support@urvo.io.
5. Text-to-Speech (Voice)
Converts LLM responses into spoken audio. urvo uses its own text-to-speech infrastructure — there is no bring-your-own-key option for voice synthesis.
Default Data Flow
In the default configuration, urvo handles all pipeline components and stores artifacts on urvo's infrastructure.
What Is Stored by Default
- Call recordings — Audio recordings of the full conversation
- Transcripts — Full transcriptions with timestamps
- Call logs — Metadata and component-level details for each call
- Product usage metrics — Internal analytics (urvo only, not customer-accessible)
- System logs — Operational logs (urvo only, not customer-accessible)
Controlling Data Storage
You can control whether call recordings and transcriptions are stored by adjusting settings on your agent's Configure page.
Disabling Call Recordings
To disable call recordings:
- Go to your agent's Configure page
- Scroll down to the Advanced section
- Turn off "Enable Recordings"
When disabled, urvo will no longer store audio recordings for that agent's calls.
Disabling Transcriptions and Recordings
To disable both transcriptions and recordings:
- Go to your agent's Configure page
- Scroll down to the Advanced section
- Set "Conversations Retention Period" to 0
Setting the retention period to 0 prevents urvo from storing both transcriptions and recordings for that agent's calls.
Custom Storage
If you need call data stored in your own cloud storage, contact support@urvo.io to discuss custom storage options.
What Data Passes Through urvo
The following describes what data is processed and how it is retained:
| Data Type | Processing | Retention |
|---|---|---|
| Raw audio streams | Real-time routing to Transcriber / Voice | Ephemeral (not stored) |
| Transcribed text | Orchestration analysis, LLM routing | Call logs (unless disabled) |
| LLM responses | Filler injection, Voice routing | Call logs (unless disabled) |
| Emotion metadata | Passed to LLM context | Ephemeral |
| Call signaling | SIP / telephony management | Metadata only |
Artifacts Storage Summary
| Artifact | Default Location | Can Be Disabled |
|---|---|---|
| Call Recordings | urvo | Yes — Turn off "Enable Recordings" in the Advanced section |
| Transcripts | urvo | Yes — Set "Conversations Retention Period" to 0 |
| Call Logs | urvo | Yes — Set "Conversations Retention Period" to 0 |
| Product Usage Metrics | urvo | No — Internal to urvo |
| System Logs | urvo | No — Internal to urvo |
Infrastructure Summary
The following summarizes what runs on urvo's infrastructure and what you can control:
| Component | Infrastructure | Customizable |
|---|---|---|
| Transport | SIP / Twilio | Choose SIP or Twilio |
| Transcriber | urvo | urvo only |
| Orchestration | urvo | urvo only |
| LLM | urvo (multiple providers available) | Choose from available LLMs; contact support for BYOK |
| Voice | urvo | urvo only |
| Storage | urvo | Contact support for custom storage |
Note: The Orchestration Layer (endpointing, interruption detection, emotion detection, backchanneling, filler injection) is urvo's core technology and runs exclusively on urvo infrastructure. Audio processed by these models is ephemeral and is not stored.
Questions?
If you have questions about urvo's data flow, storage practices, or need custom storage arrangements, reach out to support@urvo.io.