# Data Flow

Understand how data flows through urvo during voice conversations

When using urvo, data flows through multiple components during a voice conversation. Understanding this flow is essential for security-conscious organizations.

## Overview

This guide explains:

- The complete voice pipeline architecture
- What data passes through each component
- What data is stored on urvo's infrastructure
- How to control call recording and transcription storage

### Understanding Log Types

urvo generates two distinct types of logs during calls:

| Log Type | Description | Visibility |
|---|---|---|
| **System Logs** | Internal operational logs used by urvo for debugging, monitoring, and system health | urvo internal only — never shared with customers |
| **Call Logs** | Conversation data including transcripts, recordings, and call metadata | Available to customers via the dashboard |

**Note:** System logs are strictly internal to urvo and are never shared with customers or uploaded to external storage. They contain infrastructure-level data used for urvo's operational purposes only.

## Voice Pipeline Architecture

urvo orchestrates a voice pipeline with multiple modular components. Each component handles a specific part of the voice conversation flow.

### Complete Pipeline Flow

The following describes the end-to-end flow of a voice call through urvo:

1. **Transport Layer** — Audio enters via SIP or Twilio telephony
2. **Speech-to-Text (Transcriber)** — User audio is converted to text in real-time
3. **Orchestration Layer** — urvo's proprietary models handle endpointing, interruption detection, emotion detection, and backchanneling
4. **Language Model (LLM)** — Generates conversational responses based on transcribed user input
5. **Text-to-Speech (Voice)** — Converts LLM responses into spoken audio
6. **Transport Layer** — Synthesized audio is streamed back to the user

Throughout this pipeline, **artifacts** are generated: call recordings, transcripts, and call logs.

## Pipeline Components

### 1. Transport Layer

The transport layer handles real-time audio streaming between users and urvo.

| Transport Type | Description | Use Case |
|---|---|---|
| **SIP** | Session Initiation Protocol | Traditional phone systems, PBX integration, SIP trunking |
| **Twilio** | Twilio telephony integration | PSTN calls, phone numbers, outbound dialing |

### 2. Speech-to-Text (Transcriber)

Converts user audio into text in real-time using streaming recognition. urvo uses its own speech-to-text infrastructure — there is no bring-your-own-key option for transcription.

### 3. Orchestration Layer

urvo runs proprietary real-time models that make conversations feel natural. These models run exclusively on urvo's infrastructure and are not customizable.

| Model | Purpose |
|---|---|
| **Endpointing** | Detects when the user finishes speaking using audio-text fusion |
| **Interruption Detection** | Distinguishes barge-in from affirmations like "uh-huh" |
| **Background Noise Filtering** | Removes ambient sounds in real-time |
| **Background Voice Filtering** | Isolates primary speaker from TVs, echoes, and other voices |
| **Backchanneling** | Adds natural affirmations ("uh-huh", "yeah", "got it") |
| **Emotion Detection** | Analyzes emotional tone and passes it to the LLM |
| **Filler Injection** | Adds natural speech patterns ("um", "like", "so") |

**Note:** Orchestration models process data in real-time but do **not persist** the audio or intermediate results. All processing is **ephemeral**. Only final transcripts and call logs are stored.

### 4. Language Model (LLM)

Generates conversational responses based on transcribed user input. You can choose from urvo's available LLMs when configuring your agent.

**Note:** Bring-your-own-key is not supported for LLMs by default. If you need to use your own API key for a specific LLM, contact support@urvo.io.

### 5. Text-to-Speech (Voice)

Converts LLM responses into spoken audio. urvo uses its own text-to-speech infrastructure — there is no bring-your-own-key option for voice synthesis.

## Default Data Flow

In the default configuration, urvo handles all pipeline components and stores artifacts on urvo's infrastructure.

### What Is Stored by Default

- **Call recordings** — Audio recordings of the full conversation
- **Transcripts** — Full transcriptions with timestamps
- **Call logs** — Metadata and component-level details for each call
- **Product usage metrics** — Internal analytics (urvo only, not customer-accessible)
- **System logs** — Operational logs (urvo only, not customer-accessible)

## Controlling Data Storage

You can control whether call recordings and transcriptions are stored by adjusting settings on your agent's **Configure** page.

### Disabling Call Recordings

To disable call recordings:

1. Go to your agent's **Configure** page
2. Scroll down to the **Advanced** section
3. Turn off **"Enable Recordings"**

When disabled, urvo will no longer store audio recordings for that agent's calls.

### Disabling Transcriptions and Recordings

To disable both transcriptions and recordings:

1. Go to your agent's **Configure** page
2. Scroll down to the **Advanced** section
3. Set **"Conversations Retention Period"** to **0**

Setting the retention period to 0 prevents urvo from storing both transcriptions and recordings for that agent's calls.

### Custom Storage

If you need call data stored in your own cloud storage, contact support@urvo.io to discuss custom storage options.

## What Data Passes Through urvo

The following describes what data is processed and how it is retained:

| Data Type | Processing | Retention |
|---|---|---|
| Raw audio streams | Real-time routing to Transcriber / Voice | **Ephemeral** (not stored) |
| Transcribed text | Orchestration analysis, LLM routing | Call logs (unless disabled) |
| LLM responses | Filler injection, Voice routing | Call logs (unless disabled) |
| Emotion metadata | Passed to LLM context | **Ephemeral** |
| Call signaling | SIP / telephony management | Metadata only |

## Artifacts Storage Summary

| Artifact | Default Location | Can Be Disabled |
|---|---|---|
| **Call Recordings** | urvo | Yes — Turn off "Enable Recordings" in the Advanced section |
| **Transcripts** | urvo | Yes — Set "Conversations Retention Period" to 0 |
| **Call Logs** | urvo | Yes — Set "Conversations Retention Period" to 0 |
| **Product Usage Metrics** | urvo | No — Internal to urvo |
| **System Logs** | urvo | No — Internal to urvo |

## Infrastructure Summary

The following summarizes what runs on urvo's infrastructure and what you can control:

| Component | Infrastructure | Customizable |
|---|---|---|
| **Transport** | SIP / Twilio | Choose SIP or Twilio |
| **Transcriber** | urvo | urvo only |
| **Orchestration** | urvo | urvo only |
| **LLM** | urvo (multiple providers available) | Choose from available LLMs; contact support for BYOK |
| **Voice** | urvo | urvo only |
| **Storage** | urvo | Contact support for custom storage |

**Note:** The **Orchestration Layer** (endpointing, interruption detection, emotion detection, backchanneling, filler injection) is urvo's core technology and runs exclusively on urvo infrastructure. Audio processed by these models is **ephemeral** and is not stored.

## Questions?

If you have questions about urvo's data flow, storage practices, or need custom storage arrangements, reach out to support@urvo.io.