# Data Flow Understand how data flows through urvo during voice conversations When using urvo, data flows through multiple components during a voice conversation. Understanding this flow is essential for security-conscious organizations. ## Overview This guide explains: - The complete voice pipeline architecture - What data passes through each component - What data is stored on urvo's infrastructure - How to control call recording and transcription storage ### Understanding Log Types urvo generates two distinct types of logs during calls: | Log Type | Description | Visibility | |---|---|---| | **System Logs** | Internal operational logs used by urvo for debugging, monitoring, and system health | urvo internal only — never shared with customers | | **Call Logs** | Conversation data including transcripts, recordings, and call metadata | Available to customers via the dashboard | **Note:** System logs are strictly internal to urvo and are never shared with customers or uploaded to external storage. They contain infrastructure-level data used for urvo's operational purposes only. ## Voice Pipeline Architecture urvo orchestrates a voice pipeline with multiple modular components. Each component handles a specific part of the voice conversation flow. ### Complete Pipeline Flow The following describes the end-to-end flow of a voice call through urvo: 1. **Transport Layer** — Audio enters via SIP or Twilio telephony 2. **Speech-to-Text (Transcriber)** — User audio is converted to text in real-time 3. **Orchestration Layer** — urvo's proprietary models handle endpointing, interruption detection, emotion detection, and backchanneling 4. **Language Model (LLM)** — Generates conversational responses based on transcribed user input 5. **Text-to-Speech (Voice)** — Converts LLM responses into spoken audio 6. **Transport Layer** — Synthesized audio is streamed back to the user Throughout this pipeline, **artifacts** are generated: call recordings, transcripts, and call logs. ## Pipeline Components ### 1. Transport Layer The transport layer handles real-time audio streaming between users and urvo. | Transport Type | Description | Use Case | |---|---|---| | **SIP** | Session Initiation Protocol | Traditional phone systems, PBX integration, SIP trunking | | **Twilio** | Twilio telephony integration | PSTN calls, phone numbers, outbound dialing | ### 2. Speech-to-Text (Transcriber) Converts user audio into text in real-time using streaming recognition. urvo uses its own speech-to-text infrastructure — there is no bring-your-own-key option for transcription. ### 3. Orchestration Layer urvo runs proprietary real-time models that make conversations feel natural. These models run exclusively on urvo's infrastructure and are not customizable. | Model | Purpose | |---|---| | **Endpointing** | Detects when the user finishes speaking using audio-text fusion | | **Interruption Detection** | Distinguishes barge-in from affirmations like "uh-huh" | | **Background Noise Filtering** | Removes ambient sounds in real-time | | **Background Voice Filtering** | Isolates primary speaker from TVs, echoes, and other voices | | **Backchanneling** | Adds natural affirmations ("uh-huh", "yeah", "got it") | | **Emotion Detection** | Analyzes emotional tone and passes it to the LLM | | **Filler Injection** | Adds natural speech patterns ("um", "like", "so") | **Note:** Orchestration models process data in real-time but do **not persist** the audio or intermediate results. All processing is **ephemeral**. Only final transcripts and call logs are stored. ### 4. Language Model (LLM) Generates conversational responses based on transcribed user input. You can choose from urvo's available LLMs when configuring your agent. **Note:** Bring-your-own-key is not supported for LLMs by default. If you need to use your own API key for a specific LLM, contact support@urvo.io. ### 5. Text-to-Speech (Voice) Converts LLM responses into spoken audio. urvo uses its own text-to-speech infrastructure — there is no bring-your-own-key option for voice synthesis. ## Default Data Flow In the default configuration, urvo handles all pipeline components and stores artifacts on urvo's infrastructure. ### What Is Stored by Default - **Call recordings** — Audio recordings of the full conversation - **Transcripts** — Full transcriptions with timestamps - **Call logs** — Metadata and component-level details for each call - **Product usage metrics** — Internal analytics (urvo only, not customer-accessible) - **System logs** — Operational logs (urvo only, not customer-accessible) ## Controlling Data Storage You can control whether call recordings and transcriptions are stored by adjusting settings on your agent's **Configure** page. ### Disabling Call Recordings To disable call recordings: 1. Go to your agent's **Configure** page 2. Scroll down to the **Advanced** section 3. Turn off **"Enable Recordings"** When disabled, urvo will no longer store audio recordings for that agent's calls. ### Disabling Transcriptions and Recordings To disable both transcriptions and recordings: 1. Go to your agent's **Configure** page 2. Scroll down to the **Advanced** section 3. Set **"Conversations Retention Period"** to **0** Setting the retention period to 0 prevents urvo from storing both transcriptions and recordings for that agent's calls. ### Custom Storage If you need call data stored in your own cloud storage, contact support@urvo.io to discuss custom storage options. ## What Data Passes Through urvo The following describes what data is processed and how it is retained: | Data Type | Processing | Retention | |---|---|---| | Raw audio streams | Real-time routing to Transcriber / Voice | **Ephemeral** (not stored) | | Transcribed text | Orchestration analysis, LLM routing | Call logs (unless disabled) | | LLM responses | Filler injection, Voice routing | Call logs (unless disabled) | | Emotion metadata | Passed to LLM context | **Ephemeral** | | Call signaling | SIP / telephony management | Metadata only | ## Artifacts Storage Summary | Artifact | Default Location | Can Be Disabled | |---|---|---| | **Call Recordings** | urvo | Yes — Turn off "Enable Recordings" in the Advanced section | | **Transcripts** | urvo | Yes — Set "Conversations Retention Period" to 0 | | **Call Logs** | urvo | Yes — Set "Conversations Retention Period" to 0 | | **Product Usage Metrics** | urvo | No — Internal to urvo | | **System Logs** | urvo | No — Internal to urvo | ## Infrastructure Summary The following summarizes what runs on urvo's infrastructure and what you can control: | Component | Infrastructure | Customizable | |---|---|---| | **Transport** | SIP / Twilio | Choose SIP or Twilio | | **Transcriber** | urvo | urvo only | | **Orchestration** | urvo | urvo only | | **LLM** | urvo (multiple providers available) | Choose from available LLMs; contact support for BYOK | | **Voice** | urvo | urvo only | | **Storage** | urvo | Contact support for custom storage | **Note:** The **Orchestration Layer** (endpointing, interruption detection, emotion detection, backchanneling, filler injection) is urvo's core technology and runs exclusively on urvo infrastructure. Audio processed by these models is **ephemeral** and is not stored. ## Questions? If you have questions about urvo's data flow, storage practices, or need custom storage arrangements, reach out to support@urvo.io.