tts-livekit-plugin

@Okeysir198/tts-livekit-plugin

3 forks

Updated 4/12/2026

Build and deploy self-hosted Text-to-Speech API using MeloTTS from Hugging Face and create a LiveKit plugin for voice agents. Use this skill when building TTS systems, LiveKit voice agents, or self-hosted speech synthesis solutions.

Installation

$npx agent-skills-cli install @Okeysir198/tts-livekit-plugin

Claude Code

Cursor

Copilot

Codex

Antigravity

Details

RepositoryOkeysir198/P20251122-claude-skills

Pathtts-livekit-plugin/SKILL.md

Branchmain

Scoped Name@Okeysir198/tts-livekit-plugin

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

npx agent-skills-cli list

Skill Instructions

name: tts-livekit-plugin description: Build and deploy self-hosted Text-to-Speech API using MeloTTS from Hugging Face and create a LiveKit plugin for voice agents. Use this skill when building TTS systems, LiveKit voice agents, or self-hosted speech synthesis solutions.

TTS LiveKit Plugin Skill

This skill provides a complete solution for building self-hosted Text-to-Speech (TTS) systems integrated with LiveKit voice agents.

What This Skill Does

Creates a Self-Hosted TTS API Server
- FastAPI-based REST API
- Uses MeloTTS model from Hugging Face
- Supports streaming audio responses
- Multi-language and multi-voice support
- Production-ready with proper error handling
Builds a LiveKit TTS Plugin
- Fully compatible with LiveKit agents framework
- Implements standard TTS interface
- Streaming audio support
- Proper error handling and retries
- Drop-in replacement for commercial TTS providers
Provides Complete Testing
- Comprehensive test suite for API
- Integration tests for plugin
- No mocked functions - all real implementations
- Performance and concurrency tests
Includes Full Documentation
- API documentation with examples
- Plugin usage guide
- Deployment guide for production
- Multiple usage examples

Components

API Server (`api/`)

server.py: FastAPI server with MeloTTS integration
requirements.txt: Python dependencies
Endpoints:
- GET /: Health check
- GET /voices: List available voices
- POST /synthesize: Full audio synthesis
- POST /synthesize/stream: Streaming synthesis

LiveKit Plugin (`plugin/`)

melotts_plugin.py: LiveKit TTS plugin implementation
Extends livekit.agents.tts.TTS base class
Implements ChunkedStream for audio streaming
Uses aiohttp for HTTP requests
Proper exception handling (APIConnectionError, APITimeoutError, APIStatusError)

Tests (`tests/`)

test_api.py: API server tests
- Health checks
- Voice listing
- Synthesis (streaming and non-streaming)
- Multiple languages
- Error handling
- Concurrency
test_plugin.py: Plugin integration tests
- Plugin initialization
- Synthesis with real API
- Multiple languages
- Error handling
- Concurrency
- Timeout handling

Examples (`examples/`)

test_api_client.py: Standalone API testing script
simple_agent.py: Basic LiveKit agent example
voice_assistant.py: Complete voice assistant implementation

Documentation (`docs/`)

API.md: Complete API reference
PLUGIN.md: Plugin usage guide
DEPLOYMENT.md: Production deployment guide

Quick Start

1. Start the TTS API Server

cd api
pip install -r requirements.txt
python -m unidic download
python server.py

Server runs on http://localhost:8000

2. Test the API

cd examples
python test_api_client.py

3. Use in LiveKit Agent

from melotts_plugin import TTS

tts = TTS(
    api_base_url="http://localhost:8000",
    language="EN",
    speaker="EN-US",
    speed=1.0
)

stream = tts.synthesize("Hello from LiveKit!")

Features

✅ Self-hosted (no external API dependencies)
✅ High-quality natural speech (MeloTTS)
✅ 6 languages: English, Spanish, French, Chinese, Japanese, Korean
✅ Multiple voices per language
✅ Streaming audio for low latency
✅ CPU-friendly (optimized for real-time inference)
✅ GPU support (automatic if available)
✅ LiveKit agents framework compatible
✅ Production-ready error handling
✅ Comprehensive test coverage
✅ Full documentation

Architecture

┌─────────────────┐      HTTP POST       ┌──────────────────┐
│  LiveKit Agent  │ ──────────────────►  │   TTS API        │
│                 │                       │   Server         │
│  ┌───────────┐  │                       │                  │
│  │ MeloTTS   │  │   Audio Stream       │  ┌────────────┐  │
│  │ Plugin    │  │ ◄──────────────────  │  │  MeloTTS   │  │
│  └───────────┘  │    (WAV chunks)      │  │  Model     │  │
└─────────────────┘                       │  └────────────┘  │
                                          └──────────────────┘

Why MeloTTS?

High Quality: Natural-sounding speech
Fast: Optimized for real-time inference
CPU-Friendly: Works well even without GPU
Multi-lingual: 6 languages supported
Low Latency: ~150-200ms TTFB
Open Source: Free to use and modify

Performance

Latency: 150-200ms time-to-first-byte
CPU Usage: Optimized for real-time on CPUs
GPU Support: Automatic acceleration if available
Streaming: Chunked delivery for low latency
Concurrent Requests: Handles multiple simultaneous requests

Supported Languages

Language	Code	Speakers
English	EN	EN-US, EN-BR, EN-AU, EN-IN
Spanish	ES	ES
French	FR	FR
Chinese	ZH	ZH
Japanese	JP	JP
Korean	KR	KR

Testing

All tests use real implementations - no mocks:

# Start API server
cd api && python server.py

# Run API tests
cd tests && pytest test_api.py -v

# Run plugin tests
cd tests && pytest test_plugin.py -v

Deployment

Multiple deployment options:

Standalone: Run directly with Python/Uvicorn
Docker: Containerized deployment
Kubernetes: Scalable cloud deployment
Cloud: AWS, GCP, Azure support

See docs/DEPLOYMENT.md for detailed guides.

Integration with LiveKit

The plugin is a drop-in replacement for other TTS providers:

# Instead of:
# from livekit.plugins import openai
# tts = openai.TTS()

# Use:
from melotts_plugin import TTS
tts = TTS(api_base_url="http://localhost:8000")

# Same interface, self-hosted!

Use Cases

Voice assistants
Interactive voice response (IVR) systems
Accessibility tools
Educational applications
Multilingual customer service bots
Real-time voice agents
Live streaming with voice synthesis

Requirements

API Server:

Python 3.9+
2GB+ RAM
FastAPI, MeloTTS, Uvicorn
Optional: GPU for faster inference

LiveKit Plugin:

Python 3.9+
livekit-agents >= 0.8.0
aiohttp >= 3.9.0

Security

For production:

Add API authentication
Enable HTTPS/TLS
Implement rate limiting
Configure CORS
Set up monitoring

See docs/DEPLOYMENT.md#security for details.

When to Use This Skill

Use this skill when you need to:

Build a self-hosted TTS solution
Create LiveKit voice agents with custom TTS
Avoid commercial TTS API costs
Have full control over voice synthesis
Support multiple languages
Deploy TTS in private/air-gapped environments
Build voice assistants
Integrate TTS into existing applications

Troubleshooting

Server won't start:

Run python -m unidic download
Check port 8000 is available
Verify dependencies installed

Plugin connection errors:

Ensure API server is running
Check api_base_url configuration
Verify network connectivity

Audio quality issues:

Try different voices/speakers
Adjust speed parameter
Check sample rate configuration

See documentation for more troubleshooting tips.

Resources

License

Apache 2.0 License

Support

Check the documentation in docs/
Review examples in examples/
Run the test suite to verify setup
Check logs for error messages

This skill provides everything needed for production-ready, self-hosted TTS with LiveKit integration. All code is fully functional with no mocks or placeholders.

More by Okeysir198

View all

livekit-agent-tools

Comprehensive guide for building functional tools for LiveKit voice agents using the @function_tool decorator. Use when creating tools for LiveKit agents to enable capabilities like API calls, database queries, multi-agent coordination, or any external integrations. Covers tool design, RunContext handling, interruption patterns, parameter documentation, testing, and production best practices.

livekit-stt-selfhosted

Build self-hosted speech-to-text APIs using Hugging Face models (Whisper, Wav2Vec2) and create LiveKit voice agent plugins. Use when building STT infrastructure, creating custom LiveKit plugins, deploying self-hosted transcription services, or integrating Whisper/HF models with LiveKit agents. Includes FastAPI server templates, LiveKit plugin implementation, model selection guides, and production deployment patterns.

livekit-prompt-builder

Guide for creating effective prompts and instructions for LiveKit voice agents. Use when building conversational AI agents with the LiveKit Agents framework, including (1) Creating new voice agent prompts from scratch, (2) Improving existing agent instructions, (3) Optimizing prompts for text-to-speech output, (4) Integrating tool/function calling capabilities, (5) Building multi-agent systems with handoffs, (6) Ensuring voice-friendly formatting and brevity for natural conversations, (7) Iteratively improving prompts based on testing and feedback, (8) Building industry-specific agents (debt collection, healthcare, banking, customer service, front desk).

tts-livekit-plugin

Build self-hosted TTS APIs using HuggingFace models (Parler-TTS, F5-TTS, XTTS-v2) and create LiveKit voice agent plugins with streaming support. Use when creating production-ready text-to-speech systems that need: (1) Self-hosted TTS with full control, (2) LiveKit voice agent integration, (3) Streaming audio for low-latency conversations, (4) Custom voice characteristics, (5) Cost-effective alternatives to cloud TTS providers like ElevenLabs or Google TTS.

tts-livekit-plugin

Installation

Details

Usage

Skill Instructions

name: tts-livekit-plugin description: Build and deploy self-hosted Text-to-Speech API using MeloTTS from Hugging Face and create a LiveKit plugin for voice agents. Use this skill when building TTS systems, LiveKit voice agents, or self-hosted speech synthesis solutions.

TTS LiveKit Plugin Skill

What This Skill Does

Components

API Server (api/)

LiveKit Plugin (plugin/)

Tests (tests/)

Examples (examples/)

Documentation (docs/)

Quick Start

1. Start the TTS API Server

2. Test the API

3. Use in LiveKit Agent

Features

Architecture

Why MeloTTS?

Performance

Supported Languages

Testing

Deployment

Integration with LiveKit

Use Cases

Requirements

Security

When to Use This Skill

Troubleshooting

Resources

License

Support

More by Okeysir198

API Server (`api/`)

LiveKit Plugin (`plugin/`)

Tests (`tests/`)

Examples (`examples/`)

Documentation (`docs/`)