VoiceLab

A local-first web app for experimenting with ElevenLabs text-to-speech and sound effects generation, with persistent history and audio playback.

ReactTypeScriptViteFastAPIPythonElevenLabs APIWeb Audio API

GitHub

February 2026

Overview

VoiceLab is a developer-facing tool for rapidly experimenting with ElevenLabs' audio generation capabilities. It pairs a FastAPI backend with a React + TypeScript frontend to let you generate speech and sound effects from text, audition different voices, and keep a persistent history of every experiment — all running locally. It also allows user to download the generated sound as mp3.

Features

Text-to-Speech generation with access to several ElevenLabs voices
Sound effects generation from freeform text descriptions with optional duration control
Voice browser to preview and select from the full ElevenLabs voice catalog
Persistent experiment history stored in localStorage with rename, delete, and batch-download
Web Audio API playback for reliable, low-latency audio preview
MP3 download for individual clips or the entire history
"Try Another Voice" — instantly re-run a TTS prompt with a different voice

Architecture

The app follows a clean client/server split:

Backend (FastAPI + Uvicorn): Exposes four endpoints — health check, voice listing, TTS generation, and SFX generation. Pydantic models enforce request validation. Audio is returned as MP3.
Frontend (React 19 + Vite): A tabbed UI with dedicated panels for TTS and SFX.
Storage: Generated audio is immediately converted from blobs to base64 data URLs so it can be serialized into localStorage alongside metadata (voice, prompt text, timestamps).

Learnings

Understanding the ElevenLabs API: Working with voice selection, model variants, and audio output formats. Learned how to integrate TTS and sound effects endpoints, handle streaming audio responses, and manage API usage effectively.
TTS is an API call, but the UX around it isn't: The actual speech generation is straightforward — send text, get audio back. The real work is in building a smooth experience around it: letting users quickly switch voices, replay clips, and keep a history of experiments without friction.
Client-side persistence trade-offs: Storing generated audio in localStorage is a simple way to keep data between page refreshes without a database, but it fills up fast. Fine for a local tool, not for production.

Overview

Features

Architecture

Learnings

Demo