Local AI TTS Guide

Read your live stream chat aloud using free, private, self-hosted AI voices — no API keys required

Overview

Social Stream Ninja supports fully local AI text-to-speech — your chat messages are converted to voice entirely on your own computer, with no data sent to external servers and no API key required.

There are two approaches:

Path 2 — Self-Hosted Server Docker Required

Run a local TTS server on your machine and point Social Stream Ninja at it. Gives you more voice options, voice cloning, and server-side control.

  • Kokoro-FastAPI
  • openedai-speech (Piper)
  • kokoro-web

Uses Social Stream's built-in OpenAI-compatible endpoint support.

Start with Path 1. The built-in Kokoro TTS rivals cloud services in quality, runs entirely in your browser, and works with OBS out of the box. Only move to Path 2 if you need more control or different models.

Path 1 — Built-in TTS (Zero Setup)

These engines are bundled inside Social Stream Ninja and require no installation. They run in the browser using WebAssembly (WASM) or ONNX Runtime.

Provider Quality CPU Use GPU/WebGPU URL Parameter
Kokoro TTS ⭐⭐⭐⭐⭐ Excellent Medium Faster with GPU ?ttsprovider=kokoro
Piper TTS ⭐⭐⭐⭐ Very Good Low CPU only ?ttsprovider=piper
Kitten TTS ⭐⭐⭐ Good Very Low CPU only ?ttsprovider=kitten
eSpeak-NG ⭐⭐ Robotic Minimal CPU only ?ttsprovider=espeak

How to Enable

Add &ttsprovider= and &speech= to your Social Stream dock.html URL:

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=kokoro

Kokoro TTS Options

Kokoro has 26 built-in voices. Specify one with &voicekokoro=:

English female: af_bella, af_sarah, af_nicole, af_sky English male: am_adam, am_michael British female: bf_emma, bf_isabella British male: bm_george, bm_lewis
dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=kokoro&voicekokoro=af_bella&kokorospeed=1.1

Piper TTS Options

Specify a voice model with &pipervoice=:

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=piper&pipervoice=en_US-hfc_female-medium

Kitten TTS Options

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=kitten&kittenvoice=expr-voice-4-f

eSpeak-NG Options

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=espeak&espeakvoice=en&espeakspeed=175
First-time load: Kokoro and Piper need to download their model files on first use (~50–200 MB). This happens automatically in the background. Subsequent loads are instant (cached in browser).
OBS capture: All built-in TTS providers play audio directly through the browser. In OBS, add your dock.html as a Browser Source and enable "Control audio via OBS" — no virtual cables needed. See the OBS section below.

Path 2 — Self-Hosted TTS Server

If you want more voice options, voice cloning, or a dedicated server you can reuse across tools, you can run a local TTS server. Social Stream Ninja connects to it using its built-in OpenAI-compatible TTS endpoint support — no API key needed for local servers.

Requirements: Docker Desktop must be installed and running. Docker is free for personal use.

Three recommended options:

Server Model GPU Disk Default Port
Kokoro-FastAPI Recommended Kokoro 82M Optional ~2 GB 8880
openedai-speech (Piper) Lightweight Piper TTS CPU only <1 GB 8000
kokoro-web Kokoro 82M Optional ~2 GB 3000

Kokoro-FastAPI Setup

Kokoro-FastAPI runs the Kokoro 82M model as a local server with an OpenAI-compatible API. It works on CPU (no GPU required) and has excellent voice quality.

Install with Docker

Open a terminal (Command Prompt, PowerShell, or Terminal) and run one of the following:

CPU (works on any computer):

docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2

GPU (NVIDIA only — faster synthesis):

docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.0post4
First run: Docker will download the image (~1.5–2 GB). This only happens once. After that, the server starts in a few seconds.

Verify It's Running

Open your browser and go to http://localhost:8880/web/ — you should see a web UI where you can test voices.

Available Voices

67+ voices available. A few highlights:

af_bella, af_sarah, af_nicole, af_sky, af_heart (American female) am_adam, am_michael (American male) bf_emma, bf_isabella (British female) bm_george, bm_lewis (British male)

Browse and test all voices at http://localhost:8880/web/ once the server is running.

SSN URL

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=openai&openaiendpoint=http://localhost:8880/v1/audio/speech&voiceopenai=af_bella

Keep the Server Running

To keep Kokoro-FastAPI running automatically in the background, use Docker's restart flag:

docker run -d --restart unless-stopped -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2

It will now start automatically with Docker Desktop on every reboot.

openedai-speech Setup (Lightweight Piper)

openedai-speech is the lightest option — a CPU-only Piper TTS server under 1 GB. Good for older or less powerful computers.

Install with Docker Compose

1
Clone the repository or create a folder with the following docker-compose.min.yml. Alternatively, run the commands below directly.
2
Run the minimal Piper-only image:
docker run -d --restart unless-stopped \ -p 8000:8000 \ ghcr.io/matatonic/openedai-speech-min

Available Voices

openedai-speech uses OpenAI-style voice names mapped to Piper voices:

alloy, echo, fable, onyx, nova, shimmer

SSN URL

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=openai&openaiendpoint=http://localhost:8000/v1/audio/speech&voiceopenai=nova

Connecting to Social Stream Ninja

All self-hosted servers above use the same connection method — Social Stream's built-in OpenAI TTS endpoint support with a custom local URL.

URL Parameters

Parameter Value Description
ttsprovider openai Use the OpenAI-compatible TTS path
openaiendpoint http://localhost:8880/v1/audio/speech Your local server URL (change port as needed)
speech en-US Enables TTS for English
voiceopenai af_bella Voice name (depends on server)
openaiformat mp3 Audio format: mp3, wav, opus, flac
openaispeed 1.0 Speaking speed (0.5–2.0)

Full Example URLs

Kokoro-FastAPI:

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=openai&openaiendpoint=http://localhost:8880/v1/audio/speech&voiceopenai=af_bella&openaispeed=1.1

openedai-speech:

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=openai&openaiendpoint=http://localhost:8000/v1/audio/speech&voiceopenai=nova

kokoro-web:

dock.html?session=YOUR_SESSION&speech=en-US&ttsprovider=openai&openaiendpoint=http://localhost:3000/api/v1/audio/speech&voiceopenai=af_bella

Additional TTS Options

These work with any TTS provider, including local servers:

Parameter Example Description
simpletts &simpletts Skip "says" — reads message only
simpletts2 &simpletts2 Skip usernames entirely
volume &volume=0.8 Volume level (0.0–1.0)
skipmessages &skipmessages=3 Read every 3rd message only
ttscommand &ttscommand=!say Only read messages starting with !say
readevents &readevents Also read subscriptions, donations, etc.
No API key needed. When using a local server (non-openai.com URL), Social Stream Ninja sends the request without an Authorization header. You do not need to configure a key.

Getting Audio into OBS

How you capture TTS audio in OBS depends on how you're running Social Stream Ninja.

Method 1 — OBS Browser Source Recommended

This is the simplest method and works for all TTS providers (built-in and self-hosted server).

1
In OBS, add a new Browser Source
2
Set the URL to your dock.html URL with TTS parameters
3
Check "Control audio via OBS" in the browser source settings
4
Click OK — TTS audio will now appear as an OBS audio source you can adjust or route
5
Click the browser source once in the preview to allow browser audio autoplay
Why this works: Built-in TTS and self-hosted server TTS both play audio through the browser's audio context (not OS speech synthesis). OBS can capture browser audio directly when "Control audio via OBS" is checked.

Method 2 — SSN Desktop App + Desktop Audio

If you're using the Social Stream Ninja standalone desktop app (not an OBS browser source):

1
TTS audio plays through your system speakers/headphones from the app
2
In OBS, add an Audio Input Capture or Desktop Audio Capture source
3
If you want TTS isolated from other desktop audio, use a virtual audio cable:
  • Windows: VB-Audio Virtual Cable (free)
  • Set the virtual cable as output for the SSN app in Windows Sound settings
  • Capture the virtual cable input in OBS
System TTS (?speech=en-US without a provider) uses OS speech synthesis, which cannot be captured by OBS browser source. Use one of the providers above (kokoro, piper, etc.) instead.

Comparison Table

Option Setup Quality Private OBS (Browser Source) GPU Needed Cost
Built-in Kokoro None ⭐⭐⭐⭐⭐ Yes Yes No (faster with) Free
Built-in Piper None ⭐⭐⭐⭐ Yes Yes No Free
Built-in Kitten None ⭐⭐⭐ Yes Yes No Free
Built-in eSpeak None ⭐⭐ Yes Yes No Free
Kokoro-FastAPI Docker ⭐⭐⭐⭐⭐ Yes Yes No (optional) Free
openedai-speech Docker ⭐⭐⭐⭐ Yes Yes No Free
ElevenLabs API Key ⭐⭐⭐⭐⭐ No Yes No Paid tiers
System TTS None ⭐⭐ Yes No* No Free

* System TTS requires virtual audio cable routing for OBS capture.

Troubleshooting

No audio playing

Kokoro / Piper takes a long time on first load

The model files are being downloaded (~50–200 MB). This only happens once — subsequent loads use the cached version. Wait for the first message before testing.

Local server not responding (self-hosted setup)

CORS error in browser console (extension mode)

If you're using the Social Stream browser extension (not the standalone app), your local server must allow cross-origin requests from the extension.

Most FastAPI-based servers (Kokoro-FastAPI, openedai-speech) allow all origins by default. If you see a CORS error:

"http://localhost" blocked in OBS browser source

OBS browser sources can have trouble reaching localhost servers. Try:

Audio plays but OBS doesn't capture it

Wrong voice / voice not found

Voice names are case-sensitive and must match what the server supports. Visit http://localhost:8880/web/ (Kokoro-FastAPI) to browse and test available voices.

Docker image not found

Image tags change with new releases. If the tag in this guide no longer works, check the project's GitHub page for the latest version tag.

More TTS options: For cloud-based premium TTS (ElevenLabs, Google Cloud, Speechify) and full URL parameter reference, see the TTS Voice Guide.