sag

Modern text-to-speech CLI using ElevenLabs voices

Productivity System & Monitoring linuxmacoswindows Go MIT

Description

sag is a command-line text-to-speech tool that works like macOS's `say` command but uses ElevenLabs' modern AI voices. It streams audio to speakers by default, supports voice discovery and selection, speed/rate controls, multiple TTS engines (v3, v2, v2.5 Flash/Turbo), and can save audio to MP3 or WAV files.

Install

homebrewbrew install steipete/tap/sag

gogo install github.com/steipete/sag/cmd/sag@latest

AI Summary

Drop-in replacement for macOS `say` using ElevenLabs AI voices. Streams high-quality speech to speakers or files with voice discovery, speed control, and support for multiple TTS engines (v3, v2, v2.5 Flash/Turbo).

Capabilities

+ Stream text-to-speech audio to speakers in real-time (default behavior)
+ Save audio output to MP3 or WAV files with format auto-detection
+ Voice discovery and search with --try flag to preview voices
+ Multiple TTS engines: v3 (most expressive), v2 (stable), v2.5 Flash (75ms latency), v2.5 Turbo
+ Speed control (0.5x-2.0x), stability, similarity, and style tuning
+ SSML support for v2/v2.5 and audio tags for v3
+ Read text from arguments, files, or piped stdin
+ Multilingual support with ISO 639-1 language codes
+ Latency tier adjustment for streaming optimization
+ Performance metrics output

Use When

→ You need high-quality AI-generated speech from the command line
→ Creating voiceover audio files for presentations or content
→ Adding spoken notifications to scripts or automation
→ Previewing and selecting from a library of AI voices
→ Need lower latency than cloud TTS web interfaces

Avoid When

x No ElevenLabs API key is available
x Working offline without internet access
x macOS built-in `say` command quality is sufficient
x Processing very large texts (v3 limited to 5,000 characters per request)

Usage Patterns

Basic speech

sag "Hello, world!"

Speaks the text aloud using the default voice and engine

Use a specific voice

sag speak -v Roger "Welcome to the demo"

Speaks using the named voice Roger

Save to file

sag -o output.mp3 "Save this as an audio file"

Generates speech and saves it as an MP3 file

Search for voices

sag voices --search english --limit 20

Lists up to 20 English-language voices available

Preview voices

sag voices --query "narrator" --limit 5 --try

Finds narrator-style voices and plays a sample of each

Pipe text from stdin

echo "Read this text" | sag speak -v Roger

Reads and speaks text piped from stdin

Adjust speed

sag speak -v Roger --speed 1.3 "A bit faster please"

Speaks at 1.3x speed

Input / Output

stdin: Text to be spoken (when using -f - or piping)

stdout: Audio stream to speakers (default) or file

stderr: Performance metrics (with --metrics) and status messages

Exit codes:

0 Success

1 Error

Typical Pipelines

cat article.txt | sag speak -v Roger --speed 1.2

sag -o narration.mp3 -v Roger "$(cat script.txt)"

sag voices --search british --limit 10 --json | jq '.[].name'

View AGENTS.md for sag