← Back to tools

sag

Modern text-to-speech CLI using ElevenLabs voices

Productivity System & Monitoring linuxmacoswindows Go MIT

Description

sag is a command-line text-to-speech tool that works like macOS's `say` command but uses ElevenLabs' modern AI voices. It streams audio to speakers by default, supports voice discovery and selection, speed/rate controls, multiple TTS engines (v3, v2, v2.5 Flash/Turbo), and can save audio to MP3 or WAV files.

Install

homebrewbrew install steipete/tap/sag
gogo install github.com/steipete/sag/cmd/sag@latest

AI Summary

Drop-in replacement for macOS `say` using ElevenLabs AI voices. Streams high-quality speech to speakers or files with voice discovery, speed control, and support for multiple TTS engines (v3, v2, v2.5 Flash/Turbo).

Capabilities

  • + Stream text-to-speech audio to speakers in real-time (default behavior)
  • + Save audio output to MP3 or WAV files with format auto-detection
  • + Voice discovery and search with --try flag to preview voices
  • + Multiple TTS engines: v3 (most expressive), v2 (stable), v2.5 Flash (75ms latency), v2.5 Turbo
  • + Speed control (0.5x-2.0x), stability, similarity, and style tuning
  • + SSML support for v2/v2.5 and audio tags for v3
  • + Read text from arguments, files, or piped stdin
  • + Multilingual support with ISO 639-1 language codes
  • + Latency tier adjustment for streaming optimization
  • + Performance metrics output

Use When

  • You need high-quality AI-generated speech from the command line
  • Creating voiceover audio files for presentations or content
  • Adding spoken notifications to scripts or automation
  • Previewing and selecting from a library of AI voices
  • Need lower latency than cloud TTS web interfaces

Avoid When

  • x No ElevenLabs API key is available
  • x Working offline without internet access
  • x macOS built-in `say` command quality is sufficient
  • x Processing very large texts (v3 limited to 5,000 characters per request)

Usage Patterns

Basic speech

sag "Hello, world!"

Speaks the text aloud using the default voice and engine

Use a specific voice

sag speak -v Roger "Welcome to the demo"

Speaks using the named voice Roger

Save to file

sag -o output.mp3 "Save this as an audio file"

Generates speech and saves it as an MP3 file

Search for voices

sag voices --search english --limit 20

Lists up to 20 English-language voices available

Preview voices

sag voices --query "narrator" --limit 5 --try

Finds narrator-style voices and plays a sample of each

Pipe text from stdin

echo "Read this text" | sag speak -v Roger

Reads and speaks text piped from stdin

Adjust speed

sag speak -v Roger --speed 1.3 "A bit faster please"

Speaks at 1.3x speed

Input / Output

stdin: Text to be spoken (when using -f - or piping)
stdout: Audio stream to speakers (default) or file
stderr: Performance metrics (with --metrics) and status messages
Exit codes:
0 Success
1 Error

Typical Pipelines

cat article.txt | sag speak -v Roger --speed 1.2
sag -o narration.mp3 -v Roger "$(cat script.txt)"
sag voices --search british --limit 10 --json | jq '.[].name'
View AGENTS.md for sag