Linux Text to Speech: Best CLI Tools, GUI & Online Options

TTS

Tutorials

AI Listen

Linux Text to Speech: Best CLI Tools, GUI Apps, and Online Options

Text to speech on Linux is more capable than most users realize. This guide covers CLI tools like eSpeak and Piper, the best GUI app for desktop users, and online alternatives — with install instructions for Ubuntu.

Julian Sterling

AI Content Strategist

June 27, 2026

8 min read

In This Article

Text to Speech on Linux: What to Expect

Best CLI Tools for Linux Text to Speech

How to Install eSpeak on Ubuntu — Step by Step

How to Install and Use Piper

GUI and Desktop Options for Linux TTS

Tool Comparison: CLI vs GUI vs Online

Online TTS as a Simpler Alternative

Which Tool Is Right for You?

Text to Speech on Linux: What to Expect

Text to speech on Linux is a mixed landscape. The good news: several solid tools exist, most are free, and the best ones are actively maintained. The less obvious part is that the experience varies a lot depending on what you need — a developer scripting audio output has very different requirements from a desktop user who just wants to listen to a long article.

Before diving into tools, it helps to understand the three categories you will encounter:

CLI tools — run entirely in the terminal, ideal for automation and scripting
GUI apps — graphical interface, better for everyday desktop use
Online / cloud tools — require no installation, accessible from any browser

This guide covers each category with concrete install commands (Ubuntu/Debian) and honest "best for" guidance.

Quick Tip: On Ubuntu, you can test any TTS engine instantly from the terminal by piping text directly — no temp files needed. Try eSpeak or Festival to quickly hear how different voices compare.

Best CLI Tools for Linux Text to Speech

eSpeak — The Simplest Starting Point

eSpeak (and its successor eSpeak NG) is the most widely used TTS engine on Linux. It is lightweight, fast, and installs with a single command.

Install on Ubuntu/Debian:

sudo apt-get install espeak

Basic usage:

espeak "Hello, this is a test"
espeak -f document.txt

Save to audio file:

espeak "Hello world" -w output.wav

eSpeak uses a formant synthesis approach, which means the voice sounds robotic compared to modern neural TTS. That said, it is perfect for automation scripts, accessibility tools, or any situation where you just need reliable spoken output and don't care about voice quality.

Best for: Developers who need TTS in scripts, accessibility tools, server-side automation, users who want the fastest possible setup.

Festival — A Mature and Capable Engine

Festival is one of the oldest and most feature-complete TTS systems available on Linux. It supports multiple languages, has an extensible architecture, and includes a full scripting interface.

Install on Ubuntu/Debian:

sudo apt-get install festival

Basic usage:

echo "Hello from Festival" | festival --tts
festival --tts < document.txt

Festival's default English voices are noticeably dated — they were state-of-the-art in the 1990s but sound rough by today's standards. However, you can install better voice packages (like the HTS voices) to significantly improve quality.

Best for: Users who need a full-featured TTS framework, those building more complex audio pipelines, anyone who wants to go deeper into voice customization.

Piper — Neural TTS with High-Quality Output

Piper is a fast, local neural text to speech engine developed by the team behind Home Assistant. It produces genuinely natural-sounding speech that rivals cloud services — and it runs entirely offline.

Install on Ubuntu/Debian:

Piper is not in the standard apt repositories. Install it by downloading the binary from the GitHub releases page:

# Download the latest release (replace with current version)
wget https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_linux_x86_64.tar.gz
tar -xzf piper_linux_x86_64.tar.gz

You also need to download a voice model (.onnx file) from the Piper voices repository. For example, to use the en_US-lessac-medium voice:

# Download voice model and config
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json

Basic usage:

echo "This is Piper speaking" | ./piper --model en_US-lessac-medium.onnx --output_file output.wav
aplay output.wav

The extra setup steps are worth it if voice quality matters. Piper's output is dramatically more natural than eSpeak or Festival's default voices.

Best for: Developers building voice-enabled applications, podcast creators, anyone who wants high-quality offline TTS without cloud fees.

How to Install eSpeak on Ubuntu — Step by Step

For users new to Linux TTS, here is a complete walkthrough for getting eSpeak working on Ubuntu:

Open a terminal (Ctrl + Alt + T)
Update your package list:
```
sudo apt-get update
```
Install eSpeak:
```
sudo apt-get install espeak
```
Test it:
```
espeak "eSpeak is working correctly"
```

You should hear speech through your speakers or headphones immediately. No configuration files, no environment setup.

To install the newer espeak-ng instead (recommended for new projects):

sudo apt-get install espeak-ng
espeak-ng "This is espeak-ng"

How to Install and Use Piper

The full Piper workflow involves three steps: download the binary, download a voice model, and run inference.

Step 1: Download and extract the binary

cd ~
wget https://github.com/rhasspy/piper/releases/latest/download/piper_linux_x86_64.tar.gz
tar -xzf piper_linux_x86_64.tar.gz
cd piper

Step 2: Download a voice

Browse voices at https://rhasspy.github.io/piper-samples/ and pick one. The medium quality models are a good balance of size and quality. Download both the .onnx model file and the .onnx.json config file.

Step 3: Generate speech

echo "Hello from Piper" | ./piper \
  --model ~/voices/en_US-lessac-medium.onnx \
  --output_file ~/output.wav
aplay ~/output.wav

Piper also supports piping directly to a player:

echo "Playing directly" | ./piper \
  --model en_US-lessac-medium.onnx \
  --output-raw | aplay -r 22050 -f S16_LE -t raw -

GUI and Desktop Options for Linux TTS

Not everyone wants to use the command line. If you are a desktop Linux user looking for a graphical text to speech application, your main option is Speech Note.

Speech Note

Speech Note is a GTK4 application that provides a clean interface for speech recognition and text to speech. It uses espeak-ng as its TTS backend by default, but can also be configured to use Piper for higher-quality output.

Install via Flatpak (recommended):

flatpak install flathub net.mkiol.SpeechNote

The interface lets you type or paste text, select a language and voice, and play back the result — no terminal required. It also supports reading clipboard contents, which is useful for listening to web articles or documents.

Best for: Desktop Linux users who want TTS without the command line, users who also want speech recognition capabilities in the same app.

Tool Comparison: CLI vs GUI vs Online

Tool	Type	Voice Quality	Ease of Setup	Best For
eSpeak / espeak-ng	CLI	Low (robotic)	Very easy (apt-get)	Scripts, automation, accessibility
Festival	CLI	Low-Medium	Easy (apt-get)	Feature-rich pipelines, customization
Piper	CLI	High (neural)	Medium (manual download)	High-quality offline TTS, apps
Speech Note	GUI	Medium-High	Easy (Flatpak)	Desktop users, non-CLI users
AI Listen	Online/iOS	High (AI)	None required	Cross-platform, no setup needed

Robert (Deep·Male)

48kHz

MP3

Audiobook

The

first

message

from

father

after

the

surgery

was

only

three

words:

"Bring

real

coffee."

mother

laughed

loudly

that

nurse

looked

into

the

room,

and

for

moment

the

machines,

the

plastic

wristbands,

and

the

pale

winter

light

all

seemed

less

serious

than

his

terrible

taste

jokes.

Two

hours

later,

asked

what

day

was.

told

him

Tuesday.

nodded,

then

asked

again.

The

second

time,

answered

slowly.

The

third

time,

felt

smile

break

before

could

stop

it.

noticed.

"Hey,"

whispered,

were

sharing

secret

instead

sitting

beside

hospital

bed.

"If

forget

today,

you

can

lend

back

tomorrow."

did

not

know

whether

laugh

cry,

did

both

badly.

closed

his

eyes,

still

smiling,

and

tapped

two

fingers

against

the

blanket,

the

way

used

tap

the

steering

wheel

when

song

came

the

radio.

That

small

rhythm

was

enough.

For

the

first

time

all

week,

believed

might

find

our

way

home.

-01:04

Speed

0.5x

0.8x

1.0x

1.2x

1.5x

2.0x

Learn more

Download Free

Online TTS as a Simpler Alternative

If you are not a developer and you ended up on this page because you just want to have articles or documents read aloud — Linux CLI tools may be more friction than they are worth.

For casual use, an online tool or mobile app gets you there without any terminal commands. AI Listen is a cross-platform option that converts text to natural-sounding speech instantly, supports multiple languages, and works on any device. You paste text, hit play, and you're done.

It is especially useful if you regularly switch between a Linux machine and a phone or tablet — there is no configuration to replicate across devices.

Which Tool Is Right for You?

Here is the short version:

You write scripts or automate things → eSpeak or espeak-ng. Install it in ten seconds, it works, and it stays out of your way.
You want the best possible voice quality locally → Piper. Spend 15 minutes on setup and you get neural TTS that sounds genuinely good.
You prefer a GUI on the desktop → Speech Note via Flatpak. No terminal required.
You want a mature, full-featured TTS framework → Festival, especially if you plan to customize voices or build complex audio pipelines.
You just want to listen to something right now, cross-platform → AI Listen. No Linux knowledge needed, works on any device.

The Linux TTS ecosystem is strong, especially with Piper raising the quality bar significantly. Pick the tool that matches your actual use case rather than the one with the most features.

Ready to Transform Your Study Sessions?

Join 50,000+ students using AI Listen to study smarter. Free forever plan available.

Download Free

Learn more

Frequently Asked Questions

Is text to speech free on Linux?

Yes. eSpeak, Festival, and Piper are all free and open-source. Piper requires a one-time binary download, but there are no licensing costs or subscriptions.

What is the best text to speech tool for Linux?

It depends on your use case. eSpeak is the fastest to set up and best for scripting. Piper produces the most natural-sounding voices. Speech Note is the best option for desktop users who prefer a graphical interface.

How do I install text to speech on Ubuntu?

Run sudo apt-get install espeak for eSpeak, or sudo apt-get install festival for Festival. Piper is not in the standard apt repositories — download the binary and a voice model from the Piper GitHub releases page.

What is the difference between eSpeak and Piper?

eSpeak uses formant synthesis — it installs in seconds via apt-get but produces robotic-sounding output. Piper uses a neural model to generate natural speech, but requires a manual binary download and a separate voice model file.

Does Linux TTS work offline?

Yes. eSpeak, Festival, and Piper all work fully offline once installed and configured. AI Listen requires an internet connection for its AI-enhanced voices.

Can I use text to speech in shell scripts on Linux?

Yes. Both eSpeak and Festival accept input from stdin or a file, making them easy to integrate into automation pipelines. You can pipe text directly to either tool without writing a temp file first.

TTS

Tutorials

AI Listen

Share this article: