TTS
Tutorials
AI Listen
Linux Text to Speech: Best CLI Tools, GUI Apps, and Online Options
Text to speech on Linux is more capable than most users realize. This guide covers CLI tools like eSpeak and Piper, the best GUI app for desktop users, and online alternatives — with install instructions for Ubuntu.
Julian Sterling
Julian Sterling
AI Content Strategist
June 27, 2026
8 min read
linux-text-to-speech
In This Article
Text to Speech on Linux: What to Expect
Best CLI Tools for Linux Text to Speech
How to Install eSpeak on Ubuntu — Step by Step
How to Install and Use Piper
GUI and Desktop Options for Linux TTS
Tool Comparison: CLI vs GUI vs Online
Online TTS as a Simpler Alternative
Which Tool Is Right for You?

Text to Speech on Linux: What to Expect

Text to speech on Linux is a mixed landscape. The good news: several solid tools exist, most are free, and the best ones are actively maintained. The less obvious part is that the experience varies a lot depending on what you need — a developer scripting audio output has very different requirements from a desktop user who just wants to listen to a long article.

Before diving into tools, it helps to understand the three categories you will encounter:

  • CLI tools — run entirely in the terminal, ideal for automation and scripting

  • GUI apps — graphical interface, better for everyday desktop use

  • Online / cloud tools — require no installation, accessible from any browser

This guide covers each category with concrete install commands (Ubuntu/Debian) and honest "best for" guidance.

Quick Tip: On Ubuntu, you can test any TTS engine instantly from the terminal by piping text directly — no temp files needed. Try eSpeak or Festival to quickly hear how different voices compare.

Best CLI Tools for Linux Text to Speech

eSpeak — The Simplest Starting Point

eSpeak (and its successor eSpeak NG) is the most widely used TTS engine on Linux. It is lightweight, fast, and installs with a single command.

Install on Ubuntu/Debian:

sudo apt-get install espeak

Basic usage:

espeak "Hello, this is a test"
espeak -f document.txt

Save to audio file:

espeak "Hello world" -w output.wav

eSpeak uses a formant synthesis approach, which means the voice sounds robotic compared to modern neural TTS. That said, it is perfect for automation scripts, accessibility tools, or any situation where you just need reliable spoken output and don't care about voice quality.

Best for: Developers who need TTS in scripts, accessibility tools, server-side automation, users who want the fastest possible setup.

Festival — A Mature and Capable Engine

Festival is one of the oldest and most feature-complete TTS systems available on Linux. It supports multiple languages, has an extensible architecture, and includes a full scripting interface.

Install on Ubuntu/Debian:

sudo apt-get install festival

Basic usage:

echo "Hello from Festival" | festival --tts
festival --tts < document.txt

Festival's default English voices are noticeably dated — they were state-of-the-art in the 1990s but sound rough by today's standards. However, you can install better voice packages (like the HTS voices) to significantly improve quality.

Best for: Users who need a full-featured TTS framework, those building more complex audio pipelines, anyone who wants to go deeper into voice customization.

Piper — Neural TTS with High-Quality Output

Piper is a fast, local neural text to speech engine developed by the team behind Home Assistant. It produces genuinely natural-sounding speech that rivals cloud services — and it runs entirely offline.

Install on Ubuntu/Debian:

Piper is not in the standard apt repositories. Install it by downloading the binary from the GitHub releases page:

# Download the latest release (replace with current version)
wget https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_linux_x86_64.tar.gz
tar -xzf piper_linux_x86_64.tar.gz

You also need to download a voice model (.onnx file) from the Piper voices repository. For example, to use the en_US-lessac-medium voice:

# Download voice model and config
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json

Basic usage:

echo "This is Piper speaking" | ./piper --model en_US-lessac-medium.onnx --output_file output.wav
aplay output.wav

The extra setup steps are worth it if voice quality matters. Piper's output is dramatically more natural than eSpeak or Festival's default voices.

Best for: Developers building voice-enabled applications, podcast creators, anyone who wants high-quality offline TTS without cloud fees.

How to Install eSpeak on Ubuntu — Step by Step

For users new to Linux TTS, here is a complete walkthrough for getting eSpeak working on Ubuntu:

  1. Open a terminal (Ctrl + Alt + T)

  2. Update your package list:

    sudo apt-get update
  3. Install eSpeak:

    sudo apt-get install espeak
  4. Test it:

    espeak "eSpeak is working correctly"

You should hear speech through your speakers or headphones immediately. No configuration files, no environment setup.

To install the newer espeak-ng instead (recommended for new projects):

sudo apt-get install espeak-ng
espeak-ng "This is espeak-ng"

How to Install and Use Piper

The full Piper workflow involves three steps: download the binary, download a voice model, and run inference.

Step 1: Download and extract the binary

cd ~
wget https://github.com/rhasspy/piper/releases/latest/download/piper_linux_x86_64.tar.gz
tar -xzf piper_linux_x86_64.tar.gz
cd piper

Step 2: Download a voice

Browse voices at https://rhasspy.github.io/piper-samples/ and pick one. The medium quality models are a good balance of size and quality. Download both the .onnx model file and the .onnx.json config file.

Step 3: Generate speech

echo "Hello from Piper" | ./piper \
  --model ~/voices/en_US-lessac-medium.onnx \
  --output_file ~/output.wav
aplay ~/output.wav

Piper also supports piping directly to a player:

echo "Playing directly" | ./piper \
  --model en_US-lessac-medium.onnx \
  --output-raw | aplay -r 22050 -f S16_LE -t raw -

GUI and Desktop Options for Linux TTS

Not everyone wants to use the command line. If you are a desktop Linux user looking for a graphical text to speech application, your main option is Speech Note.

Speech Note

Speech Note is a GTK4 application that provides a clean interface for speech recognition and text to speech. It uses espeak-ng as its TTS backend by default, but can also be configured to use Piper for higher-quality output.

Install via Flatpak (recommended):

flatpak install flathub net.mkiol.SpeechNote

The interface lets you type or paste text, select a language and voice, and play back the result — no terminal required. It also supports reading clipboard contents, which is useful for listening to web articles or documents.

Best for: Desktop Linux users who want TTS without the command line, users who also want speech recognition capabilities in the same app.

Tool Comparison: CLI vs GUI vs Online

Tool

Type

Voice Quality

Ease of Setup

Best For

eSpeak / espeak-ng

CLI

Low (robotic)

Very easy (apt-get)

Scripts, automation, accessibility

Festival

CLI

Low-Medium

Easy (apt-get)

Feature-rich pipelines, customization

Piper

CLI

High (neural)

Medium (manual download)

High-quality offline TTS, apps

Speech Note

GUI

Medium-High

Easy (Flatpak)

Desktop users, non-CLI users

AI Listen

Online/iOS

High (AI)

None required

Cross-platform, no setup needed


ai-listen-app
Robert (Deep·Male)
48kHz
MP3
Audiobook
The
first
message
from
my
father
after
the
surgery
was
only
three
words:
"Bring
real
coffee."
My
mother
laughed
so
loudly
that
a
nurse
looked
into
the
room,
and
for
a
moment
the
machines,
the
plastic
wristbands,
and
the
pale
winter
light
all
seemed
less
serious
than
his
terrible
taste
in
jokes.
Two
hours
later,
he
asked
what
day
it
was.
I
told
him
Tuesday.
He
nodded,
then
asked
again.
The
second
time,
I
answered
more
slowly.
The
third
time,
I
felt
my
smile
break
before
I
could
stop
it.
He
noticed.
"Hey,"
he
whispered,
as
if
we
were
sharing
a
secret
instead
of
sitting
beside
a
hospital
bed.
"If
I
forget
today,
you
can
lend
it
back
to
me
tomorrow."
I
did
not
know
whether
to
laugh
or
cry,
so
I
did
both
badly.
He
closed
his
eyes,
still
smiling,
and
tapped
two
fingers
against
the
blanket,
the
way
he
used
to
tap
the
steering
wheel
when
a
song
came
on
the
radio.
That
small
rhythm
was
enough.
For
the
first
time
all
week,
I
believed
we
might
find
our
way
home.
-01:04
Speed
0.5x
0.8x
1.0x
1.2x
1.5x
2.0x

Online TTS as a Simpler Alternative

If you are not a developer and you ended up on this page because you just want to have articles or documents read aloud — Linux CLI tools may be more friction than they are worth.

For casual use, an online tool or mobile app gets you there without any terminal commands. AI Listen is a cross-platform option that converts text to natural-sounding speech instantly, supports multiple languages, and works on any device. You paste text, hit play, and you're done.

It is especially useful if you regularly switch between a Linux machine and a phone or tablet — there is no configuration to replicate across devices.

Which Tool Is Right for You?

Here is the short version:

  • You write scripts or automate things → eSpeak or espeak-ng. Install it in ten seconds, it works, and it stays out of your way.

  • You want the best possible voice quality locally → Piper. Spend 15 minutes on setup and you get neural TTS that sounds genuinely good.

  • You prefer a GUI on the desktop → Speech Note via Flatpak. No terminal required.

  • You want a mature, full-featured TTS framework → Festival, especially if you plan to customize voices or build complex audio pipelines.

  • You just want to listen to something right now, cross-platformAI Listen. No Linux knowledge needed, works on any device.

The Linux TTS ecosystem is strong, especially with Piper raising the quality bar significantly. Pick the tool that matches your actual use case rather than the one with the most features.

ai-listen-app
Ready to Transform Your Study Sessions?
Join 50,000+ students using AI Listen to study smarter. Free forever plan available.

Frequently Asked Questions
Is text to speech free on Linux?
Yes. eSpeak, Festival, and Piper are all free and open-source. Piper requires a one-time binary download, but there are no licensing costs or subscriptions.
What is the best text to speech tool for Linux?
It depends on your use case. eSpeak is the fastest to set up and best for scripting. Piper produces the most natural-sounding voices. Speech Note is the best option for desktop users who prefer a graphical interface.
How do I install text to speech on Ubuntu?
Run sudo apt-get install espeak for eSpeak, or sudo apt-get install festival for Festival. Piper is not in the standard apt repositories — download the binary and a voice model from the Piper GitHub releases page.
What is the difference between eSpeak and Piper?
eSpeak uses formant synthesis — it installs in seconds via apt-get but produces robotic-sounding output. Piper uses a neural model to generate natural speech, but requires a manual binary download and a separate voice model file.
Does Linux TTS work offline?
Yes. eSpeak, Festival, and Piper all work fully offline once installed and configured. AI Listen requires an internet connection for its AI-enhanced voices.
Can I use text to speech in shell scripts on Linux?
Yes. Both eSpeak and Festival accept input from stdin or a file, making them easy to integrate into automation pipelines. You can pipe text directly to either tool without writing a temp file first.

TTS
Tutorials
AI Listen
Share this article:
copy

Popular Articles

Continue exploring text to speech and productivity tips
How to Listen to AO3 Stories Offline on iPhone
AI Listen
How to Listen to AO3 Stories Offline on iPhone
Want AO3 offline listening on iPhone? Follow this step-by-step guide to import from the web, switch chapters smoothly, and listen on the subway or in Airplane Mode.
What Happens When Two AI Voice Assistants Talk to Each Other?
TTS
What Happens When Two AI Voice Assistants Talk to Each Other?
This guide explains what emerges, why it happens, real applications, and how to review conversations effectively.
Speech Synthesis: What It Is, How It Works, and Where It’s Used Today
AI Listen
Speech Synthesis: What It Is, How It Works, and Where It’s Used Today
Speech synthesis is the technology that generates spoken audio from text or linguistic representations. In this guide, you’ll learn what speech synthesis means, how modern systems produce natural-sounding voices, key methods, real-world use cases, and how to evaluate solutions.
Reading With Kindle Online Chrome Extension: Complete Guide
AI Listen
Reading With Kindle Online Chrome Extension: Complete Guide
Want to listen to Kindle books in your browser or build a smoother read-and-listen workflow? This comprehensive guide explains what Kindle Chrome extensions can and cannot do, and reveals better text-to-speech options for modern reading habits.
Assistive Technology for Dyslexia: What Helps Most
AI Listen
Assistive Technology for Dyslexia: What Helps Most
Assistive technology for dyslexia is more than a list of apps. This guide explains which tools matter most, who they help, and how to choose support that improves reading and learning in practice.
5 Benefits of Bimodal Learning for Better Retention
AI Listen
5 Benefits of Bimodal Learning for Better Retention
Bimodal learning is more than a theory about seeing and hearing information together. This guide explains five practical benefits, where they matter most, and how to apply them in real study workflows.