Using Voice RSS to Make a Raspberry Pi Talk

With this post I would like to introduce a simple to run, text-to-speech (TTS) system for the Raspberry Pi. Yes, you type it and it get’s spoken! – using Voice RSS to make a Raspberry Pi talk.

The basic idea behind TTS is to give your system output in the form of sentences which can then be used as vocal output to communicate in stead of displaying it on a screen. Much more fun right?

The instructions on this post will enable you to use the tts command in the form of tts "Hello, world." and it will be spoken to you.

Background

I was experimenting with some TTS software packages and a time ago came across Steven Hickson’s PiAUISuite. It used Google Speech to take basic vocal commands through a microphone, runs basic logic which can give output in various forms – speech being one of them. Sadly most of this functionality has ceased since Google stopped/changed? – who knows? – their STT and TTS services round about November 2015.

I’ve also experimented with other systems like eSpeak which is very easy to set up, does not need an active connection to the internet and has an overall okay but decent output. The original PiAUISuite was still better.

Since I had PiAUISuite installed on a few Raspberry Pis, I thought to take its ‘tts’ command and show how to tweak it a little to at least have a good TTS ability – we will be using Voice RSS for this.

Requirements

Apart from some additional Linux packages which we will be installing during this post, you’ll obviously need a fully installed Raspberry Pi with Raspbian. You will also need to be on a local network, an internet connection and if you don’t have a screen, keyboard and mouse you will need PuTTY and/or WinSCP to do the testing and coding.

A basic, free Voice RSS account – you will be able to do up to 350 requests per day.

Superuser permissions in order to do some editing.

Very Important: I haven’t written anything about sound on the Raspberry Pi yet, but you will need sound (ALSA at least). The newest Raspberry Pi 2 has HDMI and a 3.5 mm audio jack which can be used for sound. By default Rasbian should have most things installed for ALSA to work.

Limitations

Although not always a limitation per se, this system is mainly controlled from running terminal commands. It is actually great for Python and Bash scripts. Like mentioned above, a free Voice RSS account will only give you a maximum of 350 request per day.

A Modified Version of PiAUISuite

PiAUISuite will install a nifty command called ‘tts. This is what we’ll be using after some modification. Start by installing some packages and then PiAUISuite from Github by typing the following on your terminal from a freshly booted Raspbian:

sudo apt-get install git-core
sudo apt-get install mpg123

git clone https://github.com/StevenHickson/PiAUISuite.git
cd PiAUISuite/Install
./InstallAUISuite.sh

Say yes to install the dependencies – so initially you’ll be saying yes twice. Afterwards PiAUISuite will one by one try to install and set up playvideo, downloader, gvapi, gtextcommand, youtube, youtube-safe and voicecommand. We will only be needing the last one on the list, voicecommand – so say no for installing the rest.

After all the dependancies and voicecommand is installed (which can take a while), the installer will automatically prompt to set up voicecommand. On a fresh install, there will be no commands found and it will ask you to try to set itself up. We won’t be using this, so say no. (You can do this later by using voicecommand -s.

Next we will be changing some code on the newly created tts file to use Voice RSS’s TTS service in stead of Google’s (advice and code courtesy of Kevin and his blog Modified.systems).

To continue we will need a Voice RSS API key, so go and get one.

To edit the original tts file, use the following command from the Raspbian terminal:

sudo nano /usr/bin/tts

to edit the original code from Steven Hickson:

#!/bin/bash

#for the Raspberry Pi, we need to insert some sort of FILLER here since it cuts off the first bit of audio

string=$@
lang="en"
if [ "$1" == "-l" ] ; then
    lang="$2"
    string=`echo "$string" | sed -r 's/^.{6}//'`
fi

#empty the original file
echo "" > "/dev/shm/speak.mp3"

len=${#string}
while [ $len -ge 100 ] ;
do
    #lets split this up so that its a maximum of 99 characters
    tmp=${string:0:100}
    string=${string:100}

    #now we need to make sure there aren't split words, let's find the last space and the string after it
    lastspace=${tmp##* }
    tmplen=${#lastspace}

    #here we are shortening the tmp string
    tmplen=`expr 100 - $tmplen`
    tmp=${tmp:0:tmplen}

    #now we concatenate and the string is reconstructed
    string="$lastspace$string"
    len=${#string}

    #get the first 100 characters
    wget -q -U Mozilla -O "/dev/shm/tmp.mp3" "https://translate.google.com/translate_tts?tl=${lang}&q=$tmp&ie=UTF-8&total=1&idx=0&client=t"
    cat "/dev/shm/tmp.mp3" >> "/dev/shm/speak.mp3"
done
#this will get the last remnants
wget -q -U Mozilla -O "/dev/shm/tmp.mp3" "https://translate.google.com/translate_tts?tl=${lang}&q=$string&ie=UTF-8&total=1&idx=0&client=t"
cat "/dev/shm/tmp.mp3" >> "/dev/shm/speak.mp3"
#now we finally say the whole thing
cat "/dev/shm/speak.mp3" | mpg123 - 1>>/dev/shm/voice.log 2>>/dev/shm/voice.log

After getting your Voice RSS API, Keven recommended to replace the the entire script with the following shorter version:
#!/bin/bash
#for the Raspberry Pi, we need to insert some sort of FILLER here since it cuts off the first bit of audio
string=$@
lang="en-gb"
if [ "$1" == "-l" ] ; then
    lang="$2"
    string=`echo "$string" | sed -r 's/^.{6}//'`
fi

#empty the original file
echo "" > "/dev/shm/speak.mp3"

len=${#string}
wget -q -U Mozilla -O "/dev/shm/tmp.mp3" "http://api.voicerss.org/?key=MYAPIKEYGOESHERE&src=$string&f=22khz_16bit_mono&hl=$lang"
cat "/dev/shm/tmp.mp3" >> "/dev/shm/speak.mp3"
#now we finally say the whole thing
cat "/dev/shm/speak.mp3" | mpg123 - 1>>/dev/shm/voice.log 2>>/dev/shm/voice.log

Simply replace MYAPIKEYGOESHERE with your own and exit (Ctrl + X & y) to save.

The command can now be used from any directory without sudo like so:

tts "Hello, world"

which will convert your text to speech, which can be heard on your default sound card and audio out. Voice RSS allows for up to 10 000 characters per call.

(If you’re having formatting troubles with the code above, just comment and I will get back to you to help.)

You can go through the Voice RSS documentation yourself and see what languages they have available, but I was happy with the quality of the default English voice.

Conclusion

With a little effort your Raspberry Pi now has an awesome, easy to use TTS function, using PiAUISuite and Voice RSS. Thank you Steven and Keven!

Please Rate, Share, Comment & Disqus

Was this Post Helpful?

(No votes yet)

Be the first to vote by clicking on the thumbs up icon.
Loading...

Comments

comments so far

About the author
Renier is a veterinarian by profession, but apart from his own pets and keeping his animal hospital afloat, he also finds himself busy with creative web design and his websites, motorcycling, photoshopping, micro electronics, non-commercialised music, superhero movies, bad ass seriesses and many other things that is not interesting to most people.
View all posts by Renier Delport