I was working on a project the other day and it had a requirement for speech. Based on some pre-agreed text, the application needed to play back the text to the user. Now, these days this sort of thing is common place. Alexa and Siri are well known services based on this sort of technology.

So, what if you want to convert text to speech how can you do this.

As it turns out there appear to be several options. The one I like the look of the most is using a web service type offering. The appeal of this solution is you don’t need to worry about setting up the environment to support speech or worry about performance.

Whilst this option is the one I prefer, it is not what I am going to show you today.

I need a stand-alone solution that can work without internet connection or at least the project requires this.

So, the option I have gone with is a freely available solution known as FreeTTS with the addition of MBROLA and the voices supported by this engine.

The Goal

Before we make a start let us take a look at where we are going with this. In this example we will create a small application. The application allows for a voice to be selected and a place to enter some text to run through the speech engine.

There are many settings which can be altered to affect the sound produced and for some of the key options there will input fields to allow the alteration of the sound. Given that there are so many options to configure, each time the text is run through the speech engine the current settings should be displayed.

The application works by selecting a voice from the drop-down list, typing some text into the “Text to speak” text area and then pressing the “Ok” button. The settings for the voice are displayed in the grey text area to the right of the input fields. The other input fields on the left can be used to enter values which change how the sound is produced. By using the text area on the right, the user can see what default values are used for the voice and then alter these values by typing new values in the appropriate input field on the left.

Why not download the application and give it a go? The zip file contains a windows executable file. When run, the exe will unpack a JVM, jars and the application byte code and then run the application. If you have a virus scanner it may scan the files as it unpacks to ensure there is no malicious code.

In the next post we will set about building this application so you can see what is involved in case you have a need for your own application.

Making your program speak

The Goal

Leave a Reply Cancel reply