Text-to-speech in express server and Angular

Voice Synthesis Part 2

I recommend you to read Voice synthesis Part 1 (Not necessary) : Article link here

Text-to-speech (TTS) is the generation of synthesised speech from text. The technology is used to convert text or SSML into natural sounds. We will be using google cloud for text-to-speech recognition. We can’t use this sdk directly in the front end app for example angular, So we will need to use Node server with help of expressJs.

Google cloud docs : docs link

NPM package: package link

API docs : docs link

Text-to-speech sdk can have two types ( text and SSML) of input to process and produce voice output. Speech Synthesis Markup Language i.e. SSML is an xml based markup language for speech synthesis applications.

Before we continue with development please do the followings

  1. Select or create a Cloud Platform project.
  2. Enable billing for your project.
  3. Enable the Google Cloud Text-to-Speech API.
  4. Set up authentication with a service account

After following Step 4 you can have a google-authentication.json file.

5. Setup environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of google-authentication.json

Setting path in Linux :

export GOOGLE_APPLICATION_CREDENTIALS=<path-to-jsonfile>/google-authentication.json

npm install @google-cloud/text-to-speech

After installing package in the project folder created file says text-to-speech-api.ts

Now in the above example the function convertSSMLtoAudio function with parameters ssml. This will convert ssml to audio and save it as an mp3 file. We can also use text as input instead of ssml to convert text to speech.

Now this mp3 file can be sent as a resource url, for example http://localhost:8085/audios/655498494964.mp3

Play generated audio file in angular application.

Play audio in angular application.

Suggest the style of documentation

Happy reading…..

Leave a Comment

Your email address will not be published. Required fields are marked *