File Exchange

image thumbnail

speech2text

Automatic speech-to-text conversion

69 Downloads

Updated 20 Feb 2019

View License

Automate labeling and tagging of speech recordings, assess the performance of DSP pipelines for voice and speech enhancement, run text analytics on voice recordings, and more.
This entry enables you to convert sampled speech recordings available as MATLAB vectors into strings using a single function call.
You will need a license of Audio System Toolbox, an internet connection, and an active subscription to a speech-to-text service of your choice – Google™ Cloud Speech-to-Text API, IBM™ Watson Speech to Text API, or Microsoft™ Azure Speech Services API.

Please check out the Examples tab for detailed instructions on how to get started.

Cite As

MathWorks Audio System Toolbox Team (2019). speech2text (https://www.mathworks.com/matlabcentral/fileexchange/65266-speech2text), MATLAB Central File Exchange. Retrieved .

Comments and Ratings (36)

Thanks a lot Gabriele!
Guess i will go for a C# interface to google streaming and most likely will import some of the matlab skills to that application.

Hi Danni, unfortunately we are unable to make the source code of speech2text available at this point. In any case the modifications to support the streaming speech-to-text interfaces wouldn't be trivial. If you wanted to try to develop your own MATLAB wrapper for a particular web-based service, you first want to closely review the published web API of the service that you are interested in. To script and automate the requests using MATLAB, key building blocks would be the MATLAB HTTP (https://www.mathworks.com/help/matlab/http-interface.html) and JSON (https://www.mathworks.com/help/matlab/json-format.html) interfaces. Good luck!

Thanks a lot Gabriele, clear. Is there any chance to get an idea about the under the hood of the speech2text such that i can modify it to work in a streaming mode? Thanks a lot Dani

Hi Danni, speech2text itself was not designed to support streaming use and it doesn't leverage any purpose-built streaming interface from the cloud services provided. In some cases (e.g. breaking up sentences on the fly using some kind of VAD) passing the segments to speech2text may still yield acceptable results. However, besides the added latency, the cloud services will transcribe each segment in isolation.

Hi Gabriele, Is it possible to use the speech2text environment to run the Google API in a steaming model? meaning transmitting samples from the microphone to Google and receiving back the results in a real time?

Hi Piyush, please review the "Examples" tab of this page - "Perform Speech-to-Text using 3rd party Speech APIs" should already include all detailed steps and code samples. If you feel anything is missing, we would be greatful if you could tell us in detail what that is. Thanks in advance!

Can anyone tell step-wise , how to create reqyuired objects and json files to use google voice api?
So that I can use the speech2text?

Feifan Jia

Hi Adnan,
Please take a look at main example provided. Beyond the instructions on how to get things installed, under "Perform Speech-to-Text Transcription" you will find code that shows how to load a pre-recorded speech segment from a file and how to use speech2text to get a transcription.

how to used Speech2text algorithm in matlab first time ? could answer please urgently
I have a audio file but i want to translate it into a text any idea please

Never mind, I have realised my mistake and corrected it. I needed to add the path to the downloaded speec2text folder, not the compiled object.
Many thanks. :)

Hello Gabriele, thank you for your quick reply.
I have tried adding the file manually following Raja's post, however either I have not understood or it has not worked as the problem remains. I tried both:
" addpath('C:\Program Files\MATLAB\R2019a\toolbox\audio\audio\compiled\') " and
" addpath(genpath'C:\Program Files\MATLAB\R2019a\toolbox\audio\audio\compiled\')) ". Have I misunderstood something?

Hello Oliver, thank you for reaching out.
We have identified an issue with the add-on installation, which prevents the submission folders from being added to the MATLAB search path.
We will aim to fix the issue in the upcoming update. In the meantime, please add manually all submission folders to the MATLAB path (add the top-level folder and include all subfolders). You may refer to Raja's post here below for more info on this topic.
Thanks.

I am trying to use the speech2text but keep getting the error below.
"Error using speechClient
Unable to access speech2text. Make sure the file is
installed. Go to File Exchange to download. For more
information, click here."
I have confirmed the speech2text add on is installed and "which speech2text" returns the sensible answer "C:\Program Files\MATLAB\R2019a\toolbox\audio\audio\compiled\speech2text.p". Does anyone have any ideas why this isn't working?

Hello.. I keep getting an error.. I'm Using the Google Speech Recognition API but everytime I try to run it I got this error message:
Error using coder.internal.error (line 14)
Unable to access speech2text. Make sure the file is
installed. Go to File Exchange to download. For more
information, click here.

Error in speechClient

Error in speechtest (line 1)
speechObject =
speechClient('Google','languageCode','en-US');

Hello Grayson,

The downloaded speech2text files may not be on your MATLAB Search path (https://www.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html)

Please addpath your downloaded speech2text folder (https://www.mathworks.com/help/matlab/ref/addpath.html) or cd to it before running the speech2text commands.

Hope this helps.

Hi there,
I am trying to use Google's Speech recognition API, but every time I try to make a speechClient I get this error:
Error using coder.internal.error (line 14)
Unable to access speech2text. Make sure the file is installed. Go to File Exchange to download. For more information, click here.

Error in speechClient

I've checked multiple times and speech2text is definitely installed. I also do have the audio system toolbox installed. Any idea what I'm doing wrong?
Thanks in advance.

Hello Oliver,

It looks like you are using the ‘languageCode’ to pass the model name for IBM, but you would need to pass it using ‘model’, something like:

transcriber = speechClient('IBM','model','es-ES_NarrowbandModel');

This is the expected Name-Value as mentioned in the IBM documentation - https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/curl.html?curl#recognize

Hope this helps!

Oliver Sell

Hello,

I am using the IBM Watson Speech API. When I run this function I get the following error:

'Bad Request' 400 'This 8000hz audio input requires a narrow band model. See https://<STT_API_ENDPOINT>/v1/models for a list of available models.'

I tried 'en-US_NarrowbandModel' as 'languageCode' but it still does not work. How can I pass the variable 'model' 'en-US_NarrowbandModel' or change the model?

Thank you in advance.

Priyal Goel

Hello, I get the following error when I run:
[y,fs] = audioread('youre-on-the-right-track.wav');
speechObject = speechClient('Microsoft','recognition','interactive','language','en-US');
tableOut = speech2text(speechObject,y,fs)

Output argument "tableOut" (and maybe others) not assigned during call to "speechClient/speechTotext".

Error in speech2text

Error in sampleTesting (line 4)
tableOut = speech2text(speechObject,y,fs)

I'm using the Microsoft Azure Bing API
Can someone please help me with this?

@Sumit Mondal - Thank you for reporting this. It looks like this error was triggered by the absence of a license for Audio System Toolbox, which is required by speech2text. The lack of clarity of the actual error message will be fixed in an upcoming update.

The errors I get when trying to run the speech2text() function are the following :
Unable to find message key 'noAudio' in catalog 'signal:sigtools'.

Error in speechClient.checkoutASTLicense

Error in speechClient/speechTotext

Error in speech2text

Does anyone know what the problem could be ?

Is there any way to enable word time onsets and offsets in the Google API? See: https://cloud.google.com/speech-to-text/docs/async-time-offsets#speech-async-recognize-gcs-python

On the frequently encountered error "Expected input to be a vector" - Please note that the second input argument y of the speech2text function needs to be either a column or a row vector, i.e. an array having one of its dimensions equal to 1. It is very common for audio recordings to be stored in stereo format, so you may want to check the size of your audio array before using speech2text, for example by looking at your MATLAB workspace. If your audio array has multiple channels (typically resulting in a number of columns greater than 1), you need to select only one of them. Good options for stereo signals include either the left channel, i.e. y = readAudio(:,1), the right channel, i.e. y = readAudio(:,2), or their average across channels, i.e. y = mean(readAudio,2)

thank you I have emailed you @gabriele

@Sunaina Aytan - Thank you for your getting in touch. Please send more information on the error you are getting, including full reproduction steps, using https://www.mathworks.com/products/audio-system/expert-contact.html

hi i keep getting this error please help

Error using speechClient/speechTotext
Expected input to be a vector.

Error in speech2text

sneha madre

Sam, you are saving the JSON file incorrectly. The contents of JSON file for IBM should only contain the "username" and "password" obtained from your IBM Speech API account. Also, don't forget to include the parenthesis - "{" at the beginning and, "}" at the end of your JSON file.

In the downloaded folder, you should see 'writing_IBM_JSON.png' inside the HTML sub-folder. This image will help you with writing the JSON file for the IBM API.

Hope this helps!

Sam Cocks

When creating the .json file using IBM, I get an error on the jsondecode function where it is reading the save information of the file first:

Error using jsondecode
JSON syntax error at line 1, column 1 (character 1): expected value but found 'MATLAB'.

Opening my json file in matlab gives this as my first row:

MATLAB 5.0 MAT-file, Platform: MACI64, Created on: Wed

Am I saving my json incorrectly?

khcy82dyc

another issue. when I try to process larger files, matlab returns The connection to URL 'https://speech.googleapis.com/v1/speech:recognize?key=XXXXXX' timed out after 10 seconds. Set HTTPOptions.Timeout to a higher value..but I don't have the option to set the timeout value as they are protected .m files..

khcy82dyc

Arh I figured it out. this happened when google api can't detect any speech..

khcy82dyc

Hiya, I've got this error when I run:

[samples,fs] = audioread('handel.wav');
speechObject = speechClient('Google','languageCode','en-US');
tableOut = speech2text(speechObject,samples,fs)

Reference to non-existent field 'results'.

Error in speechClient/showOutput

Error in speechClient/googleAPI

Error in speechClient/speechTotext

Error in speech2text

Error in speech2textconvert (line 8)
tableOut = speech2text(speechObject,samples,fs)

Excellent One.

Updates

1.1.5.0

Addressed compatibility issues in older MATLAB releases (R2017a and R2017b)

1.1.4.0

Added support for new authentications schemes for IBM and Microsoft APIs.

1.1.3.0

Corrected path update on install

1.1.2.0

Improved handling of errors and lack of data in responses when using Microsoft API.

1.1.1.0

Updates for changes to IBM API

1.1.0.0

Added files under Files/en to enable cmd line help for p-coded files.

1.1.0.0

Added HTTPTimeOut option to allow using longer speech recordings.
Added error message to better handle a scenario where an HTTP request is successful but the API does not return any transcription data

MATLAB Release Compatibility
Created with R2018b
Compatible with R2017a to any release
Platform Compatibility
Windows macOS Linux

speech2text/en