Convert Speech to Text in Python

  Convert Speech to Text in Python

Hello friends, Today Iam going to tell you how you could create your own Speech Recognization Program which will convert your voice in Text form.You are going to be surprised because today ‘s world our devices will listen and give their answer or responds according your command which is received by your voice.But In the Past that miracle’s not possible and look now it is possible,it is surprisingly happened today.

So my tutorial is usefull for this because my tutorial is on Speech Recognition in Python.This is a very awesome tutorial having lots of interesting stuffs. In this tutorial we will learn about concept of speech recognition and it’s implementation in python. So let’s start our tutorial on Speech Recognisation in Python.

Today technologies is changed in seconds and are growing or adding more features and our Speech Recognization is one of them.Speech recognition is a technology that have evolved exponentially over the past few years. Speech recognition is one of the popular and best feature in computer world. 

It have numerous applications that can boost convenience, enhance security, help law enforcement efforts, that are the few examples. Let’s start understanding the concept of speech recognition, it’s working and  applications.

What is Speech Recognition?

·      Speech Recognition is a process in which a computer or device record the speech of humans and convert it into text format.
·      It is also known as Speech Recognition, computer speech recognition or Speech To Text (STT).
·      Linguistics, computer science, and electrical engineering are some fields that are associated with Speech Recognition.

Working of Speech Recognition

Now let’s discuss how we create a program on Speech Recognization and how they worked.

Now let’s understand the concept behind our Speech Recognization program in Python.

It is based on the algorithm of acoustic and language modeling.So first thing in our mind and we mostly asked question is -what is acoustic and language modeling?

·      Acoustic modeling represents the relationship between linguistic units of speech and audio signals.
·      Language modeling matches sounds with word sequences to help distinguish between words that sound similar.

Any speech recognition program is evaluated using two factors :-

·      Accuracy (percentage error in converting spoken words to digital data).
·      Speed (extent to which the program can keep up with a human speaker).


The most frequent applications of speech recognition are following :-

·      Car systems.
·      Medical documentation and Therapeutic use
·      High performance fighter aircraft ,Helicopters,Training air traffic controllers.
·      Telephony and other domains
·      Usage in Education and Daily life
·      People with disabilities.

Speech Recognition Python

Before we start our Project first thing or question came in our mind how we implement and use Speech Recognization in our Project.Don’t worry I will explain it now in below coding program which is implementing in Python and uses Linux/Window OS .

Implementing Speech Recognition in Python is very easy and simple. Here we will be using two libraries which are Speech Recognition and PyAudio.


Now create your own new program and give a name it or you can give it your name my program name isMY_Project_Python And after that create a python file inside the folder.

Using Libraries in Python Program

we have to install two library for implementing speech recognition.

·      SpeechRecognition
·      PyAudio

Installing SpeechRecognition

·      Go to terminal and type

$ pip install SpeechRecognition
SpeechRecognition is a library that helps in performing speech  recognition in python. It support for several engines and APIs, online and offline example Google Cloud Speech API, Microsoft Bing Voice Recognition, IBM Speech to Text etc.

Installing PyAudio

·      Go to terminal and type

$ pip install pyaudio
PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio, you can easily use Python to play and record audio on a variety of platforms, such as GNU/Linux, Microsoft Windows, and Apple Mac OS X / macOS.

Speech Recognition Program

Now let’s go to our coding part of our program.

So this is the code for speech recognition in python.As you are seeing, it is quite simple and easy.

import speech_recognition as sr    

data = sr.Recognizer()                
with sr.Microphone() as source:   
print("Speak what you want in text form :")
audio = data.listen(source)       
data_txt = data.recognize_google(audio)   
print("You said : {}".format(data_txt))
print("Sorry could not recognize your voice, speak again!!!") 

Explanation of code

So now we will start understanding the code line-by-line :-

·      first of all we will import speech_recognition as sr.
·      Notice that we have speech_recognition in such  format  whereas earlier we have installed it in this way SpeechRecognition , so you need to have a look around the cases because this is case sensitive.
·      Now we have used as notation  because writing speech_recognition whole every time is not a good way.
·      Now we have to initialize data = sr.Recognizer() ,this will work as a recognizer to recognize our voice.
·      So, with sr.Microphone() as source: which means that we are initialising our source to sr.Microphone ,we can also use some audio files to convert into text but in this tutorial i am using Microphone voice.
·      Next we will print a simple statement that recommend the user to speak anything.
·      Now we have to use data.listen(source) command and we have to listen the source.So, it will listen to the source and store it in the audio.
·      It may happen some time the audio is not clear and you might not get it correctly ,so we can put it inside the try and except block .
·      So inside the try block, our text will be data_txt = data.recognize_google(audio) ,now we have various options like recognize_bing(),recognize_google_cloud(),recognize_ibm(), etc.But for this one i am using recognize_google().And lastly we have to pass our audio.
·      And this will convert our audio into text.
·      Now we just have to print print(“You said : {}”.format(data_txt)) ,this will print whatever you have said.
·      In the except block we can just write  print(“Sorry could not recognize your voice”) ,this will message you if your voice is not recorded clearly.


The output of the above code will be given below :-

So,Now your Python Program working or not I think it will worked fine, Well Iam enjoyning a lot to explain you guys all of in my tutorial.
One thing we have to remember if it is possible use Mirophone to give your Speech because if your asked it on the Mic may be possible your voice is fluctuated……

You could use headphone too.I want to say some thing you guys can use play store to use your android phone as a Mic to receive commends or your voice for further process.

If you guys do not understand the concept of this program or your program is not worked properly,Leave a Comment below friends I will give you guys all of your answers.

Guys if you like this tutorial or this tutorial is helpful to you, then please Share it with your friends……...

At last your Speech Recognization Python Program is completed and worked properly.So that’s all for today hope you all like my tutorial on Speech Recognization in Python Programming Language.

                    Other Links 

>>> Python Basics                                      >>> Data Structure in Python

>>> CRUD/CURD in Django                   >>> Job Opportunities in Python

>>> Currency Detection By Image in Python         
>>> Pandas Tutorials
>>> Access Network Drive with password in Python                              


Post a Comment