Speech Recognition Technology
May 20, 2009
Speech recognition technology is an application that converts spoken words to an input understandable by machines. The main motive of this technology is to be able to accurately and efficiently convert a speech signal into a textual message that is independent of the speaker, device as well as the environment. The main applications to make use of this technology include voice dialling, call routing, data entry, preparation of documents, etc. to name a few. The purpose of introduction of the speech recognition technology was to eliminate written text and thus it was not accepted in the first go.
Speech Recognition technology is an alternative to the conventional methods of interacting with the computer in the form of text input through a keyboard. This effective system is out to replace or reduce the reliability on a standard keyboard or the mouse. This technology is thus useful for people who possess less keyboard skills or are slow typists. It is also useful for dyslexic people who have issues with the usage of words or characters. Finally it is a boon for people with physical disabilities. Now this technology has been incorporated in many latest satellite navigators, laptops, mobile phones etc. so that you can control your unit with the help of voice commands.
A speech recognition system comprises of four major parts. First is a microphone for the person to speak into, secondly a speech recognition software for conversion of speech to machine language, third a computer for interpreting the data and finally a sound card for input and output purposes.
The crux of this technology is its software part that does the work of translation of words into machine readable text. The software breaks down the words into phonemes, which is nothing but the basic speech sounds that make up a character or word. These phonemes are then analysed to detect which string of input best fits into the list of phonemes in their dictionary. For this purpose, it is advisable to train the system before using it. The system needs to have a thorough understanding of the speed and pitch of the user. Nevertheless, the user needs to ensure to speak in clear and modified manner for the system to comprehend correctly.
Like many other technologies, although the speech recognition feature has come a long way from where it started, it is still far from perfect. There are several challenges that need to be tackled before it can be considered perfect. The biggest challenge is the voice matching system that performs a major role in the translation process. The voice matching system will compare a user’s response with the ones available in its database and only the available words will be recognised. As the words are being broken down into phonemes that are sounds, the background noise can cause issues with the system’s comprehending ability. With the evolution of language, it makes it difficult to keep updating the system’s database. Also, most users are used to speaking in informal languages and some colloquial terms may not be possible for the device to comprehend.
By mastering these major challenges, the speech recognition technology will be able to comprehend and perceive the data perfectly.
Comments
Got something to say?



















