Speech Recognition

« Back to Glossary Index

Definition:
Speech Recognition /spiːʧ ˌrɛk.əɡˈnɪʃ.ən/ noun — In artificial intelligence and computational linguistics, speech recognition is the process of automatically converting spoken language into written text. It enables machines to understand and transcribe human speech by analyzing audio signals and mapping them to words using acoustic, phonetic, and language models.

Speech recognition systems typically involve:

Audio signal processing to clean and normalize voice input
Feature extraction to identify key sound characteristics
Acoustic modeling to relate audio features to phonemes
Language modeling to ensure grammatical coherence and context
Decoding algorithms to match speech patterns to text outputs

Applications include:

Virtual assistants (e.g., Siri, Alexa, Google Assistant)
Voice search and commands
Real-time transcription and subtitles
Speech-to-text for accessibility tools
Automated call centers and dictation software

Modern speech recognition leverages deep learning models such as Recurrent Neural Networks (RNNs), Connectionist Temporal Classification (CTC), and Transformers, improving accuracy across languages, accents, and noisy environments.

« Back to dictionary

Sentiment Analysis

Supervised Learning

RelatedPosts

Corpus

MMS

Battery Tech (Battery Technology)

Memory card slot

Attachment – Definition

Black Box Testing – Definition

Java

Transflective

SIM (Subscriber Identity Module)

Big Sur – Definition

The Rise of Israel’s AI Industry

Inside Mossad: The World’s Most Feared Intelligence Agency