Definition:
Speech Recognition /spiːʧ ˌrɛk.əɡˈnɪʃ.ən/ noun — In artificial intelligence and computational linguistics, speech recognition is the process of automatically converting spoken language into written text. It enables machines to understand and transcribe human speech by analyzing audio signals and mapping them to words using acoustic, phonetic, and language models.
Speech recognition systems typically involve:
- Audio signal processing to clean and normalize voice input
- Feature extraction to identify key sound characteristics
- Acoustic modeling to relate audio features to phonemes
- Language modeling to ensure grammatical coherence and context
- Decoding algorithms to match speech patterns to text outputs
Applications include:
- Virtual assistants (e.g., Siri, Alexa, Google Assistant)
- Voice search and commands
- Real-time transcription and subtitles
- Speech-to-text for accessibility tools
- Automated call centers and dictation software
Modern speech recognition leverages deep learning models such as Recurrent Neural Networks (RNNs), Connectionist Temporal Classification (CTC), and Transformers, improving accuracy across languages, accents, and noisy environments.
« Back to dictionary

