Speech Processing Model

Gnani.ai Launches Indic Speech-To-Text Model Under IndiaAI Mission

Gnani.ai has launched Vachana STT, a speech-to-text model built for Indian languages, under the IndiaAI Mission. The startup ...

11d

Stop using ChatGPT for everything: I use these AI models for research, coding, and more (and which I avoid)

Obsessing over model version matters less than workflow.

Earth.com

Our brain processes speech in layers, much like AI language models

Brain activity during speech follows a layered timing pattern that matches large language model steps, showing how meaning builds gradually.

winbuzzer.com

ElevenLabs Launches Scribe v2 Realtime Speech-To-Text Model with Ultra-low Latency

AI voice startup ElevenLabs today launched its Scribe v2 and Scribe v2 Realtime speech-to-text models designed for live, interactive applications. Scribe v2 delivers the highest possible accuracy in ...

Neuroscience News

Speeding Up Speech Doesn’t Speed Up Brain Processing

Summary: New research shows that when people listen to speech at different speeds, the auditory cortex does not adjust its timing but instead processes sound in a fixed time window. This discovery ...

Inc

OpenAI Just Announced GPT-Realtime, Its Most Advanced Voice AI Model Yet

OpenAI launched the Realtime API in beta in October 2024. The API, which uses the same technology as ChatGPT’s advanced voice mode, enables software developers to create voice-based AI assistants that ...

MIT Technology Review

AI text-to-speech programs could “unlearn” how to imitate certain people

New research shows models can be directly edited to hide selected voices, even when users specifically ask for them. A technique known as “machine unlearning” could teach AI models to forget specific ...

inc42

Sarvam Unveils New Speech AI Model With 11 Indian Languages Support

GenAI startup Sarvam AI, recently selected by the Centre to build India’s first homegrown LLM, has unveiled a new speech AI model that supports 11 Indian languages, including Punjabi, Marathi, Odia, ...

winbuzzer.com

Nvidia Releases High-Speed Parakeet AI Speech Recognition Model, Claims Top Spot on Leaderboard

Nvidia has entered the open-source speech recognition arena with Parakeet-TDT-0.6B-v2, an automatic speech recognition (ASR) model now hosted on Hugging Face. Beyond its accuracy ranking, Nvidia ...

IEEE

Mamba in Speech: Towards an Alternative to Self-Attention

Abstract: Transformer and its derivatives have achieved success in diverse tasks across computer vision, natural language processing, and speech processing. To reduce the complexity of computations ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results