Speech Processing Model

Gnani.ai Launches Indic Speech-To-Text Model Under IndiaAI Mission

Gnani.ai has launched Vachana STT, a speech-to-text model built for Indian languages, under the IndiaAI Mission. The startup ...

11d

Stop using ChatGPT for everything: I use these AI models for research, coding, and more (and which I avoid)

Obsessing over model version matters less than workflow.

Earth.com

Our brain processes speech in layers, much like AI language models

Brain activity during speech follows a layered timing pattern that matches large language model steps, showing how meaning builds gradually.

IEEE

A Near-Real-Time Processing Ego Speech Filtering Pipeline Designed for Speech Interruption During Human-Robot Interaction

Abstract: With current state-of-the-art (SOTA) automatic speech recognition (ASR) systems, it is not possible to transcribe overlapping speech audio streams separately. Consequently, when these ASR ...

Microsoft

Taxonomizing Representational Harms using Speech Act Theory

Representational harms are widely recognized among fairness-related harms caused by generative language systems. However, their definitions are commonly under-specified. We present a framework, ...

GitHub

Moshi: a speech-text foundation model for real time dialogue

Finally, the code for the web UI client used in the Moshi demo is provided in the client/ directory. If you want to fine tune Moshi, head out to kyutai-labs/moshi ...

IEEE

What Are They Doing? Joint Audio-Speech Co-Reasoning

Abstract: In audio and speech processing, tasks usually focus on either the audio or speech modality, even when both sounds and human speech are present in the same audio clip. Recent Auditory Large ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results