Abstract: This paper introduces a general classifier based on WavLM features, to infer demographic characteristics, such as age, gender, native language, education, and country, from speech.
Abstract: Nowadays, smart speech devices have been widely used for human-computer interaction. However, in many environments, the reverberation caused by the multi-path effect can seriously affect the ...
The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon. Kokoro Fast, ...
Early detection of Alzheimer’s disease (AD) through spontaneous speech analysis represents a promising, non-invasive diagnostic approach. Existing methods predominantly rely on fusion-based multimodal ...