AI Voice Agents

AI Voice Agents - Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧

Project List

Source	Description	Model
Bland AI	Bland AI - Automate Phone Calls with Conversational AI. Transform your enterprise communication with Bland AI. Automate inbound and outbound phone calls using AI that sounds human. Bland is a platform for AI phone calling. Using our API, you can easily send or receive phone calls with a programmable voice agent.	API
GPT-4o	GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs.	API
Retell AI	Retell AI -Build Advanced Voice AI, Powered by LLM.	API

Source	Description	Code	Paper	Model
ChatTTS	ChatTTS is a text-to-speech model designed specifically for dialogue scenario such as LLM assistant.	GitHub		Hugging Face
CosyVoice	Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.	GitHub
ElevenLabs	ElevenLabs: Text to Speech & AI Voice Generator.			API
Matcha-TTS	Matcha-TTS: A fast TTS architecture with conditional flow matching.	GitHub	arXiv
StyleTTS 2	Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.	GitHub	arXiv
XTTS	🐸TTS is a library for advanced Text-to-Speech generation.	GitHub

Source	Description	Code	Paper	Model
SenseVoice	SenseVoice is a speech foundation model with multiple speech understanding capabilities, including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED).	GitHub		Hugging Face
TeleSpeech-ASR	Large speech model-super multi-dialect ASR.	GitHub		Hugging Face
Whisper	Whisper is a general-purpose speech recognition model.	GitHub	arXiv	Hugging Face

Source	Description	Code	Paper	Model
Make-An-Audio 3	Transforming Text into Audio via Flow-based Large Diffusion Transformers.	GitHub	arXiv	Hugging Face