logoAIStage

ChatTTS Introduction

ChatTTS is a voice generation model designed for conversational scenarios, suitable for dialogue tasks of large language model assistants, conversational audio and video introductions, and more. It supports Chinese and English, and has shown high quality and naturalness in speech synthesis through training with about 100,000 hours of data. Open-source plans for a basic model trained with 40,000 hours of data are also in place.

Visit Website

What is ChatTTS

ChatTTS is a text-to-speech model specifically designed for conversational scenarios. It’s ideal for applications like dialogue tasks for large language model assistants, as well as conversational audio and video introductions. ChatTTS supports both Chinese and English, and it demonstrates high quality and naturalness in speech synthesis. This level of performance is achieved through training on approximately 100,000 hours of Chinese and English data. The project team plans to open-source a basic model trained with 40,000 hours of data, which will aid the academic and developer communities in further research and development.

Featured*

ChatTTS Alternatives