logoAIStage

ChatTTS Core Features

ChatTTS is a voice generation model designed for conversational scenarios, suitable for dialogue tasks of large language model assistants, conversational audio and video introductions, and more. It supports Chinese and English, and has shown high quality and naturalness in speech synthesis through training with about 100,000 hours of data. Open-source plans for a basic model trained with 40,000 hours of data are also in place.

Visit Website

Core Features of ChatTTS

Text-to-Speech for Chat

ChatTTS is a voice generation model specifically designed for conversational scenarios. It is ideal for applications such as dialogue tasks for large language model assistants, as well as conversational audio and video introductions.

Support for Multiple Languages

The model supports both Chinese and English, demonstrating high quality and naturalness in speech synthesis.

High-Quality Speech Synthesis

This level of performance is achieved through training on approximately 100,000 hours of Chinese and English data.

Open-Sourcing a Basic Model

Additionally, the project team plans to open-source a basic model trained with 40,000 hours of data, which will aid the academic and developer communities in further research and development.

Featured*

ChatTTS Alternatives