Kyrgyz Startup Unveils AI Speech Synthesis Model at CES 2026
The exhibition showcased a key product - a speech synthesis model called KaniTTS, which operates on an open-source principle. The developers claim that their technology can generate speech in real-time three times faster and up to ten times cheaper than offerings from well-known global companies such as ElevenLabs, OpenAI, and Google. The model is available for use under the Apache 2.0 license, making it free.
From a technical standpoint, KaniTTS allows for the creation of 15 seconds of text in just one second, using a standard NVIDIA RTX 5080 graphics card. This advantage makes the technology accessible for implementation without the need for expensive cloud infrastructure. The model has already been downloaded over 15,000 times on the Hugging Face platform. Currently, it supports eight languages, including Kyrgyz, English, German, and Chinese.
Additionally, the startup presented an automatic speech recognition model called Kyrgyz Whisper, which has been fine-tuned based on OpenAI technology. The use of 2000 hours of recordings of Kyrgyz speech has significantly reduced the language recognition error rate from nearly 100% to 0.2%. This solution addresses the issue of the lack of quality support for underrepresented languages on the international stage.
The exhibition was organized by the High Technology Park of Kyrgyzstan. According to the PVT, the country's IT sector is demonstrating steady growth: over the past five years, the volume of service exports has increased 45 times. In 2024, Kyrgyz specialists earned $130 million in foreign markets, with 40% of this export (over $50 million) going to the USA.
Related materials:
