ManaTTS is the largest open Persian speech dataset with 100+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
-
Updated
May 21, 2025 - Jupyter Notebook
ManaTTS is the largest open Persian speech dataset with 100+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject
Implementation of Transfer Learning from Speaker Verification to Multi-speaker Text-To-Speech Synthesis (SV2TTS) in Persian language.
A robust forced alignment tool for low-resource languages using multiple ASR models and CER-based matching. Built for noisy data and imperfect transcripts.
Tacotron2 Persian Text-to-Speech Model trained on ManaTTS, the largest open single-speaker Persian speech dataset with over 100 hours of high-quality audio.
An automatic pipeline for generating high-quality datasets for TTS and ASR systems.
Add a description, image, and links to the mana-tts topic page so that developers can more easily learn about it.
To associate your repository with the mana-tts topic, visit your repo's landing page and select "manage topics."