persian_tacotron

Training Tacotron2 for Persian language as a Persian text-to-speech(TTS). Tacotron2 is a TTS model that generates mel-spectrograms from text. In this implementation we use Tacotron2 from Nvidia and change it to train this model for persian language. We clone Nvidia-Tacotron2 and install its requirements and then do following changes:

prepare persian data: many audio files and phonemes sequence for each file (we use phoneme instead of text because of using english characters and solving the problem of not writing some vowels in the Persian text)
change cleaner.py in tacotron2/text/ according to used characters in phonemes
change hparams.py in tacotron2/
create a python file that creates text file for model
create a python file that tests model for a phoneme

How to use

To use this implementation:

clone this repository
install requirements in tacotron/requirments.txt
add your data in files/: audio files to files/wavs and phoneme_transcriptions.txt to files/
run create_data_file.py to create text files for model in files/text_files
move created files in files/text_files/ to tacotron/filelists/
change hparams.py in tacotron2/ to train model according to your data: epochs=? , iters_per_checkpoint=?, training_files='filelists/name-of-your-train-data.txt', validation_files='filelists/name-of-your-test-data.txt'
start training by following command:
```
python tacotron2/train.py --output_directory=outdir --log_directory=logdir
```
checkpoints will be saved in tacotron2/outdir/ In training of model if you have 1000 audio files and batch-size is 16 so you will have 1000/16 iteration for any epochs. If you get an error about memory size, decrease batch_size in hparams.py to 8.
change get_results.py and set your test phonome in main: text = ?
run get_results.py and set parameter to last saved chackpoint file. for example to use 'checkpoint_32000' use following command:
```
python get_results.py 32000
```
results of plot mel-spectrogram and audio file will be in results/

results

After training model using 2500 audio files for about 400 epochs, results is this: And you can see some audio results here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

persian_tacotron

How to use

results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
files		files
result		result
tacotron2		tacotron2
.gitignore		.gitignore
README.md		README.md
create_data_file.py		create_data_file.py
get_results.py		get_results.py

Adibian/persian_tacotron

Folders and files

Latest commit

History

Repository files navigation

persian_tacotron

How to use

results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages