Skip to content

Commit de3fa28

Browse files
authored
【PaddleSpeech No.7】 (#4043)
* run.sh修改:为 synthesize 和 synthesize_e2e 添加 --stage 参数控制 vocoder 模型选择,REAMDE.md修改:补充 stage 参数说明,明确 vocoder 选择逻辑 * 添加run.sh中stage参数相关的注释 * HiFiGAN改为MultiBand MelGAN * cmsc文件改回原位(No.15不修改),这里只对No.6做修改 * fix the tts0
1 parent ef29dbb commit de3fa28

File tree

2 files changed

+13
-6
lines changed

2 files changed

+13
-6
lines changed

examples/csmsc/tts0/README.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -99,8 +99,10 @@ pwg_baker_ckpt_0.4
9999
```
100100
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
101101
```bash
102-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name}
102+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
103103
```
104+
`--stage` controls the vocoder model during synthesis, which can use stage `0-4` to select the vocoder to use {`pwgan`, `multi band melgan`, `style melgan`, ` hifigan`, `wavernn`}
105+
104106
```text
105107
usage: synthesize.py [-h]
106108
[--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}]
@@ -146,9 +148,12 @@ optional arguments:
146148
output dir.
147149
```
148150
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
151+
149152
```bash
150-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name}
153+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
151154
```
155+
`--stage` controls the vocoder model during synthesis, which can use stage `0,1,3,4` to select the vocoder to use{`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
156+
152157
```text
153158
usage: synthesize_e2e.py [-h]
154159
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

examples/csmsc/tts0/run.sh

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,13 +27,15 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
2727
fi
2828

2929
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
30-
# synthesize, vocoder is pwgan
31-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
30+
# synthesize, vocoder is pwgan by default stage 0
31+
# stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
32+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
3233
fi
3334

3435
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
35-
# synthesize_e2e, vocoder is pwgan
36-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
36+
# synthesize_e2e, vocoder is pwgan by default stage 0
37+
# stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
38+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
3739
fi
3840

3941
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

0 commit comments

Comments
 (0)