Skip to content

Commit fae44e7

Browse files
authored
【PaddleSpeech No.12】 (#4037)
* fix the svs1 * del stage 0 * 删去中文README中多余的stage 0
1 parent de3fa28 commit fae44e7

File tree

3 files changed

+12
-4
lines changed

3 files changed

+12
-4
lines changed

examples/opencpop/svs1/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,8 @@ pwgan_opencpop_ckpt_1.4.0.zip
118118
```bash
119119
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name}
120120
```
121+
use `pwgan` model as vocoder.
122+
121123
```text
122124
usage: synthesize.py [-h]
123125
[--am {diffsinger_opencpop}]
@@ -170,8 +172,10 @@ optional arguments:
170172
`local/pinyin_to_phone.txt` comes from the readme of the opencpop dataset, indicating the mapping from pinyin to phonemes in opencpop.
171173

172174
```bash
173-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name}
175+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
174176
```
177+
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
178+
175179
```text
176180
usage: synthesize_e2e.py [-h]
177181
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

examples/opencpop/svs1/README_cn.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,8 @@ pwgan_opencpop_ckpt_1.4.0.zip
121121
```bash
122122
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name}
123123
```
124+
使用 `pwgan` 模型作为声码器。
125+
124126
```text
125127
usage: synthesize.py [-h]
126128
[--am {diffsinger_opencpop}]
@@ -173,8 +175,10 @@ optional arguments:
173175
`local/pinyin_to_phone.txt`来源于opencpop数据集中的README,表示opencpop中拼音到音素的映射。
174176

175177
```bash
176-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name}
178+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
177179
```
180+
`--stage` 用于选择合成时使用的声码器模型,取值为 `0``1`,分别对应使用 `pwgan``hifigan` 模型作为声码器。
181+
178182
```text
179183
usage: synthesize_e2e.py [-h]
180184
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

examples/opencpop/svs1/run.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,6 @@ if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
3232
fi
3333

3434
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
35-
# synthesize_e2e, vocoder is pwgan by default
36-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
35+
# synthesize_e2e, vocoder is pwgan by default, stage 1 will use hifigan as vocoder
36+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
3737
fi

0 commit comments

Comments
 (0)