Skip to content

Commit e59a3c1

Browse files
authored
【doc】fix docs (#4068)
* fix docs * fix pwgan * fix * fix
1 parent 7883aa6 commit e59a3c1

39 files changed

+135
-137
lines changed

examples/aishell/asr1/run.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,12 @@ fi
2424

2525
if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
2626
# train model, all `ckpt` under `exp` dir
27-
CUDA_VISIBLE_DEVICES=${gpus} ./local/train.sh ${conf_path} ${ckpt} ${ips}
27+
CUDA_VISIBLE_DEVICES=${gpus} ./local/train.sh ${conf_path} ${ckpt} ${ips} || exit -1
2828
fi
2929

3030
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
3131
# avg n best model
32-
avg.sh best exp/${ckpt}/checkpoints ${avg_num}
32+
avg.sh best exp/${ckpt}/checkpoints ${avg_num} || exit -1
3333
fi
3434

3535
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then

examples/aishell3/tts3/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -109,9 +109,9 @@ pwg_aishell3_ckpt_0.5
109109
```
110110
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
111111
```bash
112-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
112+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
113113
```
114-
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
114+
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
115115
```text
116116
usage: synthesize.py [-h]
117117
[--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}]
@@ -158,9 +158,9 @@ optional arguments:
158158
```
159159
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
160160
```bash
161-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
161+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
162162
```
163-
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
163+
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
164164
```text
165165
usage: synthesize_e2e.py [-h]
166166
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

examples/aishell3/tts3/local/synthesize.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ config_path=$1
44
train_output_path=$2
55
ckpt_name=$3
66

7-
stage=0
8-
stop_stage=0
7+
stage=${4:-0}
8+
stop_stage=${4:-0}
99

1010
# pwgan
1111
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

examples/aishell3/tts3/local/synthesize_e2e.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ config_path=$1
44
train_output_path=$2
55
ckpt_name=$3
66

7-
stage=0
8-
stop_stage=0
7+
stage=${4:-0}
8+
stop_stage=${4:-0}
99

1010
# pwgan
1111
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

examples/aishell3/tts3/run.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,13 +27,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
2727
fi
2828

2929
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
30-
# synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
31-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
30+
# synthesize, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
31+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3232
fi
3333

3434
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
35-
# synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
36-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
35+
# synthesize_e2e, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
36+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3737
fi
3838

3939
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

examples/aishell3_vctk/ernie_sat/run.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,6 @@ if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
3232
fi
3333

3434
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
35-
# synthesize_e2e, default speech synthesis from Chinese to English, use stage1 to switch from English to Chinese
36-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
35+
# synthesize_e2e, run both speech synthesis from Chinese to English and English to Chinese
36+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
3737
fi

examples/canton/tts3/local/synthesize_e2e.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ config_path=$1
44
train_output_path=$2
55
ckpt_name=$3
66

7-
stage=0
8-
stop_stage=0
7+
stage=${4:-0}
8+
stop_stage=${4:-0}
99

1010
# pwgan
1111
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

examples/canton/tts3/run.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
2828
fi
2929

3030
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
31-
# synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
32-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
31+
# synthesize, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
32+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3333
fi
3434

3535
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
36-
# synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
37-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
36+
# synthesize_e2e, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
37+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3838
fi
3939

4040
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

examples/csmsc/tts0/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -99,9 +99,9 @@ pwg_baker_ckpt_0.4
9999
```
100100
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
101101
```bash
102-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
102+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
103103
```
104-
`--stage` controls the vocoder model during synthesis, which can use stage `0-4` to select the vocoder to use {`pwgan`, `multi band melgan`, `style melgan`, ` hifigan`, `wavernn`}
104+
The last number controls the vocoder model during synthesis, which can use `0-4` to select the vocoder in {`pwgan`, `multi band melgan`, `style melgan`, ` hifigan`, `wavernn`}
105105

106106
```text
107107
usage: synthesize.py [-h]
@@ -150,9 +150,9 @@ optional arguments:
150150
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
151151

152152
```bash
153-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
153+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
154154
```
155-
`--stage` controls the vocoder model during synthesis, which can use stage `0,1,3,4` to select the vocoder to use{`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
155+
The last number controls the vocoder model during synthesis, which can use `0,1,3,4` to select the vocoder in {`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
156156

157157
```text
158158
usage: synthesize_e2e.py [-h]

examples/csmsc/tts0/local/synthesize.sh

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
config_path=$1
44
train_output_path=$2
55
ckpt_name=$3
6-
stage=0
7-
stop_stage=0
6+
stage=${4:-0}
7+
stop_stage=${4:-0}
88

99
# pwgan
1010
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then
@@ -21,7 +21,7 @@ if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then
2121
--voc_stat=pwg_baker_ckpt_0.4/pwg_stats.npy \
2222
--test_metadata=dump/test/norm/metadata.jsonl \
2323
--output_dir=${train_output_path}/test \
24-
--phones_dict=dump/phone_id_map.txt
24+
--phones_dict=dump/phone_id_map.txt || exit -1
2525
fi
2626

2727
# for more GAN Vocoders
@@ -40,7 +40,7 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
4040
--voc_stat=mb_melgan_csmsc_ckpt_0.1.1/feats_stats.npy \
4141
--test_metadata=dump/test/norm/metadata.jsonl \
4242
--output_dir=${train_output_path}/test \
43-
--phones_dict=dump/phone_id_map.txt
43+
--phones_dict=dump/phone_id_map.txt || exit -1
4444
fi
4545

4646
# style melgan
@@ -58,7 +58,7 @@ if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
5858
--voc_stat=style_melgan_csmsc_ckpt_0.1.1/feats_stats.npy \
5959
--test_metadata=dump/test/norm/metadata.jsonl \
6060
--output_dir=${train_output_path}/test \
61-
--phones_dict=dump/phone_id_map.txt
61+
--phones_dict=dump/phone_id_map.txt || exit -1
6262
fi
6363

6464
# hifigan
@@ -77,7 +77,7 @@ if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
7777
--voc_stat=hifigan_csmsc_ckpt_0.1.1/feats_stats.npy \
7878
--test_metadata=dump/test/norm/metadata.jsonl \
7979
--output_dir=${train_output_path}/test \
80-
--phones_dict=dump/phone_id_map.txt
80+
--phones_dict=dump/phone_id_map.txt || exit -1
8181
fi
8282

8383
# wavernn
@@ -96,5 +96,5 @@ if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then
9696
--voc_stat=wavernn_csmsc_ckpt_0.2.0/feats_stats.npy \
9797
--test_metadata=dump/test/norm/metadata.jsonl \
9898
--output_dir=${train_output_path}/test \
99-
--phones_dict=dump/phone_id_map.txt
99+
--phones_dict=dump/phone_id_map.txt || exit -1
100100
fi

examples/csmsc/tts0/local/synthesize_e2e.sh

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ config_path=$1
44
train_output_path=$2
55
ckpt_name=$3
66

7-
stage=0
8-
stop_stage=0
7+
stage=${4:-0}
8+
stop_stage=${4:-0}
99

1010
# TODO: tacotron2 动转静的结果没有动态图的响亮, 可能还是 decode 的时候某个函数动静不对齐
1111
# pwgan
@@ -25,7 +25,7 @@ if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then
2525
--text=${BIN_DIR}/../../assets/sentences.txt \
2626
--output_dir=${train_output_path}/test_e2e \
2727
--phones_dict=dump/phone_id_map.txt \
28-
--inference_dir=${train_output_path}/inference
28+
--inference_dir=${train_output_path}/inference || exit -1
2929

3030
fi
3131

@@ -47,7 +47,7 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
4747
--text=${BIN_DIR}/../../assets/sentences.txt \
4848
--output_dir=${train_output_path}/test_e2e \
4949
--phones_dict=dump/phone_id_map.txt \
50-
--inference_dir=${train_output_path}/inference
50+
--inference_dir=${train_output_path}/inference || exit -1
5151
fi
5252

5353
# the pretrained models haven't release now
@@ -68,7 +68,7 @@ if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
6868
--lang=zh \
6969
--text=${BIN_DIR}/../../assets/sentences.txt \
7070
--output_dir=${train_output_path}/test_e2e \
71-
--phones_dict=dump/phone_id_map.txt
71+
--phones_dict=dump/phone_id_map.txt || exit -1
7272
# --inference_dir=${train_output_path}/inference
7373
fi
7474

@@ -90,7 +90,7 @@ if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
9090
--text=${BIN_DIR}/../../assets/sentences.txt \
9191
--output_dir=${train_output_path}/test_e2e \
9292
--phones_dict=dump/phone_id_map.txt \
93-
--inference_dir=${train_output_path}/inference
93+
--inference_dir=${train_output_path}/inference || exit -1
9494
fi
9595

9696
# wavernn
@@ -111,5 +111,5 @@ if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then
111111
--text=${BIN_DIR}/../../assets/sentences.txt \
112112
--output_dir=${train_output_path}/test_e2e \
113113
--phones_dict=dump/phone_id_map.txt \
114-
--inference_dir=${train_output_path}/inference
114+
--inference_dir=${train_output_path}/inference || exit -1
115115
fi

examples/csmsc/tts0/run.sh

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,15 +27,15 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
2727
fi
2828

2929
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
30-
# synthesize, vocoder is pwgan by default stage 0
31-
# stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
32-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
30+
# synthesize, vocoder is pwgan by default 0
31+
# use 1-4 to select the vocoder in {multi band melgan, style melgan, hifigan, wavernn}
32+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3333
fi
3434

3535
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
36-
# synthesize_e2e, vocoder is pwgan by default stage 0
37-
# stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
38-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
36+
# synthesize_e2e, vocoder is pwgan by default 0
37+
# use 1,3,4 to select the vocoder in {multi band melgan, hifigan, wavernn}
38+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3939
fi
4040

4141
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

examples/csmsc/tts2/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -116,9 +116,9 @@ pwg_baker_ckpt_0.4
116116
```
117117
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
118118
```bash
119-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
119+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
120120
```
121-
`--stage` controls the vocoder model during synthesis, which can use stage `0-4` to select the vocoder to use {`pwgan`, `multi band melgan`, `style melgan`, `hifigan`, `wavernn`}
121+
The last number controls the vocoder model during synthesis, which can use `0-4` to select the vocoder in {`pwgan`, `multi band melgan`, `style melgan`, `hifigan`, `wavernn`}
122122

123123
```text
124124
usage: synthesize.py [-h]
@@ -166,9 +166,9 @@ optional arguments:
166166
```
167167
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
168168
```bash
169-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
169+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
170170
```
171-
`--stage` controls the vocoder model during synthesis, which can use stage `0,1,3,4` to select the vocoder to use {`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
171+
The last number controls the vocoder model during synthesis, which can use `0,1,3,4` to select the vocoder in {`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
172172

173173
```text
174174
usage: synthesize_e2e.py [-h]

examples/csmsc/tts2/local/synthesize.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
config_path=$1
44
train_output_path=$2
55
ckpt_name=$3
6-
stage=0
7-
stop_stage=0
6+
stage=${4:-0}
7+
stop_stage=${4:-0}
88

99
# pwgan
1010
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

examples/csmsc/tts2/local/synthesize_e2e.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ config_path=$1
44
train_output_path=$2
55
ckpt_name=$3
66

7-
stage=0
8-
stop_stage=0
7+
stage=${4:-0}
8+
stop_stage=${4:-0}
99

1010
# pwgan
1111
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

examples/csmsc/tts2/run.sh

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,15 +27,15 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
2727
fi
2828

2929
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
30-
# synthesize, vocoder is pwgan by default stage 0
31-
# use stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
32-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
30+
# synthesize, vocoder is pwgan by default 0
31+
# use 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
32+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3333
fi
3434

3535
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
36-
# synthesize_e2e, vocoder is pwgan by default stage 0
37-
# use stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
38-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
36+
# synthesize_e2e, vocoder is pwgan by default 0
37+
# use 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
38+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
3939
fi
4040

4141
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

examples/csmsc/tts3/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -107,9 +107,9 @@ pwg_baker_ckpt_0.4
107107
```
108108
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
109109
```bash
110-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
110+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
111111
```
112-
`--stage` controls the vocoder model during synthesis. The parameter values range from `0-4`, corresponding to the following five vocoder models: `pwgan`, `multi band melgan`, `style melgan`, `hifigan`, and `wavernn`.
112+
The last number controls the vocoder model during synthesis, which can use `0-4` to select the vocoder in {`pwgan`, `multi band melgan`, `style melgan`, `hifigan`, `wavernn`}
113113

114114
```text
115115
usage: synthesize.py [-h]
@@ -157,9 +157,9 @@ optional arguments:
157157
```
158158
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
159159
```bash
160-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
160+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
161161
```
162-
`--stage` controls the vocoder model during synthesis. The parameter values are {`0,1,3,4`}, corresponding to the following four vocoder models: `pwgan`, `multi band melgan`, `hifigan`, and `wavernn`.
162+
The last number controls the vocoder model during synthesis, which can use `0,1,3,4` to select the vocoder in {`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
163163

164164
```text
165165
usage: synthesize_e2e.py [-h]

examples/csmsc/tts3/README_cn.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -113,9 +113,9 @@ pwg_baker_ckpt_0.4
113113
`./local/synthesize.sh` 调用 `${BIN_DIR}/../synthesize.py` 即可从 `metadata.jsonl`中合成波形。
114114

115115
```bash
116-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
116+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
117117
```
118-
`--stage` 参数用于控制合成过程中使用的声码器模型。该参数的取值范围为 `0-4`,分别对应以下五种声码器模型:`pwgan``multi band melgan``style melgan``hifigan``wavernn`
118+
最后一位参数 `0` 用于控制合成过程中使用的声码器模型。该参数的取值范围为 `0-4`,分别对应以下五种声码器模型:`pwgan``multi band melgan``style melgan``hifigan``wavernn`
119119

120120
```text
121121
usage: synthesize.py [-h]
@@ -164,9 +164,9 @@ optional arguments:
164164
`./local/synthesize_e2e.sh` 调用 `${BIN_DIR}/../synthesize_e2e.py`,即可从文本文件中合成波形。
165165

166166
```bash
167-
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
167+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
168168
```
169-
`--stage` 参数用于控制合成过程中使用的声码器模型。该参数的取值范围为{ `0,1,3,4`},分别对应以下四种声码器模型:`pwgan``multi band melgan``hifigan``wavernn`
169+
最后一位参数 `0` 用于控制合成过程中使用的声码器模型。该参数的取值范围为 {`0,1,3,4`},分别对应以下四种声码器模型:`pwgan``multi band melgan``hifigan``wavernn`
170170

171171
```text
172172
usage: synthesize_e2e.py [-h]

0 commit comments

Comments
 (0)