Tacotron2+Tensorflow1.1+FALSK 语音合成
Tacotron2+Tensorflow1.1+FALSK 语音合成
背景
需要语音播报设备的名称和异常状态
环境
- Tacotron2
- Tensorflow1.1
- python3.6
- miniconda4.8.3
- 标贝数据源
安装与配置
首先安装miniconda
1.下载,使用清华下载源,进入miniconda下载页面
https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/
https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/
wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh
2.安装
bash Miniconda3-py37_4.8.3-Linux-x86_64.sh
3.In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue
回车–》q #退出阅读
4.Do you accept the license terms? [yes|no]
yes
5.Miniconda3 will now be installed into this location:
/home/aiuser/miniconda3
-
Press ENTER to confirm the location
-
Press CTRL-C to abort the installation
-
Or specify a different location below
回车
6.Do you wish the installer to initialize Miniconda3
by running conda init? [yes|no]
no
安装完成
7.配置.condarc
vim ~/.condarc
# 复制https://mirror.tuna.tsinghua.edu.cn/help/anaconda/里的配置文件
channels:
- defaults
show_channel_urls: true
default_channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
安装Tensorflow
#**conda
source ~/miniconda3/bin/activate
#创建环境
conda create -n tf python=3.6
#进入环境 退出 conda deactivate
conda activate tf
#安装tensorflow-gpu1.10
conda install tensorflow-gpu==1.10.0
如果单个包没有下载下来
CondaError: Downloaded bytes did not match Content-Length
url: https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/cudatoolkit-9.2-0.conda
target_path: /home/aiuser/miniconda3/pkgs/cudatoolkit-9.2-0.conda
Content-Length: 245249198
downloaded bytes: 230342317
复制url把包下载下来,手动安装
conda install --use-local cudatoolkit-9.2-0.conda
#如果有多个包没有下载下来重复执行上面的操作
#继续安装tensorflow-gpu1.10
conda install tensorflow-gpu==1.10.0
#测试安装是否成功
vim demo.py
#复制进去
import tensorflow as tf
version = tf.__version__
gpu_ok = tf.test.is_gpu_available()
print("tf version:",version,"nuse GPU",gpu_ok)
#执行
python demo.py
#显示true安装成功
下载数据源
标贝数据源 https://online-of-baklong.oss-cn-huhehaote.aliyuncs.com/story_resource/BZNSYP.rar?Expires=1611650858&OSSAccessKeyId=LTAI3GkKBSJFDJsp&Signature=c8ahH5BEyjEIw2wP0FmXebjNORo%3D
希尔贝壳Aishell http://www.openslr.org/33/
Tacotron2
https://github.com/JasonWei512/Tacotron-2-Chinese #可以直接在这里下载10w步的预训练模型 直接跳到第六步
下载 http://github.com/JasonWei512/Tacotron-2-Chinese/archive/mandarin-biaobei.zip
1.解压Tacotron-2-mandarin-mel.zip
2.把标贝数据集解压到Tacotron-2-mandarin-mel根目录
Tacotron-2-mandarin-mel
|- BZNSYP
|- PhoneLabeling
|- ProsodyLabeling
|- Wave
3.用ffmpeg 把 /BZNSYP/Wave/
中的 wav 的采样率降到36KHz:
import os
import subprocess
input_path = r"D:\tensorflow\Tacotron-2-mandarin-mel\Tacotron-2-mandarin-mel\BZNSYP\Wave"
output_path = r"D:\tensorflow\Tacotron-2-mandarin-mel\Tacotron-2-mandarin-mel\BZNSYP\Wave2"
for file in os.listdir(input_path):
file1 = input_path+'\\'+file
file2 = output_path+'\\'+file
cmd = "ffmpeg -i " + file1 + " -ar 36000 " + file2
subprocess.call(cmd, shell=True)
4.预处理文件
python preprocess.py --dataset='Biaobei'
5.训练
python train.py --model='Tacotron-2'
6.合成
若无 WaveNet 模型,仅有频谱预测模型,则仅由 Griffin-Lim 生成语音,输出至 /tacotron_output/logs-eval/wavs/
文件夹中。
若有 WaveNet 模型,则 WaveNet 生成的语音位于 /wavenet_output/wavs/
中
python synthesize.py --model='Tacotron-2' --text_list='sentences.txt'
Repository Structure:
Tacotron-2
├── datasets
├── en_UK (0)
│ └── by_book
│ └── female
├── en_US (0)
│ └── by_book
│ ├── female
│ └── male
├── LJSpeech-1.1 (0)
│ └── wavs
├── logs-Tacotron (2)
│ ├── eval_-dir
│ │ ├── plots
│ │ └── wavs
│ ├── mel-spectrograms
│ ├── plots
│ ├── taco_pretrained
│ ├── metas
│ └── wavs
├── logs-Wavenet (4)
│ ├── eval-dir
│ │ ├── plots
│ │ └── wavs
│ ├── plots
│ ├── wave_pretrained
│ ├── metas
│ └── wavs
├── logs-Tacotron-2 ( * )
│ ├── eval-dir
│ │ ├── plots
│ │ └── wavs
│ ├── plots
│ ├── taco_pretrained
│ ├── wave_pretrained
│ ├── metas
│ └── wavs
├── papers
├── tacotron
│ ├── models
│ └── utils
├── tacotron_output (3)
│ ├── eval
│ ├── gta
│ ├── logs-eval
│ │ ├── plots
│ │ └── wavs
│ └── natural
├── wavenet_output (5)
│ ├── plots
│ └── wavs
├── training_data (1)
│ ├── audio
│ ├── linear
│ └── mels
└── wavenet_vocoder
└── models
Flask加载model-接口形式调用合成
下载 https://gitee.com/mtllll/tacotron2-flask-server
下载预训练模型放在server根目录 logs-Tacotron-2
执行server app.py
执行client app.py