欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Tacotron2+Tensorflow1.1+FALSK 语音合成

程序员文章站 2022-07-14 19:59:49
...

Tacotron2+Tensorflow1.1+FALSK 语音合成

背景

​ 需要语音播报设备的名称和异常状态

环境

  1. Tacotron2
  2. Tensorflow1.1
  3. python3.6
  4. miniconda4.8.3
  5. 标贝数据源

安装与配置

首先安装miniconda

1.下载,使用清华下载源,进入miniconda下载页面

​ https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/

​ https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py37_4.8.3-Linux-x86_64.sh

2.安装

bash Miniconda3-py37_4.8.3-Linux-x86_64.sh

3.In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue

​ 回车–》q #退出阅读

4.Do you accept the license terms? [yes|no]

​ yes

5.Miniconda3 will now be installed into this location:
/home/aiuser/miniconda3

  • Press ENTER to confirm the location

  • Press CTRL-C to abort the installation

  • Or specify a different location below

    回车

6.Do you wish the installer to initialize Miniconda3
by running conda init? [yes|no]

​ no

​ 安装完成

7.配置.condarc

vim ~/.condarc
# 复制https://mirror.tuna.tsinghua.edu.cn/help/anaconda/里的配置文件

channels:
  - defaults
show_channel_urls: true
default_channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
  conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud

安装Tensorflow

#**conda
source ~/miniconda3/bin/activate
#创建环境
conda create -n tf python=3.6
#进入环境 退出 conda deactivate
conda activate tf
#安装tensorflow-gpu1.10
conda install tensorflow-gpu==1.10.0

如果单个包没有下载下来

CondaError: Downloaded bytes did not match Content-Length
url: https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/cudatoolkit-9.2-0.conda
target_path: /home/aiuser/miniconda3/pkgs/cudatoolkit-9.2-0.conda
Content-Length: 245249198
downloaded bytes: 230342317

复制url把包下载下来,手动安装

conda install --use-local cudatoolkit-9.2-0.conda
#如果有多个包没有下载下来重复执行上面的操作
#继续安装tensorflow-gpu1.10
conda install tensorflow-gpu==1.10.0
#测试安装是否成功
vim demo.py
#复制进去
import tensorflow as tf
version = tf.__version__
gpu_ok = tf.test.is_gpu_available()
print("tf version:",version,"nuse GPU",gpu_ok)
#执行
python demo.py
#显示true安装成功

下载数据源

标贝数据源 https://online-of-baklong.oss-cn-huhehaote.aliyuncs.com/story_resource/BZNSYP.rar?Expires=1611650858&OSSAccessKeyId=LTAI3GkKBSJFDJsp&Signature=c8ahH5BEyjEIw2wP0FmXebjNORo%3D

希尔贝壳Aishell http://www.openslr.org/33/

Tacotron2

https://github.com/JasonWei512/Tacotron-2-Chinese #可以直接在这里下载10w步的预训练模型 直接跳到第六步

下载 http://github.com/JasonWei512/Tacotron-2-Chinese/archive/mandarin-biaobei.zip

1.解压Tacotron-2-mandarin-mel.zip

2.把标贝数据集解压到Tacotron-2-mandarin-mel根目录

Tacotron-2-mandarin-mel
	|- BZNSYP
		|- PhoneLabeling
		|- ProsodyLabeling
		|- Wave

3.用ffmpeg 把 /BZNSYP/Wave/ 中的 wav 的采样率降到36KHz:

import os
import subprocess

input_path = r"D:\tensorflow\Tacotron-2-mandarin-mel\Tacotron-2-mandarin-mel\BZNSYP\Wave"
output_path = r"D:\tensorflow\Tacotron-2-mandarin-mel\Tacotron-2-mandarin-mel\BZNSYP\Wave2"
for file in os.listdir(input_path):
    file1 = input_path+'\\'+file
    file2 = output_path+'\\'+file
    cmd = "ffmpeg -i " + file1 + " -ar 36000 " + file2
    subprocess.call(cmd, shell=True)

4.预处理文件

python preprocess.py --dataset='Biaobei'

5.训练

python train.py --model='Tacotron-2'

6.合成

若无 WaveNet 模型,仅有频谱预测模型,则仅由 Griffin-Lim 生成语音,输出至 /tacotron_output/logs-eval/wavs/ 文件夹中。

若有 WaveNet 模型,则 WaveNet 生成的语音位于 /wavenet_output/wavs/

python synthesize.py --model='Tacotron-2' --text_list='sentences.txt'

Repository Structure:

Tacotron-2
├── datasets
├── en_UK		(0)
│   └── by_book
│       └── female
├── en_US		(0)
│   └── by_book
│       ├── female
│       └── male
├── LJSpeech-1.1	(0)
│   └── wavs
├── logs-Tacotron	(2)
│   ├── eval_-dir
│   │ 	├── plots
│ 	│ 	└── wavs
│   ├── mel-spectrograms
│   ├── plots
│   ├── taco_pretrained
│   ├── metas
│   └── wavs
├── logs-Wavenet	(4)
│   ├── eval-dir
│   │ 	├── plots
│ 	│ 	└── wavs
│   ├── plots
│   ├── wave_pretrained
│   ├── metas
│   └── wavs
├── logs-Tacotron-2	( * )
│   ├── eval-dir
│   │ 	├── plots
│ 	│ 	└── wavs
│   ├── plots
│   ├── taco_pretrained
│   ├── wave_pretrained
│   ├── metas
│   └── wavs
├── papers
├── tacotron
│   ├── models
│   └── utils
├── tacotron_output	(3)
│   ├── eval
│   ├── gta
│   ├── logs-eval
│   │   ├── plots
│   │   └── wavs
│   └── natural
├── wavenet_output	(5)
│   ├── plots
│   └── wavs
├── training_data	(1)
│   ├── audio
│   ├── linear
│	└── mels
└── wavenet_vocoder
	└── models

Flask加载model-接口形式调用合成

下载 https://gitee.com/mtllll/tacotron2-flask-server

下载预训练模型放在server根目录 logs-Tacotron-2

执行server app.py

执行client app.py

相关标签: 语音合成

上一篇: 语音合成综述

下一篇: C# 语音合成