【环境部署】TransformersTTS模型 -- 将文字转化为语音-JZTXT

论文背景

A Text-to-Speech Transformer in TensorFlow 2

模型仓库+文档

https://github.com/as-ideas/TransformerTTS

安装

1. python 3.6环境

conda create -n TTS36 python==3.6
conda activate TTS36

2. github仓库

git clone thub.com/as-ideas/TransformerTTS.git

3. pip软件包

cd TransformerTTS
pip install -r requirements.txt

此时运行得到错误信息

(TTS36) user@ubuntu:~/model/model3/TransformerTTS$ python predict_tts.py -t "Please, say something."
2023-06-22 23:18:30.259622: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-06-22 23:18:30.259658: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
  File "predict_tts.py", line 7, in <module>
    from data.audio import Audio
  File "/home/user/model/model3/TransformerTTS/data/audio.py", line 6, in <module>
    import librosa.display
  File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/librosa/display.py", line 23, in <module>
    from matplotlib.cm import get_cmap
  File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/matplotlib/__init__.py", line 139, in <module>
    from . import cbook, rcsetup
  File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/matplotlib/rcsetup.py", line 27, in <module>
    from matplotlib.fontconfig_pattern import parse_fontconfig_pattern
  File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/matplotlib/fontconfig_pattern.py", line 18, in <module>
    from pyparsing import (Literal, ZeroOrMore, Optional, Regex, StringEnd,
  File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/pyparsing/__init__.py", line 130, in <module>
    __version__ = __version_info__.__version__
AttributeError: 'version_info' object has no attribute '__version__'

更改包版本

pip install pyparsing==2.4.7

记录一下此时的版本

(TTS36) user@ubuntu:~/model/model3/TransformerTTS$ pip list
Package                 Version
----------------------- ---------
absl-py                 0.15.0
astunparse              1.6.3
attrs                   22.2.0
audioread               3.0.0
cached-property         1.5.2
cachetools              4.2.4
certifi                 2023.5.7
cffi                    1.15.1
charset-normalizer      2.0.12
clang                   5.0
clldutils               3.12.0
colorlog                6.7.0
csvw                    2.0.0
cycler                  0.11.0
Cython                  0.29.35
dataclasses             0.8
decorator               5.1.1
dill                    0.3.4
flatbuffers             1.12
gast                    0.4.0
google-auth             1.35.0
google-auth-oauthlib    0.4.6
google-pasta            0.2.0
grpcio                  1.48.2
h5py                    3.1.0
idna                    3.4
importlib-metadata      4.8.3
isodate                 0.6.1
joblib                  1.1.1
keras                   2.6.0
Keras-Preprocessing     1.1.2
kiwisolver              1.3.1
librosa                 0.7.1
llvmlite                0.31.0
Markdown                3.3.7
matplotlib              3.2.2
multiprocess            0.70.12.2
numba                   0.48.0
numpy                   1.19.5
oauthlib                3.2.2
opt-einsum              3.3.0
p-tqdm                  1.3.3
pathos                  0.2.8
phonemizer              2.2.2
pip                     21.3.1
pox                     0.3.0
ppft                    1.6.6.4
protobuf                3.19.6
pyasn1                  0.5.0
pyasn1-modules          0.3.0
pycparser               2.21
pyparsing               2.4.7
python-dateutil         2.8.2
pyworld                 0.3.3
regex                   2023.6.3
requests                2.27.1
requests-oauthlib       1.3.1
resampy                 0.3.1
rfc3986                 1.5.0
rsa                     4.9
ruamel.yaml             0.17.32
ruamel.yaml.clib        0.2.7
scikit-learn            0.24.2
scipy                   1.5.4
segments                2.2.1
setuptools              59.6.0
six                     1.15.0
soundfile               0.12.1
tabulate                0.8.10
tensorboard             2.6.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.8.1
tensorflow              2.6.2
tensorflow-estimator    2.6.0
termcolor               1.1.0
threadpoolctl           3.1.0
tqdm                    4.40.1
typing-extensions       3.7.4.3
uritemplate             4.1.1
urllib3                 1.26.16
webrtcvad               2.0.10
Werkzeug                2.0.3
wheel                   0.37.1
wrapt                   1.12.1
zipp                    3.6.0

4. 预处理模型

下载地址

https://public-asai-dl-models.s3.eu-central-1.amazonaws.com/TransformerTTS/api_weights/bdf06b9_ljspeech/bdf06b9_ljspeech_step_90000.zip

解压

运行

(TTS36) user@ubuntu:~/model/model3/TransformerTTS$ python predict_tts.py -t "Please, say something." -p /home/user/model/model3/TransformerTTS/model/mobel/bdf06b9_ljspeech_step_90000/bdf06b9_ljspeech_step_90000
2023-06-23 03:48:08.648046: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-06-23 03:48:08.648086: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Loading model from /home/user/model/model3/TransformerTTS/model/mobel/bdf06b9_ljspeech_step_90000/bdf06b9_ljspeech_step_90000
2023-06-23 03:48:18.661162: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-06-23 03:48:18.661202: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-06-23 03:48:18.661224: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ubuntu): /proc/driver/nvidia/version does not exist
2023-06-23 03:48:18.687299: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: git_hash mismatch: bdf06b9(config) vs 3638055(local).
Output wav under outputs/custom_text