(VALL-E X) Web を使わないで使ってみる ⁄ テキストファイルを読んで複数の音声ファイルを一括で作成する

2024年3月27日2024年3月28日

インストール

(VALL-E X)短い時間と少ないデータでテキストから好きな声で変換した音声を生成する AI を Windows パソコンにインストールする方法

@echo off
call %~dp0\scripts\env_for_icons.bat  %*
SET PATH=%PATH%;%WINPYDIRBASE%\PortableGit;%WINPYDIRBASE%\PortableGit\bin
SET PATH=%PATH%;%WINPYDIRBASE%\ffmpeg\bin
If not exist %WINPYDIRBASE%\content mkdir  %WINPYDIRBASE%\content 

set APP_NAME=VALL-E-X
set APP_DIR=%WINPYDIRBASE%\content\%APP_NAME%
echo %APP_DIR%
cd %APP_DIR%
if not defined VENV_DIR (set "VENV_DIR=%APP_DIR%\venv")
if EXIST %VENV_DIR% goto :activate_venv

::python.exe -m venv "%VENV_DIR%" 
python.exe -m venv "%VENV_DIR%" --system-site-packages 
if %ERRORLEVEL% == 0 goto :activate_venv
echo Unable to create venv 
goto :skip_venv

:activate_venv
call "%VENV_DIR%\Scripts\activate"
::If  exist %WINPYDIRBASE%\content\%APP_NAME%\checkpoints goto :skip_cmd
"D:\WinPython\Spyder.exe"
cmd.exe /k
goto :skip_venv
:skip_cmd
python -X utf8 launch-ui.py
start http://127.0.0.1:7860/
:skip_venv

スパイダーを仮想環境で使うようにしています

“D:\WinPython\content\VALL-E-X\vallex_generation.py"次のコードを保存します

from utils.generation import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
from IPython.display import Audio

from utils.prompt_making import make_prompt
make_prompt(name="paimon", audio_prompt_path="recitation009.wav")

# download and load all models
preload_models()

# generate audio from text
text_prompt = """
     私は数日間休んで、友人や家族との時間を過ごすことができます。
"""
audio_array = generate_audio(text_prompt, prompt="paimon")

# save audio to disk
write_wav("vallex_generation.wav", SAMPLE_RATE, audio_array)

簡単なコードで好きな声を学習させ文字を喋らせて声を保存することができます

プロンプトを作成する時"D:\WinPython\content\VALL-E-X\customs\paimon.npz"のファイルはできます

テキストファイルを読んで複数の音声データを作成する

from utils.generation import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
from IPython.display import Audio
from utils.prompt_making import make_prompt


def generate_wav(text_line, filename):
    audio_array = generate_audio(text_line, language="ja", prompt="paimon")
    write_wav(filename, SAMPLE_RATE, audio_array)


def read_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:  # ファイルをUTF-8エンコーディングで読み込み
        return file.readlines()

def main():
    wavname = 'vallex_generation.wav'
    file_path = "a.txt"  # 入力ファイル名を固定化
    preload_models()
    make_prompt(name="paimon", audio_prompt_path=wavname)
    lines = read_file(file_path)

    for index, line in enumerate(lines):
        line = line.strip()
        if line:
            output_filename = f"dat2/"+(str(index).zfill(3))+f".wav"  # 出力ファイル名を固定化
            generate_wav(line,output_filename)
            print(f"wav '{output_filename}' generated successfully!")

if __name__ == '__main__':
    main()

Python

Posted by eightban