aiwaifu is an Open sourced finetunable customizable simpable AI waifu inspired by neuro-sama
the goal is to just giving everyone a foundational platform to develop their own waifu
Powered by opensource AI model for self-hosted/deploy
For how our waifu look like Take A Look At This Video!
Open-Sourced Talkable, Flirtable, streamable, Modifiable! Finetunable! and even LEWDABLE! AIwaifu!!! what more can you ask for? huh?
- inspired by neuro-sama
YOUR ONE AND ONLY WAIFU(if you have your own datasets or custom the personality)
- 12 GB ram or more
- Acceptable runtime tested on i7 7700k (1.00-2.30 min) faster CPU could be better
- Minimum 8GB VRAM require
- Will take at least 7.2 GB VRAM
- Nvidia GPU Only
- Very Fast On k80(Tested) faster or equivalent GPU will be pretty fast too
- Make sure you have python3 = 3.8.X installed and > 10GB storage with decent internet connection(for model weight)
- Make sure you have Installed C/C++ build tools and have Cmake installted (if not follow this Issue)
- Make sure to have GIT LFS Installed to handle large file download in git
- clone the repo & install packages
git clone https://github.com/HRNPH/AIwaifu.git
cd ./AIwaifu
- Install Poetry
- Install Dependencies (assume you're in the root of the 'AIwaifu' repo)
poetry install
poetry shell # to activate the virtual environment (need to do this everytime you want to use this projects)
# may contain some bloated packge(since I didn't clean the requirements YET)
# so I recommend install this on venv
# ---- optional -----
python -m venv venv
./venv/script/activate # for windows
# source ./venv/bin/activate # for linux
# --------------------
# you need to uninstall websocket module and install websocket-client (which was included in the requirements for it to work)
pip uninstall websocket
pip install -r ./requirements.txt
# You need to install the monotonic_align module for sovits to work
cd AIVoifu/voice_conversion/Sovits/monotonic_align
python setup.py build_ext --inplace && cd ../../../../
- Download and start Vtube-Studio (Just download it from steam)
- Install VTS desktop audio plugin by Lua Lucky CONSIDER SUBSCRIBING TO HER! She's Cute Vtuber & Developer then open it and connect to Vtube Studio
- Just follow lua-lucky videoOpen the plugin API at port 8001 in the app setting (or any port you desired but you need to modify the code)
- Start the server (In your home server in local network or on you computer 12GB ram is a minimum recommendation)
The software was splited into http server for model inference(since I need to use my home server cause the model take too much RAM(RAM Not VRAM) > 12GB required >= 16 recommended)
# this run on localhost 8267 by default
python ./api_inference_server.py
- Start the client
# this will connect to all the server (Locally)
# it is possible to host the api model on the external server, just beware of security issue
# I'm planning to make a docker container for hosting on cloud provider for inference, but not soon
python ./main.py
- Open the Vtuber Studio(VTS) and allow access
- Current TTS model was VITS pretrained model from https://huggingface.co/docs/hub/spaces-config-reference (This may be change later on for more customizable option)
- The Language model we're using is Pygmalion1.3b
- The reason TTS was in japanese's because it's cuter!!!! we translate model outputs from English to Japanese using Facebook/nllb-600m model
Sometime shit can be broke(Especially in the server) If you happen to found what's broken feel free to open an issue or pull requests!!!
- Everything We Made Is OpenSourced, Free & Customizable To the Very Core
- We'll never include proprietary Model (ChatGPT/ETC...) since it'll conflict with what we state above And It Censor Lewd Stuff So, no lol
Take a look at the issue & feel free to ask question or suggest new features or tests our model to the limit! join the discussion and comment on model performance here #25
all help will be appreciated :DD