For users that wish to make use of Docker or another container orchestration platform, see this document first.
For users operating on Windows 10 or newer, an installation guide based on Docker and WSL is available here this document.
Clone the SimpleTuner repository and set up the python venv:
git clone --branch=release https://github.com/bghira/SimpleTuner.git
cd SimpleTuner
# if python --version shows 3.11 you can just also use the 'python' command here.
python3.11 -m venv .venv
source .venv/bin/activate
pip install -U poetry pip
# Necessary on some systems to prevent it from deciding it knows better than us.
poetry config virtualenvs.create false
ℹ️ You can use your own custom venv path by setting
export VENV_PATH=/path/to/.venv
in yourconfig/config.env
file.
Note: We're currently installing the release
branch here; the main
branch may contain experimental features that might have better results or lower memory use.
Depending on your system, you will run one of 3 commands:
# MacOS
poetry install -C install/apple
# Linux
poetry install
# Linux with ROCM
poetry install -C install/rocm
Optionally, Hopper (or newer) equipment can make use of FlashAttention3 for improved inference and training performance when making use of torch.compile
You'll need to run the following sequence of commands from your SimpleTuner directory, with your venv active:
git clone https://github.com/Dao-AILab/flash-attention
pushd flash-attention
pushd hopper
python setup.py install
popd
popd
⚠️ Managing the flash_attn build is poorly-supported in SimpleTuner, currently. This can break on updates, requiring you to re-run this build procedure manually from time-to-time.
The following must be executed for an AMD MI300X to be useable:
apt install amd-smi-lib
pushd /opt/rocm/share/amd_smi
python3 -m pip install --upgrade pip
python3 -m pip install .
popd
- 2a. Option One (Recommended): Run
configure.py
- 2b. Option Two: Copy
config/config.json.example
toconfig/config.json
and then fill in the details.
⚠️ For users located in countries where Hugging Face Hub is not readily accessible, you should addHF_ENDPOINT=https://hf-mirror.com
to your~/.bashrc
or~/.zshrc
depending on which$SHELL
your system uses.
Note: For MultiGPU setup, you will have to set all of these variables in config/config.env
TRAINING_NUM_PROCESSES=1
TRAINING_NUM_MACHINES=1
TRAINING_DYNAMO_BACKEND='no'
# this is auto-detected, and not necessary. but can be set explicitly.
CONFIG_BACKEND='json'
Any missing values from your user config will fallback to the defaults.
- If you are using
--report_to='wandb'
(the default), the following will help you report your statistics:
wandb login
Follow the instructions that are printed, to locate your API key and configure it.
Once that is done, any of your training sessions and validation data will be available on Weights & Biases.
ℹ️ If you would like to disable Weights & Biases or Tensorboard reporting entirely, use
--report-to=none
- Launch the
train.sh
script; logs will be written todebug.log
./train.sh
⚠️ At this point, if you usedconfigure.py
, you are done! If not - these commands will work, but further configuration is required. See the tutorial for more information.
To run unit tests to ensure that installation has completed successfully:
poetry run python -m unittest discover tests/
For users who train multiple models or need to quickly switch between different datasets or settings, two environment variables are inspected at startup.
To use them:
env ENV=default CONFIG_BACKEND=env bash train.sh
ENV
will default todefault
, which points to the typicalSimpleTuner/config/
directory that this guide helped you configure- Using
ENV=pixart ./train.sh
would useSimpleTuner/config/pixart
directory to findconfig.env
- Using
CONFIG_BACKEND
will default toenv
, which uses the typicalconfig.env
file this guide helped you configure- Supported options:
env
,json
,toml
, orcmd
if you rely on runningtrain.py
manually - Using
CONFIG_BACKEND=json ./train.sh
would search forSimpleTuner/config/config.json
instead ofconfig.env
- Similarly,
CONFIG_BACKEND=toml
will useconfig.env
- Supported options:
You can create config/config.env
that contains one or both of these values:
ENV=default
CONFIG_BACKEND=json
They will be remembered upon subsequent runs. Note that these can be added in addition to the multiGPU options described above.