見出し画像

コピー機学習法による影LoRA作成

こんにちは、コピー機LoRAにドはまりしているとりにくです。
先日相互フォロワーである抹茶もなかさんがこんなつぶやきをしていたのが気になって実践してみました。

使用データ

上記をベースにこんな感じの画像を作ってみました。

ももろみさん画像


ベース画像
1影LoRA
2影LoRA
ハイライト
全部込み線画なし

検証に使った線画

検証

SD1.5

prompt:
monochrome, greyscale, animal ears, solo, tail, 1girl, necktie, closed eyes, pants, hat, shirt, footwear, pants, full body, white background, shoes, jacket, headwear, smile, simple background, hand in pocket, jacket, long hair, necktie, collared shirt, fox tail, loafers, long sleeves, low ponytail, fox girl, top hat, ponytail, vest, closed mouth, fox ears, formal, suit, official alternate costume, vest, bangs

Negative prompt: EasyNegative

Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 1404191844, Size: 448x576, Model hash: 1749adc5b4, Model: SDHK04, VAE hash: 538255c0d5, VAE: kl-f8-anime2.ckpt, ControlNet 0: "Module: lineart_standard (from white bg & black line), Model: control_v11p_sd15_lineart [43d4be0d], Weight: 1, Resize Mode: Crop and Resize, Low Vram: False, Processor Res: 512, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: Balanced, Save Detected Map: True", Lora hashes: "1kage: 0c917d990d51", TI hashes: "EasyNegative: c74b4e810b03", Version: v1.6.0-2-g4afaaf8a

1影LoRA
2影LoRA
ハイライトLoRA
全部込み線画なし

う、うーーーん!悪くはないんですけど解像度がたりず細かいところがつぶれてしまいます。ここら辺はSD1.5の仕様なので仕方ない気がします。

SDXL

prompt:
monochrome, greyscale, animal ears, solo, tail, 1girl, necktie, closed eyes, pants, hat, shirt, footwear, pants, full body, white background, shoes, jacket, headwear, smile, simple background, hand in pocket, jacket, long hair, necktie, collared shirt, fox tail, loafers, long sleeves, low ponytail, fox girl, top hat, ponytail, vest, closed mouth, fox ears, formal, suit, official alternate costume, bangs SimplepositiveXLv1

Negative prompt:
negativeXL_D sdxl-negprompt8-v1m unaestheticXL_AYv1

Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 2883886449, Size: 896x1152, Model hash: 9025c79626, Model: CrazyXL, ControlNet 0: "Module: canny, Model: sai_xl_canny_256lora [566f20af], Weight: 1, Resize Mode: Crop and Resize, Low Vram: False, Processor Res: 512, Threshold A: 100, Threshold B: 200, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: Balanced, Save Detected Map: True", Lora hashes: "SDXL_kage: bc929267287d", TI hashes: "SimplepositiveXLv1: 049fb42b64c9, SimplepositiveXLv1: 049fb42b64c9, negativeXL_D: fff5d51ab655, sdxl-negprompt8-v1m: 24350b43a034, unaestheticXL_AYv1: 8a94b6725117, negativeXL_D: fff5d51ab655, sdxl-negprompt8-v1m: 24350b43a034, unaestheticXL_AYv1: 8a94b6725117", Version: v1.6.0-2-g4afaaf8a

1影
2影
ハイライト
全部込み線画なし

上記の結果をそのまま使うのは難しそうだったので、ひたすら生成AIの出力を加筆&修正してこんな感じになりました。

簡単レシピ

1つのLoRAを学習するのに、SD1.5は30分、SDXLは70分かかる。

git clone -b sdxl https://github.com/kohya-ss/sd-scripts.git
cd sd-scripts

python -m venv venv
.\venv\Scripts\activate

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
pip install -U xformers
pip install --upgrade -r requirements.txt
python -m pip install bitsandbytes==0.41.1 --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui 

accelerate config

- This machine
- No distributed training
- NO
- NO
- NO
- all
- fp16

フォルダ名は適当(繰り返し数は適当に10000とかでOK、step数指定なので)

1024,768,512にそれぞれリサイズして保存。キャプションは適当につけておく。

copi-ki.toml(GPUによってtrain_batch_sizeは変えること)

pretrained_model_name_or_path = "C:/stable-diffusion-webui/models/Stable-diffusion/SDHK04.safetensors"
network_module = "networks.lora"
xformers = true
persistent_data_loader_workers = true
max_data_loader_n_workers = 12
enable_bucket = true
save_model_as = "safetensors"
lr_scheduler_num_cycles = 4
mixed_precision = "fp16"
learning_rate = 0.0001
resolution =  "512,512"
train_batch_size = 12
network_dim = 128
network_alpha = 128
optimizer_type = "AdamW8bit"
bucket_no_upscale = true
clip_skip = 2
save_precision = "fp16"
lr_scheduler = "linear"
min_bucket_reso = 64
max_bucket_reso = 1024
caption_extension = ".txt"
seed = 42
network_train_unet_only = true

copi-ki_SDXL.toml(GPUによってtrain_batch_sizeは変えること)

pretrained_model_name_or_path = "C:/stable-diffusion-webui/models/Stable-diffusion/blue_pencil-XL-v2.0.0.safetensors"
network_module = "networks.lora"
xformers = true
gradient_checkpointing = true
persistent_data_loader_workers = true
max_data_loader_n_workers = 12
enable_bucket = true
save_model_as = "safetensors"
lr_scheduler_num_cycles = 4
mixed_precision = "fp16"
learning_rate = 0.0001
resolution =  "512,512"
train_batch_size = 12
network_dim = 64
network_alpha =64
optimizer_type = "AdamW8bit"
bucket_no_upscale = true
save_precision = "fp16"
lr_scheduler = "linear"
min_bucket_reso = 64
max_bucket_reso = 1024
caption_extension = ".txt"
seed = 42
network_train_unet_only = true
no_half_vae = true

コマンド一覧

#SD1.5学習
accelerate launch ^
  --num_cpu_threads_per_process 12 ^
  train_network.py ^
  --config_file="C:\sd-scripts\user_config\1kage\copi-ki.toml" ^
  --train_data_dir="C:\sd-scripts\user_config\1kage\base" ^
  --output_dir="C:\sd-scripts\user_config\output\1kage" ^
  --output_name=copi-ki-base ^
  --max_train_steps 1500

#SDXL学習
accelerate launch ^
  --num_cpu_threads_per_process 12 ^
  sdxl_train_network.py ^
  --config_file="C:\sd-scripts\user_config\1kage\copi-ki_SDXL" ^
  --train_data_dir="C:\sd-scripts\user_config\1kage\HL" ^
  --output_dir="C:\sd-scripts\user_config\output\1kage" ^
  --output_name=SDXL_copi-ki-HL ^
  --max_train_steps 1500

#絵柄の差分抽出
python .\networks\merge_lora.py ^
--save_to "C:\sd-scripts\user_config\output\style\〇〇.safetensors" ^
--models "C:\sd-scripts\user_config\output\style\copi-ki-〇〇.safetensors" "C:\sd-scripts\user_config\output\style\copi-ki-base.safetensors" ^
--ratios 1.4 -1.4 ^
--concat ^
--shuffle ^
--save_precision fp16

#LoRAのリサイズ
python .\networks\svd_merge_lora.pyy ^
--save_to "C:\sd-scripts\user_config\output\style\〇〇_128.safetensors" ^
--models "C:\sd-scripts\user_config\output\style\〇〇.safetensors"
--ratios 1
--new_rank 128
--device cuda
--save_precision fp16

この記事が気に入ったらサポートをしてみませんか?