日刊画像生成AI (2022年9月24日)

2022年9月27日 12:16

画像生成AI界は、今認識できないスピードで進化をし続けています。
DALL・E2公開、Midjourney公開、StableDiffusionがオープンソースで公開されて..進化の速度が上がり続けており、日々異常なスピードで変化しています。

そんな中、毎日時間なくて全然情報追えない..！って人のために業界変化、新表現、思考、問題、技術を毎日あらゆるメディアを調べ、まとめています。

諸事情で投稿が少し遅れています。ごめんなさい。
なんとか戻します。

昨日までの投稿はこちら

開発

NovelAIのStableDiffusion改良

NovelAIというサービスがStable Diffusionをカスタマイズしてオリジナルのモデルを制作しているようで、少しずつ広まってきており話題になっています。この日の情報公開では、使用しているアスペクト比に関係なくAI がプロンプトの意図に焦点を合わせることができるそうです。手も改善されているようです。

Waifu Diffusionと同様、Danbooruで学習しており、アニメ絵に関してかなりクオリティが高いです。Waifu Diffusionに存在する歪みみたいなものもほぼないですね。

Updates on #NovelAI's latest image generation development: Our own model of #stablediffusion now knows how to resist some of the occasional strange cropping behaviors.

This helps the AI zero in on the intent of your prompt, regardless of the aspect ratios you're working with. pic.twitter.com/jN6T6KMQsK
— NovelAI (@novelaiofficial) September 23, 2022

This is what too many A100s do to a man. Yes these are generated images. @novelaiofficial pic.twitter.com/UMmzDG1YHP
— Eren Doğan (@kurumuz) September 20, 2022

Just added "red hair" to the prompt. Our model is too OP fr fr pic.twitter.com/pVLpPz732Z
— Eren Doğan (@kurumuz) September 20, 2022

Hands are almost there, it's getting it pic.twitter.com/TuaU4dK4Oq
— Eren Doğan (@kurumuz) September 20, 2022

As mentioned previously, our new Image Generation Model further improves what we've learned from our Anime Module.

Who else needs an AI Generated Madoka Magica x Evangelion collab in their life? pic.twitter.com/ntZ6SFfXmr
— NovelAI (@novelaiofficial) September 22, 2022

MyAi.art 開発中

AltryneさんがMyAi.artというサービスを作られています。Chrome拡張と連携するアプリのようで、ブラウザで文字を右クリックしてメニューから選択すればそのままStableDiffusionにその言葉を送ったり、画像を右クリックでpromptの逆引きまでできるようです。

I'm working on a tool for #promptEngineer #aiart 👀
This will blow your mind, you'll be able to store prompts and generations from Dalle2, DreamStudio and MJ (from browser)
If you want early access, signup here: https://t.co/GRTC2Nw4cJ https://t.co/gb2pxszMLF pic.twitter.com/S8iJWykUH6
— Altryne - dreaming with #stableDiffusion (@altryne) September 23, 2022

Prompt逆引きの詳細を書きます。画像右クリックから、WebUIに送ったり、CLIP Interrogatorに送信したり、MyAI.artに送ることができるようです。
これはすごい便利そう..。

⭐️ CLIP interrogator will help you "analyze" any picture and spit out a prompt
This is a POC (so it injects the image to Replicate, but this will be baked into my service very soon)
Right click on Img -> Prompt with Interrogate -> img2img generation pipeline is 🔥🔥🔥 pic.twitter.com/ZZmn0HwIKu
— Altryne - dreaming with #stableDiffusion (@altryne) September 23, 2022

Flash AttentionでStable Diffusionを最大50%高速化

A6000でU-Netのクロスアテンションのほとんどをフラッシュアテンションに置き換えることで、50%に近いスピードアップを実現したとのこと。

Speed up stable diffusion by ~50% using flash attention

📝Annotated implementation: https://t.co/steDj7jdHQ
🖥 Github: https://t.co/GSSFZ93p9P

We got close to 50% speedup on A6000 by replacing most of cross attention operations in the U-Net with flash attention

🧶👇 pic.twitter.com/DrLYmyB9Do
— labml.ai (@labmlai) September 24, 2022

Stable Diffusion UI (cmdr2) v2.17リリース

windows, linux環境にワンクリックでインストールできるStable Diffusion UI (cmdr2)がアップデートしたようです。Macには近日中に対応とのこと。アップデート項目は以下。

1. text2image のサンプラーの追加
2. インペインティングとマスキング
3. ライブプレビュー: AI によってペイントされている間に画像が生き生きと動くのを確認できます
4.プログレスバー
5.メモリ使用量を削減するための多くの改善
6. 画像用の広い領域を備えたよりクリーンな UI
7. 使用する SD フォークを最新バージョンに更新する

なかなか便利そうですね。

画像モディファイアが入っていて便利そう。

UnstableFusion公開。

inpainting, img2img などを備えた Stable Diffusion デスクトップアプリが公開。UnstableFusion（名前がアダルト特化のunstable diffusionと被っているため変更されるかもしれません。）windowsとlinuxには対応しているみたいです。

generrated.com

DALL・Eで生成された7,000 以上の画像とその分類が登録されているサイト。これめちゃくちゃいいです、でも意外と知られてない。
promptmaniaのDALL・E版って感じですね、ビルダーではないですが。

このサイト知らなかったのですが、この日のある方のツイートで発見しました。9月13日から公開されていたようです。ロンドンのクリエイティブテクノロジスト Davey Barkerさんが制作されたそうです。

かなり多様に事例を見れます。基本こういうのgoogle spread sheetで海外だとまとめられててすっごい見にくいんですがこれは助かる。

その方のツイート

I finally created something and made it live!

Here's 7,000+ images I created with DALL•E 2 to act as a reference/inspiration resource for your prompt writing.https://t.co/M9MktSiCmW
— Davey Barker (@dvyio) September 13, 2022

表現

aiplagueさんの作品に注目が集まっています。

1280x720で生成されているそうです。クオリティ高い..aiplagueさん曰く、プロンプトと同様にサイズは重要とのことです。このレベルだとA100 GPUでしか無理とのこと。

Warpfusion by @devdef (thank you very much)#Stablediffusion #wrpfusion #AIart
Full 4K Version YouTube Link 👇 pic.twitter.com/g4N7Lw2QTs
— aiplague (@aiplague) September 23, 2022

Stable Diffusion × 物理シミュレーション (openframeworks)

物理シミュレーションのシンプルな映像をimg2imgでりんごに変えている方がいて面白い表現だったのでメモ。

https://www.reddit.com/r/StableDiffusion/comments/xmwcvq/apple_rendering_system/

卵から孵化した犬

DALL・E2で生成されたもので、Redditで注目が集まっていて面白かったのでメモ。

https://www.reddit.com/r/dalle2/comments/xmfc4l/a_dog_hatching_from_its_egg/

この日よかったAIアニメーション

'Drowning'
currently trying to keep my head above the water in the vast merciless sea of technology.. pic.twitter.com/syZ0kAYwpm
— Glenn Marshall (@GlennIsZen) September 24, 2022

🔊#aiart #stablediffusion pic.twitter.com/T80B4oOZFw
— Infinite Vibes (@Infinite__Vibes) September 24, 2022

研究・検証

Dreambooth検証続々

ここ最近Dreamboothの検証の投稿が相次いでいます。

AIということで親分を学習させてみた（╹◡╹）#stablediffusion #WaifuDiffusion #Dreambooth pic.twitter.com/iT1vTok1xo
— LangLang (@Langx02) September 24, 2022

というわけでDreamboothで4000step学習させたAIまおうさま
txt2img無加筆でここまでできるのですごい
髪色変えたり、少しお姉さん風にしてみたり、SDでよく試されてたねんどろ風などはできる様子#stablediffusion #Dreambooth #しもべさん pic.twitter.com/ejOFQMuUHQ
— Qlion (@QlionEw) September 24, 2022

Dreambooth on Stable DiffusionをVRAM 24GBのRTX 3090で動かす件、main. pyの611行目、ddpをgpuに書き換えるだけで行けました。
trainer_config["accelerator"] = "gpu" # "ddp"
これはマルチGPUの分散学習の設定らしいので、1 GPUで動かすときには分散なし＝gpuで問題ないようです。
— Kohya S. (@kohya_ss) September 23, 2022

Textual Inversionよりも出力が格段に安定している。 pic.twitter.com/WNMM06qDfn
— Kohya S. (@kohya_ss) September 23, 2022

Midjourneyっぽい生成をStableDiffusionでも出せないだろうか。

Midjourneyはおそらく裏側で、いくつかシークレットソースプロンプトが設定されているんじゃないかという話があります。このRedditのスレッドにそれを研究して見つけた方がいるそうです。以下のプロンプトを使うと少し近い結果が得られるようです。

prompt:
magnificent, elegant, beautiful, dynamic lighting, killian eng, ocellus, theme park, fantastical, light dust, elegant, diffuse, grimmer, intricate, light dust, orange and teal contrast volumetric lighting, triadic colors, and perhaps the most powerful, splash art.

設定：k_euler_a、steps30、cfg6-9

でも結果、これだけでは同じものは得られません。Emadさんが確認しているそうですが、MidjounreyではSDベースの上にチューニングしたモデルを載せて、前処理と後処理を行っているそうです。

スマホで撮った写真からこのような画像を作る方法

写真をまず合成してから、img2imgで、プロンプトは以下。

post apocalyptic monster mecha truck with machingun on top, digital art, concept art, dusty environment, foggy, dirty group, 3/4 view, greg rustkowski style

これで満足がいくまでプロンプト変更とPhotoshopブラシでの修正を繰り返すそうです。

https://www.reddit.com/r/StableDiffusion/comments/xlun11/i_combined_these_two_pictures_i_took_with_my/

Stable Diffusion（WebUI by AUTOMATIC1111）の起動を速くする。

思想・ムーブメント

「AI 画像がアートの世界をどのように混乱させているか」

AIアートを下絵に絵を描いている人がいました。

Been wanting to try this ever since I started seeing @midjourney images - doing paintings based on AI art. Not going to share said art because it will only make my painting look juvenile af in comparison 😅 but the prompt involved whale skeletons & bioluminescence. #midjourney pic.twitter.com/432szFeqno
— rupa d. (@dasRupa) September 23, 2022

最後に

Twitterに、毎日製作したものや、最新情報、検証を載せたりしています。よかったらフォローしてくれるとうれしいです。

Tweets by Yamkaz

画像生成AIの実験, 最新情報のまとめはこちら

前回の号はこちら

次の号はこちら

サポートいただけると喜びます。本を読むのが好きなので、いただいたものはそこに使わせていただきます

日刊 画像生成AI (2022年9月24日)

開発