blueqat cloud + Stable Diffusion + ファインチューニングモデルでフォトリアル生成AIを試す

イラスト系だけじゃなくて、最近はフォトリアルなものも出せる様です。これが出せれば広告とかに使えそうな気がしますね。モデルさんを雇って撮影って大変なので。

今回はこれまで利用してきたStable Diffusion2.0のモデルではなく、ファインチューニングされた目的別のモデルを使ってみます。

https://huggingface.co/SG161222/Realistic_Vision_V1.4

今回は写真のような生成AIを作成してくれるRealistic Vision 1.4の最新バージョンを使ってみます。

今回はサンプラーもちょっと揃えてみました。

from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler, EulerDiscreteScheduler

今回はpipelineを二つ作って比較してみます。

import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler, EulerDiscreteScheduler
from torch import autocast

#model_id0 = "CompVis/stable-diffusion-v1-4"
model_id1 = "stabilityai/stable-diffusion-2"
model_id2 = "SG161222/Realistic_Vision_V1.4"
use_auth_token = r'TOKEN'
device = "cuda"

pipeline1 = StableDiffusionPipeline.from_pretrained(model_id1, revision="fp16", torch_dtype=torch.float16, use_auth_token=use_auth_token)
pipeline1.to(device)

pipeline2 = StableDiffusionPipeline.from_pretrained(model_id2, use_auth_token=use_auth_token)
pipeline2.scheduler = DPMSolverMultistepScheduler.from_config(pipeline2.scheduler.config)
pipeline2.to(device)

model1には通常のstable diffusion

model2には、Realistic Vision V1.4

を利用しました。

pipeline2の方にRealistic Vision V1.4を設定しましたが、サンプラーを今回はDPMSolverを使いました。

NSFWに引っかかってしまうことも多いので、真っ黒な画像が出る場合には再生成を行いました。

import random
from PIL import Image

steps = 16

#画像初期化
image = Image.new("L",(768,768))

while not image.getbbox():
  seed = random.randint(0,9999999)
  generator = torch.Generator(device).manual_seed(seed)
  image = pipeline2(prompt, negative_prompt=negative_prompt, num_inference_steps=steps, generator=generator).images[0]

print(seed)
image.save(f"outputs/{seed}.png")
image

シードをランダムで生成し、画像生成に成功したらシード名を指定して画像保存します。

適切なプロンプトを指定（Realistic Vision V1.4のユーザーフォーラムにあった標準プロンプトを使いました）

Step数は16にしてあり、だいたい「2秒」で1つの画像ができます。

二番目は多少画像を切り取っていますが、安定してかなりクオリティの高い画像が生成されます。

ちなみに標準のStable Diffusion 2で同じプロンプトで生成した場合は、、、

こんな感じでしたので、いかに追加学習された専用モデルが強力かがわかりました。

DreamboothやLoRAを利用することで自分のDiffusionモデルが作れますので、今後もこの様な専用用途の生成AIがより発展しそうですね。

blueqat cloudを利用することで簡単に画像生成ができました。

blueqat cloud + Stable Diffusion + ファインチューニングモデルでフォトリアル生成AIを試す

Yuichiro Minato