Stable DiffusionをGradioインターフェースで、RTX3060とRTX3090で実行ベンチマーク。

画像生成AIはLLMに比べると相対的にVRAMと言ってGPUのメモリが少なくて平気です。ここでは、コンシューマ用のRTX3060 12GBVRAMとRTX3090 24GBVRAM　のマシンを使って画像生成AIの通常の使い方についてみてみます。

今回利用するモデルは、Stable Diffusion1.5ベースのRealistic Vision V6.0にします。

まずはローカルにSafetensorのファイルをダウンロードします。

インストールするのは、diffusersなどの必要なライブラリです。

pip install --quiet diffusers transformers accelerate gradio

コードは　こうしました。読み込んだsafetensorを使って計算をします。

from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_single_file(
  "real.safetensors",
  load_safety_checker=True,
  extract_ema=True,
  torch_dtype=torch.float16
  ).to("cuda")

def txt2img(prompt):
return pipe(prompt, num_inference_steps=20).images[0]

まずはRTX3060から。

実際にプロンプトを入れてみます。

import time

start = time.time()
txt2img("closeup face photo of man in black clothes, night city street, bokeh")
print(time.time()-start)

2.92753529548645

となりました。

Gradioで実行する場合には、上で作った関数を使います。

import gradio as gr

app = gr.Interface(
  txt2img,
  "text",
  gr.Image(),
)

app.launch(share=True)

非常に簡単に行きました。

次に同じファイルを3090で実行してみます。

import time

start = time.time()
txt2img("closeup face photo of man in black clothes, night city street, bokeh")
print(time.time()-start)

1.0783894062042236

3倍ほど早くなりました。しかし画像の場合は体感ではあまり数字程の差は感じなかったような気がします。

Stable DiffusionをGradioインターフェースで、RTX3060とRTX3090で実行ベンチマーク。

Yuichiro Minato