Benchmarking Latent Consistency Models with Stable Diffusion on CPUs and H100.

Yuichiro Minato

2024/03/20 03:00

The model known as LCM extends previous diffusion models by learning to solve ordinary differential equations, successfully reducing the number of diffusion steps. This leads to significantly faster image generation capabilities.

https://github.com/luosiallen/latent-consistency-model

It's also possible through Automatic1111, but I encountered an error, so I'll try directly from the console.

I'll prepare both CPU and GPU, and compute using float32.

*Due to insufficient verification of whether I'm actually using the CPU or GPU, I will unify this trial under a CPU+GPU H100 environment for now.

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7")

To save GPU memory, torch.float16 can be used, but it may compromise image quality.

pipe.to(torch_device="cuda", torch_dtype=torch.float32)

I used a standard prompt. First, I'll try with 4 inference steps.

prompt = "portrait photo of a girl, photograph, highly detailed face, depth of field, moody light, golden hour, style by Dan Winters, Russell James, Steve McCurry, centered, extremely detailed, Nikon D850, award winning photography"

num_inference_steps = 4

images = pipe(prompt=prompt, num_inference_steps=num_inference_steps, guidance_scale=8.0, lcm_origin_steps=50, output_type="pil").images

images[0].save("output.png")

[00:06<00:00, 1.52s/it]