LCM / Latent Consistency Models and LCM-LoRA benchmark

LCM, or Latent Consistency Models, is an attempt to reduce the denoising steps in diffusion models significantly by training neural networks (NN) to solving ordinary differential equations (ODEs) for denoising.

In this context, a model that further accelerates the process by reducing the denoising steps has emerged, extending the use of LoRA, a low-rank adapter, in the realm of LCM.

LCM is sufficiently fast that expensive GPUs are not necessary. Here, we experimented with running LCM and LoRA on a Google Colab T4 GPU.

https://github.com/luosiallen/latent-consistency-model

!pip install -q diffusers accelerate peft

This time, either fp16 or fp32 is fine, but we'll opt for 16 due to its speed.

We will use the weights from LCM.

from diffusers import StableDiffusionPipeline, LCMScheduler
import torch

pipe = StableDiffusionPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", torch_dtype=torch.float16).to("cuda")

We will use the prompt provided in the example.

prompt = "portrait photo of a girl, photograph, highly detailed face, depth of field, moody light, golden hour, style by Dan Winters, Russell James, Steve McCurry, centered, extremely detailed, Nikon D850, award winning photography"

First, let's look at the one without using LoRA. We'll examine steps 2, 4, and 8.

%%time

image = pipe(prompt=prompt, num_inference_steps=2, guidance_scale=8.0, lcm_origin_steps=50).images[0]

image.save("image.png")
image

LCM without LoRA / steps 2 / 768x768 / 1.48s

LCM without LoRA / steps 4 / 768x768 / 1.94s

LCM without LoRA / steps 8 / 768x768 / 2.94s

Next, we will attach LCM-LoRA.

While LoRA typically modifies the style, this version reduces the inference steps.

https://huggingface.co/latent-consistency/lcm-lora-sdv1-5

The guidance scale is suggested to be set to 0, or choose between 1 to 2, so we will set it to 0.

pipe.load_lora_weights("latent-consistency/lcm-lora-sdv1-5")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)

LCM with LCM-LoRA / steps 2 / 768x768 / 1.41s

The style has changed slightly. Even with just 2 steps, the image emerged without being too blurry.

LCM with LCM-LoRA / steps 4 / 768x768 / 2.02s

LCM with LCM-LoRA / steps 8 / 768x768 / 3.13s

It's great that a proper image comes out even with just 2 steps.

LCM / Latent Consistency Models and LCM-LoRA benchmark

Yuichiro Minato