LCM, or Latent Consistency Models, is an attempt to reduce the denoising steps in diffusion models significantly by training neural networks (NN) to solving ordinary differential equations (ODEs) for denoising.
In this context, a model that further accelerates the process by reducing the denoising steps has emerged, extending the use of LoRA, a low-rank adapter, in the realm of LCM.
LCM is sufficiently fast that expensive GPUs are not necessary. Here, we experimented with running LCM and LoRA on a Google Colab T4 GPU.
https://github.com/luosiallen/latent-consistency-model
!pip install -q diffusers accelerate peft
This time, either fp16 or fp32 is fine, but we'll opt for 16 due to its speed.
We will use the weights from LCM.
from diffusers import StableDiffusionPipeline, LCMScheduler
import torch
pipe = StableDiffusionPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", torch_dtype=torch.float16).to("cuda")
We will use the prompt provided in the example.
prompt = "portrait photo of a girl, photograph, highly detailed face, depth of field, moody light, golden hour, style by Dan Winters, Russell James, Steve McCurry, centered, extremely detailed, Nikon D850, award winning photography"
First, let's look at the one without using LoRA. We'll examine steps 2, 4, and 8.
%%time
image = pipe(prompt=prompt, num_inference_steps=2, guidance_scale=8.0, lcm_origin_steps=50).images[0]
image.save("image.png")
image
LCM without LoRA / steps 2 / 768x768 / 1.48s
LCM without LoRA / steps 4 / 768x768 / 1.94s
LCM without LoRA / steps 8 / 768x768 / 2.94s
Next, we will attach LCM-LoRA.
While LoRA typically modifies the style, this version reduces the inference steps.
https://huggingface.co/latent-consistency/lcm-lora-sdv1-5
The guidance scale is suggested to be set to 0, or choose between 1 to 2, so we will set it to 0.
pipe.load_lora_weights("latent-consistency/lcm-lora-sdv1-5")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
LCM with LCM-LoRA / steps 2 / 768x768 / 1.41s
The style has changed slightly. Even with just 2 steps, the image emerged without being too blurry.
LCM with LCM-LoRA / steps 4 / 768x768 / 2.02s
LCM with LCM-LoRA / steps 8 / 768x768 / 3.13s
It's great that a proper image comes out even with just 2 steps.