NVIDIA CUDA-Q Tutorial & H100 Benchmark Part 3: Calculating Magnetization Using Suzuki–Trotter Approximation

(I am not sure if it is correct that the benchmark scales linearly...)

One of the most representative applications of quantum computing is the simulation of quantum many-body time evolution. Accurately and efficiently reproducing physical quantities such as magnetization in spin system models is crucial in fields like materials science and quantum information.

In this tutorial, we’ll use CUDA-Q to simulate the magnetization of a spin chain governed by the Heisenberg model.
To efficiently track the time evolution of the quantum state under a time-dependent Hamiltonian, we apply the Suzuki–Trotter approximation.

🔗 Tutorial reference:

By leveraging CUDA-Q's state management features, the quantum state can be stored in GPU memory and evolved recursively at each step. This eliminates the need for “re-simulating from scratch” at each time step, and according to this example, achieves up to 24x speed-up compared to traditional methods.

We’ll walk through the entire process—constructing the Hamiltonian, initializing the state, computing magnetization expectation values, running Trotter-based time evolution, and visualizing the results—step-by-step using CUDA-Q.

Program Setup

First, import the necessary libraries:

import cudaq
import time
import numpy as np
from typing import List

Hamiltonian: Heisenberg Spin Chain

The Heisenberg Hamiltonian used in this simulation is:

H(t) = \sum_{j=1}^{N} \left( J_x \, \sigma_j^x \sigma_{j+1}^x + J_y \, \sigma_j^y \sigma_{j+1}^y + J_z \, \sigma_j^z \sigma_{j+1}^z \right) + \cos(\omega t) \, \sigma_j^x

We convert this into Python code:

g = 1.0
Jx = 1.0
Jy = 1.0
Jz = g
dt = 0.05
n_steps = 10
n_spins = 11
omega = 2 * np.pi

def heisenbergModelHam(t: float) -> cudaq.SpinOperator:
    tdOp = cudaq.SpinOperator(num_qubits=n_spins)
    for i in range(0, n_spins - 1):
        tdOp += (Jx * cudaq.spin.x(i) * cudaq.spin.x(i + 1))
        tdOp += (Jy * cudaq.spin.y(i) * cudaq.spin.y(i + 1))
        tdOp += (Jz * cudaq.spin.z(i) * cudaq.spin.z(i + 1))
    for i in range(0, n_spins):
        tdOp += (np.cos(omega * t) * cudaq.spin.x(i))
    return tdOp

Initial Quantum State

We set the initial state as follows:

@cudaq.kernel
def getInitState(numSpins: int):
    q = cudaq.qvector(numSpins)
    for qId in range(0, numSpins, 2):
        x(q[qId])

Time Evolution with Trotter Decomposition

The Hamiltonian cannot be directly evolved due to its complexity, so we apply the Suzuki–Trotter decomposition and simulate time evolution one step at a time:

@cudaq.kernel
def trotter(state: cudaq.State, coefficients: List[complex],
            words: List[cudaq.pauli_word], dt: float):
    q = cudaq.qvector(state)
    for i in range(len(coefficients)):
        exp_pauli(coefficients[i].real * dt, q, words[i])

Extracting Hamiltonian Terms

We extract coefficients and operators from the Hamiltonian:

def termCoefficients(op: cudaq.SpinOperator) -> List[complex]:
    result = []
    ham.for_each_term(lambda term: result.append(term.get_coefficient()))
    return result

def termWords(op: cudaq.SpinOperator) -> List[str]:
    result = []
    ham.for_each_term(lambda term: result.append(term.to_string(False)))
    return result

Magnetization Calculation

We calculate the average magnetization along the Z-axis:

average_magnetization = cudaq.SpinOperator(num_qubits=n_spins)
for i in range(0, n_spins):
    average_magnetization += ((1.0 / n_spins) * cudaq.spin.z(i))
average_magnetization -= 1.0

Running the Simulation

Now we put everything together and run the simulation.

Initialize State

state = cudaq.get_state(getInitState, n_spins)

Time Evolution Loop

results = []
times = []
for i in range(0, n_steps):
    start_time = time.time()
    ham = heisenbergModelHam(i * dt)
    coefficients = termCoefficients(ham)
    words = termWords(ham)
    magnetization_exp_val = cudaq.observe(trotter, average_magnetization, state,
                                          coefficients, words, dt)
    result = magnetization_exp_val.expectation()
    results.append(result)
    state = cudaq.get_state(trotter, state, coefficients, words, dt)
    stepTime = time.time() - start_time
    times.append(stepTime)
    print(f"Step {i}: time [s]: {stepTime}, result: {result}")

print(f"Step times: {times}")
print(f"Results: {results}")

Benchmark Results: CPU vs GPU

With this setup:

CPU (AMD EPYC 9654): simulations up to ~11 spins
GPU (NVIDIA H100): simulations up to ~25 spins

We tested from 6 to 11 spins on CPU and 6 to 25 spins on GPU.

CPU vs GPU spin simulation comparison

As shown, the GPU handles significantly larger systems with ease.

Next, keeping the number of spins fixed at 20, we ran benchmarks on GPU with step counts increasing from 10 to 100 (in steps of 10):

GPU simulation time vs steps (20 spins)

The results show consistently high speed on GPU, even with larger step counts.

NVIDIA CUDA-Q Tutorial & H100 Benchmark Part 3: Calculating Magnetization Using Suzuki–Trotter Approximation

Yuichiro Minato

NVIDIA CUDA-Q Tutorial & H100 Benchmark Part 3: Calculating Magnetization Using Suzuki–Trotter Approximation

Program Setup

Hamiltonian: Heisenberg Spin Chain

Initial Quantum State

Time Evolution with Trotter Decomposition

Extracting Hamiltonian Terms

Magnetization Calculation

Running the Simulation

Initialize State

Time Evolution Loop

Benchmark Results: CPU vs GPU