NVIDIA CUDA-Q Tutorial & H100 Benchmark Part 3: Calculating Magnetization Using Suzuki–Trotter Approximation
(I am not sure if it is correct that the benchmark scales linearly...)
One of the most representative applications of quantum computing is the simulation of quantum many-body time evolution. Accurately and efficiently reproducing physical quantities such as magnetization in spin system models is crucial in fields like materials science and quantum information.
In this tutorial, we’ll use CUDA-Q to simulate the magnetization of a spin chain governed by the Heisenberg model.
To efficiently track the time evolution of the quantum state under a time-dependent Hamiltonian, we apply the Suzuki–Trotter approximation.
🔗 Tutorial reference:
By leveraging CUDA-Q's state management features, the quantum state can be stored in GPU memory and evolved recursively at each step. This eliminates the need for “re-simulating from scratch” at each time step, and according to this example, achieves up to 24x speed-up compared to traditional methods.
We’ll walk through the entire process—constructing the Hamiltonian, initializing the state, computing magnetization expectation values, running Trotter-based time evolution, and visualizing the results—step-by-step using CUDA-Q.
Program Setup
First, import the necessary libraries:
import cudaq
import time
import numpy as np
from typing import List
Hamiltonian: Heisenberg Spin Chain
The Heisenberg Hamiltonian used in this simulation is:
We convert this into Python code:
g = 1.0
Jx = 1.0
Jy = 1.0
Jz = g
dt = 0.05
n_steps = 10
n_spins = 11
omega = 2 * np.pi
def heisenbergModelHam(t: float) -> cudaq.SpinOperator:
tdOp = cudaq.SpinOperator(num_qubits=n_spins)
for i in range(0, n_spins - 1):
tdOp += (Jx * cudaq.spin.x(i) * cudaq.spin.x(i + 1))
tdOp += (Jy * cudaq.spin.y(i) * cudaq.spin.y(i + 1))
tdOp += (Jz * cudaq.spin.z(i) * cudaq.spin.z(i + 1))
for i in range(0, n_spins):
tdOp += (np.cos(omega * t) * cudaq.spin.x(i))
return tdOp
Initial Quantum State
We set the initial state as follows:
@cudaq.kernel
def getInitState(numSpins: int):
q = cudaq.qvector(numSpins)
for qId in range(0, numSpins, 2):
x(q[qId])
Time Evolution with Trotter Decomposition
The Hamiltonian cannot be directly evolved due to its complexity, so we apply the Suzuki–Trotter decomposition and simulate time evolution one step at a time:
@cudaq.kernel
def trotter(state: cudaq.State, coefficients: List[complex],
words: List[cudaq.pauli_word], dt: float):
q = cudaq.qvector(state)
for i in range(len(coefficients)):
exp_pauli(coefficients[i].real * dt, q, words[i])
Extracting Hamiltonian Terms
We extract coefficients and operators from the Hamiltonian:
def termCoefficients(op: cudaq.SpinOperator) -> List[complex]:
result = []
ham.for_each_term(lambda term: result.append(term.get_coefficient()))
return result
def termWords(op: cudaq.SpinOperator) -> List[str]:
result = []
ham.for_each_term(lambda term: result.append(term.to_string(False)))
return result
Magnetization Calculation
We calculate the average magnetization along the Z-axis:
average_magnetization = cudaq.SpinOperator(num_qubits=n_spins)
for i in range(0, n_spins):
average_magnetization += ((1.0 / n_spins) * cudaq.spin.z(i))
average_magnetization -= 1.0
Running the Simulation
Now we put everything together and run the simulation.
Initialize State
state = cudaq.get_state(getInitState, n_spins)
Time Evolution Loop
results = []
times = []
for i in range(0, n_steps):
start_time = time.time()
ham = heisenbergModelHam(i * dt)
coefficients = termCoefficients(ham)
words = termWords(ham)
magnetization_exp_val = cudaq.observe(trotter, average_magnetization, state,
coefficients, words, dt)
result = magnetization_exp_val.expectation()
results.append(result)
state = cudaq.get_state(trotter, state, coefficients, words, dt)
stepTime = time.time() - start_time
times.append(stepTime)
print(f"Step {i}: time [s]: {stepTime}, result: {result}")
print(f"Step times: {times}")
print(f"Results: {results}")
Benchmark Results: CPU vs GPU
With this setup:
- CPU (AMD EPYC 9654): simulations up to ~11 spins
- GPU (NVIDIA H100): simulations up to ~25 spins
We tested from 6 to 11 spins on CPU and 6 to 25 spins on GPU.
As shown, the GPU handles significantly larger systems with ease.
Next, keeping the number of spins fixed at 20, we ran benchmarks on GPU with step counts increasing from 10 to 100 (in steps of 10):
The results show consistently high speed on GPU, even with larger step counts.