common.title

Docs
Quantum Circuit
TYTAN CLOUD

QUANTUM GAMING


Desktop RAG

Overview
Terms of service

Privacy policy

Contact
Research

Sign in
Sign up
common.title

Benchmarking NVIDIA cuQuantum, a quantum computer simulator, using multiple tests on the consumer-grade GPU RTX4090 and Intel i9-13900KF.

Yuichiro Minato

2023/10/10 06:37

Today, I'd like to conduct various benchmarks using Qiskit + cuQuantum. Given that cuQuantum is free, one would expect more articles about it. However, there aren't many, probably due to the niche nature of the quantum computing industry.

First, regarding the installation, you'll set up the cuQuantum in a conda environment and then install cuQuantum and cuQuantum Python. For Qiskit, you can install qiskit-aer and qiskit-aer-gpu using pip.

I am using some references

https://medium.com/qiskit/improve-quantum-simulations-with-qiskit-aer-cuquantum-9cd4bf69f042

from qiskit import *
from qiskit.circuit.library import *
from qiskit_aer import *
import time
import numpy as np
import matplotlib.pyplot as plt

def exp_qv(qubits=15, depth=10, device_num=0):
  if device_num ==0:
    sim = AerSimulator(method='statevector', device='CPU')
  elif device_num == 1:
    sim = AerSimulator(method='statevector', device='GPU')
  else:
    sim = AerSimulator(method='statevector', device='GPU', cuStateVec_enable=True)
  circuit = QuantumVolume(qubits, depth, seed=0)
  circuit.measure_all()
  circuit = transpile(circuit, sim)
   
  start = time.time()
  result = sim.run(circuit, shots=1, seed_simulator=12345).result()
  time_val = time.time() - start
  return time_val

First, we will compare the CPU, GPU, and cuQuantum's cuStateVec. We'll run the same problem on different configurations.

1. We will use the CPU. It's the Intel Core i9 13900KF, which is fairly high-performance.

2. We will use the GPU, leveraging acceleration from IBM's program.

3. We will use the GPU, but with acceleration using NVIDIA's program, cuStateVec.

Only the first configuration uses the CPU, while both the second and third use GPUs. However, the third one is our main focus for this session: NVIDIA's cuQuantum. Both configurations use the same machine and program; the only difference is the SDK.

To begin,

#num of qubits
x = np.arange(10, 30)

#QV depth
dp = 10

#list for result
arr = [[],[],[]]

#label
label = ['Qiskit + CPU', 'Qiskit + GPU', 'Qiskit + cuStateVec GPU']

#loop
for j in range(3):
  for i in x:
    r = exp_qv(i,dp,j)
    arr[j].append(r)
    print(i, r)

plt.xlabel('qubits')
plt.ylabel('sec')
for i in range(3):
  plt.plot(x, arr[i], label=label[i])
plt.title('Benchmark QV depth=' + str(dp))
plt.legend()
plt.show()

Already, with 29 qubits, the situation is quite challenging for the CPU. Next, we'll set the QV depth to 30.

The graph turned out to be exactly the same. Given this, using the CPU for QV seems to be quite challenging. From now on, I'll compare only between GPUs.

I'll try setting the QV depth to 100.

Essentially, there's consistently about a 1.4x speed increase. It felt the same even at QV depth=500.

Overall, it's indeed quite challenging to compute large problems using a CPU. Additionally, even with a GPU, simply using cuQuantum results in about a 1.5x speed boost, even with the exact same configuration.

While I'd like to try various other examples, for this time, using NVIDIA cuQuantum proved to be optimal as a quantum computer simulator. That's all.

© 2025, blueqat Inc. All rights reserved