Today, I'd like to conduct various benchmarks using Qiskit + cuQuantum. Given that cuQuantum is free, one would expect more articles about it. However, there aren't many, probably due to the niche nature of the quantum computing industry.
First, regarding the installation, you'll set up the cuQuantum in a conda environment and then install cuQuantum and cuQuantum Python. For Qiskit, you can install qiskit-aer and qiskit-aer-gpu using pip.
I am using some references
https://medium.com/qiskit/improve-quantum-simulations-with-qiskit-aer-cuquantum-9cd4bf69f042
from qiskit import *
from qiskit.circuit.library import *
from qiskit_aer import *
import time
import numpy as np
import matplotlib.pyplot as plt
def exp_qv(qubits=15, depth=10, device_num=0):
if device_num ==0:
sim = AerSimulator(method='statevector', device='CPU')
elif device_num == 1:
sim = AerSimulator(method='statevector', device='GPU')
else:
sim = AerSimulator(method='statevector', device='GPU', cuStateVec_enable=True)
circuit = QuantumVolume(qubits, depth, seed=0)
circuit.measure_all()
circuit = transpile(circuit, sim)
start = time.time()
result = sim.run(circuit, shots=1, seed_simulator=12345).result()
time_val = time.time() - start
return time_val
First, we will compare the CPU, GPU, and cuQuantum's cuStateVec. We'll run the same problem on different configurations.
1. We will use the CPU. It's the Intel Core i9 13900KF, which is fairly high-performance.
2. We will use the GPU, leveraging acceleration from IBM's program.
3. We will use the GPU, but with acceleration using NVIDIA's program, cuStateVec.
Only the first configuration uses the CPU, while both the second and third use GPUs. However, the third one is our main focus for this session: NVIDIA's cuQuantum. Both configurations use the same machine and program; the only difference is the SDK.
To begin,
#num of qubits
x = np.arange(10, 30)
#QV depth
dp = 10
#list for result
arr = [[],[],[]]
#label
label = ['Qiskit + CPU', 'Qiskit + GPU', 'Qiskit + cuStateVec GPU']
#loop
for j in range(3):
for i in x:
r = exp_qv(i,dp,j)
arr[j].append(r)
print(i, r)
plt.xlabel('qubits')
plt.ylabel('sec')
for i in range(3):
plt.plot(x, arr[i], label=label[i])
plt.title('Benchmark QV depth=' + str(dp))
plt.legend()
plt.show()
Already, with 29 qubits, the situation is quite challenging for the CPU. Next, we'll set the QV depth to 30.
The graph turned out to be exactly the same. Given this, using the CPU for QV seems to be quite challenging. From now on, I'll compare only between GPUs.
I'll try setting the QV depth to 100.
Essentially, there's consistently about a 1.4x speed increase. It felt the same even at QV depth=500.
Overall, it's indeed quite challenging to compute large problems using a CPU. Additionally, even with a GPU, simply using cuQuantum results in about a 1.5x speed boost, even with the exact same configuration.
While I'd like to try various other examples, for this time, using NVIDIA cuQuantum proved to be optimal as a quantum computer simulator. That's all.