Today, I conducted benchmarks for Torch Tytan, focusing on moderate computations with 5,000 and 10,000 qubits, and compared them across different GPUs to avoid too much variety.
今日はTorch Tytanのベンチマークをとってみました。あんまり種類があるとアレなので、5,000と10,000量子ビットの適度な計算をGPU別にとりました。
from tytan import *
import random
import time
N = 5000
qubits
q = symbols_list(N, 'q{}')
hamiltonian
H = 0
biases
for i in range(N):
H += random.randint(-10, 10) * q[i]
Jij but set N as it takes a long time to finish if we set all of connections
for i in range(N):
H += random.choice([-1, 1]) * q[random.randint(0,N-1)] * q[random.randint(0,N-1)]
compile
qubo, offset = Compile(H).get_qubo()
sampler
solver = sampler.ArminSampler(seed=None, mode='GPU', device='cuda:0', verbose=1)
start = time.time()
#sampling
result = solver.run(qubo, shots=1)
print(time.time() - start)
In this way, for 5,000 qubits, we set a moderate 5,000 points of interaction, and for 10,000 qubits, we set a moderate 10,000 points of interaction. All measurements were taken on a single GPU.
こんな感じで、5,000量子ビットでは適度に5,000箇所の相互作用を、10,000量子ビットでは、適度に1万箇所の相互作用を決めます。全てシングルGPUで計測しています。
5,000qubits
H100 : 3.7187023162841797s
RTX 6000ada : 3.443608045578003s
T4 : 14.257462501525879s
10,000qubits
H100 : 12.083187103271484s
RTX 6000ada : 11.568817615509033s
T4 : 62.8189423084259s
It went quite well. The H100 might achieve even higher speeds, but that's for another time.
なかなかいい感じでした。H100はもっと速度出るかもしれませんが一旦。