common.title

Docs
Quantum Circuit
TYTAN CLOUD

QUANTUM GAMING


autoQAOA
Desktop RAG

Overview
Terms of service

Privacy policy

Contact
Research

Sign in
Sign up
common.title

DeepSeek-R1 をフルで動かしてみた

Yuichiro Minato

2025/02/25 23:25

なんかあまりこの類のブログが見つからなかった(?)ため書いてみました。

最近、AMD MI210 で DeepSeek の軽量版を動かす機会があり、大規模モデルに興味が湧いてきました。

https://blueqat.com/yuichiro_minato2/3de2f45c-9fab-449b-8545-d1a6f49fd5c9

それまでは「デカすぎて使いづらい」と思っていましたが、やはりパラメータの大きいモデルは性能が良さそうだと何かに目覚め実感し始めました。

以前もよく考えたらGrokでやってましたが、その時はあまり性能を感じなくて興味を失ってました。

https://blueqat.com/yuichiro_minato2/958e0802-9d7b-4875-8891-28e47a510e2a

685B パラメータの巨大モデル

DeepSeek-R1 のフルモデルは 685B(6850億)パラメータ で、VRAM も 同等の 685GB ほど必要になります。NVIDIA H100 (80GB) を 8 枚用意しても 640GB しかないため、ギリギリ足りません。そのため、H100 NVL(94GB × 8 = 752GB)があれば十分かと思いそれをクラウドで探しましたがありません。。。仕方ないので、プランBで、

H200(140GB × 8 = 1120GB)

を用意するという結論に至りました。今回は H200 を 8 枚用意し、推論を試すことにしました。VRAMは十分のはずです。たまたま使えました。

|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H200                    On  |   00000000:18:00.0 Off |                    0 |
| N/A   26C    P0             75W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H200                    On  |   00000000:29:00.0 Off |                    0 |
| N/A   28C    P0             77W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H200                    On  |   00000000:3A:00.0 Off |                    0 |
| N/A   25C    P0             75W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H200                    On  |   00000000:4B:00.0 Off |                    0 |
| N/A   27C    P0             77W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA H200                    On  |   00000000:9A:00.0 Off |                    0 |
| N/A   25C    P0             74W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA H200                    On  |   00000000:AA:00.0 Off |                    0 |
| N/A   27C    P0             75W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA H200                    On  |   00000000:BA:00.0 Off |                    0 |
| N/A   25C    P0             74W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA H200                    On  |   00000000:CA:00.0 Off |                    0 |
| N/A   27C    P0             74W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

FP8 の影響と F8_E4M3 の選択

理論上、H200 は FP で 53TFLOPS、FP8 ならさらに高速化 できます。FP8 には F8_E4M3 と F8_E5M2 がありますが、今回は F8_E4M3 を使用する仕様でした。どうやらこれがメジャーなようです。

モデルのダウンロードと環境構築

Hugging Face からのダウンロードには 4〜5時間 かかりました。最終的に VRAM にロードしたモデルサイズは 652GB となりました。

https://huggingface.co/deepseek-ai/DeepSeek-R1

サービス提供の難しさ

今回、自分でモデルを動かしてみて痛感したのは「巨大モデルは動かすだけで大変」ということです。これを GPT-4 や DeepSeek のように API サービスとして安定運用するにはどのような工夫があるのか? そのスケールの大きさに改めて驚かされました。

実際の推論テスト

色々なエラーを経て、最終的に動作しました。

プロンプト

量子コンピュータにおける量子古典ハイブリッドアルゴリズムについて詳細に解説してください。

max_new_tokens = 2048 に設定し、推論を開始。

トークン単位の出力確認

推論の前に トークンごとの出力を確認 してみました。結果、

具体的,
な,
アル,
ゴ,
リ,
ズム,
の,
例,
や,
その,
応,
用,
分野,
、,
今,
後の,
展望,
など,
も,
含,
めて,
教,
えて,
ください,
。

,
Okay,
,,
 so,
 I,
 need,
 to,
 explain,
 quantum,
-class,
ical,
 hybrid,
 algorithms,
 in,
 quantum,
 computing,

と、日本語と英語でトークンの区切り方が大きく異なることがわかりました。英語は 単語単位 で区切られることが多く、日本語は 文字単位 になりがちです。

また、特に気になった区切りがこちら:

Vari,
ational,
 Quantum,
 E,
igens,
olver,
 (,
V,
QE,
),
 and,
 Quantum,
 Appro,
ximate,
 Optimization,
 Algorithm,
 (,
QA,
OA,
).

単語単位で切れていない箇所があり、トークン化の影響が感じられます。QAOAはまだ良さそうだけどなんかVQEかわいそう。。。

最終的な出力結果

Generating Response

具体的なアルゴリズムの例やその応用分野、今後の展望なども含めて教えてください。

Okay so I need to explain quantum-classical hybrid algorithms in quantum computing. Let me start by recalling what I know. Quantum computers aren't fully capable yet right? They have limited qubits and error-prone. So maybe hybrid algorithms use classical computers to handle parts that are too resource-heavy for current quantum devices.

First what's a hybrid algorithm? I think it's where some parts are run on a quantum processor and others on a classical one. Like variational algorithms where you optimize parameters classically. The most famous example might be the Variational Quantum Eigensolver (VQE) and Quantum Approximate Optimization Algorithm (QAOA).

VQE is used for finding the ground state energy of molecules which is important in chemistry. The quantum part prepares a trial state and measures the expectation value of the Hamiltonian. The classical part then adjusts the parameters to minimize this expectation value. That makes sense because optimizing parameters can be done with classical methods like gradient descent.

QAOA is for combinatorial optimization problems. It uses a parameterized quantum circuit to create a state that approximates the solution and the classical optimizer tunes the parameters to maximize the probability of measuring a good solution. Applications could be in logistics finance maybe even machine learning.

Another example might be quantum machine learning algorithms like the Quantum Neural Networks (QNN). Here the quantum circuit acts as a neural network where parameters are adjusted classically. But I'm not too sure about the specifics here. Maybe using parameterized quantum circuits as part of a larger machine learning model.

Then there's the field of error mitigation. Since current quantum devices are noisy hybrid approaches might involve classical post-processing to reduce errors. For example extrapolating results from different noise levels to estimate the zero-noise limit.

Applications: Chemistry simulations optimization problems machine learning material science finance for portfolio optimization maybe even in cryptography for certain tasks.

Future prospects: As quantum hardware improves these hybrid algorithms could handle larger problems. They might bridge the gap until fault-tolerant quantum computers are available. Also combining with classical AI techniques could enhance performance. Maybe developing more efficient classical optimizers tailored for quantum circuits or better ansatz designs to reduce the parameter space.

Wait but I should check if there are other algorithms. There's also the Quantum Gradient Descent where gradients are computed on a quantum computer to speed up optimization. Or maybe some hybrid algorithms for solving linear systems like the HHL algorithm but that's purely quantum? Hmm maybe HHL is not hybrid.

Also the concept of "classical shadows" in quantum tomography uses classical post-processing to reconstruct quantum states efficiently. That might be part of a hybrid approach.

I need to structure this. Start with an introduction explaining why hybrid algorithms are necessary (NISQ era limitations). Then explain the general framework: quantum processing for specific tasks (state preparation expectation estimation) and classical for optimization error correction etc. Then go into specific algorithms: VQE QAOA Quantum Machine Learning. For each explain how they work their applications. Then discuss applications in different fields. Finally future directions like better hardware integration improved error mitigation hybrid algorithms for more complex problems and maybe the role of quantum machine learning in enhancing classical algorithms.

Wait but I should make sure I'm accurate. For example VQE is specifically for ground state problems. QAOA for combinatorial optimization. Are there others? Maybe the Quantum Autoencoder for data compression? Or algorithms for quantum chemistry beyond VQE like DMET (density matrix embedding theory) combined with quantum algorithms?

Also in terms of applications material design drug discovery for chemistry applications. Optimization could be route planning scheduling. Finance for risk analysis portfolio management. Machine learning could be classification pattern recognition.

Future directions: Co-design of quantum and classical algorithms leveraging quantum advantages in specific subroutines. Integration with classical high-performance computing. Developing standardized frameworks and libraries (like PennyLane Qiskit Cirq) that facilitate hybrid algorithm development. Also exploring quantum advantage in hybrid settings where even a small quantum contribution provides a significant speedup.

Potential challenges: Parameter optimization can get stuck in local minima especially with many parameters. The barren plateau problem where gradients vanish exponentially with qubit count. Error mitigation techniques are still under development. Scalability of the ansatz circuits as problem size increases.

I should also mention specific companies or research groups working on these. For example IBM's Qiskit has implementations of VQE and QAOA. Google's TensorFlow Quantum integrates with quantum machine learning. Startups like Zapata Computing focus on hybrid algorithms for enterprise applications.

In summary the answer should cover:

  1. Introduction to hybrid algorithms (why they're used in NISQ era)
  2. General structure: quantum and classical parts working together
  3. Examples: VQE QAOA Quantum Machine Learning models
  4. Applications in various fields
  5. Current challenges and future directions

Make sure to explain each algorithm's mechanism what the quantum part does what the classical part does and their applications. Also mention error mitigation as part of the classical processing. Future directions could include hardware improvements algorithm co-design error correction integration and expanding application areas.</think>

量子古典ハイブリッドアルゴリズムは、現在のノイズあり中規模量子(NISQ)デバイスの制約を克服するために開発された手法で

(略)

最初英語かと思ったらthinkingの部分でした。日本語できちんと出てるようです。

今回、DeepSeek-R1 のフルモデルを動かし、大規模モデルの実用性を改めて実感しました。日本国内ではなかなか環境を揃えるのが難しいですが、米中の物量戦はすごい ということを痛感しました。

あと、メモリたくさんありましたが、長文だと足りなくなりますね。

これを受けて、なんかやっぱり量子コンピュータも物量戦になっても頑張りたいなと感じました。
あと、関係ないですが世界の中で物量で勝負しているTOYOTAやユニクロみたいな会社もすごいなと感じました。

お金もアイスのように溶けました。

© 2025, blueqat Inc. All rights reserved