common.title

Docs
Quantum Circuit
TYTAN CLOUD

QUANTUM GAMING


autoQAOA
Desktop RAG

Overview
Terms of service

Privacy policy

Contact
Research

Sign in
Sign up
common.title

Model Merging in LLM (Large Language Models)

Yuichiro Minato

2024/03/28 04:01

Hello, I previously tried merging models for an image generation AI, but this time, I've attempted to merge language models.

I believe that if the merging of language models is done well, it can result in a model with improved performance.

When it comes to tutorials on merging language models, mergekit seems to be well-known.

It allows for relatively easy model merging.

https://github.com/arcee-ai/mergekit

I will try merging according to the example provided.

It seems there are several methods for tensor computation, not just simple linear interpolation.

https://note.com/npaka/n/nc8bc297f517d

I merged these models based on the article.

Due to VRAM limitations, I chose the 7B model.

  - model: OpenPipe/mistral-ft-optimized-1218
    layer\_range: \[0, 32\]
  - model: mlabonne/NeuralHermes-2.5-Mistral-7B

Upon execution, the merge calculation begins and then completes. It might take a few minutes?

Since it's possible to upload directly to Hugging Face, I went ahead and uploaded it.

https://huggingface.co/minatoyuichiro/NeuralPipe-7B-slerp

It even comes with a readme, making it simple. I'll give it a try.

prompt = "量子コンピュータとはなんですか?日本語で答えてください"

Quantum computer is a computer that uses quantum mechanics to perform calculations. It is a new type of computer that can solve problems that are difficult or impossible for classical computers to solve.

量子コンピュータは、クラシカルコンピュータにとって解決できない問題を解くことができる新しいタイプのコンピュータです。その基礎となる物理法則は、量子力学です。

量子コンピュータは、クラシカルコンピュータよりも、特定の問題を解く速度が非常に高いことが期待されています。ただし、現在は実用的な量子コンピュータが存在しないため、実際の利用はまだ遠いです。

量子コンピュータの基本的な原理として、量子ビット(qubit)が使われています。

Well, I think it's quite good.

I'll try asking the same question with the models before merging them.

mlabonne/NeuralHermes-2.5-Mistral-7B

Upon trying this, I find...

Quantum computer is a type of computer that uses quantum mechanics to perform calculations. It is expected to be able to solve problems that classical computers cannot solve efficiently.

量子コンピュータは、クラシカルコンピュータが効率的に解決できない問題を解決することが期待される、量子物理学を用いた計算を行うコンピュータです。

Quantum computers use quantum bits, or qubits, instead of classical bits. Qubits can exist in multiple states at the same time, allowing for parallel processing and potentially faster calculations.

量子コンピュータは、クラシカルコンピュータのビットを代わりに量子ビット(クイビット)を使用しています。クイビットは、同時に複数の状態を持つことができるため、並列処理が可能で、潜在的にはより速い計算が行えます。

It seems the merged version is considerably better. Not just for diffusion models, but having language model merging available and enhancing performance makes me want to use it more and more. That's all.

© 2025, blueqat Inc. All rights reserved