ローカルで Llama 2 + LangChain の RetrievalQA を試す

npaka

2023年7月19日 21:12

ローカルで「Llama 2 + LangChain」の RetrievalQA を試したのでまとめました。

・macOS 13.4.1
・Python 3.10.10

1. 使用モデル

今回は、「llama-2-7b-chat.ggmlv3.q4_0.bin」(4bit量子化GGML)と埋め込みモデル「multilingual-e5-large」を使います。

2. ドキュメントの準備

今回は、マンガペディアの「ぼっち・ざ・ろっく！」のあらすじのドキュメントを英語に翻訳したものを用意しました。

・bocchi_en.txt

3. ローカルでの実行

ローカルでの実行手順は、次のとおりです。

(1) Pythonの仮想環境の準備。

(2) 「Llama 2」(llama-2-7b-chat.ggmlv3.q4_0.bin)の準備。
前回と同様です。

(3) パッケージのインストール。
macOSはGPU対応が面倒そうなので、CPUにしてます。

$ pip install llama-cpp-python
$ pip install langchain
$ pip install faiss-cpu
$ pip install sentence_transformers

(4) コードの作成。

・hello_qa.py

import logging
import sys

from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import LlamaCpp
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores.faiss import FAISS

# ログレベルの設定
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG, force=True)

# ドキュメントの読み込み
with open("bocchi_en.txt") as f:
    test_all = f.read()

# チャンクの分割
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,  # チャンクの最大文字数
    chunk_overlap=20,  # オーバーラップの最大文字数
)
texts = text_splitter.split_text(test_all)

# チャンクの確認
print(len(texts))
for text in texts:
    print(text[:10].replace("\n", "\\n"), ":", len(text))


# インデックスの作成
index = FAISS.from_texts(
    texts=texts,
    embedding=HuggingFaceEmbeddings(model_name="intfloat/multilingual-e5-large"),
)
index.save_local("storage")

# インデックスの読み込み
# index = FAISS.load_local(
#    "storage", HuggingFaceEmbeddings(model_name="intfloat/multilingual-e5-large")
# )

# LLMの準備
# llm = OpenAI(temperature=0, verbose=True)
llm = LlamaCpp(
    model_path="./llama-2-7b-chat.ggmlv3.q4_0.bin",
    input={
        "max_tokens": 32,
        "stop": ["System:", "User:", "Assistant:", "\n"],
    },
    verbose=True,
)

# 質問応答チェーンの作成
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=index.as_retriever(search_kwargs={"k": 4}),
    verbose=True,
)

# 質問応答チェーンの実行
print("A1:", qa_chain.run("What kind of person is Hitori Goto?"))
print("A2:", qa_chain.run("What instrument is Hitori Goto good at?"))
print("A3:", qa_chain.run("What did Hitori Goto do at the school festival?"))

チャンクは、RecursiveCharacterTextSplitterを使って必ず300以下になるようにしました。108のチャンクに分割されます。

108
kessoku ba : 298
. One of t : 295
of his gui : 292
Kika Ijich : 293
the uneven : 297
prepares f : 299
a sudden s : 152
school fes : 299
who were e : 299
　　　　：

(5) コードの実行。

$ python hello_qa.py

結果は、次のとおりです。

A1:  Based on the context provided, it seems that Hitori Goto is a shy and introverted person who struggles with social interactions and communication. They have a strong desire for approval but often feel timid and unsure of themselves around others. They have a particular interest in exploring ruins and vintage clothing stores, and they seem to be happy when people refer to them as a "weirdo". However, it's important to note that this is based on a limited amount of information, and it's possible that there may be more to Hitori's personality than what is described here.

【翻訳】
Q1: 後藤ひとりはどんな人？
A1: 提供された文脈に基づくと、後藤ひとりは内気で内向的な人物で、社会的交流やコミュニケーションに苦労しているようです。承認欲求は強いですが、他人の前では臆病で自分に自信が持てないことが多いです。廃墟や古着屋巡りに興味があり、人から「変人」と言われると喜ぶようだ。ただし、これは限られた量の情報に基づいており、ひとりの性格にはここで説明されている以上のものがある可能性があることに注意することが重要です。

A2:  Based on the given context, it can be inferred that Hitori Goto is good at playing the guitar. This is because she spends time practicing the guitar and is interested in Hitori Goto's guitar technique and her talent. Additionally, the fact that she was in a band in the past and was in charge of guitar suggests that she has some level of proficiency with the instrument.

【翻訳】
Q2: 後藤ひとりの得意楽器は何ですか？
A2: 与えられた文脈に基づいて、後藤ひとりはギターを弾くのが上手であると推測できます。ギターの練習に時間を費やしており、後藤ひとりのギターテクニックと才能に興味を持っているからだ。また、過去にバンドに所属しギターを担当していた事から、ある程度の楽器の腕前はあると思われる。

A3:  She performed ad lib and bottleneck playing, leading the live to success.

【翻訳】
Q3: 後藤ひとりは文化祭で何をしたのですか？
A3: アドリブやボトルネック演奏を披露し、ライブを成功に導きました。

7Bでも (まだ13Bと70B残してるのに) 他のローカルLLMとくらべてはるかに文脈理解してそうです。

ローカル で Llama 2 + LangChain の RetrievalQA を試す

1. 使用モデル

2. ドキュメントの準備

3. ローカルでの実行

関連

ローカルで Llama 2 + LangChain の RetrievalQA を試す