Google Colab で Llama-2-70B-chat-GGUF を試す。
「Google Colab」で「Llama-2-70B-chat-GGUF」を試したので、まとめました。
1. Llama-2-70B-chat-GGUF
「TheBloke/Llama-2-70B-chat-GGUF」を利用します。2023年9月12日現在、70Bは「Llama 2」の最大パラメータモデルになります。
2. Colabでの実行
Colabでの実行手順は、次のとおりです。
(1) Colabのノートブックを開き、メニュー「編集 → ノートブックの設定」で「GPU」の「A100」を選択。
(2) Llama.cppのクローン。
# Llama.cppのクローン
!git clone https://github.com/ggerganov/llama.cpp
%cd llama.cpp
(3) Llama.cppのビルド。
main が生成されます。
# Llama.cppのビルド
!mkdir build
%cd build
!cmake .. -DLLAMA_CUBLAS=ON
!cmake --build . --config Release
%cd ..
!cp ./build/bin/main main
(3) モデルの取得。
今回は、「llama-2-70b-chat.Q4_K_M.gguf」を使います。
!wget https://huggingface.co/TheBloke/Llama-2-70B-chat-GGUF/resolve/main/llama-2-70b-chat.Q4_K_M.gguf
(4) プロンプトの準備。
prompt.txt で保持します。
%%writefile prompt.txt
[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
What is Bocchi-chan's personality from BOCCHI THE ROCK?[/INST]
(5) 質問応答。
GPUオフロード(80/83)の設定で、処理速度は11.77トークン/秒でした。
# 質問応答
!./main -m llama-2-70b-chat.Q4_K_M.gguf --temp 0.1 -ngl 80 -c 256 -b 256 -f prompt.txt
system_info: n_threads = 6 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.100000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 256, n_batch = 256, n_predict = -1, n_keep = 0
[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
What is Bocchi-chan's personality from BOCCHI THE ROCK?[/INST] Based on the information available, Bocchi-chan's personality from BOCCHI THE ROCK can be described as follows:
* She is a shy and introverted girl who has difficulty interacting with others.
* She is a talented musician and plays the guitar well.
* Despite her shyness, she is determined to pursue her passion for music and become a professional musician.
* She is shown to be kind and considerate towards others, especially her bandmates.
* She has a tendency to get nervous and flustered easily, especially when interacting with people she admires or in high-pressure situations.
* She values her relationships with her friends and bandmates and is willing to work hard to support them and achieve their shared goals.
* She has a strong sense of determination and perseverance, as seen in her efforts to overcome her shyness and become a successful musician.
Overall, Bocchi-chan's personality can be described as shy, introverted, considerate, determined, and passionate about music. [end of text]
llama_print_timings: load time = 98781.53 ms
llama_print_timings: sample time = 327.34 ms / 244 runs ( 1.34 ms per token, 745.41 tokens per second)
llama_print_timings: prompt eval time = 5187.90 ms / 413 tokens ( 12.56 ms per token, 79.61 tokens per second)
llama_print_timings: eval time = 20472.63 ms / 241 runs ( 84.95 ms per token, 11.77 tokens per second)
llama_print_timings: total time = 26165.92 ms
回答内容は正解です。
この記事が気に入ったらサポートをしてみませんか?