【Ollama：ChatGPTを自分で作る】英語解説を日本語で読む【2023年11月11日｜@Matthew Berman】

2023年11月11日 18:32

この動画では、Ollamaを使ってゼロからChatGPTを構築する方法について詳しく説明しています。Ollamaは、簡単に大規模な言語モデルをコンピュータ上で実行し、複数のモデルを並行して動かすことができるツールです。このプラットフォームには多様なオープンソースモデルが含まれており、Webやデスクトップ統合など様々な機能が提供されています。動画では、Ollamaを用いてChatGPTの構築プロセスと、さらに進んだアプリケーション開発方法についても触れています。
公開日：2023年11月11日
※動画を再生してから読むのがオススメです。

I'm going to show you how to build ChatGPT from scratch using any open-source model that you want.

オープンソースのモデルを使って、ChatGPTをゼロから構築する方法を紹介します。

Ollama is the easiest way to run large language models on your computer and build incredible applications on top of them.

Ollamaは、コンピュータ上で大規模な言語モデルを実行し、その上に素晴らしいアプリケーションを構築する最も簡単な方法です。

Ollama powers you to run multiple models in parallel.

Ollamaは複数のモデルを並列に実行することができます。

It absolutely blew me away when I first saw it, so I'm going to show you that too.

それは私が最初に見たときに完全に私を驚かせましたので、それもお見せします。

So let's go.

では行こう。

So this is the Ollama homepage, Ol LLaMA doai, and all you need to do is click download now.

これがOllamaのホームページ、Ol LLaMA doaiで、今すぐダウンロードをクリックするだけです。

Right now, it's only for Mac OS and Linux, but they are making a Windows version and it's coming soon.

今はMac OSとLinux用だけだけど、Windows版も作っていて、もうすぐリリースされるよ。

But you could probably use WSL for Windows to get it working on Windows if you want to use it right now.

でも、今すぐ使いたいなら、WSL for Windowsを使えばWindowsでも使えるようになるだろう。

So just click download and once you do that, just open it up.

ダウンロードをクリックして、それを開けばいい。

That's it, you're done.

それだけです、終わりです。

And once you open it up, you're going to see this little icon in your taskbar, right there.

そして一度開くと、タスクバーにこの小さなアイコンが表示されます。

That's it, that's how lightweight it is.

それだけです、それがどれだけ軽量かです。

And everything else is done through the command line or code itself.

あとはすべて、コマンドラインやコードそのものを使って行う。

So if you click over to this little models link, you can see the models that are available.

この小さなモデルのリンクをクリックすると、利用可能なモデルが表示されます。

And they have all the most popular open-source models right now.

現在最も人気のあるオープンソースのモデルがすべて揃っている。

Here's Code LLaMA, here's LLaMA 2, Mistral, and they have a ton.

ここにはCode LLaMA、LLaMA 2、Mistralなど、たくさんのモデルがあります。

So go ahead and look through it.

さあ、見てみてください。

Here's Zephyr, here's Falcon.

Zephyr、Falcon。

They even have Dolphin 2.2 Mistral.

ドルフィン2.2 Mistralもある。

So they really do have a ton of great models that you can use, and they're adding more all the time.

というわけで、本当にたくさんの素晴らしいモデルがあり、常に追加されているんだ。

So now I'm going to show you how to run it through the command line.

コマンドラインから実行する方法をお見せしましょう。

Then I'm going to show you having multiple models up and running, ready to go at the same time.

そして、複数のモデルを同時に立ち上げ、実行できるようにする方法をお見せします。

And then we're going to actually build something with it.

そして、それを使って何かを作ります。

Okay, so now that we have Ollama running in our taskbar, all we have to type is Ollama run and then the model name that you want to run.

タスクバーでOllamaを起動させ、Ollama runと実行したいモデル名を入力するだけです。

And so we're going to run Nral.

そしてNralを実行する。

Now, I already have this downloaded, but if you don't, it will download it for you.

私はすでにNralをダウンロードしているが、もしダウンロードしていなければ、ダウンロードしてくれるだろう。

So then I just hit enter, and that's it.

Enterを押せば完了です。

We have it up and running.

これでNralは起動した。

Let's give it a test.

テストしてみましょう。

Tell me a joke.

ジョークを言ってみてください。

And look how fast that is.

見てください、この速さ。

Why was the math book sad?

どうして算数の本は悲しかったの？

Because it had too many problems.

それは多くの問題を抱えていたからです。

So perfect.

だから完璧なんだ。

And it is blazing fast, and that is a function of both Ollama and Mistral.

それは非常に高速で、それはOllamaとMistralの両方の機能です。

But let me blow your mind.

でも、驚かせてやろう。

Now I'm going to open up a second window.

つ目のウィンドウを開けよう。

I'm going to put these windows side by side.

このウィンドウを横に並べてみる。

And now I still have Mistral running, and now I'm going to use Ollama run LLaMA LLaMA 2.

そしてMistralを起動させたまま、OllamaでLLaMA LLaMA 2を実行する。

And now I'm going to have LLaMA 2 running at the same time.

そして今度はLLaMA 2を同時に走らせる。

Now, I have a pretty high-end Mac, but the way it handles it is absolutely blazing fast.

私のMacはかなりハイエンドだが、その処理速度は驚くほど速い。

So we have Mistral on the left, we have L2 on the right.

だから、左側にMistralがあり、右側にL2があります。

I'm going to give them a prompt that requires them to write a long response and do it both at the same time.

私は彼らに、長い応答を書く必要があるプロンプトを与え、それを同時に行うようにします。

And let's see what happens.

どうなるか見てみよう。

Okay, so on the left, I'm writing, Write a thousand-word essay about AI.

左側には、AIについて1000字のエッセイを書いてください。

And then on the right, with LLaMA 2, write a thousand-word essay about AI.

そして右側には、LLaMA 2を使って、AIについて千字のエッセイを書いてください。

So, the first thing is, let's trigger mistl.

まず最初に、mistlを起動します。

And then, at the same time, I'm going to trigger LLaMA 2.

そして同時にLLaMA 2を起動させる。

So, let's see what happens.

どうなるか見てみよう。

All right, on the left side, it goes first, and it is blazing fast.

よし、左側のLLaMA 2が最初に動き出した。

It is writing that essay about AI.

AIについてのエッセイを書いているところだ。

On the right side, LLaMA 2 is waiting, and as soon as it's done, it starts writing it with LLaMA 2.

右側ではLLaMA 2が待機していて、それが終わるとすぐにLLaMA 2で書き始める。

How incredible is that?

なんてすごいんだろう。

So, it swapped out the models in a mere maybe 1 and 1 half seconds.

だから、モデルを交換するのはわずか1秒から1.5秒です。

It is absolutely mind-blowing how they were able to do that.

どうやってそんなことができるのか、本当に驚かされる。

So, I had mistro run it on the left, LLaMA run it on the right, and they just ran sequentially.

だから、左側でMistralを実行し、右側でLLaMAを実行し、それらは順番に実行されました。

You can have four, eight, 10, as many models as you want running at the same time, and they'll queue up and run sequentially.

4台、8台、10台......同時に走らせたいモデルをいくつでも走らせることができ、それらがキューに並んで順次走る。

And the swapping between the models is lightning fast.

モデル間の切り替えは、非常に高速です。

And you're probably asking yourself, okay, that's really cool, but when would this be useful?

おそらく、これは本当にクールだが、どんなときに役に立つのだろうかと自問していることだろう。

Well, I can think of two use cases.

まあ、2つのユースケースが考えられる。

One, just being able to have the right model for the right task is incredible.

ひとつは、適切なタスクに適切なモデルを用意できることだ。

This allows us to have a centralized model that can almost act as a dispatch model, dispatching different tasks to the models that are most appropriate for that task.

これによって、ほぼディスパッチ・モデルとして機能する一元化されたモデルを持つことができ、さまざまなタスクをそのタスクに最も適したモデルにディスパッチすることができる。

And what does that remind us of?

それで思い出すのは何だろう？

Autogen.

オートジェンだ。

We can have a bunch of different models running with autogen, all running on the same computer, powered with o LLaMA.

LLaMAを搭載した同じコンピューター上で、オートジェンを使ってさまざまなモデルを走らせることができる。

And since autogen runs sequentially, it is actually a perfect fit for that kind of work.

オートジェンはシーケンシャルに動作するので、この種の作業には最適なんだ。

And there we go, there's two of them.

そして、これで2つあります。

So now that you can see that you can have as many open as you want, I'm going to close LLaMA 2.

好きなだけ開くことができることがお分かりいただけたと思うので、LLaMA 2を閉じようと思う。

And let's say we want to adjust the prompt of the system message.

システムメッセージのプロンプトを調整したいとしよう。

We can easily do that.

それは簡単にできます。

Let me show you how to do that now.

その方法をお見せしましょう。

So, I switched over to Visual Studio Code.

では、Visual Studio Codeに切り替えてみましょう。

And what we're going to need to do is create what's called a model file.

そして、私たちが必要とするのは、モデルファイルと呼ばれるものを作成することです。

And so, to start the model file, we write from and then LLaMA to.

モデル・ファイルを開始するには、fromとLLaMA toを記述します。

And we're going to change that to mistl because that's the model we're using right now.

そしてmistlに変更します。これが今使っているモデルだからです。

I click save and it recognizes this as Python, which is why you're seeing all those underlines.

保存をクリックすると、Pythonとして認識されるため、その下に下線が表示されます。

But it's not Python.

でもこれはPythonではありません。

And I'm going to leave it as plain text for now.

今はプレーンテキストのままにしておきます。

And then, we can set the temperature right here.

そして、ここで温度を設定できます。

So, let's set the temperature to 0.5.

温度を0.5に設定しよう。

And then, we can set the system prompt.

そしてシステム・プロンプトを設定する。

The one in the example is You are Mario from Super Mario Brothers.

例のプロンプトは「あなたはスーパーマリオブラザーズのマリオです。

Answer as Mario, the assistant only.

マリオとして答えてください。

So, let's do that.

では、そうしてみましょう。

Let's see if it works.

うまくいくか見てみましょう。

Now that we have this model file, okay, we're back in our terminal.

モデルファイルができたので、ターミナルに戻ります。

And now, we have to create the model file.

そして、モデルファイルを作成する必要があります。

What this is doing is creating a Model A profile of a model using that model file.

これはモデルファイルを使ってモデルAのプロファイルを作成することです。

So, it says oama create Mario DF and then we point to the model file and then hit enter.

oama create Mario DFと表示されるので、モデルファイルを指定してエンターキーを押します。

And there we go, parsing model file, looking for the model.

モデルファイルを解析し、モデルを探します。

So, it did everything correctly.

すべて正しく行われました。

Then we do o LLaMA Run Mario and hit enter.

次にo LLaMA Run Marioを実行し、エンターキーを押す。

And there it is, up and running.

そうすると、マリオが起動する。

Who are you?

あなたは誰ですか？

I am Mario, the assistant.

アシスタントのマリオです。

It's great to meet you.

お会いできてうれしいです。

How can I help you today?

今日はどのようなご用件でしょうか？

Tell me about where you live.

お住まいを教えてください。

Okay, so now it's going to answer as Mario, and that's it.

では、今度はMarioとして回答するようになります、それだけです。

And we can give it complex system prompts if we want, and we can do all the other customizations that we want to do in that model file.

そして、必要であれば、複雑なシステムプロンプトを与えることもできますし、そのモデルファイルでやりたい他のすべてのカスタマイズを行うことができます。

And another nice thing is Ollama has a ton of Integrations.

そしてもうひとつ、Ollamaにはたくさんの統合機能がある。

So, here's web and desktop Integrations.

ウェブとデスクトップのインテグレーションです。

We have an HTML UI, a chatbot UI.

HTMLのUIやチャットボットのUIもあります。

We have all these different UIs.

これらすべての異なるUIがあります。

We have terminal Integrations.

ターミナル・インテグレーションもあります。

We have different libraries, including LangChain and LlamaIndex.

LangChainやLlamaIndexなどのライブラリもあります。

And then we have a bunch of extensions and plugins.

拡張機能やプラグインもたくさんあります。

So, we can use like the Discord AI bot, for example.

例えば、DiscordのAIボットを使うことができます。

All of these are really, really easy to use.

これらはすべて本当に、本当に使いやすいです。

But I think I want to do that all myself.

でも、僕は全部自分でやりたいんだ。

Let's build on top of AMA now.

AMAの上に構築していきましょう。

So, the first thing I'm going to do is create a new folder for this project.

まず最初に、このプロジェクト用に新しいフォルダを作成します。

So, let's right-click, create a new folder, and I'm going to call it open chat because we're making a ChatGPT clone that's using open source models.

右クリックして新しいフォルダを作り、オープンソースのモデルを使ったChatGPTクローンを作るので、open chatと呼ぶことにします。

Next, I opened up Visual Studio Code, opening the Open chat folder.

次に、Visual Studio Codeを開いて、Open chatフォルダを開きます。

So, there's nothing in it yet, but we're going to put something in it.

まだ何も入っていませんが、これから何かを入れます。

So, we're going to create a new Python file.

新しいPythonファイルを作成します。

We'll save it.

保存します。

We'll call it main.py and open chat.

main.pyと名付け、チャットを開きます。

Okay, so let's do something really, really simple first.

さて、まずは本当に簡単なことをやってみましょう。

We're just going to generate a completion, which means get a response.

私たちは単に応答を生成するだけで、つまり応答を得ることです。

And since we're doing this in Python, we're going to need two things.

Pythonでやっているので、2つのものが必要です。

We're going to need to import requests and import JSON.

import requestsとimport JSONだ。

These two libraries, alright.

この2つのライブラリだ。

And then we have the URL and it's Local Host because this is all running on my local computer and we're going to use port 11434.

そしてURLと、ローカルホスト。これはすべて私のローカルコンピューターで実行されるからで、ポート11434を使う。

We're going to hit the API and the generate Endo.

APIにアクセスして、Endoを生成する。

We have our headers right here and then our data.

ここにヘッダーがあり、次にデータがあります。

We're not going to use LLaMA 2, we're actually going to be using Mistral 7B and I think that's the right syntax.

LLaMA 2は使用しません、実際にはMistral 7Bを使用する予定です、そしてそれが正しい構文だと思います。

We'll try it.

試してみよう。

And then the prompt will be why is the sky blue just as a test.

そして、プロンプトはテストとして「なぜ空は青いのか」となります。

And then we're going to ask request to do a post to the URL with the headers and the data.

そして、ヘッダーとデータを含むURLへの投稿をリクエストします。

We're going to collect the response.

応答を収集する予定です。

If we get a 200, we will print it.

もし200が返ってきたら、それを表示します。

Otherwise, we're going to print the error.

そうでなければ、エラーを表示します。

Let's see if this works.

うまくいくか見てみましょう。

I'll save and I'll click play and let's run it.

保存して、再生をクリックして、実行してみよう。

Alright, mistal 7B not found.

ミスタル7Bが見つかりません。

So I think maybe if I just delete that part and try again.

この部分を削除してもう一度やってみよう。

Let's see.

見てみよう。

Okay, interesting.

よし、面白い。

So it looks like it streamed the response because we got a ton of little pieces of it.

応答がストリーミングされたようですね、なぜなら小さなピースがたくさん得られたからです。

Let's see how we can put that all together together now.

では、それをどうまとめるかやってみよう。

Okay, looking at the documentation, it says right here a stream of JSON objects is returned.

ドキュメントを見ると、ここにJSONオブジェクトのストリームが返されると書いてある。

Okay, so then the final response in the Stream also includes additional data about the generation.

では、ストリームの最終的なレスポンスには、生成に関する追加データも含まれます。

Okay, so we get a bunch of information and if we don't want to stream it, we actually just turn stream false.

では、たくさんの情報が得られますが、ストリーミングしない場合は、単にstream falseにします。

So let's do that right here.

では、ここでそれをやってみよう。

I'm going to add stream false and then let's try it again.

stream falseを追加して、もう一度やってみよう。

Let's see what happens.

どうなるか見てみよう。

Oh, false is not a string.

falseは文字列ではありません。

Okay, fixed it and let's run it again.

オーケー、修正してもう一度実行してみよう。

It looks like false needs to be capitalized.

falseは大文字にする必要があるようだ。

Okay, push play and it looks like it worked that time.

再生ボタンを押して、今回は動作したようです。

Here we go.

さあ、始めよう。

The sky appears blue because of a phenomenon called Ray scattering.

空が青く見えるのは、光線散乱と呼ばれる現象のためです。

This occurs when, okay, there we go.

これは光線散乱と呼ばれる現象です。

We got it absolutely perfect.

完璧にできました。

So I don't really want all of this additional information.

というわけで、私はこのような付加的な情報はあまり欲しくありません。

What I really want is just the answer.

私が本当に欲しいのは答えだけなのです。

So now let's make that adjustment.

では、調整してみましょう。

Okay, so I made a few changes here.

さて、ここでいくつか変更を加えました。

First, we get the response text.

まず、レスポンス・テキストを取得します。

Then we load the JSON.

次にJSONを読み込みます。

Then we parse the JSON right here.

次にJSONをパースします。

Then we get the actual response, the response from the model, from this JSON.

そして、このJSONから実際のレスポンス（モデルからのレスポンス）を取得します。

Then we print it.

そしてそれを表示します。

Let's try one more time.

もう一回やってみましょう。

There it is, perfect.

それです、完璧です。

Now we have the response.

これでレスポンスが得られました。

Okay, now that we got the basics working, let's add a gradio front end so we can actually use it in the browser.

さて、基本的なことはできたので、実際にブラウザで使えるようにグラディオのフロントエンドを追加してみましょう。

And then, we're going to make sure that the user can go back and forth and actually have a conversation.

そして、ユーザーが行き来して実際に会話をすることができるようにします。

All right, funny enough, I'm actually going to use the mistro model to help me write this code.

面白いことに、このコードを書くのにmistroモデルを使っている。

So that's what I've done.

これが私のやったことだ。

I basically pasted in the code that I had and said, Let's add gradio and then let's also allow for a back and forth between the user and the model.

基本的には、持っていたコードを貼り付けて、Gradioを追加し、ユーザーとモデルの間でのやり取りも許可しましょう、と言った感じです。

So it generated this generate response method.

すると、このgenerate responseメソッドが生成された。

Okay, so I moved a bunch of stuff into this generate response method, including this data object, and then the response comes through here.

この生成レスポンス・メソッドに、このデータ・オブジェクトを含むたくさんのものを移動させました。

So everything is going to run through this generate response method from now on.

ですから、今後はすべてこのgenerate responseメソッドを通して実行することになります。

Then we're going to actually open up gradio.

それから、実際にGradioを開きます。

So we have gr gradio do interface, and we're going to have this function generate response.

gradioのdoインターフェイスを用意し、この関数にレスポンスを生成させます。

The input is going to be the prompt that somebody enters, and then the output will be the function response.

入力は誰かが入力するプロンプトであり、出力は関数の応答です。

Let's run it.

実行してみよう。

Let's see what happens.

何が起こるか見てみよう。

Then we launch it.

そして起動します。

All right, here we go.

さあ、始めましょう。

There's the local URL with it running.

ローカルのURLで実行されています。

Let's click on it.

クリックしてみましょう。

We're going to open it up, and here it is.

それを開いて、ここにあります。

We have a working gradio interface.

gradioのインターフェイスができました。

Let's make sure it works.

動作することを確認しましょう。

Now, tell me a joke.

では、ジョークを言ってください。

There it is.

それです、ここにあります。

Why was the math book sad?

なぜ数学の本は悲しかったのか？

Because it had too many problems.

問題が多すぎたからです。

In just a few minutes, we were able to build our own ChatGPT powered by Mistral.

ほんの数分で、私たちはMistralを使った独自のChatGPTを作ることができた。

This is absolutely incredible.

これは本当に信じられない。

But let's not stop there.

しかし、そこで立ち止まってはいけない。

Let's take it a little bit further because I don't think it has any memory of the previous conversations that we've had.

というのも、Mistralは私たちが以前に交わした会話の記憶を持っていないと思うからです。

So let's say, Tell me another one.

では、もう1つ言ってみよう。

Let's see if it actually works here.

実際にここで動作するか見てみましょう。

So it's giving me something completely different now.

そうすると、今度はまったく違うものが返ってくる。

Let's make sure it has the history of the previous messages, as many as it can fit in there.

前のメッセージの履歴を、そこに入るだけの数だけ持っていることを確認しよう。

Okay, so to do that, we're going to store the conversation history, and we're going to try to store as much as we can and fit it into the model.

オーケー、ではそのために会話の履歴を保存することにしましょう。できる限り多くの履歴を保存し、モデルに収まるようにしましょう。

And I'm sure there's better ways to do this, but we're just going to keep it simple and just assume we can store as much of the memory as we want.

もっといい方法があると思うが、ここではシンプルに、好きなだけメモリーを保存できることにしておこう。

Obviously, it's going to get cut off when we hit that token limit.

もちろん、トークンの上限に達したらカットされる。

So let's add conversation history right here as an array.

そこで、会話履歴を配列としてここに追加しよう。

And then, the first thing we're going to do when we go to generated response is append the conversation history.

そして、生成された応答に移るときに、会話履歴を追加する最初のことをします。

So, conversation history append.

会話履歴を追加します。

And then, we're going to add the prompt.

そしてプロンプトを追加します。

Then, the next thing we're going to do is add a new line, and we're going to join by this new line the conversation history.

そして次にすることは、新しい行を追加して、この新しい行で会話履歴を結合することです。

And then, we're going to add it to full prompt.

そして、それをフルプロンプトに追加します。

So, it basically takes the entire conversation history and puts it in this full prompt.

つまり、基本的に会話履歴全体をこのフル・プロンプトに入れます。

We're going to pass in the full prompt now, just like that.

このように、フルプロンプトを入力します。

And then, the last thing we need to do is when we get the full response, we want to add that to the history.

そして、最後に必要なのは、完全な応答が返ってきたら、それを履歴に追加することです。

So, down here, when we get the response right before we return it, we're going to add conversation history append.

だから、ここで、私たちは応答を取得する直前に、会話履歴を追加するつもりです。

And then, the actual response.

そして実際のレスポンス。

And then, I'm going to save.

そして、保存します。

So, let's quit out of gradio, clear, and then hit play.

gradioを終了して、クリアして、再生ボタンを押してください。

There we are.

そうです。

Let's open it up.

さあ、開きましょう。

All right, now tell me a joke.

では、ジョークを言ってくれ。

Why don't scientists trust Adams?

なぜ科学者はアダムスを信用しないのか？

Because they make up everything.

彼らはすべてをでっち上げるからだ

Very funny.

面白い。

Another one.

もう1つ。

And let's see if it knows what I'm talking about.

そして、私が何を言っているのかを知っているかどうか見てみましょう。

Now, what do you get when you mix hot water with salt?

では、塩と熱い水を混ぜると何が得られるか、試してみましょう。

A boiling solution.

沸騰した溶液だ。

There it is.

それです。

Now it has the history of the previous messages, powered by open source model, completely written from scratch by myself or yourself.

今までのメッセージの履歴があり、オープンソースのモデルによって動かされ、私自身かあなた自身によってゼロから書かれたものだ。

So now you know how to build with Ollama.

これでOllamaの作り方はわかっただろう。

If you want me to do an even deeper dive and continue to build something more sophisticated, let me know in the comments.

もっと深く掘り下げて、もっと洗練されたものを作り続けてほしいなら、コメントで教えてほしい。

If you liked this video, please consider giving a like and subscribe, and I'll see you in the next one.

この動画が気に入ったら、いいねと登録を考えていただけると嬉しいです。次の動画でお会いしましょう。

この記事が気に入ったらサポートをしてみませんか？