オープンソースの代替モデルであるFalconモデルが紹介され、OpenAIのDaVinci 3モデルとのテキスト要約タスクで比較されています。デモンストレーションでは、この実験でOpenAIのモデルの方が優れた性能を示しています。ビデオでは、特定のユースケースに基づいてモデルを理解し比較することの重要性が強調されています。

So, if you are serious about learning how to work with large language models, then this is an essential skill.


You must be able to work with different models.


If you are into creating applications with large language models, then you know that the current golden standard is to use the models from OpenAI and then pair it up with a framework like LangChain to hook it all up and create your applications.


And that is something I've been covering on my YouTube channel.


Lots of examples there.


But there are some downsides to using just the models from OpenAI.


First of all, the API costs money.


That's one.


And the second one is that you might not want to share all data that you want to send to these large language models with OpenAI.


So these could be sensitive private information or company information that you want to keep secret.


So I've also been getting tons of questions on how to counter this, how to go about that.


So in this video, I'm going to show you how you can work with open-source large language models as well.


And in particular, the Falcon model, which has been getting a lot of attention and hype lately because it's been outperforming all the other open-source models that are currently out there if we look at the leaderboards from Hugging Face.

特に、Falconモデルには、最近多くの注目とハイプが集まっています。なぜなら、現在存在する他のオープンソースモデルすべてを上回るパフォーマンスを発揮しているからです。これはHugging Faceのリーダーボードを見ると明らかです。

So, I've prepared an example for you which we will run through together.


So, there are some instructions and there is some code that you can access, but what we are going to do is we are going to test the Falcon, and then in particular, the 7 billion parameter one.


So that's not the fully trained one with the 40 billion parameters because that takes forever to ask questions and it takes a really long time to load.


And I found that the 7 billion one, and particularly the Instruct version, is something you can work really well with using the code that I will provide to you.


So we are going to do some experiments where we compare it to the text DaVinci 3 model from OpenAI and then put it to the test in a task summarization task.


So this is going to be pretty interesting.


You'll learn how to use open-source large language models from Hugging Face, then you'll learn how to set them up and compare them to OpenAI models.

Hugging Faceのオープンソースの大規模言語モデルの使い方を学び、それをどのように設定し、OpenAIのモデルと比較するのかを学びます。

And as a bonus of this video, you'll also learn how to summarize large text using LangChain and the summarization methods that are in there.


So this is going to be a very important video because as we progress in the era of AI and other open-source models will come out.


The gap between where OpenAI currently is with their models and open-source models will likely get smaller and smaller and smaller until, potentially, and many believe this will be the case, open-source models will be better than what OpenAI is currently offering.


So if you're serious about learning how to work with large language models, then I would say this is an essential skill.


Knowing how to work with various models, paid models, different APIs, but also the open-source models.


Which are, for example, available on Hugging Face, and then setting up quick little experiments to compare and validate all the models and the results in order to pick the model that is the best suitable for your specific use case.

例えば、Hugging Faceで利用可能なオープンソースのモデルです。そして、特定のユースケースに最も適したモデルを選ぶために、すべてのモデルとその結果を比較・検証する簡単な実験を設定します。

So, that is what we will do in this video.


Let's get into it.


So, in order to follow along, you first of all need access to this repository over here (the link will be in the description), and we will be working from the models directory in here where the example is, and also the Python file that we will be using.


Now, in order to follow along, you need a Python installation and some basic understanding about LangChain.


If you don't have that already, I would recommend you watch my previous video on working with LangChain.


That will set you up and explain how everything works.


But if you already understand this, then we can dive straight into it.


What you can do is, first of all, clone this repository.


Then, more specifically, go into the models folder within your favorite IDE (so your IDE of choice, for me that is VSCode). I've opened up the file over here, and we're in the falconmodel.Pi.


So if I load up a Python interactive session over here (which is something I've also been getting tons of questions about, them how to do that link to set that up in VSCode will be also in the description), we can do some basic imports.


And then, to configure everything, the only thing you need if you want to run the open-source models from Hugging Face is a Hugging Face API token.

そして、すべてを設定するために、Hugging Faceのオープンソースモデルを実行したい場合に必要なものは、Hugging Face APIトークンだけです。

So if we go to huggingface.co, then log in, create an account (it's free if you don't have one already), and you go to the settings and then the access tokens, you can create a new API key over here.


So, as you can see, I've got one for LangChain already with read accessibility, but you can create a new one, say what's it for, and then either read or write (read would be sufficient), generate the token, you can show it, and then copy it.


And then, the next step that you have to do is set that up in an environment file.


So, within the GitHub repository, there is an example already (it's an empty one), but what you can do is, if you turn or change this name into .end like you see over here, and then just change the Hugging Face API AI token with the token you've just generated, then you're good to go.

だから、GitHubのリポジトリ内にはすでに例があります(空のものですが)、あなたができることは、この名前をここに見られるように.endに変更したり、Hugging Face API AIのトークンをあなたが生成したトークンに変更するだけです、それからスタートできます。

What this basically allows us to do is use the .end library to load our environment variable of the key into our variable over here, and that will allow us to communicate with Hugging Face.

これで基本的にできることは、.endライブラリを使って環境変数のキーをこちらの変数にロードすることで、Hugging Faceと通信することができるようになります。

Alright, so there will also be a requirements.txt for this project to make sure you have all the dependencies and all the PIP libraries basically installed, but it's .end, LangChain, and I think that's it, then you're good to go.


Alright, so with those instructions out of the way, we can now get started with the models from Hugging Face.

さて、以上の説明で、Hugging Faceのモデルを使い始めることができるようになりました。

So first, what we do is we do load.f, anyfine.end to load our environment variable that we just set, and then make sure that the token is accessible over here.


And then we can continue to the actual interesting part.


And that is using the Hugging Face hub from LangChain.

LangChainのHugging Faceハブを使用します。

So, you can see we import that over here, and then what we can do is I'll show you how to interact with these models.


So it starts off with a repository ID.


And if we come back to Hugging Face, you can see that each model basically that we're looking at right now has a repo, and you can basically copy that, and then we can use that to interact with the model.

Hugging Faceに戻ると、今見ている各モデルには基本的にリポジトリがあり、それをコピーして、それを使ってモデルと対話できることがわかります。

Now, like I've set, the 40 billion one, I've tried it, but it takes forever to load.


I don't know if it's even possible to do it through the API, or that you have to set it up on some kind of heavy server.


But the 7 billion one, and like I said, the Instruct version works really well.


So we are going to copy that.


It's already in there, but you can basically put that in here.


But you can also swap that around with some of the other models that are available on Hugging Face or in the leaderboards, so that's really an interesting part.


So that's how you can swap them out, and then basically we are going to create our large language model through a LangChain object, basically like we normally do, and we can give some model parameters in there as well.


So through this, we basically specified your repository idea, and for now, we set a pretty low temperature, and we set the max new tokens to 500, meaning that the response that we get from the model will add a maximum of 500 tokens.


So let's run that and store that within our interactive session.


You can refer to the Hugging Face HOP documentation here in the LangChain documentation to get a bit of an understanding of how this works, but it's basically the same example that I've just walked you through.

LangChainのドキュメントにあるHugging Face HOPのドキュメントを参照すると、この仕組みが少し理解できると思いますが、基本的には今説明した例と同じです。

But this is where the original documentation is from, and here you can also see that they're using a Google model in this example, so that is how that would work.


So the next step over here is to create a prompt template and a large language model chain, and we're just following along with the sample from LangChain.


Where the template is, Hey, here's a question, and then we fill in our actual prompts that we fill in later, and then we say, Answer this question.


Basically, let's think step by step.


So we create a prompt template and then a large language model chain.


Again, if all of these concepts are new to you and you don't understand them, I refer back to my previous video on the basics of working with LangChain, and then it will make sense for you.


I'm assuming you already understand that, and then we're going to run this chain.


Let me first quickly complete or store this code, and then we have a question.


So we start off with a very interesting, very hard question: How do I make a sandwich?

ではまず、とても興味深く、とても難しい質問から始めます: サンドイッチを作るにはどうしたらいいのでしょうか?

And we are going to check, like, okay, how is the Falcon 7 billion model going to respond to this?


So we have the question.


We basically first get the response, goes pretty quickly, and then we can wrap that.


That is basically just to make it nice and pretty when we print it, and there we go, we have our first response.


So you need to gather the ingredients.


You'll need bread, meat, cheese, condiments, and toppings.


Once you have all your ingredients, you need to take a slice of bread.


Okay, sounds like a pretty solid instruction on how to create a sandwich.


This is pretty cool, right?


Because this is a completely free model, like no credit card required, no data sharing with OpenAI.


Not so interesting question, but hey, we're just getting started.


So we already have it up and running with just these few lines of code, and to me, that is still like so amazing.


How these open source models, if they are made available and if they work really well, you can just access them with a couple lines of code, and you can basically create cool applications with them, completely for free.


I'm really excited to see how these models will develop over time, and when we have, like, the cross point where these open source models are really better than the ones from OpenAI.


But now let's take it a step further and set up a quick little experiment that we can do.


So I'm going to download one of the transcripts from my YouTube videos, and this one, in particular, that I did recently on Flowwise.


And through the YouTube loader, also from LangChain, we are going to put in that video URL and then we're going to get the transcript.


So, this is how that works, and again, that's also covered in one of my previous videos.


But we can basically load the transcript, and here you can see we just have the whole transcript basically of that video, which is automatically generated by YouTube.


So, for every video, this is publicly available information.


You can plug in any URL, and you can get the transcription based on the YouTube algorithm.


And then we are going to split this up into different documents, basically because, as you might know, when you work with large language models and these APIs, you are limited by the amount of tokens that you can send to these APIs.


So, we're going to set up a text splitter, and basically what this will do is we'll just chunk this whole document, this whole transcript, up into various documents, and we can have a look over here.


So, there's about six documents in total, seems to be yes, so six splits basically.


So, we took the whole transcript and we chopped it up.


And now, this next part is pretty interesting, and this is summarization.


So, this again is something that large language models currently struggle with due to these limitations in the amount of tokens that you can send to the API.


So, LangChain has a cool little built-in method, load_summarize_chain basically, where you can do or create this summarization iteratively.


So, you first, like, split the whole transcript and then create a little summarization for each of the splits basically that you've created.


And then you throw those onto one pile again and then go iteratively basically on that to create your final summarization.


Now, that is exactly what we will be testing this Falcon 7 billion parameter model with and then compare it to OpenAI to see how it performs.


So, we've split up the documents, and now the next step is to load this chain.


And a quick little side tip that you can do if you print the prompt template and also the combine template, you can see how LangChain is handling this.


So, it basically is a prompt to say, like, Write a concise summary of the following, and then you put in the text, and then it will give you the concise summary.


And by default, as you can see, this template is basically the same for first creating the initial summarizations and then also when you combine everything and create a summarization of depth, it will use the same prompt template.


But you can change that by using the parameters, as you can see it's in the comments over here, but you can just add a map prompt over here and then just change that.


So, map prompt equals and then you give your map prompt over here, and you can also change the combined prompt.


So, that is how you would do that.


But for now, we're just going to leave it at default, and we're just going to run the docs, and we set both to true, so it will also show us kind of like what's going on under the hood.


So, here you can see all the chunks, basically, first write a concise summary of the following, and then this is the first introduction sentence that I basically start with, and then this is somewhere later in the video, and so on and so on.


And then it basically combines all of that, and then finally, if we look at the output summary, we can see that we have a string of text over here, and then we can just wrap that so that if we print it, we have that on one line.


And what we have, flow-wise, is a visual UI builder that allows users to build large language models apps in minutes.


That's a good one, that's basically how I start the video.


The tutorial covers setting up a free API key and cloning the Flowwise repository.


I believe that's also correct because we use, not...


it's not really we use a free Pinecone API key, but we also use the OpenAI API key which is free to set up, but it costs money if we want to interact with the API.

というのも、私たちは無料のPinecone APIキーを使っていますが、OpenAI APIキーも使っているからです。

We do clone the Flowwise repository, so that is correct, and then the project is then cloned in a Lang Chain experiment.

Flowwiseのリポジトリをクローンして、Lang Chainの実験でプロジェクトをクローンしていますね。

The project is opened in a terminal, that's also correct.


The tutorial concludes with a step-by-step guide on how to integrate Chip GPT with company data.

チュートリアルの最後には、Chip GPTを会社のデータと統合する方法について、ステップバイステップで説明しています。

That's not entirely true, that's basically an example of me explaining kind of projects that I'm currently working on for my freelance work, but okay, it seems to be at least the start is quite accurate.


And remember, this is a totally free model, and also text summarization using this method is pretty hard because we are splitting up all the context basically.


And if the cut or the split is right in the middle of like a talking point, then the ending and the beginning of the other chunk doesn't really make sense.


So if you summarize it separately, it could get messy.


So it's a pretty interesting task to compare these models on.


So now let's see what we get if we use OpenAI.


So we have another text splitter over here, which is kind of redundant because we're using the same, but now we're just going to say,Hey, we are going to create a new large language model object, basically new LLM, but now we use OpenAI.


So let's run this, and again, if you want to follow along, this does require you to put in your OpenAI API key in the .env file as well, so you need one free to set up, but you do get charged once you query it.


So we set up the model and then we create another chain and then we plug the OpenAI LLM in here.

モデルをセットアップし、別のチェーンを作成し、ここにOpenAI LLMを接続します。

Everything else works the same, that's the nice thing about using Lang Chain.


It's unified in a way that you just specify the model and then once you have that in place, you have that object.


You can just, like, completely copy your code.


That's really awesome.


So we can again create the output summary by running it, and here you can see basically the same thing.


So now the model will go to action, and we will iteratively query the OpenAI API to get the summaries of all the chunks, and then boom, it's finished.

これでモデルはアクションに移行し、OpenAI APIに繰り返し問い合わせ、すべてのチャンクのサマリーを取得します。

Over here we can wrap the text again and then print it.


Okay, so what do we have over here?


So this article discusses Flowwise AI and open-source visual UI builder that allows users to quickly build large language models apps.

この記事では、Flowwise AIと、大規模な言語モデルアプリを素早く構築できるオープンソースのビジュアルUIビルダーについて説明しています。

So the first sentence is kind of like the same, although this says that it's an article, but they do talk about Flowwise and they recognize that it's an open-source visual UI builder to quickly create large language models apps.


So those are the same, and then it refers back to like the article, which is the video.


But okay, it explains how to set up Flowwise connected to data and build a conversational AI.


This I would say is a lot better already because this is really what we do in this video.


It also explains how to use Lang Chain, a comprehensive framework for developing applications powered by large language models.

また、大規模な言語モデルを搭載したアプリケーションを開発するための包括的なフレームワークであるLang Chainの使い方も解説しています。

So it's Lang Chain under the hood.

つまり、これはLang Chainを使ったものなのです。

So yeah, in some way, I talk about language and how to use Flowwise AI to quickly prototype AI projects.

そうそう、ある意味、言語について、そしてFlowwise AIを使ってAIプロジェクトを素早くプロトタイプ化する方法について話しているんだ。

Finally, it shows how to use Flowwise AI to sell AI service to clients as a freelancer.

最後に、Flowwise AIを使って、フリーランサーとしてクライアントにAIサービスを販売する方法が紹介されています。

Okay, so this is definitely the winner because really the main message of that video was how you can use Flowwise to quickly prototype AI projects.


And then I conclude this video by basically saying, Hey, I work as a freelancer.


I work with clients, and this, I could see how I could use this as a tool to quickly spin up demos and prototypes and then go from there, basically.


So overall, the OpenAI model is still the clear winner.


But if we compare the workflow and what we're getting, it's getting close.


So let's see.


The temperature is set pretty low, but let's just see if we run it one more time.


Let's also see how deterministic this one is.


So I load the Falcon model again.


I will create a summary one more time.


Go over it.


It is really fast though, which is quite interesting.


It's faster than OpenAI for sure right now.


So what do we have?


Do we have kind of a similar answer?


So Flowwise visual UI Builder, yeah, seems to be the same.


Okay, so that's good because we set the temperature pretty low, so to 0.1.


So we would expect no creative styles here, basically, and just giving us the results.


Basically, it is good in that sense.


But I would say, at least from this little experiment, it's nowhere near the capabilities of OpenAI's models right now.


But I will definitely run more experiments with this.


And also, keep in mind that this is the 7 billion parameter model, and the one that has been getting all the hype is the 14 billion parameter one.


So if you know a way how to run this effectively on your local machine or through some server, please let me know in the comments because I do want to experiment with this one.


Alright, so you now know how to work with the open-source models from Hugging Face, like I've said, very valuable skill to have if you're serious about learning how to work with large language models.

さて、これでHugging Faceのオープンソースモデルの扱い方がわかったと思いますが、大規模言語モデルの扱い方を真剣に学びたいのであれば、非常に価値のあるスキルです。

We've looked at the Falcon 7 billion parameter model.

私たちは、Falcon 70億パラメータ・モデルを見てきました。

We set up a quick little experiment, created some text summarization, and it's overall to give you an understanding, a good idea of how to go about working with different models, how you can compare them.


And it's really up to you and the use case at hand to determine what the best model is.


Is there a budget?


Can you share data with OpenAI?


How fast does the model have to be?


These are all questions that you have to ask yourself if you're working on a project with large language models.


Alright, and that's it for this video.


