
【gpt-3.5-turboの価格引き下げと新しいモデルの概要】英語解説を日本語で読む【2024年1月26日|@Wes Roth】


So, there were just a few big announcements from OpenAI about ChatGPT and the API.


They are launching a new generation of embedding models, new GPT-4 Turbo, and moderation models.

彼らは新しい埋め込みモデルの新世代、新しいGPT-4 Turbo、およびモデレーションモデルを発表しています。

As well as new API usage management tools and soon lower pricing on GPT-3.5 Turbo.

さらに、新しいAPI使用管理ツールとGPT-3.5 Turboの価格引き下げも近々行われます。

And there are some big fixes and improvements in this one.


They're saying, We are releasing new models, reducing prices for GPT-3.5 Turbo, and introducing new ways for developers to manage API keys and new tools to manage API Keys.

彼らは言っています、「私たちは新しいモデルをリリースし、GPT-3.5 Turboの価格を引き下げ、開発者がAPIキーを管理するための新しい方法とツールを導入しています」。

A lot of people have been asking for that.


The new models include two new embedding models, an updated gpt-4-turbo-preview model, an updated gpt-3.5-turbo model, and an updated text moderation model.

新しいモデルには、2つの新しい埋め込みモデル、更新されたGPT-4 Turboプレビューモデル、更新されたGPT-3.5 Turboモデル、および更新されたテキストモデレーションモデルが含まれています。

So, first, they have new embeddings model with lower pricing.


If you're a developer or a researcher in the machine learning field, this might be a huge deal.


They're saying, We are introducing two new embedding models, a smaller and highly efficient text-embedding-3-small model, and a larger and more powerful text-embedding-3-large model.

彼らは言っています、「私たちは2つの新しい埋め込みモデル、より小さくて高効率なtext-embedding-3-small modelとより大きくてパワフルなtext-embedding-3-large modelを導入しています」。

They give a link to an embedding resource talking about the introduction of text and code embeddings.


They're saying an embedding is a sequence of numbers that represent the concepts within content, such as natural language or code.


Embeddings make it easy for language learning models and other algorithms to understand the relationship between content and to perform tasks like clustering or retrieval.


They power applications like knowledge retrieval in both ChatGPT and the assistance API, and many retrieval augmented generation (RAG) developer tools.


In other words, they take text and the embedding model saves that information as vectors, basically representing complex data like words, sentences, documents, etc., in a way that captures the semantic meaning in a more compact numerical form.


So, humans understand this better, LLMs understand this better.


Embedding models transform one into the other.


So, the small text embedding model, text-embedding-3-small, is our new highly efficient betting model and provides a significant upgrade over the predecessor, which was released in December 2022.


It has stronger performance and reduced price.


The older models will be available, even though they recommend using the newer ones.


And the large model, text-embedding-3-large, is our new best performing model, and they're showcasing pretty big results.

そして、大きなモデルであるテキスト埋め込み3 largeは、最も優れたパフォーマンスを発揮する新しいモデルであり、非常に良い結果を示しています。

We might do a full video diving deep into embedding models, but this will be useful for developers and researchers that are dealing with natural language and code tasks like semantic search, cluster during topic modeling, and classification.


Also, if you're doing things like sentiment analysis, machine translation, various information retrievals, it seems like these models are much more improved and cheaper and better to use for this.


Here's sort of a visualization of the embedding space.


I'll leave a link down in the show description.


Basically, the idea is to categorize different, for example, sentences or different meanings of words in clusters for better understanding and retrieval.


But let's move on to stuff that I think is going to be more applicable to the general public.


I'll leave a quick survey down below.


I'm curious how many people use something like this, how many people are going to be using these text and code embedding models.


I'm curious.


My guess is it's going to be a few percent of people watching this.


But we have gpt-3.5-turbo model with lower pricing.


So, it looks like input prices are reduced by 50% and output prices are reduced by 25%. So, if you're doing high volume tasks with GPT-3.5 Turbo, that would be quite a substantial savings.

入力価格は50%、出力価格は25%削減されているようです。ですので、GPT-3.5 Turboを使った大量のタスクを行っている場合、かなりの節約になるでしょう。

And the updated gpt-4-turbo-preview, over 70% of requests from GPT-4 API customers have transitioned to GPT-4 Turbo since its release as developers take advantage of its updated knowledge cutoff, larger 128k context windows, and lower prices.

そして、更新されたgpt-4-turbo-previewモデルでは、GPT-4 APIの顧客の70%以上がGPT-4 Turboに移行しており、更新された知識のカットオフ、より大きな128kのコンテキストウィンドウ、そして低価格を活用しています。

So, I believe the previous model is gpt-4-1106-preview.


So, today they're releasing an updated gpt-4-turbo-preview model, gpt-4-0125-preview.

ですので、今日彼らは更新されたGPT-4 Turboプレビューモデル、gpt-4-0125-previewをリリースしています。

So, they usually put the date and month of its release in the name of this model.


This model completes tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of laziness where the model doesn't complete a task.


The new model also includes the fix for the bug impacting non-English UTF-8 generations.


Many people complained about this, and there were even people talking about examples where, depending on what day of the week that GPT-4 thought it was, how much effort it put into generating these code responses seemed to change, which seems bizarre.


But the point is, it could do a lot more.


It just, for some reason, I mean laziness is the perfect word for it, it didn't reach its potential.


For those who want to be automatically upgraded to the new gpt-4-turbo-preview versions, we are also introducing a new gpt-4-turbo-preview model named Alias, which will always point to our latest gpt-4-turbo-preview model.


So, kind of like a redirect link.


It sounds like this will just point to whatever is the latest.


We plan to launch GPT-4 Turbo with vision in general availability in the coming months.

私たちは数ヶ月以内にビジョンを持ったGPT-4 Turboを一般提供する予定です。

And we also have an updated moderation model.


The free moderation API allows developers to identify potentially harmful text.


As part of our ongoing safety work, we are releasing text-moderation-007, our most robust moderation model to date.


You can learn more about building safe AI systems through our safety best practices here.


And of course, new ways to understand API usage and manage API keys.


This is very useful if you have multiple people on the account or you have multiple different API keys that you're using.


Many people on Twitter, for example, have asked to have a way to kind of have more granular look into where your API costs are going.


So, for example, here the laundry buddy key is off the charts, which I'm guessing they kind of put that in as a joke.

例えば、ここでは「laundry buddy key」が異常に高いですが、おそらくそれは冗談として入れられたものだと思います。

They're saying, We are launching two platforms improvements to give developers both more visibility into their usage and control over API Keys.


First, developers can now assign permissions to API keys from the API Keys page.


For example, a key could be assigned to read-only access to power an internal tracking dashboard or restricted to only access certain endpoints.


So, it sounds like if I am doing an AI tutorial video and one of you see my API key and then, try to use it somewhere.


I will be protected.




Second, the usage dashboard and usage export function now expose metrics on an API key level.


After turning on tracking, this makes it simple to view usage on a per feature, team, product, or project level simply by having separate API keys for each.


And there will be more improvement over the next couple of months.


And here's kind of what that new screen looks like.


So, you type in your name if you wanted to for your API key.


And then, you have various restriction options or restricted options models, which capabilities it has access to, whether it has the ability to call an assistant, threads, fine-tuning files, etc.


As well as a read-only key.


Once I get a chance to dive into all the stuff, I will give you a further update.


Thank you so much for watching.


Make sure you're subscribed.


My name is Wes Roth and I'll see you next time.

