OpenAIの最新モデル「GPT-4o mini」の考察

Masayuki Abe

2024年7月21日 11:36

OpenAIから最新モデル「GPT-4o mini」がリリースされたとのことで、早速調べてみました。

「GPT-4o mini」のリリースメールによると、GPT-3.5 Tubroと比較して優れている面があるとのことです。

知能: GPT-4o miniはGPT-3.5 Turboよりもテキストの知能（MMLUで82%のスコアを達成し、GPT-3.5 Turboの69.8%に対して）やマルチモーダル推論において優れています。

価格: GPT-4o miniはGPT-3.5 Turboよりも60%以上安価で、1M入力トークンあたり0.15ドル、1M出力トークンあたり0.60ドルです（これはおおよそ標準的な本で2500ページに相当します）。

モダリティ: GPT-4o miniは現在、テキストと視覚機能をサポートしており、将来的には音声とビデオの入力・出力もサポートする予定です。

言語: GPT-4o miniは、GPT-3.5 Turboよりも広範な非英語言語において多言語理解が向上しています。

「GPT-4o mini」は、GPT-3.5 Tubroの後継という位置付けの印象を受けました。

どこかのタイミングで、APIに関して、GPT-3.5 Turboから「GPT-4o mini」にユーザを切り替えたい意図が見えます。

「GPT-4o mini」は、API利用料やマルチモーダル機能もあり、画像、将来的には動画、音声も取り扱えるということで、モデルの最低限の機能の底上げをしています。

既に、ChatGPTのウェブ画面からは、GPT-3.5 Turboが消えています。

現在、ウェブ画面で使用できるモデルは、GPT-4o、GPT-4o mini、GPT-4の3種類となっています。

但し、APIに関しては、GPT-3.5シリーズは利用できるようです。

GPT-3.5シリーズのAPIをいきなり廃止してしまうと、API利用業者は、コード修正などが発生して、OpenAIのAPIを利用したサービスが動かなくなってしまいます。

OpenAIというプラットフォーマーの力が強すぎるというリスクを感じて、OpenAI以外のサービスを利用しようと考えてしまうのではないでしょうかということを避けるためにも、GPT-3.5シリーズはまだ使用できるようにしているのでしょう。

「GPT-4o mini」をPlaygroundで簡単に使ってみます。

https://platform.openai.com/playground/chat?models=gpt-4o-mini

次のOpenAIからのリリースメールの英訳を「GPT-4o mini」にしてもらいます。

Hi there,

Exciting news! Today we introduced GPT-4o mini—our new affordable and intelligent small model that’s significantly smarter, cheaper, and just as fast as GPT-3.5 Turbo—and launched it in the API. Here’s what you need to know:Intelligence: GPT-4o mini outperforms GPT-3.5 Turbo in textual intelligence (scoring 82% on MMLU compared to 69.8%) and multimodal reasoning.
Price: GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo, priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens (roughly the equivalent of 2500 pages in a standard book).
Modalities: GPT-4o mini currently supports text and vision capabilities, and we plan to add support for audio and video inputs and outputs in the future.
Languages: GPT-4o mini has improved multilingual understanding over GPT-3.5 Turbo across a wide range of non-English languages.

With its low cost and latency, GPT-4o mini works well for high-volume tasks (e.g., passing a full code base or conversation history to the model), cost-sensitive tasks (e.g., summarizing large documents), and tasks that require fast responses (e.g., customer support chatbots). Like GPT-4o, GPT-4o mini has a 128k context window, supports up to 16k output tokens per request, and has a knowledge cut-off date of October 2023. We plan to launch fine-tuning for GPT-4o mini in the coming days.

We recommend developers using GPT-3.5 Turbo switch to GPT-4o mini to unlock higher intelligence at a lower cost. You can use GPT-4o mini in the Chat Completions API and Assistants API, or in the Batch API where you get a 50% discount on batch jobs completed asynchronously within 24 hours.

To get started, test the model in Playground and check out our API documentation. To learn how to use vision with GPT-4o mini, check out the Introduction to GPT-4o and GPT-4o mini in the cookbook. If you have questions, please reach out in the OpenAI developer forum.

Happy building!

—The OpenAI team

和訳の結果は、次のとおりです。

こんにちは、

素晴らしいニュースです！本日、GPT-4o miniを発表しました。これは、GPT-3.5 Turboと同じくらい速く、さらに賢く、そして安価な小型モデルです。以下がその概要です：

知能: GPT-4o miniは、テキスト知能（MMLUで82%のスコア、GPT-3.5 Turboの69.8%に対して）とマルチモーダル推論でGPT-3.5 Turboを上回ります。

価格: GPT-4o miniは、GPT-3.5 Turboよりも60%以上安価で、1M入力トークンあたり$0.15、1M出力トークンあたり$0.60（標準的な本の2500ページ相当）です。

モダリティ: 現在、GPT-4o miniはテキストと視覚機能をサポートしており、将来的には音声や動画の入力および出力のサポートも予定しています。

言語: GPT-4o miniは、GPT-3.5 Turboに比べて幅広い非英語言語での多言語理解が向上しています。

低コストと低レイテンシーにより、GPT-4o miniは高ボリュームのタスク（例：完全なコードベースや会話履歴をモデルに渡す）、コスト感度の高いタスク（例：大規模な文書の要約）、迅速な応答が求められるタスク（例：カスタマーサポートのチャットボット）に適しています。GPT-4o miniもGPT-4oと同様に、128kのコンテキストウィンドウを持ち、リクエストごとに最大16kの出力トークンをサポートし、知識のカットオフ日は2023年10月です。GPT-4o miniのファインチューニングは、今後数日以内に開始する予定です。

GPT-3.5 Turboを使用している開発者には、より高い知能をより低いコストで提供するGPT-4o miniへの移行をお勧めします。GPT-4o miniは、Chat Completions APIやAssistants API、または24時間以内に非同期で完了するバッチジョブに対して50%の割引が受けられるBatch APIで使用できます。

始めるには、Playgroundでモデルを試してみて、APIドキュメントをご覧ください。GPT-4o miniでの視覚機能の使用方法については、クックブックの「GPT-4oとGPT-4o miniの紹介」をご確認ください。質問がある場合は、OpenAI開発者フォーラムでお問い合わせください。

楽しい開発をお祈りします！ —OpenAIチーム

英語の原文とほぼほぼニュアンスを変えることなく和訳できています。

次に、先ほどの英語のメールを画像にして、ここに記載されている文字を読み取ってもらいます。

こんな感じで、聞いてみます。

実行結果は、次の通りです。

所感としては、「GPT-4o mini」という最安のモデルがマルチモーダルのモデルになり、画像、動画、音声と扱えるようになり、技術の進歩を感じられました。

この記事が気に入ったらサポートをしてみませんか？