
OpenAI、自然な会話を実現する新しい対話型AIサービス『GPT-4o』を発表OpenAI Announces New Conversational AI Service 'GPT-4o' with Natural Interaction Capabilities



GPT-4oは有料版「ChatGPT Plus」や「ChatGPT Team」のユーザー向けに提供が開始されており、今後は企業向けのエンタープライズ版も提供予定です。無料版のChatGPTユーザーにも利用可能ですが、1日に使えるメッセージの数に制限があります。



例えば、iPhone版のChatGPTアプリに「Hey, ChatGPT」と呼びかけて質問を始めると、途中で話題を変えてもきちんと対応し、隣にいる別の人が会話に参加しても適切に反応します。









OpenAIの発表は、グーグルの年次開発者会議「Google I/O」の1日前に行われました。グーグルはこの直前に、リアルタイムの動画分析をするAIアプリのデモをSNS上で公開し、両社が競い合っている様子が伺えます。

今後も、5月21日のマイクロソフトの「Microsoft Build」や、6月10日のアップルの「WWDC」など、他の大手企業の発表が続きます。先陣を切ったOpenAIの発表に続いて、各社がどのような対抗サービスを公開するかが注目されます。

On May 13th (local time), OpenAI announced a new interactive generative AI service called "GPT-4o."

Overview of GPT-4o

GPT-4o is now available to users of the paid versions "ChatGPT Plus" and "ChatGPT Team," with plans to offer it to enterprise customers in the future. It is also available to free ChatGPT users, but with a limit on the number of messages that can be used per day.

Naturalness of ChatGPT's Voice and Reactions

One of the most noteworthy aspects is the "naturalness of ChatGPT's voice and reactions." While other companies offer generative AI services that can handle multiple types of input (multimodal), the demo of ChatGPT equipped with GPT-4o was remarkably natural.

For example, in the iPhone version of the ChatGPT app, you can start a conversation by saying "Hey, ChatGPT." The AI can keep up even if the topic changes mid-conversation, and it also responds appropriately if another person joins the conversation.

User Impressions

Not only was the response speed impressive, but the fluency was also striking. Although the demo was conducted in English, it sounded almost like a native speaker. Additionally, it could sing songs tailored to the topic and respond to unusual requests like "sing like a robot."

These features are useful not only for everyday use but also in business settings. The demo showcased the voice assistant feature in the macOS app, where it reviewed displayed source code by voice.

Future Development

However, a downside is that the feature enabling natural conversation, as seen in the demo, is "not yet in its final version." While GPT-4o, capable of multimodal analysis, will be rolled out sequentially, the "new Voice Mode" for natural conversation will be released in an alpha version (initial test version) first, aimed at Plus users.

Comparison with Competitors

Despite the impressive naturalness and response speed, similar features are already implemented by other companies like Google. For instance, on May 2nd, Google launched the "Gemini" app, which supports languages other than English, including Japanese. This app allows users to ask questions via voice and use their smartphone camera (still images) for free searches.

Movements of Major Companies

OpenAI's announcement was made one day before Google's annual developer conference "Google I/O." Just before OpenAI's event, Google demonstrated an AI app for real-time video analysis on social media, indicating the competitive nature between the two companies.

Upcoming events include Microsoft's "Microsoft Build" on May 21st and Apple's "WWDC" on June 10th. While OpenAI has taken the lead with its announcement, it will be interesting to see what competing services the major tech companies will reveal.

#OpenAI #GPT4o #AI技術 #AItechnology
