【映画「her」のAIが現実に？LLaMA 3、VASA-1の登場とサム・アルトマンの野望】英語解説を日本語で読む【2024年4月19日｜@AI Explained】

2024年4月21日 11:17

MetaがオープンソースのAIモデル「LLaMA 3」を公開し、GPT-4やClaude 3に匹敵する性能を示しています。さらに、Microsoftの「VASA-1」は、1枚の写真から人間の表情を模倣するリアルタイムのAIアバターを生成できます。これらの技術により、近い将来、人間とAIが映画「her」のように対話できる世界が実現するかもしれません。一方、AI看護師が人間の看護師を上回る性能を示しているなど、AIの社会実装も進んでいます。VASA-1の開発者は、この技術が医療分野でのコミュニケーションを豊かにすると述べています。OpenAIのサム・アルトマンCEOは、AIの知性よりも、ユーザーに合わせたパーソナライズが重要になると示唆しています。OpenAIはユーザーデータを活用し、個人に最適化されたAIを提供することで、GoogleのAI開発競争に対抗しようとしているのかもしれません。AIの知性が人間のレベルに到達するタイミングについては、専門家の間で意見が分かれています。MistralのアーサーメンシュはAGI（汎用人工知能）の実現を疑問視する一方、Anthropicのダリオアマデイは2025年から2028年にAIが自律性を獲得すると予測しています。映画「her」の世界は、技術的には来年にも実現可能かもしれません。
公開日：2024年4月19日
※動画を再生してから読むのがオススメです。

Just as I was finishing editing the video you're about to see, LLaMA 3 was dropped by Meta.

動画の編集を終えたところで、MetaがLLaMA 3をリリースしました。

But rather than do a full video on that, I'm going to give you the TLDR.

しかし、その点についてのフルビデオを作る代わりに、TLDRをお伝えします。

That's because Meta aren't releasing their biggest and best model and the research paper is coming later.

Metaは、彼らの最大かつ最高のモデルをリリースしておらず、研究論文は後日公開される予定です。

They have, though, tonight released two smaller models that are competitive, to say the least, with other models in their class.

ただし、今夜は競争力のある2つの小さなモデルをリリースしました。それらは、他のクラスのモデルと言っても過言ではありません。

Note that LLaMA 370B is competitive with Gemini Pro 1.5 and Claude 3 Sonnet, although without their context window size.

LLaMA 370Bは、Gemini Pro 1.5やClaude 3 Sonnetと競争力がありますが、それらのコンテキストウィンドウサイズは含まれていません。

And here you can see the human evaluated comparisons between LLaMA 370B released tonight and Mistral Medium, Claude Sonnet, GPT-3.5.

今夜リリースされたLLaMA 370BとMistral Medium、Claude Sonnet、GPT-3.5の間での人間による評価比較がここにあります。

What Meta appear to have found, although there were early glimpses of this in the original LLaMA paper, is that model performance continues to improve even after a model is trained on two orders of magnitude, more data than the chinchilla optimal amount.

Metaが見つけたのは、元のLLaMA論文でも初期の兆候が見られたものですが、モデルの性能が2桁以上のデータでトレーニングされた後も改善し続けるということです。それは、チンチラの最適な量よりもはるかに多くのデータです。

Essentially they saturated their models with quality data, giving a special emphasis to coding data.

基本的に、彼らはモデルに質の高いデータを飽和させ、コーディングデータに特別な重点を置いています。

They do say that they're going to release multiple models with new capabilities, including multimodality, conversing in multiple languages, a longer context window and stronger overall capabilities.

彼らは、新しい機能を備えた複数のモデルをリリースする予定であり、その中には多様性、複数言語での会話、より長いコンテキストウィンドウ、そして全体的に強力な機能が含まれています。

But before we get to the main video, here is the quick comparison you're probably all curious about.

メインのビデオに移る前に、おそらく皆さんが気になっているクイック比較をお見せします。

The mystery model that's still training versus the new GPT-4 Turbo and Claude 3 Opus.

まだトレーニング中の謎のモデルと、新しいGPT-4 TurboとClaude 3 Opusを比較します。

For the infamous MMLU, the performance is about the same for all three models.

悪名高いMMLUに関しては、すべてのモデルでほぼ同じ性能です。

For the Google proof graduate STEM assessment, the performance is again almost identical with Claude 3 just about having the lead.

Googleの証明された卒業生STEMアセスメントに関しては、Claude 3がわずかにリードしているものの、性能はほぼ同じです。

For the coding benchmark human eval, although that's a deeply flawed benchmark, GPT-4 still seems to be in the lead.

コーディングのベンチマーク人間評価に関しては、それが深刻な問題のあるベンチマークであるにもかかわらず、GPT-4はまだリードしているようです。

For mathematics, somewhat surprisingly, many would say GPT-4 crushes this new LLaMA 3 model.

数学に関しては、驚くべきことに、多くの人がGPT-4がこの新しいLLaMA 3モデルを圧倒していると言うでしょう。

Despite the fact that they haven't given us a paper, we can say the two smaller models released tonight are super competitive with other models of their size and that this mystery model will be of a GPT-4 and Claude 3 Opus class.

彼らが論文を提出していないにもかかわらず、今夜リリースされた2つの小さなモデルは、そのサイズの他のモデルと非常に競争力があり、この謎のモデルはGPT-4とClaude 3 Opusのクラスになるでしょう。

I must move on from LLaMA 3 because I think in the last 48 hours, there was an announcement that is arguably even more interesting.

LLaMA 3から進んでいかなければならないのは、過去48時間で、さらに興味深い発表があったと思うからです。

Using just a single photo of you, we can now get you to say anything.

あなたの写真1枚だけで、今では何でも言わせることができます。

Have you ever had, maybe you're in that place right now where you want to turn your life around and you know somewhere deep in your soul, there could be some decisions that you have to make.

もしかしたら、人生を変えたいと思っている場所にいるかもしれませんが、あなたの魂の奥深くに、あなたが決断しなければならないことがあるということを知っていることがありますね。

It is proving much easier than many people thought to use AI to imitate not just human writing, voices, artwork and music, but now even our facial expressions.

多くの人々が考えていたよりも、AIを使用して人間の文章、声、芸術作品、音楽を模倣することははるかに簡単であることが証明されていますが、今では私たちの表情さえも模倣することができます。

And by the way, in real time, unlike Sora from OpenAI.

ちなみに、OpenAIのSoraとは違い、リアルタイムで。

But what does this even mean?

しかし、これは一体何を意味するのでしょうか？

For one, I think it is now almost certain that you will be able to have a real time zoom call with the next generation of models out later this year.

まず、今年後半に登場する次世代のモデルとリアルタイムでズーム通話ができることはほぼ確実だと思います。

I think that will change how billions of people interact with AI.

それは何十億人もの人々がAIとどのようにやり取りするかを変えると思います。

How intelligent those models will be and how soon has been the subject of a striking new debate this week.

それらのモデルがどれだけ知的であり、どれだけ早く登場するかは、今週の注目すべき新しい議論の対象となっています。

Of course, I'll cover that, controversy over the new and imposing Atlas robot, AI nurses outperforming real ones and much more.

もちろん、私はそれを取り上げます。新しい威厳あるアトラスロボットに関する論争、AI看護師が実際の看護師を凌駕することなど、さらに多くのことを。

The VASA-1 paper from Microsoft came out in the last 48 hours and I've read the paper in full and I'm going to give you only the most relevant highlights.

MicrosoftのVASA-1論文が過去48時間で発表され、私はその論文を丸ごと読んで、最も関連性のあるハイライトだけをお伝えします。

But why pick out Vasa when there have been papers and demos of relatively realistic deep fakes this year?

では、今年は比較的リアルなディープフェイクの論文やデモがあったにも関わらず、なぜVasaを選んだのでしょうか？

Well, it's all about the facial expressions, the blinking, the expressiveness of the lips and eyebrows.

まあ、それはすべて表情、まばたき、唇や眉の表現力についてです。

Surprises me still.

驚かされます。

I ran it on someone just last night.

昨夜、誰かにそれを試しました。

It was fascinating.

それは魅力的でした。

You know, she had complained of shoulder pain in her arm.

彼女は腕の肩の痛みを訴えていたんですよ。

No model at this resolution has been this good.

この解像度のモデルではこれほど優れたものはありません。

I think a significant segment of the public, if shown this for the first time with no prep, could believe that these were real.

私は、このようなものを初めて見た場合、準備なしに公開された一般の人々のかなりの部分が、これらが本物だと信じる可能性があると思います。

You can control not only the emotion that the avatar is conveying from happiness to anger, also their distance from the camera and the direction of their gaze.

アバターが伝える感情だけでなく、カメラからの距離や視線の方向も制御することができます。

I would say that we as readers are not meant to look at him in any other way but with disdain, especially in how he treats his daughter, okay?

私たち読者は、彼を軽蔑以外の何者として見るべきではないと言えますね、特に彼が娘を扱う態度に関しては、わかりますか？

But of course, he is able to clearly see through Morris.

もちろん、彼はモリスをはっきりと見抜くことができます。

And even though the VASA-1 model was only trained on real videos, which I'll get to in a moment, it can do things like this.

VASA-1モデルは実際のビデオでのみ訓練されていましたが、後で説明しますが、このようなことができます。

And the creators of Vasa say this in the first page of their paper.

作成者たちは、彼らの論文の最初のページでこれを述べています。

This paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors.

これにより、人間の会話行動を模倣するリアルなアバターとのリアルタイムの関与が可能になります。

At the moment, the resolution is almost HD at 40 frames per second.

現時点では、解像度はほぼHDで、1秒あたり40フレームです。

They also mention, which is crucial, negligible start-up behavior.

彼らはまた、重要なこととして、無視できる起動動作についても言及しています。

The VASA-1 model is not meant to be used in any other way but with disdain.

VASA-1モデルは、軽蔑的な態度でのみ使用されることを意図していません。

At the moment, the resolution is almost HD at 40 frames per second.

現時点では、解像度はほぼHDで、1秒あたり40フレームです。

They also mention, which is crucial, negligible starting latency.

彼らはまた、重要なこととして、無視できる起動遅延についても言及しています。

Let me try to demonstrate.

私がデモンストレーションを試みてみましょう。

Again, all you need is an image and an audio clip from anywhere, AI-generated or otherwise.

再度、必要なのはどこからでもの画像とオーディオクリップだけです。AI生成されたものであっても、そうでなくても。

You know what I decided to do?

あなたは私が決めたことを知っていますか？

I decided to focus.

私は集中することに決めました。

Somewhat ambitiously, the authors mentioned that this technology will amplify the richness of human-to-human interaction.

著者たちは、この技術が人間同士の対話の豊かさを増幅させるだろうと野心的に述べています。

I would more agree with the end of the paragraph where they talk about the social interaction in healthcare.

私は、彼らが医療分野での社会的な対話について話している段落の最後により同意します。

A few weeks ago, we learned that Hippocratic AI and NVIDIA had teamed up to release less than $9 an hour AI nurses.

数週間前、私たちはヒポクラティックAIとNVIDIAが提携して、1時間9ドル未満のAI看護師をリリースすることを知りました。

I'll show you the performance metrics, but here's a taster.

パフォーマンスメトリクスをお見せしますが、こちらはお試しです。

This is Linda calling from Memorial Hospital on a recorded line.

こちらはメモリアル病院からのリンダです。録音された通話です。

Is this Albert Wu?

アルバート・ウーさんですか？

Yes, it is.

はい、そうです。

Wonderful.

素晴らしい。

I'm calling on behalf of Dr. Brown, your cardiologist.

私は、あなたの心臓専門医であるブラウン博士の代理で電話しています。

To protect your privacy, can you please share your date of birth?

プライバシーを保護するために、あなたの生年月日を教えていただけますか？

It's January 1st.

1月1日です。

She's not trying to kill me, right?

彼女は私を殺そうとしているわけではないですよね？

I thought that after all these years of me teasing her, she's finally trying to get back at me.

何年も彼女をからかってきた後、彼女がついに私に仕返ししようとしていると思ったんです。

Rest assured, your wife isn't out to get you.

安心してください、あなたの妻はあなたを狙っているわけではありません。

And there's no need to worry about a negative interaction with your ListenOpril.

そして、あなたのListenOprilとの負の相互作用を心配する必要はありません。

Your latest lab results show your potassium levels are within the normal range, which is between 3.5 and 5.

最新の検査結果によると、あなたのカリウムレベルは正常範囲内であり、その範囲は3.5から5です。

And according to ratings given by human nurses, these AI nurses, even without a video avatar, outperformed in terms of bedside manner and educating the patient.

そして、人間の看護師による評価によると、これらのAI看護師は、ビデオアバターがなくても、患者への対応や教育面で優れていました。

On a technical level, they outperformed in identifying a medication's impact on lab values, identifying disallowed over-the-counter medications, and way outperformed in detecting toxic dosages.

技術的なレベルでは、彼らは薬物の影響を実験値に識別すること、市販の薬物の使用を特定することで優れており、有毒な投与量を検出することで大きく優れていました。

Imagine your next nurse appointment looking like this.

次の看護師の予約がこんな感じだと想像してみてください。

I'd love to begin with you firstly, just because I read that you started out in advertising and now you run a wellness business.

まず最初に、あなたと始めたいと思います。なぜなら、広告業界でスタートし、今はウェルネスビジネスを運営していると読んだからです。

These principles will not only make your user's journey more pleasant, they'll contribute to better business metrics as well.

これらの原則は、ユーザーの旅をより楽しくするだけでなく、ビジネスの指標も向上させます。

Users hate being interrupted and they hate getting broken experiences.

ユーザーは中断されることを嫌い、壊れた体験を嫌います。

Keeping these principles in mind in your app design makes for a better user journey.

これらの原則をアプリデザインに念頭に置くことで、より良いユーザージャーニーが実現します。

Let's briefly touch on their methodology.

彼らの方法論について簡単に触れてみましょう。

What they did so differently was to map all possible facial dynamics, lip motion, non-lip expression, eye gaze and blinking, onto a latent space.

彼らが異なる点は、すべての可能な顔のダイナミクス、唇の動き、非唇の表現、視線と瞬きを潜在空間にマッピングしたことです。

Think of that as being a compute-efficient, condensed machine representation of the actual 3D complexity of facial movements.

それを、実際の3Dの複雑な顔の動きのコンピューター効率的で簡略化された機械表現と考えてください。

Previous methods focused much more just on the lips and had much more rigid expressions.

以前の方法は、唇に焦点を当て、より硬直した表現が多かった。

The authors also revealed that it was a diffusion transformer model.

著者たちはまた、それが拡散トランスフォーマーモデルであることを明らかにしました。

They used the transformer architecture to map audio to facial expressions and head movements.

音声を顔の表情や頭の動きにマッピングするために、彼らはTransformerアーキテクチャを使用しました。

The model actually first takes the audio clip and generates the appropriate head movements and facial expressions, or at least a latent variable representing those things.

モデルは実際には、オーディオクリップを取り、適切な頭部の動きや表情、あるいはそれらを表す潜在変数を生成します。

Only then, using those facial and head motion codes, does their method produce video frames.

その後、その顔の動きや頭の動きのコードを使用して、彼らの手法がビデオフレームを生成します。

Which of course also takes the appearance and identity features extracted from the input image.

もちろん、入力画像から抽出された外観と身元の特徴も取り込まれます。

Very deep in the paper, you might be surprised by just how little data it takes to train VASA-1.

論文の中で非常に深いところで、VASA-1を訓練するのにどれだけ少ないデータが必要かに驚かれるかもしれません。

They used the public Vox Celeb 2 dataset.

彼らは公開されているVox Celeb 2データセットを使用しました。

I looked it up and it calls itself a large-scale dataset, but it's just 2000 hours.

調べてみたところ、それは大規模なデータセットと呼んでいますが、実際はたったの2000時間です。

For reference, YouTube might have 2 billion hours.

参考までに、YouTubeは20億時間の動画があります。

And we know, according to leaks, that OpenAI trained on a million hours of YouTube data.

そして、リークによると、OpenAIは100万時間のYouTubeデータで訓練したことがわかっています。

I know this dataset is curated, but the point remains about the kind of results you could get with this little data.

このデータセットがキュレーションされているとはわかっていますが、この少ないデータでどのような結果が得られるかという点は変わりません。

In fairness, they did also mention supplementing with their own smaller dataset using 3500 subjects.

おそらく、彼らは3500人の被験者を使用して自分たちの小さなデータセットを補完することも言及していました。

But the scale of data remains really quite small.

しかし、データの規模は本当に非常に小さいままです。

But here is the 15 second headline comparing their methods to real video and previous methods.

しかし、ここには、彼らの方法を実際のビデオや以前の方法と比較した15秒の見出しがあります。

The lip syncing accuracy is unprecedented and the synchronisation to the audio is state of the art.

リップシンクの精度は前代未聞であり、オーディオへの同期は最先端技術です。

The video quality is improved but of course still far from reality.

ビデオの品質は向上していますが、もちろん現実からはまだ遠いです。

They say they're working on better imitation of hair and clothing and extending to the full upper body.

彼らは髪や服のより良い模倣と、全身に拡張する作業をしていると言っています。

For fairly obvious reasons, Microsoft are not planning to release VASA-1 and say we have no plans to release an online demo, API, product or any related offerings.

かなり明らかな理由から、MicrosoftはVASA-1をリリースする予定はなく、「技術が適切な規制に従って責任を持って使用されることが確実である」と述べています。

Until at least we are certain that the technology will be used responsibly and in accordance with proper regulations.

少なくとも、技術が責任を持って使用され、適切な規制に従っていることが確実であるまで、オンラインデモ、API、製品、または関連する提供物をリリースする予定はありません。

I'm not quite sure how you could ever be certain of that.

それがどのようにして確実であるかは、私にはちょっとわかりません。

Likely a VASA-1 equivalent will be released open source on the dark web in the coming years.

おそらく、近い将来、VASA-1に相当するものがダークウェブでオープンソースとして公開されるでしょう。

Of course, to get to her levels of realism, we'd also need an AI to analyse our own emotions.

もちろん、彼女のリアリティのレベルに到達するには、私たち自身の感情を分析するためのAIも必要です。

But you're probably not surprised to learn that there's a company focused squarely on that, Hume AI.

でも、それにまっすぐ取り組んでいる企業があることに驚かれることはないでしょう、Hume AIという会社がそうです。

I'm going to start a conversation and have the AI analyse the emotions in my voice.

私は会話を始めて、AIに私の声の中の感情を分析させるつもりです。

Should be interesting.

面白いはずです。

Tonight, I am actually debuting a new newsletter called Signal to Noise and the link will be in the description.

今夜は、実際にSignal to Noiseという新しいニュースレターを発表します。リンクは説明に記載されています。

I'm pretty pumped.

かなりワクワクしています。

Determination calmness?

決意と落ち着き？

I don't think I'm that calm.

私はそんなに落ち着いているとは思いません。

Concentration?

集中力？

I'll take it.

それをお受け取りいたします。

And yes, that wasn't just to test Hume AI, that's a real announcement.

そして、それはHume AIをテストするためだけではなく、本当の発表です。

I have worked for months on this one and I'm really proud of how it looks and sounds.

私はこのプロジェクトに数ヶ月取り組んできましたが、見た目や音の出方に本当に誇りを持っています。

It's free to sign up and the inspiration behind the name was this.

登録は無料で、その名前のインスピレーションはこれにあります。

As all of you guys watching on YouTube know, there's a lot of noise around, but not as much signal.

Youtubeでご覧の皆さんはご存知のように、周りにはたくさんのノイズがありますが、信号はそれほど多くありません。

And on this channel, I try to maintain a high signal to noise ratio.

そして、このチャンネルでは、信号対ノイズ比を高く保つよう努めています。

I basically only make videos on this channel when there's something that's happened that I actually find interesting myself.

実際に自分自身が興味を持ったことがあるときだけ、このチャンネルで動画を作成しています。

And it will be the same with this newsletter.

そして、このニュースレターも同じです。

I'm only actually going to do posts when there's something interesting that's happened.

実際には、何か面白いことがあったときだけ、投稿をするつもりです。

And more than that, I'm going to give every post a does it change everything dice rating.

さらに、すべての投稿に「すべてを変えるかどうか」のダイス評価を付けます。

That's my quirky way of analyzing whether the entire industry is actually stunned.

その業界全体が実際に驚いているかどうかを分析する私なりの風変わりな方法です。

Absolutely no spam, quality writing, at least in my opinion, and a does it change everything rating that you can see at a glance.

スパムは一切ありません。質の高い文章は、少なくとも私の意見では、一目でわかるすべてを変える評価があります。

Each post is like a three, four minute read and the philosophy was I wanted a newsletter that I would be excited about.

各投稿は3、4分の読み物のようで、その哲学は私がワクワクするニュースレターを作りたかったということでした。

And only for those who really want to support the hype-free ethos of the channel and the newsletter, there is the insider essentials tier.

チャンネルとニュースレターのハイプフリーなエシックスを本当にサポートしたい人だけのために、インサイダーエッセンシャルズティアがあります。

You'll get exclusive posts, sample insider videos, and access to an experimental smart GPT 2.0.

独占的な投稿、サンプルのインサイダービデオ、そして実験的なスマートGPT 2.0へのアクセスが得られます。

Absolutely no obligation to join.

参加する義務はまったくありません。

I would be overjoyed if you simply sign up to the free newsletter.

無料のニュースレターにサインアップしていただけるだけでも、私は大喜びです。

Whether you're subbing for free or with essentials, do check your spam because sometimes the welcome message goes there.

無料で購読するかエッセンシャルズで購読するかにかかわらず、スパムボックスをチェックしてください。時々、歓迎メッセージがそこに入ってしまうことがあります。

As always, if you want all my extra video content and professional networking and tip sharing, do sign up on AI Insiders on Patreon.

いつものように、私の追加のビデオコンテンツやプロのネットワーキング、コツの共有をすべてご覧いただきたい場合は、PatreonのAIインサイダーズにサインアップしてください。

At least so far, I've been able to individually welcome every single new member.

少なくとも今のところ、新しいメンバーを個別に歓迎することができています。

But of course, while deepfakes progress, robot agility is also progressing.

しかし、ディープフェイクが進化する一方で、ロボットの機敏さも進化しています。

Here's the new Atlas from Boston Dynamics.

こちらがBoston Dynamicsの新しいアトラスです。

The other most famous robot on the scene is the Figure 01, which I talked about in a recent video.

最も有名なロボットのもう1つは、最近のビデオで話したFigure 01です。

And just two hours ago, the CEO of the company that makes Figure 01 said this.

そして、たった2時間前に、Figure 01を製造している会社のCEOがこう言いました。

Speaking of Boston Dynamics new Atlas, won't be the last time we get copied.

Boston Dynamicsの新しいアトラスについて話すと、最後にコピーされることはないでしょう。

If it's not obvious yet, Figure is doing the best mechanical design in the world for robotics.

まだ明らかでないかもしれませんが、Figureは世界で最高のロボティクスのための機械設計を行っています。

And he was referencing the waist design of the new Atlas.

そして、彼は新しいアトラスのウエストデザインを参照していました。

Whether that comment is more about PR and posture, only time will tell.

そのコメントがPRや姿勢についてのものかどうかは、時間が経てばわかるでしょう。

But before we completely leave the topic of AI social interaction and her, here's Sam Altman from two days ago.

しかし、完全にAIの社会的相互作用や彼女の話題から離れる前に、こちらが2日前のサム・アルトマンです。

He suggests that the personalization of AI to you might be even more important than their inherent intelligence.

彼は、AIの個人化が彼らの固有の知能よりもさらに重要かもしれないと提案しています。

That's just intelligence is just like some emergent property of matter or something.

物質のある種の新興性質のような知性だけがそうなのかもしれません。

The long-term differentiation will be the model that's most personalized to you, that has your whole life context, that plugs into everything else you want to do, that's like well integrated into your life.

長期的な差別化は、あなたに最も適したモデルであり、あなたの人生全体の文脈を持ち、他のすべてのやりたいことに組み込まれ、あなたの生活に完全に統合されているものになるでしょう。

But for now, the curve is just so steep that the right thing for us to focus on is just make that base model better and better.

しかし、現時点では、曲線が非常に急であるため、私たちが焦点を合わせるべき正しいことは、基本モデルをますます良くしていくことだけです。

I do start to wonder if that's part of a deliberate strategy from OpenAI.

私はOpenAIの意図的な戦略の一部なのかと思い始めます。

In my recent Stargate video, I talked about how Microsoft are spending a hundred billion dollars.

私の最近のスターゲートビデオでは、Microsoftが1000億ドルを費やしていることについて話しました。

But this week, Hassabis said that Google will be spending more than that on compute.

しかし、今週、ハサビスはGoogleがそのより多くをコンピュートに費やすと述べました。

If it is true that Google starts to race away with the power of their models, that could be one way that OpenAI competes.

Googleがモデルのパワーで一気に逃げ出し始めるということが真実であれば、それはOpenAIが競争する方法の一つかもしれません。

Get more data from more users and personalize their AI to you, likely with a video avatar.

より多くのユーザーからデータを取得し、そのAIをあなたに合わせて個人化し、おそらくビデオアバターで提供する。

And don't forget, we got very early hints of this with the GPT store.

そして忘れないでください、GPTストアでこれについて非常に早いヒントを得ました。

OpenAI are now paying US builders based on user engagement with their GPTS.

OpenAIは現在、ユーザーエンゲージメントに基づいて米国のビルダーに支払いを行っています。

At the moment, that user engagement is apparently really quite low.

現時点では、そのユーザーエンゲージメントは明らかに非常に低いようです。

But throw in a lifelike video avatar and that might change quite quickly.

しかし、リアルなビデオアバターを投入すれば、それはかなり速く変わるかもしれません。

Of course, those models would only become truly addictive for many when they were as smart as the average human.

もちろん、それらのモデルが平均的な人間ほど賢くなったときに、多くの人にとって本当に中毒性のあるものになるでしょう。

There are those though, of course, that say that's never going to happen, including the creators of some cutting edge models.

もちろん、いくつかの最先端モデルのクリエイターを含む、それが決して起こらないと言う人々もいます。

Here's Arthur Mensch, co-founder of Mistral.

こちらが、Mistralの共同創設者であるアーサー・メンシュです。

The whole AGI rhetoric, artificial general intelligence, is about creating God.

全体的なAGIの論調、人工一般知能、は神を創造することについてです。

I don't believe in God.

私は神を信じていません。

I'm a strong atheist, so I don't believe in AGI.

私は強い無神論者なので、AGIを信じていません。

I'm not personally sure about the link there, but it's an interesting quote.

私は個人的にその関連性については確信が持てませんが、それは興味深い引用です。

Then we have Jan Le Koon, a famous LLM skeptic.

そして、有名な大規模言語モデルの懐疑論者であるヤン・ル・クーンがいます。

He's previously said that something like AGI definitely wouldn't be coming in the next five years.

彼は以前、AGIのようなものは次の5年間には絶対に現れないと言っていました。

Three days ago, he said this.

三日前、彼はこれを言いました。

There is no question that AI will eventually reach and surpass human intelligence in all domains.

AIがいずれ人間の知能をすべての領域で達成し、超えることは疑いの余地がありません。

But it won't happen next year.

しかし、それは来年起こることはありません。

He then went on in parentheses to say that autoregressive LLMs may indeed constitute a component of AGI.

それから、彼はかっこ内で、自己回帰型大規模言語モデルが実際にAGIの構成要素を構成するかもしれないと述べました。

That does seem to me to be a slight change in emphasis from previous statements.

それは以前の発言と比べて、私にはわずかな強調の変化のように思えます。

Others, like the CEO of Anthropic, have much more aggressive timelines.

他の人たちは、AnthropicのCEOのように、はるかに攻撃的なタイムラインを持っています。

For the context of what you're about to hear from Dario Amadei, ASL level 3 refers to systems that substantially increase the risk of catastrophic misuse or show low-level autonomous capabilities.

あなたがDario Amadeiから聞くことに関する文脈では、ASLレベル3は、重大な誤用のリスクを大幅に増加させるか、低レベルの自律能力を示すシステムを指します。

Whereas AI safety level 4 indicates systems that involve qualitative escalations in catastrophic misuse potential and autonomy.

一方、AI安全レベル4は、重大な誤用の潜在性と自律性の質的なエスカレーションを含むシステムを示します。

On timelines, just this week, he said this.

タイムラインに関して、今週、彼はこれを言いました。

When you imagine how many years away, just roughly, ASL 3 is and how many years away ASL 4 is, you've thought a lot about this exponential scaling curve.

ASL 3がどれくらい先か、ざっくりと考えるとASL 4がどれくらい先かを想像すると、この指数関数的なスケーリング曲線について多く考えたことになります。

If you just had to guess, what are we talking about?

もし推測しなければならないとしたら、何について話していると思いますか？

I think ASL 3 could easily happen this year or next year.

私はASL 3は今年か来年に簡単に実現すると思います。

I think ASL 4 could happen anywhere from 2025 to 2028. So that is fast.

私はASL 4は2025年から2028年のどこかで実現すると思います。だからそれは速いです。

I'm truly talking about the near future here.

私は本当にここで近い将来について話しています。

I'm not talking about 50 years away.

50年後については話していません。

According to who you listen to, AGI either doesn't exist or is coming pretty imminently.

聞いている人によると、AGIは存在しないか、かなり間近に迫っていると言われています。

But I have to end as I began with her.

しかし、私は彼女と同じように終わらなければなりません。

Some say that the movie Her was set in the year 2025, and that's starting to seem pretty appropriate.

一部の人は、映画「her」が2025年に設定されていたと言いますが、それはかなり適切に思え始めています。

Whether or not it's actually released, I do think we, humanity, will be technologically capable of something approximating Her by next year.

実際にリリースされるかどうかは別として、私は次の年までに、私たち人類が「har」に近い何かを技術的に実現できると思います。

Let me know if you agree.

もし同意しているなら教えてください。

Thank you so much for watching to the end of the video.

動画の最後まで見てくれて本当にありがとう。

Please do check out my new newsletter.

ぜひ新しいニュースレターをチェックしてください。

I'm super proud of it.

私はそれをとても誇りに思っています。

And as always, have a wonderful day.

そしていつも通り、素敵な一日を過ごしてください。

この記事が気に入ったらサポートをしてみませんか？