

民間企業がQ-Starモデルに非常に類似した重大な技術革新を達成したと複数の情報源が主張しています。このブレイクスルーは、OpenAIによって昨年開発されたQ-Starモデルに匹敵する能動的推論能力を実現する可能性があると言われています。この進歩は、AI技術の進化の速さに私たちを驚かせています。さらに、GitHubの元CEOであるナット・フリードマンとその投資パートナーのダニエル・グロスが、AIコーディングアシスタントを開発するMagicに1億ドルを投資したことが話題になりました。Magicの技術革新は、GitHub Copilotなどの既存のコーディングアシスタントを超え、完全自動化されたコーディング共同作業者を提供することを目指しています。Magicは、大規模言語モデル(LLM)の新しいタイプを開発し、Googleの最新のGemini LLMよりも5倍多い3.5百万語のテキスト入力を処理する能力を持つと主張しています。

It has been a crazy February, and it's about to get crazier because a privately owned company has just achieved a crazy technical breakthrough that is very, very similar to the Q-Star model, and that's what multiple sources are claiming.


I'm going to be giving you guys the scoop on this because this is absolutely insane, and I was actually shocked at how quickly AI technology is evolving.


So here, you can see that it says that just as important, Magic also privately claimed to have made a technical breakthrough.


This breakthrough could enable active reasoning capabilities similar to the Q-Star model developed by OpenAI last year, according to a person familiar with its technology.


And that, ladies and gentlemen, is an absolutely astounding statement.


Because we do know that Q-Star was a model/system that we didn't really know much about.


But there were so many leaks, so many theories, so many capabilities.


And at that time in which Q-Star was leaked, there were so many things going on at OpenAI.


That led us to believe that Q-Star was really true.


Now, later on in the video, I'm going to dive into a bit more about Q-Star.


Because there is actually a document that kind of holds a lot of knowledge on the Q-Star stuff and why it's such a big deal with whatever this company, this privately owned company Magic, has done.


And some of the details I'm going to show you guys are going to really shock you.


Because this goes to show how quickly we are moving in this space.


So essentially, what actually happened, okay, because I'm going to actually come back to this active reasoning because there's a lot to dissect here.


So essentially, it says that former GitHub CEO Nat Friedman and his investment partner Daniel Gross raised eyebrows last week by writing a 100 million dollar check to Magic, the developer of an artificial intelligence coding assistant.


There are loads of coding assistants already, and the top dog among them is Microsoft's GitHub Copilot.

既にたくさんのコーディングアシスタントが存在しており、その中でもトップのものはMicrosoftのGitHub Copilotです。

So what did Friedman and Gross see in Magic?


Remember, these guys wrote a 100 million dollar check to this company out of nowhere because they saw something that they were like, okay, if we invest 100 million, we are certainly going to make our money back, and they basically think that this is probably the best thing ever.


I'm going to get into more of that, okay, I'm just going to skim this bit quickly because I want to show you guys how all of this ties in.


So essentially, they state that the answer goes beyond the company's claim that it will soon be able to furnish its customers with a fully automated coding co-workers, not just a semi-automated assistant that finishes fragments of code writing as GitHub Copilot does.

会社が顧客に完全自動化されたコーディングの同僚を提供できるという主張を超えて、Magicが行うブレークスルーは、GitHub Copilotのようにコードの断片を完成させるだけでなく、完全に自動化された同僚を提供できるというものです。

So if you don't know, that's exactly what GitHub Copilot does, it kind of just finishes the fragments of code, and it's not like a fully automated co-worker.

GitHub Copilotが行うのはまさにそれで、コードの断片を完成させるだけであり、完全に自動化された同僚ではありません。

So essentially, I'm guessing that what they're moving towards is more of an agent framework, and that their breakthrough that they've done is clearly a really, really insane one, and essentially they say that the startup has created a new type of Large Language Model that can process huge amounts of data known as a context window.


Now, what I want to talk to you guys about is like this insane week, I think this month is probably going to be the biggest month of AI, probably this year.


I don't know that AI is on an exponential, but I think that with the election coming up later on in the year, I think that stuff might drown out the AI stuff, but this is crazy, okay, like this is probably the biggest thing, okay.


So essentially, and I know some of you guys didn't see this because this was overshadowed, so essentially it talks about Magic claims to be able to process 3.5 million words worth of text input, five times as much information as Google's latest Gemini LLM, which in turn was a big advance on OpenAI's GPT-4.

私たちの中にはこれを見逃した人もいると思いますが、基本的にはMagicはGoogleの最新のGemini LLMの5倍の情報を処理できると主張しています。これはOpenAIのGPT-4よりも大幅に進歩したものです。

In other words, Magic model essentially has an unlimited context window, perhaps bringing it closest to the way humans process information.


Now, why is this crazy?


Well, of course, there are just several higher-reaching implications that are just absolutely insane.


But the first thing that I think is absolutely insane, and as you watch this video, you're going to understand how crazy this breakthrough is.


It's because if you haven't been paying attention, basically, Google's latest Gemini LLM was huge, okay?

それは、もし注目していなかったら、基本的に、Googleの最新のGemini LLMが巨大だったということです。

And many people did miss it.


And I'm about to dive into it.


Essentially, they're basically stating that it is five times, okay, five times as much as Gemini, okay?


And I'm about to show you guys how crazy Gemini is.


And when you see that, you're going to be like, what on earth are you talking about?


But I think the craziest thing here is, of course, the potentially unlimited context window.


I think if that is true, and you know, potentially because when they've spoken about the kind of breakthrough, essentially that could be absolutely game-changing.


Now, one thing that I do want to also additionally talk about, okay, because this is something that you guys need to know, is that they're essentially saying that they were able to process as five times much information as Google's latest Gemini LLM, and if you haven't seen this because it got overshadowed by OpenAI's Sora product, which is of course something that was absolutely incredible, you guys are going to go ahead and see this.

今、私がさらにお話ししたいことがあります。これは皆さんが知っておくべきことですが、彼らはGoogleの最新のGemini LLMと比べて、5倍も多くの情報を処理できたと言っています。もしこれを見逃してしまったのなら、それはOpenAIのSora製品によって影が薄れてしまったからでしょう。もちろん、それは本当に素晴らしいものでした。これから皆さんに見ていただきます。

So essentially, Google's Gemini 1.5 Pro was released a couple days ago, and I know nobody paid attention to this unless you are super deep in the AI technology space because it was something that most people just didn't see.

GoogleのGemini 1.5プロが数日前にリリースされましたが、ほとんどの人が見逃してしまったのは、AI技術の分野に深く関わっていない限りです。

It probably would have been the main headline in the AI space if it wasn't for OpenAI's Sora text-to-video technology, and that is of course because it is great and amazing, but Sora stole the show.


So what you're seeing on screen right now is essentially Google's Gemini 1.5 Pro, and the main thing about this model, if you didn't really know about it, is that it's able to process huge, huge context length.

今画面で見ているのは基本的にGoogleのGemini 1.5 Proです。このモデルの主な特徴は、非常に長いコンテキストを処理できることです。

I'm talking one hour of video, 11 hours of audio, 30,000 lines of code, and 700,000 words.


That is, I don't even know how many novels that is, but it's a lot.


You can see it compared to Gemini, you can see it compared to GPT-4 Turbo, and you can also see compared to Claude 2.1.

Geminiと比較して、GPT-4 Turboと比較して、Claude 2.1とも比較できます。

This thing is absolutely insane, this is a killer, okay, of everything else that we knew, and we knew longer context windows were coming because we saw multiple different research papers that were just continually increasing the context window.


Now, the thing is, you might be wondering, okay, they've got longer context windows, they can analyze an hour of video, they can analyze 11 hours of audio, they can analyze 30,000 hours of code, 700,000 words, that's good and all, but is it even accurate because we know that some of the companies that have done that before, they've done it, but it wasn't that accurate in whatever they were doing.


So one thing that they did in order to test this and why I'm showing you guys about Google is because if Magic's claiming that they've done something better than Google have done, that is incredible, so you need to first understand how Google's Gemini Pro works.

これをテストするために彼らが行ったことの一つは、なぜ私がGoogleについて皆さんに示しているのかということです。なぜなら、MagicがGoogleよりも優れたことを主張しているのであれば、それは信じられないことです。ますますまずはGoogleのGemini Proがどのように機能するかを理解する必要があります。

And then when you understand if Magic's beaten it, why the implications of that are so crazy.


So essentially, what Google did was, you know how I just basically said that Google had a super long context window.


You can put like 11 hours of or 22 hours of audio, three hours of video, and seven million words or 10 million tokens.


Essentially, what you know they did was, in order to test this because people were like, hmm, I wonder if this is accurate or not, basically they hid a secret phrase in the video.


And for the video, it was just one frame out of all of them for like, I think it was around two hours or so.


That's what they did.


They hid one frame.


They asked AI to find it and it did.


in audio, they hid like, one sentence or like three words and it found it.


It was like, what is the secret word?


It managed to find it.


And then in text, it was able to do that as well.


And I think there were just some very, very small errors on text, but overall, you can see that on successful retrieval versus the unsuccessful retrieval, how crazy it is.


It's very, very accurate.


I think it was like 99.9 percent, something like that, pretty much perfect.


So essentially, we're moving towards an era where these super long context windows are going to be absolutely insane.


And that implication, okay, if Magic's done something where it's essentially beaten Google with like an unlimited context window, I don't even know if that's even possible.


I think the crazy implications of this is that of course we know that, remember, okay, OpenAI are going to be forced to pull something crazy out the back.


Now what I want to show you guys as well is, of course, two clips from Gemini's because if you think that this is AI, it's got a long context window, but who even cares because it's not smarter than GPT-4, it's not smarter than Claude 2.1.

今、私が皆さんに示したいのは、もちろん、Gemini'sからの2つのクリップです。なぜなら、これがAIだと思うなら、長いコンテキストウィンドウを持っているけれども、それはGPT-4よりも賢くないし、Claude 2.1よりも賢くないからです。

Just to let you guys know on the benchmarks, Gemini 1.5 Pro is actually surpassing GPT-4 Turbo on all the benchmarks, by the way, just to put that out there.

ベンチマークについて皆さんにお知らせすると、Gemini 1.5 Proは実際には、すべてのベンチマークでGPT-4 Turboを上回っています。ところで、それをお知らせしておきます。

And of course, if it can analyze an hour of video, 11 hours of audio, and all this other stuff, what you can actually do is different tasks, and that's where I'm going to show you guys these two demos.


And then, I'm going to get back to exactly how that works.


This is a demo of long context understanding, an experimental feature in our newest model, Gemini 1.5 Pro.

これは、最新モデルであるGemini 1.5 Proの実験的機能である長いコンテキスト理解のデモです。

We'll walk through some example prompts using the Three.js example code, which comes out to over 800,000 tokens.


We extracted the code for all of the Three.js examples and put it together into this text file, which we brought into Google AI studio.

私たちは、すべてのThree.jsの例のコードを抽出し、それをこのテキストファイルにまとめ、Google AIスタジオに持ち込みました。

Over here, we asked the model to find three examples for learning about character animation.


The model looked across hundreds of examples and picked out these three: one about blending skeletal animations, one about poses, and one about morph targets for facial animations.


All good choices based on our prompt.


In this test, the model took around 60 seconds to respond to each of these prompts, but keep in mind that latency times might be higher or lower as this is an experimental feature we're optimizing.


Next, we asked what controls the animations on the littlest Tokyo demo.


As you can see here, the model was able to find that demo and it explained that the animations are embedded within the gltf model.


Next, we wanted to see if it could customize this code for us.


So we asked, Show me some code to add a slider to control the speed of the animation, use that kind of GUI the other demos have.


This is what it looked like before on the original Three.js site, and here's the modified version.


It's the same scene, but it added this little slider to speed up, slow down, or even stop the animation on the fly.


It used this GUI library the other demos have.


Set a parameter called animation speed and wired it up to the mixer in the scene.


Like all generative models, responses aren't always perfect.


There's actually not an init function in this demo like there is in most of the others.


However, the code it gave us did exactly what we wanted.


Next, we tried a multimodal input by giving it a screenshot of one of the demos.


We didn't tell it anything about this screenshot and just asked where we could find the code for this demo seen over here.


As you can see, the model was able to look through the hundreds of demos and find the one that matched the image.


Next, we asked the model to make a change to the scene.


Asking, How can I modify the code to make the terrain flatter?


The model was able to zero in on one particular function called generate height and showed us the exact line to tweak.

そのモデルは、generate heightと呼ばれる特定の機能にゼロインし、調整する正確な行を示してくれました。

Below the code, it clearly explained how the change works.


In the updated version, you can see that the terrain is indeed flatter, just like we asked.


We tried one more code modification task using this 3D text demo over here.


We asked, I'm looking at the text geometry demo and I want to make a few tweaks.


How can I change the text to saygoldfish' and make the mesh materials look really shiny and metallic?


You can see the model identified the correct demo and showed the precise lines in it that need to be tweaked.


Further down, it explained these material properties, metalness and roughness, and how to change them to get a shiny effect.


You can see that it definitely pulled off the task and the text looks a lot shinier now.


These are just a couple examples of what's possible with a context window of up to 1 million multimodal tokens in Gemini 1.5 Pro.

これらは、Gemini 1.5 Proの最大100万のマルチモーダルトークンのコンテキストウィンドウで可能な例のほんの一部です。

So now that you've seen that bit right there, okay, and you understand how crazy that is, okay, you can understand that this kind of breakthrough, whatever they did that enabled them to potentially beat Google with an essentially unlimited context window or 3.5 million words or five times as much as Google's latest Gemini.


I'm not sure if this was as latest Gemini 1.5 Pro, but even if they did surpass it, these guys invested like 100 million.

これが最新のGemini 1.5 Proだったかどうかはわかりませんが、たとえ彼らがそれを超えたとしても、これらの人々は1億ドルのような投資をしています。

That is no small amount, okay?


That is a really, really huge amount.


This is definitely something crazy, okay?


We have to take a look at, even if it's not the context window, we have to take a look at this, okay?


Because this is the crazy part, okay?


Like I said, okay?


They claimed, okay, that enable active reasoning capabilities similar to the Q-Star model developed by OpenAI, according to a person familiar with this technology, and that could help solve one of the major gripes with Large Language Models, which is that they mimic what they've seen in their training data rather than using logic to solve new problems.


As for how Magic develops its LLM, this person said it took some elements of transformers, a type of AI that powers consumer products like ChatGPT and coding assistants like Copilot, and fused them with other kinds of deep learning models.


And that is something that we'll be exploring later because different architectures are something that people haven't realized are a real, real thing, okay?


The transformer architecture has just dominated the space since they were invented, and of course, it is something that is now being challenged by a few different ones.


So essentially, what we also have here is the using logic to solve new problems.


Now, why does active reasoning change the game?


Well, essentially, with logical problem solving, active reasoning involves the AI system engaging in a form of logical reasoning or deduction to solve problems.


This means that the system can, in theory, apply principles of logic to come up with solutions to problems it hasn't explicitly been trained on by understanding the underlying relationships and rules.


And this actually does go beyond pattern matching.


So instead of relying solely on statistical patterns in the data it was trained on, a system capable of active reasoning would be able to infer new information or make predictions based on logical deductions.


This capability would allow it to essentially think more like a human in terms of applying general principles to specific and unseen scenarios.


Now, what's also cool about this is that active reasoning also implies the ability to dynamically update and adapt to new problems and situations by applying learned concepts in novel ways, not just recalling or recombining information from the training data.


So that right there, guys, that dynamic adaptation, actively reasoning and adapting to new problems and situations is a kind of intelligence that only humans have possessed.


And the difference is that the current LLM capabilities are that they're essentially pattern recognition and generators.


LLMs primarily operate by recognizing patterns in vast amounts of data and text data and generating responses based on statistical likelihood.


So they excel at producing text that is coherent and contextually appropriate based on the examples they have seen during training.


And essentially, they mimic human-like responses.


LLMs can generate responses that mimic human-like text across various domains and style.


However, their understanding is a little bit limited to correlating input with similar context they've encountered in their training data without true comprehension.


And of course, they do have limited deductive reasoning.


So essentially, while LLMs can sometimes appear to be reasoning, their process is more about matching patterns than actual logical deduction.


And they can struggle with certain tasks that require genuine understanding, causality, and complex logical inference, especially if those tasks are not well represented in their training data.


So the dynamic adaptation, the beyond pattern matching, being able to infer new information or make predictions based on logical deductions, and actively apply principles of logic to come up with new solutions or solutions to problems it hasn't been explicitly trained on by understanding underlying relationships and rules is definitely a true game changer.


And this is something that we've seen over the internet from the release of Gemini 1.5 Pro.

そして、これはGemini 1.5 Proのリリース以来、インターネット上で見られるものです。

Because essentially what people are now able to do is they're able to now solve a lot of long-form problems.


If you have 30,000 lines of code versus just I guess you could say like 500 lines of code or even like 10 or like 100 lines of code, we're able to solve vastly different problems.


You're able to essentially get a lot more from when an AI system kind of understands all of that text and is able to digest all of that.


It just completely changes the game and it leads us more towards a human in terms of how a human thinks due to that active reasoning combined with that as well.


So that is why this is going to be a complete game changer.


Now, something that I did want to dive into that I did find kind of fascinating was the fact that Magic actually talked about proprietary architecture.


Okay, so here's this kind of like small presentation thing from Magic and this is about to get pretty crazy.


It says, Magic is working on frontier-scale code models to build a co-worker, not just a copilot.


And it says, Things we believe: code generation is both a product and a path to AGI.


And it says, AGI safety matters and is solvable.


And it says, To build a great AI product, we need to train our own frontier-scale model, which is essentially what they're doing.


And the last point, and the top point, are going to be essentially points why I'm going to be talking about this.


And the last point is something that I do want to touch on, is that they state that transformers aren't the final architecture.


We have something with a multi-million context token context window.


So that is something that is pretty crazy.


Now, here's the thing.


This tweet, I read it at first and I kind of saw it and was like, It doesn't really mean anything.


and then i kind of read it again, and then i was like, oh okay, this is actually bigger than i think.


So Nat Friedman, the guy who invested 100 million dollars, he stated that Magic Dev has trained a groundbreaking model with many trillions of tokens of context that performed far better in our evals than anything we've tried before.

14億ドルを投資した人、ナット・フリードマンは、Magic Devが過去に試したどんなものよりもはるかに優れた成績を収めた画期的なモデルを訓練したと述べました。

Okay, I want you guys to see what he said there.


Okay, this thing performed far better in our evaluations than anything we've tried before.


Okay, and although this was, he did tweet this before the Google's Gemini Pro release, him saying that it's far better in our evals than anything we've tried before is something that is quite shocking because he's not saying it's slightly better, he's saying that it's far better.

GoogleのGemini Proのリリース前にツイートしたものでしたが、彼が「これまで試したどんなものよりもはるかに優れた」と言っていることはかなり驚くべきことです。彼はわずかに優れているとは言っていません。はるかに優れていると言っています。

Okay, he's saying they're using it okay to build an advanced AI programmer that can reason over your entire code base and the transitive closure of your dependency tree.


And if this sounds like Magic, well, you get it.


Okay, so essentially he's stating that he was so impressed that we are investing 100 million dollars in the company today.


So I think that that is pretty insane that they saw something in that pitch deck.


Okay, these guys were working on their products and whatever, and they saw something that they tried and they were like, this is so crazy, we are putting 100 million dollars into it.


This, you have to understand that these guys are going up against Microsoft's GitHub Copilot, okay, and essentially something that is backed by OpenAI, and this guy put 100 million dollars into it.

これは、MicrosoftのGitHub Copilotに対抗していることを理解する必要があります。基本的にOpenAIの支援を受けているものであり、この人は1億ドルを投資しています。

So he's basically betting that what they have is really, really good.


And like I said, he said that the best way to really understand if someone is really, really on the ball, I guess, or I guess you could say someone is backing their position is money, okay, and money does talk, and these guys are putting 100 million dollars.


They're putting the money where their mouth is, and they're saying, look, we think this thing's so good, we're gonna put 100 million dollars of our own money in there.


Not just 10, not just 20, not just 30, 100 million dollars is quite a lot, and that's a pretty, pretty thing.


Okay, they're saying, better in our evals than anything we tried before, that is pretty crazy.


So, the next point is, of course, here, and I wondered, okay, I was wondering for some time, okay, and of course, this is just pure speculation, and of course, I don't know, but I'm guessing that potentially they maybe could have used this architecture.


Okay, so if you don't know what this is, okay, this is Mamba.


Now, essentially, around two months ago, there was this paper, okay, Mamba: Linear Time Sequence Modeling with Selective State Spaces, and essentially, this was touted to be a replacement for transformers with an exceptional performance on long context windows.

実際、約2か月前にこの論文がありました。Mamba: Linear Time Sequence Modeling with Selective State Spacesというもので、これは長いコンテキストウィンドウで優れたパフォーマンスを発揮するトランスフォーマーの代替として謳われていました。

So, it's not a direct replacement for transformers, but it is an alternative architecture that addresses some of the inefficiencies found in the transformer models that power ChatGPT and all of the other LLMs that we know.


Okay, so Mamba uses state space models, which are SSMs, to achieve linear time complexity for input computation, which can particularly be beneficial for processing long sequences efficiently.


Now, it's been shown to actually outperform transformers in inference speed and efficiency, especially on larger context sizes.


And it's important to know that while Mamba has demonstrated impressive performance on language modeling and tasks involving audio and DNA sequences, it's not superior in all aspects.


That's why I said I'm not really sure.


For example, there was actually a study from the Kempner Institute at Harvard University that actually showed that transformers are better at Mamba at tasks that involve copying and retrieval from the input context, such as future learning and retrieval tasks that are common in foundation models.


However, Mamba models are particularly better at transformers that involve tasks in processing long sequences efficiently and in scenarios where computational efficiency is crucial.


And the architecture of Mamba, which combines elements of state space models, SSMs, and recurrent neural networks, allows it to excel in several specific areas.


So it can excel in language modeling.


It's shown impressive performance in language modeling tasks surpassing similarly sized transformers and even competing on par with transformers that are twice its size in both pre-training and downstream evaluation tasks.


And of course, long sequences.


This thing can handle long sequences like a champ due to its efficient sequence modeling technique.


Mamba is actually better suited for tasks that require processing information over extended sequences.


And this is actually attributed to its ability to linearly scale with sequence contents length, making it particularly beneficial for applications where long context sizes are involved, like coding.


And it also demonstrates exceptional performance across varied domains.


Of course, in audio and genomics, like we already talked about.


And of course, it actually does address the computational limits in long context scenarios with transformers.


And another thing that it actually does have is, of course, in-context learning.


So while Mamba matches the performance of transformer models for in-context learning, it's particularly noted for scaling well with the number of in-context examples.


And this actually suggests that Mamba maintains a considerable performance edge in scenarios where leveraging context information is crucial for task performance.


So it's clear that whatever kind of architecture that these guys do have, because they said that transformers aren't the final architecture, and we have something with the multi-million token context window, it could be Mamba.


I'm not entirely sure if it is this thing.


I mean, it wouldn't surprise me, but then again, of course, this thing does have some limitations.


There's not that much of a good ecosystem.


But then again, I do find it kind of crazy that this paper was released two months ago.


and then all of a sudden google comes out with, I don't know, that google is kind of working with transformers, but all of a sudden google comes out with a 10 million kind of context window, and then these guys come out with something that's got multi-million context window, and it's pretty much unlimited.


I'm wondering if they're using, mamba to kind of do any of this or they're just using a completely different architecture that they've developed combined with the essence of LLMs.


I'm not entirely sure what architecture they're working on, but I think that once the new architecture does get I guess you could say, into the wider community because of course these guys are a private company, they're going to try and protect whatever it is that they have, whatever proprietary architecture that they're using.


I think it will be kind of fascinating and of course, essentially the guy who's, so now here we have the CEO of the Magic company, okay, the Magic AI labs, and essentially he states that we are writing code on a mission to build a safe super intelligence.


So it's clear that his goal is super intelligence and in the article it states that Magic's co-founder and CEO Eric Steinberger has grappled with the problem of getting AI models to reason before.


He previously worked at Matter Platforms conducting research on how reinforcement learning, the machine learning techniques that help the pretty much the great performance of open eyes LLMs can help AI models find the optimal solutions to problems even with imperfect information.

以前、彼はMatter Platformsで働いており、強化学習という機械学習技術が、OpenAIの大規模言語モデルの優れたパフォーマンスを支援する方法について研究していました。これにより、AIモデルが不完全な情報でも問題の最適な解決策を見つけるのに役立ちます。

And his ambition is bigger than a coding co-worker.


Remember this company's goal, okay, is to develop AI super intelligence the same way that Google do, and that is the kicker, guys.


So the fact that they've made a breakthrough that's very similar to Q-Star and the fact that they are working on super intelligence is pretty, pretty incredible because I think the fact that they're both heading in the same direction means that they're eventually going to stumble across the same roadblocks and eventually they're going to get across some of the same roadblocks that they do now.


This has some real, real ramifications and one of the things I do want to know is, of course, what is the product because essentially they say some of Freedman's former colleagues at GitHub have joined him at Magic and they include Max Shoning, Vice President of Design at GitHub, as well as some other GitHub designers.

これには実際の影響があり、私が知りたいことの1つは、製品が何かということです。基本的に、Freedmanの元同僚のいくつかがMagicに参加しており、その中にはGitHubのデザイン担当副社長であるMax Shoningや他のGitHubデザイナーも含まれています。

According to a person with knowledge of the hires, they'll likely be crucial to developing the company's first commercially available product, which I'm hearing is to be set to released in a next few months.


So I'm guessing that potentially what we're going to be having is something that blows I'm guessing GitHub Copilot out of the water.

私はおそらく、私たちが手に入れる可能性があるものは、GitHub Copilotを凌駕するものだと思います。

And think about it like this, guys.


If these guys, okay, actually let's say, for example, these guys actually did this, okay.


So they have something that has active reasoning, which is similar to Q-Star.


Remember that OpenAI, there was this whole debacle about Q-Star, some moment getting fired, which we're going to dive into in a moment.


But if they have something that has active reasoning, something that has an active unlimited context window, something that dwarfs Google's Gemini 1.5 Pro, and let's say it's so good.

しかし、もし彼らがアクティブな推論を持つ何かを持っていて、アクティブな無制限のコンテキストウィンドウを持つ何かを持っていて、GoogleのGemini 1.5 Proを凌駕する何かを持っていて、それが非常に優れているとすると。

And these guys, they said it's better than our evals, anything that they've tried before.


I think that if they release that product and that product is better than GitHub's Copilot, which I think potentially it's going to be, I think that we have a situation on our hands.


Because if that product just blows out of the water, remember, GitHub is backed by Microsoft.


I'm guessing that it's running using ChatGPT.


We're going to have a problem because what will happen is these guys will release their product, they're going to take the industry by storm.


and then, OpenAI are probably going to release GPT-5 or maybe an even advanced version because well they don't want to lose the race because everyone knows about ChatGPT.


And if these guys are developing their own proprietary front-end model which they stated that they will, remember they stated they're going to release their own proprietary model. Remember, their goal is to build super intelligence.


It's not just a code, you know buddy.


Okay, it says you know, to build a great product we need to train our own front-end scale model and transformers aren't the final architecture.


We have something with a multi-million context window now, apparently has active reasoning which means, okay, that the race could be on guys.


The race could be on, this could be an insane race.


Okay, and this is why I'm stating that this could be absolutely incredible.


Now, if you want to remember, Q-Star was pretty crazy because the day that you know Sam Altman was fired, okay, he alluded to a technical advance the company has made that allowed it to push the veil of ignorance back and the frontier of discovery forward.


Okay, and there was an interview that he said that.


And of course, Q-Star essentially OpenAI made a breakthrough before Altman firing stoking the excitement and concern.


I know that a lot of people were essentially wondering if this leak was true, but essentially it was because Sam Altman did comment on it himself.


And I think the fact that people forget that OpenAI has 702 employees, apparently it's actually 770 that signed the letter.


I think it was around 740 that did sign the letter.


And when you think about a company of that size, I don't think it's impossible to or implausible for two people to essentially go to the board and say, Look, this is crazy.


Because something that you might not know is that, I'm guessing that now OpenAI is compartmentalized because when they released Sora, I'm not even sure that the entire company knew about it.


Okay, because I remember some of the employees tweeting about it, saying that, Wow, I saw some of the demos from Sora today.


This thing is absolutely incredible.


So really and truly, it could be something.


Some people are saying that how come we didn't get Q-Star leaked by the entire OpenAI team?


Guys, OpenAI is compartmentalized, which essentially means that an organization where pieces of information are separated to prevent leaks is typically to refer to as employing a compartmentalization strategy.


And essentially what this approach is, is it involves dividing up the organization into discreet sections or compartments where information is tightly controlled and only shared on a need-to-know basis.


And it's kind of used, unlike the military and of course intelligence agencies and some corporate environments, to enhance security and of course to protect sensitive information.


Of course, with what we're doing now, all these kind of breakthroughs need to be protected.


So I'm not surprised that OpenAI would have such a strategy because it does make sense.


And like I said, if they're doing that, okay, then I'm guessing that the leaks are definitely possible because it's a giant company.


Okay we don't really know who it's going to be.


And even if they are compartmentalized, okay, and maybe it's like, it's just 100 people or 50 people or whatever, you're still not going to know who exactly did the leaks or whatever.


So, I think that Q-Star is of course really crazy because I think it's kind of shocking that the company that's trying to work on super intelligence, of course, OpenAI are trying to additionally work on super intelligence.


They may have made some real, real strides in there.


And essentially, of course, it's here an innovation by the company's researchers earlier this year that would allow them to develop far more powerful AI models.


And of course, concerns among some staff that the company didn't have proper safeguards in place to commercialize such advanced AI models.


This person said, so of course, this Q-Star innovation essentially that was able to solve math problems it hadn't seen before, which is an important technical milestone, is something that will change the game whenever it does come into it.


Now, of course, there was some more bits on Q-Star.


And then I'm going to get onto a really, really, really big issue that I don't think enough people are talking about.


And whilst these developments are good, of course, there is a lot of stuff that is unfortunately quite bad.


Okay, so essentially, remember, okay, that openly I said that while super intelligence seems far off now, we believe that it could arrive this decade.


And that means that we could get ASI by 2030.


So that could be like, I mean the fact that OpenAI is saying that super intelligence seems far, but it could arrive this decade, it's not surprised because apparently some people have stated that once you get AGI, it's not far before you get ASI.


So in addition, there was also this as well that earlier this year, Sutskever and his team discovered a variation of that method that prompted greater results in their efforts to train more sophisticated models.


And of course, essentially, OpenAI is dedicating a fifth of its compute to solving super intelligence.


And essentially, one last thing that I'm gonna cover is something that I want to talk to you guys about because this is really, really important and not enough people are talking about it.


And essentially, there is this Moloch concept, okay, and Moloch has come to signify a condition in which we humans are coerced to make futile efforts and compete with each other in such ways that we are eventually driven to our demise.


And this is really true.


And if you think I'm just adding this in the video, I'm just for the sake of it, trust me, it's not.


You guys are gonna want to see this because this could spell disaster.


So essentially, Liv Boor, I'm not exactly sure how you say your name, but she actually did a recent TED talk about the Moloch problem.

基本的に、Liv Boor、あなたの名前の発音が正確にはわかりませんが、彼女は実際に最近、Moloch問題についてのTEDトークを行いました。

And it's a really big problem that's only going to get worse as things go on because as systems become more powerful, we need more security.


But of course, as they become more powerful, people will be deploying them even more.


So I'm going to show you guys a clip from this TED talk because it's actually really important to understand this issue because if you don't, and I know some people are good, happy about AI innovation, the existential risk is there.


Like literally, like 40% of AI researchers say that we should slow down with AI research.


And of course, that's because of the, clear thing that super intelligence poses a really, really bad risk.


So I'm going to show you guys the clip.


Those influencers are sacrificing their happiness for likes.


Those news editors are sacrificing their integrity for clicks.


And polluters are sacrificing the biosphere for profit.


In all these examples, the short-term incentives of the games themselves are pushing, they're tempting their players to sacrifice more and more of their future, trapping them in a death spiral where they all lose in the end.


That's Moloch's trap, the mechanism of unhealthy competition.


And the same is now happening in the AI industry.


We're all aware of the race that's heating up between companies right now over who can score the most compute who can get the biggest funding round or get the top talent.


Well, as more and more companies enter this race, the greater the pressure for everyone to go as fast as possible and sacrifice other important stuff like safety testing.


This has all the haLLMarks of a Moloch trap because like imagine you're a CEO who in your heart of hearts believe that your team is the best to be able to safely build extremely powerful AI.


Well, if you go too slowly, then you run the risk of other much less cautious teams getting there first and deploying their systems before you can.


So that in turn pushes you to be more reckless yourself.


And given how many different experts and researchers both within these companies but also completely independent ones have been warning us about the extreme risks of rushed AI, this approach is absolutely mad.


Plus, almost all AI companies are beholden to satisfying their investors, a short-term incentive which over time will inevitably start to conflict with any benevolent mission.


And this wouldn't be a big deal if this was really just toasters we're talking about here, but AI and especially AGI is said to be a bigger paradigm shift than the agricultural or industrial revolutions, a moment in time so pivotal it's deserving of reverence and reflection, not something to be reduced to a corporate rat race of who can score the most daily active users.


I'm not saying I know what the right trade-off between acceleration and safety is, but I do know that we'll never find out what that right trade-off is if we let Moloch dictate it for us.


So that clip there essentially, they're talking about the problem of how things are just developing too quickly.


And I think this goes to show just like how I talked about why this is such a serious issue because this smaller company is essentially if they've pretty much got the same kind of Q-Star technology that OpenAI had, I think about it like OpenAI, okay, maybe unlike GPT-4, they safety tested GPT-4 for six months, okay, maybe with this AGI level system, they're gonna have to safety test it for a year and a half, okay, but what if this other AI system like they said, they said that they're deploying their system within a couple of months, is that gonna lead OpenAI to essentially you know, just forget about the guardrails essentially.


and then deploy their system.


Are we going to have some major ramifications of that in the future?


So essentially, it's a problem of, I guess you could kind of say a race to the bottom.


And this chart here is from arc investment management, and they've talked about how every year, like these breakthroughs just it's like a stock just keeps dropping until we get to AGI.


And you can see right here that gbt3, it went from 50 years all the way down to like 40.


Then you can see google's advanced conversational agent lambda 2, boom, we got down to 18 years.

それから、Googleの高度な対話エージェントであるLambda 2を見ることができます。突然、18年になりました。

ChatGPT, boom, went down again.


GPT-4 launches, boom, now it's eight years.


Like with this Q-Star breakthrough, are we going to be just dropped down again, boom, like are we going to be like four years away?


You can see that if the forecast error continues that, like like the forecast that they had, it was like, okay, we're going to be getting there by 2030.


But like the forecast is basically saying that with all these kind of breakthroughs.


And you remember that ChatGPT was like, I wish I could like show you guys how much more AI stuff went on here because there has just been so much more engagement in terms of the AI uptick.


So right here is like an acceleration point.


So we could argue that this is going to come straight down, which means that's this year.


If you actually think about it, guys, like if this point right here is an exponentially growth, like that's going to come down into this year, like, you know what I mean?


So that's something that's not surprising.


And if their forecast error continues, which to be honest, humans are very bad at extrapolating exponential growth, I wouldn't be surprised if, of course in 2025, something crazy happens.


So I mean, with all of these companies going on, and I think something that you guys do want to know as well, I mean, that's kind of weird, but OpenAI did actually talk about this.


They actually said that, we, we essentially won't get super intelligence in, in a sense.


and of course we kind of will, but what they said about their safety testing thing, they basically said that we specify four safety risk levels and only models with a post mitigation score of medium or below can be deployed.


Essentially what they mean is that only a model that's able to perform at a certain level is going to be deployed.


And if they have a system that they believe is too smart, they're just not going to put that into the wild.


But this is the thing, this is OpenAI's safety square, this is their safety mitigation.


What are other companies going to do?


Are the other companies going to abide by this?


and if they do, I mean, you know what's going to happen to the world.


I mean, it's going to be a crazy place to be living in because so far they're basically saying that AGI by 2026 and now they're in 2024.


That's a year and a half away.


And I remember when people are stating that AGI in 18 months is crazy.


Now it's seeming like it's realistic.


So I mean, with the amount of investment going into these companies, with the amount of things that are happening with Q-Star breakthroughs only a couple months ago, with active reasoning capabilities apparently recently discovered, with this huge context length just being there, with new architectures popping up, and with these massive investments, and these guys saying that this is better than anything we've seen before, I have no idea what they've developed.


But either way, I'm excited, scared, frightened.


I mean, so many words to describe such groundbreaking technology.


And I'm excited to see it involved.


So what do you guys think about this?


Do you guys think this is crazy?


Do you think it's boring?


Do you think this is a good update?


Let me know what you guys think.


This has been an insane February, and I'll see you guys for another update tomorrow.

