Inflection AIの共同創設者であるムスタファ・シュリーマンがMicrosoft AIのCEOとしてMicrosoftに加わることを決断したことです。彼はGoogle DeepMindの創設者の一人で、昨年Googleを離れてInflection AIを立ち上げました。同社が開発したチャットボット「Pi」は高い評価を受けていました。MicrosoftがインフレクションAIを買収したわけではありませんが、スタッフを雇用するために6億5000万ドルを支払うことに合意しました。これにより、AIレースにおいてGoogleとMicrosoftが競合している状況がさらに加速することになります。 NVIDIA GTCカンファレンスでは、次世代のBlackwell GPUが発表され、AIとロボティクスの訓練に焦点が当てられました。また、デジタルツイン技術や新しいAIプラットフォーム「GR00T」などの革新が紹介されました。これらの技術は、より効率的かつ低コストで大規模な言語モデルを訓練することを可能にします。 一方で、イーロン・マスクはGrok-1をオープンソース化し、AppleはGoogleやOpenAIとのパートナーシップを模索しているとの報道がありました。さらに、Stability AIからの主要な研究者の離脱や、Midjourneyの利用規約の変更など、AI業界ではさまざまな動きがありました。

There's actually been quite a bit of AI news that came out this week, but I just got back from the NVIDIA GTC conference, so I've been playing a little catch-up to keep you up to speed.

今週はかなりの量のAIニュースが出ていますが、私はNVIDIA GTCカンファレンスから戻ったばかりなので、あなたを最新情報に保つために少し追いつくのに忙しかったです。

But I want to start with what I found to be probably the most shocking news of this week, and that's that Mustafa Suleyman, this guy right here on the right who founded Inflection AI with Reid Hoffman, just decided to join Microsoft as the CEO of Microsoft AI.

しかし、今週最も驚いたニュースはおそらく、この右側にいるムスタファ・シュリーマンが、リード・ホフマンと共にInflection AIを設立した人物が、MicrosoftのCEOとして参加することを決定したことです。

If you're not familiar with the name Mustafa Suleyman, it's kind of interesting and kind of shocking, and here's why.


Suleiman was one of the original founders of Google DeepMind, well, DeepMind before Google bought it.


Last year, he left Google DeepMind to start Inflection AI, which you may be familiar with if you've ever used the Pi chatbot.

昨年、彼はGoogle DeepMindを離れてInflection AIを立ち上げ、もしあなたがPiチャットボットを使ったことがあるなら、それに馴染みがあるかもしれません。

Pi was made by Inflection AI, and in my opinion, it was a pretty good chatbot.

PiはInflection AIによって作られ、私の意見では、かなり良いチャットボットでした。

It's one of the best ones out there for having a chat on your phone with a voice to voice conversation.


It's just this really interesting turn of events where somebody who started at Google left to start their own company and then ended up at Microsoft who right now, if you're paying attention, Google and Microsoft are pretty much going head to head to try to be the winners of the AI race right now.


While it doesn't sound like Inflection AI is completely dead, it does sound like they're going to stop progress on the Pi chatbot.

Inflection AIが完全に終了したようには聞こえませんが、Piチャットボットの進捗を停止する予定のようです。

Microsoft didn't technically acquire Inflection.


However, Microsoft did agree to pay Inflection $650 million to hire its staff.


Some people are speculating that Microsoft actually structured it this way to avoid a lengthy review from antitrust regulators.


And if you invested in Inflection, well, it cushioned the blow for investors.


Inflection has promised to pay them more than the value of their original investment while allowing them to retain equity in the startup.


Anyway, it's just a real shocking move to see somebody that was basically with Google for a long time who then left to go start their own thing because presumably Google just didn't move fast enough and had too many restrictions to the growth of their platform, and then they decided to go to Microsoft, which I would imagine has a lot of the same red tape, but also seems to be kind of the front runner in the race for AI right now.


Crazy story in my eyes.


I think part of the fun of this whole AI world is seeing all of this drama unfold among all of the key figures in the AI world, sort of playing chess against each other, but like stealing each other's pieces also.


But the majority of the news this week came out of NVIDIA's GTC conference, the conference I just mentioned, I just got back from.


And if you're not familiar with the GTC conference, some have called it Woodstock for AI nerds.


When Jensen Huang gave his first keynote of the event, he did it in the same arena where the San Jose Sharks play, an arena that holds around 17,000 people.


And looking around the arena, it didn't look like there were any empty seats.


It was nuts.


It felt like a rock concert all for Jensen Huang to give a keynote.


During the keynote Jensen unveiled the next gen Blackwell GPU.

基調講演中、ジェンセンは次世代のBlackwell GPUを発表しました。

This new GPU has up to a 30 times performance increase for LLM inference.


And these new chips will enable organizations everywhere to build and run real time generative AI on trillion parameter, Large Language Models at 25 times less cost and energy consumption than its predecessor, the hopper, or you've heard of the GH one hundreds, GH two hundreds.


Those were the grace hopper GPUs, which preceded these new Blackwell GPUs.

それらは、これらの新しいBlackwell GPUの前身であるグレース・ホッパーGPUでした。

I don't want to get too in the weeds with this, I just want to kind of give you the high level, but essentially these new chips are going to make it more energy efficient, but also faster at the same time to train larger, Large Language Models.


The faster these Large Language Models are to train, the cheaper they are to train the bigger and bigger these models can get, and the more sort of compute can be pushed into them to get better results.


A big theme at NVIDIA's GTC was training robots as well, and they announced a new AI platform called GR00T.


GR00T is a general purpose foundation model for humanoid robots, so it's a model that can be fine tuned by the various robot developers who specifically be optimized for robotics.


There was a term that Jensen Huang of NVIDIA was absolutely obsessed with.


This is a term you're probably going to start hearing a lot of over the coming months and years, and that is the concept of a digital twin.


And a digital twin is essentially a virtual environment that is designed to mimic the real world environment that these AIs will be operating in, but you can sort of simulate and test things out in this digital twin, this virtual environment before bringing them into the real world.


And as part of this announcement, they announced Earth 2, which is a digital twin of the entire Earth.

そして、この発表の一環として、彼らはEarth 2を発表しました。これは、地球全体のデジタルツインです。

Earth 2 offers groundbreaking APIs designed to simulate and visualize weather and climate at an unprecedented scale, paving the way for more accurate forecasts and timely warnings.

Earth 2は、前例のないスケールで天候や気候をシミュレートし、視覚化するために設計された画期的なAPIを提供し、より正確な予測とタイムリーな警告の道を開いています。

The idea being that we can get way ahead of things like hurricanes, tornadoes, earthquakes.


Really any extreme weather will be able to predict it much further in advance using these simulations instead of the digital twins so that we can do a better job of preventing and or preparing for these events when they do happen.


And another note during Jensen's speech, he talked about quantum computing.


Jensen claims that NVIDIA is the largest quantum computing company on the planet that doesn't own a quantum computer.


And he said that because they can actually create a digital twin of a quantum computer.


They can create an emulator, a simulation of a quantum computer and effectively test things that a quantum computer would output, but using this emulator of a quantum computer.


This concept to me is still a little vague because I am not super well versed on quantum computing.


It is actually a RABBIT hole.


I am planning on diving down in the coming weeks and months.


It's something I want to learn more about.


I will turn around and talk about it on videos because if I learn about quantum computing, obviously I want to turn around and share about quantum computing.


But the way Jensen described it was that they can build a quantum computing emulator.


And I feel like if that's true, then what's the point of spending all the time and money on real quantum computers?


I need to get more clarity on that and better understand it myself, but it is a fascinating concept that I'm excited to dive deeper on.


I'm old school.


I brought a little TD notepad with me because I don't like carrying around big backpacks when I go to those events and took some notes during the keynotes.


And there were three key terms that kept on coming up through a ton of the different presentations that I sat in during Jensen's keynotes during the private press Q&A that I got to sit in on with Jensen.


And those three keywords are digital twin, synthetic data, and multimodality.


Those three keywords.


Listen for those this year.


You're probably going to hear them a lot.


Digital twin referring to a sort of clone of something that's in the real world, but a simulated version where you can test an experiment within the simulation before bringing things into the real world.


You've got synthetic data, which is data that was essentially created by computers, not created by humans.


And then trained in the AI.


For example, synthetic data might be like a video simulation like Sora may have trained on a whole bunch of publicly available video, but then people may have also created video inside of unreal engine that looks realistic, but wasn't real.


And then used that as part of the training as well.


Well, the unreal engine generated video would be technically considered synthetic data because it wasn't created in the real world.


It was created within this game engine and then fed in to the AI to let the AI think this was real world.


And the third being multimodality, which I'm sure you're probably already familiar with this term.


I just think we're going to see it used more and more and more as more advanced language models get released.


But multimodality basically means we're going to see these language models not only operate with text, but we're going to see it operate more with audio, with video, with images, with various modalities of media.


Those are the three big terms, the big takeaways from NVIDIA's GTC.


Those are things we're probably going to see a lot of this year.


Digital twins, synthetic data, and more multimodality within the Large Language Models.


This GTC event was also special for me because I got to hang out with some of my favorite creators as well.


I got to spend time with the other AI Matt, Matt vid pro, my good buddy, Bill of all to do, who we seem to keep bumping into each other at multiple events.

他のAIマット、Matt vid pro、私の親友であるBill of all to doと一緒に時間を過ごすことができました。私たちは複数のイベントで何度も偶然出会うようです。

Pete Huang, one of the creators of the neuron AI newsletter with over 400 subscribers, Igor Pognoi, the creator of the AI advantage YouTube channel, and Maria from Python simplified.

400人以上の購読者を持つニューロンAIニュースレターのクリエイターであるピート・ファン、AI advantage YouTubeチャンネルのクリエイターであるIgor Pognoi、Python simplifiedのMariaとも一緒に過ごすことができました。

I also got to hang out with a whole bunch of other creators in the AI space.


And it was so much fun to talk and nerd out about that overlap of AI and tech and YouTube.


Also this week, Elon Musk made good and open-sourced Grok-1 by open sourcing Grok-1.


He released the largest open source model publicly available.


It's a 314 billion parameter model, and it uses a mixture of experts architecture, which I've talked to quite a bit about in previous videos.


It was released under the Apache 2.0 license, meaning that you can build off of it, iterate, sell it, use it commercially, pretty much do whatever you want with it.

Apache 2.0ライセンスの下でリリースされたため、それをベースにして構築したり、改良したり、販売したり、商用利用したり、ほぼ何でもできます。

We did learn when he open sourced it that it was 314 billion parameters.


That was something we hadn't previously known.


We also didn't know that it used that mixture of experts model.


The size of this model still makes it pretty difficult for most people to run on a local machine, but if you wanted to build around Grok-1, you'll be able to.


And also we'll likely start seeing a lot more other products, things like Perplexity and mind studio, and some of these tools that use APIs and open source models, building these models into their platforms.

また、Perplexityやmind studioなどの他の製品がたくさん登場する可能性が高く、これらのツールはAPIやオープンソースモデルを使用して、それらのモデルを自社プラットフォームに組み込んでいます。

And another piece of pretty shocking news this week, Apple is reportedly exploring a partnership with Google for a Gemini powered feature on iPhones.


Again, this is just reports rumors.


There's no actual confirmations from either of these companies that this is happening.


It's quite possible that at WWDC this year, we're going to get a huge update on Siri and maybe Gemini 1.5 will be the model under the hood for Siri.

今年のWWDCでは、Siriに関する大規模なアップデートが行われる可能性が非常に高く、おそらくGemini 1.5がSiriのエンジンとして使用されるかもしれません。

Nobody really knows yet.


This article also says that the company has also held discussions with OpenAI to potentially use GPT models.


We don't totally know what Apple is going to use, but it's looking fairly likely that the first iteration of AI inside of iPhones might not actually be AI developed by Apple, which is interesting because Apple has actually acquired more AI companies than any other company.


So fascinating that after acquiring all of these AI companies to help them develop AI tech in-house, they still might go use Google or OpenAI to power their AI on their phones and in their ecosystem.


This week, Stability AI introduced stable video 3D.

今週、Stability AIは安定したビデオ3Dを紹介しました。

This is a generative model based on Stable Video Diffusion, advancing the field of 3D technology and delivering greatly improved quality and view consistency.

これは、3D技術の分野を進化させ、大幅に向上した品質と視認性を提供する、Stable Video Diffusionに基づく生成モデルです。

Basically, it'll generate what they call orbital videos based off of an input image.


You can see here's the input image.


Here's what 0, 1, 2, 3 XL generated, different angles of this image.

これが0、1、2、3 XLが生成したもので、この画像の異なる角度です。

Here's what stable 0, 1, 2, 3 did.


And here's what stable video 3D did.


You can tell that it seems like it's much more accurate than the previous models at taking a 2D image and sort of guessing what it looks like from other angles.


And then down here, we have some examples of other 3D objects that were generated with this platform.


Right now it's available for non-commercial use, but the model weights are available over on Hugging Face.

現在は非商用で利用可能ですが、モデルの重みはHugging Faceで入手できます。

And if you do want to use it for commercial purposes, you can sign up for the Stability AI membership, which I believe is like 20 bucks a month to pretty much use any of their models.

もし商業目的で使用したい場合は、Stability AIのメンバーシップにサインアップすることができます。おそらく月額20ドルで、彼らのどのモデルでもほぼ使用できると思います。

And while we're on the topic of Stable Diffusion this week, key Stable Diffusion researchers leave Stability AI as company flounders.

今週はStable Diffusionについて話していますが、Stability AIの主要な研究者が会社が苦境に立たされる中、辞任しています。

Basically the people that helped develop Stable Diffusion that worked at Stability AI have kind of all left.

基本的に、Stability AIで働いていたStable Diffusionを開発した人々は、ほぼ全員が去ってしまいました。

Stability AI has been kind of in this weird area lately where people have been calling out the CEO, Ahmad Mastak, for things saying that the company's almost bankrupt and that he's making a lot of mistakes.

最近、Stability AIはCEOのAhmad Mastakを批判する人がいて、会社がほぼ破産寸前であり、彼が多くの間違いをしていると言われています。

Recently, they sold Clipdrop, which is a company that they bought like less than a year ago.


They sold it to Jasper recently.


The people that helped develop Stable Diffusion, the thing that Stability AI is probably most known for have kind of all left.

Stability AIがおそらく最も知られているStable Diffusionを開発した人々は、ほぼ全員が去ってしまいました。

Ahmad Mastak, the CEO, wasn't super involved in building Stable Diffusion, but he did have some of the researchers that worked on it there.

CEOのAhmad MastakはStable Diffusionを構築する際にはあまり関与していませんでしたが、それを開発した研究者の一部は彼の元にいました。

Some of the other researchers went to work for Runway.


Some of them were still at Stability AI.

一部はまだStability AIに残っています。

It seems that almost nobody that helped build Stable Diffusion is still actually at Stability AI.

Stable Diffusionを構築するのに役立ったほとんどの人が、実際にはStability AIには残っていないようです。

It kind of feels like our future is a little more uncertain now, but Stable Diffusion itself is open source.

私たちの未来は少し不確かに感じられますが、Stable Diffusion自体はオープンソースです。

Most of the weights that you use for it are publicly available.


Stable Diffusion isn't going anywhere, but who knows about the future of Stability AI?

Stable Diffusionはどこにも行かないでしょうが、Stability AIの未来は誰にもわかりません。

We'll just have to watch and see how that one plays out or drama to follow.


Midjourney this week made a tweak to their terms and conditions, basically saying that if you get sued by creating art with Midjourney kind of on your own, a lot of the other companies have gone in the other direction.


Companies like Google, OpenAI, Adobe, have all basically said that if you get sued for using our platform, we'll actually help work with you.


Midjourney's new policies say this, to the extent permitted by law, you will indemnify and hold us harmless our affiliates and our personnel from and against any costs, losses, liabilities, and expenses, including attorney's fees from third party claims arising out of, or relating to your use of the service and assets or any violations of these terms.


If you're using Midjourney and you generate an image that might have something copyright in it, that's on you.


Where if you use any of the other image generators, they might help protect you a little bit.


But Midjourney is saying, you use our service at your own risk.


I want to touch on this really quickly.


You may have seen a lot of other YouTube videos talking about Q-Star and this new Q-Star leak, but from everything I can tell, it's probably fake.


It seems very unlikely that anything we've seen recently has much merit.


Basically the leak claims that it uses energy-based models.


This approach is different from what most AI systems use today.


Instead of guessing the next word one at a time, energy-based models, look at the whole response at once.


They try to find the answer that fits best with the question, like finding the missing piece of a puzzle.


This supposedly makes it easier for AI to understand and respond to complex questions, just like a human would.


But again, what we've seen around Q-Star has all been pretty much total speculation.


Sam Altman did somewhat confirm that Q-Star exists and even kind of confirmed it again in a recent interview from Lex Fridman, but there's still no real accurate info around it.


Can you speak to what Q-Star is?


We are not ready to talk about that.


See, but an answer like that means there's something to talk about.


It's very mysterious, Sam.


We also got a little bit more details around Sora from both this Mera Mariotti interview with the Wall Street Journal, as well as the Sam Altman interview we just took a peek at.

また、このMera Mariotti氏によるウォールストリートジャーナルのインタビュー、そしてちょうど覗いたサム・アルトマン氏のインタビューから、Soraに関するさらなる詳細を得ました。

One thing neither of these tech leads would say is what Sora was actually trained on.


What data was used to train Sora?


We used publicly available data and licensed data.


Videos on YouTube?


I'm actually not sure about that.


Videos from Facebook, Instagram?


You know, if they were publicly available, available, yet publicly available to use, there might be the data, but I'm not sure.


I'm not confident about it.


What about Shutterstock?


I know you guys have a deal with them.


I'm just not going to go into the details of the data that was used, but it was publicly available or licensed data.


We also got some slight hints to when we might actually see Sora, albeit, there's not really much to go on here.


You said eventually.


When is eventually?


I'm hoping definitely this year, but could be a few months.


There's an election in November, you think before or after that?


You know, that's certainly a consideration dealing with the issues of misinformation and harmful bias.


And we will not be releasing anything that we don't feel confident on when it comes to how it might affect global elections or other, other issues.


Some of the creators of Sora actually went on the MKBHD podcast recently, and Marques asked them when we can expect Sora.


I know you're not giving timelines, but you're in the testing phase now.


Do you think it's going to be in a available for public use phase anytime soon?


Not anytime soon, I think.


Again, not really to go off of Mara says maybe within a few months, but they are taking elections into consideration.


The team behind Sora is saying not publicly available anytime soon.


Who knows?


And since we're on the topic of OpenAI, let's talk about the GPT store for a second.


It's filling up with junk apps.


I personally kind of saw this coming.


I made a tweet back when the GPT store came out that custom GPTS will be really cool for a few minutes.


And then everybody's going to stop talking about them.


That kind of seems to be what happened, but don't get me wrong.


I do actually think custom GPTS are super powerful.


I've just personally never found a good reason to go use a custom GPT that somebody else has built.


I love building my own custom GPTS.


I love making a GPT where I give it my own custom data, my own custom system prompt, and basically create a chat bot that's designed to do specifically what I ask it to do.


But I've never really wanted to go and use other GPTS that other people have made.


I've tested some of them.


I've made past videos about them.


There are some good ones out there, but I find myself often just going into the GPT store building my own GPT and making custom built workflows for things that I need in my business, as opposed to searching out GPTS that other people made.


And part of the reason is there's just a lot of junk GPTS in there.


A lot of them even have the same names and it's very difficult to sift through and find the ones you're looking for.


Yeah, right now, the GPT store has a lot of junk in it.


In other news, this week, Tennessee becomes the first state to protect musicians and other artists against AI.


They created a new act called the ensuring likeness voice and images security act cleverly creating the acronym Elvis.

彼らは、Ensuring Likeness Voice and Images Security Actという新しい法律を作成しました。巧妙にアクロニムをElvisとして作成しました。

This was an existing law that was already in Tennessee.


However, the old law protected an artist's name, photograph, and likeness.


The new legislation includes AI specific protections now prohibiting people from using AI to mimic an artist's voice without permission.


This to me seems kind of bizarre.


I understand why they're doing it.


However, a lot of artists just sound similar.


If my voice randomly sounds exactly like Drake's voice, and I went and made a song where I sang and I just happened to sound like Drake, are they going to be able to sue me because my voice sounds like it just seems very unclear and very gray areas still.


And also it being a Tennessee law, how does that apply outside of Tennessee?


I don't know.


I do think we will see more and more laws pop up like this because artists don't want to be copied, but I also think it's going to be very difficult to enforce and not going to stop people from just creating music anyway.


And just not saying it was generated with AI.


If you're a YouTuber like me, you now have to disclaim when your video was generated with AI.


However, there are some caveats.


If it's obviously generated with AI or if it's like an animated series or something like that, you don't actually have to claim it was made with AI.


Just anything that looks realistic, that people may not know is AI.


You have to claim as AI, but as AI gets better, how is it going to know that it was AI if you didn't claim it?


I guess it's honored system kind of thing, but that's a new rule on YouTube.


And I'm sure they'll get that figured out over time and how to better enforce that.


But just some interesting news for people that are creating YouTube videos.


Jumping back to robotics real quick.


I came across this on Reddit.


Unitree created the first humanoid robot to do a backflip without any hydraulics because obviously this is exactly what we need our robots to do.


I mean, it is pretty cool looking, but I don't know the practical use cases other than possibly just for entertainment purposes.


And finally there is a new AI gadget coming out from Open Interpreter called the Open Interpreter 01.

そして、Open InterpreterからOpen Interpreter 01という新しいAIガジェットが登場します。

It's basically this little circle device that you hold in your hand.


And when you're at a computer, you train it to do stuff like, go check the weather for me or send a message to somebody or respond to emails or send a Slack message.


You train it on your computer, how to do stuff.


And then later on, when you're away from your computer, you say send a message in Slack that says whatever into this little circle device.


And because you already pre-trained it, it knows how to go do those steps.


To me, it looks like it's designed to compete with the RABBIT R1.

私にとって、それはRABBIT R1と競合するように設計されているように見えます。

It does a lot of similar stuff where you train it.


Once it's trained, it knows how to do that stuff in the future.


However, the big difference for this one is that it is completely open source.


The code and even the plans, the CAD designs for it are all free and open source.


You can build the device.


You can program it in the same way.


It's this cool little AI device that you can actually use right now, or you can preorder one from Open Interpreter for 99 bucks.

これは実際に今すぐ使用できるか、Open Interpreterから99ドルで予約することができるかもしれない、クールな小さなAIデバイスです。

But if you want to try to build one yourself, all the plans are available and open source to build one.


It's pretty cool.


I will link up to their Twitter demo video.


It's an eight and a half minute demo.


I didn't want to just run the whole thing in this video and turn this video into a crazy long video, although it's already getting pretty long, but it does look pretty cool.


And I highly recommend watching the demo if you're interested in AI enabled devices that you can kind of take with you anywhere that will control things back at your computer.


I do think eventually stuff like this will just be an app on your phone, but that's just my speculation.


Those are my thoughts on that.


That's all I got for you today.


I hope you enjoyed this video.


There was a lot that happened this week.


I probably even missed some of it while I was over at NVIDIA's GTC conference.


I'm considering making a full breakdown of the GTC conference.

NVIDIA GTCカンファレンスの完全な解説を考えています。

If you're interested, I feel like I covered the biggest highlights from the conference in this video.


But if you want kind of that behind the scenes, like what the experience was like video, sort of like what I did for the CES event and the augmented world expo event, I might make one for NVIDIA GTC as well, if there seems to be interest.

しかし、もし興味があるようであれば、裏側のような、体験がどんな感じだったかのビデオ、CESイベントや拡張現実世界エキスポイベントでやったようなものをNVIDIA GTC用にも作るかもしれません、もし興味があるようであれば。

But what I've learned about this channel is a lot of you guys just want that high level news.


Like you just want that here's everything you need to know for this week and the quickest possible way.


And that's what I'm trying to provide for you.


But I do also love making those deep dive videos for you.


